« Yet another English opinion lexiconWordnet structure »

Polarity lexicon: from English to Dutch

07/18/08 | by Valentin Jijkoun [mail] | Tags: lexicon, wordnet

We need a reasonably large lexicon of Dutch "opinionated" words, i.e., words that serve as subjectivity clues in text (e.g., for English, excellent, wow or admire are, often, positive clues, whereas awful, dislike or shame are negative clues).

That's what we tried. We took a large publicly available English opinionated lexicon and translated the lists of "positive" and "negative" words into Dutch using Google translate. Of course, the result is quite noisy: many words are not translated, many are translated incorrectly (for our purposes).

To get cleaner Dutch word lists, we used the Dutch Wordnet. We viewed Wordnet (essentially, a network of inter-word relations) as a... network of inter-word relations (to be precise, relations between words and synsets, i.e., set of synonymous words). Each word (node in the network) was assigned a weight 1 or -1 if it occurred in the positive (resp., negative) word list we got from Google Translate; other words were assigned weight 0. Then, we ran a weighted version of PageRank on this network, using Wordnet relations as links between nodes. Finally, we checked the result for nodes with highest and lowest PageRank score. This simple method gave us surprisingly clean lists of positive and negative words. Here are the top words:

Positive clues   Negative clues
productief
staan
recht
zoet
leven
passen
gelukkig
aardig
lekker
fijn
doen
goed
aangenaam
leuk
plezierig
verfrissend
herstellen
mooi
winstgevend
gewicht
helder
waarde
positief
plaats
...
...
rotzak
schoelje
smiecht
tyfuslijer
zakkenwasser
afzetten
vallen
dwaasheid
onzin
snijden
pan
kwaad
rommel
band
rot
duister
vuil
afnemen
aftrekken
lam
gek
puinhoop
lelijk
...

In total, we got 4466 positive words and 7091 negative. After a bugfix, all (except for 15) words in the Wordnet were assigned a score, of course. The resulting rankings still need a careful evaluation, but the lists do look good, given the simplicity of the method. Thanks to OpinionFinder for the English lexicon, and to Google for the translation service and the PageRank! ;)

An unexpected error has occured!

If this error persits, please report it to the administrator.

Go back to home page

Additional information about this error:

MySQL error!

Got error 127 from table handler(Errno=1030)

Your query:

SELECT comment_post_ID, comment_type, comment_status, COUNT(*) AS type_count
                
FROM evo_comments
WHERE comment_post_ID IN (29)
GROUP BY comment_post_ID, comment_type, comment_status