[NetBehaviour] Software learns new words from Wikipedia.

marc marc.garrett at furtherfield.org
Wed Sep 6 22:34:06 CEST 2006

Software learns new words from Wikipedia.

Tom Simonite.

A program that works out the meaning of newly coined words using the 
online encyclopaedia Wikipedia could help machines understand the slang 
used in blogs and other informal texts, say researchers.

The program – called Zeitgeist – hunts through Wikipedia looking for 
entries about new words that do not appear in an online resource called 
WordNet, an official linguistics tool that is both a dictionary and a 
thesaurus. WordNet is used by researchers to help computers understand 
human language. New words, or neologisms, that do not appear in WordNet 
inevitably leave computers stumped.

When Zeitgeist finds a Wikipedia entry about a new word, it looks at the 
links to and from the page, explains lead researcher Tony Veale from 
University College Dublin, Ireland. "Is there a pattern amongst those 
linkages that allows us to understand what the new word means?" he asks.

For example, having found an entry for the word "gastropub" – a bar that 
specialises in food – Zeitgeist can work out the definition for itself 
thanks to the links to the entries for "pub" and "gastronomy". The 
program does not read the linked-to pages but relates their titles to 
entries in the WordNet database.


More information about the NetBehaviour mailing list