[NetBehaviour] Software learns new words from Wikipedia.
marc.garrett at furtherfield.org
Wed Sep 6 22:34:06 CEST 2006
Software learns new words from Wikipedia.
A program that works out the meaning of newly coined words using the
online encyclopaedia Wikipedia could help machines understand the slang
used in blogs and other informal texts, say researchers.
The program – called Zeitgeist – hunts through Wikipedia looking for
entries about new words that do not appear in an online resource called
WordNet, an official linguistics tool that is both a dictionary and a
thesaurus. WordNet is used by researchers to help computers understand
human language. New words, or neologisms, that do not appear in WordNet
inevitably leave computers stumped.
When Zeitgeist finds a Wikipedia entry about a new word, it looks at the
links to and from the page, explains lead researcher Tony Veale from
University College Dublin, Ireland. "Is there a pattern amongst those
linkages that allows us to understand what the new word means?" he asks.
For example, having found an entry for the word "gastropub" – a bar that
specialises in food – Zeitgeist can work out the definition for itself
thanks to the links to the entries for "pub" and "gastronomy". The
program does not read the linked-to pages but relates their titles to
entries in the WordNet database.
More information about the NetBehaviour