For a query entity (noun phrase), the Targeted Hypernym Discovery (THD) algorithm extracts a hypernym from a Wikipedia article defining the noun phrase using lexico-syntactic patterns. This hypernym can be used within the SCM classifier to map the noun phrase to a WordNet synset, but it can also be perceived as the classification result by itself, achieving an unsupervised classification system.
Running THD within GATE requires two pipelines: the corpus acquisition pipeline and the corpus annotation pipeline with Wikipedia articles (see Figure).
The application consists of two GATE modules and a JAPE grammar:
Compatibility (tested): GATE 4, GATE 6, GATE 7