Datasets
Highlights: WordSim353 word pairs reannotated according to the word interchangeability guidelines, SimLex-999 word pairs reannotated according to the word interchangeability guidelines, SimLex-999 word pairs - original guidelines - CZECH version.
- Kliegr, Tomáš, and Ondřej Zamazal. "Antonyms are similar: Towards paradigmatic association approach to rating similarity in SimLex-999 and WordSim-353." Data & Knowledge Engineering 115 (2018): 174-193.
This dataset can serve for training or validating keyword detection algorithms and is described in:
- Dojchinovski, Milan, Dinesh Reddy, Tomáš Kliegr, Tomas Vitvar, and Harald Sack. "Crowdsourced Corpus with Entity Salience Annotations." In LREC. 2016.
This dataset extends DBpedia with additional types and is described in:
- Kliegr, Tomáš. "Linked hypernyms: Enriching DBpedia with targeted hypernym discovery." Web Semantics: Science, Services and Agents on the World Wide Web 31 (2015): 59-69.
This dataset is intended for evaluation of entity typing algorithms and is our described in:
- Kliegr, Tomáš, and Ondřej Zamazal. "LHD 2.0: A text mining approach to typing entities in knowledge graphs." Web Semantics: Science, Services and Agents on the World Wide Web 39 (2016): 47-61. (free preprint).