Project Title and Abstract

Project Title

Combinatorial and Relational Network as Toolkit for Dutch Language Technology.

Acronym

Cornetto

Abstract

Cornetto is a lexical semantic database for Dutch, covering 92K entries, including the most generic and central part of the language. The database combines the structure and content of Wordnet and FrameNet-like data. It contains both vertical and horizontal semantic relations and combinatorial lexical constraints such as multiword expressions, idioms and collocations on the one hand, and lexical functions and frames on the other. The concepts are aligned with the English Wordnet so that ontologies and domain labels were imported.

In addition, Cornetto developed a toolkit for the acquisition of new concepts and relations and the tuning and extraction of a domain specific sub-lexicon from a compiled corpus. The lexical database was evaluated by integration in IR and QA applications. The Cornetto goals fit the resources priority for Electronic lexicons and the research priority for Semantic analysis. In the area of applications it is related to:

  • Monolingual and multilingual Information extraction
  • Semantic web
  • Dialogue and QA solutions
  • Automatic summarization and text generation applications
  • Machine translation
  • Educational systems

Project Proposal

Project Proposal (PDF)

News

Cornetto version 2.0

A new release of the Cornetto database is now available from the TST centrale: Cornetto database at the TST centrale Cornetto has been revised during the DutchSemCor project. DutchSemCor resulted in the annotation of the SONAR corpus with the meanings of the most frequent and polysemous words in Cornetto. The release of Cornetto 2.0 will be extended with word-sense-disambiguation systems that were developed during DutchSemCor.

Cornetto Demo

Try the Cornetto client yourself and explore the content of the Cornetto database with our online Demo! [Read more]

For reference

P.Vossen, I.Maks, R.Segers, H.van der Vliet, M-F.Moens, K.Hofmann, E.Tjong Kim Sang, and M.de Rijke. Cornetto: a Combinatorial Lexical Semantic Database for Dutch. Chapter 10 in: Spyns, Peter; Odijk, Jan (Eds.) Essential Speech and Language Technology for Dutch. Results by the STEVIN-programme. Series: Theory and Applications of Natural Language Processing, 2013, 2013, XVII.

Sponsor

Nederlandse Taalunie


Last update: 2 December, 2012, p.vossen(at)let.vu.nl