The goal of this paper is to provide an annotation scheme for compounds based on generative lexicon theory (GL, Pustejovsky, 1995; Bassac and Bouillon, 2001). This scheme has been tested on a set of compounds automatically extracted from the Europarl corpus (Koehn, 2005) both in Italian and French. The motivation is twofold. On the one hand, it should help refine existing compound classifications and better explain lexicalization in both languages. On the other hand, we hope that the extracted generalizations can be used in NLP, for example for improving MT systems or for query reformulation (Claveau, 2003). In this paper, we focus on the annotation scheme and its on going evaluation.
n 14 years – the first LREC was held in Granada in 1998 – LREC has become the major event on Language Resources (LRs) and Evaluation for Human Language Technologies (HLT). The aim of LREC is to provide an overview of the state-of-the-art, explore new R&D directions and emerging trends, exchange information regarding LRs and their applications, evaluation methodologies and tools, ongoing and planned activities, industrial uses and needs, requirements coming from the e-society, both with respect to policy issues and to technological and organisational ones.
Bouillon P. ; Jezek J. ; Melloni C. ; Picton A.,
Annotating Qualia Relations in Italian and French Complex Nominals in Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)
, Atti di "Eight International Conference on Language Resources and Evaluation (LREC'12)"
, 23-25 maggio 2012
, pp. 1527-1532