LexFr: Adapting the LexIt Framework to Build a Corpus-based French Subcategorization Lexicon

Abstract

This paper introduces LexFr, a corpus-based French lexical resource built by adapting the framework LexIt, originally developed to describe the combinatorial potential of Italian predicates. As in the original framework, the behavior of a group of target predicates is characterized by a series of syntactic (i.e., subcategorization frames) and semantic (i.e., selectional preferences) statistical information (a.k.a. distributional profiles) whose extraction process is mostly unsupervised. The first release of LexFr includes information for 2,493 verbs, 7,939 nouns and 2,628 adjectives. In these pages we describe the adaptation process and evaluated the final resource by comparing the information collected for 20 test verbs against the information available in a gold standard dictionary. In the best performing setting, we obtained 0.74 precision, 0.66 recall and 0.70 F-measure.

Publication
In *Proceedings of the Tenth International Conference on Language Resources and Evaluation *
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Create your slides in Markdown - click the Slides button to check out the example.

Supplementary notes can be added here, including code, math, and images.

Giulia Rambelli
Giulia Rambelli
PhD student in Computational Linguistics

In my research, I investigate the mechanisms underlying natural language comprehension, bringing together Construction Grammars, Distributional Semantics and psycholinguistic findings.

Related