Header

UZH-Logo

Maintenance Infos

Constructing a Constructional MWE Lexicon for psycho-conceptual Annotation: Evaluation of CPA and DuELME for Lexicographic Description


Luder, M; Clematide, S (2010). Constructing a Constructional MWE Lexicon for psycho-conceptual Annotation: Evaluation of CPA and DuELME for Lexicographic Description. In: Dykstra, A; Schoonheim, T. Proceedings of the XIV Euralex International Congress. Leeuwarden, NL: Fryske Akademy, 402-410.

Abstract

The German JAKOB lexicon provides a basis for the coding of patient narratives and is currently extended in the direction of a phraseological and construction-grammar resource. For this purpose, we will compare two formalisms for the representation of multiword expressions (MWE): The Dutch Electronic Lexicon of Multiword Expressions (DuELME, Grégoire 2009) and the verb patterns from Corpus Pattern Analysis (CPA, Hanks 2008). We are looking for a representation format which is human-readable, and equally adapted for natural language processing (NLP). The JAKOB lexicon is implemented in the OLIF format and currently contains 7000 entries. The MWEs investigated are verbal
phraseologisms and originate from the corpora of three different clients, consisting of a total of more
than 400 transcribed sessions.
The narrative analysis method JAKOB is a tool for investigating everyday stories from psychotherapy
transcripts (Boothe 2004). Stories are annotated on the basis of our predefined psycho-conceptual coding system represented in the lexicon. JAKOB allows formulating hypotheses about the client’s conflicts, the analysis of the discourse being one component thereof.
DuELME is an NLP lexicon project which encodes MWE descriptions in a theory- and implementationindependent
way. Every MWE is an instance of a construction class with elements including morphosyntactic parameters. CPA patterns represent semantic properties for the elements of a (verbal)
construction, whereas syntactic properties are represented in the JAKOB lexicon by the subcategorization frames (Satzmuster) of Wahrig (2007). We are implementing an additional lexicon property ‘bauplan’ which is formally constructed as a combination of the DuELME component list, the Wahrig subcategorization frame and semantic information out of the CPA-pattern. Because this structure is difficult to read for the lexicographer, it is generated automatically and can be hidden from the user, but is available for NLP tasks.

Abstract

The German JAKOB lexicon provides a basis for the coding of patient narratives and is currently extended in the direction of a phraseological and construction-grammar resource. For this purpose, we will compare two formalisms for the representation of multiword expressions (MWE): The Dutch Electronic Lexicon of Multiword Expressions (DuELME, Grégoire 2009) and the verb patterns from Corpus Pattern Analysis (CPA, Hanks 2008). We are looking for a representation format which is human-readable, and equally adapted for natural language processing (NLP). The JAKOB lexicon is implemented in the OLIF format and currently contains 7000 entries. The MWEs investigated are verbal
phraseologisms and originate from the corpora of three different clients, consisting of a total of more
than 400 transcribed sessions.
The narrative analysis method JAKOB is a tool for investigating everyday stories from psychotherapy
transcripts (Boothe 2004). Stories are annotated on the basis of our predefined psycho-conceptual coding system represented in the lexicon. JAKOB allows formulating hypotheses about the client’s conflicts, the analysis of the discourse being one component thereof.
DuELME is an NLP lexicon project which encodes MWE descriptions in a theory- and implementationindependent
way. Every MWE is an instance of a construction class with elements including morphosyntactic parameters. CPA patterns represent semantic properties for the elements of a (verbal)
construction, whereas syntactic properties are represented in the JAKOB lexicon by the subcategorization frames (Satzmuster) of Wahrig (2007). We are implementing an additional lexicon property ‘bauplan’ which is formally constructed as a combination of the DuELME component list, the Wahrig subcategorization frame and semantic information out of the CPA-pattern. Because this structure is difficult to read for the lexicographer, it is generated automatically and can be hidden from the user, but is available for NLP tasks.

Statistics

Altmetrics

Downloads

84 downloads since deposited on 02 Nov 2010
18 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Book Section, refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Psychology
Dewey Decimal Classification:150 Psychology
Language:English
Date:2010
Deposited On:02 Nov 2010 15:57
Last Modified:05 Apr 2016 14:14
Publisher:Fryske Akademy
ISBN:978-90-6273-850-3

Download

Preview Icon on Download
Preview
Filetype: PDF
Size: 1MB