Knowledge
of language, its learning and processing requires knowledge of its
words. Yet, the information encoded in a lexical entry, its use,
the relation of words to each other in the lexicon, and the
relationship of the lexicon to the grammar are complex and
unsettled issues, on which researchers hold very different views.
We think that the accumulated evidence that lexical effects are
strong (``it is all in the words'') can be reconciled with the
theoretical needs for generalisation and succinctness by exploring
the notion of classes of
words.
The
investigation of the notion of class promises to be informative to
some of the common concerns that have appeared in recent
theoretical and computational linguistics literature: in
particular, automatic lexical acquisition and organisation, and
the integration of grammatical and probabilistic information. The
goal of the current project is to investigate these issues by
studying what part of a verb's lexical entry contains
class-related information, using a corpus-based experimental
approach.
|
The
linguistic notions that we investigate are related to the argument
structure of a verb -- the entities that constitute the fundamental
architecture of a proposition (who did what to whom). Merlo-Stevenson
(2001) investigate statistical correlates to notions such as Agent or Patient, to automatically distinguish
action verbs from change of state verbs in English. We focus our
attention on the thematic relations of the NP arguments and the
notion of argumenthood for prepositional phrases.
In the proposed research, we extend this corpus-based approach to the
investigation of new semantic roles such as Beneficiary and Instrument, and to the investigation to new languages (French and
Italian). In this way, we intend to demonstrate the completeness of
the method by applying it to a large portion of the thematic inventory, and its cross-linguistic validity by investigating new languages.
|