Thursday, May 31, 2012

Seminar by Prof. Mark Steedman: The Statistical Problem of Language Acquisition (31/05/12)

Title: The Statistical Problem of Language Acquisition
Speaker: Mark Steedman (Informatics, University of Edinburgh)
When: Thursday 15.45
Where: Aula Seminari, Via Salaria, 113 (third floor)

Abstract:

The talk reports recent work with Tom Kwiatkowski, Sharon Goldwater,
and Luke Zettlemoyer on semantic parser induction by machine from a
number of corpora pairing sentences with logical forms, including
GeoQuery, ATIS, and a corpus consisting of real child-directed utterance from
the CHILDES corpus.

The problem of semantic parser induction and child language acquisition
are both similar to the problem of inducing a grammar and a
parsing model from a treebank such as the Penn treebank, except that
the trees are unordered logical forms, in which the preterminals
are not aligned with words in the target language, and there may be
noise and spurious distracting logical forms supported by the context
but irrelevant to the utterance.

The talk shows that this class of problem can be solved if the child
or machine initially parses with the entire space of possibilities
that universal grammar allows under the assumptions of the Combinatory
Categorial theory of grammar (CCG), and learns a statistical
parsing model for that space using EM-related methods such
as Variational Bayes learning.

This can be done without all-or-none "parameter-setting" or attendant
"triggers", and without invoking any "subset principle" of the kind
proposed in linguistic theory, provided the system is presented with a
representative sample of reasonably short string-meaning pairs from
the target language.


Bio: Mark Steedman is Professor of Cognitive Science in the School of
Informatics at the University of Edinburgh, to which he moved in 1998
from the University of Pennsylvania, where he taught  as
Professor in the Department of Computer and Information Science.  He
is a Fellow of the British Academy, the Royal Society of Edinburgh,
the American Association for Artificial Intelligence, and the European
Academy.

His research covers a range of problems in computational linguistics,
artificial intelligence, computer science, and cognitive science,
including syntax and semantics of natural language, and parsing and
comprehension of natural language discourse by humans and by machine
using Combinatory Categorial Grammar (CCG).  Much of his current NLP
research concerns wide-coverage parsing for robust semantic
interpretation and natural language inference, and the problem of
inducing such grammars from data and grounded meaning representations,
including those arising in robotics domains. Some of his research
concerns the analysis of music using robust NLP methods.

No comments:

Post a Comment