Multilingual Natural Language Processing @ Sapienza: April 2012

Monday, April 30, 2012

Seminar by Prof. Iryna Gurevych: How to UBY - a Large-Scale Unified Lexical-Semantic Resource (30/04/12)

Title: How to UBY - a Large-Scale Unified Lexical-Semantic Resource
Speaker: Iryna Gurevych

Download the presentation

Abstract: The talk will present UBY, a large-scale resource integration
project based on the Lexical Markup Framework (LMF, ISO 24613:2008).
Currently, nine lexicons in two languages (English and German) have been
integrated: WordNet, GermaNet, FrameNet, VerbNet, Wikipedia (DE/EN),
Wiktionary (DE/EN), and OmegaWiki. All resources have been mapped to the
LMF-based model and imported into an SQL-DB. The UBY-API, a common Java
software library, provides access to all data in the database. The nine
lexicons are densely interlinked using monolingual and cross-lingual sense
alignments. These sense alignments yield enriched sense representations
and increased coverage. A sense alignment framework has been developed for
automatically aligning any pair of resources mono- or cross-lingually. As
an example, the talk will report on the automatic alignment of WordNet and
Wiktionary. Further information on UBY and UBY-API is available at:
http://www.ukp.tu-darmstadt.de/data/lexical-resources/uby/

Bio: Iryna Gurevych leads the UKP Lab in the Department of Computer
Science of the Technische Universität Darmstadt (UKP-TUDA) and at the
Institute for Educational Research and Educational Information (UKP-DIPF)
in Frankfurt. She holds an endowed Lichtenberg-Chair "Ubiquitous Knowledge
Processing" of the Volkswagen Foundation. Her research in NLP primarily concerns applied lexical semantic algorithms, such as computing semantic relatedness of words or paraphrase recognition, and their use to enhance the performance of NLP tasks, such as information retrieval, question answering, or summarization

Sunday, April 29, 2012

Mid-term exam (27/04/2012)

Mid-term exam: morphology and regular expressions, language modeling, probabilistic part-of-speech tagging, probabilistic syntactic parsing.

Lecture 12: introduction to semantics (20/04/12)

More exercises in preparation for the mid-term exam. Introduction to computational semantics. Syntax-driven semantic analysis. Semantic attachments. First-Order Logic. Lambda notation and lambda calculus for semantic representation.

Wednesday, April 18, 2012

Lecture 11: exercises (17/04/12)

Exercises about regular expressions for morphological analysis, n-gram models and stochastic part-of-speech tagging.

Friday, April 13, 2012

Lecture 10: syntax, part 2 (13/04/12)

Backtracking vs. dynamic programming for parsing. The CKY algorithm. The Earley algorithm. Probabilistic CFGs (PCFGs). PCFGs for disambiguation: the probabilistic CKY algorithm. PCFGs for language modeling.

Wednesday, April 4, 2012

Lecture 9: syntax, part 1 (03/04/12)

Introduction to syntax. Context-free grammars and languages. Treebanks. Normal forms. Dependency grammars. Syntactic parsing: top-down and bottom-up. Structural ambiguity.

Monday, April 2, 2012

Lecture 8: part-of-speech tagging, part 2 (30/03/12)

Stochastic part-of-speech tagging. Hidden markov models. Deleted interpolation. Transformation-based POS tagging. Handling out-of-vocabulary words.

Lecture 7: part-of-speech tagging part 1 (27/03/12)

Introduction to part-of-speech (POS) tagging. POS tagsets: the Penn Treebank tagset and the Google Universal Tagset. Rule-based POS tagging. Introduction to stochastic POS tagging.