Friday, March 27, 2015

Lecture 5: syntax

Introduction to syntax. Context-free grammars and languages. Treebanks. Normal forms. Dependency grammars. Syntactic parsing: top-down and bottom-up. Structural ambiguity. Backtracking vs. dynamic programming for parsing. The CKY algorithm. The Earley algorithm. Probabilistic CFGs (PCFGs). PCFGs for disambiguation: the probabilistic CKY algorithm. PCFGs for language modeling.

Friday, March 20, 2015

Lecture 4: Part-of-Speech Tagging

Introduction to part-of-speech (POS) tagging. POS tagsets: the Penn Treebank tagset and the Google Universal Tagset. Rule-based POS tagging. Stochastic part-of-speech tagging. Hidden markov models. Deleted interpolation. Linear and logistic regression: Maximum Entropy models. Transformation-based POS tagging. Handling out-of-vocabulary words.

Saturday, March 14, 2015

Lecture 3: language modeling

We introduced N-gram models (unigrams, bigrams, trigrams), together with their probability modeling and issues. We discussed perplexity and its close relationship with entropy, we introduced smoothing and interpolation techniques to deal with the issue of data sparsity.

We also discussed the homework 1a in detail (see slides on the class group).

Friday, March 6, 2015

Lecture 2: morphological analysis + homework 1b

We introduced words and morphemes. Before delving into morphology and morphological analysis, we introduced regular expressions as a powerful tool to deal with different forms of a word. We also introduced finite state transducers for encoding the lexicon and orthographic rules. We assigned homework 1b for Wiktionary-based morphological analysis (which covers 2 of the three homeworks).


Sunday, March 1, 2015

Lecture 1: introduction

We gave an introduction to the course and the field it is focused on, i.e., Natural Language Processing, with a focus on the Turing Test as a tool to understand whether "machines can think". We also discussed the pitfalls of the test, including Searle's Chinese Room argument. We then provided examples of tasks in desperate need for accurate NLP: computer-assisted and machine translation, text summarization, personal assistance, text understanding, machine reading, question answering, information retrieval.


Friday, February 13, 2015

SIGN UP NOW!

IMPORTANT: The 2015 class hour schedule will be on Fridays 2.30pm-5.45pm. BUT: we will discuss tomorrow whether we can move it to 4-7pm.
Please sign up to the NLP class!





Friday, October 31, 2014

2015 class hour schedule

News: The 2015 class hour schedule will be on Fridays 2.30pm-5.45pm. BUT: we will discuss tomorrow whether we can move it to 4-7pm.


Friday, June 6, 2014

Lecture 12: Statistical Machine Translation

Introduction to Machine Translation. Rule-based vs. Statistical MT. Statistical MT: the noisy channel model. The language model and the translation model. The phrase-based translation model. Learning a model of training. Phrase-translation tables. Parallel corpora. Extracting phrases from word alignments. Word alignments

IBM models for word alignment. Many-to-one and many-to-many alignments. IBM model 1 and the HMM alignment model. Training the alignment models: the Expectation Maximization (EM) algorithm. Symmetrizing alignments for phrase-based MT: symmetrizing by intersection; the growing heuristic. Calculating the phrase translation table. Decoding: stack decoding. Evaluation of MT systems. BLEU. Log-linear models for MT.

Monday, June 2, 2014

Lecture 11: Homework 1 correction + homework Q&A + Combinatory Categorial Grammar (CCG)

Homework 1 correction. Q&A on the other two homeworks. Combinatory Categorial Grammar (CCG).

 

Lecture 10: NLP research at LCL, Sapienza