Word representations. Word embeddings. Word2vec (CBOW and skipgram), PyTorch notebook on word2vec.
Multilingual Natural Language Processing @ Sapienza
Home Page and Blog of the Multilingual NLP course @ Sapienza University of Rome
Thursday, March 14, 2024
Lecture 4 (08/03/2024, 3 hours): first hands-on with PyTorch with language detection
Recap of the Supervised Learning framework, hands on practice with PyTorch on the Language Detection Model: tensors, gradient tracking, the Dataset and DataLoader class, the Module class, the backward step, the training loop, evaluating a model.
Lecture 3 (07/03/2024, 2 h): Supervised vs. unsupervised vs. reinforcement learning. PyTorch
Thursday, March 7, 2024
Lecture 2 (01/03/2024, 3 hours): Machine Learning for NLP and Logistic Regression
Basics of Machine Learning for NLP. Probabilistic classification. Logistic Regression and its use
for classification. Explicit vs. implicit features. The cross-entropy
loss function.
Lecture 1 (29/2/2024, 2 hours): Introduction
Introduction to the course. Introduction to Natural Language Processing: understanding and generation. What is NLP? The Turing Test, criticisms and alternatives. Tasks in NLP and its importance (with examples). Key areas and publication venues.
Tuesday, February 27, 2024
Tuesday, February 6, 2024
Welcome to Multilingual Natural Language Processing!!!
Welcome to the Sapienza Multilingual NLP course blog 2024! The course is held at DIAG! Cool things about to happen:
- The course will contain lots of up-to-date content on deep learning, neural networks, Large Language Model, and an improved hands-on with PyTorch!
- For attending students, there will be only TWO homeworks (and no additional duty), one of which will be done with delivery by the end of September and will replace the project. Non-attending students, instead, will have to work on three homeworks.
- There will be cool challenges throughout the whole course, including the possibility of writing and publishing papers. You will be updated on the most relevant events in the area, including the Italian/Multimodal LLM national endeavor headed by Prof. Navigli.
- We will include the most recent additions (including from 2024) from the world of NLP!
IMPORTANT: The current lecture model is in-person attendance. See the updated Syllabus.
IMPORTANT (bis): Note that the course has been renamed into Multilingual Natural Language Processing (if you have NLP in your plan and want to attend my course, please contact me at [surname]@diag.uniroma1.it).Monday, June 5, 2023
Lecture 22 (29/05/2023, 2.5 hours): text summarization, open issues in NLP, topics for thesis and more, closing
Introduction to text summarization and evaluation metrics (BLEU, ROUGE, BERTScore, alternatives). Open issues in NLP: superhuman performance in current benchmarks, stochastic parrots, evaluation of text quality. Thesis topics and more. Closing.
Friday, May 26, 2023
Lecture 21 (26/05/2023, 4.5 hours): seq2seq, Machine Translation
Foundations of sequence-to-sequence models and their use within Huggingface.
Introduction to machine translation (MT) and history of MT. Overview of statistical MT. Beam search for decoding. Introduction to neural machine translation: the encoder-decoder
neural architecture. The BLEU
evaluation score. Performances and recent improvements. Neural MT: the encoder-decoder architecture; Attention in NMT.
Monday, May 22, 2023
Lecture 20 (22/05/2023, 2.5 hours): More on semantic role labeling; Semantic Parsing
More on Semantic Role Labeling. Semantic Parsing: task, motivation and applications, Abstract Meaning Representation (AMR) and BabelNet Meaning Representation (BMR), Natural Language Generation from semantic parses