In-vitro vs. in-vivo evaluation. Introduction to perplexity. Relationship with probability of a sentence and cross-entropy. Wikidata hands-on session.
Home Page and Blog of the Multilingual NLP course @ Sapienza University of Rome
Thursday, April 10, 2025
Lecture 10 (04/04/2025, 3h): Perplexity, Wikidata hands-on
Lecture 8 (28/03/2025, 3h): introduction to probabilistic language modeling
What is a language model? N-gram models (unigrams, bigrams, trigrams), together with their probability modeling and issues. Chain rule and n-gram estimation.

Lecture 6 (21/03/2024, 3h): More on word2vec, negative sampling
More on Word2Vec. Negative sampling: the skipgram case; changes in the loss function.
Lecture 4 (14/03/2024, 3h): first hands-on with PyTorch with language detection
Recap of the Supervised Learning framework, hands on practice with PyTorch on the Language Detection Model: tensors, gradient tracking, the Dataset and DataLoader class, the Module class, the backward step, the training loop, evaluating a model.
Lecture 3 (13/03/2025, 2h): Introduction to Supervised, Unsupervised & Reinforcement Learning
Introduction to Supervised, Unsupervised & Reinforcement
Learning. The Supervised Learning framework. From real to computational:
features extraction and features vectors. Feature Engineering and
inferred features. PyTorch. Introduction to Colab notebooks and first part of the PyTorch hands-on.
Lecture 2 (07/03/2025, 3h): Logistic regression for NLP
Basics of Machine Learning for NLP. Probabilistic classification. Logistic Regression and its use
for classification. Explicit vs. implicit features. The cross-entropy
loss function.
Lecture 1 (06/03/2025, 2h): Introduction
Introduction to the course. Introduction to Natural Language Processing: understanding and generation. What is NLP? The Turing Test, criticisms and alternatives. Tasks in NLP and its importance (with examples). Key areas and publication venues.