Wednesday, April 24, 2024

Lecture 14 (19/04/2024, 4h, TAs): homework 1b

 Introduction to homework 1b. In-class lab for the homework.

Lecture 13 (18/04/2024, 2h, S): Pre-trained language models

Pre-trained language models: BERT, GPT, RoBERTa, XLM.

Lecture 12 (12/04/2024, 3.5h): the Transformer

Introduction to the Transformer architecture. Encoder, decoder. Positional embeddings. Self-attention. Cross-attention. Decoding. Introduction to homework 1b.

Lecture 11 (11/04/2024, 2h): the attention mechanism

Neural language modeling. Context2vec. Neural language models with BiLSTMs. Contextualized word representations. Introduction to the attention. 

Tuesday, April 9, 2024

Lecture 10 (05/04/2024, 4h): notebooks on real-world reviewing example, training in NLP, hyperparameters, LSTMs

Notebooks on real-world review classification example, Part-of-Speech tagging brief introduction, LSTMs recap, Notebook on Part-of-Speech Tagging with LSTMs, data preprocessing and training procedure best practices.


Lecture 9 (04/04/2024, 2h): RNNs and Long-Short Term Memory Networks

Recurrent Neural Networks. Issues. Long-Short Term Memory Networks.

Friday, March 22, 2024

Lecture 8 (22/03/2024, 4h): introduction to language modeling

Negative sampling in the word2vec notebook. What is a language model? N-gram models (unigrams, bigrams, trigrams), together with their probability modeling and issues. Chain rule and n-gram estimation. Static vs. contextualized embeddings. Introduction to Recurrent Neural Networks.

Lecture 7 (21/03/2024, 2h): Word2vec notebook

PyTorch notebook on word2vec. More on homework 1.

Lecture 6 (15/03/2024, 4h): Negative sampling, homework 1 assignment

Negative sampling: the skipgram case; changes in the loss function. Homework 1 assignment.

Word2Vec Tutorial Part 2 - Negative Sampling · Chris McCormick

Thursday, March 14, 2024

Lecture 5 (14/03/2024, 2h): Word embeddings, word2vec

Word representations. Word embeddings. Word2vec (CBOW and skipgram), PyTorch notebook on word2vec.

Lecture 4 (08/03/2024, 4h): first hands-on with PyTorch with language detection

Recap of the Supervised Learning framework, hands on practice with PyTorch on the Language Detection Model: tensors, gradient tracking, the Dataset and DataLoader class, the Module class, the backward step, the training loop, evaluating a model.

Lecture 3 (07/03/2024, 2h): Supervised vs. unsupervised vs. reinforcement learning. PyTorch

Introduction to Supervised, Unsupervised & Reinforcement Learning. The Supervised Learning framework. From real to computational: features extraction and features vectors. Feature Engineering and inferred features. PyTorch. Introduction to Colab notebooks and first part of the PyTorch hands-on.

Thursday, March 7, 2024

Lecture 2 (01/03/2024, 4h): Machine Learning for NLP and Logistic Regression

Basics of Machine Learning for NLP. Probabilistic classification. Logistic Regression and its use for classification. Explicit vs. implicit features. The cross-entropy loss function.

Lecture 1 (29/2/2024, 2h): Introduction

Introduction to the course. Introduction to Natural Language Processing: understanding and generation. What is NLP? The Turing Test, criticisms and alternatives. Tasks in NLP and its importance (with examples). Key areas and publication venues.

Tuesday, February 27, 2024

Ready! We are starting this Thursday in A2 at 12! Meanwhile, please register here:

  • Thursday (12.00-14.00), room A2, DIAG, via Ariosto 25
  • Friday (8.30-12.00), room A2, DIAG, via Ariosto 25

What is ChatGPT? | A Chat with ChatGPT on the Method Behind the Bot |  DataCamp

Tuesday, February 6, 2024

Welcome to Multilingual Natural Language Processing!!!


Welcome to the Sapienza Multilingual NLP course blog 2024! The course is held at DIAG! Cool things about to happen:

  1. The course will contain lots of up-to-date content on deep learning, neural networks, Large Language Model, and an improved hands-on with PyTorch!
  2. For attending students, there will be only TWO homeworks (and no additional duty), one of which will be done with delivery by the end of September and will replace the project. Non-attending students, instead, will have to work on three homeworks.
  3. There will be cool challenges throughout the whole course, including the possibility of writing and publishing papers. You will be updated on the most relevant events in the area, including the Italian/Multimodal LLM national endeavor headed by Prof. Navigli.
  4. We will include the most recent additions (including from 2024) from the world of NLP!
Class hours are: TBD, DIAG, via Ariosto 25

1,100+ Chatgpt Stock Illustrations, Royalty-Free Vector Graphics & Clip Art  - iStock

IMPORTANT: The current lecture model is in-person attendance. See the updated Syllabus.

IMPORTANT (bis): Note that the course has been renamed into Multilingual Natural Language Processing (if you have NLP in your plan and want to attend my course, please contact me at [surname]