Multilingual Natural Language Processing @ Sapienza: 2021

Wednesday, June 2, 2021

Lecture 25 (27/05/2021, 3 hours): encoder-decoder architecture; multilingual Semantic Parsing; Neural Machine Translation

Introduction to multilingual Semantic Parsing. Abstract Meaning Represenatation. Introduction to machine translation (MT) and history of MT. Overview of statistical MT. Beam search for decoding. Introduction to neural machine translation: the encoder-decoder neural architecture. The BLEU evaluation score. Performances and recent improvements. Neural MT: the encoder-decoder architecture; BART; advantages; results. Attention in NMT. Additional uses of the encoder-decoder architecture: Generationary.

Closing of the course!

Monday, May 24, 2021

Lecture 24 (24/5/2021, 3 hours): more on WSD; Homework 3: WSD of Word-in-Context data; Semantic Role Labeling

More on Word Sense Disambiguation. Homework 3: WSD of Word-in-Context datasets. From word to sentence representations. Semantic roles. Resources: PropBank, VerbNet, FrameNet, VerbAtlas. Semantic Role Labeling (SRL). State-of-the-art neural approaches.

Thursday, May 20, 2021

Lecture 23 (20/05/2021, 2.15 hours): Supervised and Knowledge-Based Word Sense Disambiguation

Introduction to Word Sense Disambiguation. Elements necessary for performing WSD. Supervised vs. unsupervised vs. knowledge-based WSD. Supervised WSD techniques. Neural WSD: LSTM and Transformer-based approaches. Integration of knowledge and neural WSD.

Lecture 22 (17/05/2021, 2.15 hours): BERT, GLUE and SuperGLUE benchmarks, Word Sense Disambiguation

Byte Pair Encodings (BPEs). BERT. RoBERTa. XLM-R. Evaluation: GLUE and SuperGLUE benchmarks.

Introduction to Natural Language Understanding (NLU): Word Sense Disambiguation, Semantic Role Labeling, Semantic Parsing. Lexical substitution.

Friday, May 14, 2021

Lecture 21 (13/05/2021, 2.5 hours): attention and Transformers

Introduction to the attention in deep learning: motivation, attention scores, approaches. The Transformer architecture. Introduction to BERT.

Monday, May 10, 2021

Lecture 20 (10/05/2021, 3 hours): bilingual embeddings, contextualized word embeddings, ELMo + homework 2

More on semantic vector representations. Bilingual and multilingual embeddings. Contextualized word embeddings. ELMo. Presentation of homework 2: aspect-based sentiment analysis

Thursday, May 6, 2021

Lecture 19 (06/05/2021, 2 hours): attention

Introduction to attention (via the Question Answering task). PyTorch notebook.

Lecture 18 (03/05/2021, 2 hours): neural language modeling

Introduction to Neural Language Modeling with LSTMs. Issues and advantages. PyTorch Notebook.

Friday, April 30, 2021

Lecture 17 (29/04/2021, 2 hours): semantic vector representations: SensEmbed and NASARI; linkage to BabelNet

Semantic vector representations: importance of their multilinguality; linkage to BabelNet; latent vs. explicit representations; monolingual vs. multilingual representations. The NASARI lexical, unified and embedded representations.

Lecture 16 (26/04/2021, 3 hours): intro to NER; semantic vector representations; Q&A on homework

Introduction to the Named Entity Recognition task. Semantic vector representations. Q&A on homework 1.

Thursday, April 22, 2021

Lecture 15 (22/04/2021, 2 hours): POS tagging and NER with BiLSTMs

Practical session: POS tagging and Named Entity Recognition with BiLSTMs.

Tuesday, April 20, 2021

Lecture 14 (19/04/2021, 3 hours): lexical semantics (2/2) and lexical-semantic knowledge resources

Human vs. computer dictionaries. Introduction to WordNet. The notion of synset. Lexical and semantic relations. Multilingual lexical-semantic knowledge graphs. BabelNet: motivation, creation, organization and evolution. What comes first? Language or concept? Two different views.

Saturday, April 17, 2021

Lecture 13 (15/04/2021, 15-17, 2 hours): introduction to lexical semantics

Introduction to lexical semantics. Lexicon, lemmas and word forms. Word senses: monosemy vs. polysemy. Special kinds of polysemy. Computational sense representations: enumeration vs. generation. Graded word sense assignment.

Monday, April 12, 2021

Lecture 12 (12/04/2021, 14-17, 3 hours): jump into the future + RNN notebook

Talk on cross-lingual Semantic Role Labeling and Semantic Parsing (Cleopatra workshop at The Web Conference) + RNNs in PyTorch (notebook).

Thursday, April 8, 2021

Lecture 11 (08/04/2021, 2.5 hours): Homework 1 presentation + RNNs practical session

Homework 1: the Word-in-Context disambiguation task. Recurrent Neural Networks in practice with PyTorch.

Wednesday, March 31, 2021

Lecture 10 (29/03/2021, 3.5 hours): more on word2vec, GloVe, RNNs, LSTMs and PyTorch Lightning

More on Word2Vec and word embeddings: hierarchical softmax; negative sampling. GloVe. Recurrent Neural Networks. Gated architectures, Long-Short Term Memory networks (LSTMs). Bidirectional LSTMs and stacked LSTMs. Character embeddings. Introduction to PyTorch Lightning

Lecture 9 (25/03/2021, 2 hours): part-of-speech tagging

Part-of-speech tagging. Hidden markov models. Deleted interpolation. Linear and logistic regression: Maximum Entropy models; logit and logistic function; relationship to sigmoid and softmax. Transformation-based POS tagging. Handling out-of-vocabulary words.

Wednesday, March 24, 2021

Lecture 8bis (22/03/2021, 3 hours): smoothing, interpolation, backoff; training, development, testing, hyperparameter tuning

More on language modeling: smoothing, interpolation, backoff. Hands-on: training, development, testing, hyperparameter tuning.

Friday, March 19, 2021

Lecture 8 (18/03/2021): canceled!

Tuesday, March 16, 2021

Lecture 7 (15/03/2021, 3 hours): probabilistic language modeling

Word2vec in PyTorch. We introduced N-gram models (unigrams, bigrams, trigrams), together with their probability modeling and issues

Chain rule and n-gram estimation. Perplexity and its close relationship with entropy. Smoothing and interpolation.

Thursday, March 11, 2021

Lecture 6 (11/03/2021, 2 hours): word embeddings and Word2vec

One-hot encodings and word embeddings. Introduction to word2vec. Differences between CBOW and skip-gram. The loss function in word2vec. Implementation with PyTorch.

Tuesday, March 9, 2021

Lecture 5 (08/03/2021, 3 hours): classifying Amazon reviews with a feedforward network; loss functions

Regression vs. classification. Hands-on: classifying Amazon reviews with a feedforward network. Vector representations of text. Loss functions: Mean Squared Error (MSE), Binary Cross Entropy (BCE), Categorical Cross Entropy (CCE). Sigmoid and softmax.

Saturday, March 6, 2021

Lecture 4 (04/03/2021, 2 hours): deep learning basics hands-on in PyTorch

Introduction to the Perceptron. Activation functions. Loss functions: MSE and CCE. Colab notebook. Language classification with the perceptron.

Lecture 3 (01/03/2021, 3 hours): Machine Learning basics and PyTorch

Introduction to Machine Learning for Natural Language Processing: supervised vs. unsupervised vs. reinforcement learning. Features, feature vector representations.
Introduction to PyTorch. Introduction to deep learning for NLP. The perceptron. Colab notebook with PyTorch basics.

Lecture 2 (25/02/2021, 2 hours): more on NLP

More on Natural Language Processing and its applications.

Lecture 1 (22/02/2021, 3 hours): introduction to NLP

We gave an introduction to the course and the field it is focused on, i.e., Natural Language Processing and its challenges.

Sunday, February 14, 2021

Ready, steady, go!

Welcome to the Sapienza NLP course blog! New this year:

The course will contain lots of up-to-date content on deep learning, neural networks, and an improved hands-on with PyTorch and PyTorch Lightning!
For attending students, there will be only TWO homeworks (and no additional duty), one of which will be done with delivery by the end of September and will replace the project. Non-attending students, instead, will have to work on a three homeworks.
There will be cool challenges throughout the whole course, including the possibility of writing and publishing papers.

IMPORTANT: The current lecture model is blended, meaning that 50% of the students can attend physically, while the others will attend via online streaming. Please get access to the Facebook group. For students who can attend physically, the 2021 class hour schedule will be on Monday 14-17 and on Thursday 14-16, Aula 1 - Aule L ingegneria, via del Castro Laurenziano. Other students attending online will still be considered attending students.

Please sign up to the NLP class!

Pages