Introduction to Word Sense Disambiguation (WSD). Motivation. The typical WSD framework. Lexical sample vs. all-words. WSD viewed as lexical substitution and cross-lingual lexical substitution. Knowledge resources. Representation of context: flat and structured representations. Main approaches to WSD: Supervised, unsupervised and knowledge-based WSD. Two important dimensions: supervision and knowledge. Supervised Word Sense Disambiguation: pros and cons. Vector representation of context. Main supervised disambiguation paradigms: decision trees, neural networks, instance-based learning, Support Vector Machines, IMS with embeddings, neural approaches to WSD. Unsupervised Word Sense Disambiguation: Word Sense Induction. Context-based clustering. Co-occurrence graphs: curvature clustering, HyperLex. Knowledge-based Word Sense Disambiguation. The Lesk and Extended Lesk algorithm. Structural approaches: similarity measures and graph algorithms. Conceptual density. Structural Semantic Interconnections. Evaluation: precision, recall, F1, accuracy. Baselines. Entity Linking.