Week 01

LLMs in Lingustic Research WiSe 2024/25

Akhilesh Kakolu Ramarao


09 Oct 2024

Who are we

WhoamI: Akhilesh

Photo of Akhilesh
Akhilesh Kakolu Ramarao (he/him)
PhD researcher at the English Language and Linguistics department

WhoamI: Anna

Picture of the TA Anna
Anna Sophia Stein (she/her)
MSc Linguistics student
Focus on Computational Linguistics
πŸ“§ anna.stein@hhu.de
  • Main interests: anything Natural Language Processing, open source, …
  • Part of research at the Anglistics, Linguistics and Computer Science department
  • Currently writing my MA thesis on making LLMs better at pragmatics
  • Part of the Slamlab: https://slam.phil.hhu.de/


What to expect

  • You will see code and math
  • Math will be annotated and explained in natural language
  • You will not be asked to do calculations or derive equations
  • You’ll learn how to work with code snippets but not how to build something from scratch (unless you already know how to code)

Syllabus Overview

Date General Topic Homework Assignments
09.10 Admin, Architectures: Statistical and Probabilistic Language Models Familiarization with Google Colab
16.10 Architectures: Perceptrons and Neural Networks
23.10 Architectures: Recurrent Neural Networks Assignment 1

Syllabus Overview

Date General Topic Homework Assignments
30.10 Transformer: General architecture
06.11 Transformer: The Attention mechanism
13.11 Transformer: Transformer models: Decoder/Encode only models Assignment 2
20.11 Using pre-trained models

Syllabus Overview

Date General Topic Homework Assignments
27.11 Study week
04.12 Transfer learning: fine-tuning
11.12 Adapting models for specific tasks Assignment 3
18.12 Adapting models for specific tasks
08.01 Adapting models for specific tasks Assignment 4
15.01 Probing LLMs
22.01 Probing LLMs Assignment 5
29.01 TBD

BN Requirements

  • Completion of the homeworks
  • Active participation in the class
  • Pass 4/5 assignments

Who are you

  • Who are you? What name do you prefer?
  • What are you studying?
  • Do you have any prior programming experience? If so, with what language?
  • Have you worked with LLMs before?
  • What are you hoping to get out of this course?

Basics of Language Model

  • A model trained to predict the probability distribution over words or sequences of words in a language
  • Probability distribution is a mathematical description of the probabilities of word sequences

Statistical Language Models

  • These models use statistical patterns in the data to make predictions about the likelihood of specific sequences of words
  • N-gram models are the most common type of statistical language model that predicts the probability of a word given the previous n-1 words
  • They are simple and computationally efficient but have limitations in capturing long-range dependencies
  • They are widely used in speech recognition, machine translation, and other NLP tasks
  • They are used as a baseline for more complex language models

Probabilistic Language Models

  • These models assign a probability to sequences of words based on the training data
  • They are based on the principles of probability theory and use probabilistic methods to model the language
  • They can capture complex patterns in the data and are more flexible than statistical models
  • They are used in a wide range of NLP tasks, such as machine translation, text generation, and speech recognition

Neural Language Models

  • These models use neural networks to predict the likelihood of a sequence of words
  • They are trained on a large corpus of text data and can learn complex patterns and dependencies
  • They are more powerful than traditional statistical and probabilistic models
  • They are used in a wide range of NLP tasks, such as, machine translation, text generation, and sentiment analysis

Large Language Models

  • They are advanced language models that handle billions of training data parameters and generate text output
  • They are trained on large-scale datasets and can generate human-like text
  • They are used in a wide range of NLP tasks, such as machine translation, text generation, and question answering
  • They are the focus of the course

Thank you!