Week 01
Introduction

LLMs in Lingustic Research WiSe 2024/25

Akhilesh Kakolu Ramarao

HHU

09 Oct 2024

Who are we

WhoamI: Akhilesh

Akhilesh Kakolu Ramarao (he/him)
PhD researcher at the English Language and Linguistics department

Researching Computational Morphology supervised by Prof. Dr. Kevin Tang and Dr. Dinah Baer-Henney
Started at HHU in 2021
Industry Background: NLP Researcher, Software Engineer, Building multilingual chatbots, voice assistants
Involved in several language revitalization efforts for indigenous communities in Arunachal Pradesh, India, with a focus on Idu Mishmi and K’man Mishmi languages
More: https://akkikek.xyz/about.html
Part of the Slamlab: https://slam.phil.hhu.de/

WhoamI: Anna

Anna Sophia Stein (she/her)
MSc Linguistics student
Focus on Computational Linguistics
📧 anna.stein@hhu.de

Main interests: anything Natural Language Processing, open source, …
Part of research at the Anglistics, Linguistics and Computer Science department
Currently writing my MA thesis on making LLMs better at pragmatics
Part of the Slamlab: https://slam.phil.hhu.de/

Contact

What to expect

You will see code and math
Math will be annotated and explained in natural language
You will not be asked to do calculations or derive equations
You’ll learn how to work with code snippets but not how to build something from scratch (unless you already know how to code)

Syllabus Overview

Date	General Topic	Homework	Assignments
09.10	Admin, Architectures: Statistical and Probabilistic Language Models	Familiarization with Google Colab
16.10	Architectures: Perceptrons and Neural Networks
23.10	Architectures: Recurrent Neural Networks		Assignment 1

Syllabus Overview

Date	General Topic	Assignments
30.10	Transformer: General architecture
06.11	Transformer: The Attention mechanism
13.11	Transformer: Transformer models: Decoder/Encode only models	Assignment 2
20.11	Using pre-trained models

Syllabus Overview

Date	General Topic	Assignments
27.11	Study week
04.12	Transfer learning: fine-tuning
11.12	Adapting models for specific tasks	Assignment 3
18.12	Adapting models for specific tasks
08.01	Adapting models for specific tasks	Assignment 4
15.01	Probing LLMs
22.01	Probing LLMs	Assignment 5
29.01	TBD

BN Requirements

Completion of the homeworks
Active participation in the class
Pass 4/5 assignments

Who are you

Who are you? What name do you prefer?
What are you studying?
Do you have any prior programming experience? If so, with what language?
Have you worked with LLMs before?
What are you hoping to get out of this course?

Basics of Language Model

A model trained to predict the probability distribution over words or sequences of words in a language
Probability distribution is a mathematical description of the probabilities of word sequences

Statistical Language Models

These models use statistical patterns in the data to make predictions about the likelihood of specific sequences of words
N-gram models are the most common type of statistical language model that predicts the probability of a word given the previous n-1 words
They are simple and computationally efficient but have limitations in capturing long-range dependencies
They are widely used in speech recognition, machine translation, and other NLP tasks
They are used as a baseline for more complex language models

Probabilistic Language Models

These models assign a probability to sequences of words based on the training data
They are based on the principles of probability theory and use probabilistic methods to model the language
They can capture complex patterns in the data and are more flexible than statistical models
They are used in a wide range of NLP tasks, such as machine translation, text generation, and speech recognition

Neural Language Models

These models use neural networks to predict the likelihood of a sequence of words
They are trained on a large corpus of text data and can learn complex patterns and dependencies
They are more powerful than traditional statistical and probabilistic models
They are used in a wide range of NLP tasks, such as, machine translation, text generation, and sentiment analysis

Large Language Models

They are advanced language models that handle billions of training data parameters and generate text output
They are trained on large-scale datasets and can generate human-like text
They are used in a wide range of NLP tasks, such as machine translation, text generation, and question answering
They are the focus of the course

Week 01 Introduction

Who are we

WhoamI: Akhilesh

WhoamI: Anna

Contact

What to expect

Syllabus Overview

Syllabus Overview

Syllabus Overview

BN Requirements

Who are you

Basics of Language Model

Statistical Language Models

Probabilistic Language Models

Neural Language Models

Large Language Models

Thank you!

Week 01
Introduction