Syllabus

Readings

We will use parts of the following online books

We will also read some articles

Weekly overview

Time &
Place

Program

22 Aug.

2:15 pm

 

Lec1: Introduction & Looking at data

Presentation

Recommended reading

For the descriptive statistics part, any of the following will cover the lecture
  • OpenIntro 1.1, 1.2.0- 1.2.2, 1.6.0-1.6.5, 1.7.0-1.7.2
  • Nutshell: Ch. 4, "Descriptive Statistics and Graphical Displays", p. 83-120
  • Moore and McCabe: Ch. 1, "Looking at Data - Distributions", sec. 1.1-1.2
  • Cartoon Guide: Ch. (1-) 2, "Data Description"

23 Aug

No Lecture

27 Aug

2:15 pm

Lab1: Python, NLTK, plotting

Workbench

Exercises

29 Aug

2:15 pm

Lec2: Words, lexicon, texts

Presentation

Recordings: video | audio only

Mandatory reading

Jurafsky and Martin, Speech and Language Processing, 3. ed.:

  • Ch. 2 Regular expressions etc
    • Sec. 2.0
    • Sec. 2.2 Words
    • Sec. 2.3 Corpora
    • Sec 2.4 Normalization, except 2.4.3 Chinese
  • Ch. 8 Part-of-speech-tagging,
    • Sec 8.1 and 8.2

NLTK Book

  • Ch. 3, sec. 6 Normalizing Text
  • Ch. 3, sec. 8 Segmentation
  • Ch. 5, sec. 1 Using a tagger
  • Ch. 5, sec. 2 Tagged corpora

Wikipedia

Recommended reading

Wikipedia

30 Aug

10:15 am

Lec3: Probabilities

Presentation

Mandatory reading

OpenIntro Ch. 2, "Probability", sec. 2.1-2.4

3 Sep

2:15 pm

Lab2: Mandatory assignment 1

5 Sep

2:15 pm

Lec4: Text classification

Presentation

Recordings: video | audio only

Mandatory reading

Jurafsky and Martin, Speech and Language Processing, 3. ed.:

  • Ch. 4,  "Naive Bayes Classification and Sentiment"

NLTK Book

  • Ch. 6, "Learning to Classify Text"

6 Sep

10:15 am

Lec5: Probability distributions

Presentation

Mandatory reading

OpenIntro Ch. 3, "Distributions of random variables",:

  • Sec. 3.3.1 Bernoulli distribution
  • Sec. 3.4.1 Binomial distribution

Recommended exercises

Recommended reading

OpenIntro Ch. 3, "Distributions of random variables",:

  • Sec. 3.1 Normal distribution
  • Sec. 3.4.2 Normal approximation

12 Sep

2:15 pm

Lec6: N-gram models and more on Naive Bayes

Presentations

Recordings:

Mandatory reading

Jurafsky and Martin, Speech and Language Processing, 3. ed.:

  • Ch. 3,  "N-gram langauge models"
    • Sec. 3.0-3.4
    • Sec. 3.6

17 Sep

2:15 pm

Lab3: Text classification 1

Exercises

Last minute help with mandatory assignment 1

19 Sep

2:15 pm

Lec7: HMM-tagging

Presentations

No recordings

  • The equipment in the room did not work.

Mandatory reading

Jurafsky and Martin, Speech and Language Processing, 3. ed.:

  • Ch. 8,  "Part-of-Speech Tagging "
    • Sec. 8.0-8.4

24 Sep

2:15 pm

Lab4: Text classification 2

Exercises

26 Sep

2:15 pm

Lec8: Logistic regression

Presentation

Recordings: video | audio only

Mandatory reading

Jurafsky and Martin, Speech and Language Processing, 3. ed.:

  • Ch. 5,  "Logisitc Regression"
    • We will be light on some of the mathematical details (cf. presentation)

1 Oct

2:15 pm

Lab5

  • First hour: Jointly look at solutions to exercise set 3 and set 4

  • Second hour: Start work on mandatory set 2 (= set 5) (to appear)

4 Oct

2:15 pm

Lec9: Sequence labelling, summary so far

Presentation

Recordings: video | audio only

Mandatory reading

Jurafsky and Martin, Speech and Language Processing, 3. ed.:

  • Loose ends from chapters 5 and 8
  • Chapter 7, sec. 7.0-7.3
  • Chapter 17, sec. 17.0-17.2

Matthew Honnibal: A Good Part-of-Speech Tagger in about 200 Lines of Python

8 Oct

2:15 pm

Lab6

  • Joint work on exercise set 6 (=Mandatory assignment 2, part B)

  • Individual advices on mandatory assignment 2

10 Oct

2:15 pm

Lec10: Vector Semantics - Sparse Vectors

Presentation

Recordings: video | audio only

Mandatory reading

Jurafsky and Martin, Speech and Language Processing, 3. ed.: 

15 Oct

2:15 pm

Lab7

  • Last minute help with mandatory assignment 2

17 Oct

2:15 pm

Lec11: Vector semantics - Word2Vec

Presentation

Mandatory reading

Jurafsky and Martin, Speech and Language Processing, 3. ed.:

Links

18 Oct

10:15 am

Lec12: Phrase structure, grammar and parsing

Presentation

No recording

Mandatory reading

Jurafsky and Martin, Speech and Language Processing, 3. ed.:

  • Chapter 10, sec. 10.0-10.5
  • Chapter 11. sec. 11.0-11.1

NLTK Book

  • Ch. 8, sec. (8.)1 Some grammatical dilemmeas-(8.)3 Context-Free Grammar
  • Ch. 8, sec. (8.)4.2 Shift-Reduce parsing

Recommended reading

NLTK Book

  • Ch. 8, sec. (8.)4.1Recursive Descent Parsing
  • J&M, chapter 11, sec. 11.2 CKY-parsing

24 Oct

2:15 pm

Lec13: Dependency parsing

Presentation

Recording: video | audio only

Mandatory reading

  • Nivre, Joakim: "Dependency grammar and dependency parsing" in MSI report 05133. , 2005. V?xj? University: School of Mathematics and Systems Engineering. Sections 1,2,4. On-line copy.
  • Jurafsky and Martin, Speech and Language Processing, 3. ed.:
    • Chapter 13, sec. 13.0-13.4, 13.6-13.7
  • Nivre, Joakim: "Two Strategies for Text Parsing" in A Man of Measure: Festschrift in Honour of Fred Karlsson on his 60th Birthday., 2006. On-line copy.

Recommended reading

31 Oct

2:15 pm

Lec14: Information extraction

Presentation

No recording (due to a bad cough)

Mandatory reading

Jurafsky and Martin, Speech and Language Processing, 3. ed.:

  • Chapter 11, sec. 11.3
  • Chapter 17, sec. 17.0-17.2

NLTK Book

  • Ch. 7, sec 7.0-7.3, 7.5

5 Nov

2:00 pm

Lab8

  • Joint work on Mandatory assignment 3

7 Nov

2:15 pm

Lec15: Relation extraction; Evaluation and significance

Presentation

Recordings

Mandatory reading

Miroslav Kubat, An Introduction to Machine Learning (accessible within the uio domain)

  • Chapter 11 Performance
  • Chapter 12 Statistical significance

OpenIntro Ch. 6, "Inferences for categorical data"

  • Sec. 6.5,
  • some of Sec. 6.1 and 6.2

Recommended reading

OpenIntro Ch. 3, "Distributions of random variables",:

  • Sec. 3.1 Normal distribution
  • Sec. 3.4.2 Normal approximation

14 Nov

2:15 pm

Lec16: Evaluation and significance

Presentation

Recordings: video | audio only

Mandatory reading

Miroslav Kubat, An Introduction to Machine Learning (accessible within the uio domain)

  • Chapter 13, section 13.7 Criteria for performance evaluation

21 Nov

2:15 pm

Exercises and repetition 

Exercises

28 Nov

2:15 pm

More exercises and repetition

Exercises

 

 

Published Aug. 17, 2018 2:10 PM - Last modified Aug. 19, 2019 9:54 AM