Machine Learning for Speech and Audio Processing

Lecturer: Prof. Dr.-Ing. Peter Jax

Contact: Lars Thieling, Maximilian Kentgens

Type: Master lecture

Credits: 4

Lecture in RWTHonline
Exercise in RWTHonline
Learning room RWTHmoodle
(Registration via RWTHonline)

Course language: English

Lecture slides are sold in the first lecture as well as by Irina Ronkartz. Exercise problems will be published in RWTHmoodle.


In case of changes of dates, regarding the current situation, at the start of the lecture and exercise you will be informed about RWTHmoodle.


from Friday, April 17, 2020
08:30 - 10:00
Lecture room FT

Die Vorlesung findet ab dem 17.04.2020 online statt.
Weitere Informationen dazu über RWTHmoodle.


from Friday, April 17, 2020
10:15 - 11:00
Lecture room FT

The exercise will take place online from 17.04.2020.
Further information via RWTHmoodle.


The exam in WS2019/20 is held orally under reservation on 22.04. and 29.04. from 11-12 and 14-16 o'clock. Dates are given by arrangement. Please contact Ms Sedgwick ,

Please note: Please bring along your student ID (BlueCard)!

Exam SS2020:
Thursday, August 20th, 2020

08:30 - 10:00
Lecture room AM

Exam duration: 90 minutes

Remarks: The exam is in written form. The date corresponds also to the proof of achievements (Leistungsachweise) in written form.

Resources: The exam is held with open books excluding programmable pocket calculators and communication devices.

Please note: Please bring along your student ID (BlueCard)!

The new lecture "Machine Learning for Speech and Audio Processing (MLSAP)" addresses especially students of the Master's program "Electrical Engineering, Information Technology and Computer Engineering". Starting in the summer term 2019, the course is curricularly anchored in the ELECTIVE module catalogues of the majors "Communications Engineering" (COMM), "Computer Engineering" (COMP), and "Systems and Automation" (SYAT).


In this one term lecture the fundamental methods of machine learning with applications to problems in speech and audio signal processing are presented:

  • Fundamentals of Classification and Estimation
    • Basic Problems of Classification
    • Feature Extraction Techniques
    • Basic Classification Schemes
  • Probabilistic Models
    • Stochastic Processes and Models
    • Gaussian Mixture Models (GMMs)
    • Hidden Markov Models (HMMs)
    • Training Methods
    • Bayesian Probability Theory: Classification and Estimation
    • Particle Filter
  • Non-Negative Matrix Factorization (NMF)
    • Dictionary-based concept
  • Neural Network and Deep Learning
    • Feed-Forward Neural Networks
    • Fundamental Applications
    • Learning Strategies: Supervised vs Unsupervised vs Reinforcement Learning
    • Training of Synaptic Weights: Backpropagation and Stochastic Gradient Descent
    • Behavior of Learning and the “Magic” of Setting Hyper‐Parameters
    • Generative Networks as a Complement to Directed Graphs
    • From „Shallow“ to „Deep“: Trade Comprehensibility for Performance
    • Specific Network Architectures
    • Applications in Signal Processsing
    • Interpretations and Realizations

Exercises are offered to gain a deeper understanding on the basis of practical examples.


The results of the evaluation are summarized below.

Summer term 2019

Participants of the evaluation (lecture/exercise): 32/32

Global grade: 1,3
Concept of the lecture: 1,4
Instruction and behaviour: 1,3

Global grade: 1,4
Concept of the exercise: 1,5
Instruction and behaviour: 1,4