Machine Learning for Speech and Audio Processing (New!)

Lecturer: Prof. Dr.-Ing. Peter Jax

Contact: Lars Thieling, Maximilian Kentgens

Type: Master lecture

Credits: 4

RWTHonline Lecture
RWTHonline Exercise
Learning room (Moodle)
(Registration via RWTHonline)

Course language: English

Lecture slides are sold in the first lecture as well as by Irina Ronkartz. Exercise problems will be published in the learning room.



from Friday, April 5, 2019
08:30 - 10:00
Lecture room 4G


from Friday, April 5, 2019
10:15 - 11:00
Lecture room 4G

Consultation hours:

If necessary, please contact Lars Thieling by stating the topic.


Thursday, August 1, 2019
Lecture room Aula 2

Resources: Written resources (e.g., lecture notes or books) are not permitted.

Please note: Please bring along your student ID (BlueCard)!

The new lecture "Machine Learning for Speech and Audio Processing (MLSAP)" addresses especially students of the Master's program "Electrical Engineering, Information Technology and Computer Engineering". Starting in the summer term 2019, the course is curricularly anchored in the ELECTIVE module catalogues of the majors "Communications Engineering" (COMM), "Computer Engineering" (COMP), and "Systems and Automation" (SYAT).


In this one term lecture the fundamental methods of machine learning with applications to problems in speech and audio signal processing are presented:

  • Fundamentals of Classification and Estimation
    • Basic Problems of Classification
    • Feature Extraction Techniques
    • Basic Classification Schemes
  • Probabilistic Models
    • Stochastic Processes and Models
    • Gaussian Mixture Models (GMMs)
    • Hidden Markov Models (HMMs)
    • Training Methods
    • Bayesian Probability Theory: Classification and Estimation
    • Particle Filter
  • Non-Negative Matrix Factorization (NMF)
    • Dictionary-based concept
  • Neural Network and Deep Learning
    • Feed-Forward Neural Networks
    • Fundamental Applications
    • Learning Strategies: Supervised vs Unsupervised vs Reinforcement Learning
    • Training of Synaptic Weights: Backpropagation and Stochastic Gradient Descent
    • Behavior of Learning and the “Magic” of Setting Hyper‐Parameters
    • Generative Networks as a Complement to Directed Graphs
    • From „Shallow“ to „Deep“: Trade Comprehensibility for Performance
    • Specific Network Architectures
    • Applications in Signal Processsing
    • Interpretations and Realizations

Exercises are offered to gain a deeper understanding on the basis of practical examples.


The results of the evaluation are summarized below.

Summer term 2018

Participants of the evaluation: 12
Global grade: 1,3

Concept of the lecture: 1,2
Instruction and behaviour: 1,3

Concept of the exercise: 1,5
Instruction and behaviour: 1,3