Speech and Audio Signal Processing (SASP)

The course concentrates on algorithms for speech and audio signal processing with applications in telecommunications and multimedia, especially

physiology and models for human speech production and hearing: source-filter model, filterbank model of the cochlea, masking effects,
representation of speech and audio signals: estimation and representation of short-term and long-term statistics in the time and frequency domain as well as the cepstral domain; typical examples and visualizations
source coding for speech and audio signals: criteria, scalar and vector quantization, linear prediction, prediction of the pitch frequency; waveform coding, parametric coding, hybrid coding, codec standards (ITU, GSM, ISO-MPEG)
basic concepts of automatic speech recognition (ASR): feature extraction, dynamic time warping, Hidden Markov Models (HMMs)
basic concepts of speech synthesis: text-to-speech systems, model-based and data-driven synthesis, PSOLA synthesis system
signal enhancement for acquisition and reproduction: noise reduction, acoustic echo cancellation, dereverberation using single-channel and multichannel algorithms.

Die Vorlesung behandelt Grundlagen und Algorithmen der Verarbeitung von Sprach- und Audiosignalen mit Anwendungen in Telekommunikation und Multimedia, insbesondere:

Physiologie und Modelle der Spracherzeugung und des Hrens: Quelle-Filter-Modell, Filterbank-Modell der Cochlea; Maskierungseffekte;
Darstellung von Sprach- und Audiosignalen: Schtzung und Darstellung der Kurzzeit- und Langzeitstatistik in Zeit-, Frequenz- und Cepstralbereich; typische Beispiele, Visualisierungen;
Quellencodierung fr Sprache und Audiosignale: Kriterien; skalare und vektorielle Codierung; lineare Prdiktion; Pitchprdiktion; Wellenform-/Parameter-/Hybrid-Codierung; Standards (ITU, GSM, ISO-MPEG)
Spracherkennung: Merkmalextraktion, Dynamic Time Warping, Hidden Markov Models
Grundprinzipien der Sprachsynthese: Text-to-Speech Systeme, modellbasierte und datenbasierte Synthese, PSOLA-Synthese
Signalverbesserung bei Signalaufnahme und wiedergabe: Geruschbefreiung, Echokompensation, Enthallung mittels ein- und mehrkanaliger Verfahren;

Zugehörige Einzelbeiträge

Folge

Titel

Lehrende(r)

Aktualisiert

Zugang