Back to the computer science and engineering home page.
Department of
Computer Science
& Engineering
   Education | Admission | Research | Student Life | People | News & Events | Additional Resources
 

Course Description

Textbooks

Grading Policy

Weekly Schedule

How to order textbooks

How to apply for a computer account

CSE computing help and information

OGI Academic Integrity Polic

CSE552/652
Hidden Markov Models for Speech Recognition

Spring 2006
Mondays/Wednesdays, 12:00 p.m.- 1:30 p.m., Room WCC 403, Wilson Clark Center

John-Paul Hosom
'hosom' at cslu . ogi .edu
503.748.1456

Course Description
Hidden Markov Model-based technology is widely used in today's speech recognition systems. This course is an introduction to the theory and practice of speech recognition using HMM technology. Topics include dynamic time warping, Markov Models and Hidden Markov Models (discrete, semi-continuous, and continuous), vector quantization, Gaussian Mixture Models, the Viterbi search algorithm, the forward-backward training algorithm, language modeling, and speech-specific adaptations of HMMs. The course is focused on understanding these fundamental technologies and developing the main components of speech recognition systems. Students can expect to come away from the course with an ability to write programs for the training and execution of simple HMM systems, and to know how to extend these systems to more complex cases.  Prerequisite: C programming experience.

The course syllabus is given in a Word document and .pdf document.

Textbooks
There are two recommended textbooks:

Fundamentals of Speech Recognition
Lawrence Rabiner and Biing-Hwang Juang
Prentice Hall, New Jersey, 1993
Statistical Methods for Speech Recognition
Frederick Jelinek
The MIT Press, Cambridge, MA, 1999

The lecture notes will provide the necessary material, but the textbooks provide valuable supplementary information.   Both textbooks should be on reserve at the library.

Grading Policy
Grading is based on three programming assignments, a midterm, and a final. The programming projects provide a template for basic functions such as file I/O and the basic program structure; the student must write the relevant functions. The three projects are worth 15%, 20%, and 25% of the total grade, respectively. The midterm is worth 20%, and the final is worth 20%.

Weekly Schedule
Lecture notes are added as links to power-point files.  Files related to programming assignments will also be posted here as ZIP files.

Week Number Links
Lecture Topics
Week 1:
April 3,
April 5
lecture1
lecture2

project1

  • Course Overview
  • Why Is Automatic Speech Recognition Difficult?
  • Background: Speech Production, Representations of Speech, Models of Human Speech Recognition
  • General Issues in Developing ASR Systems
  • Induction
  • DTW Motivation / Algorithm / Implementation
  • DTW Examples
  • Assign Project 1 on April 5th.
Week 2:
April 10,
April 12
lecture3
lecture4
  • (Review) Relevant Probability / Statistics Background
  • Markov Models
  • Log-Domain Mathematics
  • Hidden Markov Models
  • HMM Topologies
  • Vector Quantization
Week 3:
April 17,
April 19
lecture5
lecture6

  • Gaussian Mixture Models
  • Speech Features (LPC, PLP, MFCC)
  • HMMs for Speech
Week 4:
April 24,
April 26
lecture7
lecture8

project2
  • Project 1 Due April 24th at midnight
  • Viterbi Search
  • Lots of Viterbi Search Examples
  • Assign Project 2
Week 5:
May 1,
May 3
lecture9
lecture10
  • Review Project 1 on May 1
  • Semi-Markov Models
  • Initializing an HMM
  • Forward Procedure
  • Backward Procedure
Week 6:
May 8,
May 10
midterm sample

lecture11

  • In-Class Midterm on May 8
  • Go over Midterm on May 10
  • Gamma
  • Xi
  • Baum-Welch or Forward-Backward or EM Algorithm for training HMMs
  • Training on Multiple Files
Week 7:
May 15,
May 17
lecture12
lecture13

project3
  • Project 2 Due May 15 at midnight
  • Expectation-Maximization in General
  • Embedded Training
  • Search Algorithms: Two-Level
  • Search Algorithms: Level-Building
  • Assign Project 3 on May 17
Week 8: May 22a
May22b,
May 24
lecture14
lecture15

bonus project



  • May 22 is a Double Class!  Starts at 10:30, goes to 1:30.
  • Review Project 2 on May 22
  • Search Algorithms: One-Pass
  • Search Strategies: Beam Search, Grammar/Tree Search, On-Line Processing, Balancing Insertions and Deletions, N-Best Output
  • Acoustic-Model Strategies: Semi-Continuous HMMs, State Tying, State Clustering, Cloning, Pause Models
Week 9:
May 31a,
May 31b
lecture16
lecture17

  • May 29 is Memorial Day; no class
  • May 31 is a Double Class!  Starts at 10:30, goes to 1:30
  • Language Models:
    Incorporating N-Gram LM, Linear Smoothing, Good-Turing Smoothing, Discounting and Back-Off, Cache LM, Class-Based LM, Perplexity
  • Other Approaches to ASR: Segment-Based Systems, HMM/ANN Hybrids, others
  • Tree-Based Search with Language Models
  • Evaluation of System Performance and State of the Art
Week 10: June 5, June 7

  • NO CLASS JUNE 5 OR JUNE 7 (Paul leaves June 5th and returns June 13th)
Finals Week

  • Project 3 Due June 16 at midnight
  • Optional Project, if done, is due June 16
  • Take-Home Final Exam Due June 16

General inquiries:
csedept@cse.ogi.edu
503.748.1151

Department of Computer
Science and Engineering
OGI School of
Science & Engineering
OHSU
20000 NW Walker Road
Beaverton, OR 97006-8921

503.748.1553 FAX

Education | Admission | Research | Student Life | People | News & Events | Additional Resources
Home | OGI