MIT Media Lab: Dissertation Defense

about us . academics . sponsors . research . publications .

. people . contact us

Dissertation Defense

WHAT:
Wei Chai:
"Automated Analysis of Musical Structure"

WHEN:
Monday, June 6, 2005, 1:00 PM EST

WHERE:
Bartos Theatre, MIT Media Lab (E15)

DISSERTATION COMMITTEE:
Barry Vercoe
Professor of Media Arts and Sciences
MIT Media Laboratory

Tod Machover
Professor of Music and Media
MIT Media Laboratory

Rosalind Picard
Associate Professor of Media Arts and Sciences
MIT Media Laboratory

ABSTRACT:
Listening to music and perceiving its structure is a fairly easy task for humans, even for listeners without formal musical training. For example, we can notice changes of notes, chords, and keys, though we might not be able to name them (i.e., segmentation based on tonality and harmonic analysis); we can parse a musical piece into phrases or sections (i.e., segmentation based on recurrent structural analysis); we can identify and memorize main themes or hooks of a piece (i.e., summarization based on hook analysis); we can detect the most informative musical parts for making certain judgments (i.e., detection of salience for classification). However, building computational models to mimic these processes is a hard problem. Furthermore, the amount of digital music that has been generated and stored has already become unfathomable. How to efficiently store and retrieve the digital content is an important real-world problem.

This dissertation presents our research on automatic music segmentation, summarization, and classification using the framework combining music cognition, machine learning, and signal processing. It will inquire scientifically into the nature of human perception of music, and offer a practical solution to difficult problems of machine intelligence for automatic musical content analysis and pattern discovery.

Specifically, for segmentation, an HMM-based approach will be used for key change and chord change detection; and a method for detecting the self-similarity property using approximate pattern matching will be presented for recurrent structural analysis. For summarization, we will investigate the locations where the catchiest parts of a musical piece normally appear and develop strategies for automatically generating music thumbnails based on this analysis. For musical salience detection, we will examine methods for weighting the importance of musical segments based on the confidence of classification. Two classification techniques and their definitions of confidence will be explored. The effectiveness of all of our methods will be demonstrated by quantitative evaluations and/or human experiments on complex real-world musical stimuli.

MIT Media Laboratory Home Page | Events Main Index