Sanjoy Dasgupta on active learning

Theory Lunch, September 7, 2005

Minutes by Henry Lin

At theory lunch today, Sanjoy Dasgupta spoke about active learning. 

In the standard supervised learning setting, we are given a set of labeled data points, and we would like to find a hypothesis to predict the labels of unlabelled data. In this setting, we assume that the user/learning algorithm does not have a choice in deciding which data points are labeled. However, there are many settings in which the user does have some choice in deciding which set of points he/she would like to know the label of. 

In this active learning model, oftentimes many fewer labeled data points are needed to learn the correct hypothesis. In some settings, the standard supervised learning model requires at least W(1/e) labeled data points to learn the correct hypothesis within e error, while in the active learning model, there are algorithms that show only O(log(1/e)) labeled data points are needed to learn the correct hypothesis within epsilon error. 

Sanjoy discusses these results and others, and notes there are still many open questions in the field of active learning.