Poff's CHI Blog: Paper Reading #21: Human Model Evaluation in Interactive Supervised Learning

Rebecca Fiebrink is an assistant professor at Princeton in the school of Computer Science but also is associated with the music department.
Perry R. Cook is a professor emeritus of Princeton in the school of Computer Science and the Department of Music.
Daniel Trueman is an associate professor of music at Princeton University.

This paper was presented at CHI 2011.

Summary

Hypothesis
The researchers main hypothesis was that while using a machine learning system, humans could evaluate and adjust the machine learning generated model to better train the system.

Methods
To test their hypothesis, researchers created a system called The Wekinator. The Wekinator is a software system that allows human users to train a given machine learning given certain gesture or real time input. Based on the human input, the system attempts to figure out the best course of action for the human input. If the system is incorrect or it isn't certain enough, the human can tell the system what the correct action is and the human can train the system by performing the gesture or input over more iterations.

To test Wekinator and discover how humans used it, they created three different studies.

In Study A, researchers had members of the Music Composition department (PhD students and a faculty member) discuss Wekinator and allowed them to use it over a given amount of time. When they were done using it, the participants sat down with the researchers and talked about how they used Wekinator and they offered various suggestions. Most of the participants used Wekinator to create new sounds and new instruments through the gesture system.

In Study B, students ranging from 1st year to 4th year were tasked with creating two different types of interfaces for the machine learning system. The first was a simple interaction where based on an input (like a gesture or a movement on a trackpad), a certain sound would be produced. The next type was a "continuously controlled" instrument that would make different noises based on how the input is being entered.

In Study C, researchers worked with a cellist to build a system that could respond to the movements of a cello bow and report the movements of the bow correctly.

Results
Researchers found that in all cases, participants often time focused on iterative model-rebuilding by continuing to re-train the system and fixing the models as they needed to be adjusted with given input.

In all of the tests, direct evaluation was used more than cross-validation. Essentially, directly evaluating the model and modifying it was preferred over the cross-validation model.

The researchers also noted that cross-validation and direct evaluation was actual feedback to the participants so that they could know whether or not the system was properly interpreting their input.

Discussion

The usage of a system like Wekinator could have some very far-reaching benefits. At the moment, machine learning seems pretty bulky and hard to do if you're not an AI programmer or scientist. But Wekinator seems to make the task of machine learning easier to understand and to put into practice. In the second study, a lot of the participants didn't know much about machine learning until they were briefed on the subject. However, they were still able to put Wekinator into practice to create new types of musical instruments.

Wekinator (and systems like it) could open the door on many different input tasks. Gesture and motion control can be better fined tuned by quick training. Since multiple people might perform a gesture differently, using Wekinator, a system's model can quickly and efficiently be "tweaked" to fit a given person. This makes analog input (like speech recognition or gestures) more reliable and more accessible to a large group of people and technologies.

Poff's CHI Blog

Thursday, October 20, 2011

Paper Reading #21: Human Model Evaluation in Interactive Supervised Learning

No comments:

Post a Comment