Speaker: Jonathan Yedidia, Disney Research

Title: The "Boundary Graph" Supervised Learning Algorithm for Regression and Classification

Abstract:

I introduce the "boundary graph" (BG) supervised learning algorithm for regression and classification. The BG algorithm is a general-purpose machine learning algorithm with a wide range of potential applications, and is a competitor with other general-purpose families of algorithms, such as deep neural networks, support vector machines, boosting algorithms, or K-nearest neighbor algorithms. The motivation behind the algorithm is that existing algorithms tend to be either slow to train (like deep neural networks) or slow to query (like naive K-nearest neighbor algorithms), while the BG algorithm has very fast training and querying. The BG algorithm is an online algorithm with a good learning rate that exhibits "one-shot" learning. As the training set grows, the memory requirements only grow slowly, and the query response time grows very slowly. The BG algorithm works by using a "boundary graph," in which nodes of the graph are training examples, and edges exist between nodes when the corresponding training examples are similar in their inputs but significantly dissimilar in their outputs. Training examples are incorporated into the graph if greedy walks toward the training example end at locally nearest nodes that have outputs that are dissimilar from those of the example. Querying also proceeds by making greedy walks toward the test query, except that locally nearest nodes are used to estimate the output. The BG algorithm can be understood as a nearest neighbor algorithm that automatically stores only the most relevant training examples into a data structure that allows for very fast querying.

For additional information, contact:

Kevin Swersky