# The Boundary Forest Algorithm for Online Supervised and Unsupervised Learning

@article{Mathy2015TheBF,
title={The Boundary Forest Algorithm for Online Supervised and Unsupervised Learning},
author={Charles Mathy and Nate Derbinsky and Jos{\'e} Bento and Jonathan Rosenthal and Jonathan S. Yedidia},
journal={ArXiv},
year={2015},
volume={abs/1505.02867}
}
• Published 25 January 2015
• Computer Science
• ArXiv
We describe a new instance-based learning algorithm called the Boundary Forest (BF) algorithm, that can be used for supervised and unsupervised learning. The al- gorithm builds a forest of trees whose nodes store previ- ously seen examples. It can be shown data points one at a time and updates itself incrementally, hence it is nat- urally online. Few instance-based algorithms have this property while being simultaneously fast, which the BF is. This is crucial for applications where one…

## Figures and Tables from this paper

Efficient learning of neighbor representations for boundary trees and forests
• Computer Science
2019 53rd Annual Conference on Information Sciences and Systems (CISS)
• 2019
Differentiable Boundary Sets is introduced, an algorithm that overcomes the computational issues of the differentiable boundary tree scheme and also improves its classification accuracy and data representability.
Alternating optimization of decision trees, with application to learning sparse oblique trees
• Computer Science
NeurIPS
• 2018
An algorithm is given that, given an input tree, produces a new tree with the same or smaller structure but new parameter values that provably lower or leave unchanged the misclassification error, and can handle a sparsity penalty.
Learning data representations for robust neighbour-based inference
Differentiable Boundary Sets is introduced, an algorithm that overcomes the computational issues of the DBT scheme and also improves its classification accuracy and data representability and offers a significant reduction in training time.
Learning Deep Nearest Neighbor Representations Using Differentiable Boundary Trees
• Computer Science
ArXiv
• 2017
A new method called differentiable boundary tree is introduced which allows for learning deep kNN representations allowing for very efficient trees with a clearly interpretable structure by modelling traversals in the tree as stochastic events.
Latent source models for nonparametric inference
This thesis bridges the gap between theory and practice for nearest-neighbor inference methods in the three specific case studies of time series classification, online collaborative filtering, and patch-based image segmentation by derive theoretical performance guarantees for these methods.
k-Nearest Neighbors by Means of Sequence to Sequence Deep Neural Networks and Memory Networks
• Computer Science
IJCAI
• 2021
This paper proposes two families of models built on a sequence to sequence model and a memory network model to mimic the k-Nearest Neighbors model, which generate a sequence of labels, a sequences of out-of-sample feature vectors and a final label for classification, and thus they could also function as oversamplers.
Interpretable Synthetic Reduced Nearest Neighbor: An Expectation Maximization Approach
• Computer Science
2020 IEEE International Conference on Image Processing (ICIP)
• 2020
A novel optimization of Synthetic Reduced Nearest Neighbor based on Expectation Maximization (EM-SRNN) that always converges while also monotonically decreases the objective function is provided.
Q-learning with Nearest Neighbors
• Computer Science
NeurIPS
• 2018
This work considers model-free reinforcement learning for infinite-horizon discounted Markov Decision Processes (MDPs) with a continuous state space and unknown transition kernel and establishes a lower bound that argues that the dependence of $tilde{\Omega}\big(1/\varepsilon^{d+2}\big)$ is necessary.
Distributed Nearest Neighbor Classification.
• Computer Science
• 2018
This work replaces majority voting with the weighted voting scheme, and provides sharp theoretical upper bounds of the number of subsamples in order for the distributed nearest neighbor classifier to reach the optimal convergence rate.
Explaining the Success of Nearest Neighbor Methods in Prediction
• Computer Science
Found. Trends Mach. Learn.
• 2018
This monographaims to explain the success of near neighbor prediction methods, and covers recent theoretical guarantees on nearest neighborprediction in the three case studies of time series forecasting, recommending products to people over time, and delineating human organs in medical images by looking at image patches.

## References

SHOWING 1-10 OF 22 REFERENCES
Instance-Based Learning Algorithms
• Computer Science
Machine Learning
• 2005
This paper describes how storage requirements can be significantly reduced with, at most, minor sacrifices in learning rate and classification accuracy and extends the nearest neighbor algorithm, which has large storage requirements.
Reduction Techniques for Instance-Based Learning Algorithms
• Computer Science
Machine Learning
• 2004
Of those algorithms that provide substantial storage reduction, the DROP algorithms have the highest average generalization accuracy in these experiments, especially in the presence of uniform class noise.
Cover trees for nearest neighbor
• Computer Science
ICML
• 2006
A tree data structure for fast nearest neighbor operations in general n-point metric spaces (where the data set consists of n points) that shows speedups over the brute force search varying between one and several orders of magnitude on natural machine learning datasets.
Distance Metric Learning for Large Margin Nearest Neighbor Classification
• Computer Science
NIPS
• 2005
This paper shows how to learn a Mahalanobis distance metric for kNN classification from labeled examples in a globally integrated manner and finds that metrics trained in this way lead to significant improvements in kNN Classification.
Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration
• Computer Science
VISAPP
• 2009
A system that answers the question, “What is the fastest approximate nearest-neighbor algorithm for my data?” and a new algorithm that applies priority search on hierarchical k-means trees, which is found to provide the best known performance on many datasets.
Five Balltree Construction Algorithms
This report compares 5 different algorithms for constructing ball trees from data and finds that the bottom up approach usually produces the best trees but has the longest construction time.
Random Forests
Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Online learning of robust object detectors during unstable tracking
• Computer Science
2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops
• 2009
This work proposes a new approach, called Tracking-Modeling-Detection (TMD), that closely integrates adaptive tracking with online learning of the object-specific detector and shows the real-time learning and classification is achievable with random forests.
Near Neighbor Search in Large Metric Spaces
A data structure to solve the problem of finding approximate matches in a large database called a GNAT { Geometric Near-neighbor Access Tree} is introduced based on the philosophy that the data structure should act as a hierarchical geometrical model of the data as opposed to a simple decomposition of theData that does not use its intrinsic geometry.
Optimised KD-trees for fast image descriptor matching
• Computer Science
2008 IEEE Conference on Computer Vision and Pattern Recognition
• 2008
This paper has extended priority search, to priority search among multiple trees, by creating multiple KD-trees from the same data set and simultaneously searching among these trees, and improved the KD-treepsilas search performance significantly.