Andrew Moore
Professor of Robotics and Computer Science

Biography
Andrew began his career writing video-games for an obscure British personal computer. He rapidly became a thousandaire and retired to academia, where he received a PhD from the University of Cambridge in 1991. He researched robot learning as a Post-doc working with Chris Atkeson, and then moved to a faculty position at Carnegie Mellon.
Research Interests
At the moment, I'm most interested in learning graphical models efficiently, probabilistic models of person-person interactions, spatio-temporal algorithms for biosurveillance, active learning, new kinds of searches for interesting interactions between variables, and any kind of spatial data structure for caching sufficient statistics.
Tags
Active Learning, AD-trees, Applications, Association Rules, Astrostatistics, Auton Fast Classifiers, Bayesian Networks, Biosurveillance, Cached Sufficient Statistics, Clustering, Efficient Statistical Algorithms, GDA, Kd-trees and Ball-trees, Kernel Density Estimation, K Nearest Neighbor, Life Science Data Mining, Link Analysis, Locally Weighted Learning, Logistic Regression, Markov Decision Processes, Memory-based Learning, Mixture Models, Optimization, Reinforcement Learning, Spatial Statistics, Statistical Data Mining for Astrophysics, WSARE
Recent Papers
-
Fast Dynamic Reranking in Large Graphs
(2009)
-
Fast Incremental Proximity Search in Large Graphs
(2008)
Slightly revised from the ICML camera-ready version -
Fast State Discovery for HMM Model Selection and Learning
(2007)
-
A Tractable Approach to Finding Closest Truncated-commute-time Neighbors in Large Graphs
(2007)
-
Sequential Update of ADtrees
(2006)
- (110 more)
-
Fast Dynamic Reranking in Large Graphs
(2009)
-
Fast Incremental Proximity Search in Large Graphs
(2008)
Slightly revised from the ICML camera-ready version -
Fast State Discovery for HMM Model Selection and Learning
(2007)
-
A Tractable Approach to Finding Closest Truncated-commute-time Neighbors in Large Graphs
(2007)
-
Sequential Update of ADtrees
(2006)
-
Dependency Trees in Sub-linear Time and Bounded Memory
(2006)
Efficient learning of dependency trees for huge datasets. -
Monitoring Food Safety by Detecting Patterns in Consumer Complaints
(2006)
-
Autonomous Visualization
(2006)
-
Efficient Analytics for Effective Monitoring of Biomedical Security
(2005)
-
Making Logistic Regression A Core Data Mining Tool With TR-IRLS
(2005)
This short paper is the easiest, fastest way to learn about Truncated Regularized Iteratively Re-weighted Least Squares (TR-IRLS), my algorithm for fast, parameter-free logistic regression. TR-IRLS can also be used for any generalized linear model. This -
A Multiple Tree Algorithm for the Efficient Association of Asteroid Observations
(2005)
-
Alias Detection in Link Data Sets
(2005)
Combining string similarity with contextual similarity when searching for aliases using active learning. -
Making Logistic Regression A Core Data Mining Tool: A Practical Investigation of Accuracy, Speed, and Simplicity
(2005)
Regularized logistic regression can be fast, accurate, and simple. This paper includes the most important findings of my thesis, and a few new details. -
Fast Inference and Learning in Large-State-Space HMMs
(2005)
-
Efficiently Identifying Close Track/Observation Pairs in Continuous Timed Data
(2005)
-
Detecting Significant Multidimensional Spatial Clusters
(2005)
Applying the fast multidimensional spatial scan statistic to detect clusters in epidemiological and brain imaging data. -
A Bayesian scan statistic for spatial cluster detection
(2005)
A new Bayesian method for cluster detection -
Dynamic Social Network Analysis using Latent Space Models
(2005)
-
A Bayesian spatial scan statistic
(2005)
A new Bayesian method for spatial cluster detection -
Detecting Anomalous Patterns in Pharmacy Retail Data
(2005)
A bio-surveillance system to collect disease outbreak feedback from public health officials -
Variable KD-Tree Algorithms for Spatial Pattern Search and Discovery
(2005)
-
Anomalous Spatial Cluster Detection
(2005)
A general and powerful framework for spatial cluster detection. -
Finding Optimal Bayesian Networks by Dynamic Programming
(2005)
Learning the optimal Bayes net structure -
An Investigation of Practical Approximate Nearest Neighbor Algorithms
(2004)
How to use variations on classic exact data structures for nearest neighbor, if you want to get faster answers and are prepared to accept approximation? -
High-Dimensional Probabilistic Classification for Drug Discovery
(2004)
Discriminative probabilistic classifiers have been used successfully on large life-sciences datasets, but high dimensionalities have prohibited the use of nonparametric class probability estimation. This paper explores a method (SLAMDUNK) which addresses -
Fast Nonlinear Regression via Eigenimages Applied to Galactic Morphology
(2004)
Determining the shapes of millions of galaxies. -
Active Learning for Anomaly and Rare-Category Detection
(2004)
How to use active learning in a real-life scenario. -
Semantic based Biomedical Image Indexing and Retrieval
(2004)
Volumetric pathological neuroimage retrieval under the framework of classification-driven feature selection. -
The IOC algorithm: Efficient Many-Class Non-parametric Classification for High-Dimensional Data
(2004)
Performing k-nearest-neghbor classifications on multi-class problems without actually finding the k-nearest neighbors. -
Rapid Detection of Significant Spatial Clusters
(2004)
A new spatial scan algorith that searches over arbitrary rectangles in addition to squares. -
Alias Detection in Link Data Sets
(2004)
An active learning approach to deciding whether two names correspond to the same entity, combining string similarity information and context similarity. -
Tractable Learning of Large Bayes Net Structures from Sparse Data
(2004)
in this paper we propose an algorithm that allows to learn a Bayes Net structure from sparse data (e.g., power-law distributed) with over 100,000 variables. we also report time and performance accuracy when applied to several very large datasets -
Belief State Approaches to Signaling Alarms in Surveillance Systems
(2004)
-
A Fast Multi-Resolution Method for Detection of Significant Spatial Disease Clusters
(2003)
Rapid detection of disease clusters using a fast spatial scan statistic algorithm. -
Rapid Evaluation of Multiple Density Models
(2003)
A way to quickly evaluate and compare multiple nonparametric density estimates. -
Efficient Exact k-NN and Nonparametric Classification in High Dimensions
(2003)
Can we do non-approximate k-NN classification without actually finding the k-NN? -
A Fast Multi-Resolution Method for Detection of Significant Spatial Overdensities
(2003)
Scaling up the classical Kulldorff scan statistic. -
Bayesian Network Anomaly Pattern Detection for Disease Outbreaks
(2003)
Handles temporal trends in data by replacing the baseline of WSARE 2.0 with a baseline generated by a Bayesian network. -
Empirical Bayes Screening for Link Analysis
(2003)
An algorithm for discovering top N strange co-occurences of size 2,3,4, etc Uses ideas of frequent sets, but stratifies them according to a statistically justified hierarchical bayes model, using empirical bayes to find the parameters -
What's Strange About Recent Events
(2003)
A shorter paper on WSARE that was submitted to the Journal of Urban Health -
Optimal Reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning
(2003)
Very aggressive but computationally efficient search steps for Bayes net learning. -
Tractable Group Detection on Large Link Data Sets
(2003)
We present the k-groups algorithm, an improvement of the GDA algorithm that includes significant computational advantages. The k-groups algorithm allows tractable group detection on large data sets. -
Finding Underlying Connections: A Fast Graph-Based Method for Link Analysis and Collaboration Queries
(2003)
CGraph is an algorithm to quickly learn a graph-based model of the underlying connections of a set of entities given link data. -
cGraph: A Fast Graph-Based Method for Link Analysis and Queries
(2003)
This paper is an extended version of the 2003 ICML conference paper. -
Probabilistic Noise Identification and Data Cleaning
(2003)
We examine the use of explicit noise and corruption models to aid in the task of noise identification and data cleaning. -
A Comparison of Statistical and Machine Learning Algorithms on the Task of Link Completion
(2003)
This paper examines the task of link completion, relative algorithm performance, and what this can tell us about the structure of the data. -
Fast Robust Logistic Regression for Large Sparse Datasets with Binary Outputs
(2003)
Logistic regression can provide faster, better results than SVM for life-sciences datasets with hundreds of thousands of attributes. -
Active Learning in Discrete Input Spaces
(2002)
Using modified Gittins indices to decide which datapoint to actively label next whilst being rewarded for each new label. -
Real-valued All-Dimensions search: Low-overhead rapid searching over subsets of attributes
(2002)
Searching over large numbers of contingency tables quickly -
Using Tarjan's Red Rule for Fast Dependency Tree Construction
(2002)
Very fast growth of dependency trees. -
Interpolating Conditional Density Trees
(2002)
Very fast non-parametric Bayesian Network nodes -
Efficient Algorithms for Non-Parametric Clustering with Clutter
(2002)
Finding and counting the high density regions in spatial data. -
Stochastic Link and Group Detection
(2002)
This paper introduces the GDA algorithm. We use noisy link data (n-tuples of entities) to learn underlying groupings of entities. -
Rule-based Anomaly Pattern Detection for Detecting Disease Outbreaks
(2002)
Answering the question: "What's Strange About Recent Events?" -
Summary of Biosurveillance-relevant statistical and data mining technologies
(2002)
A brief informal survey of some techniques that have been used for Biosurveillance. -
Variable Resolution Discretization in Optimal Control
(2002)
A short paper on choosing the right resolution in a tessalation of state space. -
Reinforcement Learning for Cooperating and Communicating Reactive Agents in Electrical Power Grids
(2001)
One approach to distributed reinforcement learning -
Classification-Driven Pathological Neuroimage Retrieval Using Statistical Asymmetry Measures
(2001)
Using machine learning to detect abnormalities in neuro-imaging output. -
N-Body Problems in Statistical Learning
(2001)
A way to use multiple trees simultaneously to solve a large class of statistical problems efficiently. -
Repairing Faulty Mixture Models using Density Estimation
(2001)
Intelligent automatic selection of new mixture model components -
Mixtures of Rectangles: Interpretable Soft Clustering
(2001)
A mixture model that is easily readable by humans. -
Direct Policy Search using Paired Statistical Tests
(2001)
If you're going to choose the best policy by roll-outs, how can statistical tests and "racing" help? -
Mixnets: Learning Bayesian Networks with mixtures of discrete and continuous attributes
(2000)
Bayes Nets with Mixture Models in the nodes -
Learning Filaments
(2000)
A generative model and efficient algorithm for identifying noisy networks of points in k-dimensional space -
X-means: Extending K-means with Efficient Estimation of the Number of Clusters
(2000)
Extension to popular K-means, where the number of clusters K is also estimated. -
A Dynamic Adaptation of AD-trees for Efficient Machine Learning on Large Data Sets
(2000)
Fast implementation of on-demand AD-trees for scores of high-arity attributes and millions of rows. -
Condensed Representations for computationally tractable data mining of massive sky surveys
(2000)
-
A Nonparametric Approach to Noisy and Costly Optimization
(2000)
Optimizing a noisy process in a possibly discontinuous or non-euclidian space. -
The Anchors Hierarchy: Using the Triangle Inequality to Survive High-Dimensional Data
(2000)
Using Ball-trees allows cached sufficient statistics-based accelerations even in high dimensions. The Anchors approach quickly optimizes the Ball-tree structure for this purpose. -
Multi-Value-Functions: Efficient Automatic Action Hierarchies for Multiple Goal MDPs
(1999)
An efficient procedure to approximately compute all policies for all possible goal states. -
Accelerating Exact k-means Algorithms with Geometric Reasoning (Extended version)
(1999)
This is an extended version of the KDD99 paper. -
Accelerating Exact k-means Algorithms with Geometric Reasoning
(1999)
Using cached counts and a different kind of search operator during k-means updates, with no approximation -
Distributed Value Functions
(1999)
Distributed Reinforcement learning for applications such a power grids -
Variable Resolution discretizations for high-accuracy solutions of optimal control problems
(1999)
-
Influence and Variance of a Markov Chain: Application to adaptive discretization in optimal control
(1999)
You're using variable resolution splines to approximate a value function. Where is most profitable to increasing the resolution? -
Efficient Multi-Object Dynamic Query Histograms
(1999)
Using Multiresolution kdtrees to accelerate visualization algorithms -
Bayesian Networks for Lossless Dataset Compression
(1999)
Practical ways to compress large tabular datasets -
Very Fast EM-based Mixture Model Clustering Using Multiresolution KD-trees
(1999)
Using kdtrees with centroids in the nodes can allow accurate EM updates in time sublinear in the number of records -
Value Function Based Production Scheduling
(1998)
Production scheduling in which we account for a probability distribution on future jobs by means of kernel-based value function approximation -
Q2: Memory-based active learning for optimizing noisy continuous functions
(1998)
Maximizing a very noisy function in k-dimensional space with few samples -
Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets
(1998)
Introduces AD-trees: a new way to implicitly pre-cache the answers to all possible counting queries for a dataset. -
Learning Evaluation Functions for Global Optimization and Boolean Satisfiability
(1998)
-
AD-trees for Fast Counting and for Fast Learning of Association Rules
(1998)
Using AD-trees to learn conjunctive rules via beam search. -
Applying online search techniques to Continuous-State reinforcement learning
(1998)
Using a specialized version of A-star to boost the performance of approximate value functions -
The Racing Algorithm: Model Selection for Lazy Learners
(1997)
A detailed analysis and study of Racing. -
Efficient Locally Weighted Polynomial Regression Predictions
(1997)
Using Multiresolution KD-trees with cached first and second moments in the nodes -
Locally Weighted Learning
(1997)
Survey of the use of kernel functions in kernel regression, locally weighted regression and related function approximators. -
Using Prediction to Improve Combinatorial Optimization Search
(1997)
Automatically improving combinatorial search by reinforcement-learning-style analysis of earlier runs -
Locally Weighted Learning for Control
(1997)
How can kernel methods and locally weighted regression help robots learn to control themselves? -
A tutorial on using the Vizier memory-based learning system
(1997)
A tutorial on using the Windows Vizier software for fast locally weighted and k-NN style classification and regression. -
Simulation-based optimization of a stochastic product coating problem using hash-table cacheing of costs
(1997)
-
Memory-based Stochastic Optimization
(1996)
Using locally weighted regression to model response surfaces and to choose the next experiment -
Reinforcement Learning: A Survey
(1996)
Surveys MDPs, TD, Q-learning and many other Reinforcement Learning staples. -
Algorithms for Approximating Optimal Value Functions in Acyclic Domains
(1996)
Using "Rollouts" to make value-function-based RL more practical -
Multiresolution Instance-based Learning
(1995)
Multiresolution Kd-trees with cached statistics for accelerating kernel regression -
The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces
(1995)
Automatic variable resolution discretization of a multidimensional state space during searches for shortest paths -
Learning Automated Product Recommendations Without Observable Features: An Initial Investigation
(1995)
-
Generalization in Reinforcement Learning: Safely Approximating the Value Function
(1995)
An introduction to the ways that naive application of function approximation of value functions can fail. -
An Empirical Investigation of Brute Force to choose Features, Smoothers and Function Approximators
(1995)
What happens when you use very intense cross-validation? -
Proceedings of the Workshop on Value Function Approximation, Machine Learning Conference 1995.
(1995)
Short talks from a workshop about value function approximation -
Variable Resolution Reinforcement Learning
(1995)
Briefly surveys a number of approaches to making Bellman updates faster -
The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces
(1994)
A short introduction to an efficient learning-shortest-paths algorithm. -
Efficient Algorithms for Minimizing Cross Validation Error
(1994)
-
Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation
(1994)
Perform many cross-validation operations in a round-robin fashion, pruning likely non-winners early. -
A short tutorial note on computing information gain from counts
(1994)
A simple 2-page tutorial. -
Prioritized Sweeping: Reinforcement Learning with Less Data and Less Real Time
(1993)
As estimates of rewards and transition probabilites improve during learning, how can we efficiently allow the value function to keep up? -
Memory-based Reinforcement Learning: Efficient Computation with Prioritized Sweeping
(1992)
Using a priority queue to schedule the most useful value function updates -
Fast, Robust Adaptive Control by Learning only Forward Models
(1992)
A real robot pool player achieves high accuracy by learning in a forward direction. -
A tutorial on kd-trees
(1991)
A description of Bentley et al's classic nearest neighbor algorithm -
Variable Resolution Dynamic Programming: Efficiently Learning Action Maps in Multivariate Real-valued State-spaces
(1991)
-
Knowledge of Knowledge and Intelligent Experimentation for Learning Control
(1991)
-
Efficient Memory-based Learning for Robot Control
(1990)
Using KD-trees, nearest neighbor and active learning. -
Acquisition of Dynamic Control Knowledge for a Robotic Manipulator
(1990)
-
Experiments in Adaptive State Space Robotics
(1989)
Using a nearest neighbor classifier to design control experiments -
Learning Robotic Control: PhD. Thesis Proposal
(1988)
Thesis Proposal
Talks
-
Activity From Demographics and Links
Pittsburgh, 9/1/05
Recently Updated Software
-
K-means
-
npt
N-point Spatial Statistics. -
SBNS
Screen-based Bayes Net Structure search. A computationally efficient algorithm that performs Bayes Net structural learning from a very large binary dataset. -
Simple kd-Trees Source Code
This package contains source code for the simkd kd-tree implementation. -
Sparse Logistic Regression
This program performs fast sparse Logistic Regression classification. - (21 more)
-
K-means
-
npt
N-point Spatial Statistics. -
SBNS
Screen-based Bayes Net Structure search. A computationally efficient algorithm that performs Bayes Net structural learning from a very large binary dataset. -
Simple kd-Trees Source Code
This package contains source code for the simkd kd-tree implementation. -
Sparse Logistic Regression
This program performs fast sparse Logistic Regression classification. -
Sparse K Nearest Neighbor
This program performs fast sparse K Nearest Neighbor classification. -
AFDL (Activity From Demographics and Links)
Predicting activity of entities from linkages between entities and their demographics -
Sparse Naive Bayes Classifier
This program performs fast sparse Naive Bayes Classifier classification. -
Cuevas CFF Clustering
Cuevas uses the 2-step CFF algorithm to perform clustering against a noisy background. -
Dense Naive Bayes Classifier
This program performs fast dense Naive Bayes Classifier classification. -
convert
Utility to convert between various file formats used by the Auton Lab software. -
MDP and Reinforcement Learning Visualization
-
Dense Logistic Regression
This program performs fast dense Logistic Regression classification. -
Dense Association Rules
This program efficiently searches for high scoring association rules given dense data. -
Fast EM Clustering
Rapid Learning of Gaussian Mixture Models from large datasets. -
Scan Statistics
A fast implementation of scan statistic search for spatial overdensities. Our goal is to find rectangular regions where the count (e.g. number of disease cases) is higher than expected, given the underlying population distribution. -
Simple kd-Trees
This program constructs a kd-tree from the contents of an input dataset of k-dimensional vectors, and then performs nearest neighbor searches within the kd-tree using query points from a query dataset. -
WSARE
This program takes as input a date-indexed biosurveillance data stream such as Emergency Department data, and looks for recent strange events. -
cGraph
CGraph is an algorithm to quickly learn a graph-based model of the underlying connections of a set of entities given link data. -
k-groups/GDA
The group detection algorithm (GDA) finds underlying groupings of entities given a set of observed links and demographic information. -
Dense K Nearest Neighbor
This program performs fast dense K Nearest Neighbor classification. -
Bayes Net Learner
This program learns Bayesian network structure from categorical data. -
lr_trirls
This is a logistic regression implementation using our truncated regularized iteratively re-weighted least squares (TR-IRLS) algorithm. -
XGDA Learn
The XGDA Learn software takes link information as input, and learns groups, subgroups and friends (i.e., most likely collaborators) from that link information. -
Vizier
Old but fast locally weighted regression for Windows. -
Fast Classifiers
A collection of fast classifiers including knn, aknn, naive bayes, decision tree, and logistic regression.