Overview

I study machine learning, particularly nonparametric statistics, where I am interested in developing and proving guarantees for algorithms and studying the basic mathematics underlying certain nonparametric problem settings. In deep learning I have mostly focused on studying deep anomaly detection, but I am beginning to move into other deep topics such as deep probabilistic models.

Potential Collaborators and Students: Students sometimes send me emails looking for PhD or internship positions. I am not in control of any budget so I cannot hire PhD students nor other sorts of researchers. I can help advise students in their research (PhD or otherwise) and I am open to considering collaborations in this capacity.

Deep Anomaly Detection

Anomaly detection is the task of determining if a new data point seems anomalous or unusual when provided with a collection of data known to be nominal (normal looking). There exist many methods for anomaly detection in the classic lower-dimensional settings. However there is less development of deep approaches to anomaly detection methods for use with high dimensional data, particularly images. This lack of development is not due to a lack of usefulness: there are many applications for deep anomaly detection including medical imaging applications like tumor detection, and quality control for industrial applications.

I work on deep one-class methods for deep anomaly detection, for example Deep Support Vector Data Description. These methods are quite effective and have garnered a fair amount of attention from the machine learning community. I am interested collaborations with domain experts who have intriguing applications for deep anomaly detection.

Relevant Works:
CIFAR-10 dataset. Using our method we show the most normal examples from some classes (e.g. "cat" in the middle) on the left with most anomalous on the right.

Nonparametric Mixture Modelling

In mixture modelling one tries to find a convex combination of densities which fit the data well, with the idea being that the mixture components will reveal some interesting structure in the data or that they can be used for some other task, e.g. clustering. For mixture modelling to make sense one typically restricts the mixture components to some class of densities, for example the space of multivariate Gaussian distributions in the case of the classic Gaussian mixture model. I study the setting where one makes no assumptions on the mixture components and instead one has access to collections of samples of known to come from the same mixture component. This problem has close connections nonnegative tensor factorizations.

As an example of this problem setting lets consider topic modelling. In the topic modelling setting a mixture component would be a distribution over words, which in this case is representative of a "topic." A "collection of samples known to come from the same component" could be a document, say a Twitter tweet, and it is assumed that the tweet is generated from a single topic. The words in this example are discrete and this discrete topic modelling setting has already seen a fair amount of investigation. One could, however, use a powerful word embedding (e.g. BERT) to transform the discrete words into real-valued vectors thus yielding a continuous version of the topic model that fits the "nonparametric mixture model" setting. The transformed words' geometry encodes abstract meaning which is not available with the discrete untransformed data, and additionally allows for more meaningful relations beyond the discrete case that can only describe if two words are equal or not. A mixture component in this case would be a probability distribution over the word embedding space, i.e. a probability distribution over abstract concepts.

Beyond topic modelling there exists applications for nonparametric mixture modelling in settings where one expects subjects to be grouped, but the grouping is only obvious with repeated observations. This occurs, for example, in psychometrics where a condition like depression is only manifest in data over a period of time, not a single observation.

I study both the algorithmic and theoretical aspects of nonparametric mixture modelling.

Relevant Works:
Our method is able to recover arbitrary mixture model components so long as they are identifiable. This enables things like provably correct clustering even when the cluster distributions overlap like above.

Low-Rank Nonparametric Methods

Low-rank methods have been shown to be useful for estimation problems like matrix completion and linear regression. I am interested in extending this intuition to the nonparametric setting, starting with nonparametric density estimation. Thus far I have found that techniques used previously for finite dimensional problems like matrix completion do not extend to the infinite-dimensional problems like nonparametric density estimation. Low rank nonparametric methods will require their own novel algorithms and analysis.

Relevant Works

Other Misc. Topics

I have worked a bit on a few other research topics, a couple are listed here.