In this year’s ICML, some interesting work was presented on Neural Processes. In this blog post, I discuss what Neural Processes are and how they behave as a prior over functions.


Dimensionality reduction is a key step towards gaining insights into complex high-dimensional data. However, real-life data often exhibits strong structure which is a priori known to us. This could be in various forms of covariate information (e.g. continuous measurements, class labels, or censored survival times). Here, we propose the covariate-GPLVM to learn a covariate-adjusted low-dimensional representation which would reveal meaningful latent structure shared across different class labels or covariate values.
Preprint (2018)

Here, we introduce an augmented ensemble MCMC technique to improve on existing poorly mixing samplers for factorial HMMs. This is achieved by combining parallel tempering and an auxiliary variable scheme to exchange information between the chains in an efficient way.
Accepted to AISTATS (2019)

Here, we explore whether accurate complex trait predictions can be achieved in practice. Using a genome sequenced population of ∼7,000 yeast strains of high but varying relatedness, we consider a variety of models for predicting growth traits from various sources of information (family information, genetic variants, and growth in other environments).
In Nature communications (2016)

Here, we propose a statistical method for identifying differentially methylated regions, based on the minimum description length (MDL) principle. Our method is available as an R package.
In Bioinformatics (2016)

No-U-Turn sampler

NUTS implementation in R

Non-parametric mixture models

Rcpp implementation for DP and MFM mixtures

Bayesian logistic regression via Polya-Gamma latent variables

R package implementing the Polya-Gamma augmentation scheme


Course on Data Science and Visualisation

Here you can find course material (in Estonian!) on Data Science and Visualisation, which we created together with Tanel Pärnamaa. This course “Statistiline andmeteadus ja visualiseerimine” is centered around a number of interesting case studies, and it focuses on teaching good practices of data science in R by applying statistical methods to solve these real-life problems.

