Bayesian variable selection

Stage 3 project, 2024/25

Supervisor: Darren Wilkinson

Project outline

Bayesian variable selection (BVS) is concerned with the commonly encountered problem of deciding which variables to include in a statistical model based on whether or not they are useful, in a Bayesian manner. Since every subset of variables corresponds to a different model, this is a model selection problem. However, if there are p variables, there are 2^p models to choose from, and so explicit evaluation of the posterior probability of every possible model may not be practical. Nevertheless, Bayesian analysis proceeds by placing a prior distribution over models and model parameters, and computationally intensive methods are typically employed to explore the resulting posterior distribution. There are several quite different approaches that can be taken to address this problem, and the potential applications are many and varied.

Potential areas for more in-depth study

  • Discrete slab-and-spike regression in the style of Kuo and Mallick
  • Gibbs variable selection (GVS)
  • Reversible jump MCMC for BVS
  • Continuous shrinkage priors
    • The Bayesian LASSO
    • Global-local shrinkage
    • Regularised horseshoe priors
    • Non-local priors
  • Mixing of MCMC algorithms for various different shrinkage/selection priors
  • BVS in probabilistic programming languages
  • Modelling sparsity
    • Applications to tractable and intractable regression models
    • Graphical model selection
    • Applications to multivariate time series and Granger causality

Pre-requisites

You should have taken Statistical Inference II (MATH2711) at Stage 2, and will need to be taking Bayesian Computation and Modelling III (MATH3421) at Stage 3 as a co-requisite. You must be comfortable with programming in R.

Some relevant resources

Books

Papers

Blog posts, wikipedia pages, R packages