← Back

IBD Bayesian Pipeline

Two-stage Bayesian pipeline for IBD diagnosis from high-dimensional metagenomic data, combining horseshoe shrinkage for variable selection with polynomial logistic regression on synthetic cohorts.

Overview

Capstone project implementing a dual-stage Bayesian statistical approach for inflammatory bowel disease prediction from microbial sequencing data. Stage 1 applies a horseshoe prior for aggressive variable selection via posterior inclusion probabilities. Stage 2 feeds the selected variables into a polynomial logistic regression classifier. Built on synthetic cohort data with compositional log-ratio transformation for microbiome-appropriate preprocessing. Includes cross-validation benchmarking against baseline models and a Streamlit web app for interactive model inference.

Technical Specs

Timeline
2025
Stack
Python NumPy Streamlit scikit-learn