



leo@joseph:~$
Leo Joseph — BS/MS Bioengineering: Bioinformatics · UC San Diego
leo@joseph:~$ cat tagline.txt
I build computational tools at the intersection of deep learning and genomics: microbiome analysis, cancer genomics, and multi-omics integration.
Outside the lab: coffee, cameras, and pursuing athletic endeavors.
leo@joseph:~$ echo $LOCATION
San Diego, CA
leo@joseph:~$




leo@joseph:~$
[01] Evo2 Fine-Tuning Python · PyTorch ·
↳Leading fine-tuning of Evo2, a 40B-parameter genomic foundation model, on PacBio HiFi long-read gut metagenomic data for strain-level functional prediction and HGT network reconstruction. Ported BioNeMo training to AMD ROCm on an MI300A HPC cluster. Master's thesis project.
40B-param genomic foundation model · AMD MI300A
↳Case-control matching tool for microbiome studies. Demonstrated 11.7% IBD effect size increase in the American Gut Project and 13.5% improvement on HMP2 with dramatically reduced variance. Built with Scikit-Bio and QIIME 2.
11.7% IBD effect-size increase (R² 1.38→1.54) · SD 0.30%→0.02%
↳QIIME2 plugin providing VAE-based mechanistic interpretability for metagenomic and transcriptomic data. Identified a sparse signature of 96 taxa characteristic of IBD from over 3,000 input features. Enables biologically interpretable dimensionality reduction.
96-taxon IBD signature from 3,000+ features
↳Nextflow workflow for microbial characterization from cancer sequencing data. Integrated human read filtration with taxonomic profiling via KrakenUniq and MetaPhlAn4. Analyzed samples across colorectal, esophageal squamous cell carcinoma, and other cancer types.
2,000+ cancer samples across CRC, ESCC, and other cancer types
↳ML model detecting Homologous Recombination Deficiency from RNA-seq data in breast and ovarian cancer. Autoencoder-based interpretability identifies gene signatures from RNA-seq panels linked to positive patient survival outcomes.
AACR Annual Meeting 2026 · UCSD BMES Bioengineering Day 2025
↳Microbiome-specific knowledge graph RAG system. Extracts entities (microbes, genes, metabolites, diseases) and relationships from research literature, enabling semantic querying and citation-grounded answers for metagenomic study design.
Citation-grounded RAG over microbiome literature
[07] CNV Transformer Python · transformer ·
↳Deep learning transformer for copy number variant detection from whole-exome sequencing data. Transfer-learning fine-tuned for somatic CNV calling in cancer samples. Integrates Parascopy for paralog-specific copy number estimation.
700+ 1000 Genomes samples · outperforms existing WES methods
[08] ASD Microbiome Pipeline Python · metagenomics ·
↳Cross-sectional shotgun metagenomics pipeline characterizing microbial composition, diversity, and differential abundances in Autism Spectrum Disorder fecal samples. Integrates BIRDMAn, Qiita, and QIIME2 with ML for pre-clinical diagnostic potential.
Co-authored poster · Society for Biological Psychiatry 2025 · Toronto
leo@joseph:~$
↳Interactive Differential Expression Analysis. Python package for differential expression analysis on gene expression data, designed as a Python equivalent to DESeq2. Supports standard RNA-seq workflows with visualization and statistical testing.
Python equivalent to DESeq2
↳Warehouse location optimizer using Particle Swarm Optimization. Finds optimal placement for a set of warehouses given stores and residential areas, balancing minimum distance from residential zones against maximum distance to stores.
Particle Swarm Optimization for warehouse placement
leo@joseph:~$
[poster·AACRAnnualMeeting2026] Joseph L, Rahman D, Madakamutil Y, Abbasi A, Alexandrov LB
Leveraging transcriptomic profiles and deep learning to detect homologous recombination deficiency in breast cancer with softHRD
AACR Annual Meeting 2026 · San Diego, CA · 2026
[poster·BiologicalPsychiatry97(9):S128] Carlson AL, Patel L, Joseph L, Lopez L, Andreason C, Barnes CC, Arias S, Courchesne E, Knight R, Pierce K
Gut Microbiome Dysbiosis and Metagenomic Kynurenine Pathway Enrichment in Early Autism Spectrum Disorder
Biological Psychiatry 97(9):S128 · Society for Biological Psychiatry · Toronto, ON · 2025
leo@joseph:~$
{
"languages" : [ Python, R, Bash, C, C++, Java, JavaScript, Rust, SQL ],
"ml___data_science" : [ PyTorch, TensorFlow, scikit-learn, Jax ],
"bioinformatics" : [ QIIME2, BLAST, Bowtie, BWA, DESeq2, GATK, SAMtools, STAR, BIRDMAn, HMMER ],
"workflow___infra" : [ Nextflow, Snakemake, SLURM, Docker, Kubernetes, AWS, Azure, Git ],
"visualization" : [ ggplot2, matplotlib, seaborn, D3.js ],
"databases___web" : [ MongoDB, Flask, React, SvelteKit ]
}leo@joseph:~$
I'm an MS Bioinformatics student at UC San Diego. I build computational tools that turn raw sequencing data into biological insight.
At the Knight Lab I develop deep learning methods to unlock new metagenomic analyses, with a focus on ASD and IBD. At the Alexandrov Lab I work on cancer genomics, developing methods for characterizing tumor biology from sequencing data.
My focus is on designing novel deep learning architectures that make previously intractable biological questions answerable.
leo@joseph:~$ cat education.txt
Expected June 2027 MS Bioengineering: Bioinformatics · University of California, San Diego
June 2026 BS Bioengineering: Bioinformatics · University of California, San Diego GPA: 3.7 · Citron-Chien Fellow
leo@joseph:~$ ls experience/research/
— Knight Lab · Prof. Rob Knight Dec 2023 – Present Microbiome, ASD, IBD
— Alexandrov Lab · Prof. Ludmil Alexandrov Jun 2024 – Present Cancer genomics, HRD detection
— Bansal Lab · Prof. Vikas Bansal Sep 2024 – Jun 2025 CNV detection from WES
leo@joseph:~$ ls experience/industry/
— Kaiser Permanente · Business Process Intern Jun – Sep 2023
— Clear Labs · R&D Intern Jun – Aug 2021
— Cisco · Software Engineer Intern Jun – Aug 2021
leo@joseph:~$