AI

Nearly everything here is built on the transformer, the architecture that now powers language, vision, audio, and speech models alike. I studied these models in school just as they were emerging, and what holds my attention now is less the architecture itself than what you can build on it: applications that retrieve, remember, and measure their own output rather than models in isolation. The projects cover the areas I keep coming back to: speech and natural language, retrieval-augmented generation, computer vision, and making models cheaper to adapt and serve with techniques like LoRA, along with notes on what I learn as I go.

Machine learning

Decision-focused machine learning on tabular data, where the constraints are proof, fairness, and governance rather than scale.

Audited against planted bias

A credit risk model whose fairness audit can be proven to work

Replacing a bank's scoring model is a proof problem: the challenger must beat the incumbent, survive a fair-lending audit, and explain every decline. This project builds that whole arc on an 800,000-application simulated portfolio where protected attributes are causally inert and bias is injected deliberately, so the fairness suite is tested against ground truth: it has to catch the planted redlining proxy, and it does, rejecting the challenger with the best Gini. The mitigated champion keeps a +0.102 Gini lift over the incumbent scorecard, worth an estimated $13.8M a year in avoided default losses at a constant approval rate, declined applicants get SHAP reason codes that sum to their actual score margin, and the governance set (model card, SR 11-7 style validation, monitoring plan) ships with it. A twelve-model zoo on the approved features maps the rest of the trade space: deep models tie the boosted trees, a glass-box EBM recovers 97% of the lift, and regulator-friendly monotone constraints price out at 0.053 Gini.

LightGBM FT-Transformer SHAP Fair lending FastAPI Python

Notes

Write-ups on models, papers, and coursework I'm working through.