About

I’m a data scientist and engineer with particular skills in data cleaning, visualization, data pipeline design, and writing production quality code. I’m currently a fellow at the competitive Insight Data Science program.

I earned my PhD in computational genetics at the University of Washington, where I worked with TB-scale DNA sequence data on compute clusters to investigate the ways that natural selection and population size changes have affected human genetic diversity.

I also spent 5 years as a research scientist engineer in UW’s Biostatistics department using linear regression to identify genetic differences that contribute to human health metrics (e.g. cholesterol levels, smoking behavior). In addition to running data analyses, I built databases, apps, and custom packages for combining data from multiple previously existing research studies. I led a team of research scientists to produce cleaned and QCed datasets in support of dozens of publications.

I have a passion for Python, R, statistics and machine learning, visualization, automating analysis workflows, well-commmented code, keyboard shortcuts, and productivity tools.

LinkedIn · GitHub · Google Scholar · ORCID

About

Recent posts