RESUME
EDUCATION
2016-Present: Ph.D. Dept. of Computer Science, Johns Hopkins University
-
Advisor: Ben Langmead
-
Research Interests: Computational Genomics, Next Generation Sequencing, Text Indexing, Succinct Data Structures, Big Data, Sketching.
2012-2016: B.S. Tufts University
Majors: Computer Science and Biology
WORK EXPERIENCE
2016-Present: Johns Hopkins University. Baltimore, MD
Research Assistant
-
Develop efficient and scalable tools for analysis of next generation sequencing reads (see “Projects”).
-
Published 3 papers (2 as first author) and presented at 3 conferences (see “Publications”).
-
2 semesters as teaching assistant for “Computational Genomics: Sequences”.
Summer 2020: Illumina, Inc. San Deigo, CA
Bioinformatics Intern
-
Developed tool to detect false positives in population-scale database of thousands of structural variants.
-
Leveraged information from short-read (Illumina) and long-read (PacBio HiFi/CCS) technologies.
-
Added initial support for extending to short tandem repeats (STRs).
-
Presented research findings internally to multiple departments.
Summer 2016: Berg Health. Framingham, MA
Analytics Intern
-
Developed web portal to streamline analysis of multi ’omics data (PHP, R).
-
Created NLP machine learning tool to extract information from clinical health records (Python).
2012-2016: Tufts Academic Resource Center. Medford, MA
On-Call Tutor
-
Duties: 3 one-on-one tutoring sessions every week with beginner comp. sci. students.
-
Topics covered: programming basics, data structures, discrete mathematics, algorithms, C/C++.
PROJECTS
rowbowt (C++) : Query large, repetitive genomic collections quickly with space sublin- ear to input size.
Collaboration with researchers across multiple universities.
https://github.com/alshai/rowbowt/
levioSAM (C++) : Lift over alignments from variant-aware alternate references.
https://github.com/alshai/liftover
Personal Genome Constructor : Use low-coverage imputation to improve NGS read alignment accuracy and alleviate reference bias in downstream analyses (e.g. variant calling, allele-specfic expression).
Draws upon alignment data from SRA and variant data from the 1000 Genomes Project.
https://github.com/alshai/genome_inference/
pfbwt-f (fork) (C++) : Efficiently build a Burrows-Wheeler Transform from a se- quence containing high amounts of repitition.
https://github.com/alshai/pfbwt-f
varcount (C++) : Calculate NGS alignment coverage over a predefined set of variants.
https://github.com/alshai/varcount
VOLUNTEER EXPERIENCE
Illumina Cares
- Led team of fellow interns from wide range of disciplines to provide non-profit with playbook for teaching STEM and professional development skills to students in Zimbabwe
SKILLS
Programming Languages: Bash, C++, C, LaTeX, Python, R, Rust
Tools/Frameworks: Unix, docker, HPC (SLURM), tidyverse, numpy/scipy, SnakeMake