My journey to pursuing a PhD in Computer Science
All the preconceived notions I held about computer science at UCLA were thrown out the window after the first year of classes. From dissecting assembly code to developing a router in C, I was exposed to a broad range of topics that led me to discover that software engineering is only a minuscule part of computer science. To open myself to the true range and potential of computing, I discovered an alternate path in academic research. Through genomics, I was able to set a clear direction for my path in computer science: harnessing the unprecedented scalability and precision of computational methods to develop quantitative solutions in biomedicine.
Intro to academia: finding faculty and research collaborators (ZarLab UCLA, Mangul Lab USC)
Having had little to no idea of UCLA’s opportunities in computer science research, I consulted my engineering faculty advisor Dr. Eleazar Eskin. Since he specialized in computational genomics, I became exposed to undergraduate research programs, scholarship opportunities, and course tracks that would cater my degree to the field. With the benefit of hindsight, here are tips that I found helpful in getting a headstart on research:
Searching for a faculty advisor. Professors are often incredibly busy with their schedules, and not all of them are guaranteed to accommodate undergraduates in their lab. I was fortunate to have found Dr. Eskin’s ZarLab quite early on in my undergrad at UCLA. In order to forge my path in academia, I would refer to the following resources:
- Broadening your network. Through my first research project at ZarLab, I met Dr. Serghei Mangul as my postdoc mentor. Dr. Mangul soon joined USC as an assistant professor at the School of Pharmacy. I continue to seek research opportunities and helpful professional advice from him to this day. To expand your network in academia, you can reach out to different faculty within the same department or partake in summer research programs at different institutions. Fully funded research programs. Faculty advisors who participate in funded research programs, such as NSF REUs, are most likely to support undergraduate research. I worked full-time under Dr. Eskin and Dr. Mangul as a funded student researcher in the Bruins in Genomics (B.I.G) Summer Institute program. You can read more about my experiences at B.I.G Summer and discovering bioinformatics here:
- Fully funded research programs. Faculty advisors who participate in funded research programs, such as NSF REUs, are most likely to support undergraduate research. I worked full-time under Dr. Eskin and Dr. Mangul as a funded student researcher in the Bruins in Genomics (B.I.G) Summer Institute program. You can read more about my experiences at B.I.G Summer and discovering bioinformatics here:
- Faculty advising hours. UCLA Engineering required professors to dedicate biweekly hours toward advising students 1 on 1. I would find time to meet Dr. Eskin through faculty advising to learn more about opportunities in bioinformatics and he helped me get started on a research project right away during my first year at UCLA.
- Find the best method of communication. Some professors are simply more available on other platforms such as Slack. Make sure to ask and clarify which way is best to reach out and contact them. Attending regular lab meetings is also a guaranteed way to get face time with your professor.
Publications: navigating the unpredictable road
While working under Dr. Eskin, I joined a project on benchmarking structural variant callers. Dr. Mangul, my postdoc mentor at the time, guided me and my fellow undergraduate collaborators through the publishing process.
I learned quickly that the road to publication wasn’t straightforward. Despite working on the structural variant project throughout my entire undergrad and the preprint being finished by my third year, the paper was not accepted until the summer after I graduated from UCLA:
My Bioinformatics Journey: B.I.G Summer Institute 2019
I was also fortunate to join another research project — benchmarking error-correction methods — which was successfully published in Genome Biology:
A comprehensive benchmarking of WGS-based deletion structural variant callers
Having undertaken two separate research endeavors, I learned the following from Dr. Mangul’s mentorship on overcoming the unpredictability of publishing research:
- Overlapping literature. Ensure that there are no published papers on the same subject matter. While I worked on the structural variant project, another benchmarking effort was accepted at Genome Biology a year or two into the project which prompted us to reconsider our approach.
- Differentiability. What unique factors would make your work stand out? It is best to understand the demands of the field and how your research would help progress. Discussing ideas with fellow researchers at conferences and journal reviewers would help ensure that your work can stand out against the rest.
- Drafts don’t have to be bullet-proof. It is perfectly understandable to be rejected multiple times. In fact, submitting drafts and receiving feedback early on could help you work out the necessary areas of improvement that should be covered so that you make a robust paper.
- Make your work presentable. Besides submitting drafts, you can present your work at conferences and use the discussion and questions to test your research in the eyes of your fellow peers. Presentation and clear communication of the ideas is an absolute must to facilitate this process. The same is true for the paper itself, as the abstract and important figures must effectively demonstrate your key ideas.
Internships: undergrad research as a window into the industry
Toward the end of my first year, I took upper-division computer science courses in bioinformatics at UCLA. Getting a head start on learning advanced algorithms and tools improved my capabilities as a computer scientist and researcher. I was able to apply valuable skills from bioinformatics (Bash, Python, Jupyter Notebooks, high-performance computing) to academic coursework. For instance, I trained a neural network on a large dataset for my natural language processing class much faster by running parallel jobs on UCLA’s GPU clusters. The same skill set would prove to be essential in industry.
My unique skill set and experience in bioinformatics research led me to pursue a summer internship opportunity at Illumina, the current industry leader in DNA sequencing. The bioinformatics intern position was a perfect fit for my research background.
The hiring process. While the usual rule of thumb is to keep your resume limited to a single page for industry, I decided to submit a longer CV-style resume that emphasized the following:
- Bioinformatics research publications
- NSF REU summer experience
- UCLA bioinformatics minor + computer science major coursework
- Keywords that are relevant to bioinformatics (genomics, next-generation sequencing, structural variants)
After passing the resume screening, I was given a HireVue to record audio answers to interview questions. The questions were a mix of background questions (resume review) and conceptual questions on bioinformatics and machine learning. The final round was a live interview with the team. Unlike software engineering interviews, there were no coding questions. Rather, the questions tested my ability to think on the spot and communicate my answers coherently using math, stats, and computer science concepts.
The tools I used in my research translated extremely well to my position as a bioinformatics intern on the Illumina Primary Analysis (IPA) team. As the team solely consisted of senior scientists with PhDs, I was absorbed into a research-oriented and collaborative work environment. It was inspiring to learn from their personal experiences as PhD graduates and how they combined their expertise from different STEM fields to solve complex problems. At Illumina, I drew from my prior experience in researching error correction methods to engineer new computational tools for addressing sequence-specific errors and bias in primary analysis. I also got my first taste of machine learning visualization through t-SNE dimensionality reduction, which I used to measure how primary metrics such as error rate and chastity would impact sequencing errors.
Keep in mind that your research experience is far more valued as a PhD applicant. Internships are reserved for forging a career in industry unless the position is highly research-oriented and catered to your subfield of interest.
Why PhD?
After a broad mix of industry and research experiences, I decided to head straight into a PhD program after graduation. Based on advice from my research professors, advisors, and industry mentors, PhD was the choice for me for the following reasons:
- Specialization in a field. A bachelor’s degree provides a holistic view of the field rather than allowing students to delve deeper into the material. Pursuing a PhD also helps you gain access to more opportunities that require specialized experience applicable to your research field.
- Cultivating the research mindset. The PhD learning environment is ideal for taking the initiative and dedicating time to seek interesting problems.
- Academia in the long run. I quickly realized that a PhD would be mandatory to pursue a long-term career in academia. This especially applies if you ever plan on becoming a full-time professor.
- The best time to do research is a PhD. The PhD is perhaps the only time in your career where you can truly dedicate yourself to independent research.
PhD Admissions and the NSF GRFP Fellowship
Finding and being accepted at a PhD program best suited for your needs is a completely different process. The advisor, program, and learning environment are the most important factors to consider. I recommend reaching out directly to faculty to determine program fit and availability. While professors can be too busy to respond to emails, it never hurts to try.
The ideal program should provide you with the proper resources and mentorship to guide you through your PhD. One of the most important resources is PhD funding. While master’s degree candidates are expected to pay tuition, PhDs usually require funding sourced either from the university or external fellowships. Funding provides peace of mind for students to dedicate themselves fully to research. Given that either the PhD advisor or departmental program is expected to support you over the average five-year duration, you must ensure that your research advisor of interest is willing to accept PhD candidates.
Fortunately, receiving a graduate fellowship for your PhD studies could help circumvent these issues. The fellowship would reduce the financial burden on your program or advisor, and you would likely receive more funding than through university offers. One of the most well-known fellowships for PhDs is the NSF Graduate Research Fellowship Program (GRFP).
NSF Graduate Research Fellowship Program (GRFP). The GRFP distributes more than 2000 awards annually. You are allowed to apply twice: once as a non-PhD candidate (usually graduating senior from baccalaureate university) and the other as a first or second-year PhD student. I would strongly advise using both windows of opportunity to apply for the GRFP. Applying as a non-PhD candidate can be beneficial in making you a more attractive prospect in the admissions process. I highly recommend checking out the official NSF website’s FAQ and Elibiligity requirements.
The NSF GRFP application is due before most if not all PhD applications. Use the GRFP to prepare your recommendation and application materials early for applying to PhD programs. For both GRFP and PhD applications, here are common requirements and things to look for when preparing your application:
- Three letters of recommendation. The reference letters should be able to discuss your qualities as an independent researcher at length. Recommendations should come from faculty you have conducted research under at your home institution or summer programs (refer to the earlier blog section on finding faculty).
- Intellectual Merit. Specify your research interest, how your experience has shaped those interests, and how the PhD will continue your trajectory. While not necessary to map out every detail on what you plan to do in your PhD, you must highlight a clear direction that allows the faculty to trust in your commitment and abilities.
- Broader Impacts. Exclusive to the GRFP application, you must detail how you have promoted STEM through your extracurricular activities. Such activities could include tutoring (honor societies), leading club organizations, and arranging career events and workshops. Local outreach activities that promote diversity within the STEM community would particularly stand out.
-
Directness preferred. Your application serves to directly address the faculty in your desired program or field. Hence, essays usually convey a more informative and direct tone that directly addresses what the admissions committee is looking for. Rather than being overly descriptive, delve deeper into what makes the research appeals to you.
Comments from GRFP reviewers. The NSF GRFP Fellowship admissions process requires every reviewer to provide comments on the quality of the applications they receive. While the reviewers examining your fellowship application will most likely be different from mine, I noticed all three reviewers emphasized similar qualities that stood out to them: - Overall background and qualifications. Academic record (GPA), research (publications, posters), scholarships, awards and internships were all heavily considered. Putting effort into the personal resume and making sure to cover all relevant experience is strongly recommended.
- Research plan. Based on my research statement, they commented on whether I held a deep understanding of the field (bioinformatics) and communicated my ideas clearly. GRFP reviewers don’t expect you to follow through with your plan, but a well-thought-out research statement provides evidence of clear direction.
What’s next: NSF GRFP Fellow, Columbia CS PhD
As a Computer Science PhD Candidate at Columbia under Dr. David Knowles, I look forward to delving deeper into applied machine learning methods in genomics. I am excited to engage in interdisciplinary initiatives that empower researchers of different backgrounds to solve the greatest open challenges in biomedicine. At the New York Genome Center, I hope to pursue research that sparks innovation at the intersection of computing and human health.
Thank you to Dr. Eleazar Eskin (UCLA Computer Science and Computational Medicine), Dr. Serghei Mangul (USC School of Pharmacy), Dr. Harold Pimentel (UCLA Computational Medicine), and Dr. Bo Lu (Illumina) for providing me the guidance and self-belief to jumpstart a career in research. I could not have come this far without your support.