Carnegie Mellon University

Computational Biology

Empowering bright young minds to unlock the power of computation to solve research problems at the frontier of modern biology. 

Students consulting a laptop screen in the classroom.
Program Length
Jun. 22
to
Jul. 13, 2024 

 (3 weeks)
Early Decision & International Applications Due
february-1-final-final.png
Scholarship* & Regular Decision Applications Due
march-1-final-final.png
Housing Options

Resident 

or 

Commuter**

*Scholarship decision notifications released on Friday, April 5, 2024. All others are rolling admission.
**To be a commuter, the student and parent/guardian must have a permanent residence within approximately 30 miles of our Pittsburgh campus or within Allegheny County.
       

Program Overview


The Pre-College program in Computational Biology provides extensive training in both cutting-edge laboratory experiments to generate biological data and the computational analysis of the data that these experiments generate.

Computer science has revolutionized biology and medicine. Tomorrow's life scientists need deep knowledge of not only the laboratory techniques used for generating experimental data but also the rigorous computational techniques necessary to analyze and model these data. Pre-College Computational Biology offers an unparalleled experience for high school students to explore this relationship in a university setting.

Our work in the program focuses on answering big picture biological questions about the microbes living in Pittsburgh’s three rivers as well as the ongoing COVID-19 pandemic. After sampling water from one of Pittsburgh’s three rivers, students will use modern laboratory techniques to isolate the bacterial DNA from the water and break the DNA strands into millions of tiny fragments that are then read. The question, then, is what to do with all this information? This is where computational biology flies to the rescue.

Our program is structured to allow students to appreciate the inherent synergy between experimentation and computational analysis in modern biology. We will spend approximately half of each day of the program following a hackathon model, in which students will work in small groups to write programs solving computational problems, with hands-on guidance from the instructor and teaching assistants. Students will spend the other half of each day in the laboratory, conducting experiments to generate large datasets to be analyzed with student code.

Carnegie Mellon University is a leader in automated science and, as part of the experimental side of the program, students will get the chance to work in our automation lab. They will use robots to run experiments while learning how machine learning can be used in the design and execution of experiments.

Final projects at the close of the program allow students to present their work to peers, parents, guardians, and other guests. Example student projects can be found at our program homepage.

We are looking for students who love biology, have demonstrated that they are proficient in mathematics, and who are looking for a program that will teach them how computational approaches are fundamental to a complete understanding of modern biology.

Programming experience is not required. 

We do not require that students have experience in programming, as we provide preparatory materials that provide our students with the foundation in programming that they will need to be successful. (See “Programming Preparatory Materials” below.)

Curriculum

  • Bacterial colonization and genome sequencing
  • DNA extraction
  • Genome assembly
  • Polymerase chain reaction
  • Gel electrophoresis
  • 16S ribosomal RNA gene sequencing
  • Transfection (adding genes to E. coli)
  • Genome assembly
  • Downstream genome analysis, such as gene finding
  • Sequence alignment and its applications to species identification, genome annotation, and gene comparison
  • Evolutionary tree construction
  • Metagenomics analysis

Sampling

  • How do we design an experiment to learn about microbes in the environment?
  • How were DNA sequence data generated?
  • How can you isolate and identify individual colonies of bacteria?

Microbiome diversity

  • How can we extract DNA from samples with a variety of organic material with different structures (viruses, plants, bacteria, other microorganisms)?
  • How can we use our knowledge of evolution and molecular biology to focus our experiments on studying bacteria?
  • How can we use sequence data to determine the diversity of microbes in the rivers?
  • How can we measure the difference between two samples? 
  • How can we determine what drives microbial diversity in river water?

DNA comparison

  • How can we quantitatively determine the difference between two DNA strands containing only A’s, C’s, T’s, and G’s?

Bacteria identification

  • How can we isolate bacteria in the laboratory?
  • From bacteria, how do we isolate DNA?
  • How can we match a DNA sequence to a database of known bacteria?
  • How can we use computational techniques to understand and characterize images of bacterial colonies?

SARS-CoV-2 application

  • How can we compare the SARS-CoV-2 genome against related viruses? Does it differ more in some genes than others?

 Sequencing

  • How can we generate short fragments of DNA taken from an organism in a lab?

 Genome reconstruction 

  • How do we assemble our short strands of DNA and reconstruct them into a complete SARS-CoV-2 or bacterial genome?
  • Given a complete coronavirus or bacterial genome, how can we determine where the genes are?
  • Can we infer the function of a gene from only its sequence?
  • What genes are present in the coronavirus genome and what do they do?
  • What are the evolutionary relationships among bacteria in Pittsburgh’s rivers?
  • Can we use evolutionary trees of viruses sampled from patients to determine the origin of SARS-CoV-2 in the U.S.?
  • How can we visually compare multiple sequences to one another?
  • How can we quickly determine where mutations in the coronavirus occurred and use this to identify variants?
Preparatory materials will be provided from Professor Compeau’s Programming for Lovers open education project to admitted students in advance of the program to provide fundamental programming skills. 

IMPORTANT: Admitted students will be required to complete some assignments taken from this project before starting the program.

Eligibility and Application Requirements

To be eligible for Pre-College Computational Biology, students must: 
  • Be at least 16 years old by the program start date.
  • Be a current sophomore or junior in high school at the time of application submission. Please note: Talented sophomores are encouraged to apply, however, most of our admitted students will be juniors.
  • Have an academic average of B (3.0/4.0) or better.
The complete application for Pre-College Computational Biology consists of the following:
  • Completed online application
  • Unofficial transcript
  • Standardized test scores (optional)
Standardized tests are not required. We assess applicants holistically and take into consideration many factors, including quantitative background and skill. One way in which this skill can be demonstrated is through optional submission of PSAT, SAT, ACT, or SAT Subject Test scores and/or by mathematics coursework.
  • One letter of recommendation
  • Responses to essay prompts
Essays are required for the following prompts (300-500 words each):
  • What do you hope to gain from participating in Carnegie Mellon’s Pre-College Programs?
  • Why are you interested in studying Computational Biology?

What is computational biology?

Great question! The short answer is the application of high-powered computational approaches to analyze biological or medical datasets. For a lengthier explanation, check out the first 20 minutes of this video recorded by Professor Compeau.

Do I need to bring my own computer? What other supplies do I need?

Because our program is heavily dependent on coding, each student in our program will need to bring a laptop. We will provide all other resources needed.

Will I earn college credit from this program?

No, Pre-College Computational Biology students do not earn college credit. 

Are international students allowed to participate?

Yes, our program is open to international students as long as they are able to enter the United States and come to Pittsburgh. They are not, however, eligible for scholarship consideration.

CompBio students
CompBio student performing experiment
CompBio student performing experiment