Published quarterly by the Research Collaboratory
for Structural Bioinformatics Protein Data Bank
Summer 2012
Number 54


Education Corner

Dr. Andrew K. Vershon is a Professor and Undergraduate Director in the Department of Molecular Biology and Biochemistry at the Rutgers University and the Director of the Waksman Student Scholars Program.

He also directs a research laboratory at the Waksman Institute, where the major focus of his research is on the regulation of transcription in the yeast Saccharomyces cerevisiae.

For more information about the Waksman Student Scholars Program, please visit wssp.rutgers.edu or contact Sue Coletta.

The Waksman Student Scholars Program:
Learning Science by Doing Science

by Andrew K. Vershon, Ph.D.

The Waksman Student Scholars Program (WSSP) provides opportunities for high school students and their biology teachers to contribute to authentic research in molecular biology and bioinformatics.

Since July 1993, over 250 high school science teachers and approximately 5,000 high school students have participated in the project. The WSSP is a year-long program that begins with a summer Institute in which teachers and one or two of their students learn the background content, rationale, and methods required to conduct the research project. It continues during the academic year, when as many as 60 students from each school carry out the research at their own high schools. Schools select how the research project will be integrated into their existing curriculum structure, as either an independent research course, as part of an Advanced Placement (AP) Biology course, or as an after-school club. The program is currently being conducted at the Waksman Institute at Rutgers University, the Johns Hopkins University, the University of Texas at Austin, and the Lawrence Livermore National Laboratory in California. The WSSP and its affiliated programs receive support from the Waksman Institute, GE Healthcare Life Sciences and the National Science Foundation.

Principles Guiding the Project

Over the last two decades the biological sciences have experienced a remarkable transformation. New technologies have produced a flood of information that has to be stored, linked, accessed, and distributed. A cyberinfrastructure in support of biology has become widespread,1 and computers are now an essential tool in the biologists' repertoire. Yet, reports indicate that the K-12 curriculum has not reflected these changes.2

Evidence shows that students who engage in authentic research increase their understanding of scientific processes and change their perception about science.3 Working with an open-ended, unsolved research problem affords students with the opportunity to "learn by doing." They can learn how to develop experimental designs, analyze data, devise and test models, and share ideas.4 Students can also access Internet resources to work on research problems using the authentic data that they generate. The skills that students acquire by using online tools, navigating through professional websites, interpreting complex data, and reporting their findings, will be essential throughout their careers.5

Figure 1: A model of ascorbate peroxidase (1APX) created by students from Bayonne High School, NJ. The heme group is shown in light green and the position of the side chains and backbone atoms that contact the heme are shown in aqua. The red and blue residues show negative and positive charged side chains, respectively, that form contacts along the dimer interface of the protein in the crystal structure. The yellow marks positions that are not conserved between the sequence of the protein determined from the gene the students isolated from the duckweed plant Wolffia australianaand the pea Pisum sativum. Figure 2: A physical model of thioredoxin (2TRX) created by the students of Hillsborough High School, NJ. A dimer is shown in which one monomer is displayed in spacefill and the other in backbone. Figure 3: Students from New Brunswick Health, Sciences, and Technology High School, NJ presenting their research on a homolog of carbonic anhydrase (1EKJ) at the year-end WSSP Poster Forum. A problem with using the scientific computational resources is that high school teachers and their students may be unfamiliar with some of the specialized content and the methodologies that are required to use these resources. However, there are many examples of individual high school teachers or students who successfully learn to use these scientific tools by working in a research laboratory. This raises the question of whether resources can be developed so that diverse populations of high school students can learn to use these tools to conduct authentic research and make genuine contributions to scientific knowledge.

To address this question, the WSSP has developed an authentic research project and accompanying software program that guides students through the analysis of novel DNA sequences to determine a) if they are similar to other sequences stored in the scientific databases, b) if they code for proteins, c) if so, what is the function of the proteins, and d) examine the 3D structures of homologs of these proteins. Students conducting the research project select random clones from a plasmid cDNA library provided by the WSSP, purify the plasmid DNA, determine the size of the DNA insert by agarose gel electrophoresis of restriction enzyme digests and polymerase chain reactions (PCR). The DNA sequences of the inserts are determined and students use an online program to evaluate the quality of the sequence data and resolve ambiguities. They search the online databases for similar DNA sequences and, where applicable, for the proteins that are coded by the DNA. The students use this information to investigate the likely function of the protein and determine if it is associated with a specific cellular process or disease. When analyzing the DNA sequence and protein structure data, students work with the same research tools that scientists use every day.

Each student has the opportunity to analyze a unique DNA sequence that has not been previously examined. They are therefore able to publish their findings on GenBank,6 the DNA sequence database maintained by the National Center for Biotechnology Information (NCBI) and used by scientists throughout the world. The students are able to add to the body of scientific knowledge, and become part of a community of practice.

Most of the students' DNA sequences code for proteins that are similar to proteins in the PDB, in which the 3D structures have been determined. As part of the analysis of their DNA sequences, students examine the structures of these proteins using one of the many freely available molecular graphics programs, such as Jmol,7 RCSB PDB's Simple Viewer,8 or Cn3D.9 Students view the secondary, tertiary and quaternary structures of the proteins, along with identifying active sites and important structural components. Some of the students extend the structural analysis by creating models in Jmol. After examining the primary citations associated with an entry, students design models that can show the active site residues, contact regions with other proteins, or other important structural features (Figures 1 and 2). Other students design models that highlight sequence identity and conservation between the protein they identified and the protein in the PDB. Through a collaboration with the Center for BioMolecular Modeling (cbm.msoe.edu) and as part of the HHMI-funded Students Modeling A Research Topic (SMART) Team project (see the Winter 2006 Education Corner) it has been possible to obtain physical models of the proteins designed by students (Figure 2 and 3). By working with these modeling programs, students gain a deeper understanding of protein structure and function.



  1. D. Hart (2008) Cyberinfrastructure. National Science Foundation Special Reports: www.nsf.gov/news/special_reports/cyber.
  2. B. L. Lowell, H. Salzman, H. Bernstein, E. Henderson (2009) Steady as She Goes? Three generations of students through the science and engineering pipeline. Annual Meetings of the Association for Public Policy Analysis and Management (Washington, DC):
  3. S. Barab, K. Hay (2001) Doing science at the elbows of experts: Issues related to the science apprenticeship camp. Journal of Research in Science Teaching 38: 70-102.
  4. C. A. Chinn, G. A. Malhotra (2002) Epistemology authentic inquiry in schools: A theoretical frame work for evaluating inquiry tasks. Science Education 86: 175-218.
  5. National Research Council. (2010). Exploring the Intersection of Science Education and 21st Century Skills: A Workshop Summary, The National Academies Press.
  6. D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, D. L. Wheeler (2008) GenBank.
    Nucleic Acids Res 36: D25-30.
  7. Jmol: an open-source Java viewer for chemical structures in 3D. www.jmol.org.
  8. J. L. Moreland, A. Gramada, O. V. Buzko, Q. Zhang, P. E. Bourne (2005) The Molecular Biology Toolkit (MBT): a modular platform for developing molecular visualization applications. BMC Bioinformatics 6: 21.
  9. C. W. Hogue (1997) Cn3D: a new generation of three-dimensional molecular structure viewer. Trends Biochem Sci 22: 314-316.