Images of molecules are becoming more and more common in educational and entertainment
media. These pictures are often created by computer graphics artists using state-of-the-art programs such as Maya
and Cinema4D. However, the methods used to import PDB structures into these advanced programs can be challenging.
David Goodsell recently spoke with two molecular graphics professionals to see what is available and what still
needs to be done.
Q. First off, can you tell me a bit about yourselves and the work you are doing in molecular graphics?
A. GaëL McGill: Cell and molecular biology has been a passion from
my middle school days. I came to the USA as an undergraduate specifically to study molecular biology and my
dream was to be involved in research. I went on to do my Ph.D. at Harvard Medical School (mostly focusing on
cancer signal transduction pathways and apoptosis using a varied mix of cell and molecular biology, biochemistry,
animal model/genetics, and screening approaches). Having identified a need in the local academic, medical,
biotech and pharmaceutical communities for "scientifically-informed" graphic design and web programming services,
I started my company Digizyme (www.digizyme.com) in
1999 during my Ph.D. years. On a personal level, this really served as a creative outlet outside of the long
hours at the bench. Digizyme has grown to offer more advanced services in recent years--including 3D animation
services and even product design and visualization for the biomedical device industry. Over the past few years,
I have "reintegrated" into academia in hopes of establishing a full-time team of scientist-animators at Harvard
Medical School (where I currently teach Maya molecular visualization classes year-round). I enjoy the variety
and challenge of client-driven projects as part of my work at Digizyme, and look forward to the freedom of
pursuing larger-scale, longer-term collaborative projects relating to fundamental cell/molecular visualization
challenges as an academic.
A. Graham Johnson: I graduated from the Department of Art as Applied to Medicine
in the Johns Hopkins School of Medicine with a master's degree in Medical and Biological Illustration in 1997. At
Hopkins, we studied anatomy and physiology with medical school students while simultaneously sketching and
illustrating our dissections along with autopsies and surgeries observed at the hospital. After graduating,
I focused on studying molecular and cell biology while illustrating the textbook Cell Biology with
Tom Pollard and Bill Earnshaw. I began animating this content for other clients and realized that trained medical
illustrators could contribute a great deal to this relatively unillustrated subject. It occurred to me that one
could simulate most of my animations either by scripting the physics or by translating imported data to a format
that my 3D software package could recognize. I began simulating a handful of my client animations, but quickly
learned that the out-of-the-box software was not intended for this purpose and could only be used for inaccurate
and simple molecular interactions. In 2005, I applied to the Ph.D. program at The Scripps Research Institute to
work in the Molecular Graphics Laboratory under Arthur Olson to better understand the content and to communicate
directly with a team of talented molecular graphics coders. As I head into my fourth year of the program, I'm
finally producing some useable scripts that have made it easier to create illustrations and animations of
molecular realms. I hope to distribute many of the tools very soon.
Q. How do you import PDB structures? Are these tools generally available?
A. Gaël McGill: For the most part, we use existing molecular
graphics applications like Chimera and PyMOL to generate geometry files. These are typically exported as
VRML (Virtual Reality Modeling Language) and then converted to OBJ format (a common data
format for 3D data) before being imported into Maya. We also use MEL (Maya Embedded Language) scripts--either
ones already available online (although currently there are not many related to PDB), or in-house/custom ones.
Which method we use depends on what we will be doing with the geometry once inside Maya. A great option for
bringing in large PDB datasets has been Chimera's "multiscale models" feature. Eventually it would be great
to create a similar functionality for creating polygonal models within Maya itself in order to have more
control over the output geometry. Still, this type of tool has been very useful in creating animations
showcasing large complexes (like entire viruses).
A. Graham Johnson: I've written a COFFEE plug-in (Cinema4D's native
scripting language) that imports a single PDB file or a list of PDB files directly into my viewer window
as a set of points in space (used to generate smooth surface models such as metaballs),
CPK spheres, or backbone spline. I'm building a primitive ribbon generator and hope to make the tools
available for use within the next year. If I require a more sophisticated surface model, e.g., one
colored by electrostatic potential, I'll export it from one of the popular molecular viewers as either a
VRML or an OBJ file. Again, for static images, I'll often just export a screen grab from a molecular
viewer that offers a style I'm after.
The Synapse Revealed created by
Graham Johnson for the
Howard Hughes Medical Institute Bulletin ?2004
Q. What type of molecular imagery is most popular with your clients?
A. Graham Johnson: Because molecular graphics viewers are so user-friendly
these days, clients rarely come to me to request an image or movie of a single molecule spinning on a monochromatic
background. Most of my clients ask me to generate an editorial image, or to illustrate or animate a process or
cell event involving multiple molecule types.
A. Gaël McGill: Although it depends on the project, I find that
clients (especially biotech and pharma) want images of molecules "in context." In other words, scenery that
captures a molecular process of interest but also places it within a cellular landscape. The challenge is
to create a still image that captures or suggests a narrative or mechanism... essentially an "action shot"
in which the visual context of the structure being depicted (and its binding partners) helps to communicate
Q. I imagine that assembly of biologically relevant complexes (such as
chromatin or a transcription complex) and modeling their dynamics poses difficult challenges--what
types of tools do you use for this?
A. Gaël McGill: This is one of the toughest challenges at
the moment (and it does not only apply just to complexes): how does one visually represent the dynamic
aspects of proteins based on available (mostly static) data? The ability to create linear morphs between
multiple conformational states of a protein using the adiabatic mapping technique (used
by Mark Gerstein's method at www.molmovdb.org, for
example) is very useful to visualize one possible trajectory, but it is only one possible trajectory and
it also cannot tackle more complex morphs that involve partial refolding of protein domains. Drew Berry
at The Walter and Eliza Hall Institute of Medical Research has pioneered a visual style that suggests
the dynamics of proteins, but it would be nice to create animations that are based on actual data for
these dynamics (i.e., as opposed to using noise/fractal motions throughout, having vibrations
and degrees of flexibility that reflect the protein's actual range of 'thermodynamically-permissible'
motion). In packages like Maya, we are currently limited to using pretty basic kinematic tools
(i.e., building rigs driven by forward or inverse kinematics) that intrinsically
have no knowledge of the molecular structure and its limitations or range of permissible torsion/bending.
The software does not even register or warn against impending self-intersections--a problem that
we are currently exploring in collaboration with topologists/software developers from the entertainment
industry. At the moment (and depending on the target audience), we try to find as many sources of
reference data as possible and use them as "inspiration" to create a dynamic representation of a
protein or complex. The goal is to find more direct ways of integrating these data into the
visualization (inasmuch as it helps communicate crucial parts of the story).
A. Graham Johnson: I've attempted to rig a handful of
complex builders over the years with out-of-the-box toolsets. Such tools often do a great
job of roughing a concept together, but fail when applied to large-scale systems or attempts to
accurately simulate the rigor and detail often required for molecular imagery. Years ago, for
example, I tried, to stitch together thousands of blocks with pairs of springs to represent the
persistence length and flexible backbone of DNA in a plasmid. I animated a twist to see if it would
supercoil, but the collision detector would always overload and the system would come to a
screeching halt before the DNA could achieve a single twist. I've tried pouring virtual molecules
into virtual organelles to fill them with random recipes of non-colliding molecules, but again,
the technique has always proved to be slow, limited in volume, and relatively uncontrollable.
Most particle generators I've toyed with have similar limitations to their physics simulation.
To overcome these challenges, I've begun to construct scripts from scratch that attempt to combine
the capabilities of simplified molecular dynamics with the visualization power of commercial 3D software.
Early Events in Reovirus Entry
by Gaël McGill. The full movie can be viewed online at
Q. Are there any resources that you would suggest to artists interested in incorporating PDB structures into their work?
A. Gaël McGill: Other than the fantastic PDB itself (not sure what we would do without it!),
I recently launched a free resource for scientists interested in learning 3D software packages
for cell and molecular visualization at www.molecularmovies.org. One section
is a showcase/directory of some of the web's best cell and molecular movies (organized by scientific topic),
and another is dedicated to tutorials and lectures. There are currently hundreds of pages of free tutorials
that approach learning Maya in the context of biological visualization. More specifically, several of these
tutorials focus on getting PDB data into 3D applications like Maya. Expansion of the site in the near future
will also include a "Toolkit" section where animators can share scripts and plugins for PDB import (and other
tasks related to molecular animation), and a new section that provides a more general directory of visual
resources. The idea behind this last section is to find and organize non "narrative-driven" raw data
visualizations (i.e. like time-lapse movies, MD simulations and other datasets) that animators
can use as reference materials to create better visualizations.
A. Graham Johnson: The updated and integrated Electron Microscopy
Data Bank (emdatabank.org) offers many low-resolution models of macromolecular
structures and has a new online EM viewer. Many files in the PDB exist as low resolution structures with
only alpha carbon coordinates published. If you need a rough approximation of the sidechains to generate
a teaching model for such a molecule, you can generate or download a pre-generated version from MaxSprout
Lastly, I find the TransMembrane PDB indispensable (pdbtm.enzim.hu).
Q. Have you had any projects that posed an insurmountable challenge?
A. Gaël McGill: The great thing about cell and molecular visualization
is that there is an endless source of topics/mechanisms to visualize and each of these come with their own
unique challenges. We may not always use the optimal solution or have the perfect tool available, but there is
almost always a creative way to solve the visual representation challenges that emerge. It is one of the
aspects of visualization with powerful packages like Maya that make this work so fresh and exciting!
A. Graham Johnson: Many projects have and I've often had to truncate my
personal goals or compromise with the client to find some work-around because of strict deadlines. In years
past, I sometimes had to resort to keyframe animation, hand drawn animation, simple 2D
vector animation, and even static imagery to convey a message that could have been most clearly presented
as a 3D animated sequence... I simply lacked the technology, skill, or time. Finding out, however, that
molecular animation posed more challenges than my other medical illustration jobs directly inspired me to
build tool sets to help meet such challenges.
Q. What new tools would you like to have?
A. Gaël McGill: As noted above, we are in the process of
creating a suite of MEL scripts that can address some of the basic geometry-building
tasks for getting PDB data into Maya (without having to resort to molecular graphics software-exported
meshes). Once we have this first set of scripts (that just focus on efficient/clean geometry creation),
the next step would be to explore the development of programmatically-driven rigging
tools for defining the articulation of the models. In other words, to write scripts that not only
create Maya-native geometry directly from the PDB but also automatically create a rig that has some
inherent motion constraints applied. This is easier said than done and will of course depend on the
type of molecular representation (ball & stick versus cartoon for example would have very different
'rules' applied to constrain motion). Having geometry that is more 'self-aware' (and that can at least
avoid or warn about self-intersections) would be useful.
A. Graham Johnson: I agree that methods for exporting molecular
models in styles that are animation-ready would be very helpful to everyone in the molecular illustration
field. I would primarily like to see an extension of the PDB that offers biological unit matrices to help
users generate pertinent symmetries. This works great in PDB files for viruses that have BIOMT lines in
REMARK 350 to describe the transformation and orientation matrices needed to generate a complete virus1.
More specifically, I'd love to see this for other common cell complexes. How can one generate an in vivo
microtubule with 13 protofilaments and a proper seam from 1TUB
for example? What rotation per y translation might a user need to enter to generate an actin filament from an actin monomer?
A handful of filamentous files exist in the PDB, but animators can benefit from viewport and render time
efficiencies afforded by modern software by cloning a single molecule rather than rendering coordinates
from the thousands of copies of 1TUB needed
to generate a lengthy microtubule.
McGill: Basic collision detection is also not easy to implement at the moment (whether between
different parts of the same continuous mesh or between meshes). Some way of integrating electrostatic
forces would also be amazing! Better simulation tools would also help us create molecular vistas with
some semblance to what is happening in vivo. By simulation I don't mean at the same atom-by-atom
level that molecular dynamics offers, but something that would drive the stochastic behavior of numerous
molecules within a defined volume or environment, for example.
Finally, an area that is ripe for exploration: we need to tap into the full promise of educational gaming
and interactive environments by harnessing the power of modern gaming engines. In many cases, the digital
assets (models, textures, rigs) used to develop high-end games are created in packages like Maya. So one
could easily imagine a scenario where a lot of the work being done to create 'narrative-driven' molecular
movies in Maya could be repurposed and adapted to generate interactive molecular environments for
Q. What packages do you typically use for your molecular animation projects?
Graham Johnson: In 1998, I generated most of my instructional static images directly from a
package called Ribbons. It offered the most attractive defaults, endlessly adjustable styles, and one of the
better-developed graphic user interfaces for its time. To this day, its outlining feature helps produce some
of the most pedagogically useful rendering styles that can be reduced to nearly impossible sizes while allowing
structures to remain legible. For glitzier editorial renderings, I used a computer graphicspipeline
patched together by Dr. Witek Kwiatkowski (The Salk Institute) to meet my specific goals. It converts output
from a variety of molecular viewers to a freeware renderer called PovRay. We can export surface models with
electrostatic potentials from GRASP, for example, and fancy beaded ribbon models from MolScript. The modern
molecular viewer Pymol now emulates this multi-hour process with the click of 2 or 3 buttons on any operating system.
For static pedagogic imagery, I still prefer to use renderings from molecular graphics viewers such as Ribbons,
Pymol, or Chimera. I've also recently had my eyes opened to PMV (Python Molecular Viewer), which can generate
a variety of outline styles in real-time that rival or beat the best of the commercial cartoon-style renderers
for creating a pencil sketch styled contour for example. For a workflow pipeline, such as for creating a
secondary signaling cascade involving 12 PDB files, I find it most efficient to first create a sketch (in
a familiar media like pencil and paper if you want the final composition to look its best!). I render each
molecule individually, combine snapshots of the images, and then fill in the background to couch the
molecules in their proper context (e.g., draw organelle bilayers and matrices) with familiar tools
as found in Photoshop or Illustrator. I use the model snapshots directly or just as skeletons for
hand-rendered blobs to make each molecule less busy for more complicated scenes or to match an intended style
so molecules blend into the background. To this day, I paint over most every 3D rendering I create to reduce
that geometric, plastic, "computery" look, but as non-photorealistic algorithms and my lighting/texturing
skills improve, I can spend significantly less time on this step each year.
Although PMV may change this soon, most molecular viewers do not offer a thorough (or even basic) set of
animation tools that commercial 3D software users would recognize. For that reason, I have always turned
to commercial packages. I started using Strata Studio Pro in 1996, but switched to the more stable Maxon's
Cinema 4D (C4D) in 1999. I still use C4D to generate images with fancy textures, lights, or complicated
scenes involving more than one molecule or editorial scenes involving molecules out of context (e.g., a
nucleosome clamped under an old dissection magnifying glass on a desk). I also use C4D to generate all of
my animations because it offers some easy to use tools for keyframing, character rigging,
and physics/particle simulation.
Gaël McGill: I use Autodesk Maya Unlimited & Adobe After Effects usually in
combination with UCSF's Chimera, Warren Delano's PyMOL (and sometimes Maxon's Cinema4D for metaballs)
when dealing with PDB, EM, or microscopy datasets. Occasionally, I also use Pixologic ZBrush, and
Luxology modo, and Adobe Flash, depending on the animation and its delivery format: self-running versus
interactive and/or web-based movie.
Q. We often use many representations to visualize biological molecules: space filling,
bonds, ribbons, etc. Do any of these cause particular problems?
Gaël McGill: When initially created within molecular graphics packages (like Chimera)
and then exported to Maya, certain types of geometry can result in very heavy files (cartoon/ribbons and
high-resolution surface meshes in particular). The other aspect is that the meshes tend to be messy and
unpredictable in terms of the order of vertices in the polygonal model and other properties that one could
use to programmatically rebuild the geometry within an application like Maya. The best solution moving
forward will be to create a MEL scripts that start directly from the PDB coordinate file and
generate identical looking ribbon or surface representations--but ones that are much lighter and cleaner
because they have been built within Maya using more 'optimized' types of geometry, like NURBS
(non-rational B-spline) for example.
Graham Johnson: I've encountered similar problems with each type of representation export. Some
molecular viewers don't take advantage of the VRML file format and will export each sphere in a CPK model
as a spherical mesh containing dozens or hundreds of points. Most packages, however, do just export a
translation, radius and texture for each atom. Also, depending on the formatting used, a CPK model from
one package might take 2 seconds to import, while the same model from a different molecular viewer export
may take 3 minutes to import. Some will come in with a new texture map for each atom rather than
references, so you'll end up with hundreds of oxygen reds for a typical protein. Each molecular viewer
creates a slightly different looking ribbon style, which is nice to keep them recognizable, but difficult
if you want to merge offerings from different packages. Most offer little adjustment, but PMV has a
profile editor that will let you draw your own extrusion shape for helices, coils and beta sheets.
Surface models tend to be less problematic, but I find that most of them have redundant points describing
each vertex, i.e., each polygon has its own corner point, even though the corner points meet.
To fix this, I usually "optimize" any set of polygons I bring in to C4D which merges all of the redundant
points cleans the look of the model mesh by allowing the shading model to function properly, and reduces
the memory requirement size of the model often to ~30% of the original import. Ribbon models have become
more predictable and easy to import in recent years, but be sure to set your Phong tags correctly so the
nice 90 degree edges on your beta strands don't look rounded off. These models will also appear
tessellated and require an "optimization" to repair.
Clients often want to animate a probable transition between two crystallized states of a particular molecule.
All of the styles molecular viewers generate have inconsistent point numbers and point assignments. Morphing,
for example, from a surface model of one protein conformation derived from a PDB file to a second conformation
from a different PDB file inevitably fails. Even if the PDB files have the exact same sequences, the point
order of their surface or ribbon meshes will differ with surface area and portions of the model will therefore
turn inside out en route to their new position when interpolated. To work around this, I've had to use the PDB
data more directly and generate a surface skin on the fly with metaballs while the raw data's
atom skeleton moves "properly" below the surface (proper for animation rigging, not for detailed structural biology).
A similar approach is required to transition smoothly from a helix to a random coil in a ribbon model as it morphs
between states, but easy and automated methods don't yet exist and most techniques produce strobing intermediates
that severely distract from understanding the backbone's motion.
The RCSB PDB (citation) is managed by two members of the Research Collaboratory for Structural Bioinformatics:
RCSB PDB is a member of the
The RCSB PDB is funded by a grant from the
National Science Foundation, the
National Institutes of Health, and the
US Department of Energy.