PDB COMMUNITY FOCUS: Robert M. Sweet, Brookhaven National Laboratory

Robert (Bob) Sweet is a member of the Biology Department at Brookhaven National Laboratory (BNL), and group leader of the Macromolecular Crystallography Research Resource (PXRR) at the National Synchrotron Light Source (NSLS; funded by BER/DOE and NCRR/NIH). Raised in rural midwestern US, he was educated at Caltech and the University of Wisconsin, Madison. His first exposure to crystallography was at the knee of Dick Marsh at Caltech, where in about 1963 he estimated intensities visually from Weissenberg photographs and calculated his first Fourier synthesis with Beevers-Lipson strips and a Marchant calculator. An automated diffractometer helped to solve a few cephalosporin structures in the lab of Larry Dahl in Madison that provided Bob with a PhD at the beginning of 1970. Then postdoctoral work with David Blow at the MRC Lab in Cambridge gave him an introduction to protein crystallography. Bob also managed to play a small role in the creation of modern oscillation photography in cooperation with Uli Arndt and Alan Wonacott. He spent a decade in Chemistry at UCLA, and has been at BNL since 1983. There, the importance of the NSLS to the PX community has grown steadily: the PXRR now comprises six beam lines, and has contributed to over 280 publications during the last year.

Q: It's very clear that the PXRR facility will be generating a tremendous number of macromolecular structures, all of which should ultimately be deposited in the PDB. What is your vision for optimal interactions between facilities such as the PXRR and the PDB?

A: Well, to begin with we can remind our users that you're important to us. For six or seven years, our beam-time request form has had a check box (persuasively pre-checked) followed by the words, "Acknowledge your intent to submit the coordinates of the structure derived from this work to the Protein Data Bank." Also, for at least that long, we've had a dream that our data collection and processing programs would create a stream of information in mmCIF format that would represent a major part of the experimental portion of a PDB entry. We have fragments of the data-harvesting code available, but our proudest achievement in this regard is that just a year ago we released our experiment-tracking database, PXDB. This system (please play with it: www.px.nsls.bnl.gov/database/pxdb_intro.html) accepts the information from the user's initial application for beam time, logs actual visits, records the identities of specimens, and registers every image taken. We expect it to be hugely useful as Dieter Schneider and Alex Soares get our specimen-mounting robots in place. In summary, we'll grease the rails for you as best we can.

Q: What is the current ratio of "FedEx" data collection to hands-on data collection at the PXRR, and how do you see this changing as time goes on?

A: This program was started by Mike Becker, brought into regular practice by Howard Robinson, and now operated also by Annie Héroux and Alex Soares. (These folks prefer to call it "mail-in," but I think the name you used has a certain ring to it.) At the last attempt to answer this question we found that, integrated over many months, the mail-in scientists have been employing 0.8 of our 6.0 beam lines. It's a growth industry – it has nowhere to go but up; buy stock in it if you can. Another interesting question is, "What fraction of the users follow the traditional cycle of trimester proposals vs. those who gain 'rapid' access?" For five or six years we have had a rapid-access mechanism for some of our less heavily loaded beam lines. It started out as a simple web form, is now an integral part of PXDB, and will eventually be a part of the NSLS user program. Rapid access to at least five of the beam lines is overseen by Anand Saxena, and it can be really fast. Essentially, if there's time available and you have crystals, Anand will get you in. So the answer to the question is that something over half of our users come to us this way. It's quick, efficient, and more personalized than you might think. In this context, the FedEx work can be quick: typically a week elapses from the time a dewar of cryocooled crystals arrives until data are reported back to the user and the project is essentially finished.

Q: Rapid on-site data collection, mail-in service, and automatic data reduction and structure solving packages are having a profound effect on macromolecular structure determination. How much crystallography will a researcher need to know to solve structures? And, as an educator continually involved in teaching crystallography, what do you see as the best way to teach the fundamentals of crystallography to the growing base of scientists producing and using the structural data stored in the PDB?

A: I think your first question is really, "How little may a researcher know and still be able to solve a structure reliably?" Well, how much do you know about optics and diffraction gratings when you measure an UV/Vis absorption spectrum? Not very much, probably, but you still get really good spectra because the instrument just works. I think that we who devise instruments, software, and methods take this as a model: eventually for some range of "routine" structures, macromolecular crystallography will work about that well. Of course if there are half a dozen anomalies that can give a misleading absorption spectrum (a one-dimensional pattern), then there are at least that many cubed for a crystal structure.

But I'm not answering your questions. We've had seven cycles of our RapiData course – http://www.px.nsls.bnl.gov/RapiData2005/. We have a number of the software "gods" come to teach firstly data reduction, and then solving of the phase problem. These descriptions are at a fairly high level – some knowledge of diffraction is really necessary. We found fairly quickly that whereas we expected to be judging applicants to the course based on their preparation, instead we find we're hoping they know enough to understand! Four years ago I started teaching a "fundamentals" course as an optional extra day at the beginning of RapiData and at least 3/4 of the students have been coming for the five-hour series of lectures. You're welcome to have a look at the visuals I used during the lectures here: www.px.nsls.bnl.gov/fundamentals_lecture/. You'll see that I teach a bit of diffraction theory as related to lens optics, show how a repetitive pattern gives spots, and then move on to reciprocal space. I look briefly at ancient and modern x-ray cameras and diffractometers (students liked the history). Then I develop the expressions for the structure factor and Fourier synthesis. I do a very brief treatment of symmetry, including defining a space group or two and showing how the symmetry of the diffraction pattern comes from the symmetry of the crystal. Then I talk about heavy-atom and direct-methods phasing, and I'm done: all of that in five hours.

I believe this represents the sort of thing the users of our equipment, software, and methods ought to understand in order to have any idea what is going on. The only evidence I have of the usefulness of the approach is that several students each year will say things like, "So that's what that is all about," or "I always wondered how that worked." Certainly no one has said (in our anonymous course evaluations) that it is a waste of time. It would be better, of course, if the same material were presented in a more detailed and leisurely format over ten to twenty lectures back at the university. I'm not experienced with people using but not producing PDB data, so I won't comment on that part of your question.

Q: The PDB is growing at a steadily increasing rate. How have your interactions with the PDB changed over the years and where do you see them going as both macromolecular crystallography and the PDB continue to evolve?

A: Well, a big change is that before 1999 I used to be able to walk across the street to talk with the PDB workers. That changed, of course, when the RCSB took over. Seriously though, the changes are small and incremental. We've had the idea in mind of pipelining information from the data stream to PDB deposition for a long time. We can see that the increased pace makes this even more important. The possibility of having a new synchrotron here (NSLS-II; www.nsls2.bnl.gov) gives us an accelerated mission to be ready for the increment in productivity that this will engender. I'm impressed with the ease with which the PDB is providing interchangeability among its file formats from PDB to mmCIF to the more modern XML. This innovation matches our own migration of information-exchange media. I believe we'll be able to communicate easily, and will continue to grow in parallel.