Skip to Main Content U.S. Department of Energy
Biological Banner

Biological Applications

PNNL scientists are applying quantitative proteomics approaches to an increasing number of biological systems important for the advancement of Genomics GtL projects and other DOE missions. Current biological systems include Shewanella and Geobacter, which are dissimilatory metal-reducing bacteria able to reduce metals and radionuclides as a consequence of their respiratory metabolism. An improved fundamental understanding of this metabolism will be useful for designing strategies for bioremediation of sites contaminated with heavy metals and radionuclides and for predicting the behavior of contaminants in the environment. R. sphaeroides is a metabolically diverse bacterium that can generate energy by fermentation, aerobic respiration or via anoxygenic photosynthesis. Understanding how Rhodobacter senses its environment and adjusts its cellular machinery to allow for energy generation under these distinct lifestyles is essential for exploiting microbial systems for carbon management and energy generation. Caulobacter crescentus cultures can readily be synchronized providing a useful model system for studies of the cell cycle progression and is the focus of a project currently supported the Genomics: GtL program. Pelagibacter ubique is one of the dominant heterotrophic bacteria in open oceans and the only known cultured isolate possessing the recently discovered proteorhodopsin light driven protein pump that is abundant in such environments. This organism and culture enrichments containing close relatives is a model system for characterizing the role of marine microbial communities in ocean carbon and nutrient cycling.

A "bottom-up" proteomics approach

We will be centering most of the high-throughput studies on the use of LC-MS based measurements (the accurate mass and time (AMT) tag approach). This was developed at PNNL as a "bottom-up" comprehensive proteomics method that provides greater sensitivity, dynamic range and throughput than other methods. Briefly, the LC-MS-based measurements strategy (see figure) combines the high mass measurement accuracy of FTICR MS with the accurate elution time measurements provided by liquid chromatography (LC) separation(s) to identify peptides.

The approach involves first generating a database of peptides that serve as AMT tags using low-throughput MS/MS (i.e., biomarkers), and then using this database to enable subsequent high-throughput LC-MS analyses of biological samples. The concept of the LC-MS based measurements approach relies on the mass and LC-elution time of a given peptide being unique among all possible peptides deduced from a genome sequence when the measured molecular mass and elution time of the peptide are sufficiently accurate.

The following is a list of organisms that are being investigated in this project:

Proteomics Approaches

Our objective is to provide a comprehensive analysis of protein abundance and how it changes with changing environmental conditions, the general location of protein abundance within the cell, and protein modification state. We will also assess the rate of protein turnover that occurs in response to transition of biological systems, either wild type or mutant, and between two or more different growth states. Such measurements are important for understanding the dynamics of transcription and translation and for comparing gene expression data with proteome measurements.

Additionally, we will advance the state of the art in proteomics for examining complex biological systems—initially microbial communities and plants as well as microbial systems for which sequenced genomes are not available. The latter will also require significant development and application of sample preparation techniques and bioinformatics that will be addressed in this project, as well as improvements in data quality, sensitivity, and informatics addressed by the companion effort.

Improved protein annotation through peptide/protein identifications: High-throughput annotation methods rely on computer-based algorithms to predict the location of protein-encoding genes (CDS) and the functions they encode. With S. oneidensis, we have shown that proteomics can be used to validate that hypothetical genes encode proteins and to improve the accuracy of the N-termini. Further, we have shown that proteomics can be used to identify CDS that were either missed during the annotation process or were the result of sequencing errors. Working with the other biological systems listed, we will work with collaborators to facilitate use of the proteome data for improving the genome annotations of their respective organisms. This use of proteome data is especially important in eukaryotic systems, such as plants, where the prediction of coding sequences is more difficult than in prokaryotes. Determination of protein coding regions, correct protein termini and reading frame will use tools developed in the companion HTP project and through collaborations with outside researchers (e.g., the Joint Genome Institute).

Broaden the application of quantitative protein abundance determinations: We have previously shown that using absolute peak intensity for a peptide analyzed by a high-resolution instrument is one of the most effective and sensitive methods for protein quantitation. We are broadening the application of this quantitative approach to include the additional organisms described above and to address the organism specific hypotheses outlined for each model organism (Shewanella, Rhodobacter, etc).

For example, comparisons of Pelagibacter grown in different cell states, Caulobacter swarmer and stem cells, or Geobacter grown on an anode versus a soluble electron acceptors are a few of the many different experiments that will use this proteomic approach. Experiments of this type generate complex experimental designs, including hierarchical levels of replication (e.g. biological replicates, sample replicates, and technical replicates) across multiple conditions or mutants. Data analysis techniques must incorporate these complex experimental designs to provide statistically rigorous and relevant biological conclusions based on label-free abundance information.

Sub-cellular localization: One biological question that proteomics is uniquely suited to answer is the location of proteins within a cell. Location, in this case, is usually operationally defined on the basis of a cell fraction obtained via physical or chemical, or a combination thereof, separations. In gram-negative bacteria, for example, cells can be fractionated into cytosolic and membrane components. In addition, membranes can be further separated into inner and out membrane fractions or, in the case of photosynthetic bacteria PS membranes.

On a low-resolution level, we can determine if the protein is associated with a water-soluble or water insoluble fraction. Differences in the predicted (based on hydrophobicity and PSORT prediction) and observed locations yield potential candidates for protein interaction studies.

At a higher resolution level separations can be integrated with quantitative proteomic studies to determine the primary location of proteins in a system including proteins that co-localize between two fractions. Such studies are the basis for understanding how the organism interacts with its environment, but also how proteins interact with each other. Proteins that localize between various fractions may be candidates for protein interaction partners. We are extending our work on R. sphaeroides to further examine the sub-cellular fractionation of wild type and mutant forms of this organism grown under different conditions. Every sample submitted to the proteomic pipeline undergoes routine examination of soluble and insoluble fractions, with low-resolution studies proceeding on all biological systems.

BER-PNNL Proteomics