Office of Biological & Environmental Research - PNNL Proteomics

Research Topics

This project aims to apply quantitative proteomics approaches to an increasing number of biological systems important for the advancement of Genomics GtL projects and other DOE missions. Current biological systems include Shewanella and Geobacter, which are dissimilatory metal-reducing bacteria able to reduce metals and radionuclides as a consequence of their respiratory metabolism. An improved fundamental understanding of this metabolism will be useful for designing strategies for bioremediation of sites contaminated with heavy metals and radionuclides and for predicting the behavior of contaminants in the environment. R. sphaeroides is a metabolically diverse bacterium that can generate energy by fermentation, aerobic respiration or via anoxygenic photosynthesis. Understanding how Rhodobacter senses its environment and adjusts its cellular machinery to allow for energy generation under these distinct lifestyles is essential for exploiting microbial systems for carbon management and energy generation. Caulobacter crescentus cultures can readily be synchronized providing a useful model system for studies of the cell cycle progression and is the focus of a project currently supported the Genomics: GtL program. Pelagibacter ubique is one of the dominant heterotrophic bacteria in open oceans and the only cultured isolate to date possessing the recently discovered proteorhodopsin light driven protein pump that is abundant in such environments. This organism and culture enrichments containing close relatives serves as a model system for characterizing the role of marine microbial communities in ocean carbon and nutrient cycling.

AMT

We will be centering most of the high-throughput studies on the use of LC-MS based measurements (the accurate mass and time (AMT) tag approach) developed as a "bottom-up" comprehensive proteomics method that provides greater sensitivity, dynamic range and throughput than other methods. Briefly, the LC-MS based measurements strategy (see Figure) combines the high mass measurement accuracy of FTICR mass spectrometry (MS) with the accurate elution time measurements provided by liquid chromatography (LC) separation(s) to identify peptides. The approach involves first generating a database of peptides that serve as AMT tags using low throughput tandem mass spectrometry (MS/MS) (i.e., biomarkers), and then utilizing this database to enable subsequent high-throughput LC-MS analyses of biological samples14. The concept of the LC-MS based measurements approach relies on the mass and LC-elution time of a given peptide being unique among all possible peptides deduced from a genome sequence when the measured molecular mass and elution time of the peptide are sufficiently accurate.

The following is a list of organisms that are being investigated in this project:

Proteomics Approaches:

Our objective is to provide a comprehensive analysis of protein abundance and how abundance changes with changing environmental condition as well as their general location within the cell, and modification state. We will also assess the rate of protein turnover that occurs in response to transition of biological systems, either wild type or mutant, and between two or more different growth states. Such measurements are important for understanding the dynamics of transcription and translation and for comparing gene expression data with proteome measurements. Additionally, we will also advance the state of the art in proteomics for examining complex biological systems, initially microbial communities and plants as well as microbial systems for which sequenced genomes are not available. These latter tasks will also require significant efforts in the development and application of sample preparation techniques and bioinformatics that will be addressed in this project, as well as improvements in data quality, sensitivity, and informatics addressed by the companion effort.

Improved protein annotation through peptide/protein identifications: High-throughput annotation methods rely on computer-based algorithms to predict the location of protein-encoding genes (CDS) and the functions they encode. With S. oneidensis, we have shown that proteomics can be used to validate that hypothetical genes encode proteins and to improve the accuracy of the N-termini. Furthermore, we have shown that proteomics can be used to identify CDS that were either missed during the annotation process or were the result of sequencing errors. Working with the other biological systems listed, we will work with collaborators to facilitate use of the proteome data for improving the genome annotations of their respective organisms. This use of proteome data is especially important in eukaryotic systems, such as plants, where the prediction of coding sequences is more difficult than in prokaryotes. Determination of protein coding regions, correct protein termini and reading frame will utilize tools developed in the companion HTP project and through collaborations with outside researchers (e.g., the Joint Genome Institute).

Broaden the application of quantitative protein abundance determinations: We have previously shown that using absolute peak intensity for a peptide analyzed by a high-resolution instrument is one of the most effective and sensitive methods for protein quantitation. We will broaden the application of this quantitative approach to include the additional organisms described above and to address the organism specific hypotheses outlined for each model organism (Shewanella, Rhodobacter, etc). For example, comparisons of Pelagibacter grown in different cell states, Caulobacter swarmer and stem cells, or Geobacter grown on an anode versus a soluble electron acceptors are a few of the many different experiments that will utilize this proteomic approach. We expect that these experiments will generate complex experimental designs including hierarchical levels of replication (biological replicates, sample replicates, technical replicates) across multiple conditions or mutants. These complex experimental designs will be incorporated into our data analysis to provide statistically rigorous and relevant biological conclusions based on label-free abundance information. In addition, normalization approaches will need to be advanced to handle these experimental designs. We will also pursue peptide quantitation by the use of stable isotope labeling where applicable.

Sub-cellular localization: One biological track that proteomics is uniquely suited to answer is the location of proteins within a cell. Location, in this case, is usually operationally defined on the basis of a cell fraction obtained via physical or chemical, or a combination thereof, separations. In gram-negative bacteria, for example, cells can be fractionated into cytosolic and membrane components. In addition, membranes can be further separated into inner and out membrane fractions or, in the case of photosynthetic bacteria PS membranes. On a low-resolution level, one can determine if the protein is associated with a water-soluble or water insoluble fraction. Differences in the predicted (based on hydrophobicity and PSORT prediction) and observed locations yield potential candidates for protein interaction studies. On a higher resolution level separations can be integrated with quantitative proteomic studies to determine the primary location of proteins in a system including proteins that co-localize between two fractions. Such studies serve as the basis for understanding how the organism interacts with its environment, but also how proteins interact with each other. Proteins that localize between various fractions may serve as candidates for protein interaction partners. We will extend the work on R. sphaeroides to further examine the sub-cellular fractionation of wild type and mutant forms of this organism grown under different conditions. Every sample that is submitted to the proteomic pipeline undergoes routine examination of soluble and insoluble fractions, and the low-resolution studies will proceed on all biological systems.