Ever wondered how all the processes inside a living cell work together as one system, in real time? How they all collaborate with each other, affect each other?
Well, wonder no more.
Researchers from Stanford U. and the Craig Venter institute have created a computerized, simulated model of the entire life cycle of the bacterium Micoplasma genitalium (the bacteria with smallest genome [as far as we know]).
M.g. contains only 525 genes [but how many proto-genes?], and data on its genome, transcriptome, proteome and metabolome have been recently acquired. Karr et al. combined data from >900 publications to define >1900 parameters which were related to 28 modules of sub-cellular processes. Each module was first simulated according to the best fitted model (e.g. metabolism was modeled using flux-balance analysis, whereas RNA and protein degradation were modeled as Poisson processes). The difficult part, according to them, was to combine all 28 modules into one system.
They therefore assumed 1) that on short time scales (<1 sec) each module is independent of the others and 2) by running each module in 1 sec loops, they can use the variables created by all the modules from the previous loop to calculate the variables of the ongoing loop. Looping ends when the cell completes one division.
They ran the simulation 128 times, and then validated their model against databases which were not used to construct the model. They claim that the model predictions were significantly close to experimental data.
Now here comes the interesting part – their predictions. I won’t go over all of them – you can read that in the paper. But I do want to discuss a few unexpected predictions.
The metabolism regulates the cell-cycle
The cell-cycle in their model consists of three stages: initiation, replication and cytokinesis (cell division). While running the simulations, they noticed that there is a large variability between cells in the duration of the first and second stages (~65% & 38%), but very little variability in the duration of the last stage (~4%), or in the overall duration of the cell cycle (~9%). This was strange. Why was there such a variability of the two stages, and how did this variability disappear when looking on the whole cell-cycle?
With respect to replication initiation, they found some correlation to the initial number of a protein called DnaA that must accumulates to a certain threshold at the replication start site in order to recruit DNA polymerase. They also found that the stochastic nature of the transcription and translation modules creates variability in the number of DnaA monomers also during the simulation, not just when the simulation starts.
About the replication itself – since this is a deterministic process as they call it, they expected to see a straightforward correlation between replication and cell cycle progression. Instead, their model suggests a two-step replication process: the first step is fast. However, as soon as the availability of free deoxy-ribonucleotide-triphosphates (dNTPs) pool decreases beyond a certain threshold, replication slows to a rate equal to the rate of dNTPs production. Hence, the duration of the replication stage in individual cells is related to the free dNTP content at the start of replication than to the dNTP content at the start of the cell cycle. The availability of dNTPs imposes a sort of balance on the duration of the cell cycle – in cels with long initiation phase, the dNTP concentration will be high once replication begins, and replication duration will be short. But, if initiation starts early, then the pool of dNTP will be depleted quickly, thus slowing the replication phase. Yet, the total time of the cell cycle will not vary a great deal among individual cells.
Discovering new role and getting kinetic data for enzymatic activities
After reviewing the behavior of the wild-type cell, they went on to do simulations of single gene mutants for all 525 genes (a total of >3000 runs!).
For each gene they could determine if the cell remains viable. The simulation predictions were ~80% accurate compared to experimental data.
They then went on to experimentally test several mutants. An interesting case was the gene lpdA. The model found its deletion inviable, which was inconsistent with previous data. They therefore tested the viability of this mutant and found the strain to grow 40% slower than the wild type. They then reasoned that another protein must perform a similar task as LpdA. Indeed, they found that another gene, nox, showed similarity to lpdA in terms of sequence and function. Correcting the model with the added new task of Nox yielded a viable simulated cell.
In several other cases, there was a discrepancy between the growth rate of viable mutants in silico and in vivo. These mutations were in gene coding certain enzymes. They then played with the kinetic parameters of the enzyme so that the outcome in silico, in terms of cell cycle duration, was similar to the outcome in vivo. Thus, they were able to calculate in silico the kinetic parameters of enzymes.
In each of these cases, the new catalytic values are consistent with experimental data.
Biologists are starting to realize more and more the importance of looking at the big picture, beyond their favorite protein or pathway. Systems biology is a field that is developing very fast.
Having a whole-cell in silico model is a dream of every systems biologist, indeed, of every biologist.
The fact that such a model is imperfect allows us to find discrepancies, and from them – new biological knowledge.
This model is a huge step forward in systems biology, and it is only the beginning.
Who knows, this may be the basis for designing new organisms in the future…
Jonathan R. Karr1, Jayodita C. Sanghvi, Derek N. Macklin, Miriam V. Gutschow, Jared M. Jacobs, Benjamin Bolival Jr., Nacyra Assad-Garcia, John I. Glass, & Markus W. Covert (2012). A Whole-Cell Computational Model Predicts Phenotype from Genotype Cell, 150 (2), 389-401 DOI: 10.1016/j.cell.2012.05.044