versioning metaphor • 2007 Dec 10 • mahiwaga

Still reading stuff about multicellular computing.

Not necessarily applicable to software above the level of a single device, but Burbeck gets it wrong with regards to versioning issues in multicellular organisms.

For one thing, there is the key versioning difference between somatic cells and gametes. Somatic cells, the type of cells that become neurons, myocytes, hepatocytes, blood cells, etc., are diploid and always have two copies of a particular code sequence of DNA: two copies of every gene, which are going to be two versions, one from each parent. Gametes are haploid, containing only one copy of every gene. The creation of gametes involves choosing which version of a gene to include. This has been thought to be a completely random process, although as we learn more, it may not be completely random.

More dramatic is the versioning that occurs in cells of the immune system: each cell gets random changes to its DNA—bits/base pairs are spliced out and discarded permanently, allowing each cell to recognize different parts of different foreign organisms—different epitopes. Versioning becomes important because the body has to deactivate versions that react to itself. Versioning also occurs when a B lymphocyte (a specific type of cell in the immune system) class-switches: it dumps entire segments of DNA in order to form different classes of immunoglobulin—different classes of antibody, corresponding to different phases of infection. (IgM and IgD are produced acutely, IgG is produced generally after six weeks or so.)

Another type of versioning occurs with the specialization of stem cells. A pluripotent cell becomes a multipotent cell, which then becomes a stem cell. The stem cell divides so that one daughter cell remains a stem cell, while the other daughter cell goes on to differentiate.

The reason why you can’t use differentiated cells to form other types of cells like you can with stem cells is that there are differences in the resultant DNA. While it doesn’t necessarily mean that actual bits/base pairs are discarded, the histone scaffolding only allows certain genes to be expressed, and we don’t (yet) have a good handle on how to manipulate the scaffolding (although as recent research has shown, we can reverse differentiation to a certain extent.) The problem with non-selective manipulation of the scaffolding is that this is precisely the mechanism that causes cancer: cells that mutate and become malignant tend to de-differentiate.

Yet another example of versioning is what occurs on the X chromosome in women: in each and every somatic cell, one of the two X chromosomes gets randomly deactivated so that there is a proper balance in gene expression. (This process is called lyonization.) Because the two versions are essentially guaranteed to be different (one is from mother and the other from father), it can sometimes result in mosaicism of traits. More interesting is the fact, in some instances, the cells are able to uniformly deactivate the version that has a deleterious mutant gene.

Then there are the sub-organismal transactions of data. Viruses and transposons continually infect cells and their DNA sits in the nucleus, often quiescent until some stress stimulus forces it to manifest itself, which then provokes an immune response and eventually leads to apoptosis. While the viruses we are most familiar with in daily life: rhinoviruses, adenoviruses, rotavirus, etc., infect cells that have limited life cycles, making such infections self-limited, there are other viruses that essentially take up permanent residence in our DNA: varicella, herpes simplex, cytomegalovirus. Also hepatitis B and C. And HIV. These predominantly cause diseases of reactivation. (For example, the primary infection in HIV really just causes flu-like symptoms which eventually subside. The disease doesn’t really progress until much later.) While one can argue that these are foreign, invading DNA/code, it is clear that the packaging, transporting, and integration of pieces of DNA has long been a part of life, and may have a direct hand in evolution.

Again, the similarities between cell biology and computer software became more apparent with the popularization of Open Source. The prime example of DNA versioning is with the Linux kernel. The same source code—the same DNA—is used to generate widely different kinds of kernels. Some are run on desktop computers, others on servers, but even more drastic are the different versions that run mobile phones, PDAs, routers, DVRs, and various other embedded devices.

We are beginning to enter the age where code can become self-sustaining. Mac OS X’s and Windows software update programs are just simple examples of what is likely to become standard fare. Google and Amazon already give suggestions about what other content we might be interested in. We can make DVRs selectively record shows we might be interested in. iTunes has already had rudiments of being able to select songs that you like in your collection, simply by keeping track of the play count. For very specific things, the computer can anticipate your needs.

Imagine if you extrapolated this principle to software—RNA—itself. Obviously there would be enormous security risks. But what if your system anticipated and automatically downloaded software you might be interested in? What if your Linux distro could anticipate which of the cutting edge patches you might be more interested in and it could offer to immediately download and compile it in? What if embedded devices, when connected to the Internet, could detect when a patch relevant to that particular device was released, and it could essentially update itself?