central dogma • 2007 Dec 9 • mahiwaga

Of course, I suppose I really should’ve searched Google before trying to coin a phrase. Other people have already used the analogy of the mechanisms of life to the mechanisms of computer programming and information technology.

While the notion of objects (in a programming sense) being self-contained entities consisting of both executable code and inert data is accurately descriptive of cellular mechanisms, the idea of software above the level of a single device being analogous to multicellular organisms hasn’t been quite addressed.

For this, we need to discuss the dominant paradigm in cellular biology, ostentatiously called the central dogma: DNA→RNA→protein. This fits well with the usual flow of code: source→raw object code/byte code→machine language. This also matches the trickle-down concept of the current World Wide Web: you download stuff from the Web onto your computer, and you then transfer digital music or videos to your device.

But in biology, the discovery of retroviruses proved that the dogma wasn’t quite that strict. In this case, RNA→DNA. In fact, it is becoming more accepted that life may have started out with RNA rather than DNA. And proteins aren’t left out of the flow of information: certainly ribosomes and histones affect the expression of DNA, not to mention the flow of nuclear receptors, as well as the transcription and replication mechanisms that copy DNA→RNA and DNA→DNA.

While it is unknown whether or not life started as RNA, code definitely started on single-cell computers. I’m not sure where to put the old mainframe servers into the paradigm, but most modern servers are essentially single-cell computers or networks of single-cell computers. In the current incarnation, interpreted languages such as perl, PHP, python, and ruby, not to mention javascript, are the “duct tape that holds the Web together.” Java and C# also fit into this schema, although there is an extra level of abstraction in the form of a virtual machine. These languages all correspond to RNA: partly structural, partly functional; executable code that isn’t raw machine language. Meanwhile, compiled languages like C/ObjC/C++ correspond to DNA: pure source code that needs to be transcribed to object code, which then needs to be translated to machine language.

But the idea of multicellular computing applying to software above the level of a single device is less concrete than this. DNA is the content that sits on the servers. RNA is the software that manages the content: browsers, media players, sync software, iTunes, RSS aggregators, but it also applies to the OS. Protein is when the content is actually used/activated/consumed: when an MP3 file is listened to, when an AVI is watched, when a Flash or Java application is deployed on a mobile device.

The Open Source paradigm makes it readily apparent that DNA, that is, content—source code—can be turned into RNA. You no longer have to buy RNA directly from the software developer. You can build it yourself.

While security practices are meant to prevent the inadvertant running of untrusted code, it’s going to happen anyway. In fact, it’s meant to happen. You can cut-and-paste scripts from the Web and deploy them willy-nilly. Scripts will mutate, reproduce, metastasize. The evolution of the Web is dependent on the flow of information to and fro.