Discussion

Two achievements have been shown: the stability of non-self-replicator systems over very long timescales and the emergence of non-self-replicator systems without specific seeding. Both results are extremely important. Firstly, molecular-replication systems can be hardly envisaged as being constituted of self-replicators because self-replicators need to solve the compartmentalization problem. Secondly, non-self-replicator systems are extremely vulnerable to parasitism and especially in the beginning of non-self-replication perturbation by other micro-controllers is very likely. Only after many replicated sequences a sufficient number of replicas is available which can be the base of proliferation of the species and in addition change the environment such that disturbing events cannot happen anymore, e.g. in the simplest case by overwriting all the other competitors with the species own code. 
A third indirect result of this work is, when studying the emerging replication systems, it is the unlikely development of replication from scratch purely through network interactions. This already could have been seen in [50] where no modular structures evolved when solving the multiplier problem, even after long evolutionary times the evolving robustness did not yield modules. On the contrary, every function-generator (4x16 function-generators with 4-bits input each evolve, they yielded together a 4-bit-multiplier) of the multiplier evolved its special solution and the mathematical structure was not recognized or learned explicitely by the system. This is not a contradiction to the astonishingly simple solution when introducing self-assembly [16] because this self-assembly procedure only allows modules to be evolved. A consequence of this third finding is that probably no miraculous network-topology will help to create a replication system but only the sheer size of the search-space for finding a sufficiently small non-self-replication system. In the current work and the currently available computer-power with around 22 unknown bits to be found. Transferring this result into biochemical systems require a systems setup such that physics and chemical properties have to provide the vast majority of information for getting a replication running and only a tiny amount of flexibility in the information carrying modules can be tolerated.

Reflecting literature in light of current results

The following list is a loose coupling of comments and remarks reflecting literature related to this work:
  • Almost all micro-controller based software evolution studies assumed the existence of cellular environments, see e.g. [42] and follow-ups [2]. The question of the transition from the abiotic to the biotic world has hardly been tackled with evolving software, apart of von Neumann's and Holland's work. The second usually taken approach was to search for self-replicators. But as [14] pointed out, self-replication has a strong tendency towards simplified replication phenomena and, as has been mentioned in the introductory section, self-replicators are not feasible when asking the question of molecular replication without cellular environments.
  • Actually the first work which tried to develop a computational model and provided formal proofs of emerging replication systems is Holland's a-universe. Though he did not give evidence that he indeed implemented this model and checked his proofs only one work of [34] tried to realize his model. McMullin was not able to validate Holland's findings and argued that especially interactions with other not yet ready replication systems would hinder the emergence of stable replicators. 
  • Pargellis [36] was the first to show that self-replicating software in micro-controllers could emerge. He streamlined the Tierra [42] instruction set such that one in about 100,000 random-sequences of five-instructions sequences resulted in a self-replicator. The following work [37] extended the instruction-set to 32 instructions (1 in 20 million random sequences yielded self-replicators) now also showing the emergence of self-replicators which are Turing-universal. As with Tierra the allocation of memory and the division of copies from the parent are difficult to map to biological systems.

  • In the presented model two questions were not yet answered: whether the software evolving in the micro-controllers is indeed able to evolve arbitrarily complex features and secondly whether externally given tasks can be solved by these evolving micro-controllers. Currently, successful solving externally given tasks require very special fitness-landscapes introduced from the outside to allow complex features to be evolved, [29]. Especially the intermediate steps had to be rewarded with software evolution in micro-controllers to evolve some simple Boolean functions, [30]. This problematic situation might be relieved if self-assembly and thus structure-learning processes could be utilized [16].

From different types of machines towards evolvable hardware

With the different types of micro-controllers at hand and allowing evolution to switch between these types the natural extension is to let the machines being constructed directly by the dynamics or evolutionary processes in the system. Of course the search space dramatically increases in size and it is a question whether a pathway from simple machines towards more complex ones still exists. It is obvious and was the recurring result of research in this area that machines with even moderate complexity failed to spontaneously emerge replication systems. For example, only six additional bits are needed to modify the replicating program of the simplest machine to run successfully on the machine where the End-instruction really means End and the instruction SetFB is being replaced with a Goto-instruction, see table 1, left part. It should be possible for evolution to jump over this gap of six bits. Also straight forward is the concurrent existence of differing widths of the cargo-part in the system. Whether these jumps are possible during evolution has to be tested thoroughly. The more plasticity is build into the hardware, meaning the more the hardware as such is evolvable, the more it becomes a target of evolution. The pathway from one machine-type to another or from one hardware to the next should be as smooth as possible. How to realize such a smooth "hardware-landscape" is part of future research. The big advantage of replicating systems though is the high abundance of copies of successful replicators. This gives hopefully enough robustness to test and play with many types of machines in one system. Of course, extremely interesting is also the question how evolution behaves if only certain types of machines are possible in different parts of the simulation space, for example, containers 0-9 would allow only machines with 2 special-bits (SP) to be used and containers 10-19 only machines with 3 special-bits.

Connecting to molecular dynamics (MD)

Now with the spontaneous emergence of replicators at hand the artificial restriction to simple spatial topologies can be relaxed. Actually, it is easy and straight forward to incorporate the evolving software into a molecular dynamics code like for example LAMMPS http://lammps.sandia.gov/. Then information processing in the world of simulated molecules becomes feasible. Using the mesoscale simulation facility DPD of LAMMPS or the extension multipolar reactive DPD [18] we will be able to combine physically valid system dynamics with evolving soft- and hardware.

Transferring the results to biochemical experiments

Certain physical assumptions had to be made to allow for a successful spontaneous emergence of replicator-systems. Most of these assumptions have been reported in the section about physical assumptions and are summarized here only:
  • tri-molecular reactions had to be assumed which translates in biochemistry that catalysts slide along the template. This could for example be a stochastic ratchet-like process. If this is not possible then a physical structure has to provide the same effect, e.g. the template being pushed and pulled via hydrodynamic pressure through an eye of a needle or a cavity with the catalyst connected to the opening.

  • only very few bits can be encoded in the sequences everything else has to be provided by physics and chemistry. There is no hope that in nature the available parallelism is gigantic compared to what we have available in the computer: firstly, the explosion of the search-space outnumbers the available resources right from the beginning and secondly, the physical non-determinism, fuzziness and Brownian motions consume many of the parallelism-resources. This is a very important finding. Even though proponents of the RNA-world hypothesis, [2026], believe that ribozymes can in principle solve the replicator-emergence problem, still a gap between the required fidelity of replication and the capabilities of ribozymes exist. This upper bound of perhaps 20 to 30 bits of the exploitable informational search space requires that ribozymes needing more then a few nucleotides will probably not be able to emerge spontaneously.

  • magic network behavior probably doesn't help. Before, it was not clear whether some combinations of building blocks of sequences which accidentally happens to be in the same area could mimic a replicator. The consequence of not yet finding any hints of such a behavior in the experiments makes it unlikely that networks of cooperating molecules would emerge into a replication system in nature. If several components were working together then the connections between these components have to be very profound and reliable and thus working as one entity and not as a network of loosely coupled operators. This is not in contradiction to intriguing catalytic properties of e.g. cleaving deoxyribozymes [31] building an auto-catalytic cleavage process. Auto-catalytic replication of information is a much more complex process then simply letting ribozymes cleaving circular rings of RNA which then become active ribozymes.

  • already known: spatial resolution is important. That spatial resolution is an important ingredient is known since many years [8, 47, 33, 17], it helps overcoming parasites and generates diversity due to time-delays in  the communication. It translates to biochemical scenarios to well stirred and turbulent fluidic ensembles which are not very suitable environments for the first emergence of life-like processes.

  • sequence specificity might be sufficient, compartments in a physical sense are probably not needed. That physical compartments in a strict sense are not important is good news because containers always pose the question on how they are created and maintained in the course of evolution. Though protocell research made considerable progress [39], the problem of linking informational molecules to the creation of vesicles with amphiphiles created by the chemistry itself and the necessary dividing of vesicles is still poorly understood. Furthermore, getting resources in and waste out is a fundamental problem of containers.

  • circular plasmid like templates and also perhaps circular catalysts have the big advantage they can stay at the template and produce many copies of the same template. This gives a dynamically completely different robustness. The only problem, the common problem, how is the copy released from the ternary complex. In the experiments an End-of-Sequence recognition was used which could be realized as a sort of 'STOP-codon' in the biochemical realization. Another possibility could be a sequence-replacement system which successfully has been realized in the DNA-computing realm [56]. Also refolding of the secondary structure due to different salt-conditions or temperature gradients could change the enzymatic functions of the ribozymes [27].

  • low perturbation by random sequences. It turned out that too many random sequences in the vicinity of a replication system are a problem for spontaneous emergence. In contrast to the protein-world of [25] labile replication systems are being subjected to perturbations by other not related sequences. Trinks [51], proposed ice-cavities as a possible space of the origin of life which seems to be a plausible location because of the huge parallelism, low energy intake and reliable environmental conditions, this view is supported by [52] who argues that the phosphodiester backbone of RNA can be stabilized in ice.

Conclusions

A long standing problem has been solved. The de novo emergence of enzyme-like replicating systems could be achieved. Key to success was the steady simplification of the micro-controllers and the addition of physically plausible constraints which again allowed further simplifications of the system. With the new knowledge gained, further origin-of-life-models can be changed accordingly and it is expected that many of them will be able to show emergent replication systems. The even more interesting question is whether we now have a guideline to realize emergent replication systems in real physical and chemical environments. It is the hope of this work that it can really produce hints on how chemical experiments have to be setup. A bridge between computer-science and experimental origin-of-life research is now visible and future work certainly will strengthen this tiny pathway.