Role of optimization in artificial cells

Artificial cells, by their very nature, are composed of subsystems that must function together to satisfy the criteria of life.  As described in the theoretical work packages ("Artificial cells: conception and simulation" and "Artificial cell evolution and functionality"), there are three basic functionalities that a living artificial cell must include:  (i) containment / identity, (ii) metabolism, (iii) an informational encoding of all components that can evolve through reproduction under selective pressure.


Subsystems of an artificial cell will typically target each of these functionalities, but combining the subsystems will, in general, strongly affect their individual functionality, since there exist strongly nonlinear interactions between the subsystems.  The effect is extremely difficult, if not impossible to predict from first principles. 


The problem may be formulated in terms of a problem in the design of experiments; the subsystems together may typically be characterized by real variables (concentrations of constituent chemicals, experimental conditions, etc.), and so represented by a point in a high dimensional space pastedgraphic-8_textmedium, where we will call pastedgraphic-2_textmedium the experimental space.  Target functionality is ultimately the degree to which an artificial cell survives and reproduces, but for particular subsystems and their combination there may be various other functionalities, typically measurably by the results of an assay, e.g. a fluorescence assay, a spectrophotometric assay, etc.  The measurable result results in a real valued function on the experimental space, typically called the response, pastedgraphic-4_textmedium. Thus, the experimental response is a surface over a high dimensional space.  The optimization problem is to find the points (experiments) in the experimental space that generate high response.


There are two fundamental issues that make this optimization problem difficult:


(i) the space is typically high dimensional, in the sense that the experiments have many constituent components.  Even when the space is discretized, e.g., by varying concentrations in discrete intervals, the number of possible experiments can be daunting.  for example if the experimental space is a 20 dimensional space of concentrations of 20 constituent chemicals, and each dimension is discretized to have five possible concentration values, the total number of experiments is 520, or about 1014.  


(ii) the response surface is typically unknowable, a priori, and not linear, in the sense that the response in the full experimental space is not a simple linear combination of responses to the individual experimental factors (each dimension in the space).  The nonlinearity  of the response surface can arise from nonlinear chemical kinetics as well as from other physico-chemical effects such as phase transitions associated with self-assembly processes.


Thus, the optimization has a particular character:  The experimental space is typically subsampled by a set of experiments vastly smaller than the entire space, and the experiments must be designed to maximize the target response.