Information and Entropy in Biological Systems Workshop
• the maximum entropy principle in ecology,
• Shannon and Rényi entropies as measures of biodiversity,
• information theory in evolutionary game theory,
• information and the thermodynamics of individual cells.
Because they work in diverse fields, researchers in these specific topics have had little opportunity to trade insights and take stock of the progress so far. The aim of the workshop is to do just this.
In what follows, participants’ names are in boldface, while the main goals of the workshop are in italics.
Roderick Dewar is a key advocate of the principle of Maximum Entropy Production, which says that biological systems—and indeed all open, non-equilibrium systems—act to produce entropy at the maximum rate. Along with others, he has applied this principle to make testable predictions in a wide range of biological systems, from ATP synthesis [DJZ2006] to respiration and photosynthesis of individual plants [D2010] and plant communities. He has also sought to derive this principle from ideas in statistical mechanics [D2004, D2009], but it remains controversial.
The first goal of this workshop is to study the validity of this principle .
While they may be related, the principle of Maximum Entropy Production should not be confused with the MaxEnt inference procedure, which says that we should choose the probabilistic hypothesis with the highest entropy subject to the constraints provided by our data. MaxEnt was first explicitly advocated by Jaynes. He noted that it is already implicit in the procedures of statistical mechanics, but convincingly argued that it can also be applied to situations where entropy is more ‘informational’ than ‘thermodynamic’ in character.
Recently John Harte has applied MaxEnt in this way to ecology, using it to make specific testable predictions for the distribution, abundance and energy usage of species across spatial scales and across habitats and taxonomic groups [Harte2008, Harte2009, Harte2011]. Annette Ostling is an expert on other theories that attempt to explain the same data, such as the ‘neutral model’ [AOE2008, ODLSG2009, O2005, O2012]. Dewar has also used MaxEnt in ecology [D2008], and he has argued that it underlies the principle of Maximum Entropy Production.
Thus, a second goal of this workshop is to familiarize all the participants with applications of the MaxEnt method to ecology, compare it with competing approaches, and study whether MaxEnt provides a sufficient justification for the principle of Maximum Entropy Production.
Entropy is not merely a predictive tool in ecology: it is also widely used as a measure of biodiversity. Here Shannon’s original concept of entropy naturally generalizes to ‘Rényi entropy’, which depends on a parameter . This equals
where is the fraction of organisms of the th type (which could mean species, some other taxon, etc.). In the limit this reduces to the Shannon entropy:
As increases, we give less weight to rare types of organisms. Christina Cobbold and Tom Leinster have described a systematic and highly flexible system of biodiversity measurement, with Rényi entropy at its heart [CL2012]. They consider both the case where all we have are the numbers , and the more subtle case where we take the distance between different types of organisms into account.
John Baez has explained the role of Rényi entropy in thermodynamics [B2011], and together with Tom Leinster and Tobias Fritz he has proved other theorems characterizing entropy which explain its importance for information processing [BFL2011]. However, these ideas have not yet been connected to the widespread use of entropy in biodiversity studies. More importantly, the use of entropy as a measure of biodiversity has not been clearly connected to MaxEnt methods in ecology. Does the success of MaxEnt methods imply a tendency for ecosystems to maximize biodiversity subject to the constraints of resource availability? This seems surprising, but a more nuanced statement along these general lines might be correct.
So, a third goal of this workshop is to clarify relations between known characterizations of entropy, the use of entropy as a measure of biodiversity, and the use of MaxEnt methods in ecology.
As the amount of data to analyze in genomics continues to surpass the ability of humans to analyze it, we can expect automated experiment design to become ever more important. In Chris Lee and Marc Harper ’s RoboMendel program [LH2013], a mathematically precise concept of ‘potential information’—how much information is left to learn—plays a crucial role in deciding what experiment to do next, given the data obtained so far. It will be useful for them to interact with William Bialek , who has expertise in estimating entropy from empirical data and using it to constrain properties of models [BBS, BNS2001, BNS2002], and Susanne Still , who applies information theory to automated theory building and biology [CES2010, PS2012].
However, there is another link between biology and potential information. Harper has noted that in an ecosystem where the population of each type of organism grows at a rate proportional to its fitness (which may depend on the fraction of organisms of each type), the quantity
always decreases if there is an evolutionarily stable state [Harper2009]. Here is the fraction of organisms of the th genotype at a given time, while is this fraction in the evolutionarily stable state. This quantity is often called the Shannon information of ‘relative to’ . But in fact, it is precisely the same as Lee and Harper ’s potential information! Indeed, there is a precise mathematical analogy between evolutionary games and processes where a probabilistic hypothesis is refined by repeated experiments.
Thus, a fourth goal of this workshop is to develop the concept of evolutionary games as ‘learning’ processes in which information is gained over time.
We shall try to synthesize this with Carl Bergstrom and Matina Donaldson-Matasci ’s work on the ‘fitness value of information’: a measure of how much increase in fitness a population can obtain per bit of extra information [BL2004, DBL2010, DM2013]. Following Harper , we shall consider not only relative Shannon entropy, but also relative Rényi entropy, as a measure of information gain [Harper2011].
A fifth and final goal of this workshop is to study the interplay between information theory and the thermodynamics of individual cells and organelles.
Susanne Still has studied the thermodynamics of prediction in biological systems [BCSS2012]. And in a celebrated related piece of work, Jeremy England used thermodynamic arguments to a derive a lower bound for the amount of entropy generated during a process of self-replication of a bacterial cell [England2013]. Interestingly, he showed that E. coli comes within a factor of 3 of this lower bound.
In short, information theory and entropy methods are becoming powerful tools in biology, from the level of individual cells, to whole ecosystems, to experimental design, model-building, and the measurement of biodiversity. The time is ripe for an investigative workshop that brings together experts from different fields and lets them share insights and methods and begin to tackle some of the big remaining questions.