December 26, 2012

Happy Holidays (Sci-Fi)!


This is being cross-posted to Tumbld Thoughts.


Happy Holidays! Brought to you by science fiction [1].

NOTES:

[1] References courtesy Charlie X (Star Trek: TOS) and The X-files.

December 23, 2012

Media Neuroscience Lab Profile


Here is a new research lab [1] recently founded by my colleague Rene Weber [2] at UC-Santa Barbara. The Media Neuroscience Lab does research at the intersection of Virtual Worlds and Neuroscience. Their emphasis is on using physiological data [3] to understand the substrates of communication and cognition during use of entertainment media and other virtual worlds. Check it out.

Figure 1. This figure features an example from the lab's most recent paper [4] featured in a summary post on the Neuroskeptic blog [5]. Images courtesy [4] (middle, bottom) and [5] (top).

NOTES:

[1] the Media Neuroscience Lab is affiliated with the SAGE Center for the Study of the Mind and the Institute for Collaborative Biotechnologies (both at UCSB). The lab also maintains a Twitter feed.

[2] a pioneer in the area of applying innovative neuroimaging techniques to the study of video gameplay and television watching. He and I have worked together on the role of dynamic attentional processing during immersion in virtual environments (so far unpublished).

Picture of Dr. Weber in front of his primary scientific instrument.

[3] the lab also maintains an archive of neuroimaging (fMRI) and psychophysiological (Biopac) datasets if you are interested.

[4] Klasen, M., Weber, R., Kircher, T.T., Mathiak, K.A., and Mathiak, K. (2012). Neural contributions to flow experience during video game playing. Social, Cognitive, and Affective Neuroscience, 7(4), 485-495.

The free-viewing condition (fMRI experimental design) was used to understand what the brain is doing while the subjects played "Tactical Ops" (Figure 1, middle). The results were interpreted in the context of Csíkszentmihályi's flow theory of experiential cognitive (in this case, neural) processing.

[5] Post from Neuroskeptic blog on their recent paper "Neural contributions to flow experience during video game playing": How Your Brain Gets In The Game. Neuroskeptic, May 23, 2011.

December 13, 2012

Perhaps it's not too late.....or too soon


Just found out (via IEEE Spectrum) about an interesting contest sponsored by Intel and the HYVE [1]. It's called the Intel Future Contest. The point is to come up with an idea for a an idealized application for a smart sensor suite. What do you do with it? Use it in the home? Use it to monitor your health? Or use it in your favorite hobby [2]? See the description below:

"Imagine five years into the future. You have on you (or with you) this new sensing technology. It can see, hear, remember and understand everything around you all the time".

They also provide a number of criteria to keep in mind during the design process. These are included in the table below:


The deadline is December 18. To submit, they require a series of sketches or something similar. Judges will (hopefully) be fair and impartial [3]. Categories include: healthcare, communication, work, knowledge production, entertainment, infrastructure and environmental modeling, creative expression, and an "other" category. If you have an idea (that you are not submitting or have already submitted), perhaps you could also submit it in the comments section below [4]. 

NOTES:

[1] a self-proclaimed "open-source innovation company".


[2] skydiving reference and pic (below) courtesy Ars Technica and Google I/O 2012 keynote address.


[3] one of the judges is Mark Roth, who I recognize as the conductor of experiments on SO2-induced hibernation in mice (metabolic flexability).

[4] to see the pool of submissions (which features some impressive ideas), go to this site. My submission, "Sensor-enabled Relativistic Virtual Worlds", can be viewed here (or Figshare version here).


December 12, 2012

"Disrupting" the Scientific Establishment?

This series of videos [1] is for my academic scientists friends frustrated with the traditional funding streams [2]. Part of my continuing series on alternative funding models. The first is a cartoon-laden introduction to crowdsourcing (as a concept) called: "Crowdsourcing: what is it?".


The second video speaks more directly to the disruptive potential of crowdsourcing [4] on traditional academic research power structures [3]: "Disrupting conventional funding models".



The third featured video profiles the SciFund challenge [5], and provides insights into how crowdsourcing is effective as a practical tool: "Crowdfunding for Academic Research".


Finally, there is an example from the UK called Open Science Space (an IndieGoGo project) run by Peter Cubbin. In the video, he explains one proposal for how "open-source" funding fits into the scientific enterprise.



So the questions are raised: is this a viable alternate path forward? Do such models have the potential to "disrupt" how academic science is done and rewarded [6]? Please feel free to discuss and share.

NOTES:

[1] All contributions courtesy various YouTube contributors.

[2] NIH, NSF, DARPA, etc. While the predominant way to fund science, these models do not work well for every type or research or project.

[3] And markets for research and researchers (scientists). But more on that in another post.

[4] Actually, the disruptive innovation here is the enabling technology (e.g. the internet).

[5] Currently in Round 3 (the initial round began a few years back). I direct you to the call to arms, put out by Jai Ranganathan and Jarrett Byrnes.

[6] This is somewhat of an aside, but there has been much talk about academic elitism (e.g. money and attitudes) and its perpetuation recently in the blogosphere and news media. See the following two articles for a taste of this zeitgeist:

"Is Michigan State really better than Yale?" (New York Times)

"Ph.D.'s From Top Political-Science Programs Dominate Hiring, Research Finds" (Chronicle of Higher Education).


December 8, 2012

Algorithmic Self-assembly (with DNA) Profile



Another popular post that is being re-posted from my microblog, Tumbld Thoughts (on Algorithmic Self-assembly).


Here is David Doty (Math/CS) from Caltech discussing the theory of Algorithmic Self-Assembly [1] featured in a Communications of the ACM article — picture on the left, and Vimeo video slideshow — picture on the right). Here is a related blog post from 80 Beats (Discover magazine science blog) on DNA LEGO bricks. Enjoy both.

Associated trivia: the “Abstract Tile Assembly Model” [2] featured in the Vimeo video was developed by Erik Winfree (another DNA Computing person), who is the son of Arthur Winfree. Art Winfree wrote an excellent book called the “Geometry of Biological Time”, and was a mentor of Steven Strogatz [3].


NOTES:

[1] process by which “things autonomously (no directedness) assemble themselves into larger functional units”.

[2] for a demo, try out the the Xgrow simulator.

[3] a nice story about this is featured in the book “Sync

December 5, 2012

Triangulating Scientific “Truths”: an ignorant perspective

I have recently read the book "Ignorance: how it drives science" by Stuart Firestein, a Neuroscientist at NYU. The title "Ignorance" refers to the ubiquity of how what we don't know influences the scientific facts we teach, reference, and hold up as popular examples. In fact, he teaches a course at New York University (NYU) on how "Ignorance" can guide research, with guest lecturers from a variety of fields [1]. This is of particular interest to me, as I hosted a workshop last summer (at Artificial Life 13) called "Hard-to-Define Events" (HTDE). From what I have seen in the past year or so [2], people seem to be converging on this idea.


In my opinion, two trends are converging that seem to be generating interest in this topic. One is the rise of big data and the internet, which make the communication of results and rendering of the "research landscape" easier. Literature mining tools [3] are enabling discovery in and of itself, but also revealing the shortcomings of previously-published studies. There has also been a good deal of controversy raised over the last 10 years in terms of a replication crisis [4] coupled with the realization that the scientific method is not as rigorous [5] as previously thought.

The work of Jeff Hawkins, head of Numenta, Inc. [6], addresses many of these issues from the perspective of formulating a theoretical synthesis. For a number of years, he has been interested in a unified theory of the human brain. While there are challenges in terms of both testing such a theory AND getting the field to fit it into their conceptual schema, Jeff has nevertheless found success building technological artifacts based on these ideas.


Jeff's work illustrates the balance between integrating what we do know about a scientific field with what we don't know. This involves using novel neural network models to generate "intelligent" and predictive behavior. Computational abstraction is a useful tool in this regard, but in the case of empirical science the challenge is to include what we do know and exclude what we don't from our models.

According to this viewpoint, the success of scientific prediction (e.g. the extent to which a theory is useful) is dependent upon whether findings and deductions are convergent or divergent. By convergent and divergent, I mean how independent findings can be used to triangulate a single set of principles or predict similar outcomes. Examples of convergent findings include Darwin’s finch beak diversity to understand natural selection [7] and the use of behavioral and neuroimaging assays [8] to understand attention. 


There are two ways in which the author proposes that ignorance operates in the domain of science. For one, ignorance defines our knowledge. The more we discover, the more we discover we don't know. This is often the case with problems and in fields for which little a priori knowledge exists. The field of neuroscience has certainly has its share of landmark discoveries that ultimately raise more questions than provide answers in terms of function or mechanism. Many sciences often go through a "stamp-collecting" or "natural history" phase, during which characterization is the primary goal [9]. Only later does hypothesis-driven, predictive science even seem appropriate.

The second role of ignorance is a caveat, based on the first part of the word: "to ignore". In this sense, scientific models can be viewed as tools of conformity. There is a tendency to ignore what does not fit the model, treating these data as outliers or noise. You can think about this as a challenge to the traditional use of curve-fitting and normalization models [10], both of which are biased towards treating normalcy as a statistical signal signature.

If we think about this algorithmically [11], it requires a constantly growing problem space, but in a manner typically associated with reflexivity. What would an algorithm defining "what we don't know" and “reflexive” science look like? Perhaps this can be better understood using a metaphor of the Starship Enterprise embedded in a spacetime topology. Sometimes, the Enterprise must venture into uncharted regions of space (but one that still corresponds to spacetime). While the newly-discovered features are embedded in the existing metric, these features are unknown a priori [12]. Now consider features that exist beyond the spacetime framework (beyond the edge of the known universe) [13]. How does a faux spacetime get extrapolated to features found here? The word extrapolation is key, since the features will not necessarily be classified in a fundamentally new way (e.g. prior experience will dictate what the extended space will look like).



With this in mind, there are several points that occurred to me as I was reading “Ignorance” that might serve as heuristics for doing exploratory and early-stage science:

1) Instead of focusing on convexity (optimal points), examine the trajectory:

* problem spaces which are less well-known have a higher degree of nonconvexity, and have a moving global optimum.

* this allows us to derive trends in problem space instead of merely finding isolated solutions, especially for an ill-defined problem. It also prevents an answer marooned in solution space.

2) Instead of getting the correct answer, focus on defining the correct questions:

* according to Stuart Firestein, David Hilbert's (Mathematician) approach was to predict best questions rather than best answers (e.g. what a futurist would do).

* a new book by Michael Brooks [14] focuses on outstanding mysteries across a number of scientific fields, from dark matter to abiogenesis.

3) People tend to ask question where we have the most complete information (e.g. look where the light shines brightest, not where the answer actually is):

* this leads us to make the comparison between prediction (function) and phenomenology (structure). Which is better? Are the relative benefits for each mode of investigation problem-dependent?


Stepping back from the solution space-as-Starfleet mission metaphor, we tend to study what we can measure well, and in turn what we can characterize well. But what is the relationship between characterization, measurement, and a solution (or ground-breaking scientific finding)? There are two components in the progression from characterization to solution. The first is to characterize enough of the problem space so as to create a measure, and then use that measure for further characterization, ultimately arriving at a solution. The second is to characterize enough of the problem space so that coherent questions can be asked, which allows a solution to be derived. When combined, this may provide the best tradeoff between precision and profundity. 

Yet which should come first, the measurement or the question? This largely depends on the nature of the measurement. In some cases, measures are more universal than single questions or solutions (e.g. information entropy, fMRI, optical spectroscopy). Metric spaces are also subject to this universality. If a measure can lead to any possible solution, then it is much more independent of question. In “Ignorance”, von Neumann's universal constructor, as applied to NKS theory by Wolfram [15] are discussed as a potentially universal measurement scheme. 


There are two additional points I found intriguing. The first is a fundamental difference between scientific fields where there is a high degree of "ignorance" (e.g. neuroscience) versus those where there is a relatively low degree (e.g. particle physics). This is not a new observation, but has implications for applied science. For example, the interferometer is a tool used in the physical sciences to build inferences between and find information among signals in a system. Would it be possible to build an interferometer based on neural data? Yes and no. While there is an emerging technology called brain-machine interfaces (BMI), these interfaces are limited to well-characterized signals and favorable electrophysiological conditions [16]. Indeed, as we uncover increasingly more about brain function, perhaps brain-machine interface technology will become closer to being like an interferometer. Or perhaps not, which would reveal a lot about how intractable ignorance (e.g. abundance of unknowable features) might be in this field. 

The second point involves the nature of innovation, or rather, innovations which lead to useful inventions. It is generally thought that engaging in applied science is the shortest route to success in this area. After all, pure research (e.g. asking questions about the truly unknown) invoves blind trial-and-error and ad-hoc experiments which lead to hard-to-interpret results. Yet "Ignorance" author Firestein argues that pure research might be more useful in terms of generating future innovation than we might recognize. This is becuase while there are many blind alleyways in the land of pure research, there are also many opportunities for serendipity (e.g. luck). It is the experiments that benefit from luck which potentially drive innovation along the furthest.

NOTES:

[1] One example is Brian Greene, a popularizer of string theory and theoretical physics and faculty member at NYU.

[2] via researching the literature and internal conversations among colleagues.

[3] for an example, please see: Frijters, R., van Vugt, M., Smeets, R., van Schaik, R., de Vlieg, J., and Alkema, W. (2010). Literature Mining for the Discovery of Hidden Connections between Drugs, Genes and Diseases. PLoS Computational Biology, 6(9), e1000943.

[4] Yong, E. (2012). Bad Copy. Nature, 485, 298.

[5] Ioannidis, J.P. (2005). Why Most Published Research Findings Are False. PLoS Medicine, 2(8), e124 (2005).

[6] Also the author of the following book: Hawkins, J. and Blakeslee, S. (2004). On Intelligence. Basic Books.

[7] here is a list of examples for adaptation and natural selection (COURTESY: PBS).

[8] for an example of how this has affected the field of Psychology, please see: Sternberg, R.J. and Grigorenko, E.L. (2001). Unified Psychology. American Psychologist, 56(12), 1069-1079.

[9] this was true of biology in the 19th century, and neuroimaging in the late 20th century. There are likely other examples I have not included here.

[10] here are more information on curve fitting (demo) and normalization (Wiki).

[11] this was discussed in the HTDE Workshop (2012). What is the computational complexity of a scientific problem? Can this be solved via parallel computing, high-throughput simulation, or other strategies?

Here are some additional insights from the philosophy of science (A) and the emerging literature on solving well-defined problems through optimizing experimental design (B):

A] Casti, J.L. (1989). Paradigms Lost: images of man in the mirror of science. William Morrow.

B] Feala, J.D., Cortes, J., Duxbury, P.M., Piermarocchi, C., McCulloch, A.D., and Paternostro, G. (2010). Systems approaches and algorithms for discovery of combinatorial therapies. Wiley Interdisciplinary Reviews: Systems Biology and Medicine, 2, 127.  AND  Lee, C.J. and Harper, M. (2012). Basic Experiment Planning via Information Metrics: the RoboMendel Problem. arXiv, 1210.4808 [cs.IT].

[12] our spacetime topology corresponds to a metric space, a common context, or conceptual framework. In an operational sense, this could be the dynamic range of a measurement device or the logical structure of a theory.

[13] I have no idea how this would square away with current theory. For now, let’s just consider the possibility.....

[14] the citation is: Brooks, M. (2009). 13 Things That Don't Make Sense. Doubleday.

[15] NKS (New Kind of Science) demos are featured at Wolfram's NKS website.

[16] this is a theme throughout the BMI and BCI literature, and includes variables such as type of signal used, patient population, and what is being controlled.

December 1, 2012

CoE #46 is now on Figshare


After reading a blog post (Posting Blog Entries to Figshare) by C. Titus Brown (from Living in an Ivory Basement) on strategies for furthering open-source science, I decided to post the Carnival of Evolution blog carnival I hosted back in April (The Tree (Structures) of Life) to Figshare [1].


Figshare is a public repository that provides a digital identifier (DOI) to each document shared. DOI identifiers make it easy for non-standard content (blog posts, slideshows, etc) to be formally cited and cataloged. In his post, C. Titus Brown suggests a programming solution to convert blog posts to Figshare documents. I did not use his method for this, but I simply formatted it as a more formal document. Enjoy.


REFERENCES:

[1] CoE #46: The Tree (Structures) of Life. http://dx.doi.org/10.6084/m9.figshare.99887

November 28, 2012

Research Paper "Emergence"


I am cross-posting this from my microblog (on Tumblr), Tumbld Thoughts. It's a YouTube video documenting the progression of a research paper [1]. This paper was submitted [2] to the Computer Science conference "WWW" (or World Wide Web). 463 changes to the document were made as it was being fleshed out, all of which are included in the 1 minute and 32 second time-lapse video. Enjoy.

NOTES:

[1] shown here is a short animation of the process. I tend to be a bit more nonlinear when writing research papers, so this is pretty impressive.

[2] Courtesy of Timothy Weninger, UIUC.

November 27, 2012

Topological references, courtesy of Futurama

I am re-posting this from my microblog Tumbld Thoughts, as it has been one of my most popular posts there.  The theme fits in well with this blog as well.


Currently catching up on TV. In the Futurama episode "Benderama", there was a reference to the Banach-Tarski paradox (in the form of a duplicator machine) [1]. The machine is used to duplicate things (ultimately Bender) at progressively smaller scales [2]. All this in 23 minutes. Great stuff.

NOTES: 

[1] the point was not to be technically accurate. The point was to make an analogical and highly obscure reference so that people like me could talk about it.

[2]  the "shrinker" eventually leads to the potential destruction of the world's drinking supply (nice subreference to the gray goo hypothesis/scenario in the field of nanotechnology).

November 21, 2012

Artificial Life meets Geodynamics (EvoGeo)

For the past few years, I have been working on a novel approach to modeling biological evolution. It utilizes a computational tool from fluid mechanics called Lagrangian Coherent Structures (LCS). This method was developed by George Haller [1] at ETH Zurich, and has been previously used in biology to study the biomechanics of locomotion in fluids [2] and the relationship between predation and ocean currents
[3]. I presented on this at ALife XIII (Contextual Geometric Structures - CGS, [4]) and have a related paper (LCS-like structures, [5]) on the arXiv.

The premise of this model is simple: a population of automata, with either a neural (CGS) or genomic (LCS-like) representation, diffuses across an n-dimensional flow field. As they diffuse according to local flow conditions (which can mimic environmental selection), they form various clusters called coherent structures. While they can be identified using qualitative means, there are actually highly-complex differential equations that can be used to estimate evolutionary distance and perhaps even reconstruct evolutionary trajectories [6]. However, much more work needs to be done on the simulation capabilities of this hybrid [7] evolutionary model, and as of now exists as somewhere between conceptual metaphor and bona-fide simulation.

Figure 1. RIGHT: An example of LCS models as applied to fluid dynamics. LEFT: An example of the LCS-inspired model (from CGS paper).

Figure 2. LEFT: Internal representations for automata in the CGS paper. RIGHT: Internal representations for automata in the LCS-paper.

In my application, some of the equations and data structures are borrowed, and some are uniquely "evolutionary". This allows me flexbility in terms of applying the model to many different kinds of evolutionary scenarios. One of these (identified in the LCS-like paper) is biogeography, particularly island biogegraphy. For the uninitiated, biogeography [8] is the study of population processes in a geographic context. As organisms migrate and geomorphology changes, population genetics and demography are affected in corresponding fashion.

In my LCS-inspired evolutionary model, recall that the environment consists of a generic flow field (and in CGS paper, this is already exploited as a quasi-geography). If this substrate could be replaced by a dynamic topology (e.g. a more explicit geography), the LCS-inspired model might provide insights into evolutionary "deep" time. When I say deep time, I mean a period of time long enough for uplift, continental drift, and seafloor spreading to occur and effect the distribution of populations and species.

How do we go about establishing this dynamic topology? There are a number of options here, two of which I will discuss in detail. The first is the terraforming engine used in virtual worlds such as SimCity, Spore, and Second Life [9]. The second involves using a mathematical tool called plate motion vectors to predict tectonic drift [10]. Figure 3 shown examples of each. While this has not been formally worked out, the basic goal is to create kinematics (underlying movements that govern environmental constraints) much as the quasi-flow regimes provide in the CGS and LCS-like models.

Once the kinematics of geomorphology have been established as an evolutionary field, the kinetics (e.g. how these movements unfold in time) must then be accounted for. A model of plate tectonics (the Burridge-Knopoff model, [11]) can be used to approximate tectonic activity as a series of sliding blocks that interact according to local rules. A simpler method might be to represent the buildup and release of stresses between tectonic units as a non-uniform probability density function (PDF). In any case, examples of the concept on a three-dimensional space can be seen in Figure 4.

Figure 3. LEFT: Examples of terraforming (SimCity) in a virtual world. RIGHT: Predictions of tectonic drift for the Jurassic period [see 9]. 

Figure 4. Single generation examples of organisms  (automata, black balls) and populations (clusters of black balls) on a dynamic topology. LEFT: Two separate landmasses, one with mountains. UPPER RIGHT: Newly-uplifted mountains and an isthmus (land bridge). LOWER RIGHT: Pre-uplifted mountains and continental drift. Simulated using pseudo-data.

In an artificial life context, a fitness function is used to disallow diffusion of organisms and thus gene flow across cells that are either too high or too low (representing oceans and mountains, respectively). Since there is a geodynamic component to the model, these fitnesses and regions colonized can both change over time [12]. Organisms can colonize a patch of land that becomes an island or valley over time, only to be cut off from the main population with uplift or ocean floor spreading. By contrast, members of the population can evolve the ability to either live in or traverse these new environments, leading to some potentially interesting scenarios. It should be noted that agents (automata) in the LCS-inspired models reproduce asexually and have unique evolutionary dynamics, so any conclusions raised using this model should not necessarily be extrapolated to paleobiological scenarios. 

What I am describing, then, is a three-level hybrid model (see Figure 5): a geomorphological model to partition and add dimensionality to the underlying substrate of evolution, a general model of particle diffusion (genetic drift), and a genomic representation that provide diversity and relatedness to the automata population. This is only a quick sketch from the Fluid Models of Evolutionary Dynamics project -- if you interested in collaborating or otherwise helping develop this model, let it be known. Other comments are also welcome.

Figure 5. Diagram of the three-level model structure in context.

References and Notes:

[1] Haller, G. (2007). Uncovering the Lagrangian Skeleton of Turbulence. Physical Review Letters, 98, 144502. Here is a tutorial.

[2] Nawroth, J.C., Feitl, K.E., Colin, S.P., Costello, J.H., and Dabiri, J.O. (2010). Phenotypic plasticity in juvenile jellyfish medusae facilitates effective animal-fluid interaction. Biology Letters, 6(3), 389-393.

[3] Tew Kai, E., Rossi, V., Sudre, J., Weimerskirch, H., Lopez, C., Hernandez-Garcia, E., Marsac, F., and Garcon, V. (2009). Top marine predators track Lagrangian coherent structures. PNAS, 106(20), 8245–8250.

[4] Alicea, B. Contextual Geometric Structures: modeling the fundamental components of cultural behavior.
Proceedings of Artificial Life, 13, 147-154 (2012).

[5] Alicea, B. Lagrangian Coherent Structures (LCS) may describe evolvable frontiers in natural populations.
arXiv Repository, arXiv:1101.6071 [nlin.AO, physics.bio-ph, q-bio.PE] (2011).

Supplementary Information for [4] and [5], and slides for [4].

[6] For examples, please see: Lipinski, D. and Mohseni, K. (2010). A ridge tracking algorithm and error estimate for efficient computation of Lagrangian coherent structures. Chaos, 20, 017504.

[7] Fromentin, J., Eveillard, D., and Roux, O. (2010). Hybrid modeling of biological networks: mixing temporal and qualitative biological properties. BMC Systems Biology, 4, 79.

[8] There are many relevant reviews. For an example, please see: Ronquist, F. and Sanmartin, I. (2011). Phylogenetic Methods in Biogeography. Annual Review of Ecology, Evolution, and Systematics, 42, 441-464.

[9] Here is the terraforming and ground textures documentation page from Second Life support. They use something called a .raw file, which converts a height field into a georeferenced surface (similar to digital elevation maps or digital elevation models).

[10] For more information on plate motion vectors, please see this link to "Teaching Geosciences in the 21rst Century". For tutorial on measuring relative plate motion, please see this tutorial. For tectonic drift methods (e.g. plate motion vectors) used to create predictions (maps) of continental drift, see the research of Ronald Blakely, Northern Arizona University.

[11] Rundle, J.B., Tiampo, K.F., Klein, W., and Sa Martins, J.S. (2002). Self-organization in leaky threshold systems: The influence of near-mean field dynamics and its implications for earthquakes, neurobiology, and forecasting. PNAS, 99(S1), 2514-2521.

[12] An important distinction: the surfaces shown in Figure 4 are not traditional fitness landscapes. Rather, these surfaces are based on elevation, which can change as the model evolves. Any fitness parameters are calculated independently of elevation, although they can be effected by elevation (e.g. uplift) or general topographic changes (new mountains, seafloor).

November 14, 2012

Paper of the Week (Detecting Causality)



I haven't done this in awhile [1], but here is my choice for paper of the week, published several weeks ago in the journal Science. It is called "Detecting Causality in Complex Ecosystems" [2], and the authors include George Sugihara and Robert May [3]. By "causality", the authors mean differentiating truly causal events from intermittent correlative events (termed "mirage" correlations in the paper, see Figure 1). And by "complex ecosystem", the authors mean comparisons between two or more time-series traces [4], although they use real-world examples [5] in the paper to test their model. The idea of uncovering causality in two or more time series is typically done using an approach called Granger causality, in which the events of one time series predict the events in another time series given some degree of lag [6].

Figure 1. Three instances of mirage correlations between two time series (which can occur in a range of systems, from ecological to financial). COURTESY: Figure 1 in [2].

However, in cases where coupling between the two systems under analysis is weak [6], a new approach called Convergent Cross Mapping (CCM) can be used to detect causality. The CCM approach measures the extent to which the historical record (e.g. a prior time-series) can predict a current time-series [see 7]. This approach relies on the principle of cross-prediction: the current time-series must causally influence the prior time series via feedback or transitive couplings (see Figure 2). The success of this approach also depends heavily on convergence within complex systems [8] and the ability to reconstruct the state space for both time-series (Figure 3) using historical and current information [9].

Figure 2. Cases and examples of coupling between dynamical systems and/or variables. COURTESY: Figure 4 in [2].


Figure 3. LEFT: Example of the CCM method for three time-series sharing the same attractor basin manifold. RIGHT: an example of the Simplex projection method (see notes [3] and [9]).

In my opinion, this is a very interesting and perhaps even landmark paper. You should read it and save it to your Mendeley (or similar application) library immediately.

NOTES:

[1] Haven't done this since 2011, so here and here are my previous "papers of the week". FYI. 

[2] Formal citation: Sugihara, G., May, R., Ye, H., Hsieh, C-H., Deyle, E., Fogarty, M., and Munch, S. (2012). Detecting Causality in Complex Ecosystems. Science, 338, 496. 

[3] Both are legends in the field of ecology. Figure 3 shows an example of their previously introduced (and theory-based) method called Simplex analysis.

[4] one example: an n-dimensional phase space trajectory such as a Lorentz attractor.

[5] one example: the sardine-anchovy temperature problem. Pictures courtesy mbari.org.


[6] MATLAB code for Granger causality can be found here (basic analysis) and here (toolbox for inferring network connectivity).

[7] What type of complex systems are most amenable to the CCM method? Nonseperable, weakly connected dynamical systems. Which are, according to the authors, something beyond the scope of (linear) Granger causality analysis. 

[8] convergence is to be contrasted with Lyapunov divergence (characterized by an exponent), where two systems begin at the same initial condition and diverge over time. 

[9] YouTube animation of this process from the Sugihara Lab. In implementing the  CCM method, an algorithm based on the simplex projection is used to generate a nearest-neighbor solution for kernel density estimation is used. More details can be found in the Supplementary Materials




November 5, 2012

5 years or 20,000 visits, whichever comes first....

Five years ago next month, I founded Synthetic Daisies. The name [1] is based on a merger of my interest in artificial and synthetic life (Synthetic) with my interest in systems thinking (Daisies, based on the Daisyworld experiments of James Lovelock). And yet historically at least, the blog have covered much more. The first post was a review of the book “The Complementary Nature”, followed by two sets of posts: one post on non-optimality [2] and another on YouTube videos from the grand opening of Dickinson Hall at the University of Florida [3]. Since then, I have posted on a mix of biological, computational, and innovation-related topics (broadly construed). I try to keep it balanced between these three areas, with some pop culture and geeky technology fun thrown in.

Images from the first Synthetic Daisies posts (using current template). LEFT: Complementary Nature review, CENTER: non-optimality post, RIGHT: YouTube videos of the Dickinson Hall opening.

My blogging style has evolved since starting Synthetic Daisies. For one thing, I have optimized the template design and layout towards something that is attractive to look at as it is to read (not sure I have totally succeeded at this, but I’ve tried). I have also used a number of devices to convey what is sometimes highly complicated information to a general audience. One of these is a preponderance of diagrams and images. Another is a moving a lot of technical detail to footnotes. Being true to the interactive nature of a blog, I sometimes update posts with retrospective information (after the original event or thoughts that inspired the post). This is a good way to keep your posts from becoming "archival" (and perhaps even embarrassing) in feel after a few months or years.

After posting sporadically for the first three years, I made the commitment to keep up the frequency of new posts (at least one post every 1-2 weeks). If you are starting a blog, don’t be afraid to post a large number of posts without immediate readership or feedback. If it is good and you promote them, the traffic will follow. I must also say that blogging on a regular basis has improved my writing skills. Although it takes a lot of effort, I hope that people find this blog both useful and entertaining.

An example of the previous blog template (circa May 2012).

Not bad for the first 4.917 years. However, according to the blogspot.com analytics, it appears that I have reached 20,000 [4] views (this includes all posts and pages). Many of these views are due to a handful of posts, as the number of views per post follows a power-law distribution [5].  One thing that helped my blog along is participating in the Carnival of Evolution (CoE) [6]. Another helper in getting readership is thinking about innovative topics to cover (e.g. thinking outside the blog template, so to speak). I also publicize my posts on Facebook, Tumblr [7], and my research website where appropriate. Finally, I have integrated the blog with my research projects and teaching activities, which helps along the topical innovation.

Power-law-distribution [8] of views per post (left) and time-series of views for all time and by week (top right, subsample) and all-time (lower right).

The ten most viewed posts and their rates of accession (data collated on 11/5/2012).
Post Title
Views
Viewing Rate (per day)
2522
11.57
814
3.21
702
2
614
2.15
297
1.16
293
0.79
292
0.93
277
0.43
267
5.45
264
0.88
* = featured in an edition of CoE.

NOTES:

[1] See the explanation page for more information.

[2] This was a very rough idea at the time. It is also quite unconventional. In engineering and economics, there are whole subfields, library stacks, and conferences devoted the notion of “optimization”. However, in a number of fields (such as Molecular Biology, Evolutionary Ecology, and Cultural Anthropology), models of optimality do not explain the data well. I eventually worked this idea into an arXiv paper called “The ‘Machinery’ of Biocomplexity: understanding non-optimal architectures in biological systems”.

[3] Dickinson Hall was opened in the 1960s as the original home of the Florida Museum of Natural History (FLMNH). It is now used as a research facility and to house the collections not on exhibition.  Link to post.


[4] To be precise, probably sometime tomorrow. Most of these visits are legitimate (e.g. not spambots or other redundant counts). What people are getting out of each visit, however, is not known.

[5] Notably, the traffic with respect to time is bursty, especially on a day-to-day basis (as expected).

[6] I hosted CoE #46 with the theme of evolutionary trees (which has earned roughly 2500 views). See it here. I plan to host again next year. CoE is the longest-running blog carnival (which is a monthly review of the blogosphere for a certain topic). Thanks to The Genealogical World of Phylogenetic Networks for this analysis.


[7] I keep a Tumblr site (Tumbld Thoughts) for some of my shorter concepts, observations, and sets of hyperlinks. This “microblogging” platform is particularly good for this purpose. Analytics are currently being collected.

[8] Rank order distribution plotted on a log-log plot. The distribution of views per post follows a power-law distribution, assuming that all traffic flow to the site over time is a Poisson process.......you know, all the good stuff.......

Printfriendly