March 3, 2018

Open Data Day 2018: Orthogonal Research Version

Time once again for International Open Data Day, an annual event hosted by organizations all around the world. For the Orthogonal Research contribution, I am sharing a presentation on the role of theory in data science (and the analysis of open data).

 Full set of slides are available on Figshare, doi:10.6084/m9.figshare.5483746

A theory of data goes back to before there were concepts such as "big data" or "open data". In fact, we can learn a lot from attempts to characterize regularities in scientific phenomena, particularly in the behavioral sciences (e.g. Psychophysics).

There are a number of ways to build a mini-theory, but one advantage of the approach we are working on is that (assuming partial information about the data being analyzed) a theoretical model can be built with very limited amounts of data. I did not mention the role of non-empirical reasoning [1] in the theory-building, but might be an important issue for future consideration.

 The act of theory-building is also creating generalized models of pattern interpretation. In this case, our mini-theory detects sheep-shaped arrays. But there are bottom-up and top-down assumptions that go into this recognition, and theory-building is a way to make those explicit.
 Naive theories are a particular mode of error in theory-building from sparse or incomplete data. In the case of human reasoning, naive theories result from generalization based on limited empirical observation and blind inference of mechanism. They are characterized in the Cognitive Science literature as being based on implicit and non-domain-specific knowledge [2].

Taken together, mini-theories and naive theories can help us not only better characterize unlabeled and sparsely labelled data, but also gain an appreciation for local features in the dataset. In some cases, naive theory-building might be beneficial for enabling feature engineering, ontologies/metadata [3] and other characteristics of the data.

In terms of usefulness, theory-building in data science lies somewhere in between mathematical discovery programs and epistemological models. 

[1] Dawid, R. (2013). Novel Confirmation and the Underdetermination of Scientific Theory Building. PhilSci Archive.

[2] Gelman, S.A., Noles, N.S. (2011). Domains and naive theories. WIREs Cognitive Science, 2, 490–502. doi:10.1002/wcs.124

[3] Rzhetsky, A., Evans, J.A. (2011). War of Ontology Worlds: Mathematics, Computer Code, or Esperanto? PLoS Computational Biology, 7(9), e1002191. doi:10.1371/journal.pcbi.1002191

February 12, 2018

Darwin as a Universal Principle

Background Diagram: Mountian-Sky-Astronomy-Big-Bang blog.

For this year's Darwin Day post, I would like to introduce the concept of Universal Darwinism. To understand what is meant by universal Darwinism, we need to explore the meaning of the term as well as the many instances Darwinian ideas have been applied to. The most straightforward definition of Universal Darwinism is a Darwinian processes that can be extended to any adaptive system, regardless of their suitability. Darwinian processes can be boiled down to three essential features:
1) production of random diversity/variation (or stochastic process).  
2) replication and heredity (reproduction, historical contingency). 
3) natural selection (selective mechanism based on some criterion). 
A fourth feature, one that underlies all three of these points, is the production and maintenance of populations (e.g. population dynamics). These features are a starting point for many applications of universal Darwinism. Depending on the context of the application,these four features may be emphasized in different ways or additional features may be added.

Taken collectively, these three features constitute many different types of process, encompassing evolutionary epistemology [1] to cultural systems [2], neural systems [3, 4], physical systems [5, 6], and informational/cybernetic systems [7, 8]. Many of these universal applications are explicitly selectionist, and do not have uniform fitness criteria. In fact, fitness is assumed in the adaptive mechanism. This provides a very loose analogy to organismal evolution indeed.

Universal computational model shaped by Darwinian processes. COURTESY: Dana Edwards, Universal Darwinism and Cyberspace.

Of these, the application to cybernetic systems is the most general. Taking inspiration from both cybernetics theory and the selectionist aspects of Darwinian models, Universal Selection Theory [7, 8] has four basic claims that can be paraphrased in the following three statements:
1) "operate on blindly-generated variation with selective retention". 
2) "process itself reveals information about the environment". 
3) "processes built atop selection also operate on variation with selective retention".
The key notions are that evolution acts to randomly generate variation, retains only the most fit solutions, then builds upon this in a modular and hierarchical manner. In this way, universal Darwinian processes act to build complexity. As with the initial list of features, the formation and maintenance of populations is an important bootstrapping and feedback mechanism. Populations and heredity underlie all Darwinian processes, even if they are not defined in the same manner as biological populations. Therefore, all applications of Darwinian principles must at least provide an analogue to dynamic populations, even at a superficial level.

There is an additional advantage of using universal Darwinian models: capturing the essence of Darwinian processes in a statistical model. Commonalities between Darwinian processes and Bayesian inference [3, 5] can be proposed as a mechanism for change in models of cosmic evolution. In the Darwinian-Bayesian comparison, heredity and selection are approximated using the relationship between statistical priors and empirical observation. The theoretical and conceptual connections between phylogeny, populations, and Bayesian priors is a post-worthy topic in and of itself.

At this point, we can step out a bit and discuss the origins of universal Darwinian systems. The origin of a Darwinian (or evolutionary) system can take a number of forms [9]. There are two forms of "being from nothingness" in [9] that could be proposed as origin points for Darwinian systems. The first is an origin in the lowest possible energetic (or in our case also fitness) state, and the other is what exists when you remove the governance of natural laws. While the former is easily modeled using variations of the NK model (which can be generalized across different types of systems), the latter is more interesting and is potentially even more universal.

An iconic diagram of Cosmic Evolution. COURTESY: Inflation Theory by Dr. Alan Guth.

An iconic diagram of Biological Evolution. COURTESY: Palaeontological Scientific Trust (PAST).

So did Darwin essentially construct a "theory of everything" over 200 years ago? Did he find "42" in the Galapagos while observing finches and tortoises? There are a number of features from complexity theory that might also fit into the schema of Darwinian models. These include concepts from self-organization not explicitly part of the Darwinian formulation: scaling and complexity, dependence on initial condition, tradeoffs between exploitation and exploration, and  order arising from local interactions in a disordered system. More explicitly, contributions from chaos theory might provide a bridge between nonlinear adaptive mechanisms and natural selection.

The final relationship I would like to touch on here is a comparison between Darwinian processes and Universality in complex systems. The simplest definition of Universality states that the properties of a system are independent of the dynamical details and behavior of the system. Universal properties such as scale-free behavior [10] and conformation to a power law [11] occur in a wide range of systems, from biological to physical and from behavioral to social systems. Much like applications of Universal Darwinism, Universality allows us to observe commonalities among entities as diverse as human cultures, organismal orders/genera, and galaxies/universes. The link to Universality also provides a basis for the abstraction of a system's Darwinian properties. This is the key to developing more representationally-complete computational models.

8-bit Darwin. COURTESY: Diego Sanches.

Darwin viewed his theory development of evolution by natural selection as an exercise in inductive empiricism [12]. Ironically, people are now using his purely observational exercise as inspiration for theoretical mechanisms for systems from the natural world and beyond.

[1] Radnitzky, G.,‎ Bartley, W.W., and Popper, K. (1993). Evolutionary Epistemology, Rationality, and the Sociology of Knowledge. Open Court Publishing, Chicago. AND Dennett, D. (1995). Darwin's Dangerous Idea. Simon and Schuster, New York.

[2] Claidiere, N., Scott-Phillips, T.C., and Sperber, D. (2014). How Darwinian is cultural evolution? Philosophical Transactions of the Royal Society B, 36(9), 20130368.

[3] Friston, K. (2007). Free Energy and the Brain. Synthese, 159, 417-458.

[4] Edelman, G.M. (1987). Neural Darwinism: the theory of neuronal group selection. Oxford University Press, Oxford, UK.

[5] Campbell, J. (2011). Universal Darwinism: the path to knowledge. CreateSpace Independent Publishing.

[6] Smolin, L. (1992). Did the universe evolve? Classical and Quantum Gravity, 9, 173-191.

[7] Campbell, D.T. (1974). Unjustified Variation and Selective Retention in Scientific Discovery. In "Studies in the Philosophy of Biology", F.J. Ayala and T. Dobzhansky eds., pgs. 139-161. Palgrave, London.

[8] Cziko, G.A. (2001). Universal Selection Theory and the complementarity of different types of blind variation and selective retention. In "Selection Theory and Social Construction", C. Hayes and D. Hull eds. Chapter 2. SUNY Press, Albany, NY.

[9] Siegal, E. (2018). The Four Scientific Meanings Of ‘Nothing’. Starts with a Bang! blog, February 7.

[10] Barab├ísi, A-L. (2009). Scale-Free Networks: a decade and beyond. Science, 325, 412-413.

[11] Lorimer, T., Gomez, F., and Stoop, R. (2015). Two universal physical principles shape the power-law statistics of real-world networks. Scientific Reports, 5, 12353.

[12] Ayala, F.J. (2009). Darwin and the Scientific Method. PNAS, 106(1), 10033–10039.

February 1, 2018

Things that Just Happened in London.....

This week, the Royal Society is hosting a workshop called "From Connectome to Behavior", hosted by the OpenWorm Foundation. A program can be found here

The Monday and Tuesday sessions included talks by OpenWorm senior contributors as well as mathematical, biological, and engineering researchers from around the world (including John White, a C. elegans  research legend). Fortunately, you can get a taste for the topical diversity on the OpenWorm Twitter feed, and from the screenshots below.

The Wednesday session was a day for demos and less formal talks, as evidenced by the robotics contingent showing off their latest hardware. Living worms also made an appearance!

How good is the OpenWorm simulation suite? Take a simple test: which one is the real worm, the worm on the left or the worm on the right? View the video footage and vote here.

Here is some OpenWorm-related artwork on display, designs by Matteo Farinella

If what you see here looks good and you would like to learn more, please get in touch with the OpenWorm community! Hope to see you soon!

Thanks to the Royal Society of London for being an excellent host!

January 27, 2018

News and Views: January 2018 edition

Here are a few updates on Synthetic Daisies-adjacent news and views originally posted on Tumbld Thoughts.

First: For those interested in scientific community-building, fostering diversity, and mentorship, here is a working Laboratory Contribution Philosophy [1] developed for the Orthogonal Research Lab.

The full series of slides can be found on the Open Science Framework as part of the Open Career Development project.

Secondly: It’s time for the 2018 Foundational Questions Institute essay contest! This year's theme: “What is Fundamental?” Submissions are now closed, but the entries are open for evaluation. Hundreds of essays providing insights from physics, math, philosophy, and more. Check out my entry here: “Towards the meta-fundamental: introducing intercontextual invariants”.

[1] inspired by the contribution methodology to Open Source projects.

January 13, 2018

Using PowerPoint for Simple 3-D Rendering

I've recently become acquainted with a feature in Power Point that allows you to build 3-D worlds within a normal slideshow. Microsoft has been developing these modeling capabilities not only natively in Power Point but also in Paint 3D [1]. As an alternative to breaking free from PowerPoint, I want to briefly show how this can be used for everyday scientific exposition and communication.

Tutorial for creating 3-D in a PowerPoint presentation using Paint 3D
(from Microsoft and LinkedIn Learning).

Having a integrated platform is particularly useful for scientific and engineering presentations where a simple display is desired with minimal training and digital bandwidth requirements. Although PowerPoint is not an open source platform, packages such as Open Office Impress does many of the same things I will show you here. To create your own image, you will have to become proficient in "working with shapes". Create a 2-D shape, and overlay/join multiple 2-D shapes [2] if you prefer. Right-clicking on the shape allows you to recolor, extrude, and rotate the shape as desired.

An cross-organizational advertisement for Google Summer of Code offerings. This example demonstrates a background plus a variety of composite shapes (more on these later).

A project advertisement for the DevoWorm group. This example demonstrates surface definitions and object embedding. 

We will begin our discussion with a few simple examples that I created in a short period of time. These types of objects can be used symbolically, decoratively, or as a labelled object. While joined objects can be rotated and sized (as in something like Blender), composite objects are a bit trickier to work with.

Simple "doo-dad" in stationary 3-D. Components not joined.

Architectural feature in stationary 3-D. Components not joined.

In both examples (particularly the latter), multiple shapes are combined to produce a 3-D geometry. In the first example, four circles and an oval are used to build a mechanical "doo-dad". In the second example, 15 cubes are layered and rotated to form three isometric rows. These are joined at the row ends by two flat rectangles layered so as to achieve the visual illusion of continuity. To achieve the desired visual effect, the shapes were not physically joined. They do, however, create an appropriate visual effect.

Creating a donut shape in PowerPoint from a circle.

One way to see the advantage of using extruded objects to achieve 3-D is to discuss the concept of stereoscopy. Shown below is an elaborate design that appears to have depth cues, but is actually the same pattern that overlaps at a narrow visual angle. While a series of offset cubes provide the illusion of depth, overlapping squares does not produce very rich depth and orientational cues. We can also use extruded shapes, which are also not very perceptually rich.

Four overlapping squares (left) vs. a cube with four segments (right).

Another alternative is to use isometric shapes joined together in a scene. In the following example, we have 12 cubes joined together in pairs with one end removed. These long cubes are then stacked in a tiered arrangement. While this example is largely decorative, this structure could also be made to be interactive.

This structure forms an isometric lattice (a type of axonometric projection) that provides the illusion of a 3-D scene without fully representing the third dimension [3]. To accomplish this, a full isometric lattice is projected at a 120 degree angle, with three prominent sides of a projected object (in this case, a cube) displayed at a 60 degree angle. Isometric geometries have been used in video game design for many years, including games such as Zaxxon, Q*Bert, and classic Sim City. We can also use them to create objects and projections for representing datasets and their structure. The latter will be the subject of a future post.

Isometric "hive" of square pipes in stationary 3-D.

UPDATE (2/2/2018): Orthogonal Research Badge System now has a microcredential in 3-D Power Point Design! Work your way through some simple examples such as basic shape extrusion, then try your hand at more complex models. 

[1] for a more detailed demonstration and discussion of Paint 3D, watch "3D Tools in Powerpoint" from the Presentation Guild, featuring Stephanie Horn from Microsoft.

[2] this is done using the Insert Shapes function on the Format toolbar. You must place a shape on the slide, then Format will appear in the Drawing Tools.

[3] Krikke, J. (2000). Axonometry: a matter of perspective. IEEE Computer Graphics and Applications, 20(4), 7-11.