Theory Driven Data Science

This is a personal web site that is intended to promote a concept that I have developed called “theory driven data science.

Theory driven data science is based on the premise that a theoretical understanding of the things one is trying to predict will produce more accurate predictions and enhance the ability to develop prescriptive solutions. The theory driven data science model is based on three core functions. Theory, method, and practice.

  1. Theory: A Physics of Living Systems
  2. Method: Data Pattern Analysis
  3. Practice: Improved prediction and enhanced prescription

The theoretical foundation of theory driven data science is based on the “Physics of Living Systems,” which is a set of core principles related to the structure of data related to living systems. By understanding the “nature of nature” we can build better predictive models and choose better options for fixing the problems we are trying to address. The Physics of Living Systems has three core propositions which are referred to as “The Facts of Life.”

  1. Living systems have infinite variety
  2. Living systems are constantly changing
  3. Living systems are subject to selection processes

For an example of improved prediction, I developed a theory that health is normally distributed. This theory arose after examining the percentages of chronic conditions in the healthcare population. Using this normal health theory, I was able to boost the explanatory power of a population health ranking model by 50%. Theory driven data science.

A theory driven approach to data science focuses on the “six honest serving men” that Kipling wrote about in his poem from “The Elephant’s Child.”

I KEEP six honest serving-men
(They taught me all I knew);
Their names are What and Why and When 
And How and Where and Who.

Posted by arnoldtk