Published by Weisser Zwerg Blog on
Judea Pearls Causality and the do()-calculus in a state-based and an operation-based picture.
I am struggling since quite some time with Judea Pearl’s causality theory and his do-calculus. His work is summarized in the following three books (in order of publishing and in decreasing order of difficulty):
- Causality: Models, Reasoning and Inference
- Causal Inference in Statistics - A Primer
- The Book of Why: The New Science of Cause and Effect
The main reason I am struggling with his representation is that his representation does not match my intuition. He uses a state-based representation, while my intuition of causality is operation-based (I’ll explain the terminology below). In a follow-up blog-post I’ll go into further details about Judea Pearl’s causality theory, but in this one I’ll only focus on these representational issues.
The preceding sub-title and the following quote are Leslie Lamport’s words:
- Writing is nature’s way of letting you know how sloppy your thinking is.
- Mathematics is nature’s way of letting you know how sloppy your writing is.
- Formal mathematics is nature’s way of letting you know how sloppy your mathematics is.
This blog post develops in parallel to my thinking, e.g. don’t take it as the final word …
The concept of operation-based vs. state-based is quite common in computer science, but I found the best description of the topic in the papers around Conflict-Free Replicated Data Types (CRDT). They make a difference between Commutative Replicated Data Types (CmRDTs or Operation-Based CRDT) and Convergent Replicated Data Types (CvRDTs or State-Based CRDTs). For the full details I suggest you read the paper A comprehensive study of Convergent and Commutative Replicated Data Types.
In state-based representation you only track the state of an object (or the world), e.g. is-the-light-switch-on-p (this is lisp notation where the -p at the end indicates a predicate) with values yes/no.
In the operation-based representation you track (or often know) a starting-state and then you track the operations. Imagine a bank account that starts out with 0 credits. The operations would be [deposit: 10], [withdraw: 5], … and if you know all the operations then you know the state of the bank account.
Both representations are equivalent and the above mentioned paper shows how to convert from the one to the other.
My intuition of causality works in the operation based picture. You start out with a world state , you apply an action to it and you end up with a new world state .
There is also an implicit happened-before order (e.g. time) in the sequence of actions (at least locally, e.g. when you look at an object like the light switch).
The representation of Markov-Decision-Processes (MDPs) would match my intuition quite well:
You hav states (green) and actions (redish/orange). If you apply an action to a state then you have some transition probabilities to other states. The rewards part of MDPs is irrelevant for our analysis of causality here.
But the do-calculus of Judea Pearl works “on tables” (each column represents the instantiations of a random variable) like in statistics, e.g. it is completely state-based.
One typical example is the Rain-Sprinkler-GrassWet example:
You have three variables as three columns, one column
RAIN, one column
SPRINKLER and one column
GRASS_WET. This representation lacks actions (like “turn the sprinkler on”)
In such simple examples like
Voltage-Current (like in Ohm’s law) this is fine.
But for more difficult examples it constantly confuses me.
Imagine the example of someone throwing a
projectile (stone, foot-ball, snow-ball, cotton-ball) and maybe hitting a
window that then perhaps breaks. In order to work with this
example it would not be sufficient to have a column
projectile with some value in it. You would need to store the whole trajectory in the “column”
projectile and then assign
to each trajectory a probability for the
I cannot tell exaclty why, but for my intuition it would also make a difference if the
projectile would be a stone or a snow-ball. In the case of a stone:
if somebody threw the stone three weeks ago and the window broke you could at least forensically reconstruct the happenings by finding the
projectile inside the room the window
is part of. But in the case of a snow-ball the window would break and three weeks later the snow-ball is molten and the water is evaporated, e.g. there is no trace left of the
object that caused the window to break. This is not relevant for forward causality (from cause to effect), but it feels important for the backward reasoning from effect to
most probable cause, especially if you have “missing variables” (like the snow-ball is missing).
I am not sure if this will remain my final answer, but for the moment I’ve settled for predicates like in logic. In the window example
you would introduce a column
projectile-did-hit-the-window-in-the-past-p or similar. Once you have converted the operation-based scenario into a (kind-of) state-based picture
you can apply the do-calculus.
I still would hope for another representation more like Markov-Decision-Processes.
- Causality for Machine Learning by Bernhard Schölkopf is a great summary of the current state of affairs. I am especially grateful for sharing his insights how the concept of causality as discussed by the machine learning/statistics literature fits into the picture of physics.
- Elements of Causal Inference: Foundations and Learning Algorithms by Jonas Peters, Dominik Janzing, Bernhard Schölkopf is also a great resource for gaining a better understanding of the topic.