# Causality, State Based vs. Operation Based Representation

Published by Weisser Zwerg Blog on

Judea Pearls Causality and the do()-calculus in a state-based and an operation-based picture.

## Rational

I am struggling since quite some time with Judea Pearl’s causality theory and his do-calculus. His work is summarized in the following three books (in order of publishing and in decreasing order of difficulty):

- Causality: Models, Reasoning and Inference
- Causal Inference in Statistics - A Primer
- The Book of Why: The New Science of Cause and Effect

The main reason I am struggling with his representation is that his representation does not match my intuition. He uses a state-based representation, while my intuition of causality is operation-based (I’ll explain the terminology below). In a follow-up blog-post I’ll go into further details about Judea Pearl’s causality theory, but in this one I’ll only focus on these representational issues.

### Thinking is writing and writing is thinking / Writing is hard, because thinking is hard

The preceding sub-title and the following quote are Leslie Lamport’s words:

- Writing is nature’s way of letting you know how sloppy your thinking is.
- Mathematics is nature’s way of letting you know how sloppy your writing is.
- Formal mathematics is nature’s way of letting you know how sloppy your mathematics is.

This blog post develops in parallel to my thinking, e.g. don’t take it as the final word …

## Operation Based vs. State Based representations

The concept of operation-based vs. state-based is quite common in computer science, but I found the best description of the topic in the papers around Conflict-Free Replicated Data Types (CRDT). They make a difference between Commutative Replicated Data Types (CmRDTs or Operation-Based CRDT) and Convergent Replicated Data Types (CvRDTs or State-Based CRDTs). For the full details I suggest you read the paper A comprehensive study of Convergent and Commutative Replicated Data Types.

### State-Based Representation

In state-based representation you only track the state of an object (or the world), e.g. is-the-light-switch-on-p (this is lisp notation where the -p at the end indicates a predicate) with values yes/no.

### Operation-Based Representation

In the operation-based representation you track (or often know) a starting-state and then you track the operations. Imagine a bank account that starts out with 0 credits. The operations would be [deposit: 10], [withdraw: 5], … and if you know all the operations then you know the state of the bank account.

Both representations are equivalent and the above mentioned paper shows how to convert from the one to the other.

## Causality Representation vs. my Intuition

My intuition of causality works in the operation based picture. You start out with a world state $W_0$, you apply an action $A$ to it and you end up with a new world state $W_1$.

$f: (W_0, A) \mapsto W_1$

There is also an implicit happened-before order (e.g. time) in the sequence of actions (at least locally, e.g. when you look at an object like the light switch).

The representation of Markov-Decision-Processes (MDPs) would match my intuition quite well:

You hav states (green) and actions (redish/orange). If you apply an action to a state then you have some transition probabilities to other states. The rewards part of
MDPs is irrelevant for our analysis of causality here.

But the do-calculus of Judea Pearl works “on tables” (each column represents the instantiations of a random variable) like in statistics, e.g. it is completely state-based.

One typical example is the Rain-Sprinkler-GrassWet example:

You have three variables as three columns, one column `RAIN`

, one column `SPRINKLER`

and one column `GRASS_WET`

. This representation lacks actions (like “turn the sprinkler on”)
and time.

In such simple examples like `Rain-Sprinkler-GrassWet`

or `LightSwitch-Light`

or `Voltage-Current`

(like in Ohm’s law) this is fine.
But for more difficult examples it constantly confuses me.

Imagine the example of someone throwing a `projectile`

(stone, foot-ball, snow-ball, cotton-ball) and maybe hitting a `window`

that then perhaps breaks. In order to work with this
example it would not be sufficient to have a column `projectile`

with some value in it. You would need to store the whole trajectory in the “column” `projectile`

and then assign
to each trajectory a probability for the `window`

breaking.

I cannot tell exaclty why, but for my intuition it would also make a difference if the `projectile`

would be a stone or a snow-ball. In the case of a stone:
if somebody threw the stone three weeks ago and the window broke you could at least forensically reconstruct the happenings by finding the `projectile`

inside the room the window
is part of. But in the case of a snow-ball the window would break and three weeks later the snow-ball is molten and the water is evaporated, e.g. there is no trace left of the
object that caused the window to break. This is not relevant for forward causality (from cause to effect), but it feels important for the backward reasoning from effect to
most probable cause, especially if you have “missing variables” (like the snow-ball is missing).

## Possible resolution

I am not sure if this will remain my final answer, but for the moment I’ve settled for predicates like in logic. In the window example
you would introduce a column `projectile-did-hit-the-window-in-the-past-p`

or similar. Once you have converted the operation-based scenario into a (kind-of) state-based picture
you can apply the do-calculus.

I still would hope for another representation more like Markov-Decision-Processes.