![[noise-fischer-black.png]]
**Causal inference**
The abstract above is from Black's 1986 article "Noise".^[Black, F. (1986). Noise. The Journal of Finance, 41(3), 528-543.] Black goes on to say:
*"Even highly trained people, though, seem to make certain kinds of errors consistently. For example, there is a strong tendency in looking at data to assume that when two events frequently happen together, one causes the other. There is an even stronger tendency to assume that the one that occurs first causes the one that occurs second. These tendencies are easy to resist in the simplest cases. But they seem to creep back in when econometric studies become more complex. Sometimes I wonder if we can draw any conclusions at all from the results of regression studies."*
Finding a link between a cause and its effect in the midst of noise is difficult.
This book offers a curated set of design patterns for causal inference, with modern applications of each pattern across the following three approaches: Statistics (traditional frequentist methods), Machine Learning, and Bayesian. Each design pattern is supported by business cases that use real(istic) data for causal inference. Three approaches are compared using the latest tools on the same data and model.
> [!NOTE]
> **What is a design pattern?**
> A (software) design pattern is a general, reusable solution to a common problem in a given (software) design context. It is not a finished design that can be directly transformed. Rather, it is a description or template for solving a problem that can be used in many different situations. Design patterns are formalized best practices that can be used to solve common problems.^[[Software design pattern - Wikipedia](https://en.wikipedia.org/wiki/Software_design_pattern)]
>
> **And why design patterns?**
> It's a tribute to my early career in programming (using C# and then Java). Our most valuable resources back then were design patterns. I still have a copy of the book _Head First Java Design Patterns: A Brain-Friendly Guide_ on my bookshelf from 20 years ago. It was a lifesaver when I moved from C# to Java.
**What is the goal here, beyond the design pattern idea?**
This project is not meant to replace the many great books on causal inference methods. Instead, it strives to accomplish two goals in an interactive format that presents all of the book's content in a network of ideas, concepts, and applications:
1\. To present solution patterns of the same conceptual model for problems requiring causal inference (using the same data, and often using the same estimators) using the three approaches: Statistics, Machine Learning, and Bayesian. The applications include modern R and Python code and utilize the latest libraries and packages. In some cases, there are comparative evaluations of the two.
2\. To discuss the intricacies of modeling data for causal inference and lesser known and sometimes puzzling situations along the way, such as larger coefficients in IV models, negative R-squared values, potentially detrimental effects of matching in difference-in-differences, or unexpected multimodal posterior distributions.
**All kinds of analytics**
These design patterns are necessary because modern analytics makes it too easy to train models without deeply thinking about the underlying causal mechanisms.
With the recent advances in data centers, computational power, and algorithmic innovations, it is easier than ever to train models on large datasets and make predictions. Given enough data, and with the help of almost off-the-shelf libraries, one can model data without having to think deeply about the problem.
This is more evident with nonparametric methods, where the problem formulation loses its appeal and the value of method-dependent metrics such as variable importance is easily overstated. But even with parametric methods, one can use models for decision making without checking the underlying assumptions.
Conceptual models and assumptions go hand in hand. For example, a regression model alone with a statistically significant interaction term of brightness and a weekday indicator may lead to the erroneous conclusion that increasing the brightness of a retail store during the week will increase sales more than on weekends. If we use such a model alone and increase store brightness during the week, we may be making a costly but not necessarily beneficial intervention.
Would an algorithm like XGBoost solve this? No, because the missing piece is the conceptual model and underlying assumptions, not a better curve fitting method.
**There are at least two distinct questions here:**
1\. What is the relationship between brightness and sales in a retail store on weekdays versus weekends?
*This question seeks a correlation and can be answered using a mere regression. The positive interaction term indicates that brightness performs better on weekdays (i.e., the correlation is higher).*
2\. **Why** is the effect of brightness greater during the week than on the weekend?
*This is a causal problem and the regression model alone is not enough. It could be that shoppers are in a hurry during the week and tend to make more purchases under brighter lighting. It could also be that weekday shoppers visit stores after work and the effect of brightness is greater because it is dark outside. Another reason could be that weekday shoppers are older and need more lighting than weekend shoppers.*
Some of this is noise and some of it is true cause and effect. The regression (or XGBoost) model alone is not capable of eliminating the noise and identifying the true relationship. That is why it cannot answer the second question. However, it is possible to use the same regression or XGBoost model to answer the question if a causal design pattern is applied: a conceptual model and some assumptions along with the necessary data (either existing observational data or experimental data).
**Got it, where can I start?**
[[Table of Contents]] would be a good place to start.
\---
*Causal Book is a work in progress. I would appreciate it if you contact me [here](https://www.linkedin.com/in/gtozer/) for anything related to this project (comment, typo, question, error, anything). G.T. *^[*Gorkem Turgut Ozer © 2026*]
> [!info]- Last updated: June 11, 2026