Confounding by intention - Causal Book: Design Patterns in Causal Inference

[[Confounding]] by intention can be defined at the individual level for problems involving consumer decision making for the purchase of goods and services. If a consumer already had the intention to buy Product B while shopping for Product A, how does one conclude that an ad for Product B on the product page of Product A leads to an increase in the sales of Product B? This is a typical problem in measuring the causal impact of [[Recommender system|recommender systems]]. If the consumer would have purchased Product B regardless of whether an ad was displayed on Product A's page, then the effect of the ad is confounded by the consumer's intent.[^1] What would be a case in which we might observe such confounding? Marketing provides an excellent and very common case study. It is called a multi-touch or multi-channel [[Attribution problem|attribution]] problem: the consumer sees the TV ad and intends to buy. Between intent and purchase, the consumer is also exposed to alternative forms of the ad (e.g., radio, smartphone). In this case, it's difficult to attribute the purchase to the (final) mobile ad. ![[Pasted image 20210922101002.png]] For example, suppose a fashion retailer sells pink shirts and navy blazers. If the pink shirt is recommended on the product page of the navy blazer, does that mean that an increase in sales of navy blazers is attributable to the retailer's merchandising efforts through the recommender system? Not necessarily. What if the customer who visited the site already intended to buy both items? What if the number of customers who want to buy both items suddenly increases because their favorite actor wore a pink shirt and a navy blazer to the Oscars? Such customers would search for the navy blazer even if it was not conveniently recommended and linked on the pink shirt's product page. In such a case, the positive association observed cannot be attributed to the recommender system as it is the outcome of an external shock. A naïve estimate of the effect size would be an overestimation. Short of a randomized experiment, the true effect of the action by merchandising analytics via the recommender system can potentially be isolated using an instrumental variable. In this case, a good instrument would affect the sales of navy blazers but not the sales of pink shirts. These customers would search for the navy blazer even if it was not conveniently recommended and linked on the pink shirt's product page. In such a case, the observed positive association cannot be attributed to the recommender system, as it is the result of an external shock (the confounder!). In this case, a naive estimate of the effect size would likely be an overestimate (positive [[bias]]). What can be done in the absence of a randomized experiment? The *truer* effect of the recommender system can potentially be isolated using an instrumental variable. In this case, a good instrument would affect sales of navy blazers (hence the views of the recommendation) but not pink shirts. Let's look at the problem graphically. The following DAG ([[Directed acyclic graph]]) shows the scenario explained above. The link between the intention and the sales of two products is shown as the edges from I (Intention) to $S_a$ (sales of Product A) and $S_b$ (sales of Product B). ![[Pasted image 20211018105027.png]] Consumer intent and associated demand cannot be measured, so the circles are unshaded. Demand for Product A and Product B is a direct result of intention. Sales, on the other hand, are measured for both products and are shaded gray accordingly ($S_a$ and $S_b$ in the DAG above). Now suppose that a recommendation for Product B is placed on Product A's page, so that the sales for Product B is a combination of the sales through the recommendation and sales from the consumer's inherent demand. The sales of Product A lead to some of the sales of Product B due to the recommendation, so an arrow now connects $S_a$ and $Sb_r$ (sales due to the recommendation). For the sake of simplicity, ad clicks are excluded as a mediating variable. ![[Pasted image 20211018102049.png]] Our goal is to know whether $Sb_r$ is a significant enough contributor to the sales of Product B. The problem with isolating this effect is that $Sb_r$ and $Sb_d$ (direct sales) are correlated through the confounding effect of latent demand. Consumers may have clicked on the recommendation because they wanted Product B anyway. In other words, if the recommendation was not on Product A's page, consumers would have purchased Product B anyway. One solution to such a confounding problem is to use an exogenous shock as an [[Design Pattern I - Instrumental Variable (IV)|Instrumental Variable (IV)]]. The shock can help isolate the truer causal effect only if it is a shock that affects the sales of Product A but not Product B. ![[Pasted image 20211018102358.png]] For example, this exogenous shock may be a spike in views of Product A due to promotional coverage by an influencer. The coverage does not mention Product B at all. The exogenous shock helps to measure the truer causal effect because the coefficient of the relationship between Sales of Product A and Sales of Product B through recommendations can be identified. This is because: ![[Pasted image 20211108141122.png]] Assuming the relationships are linear, we can formulate from the DAG that: $Sa = \alpha E + U$ $Sb_r = \beta Sa + U$ Therefore: $cov(E, Sa) = \alpha$ $cov(E, Sb_r) = \alpha \cdot \beta$ Solving for $\beta = \dfrac{cov(E, Sb_r)}{cov(E, Sa)}$ Voila! We have an estimate of $\beta$, which is a causal effect of the recommendation on Product A's page on Product B's sales, reasonably isolated from the correlation between $Sa$ and $Sb_r$ through the demand for both products. **Why might this not work?** If the shock affects sales of Product B (by creating demand for both products), as shown below, it no longer serves the purpose of isolating the causal effect. This is a case where the exogenous shock causes a consumer to buy not only Product A, but also Product B (not because of the recommender on Product A's page, but directly because of the shock). The DAG below illustrates such an additional effect with the colored arrow: ![[Pasted image 20211018123446.png]] For example, if Product A and Product B are highly complementary, the consumer may actually go about buying Product B anyway. For example, if the promotional coverage is for a smartphone, consumers may also think of buying a case for the phone. In other words, the consumer may buy the case not because it is recommended on the smartphone's product page but because the idea of buying the smartphone fuels the intention to buy the case as well. In this case, the identification is not as clean because the relationship between the shock and the sales of Product B through the recommendation has one more path to close, changing the way the covariance between the two was previously calculated. The following DAG reproduces the case with the relationships identified: ![[Pasted image 20211108151748.png]] In this case, the exclusion restriction, a requirement that the exogenous shock be absent from the function defining the outcome, is violated, as seen below: $Sb_r = \beta Sa + \sigma_2 (\gamma E + U) + ...$ In other words, the shock does not help us isolate the increased demand for the case due to an ad on the smartphone's product page from the customer's inherent intention to buy the case. Our discussion here can be extended almost indefinitely to many potential DAGs based on the problem at hand, so this is a good time to stop and get your hands dirty with data in [[Data and conceptual model]]. [^1]: See [[Impact of recommendations on Amazon]] for a case study on the problem. > [!info]- Last updated: September 3, 2024