#### Replication of the naïve model
In what follows, we replicate the naïve model in [[Statistical modeling of RD]] using Python's *[[PyFixest]]* library.
```python
import pandas as pd
import pyfixest as pf
df = pd.read_csv('synthetic-loyalty-data-for-rd_v100.csv')
df.columns = df.columns.str.lower() # standardize column names
pf.feols("future_spending ~ status_platinum", data = df).summary()
```
The results are as follows:
```python
Estimation: OLS
Dep. var.: future_spending, Fixed effects: 0
Inference: iid
Observations: 50000
| Coefficient | Estimate | Std. Error | t value | Pr(>|t|) | 2.5% | 97.5% |
|:----------------|-----------:|-------------:|----------:|-----------:|---------:|---------:|
| Intercept | 8414.359 | 8.213 | 1024.471 | 0.000 | 8398.260 | 8430.457 |
| status_platinum | 2101.010 | 10.122 | 207.561 | 0.000 | 2081.170 | 2120.850 |
---
RMSE: 1073.419 R2: 0.463
```
The results are identical to the R model in [[Statistical modeling of RD]]: Platinum customers spend approximately \$2,101 more than non-Platinum customers in the following year on average. The standard errors also match because both use IID standard errors.
##### Replication of the RD model
We now replicate the (Sharp) RD estimation using the Python port of `rdrobust`:[^1]
```python
from rdrobust import rdrobust
rd = rdrobust(y = df['future_spending'], x = df['past_spending'], c = 10000)
print(rd)
```
The results are as follows:
```python
Call: rdrobust
Number of Observations: 50000
Polynomial Order Est. (p): 1
Polynomial Order Bias (q): 2
Kernel: Triangular
Bandwidth Selection: mserd
Var-Cov Estimator: NN
Left Right
------------------------------------------------
Number of Observations 17081 32919
Number of Unique Obs. 16816 32487
Number of Effective Obs. 6697 6838
Bandwidth Estimation 1849.278 1849.278
Bandwidth Bias 3385.648 3385.648
rho (h/b) 0.546 0.546
Method Coef. S.E. t-stat P>|t| 95% CI
-------------------------------------------------------------------------
Conventional 1567.178 38.508 40.697 0.000e+00 [1491.703, 1642.653]
Robust - - 36.225 2.452e-287 [1496.639, 1667.855]
```
The point estimate (\$1,567), bandwidth (\$1,849), and robust confidence interval all match the R results. This is expected: both the R and Python versions of `rdrobust` are maintained by the same authors and use the same underlying algorithms. Note that the Python implementation presents the output in a slightly different layout than the R version: the bandwidth selection details come first, followed by a two-row table showing the Conventional and Robust estimates side by side.
##### Replication of the diagnostic tests
**McCrary density test**
```python
from rddensity import rddensity
dens = rddensity(X = df['past_spending'], c = 10000)
print(dens.test)
```
The results are as follows:
```python
t_asy NaN
t_jk -0.957637
p_asy NaN
p_jk 0.338246
dtype: float64
```
The jackknife-based test statistic (`t_jk = -0.96`) and p-value (`p_jk = 0.34`) match the R result exactly. Unlike the R version, the default `print(dens)` in Python does not show the test result, so we access it through the `.test` attribute. The `_asy` entries are `NaN` here because the asymptotic variance is not computed by default (the default `vce='jackknife'`); they would populate if `vce='plugin'` were specified.
**Covariate balance**
```python
covariates = ['age_years', 'total_bookings', 'avg_booking_value', 'pct_international_trips']
for var in covariates:
rd_cov = rdrobust(y = df[var], x = df['past_spending'], c = 10000)
print(f"{var:30s} Estimate: {rd_cov.coef.iloc[0, 0]:8.3f} P-value: {rd_cov.pv.iloc[0, 0]:.3f}")
```
The results are as follows:
```python
age_years Estimate: 0.387 P-value: 0.326
total_bookings Estimate: 0.028 P-value: 0.582
avg_booking_value Estimate: -42.269 P-value: 0.223
pct_international_trips Estimate: -0.007 P-value: 0.255
```
All covariates are balanced at the cutoff, consistent with the R results.
##### RD plot
The Python `rdrobust` package also includes `rdplot` for visualization:
```python
from rdrobust import rdplot
rdplot(y = df['future_spending'], x = df['past_spending'], c = 10000,
x_label = "Past Spending ($)",
y_label = "Future Spending ($)",
title = "")
```
The resulting plot should be visually identical to the R version shown in [[Statistical modeling of RD]] except for the aesthetics.
> [!NOTE]
> The Python port of `rdrobust` uses the same bandwidth selection, kernel functions, and bias-correction procedures as the R version. Unlike the [[Replication of IV in Python|IV replication]], where differences in small sample corrections between `fixest` and `PyFixest` led to slight discrepancies in standard errors, the `rdrobust` outputs should be numerically identical across the two languages because the package does not involve the same clustering or degrees-of-freedom adjustments.
[^1]: The Python packages `rdrobust` and `rddensity` can be installed via `pip install rdrobust rddensity`.
> [!info]- Last updated: April 17, 2026