Multiple Treatments with Econml #930

turankeles · 2024-11-25T01:51:57Z

Hi,
I greatly enjoy the EconML library. However, regarding multiple treatments, there is an issue I could not figure out. I would really appreciate your help.
Here is the brief of my problem:

I have 2 binary columns (email_campaign,social_media_ad) with an X variable and binary outcome.I ran a combined treatment with CausalForestDML and ran separate CausalForestDML separately for each treatment. why I get different ate results? When running multiple treatments, when I set T0=0,T1=1 why the ate result is different than running a separate model with only treatment email_campaign? The combined treatment column is 0 when email_campaign and social_media_ad is zero, 1 when social_media_ad is 1 and social_media_ad is 0 , 2 when email_campaign is 1 and social_media_ad is 0, 3 when both are 1. A sample of the data is:

import pandas as pd
import numpy as np
from econml.dml import CausalForestDML
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor
np.random.seed(123)

Sample data (replace with your actual data)

data = pd.DataFrame({
'Customer ID': range(1, 1001),
'Sales': np.random.randint(0, 1000, 1000),
'churn': np.random.randint(0, 2, 1000),
'Email Campaign': np.random.randint(0, 2, 1000),
'Social Media Ad': np.random.randint(0, 2, 1000)
})

Create the combined treatment variable

data['Combined Treatment'] = data['Email Campaign'] * 2 + data['Social Media Ad']
data.columns=data.columns.str.lower().str.replace(' ','_')

Define features and target variable

X = data[['sales']]
T = data['combined_treatment']
Y = data['churn']

Initialize the CausalForestDML model

est = CausalForestDML(
model_t=RandomForestClassifier(random_state=123),
model_y=RandomForestRegressor(random_state=123),
discrete_treatment=True,random_state=123
)

Fit the model

model_est=est.fit(Y, T, X=X)

The ate result of each treatment:
est.ate(X,T0=0,T1=1) --> -0.0016 Social_media_ad ( combined_treatment==1)
est.ate(X,T0=0,T1=2) --> -0.033 The email_campaign (combined_treatment==2)
est.ate(X,T0=1,T1=2) --> -0.032
est.ate(X,T0=0,T1=3) --> -0.051

Email:
est_mail = CausalForestDML(
model_t=RandomForestClassifier(random_state=123),
model_y=RandomForestRegressor(random_state=123),
discrete_treatment=True,random_state=123
)

        est_mail.fit(Y, data["email_campaign"], X=X)
        est_mail.ate(X)  --> -0.019

In the above example, T0=0,T1=2 means the treatment of email_campaign. My question is why it yields different results with multiple treatments and separate treatments? How to utilize the multiple treatments approach in EconML?
Social media ad:
est_social_media_ad = CausalForestDML(
model_t=RandomForestClassifier(random_state=123),
model_y=RandomForestRegressor(random_state=123),
discrete_treatment=True,random_state=123
)

      est_social_media_ad .fit(Y, data["social_media_ad"], X=X)
      est_social_media_ad .ate(X) -->0.010

In the above example, T0=0,T1=1 means the treatment of social_media_ad. The result from multiple treatment model is negative but in the single treatment model is positive. Why?

Note:
1- I receive even contrasting (negative vs positive) results when running on different datasets.
2- I receive inconsistent results even if two treatment variables are totally independent, meaning when each customer receives only one treatment.

Best

The text was updated successfully, but these errors were encountered:

kbattocchi · 2024-11-25T21:43:21Z

At least with the sample data in your example, the confidence intervals are pretty wide (e.g. (-0.37, 0.37) for est.ate_interval(X,T0=0,T1=1)) so the point estimates for each estimator are well within the confidence intervals of the other, so I wouldn't worry about it.

It's not surprising that the point estimates aren't exactly the same: we stratify on treatment when creating samples for cross-fitting, so the estimators aren't seeing exactly the same samples, and the treatment models will behave slightly differently since they're predicting different things (email vs. not email in one case, as opposed to distinguishing between all of None, Email, Social, Both in the other).

fhz-3722 · 2024-11-26T12:29:45Z

Hi, I guess this example will answer your question https://github.com/py-why/EconML/blob/main/notebooks/Double%20Machine%20Learning%20Examples.ipynb

turankeles · 2024-11-28T04:57:16Z

Thank you very much!
I have a few other questions, though.
1- For binary (discrete) treatment and binary outcome, the model_t and model_y both should be classifier?
2- The negative ate means the treatment is decreasing churn? And positive ate increasing the churn
3- And how to interpret the ate? As the probability of churn?

turankeles · 2024-11-28T04:58:08Z

Hi, I guess this example will answer your question https://github.com/py-why/EconML/blob/main/notebooks/Double%20Machine%20Learning%20Examples.ipynb

Thanks!

kbattocchi · 2024-12-02T15:00:00Z

Thank you very much!
I have a few other questions, though.
1- For binary (discrete) treatment and binary outcome, the model_t and model_y both should be classifier?
2- The negative ate means the treatment is decreasing churn? And positive ate increasing the churn
3- And how to interpret the ate? As the probability of churn?

Yes, pass discrete_treatment=True and discrete_outcome=True and then use classifiers for both models.
Negative ATE would mean that on average the treatment decreases the likelihood of the 'high' outcome. If your outcome is churn, then yes, negative ATE would mean that it decreases churn.
The ATE is the average change in the probability of the outcome if the treatment goes from 0 to 1.

turankeles · 2024-12-04T20:20:35Z

Following the previous questions, I have Encoded four treatments into one column, combined_treatment. This multi-treatment column’s values range from 0 to 15. I am running a CausalForestDML with XGBClassifier, as shown below. However, some of the point estimates are bigger than 1 or lower than -1. I get similar results with all of the treatment interactions. If the output of this CausalForestDML model is the probability of the outcome (churn, binary), why I get point_estimates higher than 1 or lower than -1?
Switching from XGBClassifier to other algorithms such as RandomForestClassifier lowers the numbers of point_estimates that are outside of (-1,1) range, but still have some.

model = CausalForestDML(
model_t=XGBClassifier(),
model_y= XGBClassifier (),
discrete_treatment=True
)

est_model=model.dowhy .fit(Y, combined_treatment, X=X,W=W)

output1=est_model.effect_inference(X_test,T0=0,T1=1)
output2=est_model.effect_inference(X_test,T0=0,T1=2)
output3=est_model.effect_inference(X_test,T0=0,T1=3)
……
output15=est_model.effect_inference(X_test,T0=0,T1=15)
All these outputs yield some point_estimates out of -1,1 range.
If the results are probabilities of outcome, how to interpret or justify these results? If the results are not probabilities of outcome, how to interpret?

This is the output for T0=0, T1=12:

Really appreciate your input!

turankeles · 2024-12-04T20:36:30Z

Setting discrete_treatment=False does not help either.
CausalForestDML and LinearDML do not have discrete_outcome, so I cannot set discrete_outcome=True.

I tried the wrapper class here, #334 (comment), it doesnt change the results either.

kbattocchi · 2024-12-04T23:28:01Z

I'm a bit confused by your last statement - both CausalForestDML and LinearDML do have discrete_outcome arguments to their initializers (and as a side note, if your treatment is discrete you might want to use the DRLearner subclasses instead of DML ones anyway, though this same issue can also happen there).

The basic issue that can cause this type of result is just a kind of extrapolation. Imagine a setting where there's a binary treatment and we've learned first stage models where P(treatment=1) = 0.4 and P(outcome=1) = 0.2 for set of characteristics (e.g. for some rare combination of Xs). Then imagine that when we're training our final model, we have only one data point with this set of Xs, and it has treatment=1, outcome=1. Then the "surprise" portion of the outcome is 1-0.2=0.8, and the "surprise" portion of the treatment is 1-0.4=0.6, so the resulting treatment effect we'd calculate for this one-element subset would be 0.8/0.6>1.

As your sample size increases, this problem should become more and more rare (assuming your first stage models get arbitrarily accurate) - as the distribution of observed (treatment, outcome) pairs approaches the true density, it becomes mathematically guaranteed that the computed effect will be in [-1,1].

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple Treatments with Econml #930

Multiple Treatments with Econml #930

turankeles commented Nov 25, 2024

kbattocchi commented Nov 25, 2024

fhz-3722 commented Nov 26, 2024

turankeles commented Nov 28, 2024

turankeles commented Nov 28, 2024

kbattocchi commented Dec 2, 2024

turankeles commented Dec 4, 2024

turankeles commented Dec 4, 2024

kbattocchi commented Dec 4, 2024

Multiple Treatments with Econml #930

Multiple Treatments with Econml #930

Comments

turankeles commented Nov 25, 2024

Sample data (replace with your actual data)

Create the combined treatment variable

Define features and target variable

Initialize the CausalForestDML model

Fit the model

kbattocchi commented Nov 25, 2024

fhz-3722 commented Nov 26, 2024

turankeles commented Nov 28, 2024

turankeles commented Nov 28, 2024

kbattocchi commented Dec 2, 2024

turankeles commented Dec 4, 2024

turankeles commented Dec 4, 2024

kbattocchi commented Dec 4, 2024