Causal Inference

Introduction

Causal Inference is an analysis that uncovers the causal effect of a certain action on an outcome. The goal of causal analysis is to explain whether a change in treatment variables actually causes changes to outcome variables. Traditionally the standard approach to finding causal effects is Randomized Controlled Trials (aka A/B testing). However, RCTs are often expensive, time-consuming or even unethical. Our Causal Inference analysis using the latest technologies from Causal AI helps uncover causal effects from observational data (don’t have to be randomized) as simple as training an ML model without programming.

In order to avoid biases due to spurious correlations, the causal analysis algorithm first de-associates common causes from both given treatment and outcome. It then finds the association between the de-associated treatment and outcome.

To run a Causal Inference analysis, you need a data set whose each row contains values of the treatment, outcome and all common causes.

Parameters

  • Treatment: The variable that causes the effect of interest. It can be either numeric, boolean, or categorical. Actable AI would group the outcome value according to the treatment and compare among different groups.
  • Outcome: This is the variable on which we want to measure the causal effects of treatment. It can be either numeric, boolean, or categorical.
  • Common Causes (optional): Common causes are also known as confounders. These are variables that can affect both the treatment and outcome. Selecting good common causes would help improve the analytics. The value selection is important for the results. Please include the columns that have an effect on both Outcome and Treatment.
  • Effect Modifiers (optional): This is a special common cause variable that affects the estimation of causal effects. If selected, the causal effects will be estimated at the observed values of this variable. It can be either numeric or categorical.
  • Logarithmic treatment: If checked, a logarithmic transformation is applied to the treatment. This is helpful when one wants to analyze the causal effects due to a percentage change in treatment instead of an absolute change.
  • Logarithmic outcome: If checked, a logarithmic transformation is applied to the outcome. This is helpful when one wants to measure the causal effects in terms of a percentage change in outcome instead of an absolute change.
  • Filters (optional): Set conditions on columns to filter on the original dataset. If selected, only a subset of the original data would be used in the analytics.

Result View

  • Treatment effect: This display would illustrate how treatment affects outcome, with the value grouped by effect modifier if chosen. We will cover how to interpret this graph in Case Study.
  • Causal Graph: A directed acyclic graph (DAG) that visualise the parameters choices based on user selection of treatment, outcome, common cause and effect modifier variables. The edge means the causality between variables. The blue colour stands for the Effect modifier, yellow colour stands for the Treatment, white colour stands for the Common causes and red colour stands for the Outcome.
  • Table: Display the original dataset.

Case Study

Imagine we are a real estate agent and we would like to put more properties on the market. We’d like to understand what would affect our rental price and how they would affect the price.

An example of the dataset could be:

number_of_rooms number_of_bathrooms sqft location days_on_market initial_price neighborhood rental_price
0 1 484,8 great 10 2271 south_side 2271
1 1 674 good 1 2167 downtown 2167
1 1 554 poor 19 1883 westbrae 1883
0 1 529 great 3 2431 south_side 2431
3 2 1219 great 3 5510 south_side 5510
1 1 398 great 11 2272 south_side 2272
3 2 1190 poor 58 4463 westbrae 4123.812

Causal inference analysis is able to describe the outcome change contributed by a continuous value change or a category value change. We are going to use the following example to show you how to interpret the result graphs.

Categorical

As a state agent expert, years of experience told me that location of property would affect the rental price. From the example dataset, we know the location is categorical data, where the values are poor, good and great. We also understand that room numbers (number_of_rooms, number_of_bathrooms) and the property size (sqft) might affect both rental price and location, but not day_on_market or initial_price. Now the question is, how’s the different location tags encourage/discourage the final rental price?

We would setup the analyse as:

_images/category_setup_treatment_control_good.png

Review Result

The result is demonstrated as following

_images/category_treatment_effect_plot_treatment_control_good.png

Unlike what we might expect, it’s not necessary that a better location would result from a better rental price. The resulting graph tells us, compared to the properties which categorised as located in a good location, both properties in a great location and poor location is not competitive in the rental price, where the properties in great location show an average of -193.24 (+/-18.35) drop in price, properties in poor location shows an average of -386.48 (+/-36.69) drop in price.

Numerical

We already understand how the location would affect the final rental price, but we are also curious how the price change according to the property size growth (sqft) in a different location. In order to do this, we would need to use Effect Modifiers.

We would setup the analyse as:

_images/numeric_setup.png

Review Result

The result shows a line graph above and a binary tree at the bottom.

The line graph tells us how much the outcome changes on average when the treatment value changes by one unit in general. In this example, compared to properties in a great location, properties in good locations would charge more. As the property size increases, the overall delta price (rental price for great location compared to the price for good/bad location) is increased but the change is not obvious. It is also interesting to tell from this graph that when we have a larger property, the rental price is more fluctuated. Even at some point, the property in a poor location could charge more than the property in a good location.

_images/numeric_treatment_effect_plot.png

The bottom graph tells us how’s our data distributed. Each node contains the following information:

  • Condition boundary: For the selected numeric effect modifier, what is the value boundary that splits data. The node would have 2 children, where left child means the condition meets and right child means the condition does not meet.
  • Sample: The size of the sample in this segment. It helps people understand how’s our data distributed via the effect modifier changes.
  • CATE: Stands for conditional average treatment effect. A CATE is an average treatment effect specific to a subgroup of subjects, where the subgroup is defined by subjects’ attributes or attributes of the context in which the experiment occurs. The mean and std (standard deviation) are describing the subgroup via the coefficient between effect modifier and outcome. By comparing the mean value, one can tell how severe the impact of the effect modifier on the outcome is. One can read std to get confidence about how stable is the subgroup samples.
_images/numeric_decision_tree.png