Correlation analysis is a method of statistical evaluation used to study the relationship between two factuals. The analysis is useful to indicate whether there are possible connections established between variables in the same dataset. For example, in the advertising industry, there would be correlation between advertising spend and the ads impression rate. However, correlation is not always causal, most of the time they are misleading and show fake relationship between variables. In order to discover and understand the true causal, you are very welcome to use Causal Inference.

**Correlation target**: The choosen to be studied with selected compared factors.**Compared factors**: Features to compute the correlation with the target.**De-correlation (optional)**: If set, data are sampled to reduce correlation between the target and selected column. It is useful when you want to de-couple specific columns with the choosen target.**Number of displayed factors**: Number of best correlated factors to be displayed.**Filters (optional)**: Set conditions on columns to filter on the original dataset. If selected, only a subset of the original data would be used in the analytics.**Bar Values**: Whether values would be shown on the bar. Changing this control takes effect instantly.

Imagine we are a bike rental shop owner. We have our bike demand data over two years and we would like to understand how does the weather change impact on rental demand. All weather values are normalised in the table.

An example of the dataset could be:

temp | hum | windspeed | casual | registered | cnt | dteday |
---|---|---|---|---|---|---|

0.344167 | 0.805833 | 0.160446 | 331 | 654 | 985 | 2011-01-01 |

0.363478 | 0.696087 | 0.248539 | 131 | 670 | 801 | 2011-01-02 |

0.196364 | 0.437273 | 0.248309 | 120 | 1229 | 1349 | 2011-01-03 |

0.2 | 0.590435 | 0.160296 | 108 | 1454 | 1562 | 2011-01-04 |

0.226957 | 0.436957 | 0.1869 | 82 | 1518 | 1600 | 2011-01-05 |

0.204348 | 0.518261 | 0.0895652 | 88 | 1518 | 1606 | 2011-01-06 |

0.196522 | 0.498696 | 0.168726 | 148 | 1362 | 1510 | 2011-01-07 |

0.165 | 0.535833 | 0.266804 | 68 | 891 | 959 | 2011-01-08 |

0.138333 | 0.434167 | 0.36195 | 54 | 768 | 822 | 2011-01-09 |

We set our parameters as following, where `cnt`

stands for the rental demand count.

The result view contains a **Chart** tab, a **Data** tab and a **Table** tab.

The **Chart** tab provides an overview plot chart for showing the correlation between *Compared factors* and *Correlation target*. As we can tell, `temp`

has a positive correlation with `cnt`

while `windspeed`

and `hum`

(humidity) has negative correlation with `cnt`

.

Actable AI also gives breakdown views for each compared factors. For example, the following graph describe the correlation between `temp`

and `cnt`

. Most of data points are being covered by the regression calculation and the correlation coefficient is 0.622.

Actable AI uses spearman correlation coefficient to indicate how strong the correlation is. In nutshell, the value range for Spearman correlation coefficient is between 0-1.

- 0.00-0.19 “very weak”
- 0.20-0.39 “weak”
- 0.40-0.59 “moderate”
- 0.60-0.79 “strong”
- 0.80-1.00 “very strong”

The **Data** tab provides the spearman’s rank coefficient and the P-value for the correlation analysis. The p-value obtained from the calculator is a measure of how likely any observed correlation is due to chance. The range of P-value is between 0 (0%) - 1 (100%). Here, our null hypothesis is that there are no correlation between this factual and the target.

- A close to 1 value suggests no correlation.
- A close to 0 value suggests there is a very high probability that data have strong correlation.

The **Table** tab display the original dataset.