Similar as Regression, Bayesian Linear Regression analysis predicts a continuous value based on other variables, and also an interpretation for the choosen variables.

Actable AI would use the entire table as the source data and automatically splits the table into three parts:

**Train data**: rows in the table where predictors and target are filled.**Prediction data**: rows where target column is missing.**Validation data**: Actable AI would sample a part of the data to verify the reliability of the trained model. This part of the data would also be used in the performance tunning stage if performance optimisation is selected.

**Predicted target**: Choose one column whose missing values would be predicted.**Predictors**: Columns that are used to predict the predicted target.**Validation percentage**: The model is evaluated in the end via the validation dataset. By sliding this value, one can control the percentage of rows with a non-empty predicted target that is used for validation.**Polynomial degree**: Calculate exponential and cross-intersection values for numeric variables, the values would be used as additional input.**Quantile low**: Quantiles divids the prediction result range into continuous intervals with equal probabilities. If set, a lower bound with the set confidence is returned.**Quantile high**: If set, a higher bound with the set confidence is returned.**Filters (optional)**: Set conditions on columns to filter on the original dataset. If selected, only a subset of the original data would be used in the analytics.**Number of trials**: Number of trials for hyper-parameter optimisation. Increasing the number of trails usually results in better prediction but longer training time.

Imagine we are a real estate company and would like to forecast rental prices for properties that remain on the market. An example of the dataset could be:

days_on_market | initial_price | location | neighborhood | number_of_bathrooms | number_of_rooms | sqft | rental_price |
---|---|---|---|---|---|---|---|

10 | 2271 | great | south_side | 1 | 0 | 4848 | 2271 |

1 | 2167 | good | downtown | 1 | 1 | 674 | 2167 |

19 | 1883 | poor | westbrae | 1 | 1 | 554 | 1883 |

3 | 2431 | great | south_side | 1 | 0 | 529 | 2431 |

58 | 4463 | poor | westbrae | 2 | 3 | 1190 | 4123.812 |

Now we added some new properties and would like to find out how much is the rental price for their condition, they are

days_on_market | initial_price | location | neighborhood | number_of_bathrooms | number_of_rooms | sqft | rental_price |
---|---|---|---|---|---|---|---|

18 | 1725 | poor | westbrae | 1 | 0 | 509 | |

49 | 1388 | poor | westbrae | 1 | 0 | 481 | |

1 | 4677 | good | downtown | 2 | 3 | 808 | |

30 | 1713 | poor | westbrae | 1 | 1 | 522 | |

10 | 1903 | good | downtown | 1 | 1 | 533 |

We set our parameters as

The result view contains a **Prediction** tab, a **Performance** tab, a **Multivariate** tab, a **Univariate** tab and a **Table** tab.

The **Prediction** tab shows the prediction result for the rows which missed the target value. The table would have two new columns `<target>_low`

and `<target>_high`

.

The **Performance** tab shows the performance of our model with the *Root Mean Square Error (RMSE)* metric (14.292) and the *R-squared (R2)* metric (1.0).

**RMSE**: Square Root of the Average of Squared Error is calculated as the square root of the second sample moment of the differences between predicted values and observed values.**R2**: R-squared is the coefficient of determination. R squared indicates how much target is predictable from predictors. 0 means predictors have zero predictability of the target, while 1 means the target is fully predictable by predictors.

- The
**Multivariate**tab contains a table showing the variable used/generated by the model. - Variable name: The name of the variable used. Multiple variable names corresponds to a cross-intersection of variables. In simple terms these variables multiplied by eachother.
- Coefficient Value: Coefficients of the regresison model (mean of distribution)

- Standard Deviation: Standard deviation of the coefficients.

- The
**Univariate**tab is more focused on the relationship between each predictor and the target. - The first graph below shows a regression of our target
`rental price`

by only using the`number of rooms`

as feature. We can see that the`number of rooms`

can easily predict the`rental price`

wit an R-squared of 0.912. We can also see that the more rooms the higher the price. - The second graph shows the probability density function of the
`rental price`

. We can see that the number of rooms has a positive influence on the price.

- The first graph below shows a regression of our target

We generate a univariate analysis for every original value *(every non cross-intersection or exponent)* in the dataset.

The **Table** tab display the original dataset.