7 Rain Prediction
20200901 Decision trees and the ensemble of decision trees within a random forest are two common approaches to building classification models in AI. The concept of an ensemble of decision trees was introduced in 1988 in the paper Combining Decision Trees: Initial results from the MIL algorithm where the improved performance from multiple trees is demonstrated. The rattle package in R provides the weatherAUS dataset which is used to predict if it will rain tomorrow (or any other target variable of choice).
MLHub’s rain package uses the weatherAUS dataset from R’s Rattle package to train a predictive model for the probability of it raining tomorrow based on today’s weather observations. The training dataset consists of daily weather observations from weather stations across Australia capturing the amount of sunshine, the humidity, the amount of rain today, etc. This simplest of approaches uses the decision tree induction algorithm to build a model that captures knowledge in the form of a decision tree. Other (often more accurate but more complex) models include the random forest which builds a forest (that is, a collection) of decision trees and produces an ensemble model. Ensembles have been shown over many years to produce more accurate models (see, for example, the original work on multiple inductive learning).
The example model and code come from the Essentials of Data Science by Graham Williams https://bit.ly/essentials_data_science.
We install, configure and demonstrate the model with these three commands:
ml install rain ml configure rain ml demo rain
Your donation will support ongoing development and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.