The CART Algorithm is the type of algorithm that builds the decision tree based on Gini's impurity index. A statistician named Leo Breiman described the algorithm that can be used for classification or regression.
CART is an umbrella word that has the following types of decision trees:
Classification Trees: In the case of target variables being continuous then trees are used to find the"class" for which the target variable is most likely to fall.
Regression trees: It is used for forecasting the values of continuous variables.
In decision trees firstly nodes are split into subnodes based on attributes. The CART Algorithm helps in the splitting of nodes by searching for the best homogeneity for the sub-nodes with the help of the Gini Index Criterion. The root node is considered as the training dataset and is split into two by considering the best attributes.
Then subsets are also split in the same way using some logic. this process continues until the last pure sub-set is obtained in the tree or the max number of leaves is possible in the growing trees. This process is also known as Tree Pruning.
To understand how to calculate the Gini impurity index you can refer to Decision Tree: Gini Impurity
- It is nonparametric and thus does not depend upon information for a certain sort of distribution.
- It combines both testings with the test data and cross-validation to work more precisely to the goodness to fit.
- It can be easily used with other prediction algorithms in combination to choose the input set of variables.
It is the subset of random forest which is the most powerful algorithm of Machine Learning. This algorithm is widely used in making Decision Trees through Classification and Regression. Decision Trees are widely used in data mining to create a model that predicts the value of a target based on the values of many input variables.