Comment fonctionne l’algorithme de la forêt aléatoire dans l’apprentissage automatique

How The Random Forest Algorithm Works In Machine Learning 1
  • Pour modéliser un plus grand nombre d’arbres de décision afin de créer la forêt, vous n’allez pas utiliser la même apache que pour construire la décision avec un gain d’information ou une approche par indice de Gini.
  • Si vous ne connaissez pas les concepts du classificateur d’arbres de décision, veuillez consacrer un peu de temps aux articles ci-dessous, car vous devez savoir comment fonctionne le classificateur d’arbres de décision avant d’apprendre la nature du fonctionnement de l’algorithme de la forêt aléatoire.
  • Étant donné l’ensemble de données de formation avec les cibles et les caractéristiques, l’algorithme de l’arbre de décision proposera un ensemble de règles.
  • Dans l’algorithme d’arbre de décision, le calcul de ces nœuds et la formation des règles se feront en utilisant les calculs de gain d’information et d’indice de Gini.
  • Dans l’algorithme de la forêt aléatoire, au lieu d’utiliser le gain d’information ou l’indice de Gini pour calculer le nœud racine, le processus de recherche du nœud racine et de division des nœuds de caractéristiques se fera de manière aléatoire.

Vous allez apprendre l’algorithme de classification le plus populaire. Il s’agit de l’algorithme de la forêt aléatoire. C’est une façon d’apprendre à dire le classificateur Random forest. Comme motivation pour aller plus loin, je vais vous donner l’un des meilleurs avantages de la forêt aléatoire.

@YvesMulkersHow the random forest algorithm works in #machinelearning

You are going to learn the most popular classification algorithm. Which is the Random forest algorithm. In machine learning way fo saying the Random forest classifier. As a motivation to go further I am going to give you one of the best advantages of random forest.

The Same algorithm both for classification and regression, You mind be thinking I am kidding. But the truth is, Yes we can use the same random forest algorithm both for classification and regression.

Excited, I do have the same feeling when I first heard about the advantage of the random forest algorithm. Which is the same algorithm can use for both regression and classification problems.

you will learn, how the random forest algorithm works in machine learning for the classification task. In the next coming another article, you can learn about how the random forest algorithm can use for regression.

Get a cup of coffee before you begin, As this going to be a long article

We begin with the table of contents.

How random forest classifier works for classification.

Random forest algorithm is a supervised classification algorithm. As the name suggest, this algorithm creates the forest with a number of trees.

In general, the more trees in the forest the more robust the forest looks like. In the same way in the random forest classifier, the higher the number of trees in the forest gives the high accuracy results.

If you know the Decision tree algorithm. You might be thinking are we creating more number of decision trees and how can we create more number of decision trees. As all the calculation of nodes selection will be same for the same dataset.

Yes. You are true. To model more number of decision trees to create the forest you are not going to use the same apache of constructing the decision with information gain or gini index approach.

If you are not aware of the concepts of decision tree classifier, Please spend some time on the below articles, As you need to know how the Decision tree classifier works before you learning the working nature of the random forest algorithm.

If you are new to the concept of decision tree. I am giving you a basic overview of the decision tree.

Decision tree concept is more to the rule based system. Given the training dataset with targets and features, the decision tree algorithm will come up with some set of rules. The same set rules can be used to perform the prediction on the test dataset.

Suppose you would like to predict that your daughter will like the newly released animation movie or not. To model the decision tree you will use the training dataset like the animated cartoon characters your daughter liked in the past movies.

So once you pass the dataset with the target as your daughter will like the movie or not to the decision tree classifier. The decision tree will start building the rules with the characters your daughter like as nodes and the targets like or not as the leaf nodes. By considering the path from the root node to the leaf node. You can get the rules.

The simple rule could be if some x character is playing the leading role then your daughter will like the movie. You can think few more rule based on this example.

Then to predict whether your daughter will like the movie or not. You just need to check the rules which are created by the decision tree to predict whether your daughter will like the newly released movie or not.

In decision tree algorithm calculating these nodes and forming the rules will happen using the information gain and gini index calculations.

In random forest algorithm, Instead of using information gain or gini index for calculating the root node, the process of finding the root node and splitting the feature nodes will happen randomly. Will look about in detail in the coming section.

A LIRE:  Linux contre Windows: Quel est le meilleur système d'exploitation pour les chercheurs en données?