Difference between revisions of "Orange: Random Forest"

From OnnoWiki
Jump to navigation Jump to search
(Created page with "Sumber: https://docs.biolab.si//3/visual-programming/widgets/model/randomforest.html Predict using an ensemble of decision trees. Inputs Data: input dataset Prepr...")
 
Line 7: Line 7:
  
 
     Data: input dataset
 
     Data: input dataset
 
 
     Preprocessor: preprocessing method(s)
 
     Preprocessor: preprocessing method(s)
  
Line 13: Line 12:
  
 
     Learner: random forest learning algorithm
 
     Learner: random forest learning algorithm
 
 
     Model: trained model
 
     Model: trained model
  
Line 22: Line 20:
 
Random Forest works for both classification and regression tasks.
 
Random Forest works for both classification and regression tasks.
  
../../_images/RandomForest-stamped.png
+
[[File:RandomForest-stamped.png|center|200px|thumb]]
  
 
     Specify the name of the model. The default name is “Random Forest”.
 
     Specify the name of the model. The default name is “Random Forest”.
 
 
     Specify how many decision trees will be included in the forest (Number of trees in the forest), and how many attributes will be arbitrarily drawn for consideration at each node. If the latter is not specified (option Number of attributes… left unchecked), this number is equal to the square root of the number of attributes in the data. You can also choose to fix the seed for tree generation (Fixed seed for random generator), which enables replicability of the results.
 
     Specify how many decision trees will be included in the forest (Number of trees in the forest), and how many attributes will be arbitrarily drawn for consideration at each node. If the latter is not specified (option Number of attributes… left unchecked), this number is equal to the square root of the number of attributes in the data. You can also choose to fix the seed for tree generation (Fixed seed for random generator), which enables replicability of the results.
 
 
     Original Breiman’s proposal is to grow the trees without any pre-pruning, but since pre-pruning often works quite well and is faster, the user can set the depth to which the trees will be grown (Limit depth of individual trees). Another pre-pruning option is to select the smallest subset that can be split (Do not split subsets smaller than).
 
     Original Breiman’s proposal is to grow the trees without any pre-pruning, but since pre-pruning often works quite well and is faster, the user can set the depth to which the trees will be grown (Limit depth of individual trees). Another pre-pruning option is to select the smallest subset that can be split (Do not split subsets smaller than).
 
 
     Produce a report.
 
     Produce a report.
 
 
     Click Apply to communicate the changes to other widgets. Alternatively, tick the box on the left side of the Apply button and changes will be communicated automatically.
 
     Click Apply to communicate the changes to other widgets. Alternatively, tick the box on the left side of the Apply button and changes will be communicated automatically.
  
Examples
+
==Contoh==
  
 
For classification tasks, we use iris dataset. Connect it to Predictions. Then, connect File to Random Forest and Tree and connect them further to Predictions. Finally, observe the predictions for the two models.
 
For classification tasks, we use iris dataset. Connect it to Predictions. Then, connect File to Random Forest and Tree and connect them further to Predictions. Finally, observe the predictions for the two models.
  
../../_images/RandomForest-classification.png
+
[[File:RandomForest-classification.png|center|200px|thumb]]
  
 
For regressions tasks, we will use housing data. Here, we will compare different models, namely Random Forest, Linear Regression and Constant, in the Test & Score widget.
 
For regressions tasks, we will use housing data. Here, we will compare different models, namely Random Forest, Linear Regression and Constant, in the Test & Score widget.
  
../../_images/RandomForest-regression.png
+
[[File:RandomForest-regression.png|center|200px|thumb]]
References
+
 
 +
 
 +
==Referensi==
  
 
Breiman, L. (2001). Random Forests. In Machine Learning, 45(1), 5-32. Available here.
 
Breiman, L. (2001). Random Forests. In Machine Learning, 45(1), 5-32. Available here.

Revision as of 09:37, 23 January 2020

Sumber: https://docs.biolab.si//3/visual-programming/widgets/model/randomforest.html


Predict using an ensemble of decision trees.

Inputs

   Data: input dataset
   Preprocessor: preprocessing method(s)

Outputs

   Learner: random forest learning algorithm
   Model: trained model

Random forest is an ensemble learning method used for classification, regression and other tasks. It was first proposed by Tin Kam Ho and further developed by Leo Breiman (Breiman, 2001) and Adele Cutler.

Random Forest builds a set of decision trees. Each tree is developed from a bootstrap sample from the training data. When developing individual trees, an arbitrary subset of attributes is drawn (hence the term “Random”), from which the best attribute for the split is selected. The final model is based on the majority vote from individually developed trees in the forest.

Random Forest works for both classification and regression tasks.

RandomForest-stamped.png
   Specify the name of the model. The default name is “Random Forest”.
   Specify how many decision trees will be included in the forest (Number of trees in the forest), and how many attributes will be arbitrarily drawn for consideration at each node. If the latter is not specified (option Number of attributes… left unchecked), this number is equal to the square root of the number of attributes in the data. You can also choose to fix the seed for tree generation (Fixed seed for random generator), which enables replicability of the results.
   Original Breiman’s proposal is to grow the trees without any pre-pruning, but since pre-pruning often works quite well and is faster, the user can set the depth to which the trees will be grown (Limit depth of individual trees). Another pre-pruning option is to select the smallest subset that can be split (Do not split subsets smaller than).
   Produce a report.
   Click Apply to communicate the changes to other widgets. Alternatively, tick the box on the left side of the Apply button and changes will be communicated automatically.

Contoh

For classification tasks, we use iris dataset. Connect it to Predictions. Then, connect File to Random Forest and Tree and connect them further to Predictions. Finally, observe the predictions for the two models.

RandomForest-classification.png

For regressions tasks, we will use housing data. Here, we will compare different models, namely Random Forest, Linear Regression and Constant, in the Test & Score widget.

RandomForest-regression.png


Referensi

Breiman, L. (2001). Random Forests. In Machine Learning, 45(1), 5-32. Available here.


Referensi

Pranala Menarik