《Chinese Journal of Aplied Entomology》

Set Homepage

Add To Favorite

Chinese

Latest Cover

Online Office

Issue:ISSN 2095-1353
CN 11-6020/Q
Director:Chinese Academy of Sciences
Sponsored by:Chinese Society of Entomological;institute of zoology, chinese academy of sciences;
Address:Chaoyang District No. 1 Beichen West Road, No. 5 hospital,Beijing City,100101, China
Tel:+86-10-64807137
Fax:+86-10-64807137
Email:entom@ioz.ac.cn

Your Position ：Home->Past Journals Catalog->2013年50 No.4

View Full-text

Using “random forest” for classification and regression

Author of the article:

Author's Workplace：中国科学院动物研究所北京100101

Key Words：random forest, classification tree, discriminant analysis, regression, machine learning

Abstract：“Random forest” is an algorithm developed by Breiman and Cutler in 2001. It runs by constructing multiple decision trees while training and outputing the class that is the mode of the classes output by individual trees. It has improved performance over single decision trees, and it is much more efficient than traditional machine learning techniques, e.g. artificial neural networks, especially when the dataset is large. Random forest can handle up to thousands of explanatory variables. It can be used to rank the importance of variables when the R package “random.forest” is implemented. It is suitable for demonstrating the nonlinear effect of variables, and it can model complex interactions among variables. Random forest is robust for outliers. In this paper, three examples are used to introduce how to use random forest for a discrimination problem (the dependent variable has multiple categories) for presenceabsence data (the deperdent variable has two categories), and for regression(the dependent variable is a continuous variable).