Revision as of 10:20, 12 October 2022

Feature selection

@@ Line 22: / Line 22: @@
 ** [https://avikarn.com/2020-06-26-MachineLearning_rf_glmnet/ Utilizing Machine Learning algorithms (GLMnet and Random Forest models) for Genomic Prediction of a Quantitative trait]
 * [https://link.springer.com/article/10.1186/1471-2105-7-3 Gene selection and classification of microarray data using random forest] 2006, and the R package [https://cran.r-project.org/web/packages/varSelRF/ varSelRF]
+** The most reliable measure is based on the decrease of classification accuracy when values of a variable in a node of a tree are permuted randomly
+** this measure of variable importance is not the same as a non-parametric statistic of difference between groups, such as could be obtained with a Kruskal-Wallis test)
 * [https://www.r-bloggers.com/2021/07/feature-importance-in-random-forest/ Feature Importance in Random Forest]