Eylem Seç
Random Forest Robustness, Variable Importance, and Tree Aggregation
Başlık:
Random Forest Robustness, Variable Importance, and Tree Aggregation
Yazar:
Sage, Andrew John, author. (orcid)0000-0003-4656-9748
ISBN:
9780438077010
Yazar Ek Girişi:
Fiziksel Tanımlama:
1 electronic resource (120 pages)
Genel Not:
Source: Dissertation Abstracts International, Volume: 79-11(E), Section: B.
Advisors: Ulrike Genschel; Dan Nettleton Committee members: Daniel Nordman; Craig Ogilvie; Vivekananda Roy.
Özet:
Random forest methodology is a nonparametric, machine learning approach capable of strong performance in regression and classification problems involving complex datasets. In addition to making predictions, random forests can be used to assess the relative importance of explanatory variables. In this dissertation, we explore three topics related to random forests: tree aggregation, variable importance, and robustness. In Chapter 2, we show that the method of tree aggregation used in one popular random forest implementation can lead to biased class probability estimates and that it is often beneficial to combine the tree partitioning algorithm used in one implementation with the aggregation scheme used in another. In Chapter 3, we show that imputing missing values proir to assessing variable importance often leads to inaccurate variable importance measures. Using simulation studies, we investigate the impact on variable importance of six random-forest-based imputation techniques and find that some techniques are prone to overestimating the importance of variables whose values have been imputed, while other techniques tend to underestimate the importance of such variables. In Chapter 4, we propose a new robust approach for random forest regression. Adapted from a popular approach used in polynomial regression, our method uses residual analysis to modify the weights associated with training cases in random forest predictions, so that outlying training cases have less impact. We show, using simulation studies, that this approach outperforms existing robust techniques on noisy, contaminated datasets.
Notlar:
School code: 0097
Konu Başlığı:
Tüzel Kişi Ek Girişi:
Mevcut:*
Yer Numarası | Demirbaş Numarası | Shelf Location | Lokasyon / Statüsü / İade Tarihi |
---|---|---|---|
XX(690127.1) | 690127-1001 | Proquest E-Tez Koleksiyonu | Arıyor... |
On Order
Liste seç
Bunu varsayılan liste yap.
Öğeler başarıyla eklendi
Öğeler eklenirken hata oldu. Lütfen tekrar deneyiniz.
:
Select An Item
Data usage warning: You will receive one text message for each title you selected.
Standard text messaging rates apply.