Random Forest - Visualization


In order to form an intuitive visualization of the model-space represented by a random forest, a dataset consisting of 200 random points (100 green points and 100 red points) was created. The green points were drawn from a Gaussian distribution with a centroid at (0,1), and the red points were drawn from a Gaussian distribution with a centroid at (1,0). In both cases, the variance was circular with an average radius of 1.

A Random Forest model, consisting of 50 trees, was trained on this data. The purity of the color indicates the portion of the 50 trees that voted in agreement. Significant over-fit can be observed in this visualization.

For contrast, a logistic regression model (which is somewhat less-prone to over-fit) was also trained on this same data.

(Typically, random forest is best-suited for use with categorical features, but continuous features were used in this illustration because they were easier to visualize.)

Read more about this topic:  Random Forest

Other articles related to "visualization":

Shakti Gawain - Publications
1992) Meditations Creative Visualization and Meditation Exercises to Enrich Your Life (1992) The Path of Transformation How Healing Ourselves Can Change the World (1993) Making Sense of Your Dollars A Biblical ...
Lawrence J. Rosenblum - Work - Research Trends in Visualization
... The field of visualization has undergone considerable changes since its founding in the late 1980s ... From its origins in scientific visualization, new areas have arisen in the new Millennium ... These include information visualization and, more recently, mobile visualization including location-aware computing, and visual analytics ...