Feature Selection

In machine learning and statistics, feature selection, also known as variable selection, feature reduction, attribute selection or variable subset selection, is the process of selecting a subset of relevant features for use in model construction. The central assumption when using a feature selection technique is that the data contains many redundant or irrelevant features. Redundant features are those which provide no more information than the currently selected features, and irrelevant features provide no useful information in any context. Feature selection techniques are a subset of the more general field of feature extraction. Feature extraction creates new features from functions of the original features, whereas feature selection returns a subset of the features. Feature selection techniques are often used in domains where there are many features and comparatively few samples (or datapoints). The archetypal case is the use of feature selection in analysing DNA microarrays, where there are many thousands of features, and a few tens to hundreds of samples. Feature selection techniques provide three main benefits when constructing predictive models:

  • improved model interpretability,
  • shorter training times,
  • enhanced generalisation by reducing Overfitting.

Feature selection is also useful as part of the data analysis process, as shows which features are important for prediction, and how these features are related.

Read more about Feature SelectionIntroduction, Subset Selection, Optimality Criteria, Minimum-redundancy-maximum-relevance (mRMR) Feature Selection, Correlation Feature Selection, General L1-norm Support Vector Machine For Feature Selection, Regularized Trees, Embedded Methods Incorporating Feature Selection, Software For Feature Selection

Other articles related to "features, feature selection, selection, feature":

Random Multinomial Logit - Rationale For The New Method
... algorithms like multinomial logit (MNL) are specifically designed to map features to a multiclass output vector ... dimensionality, thereby implicitly necessitating feature selection, i.e ... the selection of a best subset of variables of the input feature set ...
Dimension Reduction - Feature Selection
... Feature selection approaches try to find a subset of the original variables (also called features or attributes) ...
Software For Feature Selection
... standard data analysis software systems are often used for feature selection, such as SciLab, NumPy and the R language ... software systems are tailored specifically to the feature-selection task Weka – freely available and open-source software in Java ... Feature Selection Toolbox 3 – freely available and open-source software in C++ ...
Minimum Redundancy Feature Selection
... Minimum redundancy feature selection is an algorithm frequently used in a method to accurately identify characteristics of genes and phenotypes and narrow down their relevance and ... Feature selection, one of the basic problems in pattern recognition and machine learning, identifies subsets of data that are relevant to the parameters used and is normally called Maximum Relevance ... Features can be selected in many different ways ...

Famous quotes containing the words selection and/or feature:

    Every writer is necessarily a critic—that is, each sentence is a skeleton accompanied by enormous activity of rejection; and each selection is governed by general principles concerning truth, force, beauty, and so on.... The critic that is in every fabulist is like the iceberg—nine-tenths of him is under water.
    Thornton Wilder (1897–1975)

    When delicate and feeling souls are separated, there is not a feature in the sky, not a movement of the elements, not an aspiration of the breeze, but hints some cause for a lover’s apprehension.
    Richard Brinsley Sheridan (1751–1816)