An Overview of Feature Selection + History-Based Feature Selection (HBFS)

本文由 AI 分析生成

建立時間： 2026-03-28 來源： https://towardsdatascience.com/an-overview-of-feature-selection-1c50965551dd

Summary

A thorough overview of feature selection motivations and techniques for tabular ML, positioning it as a first installment in a series on History-based Feature Selection (HBFS). Covers three motivations (accuracy, computation, robustness), two technique categories (per-feature evaluation vs. set-based search), and the full taxonomy from filter methods through wrapper and genetic algorithms.

系統性介紹表格型機器學習的特徵選取動機與技術，並引出 History-based Feature Selection（HBFS）方法。涵蓋三大動機（準確性、計算成本、魯棒性）和兩大技術類別（單特徵評估 vs. 集合搜索），以及從過濾法到包裝法和遺傳演算法的完整分類。

Key Points

Three motivations: (1) increase model accuracy — irrelevant features confuse tree-based models at deeper nodes; (2) reduce compute — fewer features cut tuning, training, and inference time; (3) improve robustness to future data drift
Two broad technique categories:
- Per-feature evaluation (filter methods): correlation, mutual information — fast but ignore feature interactions
- Set-based search (wrapper methods): evaluate candidate subsets — slower but capture interactions; includes genetic algorithms, swarm intelligence
HBFS preview: an experimental approach that learns from past feature set evaluations to estimate which untried subsets might perform well — treating feature selection as a regression problem over subset space
Unintuitive finding: using fewer features often increases accuracy because irrelevant features cause tree-based models to make spurious splits at low sample counts
Cloud cost dimension: in BigQuery-style column-pricing environments, fewer features reduce query costs beyond just compute time

Insights

The “fewer features = better accuracy for tree models” point is counterintuitive but well-established. Random forests and gradient boosting pick split features randomly among a candidate set — irrelevant features get chosen by chance at deep tree nodes where samples are scarce, introducing noise. Feature selection removes the noise source rather than hoping the model learns to ignore it.

The HBFS concept is interesting because it reframes feature selection as a meta-learning problem: instead of searching the exponential feature subset space exhaustively (impossible) or greedily (suboptimal), HBFS trains a model to predict subset performance from subset composition. This is analogous to neural architecture search (NAS) approaches that use a surrogate model to guide search.

Connections

Raw Excerpt

It is often the case that we find a higher accuracy by using fewer features than the full set of features available. This can be a bit unintuitive — in principle, models should ideally be able to ignore irrelevant features, but in practice, they very often cannot.

bot_vault

Explorer

An Overview of Feature Selection + History-Based Feature Selection (HBFS)

Summary

Key Points

Insights

Connections

Raw Excerpt

Graph View

Table of Contents

Backlinks