本文由 AI 分析生成
建立時間: 2026-03-28 來源: https://www.explainthis.io/zh-hant/swe/feature-flags
Summary
ExplainThis 用 2017 年 AWS S3 故障(因打錯字導致全球服務中斷、損失超過 1.5 億美元)作為引子,解釋功能旗標(Feature Flags)的概念與工程價值。功能旗標讓軟體在不重新部署的情況下,透過外部設定動態開關功能,實現漸進式推出、快速回滾與 A/B 測試,正是 S3 事故檢討中三大解決方案之一。
ExplainThis 以 2017 年 AWS S3 故障為引子,解釋功能旗標的概念——一種讓軟體在不重新部署的情況下,動態切換功能開關的技術,可實現漸進推出、快速回滾和 A/B 測試,是現代 CI/CD 的關鍵基礎設施。
Key Points
- Feature flags enable turning features on/off without redeployment, via external config (LaunchDarkly, PostHog, AWS AppConfig)
- The S3 2017 outage traced to: no input validation, config pushed everywhere at once, no automated rollback — flags solve the second problem
- Gradual rollout: enable for 1% of users first; detect issues before full release
- Code pattern:
if (user.shouldSee("feature")) { showNewFeature() } else { showOld() }— the flag value lives in config, not code - Modern usage extends beyond binary toggle: percentage rollout, user targeting, A/B testing
Insights
The S3 incident framing is pedagogically brilliant — it transforms feature flags from an abstract engineering practice into a concrete response to a catastrophic real-world failure. The engineering lesson is that human errors are inevitable; what matters is whether your system design makes them impossible or easily reversible. Feature flags address the “blast radius” problem: even if a bad change ships, it only affects a controlled subset of users.
Connections
Raw Excerpt
假如配置不是一次全部推送,而是逐步推送給終端使用者,這樣可以在一開始只推送給小量使用者時就發現,能避免一次推送給全世界,導致一出問題影響範圍就遍及全世界。