Benjamini-Hochberg FDR

What it is

PQC uses Benjamini-Hochberg (BH) false discovery rate control on day-level test p-values derived from residual anomaly scores.

FDR definition

False discovery rate is:

\[\mathrm{FDR} = \mathbb{E}\left[\frac{V}{\max(R,1)}\right]\]

where \(V\) is the number of false positives and \(R\) is the number of rejected hypotheses.

BH procedure (used by PQC)

For sorted p-values \(p_{(1)} \le \cdots \le p_{(m)}\) and target \(q\):

  1. find largest \(k\) such that \(p_{(k)} \le (k/m)q\)

  2. reject all hypotheses with \(p \le p_{(k)}\)

Why PQC uses it

Outlier scans test many days. BH controls expected false discoveries while retaining more power than family-wise error controls (e.g., Bonferroni).

Assumptions and caveats

  • BH guarantees hold under independence or certain positive dependence.

  • If p-values are miscalibrated (model mismatch), realized FDR can drift.

  • Day-level aggregation can miss finer within-day effects.

Small worked example

Suppose q=0.02 and sorted p-values are: [0.001, 0.012, 0.08, 0.2]. Thresholds are [0.005, 0.01, 0.015, 0.02]. Only the first p-value passes, so one day is flagged.

References

[BH1995]

Benjamini, Y., & Hochberg, Y. (1995). “Controlling the false discovery rate: a practical and powerful approach to multiple testing.” Journal of the Royal Statistical Society Series B, 57(1), 289-300.