pqc.detect.bad_measurements

Detect bad measurements using OU innovations and day-level FDR control.

This module implements a two-stage statistical procedure:

  1. Model short-timescale correlated residuals with an Ornstein–Uhlenbeck (OU) process and compute normalized innovations \(z_i\).

  2. Control multiplicity across days by converting daily maxima of \(|z_i|\) into p-values and applying Benjamini–Hochberg false discovery rate (FDR) control.

The OU step is used because pulsar timing residuals can be locally correlated; innovation whitening makes single-point surprises more interpretable. The day-level FDR step is used to reduce false positives under repeated testing.

Notes

Definition (OU innovations)

For irregularly sampled times \(t_i\), residuals \(y_i\), and correlation timescale \(\tau\), an OU predictor uses \(\phi_i = \exp(-(t_i-t_{i-1})/\tau)\). The innovation is \(e_i = y_i - \phi_i y_{i-1}\) and the normalized innovation is \(z_i = e_i / \sqrt{\mathrm{Var}(e_i)}\).

Definition (FDR)

FDR is \(\mathbb{E}[V/\max(R,1)]\), where \(V\) is number of false rejections and \(R\) is total rejections.

Assumptions
  • Innovations are approximately Gaussian under the null model.

  • Day-level tests are independent or positively dependent (BH validity).

  • A daily max-\(|z|\) statistic captures day-level contamination.

Interpretation

A flagged day indicates at least one measurement is unusually inconsistent with OU-consistent noise at the chosen FDR level.

Caveats
  • Misspecified \(\tau\) or variance can distort p-values.

  • Daily aggregation can lose within-day structure.

  • Heavy tails can inflate false positives if Gaussian tails are assumed.

Worked Example

Suppose day-level p-values are [0.001, 0.012, 0.08, 0.2] and fdr_q = 0.02. BH thresholds for sorted p-values are [0.005, 0.01, 0.015, 0.02]. Only 0.001 is below threshold, so one day is flagged.

References

Functions

detect_bad(df, *[, mjd_col, resid_col, ...])

Flag bad measurements using OU innovations and BH-FDR on days.