Overview
========

pleb - The EPTA Data Combination Pipeline ("pleb") is a Python toolkit for combining
pulsar timing data across observing backends and generating diagnostics that
highlight timing-model quality, residual behavior, and data-set consistency.
It wraps standard timing workflows (via tempo2 outputs) and adds a structured
reporting layer aimed at PTA-style data sets, where long baselines and multiple
instruments are the norm. [Edwards2006]_ [Hobbs2006]_

If you are new to pulsar timing, the key object is the **timing residual**:
the difference between an observed pulse time-of-arrival (TOA) and the model
prediction. Residuals summarize how well the timing model explains the data,
and they are the primary input to PTA diagnostics and noise modeling. [Lorimer2005]_ [Stairs2003]_

The pipeline is designed to help you:

- Compare timing residuals and summary statistics across git branches or data
  combinations.
- Generate residual plots, covariance heatmaps, and change reports to track
  the impact of model or data updates.
- Run optional quality-control (QC) stages that detect outliers and transient
  behavior in residuals.
- Produce per-pulsar reports suitable for review by analysts new to the data set.

Pulsar timing context
---------------------

Pulsar timing models predict TOAs using a mixture of deterministic and
stochastic terms. Deterministic terms include spin-down, astrometry, and
binary motion; stochastic terms include red timing noise and dispersion-measure
(DM) variations. The pipeline itself does not perform full stochastic modeling,
but it surfaces residual patterns that commonly motivate those models. [Coles2011]_ [Keith2013]_

For example, residuals can be summarized by a reduced chi-square statistic:

.. math::

   \\chi^2_\\nu = \\frac{1}{N - p} \\sum_{i=1}^{N} \\frac{r_i^2}{\\sigma_i^2},

where :math:`r_i` are residuals, :math:`\\sigma_i` are TOA uncertainties, and
:math:`p` is the number of fitted parameters. Deviations in
:math:`\\chi^2_\\nu` or structure in residuals versus time, frequency, or
backend are indicators that the model is incomplete or the data contain
systematic effects. [Edwards2006]_ [Hobbs2006]_

Radio telescopes and backends
-----------------------------

PTA data sets combine observations from multiple radio telescopes and signal
processing backends. Backends can differ in bandwidth, channelization, and
time-tagging, which in turn affects residual scatter and systematic offsets.
This pipeline helps compare those backend-dependent behaviors by enforcing a
consistent metadata schema and plotting residuals grouped by backend. [Manchester2005]_

Where to start
--------------

If you are installing the package for the first time, begin with
:doc:`installation` and :doc:`quickstart`. For the timing concepts that appear
throughout the documentation, see :doc:`concepts`. For end-to-end workflows,
see :doc:`examples`.