API Reference

Run the data-combination diagnostics pipeline.

This package exposes a small public API for running the full pipeline, running parameter scans, and applying FixDataset operations programmatically. The implementation is refactored from the original notebooks and designed to import quickly; heavy dependencies (GitPython, libstempo/pqc) are imported lazily by entry points.

Examples

Run the pipeline programmatically:

from pathlib import Path
from pleb import PipelineConfig, run_pipeline

cfg = PipelineConfig(
    home_dir=Path("/data/epta"),
    singularity_image=Path("/images/tempo2.sif"),
    dataset_name="EPTA",
)
outputs = run_pipeline(cfg)

Run a parameter scan:

from pleb import PipelineConfig, run_param_scan

cfg = PipelineConfig(
    home_dir=Path("/data/epta"),
    singularity_image=Path("/images/tempo2.sif"),
    dataset_name="EPTA",
    param_scan_typical=True,
)
results = run_param_scan(cfg)

See also

pleb.pipeline.run_pipeline: Full pipeline implementation. pleb.param_scan.run_param_scan: Parameter scan runner. pleb.dataset_fix: FixDataset helpers.

class pleb.BinaryAnalysisConfig(only_models=None)[source]

Bases: object

Configuration for binary/orbital diagnostics derived from .par files.

Attributes

only_modelslist of str, optional

If set, only report pulsars whose BINARY parameter matches one of these model names.

Examples

Limit output to BTX binaries:

cfg = BinaryAnalysisConfig(only_models=["BTX"])
only_models: List[str] | None
Parameters:

only_models (List[str] | None)

class pleb.FixDatasetConfig(apply=False, backup=True, dry_run=False, update_alltim_includes=True, min_toas_per_backend_tim=10, generate_alltim_variants=False, backend_classifications_path=None, alltim_variants_path=None, relabel_rules_path=None, overlap_rules_path=None, overlap_exact_catalog_path=None, jump_reference_variants=False, jump_reference_keep_tmp=False, jump_reference_jump_flag='-sys', jump_reference_csv_dir=None, tempo2_home_dir=None, tempo2_dataset_name=None, tempo2_singularity_image=None, required_tim_flags=<factory>, infer_system_flags=False, flag_sys_freq_rules_enabled=False, flag_sys_freq_rules_path=None, system_flag_table_path=None, system_flag_mapping_path=None, system_flag_overwrite_existing=False, wsrt_p2_force_sys_by_freq=False, wsrt_p2_prefer_dual_channel=False, wsrt_p2_mjd_tol_sec=9.9e-07, wsrt_p2_action='comment', wsrt_p2_comment_prefix='C WSRT_P2_PREFER_DUAL', backend_overrides=<factory>, raise_on_backend_missing=False, dedupe_toas_within_tim=True, dedupe_mjd_tol_sec=0.0, dedupe_freq_tol_mhz=None, dedupe_freq_tol_auto=False, check_duplicate_backend_tims=False, remove_overlaps_exact=True, insert_missing_jumps=True, jump_flag='-sys', prune_stale_jumps=False, ensure_ephem=None, ensure_clk=None, ensure_ne_sw=None, force_ne_sw_overwrite=False, remove_patterns=<factory>, coord_convert=None, qc_remove_outliers=False, qc_outlier_cols=None, qc_action='comment', qc_comment_prefix='C QC_OUTLIER', qc_backend_col='sys', qc_remove_bad=True, qc_remove_transients=False, qc_remove_solar=False, qc_solar_action='comment', qc_solar_comment_prefix='# QC_SOLAR', qc_remove_orbital_phase=False, qc_orbital_phase_action='comment', qc_orbital_phase_comment_prefix='# QC_BIANRY_ECLIPSE', qc_merge_tol_days=2.3148148148148147e-05, qc_results_dir=None, qc_branch=None)[source]

Bases: object

Controls for dataset fixing utilities.

These features are adapted from the FixDataset notebook. They are disabled by default because automatic edits can be repo-specific.

Notes

Use apply=False for report-only runs. When apply=True, this module will modify .par/.tim files and optionally create backups.

Parameters:
  • apply (bool)

  • backup (bool)

  • dry_run (bool)

  • update_alltim_includes (bool)

  • min_toas_per_backend_tim (int)

  • generate_alltim_variants (bool)

  • backend_classifications_path (str | None)

  • alltim_variants_path (str | None)

  • relabel_rules_path (str | None)

  • overlap_rules_path (str | None)

  • overlap_exact_catalog_path (str | None)

  • jump_reference_variants (bool)

  • jump_reference_keep_tmp (bool)

  • jump_reference_jump_flag (str)

  • jump_reference_csv_dir (str | None)

  • tempo2_home_dir (str | None)

  • tempo2_dataset_name (str | None)

  • tempo2_singularity_image (str | None)

  • required_tim_flags (Dict[str, str])

  • infer_system_flags (bool)

  • flag_sys_freq_rules_enabled (bool)

  • flag_sys_freq_rules_path (str | None)

  • system_flag_table_path (str | None)

  • system_flag_mapping_path (str | None)

  • system_flag_overwrite_existing (bool)

  • wsrt_p2_force_sys_by_freq (bool)

  • wsrt_p2_prefer_dual_channel (bool)

  • wsrt_p2_mjd_tol_sec (float)

  • wsrt_p2_action (str)

  • wsrt_p2_comment_prefix (str)

  • backend_overrides (Dict[str, str])

  • raise_on_backend_missing (bool)

  • dedupe_toas_within_tim (bool)

  • dedupe_mjd_tol_sec (float)

  • dedupe_freq_tol_mhz (float | None)

  • dedupe_freq_tol_auto (bool)

  • check_duplicate_backend_tims (bool)

  • remove_overlaps_exact (bool)

  • insert_missing_jumps (bool)

  • jump_flag (str)

  • prune_stale_jumps (bool)

  • ensure_ephem (str | None)

  • ensure_clk (str | None)

  • ensure_ne_sw (str | None)

  • force_ne_sw_overwrite (bool)

  • remove_patterns (List[str])

  • coord_convert (str | None)

  • qc_remove_outliers (bool)

  • qc_outlier_cols (List[str] | None)

  • qc_action (str)

  • qc_comment_prefix (str)

  • qc_backend_col (str)

  • qc_remove_bad (bool)

  • qc_remove_transients (bool)

  • qc_remove_solar (bool)

  • qc_solar_action (str)

  • qc_solar_comment_prefix (str)

  • qc_remove_orbital_phase (bool)

  • qc_orbital_phase_action (str)

  • qc_orbital_phase_comment_prefix (str)

  • qc_merge_tol_days (float)

  • qc_results_dir (Path | None)

  • qc_branch (str | None)

apply

Apply changes when True; otherwise run report-only.

Type:

bool

backup

Create .orig backups when applying.

Type:

bool

dry_run

Compute changes without writing to disk.

Type:

bool

update_alltim_includes

Update INCLUDE lines in *_all.tim.

Type:

bool

min_toas_per_backend_tim

Minimum TOAs for a backend tim to be included.

Type:

int

generate_alltim_variants

Generate additional *_all.<variant>.tim files.

Type:

bool

backend_classifications_path

TOML with class->system mappings.

Type:

str | None

alltim_variants_path

TOML with variant selection rules.

Type:

str | None

relabel_rules_path

TOML with declarative TOA relabel rules.

Type:

str | None

overlap_rules_path

TOML with declarative overlap/dedup preference rules.

Type:

str | None

overlap_exact_catalog_path

TOML keep->drop map used by remove_overlaps_exact.

Type:

str | None

jump_reference_variants

Build per-variant reference-system JUMP parfiles.

Type:

bool

required_tim_flags

Flags to ensure on each TOA line.

Type:

Dict[str, str]

infer_system_flags

Infer -sys/-group/-pta flags.

Type:

bool

flag_sys_freq_rules_enabled

Enable YAML-based system/frequency rules.

Type:

bool

flag_sys_freq_rules_path

Path to flag_sys_freq_rules.yaml.

Type:

str | None

system_flag_table_path

Path to the system-flag table (JSON/TOML).

Type:

str | None

system_flag_mapping_path

Path to editable mapping/allowlist JSON.

Type:

str | None

system_flag_overwrite_existing

Overwrite existing system flags.

Type:

bool

backend_overrides

Map tim basename to backend name override.

Type:

Dict[str, str]

raise_on_backend_missing

Raise when backend cannot be inferred.

Type:

bool

dedupe_toas_within_tim

Remove duplicate TOAs within each tim.

Type:

bool

check_duplicate_backend_tims

Detect duplicated backend tims.

Type:

bool

remove_overlaps_exact

Remove known overlapping TOAs across backends.

Type:

bool

insert_missing_jumps

Insert missing JUMP lines into par files.

Type:

bool

jump_flag

Flag used to label inserted jumps.

Type:

str

prune_stale_jumps

Drop JUMP lines not present in timfile flags.

Type:

bool

ensure_ephem

Ensure EPHEM param exists (optional value).

Type:

str | None

ensure_clk

Ensure CLK param exists (optional value).

Type:

str | None

ensure_ne_sw

Ensure NE_SW param exists (optional value).

Type:

str | None

force_ne_sw_overwrite

Overwrite existing NE_SW only when True.

Type:

bool

remove_patterns

Remove lines matching these patterns.

Type:

List[str]

coord_convert

Coordinate conversion mode (equ2ecl or ecl2equ).

Type:

str | None

qc_remove_outliers

Apply/remove outliers flagged by PQC.

Type:

bool

qc_action

comment or delete for flagged TOAs.

Type:

str

qc_comment_prefix

Prefix for commented-out TOAs.

Type:

str

qc_backend_col

Backend column for matching QC results.

Type:

str

qc_remove_bad

Apply bad/bad_day flags from QC.

Type:

bool

qc_remove_transients

Apply transient flags from QC.

Type:

bool

qc_remove_solar

Apply solar-elongation flags from QC.

Type:

bool

qc_solar_action

comment or delete for solar-flagged TOAs.

Type:

str

qc_solar_comment_prefix

Prefix for solar-flagged TOA comments.

Type:

str

qc_remove_orbital_phase

Apply orbital-phase flags from QC.

Type:

bool

qc_orbital_phase_action

comment or delete for orbital-phase TOAs.

Type:

str

qc_orbital_phase_comment_prefix

Prefix for orbital-phase TOA comments.

Type:

str

qc_merge_tol_days

MJD tolerance for QC matching.

Type:

float

qc_results_dir

Directory containing QC CSV outputs.

Type:

pathlib.Path | None

qc_branch

Subdirectory for QC results (optional).

Type:

str | None

Examples

Run a report-only pass for a pulsar:

cfg = FixDatasetConfig()
report = fix_pulsar_dataset(Path("/data/epta/J1234+5678"), cfg)
alltim_variants_path: str | None
apply: bool
backend_classifications_path: str | None
backend_overrides: Dict[str, str]
backup: bool
check_duplicate_backend_tims: bool
coord_convert: str | None
dedupe_freq_tol_auto: bool
dedupe_freq_tol_mhz: float | None
dedupe_mjd_tol_sec: float
dedupe_toas_within_tim: bool
dry_run: bool
ensure_clk: str | None
ensure_ephem: str | None
ensure_ne_sw: str | None
flag_sys_freq_rules_enabled: bool
flag_sys_freq_rules_path: str | None
force_ne_sw_overwrite: bool
generate_alltim_variants: bool
infer_system_flags: bool
insert_missing_jumps: bool
jump_flag: str
jump_reference_csv_dir: str | None
jump_reference_jump_flag: str
jump_reference_keep_tmp: bool
jump_reference_variants: bool
min_toas_per_backend_tim: int
overlap_exact_catalog_path: str | None
overlap_rules_path: str | None
prune_stale_jumps: bool
qc_action: str
qc_backend_col: str
qc_branch: str | None
qc_comment_prefix: str
qc_merge_tol_days: float
qc_orbital_phase_action: str
qc_orbital_phase_comment_prefix: str
qc_outlier_cols: List[str] | None
qc_remove_bad: bool
qc_remove_orbital_phase: bool
qc_remove_outliers: bool
qc_remove_solar: bool
qc_remove_transients: bool
qc_results_dir: Path | None
qc_solar_action: str
qc_solar_comment_prefix: str
raise_on_backend_missing: bool
relabel_rules_path: str | None
remove_overlaps_exact: bool
remove_patterns: List[str]
required_tim_flags: Dict[str, str]
system_flag_mapping_path: str | None
system_flag_overwrite_existing: bool
system_flag_table_path: str | None
tempo2_dataset_name: str | None
tempo2_home_dir: str | None
tempo2_singularity_image: str | None
update_alltim_includes: bool
wsrt_p2_action: str
wsrt_p2_comment_prefix: str
wsrt_p2_force_sys_by_freq: bool
wsrt_p2_mjd_tol_sec: float
wsrt_p2_prefer_dual_channel: bool
class pleb.IngestConfig(ingest_mapping_file=None, ingest_output_dir=None, home_dir=None, dataset_name=None, ingest_verify=False, ingest_commit_branch_name=None, ingest_commit_base_branch=None, ingest_commit_message=None)[source]

Bases: object

Configuration model for ingest-only CLI mode.

Notes

This config intentionally decouples ingest from full pipeline execution so users can ingest/commit datasets without specifying Tempo2/PQC settings.

dataset_name: str | None
static from_dict(d)[source]
Parameters:

d (Dict[str, Any])

Return type:

IngestConfig

home_dir: Path | None
ingest_commit_base_branch: str | None
ingest_commit_branch_name: str | None
ingest_commit_message: str | None
ingest_mapping_file: Path | None
ingest_output_dir: Path | None
ingest_verify: bool
static load(path)[source]
Parameters:

path (Path)

Return type:

IngestConfig

resolved_output_root()[source]

Resolve ingest output root from explicit or fallback settings.

Return type:

Path

to_dict()[source]
Return type:

Dict[str, Any]

Parameters:
  • ingest_mapping_file (Path | None)

  • ingest_output_dir (Path | None)

  • home_dir (Path | None)

  • dataset_name (str | None)

  • ingest_verify (bool)

  • ingest_commit_branch_name (str | None)

  • ingest_commit_base_branch (str | None)

  • ingest_commit_message (str | None)

class pleb.ParamScanConfig(home_dir, singularity_image, dataset_name=None, results_dir=PosixPath('.'), reference_branch='main', pulsars='ALL', outdir_name=None, epoch='55000', force_rerun=False, jobs=1, cleanup_output_tree=True, cleanup_work_dir=False, param_scan_typical=True, param_scan_dm_redchisq_threshold=2.0, param_scan_dm_max_order=4, param_scan_btx_max_fb=3)[source]

Bases: object

Configuration model for parameter-scan mode.

Notes

Statistical scan thresholds in this model (for example reduced chi-square gates and tested derivative orders) are passed through to pleb.param_scan; this class only stores and validates values.

cleanup_output_tree: bool
cleanup_work_dir: bool
dataset_name: str | None
epoch: str
force_rerun: bool
static from_dict(d)[source]
Parameters:

d (Dict[str, Any])

Return type:

ParamScanConfig

home_dir: Path
jobs: int
static load(path)[source]
Parameters:

path (Path)

Return type:

ParamScanConfig

outdir_name: str | None
param_scan_btx_max_fb: int
param_scan_dm_max_order: int
param_scan_dm_redchisq_threshold: float
param_scan_typical: bool
pulsars: str | List[str]
reference_branch: str
results_dir: Path
singularity_image: Path
to_dict()[source]
Return type:

Dict[str, Any]

to_pipeline_config()[source]
Return type:

PipelineConfig

Parameters:
  • home_dir (Path)

  • singularity_image (Path)

  • dataset_name (str | None)

  • results_dir (Path)

  • reference_branch (str)

  • pulsars (str | List[str])

  • outdir_name (str | None)

  • epoch (str)

  • force_rerun (bool)

  • jobs (int)

  • cleanup_output_tree (bool)

  • cleanup_work_dir (bool)

  • param_scan_typical (bool)

  • param_scan_dm_redchisq_threshold (float)

  • param_scan_dm_max_order (int)

  • param_scan_btx_max_fb (int)

class pleb.PipelineConfig(home_dir, singularity_image, dataset_name=None, results_dir=PosixPath('.'), branches=<factory>, reference_branch='main', pulsars='ALL', outdir_name=None, cleanup_output_tree=True, cleanup_work_dir=False, epoch='55000', force_rerun=False, jobs=1, run_tempo2=True, make_toa_coverage_plots=True, make_change_reports=True, make_covariance_heatmaps=True, make_residual_plots=True, make_outlier_reports=True, make_plots=None, make_reports=None, make_covmat=None, testing_mode=False, run_pqc=False, pqc_backend_col='group', pqc_drop_unmatched=False, pqc_merge_tol_seconds=2.0, pqc_tau_corr_minutes=30.0, pqc_fdr_q=0.01, pqc_mark_only_worst_per_day=True, pqc_tau_rec_days=7.0, pqc_window_mult=5.0, pqc_min_points=6, pqc_delta_chi2_thresh=25.0, pqc_exp_dip_min_duration_days=21.0, pqc_step_enabled=True, pqc_step_min_points=20, pqc_step_delta_chi2_thresh=25.0, pqc_step_scope='both', pqc_dm_step_enabled=True, pqc_dm_step_min_points=20, pqc_dm_step_delta_chi2_thresh=25.0, pqc_dm_step_scope='both', pqc_robust_enabled=True, pqc_robust_z_thresh=5.0, pqc_robust_scope='both', pqc_add_orbital_phase=True, pqc_add_solar_elongation=True, pqc_add_elevation=False, pqc_add_airmass=False, pqc_add_parallactic_angle=False, pqc_add_freq_bin=False, pqc_freq_bins=8, pqc_observatory_path=None, pqc_structure_mode='none', pqc_structure_detrend_features=<factory>, pqc_structure_test_features=<factory>, pqc_structure_nbins=12, pqc_structure_min_per_bin=3, pqc_structure_p_thresh=0.01, pqc_structure_circular_features=<factory>, pqc_structure_group_cols=None, pqc_outlier_gate_enabled=False, pqc_outlier_gate_sigma=3.0, pqc_outlier_gate_resid_col=None, pqc_outlier_gate_sigma_col=None, pqc_event_instrument=False, pqc_solar_events_enabled=False, pqc_solar_approach_max_deg=30.0, pqc_solar_min_points_global=30, pqc_solar_min_points_year=10, pqc_solar_min_points_near_zero=3, pqc_solar_tau_min_deg=2.0, pqc_solar_tau_max_deg=60.0, pqc_solar_member_eta=1.0, pqc_solar_freq_dependence=True, pqc_solar_freq_alpha_min=0.0, pqc_solar_freq_alpha_max=4.0, pqc_solar_freq_alpha_tol=0.001, pqc_solar_freq_alpha_max_iter=64, pqc_orbital_phase_cut_enabled=False, pqc_orbital_phase_cut_center=0.25, pqc_orbital_phase_cut=None, pqc_orbital_phase_cut_sigma=3.0, pqc_orbital_phase_cut_nbins=18, pqc_orbital_phase_cut_min_points=20, pqc_eclipse_events_enabled=False, pqc_eclipse_center_phase=0.25, pqc_eclipse_min_points=30, pqc_eclipse_width_min=0.01, pqc_eclipse_width_max=0.5, pqc_eclipse_member_eta=1.0, pqc_eclipse_freq_dependence=True, pqc_eclipse_freq_alpha_min=0.0, pqc_eclipse_freq_alpha_max=4.0, pqc_eclipse_freq_alpha_tol=0.001, pqc_eclipse_freq_alpha_max_iter=64, pqc_gaussian_bump_enabled=False, pqc_gaussian_bump_min_duration_days=60.0, pqc_gaussian_bump_max_duration_days=1500.0, pqc_gaussian_bump_n_durations=6, pqc_gaussian_bump_min_points=20, pqc_gaussian_bump_delta_chi2_thresh=25.0, pqc_gaussian_bump_suppress_overlap=True, pqc_gaussian_bump_member_eta=1.0, pqc_gaussian_bump_freq_dependence=True, pqc_gaussian_bump_freq_alpha_min=0.0, pqc_gaussian_bump_freq_alpha_max=4.0, pqc_gaussian_bump_freq_alpha_tol=0.001, pqc_gaussian_bump_freq_alpha_max_iter=64, pqc_glitch_enabled=False, pqc_glitch_min_points=30, pqc_glitch_delta_chi2_thresh=25.0, pqc_glitch_suppress_overlap=True, pqc_glitch_member_eta=1.0, pqc_glitch_peak_tau_days=30.0, pqc_glitch_noise_k=1.0, pqc_glitch_mean_window_days=180.0, pqc_glitch_min_duration_days=1000.0, pqc_backend_profiles_path=None, qc_report=False, qc_report_backend_col=None, qc_report_backend=None, qc_report_dir=None, qc_report_no_plots=False, qc_report_structure_group_cols=None, qc_report_no_feature_plots=False, qc_report_compact_pdf=False, qc_report_compact_pdf_name='qc_compact_report.pdf', qc_report_compact_outlier_cols=None, qc_cross_pulsar_enabled=False, qc_cross_pulsar_time_col=None, qc_cross_pulsar_window_days=1.0, qc_cross_pulsar_min_pulsars=2, qc_cross_pulsar_include_outliers=True, qc_cross_pulsar_include_events=True, qc_cross_pulsar_outlier_cols=None, qc_cross_pulsar_event_cols=None, qc_cross_pulsar_dir=None, run_fix_dataset=False, make_binary_analysis=False, param_scan_typical=False, param_scan_dm_redchisq_threshold=2.0, param_scan_dm_max_order=4, param_scan_btx_max_fb=3, fix_apply=False, fix_branch_name=None, fix_base_branch=None, fix_commit_message=None, fix_backup=True, fix_dry_run=False, fix_update_alltim_includes=True, fix_min_toas_per_backend_tim=10, fix_required_tim_flags=<factory>, fix_system_flag_mapping_path=None, fix_system_flag_table_path=None, fix_flag_sys_freq_rules_enabled=False, fix_flag_sys_freq_rules_path=None, fix_generate_alltim_variants=False, fix_backend_classifications_path=None, fix_alltim_variants_path=None, fix_relabel_rules_path=None, fix_overlap_rules_path=None, fix_overlap_exact_catalog_path=None, fix_jump_reference_variants=False, fix_jump_reference_keep_tmp=False, fix_jump_reference_jump_flag='-sys', fix_jump_reference_csv_dir=None, fix_infer_system_flags=False, fix_system_flag_overwrite_existing=False, fix_wsrt_p2_force_sys_by_freq=False, fix_wsrt_p2_prefer_dual_channel=False, fix_wsrt_p2_mjd_tol_sec=9.9e-07, fix_wsrt_p2_action='comment', fix_wsrt_p2_comment_prefix='C WSRT_P2_PREFER_DUAL', fix_backend_overrides=<factory>, fix_raise_on_backend_missing=False, fix_dedupe_toas_within_tim=True, fix_dedupe_mjd_tol_sec=0.0, fix_dedupe_freq_tol_mhz=None, fix_dedupe_freq_tol_auto=False, fix_check_duplicate_backend_tims=False, fix_remove_overlaps_exact=True, fix_insert_missing_jumps=True, fix_jump_flag='-sys', fix_prune_stale_jumps=False, fix_ensure_ephem=None, fix_ensure_clk=None, fix_ensure_ne_sw=None, fix_force_ne_sw_overwrite=False, fix_remove_patterns=<factory>, fix_coord_convert=None, fix_prune_missing_includes=True, fix_drop_small_backend_includes=True, fix_system_flag_update_table=True, fix_default_backend=None, fix_group_flag='-group', fix_pta_flag='-pta', fix_pta_value=None, fix_standardize_par_values=True, fix_prune_small_system_toas=False, fix_prune_small_system_flag='-sys', fix_qc_remove_outliers=False, fix_qc_outlier_cols=None, fix_qc_action='comment', fix_qc_comment_prefix='C QC_OUTLIER', fix_qc_backend_col='sys', fix_qc_remove_bad=True, fix_qc_remove_transients=False, fix_qc_remove_solar=False, fix_qc_solar_action='comment', fix_qc_solar_comment_prefix='# QC_SOLAR', fix_qc_remove_orbital_phase=False, fix_qc_orbital_phase_action='comment', fix_qc_orbital_phase_comment_prefix='# QC_BIANRY_ECLIPSE', fix_qc_merge_tol_days=2.3148148148148147e-05, fix_qc_results_dir=None, fix_qc_branch=None, binary_only_models=None, dpi=120, max_covmat_params=None, ingest_mapping_file=None, ingest_output_dir=None, ingest_commit_branch=True, ingest_commit_branch_name=None, ingest_commit_base_branch=None, ingest_commit_message=None, ingest_verify=False)[source]

Bases: object

Configure the data-combination pipeline.

The configuration is intentionally flat so that it can be serialized to JSON/TOML and overridden via CLI flags without nested structures. Most fields correspond directly to CLI options and pipeline stages.

Notes

The dataset_name field is interpreted by resolved() as a filesystem path when it contains a path separator (/ or \) or starts with .; otherwise it is treated as a directory name under home_dir.

Examples

Basic construction and JSON save:

cfg = PipelineConfig(
    home_dir=Path("/data/epta"),
    singularity_image=Path("/images/tempo2.sif"),
    dataset_name="EPTA",
)
cfg.save_json(Path("pipeline.json"))
Parameters:
  • home_dir (Path)

  • singularity_image (Path)

  • dataset_name (str | None)

  • results_dir (Path)

  • branches (List[str])

  • reference_branch (str)

  • pulsars (str | List[str])

  • outdir_name (str | None)

  • cleanup_output_tree (bool)

  • cleanup_work_dir (bool)

  • epoch (str)

  • force_rerun (bool)

  • jobs (int)

  • run_tempo2 (bool)

  • make_toa_coverage_plots (bool)

  • make_change_reports (bool)

  • make_covariance_heatmaps (bool)

  • make_residual_plots (bool)

  • make_outlier_reports (bool)

  • make_plots (bool | None)

  • make_reports (bool | None)

  • make_covmat (bool | None)

  • testing_mode (bool)

  • run_pqc (bool)

  • pqc_backend_col (str)

  • pqc_drop_unmatched (bool)

  • pqc_merge_tol_seconds (float)

  • pqc_tau_corr_minutes (float)

  • pqc_fdr_q (float)

  • pqc_mark_only_worst_per_day (bool)

  • pqc_tau_rec_days (float)

  • pqc_window_mult (float)

  • pqc_min_points (int)

  • pqc_delta_chi2_thresh (float)

  • pqc_exp_dip_min_duration_days (float)

  • pqc_step_enabled (bool)

  • pqc_step_min_points (int)

  • pqc_step_delta_chi2_thresh (float)

  • pqc_step_scope (str)

  • pqc_dm_step_enabled (bool)

  • pqc_dm_step_min_points (int)

  • pqc_dm_step_delta_chi2_thresh (float)

  • pqc_dm_step_scope (str)

  • pqc_robust_enabled (bool)

  • pqc_robust_z_thresh (float)

  • pqc_robust_scope (str)

  • pqc_add_orbital_phase (bool)

  • pqc_add_solar_elongation (bool)

  • pqc_add_elevation (bool)

  • pqc_add_airmass (bool)

  • pqc_add_parallactic_angle (bool)

  • pqc_add_freq_bin (bool)

  • pqc_freq_bins (int)

  • pqc_observatory_path (str | None)

  • pqc_structure_mode (str)

  • pqc_structure_detrend_features (List[str] | None)

  • pqc_structure_test_features (List[str] | None)

  • pqc_structure_nbins (int)

  • pqc_structure_min_per_bin (int)

  • pqc_structure_p_thresh (float)

  • pqc_structure_circular_features (List[str] | None)

  • pqc_structure_group_cols (List[str] | None)

  • pqc_outlier_gate_enabled (bool)

  • pqc_outlier_gate_sigma (float)

  • pqc_outlier_gate_resid_col (str | None)

  • pqc_outlier_gate_sigma_col (str | None)

  • pqc_event_instrument (bool)

  • pqc_solar_events_enabled (bool)

  • pqc_solar_approach_max_deg (float)

  • pqc_solar_min_points_global (int)

  • pqc_solar_min_points_year (int)

  • pqc_solar_min_points_near_zero (int)

  • pqc_solar_tau_min_deg (float)

  • pqc_solar_tau_max_deg (float)

  • pqc_solar_member_eta (float)

  • pqc_solar_freq_dependence (bool)

  • pqc_solar_freq_alpha_min (float)

  • pqc_solar_freq_alpha_max (float)

  • pqc_solar_freq_alpha_tol (float)

  • pqc_solar_freq_alpha_max_iter (int)

  • pqc_orbital_phase_cut_enabled (bool)

  • pqc_orbital_phase_cut_center (float)

  • pqc_orbital_phase_cut (float | None)

  • pqc_orbital_phase_cut_sigma (float)

  • pqc_orbital_phase_cut_nbins (int)

  • pqc_orbital_phase_cut_min_points (int)

  • pqc_eclipse_events_enabled (bool)

  • pqc_eclipse_center_phase (float)

  • pqc_eclipse_min_points (int)

  • pqc_eclipse_width_min (float)

  • pqc_eclipse_width_max (float)

  • pqc_eclipse_member_eta (float)

  • pqc_eclipse_freq_dependence (bool)

  • pqc_eclipse_freq_alpha_min (float)

  • pqc_eclipse_freq_alpha_max (float)

  • pqc_eclipse_freq_alpha_tol (float)

  • pqc_eclipse_freq_alpha_max_iter (int)

  • pqc_gaussian_bump_enabled (bool)

  • pqc_gaussian_bump_min_duration_days (float)

  • pqc_gaussian_bump_max_duration_days (float)

  • pqc_gaussian_bump_n_durations (int)

  • pqc_gaussian_bump_min_points (int)

  • pqc_gaussian_bump_delta_chi2_thresh (float)

  • pqc_gaussian_bump_suppress_overlap (bool)

  • pqc_gaussian_bump_member_eta (float)

  • pqc_gaussian_bump_freq_dependence (bool)

  • pqc_gaussian_bump_freq_alpha_min (float)

  • pqc_gaussian_bump_freq_alpha_max (float)

  • pqc_gaussian_bump_freq_alpha_tol (float)

  • pqc_gaussian_bump_freq_alpha_max_iter (int)

  • pqc_glitch_enabled (bool)

  • pqc_glitch_min_points (int)

  • pqc_glitch_delta_chi2_thresh (float)

  • pqc_glitch_suppress_overlap (bool)

  • pqc_glitch_member_eta (float)

  • pqc_glitch_peak_tau_days (float)

  • pqc_glitch_noise_k (float)

  • pqc_glitch_mean_window_days (float)

  • pqc_glitch_min_duration_days (float)

  • pqc_backend_profiles_path (str | None)

  • qc_report (bool)

  • qc_report_backend_col (str | None)

  • qc_report_backend (str | None)

  • qc_report_dir (Path | None)

  • qc_report_no_plots (bool)

  • qc_report_structure_group_cols (str | None)

  • qc_report_no_feature_plots (bool)

  • qc_report_compact_pdf (bool)

  • qc_report_compact_pdf_name (str)

  • qc_report_compact_outlier_cols (List[str] | None)

  • qc_cross_pulsar_enabled (bool)

  • qc_cross_pulsar_time_col (str | None)

  • qc_cross_pulsar_window_days (float)

  • qc_cross_pulsar_min_pulsars (int)

  • qc_cross_pulsar_include_outliers (bool)

  • qc_cross_pulsar_include_events (bool)

  • qc_cross_pulsar_outlier_cols (List[str] | None)

  • qc_cross_pulsar_event_cols (List[str] | None)

  • qc_cross_pulsar_dir (Path | None)

  • run_fix_dataset (bool)

  • make_binary_analysis (bool)

  • param_scan_typical (bool)

  • param_scan_dm_redchisq_threshold (float)

  • param_scan_dm_max_order (int)

  • param_scan_btx_max_fb (int)

  • fix_apply (bool)

  • fix_branch_name (str | None)

  • fix_base_branch (str | None)

  • fix_commit_message (str | None)

  • fix_backup (bool)

  • fix_dry_run (bool)

  • fix_update_alltim_includes (bool)

  • fix_min_toas_per_backend_tim (int)

  • fix_required_tim_flags (Dict[str, str])

  • fix_system_flag_mapping_path (str | None)

  • fix_system_flag_table_path (str | None)

  • fix_flag_sys_freq_rules_enabled (bool)

  • fix_flag_sys_freq_rules_path (str | None)

  • fix_generate_alltim_variants (bool)

  • fix_backend_classifications_path (str | None)

  • fix_alltim_variants_path (str | None)

  • fix_relabel_rules_path (str | None)

  • fix_overlap_rules_path (str | None)

  • fix_overlap_exact_catalog_path (str | None)

  • fix_jump_reference_variants (bool)

  • fix_jump_reference_keep_tmp (bool)

  • fix_jump_reference_jump_flag (str)

  • fix_jump_reference_csv_dir (str | None)

  • fix_infer_system_flags (bool)

  • fix_system_flag_overwrite_existing (bool)

  • fix_wsrt_p2_force_sys_by_freq (bool)

  • fix_wsrt_p2_prefer_dual_channel (bool)

  • fix_wsrt_p2_mjd_tol_sec (float)

  • fix_wsrt_p2_action (str)

  • fix_wsrt_p2_comment_prefix (str)

  • fix_backend_overrides (Dict[str, str])

  • fix_raise_on_backend_missing (bool)

  • fix_dedupe_toas_within_tim (bool)

  • fix_dedupe_mjd_tol_sec (float)

  • fix_dedupe_freq_tol_mhz (float | None)

  • fix_dedupe_freq_tol_auto (bool)

  • fix_check_duplicate_backend_tims (bool)

  • fix_remove_overlaps_exact (bool)

  • fix_insert_missing_jumps (bool)

  • fix_jump_flag (str)

  • fix_prune_stale_jumps (bool)

  • fix_ensure_ephem (str | None)

  • fix_ensure_clk (str | None)

  • fix_ensure_ne_sw (str | None)

  • fix_force_ne_sw_overwrite (bool)

  • fix_remove_patterns (List[str])

  • fix_coord_convert (str | None)

  • fix_prune_missing_includes (bool)

  • fix_drop_small_backend_includes (bool)

  • fix_system_flag_update_table (bool)

  • fix_default_backend (str | None)

  • fix_group_flag (str)

  • fix_pta_flag (str)

  • fix_pta_value (str | None)

  • fix_standardize_par_values (bool)

  • fix_prune_small_system_toas (bool)

  • fix_prune_small_system_flag (str)

  • fix_qc_remove_outliers (bool)

  • fix_qc_outlier_cols (List[str] | None)

  • fix_qc_action (str)

  • fix_qc_comment_prefix (str)

  • fix_qc_backend_col (str)

  • fix_qc_remove_bad (bool)

  • fix_qc_remove_transients (bool)

  • fix_qc_remove_solar (bool)

  • fix_qc_solar_action (str)

  • fix_qc_solar_comment_prefix (str)

  • fix_qc_remove_orbital_phase (bool)

  • fix_qc_orbital_phase_action (str)

  • fix_qc_orbital_phase_comment_prefix (str)

  • fix_qc_merge_tol_days (float)

  • fix_qc_results_dir (Path | None)

  • fix_qc_branch (str | None)

  • binary_only_models (List[str] | None)

  • dpi (int)

  • max_covmat_params (int | None)

  • ingest_mapping_file (Path | None)

  • ingest_output_dir (Path | None)

  • ingest_commit_branch (bool)

  • ingest_commit_branch_name (str | None)

  • ingest_commit_base_branch (str | None)

  • ingest_commit_message (str | None)

  • ingest_verify (bool)

home_dir

Root of the data repository containing pulsar folders.

Type:

pathlib.Path

singularity_image

Singularity/Apptainer image with tempo2.

Type:

pathlib.Path

dataset_name

Dataset name or path (see resolved()).

Type:

str | None

results_dir

Output directory for pipeline reports.

Type:

pathlib.Path

branches

Git branches to run/compare.

Type:

List[str]

reference_branch

Branch used as change-report reference.

Type:

str

pulsars

“ALL” or an explicit list of pulsar names.

Type:

str | List[str]

outdir_name

Optional output subdirectory name.

Type:

str | None

epoch

Tempo2 epoch used for fitting.

Type:

str

force_rerun

Re-run tempo2 even if outputs exist.

Type:

bool

jobs

Parallel workers per branch.

Type:

int

run_tempo2

Whether to run tempo2.

Type:

bool

make_toa_coverage_plots

Generate coverage plots.

Type:

bool

make_change_reports

Generate change reports.

Type:

bool

make_covariance_heatmaps

Generate covariance heatmaps.

Type:

bool

make_residual_plots

Generate residual plots.

Type:

bool

make_outlier_reports

Generate outlier tables.

Type:

bool

make_plots

Convenience toggle to disable all plotting outputs.

Type:

bool | None

make_reports

Convenience toggle to disable report outputs.

Type:

bool | None

make_covmat

Convenience toggle to control covariance heatmaps.

Type:

bool | None

testing_mode

If True, skip change reports (useful for CI).

Type:

bool

run_pqc

Enable optional pqc stage.

Type:

bool

pqc_backend_col

Backend grouping column for pqc.

Type:

str

pqc_drop_unmatched

Drop unmatched TOAs in pqc.

Type:

bool

pqc_merge_tol_seconds

Merge tolerance in seconds for pqc.

Type:

float

pqc_tau_corr_minutes

OU correlation time for pqc.

Type:

float

pqc_fdr_q

False discovery rate for pqc.

Type:

float

pqc_mark_only_worst_per_day

Mark only worst TOA per day.

Type:

bool

pqc_tau_rec_days

Recovery time for transient scan.

Type:

float

pqc_window_mult

Window multiplier for transient scan.

Type:

float

pqc_min_points

Minimum points for transient scan.

Type:

int

pqc_delta_chi2_thresh

Delta-chi2 threshold for transients.

Type:

float

pqc_exp_dip_min_duration_days

Minimum duration (days) for exp dips.

Type:

float

pqc_add_orbital_phase

Add orbital-phase feature.

Type:

bool

pqc_add_solar_elongation

Add solar elongation feature.

Type:

bool

pqc_add_elevation

Add elevation feature.

Type:

bool

pqc_add_airmass

Add airmass feature.

Type:

bool

pqc_add_parallactic_angle

Add parallactic-angle feature.

Type:

bool

pqc_add_freq_bin

Add frequency-bin feature.

Type:

bool

pqc_freq_bins

Number of frequency bins if enabled.

Type:

int

pqc_observatory_path

Optional observatory file path.

Type:

str | None

pqc_structure_mode

Feature-structure mode (none/detrend/test/both).

Type:

str

pqc_structure_detrend_features

Features to detrend against.

Type:

List[str] | None

pqc_structure_test_features

Features to test for structure.

Type:

List[str] | None

pqc_structure_nbins

Bin count for structure tests.

Type:

int

pqc_structure_min_per_bin

Minimum points per bin.

Type:

int

pqc_structure_p_thresh

p-value threshold for structure detection.

Type:

float

pqc_structure_circular_features

Circular features in [0,1).

Type:

List[str] | None

pqc_structure_group_cols

Grouping columns for structure tests.

Type:

List[str] | None

pqc_outlier_gate_enabled

Enable hard sigma gate for outlier membership.

Type:

bool

pqc_outlier_gate_sigma

Sigma threshold for outlier gate.

Type:

float

pqc_outlier_gate_resid_col

Residual column to gate on (optional).

Type:

str | None

pqc_outlier_gate_sigma_col

Sigma column to gate on (optional).

Type:

str | None

pqc_event_instrument

Enable per-event membership diagnostics.

Type:

bool

pqc_solar_events_enabled

Enable solar event detection.

Type:

bool

pqc_solar_approach_max_deg

Max elongation for solar approach region.

Type:

float

pqc_solar_min_points_global

Min points for global fit.

Type:

int

pqc_solar_min_points_year

Min points for per-year fit.

Type:

int

pqc_solar_min_points_near_zero

Min points near zero elongation.

Type:

int

pqc_solar_tau_min_deg

Min elongation scale for exponential.

Type:

float

pqc_solar_tau_max_deg

Max elongation scale for exponential.

Type:

float

pqc_solar_member_eta

Per-point membership SNR threshold.

Type:

float

pqc_solar_freq_dependence

Fit 1/f^alpha dependence.

Type:

bool

pqc_solar_freq_alpha_min

Lower bound for alpha.

Type:

float

pqc_solar_freq_alpha_max

Upper bound for alpha.

Type:

float

pqc_solar_freq_alpha_tol

Optimization tolerance for alpha.

Type:

float

pqc_solar_freq_alpha_max_iter

Max iterations for alpha optimizer.

Type:

int

pqc_orbital_phase_cut_enabled

Enable orbital-phase based flagging.

Type:

bool

pqc_orbital_phase_cut_center

Eclipse center phase (0..1).

Type:

float

pqc_orbital_phase_cut

Fixed orbital-phase cutoff (0..0.5), or None for auto.

Type:

float | None

pqc_orbital_phase_cut_sigma

Sigma threshold for automatic cutoff estimation.

Type:

float

pqc_orbital_phase_cut_nbins

Number of bins for cutoff estimation.

Type:

int

pqc_orbital_phase_cut_min_points

Minimum points for cutoff estimation.

Type:

int

pqc_eclipse_events_enabled

Enable eclipse event detection.

Type:

bool

pqc_eclipse_center_phase

Eclipse center phase (0..1).

Type:

float

pqc_eclipse_min_points

Min points for global fit.

Type:

int

pqc_eclipse_width_min

Min eclipse width in phase.

Type:

float

pqc_eclipse_width_max

Max eclipse width in phase.

Type:

float

pqc_eclipse_member_eta

Per-point membership SNR threshold.

Type:

float

pqc_eclipse_freq_dependence

Fit 1/f^alpha dependence.

Type:

bool

pqc_eclipse_freq_alpha_min

Lower bound for alpha.

Type:

float

pqc_eclipse_freq_alpha_max

Upper bound for alpha.

Type:

float

pqc_eclipse_freq_alpha_tol

Optimization tolerance for alpha.

Type:

float

pqc_eclipse_freq_alpha_max_iter

Max iterations for alpha optimizer.

Type:

int

pqc_gaussian_bump_enabled

Enable Gaussian-bump event detection.

Type:

bool

pqc_gaussian_bump_min_duration_days

Minimum bump duration in days.

Type:

float

pqc_gaussian_bump_max_duration_days

Maximum bump duration in days.

Type:

float

pqc_gaussian_bump_n_durations

Number of duration grid points.

Type:

int

pqc_gaussian_bump_min_points

Minimum points for bump detection.

Type:

int

pqc_gaussian_bump_delta_chi2_thresh

Delta-chi2 threshold for bump detection.

Type:

float

pqc_gaussian_bump_suppress_overlap

Suppress overlapping bumps.

Type:

bool

pqc_gaussian_bump_member_eta

Per-point membership SNR threshold.

Type:

float

pqc_gaussian_bump_freq_dependence

Fit 1/f^alpha dependence.

Type:

bool

pqc_gaussian_bump_freq_alpha_min

Lower bound for alpha.

Type:

float

pqc_gaussian_bump_freq_alpha_max

Upper bound for alpha.

Type:

float

pqc_gaussian_bump_freq_alpha_tol

Optimization tolerance for alpha.

Type:

float

pqc_gaussian_bump_freq_alpha_max_iter

Max iterations for alpha optimizer.

Type:

int

pqc_glitch_enabled

Enable glitch event detection.

Type:

bool

pqc_glitch_min_points

Minimum points for glitch detection.

Type:

int

pqc_glitch_delta_chi2_thresh

Delta-chi2 threshold for glitch detection.

Type:

float

pqc_glitch_suppress_overlap

Suppress overlapping glitches.

Type:

bool

pqc_glitch_member_eta

Per-point membership SNR threshold.

Type:

float

pqc_glitch_peak_tau_days

Peak exponential timescale for glitch model.

Type:

float

pqc_glitch_noise_k

Noise-aware threshold multiplier.

Type:

float

pqc_glitch_mean_window_days

Rolling-mean window (days) for zero-crossing.

Type:

float

pqc_glitch_min_duration_days

Minimum glitch duration (days).

Type:

float

pqc_backend_profiles_path

Optional TOML with per-backend pqc overrides.

Type:

str | None

qc_report

Generate pqc report artifacts after the run.

Type:

bool

qc_report_backend_col

Backend column name for reports (optional).

Type:

str | None

qc_report_backend

Optional backend key to plot.

Type:

str | None

qc_report_dir

Output directory for reports (optional).

Type:

pathlib.Path | None

qc_report_no_plots

Skip transient plots in reports.

Type:

bool

qc_report_structure_group_cols

Structure grouping override for reports.

Type:

str | None

qc_report_no_feature_plots

Skip feature plots in reports.

Type:

bool

qc_report_compact_pdf

Generate compact composite PDF report.

Type:

bool

qc_report_compact_pdf_name

Filename for compact PDF report.

Type:

str

run_fix_dataset

Enable FixDataset stage.

Type:

bool

make_binary_analysis

Enable binary analysis table.

Type:

bool

param_scan_typical

Enable typical param-scan profile.

Type:

bool

param_scan_dm_redchisq_threshold

Threshold for DM scan.

Type:

float

param_scan_dm_max_order

Max DM derivative order.

Type:

int

param_scan_btx_max_fb

Max FB derivative order.

Type:

int

fix_apply

Whether FixDataset applies changes and commits.

Type:

bool

fix_branch_name

Name of FixDataset branch. If unset and fix_apply is true, a name is auto-generated as branch_run_ddmmyyhhmm.

Type:

str | None

fix_base_branch

Base branch for FixDataset.

Type:

str | None

fix_commit_message

Commit message for FixDataset.

Type:

str | None

fix_backup

Create backup before FixDataset modifications.

Type:

bool

fix_dry_run

If True, FixDataset does not write changes.

Type:

bool

fix_update_alltim_includes

Update INCLUDE lines in .tim files.

Type:

bool

fix_min_toas_per_backend_tim

Minimum TOAs per backend .tim.

Type:

int

fix_required_tim_flags

Required flags for .tim entries.

Type:

Dict[str, str]

fix_system_flag_mapping_path

Editable system-flag mapping JSON (optional).

Type:

str | None

fix_system_flag_mapping_path

Editable system-flag mapping JSON (optional).

Type:

str | None

fix_relabel_rules_path

Declarative TOA relabel rules TOML (optional).

Type:

str | None

fix_overlap_rules_path

Declarative overlap rules TOML (optional).

Type:

str | None

fix_overlap_exact_catalog_path

TOML keep->drop map for exact overlap removal.

Type:

str | None

fix_jump_reference_variants

Build per-variant reference-system jump parfiles.

Type:

bool

fix_jump_reference_keep_tmp

Keep temporary split tim/par files.

Type:

bool

fix_jump_reference_jump_flag

Jump flag used in generated variant parfiles.

Type:

str

fix_jump_reference_csv_dir

Output directory for jump-reference CSV files.

Type:

str | None

fix_insert_missing_jumps

Insert missing JUMP lines.

Type:

bool

fix_jump_flag

Flag used for inserted jumps.

Type:

str

fix_prune_stale_jumps

Drop JUMPs not present in timfile flags.

Type:

bool

fix_ensure_ephem

Ensure ephemeris parameter exists.

Type:

str | None

fix_ensure_clk

Ensure clock parameter exists.

Type:

str | None

fix_ensure_ne_sw

Ensure NE_SW parameter exists.

Type:

str | None

fix_force_ne_sw_overwrite

Overwrite existing NE_SW values when true.

Type:

bool

fix_remove_patterns

Patterns to remove from .par/.tim.

Type:

List[str]

fix_coord_convert

Optional coordinate conversion.

Type:

str | None

fix_qc_remove_outliers

Comment/delete TOAs flagged by pqc outputs.

Type:

bool

fix_qc_action

Action for pqc outliers (comment/delete).

Type:

str

fix_qc_comment_prefix

Prefix for commented TOA lines.

Type:

str

fix_qc_backend_col

Backend column for pqc matching (if needed).

Type:

str

fix_qc_remove_bad

Act on bad/bad_day flags.

Type:

bool

fix_qc_remove_transients

Act on transient flags.

Type:

bool

fix_qc_remove_solar

Act on solar-elongation flags.

Type:

bool

fix_qc_solar_action

Action for solar-flagged TOAs (comment/delete).

Type:

str

fix_qc_solar_comment_prefix

Prefix for solar-flagged TOA comments.

Type:

str

fix_qc_remove_orbital_phase

Act on orbital-phase flags.

Type:

bool

fix_qc_orbital_phase_action

Action for orbital-phase flagged TOAs (comment/delete).

Type:

str

fix_qc_orbital_phase_comment_prefix

Prefix for orbital-phase TOA comments.

Type:

str

fix_qc_merge_tol_days

MJD tolerance when matching TOAs.

Type:

float

fix_qc_results_dir

Directory containing pqc CSV outputs. If unset and fix_apply is true, defaults to <results>/qc/<fix_branch_name>.

Type:

pathlib.Path | None

fix_qc_branch

Branch subdir for pqc CSV outputs. If unset and fix_qc_results_dir is set, defaults to fix_branch_name.

Type:

str | None

binary_only_models

Limit binary analysis to model names.

Type:

List[str] | None

dpi

Plot resolution.

Type:

int

max_covmat_params

Max params in covariance heatmaps.

Type:

int | None

ingest_mapping_file

JSON mapping file for ingest mode (optional).

Type:

pathlib.Path | None

ingest_output_dir

Output root directory for ingest mode (optional).

Type:

pathlib.Path | None

ingest_commit_branch

Create a new branch and commit ingest outputs.

Type:

bool

ingest_commit_branch_name

Optional name for the ingest branch.

Type:

str | None

ingest_commit_base_branch

Base branch for the ingest commit.

Type:

str | None

ingest_commit_message

Commit message for ingest.

Type:

str | None

binary_only_models: List[str] | None
branches: List[str]
cleanup_output_tree: bool
cleanup_work_dir: bool
dataset_name: str | None
dpi: int
epoch: str
fix_alltim_variants_path: str | None
fix_apply: bool
fix_backend_classifications_path: str | None
fix_backend_overrides: Dict[str, str]
fix_backup: bool
fix_base_branch: str | None
fix_branch_name: str | None
fix_check_duplicate_backend_tims: bool
fix_commit_message: str | None
fix_coord_convert: str | None
fix_dedupe_freq_tol_auto: bool
fix_dedupe_freq_tol_mhz: float | None
fix_dedupe_mjd_tol_sec: float
fix_dedupe_toas_within_tim: bool
fix_default_backend: str | None
fix_drop_small_backend_includes: bool
fix_dry_run: bool
fix_ensure_clk: str | None
fix_ensure_ephem: str | None
fix_ensure_ne_sw: str | None
fix_flag_sys_freq_rules_enabled: bool
fix_flag_sys_freq_rules_path: str | None
fix_force_ne_sw_overwrite: bool
fix_generate_alltim_variants: bool
fix_group_flag: str
fix_infer_system_flags: bool
fix_insert_missing_jumps: bool
fix_jump_flag: str
fix_jump_reference_csv_dir: str | None
fix_jump_reference_jump_flag: str
fix_jump_reference_keep_tmp: bool
fix_jump_reference_variants: bool
fix_min_toas_per_backend_tim: int
fix_overlap_exact_catalog_path: str | None
fix_overlap_rules_path: str | None
fix_prune_missing_includes: bool
fix_prune_small_system_flag: str
fix_prune_small_system_toas: bool
fix_prune_stale_jumps: bool
fix_pta_flag: str
fix_pta_value: str | None
fix_qc_action: str
fix_qc_backend_col: str
fix_qc_branch: str | None
fix_qc_comment_prefix: str
fix_qc_merge_tol_days: float
fix_qc_orbital_phase_action: str
fix_qc_orbital_phase_comment_prefix: str
fix_qc_outlier_cols: List[str] | None
fix_qc_remove_bad: bool
fix_qc_remove_orbital_phase: bool
fix_qc_remove_outliers: bool
fix_qc_remove_solar: bool
fix_qc_remove_transients: bool
fix_qc_results_dir: Path | None
fix_qc_solar_action: str
fix_qc_solar_comment_prefix: str
fix_raise_on_backend_missing: bool
fix_relabel_rules_path: str | None
fix_remove_overlaps_exact: bool
fix_remove_patterns: List[str]
fix_required_tim_flags: Dict[str, str]
fix_standardize_par_values: bool
fix_system_flag_mapping_path: str | None
fix_system_flag_overwrite_existing: bool
fix_system_flag_table_path: str | None
fix_system_flag_update_table: bool
fix_update_alltim_includes: bool
fix_wsrt_p2_action: str
fix_wsrt_p2_comment_prefix: str
fix_wsrt_p2_force_sys_by_freq: bool
fix_wsrt_p2_mjd_tol_sec: float
fix_wsrt_p2_prefer_dual_channel: bool
force_rerun: bool
static from_dict(d)[source]

Construct a PipelineConfig from a dict.

Parameters:

d (Dict[str, Any]) – Dictionary of configuration values.

Returns:

A new PipelineConfig instance.

Raises:

KeyError – If required keys (home_dir or singularity_image) are missing.

Return type:

PipelineConfig

Examples

Load from a dict (e.g., parsed JSON/TOML):

cfg = PipelineConfig.from_dict(
    {
        "home_dir": "/data/epta",
        "singularity_image": "/images/tempo2.sif",
        "dataset_name": "EPTA",
    }
)
home_dir: Path
ingest_commit_base_branch: str | None
ingest_commit_branch: bool
ingest_commit_branch_name: str | None
ingest_commit_message: str | None
ingest_mapping_file: Path | None
ingest_output_dir: Path | None
ingest_verify: bool
jobs: int
static load(path)[source]

Load configuration from a JSON or TOML file.

Parameters:

path (Path) – Path to a .json or .toml file.

Returns:

A PipelineConfig instance.

Raises:
  • FileNotFoundError – If the path does not exist.

  • RuntimeError – If TOML is requested but tomllib is unavailable.

  • ValueError – If the file extension is unsupported.

Return type:

PipelineConfig

Examples

Load from JSON:

cfg = PipelineConfig.load(Path("pipeline.json"))

Load from TOML ([pipeline] table supported):

cfg = PipelineConfig.load(Path("pipeline.toml"))
make_binary_analysis: bool
make_change_reports: bool
make_covariance_heatmaps: bool
make_covmat: bool | None
make_outlier_reports: bool
make_plots: bool | None
make_reports: bool | None
make_residual_plots: bool
make_toa_coverage_plots: bool
max_covmat_params: int | None
outdir_name: str | None
param_scan_btx_max_fb: int
param_scan_dm_max_order: int
param_scan_dm_redchisq_threshold: float
param_scan_typical: bool
pqc_add_airmass: bool
pqc_add_elevation: bool
pqc_add_freq_bin: bool
pqc_add_orbital_phase: bool
pqc_add_parallactic_angle: bool
pqc_add_solar_elongation: bool
pqc_backend_col: str
pqc_backend_profiles_path: str | None
pqc_delta_chi2_thresh: float
pqc_dm_step_delta_chi2_thresh: float
pqc_dm_step_enabled: bool
pqc_dm_step_min_points: int
pqc_dm_step_scope: str
pqc_drop_unmatched: bool
pqc_eclipse_center_phase: float
pqc_eclipse_events_enabled: bool
pqc_eclipse_freq_alpha_max: float
pqc_eclipse_freq_alpha_max_iter: int
pqc_eclipse_freq_alpha_min: float
pqc_eclipse_freq_alpha_tol: float
pqc_eclipse_freq_dependence: bool
pqc_eclipse_member_eta: float
pqc_eclipse_min_points: int
pqc_eclipse_width_max: float
pqc_eclipse_width_min: float
pqc_event_instrument: bool
pqc_exp_dip_min_duration_days: float
pqc_fdr_q: float
pqc_freq_bins: int
pqc_gaussian_bump_delta_chi2_thresh: float
pqc_gaussian_bump_enabled: bool
pqc_gaussian_bump_freq_alpha_max: float
pqc_gaussian_bump_freq_alpha_max_iter: int
pqc_gaussian_bump_freq_alpha_min: float
pqc_gaussian_bump_freq_alpha_tol: float
pqc_gaussian_bump_freq_dependence: bool
pqc_gaussian_bump_max_duration_days: float
pqc_gaussian_bump_member_eta: float
pqc_gaussian_bump_min_duration_days: float
pqc_gaussian_bump_min_points: int
pqc_gaussian_bump_n_durations: int
pqc_gaussian_bump_suppress_overlap: bool
pqc_glitch_delta_chi2_thresh: float
pqc_glitch_enabled: bool
pqc_glitch_mean_window_days: float
pqc_glitch_member_eta: float
pqc_glitch_min_duration_days: float
pqc_glitch_min_points: int
pqc_glitch_noise_k: float
pqc_glitch_peak_tau_days: float
pqc_glitch_suppress_overlap: bool
pqc_mark_only_worst_per_day: bool
pqc_merge_tol_seconds: float
pqc_min_points: int
pqc_observatory_path: str | None
pqc_orbital_phase_cut: float | None
pqc_orbital_phase_cut_center: float
pqc_orbital_phase_cut_enabled: bool
pqc_orbital_phase_cut_min_points: int
pqc_orbital_phase_cut_nbins: int
pqc_orbital_phase_cut_sigma: float
pqc_outlier_gate_enabled: bool
pqc_outlier_gate_resid_col: str | None
pqc_outlier_gate_sigma: float
pqc_outlier_gate_sigma_col: str | None
pqc_robust_enabled: bool
pqc_robust_scope: str
pqc_robust_z_thresh: float
pqc_solar_approach_max_deg: float
pqc_solar_events_enabled: bool
pqc_solar_freq_alpha_max: float
pqc_solar_freq_alpha_max_iter: int
pqc_solar_freq_alpha_min: float
pqc_solar_freq_alpha_tol: float
pqc_solar_freq_dependence: bool
pqc_solar_member_eta: float
pqc_solar_min_points_global: int
pqc_solar_min_points_near_zero: int
pqc_solar_min_points_year: int
pqc_solar_tau_max_deg: float
pqc_solar_tau_min_deg: float
pqc_step_delta_chi2_thresh: float
pqc_step_enabled: bool
pqc_step_min_points: int
pqc_step_scope: str
pqc_structure_circular_features: List[str] | None
pqc_structure_detrend_features: List[str] | None
pqc_structure_group_cols: List[str] | None
pqc_structure_min_per_bin: int
pqc_structure_mode: str
pqc_structure_nbins: int
pqc_structure_p_thresh: float
pqc_structure_test_features: List[str] | None
pqc_tau_corr_minutes: float
pqc_tau_rec_days: float
pqc_window_mult: float
pulsars: str | List[str]
qc_cross_pulsar_dir: Path | None
qc_cross_pulsar_enabled: bool
qc_cross_pulsar_event_cols: List[str] | None
qc_cross_pulsar_include_events: bool
qc_cross_pulsar_include_outliers: bool
qc_cross_pulsar_min_pulsars: int
qc_cross_pulsar_outlier_cols: List[str] | None
qc_cross_pulsar_time_col: str | None
qc_cross_pulsar_window_days: float
qc_report: bool
qc_report_backend: str | None
qc_report_backend_col: str | None
qc_report_compact_outlier_cols: List[str] | None
qc_report_compact_pdf: bool
qc_report_compact_pdf_name: str
qc_report_dir: Path | None
qc_report_no_feature_plots: bool
qc_report_no_plots: bool
qc_report_structure_group_cols: str | None
reference_branch: str
resolved()[source]

Return a copy with paths expanded and resolved.

The dataset_name field is interpreted as:

  1. absolute path -> use as-is

  2. looks like a path (contains “/” or starts with “.”) -> resolve relative to cwd

  3. plain name -> treat as a directory inside home_dir

Returns:

A new PipelineConfig with resolved paths.

Raises:

TypeError – If dataset_name is None (it must be a string or path-like value when resolving).

Return type:

PipelineConfig

Examples

Resolve a dataset by name relative to home_dir:

cfg = PipelineConfig(
    home_dir=Path("/data/epta"),
    singularity_image=Path("/images/tempo2.sif"),
    dataset_name="EPTA",
)
resolved = cfg.resolved()
assert resolved.dataset_name == Path("/data/epta/EPTA")
results_dir: Path
run_fix_dataset: bool
run_pqc: bool
run_tempo2: bool
save_json(path)[source]

Write configuration to a JSON file.

Parameters:

path (Path) – Output file path.

Return type:

None

Examples

Save to disk:

cfg.save_json(Path("pipeline.json"))
singularity_image: Path
testing_mode: bool
to_dict()[source]

Serialize configuration to a JSON-friendly dict.

Returns:

Dictionary representation of the config with Path values converted to strings.

Return type:

Dict[str, Any]

Examples

Convert to a dict suitable for JSON serialization:

cfg = PipelineConfig(
    home_dir=Path("/data/epta"),
    singularity_image=Path("/images/tempo2.sif"),
    dataset_name="EPTA",
)
payload = cfg.to_dict()
class pleb.QCReportConfig(run_dir, backend_col='group', backend=None, report_dir=None, no_plots=False, structure_group_cols=None, no_feature_plots=False, compact_pdf=False, compact_pdf_name='qc_compact_report.pdf')[source]

Bases: object

Configuration model for qc-report mode.

Notes

This model configures report rendering only. It does not run PQC; it consumes existing *_qc.csv outputs and generates summary artifacts (plots, tables, optional compact PDF).

backend: str | None
backend_col: str
compact_pdf: bool
compact_pdf_name: str
static from_dict(d)[source]
Parameters:

d (Dict[str, Any])

Return type:

QCReportConfig

static load(path)[source]
Parameters:

path (Path)

Return type:

QCReportConfig

no_feature_plots: bool
no_plots: bool
report_dir: Path | None
run_dir: Path
structure_group_cols: str | None
Parameters:
  • run_dir (Path)

  • backend_col (str)

  • backend (str | None)

  • report_dir (Path | None)

  • no_plots (bool)

  • structure_group_cols (str | None)

  • no_feature_plots (bool)

  • compact_pdf (bool)

  • compact_pdf_name (str)

class pleb.WorkflowRunConfig(workflow_file)[source]

Bases: object

Configuration model for workflow-file execution mode.

Parameters

workflow_filepathlib.Path

Path to workflow definition (TOML/JSON) executed by pleb.workflow.run_workflow().

static from_dict(d)[source]
Parameters:

d (Dict[str, Any])

Return type:

WorkflowRunConfig

static load(path)[source]
Parameters:

path (Path)

Return type:

WorkflowRunConfig

workflow_file: Path
Parameters:

workflow_file (Path)

pleb.fix_pulsar_dataset(psr_dir, cfg)[source]

Apply or report dataset fixes for a single pulsar directory.

Parameters:
  • psr_dir (Path) – Pulsar directory containing .par/.tim files.

  • cfg (FixDatasetConfig) – FixDataset configuration.

Returns:

Report dictionary with per-step results.

Return type:

Dict[str, object]

Examples

Run a report-only pass:

report = fix_pulsar_dataset(Path("/data/epta/J1234+5678"), FixDatasetConfig())

See also

write_fix_report: Write aggregated reports to disk.

pleb.run_param_scan(cfg, **kwargs)[source]

Run a parameter scan (fit-only) workflow.

This wrapper lazily imports the parameter scan module.

Parameters

cfgPipelineConfig

Pipeline configuration.

**kwargs

Forwarded to pleb.param_scan.run_param_scan().

Returns

dict

Output-path mapping produced by the scan.

See Also

pleb.param_scan.run_param_scan: Full scan implementation.

Parameters:

cfg (PipelineConfig)

pleb.run_pipeline(cfg)[source]

Run the full data-combination pipeline.

This is a lightweight wrapper that lazily imports the heavy pipeline module.

Parameters

cfgPipelineConfig

Pipeline configuration.

Returns

dict

Output-path mapping as returned by pleb.pipeline.run_pipeline().

See Also

pleb.pipeline.run_pipeline: Full pipeline implementation.

Parameters:

cfg (PipelineConfig)

pleb.write_binary_analysis(home_dir, out_dir, pulsars, branches, config=None)[source]

Write a per-branch, per-pulsar binary analysis TSV.

Looks for <home_dir>/<pulsar>/<pulsar>.par on each branch.

Parameters

home_dirpathlib.Path

Root data repository.

out_dirpathlib.Path

Output directory for the TSV.

pulsarslist of str

Pulsar names to include.

brancheslist of str

Branch names (used for labeling).

configBinaryAnalysisConfig, optional

Optional binary analysis configuration.

Returns

pathlib.Path

Path to the written TSV file.

Examples

Write a binary analysis table for two branches:

out_path = write_binary_analysis(
    home_dir=Path("/data/epta/EPTA"),
    out_dir=Path("results/binary"),
    pulsars=["J1234+5678"],
    branches=["main", "EPTA"],
)
Parameters:
  • home_dir (Path)

  • out_dir (Path)

  • pulsars (List[str])

  • branches (List[str])

  • config (BinaryAnalysisConfig | None)

Return type:

Path

pleb.write_fix_report(reports, out_dir)[source]

Write FixDataset reports to disk.

Parameters:
  • reports (List[Dict[str, object]]) – List of per-pulsar report dictionaries.

  • out_dir (Path) – Output directory.

Returns:

Path to the detailed JSON report file.

Return type:

Path

Examples

Save reports for multiple pulsars:

detail_path = write_fix_report(reports, Path("results/fix_dataset"))

Internal Modules (No Index)

Provide the command-line interface for the data-combination pipeline.

This module wires config loading/overrides to pleb.pipeline.run_pipeline() and pleb.param_scan.run_param_scan(), including convenience flags for parameter scans and PQC reporting.

Examples

Run the full pipeline from a JSON config:

python -m pleb.cli --config pipeline.json

Run a parameter scan with a typical profile:

python -m pleb.cli --config pipeline.toml --param-scan --scan-typical

Generate a PQC report from a run directory:

python -m pleb.cli qc-report --run-dir results/run_2024-01-01

See also

pleb.config.PipelineConfig: Configuration model. pleb.pipeline.run_pipeline: Pipeline execution entry point. pleb.param_scan.run_param_scan: Parameter scan entry point. pleb.qc_report.generate_qc_report: QC report generator.

Define configuration models for the data-combination pipeline.

This module provides PipelineConfig, a flattened dataclass used by the CLI and pipeline entry points to control data ingestion, fitting, reporting, and optional FixDataset or parameter-scan stages. The config is intentionally flat to simplify JSON/TOML serialization and CLI overrides.

See also

pleb.pipeline.run_pipeline: Main pipeline entry point. pleb.param_scan.run_param_scan: Parameter scan entry point. pleb.cli: Command-line interface that consumes PipelineConfig.

FixDataset utilities for cleaning and normalizing pulsar datasets.

This module implements deterministic file-level transformations for .par and .tim trees: include maintenance, flag standardization, deduplication, JUMP maintenance, declarative relabel/overlap rules, and optional QC-driven comment/delete actions.

Notes

Statistical operations in this module are lightweight and mostly descriptive:

  • Reference-system selection uses robust ranking by median TOA uncertainty and TOA count (a pragmatic proxy for a stable timing anchor).

  • Deduplication can use user-defined tolerance or auto-derived frequency tolerance from channel spacing.

  • QC application consumes precomputed pqc flags; this module does not fit statistical models itself.

Worked example

For a variant J1713+0747_all.new.tim, reference-system generation:

  1. Split included backend timfiles by -sys.

  2. Count TOAs per system and compute median TOA error (microseconds).

  3. Choose system with smallest median error; tie-break by largest TOA count.

  4. Write J1713+0747.new.par with JUMP -sys <system> 0 <fitflag> lines.

References

See Also

pleb.config.PipelineConfig

Pipeline-level integration settings.

pleb.pipeline.run_pipeline

Orchestrates FixDataset execution and reporting.

Legacy robust .tim parsing import path.

This module is kept for backward compatibility and re-exports the canonical reader from pleb.tim_reader.

Legacy dataset-fix utilities extracted from FixDataset.ipynb.

These functions implement the original notebook’s dataset correction features: - TIM fixes: whitespace/padd cleanup, missing flag insertion (-be/-pta/-group/-sys), NUPPI splitting, overlap removal, missing tim INCLUDE updates. - PAR fixes: ensure ephem/clk/ne_sw, add missing JUMPs, coordinate conversion helpers, optional param additions.

They are intentionally kept close to the notebook logic for parity.

Lightweight Git helpers used by the pipeline.

These helpers wrap common GitPython operations with minimal logging.

Kepler/orbit helper functions.

This module is adapted from the AnalysePulsars notebook and provides small, float-based orbital conversions suitable for tempo2-style workflows.

Notes

The original notebook mixed unit-aware calculations (Astropy) and raw floats. Here we keep the core conversions in plain floats and expose more advanced solvers behind optional SciPy/Astropy imports when needed.

Legacy dataset-fix utilities extracted from FixDataset.ipynb.

These functions implement the original notebook’s dataset correction features: - TIM fixes: whitespace/padd cleanup, missing flag insertion (-be/-pta/-group/-sys), NUPPI splitting, overlap removal, missing tim INCLUDE updates. - PAR fixes: ensure ephem/clk/ne_sw, add missing JUMPs, coordinate conversion helpers, optional param additions.

They are intentionally kept close to the notebook logic for parity.

Logging helpers for the pipeline package.

This module provides a small logger factory that writes to stdout and a timestamped log file under logs/ (or $PLEB_LOG_DIR).

Optional PQC integration for per-TOA quality control.

This module converts PTAQCConfig values into pqc detector configuration objects, runs pqc per pulsar, and writes a per-TOA QC table. It also supports per-backend override profiles and subprocess isolation for libstempo-related crashes.

Notes

The QC CSV produced here is intentionally diagnostic rather than a final truth label. It carries detector outputs such as:

  • bad_ou: outlier evidence from the OU-innovation bad-measurement model.

  • bad_mad: robust outlier evidence from MAD-based detectors.

  • bad_hard: optional hard sigma-gate failures.

  • bad_point: combined outlier indicator after event-aware reconciliation.

  • event_member: coherent-event membership (transient/step/solar/etc.).

  • outlier_any: compatibility field from pqc (bad_point OR event).

Statistical concepts used by the wrapped PQC pipeline include:

  1. False discovery rate (FDR) control Controls expected fraction of false positives among detections. A typical Benjamini-Hochberg decision rule marks p-values p_(i) where p_(i) <= (i/m) q for rank i, number of tests m, and target FDR q.

  2. OU-correlated innovations Residuals are tested under a short-timescale correlated process to avoid over-flagging clustered noise as independent outliers.

  3. Robust z-scores with MAD Robust scale estimate: MAD = median(|x - median(x)|) and sigma_robust ~= 1.4826 * MAD (for Gaussian data) gives outlier score |x - median(x)| / sigma_robust.

  4. Delta-chi-square model comparison Event detectors compare null vs alternative local models; large Delta chi^2 supports structured deviations.

Worked example

If residuals in one backend include a coherent eclipse-like dip, robust/MAD detectors may initially mark those TOAs. Event detectors can then model the dip and reclassify those points as event_member so bad_point is cleared.

References

See Also

pleb.pipeline.run_pipeline

Pipeline stage that invokes this module.

pleb.dataset_fix.apply_pqc_outliers

Optional downstream action stage (comment/delete flagged TOAs).

Parse tempo2 outputs and pipeline text formats.

This module provides small, resilient parsers for tempo2 logs and output artifacts such as plk logs, covariance matrices, and general2 output.

See also

pleb.reports: Utilities that consume parsed outputs. pleb.tim_reader.read_tim_file_robust: Robust .tim reader used here.

Parameter-scan utilities for rapid fit diagnostics.

This module implements a fit-only workflow that evaluates candidate parameter additions or edits by running tempo2 on temporary .par variants. It is used by the CLI --param-scan mode.

See also

pleb.pipeline.run_pipeline: Full pipeline workflow. pleb.reports.write_new_param_significance: Related reporting utilities.

Orchestrate the data-combination pipeline end to end.

This module coordinates git branch management, tempo2 runs, report generation, and optional quality-control steps. It stitches together the core building blocks in pleb.tempo2, pleb.reports, and pleb.dataset_fix.

See also

pleb.config.PipelineConfig: Primary configuration model. pleb.param_scan.run_param_scan: Fit-only parameter scan workflow.

Plotting helpers for pipeline outputs.

This module renders summary plots and tables from tempo2 outputs and pipeline metadata. It relies on Matplotlib and optionally Seaborn for styling.

See also

pleb.reports: Tabular report generation. pleb.parsers: Parsing helpers used by plots.

Binary/orbital analysis helpers for pulsar .par files.

This module provides lightweight parsing and derived-parameter calculations intended for summary reports, not full timing-model validation.

See also

pleb.kepler_orbits: Orbital mechanics helpers used in derived quantities. pleb.config.PipelineConfig: Enables binary analysis in the pipeline.

Generate report artifacts from existing pqc CSV outputs.

This module is a post-processing/reporting layer. It does not re-run pqc; instead it reads *_qc.csv files, renders helper-script diagnostics, and can assemble a compact PDF with actionable per-backend tables.

Notes

Compact decisions are derived from two logical sets:

  • outlier set (by default union of outlier_any, bad_point, robust/bad-mad columns, etc.)

  • event set (transient, solar, eclipse, Gaussian bump, glitch, orbital flags)

Decision rules:

  • BAD_TOA: outlier and not event

  • REVIEW_EVENT: outlier and event

  • EVENT: event and not outlier

  • KEEP: neither set

References

Generate comparison and QC reports from pipeline outputs.

This module builds change reports, model-comparison summaries, and outlier tables from tempo2 outputs parsed by pleb.parsers.

See also

pleb.parsers: Parsing helpers for tempo2 logs. pleb.pipeline.run_pipeline: Orchestrates report generation.

Infer system flags for EPTA-style tempo2 FORMAT 1 .tim files.

Goal:

  • When -sys/-group/-pta are missing (and sometimes -be missing), infer them cheaply and consistently.

  • If bandwidth (-bw) and number-of-bands (-nchan/-nband) are available, assign sub-band systems by binning frequencies into equal-width sub-bands.

  • Keep system format: <TEL>.<BACKEND>.<CENTRE_MHZ> (used with “-sys” flag)

Design choices (cheap + robust):

  • Only TOA lines are processed; directives/comments are preserved.

  • We never try to infer a header; we assume FORMAT 1 and use the 2nd column as frequency (MHz).

  • We drop/ignore any TOA lines whose frequency is non-numeric.

  • Backend inference:
    1. per-TOA “-be” flag if present

    2. filename stem heuristic: <TEL>.<BACKEND>….tim

    3. otherwise raise BackendMissingError with a sample TOA line for the UI to show the user

  • Second pass canonicalisation across pulsars:

    Use canonicalise_centres() on a combined table of inferred centres to “snap” them across pulsars within a tolerance (default 1 MHz).

See also

pleb.dataset_fix.infer_and_apply_system_flags: Integration point for FixDataset.

tempo2 execution helpers for the pipeline.

This module wraps the tempo2 CLI invocation used by the pipeline and parameter-scan workflows. It assumes tempo2 is available inside a Singularity or Apptainer container.

See also

pleb.pipeline.run_pipeline: Main workflow orchestration. pleb.param_scan.run_param_scan: Fit-only parameter scan workflow.

Robust .tim file parsing utilities.

These helpers implement tolerant parsing and filtering for tempo2 .tim files that contain mixed headers, directives, and TOA rows.

General utility helpers for the pipeline.

These helpers provide small filesystem utilities and shared path conventions used across pipeline modules.