Single-Pulsar Three-Pass Workflow¶

Once the separate stages are understood, the next step is to chain them into one reproducible workflow file.

Workflow mode should not be the first introduction to the pipeline. It is most useful after the individual stages are already understood.

The Three Passes¶

For one pulsar, a clean three-pass workflow is:

Pass 1: Build the first coherent branch with system flags, jumps, and variants.
Pass 2: Run tempo2 and PQC on top of Pass 1 without applying QC edits.
Pass 3: Read the QC outputs from Pass 2 and comment flagged TOAs on a new branch.

The Three Run Profiles¶

You should already have these files:

configs/runs/fixdataset/single_pulsar_step1_fix.toml
configs/runs/pqc/single_pulsar_pqc_detect.toml
configs/runs/fixdataset/single_pulsar_pqc_apply.toml

The third one is new and is shown below.

In practice, the workflow file works best when each referenced run profile can also be executed directly on its own. That makes stage-level debugging much simpler.

The QC-Apply Profile¶

Example: configs/runs/fixdataset/single_pulsar_pqc_apply.toml

Tracked repository example: configs/runs/fixdataset/single_pulsar_pqc_apply.example.toml

home_dir = "/data/canonical"
dataset_name = "EPTA-DR3/epta-dr3-data"
results_dir = "results"
singularity_image = "/work/containers/psrpta.sif"

branches = ["step3_apply_qc_comments"]
reference_branch = "step3_apply_qc_comments"
pulsars = ["J1909-3744"]
jobs = 1
outdir_name = "j1909_pqc_apply"

run_tempo2 = false
run_pqc = false
qc_report = false
make_plots = false
make_reports = false
make_covmat = false

run_fix_dataset = true
fix_apply = true
fix_base_branch = "step2_pqc_balanced_detect"
fix_branch_name = "step3_apply_qc_comments"
fix_commit_message = "Step3: apply PQC comments for J1909-3744"

fix_qc_results_dir = "results/j1909_pqc_detect/qc"
fix_qc_branch = "step2_pqc_balanced_detect"

fix_qc_remove_outliers = true
fix_qc_action = "comment"
fix_qc_outlier_cols = ["bad_point", "robust_outlier", "robust_global_outlier", "bad_mad"]
fix_qc_remove_bad = true
fix_qc_remove_transients = false
fix_qc_remove_solar = false
fix_qc_remove_orbital_phase = false

fix_generate_alltim_variants = true
fix_backend_classifications_path = "configs/catalogs/variants/backend_classifications_legacy_new.toml"
fix_alltim_variants_path = "configs/catalogs/variants/alltim_variants_legacy_new.toml"
fix_jump_reference_variants = true
fix_jump_reference_jump_flag = "-sys"

This is the single-pulsar version of configs/workflows/steps/step3_apply_qc_comments_variants.toml.

How To Explain The QC-Apply Keys¶

fix_qc_results_dir: Directory containing the QC outputs produced by the detect run.
fix_qc_branch: Branch name that the QC outputs correspond to.
fix_qc_remove_outliers: Enable QC-driven action.
fix_qc_action = "comment": Comment flagged TOAs rather than delete them. This is the recommended first policy.
fix_qc_outlier_cols: Explicit QC columns that should count as actionable outlier evidence.
fix_qc_remove_transients = false: Do not automatically comment transient or event families until their meaning has been reviewed explicitly in the QC outputs.

Why The Apply Stage Uses Explicit Outlier Columns¶

Action policy is distinct from detection policy.

For a first apply pass, a narrow explicit list like:

fix_qc_outlier_cols = ["bad_point", "robust_outlier", "robust_global_outlier", "bad_mad"]

is better than a vague “anything suspicious” rule.

This keeps the first apply pass auditable.

How `fix_qc_results_dir` And `fix_qc_branch` Work Together¶

These two keys are easy to misunderstand.

fix_qc_results_dir: Points to the run-directory location where the QC outputs were written.
fix_qc_branch: Tells the apply stage which branch-specific QC subdirectory or context those outputs correspond to.

Together they define the hand-off from Step 2 to Step 3:

Step 2 generates QC outputs under its run directory,
Step 3 reads those outputs back in and applies the chosen action policy to a new branch.

If these paths are wrong, the apply stage can appear to ignore QC results even though the real problem is that it is reading the wrong run location.

The Workflow File¶

Once the three run profiles exist, make a workflow file under configs/workflows/.

Example: configs/workflows/single_pulsar_3pass.toml

Tracked repository example: configs/workflows/single_pulsar_3pass.example.toml

config = "configs/runs/fixdataset/single_pulsar_step1_fix.toml"
mode = "serial"

[[groups]]
name = "step1_fix_flags_and_jumps"
mode = "serial"

[[groups.steps]]
name = "pipeline"
config = "configs/runs/fixdataset/single_pulsar_step1_fix.toml"

[[groups]]
name = "step2_detect"
mode = "serial"

[[groups.steps]]
name = "pipeline"
config = "configs/runs/pqc/single_pulsar_pqc_detect.toml"

[[groups]]
name = "step3_apply"
mode = "serial"

[[groups.steps]]
name = "pipeline"
config = "configs/runs/fixdataset/single_pulsar_pqc_apply.toml"

This is the stripped-down single-pulsar form of the repository’s branch-chained workflow pattern in configs/workflows/branch_chained_fix_pqc_variants.toml.

How Run Directories And Branches Relate In The Workflow¶

The workflow coordinates two parallel pieces of state:

dataset branches,
run directories under results_dir.

These are related, but they are not the same thing.

Example sequence:

Pass 1 writes branch step1_fix_flags_variants and run directory results/j1909_step1_fix,
Pass 2 writes branch step2_pqc_balanced_detect and run directory results/j1909_pqc_detect,
Pass 3 writes branch step3_apply_qc_comments and run directory results/j1909_pqc_apply.

The branch names define the mutation history of the dataset. The run directories define where logs, summaries, plots, and QC products are stored.

How To Run The Workflow¶

Run:

pleb workflow --file configs/workflows/single_pulsar_3pass.toml

This is most useful after the stages have already been run manually at least once.

When To Prefer Manual Runs Over The Workflow File¶

Use the workflow file when:

the stage order is stable,
the branch hand-off is already understood,
the goal is repeatability.

Run stages manually when:

a config is still being tuned,
a branch name or output path is changing,
the detect/apply hand-off is being debugged,
one stage is failing and needs isolated inspection.

Why Branch Chaining Matters¶

Each pass starts from the previous branch and writes a new branch:

raw_ingest -> step1_fix_flags_variants
step1_fix_flags_variants -> step2_pqc_balanced_detect
step2_pqc_balanced_detect -> step3_apply_qc_comments

This branch pattern matters because it preserves the logic of each stage:

Step 1 changes metadata and jump structure,
Step 2 generates diagnostics,
Step 3 applies selected QC actions.

Debugging Workflow Mode¶

If the full workflow fails, do not debug the whole workflow at once.

Instead:

identify which pass failed,
run that pass directly with pleb --config ...,
inspect the resolved config in run_settings/,
fix the stage-specific issue,
rerun the workflow.

This avoids treating workflow mode like a black box.

Final Rule¶

A workflow file is a convenience layer, not a substitute for understanding the individual run profiles.

If each of the three run profiles cannot be explained independently, the workflow file is still too opaque for routine use.

Single-Pulsar Three-Pass Workflow¶

The Three Passes¶

The Three Run Profiles¶

The QC-Apply Profile¶

How To Explain The QC-Apply Keys¶

Why The Apply Stage Uses Explicit Outlier Columns¶

How `fix_qc_results_dir` And `fix_qc_branch` Work Together¶

The Workflow File¶

How Run Directories And Branches Relate In The Workflow¶

How To Run The Workflow¶

When To Prefer Manual Runs Over The Workflow File¶

Why Branch Chaining Matters¶

Debugging Workflow Mode¶

Final Rule¶

pleb - The EPTA Data Combination Pipeline

Navigation

Related Topics

Single-Pulsar Three-Pass Workflow¶

The Three Passes¶

The Three Run Profiles¶

The QC-Apply Profile¶

How To Explain The QC-Apply Keys¶

Why The Apply Stage Uses Explicit Outlier Columns¶

How fix_qc_results_dir And fix_qc_branch Work Together¶

The Workflow File¶

How Run Directories And Branches Relate In The Workflow¶

How To Run The Workflow¶

When To Prefer Manual Runs Over The Workflow File¶

Why Branch Chaining Matters¶

Debugging Workflow Mode¶

Final Rule¶

Related Documentation¶

How `fix_qc_results_dir` And `fix_qc_branch` Work Together¶