Pipeline overview

CardiacNexus converts UK Biobank CMR inputs into modality-specific segmentations, tracking artifacts, feature CSV files, NPZ time series, QC visualizations, and aggregate phenotype tables.

Modality
Multimodal CMR
Pipeline step
Step 1 preparation, Step 2 segmentation, Step 3 modality-specific feature extraction, Step 4 combined features and aggregation
Outputs
Per-subject features, aggregate phenotype tables, time-series files, QC artifacts
Maturity
Source-audited overview page

Current workflow

StageCurrent roleMain files
PreparationCreates analysis-ready subject directories and NIfTI inputs from UKB source imagingstep1_prepare_data_cmr.py, scripts/prepare_data.py
SegmentationRuns modality-specific segmentation wrappers for cine, aorta, flow, LVOT, and native T1 inputsstep2_segment.py, src/segmentation/**
Feature extractionComputes modality-specific phenotype rows, time-series files, and QC plotsstep3_extract_feature_separate.py, src/feature_extraction/**
Combined featuresComputes cross-modality rows such as valve/AVPD/IPVT and corrected native T1step4_extract_feature_combined.py, src/feature_extraction/Combined_Features/**
Aggregation and validationConcatenates or documents output contracts for downstream analysisscripts/aggregate_csv.py, docs/data/**, website/scripts/**

QC dependency

Many phenotypes depend on segmentation quality, cardiac frame selection, temporal smoothing, geometry metadata, BSA, ECG timing, pressure data, or DICOM acquisition parameters. The output row alone is not the full provenance.

Extraction layers

Structural phenotypes describe size and geometry: ventricular volumes, myocardial mass, wall thickness, atrial diameters and volumes, aortic diameter/length/curvature/torsion, and valve/root diameters.

Functional phenotypes describe dynamics: stroke volume, ejection fraction, filling and emptying rates, strain, torsion, recoil, distensibility, phase-contrast flow, regurgitant fraction, flow displacement, and cross-chamber coupling.

Tissue phenotypes currently describe native T1 and blood-corrected native T1. ECV remains documented as clinical context only unless a current pipeline contract emits contrast-dependent ECV rows.

What source-audited means here

A source-audited page is checked against the current implementation and data registries, but it is not a guarantee that every subject row exists. Some rows are conditional on BSA, ECG, pulse pressure, valid timing, QC success, or backend peak detection. Public pages should name these conditions instead of silently implying complete coverage.

Source audit

  • Step roles were checked against the current step*.py orchestration files and the repository architecture rules.
  • Feature families were checked against the current source tree under src/feature_extraction/**.
  • Output-contract and validation expectations were checked against docs/data/** and website/scripts/**.
  • Textbook context boundary: broad clinical textbook context is not surfaced here because this page documents pipeline architecture rather than disease interpretation.