Skip to content

discovery

discovery

DICOM discovery and loading.

Two primitives:

  • :func:discover_dicom walks a directory, groups files by SeriesInstanceUID, and annotates each series with a best-effort role hint (dynamic / dynamic_frame / vfa / t1_look_locker / unknown). It never touches pixel data and logs a transparent summary of what it found and why.
  • :func:load_dicom_series loads one :class:SeriesInfo (3D or 4D, depending on TPI structure) or stacks a list of per-timepoint series into a 4D volume with a derived time vector.

Design: discovery is observation only. Role hints are hints — callers may ignore them and select series by any attribute on SeriesInfo. No modality-specific behaviour is baked into loading; modality is just carried through onto the returned :class:PerfusionDataset.

SeriesInfo dataclass

SeriesInfo(uid, study_instance_uid, files, description, series_number, dicom_modality, manufacturer, flip_angle, tr, te, field_strength, rows, columns, n_temporal_positions, n_acquisition_numbers, n_slice_locations, image_types=set(), acquisition_time=None, acquisition_time_sec=None, trigger_times=set(), trigger_time_hint=None, role_hint='unknown', group_key=None, reason='')

Metadata for one DICOM series, produced by :func:discover_dicom.

ATTRIBUTE DESCRIPTION
uid

SeriesInstanceUID (0020,000E).

TYPE: str

study_instance_uid

StudyInstanceUID (0020,000D). Used to keep per-timepoint frame clustering from crossing study boundaries (e.g. multiple visits exported under one root).

TYPE: str | None

files

All files belonging to this series, sorted.

TYPE: list[Path]

description

SeriesDescription (0008,103E).

TYPE: str

series_number

SeriesNumber (0020,0011).

TYPE: int | None

dicom_modality

DICOM Modality (0008,0060) — e.g. "MR", "KO".

TYPE: str | None

manufacturer

Manufacturer (0008,0070).

TYPE: str | None

flip_angle

FlipAngle (0018,1314), only set when uniform across the series.

TYPE: float | None

tr, te, field_strength

RepetitionTime, EchoTime, MagneticFieldStrength.

TYPE: float | None

rows, columns

Image matrix size (from first file).

TYPE: int | None

n_temporal_positions

Count of unique TemporalPositionIdentifier values. 0 when the tag is absent on all files. >1 marks a single-series dynamic (all timepoints packaged inside one SeriesInstanceUID).

TYPE: int

n_acquisition_numbers

Count of unique AcquisitionNumber values. Used as a fallback single-series dynamic signal when TemporalPositionIdentifier is absent (common on older TCIA DCE exports).

TYPE: int

n_slice_locations

Count of unique SliceLocation values.

TYPE: int

image_types

Unique ImageType (0008,0008) combinations. Used to detect mixed magnitude/phase series (Philips dual-output exports).

TYPE: set[tuple[str, ...]]

acquisition_time

First file's AcquisitionTime (HHMMSS.FFFFFF).

TYPE: str | None

acquisition_time_sec

Parsed acquisition_time in seconds since midnight.

TYPE: float | None

trigger_times

Unique TriggerTime values across the series (ms).

TYPE: set[float]

trigger_time_hint

Seconds parsed from the SeriesDescription TT=X.Xs suffix, when present — only emitted by per-timepoint exports (one 3D series per dynamic frame).

TYPE: float | None

role_hint

Best-effort classification. See module docstring for meanings.

TYPE: RoleHint

group_key

For dynamic_frame series: a key shared by every frame of the same multi-series dynamic. None otherwise.

TYPE: str | None

reason

Human-readable justification for the role_hint.

TYPE: str

discover_dicom

discover_dicom(path, modality=None, recursive=True)

Walk path, group DICOM files by series, and classify each series.

PARAMETER DESCRIPTION
path

Directory (or single file) to scan.

TYPE: str | Path

modality

Reserved for future modality-specific classification tweaks. Currently ignored — the heuristics do not change by modality because the signals they rely on (TPI, FlipAngle, ImageType, description patterns) are modality-agnostic.

TYPE: Modality | str | None DEFAULT: None

recursive

Recurse into subdirectories. Handles both flat (NKI) and nested (TCIA) layouts.

TYPE: bool DEFAULT: True

RETURNS DESCRIPTION
list[SeriesInfo]

One entry per unique SeriesInstanceUID under path. Sorted by series_number (when available), then UID.

Notes

Role hints (SeriesInfo.role_hint) are advisory. Callers can — and should — inspect SeriesInfo directly when the heuristic is wrong. A summary of the classification is logged at INFO level.

load_dicom_series

load_dicom_series(series, modality=None)

Load one or more DICOM series as a :class:PerfusionDataset.

PARAMETER DESCRIPTION
series

Either a single :class:SeriesInfo (loaded as 3D or 4D depending on its TPI structure), or a list of series that should be stacked into one 4D volume. When every member of the list is VFA (role_hint == "vfa"), they are stacked along the flip-angle axis (sorted by flip angle) with acquisition_params.flip_angles populated from the source series. Otherwise the list is treated as per-timepoint frames — each a separate 3D volume representing one dynamic timepoint — which are sorted by the embedded timing and stacked into a 4D dataset with a derived time vector.

TYPE: SeriesInfo | list[SeriesInfo]

modality

Attached to the returned PerfusionDataset. Does not alter loading behaviour.

TYPE: Modality | str | None DEFAULT: None

RETURNS DESCRIPTION
PerfusionDataset

For single 3D series: time_points=None. For 4D (single-series dynamic, per-timepoint stack of separate 3D volumes, or VFA stack): populated time_points. VFA stacks use an integer placeholder because the last axis is flip angle, not time.

RAISES DESCRIPTION
IOError

If no magnitude frames can be read.

DataValidationError

If a multi-series load sees inconsistent shapes across frames, or if fewer than 2 series are passed as a list.