Skip to content

Conversation

@MGAMZ
Copy link
Owner

@MGAMZ MGAMZ commented Jan 17, 2026

Combine multiple label folders by intersecting filenames and merging labels according to ordered mapping rules. This tool is useful when you have multiple specialized segmentations for the same cases and want to create a unified label map.

Usage

itk_combine --source <name>=<folder> --map <mapping_rule> <dest_folder> [options]

Parameters

  • --source: Specify a label source in the format name=/path/to/folder. Can be specified multiple times for different sources.
  • --map: Specify a mapping rule in the format <source_name>:<source_labels>-><target_label>.
    • <source_name> must match one of the names defined in --source.
    • <source_labels> can be a single integer or a comma-separated list of integers.
    • Multiple --map rules are allowed. Priority is determined by order: the first rule that matches a voxel determines its value in the output.
  • dest_folder: Destination folder for the combined label files.
  • --mp: Enable multiprocessing.
  • --workers: Number of worker processes (defaults to half of CPU cores).

Example

Suppose you have:

  • Source A: Organ segmentations (1: Liver, 2: Spleen)
  • Source B: Tumor segmentations (1: Liver Tumor)

To combine them into a single map where Background=0, Liver=1, Spleen=2, and Liver Tumor=3 (with tumor taking priority over the organ label):

itk_combine \
    --source organs=/path/to/organs \
    --source tumors=/path/to/tumors \
    --map tumors:1->3 \
    --map organs:1->1 \
    --map organs:2->2 \
    /path/to/combined_output \
    --mp

@MGAMZ MGAMZ marked this pull request as ready for review January 17, 2026 11:22
Copilot AI review requested due to automatic review settings January 17, 2026 11:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new itk_combine command-line tool to merge multiple label segmentations from different sources into a unified label map. The tool intersects filenames across multiple source folders and applies ordered mapping rules to combine labels with priority handling.

Changes:

  • Added CombineProcessor class implementing label merging logic with validation and priority-based rule application
  • Implemented CLI with --source and --map arguments for flexible label combination
  • Added comprehensive tests covering 2-4 source combinations with various mapping scenarios

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
itkit/process/itk_combine.py Main implementation with processor class, argument parsing, and mapping logic
tests/itk_process/test_itk_combine.py Test suite with 4 test cases covering simple and complex mapping scenarios
pyproject.toml Added itk_combine entry point for CLI
docs/itk_combine.md Comprehensive documentation with usage examples and parameter descriptions
docs/index.md Added reference to itk_combine in documentation index

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

help="Mapping rule in form `<source>:<src_labels>-><target>`, e.g., `A:1,2->3` (repeatable)",
)
parser.add_argument(
"-o", "dest_folder",
Copy link

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The positional argument syntax is incorrect. The first string in add_argument for a positional argument should not start with a dash. This should be:

parser.add_argument(
"dest_folder",
type=Path,
help="Destination folder for combined labels",
)

Without this fix, argparse will interpret this as an optional argument with flag "-o" but users won't be able to specify the destination folder as a positional argument as shown in the documentation.

Suggested change
"-o", "dest_folder",
"dest_folder",

Copilot uses AI. Check for mistakes.
seen_names: set[str] = set()
for item in source_args:
if "=" not in item:
raise ValueError(f"Invalid source format: {item}. Expected name=/path/to/labels")
Copy link

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message text is inconsistent with the expected format described in the error message itself. The message says "Expected name=/path/to/labels" but the parsing splits on "=" to get name and folder. Consider changing the message to "Expected name=/path/to/folder" to match the actual parameter description in the documentation and help text.

Suggested change
raise ValueError(f"Invalid source format: {item}. Expected name=/path/to/labels")
raise ValueError(f"Invalid source format: {item}. Expected name=/path/to/folder")

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +187
import os
import tempfile
from pathlib import Path

import numpy as np
import pytest
import SimpleITK as sitk

from itkit.process.itk_combine import CombineProcessor, MappingRule, SourceSpec


@pytest.fixture
def temp_dir():
with tempfile.TemporaryDirectory() as tmpdir:
yield tmpdir


def _write_label(folder: str, name: str, array: np.ndarray) -> str:
path = os.path.join(folder, name)
image = sitk.GetImageFromArray(array.astype(np.uint8))
image.SetSpacing((1.0, 1.0, 1.0))
sitk.WriteImage(image, path, True)
return path


def _make_array(shape=(3, 3, 3)) -> np.ndarray:
return np.zeros(shape, dtype=np.uint8)


@pytest.mark.itk_process
class TestCombineProcessor:
def test_two_sources_simple_mapping(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_b = _make_array()
arr_b[1, :, :] = 1

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
]
rules = [
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="B", source_labels=(1,), target_label=2),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 1)
assert np.all(output[1, :, :] == 2)
assert np.all(output[2, :, :] == 0)

def test_two_sources_complex_mapping(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_a[1, :, :] = 2
arr_b = _make_array()
arr_b[0, :, :] = 1
arr_b[2, :, :] = 2

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
]
rules = [
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="A", source_labels=(2,), target_label=4),
MappingRule(source_name="B", source_labels=(1,), target_label=2),
MappingRule(source_name="B", source_labels=(2,), target_label=3),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 1)
assert np.all(output[1, :, :] == 4)
assert np.all(output[2, :, :] == 3)

def test_three_sources_mapping(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
src_c = os.path.join(temp_dir, "C")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(src_c)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_b = _make_array()
arr_b[1, :, :] = 1
arr_c = _make_array()
arr_c[2, :, :] = 1

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)
_write_label(src_c, "case.mha", arr_c)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
SourceSpec(name="C", folder=Path(src_c)),
]
rules = [
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="B", source_labels=(1,), target_label=2),
MappingRule(source_name="C", source_labels=(1,), target_label=3),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 1)
assert np.all(output[1, :, :] == 2)
assert np.all(output[2, :, :] == 3)

def test_four_sources_priority(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
src_c = os.path.join(temp_dir, "C")
src_d = os.path.join(temp_dir, "D")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(src_c)
os.makedirs(src_d)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_b = _make_array()
arr_b[0, :, :] = 1
arr_c = _make_array()
arr_c[1, :, :] = 2
arr_d = _make_array()
arr_d[1, :, :] = 2

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)
_write_label(src_c, "case.mha", arr_c)
_write_label(src_d, "case.mha", arr_d)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
SourceSpec(name="C", folder=Path(src_c)),
SourceSpec(name="D", folder=Path(src_d)),
]
rules = [
MappingRule(source_name="B", source_labels=(1,), target_label=5),
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="D", source_labels=(2,), target_label=6),
MappingRule(source_name="C", source_labels=(2,), target_label=2),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 5)
assert np.all(output[1, :, :] == 6)
assert np.all(output[2, :, :] == 0)
Copy link

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for the command-line interface. Looking at test_process_mains.py, other similar tools (itk_extract, itk_resample, itk_orient, itk_patch, itk_aug) have main function tests that verify the CLI can be called successfully. Consider adding a test_itk_combine_main function that verifies the full command-line invocation works correctly, including argument parsing and execution.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +187
import os
import tempfile
from pathlib import Path

import numpy as np
import pytest
import SimpleITK as sitk

from itkit.process.itk_combine import CombineProcessor, MappingRule, SourceSpec


@pytest.fixture
def temp_dir():
with tempfile.TemporaryDirectory() as tmpdir:
yield tmpdir


def _write_label(folder: str, name: str, array: np.ndarray) -> str:
path = os.path.join(folder, name)
image = sitk.GetImageFromArray(array.astype(np.uint8))
image.SetSpacing((1.0, 1.0, 1.0))
sitk.WriteImage(image, path, True)
return path


def _make_array(shape=(3, 3, 3)) -> np.ndarray:
return np.zeros(shape, dtype=np.uint8)


@pytest.mark.itk_process
class TestCombineProcessor:
def test_two_sources_simple_mapping(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_b = _make_array()
arr_b[1, :, :] = 1

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
]
rules = [
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="B", source_labels=(1,), target_label=2),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 1)
assert np.all(output[1, :, :] == 2)
assert np.all(output[2, :, :] == 0)

def test_two_sources_complex_mapping(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_a[1, :, :] = 2
arr_b = _make_array()
arr_b[0, :, :] = 1
arr_b[2, :, :] = 2

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
]
rules = [
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="A", source_labels=(2,), target_label=4),
MappingRule(source_name="B", source_labels=(1,), target_label=2),
MappingRule(source_name="B", source_labels=(2,), target_label=3),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 1)
assert np.all(output[1, :, :] == 4)
assert np.all(output[2, :, :] == 3)

def test_three_sources_mapping(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
src_c = os.path.join(temp_dir, "C")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(src_c)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_b = _make_array()
arr_b[1, :, :] = 1
arr_c = _make_array()
arr_c[2, :, :] = 1

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)
_write_label(src_c, "case.mha", arr_c)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
SourceSpec(name="C", folder=Path(src_c)),
]
rules = [
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="B", source_labels=(1,), target_label=2),
MappingRule(source_name="C", source_labels=(1,), target_label=3),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 1)
assert np.all(output[1, :, :] == 2)
assert np.all(output[2, :, :] == 3)

def test_four_sources_priority(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
src_c = os.path.join(temp_dir, "C")
src_d = os.path.join(temp_dir, "D")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(src_c)
os.makedirs(src_d)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_b = _make_array()
arr_b[0, :, :] = 1
arr_c = _make_array()
arr_c[1, :, :] = 2
arr_d = _make_array()
arr_d[1, :, :] = 2

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)
_write_label(src_c, "case.mha", arr_c)
_write_label(src_d, "case.mha", arr_d)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
SourceSpec(name="C", folder=Path(src_c)),
SourceSpec(name="D", folder=Path(src_d)),
]
rules = [
MappingRule(source_name="B", source_labels=(1,), target_label=5),
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="D", source_labels=(2,), target_label=6),
MappingRule(source_name="C", source_labels=(2,), target_label=2),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 5)
assert np.all(output[1, :, :] == 6)
assert np.all(output[2, :, :] == 0)
Copy link

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for the parsing functions. The _parse_sources and _parse_mapping_rule functions handle error cases and validation, but there are no tests verifying these functions work correctly or handle edge cases properly. Consider adding tests similar to the TestParseLabelMappings class in test_itk_extract.py to cover:

  • Valid source parsing
  • Invalid source format (missing "=", empty name, non-existent folder)
  • Duplicate source names
  • Valid mapping rule parsing
  • Invalid mapping rule format (missing "->", missing ":", invalid integers)
  • Empty source labels list

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +187
import os
import tempfile
from pathlib import Path

import numpy as np
import pytest
import SimpleITK as sitk

from itkit.process.itk_combine import CombineProcessor, MappingRule, SourceSpec


@pytest.fixture
def temp_dir():
with tempfile.TemporaryDirectory() as tmpdir:
yield tmpdir


def _write_label(folder: str, name: str, array: np.ndarray) -> str:
path = os.path.join(folder, name)
image = sitk.GetImageFromArray(array.astype(np.uint8))
image.SetSpacing((1.0, 1.0, 1.0))
sitk.WriteImage(image, path, True)
return path


def _make_array(shape=(3, 3, 3)) -> np.ndarray:
return np.zeros(shape, dtype=np.uint8)


@pytest.mark.itk_process
class TestCombineProcessor:
def test_two_sources_simple_mapping(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_b = _make_array()
arr_b[1, :, :] = 1

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
]
rules = [
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="B", source_labels=(1,), target_label=2),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 1)
assert np.all(output[1, :, :] == 2)
assert np.all(output[2, :, :] == 0)

def test_two_sources_complex_mapping(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_a[1, :, :] = 2
arr_b = _make_array()
arr_b[0, :, :] = 1
arr_b[2, :, :] = 2

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
]
rules = [
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="A", source_labels=(2,), target_label=4),
MappingRule(source_name="B", source_labels=(1,), target_label=2),
MappingRule(source_name="B", source_labels=(2,), target_label=3),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 1)
assert np.all(output[1, :, :] == 4)
assert np.all(output[2, :, :] == 3)

def test_three_sources_mapping(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
src_c = os.path.join(temp_dir, "C")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(src_c)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_b = _make_array()
arr_b[1, :, :] = 1
arr_c = _make_array()
arr_c[2, :, :] = 1

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)
_write_label(src_c, "case.mha", arr_c)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
SourceSpec(name="C", folder=Path(src_c)),
]
rules = [
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="B", source_labels=(1,), target_label=2),
MappingRule(source_name="C", source_labels=(1,), target_label=3),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 1)
assert np.all(output[1, :, :] == 2)
assert np.all(output[2, :, :] == 3)

def test_four_sources_priority(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
src_c = os.path.join(temp_dir, "C")
src_d = os.path.join(temp_dir, "D")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(src_c)
os.makedirs(src_d)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_b = _make_array()
arr_b[0, :, :] = 1
arr_c = _make_array()
arr_c[1, :, :] = 2
arr_d = _make_array()
arr_d[1, :, :] = 2

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)
_write_label(src_c, "case.mha", arr_c)
_write_label(src_d, "case.mha", arr_d)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
SourceSpec(name="C", folder=Path(src_c)),
SourceSpec(name="D", folder=Path(src_d)),
]
rules = [
MappingRule(source_name="B", source_labels=(1,), target_label=5),
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="D", source_labels=(2,), target_label=6),
MappingRule(source_name="C", source_labels=(2,), target_label=2),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 5)
assert np.all(output[1, :, :] == 6)
assert np.all(output[2, :, :] == 0)
Copy link

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for the main function's validation logic. The main function (lines 173-210) contains important validation such as:

  • Checking that mapping rules reference valid source names
  • Ensuring at least one mapping rule is provided

These validation cases are not tested. Consider adding tests that verify these error conditions are properly handled.

Copilot uses AI. Check for mistakes.
def get_items_to_process(self) -> list[tuple[str, list[str]]]:
source_files: dict[str, dict[str, str]] = {}
for src in self.sources:
files = {p.name: str(p) for p in src.folder.glob("*.mha")}
Copy link

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file discovery only looks for .mha files, but the base class SUPPORTED_EXTENSIONS includes '.mha', '.mhd', '.nii', '.nii.gz', '.nrrd'. While the documentation mentions that output is normalized to .mha format, there's no mention that input files must also be .mha. Consider either:

  1. Using a broader glob pattern to support all SUPPORTED_EXTENSIONS for input files, or
  2. Clearly documenting in itk_combine.md that only .mha input files are supported.

This hardcoded limitation is inconsistent with other processors in the codebase that typically support multiple input formats.

Suggested change
files = {p.name: str(p) for p in src.folder.glob("*.mha")}
files: dict[str, str] = {}
for ext in self.SUPPORTED_EXTENSIONS:
pattern = f"*{ext}" if ext.startswith(".") else f"*.{ext}"
for p in src.folder.glob(pattern):
files[p.name] = str(p)

Copilot uses AI. Check for mistakes.
Comment on lines +184 to +186
if not rules:
raise ValueError("At least one mapping rule is required.")

Copy link

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check for empty rules is redundant because the --map argument is marked as required=True in the argument parser. This validation can never be reached since argparse will fail earlier if no --map arguments are provided. Consider removing this redundant check.

Suggested change
if not rules:
raise ValueError("At least one mapping rule is required.")

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +187
import os
import tempfile
from pathlib import Path

import numpy as np
import pytest
import SimpleITK as sitk

from itkit.process.itk_combine import CombineProcessor, MappingRule, SourceSpec


@pytest.fixture
def temp_dir():
with tempfile.TemporaryDirectory() as tmpdir:
yield tmpdir


def _write_label(folder: str, name: str, array: np.ndarray) -> str:
path = os.path.join(folder, name)
image = sitk.GetImageFromArray(array.astype(np.uint8))
image.SetSpacing((1.0, 1.0, 1.0))
sitk.WriteImage(image, path, True)
return path


def _make_array(shape=(3, 3, 3)) -> np.ndarray:
return np.zeros(shape, dtype=np.uint8)


@pytest.mark.itk_process
class TestCombineProcessor:
def test_two_sources_simple_mapping(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_b = _make_array()
arr_b[1, :, :] = 1

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
]
rules = [
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="B", source_labels=(1,), target_label=2),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 1)
assert np.all(output[1, :, :] == 2)
assert np.all(output[2, :, :] == 0)

def test_two_sources_complex_mapping(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_a[1, :, :] = 2
arr_b = _make_array()
arr_b[0, :, :] = 1
arr_b[2, :, :] = 2

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
]
rules = [
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="A", source_labels=(2,), target_label=4),
MappingRule(source_name="B", source_labels=(1,), target_label=2),
MappingRule(source_name="B", source_labels=(2,), target_label=3),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 1)
assert np.all(output[1, :, :] == 4)
assert np.all(output[2, :, :] == 3)

def test_three_sources_mapping(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
src_c = os.path.join(temp_dir, "C")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(src_c)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_b = _make_array()
arr_b[1, :, :] = 1
arr_c = _make_array()
arr_c[2, :, :] = 1

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)
_write_label(src_c, "case.mha", arr_c)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
SourceSpec(name="C", folder=Path(src_c)),
]
rules = [
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="B", source_labels=(1,), target_label=2),
MappingRule(source_name="C", source_labels=(1,), target_label=3),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 1)
assert np.all(output[1, :, :] == 2)
assert np.all(output[2, :, :] == 3)

def test_four_sources_priority(self, temp_dir):
src_a = os.path.join(temp_dir, "A")
src_b = os.path.join(temp_dir, "B")
src_c = os.path.join(temp_dir, "C")
src_d = os.path.join(temp_dir, "D")
out_dir = os.path.join(temp_dir, "out")
os.makedirs(src_a)
os.makedirs(src_b)
os.makedirs(src_c)
os.makedirs(src_d)
os.makedirs(out_dir)

arr_a = _make_array()
arr_a[0, :, :] = 1
arr_b = _make_array()
arr_b[0, :, :] = 1
arr_c = _make_array()
arr_c[1, :, :] = 2
arr_d = _make_array()
arr_d[1, :, :] = 2

_write_label(src_a, "case.mha", arr_a)
_write_label(src_b, "case.mha", arr_b)
_write_label(src_c, "case.mha", arr_c)
_write_label(src_d, "case.mha", arr_d)

sources = [
SourceSpec(name="A", folder=Path(src_a)),
SourceSpec(name="B", folder=Path(src_b)),
SourceSpec(name="C", folder=Path(src_c)),
SourceSpec(name="D", folder=Path(src_d)),
]
rules = [
MappingRule(source_name="B", source_labels=(1,), target_label=5),
MappingRule(source_name="A", source_labels=(1,), target_label=1),
MappingRule(source_name="D", source_labels=(2,), target_label=6),
MappingRule(source_name="C", source_labels=(2,), target_label=2),
]

processor = CombineProcessor(sources=sources, dest_folder=Path(out_dir), mapping_rules=rules)
processor.process("Combine")

output = sitk.GetArrayFromImage(sitk.ReadImage(os.path.join(out_dir, "case.mha")))
assert np.all(output[0, :, :] == 5)
assert np.all(output[1, :, :] == 6)
assert np.all(output[2, :, :] == 0)
Copy link

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for edge cases in the CombineProcessor. Consider adding tests for:

  1. Empty intersection (no common files across all sources)
  2. Size mismatch between source images (should raise ValueError)
  3. Spacing mismatch between source images (should raise ValueError)
  4. Multiple rules mapping to the same target label from different sources
  5. A rule with multiple source labels (e.g., source_labels=(1, 2, 3))

These edge cases are important to verify the robustness of the implementation.

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Jan 17, 2026

Codecov Report

❌ Patch coverage is 70.22901% with 78 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@2fc24c2). Learn more about missing BASE report.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
itkit/process/itk_combine.py 44.28% 74 Missing and 4 partials ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main      #61   +/-   ##
=======================================
  Coverage        ?   72.18%           
=======================================
  Files           ?      108           
  Lines           ?    12060           
  Branches        ?     1077           
=======================================
  Hits            ?     8705           
  Misses          ?     3106           
  Partials        ?      249           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@MGAMZ MGAMZ merged commit 5a2e2a3 into main Jan 17, 2026
7 checks passed
@MGAMZ MGAMZ deleted the dev/itk-combine branch January 17, 2026 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants