batch

batch ¶

Batch processing utilities for GPU memory management.

This module provides the BatchProcessor class for processing large datasets in batches on GPU while managing memory. It supports batch processing of multiple voxels on GPU with configurable batch sizes and graceful fallback to CPU execution if GPU memory is exhausted.

Example

from osipy.common.backend import BatchProcessor import numpy as np

def process_batch(data): ... return data ** 2

processor = BatchProcessor(batch_size=10000) large_data = np.random.randn(100000, 50) result = processor.process(large_data, process_batch)

References

.. [1] CuPy Memory Management: https://docs.cupy.dev/en/stable/user_guide/memory.html

BatchResult `dataclass` ¶

BatchResult(
    data,
    used_gpu=False,
    fallback_occurred=False,
    batches_processed=0,
)

Result from batch processing.

ATTRIBUTE	DESCRIPTION
`data`	The processed data. TYPE: `NDArray`
`used_gpu`	Whether GPU was used for processing. TYPE: `bool`
`fallback_occurred`	Whether CPU fallback occurred due to GPU memory issues. TYPE: `bool`
`batches_processed`	Number of batches processed. TYPE: `int`

BatchProcessor `dataclass` ¶

BatchProcessor(
    batch_size=None,
    use_gpu=True,
    auto_fallback=True,
    memory_safety_margin=0.1,
)

Process data in batches with automatic GPU memory management.

This class provides efficient batch processing for large datasets, automatically managing GPU memory and falling back to CPU when necessary.

PARAMETER	DESCRIPTION
`batch_size`	Number of elements per batch. Default uses the global configuration. TYPE: `int` DEFAULT: `None`
`use_gpu`	Whether to attempt GPU acceleration. Default is True. TYPE: `bool` DEFAULT: `True`
`auto_fallback`	Whether to automatically fall back to CPU on GPU memory errors. Default is True. TYPE: `bool` DEFAULT: `True`
`memory_safety_margin`	Fraction of estimated memory to keep free (0.0 to 0.5). Default is 0.1 (10% safety margin). TYPE: `float` DEFAULT: `0.1`

Example

processor = BatchProcessor(batch_size=5000) result = processor.map(data, lambda x: x ** 2)

__post_init__ ¶

__post_init__()

Initialize with defaults from global config if needed.

map ¶

map(data, func, axis=0)

Apply a function to data in batches.

PARAMETER	DESCRIPTION
`data`	Input data array. TYPE: `NDArray`
`func`	Function to apply to each batch. Should accept and return arrays. TYPE: `Callable`
`axis`	Axis along which to batch. Default is 0. TYPE: `int` DEFAULT: `0`

RETURNS	DESCRIPTION
`BatchResult`	Result containing processed data and metadata.

Notes

If GPU memory is exhausted, this will automatically fall back to CPU processing (if auto_fallback is True) with a warning.

batch_apply ¶

batch_apply(data, func, batch_size=None, axis=0)

Convenience function to apply a function in batches.

PARAMETER	DESCRIPTION
`data`	Input data array. TYPE: `NDArray`
`func`	Function to apply to each batch. TYPE: `Callable`
`batch_size`	Batch size. Default uses global configuration. TYPE: `int` DEFAULT: `None`
`axis`	Axis along which to batch. Default is 0. TYPE: `int` DEFAULT: `0`

RETURNS	DESCRIPTION
`NDArray`	Processed data.

Example

result = batch_apply(large_array, lambda x: x ** 2, batch_size=10000)

batch

batch ¶

BatchResult dataclass ¶

BatchProcessor dataclass ¶

__post_init__ ¶

map ¶

batch_apply ¶

BatchResult `dataclass` ¶

BatchProcessor `dataclass` ¶