Pipeline

End-to-end orchestration of synthetic training-data generation.

This module combines cr_mech_coli simulation with synthetic image generation:

  1. Samples simulation parameters

  2. Runs growth simulations & tracks cell lineages

  3. Renders raw images with segmentation masks

  4. Applies microscope-style post-processing

  5. Writes TIFF outputs with JSON metadata

sample_range(value: Any, rng: Generator) Any

Sample a value from a range specification.

Parameters:
  • value – Either a scalar (returned as-is) or a [min, max] list (sampled uniformly).

  • rng (np.random.Generator) – NumPy random generator.

Returns:

Sampled or fixed value.

sample_simulation_params(sim_config: Dict[str, Any], rng: Generator) Dict[str, Any]

Sample concrete values from simulation parameter ranges.

Parameters:
  • sim_config (Dict[str, Any]) – The [simulation] section from config with [min, max] ranges.

  • rng (np.random.Generator) – NumPy random generator.

Returns:

Dictionary with sampled concrete values.

Return type:

Dict[str, Any]

jitter_synthetic_params(params: Dict[str, Any], variation_factor: float, rng: Generator) Dict[str, Any]

Apply variation jitter to the 7 optimized parameters.

Parameters:
  • params (Dict[str, Any]) – Dictionary containing the base synthetic parameters.

  • variation_factor (float) – Fraction of the value to vary (0.1 = +/-10%).

  • rng (np.random.Generator) – NumPy random generator.

Returns:

New dictionary with jittered values (only the 7 optimized

params are modified).

Return type:

Dict[str, Any]

load_parameter_sets(param_files: List[str]) List[Dict[str, Any]]

Load parameter sets from JSON files.

Parameters:

param_files (List[str]) – List of JSON file paths containing parameters.

Returns:

List of parameter dictionaries.

Return type:

List[Dict[str, Any]]

compute_cell_ages(container: CellContainer, iteration: int) Dict[CellIdentifier, int]

Compute age for all cells at a specific iteration.

Age is defined as: current_iteration - birth_iteration Daughter cells after division start at age 0.

Parameters:
  • container (crm.CellContainer) – CellContainer from simulation.

  • iteration (int) – Target iteration to compute ages at.

Returns:

Dictionary mapping CellIdentifier to age

(in iterations).

Return type:

Dict[crm.CellIdentifier, int]

run_simulation(n_frames: int, image_size: Tuple[int, int], n_bacteria_range: Tuple[int, int], border_distance: float, max_bacteria_length: float, simulation_seed: int, n_vertices: int = 8, sim_params: Dict[str, Any] = None) Tuple[CellContainer, Configuration]

Run cr_mech_coli bacteria growth simulation.

Parameters:
  • n_frames (int) – Number of frames (saved iterations) to generate.

  • image_size (Tuple[int, int]) – (width, height) of the simulation domain in pixels.

  • n_bacteria_range (Tuple[int, int]) – (min, max) range for initial bacteria count.

  • border_distance (float) – Minimum distance from border for initial positions.

  • max_bacteria_length (float) – Maximum bacteria length before division.

  • simulation_seed (int) – Random seed for simulation.

  • n_vertices (int) – Number of vertices per bacterium.

  • sim_params (Dict[str, Any]) – Optional dict of sampled simulation parameters to override defaults.

Returns:

CellContainer with simulation

results and Configuration used.

Return type:

Tuple[crm.CellContainer, crm.Configuration]

render_and_save_frame(container: CellContainer, iteration: int, domain_size: Tuple[float, float], output_dir: Path, render_settings: RenderSettings) Tuple[ndarray, ndarray, Dict]

Render and save a single frame (image and mask).

Parameters:
  • container (crm.CellContainer) – Simulation results container.

  • iteration (int) – Iteration number to render.

  • domain_size (Tuple[float, float]) – Simulation domain size.

  • output_dir (Path) – Directory to save output files.

  • render_settings (crm.RenderSettings) – Rendering configuration.

Returns:

(image, mask, cell_colors_dict).

Return type:

Tuple[np.ndarray, np.ndarray, Dict]

process_frame_for_synthetic(args)

Worker function for parallel synthetic image generation.

Parameters:

args – Tuple of (frame_idx, iteration, generated_dir, synthetic_dir, cell_ages, cell_colors_serializable, synthetic_config, background_config, halo_config, brightness_config, variation_factor, delete_after_processing, bg_seed).

Returns:

(frame_idx, iteration, params) with the parameters

used for this frame.

Return type:

Tuple[int, int, Dict]

run_pipeline(output_dir: str = './outputs', n_frames: int = 10, image_size: Tuple[int, int] = (512, 512), n_bacteria_range: Tuple[int, int] = (1, 10), border_distance: float = 5.0, max_bacteria_length: float = 6.0, simulation_seed: int = None, n_vertices: int = 8, parameter_sets: List[str] = None, brightness_range: Tuple[float, float] = (0.8, 0.3), num_dark_spots_range: Tuple[int, int] = (0, 5), skip_synthetic: bool = False, delete_generated: bool = False, n_workers: int = None, sim_param_ranges: Dict[str, Any] = None, rendering_config: Dict[str, Any] = None, synthetic_config: Dict[str, Any] = None, background_config: Dict[str, Any] = None, halo_config: Dict[str, Any] = None, brightness_config: Dict[str, Any] = None, n_simulations: int = 1)

Run the complete synthetic image generation pipeline.

Parameters:
  • output_dir (str) – Directory to save outputs.

  • n_frames (int) – Number of frames to generate.

  • image_size (Tuple[int, int]) – (width, height) of images in pixels.

  • n_bacteria_range (Tuple[int, int]) – (min, max) range for initial bacteria count.

  • border_distance (float) – Minimum distance from border for bacteria.

  • max_bacteria_length (float) – Max bacteria length before division.

  • simulation_seed (int) – Random seed. If None, generates a random seed.

  • n_vertices (int) – Number of vertices per bacterium.

  • parameter_sets (List[str]) – List of JSON file paths with parameters (legacy).

  • brightness_range (Tuple[float, float]) – (young, old) brightness for age-based mode (legacy).

  • num_dark_spots_range (Tuple[int, int]) – (min, max) range for dark spots (legacy).

  • skip_synthetic (bool) – Only run simulation, skip synthetic generation.

  • delete_generated (bool) – Delete raw generated images after processing.

  • n_workers (int) – Number of parallel workers. If None, uses all CPUs.

  • sim_param_ranges (Dict[str, Any]) – Simulation parameter ranges from TOML [simulation] section.

  • rendering_config (Dict[str, Any]) – Rendering settings from TOML [rendering] section.

  • synthetic_config (Dict[str, Any]) – Synthetic params from TOML [synthetic] section.

  • background_config (Dict[str, Any]) – Background params from TOML [background] section.

  • halo_config (Dict[str, Any]) – Halo params from TOML [halo] section.

  • brightness_config (Dict[str, Any]) – Brightness params from TOML [brightness] section.

  • n_simulations (int) – Number of simulations with different randomized parameters.