Appearance
colliderml.simulate
API reference for the simulation subsystem. For the conceptual overview (container runtimes, caching, the pipeline stages) see Local Simulation; for the SaaS variant see Remote Simulation.
Top-level entry point
simulate(...) → SimulationResult
python
colliderml.simulate(
*,
preset: str | Preset | None = None,
channel: str | None = None,
events: int | None = None,
pileup: int | None = None,
seed: int = 42,
output_dir: str | Path | None = None,
image: str = DEFAULT_IMAGE,
remote: bool = False,
run_id: str = "0",
prod_root: str | Path | None = None,
runtime: str | None = None,
quiet: bool = False,
) -> SimulationResultRuns the full pipeline for one (channel, events, pileup) configuration.
Parameter semantics
preset— name of a bundled preset (seelist_presets()), or aPresetinstance. Supplies defaults forchannel,events, andpileup.- Explicit
channel/events/pileupalways win over the preset's values. Pass just one (e.g.pileup=40) to tweak a preset without defining a new one. output_dirdefaults to./colliderml_output/<channel>_pu<pileup>_<events>evt/.seedis forwarded to every stage so the full run is reproducible.imagepins the OpenDataDetector software image. Leave at the default unless you are rebuilding the image yourself.remote=Truesubmits to the SaaS backend instead of running locally — requirespip install "colliderml[remote]"and a valid HuggingFace token. Seecolliderml.remotefor the SaaS client.run_idnames the subdirectory underoutput_dir/runs/. The default"0"is fine for single-run workflows; bump it for parameter sweeps.prod_rootlets advanced callers point at a pre-existingcolliderml-productioncheckout. LeaveNonefor the default auto-clone behaviour.runtimeforces a container runtime ("docker"or"podman"); auto-detected otherwise.quiet=Truesuppresses per-stage progress messages.
Errors
ValueError— neither preset nor explicit channel is supplied, or the channel is unknown.ContainerRuntimeError— no container runtime is available andremote=False.RuntimeError— a pipeline stage exited non-zero.
Example
python
import colliderml
result = colliderml.simulate(
preset="higgs-portal-quick",
output_dir="runs/hp10",
seed=1234,
)
print(result.run_dir)
for fp in result.list_files():
print(" ", fp.name)SimulationResult dataclass
python
@dataclass
class SimulationResult:
channel: str
events: int
pileup: int
output_dir: Path
run_dir: Path
stages: list[StageRun]
remote_request_id: str | None = None| Field | Meaning |
|---|---|
channel | Physics channel that was simulated. |
events | Requested event count (actual may differ slightly for some channels). |
pileup | Pileup level. |
output_dir | The directory you passed (or the computed default). |
run_dir | <output_dir>/runs/<run_id>/ — the per-run artefact directory. |
stages | One StageRun(name, stage, returncode) per executed stage. |
remote_request_id | Only set when remote=True; the backend's request ID. |
Methods:
list_files()→ sortedlist[Path]of every file underrun_dir.
Preset dataclass
python
@dataclass(frozen=True)
class Preset:
name: str
channel: str
events: int
pileup: int
description: str = ""Helper methods:
as_dict()— plain dict view for JSON/YAML serialisation.
load_presets() -> dict[str, Preset]
Loads the bundled preset catalogue from the package's presets.yaml. The file ships as package data, so it works in editable installs, wheels, and sdists alike.
python
from colliderml.simulate import load_presets
catalogue = load_presets()
for name, preset in sorted(catalogue.items()):
print(f"{name:25s} {preset.channel:15s} events={preset.events:>6} pileup={preset.pileup}")resolve_preset(name, presets=None) -> Preset
Looks up a single preset by name. Raises ValueError with the full list of available names on a miss — so typos surface immediately.
Auto-cloning the production repo
The pipeline scripts, YAML stage configs, and container bootstrap script live in OpenDataDetector/colliderml-production. The public library clones that repository on first use into a cache directory and mounts it inside the container as /workspace.
python
from colliderml.simulate.docker import (
clone_colliderml_production,
default_cache_root,
COLLIDERML_PRODUCTION_REF,
)
print("cache root:", default_cache_root())
print("pinned ref:", COLLIDERML_PRODUCTION_REF)
# Idempotent; just fetches if the clone already exists.
prod = clone_colliderml_production()
print("cloned to:", prod)Overriding the pinned ref — set the COLLIDERML_PRODUCTION_REF environment variable before calling simulate(). This is the supported way for pipeline developers to test an in-flight branch without editing the library.
Forcing a re-clone — pass force_refresh=True to clone_colliderml_production, or delete the cache directory and let simulate() rebuild it.
Container runtime helpers
All exported from colliderml.simulate.docker:
| Function | Purpose |
|---|---|
get_container_runtime() | Return "docker" or "podman"; raise ContainerRuntimeError if neither is installed. |
check_runtime_available(runtime=None) | Verify the runtime's daemon is actually reachable (runtime info). |
check_image_available(image, *, runtime=None) | Return True iff the image is already pulled locally. |
pull_image(image, *, interactive=True, runtime=None) | Pull the image, prompting before the ~10 GB download when running on a tty. |
default_cache_root() | $COLLIDERML_CACHE/simulate or ~/.cache/colliderml/simulate if unset. |
clone_colliderml_production(cache_root=None, *, ref=None, force_refresh=False) | Clone or refresh the production repo at the pinned ref. |
run_pipeline(...) | Low-level multi-stage runner used by simulate(). |
You rarely need to call these directly — they are documented here so that users debugging CI failures or writing their own orchestrators can see the primitives.
Pipeline introspection
colliderml.simulate.pipeline exposes the channel→stages mapping that simulate() uses internally. Useful when you want to reason about what would run before spinning up a container:
python
from colliderml.simulate.pipeline import (
CHANNEL_STAGES,
get_channel_stages,
list_channels,
)
print(list_channels())
for stage in get_channel_stages("ttbar"):
print(f" {stage.name:35s} ({stage.script})")And to preview the manifest that will be handed to the container runner:
python
from pathlib import Path
from colliderml.simulate.docker import clone_colliderml_production
from colliderml.simulate.pipeline import generate_stage_manifest
prod = clone_colliderml_production()
for step in generate_stage_manifest("higgs_portal", prod_root=prod):
print(step["name"], "→", step["config_path"])See also
- Local Simulation guide — conceptual overview
- Remote Simulation guide — SaaS variant
colliderml.load()— load simulation output- Benchmark tasks — score simulated events against a task