# stratoflights-predictor High-altitude balloon trajectory prediction service. Forecasts ascent, descent, and float trajectories from NOAA GFS and GEFS wind data, exposed as a REST API. The trajectory engine is a propagator-and-constraint system: any flight profile can be expressed as a chain of propagators (constant-rate ascent, parachute descent, piecewise rates with absolute / profile-relative / propagator-relative timing, wind drift) with attached constraints (scalar comparisons over altitude or time, terrain contact, geographic polygons). Constraints can stop the profile, hand off to a fallback propagator, or clip the violated coordinate to the boundary. The legacy Tawhiri request shape is kept as a compatibility endpoint so existing clients work unchanged. ## Quick start ```bash make build # produces bin/{predictor,predictor-cli,compare-tawhiri} ./bin/predictor # downloads ~9 GB of GFS data on first start ./bin/predictor-cli ready ./bin/predictor-cli predict \ launch_latitude=52.2 launch_longitude=0.1 \ launch_datetime=2026-03-28T12:00:00Z \ ascent_rate=5 burst_altitude=30000 descent_rate=5 ``` ## Configuration Layered configuration: built-in defaults < YAML file < env vars < CLI flags. | Setting | Env var | CLI flag | Default | |---|---|---|---| | HTTP port | `PREDICTOR_PORT` | `-port` | `8080` | | Data directory | `PREDICTOR_DATA_DIR` | `-data-dir` | `/tmp/predictor-data` | | Elevation dataset | `PREDICTOR_ELEVATION_DATASET` | `-elevation` | `/srv/ruaumoko-dataset` | | Source variant | `PREDICTOR_SOURCE` | — | `gfs-0p50-3h` | | Download parallelism | `PREDICTOR_DOWNLOAD_PARALLEL` | `-download-parallel` | `8` | | Download bandwidth (bytes/s; 0 = unlimited) | `PREDICTOR_DOWNLOAD_BANDWIDTH` | `-download-bandwidth` | `0` | | Scheduler interval | `PREDICTOR_UPDATE_INTERVAL` | `-update-interval` | `6h` | | Dataset freshness TTL | `PREDICTOR_DATASET_TTL` | `-freshness-ttl` | `48h` | | Metrics enabled | `PREDICTOR_METRICS_ENABLED` | `-metrics` | `true` | | Metrics HTTP path | `PREDICTOR_METRICS_PATH` | `-metrics-path` | `/metrics` | | Log level | `PREDICTOR_LOG_LEVEL` | `-log-level` | `info` | YAML config mirrors the same structure; see `internal/config/config.go`. Supported source variants: | `source` | Resolution | Cadence | Notes | |---|---|---|---| | `gfs-0p50-3h` | 0.5° | 3h to 192h | historical Tawhiri default | | `gfs-0p25-3h` | 0.25° | 3h to 192h | | | `gfs-0p25-1h` | 0.25° | 1h to 120h | | | `gefs-0p50-3h` | 0.5° | 3h to 192h | 21-member ensemble; each member is a separate dataset | ## REST API ### Tawhiri-compatible (legacy) `GET /api/v1/prediction` — preserves the exact request and response shape of the upstream Cambridge University Spaceflight predictor. `GET /ready` — returns `{"status":"ok", "dataset_time":"..."}` once a dataset is loaded. ### Profile-driven (synchronous) `POST /api/v2/prediction` — execute a profile synchronously and return the trajectory. Request shape: ```json { "launch": { "time": "2026-03-28T12:00:00Z", "latitude": 52.2, "longitude": 0.1, "altitude": 0 }, "direction": "forward", "profile": [ { "name": "ascent", "model": { "type": "constant_rate", "rate": 5, "include_wind": true }, "constraints": [{ "type": "altitude", "op": ">=", "limit": 30000 }] }, { "name": "descent", "model": { "type": "parachute_descent", "sea_level_rate": 5, "include_wind": true }, "constraints": [{ "type": "terrain_contact" }] } ], "globals": [{ "type": "time", "op": ">", "limit": 1799999999 }] } ``` Model types: `constant_rate`, `parachute_descent`, `piecewise`, `wind`. Constraint types: `altitude`, `time`, `terrain_contact`, `polygon`. Operators: `<`, `<=`, `>`, `>=`, `==`. Actions: `stop` (default), `fallback`, `clip`. Direction: `forward` (default) or `reverse`. Piecewise segments support a `reference` field (`absolute`, `profile_start`, or `propagator_start`) so a single rate schedule can be reused across profiles with different launch times. The response includes per-stage trajectories, detailed termination info (violation state + refined state + constraint name), an `events` array of non-fatal observations (e.g. `above_model` when altitude exceeded the dataset's highest pressure level), and dataset metadata. ### Profile-driven (asynchronous) `POST /api/v1/predictions` — enqueue a prediction. Returns `202` with a job ID: ```json {"id":"842107d9-…","status":"pending","created_at":"…"} ``` `GET /api/v1/predictions/{id}` — poll status. When `status == "complete"`, the response includes a `result` field with the full v2 PredictionResponse. `DELETE /api/v1/predictions/{id}` — cancel a queued job. A worker pool (`http.async_workers`, default 4) services the queue; completed results are retained for `http.async_result_ttl` (default 1h). ### Dataset admin ``` GET /api/v1/admin/datasets list stored datasets (epoch, subset, coverage, loaded?) POST /api/v1/admin/datasets trigger a download DELETE /api/v1/admin/datasets/{filename} delete by filename (DatasetID.Filename()) GET /api/v1/admin/jobs list every download job GET /api/v1/admin/jobs/{id} fetch one job DELETE /api/v1/admin/jobs/{id} cancel a running download GET /api/v1/admin/status consolidated status (uptime, mem, goroutines, jobs, datasets) ``` Trigger-download body: ```json { "epoch": "2026-03-28T06:00:00Z", "subset": { "region": { "min_lat": -10, "max_lat": 10, "min_lng": 0, "max_lng": 30 }, "hour_range": { "min_hour": 0, "max_hour": 72 }, "members": [5] } } ``` `{"latest": true}` is a shortcut that refreshes the latest global dataset for the configured source. Each `(epoch, subset)` combination is a separate dataset; the loader auto-selects which loaded dataset covers a given prediction query. ### Metrics `GET /metrics` — Prometheus text exposition. Counters: `predictor_predictions_total{profile,status}`, `predictor_downloads_total`, `predictor_download_bytes_total`, and a gauge `predictor_active_dataset_epoch_seconds`. ## Architecture ``` cmd/ predictor/ main server predictor-cli/ HTTP client compare-tawhiri/ end-to-end validation against the public Tawhiri instance internal/ numerics/ pure numerical primitives (interp, bisect, RK4, refinement) engine/ propagator + constraint system + concrete models + registry weather/ WindField interface; gfs/ — variant-parameterized GFS file format + WindField datasets/ Source / Storage / Manager + transactional, resumable, subsettable downloads grib/ — shared GRIB downloader skeleton (idx parser, HTTP, parallel blit) gfs/ — GFS Source (URL templating only) gefs/ — GEFS Source (URL templating + member resolution) elevation/ ruaumoko-format ground elevation reader config/ layered file+env+CLI config metrics/ Sink interface + Prometheus text impl api/ HTTP transport tawhiri/ — legacy v1 endpoint via ogen v2/ — synchronous profile-driven endpoint async/ — asynchronous prediction jobs admin/ — dataset + service-status endpoints httpjson/ — tiny JSON response helpers middleware/ api/rest/predictor.swagger.yml OpenAPI 3 spec for v1 + /ready pkg/rest/ ogen-generated code (regenerate via `make generate-ogen`) docs/numerics.tex end-to-end mathematical reference scripts/build_elevation.py ETOPO 2022 → ruaumoko converter ``` ## Subsetting and ensembles Each stored dataset is identified by `DatasetID = (epoch, subset)`. A subset restricts the data fetched by region, forecast-hour range, or ensemble member. The downloader honours the subset (skipping out-of-range forecast steps; member-selecting URLs for GEFS), the storage tracks each subset as a separate file (filename includes a deterministic subset key), and the Manager exposes coverage so per-query dataset selection picks the right one. ## Deployment Local single instance, Docker container, or load-balanced cluster behind a shared filesystem for the dataset cache. The async API stores results in-memory only; for cluster deployments with sticky sessions, ensure clients poll the same node they submitted to. ## Validation `./bin/compare-tawhiri --server http://localhost:8080` runs an identical prediction against the local server and the public SondeHub Tawhiri instance, reporting the great-circle distance between landing points. ## Numerical methods `docs/numerics.tex` is the complete mathematical reference: state vector, equations of motion (constant rate, parachute drag, piecewise, wind transport), numerical methods (multilinear interpolation, bisection, classical RK4, binary-search termination refinement), constraint geometry (scalar comparisons, point-in-polygon with antimeridian handling), and design notes on the deferred items (WGS84/ECEF coordinate system, mass-aware drift, Monte Carlo). ## References - [Tawhiri](https://github.com/cuspaceflight/tawhiri) — reference Python/Cython predictor - [ruaumoko](https://github.com/cuspaceflight/ruaumoko) — global elevation dataset format - [NOAA GFS](https://www.ncei.noaa.gov/products/weather-climate-models/global-forecast) - [NOAA GEFS](https://www.ncei.noaa.gov/products/weather-climate-models/global-ensemble-forecast) - [ETOPO 2022](https://www.ncei.noaa.gov/products/etopo-global-relief-model) - [SondeHub Tawhiri API](https://api.v2.sondehub.org/tawhiri) — public Tawhiri instance