# stratoflights-predictor High-altitude balloon trajectory prediction service. Forecasts ascent, descent, and float trajectories from NOAA GFS wind data, exposed as a REST API. The trajectory engine is a propagator-and-constraint system: any flight profile can be expressed as a chain of propagators (constant-rate ascent, parachute descent, piecewise rates, wind drift) with attached constraints (altitude, time, terrain contact). The legacy Tawhiri request shape is kept as a compatibility endpoint so existing clients work unchanged. ## Quick start ```bash # Build all three binaries (server, CLI, validation tool) make build # Run the server (first start downloads ~9 GB of GFS data over 30-60 min) ./bin/predictor # Check readiness ./bin/predictor-cli ready # Run a Tawhiri-style prediction ./bin/predictor-cli predict \ launch_latitude=52.2 launch_longitude=0.1 \ launch_datetime=2026-03-28T12:00:00Z \ ascent_rate=5 burst_altitude=30000 descent_rate=5 ``` ## Configuration Configuration is layered: built-in defaults, then a YAML file (`--config path.yml` or `PREDICTOR_CONFIG_FILE=path.yml`), then env vars, then CLI flags. Flags override env vars override file values override defaults. | Setting | Env var | CLI flag | Default | |---|---|---|---| | HTTP port | `PREDICTOR_PORT` | `-port` | `8080` | | Data directory | `PREDICTOR_DATA_DIR` | `-data-dir` | `/tmp/predictor-data` | | Elevation dataset | `PREDICTOR_ELEVATION_DATASET` | `-elevation` | `/srv/ruaumoko-dataset` | | Source | `PREDICTOR_SOURCE` | — | `noaa-gfs-0p50` | | Download parallelism | `PREDICTOR_DOWNLOAD_PARALLEL` | `-download-parallel` | `8` | | Download bandwidth (bytes/s; 0 = unlimited) | `PREDICTOR_DOWNLOAD_BANDWIDTH` | `-download-bandwidth` | `0` | | Scheduler interval | `PREDICTOR_UPDATE_INTERVAL` | `-update-interval` | `6h` | | Dataset freshness TTL | `PREDICTOR_DATASET_TTL` | `-freshness-ttl` | `48h` | | Metrics enabled | `PREDICTOR_METRICS_ENABLED` | `-metrics` | `true` | | Metrics HTTP path | `PREDICTOR_METRICS_PATH` | `-metrics-path` | `/metrics` | | Log level | `PREDICTOR_LOG_LEVEL` | `-log-level` | `info` | A YAML config file mirrors the same structure: ```yaml http: port: 8080 data: dir: /var/lib/predictor elevation_path: /var/lib/predictor/elevation source: noaa-gfs-0p50 download: parallel: 8 bandwidth_bytes_per_second: 0 update_interval: 6h freshness_ttl: 48h metrics: enabled: true path: /metrics log: level: info ``` ## REST API ### Tawhiri-compatible `GET /api/v1/prediction` — preserves the exact request and response shape of the upstream Cambridge University Spaceflight predictor. Query parameters: | Parameter | Required | Description | |---|---|---| | `launch_latitude` | yes | Degrees, -90 to 90 | | `launch_longitude` | yes | Degrees, -180 to 180 or 0 to 360 | | `launch_datetime` | yes | RFC 3339 | | `launch_altitude` | no | Metres ASL (default 0) | | `profile` | no | `standard_profile` (default) or `float_profile` | | `ascent_rate` | no | m/s (default 5) | | `burst_altitude` | no | Metres (default 28000) | | `descent_rate` | no | m/s (default 5) | | `float_altitude` | no | Metres (float profile only) | | `stop_datetime` | no | Float-profile end time | `GET /ready` — returns `{"status": "ok", "dataset_time": "..."}` once a dataset is loaded; `{"status": "not_ready", ...}` before then. ### Profile-driven (new primary) `POST /api/v2/prediction` — accepts an arbitrary chain of propagators with optional constraints. Useful when the frontend wants flight profiles the Tawhiri shape can't express (e.g. piecewise rates, fallback on constraint violation). ```json { "launch": { "time": "2026-03-28T12:00:00Z", "latitude": 52.2, "longitude": 0.1, "altitude": 0 }, "profile": [ { "name": "ascent", "model": {"type": "constant_rate", "rate": 5, "include_wind": true}, "constraints": [{"type": "max_altitude", "limit": 30000}] }, { "name": "descent", "model": {"type": "parachute_descent", "sea_level_rate": 5, "include_wind": true}, "constraints": [{"type": "terrain_contact"}] } ] } ``` Model types: `constant_rate`, `parachute_descent`, `piecewise`, `wind`. Constraint types: `max_altitude`, `min_altitude`, `max_time`, `terrain_contact`. Constraint actions: `stop` (default), `fallback`, `clip`. Set `"direction": "reverse"` to integrate backward from a known landing. ### Dataset admin ``` GET /api/v1/admin/datasets list stored epochs POST /api/v1/admin/datasets {epoch | latest} trigger a download DELETE /api/v1/admin/datasets/{epoch} delete a stored dataset GET /api/v1/admin/jobs list every job GET /api/v1/admin/jobs/{id} fetch one job DELETE /api/v1/admin/jobs/{id} cancel a running job ``` Returns `JobInfo`: ```json {"id":"…","source":"noaa-gfs-0p50","epoch":"…","status":"running", "started_at":"…","total_units":130,"done_units":47,"bytes":510000000} ``` ### Metrics `GET /metrics` — Prometheus text exposition. Counters: `predictor_predictions_total{profile,status}`, `predictor_downloads_total{source,status}`, `predictor_download_bytes_total{source}`, and a gauge `predictor_active_dataset_epoch_seconds`. ## Architecture ``` cmd/ predictor/main.go main server entry point predictor-cli/main.go HTTP client compare-tawhiri/main.go end-to-end validation against the public Tawhiri instance internal/ numerics/ pure numerical primitives (interp, bisect, RK4, refinement) engine/ propagator + constraint system + concrete models weather/ WindField interface; gfs/ — NOAA GFS file format + impl datasets/ Source/Storage/Manager + transactional, resumable downloads gfs/ — NOAA GFS source impl elevation/ ruaumoko-format ground elevation reader config/ layered file+env+CLI config metrics/ Sink interface + Prometheus text impl api/ HTTP transport tawhiri/ — legacy v1 endpoint via ogen v2/ — profile-driven endpoint admin/ — dataset/job admin endpoints middleware/ api/rest/predictor.swagger.yml OpenAPI 3 spec for v1 + /ready pkg/rest/ ogen-generated code (regenerate via `make generate-ogen`) docs/numerics.tex LaTeX math reference for the numerics package scripts/build_elevation.py ETOPO 2022 → ruaumoko converter ``` ## Deployment ### Local single instance ```bash ./bin/predictor --data-dir /var/lib/predictor ``` No external dependencies beyond the NOAA S3 mirror. ### Docker single container ```dockerfile FROM golang:1.25 AS build WORKDIR /src COPY . . RUN go build -o /predictor ./cmd/predictor FROM gcr.io/distroless/base COPY --from=build /predictor /predictor EXPOSE 8080 ENTRYPOINT ["/predictor"] ``` Mount a volume at `/data` and set `PREDICTOR_DATA_DIR=/data`. ### Load-balanced cluster The server is stateless apart from the on-disk dataset cache and in-memory job table. For multiple replicas, point all replicas at a shared filesystem (NFS or similar) for `data_dir`; each replica reads-only its own mmap. Active download coordination across replicas is not implemented — run downloads on one node, or accept that two nodes may download the same epoch concurrently (only one Commit wins via atomic rename). ## Elevation dataset Without elevation data, descent terminates at sea level. With elevation, descent terminates at ground level, matching upstream Tawhiri. ```bash pip install xarray netcdf4 numpy python3 scripts/build_elevation.py /var/lib/predictor/elevation ``` `PREDICTOR_ELEVATION_DATASET=/var/lib/predictor/elevation ./bin/predictor` ## Numerical methods The numerics package (`internal/numerics`) provides: - regular-grid multilinear interpolation, - monotone bisection, - classical RK4 (forward and reverse time), - binary-search refinement of a termination point. Detailed math reference: `docs/numerics.tex`. The package has no domain dependencies and is small enough for manual verification (~300 lines of Go), enabling a future C or Rust port without changes to the trajectory engine. ## Wind data | Property | Value | |---|---| | Source | NOAA GFS, S3 mirror (`noaa-gfs-bdp-pds.s3.amazonaws.com`) | | Resolution | 0.5° | | Grid | 361 × 720 (lat × lng) | | Forecast steps | 65 (every 3 hours, 0–192h) | | Pressure levels | 47 (1000 → 1 hPa) | | Variables | Geopotential height, U-wind, V-wind | | File size | ~8.87 GiB (float32 flat binary, mmap-backed) | | Update cadence | every 6 hours | Downloads use HTTP Range requests against `.idx` index files to fetch only the needed GRIB messages. Downloads are transactional (temp file, manifest, atomic rename on commit) and resumable: interrupted downloads pick up where they left off via the manifest. ## Validation `./bin/compare-tawhiri --server http://localhost:8080` runs an identical prediction against the local server and against the public SondeHub Tawhiri instance, reporting the great-circle distance between landing points. ## References - [Tawhiri](https://github.com/cuspaceflight/tawhiri) — reference Python/Cython predictor - [ruaumoko](https://github.com/cuspaceflight/ruaumoko) — global elevation dataset format - [NOAA GFS](https://www.ncei.noaa.gov/products/weather-climate-models/global-forecast) - [ETOPO 2022](https://www.ncei.noaa.gov/products/etopo-global-relief-model) - [SondeHub Tawhiri API](https://api.v2.sondehub.org/tawhiri) — public Tawhiri instance