285 lines
13 KiB
Markdown
285 lines
13 KiB
Markdown
# stratoflights-predictor
|
|
|
|
High-altitude balloon trajectory prediction service. Forecasts ascent, descent,
|
|
and float trajectories from NOAA GFS and GEFS wind data, exposed as a REST API.
|
|
|
|
The trajectory engine is a propagator-and-constraint system: any flight
|
|
profile can be expressed as a chain of propagators (constant-rate ascent,
|
|
parachute descent, piecewise rates with absolute / profile-relative /
|
|
propagator-relative timing, wind drift) with attached constraints
|
|
(scalar comparisons over altitude or time, terrain contact, geographic
|
|
polygons). Constraints can stop the profile, hand off to a fallback
|
|
propagator, or clip the violated coordinate to the boundary. The legacy
|
|
Tawhiri request shape is kept as a compatibility endpoint so existing
|
|
clients work unchanged.
|
|
|
|
## Quick start
|
|
|
|
```bash
|
|
make build # produces bin/{predictor,predictor-cli,compare-tawhiri}
|
|
./bin/predictor # downloads ~9 GB of GFS data on first start
|
|
|
|
./bin/predictor-cli ready
|
|
./bin/predictor-cli predict \
|
|
launch_latitude=52.2 launch_longitude=0.1 \
|
|
launch_datetime=2026-03-28T12:00:00Z \
|
|
ascent_rate=5 burst_altitude=30000 descent_rate=5
|
|
```
|
|
|
|
## Configuration
|
|
|
|
Layered configuration: built-in defaults < YAML file < env vars < CLI flags.
|
|
|
|
| Setting | Env var | CLI flag | Default |
|
|
|---|---|---|---|
|
|
| HTTP port | `PREDICTOR_PORT` | `-port` | `8080` |
|
|
| Data directory | `PREDICTOR_DATA_DIR` | `-data-dir` | `/tmp/predictor-data` |
|
|
| Elevation dataset | `PREDICTOR_ELEVATION_DATASET` | `-elevation` | `/srv/ruaumoko-dataset` |
|
|
| Source variant | `PREDICTOR_SOURCE` | — | `gfs-0p50-3h` |
|
|
| Download parallelism | `PREDICTOR_DOWNLOAD_PARALLEL` | `-download-parallel` | `8` |
|
|
| Download bandwidth (bytes/s; 0 = unlimited) | `PREDICTOR_DOWNLOAD_BANDWIDTH` | `-download-bandwidth` | `0` |
|
|
| Scheduler interval | `PREDICTOR_UPDATE_INTERVAL` | `-update-interval` | `6h` |
|
|
| Dataset freshness TTL | `PREDICTOR_DATASET_TTL` | `-freshness-ttl` | `48h` |
|
|
| Metrics enabled | `PREDICTOR_METRICS_ENABLED` | `-metrics` | `true` |
|
|
| Metrics HTTP path | `PREDICTOR_METRICS_PATH` | `-metrics-path` | `/metrics` |
|
|
| Log level | `PREDICTOR_LOG_LEVEL` | `-log-level` | `info` |
|
|
|
|
YAML config mirrors the same structure; see `internal/config/config.go`.
|
|
|
|
Supported source variants:
|
|
|
|
| `source` | Resolution | Cadence | Notes |
|
|
|---|---|---|---|
|
|
| `gfs-0p50-3h` | 0.5° | 3h to 192h | historical Tawhiri default |
|
|
| `gfs-0p25-3h` | 0.25° | 3h to 192h | |
|
|
| `gfs-0p25-1h` | 0.25° | 1h to 120h | |
|
|
| `gefs-0p50-3h` | 0.5° | 3h to 192h | 21-member ensemble; each member is a separate dataset |
|
|
|
|
## REST API
|
|
|
|
### Tawhiri-compatible (legacy)
|
|
|
|
`GET /api/v1/prediction` — preserves the exact request and response shape of
|
|
the upstream Cambridge University Spaceflight predictor.
|
|
|
|
`GET /ready` — returns `{"status":"ok", "dataset_time":"..."}` once a dataset
|
|
is loaded.
|
|
|
|
### Profile-driven (synchronous)
|
|
|
|
`POST /api/v2/prediction` — execute a profile synchronously and return the
|
|
trajectory. Request shape:
|
|
|
|
```json
|
|
{
|
|
"launch": { "time": "2026-03-28T12:00:00Z", "latitude": 52.2, "longitude": 0.1, "altitude": 0 },
|
|
"direction": "forward",
|
|
"profile": [
|
|
{
|
|
"name": "ascent",
|
|
"model": { "type": "constant_rate", "rate": 5, "include_wind": true },
|
|
"constraints": [{ "type": "altitude", "op": ">=", "limit": 30000 }]
|
|
},
|
|
{
|
|
"name": "descent",
|
|
"model": { "type": "parachute_descent", "sea_level_rate": 5, "include_wind": true },
|
|
"constraints": [{ "type": "terrain_contact" }]
|
|
}
|
|
],
|
|
"globals": [{ "type": "time", "op": ">", "limit": 1799999999 }]
|
|
}
|
|
```
|
|
|
|
Model types: `constant_rate`, `parachute_descent`, `piecewise`, `wind`.
|
|
Constraint types: `altitude`, `time`, `terrain_contact`, `polygon`.
|
|
Operators: `<`, `<=`, `>`, `>=`, `==`. Actions: `stop` (default), `fallback`, `clip`.
|
|
Direction: `forward` (default) or `reverse`.
|
|
|
|
Piecewise segments support a `reference` field (`absolute`, `profile_start`, or
|
|
`propagator_start`) so a single rate schedule can be reused across profiles
|
|
with different launch times.
|
|
|
|
The response includes per-stage trajectories, detailed termination info
|
|
(violation state + refined state + constraint name), an `events` array of
|
|
non-fatal observations (e.g. `above_model` when altitude exceeded the dataset's
|
|
highest pressure level), and dataset metadata.
|
|
|
|
### Profile-driven (asynchronous)
|
|
|
|
`POST /api/v1/predictions` — enqueue a prediction. Returns `202` with a job ID:
|
|
|
|
```json
|
|
{"id":"842107d9-…","status":"pending","created_at":"…"}
|
|
```
|
|
|
|
`GET /api/v1/predictions/{id}` — poll status. When `status == "complete"`,
|
|
the response includes a `result` field with the full v2 PredictionResponse.
|
|
|
|
`DELETE /api/v1/predictions/{id}` — cancel a queued job.
|
|
|
|
A worker pool (`http.async_workers`, default 4) services the queue; completed
|
|
results are retained for `http.async_result_ttl` (default 1h).
|
|
|
|
### Dataset admin
|
|
|
|
```
|
|
GET /api/v1/admin/datasets list stored datasets (epoch, subset, coverage, loaded?)
|
|
POST /api/v1/admin/datasets trigger a download
|
|
DELETE /api/v1/admin/datasets/{filename} delete by filename (DatasetID.Filename())
|
|
GET /api/v1/admin/jobs list every download job
|
|
GET /api/v1/admin/jobs/{id} fetch one job
|
|
DELETE /api/v1/admin/jobs/{id} cancel a running download
|
|
GET /api/v1/admin/status consolidated status (uptime, mem, goroutines, jobs, datasets)
|
|
```
|
|
|
|
Trigger-download body:
|
|
|
|
```json
|
|
{
|
|
"epoch": "2026-03-28T06:00:00Z",
|
|
"subset": {
|
|
"region": { "min_lat": -10, "max_lat": 10, "min_lng": 0, "max_lng": 30 },
|
|
"hour_range": { "min_hour": 0, "max_hour": 72 },
|
|
"members": [5]
|
|
}
|
|
}
|
|
```
|
|
|
|
`{"latest": true}` is a shortcut that refreshes the latest global dataset
|
|
for the configured source. Each `(epoch, subset)` combination is a
|
|
separate dataset; the loader auto-selects which loaded dataset covers a
|
|
given prediction query.
|
|
|
|
### Wind visualization
|
|
|
|
`GET /api/v1/wind/field` — a velocity grid in the
|
|
[wind-js-server](https://github.com/danwild/wind-js-server) / leaflet-velocity
|
|
format (a two-element `[U, V]` array of `{header, data}` records), suitable for
|
|
animated particle layers. Query params: `time`, `altitude`, `min_lat`,
|
|
`max_lat`, `min_lng`, `max_lng`, `step` (degrees, min `0.25`). Responses are
|
|
cached in memory by parameters.
|
|
|
|
`GET /api/v1/wind/meta` — active dataset source, epoch, suggested altitudes,
|
|
and bounding box.
|
|
|
|
A runnable browser client is in [`examples/wind-demo`](examples/wind-demo).
|
|
|
|
### Documentation & metrics
|
|
|
|
`GET /docs` serves a [ReDoc](https://github.com/Redocly/redoc) rendering of the
|
|
full OpenAPI spec, which is also available raw at `GET /openapi.yaml`.
|
|
|
|
`GET /metrics` — Prometheus text exposition. Counters:
|
|
`predictor_predictions_total{profile,status}`, `predictor_downloads_total`,
|
|
`predictor_download_bytes_total`, and a gauge
|
|
`predictor_active_dataset_epoch_seconds`.
|
|
|
|
## Architecture
|
|
|
|
The entire REST API is defined by one OpenAPI spec and served by an
|
|
[ogen](https://ogen.dev)-generated server; the `internal/api` package only
|
|
implements the generated `Handler` interface, mapping between the wire types
|
|
and the engine/dataset/wind subsystems. `/metrics`, `/docs`, and
|
|
`/openapi.yaml` are mounted on the same `http.ServeMux` alongside it.
|
|
|
|
```
|
|
cmd/
|
|
predictor/ main server
|
|
predictor-cli/ HTTP client
|
|
compare-tawhiri/ end-to-end validation against the public Tawhiri instance
|
|
api/
|
|
rest/predictor.swagger.yml OpenAPI 3 spec — ogen input AND served at /openapi.yaml
|
|
spec.go embeds the spec (go:embed) for the docs handler
|
|
internal/
|
|
numerics/ performance-critical core: interpolation, bisection,
|
|
RK4 + crossing refinement, atmosphere density, vector
|
|
and polygon math (portable to C/Rust)
|
|
engine/ propagator + constraint orchestration + registry (thin over numerics)
|
|
weather/ WindField interface; gfs/ — variant-parameterized GFS cube + sampler
|
|
datasets/ Source / Storage / Manager + transactional, resumable, subsettable downloads
|
|
grib/ — shared GRIB downloader skeleton (idx parser, HTTP, parallel blit)
|
|
gfs/ — GFS Source (URL templating only)
|
|
gefs/ — GEFS Source (URL templating + member resolution)
|
|
windviz/ cube-agnostic wind-field rasterizer + cache
|
|
elevation/ ruaumoko-format ground elevation reader
|
|
config/ layered file+env+CLI config
|
|
metrics/ Sink interface + Prometheus text impl
|
|
api/ ogen Handler implementation
|
|
handler.go — composite handler + NewError
|
|
prediction.go — v1 (Tawhiri), v2, async predictions
|
|
datasets.go — dataset + job admin + status
|
|
wind.go — wind visualization endpoints
|
|
mapping.go — ogen <-> engine conversions
|
|
async/ — prediction worker pool
|
|
docs/ — ReDoc page + /openapi.yaml
|
|
middleware/ — ogen logging, CORS
|
|
pkg/rest/ ogen-generated server/client/types (regenerate via `make generate-ogen`)
|
|
examples/wind-demo/ Leaflet + leaflet-velocity sample client
|
|
docs/numerics.tex end-to-end mathematical reference
|
|
scripts/build_elevation.py ETOPO 2022 → ruaumoko converter
|
|
```
|
|
|
|
## Subsetting and ensembles
|
|
|
|
Each stored dataset is identified by `DatasetID = (epoch, subset)`. A subset
|
|
restricts the data fetched by region, forecast-hour range, or ensemble
|
|
member. The downloader honours the subset (skipping out-of-range
|
|
forecast steps; member-selecting URLs for GEFS), the storage tracks each
|
|
subset as a separate file (filename includes a deterministic subset key),
|
|
and the Manager exposes coverage so per-query dataset selection picks the
|
|
right one.
|
|
|
|
## Deployment
|
|
|
|
The service ships as a single static binary in a distroless image and runs in
|
|
three configurations — see **[DEPLOYMENT.md](DEPLOYMENT.md)** for the full guide.
|
|
|
|
| Environment | File |
|
|
|---|---|
|
|
| Local dev | `docker compose up --build` (`docker-compose.yml`) |
|
|
| Staging (single host, + Prometheus) | `docker-compose.staging.yml` |
|
|
| Production (Docker Swarm) | `docker-compose.swarm.yml` |
|
|
|
|
Production runs on Docker Swarm pinned to ≤2 nodes labelled `predictor.data=true`,
|
|
each holding one copy of the dataset on **node-local disk** (never NFS).
|
|
Replicas spread across the two nodes for redundancy; multiple replicas per node
|
|
share the node's dataset and coordinate downloads with a file lock so only one
|
|
fetches the ~9 GiB cube. The predictor is an internal backend reached by the
|
|
API gateway over an overlay network; it enforces no auth itself. CI/CD is a
|
|
Forgejo pipeline that builds, tests, and deploys to Swarmpit
|
|
(`.forgejo/workflows/ci-cd.yml`).
|
|
|
|
The async prediction API stores results in memory only; behind a load balancer,
|
|
clients must poll the same instance they submitted to (or use the synchronous
|
|
`/api/v2/prediction`).
|
|
|
|
### Health
|
|
|
|
- `GET /health` — liveness, always 200 while the process runs (used by the
|
|
container `HEALTHCHECK` via `predictor -healthcheck`).
|
|
- `GET /ready` — readiness, 200 only once a dataset is loaded.
|
|
|
|
## Validation
|
|
|
|
`./bin/compare-tawhiri --server http://localhost:8080` runs an identical
|
|
prediction against the local server and the public SondeHub Tawhiri
|
|
instance, reporting the great-circle distance between landing points.
|
|
|
|
## Numerical methods
|
|
|
|
`docs/numerics.tex` is the complete mathematical reference: state vector,
|
|
equations of motion (constant rate, parachute drag, piecewise, wind
|
|
transport), numerical methods (multilinear interpolation, bisection,
|
|
classical RK4, binary-search termination refinement), constraint
|
|
geometry (scalar comparisons, point-in-polygon with antimeridian
|
|
handling), and design notes on the deferred items (WGS84/ECEF
|
|
coordinate system, mass-aware drift, Monte Carlo).
|
|
|
|
## References
|
|
|
|
- [Tawhiri](https://github.com/cuspaceflight/tawhiri) — reference Python/Cython predictor
|
|
- [ruaumoko](https://github.com/cuspaceflight/ruaumoko) — global elevation dataset format
|
|
- [NOAA GFS](https://www.ncei.noaa.gov/products/weather-climate-models/global-forecast)
|
|
- [NOAA GEFS](https://www.ncei.noaa.gov/products/weather-climate-models/global-ensemble-forecast)
|
|
- [ETOPO 2022](https://www.ncei.noaa.gov/products/etopo-global-relief-model)
|
|
- [SondeHub Tawhiri API](https://api.v2.sondehub.org/tawhiri) — public Tawhiri instance
|