9.7 KiB
stratoflights-predictor
High-altitude balloon trajectory prediction service. Forecasts ascent, descent, and float trajectories from NOAA GFS wind data, exposed as a REST API.
The trajectory engine is a propagator-and-constraint system: any flight profile can be expressed as a chain of propagators (constant-rate ascent, parachute descent, piecewise rates, wind drift) with attached constraints (altitude, time, terrain contact). The legacy Tawhiri request shape is kept as a compatibility endpoint so existing clients work unchanged.
Quick start
# Build all three binaries (server, CLI, validation tool)
make build
# Run the server (first start downloads ~9 GB of GFS data over 30-60 min)
./bin/predictor
# Check readiness
./bin/predictor-cli ready
# Run a Tawhiri-style prediction
./bin/predictor-cli predict \
launch_latitude=52.2 launch_longitude=0.1 \
launch_datetime=2026-03-28T12:00:00Z \
ascent_rate=5 burst_altitude=30000 descent_rate=5
Configuration
Configuration is layered: built-in defaults, then a YAML file
(--config path.yml or PREDICTOR_CONFIG_FILE=path.yml), then env vars,
then CLI flags. Flags override env vars override file values override defaults.
| Setting | Env var | CLI flag | Default |
|---|---|---|---|
| HTTP port | PREDICTOR_PORT |
-port |
8080 |
| Data directory | PREDICTOR_DATA_DIR |
-data-dir |
/tmp/predictor-data |
| Elevation dataset | PREDICTOR_ELEVATION_DATASET |
-elevation |
/srv/ruaumoko-dataset |
| Source | PREDICTOR_SOURCE |
— | noaa-gfs-0p50 |
| Download parallelism | PREDICTOR_DOWNLOAD_PARALLEL |
-download-parallel |
8 |
| Download bandwidth (bytes/s; 0 = unlimited) | PREDICTOR_DOWNLOAD_BANDWIDTH |
-download-bandwidth |
0 |
| Scheduler interval | PREDICTOR_UPDATE_INTERVAL |
-update-interval |
6h |
| Dataset freshness TTL | PREDICTOR_DATASET_TTL |
-freshness-ttl |
48h |
| Metrics enabled | PREDICTOR_METRICS_ENABLED |
-metrics |
true |
| Metrics HTTP path | PREDICTOR_METRICS_PATH |
-metrics-path |
/metrics |
| Log level | PREDICTOR_LOG_LEVEL |
-log-level |
info |
A YAML config file mirrors the same structure:
http:
port: 8080
data:
dir: /var/lib/predictor
elevation_path: /var/lib/predictor/elevation
source: noaa-gfs-0p50
download:
parallel: 8
bandwidth_bytes_per_second: 0
update_interval: 6h
freshness_ttl: 48h
metrics:
enabled: true
path: /metrics
log:
level: info
REST API
Tawhiri-compatible
GET /api/v1/prediction — preserves the exact request and response shape of
the upstream Cambridge University Spaceflight predictor. Query parameters:
| Parameter | Required | Description |
|---|---|---|
launch_latitude |
yes | Degrees, -90 to 90 |
launch_longitude |
yes | Degrees, -180 to 180 or 0 to 360 |
launch_datetime |
yes | RFC 3339 |
launch_altitude |
no | Metres ASL (default 0) |
profile |
no | standard_profile (default) or float_profile |
ascent_rate |
no | m/s (default 5) |
burst_altitude |
no | Metres (default 28000) |
descent_rate |
no | m/s (default 5) |
float_altitude |
no | Metres (float profile only) |
stop_datetime |
no | Float-profile end time |
GET /ready — returns {"status": "ok", "dataset_time": "..."} once a
dataset is loaded; {"status": "not_ready", ...} before then.
Profile-driven (new primary)
POST /api/v2/prediction — accepts an arbitrary chain of propagators with
optional constraints. Useful when the frontend wants flight profiles the
Tawhiri shape can't express (e.g. piecewise rates, fallback on constraint
violation).
{
"launch": {
"time": "2026-03-28T12:00:00Z",
"latitude": 52.2,
"longitude": 0.1,
"altitude": 0
},
"profile": [
{
"name": "ascent",
"model": {"type": "constant_rate", "rate": 5, "include_wind": true},
"constraints": [{"type": "max_altitude", "limit": 30000}]
},
{
"name": "descent",
"model": {"type": "parachute_descent", "sea_level_rate": 5, "include_wind": true},
"constraints": [{"type": "terrain_contact"}]
}
]
}
Model types: constant_rate, parachute_descent, piecewise, wind.
Constraint types: max_altitude, min_altitude, max_time,
terrain_contact. Constraint actions: stop (default), fallback, clip.
Set "direction": "reverse" to integrate backward from a known landing.
Dataset admin
GET /api/v1/admin/datasets list stored epochs
POST /api/v1/admin/datasets {epoch | latest} trigger a download
DELETE /api/v1/admin/datasets/{epoch} delete a stored dataset
GET /api/v1/admin/jobs list every job
GET /api/v1/admin/jobs/{id} fetch one job
DELETE /api/v1/admin/jobs/{id} cancel a running job
Returns JobInfo:
{"id":"…","source":"noaa-gfs-0p50","epoch":"…","status":"running",
"started_at":"…","total_units":130,"done_units":47,"bytes":510000000}
Metrics
GET /metrics — Prometheus text exposition. Counters:
predictor_predictions_total{profile,status},
predictor_downloads_total{source,status},
predictor_download_bytes_total{source},
and a gauge predictor_active_dataset_epoch_seconds.
Architecture
cmd/
predictor/main.go main server entry point
predictor-cli/main.go HTTP client
compare-tawhiri/main.go end-to-end validation against the public Tawhiri instance
internal/
numerics/ pure numerical primitives (interp, bisect, RK4, refinement)
engine/ propagator + constraint system + concrete models
weather/ WindField interface; gfs/ — NOAA GFS file format + impl
datasets/ Source/Storage/Manager + transactional, resumable downloads
gfs/ — NOAA GFS source impl
elevation/ ruaumoko-format ground elevation reader
config/ layered file+env+CLI config
metrics/ Sink interface + Prometheus text impl
api/ HTTP transport
tawhiri/ — legacy v1 endpoint via ogen
v2/ — profile-driven endpoint
admin/ — dataset/job admin endpoints
middleware/
api/rest/predictor.swagger.yml OpenAPI 3 spec for v1 + /ready
pkg/rest/ ogen-generated code (regenerate via `make generate-ogen`)
docs/numerics.tex LaTeX math reference for the numerics package
scripts/build_elevation.py ETOPO 2022 → ruaumoko converter
Deployment
Local single instance
./bin/predictor --data-dir /var/lib/predictor
No external dependencies beyond the NOAA S3 mirror.
Docker single container
FROM golang:1.25 AS build
WORKDIR /src
COPY . .
RUN go build -o /predictor ./cmd/predictor
FROM gcr.io/distroless/base
COPY --from=build /predictor /predictor
EXPOSE 8080
ENTRYPOINT ["/predictor"]
Mount a volume at /data and set PREDICTOR_DATA_DIR=/data.
Load-balanced cluster
The server is stateless apart from the on-disk dataset cache and in-memory
job table. For multiple replicas, point all replicas at a shared filesystem
(NFS or similar) for data_dir; each replica reads-only its own mmap. Active
download coordination across replicas is not implemented — run downloads on
one node, or accept that two nodes may download the same epoch concurrently
(only one Commit wins via atomic rename).
Elevation dataset
Without elevation data, descent terminates at sea level. With elevation, descent terminates at ground level, matching upstream Tawhiri.
pip install xarray netcdf4 numpy
python3 scripts/build_elevation.py /var/lib/predictor/elevation
PREDICTOR_ELEVATION_DATASET=/var/lib/predictor/elevation ./bin/predictor
Numerical methods
The numerics package (internal/numerics) provides:
- regular-grid multilinear interpolation,
- monotone bisection,
- classical RK4 (forward and reverse time),
- binary-search refinement of a termination point.
Detailed math reference: docs/numerics.tex. The package has no
domain dependencies and is small enough for manual verification (~300
lines of Go), enabling a future C or Rust port without changes to the
trajectory engine.
Wind data
| Property | Value |
|---|---|
| Source | NOAA GFS, S3 mirror (noaa-gfs-bdp-pds.s3.amazonaws.com) |
| Resolution | 0.5° |
| Grid | 361 × 720 (lat × lng) |
| Forecast steps | 65 (every 3 hours, 0–192h) |
| Pressure levels | 47 (1000 → 1 hPa) |
| Variables | Geopotential height, U-wind, V-wind |
| File size | ~8.87 GiB (float32 flat binary, mmap-backed) |
| Update cadence | every 6 hours |
Downloads use HTTP Range requests against .idx index files to fetch only
the needed GRIB messages. Downloads are transactional (temp file, manifest,
atomic rename on commit) and resumable: interrupted downloads pick up where
they left off via the manifest.
Validation
./bin/compare-tawhiri --server http://localhost:8080 runs an identical
prediction against the local server and against the public SondeHub Tawhiri
instance, reporting the great-circle distance between landing points.
References
- Tawhiri — reference Python/Cython predictor
- ruaumoko — global elevation dataset format
- NOAA GFS
- ETOPO 2022
- SondeHub Tawhiri API — public Tawhiri instance