feat: refactor
This commit is contained in:
parent
82ef1cb3b8
commit
51bbf3c579
44 changed files with 8589 additions and 0 deletions
261
README.md
Normal file
261
README.md
Normal file
|
|
@ -0,0 +1,261 @@
|
|||
# Balloon Trajectory Predictor
|
||||
|
||||
High-altitude balloon trajectory prediction service. Predicts ascent, burst, and descent trajectories using GFS wind forecast data from NOAA.
|
||||
|
||||
The prediction algorithms are an exact port of [Tawhiri](https://github.com/cuspaceflight/tawhiri) (Cambridge University Spaceflight) to Go, verified to produce identical results.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Build
|
||||
make build
|
||||
|
||||
# Run (downloads ~9 GB of GFS data on first start, takes 30-60 min)
|
||||
PREDICTOR_DATA_DIR=/tmp/predictor-data go run ./cmd/api
|
||||
|
||||
# Check readiness
|
||||
curl http://localhost:8080/ready
|
||||
|
||||
# Run a prediction
|
||||
curl 'http://localhost:8080/api/v1/prediction?launch_latitude=52.2&launch_longitude=0.1&launch_datetime=2026-03-28T12:00:00Z&launch_altitude=0&ascent_rate=5&burst_altitude=30000&descent_rate=5'
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
All configuration is via environment variables.
|
||||
|
||||
| Variable | Default | Description |
|
||||
|---|---|---|
|
||||
| `PREDICTOR_PORT` | `8080` | HTTP server port |
|
||||
| `PREDICTOR_DATA_DIR` | `/tmp/predictor-data` | Directory for wind datasets and temp files |
|
||||
| `PREDICTOR_DOWNLOAD_PARALLEL` | `8` | Max concurrent GRIB download goroutines |
|
||||
| `PREDICTOR_UPDATE_INTERVAL` | `6h` | How often to check for new forecasts |
|
||||
| `PREDICTOR_DATASET_TTL` | `48h` | Max age before a dataset is considered stale |
|
||||
| `PREDICTOR_ELEVATION_DATASET` | `/srv/ruaumoko-dataset` | Path to elevation dataset (optional) |
|
||||
|
||||
## API
|
||||
|
||||
### `GET /api/v1/prediction`
|
||||
|
||||
Run a balloon trajectory prediction.
|
||||
|
||||
**Parameters** (query string):
|
||||
|
||||
| Parameter | Required | Description |
|
||||
|---|---|---|
|
||||
| `launch_latitude` | yes | Launch latitude in degrees (-90 to 90) |
|
||||
| `launch_longitude` | yes | Launch longitude in degrees (-180 to 180 or 0 to 360) |
|
||||
| `launch_datetime` | yes | Launch time in RFC 3339 format |
|
||||
| `launch_altitude` | no | Launch altitude in metres ASL (default: 0) |
|
||||
| `profile` | no | `standard_profile` (default) or `float_profile` |
|
||||
| `ascent_rate` | no | Ascent rate in m/s (default: 5) |
|
||||
| `burst_altitude` | no | Burst altitude in metres (default: 28000) |
|
||||
| `descent_rate` | no | Sea-level descent rate in m/s (default: 5) |
|
||||
| `float_altitude` | no | Float altitude in metres (float_profile only) |
|
||||
| `stop_datetime` | no | Float end time (float_profile only, default: +24h) |
|
||||
|
||||
**Response** (Tawhiri-compatible):
|
||||
|
||||
```json
|
||||
{
|
||||
"prediction": [
|
||||
{
|
||||
"stage": "ascent",
|
||||
"trajectory": [
|
||||
{"datetime": "2026-03-28T12:00:00Z", "latitude": 52.2, "longitude": 0.1, "altitude": 0},
|
||||
...
|
||||
]
|
||||
},
|
||||
{
|
||||
"stage": "descent",
|
||||
"trajectory": [...]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"start_datetime": "...",
|
||||
"complete_datetime": "..."
|
||||
},
|
||||
"request": {
|
||||
"dataset": "2026-03-28T06:00:00Z",
|
||||
"launch_latitude": 52.2,
|
||||
...
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### `GET /ready`
|
||||
|
||||
Health check. Returns `{"status": "ok"}` when a dataset is loaded.
|
||||
|
||||
## Elevation Dataset
|
||||
|
||||
Without elevation data, descent terminates at sea level (altitude <= 0). With elevation data, descent terminates at ground level, matching Tawhiri's behaviour.
|
||||
|
||||
### Building the elevation dataset
|
||||
|
||||
The elevation dataset uses ETOPO 2022 at 30 arc-second resolution, converted to a ruaumoko-compatible binary format (21601 x 43200 grid of int16 little-endian elevation values in metres).
|
||||
|
||||
**Requirements**: Python 3, xarray, netcdf4, numpy.
|
||||
|
||||
```bash
|
||||
pip install xarray netcdf4 numpy
|
||||
|
||||
# Downloads ~1.1 GB from NOAA, produces ~1.74 GB binary file
|
||||
python3 scripts/build_elevation.py /tmp/predictor-data/ruaumoko-dataset
|
||||
```
|
||||
|
||||
To skip the download if you already have the ETOPO NetCDF file:
|
||||
|
||||
```bash
|
||||
ETOPO_NC_PATH=/path/to/ETOPO_2022_v1_30s_N90W180_surface.nc \
|
||||
python3 scripts/build_elevation.py /tmp/predictor-data/ruaumoko-dataset
|
||||
```
|
||||
|
||||
The ETOPO 2022 NetCDF can be manually downloaded from:
|
||||
https://www.ncei.noaa.gov/products/etopo-global-relief-model
|
||||
|
||||
### Using the elevation dataset
|
||||
|
||||
```bash
|
||||
PREDICTOR_ELEVATION_DATASET=/tmp/predictor-data/ruaumoko-dataset go run ./cmd/api
|
||||
```
|
||||
|
||||
If the file doesn't exist or can't be read, the service starts normally with a warning and falls back to sea-level termination.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
cmd/api/main.go Entry point, config, scheduler, HTTP server
|
||||
internal/
|
||||
dataset/
|
||||
dataset.go Shape constants, pressure levels, S3 URLs
|
||||
file.go mmap-backed dataset file (read/write/blit)
|
||||
downloader/
|
||||
downloader.go S3 partial GRIB download (idx + range requests)
|
||||
idx.go NOAA .idx file parser
|
||||
config.go Environment-based configuration
|
||||
elevation/
|
||||
elevation.go Ruaumoko-compatible elevation dataset (mmap int16)
|
||||
prediction/
|
||||
interpolate.go 4D wind interpolation (time, lat, lon, altitude)
|
||||
solver.go RK4 integrator with binary search termination
|
||||
models.go Ascent, descent, wind models; flight profiles
|
||||
warnings.go Prediction warning counters
|
||||
service/
|
||||
service.go Dataset lifecycle, concurrent-safe access
|
||||
transport/
|
||||
middleware/log.go Request logging middleware
|
||||
rest/
|
||||
handler/handler.go ogen API handler implementation
|
||||
handler/deps.go Service interface
|
||||
transport.go ogen HTTP server, CORS
|
||||
api/rest/predictor.swagger.yml OpenAPI 3.0 spec
|
||||
pkg/rest/ Generated ogen code (17 files)
|
||||
scripts/
|
||||
build_elevation.py ETOPO 2022 to ruaumoko converter
|
||||
```
|
||||
|
||||
## Wind Dataset
|
||||
|
||||
The service downloads GFS 0.5-degree forecast data from NOAA S3:
|
||||
|
||||
| Property | Value |
|
||||
|---|---|
|
||||
| Source | `noaa-gfs-bdp-pds.s3.amazonaws.com` |
|
||||
| Resolution | 0.5 degrees |
|
||||
| Grid | 361 lat x 720 lon |
|
||||
| Time steps | 65 (every 3 hours, 0-192h) |
|
||||
| Pressure levels | 47 (1000 to 1 hPa) |
|
||||
| Variables | Geopotential height, U-wind, V-wind |
|
||||
| Dataset size | 9,528,667,200 bytes (~8.87 GiB) |
|
||||
| Update cadence | Every 6 hours (GFS runs at 00, 06, 12, 18 UTC) |
|
||||
|
||||
Data is downloaded using HTTP Range requests against `.idx` index files, fetching only the needed GRIB messages (HGT, UGRD, VGRD at 47 pressure levels). Full download takes 30-60 minutes depending on bandwidth.
|
||||
|
||||
The dataset is stored as a memory-mapped flat binary file of float32 values in C-order with shape `(65, 47, 3, 361, 720)`.
|
||||
|
||||
## Prediction Algorithms
|
||||
|
||||
All algorithms are exact ports of the reference implementations in Tawhiri. The following sections describe the key components.
|
||||
|
||||
### Interpolation (`internal/prediction/interpolate.go`)
|
||||
|
||||
4D wind interpolation from the dataset grid to arbitrary coordinates.
|
||||
|
||||
1. **Trilinear weights** (`pick3`): compute 8 interpolation weights for the (hour, lat, lon) cube corners.
|
||||
2. **Altitude search** (`search`): binary search on interpolated geopotential height to find the two pressure levels bracketing the target altitude.
|
||||
3. **Wind extraction** (`interp4`): 8-point weighted sum at each bracket level, then linear interpolation between levels.
|
||||
|
||||
Reference: `tawhiri/interpolate.pyx`
|
||||
|
||||
### Solver (`internal/prediction/solver.go`)
|
||||
|
||||
4th-order Runge-Kutta integrator with dt = 60 seconds.
|
||||
|
||||
- State vector: (latitude, longitude, altitude) in degrees and metres.
|
||||
- Time: UNIX timestamp in seconds.
|
||||
- Longitude is kept in [0, 360) via Python-style modulo after each `vecadd`.
|
||||
- When a terminator fires, binary search refinement (tolerance 0.01) finds the precise termination point between the last good step and the first terminated step.
|
||||
- Longitude interpolation (`lngLerp`) handles the 0/360 wrap-around.
|
||||
|
||||
Reference: `tawhiri/solver.pyx`
|
||||
|
||||
### Models (`internal/prediction/models.go`)
|
||||
|
||||
- **Constant ascent**: vertical velocity = ascent_rate m/s.
|
||||
- **Drag descent**: NASA atmosphere density model with drag coefficient = sea_level_rate * 1.1045. Descent rate increases with altitude due to thinner air.
|
||||
- **Wind velocity**: u, v components from interpolation converted to degrees/second: `dlat = (180/pi) * v / (R)`, `dlng = (180/pi) * u / (R * cos(lat))` where R = 6371009 + altitude.
|
||||
- **Linear model**: sum of component models (e.g., wind + ascent).
|
||||
- **Elevation termination**: `ground_elevation > altitude` using ruaumoko dataset.
|
||||
|
||||
Reference: `tawhiri/models.py`
|
||||
|
||||
### Profiles
|
||||
|
||||
- **standard_profile**: ascent (constant rate + wind) until burst altitude, then descent (drag + wind) until ground level.
|
||||
- **float_profile**: ascent to float altitude, then drift at constant altitude until stop time.
|
||||
|
||||
## Verification
|
||||
|
||||
The predictor has been verified against the reference Tawhiri implementation:
|
||||
|
||||
| Test | Result |
|
||||
|---|---|
|
||||
| Dataset (step 0): 36.6M float32 values vs Python/cfgrib | 0 mismatches, max diff = 0.0 |
|
||||
| Prediction burst point vs public Tawhiri API | Identical (lat, lon, alt all match) |
|
||||
| Prediction landing point vs public Tawhiri API | Identical lat/lon, 5m altitude diff (different elevation datasets) |
|
||||
| Descent point count | Identical (46 points) |
|
||||
| Ascent point count | Identical (101 points) |
|
||||
|
||||
## Development
|
||||
|
||||
```bash
|
||||
# Regenerate ogen API code after modifying the swagger spec
|
||||
make generate-ogen
|
||||
|
||||
# Run tests
|
||||
make test
|
||||
|
||||
# Format
|
||||
make fmt
|
||||
```
|
||||
|
||||
### Comparison tools
|
||||
|
||||
```bash
|
||||
# Compare single dataset step against Python/cfgrib reference
|
||||
go run ./cmd/compare_step0 <run_YYYYMMDDHH> <output_path>
|
||||
|
||||
# Run prediction and compare against public Tawhiri API
|
||||
go run ./cmd/compare_prediction
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [Tawhiri](https://github.com/cuspaceflight/tawhiri) — Reference Python/Cython predictor (Cambridge University Spaceflight)
|
||||
- [tawhiri-downloader](https://github.com/cuspaceflight/tawhiri-downloader) — OCaml dataset downloader
|
||||
- [ruaumoko](https://github.com/cuspaceflight/ruaumoko) — Global elevation dataset
|
||||
- [NOAA GFS](https://www.ncei.noaa.gov/products/weather-climate-models/global-forecast) — Global Forecast System
|
||||
- [NOAA GFS on S3](https://noaa-gfs-bdp-pds.s3.amazonaws.com/index.html) — Public S3 bucket
|
||||
- [ETOPO 2022](https://www.ncei.noaa.gov/products/etopo-global-relief-model) — Global relief model for elevation data
|
||||
- [SondeHub Tawhiri API](https://api.v2.sondehub.org/tawhiri) — Public Tawhiri instance for comparison
|
||||
Loading…
Add table
Add a link
Reference in a new issue