API Docs
This page serves as the documentation for the underlying data source algorithm for the Pirate Weather API- in sort, it explains which parameter comes from where. Since the goal of this API to to provide raw model data with as little processing as possible, results from the API should very closely match the model described in this document, with some minor differences due to interpolation.
Data sources
Several models are used to produce the forecast. Most are hosted on AWS's Open Data Platform, and the fantastic Herbie package is used to download and perform initial processing for many of them.
RTMA Rapid Update
The Real-Time Mesoscale Analysis Rapid Update (RTMA-RU) provides real time analysis for the continental US and parts of Canada. The model runs every 15-minutes and combines the HRRR first guess with observations from satellites and station observations.
RTMA-RU blends a rapidly updating HRRR first guess with whatever observations are available at each 15-minute cycle, the analyses can sometimes show noticeable jumps from one update to the next. Changes in observation availability, timing, or quality—as well as shifts in how strongly the system weights those observations relative to the HRRR first guess—can cause sudden increases or decreases in the analyzed values. These cycle-to-cycle fluctuations are a normal artifact of the rapid-update data assimilation process, and they can appear in any variable, especially in areas with sparse or intermittent observational coverage.
NBM
The National Blend of Models (NBM) is a calibrated blend of both NOAA and non-NOAA weather models from around the world. Running every hour for about 7 days, the NBM produces a forecast that aims to leverage strengths from each of the source models, as well as providing some probabilistic forecasts. For most weather elements in the US and Canada, this is the primary source.
HRRR
The High Resolution Rapid Refresh (HRRR) provides forecasts over all of the continental US, as well as most of the Canadian population. 15-minute forecasts every 3 km are provided every hour for 18 hours, and every 6 hours a 48-hour forecast is run, all at a 3 km resolution. This was perfect for this project, since Dark Sky provided a minute-by-minute forecast for 1 hour, which can be loosely approximated using the 15-minute HRRR forecasts.
GFS
The Global Forecast System (GFS) is NOAA's global weather model. Running with a resolution of about 30 km (0.25 degrees), the GFS model provides hourly forecasts out of 120 hours, and 3-hour forecasts out to 240 hours. Here, GFS data is used for anywhere in the world not covered by the HRRR model, and for all results past 48 hours.
The GFS model also underpins the Global Ensemble Forecast System (GEFS), which is the 30-member ensemble (the website says 21, but there are 30 data files) version of the GFS. This means that 30 different "versions" of the model are run, each with slightly different starting assumptions. The API uses the GEFS to get precipitation type, quantity, and probability, since it seemed like the most accurate way of determining this. I have no idea how Dark Sky did it, and I am very open to feedback about other ways it could be assigned, since getting the precipitation probability number turned out to be one of the most complex parts of the entire setup!
GEFS
The Global Ensemble Forecast System (GEFS) is the ensemble version of NOAA's GFS model. By running different variations parameters and inputs, 30 different versions of this model are run at the same time, providing 3-hour forecasts out to 240 hours. The API uses the GEFS to get precipitation type, quantity, and probability, since it seemed like the most accurate way of determining this. I have no idea how Dark Sky did it, and I am very open to feedback about other ways it could be assigned, since getting the precipitation probability number turned out to be one of the most complex parts of the entire setup!
ECMWF IFS
The European Centre for Medium-Range Weather Forecasts Integrated Forecasting System (ECMWF IFS) is a global numerical weather prediction model used for medium-range to long-range atmospheric forecasting. It combines a spectral atmospheric model, an ocean model, and advanced data assimilation techniques to produce some of the most accurate weather forecasts in the world. Probability results are also included from the Ensemble version of this forecast.
The ECMWF IFS underpins many operational forecasting systems worldwide, serving as a benchmark for global models due to its strong performance in forecast skill, particularly for medium-range (3–10 days) predictions and ensemble probabilistic guidance.
DWD MOSMIX
Deutscher Wetterdienst Model Output Statistics-MIX (DWD MOSMIX) is a statistically post-processed forecast product produced by the German Weather Service. Rather than a single numerical model, MOSMIX blends output from several global and regional models and applies bias corrections based on historical station observations. The result is high-quality point forecasts optimized for specific locations.
MOSMIX provides hourly forecasts for thousands of stations worldwide, though not all parameters are available at every station. Here, MOSMIX data is used wherever it is available, offering refined, observation-tuned guidance—particularly strong within Europe, where DWD’s station network is most comprehensive.
ERA5
To provide historic weather data, the Google European Reanalysis 5 Dataset is used, specifically their full_37-1h-0p25deg-chunk-1.zarr-v3 product. Details on the Google implementation are available in their repository. In the medium term, I'll be exploring adding a local copy of this repository, which would significantly improve performance.
Forecast element sources
Every Pirate Weather forecast element for each time block (currently, minutely, hourly, or daily) is included in the table below, along with the primary, secondary, and tertiary data sources. Fallback sources are used if model data is intentionally excluded, the request point is outside of the primary model coverage area, or if there's some sort of data interruption.
At a high level, the general approach is to use NBM first, then HRRR, then DWD_MOSMIX, then ECMWF_IFS, then GEFS, and finally GFS. However, for currently and minutely blocks, data from the sub-hourly (15-minute) HRRR and the RTMA-RU models are preferred when available.
Currently
| Parameter | Global/Standard Priority | North America Priority |
|---|---|---|
| apparentTemperature | RTMA-RU > HRRR_SubH > NBM > DWD MOSMIX > ECMWF IFS > GFS | RTMA-RU > HRRR_SubH > NBM > ECMWF IFS > GFS > DWD MOSMIX |
| cape | HRRR_SubH > NBM > GFS | HRRR_SubH > NBM > GFS |
| cloudCover | RTMA-RU > NBM > HRRR > DWD MOSMIX > ECMWF IFS > GFS | RTMA-RU > NBM > HRRR > ECMWF IFS > GFS > DWD MOSMIX |
| currentDayIce | NBM > HRRR > ECMWF IFS > GEFS > GFS | NBM > HRRR > ECMWF IFS > GEFS > GFS |
| currentDayLiquid | NBM > HRRR > ECMWF IFS > GEFS > GFS | NBM > HRRR > ECMWF IFS > GEFS > GFS |
| currentDaySnow | NBM > HRRR > ECMWF IFS > GEFS > GFS | NBM > HRRR > ECMWF IFS > GEFS > GFS |
| dewPoint | RTMA-RU > HRRR_SubH > NBM > DWD MOSMIX > ECMWF IFS > GFS | RTMA-RU > HRRR_SubH > NBM > ECMWF IFS > GFS > DWD MOSMIX |
| fireIndex | NBM | NBM |
| feelsLike | NBM > GFS | NBM > GFS |
| humidity | RTMA-RU > HRRR > NBM > DWD MOSMIX > ECMWF IFS > GFS | RTMA-RU > HRRR > NBM > ECMWF IFS > GFS > DWD MOSMIX |
| nearestStormBearing | GFS | GFS |
| nearestStormDistance | GFS | GFS |
| ozone | GFS | GFS |
| precipIntensity | HRRR_SubH > NBM > DWD MOSMIX > ECMWF IFS > GEFS | HRRR_SubH > NBM > ECMWF IFS > GEFS > DWD MOSMIX |
| precipIntensityError | ECMWF IFS > GEFS | ECMWF IFS > GEFS |
| precipProbability | NBM > ECMWF IFS > GEFS | NBM > ECMWF IFS > GEFS |
| precipType | HRRR_SubH > NBM > DWD MOSMIX > ECMWF IFS > GEFS | HRRR_SubH > NBM > ECMWF IFS > GEFS > DWD MOSMIX |
| pressure | HRRR > ECMWF IFS > DWD MOSMIX > GFS | HRRR > ECMWF IFS > GFS > DWD MOSMIX |
| solar | HRRR_SubH > NBM > DWD MOSMIX > GFS | HRRR_SubH > NBM > GFS > DWD MOSMIX |
| smoke | HRRR | HRRR |
| temperature | RTMA-RU > HRRR_SubH > NBM > DWD MOSMIX > ECMWF IFS > GFS | RTMA-RU > HRRR_SubH > NBM > ECMWF IFS > GFS > DWD MOSMIX |
| uvIndex | GFS | GFS |
| visibility | RTMA-RU > HRRR_SubH > NBM > DWD MOSMIX > ECMWF IFS > GFS | RTMA-RU > HRRR_SubH > NBM > ECMWF IFS > GFS > DWD MOSMIX |
| windBearing | RTMA-RU > HRRR_SubH > NBM > DWD MOSMIX > ECMWF IFS > GFS | RTMA-RU > HRRR_SubH > NBM > ECMWF IFS > GFS > DWD MOSMIX |
| windGust | RTMA-RU > HRRR_SubH > NBM > DWD MOSMIX > GFS | RTMA-RU > HRRR_SubH > NBM > GFS > DWD MOSMIX |
| windSpeed | RTMA-RU > HRRR_SubH > NBM > DWD MOSMIX > ECMWF IFS > GFS | RTMA-RU > HRRR_SubH > NBM > ECMWF IFS > GFS > DWD MOSMIX |
Minutely
| Parameter | Global/Standard Priority | North America Priority |
|---|---|---|
| precipIntensity | HRRR_SubH > NBM > DWD MOSMIX > ECMWF IFS > GEFS | HRRR_SubH > NBM > ECMWF IFS > GEFS > DWD MOSMIX |
| precipIntensityError | ECMWF IFS > GEFS | ECMWF IFS > GEFS |
| precipProbability | NBM > ECMWF IFS > GEFS | NBM > ECMWF IFS > GEFS |
| precipType | HRRR_SubH > NBM > DWD MOSMIX > ECMWF IFS > GEFS | HRRR_SubH > NBM > ECMWF IFS > GEFS > DWD MOSMIX |
Hourly / Daily / Day/Night
| Parameter | Global/Standard Priority | North America Priority |
|---|---|---|
| apparentTemperature | NBM > HRRR > DWD MOSMIX > ECMWF IFS > GFS | NBM > HRRR > ECMWF IFS > GFS > DWD MOSMIX |
| cape | NBM > HRRR > GFS | NBM > HRRR > GFS |
| cloudCover | NBM > HRRR > ECMWF IFS > GFS | NBM > HRRR > ECMWF IFS > GFS |
| dewPoint | NBM > HRRR > ECMWF IFS > GFS | NBM > HRRR > ECMWF IFS > GFS |
| fireIndex | NBM | NBM |
| feelsLike | NBM > GFS | NBM > GFS |
| humidity | NBM > HRRR > DWD MOSMIX > ECMWF IFS > GFS | NBM > HRRR > ECMWF IFS > GFS > DWD MOSMIX |
| iceAccumulation | NBM > HRRR > DWD MOSMIX > ECMWF IFS > GEFS > GFS | NBM > HRRR > ECMWF IFS > GEFS > GFS > DWD MOSMIX |
| liquidAccumulation | NBM > HRRR > DWD MOSMIX > ECMWF IFS > GEFS > GFS | NBM > HRRR > ECMWF IFS > GEFS > GFS > DWD MOSMIX |
| nearestStormBearing | NBM > HRRR > DWD MOSMIX > ECMWF IFS > GEFS > GFS | NBM > HRRR > ECMWF IFS > GEFS > GFS > DWD MOSMIX |
| nearestStormDistance | GFS | GFS |
| ozone | GFS | GFS |
| precipAccumulation | NBM > HRRR > DWD MOSMIX > ECMWF IFS > GEFS > GFS | NBM > HRRR > ECMWF IFS > GEFS > GFS > DWD MOSMIX |
| precipIntensity | NBM > HRRR > DWD MOSMIX > ECMWF IFS > GEFS | NBM > HRRR > ECMWF IFS > GEFS > DWD MOSMIX |
| precipIntensityError | ECMWF IFS > GEFS | ECMWF IFS > GEFS |
| precipProbability | NBM > ECMWF IFS > GEFS | NBM > ECMWF IFS > GEFS |
| precipType | NBM > HRRR > DWD MOSMIX > ECMWF IFS > GEFS | NBM > HRRR > ECMWF IFS > GEFS > DWD MOSMIX |
| pressure | HRRR > DWD MOSMIX > ECMWF IFS > GFS | HRRR > ECMWF IFS > GFS > DWD MOSMIX |
| solar | NBM > HRRR > DWD MOSMIX > GFS | NBM > HRRR > GFS > DWD MOSMIX |
| snowAccumulation | NBM > HRRR > DWD MOSMIX > ECMWF IFS > GEFS > GFS | NBM > HRRR > ECMWF IFS > GEFS > GFS > DWD MOSMIX |
| smoke | HRRR | HRRR |
| temperature | NBM > HRRR > DWD MOSMIX > ECMWF IFS > GFS | NBM > HRRR > ECMWF IFS > GFS > DWD MOSMIX |
| uvIndex | GFS | GFS |
| visibility | NBM > HRRR > DWD MOSMIX > ECMWF IFS > GFS | NBM > HRRR > ECMWF IFS > GFS > DWD MOSMIX |
| windBearing | NBM > HRRR > DWD MOSMIX > ECMWF IFS > GFS | NBM > HRRR > ECMWF IFS > GFS > DWD MOSMIX |
| windGust | NBM > HRRR > DWD MOSMIX > GFS | NBM > HRRR > GFS > DWD MOSMIX |
| windSpeed | NBM > HRRR > DWD MOSMIX > ECMWF IFS > GFS | NBM > HRRR > ECMWF IFS > GFS > DWD MOSMIX |
Data Pipeline
Trigger
Forecasts are saved from NOAA onto the AWS Public Cloud into three buckets for the HRRR, GFS, GEFS, RTMA-RU and ECMWF IFS models. Since I couldn't find a good way to trigger processing tasks based on S3 events in a public bucket, the ingest system relies on timed events scheduled through AWS EventBridge Rules, with the timings shown in the table below:
| Model | Run Times (UTC) | Delay | Ingest Times (UTC) |
|---|---|---|---|
| GFS | 0,6,12,18 | 5:00 | 5,11,17,23 |
| GEFS | 0,6,12,18 | 7:00 | 7,13,19,1 |
| NBM | 0-24 | 1:45 | 1:45-00:45 |
| HRRR- 48h | 0,6,12,18 | 2:30 | 2:30,8:30,14:30,20:30 |
| HRRR- 18h/ SubHourly | 0-24 | 1:45 | 1:45-00:45 |
| RTMA-RU | 0-24 | 0:25 | :25,:40,:55,:10 |
| ECMWF IFS | 0,12 | 8:00 | 8,20 |
| DWD MOSMIX | 0-24 | 1:00 | 1:00-0:00 |