
Climate and air pollution prediction from spatio-temporal observational constraints
Current climate models cannot resolve key small‑scale processes such as clouds, aerosols, and ocean eddies, which introduces significant uncertainties. Their evaluation is generally performed with spatio‑temporal aggregated observations. At the same time, increasingly sophisticated retrievals from satellites are becoming available at matching spatial resolution, which would in theory provide strong constraints; yet in practice, data volumes are too large for routine pixel‑level evaluation. Novel approaches are required.
This use case addresses the identified issue by developing embedding‑based methods to analyse, evaluate, and track small‑scale processes in compressed latent space:
- Another strand focuses on benchmarking the compression of atmospheric remote sensing data, ensuring that novel neural compression schemes follow the requirements of weather experts with a focus on clouds.
- One strand of experiments focuses on understanding the utility of embeddings for specific applications in atmospheric science, and how the atmosphere interacts with Earth’s surface.
Data sources used
Our use case leverages the following Earth observation missions: multi-spectral imagery from the MODIS (NASA) and Sentinel-2 (ESA) low-Earth-orbit satellites (Fig. 1), and longwave radiation data derived from the GOES-16 (NASA) geostationary satellite. However, the developed methodology is intended to become independent of the specific choice of satellite. In addition we will augment these remote sensing modalities with:
- weather data such as ERA5 (ECMWF) and Coupled Model Intercomparison Project (WCRP)
- Nitrogen dioxide reanalysis data (CAMS)
- longwave radiation and precipitation from the nextGEMS km-scale model developed (ICON)

Methodology
Strand 1 focuses on creating embeddings that capture cloud and aerosol structure from high‑resolution satellite data. Foundation models and tokenisation networks such as finite scalar quantised variational autoencoders (Mentzer et al. 2024) are able to produce discrete latent tokens from raw remote sensing modalities. Figure 2 illustrates this process. Latent‑space clustering enables unsupervised cloud classification, aerosol detection, and storm‑regime identification. Stable Diffusion (Rombach et al. 2022) helps to predict Earth surface parameters in embedding space based on weather information.
Strand 2 develops the ClimateBenchPress benchmark (Reichelt et al. 2026) tailored to the needs of atmospheric scientists addressing gaps in existing neural compressors (Gomes et al. 2025). We evaluate the reconstruction quality according to a set of metrics identified and chosen through discussions with the stakeholders. Moreover, we study the long-term evolution of clouds in (compressed) embedding space guided by sparse labels (Assran et al. 2021) to better understand the interaction of Earth’s atmosphere and its surface.

Embeddings
Embeddings are compact vector representations produced by a geospatial foundation model. Instead of storing raw imagery, the model encodes the relevant information into a lower-dimensional representation that can be used for downstream analysis.
In this project, embeddings are generated as a time series of spatial feature maps, where each embedding summarises the information contained in the corresponding satellite observations.
These embeddings act as a bridge between large EO datasets and lightweight machine learning workflows. Users can train simple models on top of them without needing to access or process the underlying satellite imagery.
The exact characteristics of the embeddings — such as their dimensionality, spatial resolution, and compression strategy — will depend on the selected foundation model and are being investigated as part of the project.

Expected Impact
Neural compression of remote sensing data into foundation model embedding spaces is key to analyse multi‑petabyte geospatial datasets produced by projects such as NextGEMS, DYAMOND3 global cloud, and future DestinationEarth simulations.
Embeddings offer a scalable way to evaluate and constrain clouds, aerosols, and air‑pollution processes in both satellite observations and next‑generation cloud‑resolving climate models. By enabling classification from sparse labels, self‑supervised detection of clouds, dust storms, and pollution features, and the tracking of convective systems and aerosol plumes, embeddings provide a practical route to extract scientific information from datasets that are otherwise too large for routine pixel‑level comparison. They also create a compact space for detecting changes linked to temperature shifts or emission reductions, identifying extreme events and outlier climate responses, and revealing potential model issues.
Future work
Several research directions are explored for this use case and may continue beyond the project timeframe.
New benchmarking datasets
A computer vision benchmark for near real-time forest disturbance detection and agent attribution will be assembled to support systematic evaluation.
Benchmarking geospatial foundation models
Existing GFMs pretrained on Sentinel-1 and Sentinel-2 data are evaluated to determine which provide the most informative embeddings for disturbance monitoring.
Temporal reasoning for disturbance detection
A major research focus will be the development of temporal models that analyse embedding time series in order to detect disturbances as early as possible and attribute their likely causes.
Continental-scale disturbance mapping
In collaboration with experts in forest monitoring, this research direction would pave the way to producing high-resolution disturbance maps at European scale covering the full Sentinel mission history.

References
- Assran et al. (2021). Semi-supervised learning of visual features by non-parametrically predicting view assignments with support samples. Proceedings of the IEEE/CVF international conference on computer vision
- CAMS. Copernicus Atmosphere Monitoring Service (CAMS). https://atmosphere.copernicus.eu
- ECMWF. ECMWF Reanalysis v5 (ERA5) dataset. https://www.ecmwf.int/en/forecasts/dataset/ecmwf-reanalysis-v5
- ESA. Sentinel‑2. https://www.esa.int/Applications/Observing_the_Earth/Copernicus/Sentinel-2
- Gomes et al. (2025). Lossy neural compression for geospatial analytics: A review. IEEE Geoscience and Remote Sensing Magazine
- ICON. Next Generation Earth Modelling Systems. https://nextgems-h2020.eu
- Mentzer et al. (2024). Finite scalar quantization: Vq-vae made simple. International Conference on Learning Representations
- NASA. GOES Satellite Network. https://science.nasa.gov/mission/goes/
- NASA. MODIS Web. https://modis.gsfc.nasa.gov/about/
- Reichelt et al. (2026). ClimateBenchPress (v1.0): A Benchmark for Lossy Compression of Climate Data. EGUsphere
- Rombach et al. (2022). High-resolution image synthesis with latent diffusion models. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
- WCRP. Coupled Model Intercomparison Project: CMIP. https://wcrp-cmip.org

