Embed2Scale Introduces TerraCodec: Neural Compression for Optical Earth Observation Data


The Embed2Scale project has released TerraCodec, an open-source family of neural compression models designed for optical Earth observation data.

Earth observation satellites continuously produce massive streams of multispectral and temporal imagery, creating growing challenges for storage, transmission, and large-scale analysis. TerraCodec addresses these challenges using neural compression models that encode multispectral imagery into compact latent representations and generate efficient bitstreams through learned entropy coding.

While most neural compression research has focused on RGB imagery, TerraCodec is designed specifically for multispectral satellite data. The model family includes efficient image compression baselines and temporal models that further improve compression by exploiting redundancy across satellite time series. All models are pretrained and evaluated at scale on the public SSL4EO-S12 v1.1 dataset and released open-source.

What TerraCodec enables

TerraCodec introduces neural compression models tailored for Earth observation workflows. The models allow users to:

  • Compress multispectral Sentinel-2 imagery using learned neural codecs
  • Exploit temporal redundancy in seasonal satellite-image sequences
  • Use flexible-rate compression with a single model checkpoint
  • Integrate compression directly into geospatial ML pipelines

TerraCodec achieves 3–10× higher compression than classical codecs such as JPEG2000 or WebP while maintaining comparable reconstruction quality. Beyond compression, the temporal models also enable applications such as cloud inpainting in satellite-image sequences, and we study how compression affects downstream Earth observation tasks.

Model family

The release includes models for different compression scenarios:

  • TEC-FP – compact factorised-prior image codec for efficient image compression
  • TEC-ELIC – improved entropy model with better rate–distortion performance
  • TEC-TT – temporal transformer codec for multispectral time-series data
  • FlexTEC – flexible-rate temporal model supporting multiple compression levels with one checkpoint

Open-source release

TerraCodec is released as an open-source library with pretrained checkpoints and example notebooks. Models can be used as standalone via PyPI or integrated into the TerraTorch model registry.

Towards scalable EO data pipelines

As Earth observation datasets continue to grow, neural compression methods like TerraCodec can help enable more efficient storage, transfer, and sharing of satellite data. By releasing models and code openly, the Embed2Scale project aims to accelerate research on compact representations for Earth observation.