Meeting Documents
Enki: Reconstructing Masked Pixels in Sea Surface Temperature with a Large Language Model
Presented at: Ocean Sciences Meeting 2024
Abstract
We will present Enki, a large language model created to reconstruct masked pixels (e.g. clouds) in sea surface temperature (SST) data. Our model achieves excellent performances by combining (1) the attentive skill of a natural language processing algorithm (specifically a vision transformer masked autoencoder) with (2) high-fidelity ocean model outputs of SST from the fine-scale 1/48 degree, 90-level ocean simulation from the Estimating the Circulation and Climate of the Ocean project, aka LLC4320. We demonstrate that the Enki model repeatedly outperforms previously adopted inpainting techniques - by up to an order-of-magnitude in reconstruction error - while displaying high performance even in circumstances where the majority of pixels are masked.
We train, validate, and test Enki on 64x64-pixel cutouts of LLC4320 SST outputs drawn uniformly from across the ocean and achieve reconstruction root-mean-square errors RMSE < 0.05 K even in data with 50% masked pixels. We then apply Enki to VIIRS Level 2 products with artificially injected clouds and recover RMSE less than the sensor uncertainty (RMSE < ~0.1 K) for all but the most complex fields. With Enki, we can increase the available dataset for analysis by an order-of-magnitude and/or adopt the algorithm to generate high Level products. Last, we will report on new results to incorporate realistic cloud coverage and performance with masking ratios exceeding 50%.
View Document (AGU) »