What if pretraining scientific foundation models didn’t require massive datasets at all?
We show that this is not only possible, but has a range of neat benefits: our “Tadpole” models learn from canonical PDE data that is generated on-the-fly by efficient and accurate spectral solvers. This effectively enables unlimited training data, and circumvents storage and I/O bottlenecks.
Three key ingredients to make the Tadpole approach work are:
- Autoencoding instead of dynamics pretraining: we learn transferable spatial representations rather than system-specific dynamics. This improves generalization across heterogeneous PDE systems.
- Custom parameter-efficient fine-tuning (PEFT): we propose the use of LoRA, latent transformations + skip connections. This gives very good performance for temporal predictions, e.g., outperforming the Walrus model, which has orders of magnitude more parameters.
- Online pretraining at scale: as outlined above, Tadpole generates PDE data online via GPU-based ETDRK solvers. This scales to hundreds of TB equivalent data.
Most scientific foundation models are bottlenecked by data generation and storage. Tadpole flips this paradigm: Data is no longer a static, pre-computed asset. Instead, it is generated online and becomes part of the training loop. I think this is a key step toward scalable foundation models for applications in science and engineering.
- Code: https://github.com/tum-pbs/Tadpole
- Preprint: https://arxiv.org/abs/2605.15284
- Data: https://huggingface.co/thuerey-group/Tadpole
Full abstract: We introduce Tadpole, a novel foundation model for three-dimensional partial differential equations (PDEs) that addresses key challenges in transferability, scalability to high dimensionality, and multi-functionality. Tadpole is pre-trained as an autoencoder on synthetic 3D PDE data generated by an efficient online data-generation framework. This enables large-scale, diverse training without storage or I/O overhead, demonstrated by scaling to an equivalent of hundreds of terabytes of training data. By autoencoding single-channel spatial crops, Tadpole learns rich and transferable representations across heterogeneous physical systems with varying numbers of state variables and spatial resolutions. Although pre-trained solely as an autoencoder, Tadpole can be efficiently applied for multiple downstream tasks beyond reconstruction, including dynamics learning and generative modeling. For dynamics learning, we propose a novel parameter-efficient fine-tuning strategy that integrates low-rank adaptation, latent-space transformations, and reintroduced skip connections, achieving accurate temporal modeling with a minimal number of trainable parameters. Tadpole demonstrates strong fine-tuning performance across various downstream tasks, highlighting its versatility and effectiveness as a foundation model for 3D PDE learning.
