Sharmila Karumuri, Lori Graham-Brady and Somdatta Goswami.
You can find both the presentation slides and the recording of our presentation, outlining our approach and the results we achieved:
Slides- DeepONet-randomization-study-final.pdf
Recording- DeepONet-Efficient-Training-with-Random-Sampling_presentation.mp4
In this work, we introduce a novel random sampling technique for training DeepONet, designed to enhance model generalization and reduce computational demands. This technique focuses on the trunk network of DeepONet, which generates basis functions for spatiotemporal locations within a bounded domain where the physical system operates.
Traditionally, DeepONet training involves evaluating the trunk network on a uniform grid of spatiotemporal points to construct the loss function for each iteration. This approach results in larger batch sizes, which can lead to poor generalization, slower convergence, and increased memory usage. Our method, which employs random sampling for the trunk network inputs, addresses these issues by reducing batch sizes, thereby improving generalization and reducing memory usage, ultimately enhancing computational efficiency.
We validate our approach with three benchmark examples, demonstrating significant reductions in training time while maintaining or even improving generalization performance compared to the traditional training approach.
Figure 1: Traditional Approach
The dotted lines in the train and test plots represent the traditional training approach, while the solid lines depict our randomized sampling approach. This animation clearly shows that we achieve the same level of test accuracy in one-fifth of the training time.
The labeled dataset used for the problems demonstrated in the manuscript, the data generation script along with the postprocessing results are uploaded here.
The code for examples is written in PyTorch. To install the dependencies, refer to requirements.txt and clone our repository:
git clone https://Centrum-IntelliPhysics/Efficient_DeepONet_training.git
cd Efficient_DeepONet_training
This repository contains implementations and analyses for the experiments described in the paper. It is organized as follows:
- Example Folders: Each example discussed in the paper is located in its respective folder. Within these folders, you will find a Python file named DeepONet_analysis.py, which demonstrates the process of random subsampling of the inputs to the trunk network for efficient training of the DeepONet model.
- Results: Results from the computations are saved in the 'analysis_results' folder. This includes outputs generated from running the DeepONet_analysis.py scripts.
- Postprocessing: The 'postprocessing' folder contains code for generating plots and visualizations based on the analysis results.
Our preprint is available on Arxiv as well as the published version is availble here. If you use this code for your research, please cite our paper
@article{doi:10.1142/S2811032325400016,
author = {Karumuri, Sharmila and Graham-Brady, Lori and Goswami, Somdatta},
title = {Efficient Training of Deep Neural Operator Networks via Randomized Sampling},
journal = {World Scientific Annual Review of Artificial Intelligence},
volume = {03},
number = {},
pages = {2540001},
year = {2025},
doi = {10.1142/S2811032325400016}}
