Managing OpenFOAM Physical Simulations with DVC, CML, and Studio (Part 1)
In the series of blog posts we discuss the challenges of using OpenFOAM for computational fluid dynamics simulations, as well as the benefits of using DVC, CML, and Iterative Studio for data versioning, experiment management, and cloud resource management. In the first part we build a demo project with OpenFOAM and DVC to automate the process of running simulations, to capture and track data and code.
- Mikhail Rozhkov+1
- April 17, 2023 • 14 min read
OpenFOAM is a powerful, open-source software tool used for computational fluid dynamics (CFD) simulations. It allows engineers and scientists to model and analyze the flow of fluids, such as gases and liquids, through intricate geometries and physical phenomena. For example, such physical phenomena could be turbulence, heat transfer, and chemical reactions. OpenFOAM has a large and dedicated user base and is utilized in a variety of industries, including aerospace, automotive, chemical, energy, and marine engineering.
This post focuses on the following challenges that users of OpenFOAM may encounter:
Complexity: OpenFOAM is a highly flexible and powerful tool, but this can also make it difficult for new users to learn and navigate. The software has a large number of solvers and utilities, and it can be challenging to understand which solver is most suitable for a given problem.
Data management: OpenFOAM simulations generate a number of outputs that need to be stored, versioned, shared, and cleaned up when needed.
Interfacing with other software: OpenFOAM may need to be used in conjunction with other software, such as CAD or mesh generation tools, and there can be challenges in integrating these tools and transferring data between them.
Software version control: OpenFOAM and simulation software are constantly updating and very complex software packages.
All challenges above become more challenging for a small team of researchers who develop and run simulations. They may lack experience with DevOps and cloud Infrastructure management. Therefore, having a handy toolset is needed to help with pipelines and infrastructure setup.
With DVC you may manage versions of simulation outputs, pipelines, and control software versions used to execute the pipeline ensuring consistent results. These features allow users to ensure that the new version of the software produces the same results as previous versions, helping to maintain the reliability and accuracy of the simulations. CML and Iterative Studio together provide a key for cloud resources management, running new experiments via nice UI, showing parameters and results of the simulation.
We describe these and other features in the two following posts. In this post, we discuss how Iterative tools help with physical and computational simulations. To do this, we’ll go over a simple demo project built with OpenFOAM. The demo shows how to set up DVC for simulation experiments and data management.
These posts are a result of collaboration between the Iterative.ai and PlasmaSolve teams. PlasmaSolve was founded in 2016 by plasma physicists and software engineers to provide a platform for cutting-edge physics simulation services and research. The PlasmaSolve team strives to deliver top-notch solutions and well-designed physics simulations to speed up research and reduce development costs using various open-source and commercial simulation tools.
In this post, you will learn how to:
Configure and run OpenFOAM simulations with DVC
Store and share simulation data in the cloud using DVC
OpenFOAM simulations may include several computational steps, from mesh generation to a large number of solvers and post-processing simulation results. SonicFoam is a simulation tool based on the open-source CFD (Computational Fluid Dynamics) software OpenFOAM. It is used to simulate compressible, inviscid flows with high Mach numbers, such as supersonic flows.
In this demo, we simulate a supersonic flow over a step located at the front of
the flow. The scenario involves a Mach 3 flow entering a rectangular area with a
step near the inlet, which creates shock waves. We use the same geometry to run
two chained simulations:
Shock fronts in the forward step problem (source)
Our demo simulation pipeline contains a few steps:
Generate geometry with
sonicFoamsimulation to get velocity (
U) and temperature (
Post-processing simulation results;
Run a subsequent
scalarTransportFoamsimulation that uses the velocity field computed before.
In reality, simulations sometimes need to be “chained”, i.e. outputs of one
simulation go as an input to another simulation. When running a parametric study
of such a simulation chain, intermediate simulations are often recomputed even
if the parameter change does not influence them. We demonstrate how to use DVC
to cache all the results and only trigger a computation if really necessary.
Results of the
sonicFoam solver go as inputs to the
As a basis for the demo, we use OpenFOAM Supersonic flow over a forward-facing step tutorial. The original code can be found here.
💡 For this part of the post, we follow the
no-dvc branch in the
The easiest way to follow the demo with OpenFOAM simulation is to run in
Docker containers. Follow the setup section in the
README to build a Docker image and set up Python virtual
environment and install dependencies.
After the environment is set up we only need to run
which runs a new OpenFOAM job in a Docker container. For example, to run the
OpenFOAM simulation in an interactive way, use the command:
sonicFoam, a user must first create a 3D geometry model of the flow
domain using a tool such as CAD software. The user must then define the boundary
conditions and physical properties of the flow, such as the temperature,
pressure, and velocity at each boundary. The user can then run the simulation
sonicFoam solver, which will solve the governing equations of
compressible flow using the finite volume method.
$ ./openfoam-cse-docker -c 'cd sonicFoam && blockMesh'
Geometry of the forward step (source)
During the simulation,
sonicFoam will calculate various flow quantities, such
as the pressure, velocity, and temperature, at each point in the flow domain.
The user can then visualize and analyze these results using post-processing
tools, such as ParaView, to gain insight into the flow behavior.
$ ./openfoam-cse-docker -c 'cd sonicFoam && sonicFoam'
As an example of post-processing stages in the simulation demo, we have a few tasks:
calculate the magnitude of the velocity
generate VTK and visualize mesh
Calculate the magnitude of the velocity
postProcess is a command allows users to perform post-processing operations on
simulation data. The
-func option specifies that a user-defined function
should be applied to the data. In this case calculates and writes the field of
the magnitude of velocity into a ﬁle named
mag(U) in each time directory
generated during simulation:
$ ./openfoam-cse-docker -c 'cd sonicFoam && postProcess -func "mag(U)"'
postProcess command can be used in conjunction with various options and
functions to perform a wide range of post-processing tasks, such as calculating
flow quantities, generating plots, and creating animations. It is an important
tool for gaining insight into the results of CFD simulations.
In order to produce a 1D dataset and its visualization we compute the flow rate
over the “outlet” patch. For this purpose, we may apply the
flowRatePatch(name=outlet) function to the simulation data. The
flowRatePatch function calculates the flow rate through a patch, which is a
specified boundary in the flow domain. The input
name specifies the patch to
use, in this case,
outlet patch represents the boundary at the
outlet of the flow domain, so the
flowRatePatch function will calculate the
flow rate through the outlet.
$ ./openfoam-cse-docker -c 'cd sonicFoam && \
postProcess -func "flowRatePatch(name=outlet)"'
This operation saves results into the
foamToVTK is a utility converts simulation data stored in the OpenFOAM format
to the VTK (Visualization ToolKit) format.
VTK is a popular file format for storing and visualizing scientific data, and it
is often used for post-processing and visualization of CFD simulations.
$ ./openfoam-cse-docker -c 'cd sonicFoam && foamToVTK'
This will convert the simulation data stored in the
sonicFoam directory from
the OpenFOAM format to the VTK format, allowing it to be visualized and analyzed
using tools that support the VTK format. It creates
with formatted simulation results.
To visualize the results of a simulation performed using the OpenFOAM toolkit's
sonicFoam solver, you can use one of the post-processing tools included with
the OpenFOAM toolkit, such as
foamToVTK. These tools allow you
to view and analyze the simulation results in a graphical interface.
In the demo example, a 3D geometry mesh and float pressure diagram are generated. There are examples of generated files below.
scalarTransportFoam is a solver in the open-source CFD software OpenFOAM
that is used to solve a transport equation for a passive scalar using a
specified stationary velocity field. It is typically used to calculate the
convection diffusion of a scalar in a given velocity field.
scalarTransportFoam solver, we need to update the stage
configuration based on the
Uconfig from the last simulation stage in
Tconfig with the
boundaryFieldfrom the last simulation stage in
polyMeshto use the same geometry
# Configure scalarTransportFoam
$ python3 src/config_scalarTransportFoam.py
# Run scalarTransportFoam simulation
$ ./openfoam-cse-docker -c 'cd scalarTransportFoam && scalarTransportFoam'
The simulation will calculate the transport of the passive scalar using the specified velocity field and other input parameters. The resulting simulation data can then be post-processed and analyzed to gain insight into the transport of the scalar in the flow.
💡 For this part of the post, we follow the
main branch in the
Please follow the README to prepare your environment and install dependencies.
Up to this moment, we run different tasks for the simulation pipeline using separate commands. Let’s see how DVC tools can help with automating the simulation pipeline and handling simulation output data.
DVC pipelines is a feature of the DVC (Data Version Control) tool. A DVC pipeline is a series of commands that are executed in a specific order and can be used to run all steps that are needed- simulation itself, post-processing the results, and generating reports. DVC automatically captures and tracks the data and code associated with your OpenFOAM simulations to make them reproducible and shareable with your team.
A DVC config file is written in YAML format and consists of a list of steps, each of which corresponds to a command that should be executed as part of the pipeline. The steps can depend on one another, meaning that the output from one step is used as input for another step. More details can be found on the DVC documentation website.
Let’s consider an example of the DVC pipeline configuration for
- bash run.sh 'cd sonicFoam && blockMesh'
cmd field specifies the command to be executed, which in this case is a
utility shell script
run.sh that changes the file permissions and runs the
blockMesh command directly or using
openfoam-cse-docker script. The
script “knows” how to run the simulations pipeline on your local environment
(manually) or as a part of the GitLab CI pipeline on the Cloud environment
(automatically). We will discuss CI configuration in later sections.
deps field in this pipeline step specifies the input files that the
blockMesh command depends on
blockMeshDict file. These files contain
information about the mesh and the simulation parameters, and are required by
blockMesh command to generate the mesh.
outs field specifies the output files generated by the
command. In this case, the output is the
polyMesh directory, which contains
the generated mesh data. The mesh data is captured and versioned by DVC.
DVC pipeline configuration file (
params.yaml) file configures an OpenFOAM
simulation. Here is an extract of the parameters used for
params field of the
controlDict section specifies the values of the
simulation control parameters. In this case, the
parameters are set to specific values.
In the DVC simulation setup, the user is responsible for putting the values from
params.yaml file into the
controlDict. Unlike other tools that handle
this process automatically, this approach requires some manual effort on the
user's end but provides greater flexibility as it eliminates the need for
support for each and every tool or software used in the simulation. The demo
showcases how this task is carried out through the
DVC pipeline configuration expects that all inputs and outputs of each stage are
explicitly defined in the
dvc.yaml file. This is a common pattern in Machine
Learning and Data Management pipelines. DVC uses explicit
build a computational DAG and “understand” whether it needs to re-run a stage if
some of its dependencies change. This ensures the reproducibility of the
However, OpenFOAM simulation pipelines are different. Depending on the
simulation parameters (e.g.
writeInterval in the
parameters), a different number of files and folders can be generated.
Therefore, it may impossible to specify all outputs in
dvc.yaml in advance.
But, because of these files are not specified in
dvc.yaml, DVC can’t manage
them properly. To solve this problem, we introduced two helper scripts that
“help” DVC to find and handle generated files and folders for the simulation use
supporting wildcard patterns in
dvc.yaml configuration file will simplify such use cases!
Let’s introduce two additional helper scripts:
dvc_outs_remove.py- removes the stage outputs from the previous simulation. This script checks if there are files previously added by
dvc_outs_handler.pyscript and remove them from DVC with
dvc_outs_handler.py- finds all “untracked” and adds them to DVC control. By default, only files tracked by either Git or DVC are saved to the experiment. This script checks if there are files or directories generated by the stage and add them to DVC with
# Remove previous sim results
- python3 src/dvc_outs_remove.py --stage=sonicFoam ...
# Run sim
- bash run.sh 'cd sonicFoam && sonicFoam'
# Add generated files to DVC and create outputs index files
- python3 src/dvc_outs_handler.py --stage=sonicFoam ...
It is common for OpenFOAM simulations to involve complex pipelines with multiple steps and dependencies between the steps. This is because simulations often require the use of multiple solvers, each of which may have its own input and output files and dependencies on other solvers.
For example, a simulation may require the use of multiple solvers to simulate different physical phenomena, such as fluid flow, heat transfer, and chemical reactions. These solvers may need to be run in a specific order and may depend on the output of other solvers as input.
It’s possible to manage these dependencies with DVC! DVC allows you to specify the steps in the simulation pipeline and the dependencies between them in a configuration file.
The demo project example has two solvers:
Both solvers depend on the same geometry generated by the
blockMesh stage. In
the case we know exactly the path to the output (
outs) of the
solver, we may explicitly define it as a dependency (
deps) of the
scalarTransportFoam stage. In our case, we use a utility script
src/config_scalarTransportFoam.py) to get the results of the
solver and prepare the initial state for the
- python3 src/config_scalarTransportFoam.py
- bash run.sh 'cd scalarTransportFoam && scalarTransportFoam'
After the DVC pipeline is set up, you may run a new simulation experiment with a command:
$ dvc exp run
To run a new simulation with updated parameters you may manually change the
parameter value in the
params.yaml file and run
dvc exp run or, it’s
modify parameters on-the-fly.
For example, let’s change the length of our simulation:
$ dvc exp run -S 'configureSim.controlDict.params.endTime=4'
It is also possible to queue and run multiple simulations in parallel.
In the next post, we will show how to visualize and compare simulation data with CML and Iterative Studio.
Effective data management is essential for successful OpenFOAM simulations. Proper data management can help you organize and track the data and code associated with your simulations, and make it easier to reproduce simulation results.
There are several challenges that users of OpenFOAM may encounter in managing the data associated with their simulations:
Large data volumes: OpenFOAM simulations can generate large amounts of data, particularly for complex or high-resolution simulations. This can make it difficult to store, transfer, and analyze the data effectively.
Data version control: It is important for users to be able to track changes to the input files and simulation results over time and to be able to reproduce past simulations. This can be challenging without a version control system or other means of tracking changes.
Data transfer: Users may need to transfer large amounts of data between different systems or devices, such as between their personal computers and a high-performance computing cluster. This can be challenging due to the size of the data and the potential for data transfer bottlenecks.
Collaboration: Users may want to share simulation results with colleagues or collaborate on simulations. This can be done by sharing the simulation input files and results, as well as using tools such as online collaborative platforms or version control systems.
Luckily, DVC may help with all of them. Let’s review the core features of DVC
that we used in the demo project.
Data versioning is a
core feature of DVC that helps to capture the versions of simulation data in Git
commits, while storing them on-premises or in cloud storage. Moreover, using DVC
pipelines, all outputs specified as
configuration, are automatically added to DVC version control! Other files,
generated by different stages, are added to DVC via
script. The next step is to set up DVC remote storage and upload these files
DVC help to store large volumes of data in the on-premise or cloud storage (e.g. SSH, S3, HDFS, etc.) The demo project uses AWS S3 as a remote storage. For more details on the remote storage configuration you may check Example: Customize an additional S3 remote.
You may add your own remote storage in AWS S3 bucket using the following command:
$ dvc remote modify s3remote url s3://<bucket>/<path>
After the remote storage is set up, you need a single additional command to transfer your results to the storage:
$ dvc exp push
With this DVC takes care of pushing and pulling to/from both Git and DVC remotes
in the case of experiments. Therefore, the following collaboration with
colleagues is simple. Your colleagues may access your last simulation results
dvc exp pull command (after updating their repository with
$ dvc exp pull
This post details how Iterative tools help in physical and computational simulations. The demo shows how to set up DVC for simulation experiments and data management.
Overall, DVC can help OpenFOAM users to:
Reduce the complexity of simulation pipelines and automate tasks such as running simulations, post-processing results, and generating reports.
Manage and track the data and code associated with your OpenFOAM simulations, and make it easier to reproduce simulation results.
Manage simulation experiments with a YAML config files.
Store and share simulation data in the cloud using DVC and AWS S3.
Easily collaborate with your colleagues around simulation results, share and reuse data.
In the next post, we will discuss how to utilize cloud computing resources and visualize and compare simulation data with CML and Iterative Studio.