Computational Fluid Dynamics Simulation Data of Spatial Deposition
Deposition - train data
File Size |
|
File Format |
|
Description | Matrix of shape (15000,1000,1000,3) corresponding to (training case, height, width, RGB), respectively. |
Inputs - train data
File Size |
|
File Format |
|
Description | Matrix of shape (15000,4), where the rows correspond to the first 15,000 simulations and columns are sx, sy, wu, and wv (source location in x, source location in y, wind velocity projection in x, and wind velocity projection in y). |
Deposition - test data
File Size |
|
File Format |
|
Description | Matrix of shape (1000,1000,1000,3) corresponding to (testing case, height, width, RGB), respectively. |
Inputs - test data
File Size |
|
File Format |
|
Description | Matrix of shape (1000,4), where the rows correspond to the last 1,000 simulations and columns are sx, sy, wu, and wv (source location in x, source location in y, windvelocity projection in x, and wind velocity projection in y). |
- Collection
- Cite This Work
-
Fernandez-Godino, M. Giselle; Lucas, Donald D.; Gunawardena, Nipun (2023). Computational Fluid Dynamics Simulation Data of Spatial Deposition. In Lawrence Livermore National Laboratory (LLNL) Open Data Initiative. UC San Diego Library Digital Collections. https://doi.org/10.6075/J0D50N50
- Description
-
Data Set Description:
The dataset consists of two folders. The files used for training are stored inside the folder "train." The files used for testing are stored inside the folder "test."
There are 16,000 simulations in total, divided into 15,000 training cases and 1,000 test cases.
The file “inputs_15k_train.npy” contains a matrix of shape (15000,4), where the rows correspond to the first 15,000 simulations and columns are sx, sy, wu, and wv (source location in x, source location in y, wind velocity projection in x, and wind velocity projection in y). Similarly, the file “inputs_1k_test.npy” contains a matrix of shape (1000,4), where the rows correspond to the last 1,000 simulations, and columns are also sx, sy, wu, and wv.
The file “RGB_deposition_15k_train.npy” contains a matrix of shape (15000,1000,1000,3) corresponding to (training case, height, width, RGB), respectively. Similarly, the file “RGB_deposition_1k_test.npy” contains a matrix of shape (1000,1000,1000,3) corresponding to (test case, height, width, RGB), respectively.
Purpose:
Based on simulations of the atmospheric transport and dispersion of a passive tracer using a computational fluid dynamics (CFD) model, we use autoencoder-based models to learn complex plume spatial patterns (a megapixel image) from four scalars (sx,sy,wu,wv). In other words, the goal is to predict a deposition image given its associated release location and wind velocity (four scalar quantities). We are interested in the mapping: [sx,sy,wu,wv]→[height×width×RGB channel]. The publication associated with the data set can be found in [1].
References:
[1] Fernández-Godino, M. G., Lucas, D. D., & Kong, Q. Predicting wind-driven spatial deposition through simulated color images using deep autoencoders. Scientific Reports, 2023 13(1), 1394, https://doi.org/10.1038/s41598-023-28590-4.
[2] Gowardhan, A., D. McGuffin, D. D. Lucas, S. Neuscamman, O. Alvarez, and L. Glascoe, Large Eddy Simulations of Turbulent and Buoyant Flows in Urban and Complex Terrain Areas Using the Aeolus Model, Atmosphere 2021, 12(9), 1107, https://doi.org/10.3390/atmos12091107.
- Creation Date
- 2020-11-15
- Date Issued
- 2023
- Principal Investigator
- Advisor
- Contributor
- Methods
-
Relevant Information:
This dataset’s physics problem is a two-dimensional, spatial pattern formed from a pollutant that has been released into the atmosphere and dispersed for up to an hour while undergoing deposition to the surface. The pollutant’s release location (sx,sy) is assumed to occur anywhere in a two-dimensional domain of 5000 m × 5000 m. The release is initialized from a small bubble that is centered five meters above the surface, has a radius of five meters, and has internal momentum that causes it to expand radially and rise to a height of about 100 meters within the initial minute of simulation time. The same bubble source was used for all the simulations as a simplification. Only the (sx,sy) coordinates of the locations of the bubble source are relevant. All the realizations used unit mass releases, and the resulting deposition patterns can be scaled proportionately for other mass amounts. The time scale of the simulated data represents the cumulative mass deposited on the surface for one hour. The pollutant is blown in a direction controlled by the large-scale atmospheric inflow winds expressed as wind speed (ws), which varies from 0.5 to 15 m/s, and wind direction (wd), which can be anywhere in the interval [0,360) degrees following standard mathematical convention. The files “inputs_15k_train.npy” and “inputs_1k_test.npy”, however, includes wu = ws cos(wd) and wv = ws sen(wd), the wind velocity components projected onto the x and y axes. We assume that the spatial patterns were collected by a hypothetical imaging device that records the magnitude of the logarithm of deposition as a red, green, and blue (RGB) color image with channels containing integer values ranging from 0 to 255. The goal is to predict a deposition image given its associated release location and wind velocity (four scalar quantities). In other words, we are interested in the following mapping: [sx,sy,wu,wv]→[height×width×RGB channel]. See [1].
The data is obtained from simulations and later post-processed to make it adequate for machine learning training. Given large-scale winds as an inflow boundary condition, the CFD code Aeolus [2] uses millions of grid cells to simulate fluid flow and material transport in complex, three-dimensional environments at high resolution, accounting for turbulence from structures, terrain features, and obstacles and predicting deposition on the ground and other surfaces. Megapixel deposition images were obtained by processing the output of Aeolus simulations, which were run using a resolution of (x,y,z)=1000×1000×100 cells, each cell representing 5 m × 5 m × 5 m. Within Aeolus, pollutant concentration and deposition values are calculated by releasing and transporting Lagrangian particles of specified masses and sizes within the flow field. Particles that intersect the ground or other surfaces through turbulence or gravitational settling are removed from the atmosphere and recorded as deposition having units of mass per area. The releases were modeled as small, rising bubbles of mass carried by the winds about a minute into the simulations. Note that the actual deposition values are not given in this dataset. The entire dataset, created by running Aeolus multiple times, contains 16,000 deposition images. The data images are stored as [number of images, height, width, RGB channels]= [16,000, 1000, 1000, 3]. Each megapixel image shows the spatial deposition pattern of a unique release scenario in Aeolus changing source location and inflow wind, [sx,sy,wu,wv], using Latin hypercube sampling technique within the design of experiment. The data can potentially be augmented for different wind directions by rotating the spatial plume pattern to predict deposition patterns. This augmentation is not always possible in practice due to terrain-based asymmetries in transport and dispersion. The Python rainbow colormap is used to create the RGB images for training and testing the autoencoder. As previously noted, RGB pixel colors are associated with the logarithm of the deposition values. - Technical Details
-
Attribute Information:
Each *.npy file contains an array. The train and test input file arrays have a shape of (15000,4) and (1000,4), respectively. The train and test RGB file (output) arrays have a shape of (15000,1000,1000,3) and (1000,1000,1000,3), respectively.
Software used:
Python, version 3.9.15
Numpy, version 1.23.4 - Funding
-
This research was funded by the National Nuclear Security Administration, Defense Nuclear Nonproliferation Research and Development (NNSA DNN R&D), was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344, and is released under tracking number LLNL-MI-84834.
- Topics
Format
View formats within this collection
- Language
- English
- Identifier
-
Identifier: M. Giselle Fernandez-Godino: https://orcid.org/0000-0002-3837-8661
- Related Resources
- Fernández-Godino, M.G., Lucas, D.D. & Kong, Q. (2023). Predicting wind-driven spatial deposition through simulated color images using deep autoencoders. Sci Rep 13, 1394. https://doi.org/10.1038/s41598-023-28590-4
- Gowardhan, A., D. McGuffin, D. D. Lucas, S. Neuscamman, O. Alvarez, and L. Glascoe (2021). Large Eddy Simulations of Turbulent and Buoyant Flows in Urban and Complex Terrain Areas Using the Aeolus Model, Atmosphere, 12(9), 1107. https://doi.org/10.3390/atmos12091107
Primary associated publication
Reference
- License
-
Creative Commons Attribution 4.0 International Public License
- Rights Holder
- Lawrence Livermore National Laboratory
- Copyright
-
Under copyright (US)
Use: This work is available from the UC San Diego Library. This digital copy of the work is intended to support research, teaching, and private study.
Constraint(s) on Use: This work is protected by the U.S. Copyright Law (Title 17, U.S.C.). Use of this work beyond that allowed by "fair use" or any license applied to this work requires written permission of the copyright holder(s). Responsibility for obtaining permissions and any use and distribution of this work rests exclusively with the user and not the UC San Diego Library. Inquiries can be made to the UC San Diego Library program having custody of the work.
- Digital Object Made Available By
-
Research Data Curation Program, UC San Diego, La Jolla, 92093-0175 (https://lib.ucsd.edu/rdcp)
- Last Modified
2023-06-01