Computational Fluid Dynamics Simulation Data of Spatial Deposition

Readme

File Size	6.62 KB
File Format	Plain text

Download file View file Download file

Deposition - train data

File Size	5.8 GB
File Format	ZIP Format
Description	Matrix of shape (15000,1000,1000,3) corresponding to (training case, height, width, RGB), respectively.

Download file View file Download file

Inputs - train data

File Size	127 KB
File Format	ZIP Format
Description	Matrix of shape (15000,4), where the rows correspond to the first 15,000 simulations and columns are s_x, s_y, w_u, and w_v (source location in x, source location in y, wind velocity projection in x, and wind velocity projection in y).

Download file View file Download file

Deposition - test data

File Size	414 MB
File Format	ZIP Format
Description	Matrix of shape (1000,1000,1000,3) corresponding to (testing case, height, width, RGB), respectively.

Download file View file Download file

Inputs - test data

File Size	11 KB
File Format	ZIP Format
Description	Matrix of shape (1000,4), where the rows correspond to the last 1,000 simulations and columns are s_x, s_y, w_u, and w_v (source location in x, source location in y, windvelocity projection in x, and wind velocity projection in y).

Download file View file Download file

Collection

Lawrence Livermore National Laboratory (LLNL) Open Data Initiative

Cite This Work

Fernandez-Godino, M. Giselle; Lucas, Donald D.; Gunawardena, Nipun (2023). Computational Fluid Dynamics Simulation Data of Spatial Deposition. In Lawrence Livermore National Laboratory (LLNL) Open Data Initiative. UC San Diego Library Digital Collections. https://doi.org/10.6075/J0D50N50

Description

Data Set Description:
The dataset consists of two folders. The files used for training are stored inside the folder "train." The files used for testing are stored inside the folder "test."

There are 16,000 simulations in total, divided into 15,000 training cases and 1,000 test cases.

The file “inputs_15k_train.npy” contains a matrix of shape (15000,4), where the rows correspond to the first 15,000 simulations and columns are s_x, s_y, w_u, and w_v (source location in x, source location in y, wind velocity projection in x, and wind velocity projection in y). Similarly, the file “inputs_1k_test.npy” contains a matrix of shape (1000,4), where the rows correspond to the last 1,000 simulations, and columns are also s_x, s_y, w_u, and w_v.

The file “RGB_deposition_15k_train.npy” contains a matrix of shape (15000,1000,1000,3) corresponding to (training case, height, width, RGB), respectively. Similarly, the file “RGB_deposition_1k_test.npy” contains a matrix of shape (1000,1000,1000,3) corresponding to (test case, height, width, RGB), respectively.

Purpose:
Based on simulations of the atmospheric transport and dispersion of a passive tracer using a computational fluid dynamics (CFD) model, we use autoencoder-based models to learn complex plume spatial patterns (a megapixel image) from four scalars (s_x,s_y,w_u,w_v). In other words, the goal is to predict a deposition image given its associated release location and wind velocity (four scalar quantities). We are interested in the mapping: [s_x,s_y,w_u,w_v]→[height×width×RGB channel]. The publication associated with the data set can be found in [1].

References:
[1] Fernández-Godino, M. G., Lucas, D. D., & Kong, Q. Predicting wind-driven spatial deposition through simulated color images using deep autoencoders. Scientific Reports, 2023 13(1), 1394, https://doi.org/10.1038/s41598-023-28590-4.

[2] Gowardhan, A., D. McGuffin, D. D. Lucas, S. Neuscamman, O. Alvarez, and L. Glascoe, Large Eddy Simulations of Turbulent and Buoyant Flows in Urban and Complex Terrain Areas Using the Aeolus Model, Atmosphere 2021, 12(9), 1107, https://doi.org/10.3390/atmos12091107.

Creation Date

2020-11-15

Date Issued

2023

Principal Investigator

Advisor

Contributor

Methods

Relevant Information:
This dataset’s physics problem is a two-dimensional, spatial pattern formed from a pollutant that has been released into the atmosphere and dispersed for up to an hour while undergoing deposition to the surface. The pollutant’s release location (s_x,s_y) is assumed to occur anywhere in a two-dimensional domain of 5000 m × 5000 m. The release is initialized from a small bubble that is centered five meters above the surface, has a radius of five meters, and has internal momentum that causes it to expand radially and rise to a height of about 100 meters within the initial minute of simulation time. The same bubble source was used for all the simulations as a simplification. Only the (s_x,s_y) coordinates of the locations of the bubble source are relevant. All the realizations used unit mass releases, and the resulting deposition patterns can be scaled proportionately for other mass amounts. The time scale of the simulated data represents the cumulative mass deposited on the surface for one hour. The pollutant is blown in a direction controlled by the large-scale atmospheric inflow winds expressed as wind speed (w_s), which varies from 0.5 to 15 m/s, and wind direction (w_d), which can be anywhere in the interval [0,360) degrees following standard mathematical convention. The files “inputs_15k_train.npy” and “inputs_1k_test.npy”, however, includes w_u = w_s cos(w_d) and w_v = w_s sen(w_d), the wind velocity components projected onto the x and y axes. We assume that the spatial patterns were collected by a hypothetical imaging device that records the magnitude of the logarithm of deposition as a red, green, and blue (RGB) color image with channels containing integer values ranging from 0 to 255. The goal is to predict a deposition image given its associated release location and wind velocity (four scalar quantities). In other words, we are interested in the following mapping: [s_x,s_y,w_u,w_v]→[height×width×RGB channel]. See [1].

The data is obtained from simulations and later post-processed to make it adequate for machine learning training. Given large-scale winds as an inflow boundary condition, the CFD code Aeolus [2] uses millions of grid cells to simulate fluid flow and material transport in complex, three-dimensional environments at high resolution, accounting for turbulence from structures, terrain features, and obstacles and predicting deposition on the ground and other surfaces. Megapixel deposition images were obtained by processing the output of Aeolus simulations, which were run using a resolution of (x,y,z)=1000×1000×100 cells, each cell representing 5 m × 5 m × 5 m. Within Aeolus, pollutant concentration and deposition values are calculated by releasing and transporting Lagrangian particles of specified masses and sizes within the flow field. Particles that intersect the ground or other surfaces through turbulence or gravitational settling are removed from the atmosphere and recorded as deposition having units of mass per area. The releases were modeled as small, rising bubbles of mass carried by the winds about a minute into the simulations. Note that the actual deposition values are not given in this dataset. The entire dataset, created by running Aeolus multiple times, contains 16,000 deposition images. The data images are stored as [number of images, height, width, RGB channels]= [16,000, 1000, 1000, 3]. Each megapixel image shows the spatial deposition pattern of a unique release scenario in Aeolus changing source location and inflow wind, [s_x,s_y,w_u,w_v], using Latin hypercube sampling technique within the design of experiment. The data can potentially be augmented for different wind directions by rotating the spatial plume pattern to predict deposition patterns. This augmentation is not always possible in practice due to terrain-based asymmetries in transport and dispersion. The Python rainbow colormap is used to create the RGB images for training and testing the autoencoder. As previously noted, RGB pixel colors are associated with the logarithm of the deposition values.

Technical Details

Attribute Information:
Each *.npy file contains an array. The train and test input file arrays have a shape of (15000,4) and (1000,4), respectively. The train and test RGB file (output) arrays have a shape of (15000,1000,1000,3) and (1000,1000,1000,3), respectively.

Software used:
Python, version 3.9.15
Numpy, version 1.23.4

Funding

This research was funded by the National Nuclear Security Administration, Defense Nuclear Nonproliferation Research and Development (NNSA DNN R&D), was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344, and is released under tracking number LLNL-MI-84834.

Topics

Format View formats within this collection

Language

English

Identifier

Identifier: M. Giselle Fernandez-Godino: https://orcid.org/0000-0002-3837-8661

Related Resources

Primary associated publication

Fernández-Godino, M.G., Lucas, D.D. & Kong, Q. (2023). Predicting wind-driven spatial deposition through simulated color images using deep autoencoders. Sci Rep 13, 1394. https://doi.org/10.1038/s41598-023-28590-4

Reference

Gowardhan, A., D. McGuffin, D. D. Lucas, S. Neuscamman, O. Alvarez, and L. Glascoe (2021). Large Eddy Simulations of Turbulent and Buoyant Flows in Urban and Complex Terrain Areas Using the Aeolus Model, Atmosphere, 12(9), 1107. https://doi.org/10.3390/atmos12091107

License

Creative Commons Attribution 4.0 International Public License

Rights Holder

Lawrence Livermore National Laboratory

Copyright

Under copyright (US)

Use: This work is available from the UC San Diego Library. This digital copy of the work is intended to support research, teaching, and private study.

Constraint(s) on Use: This work is protected by the U.S. Copyright Law (Title 17, U.S.C.). Use of this work beyond that allowed by "fair use" or any license applied to this work requires written permission of the copyright holder(s). Responsibility for obtaining permissions and any use and distribution of this work rests exclusively with the user and not the UC San Diego Library. Inquiries can be made to the UC San Diego Library program having custody of the work.

Digital Object Made Available By

Research Data Curation Program, UC San Diego, La Jolla, 92093-0175 (https://lib.ucsd.edu/rdcp)

Last Modified

2023-06-01