Published on :
ETL INTERNAL_CREATION

Creation of a large synthetic dataset

This code is also available in: Deutsch Español Français
Awaiting validation
The script uses a DATA STEP to create the `myLib.biggerDataset` dataset. A DO loop is employed to iterate over an extended range of values. The variable `i` is initialized with a missing value, then iterates from -1,000,000 to 1,000,000. For each iteration, the variable `j` is created by converting the numeric value of `i` into a character string using the PUT function and the `fmtNum.` format. The variable `k` is assigned a pseudo-random uniformly distributed numeric value generated by the `RANUNI(17)` function. Each iteration writes a new observation to the dataset.
Data Analysis

Type : INTERNAL_CREATION


The `myLib.biggerDataset` dataset is entirely created internally within the script via a DATA STEP. The values for variables `i`, `j`, and `k` are generated by a DO loop, the PUT function for format conversion, and the RANUNI function for random number generation.

1 Code Block
DATA STEP Data
Explanation :
This DATA STEP initiates the creation of the `myLib.biggerDataset` dataset. The `DO i = ., -1e6 to 1e6` loop generates values for `i` ranging from 'missing' to -1,000,000 up to 1,000,000. `j` is a character version of `i`, and `k` is a random number. `OUTPUT` writes each observation.
Copied!
1DATA myLib.biggerDataset;
2 DO i = ., -1e6 to 1e6;
3 j = put(i, fmtNum.);
4 k = ranuni(17);
5 OUTPUT;
6 END;
7RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.