Published on :
ETL INTERNAL_CREATION

Bulk Data Loading (Bulkload)

This code is also available in: Deutsch Español Français
Awaiting validation
The script begins by defining and initializing global macros for the port and host required for the bulkload mechanism. Then, it deletes the destination tables ('mydblib.testblkld1' and 'mydblib.testblkld2') if they exist, to ensure a clean test environment. A source dataset ('work.testblkld') is created internally using a DATA step and included data ('cards;'). Finally, the script performs two bulk loading operations: the first via a PROC SQL CREATE TABLE statement, and the second via a DATA step SET, specifying bulkload options such as port, host, 'gpfdist' protocol, and 'CSV' format for the first case.
Data Analysis

Type : INTERNAL_CREATION


The data used for bulk loading is created directly in the script via a DATA step with the 'cards;' instruction, forming the 'work.testblkld' dataset.

1 Code Block
MACRO GLOBAL
Explanation :
This block defines and initializes two global macros, `PORT` and `HOST`, which are configurable parameters for the bulkload destination. Their values must be entered by the user before execution.
Copied!
1 /* CREATE GLOBAL MACROS FOR BULKLOAD */
2 
3 %GLOBAL PORT; /* Port for Hawq bulk loader */
4 %GLOBAL HOST; /* Client box for Hawq bulk load */
5 
6 /* ASSIGN GLOBAL MACRO VALUES FOR BULKLOAD */
7 
8 %let PORT =;
9 %let HOST =;
2 Code Block
PROC DELETE
Explanation :
These two `PROC DELETE` statements are used to clean the environment by deleting the 'testblkld1' and 'testblkld2' tables from the 'mydblib' library before proceeding with new loading operations. This ensures that each script execution starts from a known state.
Copied!
1 
2PROC DELETE
3DATA=mydblib.testblkld1;
4 
5RUN;
6PROC DELETE
7DATA=mydblib.testblkld2;
8 
9RUN;
10 
3 Code Block
DATA STEP Data
Explanation :
This DATA step creates a temporary dataset named 'work.testblkld'. The data is provided inline via the `cards;` statement, defining the variables `name`, `age`, `sex`, and `bdate`. This dataset will serve as the source for subsequent bulk loading operations.
Copied!
1DATA work.testblkld;
2 INPUT name $ age sex $ bdate mmddyy.;
3 CARDS;
4amy 3 f 030185
5bill 12 m 121277
6charlie 35 m 010253
7david 19 m 101469
8elinor 42 f 080845
9pearl 78 f 051222
10vera 96 f 101200
11frank 24 m 092663
12georgia 1 f 040687
13henry 46 m 053042
14joann 27 f 020461
15buddy 66 m 101432
16;
17RUN;
4 Code Block
PROC SQL
Explanation :
This block uses `PROC SQL` to create a table named 'testblkld1' in the 'mydblib' library. The `BULKLOAD=YES` option enables bulk loading. The `BL_PORT`, `BL_HOST`, `BL_PROTOCOL` (set to 'gpfdist'), and `bl_format` ('CSV') options configure the data transfer details. The data is selected from 'work.testblkld'.
Copied!
1PROC SQL;
2create TABLE mydblib.testblkld1
3 (BULKLOAD=YES
4 BL_PORT=&port
5 BL_HOST=&host
6 BL_PROTOCOL="gpfdist"
7 bl_format='CSV')
8 as select * from work.testblkld;
9QUIT;
5 Code Block
DATA STEP
Explanation :
This DATA step also performs a bulk load to a new table 'testblkld2' in the 'mydblib' library. Unlike the previous example, the bulkload options (`BULKLOAD`, `BL_PORT`, `BL_HOST`, `BL_PROTOCOL`) are specified directly in the target dataset options. The data source is 'work.testblkld', read via the `set` statement.
Copied!
1DATA mydblib.testblkld2 (
2 BULKLOAD=YES
3 BL_PORT=&port
4 BL_HOST=&host
5 BL_PROTOCOL="gpfdist"
6 );
7 SET work.testblkld;
8RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Copyright Info : SAS SAMPLE LIBRARY