Attention : This code requires administrator privileges.
The script begins by establishing a LIBNAME connection to a Hadoop server. It then copies a SASHELP table to Hadoop. The PROC RANK procedure is executed twice: once leveraging in-database processing (default behavior), and once explicitly disabling this feature for Hadoop to show local processing. Finally, the created tables are cleaned up.
Data Analysis
Type : MIXTE
Initial data is taken from the SASHELP library (`sashelp.class`), then copied to an external system (Hadoop) via the 'mycdh' LIBNAME. A temporary table (`work.class_rank`) is also created internally by SAS.
1 Code Block
LIBNAME / DATA STEP Data
Explanation : This block establishes a connection to a Hadoop server by creating the 'mycdh' LIBNAME with the specified connection information. Then, a DATA step is used to copy the 'sashelp.class' table to this new Hadoop library, making the 'mycdh.class' table available in the distributed file system.
Copied!
libname mycdh hadoop server='quickstart.cloudera' user=cloudera password=cloudera schema='default';
data mycdh.class;
set sashelp.class;
run;
Explanation : Enables SAS tracing options ('sastrace') to display potential SQL code generation. The 'PROC RANK' procedure is then executed on the 'mycdh.class' table (residing in Hadoop). By default, SAS will attempt to execute this procedure directly in-database for performance optimization, with ranking performed by descending 'weight'.
Explanation : Modifies SAS options to explicitly disable in-database SQL generation for Hadoop ('EXCLUDEDB='HADOOP''). The same 'PROC RANK' procedure is executed, but this time, SAS will retrieve data from Hadoop locally to perform the ranking processing on the SAS server.
Explanation : Displays the content of the 'work.class_rank' table generated by the previous RANK procedures. This allows verification of the ranking operation results.
Copied!
proc print data=work.class_rank;
run;
1
PROC PRINTDATA=work.class_rank;
2
RUN;
5 Code Block
PROC SQL
Explanation : Uses 'PROC SQL' to drop the temporary tables 'work.class_rank' and 'mycdh.class' created during script execution, thus ensuring a clean environment.
Copied!
proc sql;
drop table work.class_rank;
drop table mycdh.class;
quit;
1
PROC SQL;
2
drop TABLE work.class_rank;
3
drop TABLE mycdh.class;
4
QUIT;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Copyright Info : Exercise 06 - SAS3880 / An Insider's Guide to SAS/ACCESS to Hadoop / In-Database PROC Example
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. WeAreCAS is an independent community site and is not affiliated with SAS Institute Inc.
This site uses technical and analytical cookies to improve your experience.
Read more.