Database

SAS and Impala: How to Define a Request Pool via an ODBC Connection

Simon 26/05/2024 5 views

When working with SAS© to interact with Big Data environments like Hadoop Impala, resource management is crucial. A common requirement is to assign a session to a specific "Request Pool" to ensure that queries are prioritized or managed correctly by the cluster.

Although using SQL Pass-Through allows executing configuration commands via an EXECUTE statement, this method does not always provide the flexibility of a SAS© library (LIBNAME).

Here's how to directly configure a request pool when declaring a library via ODBC.

Illustration

The Problem

The objective is to create a SAS© library (not just a simple pass-through query) that points to an Impala schema, while forcing the session to use a specific request_pool from initialization.

A standard ODBC connection does not automatically transmit these session parameters without explicit configuration.

The Solution: The dbconinit Parameter

The key lies in using the dbconinit (Database Connection Initialization) option. This option allows you to specify an SQL command that will be executed immediately after the connection is established, but before any data is read or written.

This is the ideal place to put the Impala SET request_pool command.

Implementation

Here is the complete syntax for declaring your library. In this example, we connect to a DSN source named MYIMPALA and set the pool to a specific value.

1LIBNAME eap32 odbc
2 noprompt="dsn=MYIMPALA;"
3 schema=pc_anltcclientref_analysis
4 dbconinit="set request_pool='ACE-S-FRA-EAP-PROD-PC_ANLTCCLIENTREF'"
5 access=readonly;
Parameter Details:

noprompt="dsn=..." : Defines the ODBC data source without opening a dialog box.

schema : Targets the specific database schema.

dbconinit : This is where the magic happens. The set request_pool='...' command is sent to the Impala server to configure the session environment.

access=readonly : A good practice to ensure that access to analytical data remains read-only.

Thanks to the dbconinit option, you don't need to resort to complex Pass-Through methods for simple session configuration. You thus retain the advantage of having a standard, navigable SAS© library that can be used directly in your Data steps or procedures, while respecting the resource allocation rules of your Hadoop cluster.