In the SAS©® Viya™™ architecture, effective data management relies on a fundamental concept: the Caslib. A Caslib is an in-memory space on the CAS (Cloud Analytics Services) server intended to hold tables, access controls, and information about data sources.
This article explores the different types of Caslibs, their scope, and how to manipulate them using SAS© code, with a focus on best practices for loading and sharing data.
What is a Caslib?
A Caslib acts as a unified access point. It allows connecting the CAS server to:
External data sources (files, databases like Oracle or Hadoop).
In-memory tables that have been loaded onto the CAS server.
It also associates access controls that define which user groups or individuals are allowed to interact with the data.
Caslib Types
There are three main categories of Caslibs, defined by how they are created and managed:
1. Personal Caslib
This library is configured during the CAS server installation. When a CAS session is initiated, the personal Caslib is always available with a global scope for the current user. It allows access to CAS tables from any session using the same user ID (e.g., casuser).
2. Predefined Caslib
Managed by CAS administrators, these libraries have a global scope. They are typically used for popular data sources shared by a wide range of users (for example, a Hadoop-Hive or Oracle connection common to the entire team). The administrator manages access permissions.
3. Manually Added Caslib
Authorized users can add Caslibs via a CASLIB statement (for example in SAS©® Studio). This is the preferred method for ad hoc data access, when the user does not necessarily want to share the data with the entire server.
Caslib Scope: Session vs. Global
The concept of scope is crucial for understanding data visibility and persistence.
Session-Scope Caslib
If a Caslib is defined without the GLOBAL option, it is limited to the current session.
Availability: Tables loaded into this Caslib are only visible to the user's specific CAS session.
Persistence: If the user opens a new session, the Caslib and its tables will no longer be accessible.
Code Example (Session-Scope):
The code below creates a local Hive connection to the session. Note the absence of the GLOBAL option and the PROMOTE option.