Published on :
Data Access CREATION_INTERNE

Greenplum Data Connector

This code is also available in: Deutsch Español Français
Awaiting validation
This data connector facilitates the integration of data stored in a Greenplum database with the in-memory processing environment of SAS© Viya. It supports various Greenplum data types, converting them to corresponding CAS data types. Configuration involves defining a 'caslib' that specifies database connection details, including source type, server, credentials, database, and schema. Data loading and unloading operations can be performed via CASLIB statements or procedures such as PROC CASUTIL. It is crucial to respect the case sensitivity of table names and identifiers as defined in Greenplum.
Data Analysis

Type : CREATION_INTERNE


Examples use generated data (datalines) to simulate a Greenplum table or SASHELP data, and generic connection parameters. The examples are self-contained.

1 Code Block
CASLIB
Explanation :
This example demonstrates the simplest way to establish a connection between SAS Cloud Analytic Services (CAS) and a Greenplum database. The `CASLIB` statement is used to define a CAS library (`greenplumcaslib_basic`) that points to the Greenplum data source. The `srctype`, `server`, `port`, `database`, `username`, `password`, and `schema` options are essential for specifying connection details. Once the caslib is defined, the `proc cas; list caslibs; quit;` procedure allows verifying its presence.
Copied!
1/* Remplacez les valeurs fictives par vos informations de connexion Greenplum */
2caslib greenplumcaslib_basic
3 datasource=(
4 srctype='greenplum',
5 server='votre_serveur_greenplum.com',
6 port=5432,
7 database='votre_bdd_greenplum',
8 username='votre_utilisateur',
9 password='votre_motdepasse',
10 schema='public'
11 );
12 
13/* Vérifier que la caslib a été ajoutée */
14PROC CAS;
15 list caslibs;
16QUIT;
2 Code Block
PROC CASUTIL Data
Explanation :
This example illustrates loading data from Greenplum (simulated here by `casuser.myGPdata` for example autonomy) into an in-memory CAS table using `PROC CASUTIL`. The `LOAD` subcommand specifies the source table (`casdata`), the source caslib (`incaslib`), the destination caslib (`outcaslib`), and the name of the new CAS table (`casout`). The `promote` option makes the table visible to all sessions. The `table.tableInfo` and `table.fetch` commands are then used to verify that the table has been loaded correctly and to display its first rows.
Copied!
1/* Créer une table CAS factice pour simuler des données Greenplum (si non existantes) */
2DATA casuser.myGPdata;
3 INPUT ID Name $ Value;
4 DATALINES;
5 1 John 100
6 2 Jane 150
7 3 Mike 200
8 4 Sarah 120
9 ;
10RUN;
11 
12/* Définir une caslib Greenplum (assumant qu'elle est déjà définie et configurée comme dans l'Exemple 1) */
13/* Remplacez les valeurs fictives par vos informations de connexion Greenplum */
14caslib greenplumcaslib
15 datasource=(
16 srctype='greenplum',
17 server='votre_serveur_greenplum.com',
18 port=5432,
19 database='votre_bdd_greenplum',
20 username='votre_utilisateur',
21 password='votre_motdepasse',
22 schema='public'
23 );
24 
25/* Utiliser PROC CASUTIL pour charger la table 'myGPdata' (simulant une table Greenplum) dans 'casuser' */
26PROC CASUTIL;
27 load casdata='myGPdata' incaslib='greenplumcaslib' outcaslib='casuser'
28 casout='myGPdata_cas' promote;
29QUIT;
30 
31/* Vérifier le chargement et le contenu de la table CAS */
32PROC CAS;
33 TABLE.tableInfo caslib='casuser' TABLE='myGPdata_cas';
34 TABLE.fetch caslib='casuser' TABLE='myGPdata_cas' sastableonly=true;
35QUIT;
3 Code Block
PROC CASUTIL Data
Explanation :
This example demonstrates advanced functionalities of `PROC CASUTIL` for data loading. It includes source-side filtering (`where='Quantity > 8'`) to reduce the volume of transferred data and import options (`importoptions`) to rename columns (`OrderID` to `CommandeID`, `Product` to `Produit`) and specify their types (`double`, `varchar`) during loading into CAS. These optimizations improve the efficiency and relevance of data in the CAS environment. The `PROC PRINT` procedure is used to display the result of the filtered and transformed loading.
Copied!
1/* Créer une table CAS factice pour simuler des données Greenplum */
2DATA casuser.greenplum_sales;
3 INPUT OrderID Product $ Quantity Price;
4 DATALINES;
5 101 A 10 25.50
6 102 B 5 10.00
7 103 A 12 25.50
8 104 C 8 50.25
9 105 B 7 10.00
10 ;
11RUN;
12 
13/* Supposons que greenplumcaslib est déjà définie */
14/* Simuler un chargement depuis Greenplum avec filtrage et sélection/renommage de colonnes */
15PROC CASUTIL;
16 load casdata='greenplum_sales' incaslib='greenplumcaslib' outcaslib='casuser'
17 casout='filtered_greenplum_data' promote
18 where='Quantity > 8' /* Filtrage des données dès la source Greenplum */
19 importoptions=(
20 columns=(
21 {name='OrderID', rename='CommandeID', type='double'},
22 {name='Product', rename='Produit', type='varchar'}
23 )
24 );
25QUIT;
26 
27/* Vérifier le chargement et les transformations */
28PROC PRINT DATA=casuser.filtered_greenplum_data;
29RUN;
4 Code Block
CAS ACTION Data
Explanation :
This example illustrates the direct use of a CAS action, specifically `table.fetch`, to interact with Greenplum data without using `PROC CASUTIL`. It shows how to read data directly from the `greenplumcaslib` (simulating a Greenplum table called `greenplum_inventory`), apply a filtering condition (`where='Stock_Level < 100'`), and save the results into a new temporary CAS table (`limited_stock_items`). This approach is typical for Viya scenarios where data manipulation is performed in distributed in-memory on the CAS server.
Copied!
1/* Créer une table CAS factice pour simuler des données Greenplum */
2DATA casuser.greenplum_inventory;
3 INPUT Item $ Stock_Level Location $;
4 DATALINES;
5 Laptop 50 WarehouseA
6 Mouse 200 WarehouseB
7 Keyboard 120 WarehouseA
8 Monitor 75 WarehouseC
9 ;
10RUN;
11 
12/* Définir une caslib Greenplum (assumant qu'elle est déjà définie et configurée) */
13/* Remplacez les valeurs fictives par vos informations de connexion Greenplum */
14caslib greenplumcaslib
15 datasource=(
16 srctype='greenplum',
17 server='votre_serveur_greenplum.com',
18 port=5432,
19 database='votre_bdd_greenplum',
20 username='votre_utilisateur',
21 password='votre_motdepasse',
22 schema='public'
23 );
24 
25/* Utiliser l'action 'table.fetch' pour lire directement depuis la source Greenplum via la caslib */
26PROC CAS;
27 SESSION casauto;
28 TABLE.fetch caslib='greenplumcaslib' TABLE='greenplum_inventory' /* Simule une table Greenplum */
29 where='Stock_Level < 100'
30 to='limited_stock_items';
31
32 /* Afficher le résultat de l'action CAS */
33 TABLE.contents caslib='casuser' TABLE='limited_stock_items';
34 TABLE.fetch caslib='casuser' TABLE='limited_stock_items' sastableonly=true;
35QUIT;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Copyright Info : Copyright © SAS Institute Inc. All Rights Reserved


Related Documentation : Data Access

Sujet / Mot-cléLien vers la ressource
DOC FedSQL en/sampleCode/FEDSQL9D66
Banner
Expert Advice
Expert
Michael
Responsable de l'infrastructure Viya.
« To maximize performance, ensure that the network infrastructure between your SAS Viya deployment and the Greenplum segment hosts is optimized for high throughput. Since both are MPP systems, the speed of data ingestion is often limited only by the "pipe" connecting the two clusters. »