The fedsql.execdirect action allows submitting SQL queries directly to the CAS server. By specifying the 'showStages=true' parameter, it is possible to obtain a detailed report of the query's execution plan. This report breaks down execution into several stages, displaying the methods used (e.g., sorting, aggregation, sequential scan), the SQL query generated for each stage (stage query), the number of data access threads per worker, input tables with their properties (number of rows, columns, partitioning), and execution times for each stage and for the global query. This information is crucial for query optimization and understanding FedSQL's behavior on the distributed CAS architecture.
Data Analysis
Type : CREATION_INTERNE
The examples use an internally created 'crimeCensus' table with a DATA Step and dummy data to simulate the behavior described in the documentation.
1 Code Block
DATA STEP / FEDSQL.EXECDIRECT Data
Explanation : This minimal example creates a temporary 'crimeCensus' table in CASUSER with dummy data. Then, it executes a simple FedSQL query that counts the number of occurrences for each 'primary_type' and sorts the results in descending order of the count. The 'showStages' parameter is not used here to illustrate the basic use of the action.
Copied!
data casuser.crimeCensus;
infile datalines dlm=';';
input primary_type $ date :yymmdd10. community_area hardship_index per_capita_income;
format date yymmdd10.;
datalines;
THEFT;2015-01-15;10;80;25000
THEFT;2015-02-20;10;80;25000
THEFT;2015-03-01;10;80;25000
THEFT;2015-04-10;10;80;25000
THEFT;2015-05-05;10;80;25000
THEFT;2016-01-20;10;80;25000
THEFT;2016-02-25;10;80;25000
THEFT;2016-03-05;10;80;25000
THEFT;2016-04-15;10;80;25000
THEFT;2016-05-10;10;80;25000
ASSAULT;2015-01-01;20;70;30000
ASSAULT;2015-01-02;20;70;30000
ASSAULT;2015-01-03;20;70;30000
ASSAULT;2015-01-04;20;70;30000
ASSAULT;2015-01-05;20;70;30000
ROBBERY;2016-06-01;30;90;20000
ROBBERY;2016-06-02;30;90;20000
ROBBERY;2016-06-03;30;90;20000
ROBBERY;2016-06-04;30;90;20000
ROBBERY;2016-06-05;30;90;20000
;
run;
proc cas;
fedsql.execdirect
query="select primary_type, count(*) as count from crimeCensus group by primary_type order by count desc";
quit;
1
DATA casuser.crimeCensus;
2
INFILEDATALINES dlm=';';
3
INPUT primary_type $ date :yymmdd10. community_area hardship_index per_capita_income;
4
FORMAT date yymmdd10.;
5
DATALINES;
6
THEFT;2015-01-15;10;80;25000
7
THEFT;2015-02-20;10;80;25000
8
THEFT;2015-03-01;10;80;25000
9
THEFT;2015-04-10;10;80;25000
10
THEFT;2015-05-05;10;80;25000
11
THEFT;2016-01-20;10;80;25000
12
THEFT;2016-02-25;10;80;25000
13
THEFT;2016-03-05;10;80;25000
14
THEFT;2016-04-15;10;80;25000
15
THEFT;2016-05-10;10;80;25000
16
ASSAULT;2015-01-01;20;70;30000
17
ASSAULT;2015-01-02;20;70;30000
18
ASSAULT;2015-01-03;20;70;30000
19
ASSAULT;2015-01-04;20;70;30000
20
ASSAULT;2015-01-05;20;70;30000
21
ROBBERY;2016-06-01;30;90;20000
22
ROBBERY;2016-06-02;30;90;20000
23
ROBBERY;2016-06-03;30;90;20000
24
ROBBERY;2016-06-04;30;90;20000
25
ROBBERY;2016-06-05;30;90;20000
26
;
27
RUN;
28
29
PROC CAS;
30
fedsql.execdirect
31
query="select primary_type, count(*) as count from crimeCensus group by primary_type order by count desc";
32
QUIT;
33
2 Code Block
FEDSQL.EXECDIRECT
Explanation : This example reproduces the use case provided in the documentation. It submits a more complex FedSQL query, filtering by crime type ('THEFT') and a date range, grouping by several variables and having a 'HAVING' condition. The 'showStages=true' parameter is enabled to obtain the execution plan details, showing the different stages of query processing on the CAS server.
Copied!
proc cas;
fedsql.execdirect
showStages=true
query="select primary_type, count(*) as count, community_area,
hardship_index, per_capita_income from crimeCensus
where primary_type='THEFT'
and date between date '2015-01-01' and date'2016-01-01'
group by primary_type, community_area, hardship_index, per_capita_income
having count(*) >= 5
order by count desc";
quit;
1
PROC CAS;
2
fedsql.execdirect
3
showStages=true
4
query="select primary_type, count(*) as count, community_area,
5
hardship_index, per_capita_income from crimeCensus
6
where primary_type='THEFT'
7
and date between date '2015-01-01' and date'2016-01-01'
8
group by primary_type, community_area, hardship_index, per_capita_income
9
having count(*) >= 5
10
order by count desc";
11
QUIT;
3 Code Block
FEDSQL.EXECDIRECT
Explanation : This advanced example introduces a join between the 'crimeCensus' table (created in Example 1) and a new 'population_data' table. The query calculates the total number of incidents and the incident rate per 1000 inhabitants for specific crime types over a longer period. Using 'showStages=true' allows analyzing the execution plan for this multi-table query, including how the join and complex calculations are handled by FedSQL on CAS.
Copied!
data casuser.population_data;
infile datalines dlm=';';
input community_area $ population;
datalines;
10;50000
20;75000
30;60000
;
run;
proc cas;
fedsql.execdirect
showStages=true
query="select c.primary_type, count(c.*) as total_incidents,
c.community_area, p.population,
(count(c.*) * 1000 / p.population) as incidents_per_1000
from crimeCensus as c
join population_data as p
on c.community_area = p.community_area
where c.primary_type in ('THEFT', 'ROBBERY')
and c.date between date '2015-01-01' and date '2016-12-31'
group by c.primary_type, c.community_area, p.population
having count(c.*) >= 2
order by incidents_per_1000 desc";
quit;
1
DATA casuser.population_data;
2
INFILEDATALINES dlm=';';
3
INPUT community_area $ population;
4
DATALINES;
5
10;50000
6
20;75000
7
30;60000
8
;
9
RUN;
10
11
PROC CAS;
12
fedsql.execdirect
13
showStages=true
14
query="select c.primary_type, count(c.*) as total_incidents,
15
c.community_area, p.population,
16
(count(c.*) * 1000 / p.population) as incidents_per_1000
17
from crimeCensus as c
18
join population_data as p
19
on c.community_area = p.community_area
20
where c.primary_type in ('THEFT', 'ROBBERY')
21
and c.date between date '2015-01-01' and date '2016-12-31'
22
group by c.primary_type, c.community_area, p.population
23
having count(c.*) >= 2
24
order by incidents_per_1000 desc";
25
QUIT;
4 Code Block
FEDSQL.EXECDIRECT / TABLE.REPLICATE
Explanation : This example focuses on Viya/CAS integration by replicating the 'crimeCensus' table (for demonstration purposes, the table is first recreated if necessary) in CAS server memory using the 'table.replicate' action. Replication can improve performance for small, frequently accessed tables. Then, a FedSQL query is executed on the replicated table with 'showStages=true' to observe how the execution plan leverages the replicated table (e.g., 'Replicated to one worker' in the plan output), which can show more efficient data access time.
Copied!
proc cas;
/* Créer la table crimeCensus si elle n'existe pas déjà (dupliqué pour l'autonomie) */
data casuser.crimeCensus;
infile datalines dlm=';';
input primary_type $ date :yymmdd10. community_area hardship_index per_capita_income;
format date yymmdd10.;
datalines;
THEFT;2015-01-15;10;80;25000
THEFT;2015-02-20;10;80;25000
THEFT;2015-03-01;10;80;25000
THEFT;2015-04-10;10;80;25000
THEFT;2015-05-05;10;80;25000
THEFT;2016-01-20;10;80;25000
THEFT;2016-02-25;10;80;25000
THEFT;2016-03-05;10;80;25000
THEFT;2016-04-15;10;80;25000
THEFT;2016-05-10;10;80;25000
ASSAULT;2015-01-01;20;70;30000
ASSAULT;2015-01-02;20;70;30000
ASSAULT;2015-01-03;20;70;30000
ASSAULT;2015-01-04;20;70;30000
ASSAULT;2015-01-05;20;70;30000
ROBBERY;2016-06-01;30;90;20000
ROBBERY;2016-06-02;30;90;20000
ROBBERY;2016-06-03;30;90;20000
ROBBERY;2016-06-04;30;90;20000
ROBBERY;2016-06-05;30;90;20000
;
run;
/* Répliquer la table crimeCensus dans CAS pour une meilleure performance */
table.replicate /
name='crimeCensus',
caslib='casuser',
target='crimeCensus_replicated',
numreplicas=1; /* Répliquer sur un seul worker pour une petite table */
/* Exécuter une requête FedSQL sur la table répliquée avec showStages */
fedsql.execdirect
showStages=true
query="select primary_type, count(*) as count from crimeCensus_replicated group by primary_type order by count desc";
quit;
1
PROC CAS;
2
/* Créer la table crimeCensus si elle n'existe pas déjà (dupliqué pour l'autonomie) */
3
DATA casuser.crimeCensus;
4
INFILEDATALINES dlm=';';
5
INPUT primary_type $ date :yymmdd10. community_area hardship_index per_capita_income;
6
FORMAT date yymmdd10.;
7
DATALINES;
8
THEFT;2015-01-15;10;80;25000
9
THEFT;2015-02-20;10;80;25000
10
THEFT;2015-03-01;10;80;25000
11
THEFT;2015-04-10;10;80;25000
12
THEFT;2015-05-05;10;80;25000
13
THEFT;2016-01-20;10;80;25000
14
THEFT;2016-02-25;10;80;25000
15
THEFT;2016-03-05;10;80;25000
16
THEFT;2016-04-15;10;80;25000
17
THEFT;2016-05-10;10;80;25000
18
ASSAULT;2015-01-01;20;70;30000
19
ASSAULT;2015-01-02;20;70;30000
20
ASSAULT;2015-01-03;20;70;30000
21
ASSAULT;2015-01-04;20;70;30000
22
ASSAULT;2015-01-05;20;70;30000
23
ROBBERY;2016-06-01;30;90;20000
24
ROBBERY;2016-06-02;30;90;20000
25
ROBBERY;2016-06-03;30;90;20000
26
ROBBERY;2016-06-04;30;90;20000
27
ROBBERY;2016-06-05;30;90;20000
28
;
29
RUN;
30
31
/* Répliquer la table crimeCensus dans CAS pour une meilleure performance */
32
TABLE.replicate /
33
name='crimeCensus',
34
caslib='casuser',
35
target='crimeCensus_replicated',
36
numreplicas=1; /* Répliquer sur un seul worker pour une petite table */
37
38
/* Exécuter une requête FedSQL sur la table répliquée avec showStages */
39
fedsql.execdirect
40
showStages=true
41
query="select primary_type, count(*) as count from crimeCensus_replicated group by primary_type order by count desc";
42
QUIT;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
« Pay close attention to the replicate action in Example 4. For small lookup tables, replicating the data to every worker node allows the FedSQL engine to perform "local joins." This eliminates the need for expensive data shuffles across the network, often reducing query time from seconds to milliseconds. »
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. WeAreCAS is an independent community site and is not affiliated with SAS Institute Inc.
This site uses technical and analytical cookies to improve your experience.
Read more.