optNetwork

minSpanTree

Description

The minSpanTree action calculates the minimum spanning tree (MST) of a graph. An MST is a subgraph that connects all nodes together with the minimum possible total link weight, without forming any cycles. For directed graphs, it finds a minimum spanning arborescence, which is a directed tree rooted at a specified source node.

optNetwork.minSpanTree <result=results> <status=rc> / deterministic=TRUE | FALSE, direction="DIRECTED" | "UNDIRECTED", display={...}, distributed=TRUE | FALSE, graph=integer, indexOffset=integer, links={...}, linksVar={...}, logFreqTime=integer, logLevel="AGGRESSIVE" | "BASIC" | "MODERATE" | "NONE", multiLinks=TRUE | FALSE, nodes={...}, nodesVar={...}, nThreads=integer, out={...}, outGraphList={...}, outLinks={...}, outNodes={...}, outputTables={...}, selfLinks=TRUE | FALSE, source="string" | double | 64-bit-integer, standardizedLabels=TRUE | FALSE, standardizedLabelsOut=TRUE | FALSE ;
Settings
ParameterDescription
deterministic When set to True, ensures that each invocation (with the same machine configuration and parameter settings) produces the same final result.
direction Specifies whether to consider the input graph as directed or undirected. For MST, this is typically 'UNDIRECTED'. When 'DIRECTED' is used, a source node must be specified to find a minimum spanning arborescence.
display Specifies a list of results tables to send to the client for display.
distributed When set to True, uses a distributed graph algorithm. Set to false for single-machine processing.
graph Specifies the in-memory graph to use for the calculation, if it has been previously loaded.
indexOffset Specifies the index offset for identifiers in the log and results tables. For example, an offset of 1 would label entities starting from 1 instead of 0.
links Specifies the input data table that contains the graph link information, including 'from', 'to', and 'weight' columns.
linksVar Specifies the data variable names for the links table if they differ from the defaults (e.g., from, to, weight).
logLevel Controls the amount of information displayed in the SAS log during action execution.
out Specifies the output data table to contain the links that form the minimum spanning tree.
source Specifies the source node for calculating a directed minimum spanning tree (arborescence). This parameter is required when direction is 'DIRECTED'.
Data Preparation View data prep sheet
Data Creation: Weighted Links

Create a CAS table named 'mycas.LinkSetIn' representing a graph. It contains 'from' and 'to' columns for nodes, and a 'weight' column for the cost or distance of each link.

Copied!
1DATA mycas.LinkSetIn;
2 attrib from varchar(1) to varchar(1) weight numeric;
3 INFILE DATALINES dlm=' ';
4 INPUT from $ to $ weight;
5 DATALINES;
6A B 10
7A C 4
8B C 2
9B D 5
10C D 8
11C E 6
12D E 1
13;
14RUN;

Examples

This example finds the minimum spanning tree on an undirected graph. It identifies the subset of links that connects all nodes with the minimum total weight.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 ACTION optNetwork.minSpanTree /
3 links={name='LinkSetIn'}
4 out={name='MstOut', replace=true};
5 RUN;
6 ACTION TABLE.fetch / TABLE='MstOut';
7 RUN;
8QUIT;
Result :
The action returns a table 'MstOut' containing the links of the MST: (C, A), (C, B), (B, D), (D, E). The total weight of the MST is 22.

This example calculates a directed minimum spanning tree (an arborescence) rooted at the source node 'A'. It finds the cheapest paths from 'A' to all other nodes, forming a tree structure.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 ACTION optNetwork.minSpanTree /
3 links={name='LinkSetIn'}
4 direction='DIRECTED'
5 SOURCE='A'
6 out={name='MstOutDirected', replace=true};
7 RUN;
8 ACTION TABLE.fetch / TABLE='MstOutDirected';
9 RUN;
10QUIT;
Result :
The output table 'MstOutDirected' contains the links of the minimum spanning arborescence rooted at 'A'. The links will be (A, C), (C, B), (B, D), and (D, E), with a total weight of 12.

FAQ

What is the main purpose of the minSpanTree action?
What types of graphs does the minSpanTree action support?
What are the primary input data tables for the minSpanTree action?
What is the main output of the minSpanTree action?
Can the minSpanTree computation be parallelized?