network

community

Description

The community action detects communities in a graph. It is a fundamental tool for understanding the structure of a network. In a social network, for example, communities can represent groups of friends or colleagues.

network.community <result=results> <status=rc> / algorithm="LABELPROPAGATION" | "LOUVAIN", deterministic=TRUE | FALSE, direction="DIRECTED" | "UNDIRECTED", display={...}, distributed=TRUE | FALSE, fix="variable-name", graph=integer, indexOffset=integer, labelUpdateMode="ASYNCHRONOUS" | "SYNCHRONOUS", linkRemovalRatio=integer, links={...}, linksVar={...}, logFreqTime=integer, logLevel="AGGRESSIVE" | "BASIC" | "MODERATE" | "NONE", maxIters=integer, multiLinks=TRUE | FALSE, nodes={...}, nodesVar={...}, nThreads=integer, outCommLinks={...}, outCommunity={...}, outGraphList={...}, outLevel={...}, outLinks={...}, outNodes={...}, outOverlap={...}, outputTables={...}, recursive={...}, resolutionList={double-1 <, double-2, ...>}, selfLinks=TRUE | FALSE, standardizedLabels=TRUE | FALSE, standardizedLabelsOut=TRUE | FALSE, tolerance=double, warmStart="variable-name";
Settings
ParameterDescription
algorithm Specifies the algorithm to use for community detection. Can be 'LABELPROPAGATION' or 'LOUVAIN'.
deterministic When set to True, ensures that each invocation (with the same machine configuration and parameter settings) produces the same final result.
direction Specifies whether to consider the input graph as directed or undirected.
display Specifies a list of results tables to send to the client for display.
distributed When set to True, uses a distributed graph for processing.
fix Specifies the variable that defines groups of nodes to fix together for community detection.
graph Specifies the in-memory graph to use for the analysis.
indexOffset Specifies the index offset for identifiers in the log and results output data tables.
labelUpdateMode Specifies whether nodes update their labels according to the labels of their neighbors at the current iteration (ASYNCHRONOUS) or the previous iteration (SYNCHRONOUS).
linkRemovalRatio Specifies the percentage of small-weight links to be removed around each node neighborhood.
links Specifies the input data table that contains the graph link information.
linksVar Specifies the data variable names for the links table.
logFreqTime Controls the frequency in seconds for displaying iteration logs.
logLevel Controls the amount of information that is displayed in the SAS log.
maxIters Specifies the maximum number of iterations that the algorithm can run.
multiLinks When set to True, includes multilinks when an input graph is read.
nodes Specifies the input data table that contains the graph node information.
nodesVar Specifies the data variable names for the nodes table.
nThreads Specifies the maximum number of threads to use for multithreaded processing.
outCommLinks Specifies the output data table to describe the links between each community.
outCommunity Specifies the output data table to contain properties about each community.
outGraphList Specifies the output data table to contain summary information about in-memory graphs.
outLevel Specifies the output data table to contain community information at different resolution levels.
outLinks Specifies the output data table to contain the graph link information along with any results from the algorithms that calculate metrics on links.
outNodes Specifies the output data table to contain the graph node information along with any results from the algorithms that calculate metrics on nodes.
outOverlap Specifies the output data table to describe the intensity of each node's membership to multiple communities.
outputTables Lists the names of results tables to save as CAS tables on the server.
recursive Breaks down large communities into smaller ones until the specified conditions are satisfied.
resolutionList Specifies a list of resolution values for community detection.
selfLinks When set to True, includes self-links when an input graph is read.
standardizedLabels When set to True, specifies that the input graph data are in a standardized format.
standardizedLabelsOut When set to True, requests that the output graph data include standardized format.
tolerance Specifies the tolerance value for when to stop iterations.
warmStart Specifies the variable that defines initial community identifiers for warm starting community detection.
Data Preparation View data prep sheet
Data Creation

This example uses the `mycas.LinkSetIn` data table to represent the links of a graph. The `from` and `to` columns define the nodes of the links, and the `weight` column provides the link weights.

Copied!
1DATA mycas.LinkSetIn;
2 INFILE DATALINES delimiter=',';
3 INPUT from $ to $ weight;
4 DATALINES;
5A,B,1
6A,C,1
7A,D,1
8B,C,1
9B,D,1
10C,D,1
11E,F,1
12E,G,1
13F,G,1
14;
15RUN;

Examples

This example illustrates the use of the community detection algorithm on an undirected graph. It produces several output tables detailing the communities, nodes, links, and other properties.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 ACTION network.community /
3 links = {name = "LinkSetIn"}
4 outNodes = {name = "NodeSetOut", replace=true}
5 outLinks = {name = "LinkSetOut", replace=true}
6 outLevel = {name = "LevelSetOut", replace=true}
7 outCommunity = {name = "CommSetOut", replace=true}
8 outOverlap = {name = "OverlapSetOut", replace=true}
9 outCommLinks = {name = "CommLinkSetOut", replace=true};
10RUN;
11QUIT;
Result :
The output data table mycas.NodeSetOut contains the community identifier of each node. Other tables provide detailed information about the network structure and community properties.

This example illustrates the use of the community detection algorithm on a directed graph. The `direction` parameter is set to 'DIRECTED'.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 ACTION network.community /
3 direction = "directed"
4 links = {name = "LinkSetIn",
5 vars = {"from", "to", "weight"}}
6 outNodes = {name = "NodeSetOut", replace=true};
7RUN;
8QUIT;
Result :
The output data table mycas.NodeSetOut contains the community identifier of each node, calculated based on the directed nature of the graph.

FAQ

What is the purpose of the network.community action?
What algorithms can be used for community detection with this action?
How does the Louvain algorithm work in this context?
What does the 'resolutionList' parameter do?
How can I handle directed versus undirected graphs?
What is the purpose of the 'outNodes' and 'outLinks' output tables?
How can I break down very large communities into smaller ones?
What information does the 'outOverlap' table provide?

Associated Scenarios

Use Case
Detection of Money Laundering Rings (Directed Graph)

A financial institution wants to identify potential money laundering rings where funds circulate in a closed loop between accounts. Since money flow is directional, the analysis...

Use Case
Granular Segmentation of Large Social Networks

A marketing firm analyzes a large social network. The initial communities are too large to be actionable for targeted campaigns. They need to recursively break down these giant ...

Use Case
IT Infrastructure Dependency & Overlap Analysis

An IT department maps server dependencies. Some servers (like gateways) belong to multiple subsystems. The goal is to identify these 'bridge' servers (Overlap) and analyze the n...