network

community

Description

The community action detects communities in a graph. It is a fundamental tool for understanding the structure of a network. In a social network, for example, communities can represent groups of friends or colleagues.

network.community <result=results> <status=rc> / algorithm="LABELPROPAGATION" | "LOUVAIN", deterministic=TRUE | FALSE, direction="DIRECTED" | "UNDIRECTED", display={...}, distributed=TRUE | FALSE, fix="variable-name", graph=integer, indexOffset=integer, labelUpdateMode="ASYNCHRONOUS" | "SYNCHRONOUS", linkRemovalRatio=integer, links={...}, linksVar={...}, logFreqTime=integer, logLevel="AGGRESSIVE" | "BASIC" | "MODERATE" | "NONE", maxIters=integer, multiLinks=TRUE | FALSE, nodes={...}, nodesVar={...}, nThreads=integer, outCommLinks={...}, outCommunity={...}, outGraphList={...}, outLevel={...}, outLinks={...}, outNodes={...}, outOverlap={...}, outputTables={...}, recursive={...}, resolutionList={double-1 <, double-2, ...>}, selfLinks=TRUE | FALSE, standardizedLabels=TRUE | FALSE, standardizedLabelsOut=TRUE | FALSE, tolerance=double, warmStart="variable-name";
Settings
ParameterDescription
algorithmSpecifies the algorithm to use for community detection. Can be 'LABELPROPAGATION' or 'LOUVAIN'.
deterministicWhen set to True, ensures that each invocation (with the same machine configuration and parameter settings) produces the same final result.
directionSpecifies whether to consider the input graph as directed or undirected.
displaySpecifies a list of results tables to send to the client for display.
distributedWhen set to True, uses a distributed graph for processing.
fixSpecifies the variable that defines groups of nodes to fix together for community detection.
graphSpecifies the in-memory graph to use for the analysis.
indexOffsetSpecifies the index offset for identifiers in the log and results output data tables.
labelUpdateModeSpecifies whether nodes update their labels according to the labels of their neighbors at the current iteration (ASYNCHRONOUS) or the previous iteration (SYNCHRONOUS).
linkRemovalRatioSpecifies the percentage of small-weight links to be removed around each node neighborhood.
linksSpecifies the input data table that contains the graph link information.
linksVarSpecifies the data variable names for the links table.
logFreqTimeControls the frequency in seconds for displaying iteration logs.
logLevelControls the amount of information that is displayed in the SAS log.
maxItersSpecifies the maximum number of iterations that the algorithm can run.
multiLinksWhen set to True, includes multilinks when an input graph is read.
nodesSpecifies the input data table that contains the graph node information.
nodesVarSpecifies the data variable names for the nodes table.
nThreadsSpecifies the maximum number of threads to use for multithreaded processing.
outCommLinksSpecifies the output data table to describe the links between each community.
outCommunitySpecifies the output data table to contain properties about each community.
outGraphListSpecifies the output data table to contain summary information about in-memory graphs.
outLevelSpecifies the output data table to contain community information at different resolution levels.
outLinksSpecifies the output data table to contain the graph link information along with any results from the algorithms that calculate metrics on links.
outNodesSpecifies the output data table to contain the graph node information along with any results from the algorithms that calculate metrics on nodes.
outOverlapSpecifies the output data table to describe the intensity of each node's membership to multiple communities.
outputTablesLists the names of results tables to save as CAS tables on the server.
recursiveBreaks down large communities into smaller ones until the specified conditions are satisfied.
resolutionListSpecifies a list of resolution values for community detection.
selfLinksWhen set to True, includes self-links when an input graph is read.
standardizedLabelsWhen set to True, specifies that the input graph data are in a standardized format.
standardizedLabelsOutWhen set to True, requests that the output graph data include standardized format.
toleranceSpecifies the tolerance value for when to stop iterations.
warmStartSpecifies the variable that defines initial community identifiers for warm starting community detection.
Data Preparation View data prep sheet
Data Creation

This example uses the `mycas.LinkSetIn` data table to represent the links of a graph. The `from` and `to` columns define the nodes of the links, and the `weight` column provides the link weights.

Copied!
1DATA mycas.LinkSetIn;
2 INFILE DATALINES delimiter=',';
3 INPUT from $ to $ weight;
4 DATALINES;
5A,B,1
6A,C,1
7A,D,1
8B,C,1
9B,D,1
10C,D,1
11E,F,1
12E,G,1
13F,G,1
14;
15RUN;

Examples

This example illustrates the use of the community detection algorithm on an undirected graph. It produces several output tables detailing the communities, nodes, links, and other properties.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 ACTION network.community /
3 links = {name = "LinkSetIn"}
4 outNodes = {name = "NodeSetOut", replace=true}
5 outLinks = {name = "LinkSetOut", replace=true}
6 outLevel = {name = "LevelSetOut", replace=true}
7 outCommunity = {name = "CommSetOut", replace=true}
8 outOverlap = {name = "OverlapSetOut", replace=true}
9 outCommLinks = {name = "CommLinkSetOut", replace=true};
10RUN;
11QUIT;
Result :
The output data table mycas.NodeSetOut contains the community identifier of each node. Other tables provide detailed information about the network structure and community properties.

This example illustrates the use of the community detection algorithm on a directed graph. The `direction` parameter is set to 'DIRECTED'.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 ACTION network.community /
3 direction = "directed"
4 links = {name = "LinkSetIn",
5 vars = {"from", "to", "weight"}}
6 outNodes = {name = "NodeSetOut", replace=true};
7RUN;
8QUIT;
Result :
The output data table mycas.NodeSetOut contains the community identifier of each node, calculated based on the directed nature of the graph.

FAQ

What is the purpose of the network.community action?
What algorithms can be used for community detection with this action?
How does the Louvain algorithm work in this context?
What does the 'resolutionList' parameter do?
How can I handle directed versus undirected graphs?
What is the purpose of the 'outNodes' and 'outLinks' output tables?
How can I break down very large communities into smaller ones?
What information does the 'outOverlap' table provide?

Associated Scenarios

Use Case
Detection of Money Laundering Rings (Directed Graph)

A financial institution wants to identify potential money laundering rings where funds circulate in a closed loop between accounts. Since money flow is directional, the analysis...

Use Case
Granular Segmentation of Large Social Networks

A marketing firm analyzes a large social network. The initial communities are too large to be actionable for targeted campaigns. They need to recursively break down these giant ...

Use Case
IT Infrastructure Dependency & Overlap Analysis

An IT department maps server dependencies. Some servers (like gateways) belong to multiple subsystems. The goal is to identify these 'bridge' servers (Overlap) and analyze the n...