simple

groupBy

Description

Builds BY groups in terms of the variable value combinations given the variables in the variable list.

CASL: simple.groupBy <result=results> <status=rc> /\n aggregator="MAX" | "MEAN" | "MIN" | "N" | "SUM",\n attributes={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}},\n casOut={caslib="string", compress=TRUE | FALSE, indexVars={"variable-name-1" <, "variable-name-2", ...>}, label="string", lifetime=64-bit-integer, maxMemSize=64-bit-integer, memoryFormat="DVR" | "INHERIT" | "STANDARD", name="table-name", promote=TRUE | FALSE, replace=TRUE | FALSE, replication=integer, tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE", threadBlockSize=64-bit-integer, timeStamp="string", where={"string-1" <, "string-2", ...>}},\n descending=TRUE | FALSE,\n freq="variable-name",\n includeMissing=TRUE | FALSE,\n inputs={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}},\n partition=TRUE | FALSE,\n repeatedTable=TRUE | FALSE,\n resultLimit=integer,\n scoreGt=double,\n scoreLt=double,\n *table={caslib="string", computedOnDemand=TRUE | FALSE, computedVars={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, computedVarsProgram="string", dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>}, importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, *name="table-name", singlePass=TRUE | FALSE, vars={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression", whereTable={casLib="string", dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}, importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, *name="table-name", vars={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression"}},\n weight="variable-name"\n ;\nLUA: results, info = s:simple_groupBy{\n aggregator="MAX" | "MEAN" | "MIN" | "N" | "SUM",\n attributes={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}},\n casOut={caslib="string", compress=true | false, indexVars={"variable-name-1" <, "variable-name-2", ...>}, label="string", lifetime=64-bit-integer, maxMemSize=64-bit-integer, memoryFormat="DVR" | "INHERIT" | "STANDARD", name="table-name", promote=true | false, replace=true | false, replication=integer, tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE", threadBlockSize=64-bit-integer, timeStamp="string", where={"string-1" <, "string-2", ...>}},\n descending=true | false,\n freq="variable-name",\n includeMissing=true | false,\n inputs={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}},\n partition=true | false,\n repeatedTable=true | false,\n resultLimit=integer,\n scoreGt=double,\n scoreLt=double,\n *table={caslib="string", computedOnDemand=true | false, computedVars={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, computedVarsProgram="string", dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>}, importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, *name="table-name", singlePass=true | false, vars={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression", whereTable={casLib="string", dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}, importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, *name="table-name", vars={{format="string", formattedLength=integer, label="string", *name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression"}},\n weight="variable-name"\n};\nPYTHON: results=s.simple.groupBy(\n aggregator="MAX" | "MEAN" | "MIN" | "N" | "SUM",\n attributes=[{"format":"string", "formattedLength":integer, "label":"string", "name":"variable-name", "nfd":integer, "nfl":integer}, {...}],\n casOut={"caslib":"string", "compress":True | False, "indexVars":["variable-name-1" <, "variable-name-2", ...>], "label":"string", "lifetime":64-bit-integer, "maxMemSize":64-bit-integer, "memoryFormat":"DVR" | "INHERIT" | "STANDARD", "name":"table-name", "promote":True | False, "replace":True | False, "replication":integer, "tableRedistUpPolicy":"DEFER" | "NOREDIST" | "REBALANCE", "threadBlockSize":64-bit-integer, "timeStamp":"string", "where":["string-1" <, "string-2", ...>]},\n descending=True | False,\n freq="variable-name",\n includeMissing=True | False,\n inputs=[{"format":"string", "formattedLength":integer, "label":"string", "name":"variable-name", "nfd":integer, "nfl":integer}, {...}],\n partition=True | False,\n repeatedTable=True | False,\n resultLimit=integer,\n scoreGt=double,\n scoreLt=double,\n table={"caslib":"string", "computedOnDemand":True | False, "computedVars":[{"format":"string", "formattedLength":integer, "label":"string", "name":"variable-name", "nfd":integer, "nfl":integer}, {...}], "computedVarsProgram":"string", "dataSourceOptions":{"key-1":{any-list-or-data-type-1} <, "key-2":{any-list-or-data-type-2}, ...>}, "importOptions":{"fileType":"ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, "name":"table-name", "singlePass":True | False, "vars":[{"format":"string", "formattedLength":integer, "label":"string", "name":"variable-name", "nfd":integer, "nfl":integer}, {...}], "where":"where-expression", "whereTable":{"casLib":"string", "dataSourceOptions":{adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}, "importOptions":{"fileType":"ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, "name":"table-name", "vars":[{"format":"string", "formattedLength":integer, "label":"string", "name":"variable-name", "nfd":integer, "nfl":integer}, {...}], "where":"where-expression"}},\n weight="variable-name"\n)\nR: results <- cas.simple.groupBy(s,\n aggregator="MAX" | "MEAN" | "MIN" | "N" | "SUM",\n attributes=list( list(format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer) <, list(...)>),\n casOut=list(caslib="string", compress=TRUE | FALSE, indexVars=list("variable-name-1" <, "variable-name-2", ...>), label="string", lifetime=64-bit-integer, maxMemSize=64-bit-integer, memoryFormat="DVR" | "INHERIT" | "STANDARD", name="table-name", promote=TRUE | FALSE, replace=TRUE | FALSE, replication=integer, tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE", threadBlockSize=64-bit-integer, timeStamp="string", where=list("string-1" <, "string-2", ...>)),\n descending=TRUE | FALSE,\n freq="variable-name",\n includeMissing=TRUE | FALSE,\n inputs=list( list(format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer) <, list(...)>),\n partition=TRUE | FALSE,\n repeatedTable=TRUE | FALSE,\n resultLimit=integer,\n scoreGt=double,\n scoreLt=double,\n table=list(caslib="string", computedOnDemand=TRUE | FALSE, computedVars=list( list(format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer) <, list(...)>), computedVarsProgram="string", dataSourceOptions=list(key-1=list(any-list-or-data-type-1) <, key-2=list(any-list-or-data-type-2), ...>), importOptions=list(fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters), name="table-name", singlePass=TRUE | FALSE, vars=list( list(format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer) <, list(...)>), where="where-expression", whereTable=list(casLib="string", dataSourceOptions=list(adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters), importOptions=list(fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters), name="table-name", vars=list( list(format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer) <, list(...)>), where="where-expression")),\n weight="variable-name"\n)
Settings
ParameterDescription
aggregatorspecifies the aggregator for which the values of the weight variable are rolled up into a rank order score if a weight variable is specified. Alias: agg Default: SUM MAX: specifies the maximum value. MEAN: specifies the arithmetic mean. MIN: specifies the minimum value. N: specifies the number of observations. SUM: specifies the sum of nonmissing values.
attributesspecifies the variable attributes. For more information about specifying the attributes parameter, see the common casinvardesc parameter. Aliases: attribute, attrs, attr, varAttrs
casOutspecifies the settings for an output table. For more information about specifying the casOut parameter, see the common casouttable parameter.
descendingwhen set to True, the levels of the group-by variables are arranged in descending order. Default: FALSE
freqspecifies the numeric frequency variable used to compute the ranking of a distinct grouping. Alias: frequency
includeMissingwhen set to True, missing values are included in the determination of distinct values in group-by analysis. Default: TRUE
inputsspecifies the input variables for the analysis. For more information about specifying the inputs parameter, see the common casinvardesc parameter. Alias: input
partitionwhen set to True, the table is assumed to be partitioned. Default: FALSE
repeatedTablewhen set to True, generate a repeated group-by table with the same rows on all worker nodes. Default: FALSE
resultLimitspecifies the maximum number of distinct groupings to return. Alias: limit Range: 0–MACINT
scoreGtspecifies the exclusive lower bound of the numeric rank order scores of the distinct groupings to return.
scoreLtspecifies the exclusive upper bound of the numeric rank order scores of the distinct groupings to return.
tablespecifies the table name, caslib, and other common parameters. Long form: table={name="table-name"} Shortcut form: table="table-name" The castable value can be one or more of the following: caslib: specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib. computedOnDemand: when set to True, creates the computed variables when the table is loaded instead of when the action begins. Alias: compOnDemand Default: FALSE computedVars: specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included. Alias: compVars format: specifies the format to apply to the variable. formattedLength: specifies the length of the format field plus the length of the format precision. label: specifies the descriptive label for the variable. name: specifies the name for the variable. nfd: specifies the length of the format precision. nfl: specifies the length of the format field. computedVarsProgram: specifies an expression for each computed variable that you include in the computedVars parameter. Alias: compPgm dataSourceOptions: specifies data source options. Aliases: options, dataSource importOptions: specifies the settings for reading a table from a data source. Alias: import For more information about specifying the importOptions parameter, see the common importOptions parameter. name: specifies the name of the input table. singlePass: when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs. Default: FALSE vars: specifies the variables to use in the action. format: specifies the format to apply to the variable. formattedLength: specifies the length of the format field plus the length of the format precision. label: specifies the descriptive label for the variable. name: specifies the name for the variable. nfd: specifies the length of the format precision. nfl: specifies the length of the format field. where: specifies an expression for subsetting the input data. whereTable: specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first. casLib: specifies the caslib for the filter table. By default, the active caslib is used. dataSourceOptions: specifies data source options. Aliases: options, dataSource For more information about specifying the dataSourceOptions parameter, see the common dataSourceOptions parameter. importOptions: specifies the settings for reading a table from a data source. Alias: import For more information about specifying the importOptions parameter, see the common importOptions parameter. name: specifies the name of the filter table. vars: specifies the variable names to use from the filter table. format: specifies the format to apply to the variable. formattedLength: specifies the length of the format field plus the length of the format precision. label: specifies the descriptive label for the variable. name: specifies the name for the variable. nfd: specifies the length of the format precision. nfl: specifies the length of the format field. where: specifies an expression for subsetting the data from the filter table.
weightspecifies the numeric weight variable used to compute the rank order score of a distinct grouping.

Examples

FAQ

What is the purpose of the groupBy action in SAS Visual Analytics?
What is the 'aggregator' parameter used for in the groupBy action?
What are the available options for the 'aggregator' parameter in groupBy action?
How do I specify variable attributes using the 'attributes' parameter in the groupBy action?
What does the 'casOut' parameter do in the groupBy action?
What is the effect of the 'descending' parameter in the groupBy action?
How is the 'freq' parameter used in the groupBy action?
What does 'includeMissing' control in the groupBy action?
What is the 'inputs' parameter for in the groupBy action?
When should I use the 'partition' parameter in the groupBy action?
What does 'repeatedTable' do in the groupBy action?
How can I limit the number of distinct groupings returned by the groupBy action?
How do 'scoreGt' and 'scoreLt' filter results in the groupBy action?
What is the 'table' parameter used for in the groupBy action?
What is the 'weight' parameter in the groupBy action?
What is the "groupBy" action?
What are the common parameters for the "groupBy" action?
What is the 'aggregator' parameter?
What does 'aggregator="MAX"' mean?
What does 'aggregator="MEAN"' mean?
What does 'aggregator="MIN"' mean?
What does 'aggregator="N"' mean?
What does 'aggregator="SUM"' mean?
What is the 'attributes' parameter?
What is the 'casOut' parameter?
What is the 'descending' parameter?
What is the 'freq' parameter?
What is the 'includeMissing' parameter?
What is the 'inputs' parameter?
What is the 'partition' parameter?
What is the 'repeatedTable' parameter?
What is the 'resultLimit' parameter?
What is the 'scoreGt' parameter?
What is the 'scoreLt' parameter?
What is the required 'table' parameter?
What is the 'table.caslib' subparameter?
What is the 'table.computedOnDemand' subparameter?
What is the 'table.computedVars' subparameter?
What is the 'table.computedVars.format' subparameter?
What is the 'table.computedVars.formattedLength' subparameter?
What is the 'table.computedVars.label' subparameter?
What is the required 'table.computedVars.name' subparameter?
What is the 'table.computedVars.nfd' subparameter?
What is the 'table.computedVars.nfl' subparameter?
What is the 'table.computedVarsProgram' subparameter?
What is the 'table.dataSourceOptions' subparameter?
What is the 'table.importOptions' subparameter?
What is the required 'table.name' subparameter?
What is the 'table.singlePass' subparameter?
What is the 'table.vars' subparameter?
What is the 'table.vars.format' subparameter?
What is the 'table.vars.formattedLength' subparameter?
What is the 'table.vars.label' subparameter?
What is the required 'table.vars.name' subparameter?
What is the 'table.vars.nfd' subparameter?
What is the 'table.vars.nfl' subparameter?
What is the 'table.where' subparameter?
What is the 'table.whereTable' subparameter?
What is the 'table.whereTable.casLib' subparameter?
What is the 'table.whereTable.dataSourceOptions' subparameter?
What is the 'table.whereTable.importOptions' subparameter?
What is the required 'table.whereTable.name' subparameter?
What is the 'table.whereTable.vars' subparameter?
What is the 'table.whereTable.vars.format' subparameter?
What is the 'table.whereTable.vars.formattedLength' subparameter?
What is the 'table.whereTable.vars.label' subparameter?
What is the required 'table.whereTable.vars.name' subparameter?
What is the 'table.whereTable.vars.nfd' subparameter?
What is the 'table.whereTable.vars.nfl' subparameter?
What is the 'table.whereTable.where' subparameter?
What is the 'weight' parameter?