Gaussian mixture model. This action performs clustering on input data using a Gaussian mixture model.
| Parameter | Description |
|---|---|
| alpha | Specifies the concentration parameter for the Dirichlet process. |
| attributes | Changes the attributes of variables used in this action. Currently, attributes specified on the inputs and nominals parameter are ignored. For more information about specifying the attributes parameter, see the common casinvardesc parameter (Appendix A: Common Parameters). |
| clusterCovOut | Creates a table on the server that contains the covariance matrix of each cluster. For more information about specifying the clusterCovOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters). |
| clusterSumOut | Creates a table on the server that contains the summary of the clustering results including the size, neighbor, and mean of each cluster. For more information about specifying the clusterSumOut parameter, see the common casouttable (Form 1) parameter (Appendix A: Common Parameters). |
| display | Specifies a list of results tables to send to the client for display. For more information about specifying the display parameter, see the common displayTables parameter (Appendix A: Common Parameters). |
| freq | Names the numeric variable that contains the frequency of occurrence for each observation. |
| inference | Specifies the inference method to use in the analysis. The 'method' parameter determines other applicable parameters. Current method: 'VB' (Variational Bayesian). Parameters for method='VB': - covariance="DIAGONAL" | "FULL": Specifies the covariance matrix type of the Gaussian mixtures. Default is DIAGONAL. - maxVbIter=64-bit-integer: Specifies the number of iterations for the variational Bayesian (VB) inference. - threshold=double: Specifies the threshold of the difference between the current and previous likelihoods. |
| inputs | Specifies variables to use for analysis. For more information about specifying the inputs parameter, see the common casinvardesc parameter (Appendix A: Common Parameters). |
| maxClusters | Specifies the maximum number of clusters. |
| output | Creates a table on the server that contains the predicted cluster as well as the probability distribution over all obtained clusters for each observation. For more information about specifying the output parameter, see the common outputStatement parameter (Appendix A: Common Parameters). |
| outputTables | Lists the names of results tables to save as CAS tables on the server. For more information about specifying the outputTables parameter, see the common outputTables parameter (Appendix A: Common Parameters). |
| saveState | Specifies to the table in which to save the model state for future model prediction. The casouttable value can be one or more of the following: caslib="string" specifies the name of the caslib for the output table. label="string" specifies the descriptive label to associate with the table. lifetime=64-bit-integer specifies the number of seconds to keep the table in memory after it is last accessed. The table is dropped if it is not accessed for the specified number of seconds. memoryFormat="DVR" | "INHERIT" | "STANDARD" specifies the memory format for the output table. name="table-name" specifies the name for the output table. promote=TRUE | FALSE when set to True, adds the output table with a global scope. This enables other sessions to access the table, subject to access controls. The target caslib must also have a global scope. replace=TRUE | FALSE when set to True, overwrites an existing table that has the same name. tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE" Specifies the Table Redistribution Policy when the number of worker pods increases on a running CAS server. |
| seed | Specifies a double to use to start the pseudorandom number generator for initialization. |
| table | Specifies the input data table. The castable value can be one or more of the following: caslib="string" specifies the caslib for the input table that you want to use with the action. By default, the active caslib is used. Specify a value only if you need to access a table from a different caslib. computedOnDemand=TRUE | FALSE when set to True, creates the computed variables when the table is loaded instead of when the action begins. computedVars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}} specifies the names of the computed variables to create. Specify an expression for each variable in the computedVarsProgram parameter. If you do not specify this parameter, then all variables from computedVarsProgram are automatically included. computedVarsProgram="string" specifies an expression for each computed variable that you include in the computedVars parameter. dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2>} specifies data source options. importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters} specifies the settings for reading a table from a data source. name="table-name" specifies the name of the input table. singlePass=TRUE | FALSE when set to True, does not create a transient table on the server. Setting this parameter to True can be efficient, but the data might not have stable ordering upon repeated runs. vars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}} specifies the variables to use in the action. where="where-expression" specifies an expression for subsetting the input data. whereTable={casLib="string", dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}, importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, name="table-name", vars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression"} specifies an input table that contains rows to use as a WHERE filter. If the vars parameter is not specified, then all the variable names that are common to the input table and the filtering table are used to find matching rows. If the where parameter for the input table and this parameter are specified, then this filtering table is applied first. |