nominalVarsDimReduction

mca

Description

Reduces the dimensionality of nominal variables by using a multiple correspondence analysis (MCA). MCA is a statistical technique for analyzing the relationships among a set of nominal variables. The action takes a CAS table as input, where each observation represents an object and each variable represents a nominal characteristic of the object. It produces an output table containing the reduced-dimensional representation of the input data.

nominalVarsDimReduction.mca / dimensions=integer, table={...}, display={...}, freq="variable-name", id={"variable-name-1", ...}, inputs={{...}, ...}, nominals={{...}, ...}, output={...}, outputTables={...}, prefix="string", saveState={...};
Settings
ParameterDescription
dimensionsSpecifies the number of reduced variables.
tableSpecifies the input table that contains the data to be analyzed.
displaySpecifies a list of results tables to send to the client for display.
freqSpecifies the frequency variable used for the analysis.
idSpecifies the variables to use as record identifiers and to transfer to the output table.
inputsSpecifies the variables to use in the analysis.
nominalsSpecifies the nominal variables to use in the training.
outputSpecifies the output data table that contains the values of the reduced variables for the training nominal data.
outputTablesLists the names of results tables to be saved as CAS tables on the server.
prefixSpecifies a prefix to apply to the names of the reduced variables in the output table.
saveStateSpecifies the output data table in which to save the dimensionality reduction model for future scoring.
Data Preparation View data prep sheet
Creating the Input Data Set

This example uses a fictional data set about car preferences. The data set contains several nominal variables such as the car's origin, type, and the driver's gender and marital status. We will load this data into a CAS table named 'cars_data'.

Copied!
1DATA casuser.cars_data;
2 INFILE DATALINES;
3 INPUT ID Make $ Model $ Origin $ Type $ Gender $ MaritalStatus $;
4 DATALINES;
5 1 Toyota Camry USA Sedan Male Married
6 2 Honda Civic Japan Sedan Female Single
7 3 Ford F-150 USA Truck Male Married
8 4 Chevrolet Malibu USA Sedan Female Divorced
9 5 Nissan Rogue Japan SUV Male Single
10 6 Hyundai Elantra Korea Sedan Female Married
11 7 BMW 3-Series Germany Sedan Male Single
12 8 Audi A4 Germany Sedan Female Married
13 9 Jeep Wrangler USA SUV Male Divorced
14 10 Kia Sorento Korea SUV Female Single
15 ;
16RUN;

Examples

This example performs a basic MCA to reduce the dimensionality of the nominal variables 'Make', 'Origin', and 'Type' to 2 dimensions. The results are stored in an output table, keeping the ID variable for identification.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 nominalVarsDimReduction.mca /
3 TABLE={name='cars_data'},
4 nominals={'Make', 'Origin', 'Type'},
5 dimensions=2,
6 OUTPUT={casout={name='cars_mca_basic', replace=true}, copyVars={'ID'}};
7RUN;
Result :
The action generates an output table 'cars_mca_basic' in the casuser caslib. This table contains the original 'ID' variable and two new variables representing the reduced dimensions, named 'rv1' and 'rv2' by default.

This example performs a more detailed MCA on all nominal variables in the table. It generates 3 dimensions, prefixes the new variable names with 'MCA_Dim', includes all original variables in the output, and saves the scoring model to a separate CAS table for later use.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 nominalVarsDimReduction.mca /
3 TABLE={name='cars_data'},
4 nominals={'Make', 'Origin', 'Type', 'Gender', 'MaritalStatus'},
5 dimensions=3,
6 id={'ID'},
7 prefix='MCA_Dim',
8 OUTPUT={casout={name='cars_mca_detailed', replace=true}, copyVars='ALL'},
9 saveState={name='mca_model_state', replace=true};
10RUN;
Result :
This creates two tables in the active caslib: 'cars_mca_detailed' containing the original data plus three new dimension variables named 'MCA_Dim1', 'MCA_Dim2', and 'MCA_Dim3'; and 'mca_model_state' which contains the scoring model for applying this MCA transformation to new data.

FAQ

What is the purpose of the mca action in SAS Viya?
How do I specify the number of new variables to be created?
Which parameter is used to define the input table for the analysis?
How can I save the dimensionality reduction model for future scoring?
Is it possible to add a custom prefix to the names of the output reduced variables?
How can I specify which variables are nominal for the training?