nominalVarsDimReduction

mca

Description

Reduces the dimensionality of nominal variables by using a multiple correspondence analysis (MCA). MCA is a statistical technique for analyzing the relationships among a set of nominal variables. The action takes a CAS table as input, where each observation represents an object and each variable represents a nominal characteristic of the object. It produces an output table containing the reduced-dimensional representation of the input data.

nominalVarsDimReduction.mca / dimensions=integer, table={...}, display={...}, freq="variable-name", id={"variable-name-1", ...}, inputs={{...}, ...}, nominals={{...}, ...}, output={...}, outputTables={...}, prefix="string", saveState={...};
Settings
ParameterDescription
dimensions Specifies the number of reduced variables.
table Specifies the input table that contains the data to be analyzed.
display Specifies a list of results tables to send to the client for display.
freq Specifies the frequency variable used for the analysis.
id Specifies the variables to use as record identifiers and to transfer to the output table.
inputs Specifies the variables to use in the analysis.
nominals Specifies the nominal variables to use in the training.
output Specifies the output data table that contains the values of the reduced variables for the training nominal data.
outputTables Lists the names of results tables to be saved as CAS tables on the server.
prefix Specifies a prefix to apply to the names of the reduced variables in the output table.
saveState Specifies the output data table in which to save the dimensionality reduction model for future scoring.
Data Preparation View data prep sheet
Creating the Input Data Set

This example uses a fictional data set about car preferences. The data set contains several nominal variables such as the car's origin, type, and the driver's gender and marital status. We will load this data into a CAS table named 'cars_data'.

Copied!
1DATA casuser.cars_data;
2 INFILE DATALINES;
3 INPUT ID Make $ Model $ Origin $ Type $ Gender $ MaritalStatus $;
4 DATALINES;
5 1 Toyota Camry USA Sedan Male Married
6 2 Honda Civic Japan Sedan Female Single
7 3 Ford F-150 USA Truck Male Married
8 4 Chevrolet Malibu USA Sedan Female Divorced
9 5 Nissan Rogue Japan SUV Male Single
10 6 Hyundai Elantra Korea Sedan Female Married
11 7 BMW 3-Series Germany Sedan Male Single
12 8 Audi A4 Germany Sedan Female Married
13 9 Jeep Wrangler USA SUV Male Divorced
14 10 Kia Sorento Korea SUV Female Single
15 ;
16RUN;

Examples

This example performs a basic MCA to reduce the dimensionality of the nominal variables 'Make', 'Origin', and 'Type' to 2 dimensions. The results are stored in an output table, keeping the ID variable for identification.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 nominalVarsDimReduction.mca /
3 TABLE={name='cars_data'},
4 nominals={'Make', 'Origin', 'Type'},
5 dimensions=2,
6 OUTPUT={casout={name='cars_mca_basic', replace=true}, copyVars={'ID'}};
7RUN;
Result :
The action generates an output table 'cars_mca_basic' in the casuser caslib. This table contains the original 'ID' variable and two new variables representing the reduced dimensions, named 'rv1' and 'rv2' by default.

This example performs a more detailed MCA on all nominal variables in the table. It generates 3 dimensions, prefixes the new variable names with 'MCA_Dim', includes all original variables in the output, and saves the scoring model to a separate CAS table for later use.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 nominalVarsDimReduction.mca /
3 TABLE={name='cars_data'},
4 nominals={'Make', 'Origin', 'Type', 'Gender', 'MaritalStatus'},
5 dimensions=3,
6 id={'ID'},
7 prefix='MCA_Dim',
8 OUTPUT={casout={name='cars_mca_detailed', replace=true}, copyVars='ALL'},
9 saveState={name='mca_model_state', replace=true};
10RUN;
Result :
This creates two tables in the active caslib: 'cars_mca_detailed' containing the original data plus three new dimension variables named 'MCA_Dim1', 'MCA_Dim2', and 'MCA_Dim3'; and 'mca_model_state' which contains the scoring model for applying this MCA transformation to new data.

FAQ

What is the purpose of the mca action in SAS Viya?
How do I specify the number of new variables to be created?
Which parameter is used to define the input table for the analysis?
How can I save the dimensionality reduction model for future scoring?
Is it possible to add a custom prefix to the names of the output reduced variables?
How can I specify which variables are nominal for the training?