The `annTrain` action, part of the `neuralNet` action set, is used to train an artificial neural network (ANN) in SAS Viya. This process involves adjusting the network's weights based on a given dataset to minimize prediction errors. The action supports various architectures like Multi-Layer Perceptrons (MLP), Generalized Linear Models (GLIM), and direct connection models. It offers extensive customization options, including different activation functions, optimization algorithms (like LBFGS and SGD), and data standardization methods, making it a versatile tool for building predictive models.
| Parameter | Description |
|---|---|
| acts | Specifies the activation function for the neurons on each hidden layer. |
| applyRowOrder | Specifies that the action should use a prespecified row ordering. |
| arch | Specifies the network architecture to be trained (MLP, GLIM, or DIRECT). |
| attributes | Specifies temporary attributes, such as a format, to apply to input variables. |
| bias | Specifies a fixed bias value for all hidden and output neurons, which will not be optimized. |
| casOut | Specifies the output table for the trained model. |
| code | Requests that the action produce SAS score code for deployment. |
| combs | Specifies the combination function for the neurons on each hidden layer. |
| delta | Specifies the annealing parameter for simulated annealing (SA) global optimization. |
| dropOut | Specifies the dropout ratio for the hidden layers, valid only with SGD optimization and linear combinations. |
| dropOutInput | Specifies the dropout ratio for the input layers, valid only with SGD optimization and linear combinations. |
| errorFunc | Specifies the error function to train the network (e.g., ENTROPY, NORMAL). |
| freq | Specifies a numeric variable that contains the frequency of occurrence of each observation. |
| fullWeights | Generates the full weight model for LBFGS optimization. |
| hiddens | Specifies the number of hidden neurons for each hidden layer in the model. |
| includeBias | When set to False, bias parameters are not included for the hidden and output units. |
| inputs | Specifies the input variables to use in the analysis. |
| inversePriors | Calculates the weight for prediction error based on the inverse of class frequencies. |
| listNode | Specifies which nodes (input, hidden, output, or all) to include in the scoring output table. |
| missing | Specifies how to impute missing values for input or target variables. |
| modelId | Specifies a model ID variable name to be included in the generated DATA step scoring code. |
| modelTable | Specifies the table containing a pre-trained model whose weights are used to initialize the network. |
| nAnns | Specifies the number of networks to select from multiple tries, based on the smallest error. |
| nloOpts | Specifies the nonlinear optimization options. |
| nominals | Specifies the nominal input and target variables to use in the analysis. |
| nTries | Specifies the number of training attempts with random initial weights. |
| randDist | Specifies the distribution for randomly generating initial network connection weights. |
| resume | Resumes a training optimization using weights from a previous training session. |
| samplingRate | Specifies the fraction of the data to use for training the neural network. |
| saveState | Specifies the table in which to save the model state for future scoring. |
| scaleInit | Specifies how to scale the initial weights. |
| seed | Specifies the random number seed for initializing network weights. |
| std | Specifies the standardization method to use on the interval variables. |
| step | Specifies a step size for weight perturbations during Monte Carlo or simulated annealing. |
| t | Specifies the artificial temperature parameter for Monte Carlo or simulated annealing. |
| table | Specifies the input table containing the training data. |
| target | Specifies the target or response variable for training. |
| targetAct | Specifies the activation function for the neurons on the output layer. |
| targetComb | Specifies the combination function for the neurons on the target output nodes. |
| targetMissing | Specifies how to impute missing values for the target variable. |
| targetStd | Specifies the standardization method to use on the target variable. |
| validTable | Specifies the table with validation data for early stopping. |
| weight | Specifies a variable to weight the prediction errors for each observation during training. |
This example uses the `HMEQ` dataset from the `SAMPSIO` library, which contains information about home equity loans. The goal is to predict loan default. The data is loaded into a CAS table named `my_hmeq`.
| 1 | |
| 2 | DATA my_hmeq; |
| 3 | SET sampsio.hmeq; |
| 4 | |
| 5 | RUN; |
| 6 | |
| 7 | PROC CASUTIL; |
| 8 | load |
| 9 | DATA=my_hmeq casout='my_hmeq' replace; |
| 10 | |
| 11 | RUN; |
| 12 |
This example trains a simple Multi-Layer Perceptron (MLP) with one hidden layer of 10 neurons to predict the binary target `BAD` using several interval inputs.
| 1 | PROC CAS; |
| 2 | ACTION neuralNet.annTrain / |
| 3 | TABLE={name='my_hmeq'}, |
| 4 | inputs={'LOAN', 'MORTDUE', 'VALUE', 'YOJ', 'DEROG', 'DELINQ', 'CLAGE', 'NINQ', 'CLNO', 'DEBTINC'}, |
| 5 | target='BAD', |
| 6 | hiddens={10}, |
| 7 | arch='MLP', |
| 8 | nominals={'BAD'}, |
| 9 | nloOpts={maxIters=50, algorithm='LBFGS'}, |
| 10 | saveState={name='ann_model', replace=true}; |
| 11 | RUN; |
This example demonstrates a more complex training scenario. It partitions the data into training and validation sets. It then trains an MLP with two hidden layers (20 and 15 neurons), uses the RECTIFIER activation function, and employs the Stochastic Gradient Descent (SGD) optimizer with a specific learning rate and momentum. Early stopping is enabled by referencing the validation data.
| 1 | PROC CAS; |
| 2 | partition.partition / |
| 3 | TABLE={name='my_hmeq'}, |
| 4 | partInd={name='_partInd_', replace=true}, |
| 5 | sampling={method='STRATIFIED', vars={'BAD'}, partprop={train=0.7, valid=0.3}}; |
| 6 | RUN; |
| 7 | |
| 8 | ACTION neuralNet.annTrain / |
| 9 | TABLE={name='my_hmeq', where='_partInd_=1'}, |
| 10 | validTable={name='my_hmeq', where='_partInd_=2'}, |
| 11 | inputs={'LOAN', 'MORTDUE', 'VALUE', 'YOJ', 'DEROG', 'DELINQ', 'CLAGE', 'NINQ', 'CLNO', 'DEBTINC'}, |
| 12 | target='BAD', |
| 13 | hiddens={20, 15}, |
| 14 | acts={'RECTIFIER'}, |
| 15 | arch='MLP', |
| 16 | nominals={'BAD'}, |
| 17 | std='STD', |
| 18 | nloOpts={ |
| 19 | algorithm='SGD', |
| 20 | maxIters=100, |
| 21 | sgdOpt={learningRate=0.01, momentum=0.5, miniBatchSize=50}, |
| 22 | validate={frequency=5, stagnation=10} |
| 23 | }, |
| 24 | seed=12345, |
| 25 | saveState={name='ann_model_sgd', replace=true}; |
| 26 | RUN; |
An industrial manufacturing company wants to implement a predictive maintenance program. The goal is to train a neural network to predict imminent machine failure based on real-...
A major telecommunications provider needs to build a customer churn prediction model using a dataset of several million subscribers. Due to the data volume, training efficiency ...
A research organization is analyzing clinical trial data to predict patient response to a new treatment. The dataset is small, contains numerous missing values from incomplete l...