neuralNet

annTrain

Description

The `annTrain` action, part of the `neuralNet` action set, is used to train an artificial neural network (ANN) in SAS Viya. This process involves adjusting the network's weights based on a given dataset to minimize prediction errors. The action supports various architectures like Multi-Layer Perceptrons (MLP), Generalized Linear Models (GLIM), and direct connection models. It offers extensive customization options, including different activation functions, optimization algorithms (like LBFGS and SGD), and data standardization methods, making it a versatile tool for building predictive models.

neuralNet.annTrain { acts={"EXP", "IDENTITY", "LOGISTIC", "RECTIFIER", "SIN", "SOFTPLUS", "TANH"}, applyRowOrder=TRUE | FALSE, arch="DIRECT" | "GLIM" | "MLP", attributes={{...}, ...}, bias=double, casOut={...}, code={...}, combs={"ADD", "LINEAR", "RADIAL"}, delta=double, dropOut=double, dropOutInput=double, errorFunc="ENTROPY" | "GAMMA" | "NORMAL" | "POISSON", freq="variable-name", fullWeights=TRUE | FALSE, hiddens={64-bit-integer-1, ...}, includeBias=TRUE | FALSE, inputs={{...}, ...}, inversePriors=TRUE | FALSE, listNode="ALL" | "HIDDEN" | "INPUT" | "OUTPUT", missing="MAX" | "MEAN" | "MIN" | "NONE", modelId="string", modelTable={...}, nAnns=64-bit-integer, nloOpts={...}, nominals={{...}, ...}, nTries=64-bit-integer, randDist="CAUCHY" | "MSRA" | "NORMAL" | "UNIFORM" | "XAVIER", resume=TRUE | FALSE, samplingRate=double, saveState={...}, scaleInit=64-bit-integer, seed=double, std="MIDRANGE" | "NONE" | "STD", step=double, t=double, table={...}, target="variable-name", targetAct="EXP" | "IDENTITY" | "LOGISTIC" | "SIN" | "SOFTMAX" | "TANH", targetComb="ADD" | "LINEAR" | "RADIAL", targetMissing="MAX" | "MEAN" | "MIN" | "NONE", targetStd="MIDRANGE" | "NONE" | "STD", validTable={...}, weight="variable-name" };
Settings
ParameterDescription
actsSpecifies the activation function for the neurons on each hidden layer.
applyRowOrderSpecifies that the action should use a prespecified row ordering.
archSpecifies the network architecture to be trained (MLP, GLIM, or DIRECT).
attributesSpecifies temporary attributes, such as a format, to apply to input variables.
biasSpecifies a fixed bias value for all hidden and output neurons, which will not be optimized.
casOutSpecifies the output table for the trained model.
codeRequests that the action produce SAS score code for deployment.
combsSpecifies the combination function for the neurons on each hidden layer.
deltaSpecifies the annealing parameter for simulated annealing (SA) global optimization.
dropOutSpecifies the dropout ratio for the hidden layers, valid only with SGD optimization and linear combinations.
dropOutInputSpecifies the dropout ratio for the input layers, valid only with SGD optimization and linear combinations.
errorFuncSpecifies the error function to train the network (e.g., ENTROPY, NORMAL).
freqSpecifies a numeric variable that contains the frequency of occurrence of each observation.
fullWeightsGenerates the full weight model for LBFGS optimization.
hiddensSpecifies the number of hidden neurons for each hidden layer in the model.
includeBiasWhen set to False, bias parameters are not included for the hidden and output units.
inputsSpecifies the input variables to use in the analysis.
inversePriorsCalculates the weight for prediction error based on the inverse of class frequencies.
listNodeSpecifies which nodes (input, hidden, output, or all) to include in the scoring output table.
missingSpecifies how to impute missing values for input or target variables.
modelIdSpecifies a model ID variable name to be included in the generated DATA step scoring code.
modelTableSpecifies the table containing a pre-trained model whose weights are used to initialize the network.
nAnnsSpecifies the number of networks to select from multiple tries, based on the smallest error.
nloOptsSpecifies the nonlinear optimization options.
nominalsSpecifies the nominal input and target variables to use in the analysis.
nTriesSpecifies the number of training attempts with random initial weights.
randDistSpecifies the distribution for randomly generating initial network connection weights.
resumeResumes a training optimization using weights from a previous training session.
samplingRateSpecifies the fraction of the data to use for training the neural network.
saveStateSpecifies the table in which to save the model state for future scoring.
scaleInitSpecifies how to scale the initial weights.
seedSpecifies the random number seed for initializing network weights.
stdSpecifies the standardization method to use on the interval variables.
stepSpecifies a step size for weight perturbations during Monte Carlo or simulated annealing.
tSpecifies the artificial temperature parameter for Monte Carlo or simulated annealing.
tableSpecifies the input table containing the training data.
targetSpecifies the target or response variable for training.
targetActSpecifies the activation function for the neurons on the output layer.
targetCombSpecifies the combination function for the neurons on the target output nodes.
targetMissingSpecifies how to impute missing values for the target variable.
targetStdSpecifies the standardization method to use on the target variable.
validTableSpecifies the table with validation data for early stopping.
weightSpecifies a variable to weight the prediction errors for each observation during training.
Data Preparation View data prep sheet
Data Creation

This example uses the `HMEQ` dataset from the `SAMPSIO` library, which contains information about home equity loans. The goal is to predict loan default. The data is loaded into a CAS table named `my_hmeq`.

Copied!
1 
2DATA my_hmeq;
3SET sampsio.hmeq;
4 
5RUN;
6 
7PROC CASUTIL;
8load
9DATA=my_hmeq casout='my_hmeq' replace;
10 
11RUN;
12 

Examples

This example trains a simple Multi-Layer Perceptron (MLP) with one hidden layer of 10 neurons to predict the binary target `BAD` using several interval inputs.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 ACTION neuralNet.annTrain /
3 TABLE={name='my_hmeq'},
4 inputs={'LOAN', 'MORTDUE', 'VALUE', 'YOJ', 'DEROG', 'DELINQ', 'CLAGE', 'NINQ', 'CLNO', 'DEBTINC'},
5 target='BAD',
6 hiddens={10},
7 arch='MLP',
8 nominals={'BAD'},
9 nloOpts={maxIters=50, algorithm='LBFGS'},
10 saveState={name='ann_model', replace=true};
11RUN;
Result :
The action trains the neural network and saves the model weights and state to a CAS table named 'ann_model'. The results will include model information, optimization details, and fit statistics.

This example demonstrates a more complex training scenario. It partitions the data into training and validation sets. It then trains an MLP with two hidden layers (20 and 15 neurons), uses the RECTIFIER activation function, and employs the Stochastic Gradient Descent (SGD) optimizer with a specific learning rate and momentum. Early stopping is enabled by referencing the validation data.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 partition.partition /
3 TABLE={name='my_hmeq'},
4 partInd={name='_partInd_', replace=true},
5 sampling={method='STRATIFIED', vars={'BAD'}, partprop={train=0.7, valid=0.3}};
6RUN;
7 
8 ACTION neuralNet.annTrain /
9 TABLE={name='my_hmeq', where='_partInd_=1'},
10 validTable={name='my_hmeq', where='_partInd_=2'},
11 inputs={'LOAN', 'MORTDUE', 'VALUE', 'YOJ', 'DEROG', 'DELINQ', 'CLAGE', 'NINQ', 'CLNO', 'DEBTINC'},
12 target='BAD',
13 hiddens={20, 15},
14 acts={'RECTIFIER'},
15 arch='MLP',
16 nominals={'BAD'},
17 std='STD',
18 nloOpts={
19 algorithm='SGD',
20 maxIters=100,
21 sgdOpt={learningRate=0.01, momentum=0.5, miniBatchSize=50},
22 validate={frequency=5, stagnation=10}
23 },
24 seed=12345,
25 saveState={name='ann_model_sgd', replace=true};
26RUN;
Result :
The action trains a more complex neural network using the training partition and uses the validation partition to monitor performance and stop training early if the validation error stops improving. The final model is saved to 'ann_model_sgd'.

FAQ

What is the primary purpose of the annTrain action in SAS Viya?
What types of network architectures can be trained using the annTrain action?
How can I define the structure of the hidden layers in my neural network?
Which optimization algorithms are available for training the network?
How does the annTrain action handle missing values in the training data?
Is it possible to use a validation dataset to prevent overfitting during training?
What activation functions can be used for the hidden and target layers?
How can I save the state of my trained model for later use?

Associated Scenarios

Use Case
Standard Case: Predicting Industrial Machine Failure with an MLP

An industrial manufacturing company wants to implement a predictive maintenance program. The goal is to train a neural network to predict imminent machine failure based on real-...

Use Case
Performance Test: Training on Large-Scale Telecom Churn Data with SGD

A major telecommunications provider needs to build a customer churn prediction model using a dataset of several million subscribers. Due to the data volume, training efficiency ...

Use Case
Edge Case: Handling Missing Values and Imbalanced Classes in Clinical Data

A research organization is analyzing clinical trial data to predict patient response to a new treatment. The dataset is small, contains numerous missing values from incomplete l...