annTrain - WeAreCAS

Q: What is the primary purpose of the annTrain action in SAS Viya?

The annTrain action is used to train an artificial neural network. It is part of the Neural Network Action Set and provides functionalities for training various network architectures.

Q: What types of network architectures can be trained using the annTrain action?

You can specify the network architecture using the 'arch' parameter. The available options are 'MLP' for a multilayer perceptron, 'GLIM' for a generalized linear model architecture (a two-layer perceptron with no hidden layers), and 'DIRECT' which is an extension of MLP with direct connections from the input to the output layer. The default architecture is 'GLIM'.

Q: How can I define the structure of the hidden layers in my neural network?

The 'hiddens' parameter allows you to specify the number of hidden neurons for each hidden layer. For example, specifying hiddens={10, 5} would create a network with two hidden layers, the first having 10 neurons and the second having 5.

Q: Which optimization algorithms are available for training the network?

The optimization algorithm can be selected via the 'algorithm' subparameter within 'nloOpts'. The supported algorithms are 'ADAM', 'HF' (Hessian-Free), 'LBFGS', and 'SGD' (Stochastic Gradient Descent).

Q: How does the annTrain action handle missing values in the training data?

You can control missing value imputation using the 'missing' parameter for input variables and 'targetMissing' for the target variable. Options include 'MEAN', 'MAX', or 'MIN' to replace missing values with the mean, maximum, or minimum value of the variable. If set to 'NONE' (the default for interval variables), observations with missing values are ignored. For nominal variables, a new category is created for missing values by default.

Q: Is it possible to use a validation dataset to prevent overfitting during training?

Yes, you can provide a validation dataset using the 'validTable' parameter. This allows for early stopping of the training process, which can be configured through the 'validate' subparameter of the 'nloOpts' option, based on the model's performance on this validation data.

Q: What activation functions can be used for the hidden and target layers?

For hidden layers, the 'acts' parameter supports 'EXP', 'IDENTITY', 'LOGISTIC', 'RECTIFIER', 'SIN', 'SOFTPLUS', and 'TANH'. For the target layer, the 'targetAct' parameter supports 'EXP', 'IDENTITY', 'LOGISTIC', 'SIN', 'SOFTMAX', and 'TANH'. The default activation function depends on the variable type.

Q: How can I save the state of my trained model for later use?

You can use the 'saveState' parameter to specify an output table where the model's state, including its weights, will be saved. This allows you to resume training later or use the model for scoring with the 'annScore' action.

At a glance

Within the SAS Viya ecosystem, the annTrain action operates as the primary engine for building and refining artificial neural networks. As part of the neuralNet action set, this tool empowers analytics professionals to tackle complex modeling challenges by leveraging architectures like Multi-Layer Perceptrons (MLP). It goes beyond simple training by offering a suite of advanced controls—including dropout techniques and solvers like SGD—to rigorously prevent overfitting and enhance model stability. Whether your goal is regression or classification, this utility streamlines the transition from training to deployment via auto-generated scoring code. The following section compiles frequently asked questions to assist you in optimizing these configurations and troubleshooting common implementation scenario

Description

The `annTrain` action, part of the `neuralNet` action set, is used to train an artificial neural network (ANN) in SAS Viya. This process involves adjusting the network's weights based on a given dataset to minimize prediction errors. The action supports various architectures like Multi-Layer Perceptrons (MLP), Generalized Linear Models (GLIM), and direct connection models. It offers extensive customization options, including different activation functions, optimization algorithms (like LBFGS and SGD), and data standardization methods, making it a versatile tool for building predictive models.

neuralNet.annTrain { acts={"EXP", "IDENTITY", "LOGISTIC", "RECTIFIER", "SIN", "SOFTPLUS", "TANH"}, applyRowOrder=TRUE | FALSE, arch="DIRECT" | "GLIM" | "MLP", attributes={{...}, ...}, bias=double, casOut={...}, code={...}, combs={"ADD", "LINEAR", "RADIAL"}, delta=double, dropOut=double, dropOutInput=double, errorFunc="ENTROPY" | "GAMMA" | "NORMAL" | "POISSON", freq="variable-name", fullWeights=TRUE | FALSE, hiddens={64-bit-integer-1, ...}, includeBias=TRUE | FALSE, inputs={{...}, ...}, inversePriors=TRUE | FALSE, listNode="ALL" | "HIDDEN" | "INPUT" | "OUTPUT", missing="MAX" | "MEAN" | "MIN" | "NONE", modelId="string", modelTable={...}, nAnns=64-bit-integer, nloOpts={...}, nominals={{...}, ...}, nTries=64-bit-integer, randDist="CAUCHY" | "MSRA" | "NORMAL" | "UNIFORM" | "XAVIER", resume=TRUE | FALSE, samplingRate=double, saveState={...}, scaleInit=64-bit-integer, seed=double, std="MIDRANGE" | "NONE" | "STD", step=double, t=double, table={...}, target="variable-name", targetAct="EXP" | "IDENTITY" | "LOGISTIC" | "SIN" | "SOFTMAX" | "TANH", targetComb="ADD" | "LINEAR" | "RADIAL", targetMissing="MAX" | "MEAN" | "MIN" | "NONE", targetStd="MIDRANGE" | "NONE" | "STD", validTable={...}, weight="variable-name" };

Settings

Parameter	Description
acts	Specifies the activation function for the neurons on each hidden layer.
applyRowOrder	Specifies that the action should use a prespecified row ordering.
arch	Specifies the network architecture to be trained (MLP, GLIM, or DIRECT).
attributes	Specifies temporary attributes, such as a format, to apply to input variables.
bias	Specifies a fixed bias value for all hidden and output neurons, which will not be optimized.
casOut	Specifies the output table for the trained model.
code	Requests that the action produce SAS score code for deployment.
combs	Specifies the combination function for the neurons on each hidden layer.
delta	Specifies the annealing parameter for simulated annealing (SA) global optimization.
dropOut	Specifies the dropout ratio for the hidden layers, valid only with SGD optimization and linear combinations.
dropOutInput	Specifies the dropout ratio for the input layers, valid only with SGD optimization and linear combinations.
errorFunc	Specifies the error function to train the network (e.g., ENTROPY, NORMAL).
freq	Specifies a numeric variable that contains the frequency of occurrence of each observation.
fullWeights	Generates the full weight model for LBFGS optimization.
hiddens	Specifies the number of hidden neurons for each hidden layer in the model.
includeBias	When set to False, bias parameters are not included for the hidden and output units.
inputs	Specifies the input variables to use in the analysis.
inversePriors	Calculates the weight for prediction error based on the inverse of class frequencies.
listNode	Specifies which nodes (input, hidden, output, or all) to include in the scoring output table.
missing	Specifies how to impute missing values for input or target variables.
modelId	Specifies a model ID variable name to be included in the generated DATA step scoring code.
modelTable	Specifies the table containing a pre-trained model whose weights are used to initialize the network.
nAnns	Specifies the number of networks to select from multiple tries, based on the smallest error.
nloOpts	Specifies the nonlinear optimization options.
nominals	Specifies the nominal input and target variables to use in the analysis.
nTries	Specifies the number of training attempts with random initial weights.
randDist	Specifies the distribution for randomly generating initial network connection weights.
resume	Resumes a training optimization using weights from a previous training session.
samplingRate	Specifies the fraction of the data to use for training the neural network.
saveState	Specifies the table in which to save the model state for future scoring.
scaleInit	Specifies how to scale the initial weights.
seed	Specifies the random number seed for initializing network weights.
std	Specifies the standardization method to use on the interval variables.
step	Specifies a step size for weight perturbations during Monte Carlo or simulated annealing.
t	Specifies the artificial temperature parameter for Monte Carlo or simulated annealing.
table	Specifies the input table containing the training data.
target	Specifies the target or response variable for training.
targetAct	Specifies the activation function for the neurons on the output layer.
targetComb	Specifies the combination function for the neurons on the target output nodes.
targetMissing	Specifies how to impute missing values for the target variable.
targetStd	Specifies the standardization method to use on the target variable.
validTable	Specifies the table with validation data for early stopping.
weight	Specifies a variable to weight the prediction errors for each observation during training.

Data Preparation View data prep sheet

Data Creation

This example uses the `HMEQ` dataset from the `SAMPSIO` library, which contains information about home equity loans. The goal is to predict loan default. The data is loaded into a CAS table named `my_hmeq`.

Copied!

1
2	DATA my_hmeq;
3	SET sampsio.hmeq;
4
5	RUN;
6
7	PROC CASUTIL;
8	load
9	DATA=my_hmeq casout='my_hmeq' replace;
10
11	RUN;
12

Examples

This example trains a simple Multi-Layer Perceptron (MLP) with one hidden layer of 10 neurons to predict the binary target `BAD` using several interval inputs.

SAS® / CAS Code Code awaiting community validation

Copied!

1	PROC CAS;
2	ACTION neuralNet.annTrain /
3	TABLE={name='my_hmeq'},
4	inputs={'LOAN', 'MORTDUE', 'VALUE', 'YOJ', 'DEROG', 'DELINQ', 'CLAGE', 'NINQ', 'CLNO', 'DEBTINC'},
5	target='BAD',
6	hiddens={10},
7	arch='MLP',
8	nominals={'BAD'},
9	nloOpts={maxIters=50, algorithm='LBFGS'},
10	saveState={name='ann_model', replace=true};
11	RUN;

Result :
The action trains the neural network and saves the model weights and state to a CAS table named 'ann_model'. The results will include model information, optimization details, and fit statistics.

This example demonstrates a more complex training scenario. It partitions the data into training and validation sets. It then trains an MLP with two hidden layers (20 and 15 neurons), uses the RECTIFIER activation function, and employs the Stochastic Gradient Descent (SGD) optimizer with a specific learning rate and momentum. Early stopping is enabled by referencing the validation data.

SAS® / CAS Code Code awaiting community validation

Copied!

1	PROC CAS;
2	partition.partition /
3	TABLE={name='my_hmeq'},
4	partInd={name='_partInd_', replace=true},
5	sampling={method='STRATIFIED', vars={'BAD'}, partprop={train=0.7, valid=0.3}};
6	RUN;
7
8	ACTION neuralNet.annTrain /
9	TABLE={name='my_hmeq', where='_partInd_=1'},
10	validTable={name='my_hmeq', where='_partInd_=2'},
11	inputs={'LOAN', 'MORTDUE', 'VALUE', 'YOJ', 'DEROG', 'DELINQ', 'CLAGE', 'NINQ', 'CLNO', 'DEBTINC'},
12	target='BAD',
13	hiddens={20, 15},
14	acts={'RECTIFIER'},
15	arch='MLP',
16	nominals={'BAD'},
17	std='STD',
18	nloOpts={
19	algorithm='SGD',
20	maxIters=100,
21	sgdOpt={learningRate=0.01, momentum=0.5, miniBatchSize=50},
22	validate={frequency=5, stagnation=10}
23	},
24	seed=12345,
25	saveState={name='ann_model_sgd', replace=true};
26	RUN;

Result :
The action trains a more complex neural network using the training partition and uses the validation partition to monitor performance and stop training early if the validation error stops improving. The final model is saved to 'ann_model_sgd'.

FAQ

What is the primary purpose of the annTrain action in SAS Viya?

What types of network architectures can be trained using the annTrain action?

How can I define the structure of the hidden layers in my neural network?

Which optimization algorithms are available for training the network?

How does the annTrain action handle missing values in the training data?

Is it possible to use a validation dataset to prevent overfitting during training?

What activation functions can be used for the hidden and target layers?

How can I save the state of my trained model for later use?

Associated Scenarios

Use Case

Standard Case: Predicting Industrial Machine Failure with an MLP

An industrial manufacturing company wants to implement a predictive maintenance program. The goal is to train a neural network to predict imminent machine failure based on real-...

View scenario

Use Case

Performance Test: Training on Large-Scale Telecom Churn Data with SGD

A major telecommunications provider needs to build a customer churn prediction model using a dataset of several million subscribers. Due to the data volume, training efficiency ...

View scenario

Use Case

Edge Case: Handling Missing Values and Imbalanced Classes in Clinical Data

A research organization is analyzing clinical trial data to predict patient response to a new treatment. The dataset is small, contains numerous missing values from incomplete l...

View scenario

Actions associées

neuralNet

annCode

The annCode action generates SAS DATA step scoring code from a trained artifi...

neuralNet

annScore

The `annScore` action scores a data table using a pre-trained artificial neur...

Table of Contents

At a glance

Description

Data Creation

Examples

Basic MLP Training

Advanced Training with Validation and SGD

FAQ

Associated Scenarios

Use Case

Standard Case: Predicting Industrial Machine Failure with an MLP

Use Case

Performance Test: Training on Large-Scale Telecom Churn Data with SGD

Use Case

Edge Case: Handling Missing Values and Imbalanced Classes in Clinical Data

Actions associées

annCode

annScore