Published on :
Machine Learning CREATION_INTERNE

SVMACHINE Procedure

This code is also available in: Deutsch Español Français
Awaiting validation
The SVMACHINE procedure is a powerful tool for building classification and regression models. It uses the concept of hyperplanes to separate data classes, maximizing the margin between the closest data points (support vectors). It supports various kernel functions (linear, polynomial, RBF, sigmoid) to adapt to complex relationships in the data. The procedure runs on the CAS server, enabling distributed and parallel processing of large volumes of data. It is particularly effective for binary classification problems and can be adapted to regression problems.
Data Analysis

Type : CREATION_INTERNE


Examples use generated data (datalines) or SASHELP data, loaded into the CAS session.

1 Code Block
PROC SVMACHINE Data
Explanation :
This example illustrates a simple binary classification using the SVMACHINE procedure with a linear kernel. A subset of the SASHELP.IRIS dataset is created in CAS, keeping only two species. The 'target' variable is binarized (0 or 1). The SVMACHINE procedure is then executed to train a model, and fit statistics are displayed.
Copied!
1CASLIB _ALL_ ASSIGN; /* Assurez-vous que la CASLIB est assignée */
2 
3DATA mycas.iris_subset;
4 SET sashelp.iris;
5 where species in ('Setosa', 'Versicolor');
6 IF species = 'Setosa' THEN target = 0;
7 ELSE target = 1;
8 drop species;
9RUN;
10 
11PROC SVMACHINE DATA=mycas.iris_subset;
12 INPUT sepalwidth sepallength petalwidth petallength / level=interval;
13 target target / level=binary;
14 kernel linear;
15 ods OUTPUT FitStatistics=FitStat;
16RUN;
17 
18PROC PRINT DATA=FitStat;
19 title 'Statistiques d''ajustement du modèle SVM Linéaire';
20RUN;
2 Code Block
PROC SVMACHINE Data
Explanation :
This example uses a simulated two-dimensional dataset with a non-linear decision boundary. The SVMACHINE procedure is configured with an RBF (Radial Basis Function) kernel and a cost parameter 'C' of 10. The 'C' parameter controls the penalty for misclassified data points, influencing the margin width and model complexity.
Copied!
1CASLIB _ALL_ ASSIGN;
2 
3DATA mycas.simulated_data;
4 call streaminit(123);
5 DO i = 1 to 100;
6 x1 = rand('Uniform') * 10;
7 x2 = rand('Uniform') * 10;
8 IF (x1 - 5)**2 + (x2 - 5)**2 < 10 THEN target = 0;
9 ELSE target = 1;
10 OUTPUT;
11 END;
12RUN;
13 
14PROC SVMACHINE DATA=mycas.simulated_data;
15 INPUT x1 x2 / level=interval;
16 target target / level=binary;
17 kernel rbf;
18 c 10; /* Paramètre de coût (C) */
19 ods OUTPUT FitStatistics=FitStat;
20RUN;
21 
22PROC PRINT DATA=FitStat;
23 title 'Statistiques d''ajustement du modèle SVM RBF';
24RUN;
3 Code Block
PROC SVMACHINE Data
Explanation :
This example demonstrates the use of a polynomial kernel of degree 2 to capture more complex relationships in the data. It also incorporates a data partitioning step into training (70%) and validation (30%) sets to evaluate model generalization. The output includes fit statistics and a confusion matrix.
Copied!
1CASLIB _ALL_ ASSIGN;
2 
3DATA mycas.complex_data;
4 call streaminit(456);
5 DO i = 1 to 200;
6 x1 = rand('Normal', 0, 2);
7 x2 = rand('Normal', 0, 2);
8 IF x1**2 + x2**2 < 4 and x1 > 0 THEN target = 0;
9 ELSE target = 1;
10 OUTPUT;
11 END;
12RUN;
13 
14PROC SVMACHINE DATA=mycas.complex_data;
15 INPUT x1 x2 / level=interval;
16 target target / level=binary;
17 kernel polynomial / degree=2; /* Noyau polynomial de degré 2 */
18 c 1;
19 partition fraction(train=0.7 val=0.3 seed=789);
20 ods OUTPUT FitStatistics=FitStat
21 MisclassificationMatrix=MisClass;
22RUN;
23 
24PROC PRINT DATA=FitStat;
25 title 'Statistiques d''ajustement du modèle SVM Polynomial';
26RUN;
27 
28PROC PRINT DATA=MisClass;
29 title 'Matrice de confusion du modèle SVM Polynomial';
30RUN;
4 Code Block
PROC CAS / SVMACHINE Data
Explanation :
This advanced example shows how to interact with the SVMACHINE procedure via the CASL scripting language (CAS Language) within PROC CAS. It includes creating training data and new data to be scored directly in a CAS table. The SVM model is trained using the `svMachine.svMachine` action and its state is saved. Then, this saved model is used to predict values on new data, and the results are displayed.
Copied!
1CASLIB _ALL_ ASSIGN;
2 
3PROC CAS;
4 SESSION casauto;
5 /* Création de données simulées directement dans CAS */
6 DATA casuser.score_data / overwrite=true;
7 INPUT x1 x2 target;
8 DATALINES;
9 1 1 0
10 9 9 1
11 5 5 0
12 2 8 1
13 7 3 0
14 ;
15 RUN;
16 
17 /* Entraînement d'un modèle simple pour la démonstration */
18 svmachine.svMachine /
19 TABLE={name='casuser.score_data'},
20 inputs={'x1', 'x2'},
21 target='target',
22 kernel={type='linear'},
23 savestate={name='my_svm_model', replace=true}
24 ;
25 
26 /* Préparation des nouvelles données à scorer */
27 DATA casuser.new_data / overwrite=true;
28 INPUT x1 x2;
29 DATALINES;
30 1.5 1.5
31 8.5 8.5
32 4.8 5.2
33 3.0 7.0
34 6.5 2.5
35 ;
36 RUN;
37 
38 /* Scoring des nouvelles données */
39 svmachine.svMachine /
40 TABLE={name='casuser.new_data'},
41 predict={model={name='my_svm_model'}},
42 OUTPUT={casout={name='casuser.predictions', replace=true}, copyvars={'x1', 'x2'}}
43 ;
44 
45 /* Affichage des prédictions */
46 printtable / TABLE={name='casuser.predictions'};
47QUIT;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Banner
Expert Advice
Expert
Michael
Responsable de l'infrastructure Viya.
« In SAS Viya, PROC SVMACHINE implements the Support Vector Machine (SVM) algorithm, a gold standard for high-accuracy binary classification. By finding the optimal hyperplane that maximizes the margin between classes, SVMs are uniquely robust against overfitting, even in high-dimensional feature spaces. »