copula

copulaFit

Description

The copulaFit action estimates the parameters for a specified copula type. Copulas are multivariate distribution functions whose one-dimensional marginal distributions are uniform on the interval [0,1]. They are used to model the dependence structure of random variables, separating the marginal distributions from the dependence structure. This action is fundamental in financial risk management, insurance, and other fields where understanding the joint behavior of multiple variables is critical.

copula.copulaFit <result=results> <status=rc> / copulatype="CLAYTON" | "FRANK" | "GUMBEL" | "NORMAL" | "T", corrtable={caslib="string", computedOnDemand=TRUE | FALSE, computedVars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, computedVarsProgram="string", dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>}, groupBy={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, groupByMode="NOSORT" | "REDISTRIBUTE", importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, name="table-name", orderBy={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, singlePass=TRUE | FALSE, vars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression", whereTable={casLib="string", dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}, importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, name="table-name", vars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression"}}, df=double, display={caseSensitive=TRUE | FALSE, exclude=TRUE | FALSE, excludeAll=TRUE | FALSE, keyIsPath=TRUE | FALSE, names={"string-1" <, "string-2", ...>}, pathType="LABEL" | "NAME", traceNames=TRUE | FALSE}, initialvalues={"string-1" <, "string-2", ...>}, KendallCorrtable={caslib="string", computedOnDemand=TRUE | FALSE, computedVars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, computedVarsProgram="string", dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>}, groupBy={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, groupByMode="NOSORT" | "REDISTRIBUTE", importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, name="table-name", orderBy={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, singlePass=TRUE | FALSE, vars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression", whereTable={casLib="string", dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}, importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, name="table-name", vars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression"}}, margApproxOpts={algorithm="BIN" | "SORT", interpolation="LINEAR" | "MONOCUBIC" | "STEP", maxiters=integer, refineres=integer, sampletol=double}, marginals="EMPIRICAL" | "UNIFORM", method="CAL" | "MLE", name="string", optimizer={aftl=double, agtl=double, algorithm="CONJUGATEGRADIENT" | "DOUBLEDOGLEG" | "NEWTONRAPHSONWITHLINESEARCH" | "NEWTONRAPHSONWITHRIDGING" | "NONE" | "QUASINEWTON" | "TRUSTREGION", atol=double, axtl=double, ceps=double, ftol=double, gtol=double, iterationHistory={basic=TRUE | FALSE, estimatesForEachStep=TRUE | FALSE}, maxf=double, maxit=double, maxtime=double, sing=double}, outpseudo={caslib="string", compress=TRUE | FALSE, indexVars={"variable-name-1" <, "variable-name-2", ...>}, label="string", lifetime=64-bit-integer, maxMemSize=64-bit-integer, memoryFormat="DVR" | "INHERIT" | "STANDARD", name="table-name", promote=TRUE | FALSE, replace=TRUE | FALSE, replication=integer, tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE", threadBlockSize=64-bit-integer, timeStamp="string", where={"string-1" <, "string-2", ...>}}, outputTables={groupByVarsRaw=TRUE | FALSE, includeAll=TRUE | FALSE, names={"string-1" <, "string-2", ...>} | {key-1={casouttable-1} <, key-2={casouttable-2}, ...>}, repeated=TRUE | FALSE, replace=TRUE | FALSE}, plot={kendall=TRUE | FALSE, marginals=TRUE | FALSE, nsamples=integer, nvar=integer, resolution=integer, scatter=TRUE | FALSE, tail=TRUE | FALSE, uniform=TRUE | FALSE, unpackpanel=TRUE | FALSE}, store={caslib="string", label="string", lifetime=64-bit-integer, memoryFormat="DVR" | "INHERIT" | "STANDARD", name="table-name", promote=TRUE | FALSE, replace=TRUE | FALSE, tableRedistUpPolicy="DEFER" | "NOREDIST" | "REBALANCE"}, table={caslib="string", computedOnDemand=TRUE | FALSE, computedVars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, computedVarsProgram="string", dataSourceOptions={key-1=any-list-or-data-type-1 <, key-2=any-list-or-data-type-2, ...>}, groupBy={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, groupByMode="NOSORT" | "REDISTRIBUTE", importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, name="table-name", orderBy={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, singlePass=TRUE | FALSE, vars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression", whereTable={casLib="string", dataSourceOptions={adls_noreq-parameters | bigquery-parameters | cas_noreq-parameters | clouddex-parameters | db2-parameters | dnfs-parameters | esp-parameters | fedsvr-parameters | gcs_noreq-parameters | hadoop-parameters | hana-parameters | impala-parameters | informix-parameters | jdbc-parameters | mongodb-parameters | mysql-parameters | odbc-parameters | oracle-parameters | path-parameters | postgres-parameters | redshift-parameters | s3-parameters | sapiq-parameters | sforce-parameters | singlestore_standard-parameters | snowflake-parameters | spark-parameters | spde-parameters | sqlserver-parameters | ss_noreq-parameters | teradata-parameters | vertica-parameters | yellowbrick-parameters}, importOptions={fileType="ANY" | "AUDIO" | "AUTO" | "BASESAS" | "CSV" | "DELIMITED" | "DOCUMENT" | "DTA" | "ESP" | "EXCEL" | "FMT" | "HDAT" | "IMAGE" | "JMP" | "LASR" | "PARQUET" | "SOUND" | "SPSS" | "VIDEO" | "XLS", fileType-specific-parameters}, name="table-name", vars={{format="string", formattedLength=integer, label="string", name="variable-name", nfd=integer, nfl=integer}, {...}}, where="where-expression"}}, theta=double, timingReport={details=TRUE | FALSE, summary=TRUE | FALSE}, tolerance=double, var={"string-1" <, "string-2", ...>}, varSummary=TRUE | FALSE ;
Settings
ParameterDescription
copulatypeSpecifies the type of the copula to be estimated. Can be CLAYTON, FRANK, GUMBEL, NORMAL, or T.
corrtableSpecifies the data table that contains the Pearson correlations matrix to use when you are fitting a t copula.
dfSpecifies an initial value for the degrees of freedom (df) when you are fitting a t copula.
displaySpecifies the list of display tables that you want the action to create. If omitted, all tables are created.
initialvaluesProvides the initial values for the numerical optimization. For Archimedean copulas, the initial values of the parameters are computed using the calibration method.
KendallCorrtableSpecifies the data table that contains the Kendall correlations matrix to use when you are fitting a t copula.
margApproxOptsSpecifies the options used when approximating the empirical marginal distribution function using the adaptive method.
marginalsSpecifies the marginal distribution of the individual variables. Can be EMPIRICAL or UNIFORM. Default is EMPIRICAL.
methodSpecifies the method to use to estimate parameters. Can be CAL (calibration) or MLE (maximum likelihood estimation). Default is MLE.
nameSpecifies an identifier for the fit, which is stored as an ID variable in the OUTCOPULA data set.
optimizerSpecifies parameters that control various aspects of the parameter estimation process, such as the algorithm and convergence criteria.
outpseudoSpecifies the output data set for saving the pseudo-samples with uniform marginal distributions.
outputTablesSpecifies the list of display tables that you want to output as CAS tables.
plotSpecifies the options used to produce correlation plots.
storeStores model properties and fit results in an item store.
tableSpecifies the input data table.
thetaSpecifies an initial value for theta, the dependence parameter for Archimedean copulas.
timingReportSpecifies the kind of timing information that you want the action to provide.
toleranceSpecifies the tolerance that is allowed for the fit.
varSpecifies the list of variable names for fitting a copula.
varSummaryWhen set to True, produces a table that describes some basic statistical properties of the variables in your model.
Data Preparation View data prep sheet
Data Creation

This example requires a dataset with multiple continuous variables to model their dependence structure. We will create a sample dataset named 'simudata' with three variables (y1, y2, y3) that are correlated.

Copied!
1DATA simudata;
2 keep y1 y2 y3;
3 DO i = 1 to 1000;
4 u1 = rand('UNIFORM');
5 u2 = rand('UNIFORM');
6 u3 = rand('UNIFORM');
7 y1 = quantile('NORMAL', u1);
8 y2 = 0.5*y1 + sqrt(1 - 0.5*0.5)*quantile('NORMAL', u2);
9 y3 = 0.2*y1 + 0.3*y2 + sqrt(1 - 0.2*0.2 - 0.3*0.3 - 2*0.2*0.3*0.5)*quantile('NORMAL', u3);
10 OUTPUT;
11 END;
12RUN;
13 
14DATA mycas.simudata;
15 SET simudata;
16RUN;

Examples

This example demonstrates how to fit a Normal (Gaussian) copula to the 'simudata' dataset using the Maximum Likelihood Estimation (MLE) method.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 copula.copulaFit /
3 TABLE={name='simudata'},
4 var={'y1', 'y2', 'y3'},
5 copulatype='NORMAL',
6 method='MLE',
7 store={name='myfit', replace=true};
8RUN;
9QUIT;
Result :
The action will produce several tables including 'Fit Details', 'Parameter Estimates', and 'Pearson Correlation Matrix'. The log will show the successful completion of the action and the creation of the 'myfit' item store.

This example fits a t-copula, which is useful for capturing tail dependence. It specifies initial values for the degrees of freedom (df) and the correlation matrix. It also demonstrates how to save the resulting pseudo-samples (data transformed to uniform marginals) to a new CAS table named 'outpseudo_t'.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 copula.copulaFit /
3 TABLE={name='simudata'},
4 var={'y1', 'y2', 'y3'},
5 copulatype='T',
6 method='MLE',
7 df=5,
8 corrtable={name='mycorr', caslib='casuser'},
9 outpseudo={name='outpseudo_t', replace=true},
10 store={name='myfit_t', replace=true};
11RUN;
12QUIT;
Result :
The results will include parameter estimates for the t-copula, including the estimated degrees of freedom. A new CAS table named 'outpseudo_t' will be created in the active caslib, containing the transformed data. The 'myfit_t' item store will contain the full model details.

This example fits a Clayton copula, which is an Archimedean copula known for modeling lower tail dependence. It uses the Calibration (CAL) method based on Kendall's tau for faster estimation.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2 copula.copulaFit /
3 TABLE={name='simudata'},
4 var={'y1', 'y2', 'y3'},
5 copulatype='CLAYTON',
6 method='CAL',
7 store={name='myfit_clayton', replace=true};
8RUN;
9QUIT;
Result :
The action will output the estimated dependence parameter (theta) for the Clayton copula. The estimation method will be listed as 'Calibration'. The 'myfit_clayton' item store will be created.

FAQ

What is the primary function of the copulaFit action in SAS Viya?
Which copula types can be estimated using the copulaFit action?
What is the difference between the 'MLE' and 'CAL' methods for parameter estimation in copulaFit?
How do you specify the input data for the copulaFit action?
What is the purpose of the `var` parameter in the copulaFit action?
How can I save the fitted copula model for later use?
What does the `marginals` parameter control?
When fitting a t-copula, how can I provide an initial correlation matrix?
Is it possible to generate plots to visualize the fit results?
How can I obtain the pseudo-samples with uniform marginal distributions as an output table?