The copulaFit action estimates the parameters for a specified copula type. Copulas are multivariate distribution functions whose one-dimensional marginal distributions are uniform on the interval [0,1]. They are used to model the dependence structure of random variables, separating the marginal distributions from the dependence structure. This action is fundamental in financial risk management, insurance, and other fields where understanding the joint behavior of multiple variables is critical.
| Parameter | Description |
|---|---|
| copulatype | Specifies the type of the copula to be estimated. Can be CLAYTON, FRANK, GUMBEL, NORMAL, or T. |
| corrtable | Specifies the data table that contains the Pearson correlations matrix to use when you are fitting a t copula. |
| df | Specifies an initial value for the degrees of freedom (df) when you are fitting a t copula. |
| display | Specifies the list of display tables that you want the action to create. If omitted, all tables are created. |
| initialvalues | Provides the initial values for the numerical optimization. For Archimedean copulas, the initial values of the parameters are computed using the calibration method. |
| KendallCorrtable | Specifies the data table that contains the Kendall correlations matrix to use when you are fitting a t copula. |
| margApproxOpts | Specifies the options used when approximating the empirical marginal distribution function using the adaptive method. |
| marginals | Specifies the marginal distribution of the individual variables. Can be EMPIRICAL or UNIFORM. Default is EMPIRICAL. |
| method | Specifies the method to use to estimate parameters. Can be CAL (calibration) or MLE (maximum likelihood estimation). Default is MLE. |
| name | Specifies an identifier for the fit, which is stored as an ID variable in the OUTCOPULA data set. |
| optimizer | Specifies parameters that control various aspects of the parameter estimation process, such as the algorithm and convergence criteria. |
| outpseudo | Specifies the output data set for saving the pseudo-samples with uniform marginal distributions. |
| outputTables | Specifies the list of display tables that you want to output as CAS tables. |
| plot | Specifies the options used to produce correlation plots. |
| store | Stores model properties and fit results in an item store. |
| table | Specifies the input data table. |
| theta | Specifies an initial value for theta, the dependence parameter for Archimedean copulas. |
| timingReport | Specifies the kind of timing information that you want the action to provide. |
| tolerance | Specifies the tolerance that is allowed for the fit. |
| var | Specifies the list of variable names for fitting a copula. |
| varSummary | When set to True, produces a table that describes some basic statistical properties of the variables in your model. |
This example requires a dataset with multiple continuous variables to model their dependence structure. We will create a sample dataset named 'simudata' with three variables (y1, y2, y3) that are correlated.
| 1 | DATA simudata; |
| 2 | keep y1 y2 y3; |
| 3 | DO i = 1 to 1000; |
| 4 | u1 = rand('UNIFORM'); |
| 5 | u2 = rand('UNIFORM'); |
| 6 | u3 = rand('UNIFORM'); |
| 7 | y1 = quantile('NORMAL', u1); |
| 8 | y2 = 0.5*y1 + sqrt(1 - 0.5*0.5)*quantile('NORMAL', u2); |
| 9 | y3 = 0.2*y1 + 0.3*y2 + sqrt(1 - 0.2*0.2 - 0.3*0.3 - 2*0.2*0.3*0.5)*quantile('NORMAL', u3); |
| 10 | OUTPUT; |
| 11 | END; |
| 12 | RUN; |
| 13 | |
| 14 | DATA mycas.simudata; |
| 15 | SET simudata; |
| 16 | RUN; |
This example demonstrates how to fit a Normal (Gaussian) copula to the 'simudata' dataset using the Maximum Likelihood Estimation (MLE) method.
| 1 | PROC CAS; |
| 2 | copula.copulaFit / |
| 3 | TABLE={name='simudata'}, |
| 4 | var={'y1', 'y2', 'y3'}, |
| 5 | copulatype='NORMAL', |
| 6 | method='MLE', |
| 7 | store={name='myfit', replace=true}; |
| 8 | RUN; |
| 9 | QUIT; |
This example fits a t-copula, which is useful for capturing tail dependence. It specifies initial values for the degrees of freedom (df) and the correlation matrix. It also demonstrates how to save the resulting pseudo-samples (data transformed to uniform marginals) to a new CAS table named 'outpseudo_t'.
| 1 | PROC CAS; |
| 2 | copula.copulaFit / |
| 3 | TABLE={name='simudata'}, |
| 4 | var={'y1', 'y2', 'y3'}, |
| 5 | copulatype='T', |
| 6 | method='MLE', |
| 7 | df=5, |
| 8 | corrtable={name='mycorr', caslib='casuser'}, |
| 9 | outpseudo={name='outpseudo_t', replace=true}, |
| 10 | store={name='myfit_t', replace=true}; |
| 11 | RUN; |
| 12 | QUIT; |
This example fits a Clayton copula, which is an Archimedean copula known for modeling lower tail dependence. It uses the Calibration (CAL) method based on Kendall's tau for faster estimation.
| 1 | PROC CAS; |
| 2 | copula.copulaFit / |
| 3 | TABLE={name='simudata'}, |
| 4 | var={'y1', 'y2', 'y3'}, |
| 5 | copulatype='CLAYTON', |
| 6 | method='CAL', |
| 7 | store={name='myfit_clayton', replace=true}; |
| 8 | RUN; |
| 9 | QUIT; |
An investment bank needs to estimate the Value at Risk (VaR) of a diversified portfolio containing Equities, Bonds, and Commodities. The goal is to model the dependency structur...
A manufacturing plant monitors the health of 500,000 IoT sensors. They need to detect synchronized failures where multiple sensor readings (Temperature, Vibration) drop simultan...
An actuarial team is modeling the dependency between 'Claim Cost' and 'Time to Report'. The raw data is messy, containing random missing values. Additionally, the team wants to ...