The package OpVaR is a toolkit of statistical methods for operational risk modeling. Anticipating a loss frequency/loss severity decomposition, it especially tackles the issues:
In the following, the functionality of OpVaR shall be briefly sketched using the hypothetical example data set lossdat. Load the package and this dataset by:
library(OpVaR)
data(lossdat)
Key element of OpVaR is the opriskmodel structure. Inspired by the regulatory 8x7 business line/event type matrix, the opriskmodel structure is a list object where the single elements correspond e.g., to the cells of a business line/event type matrix. Each element comprises a loss frequency model, a loss severity model and if specified a dependency model between the frequencies and severities. Our lossdat dataset provides a minimal (2x2) example. As for the opriskmodel structure, the loss data needs to be stored in a list of cells with each cell comprising a data frame of loss severities and an integer time period assignment (e.g., years or quarters). With our example data set lossdat, a corresponding empty opriskmodel will comprise 4 cells/list elements and is initialized by:
opriskmodel=list()
for(i in 1:length(lossdat)){
opriskmodel[[i]]=list()
}
There are two options for modeling loss frequencies: the Poisson and negative binomial distribution. Loss frequency models are fitted using the fitFreqdist command and take into account the discrete time period classification in the data.
### Fit Frequency Distribution
opriskmodel[[1]]$freqdist=fitFreqdist(lossdat[[1]],"pois")
opriskmodel[[2]]$freqdist=fitFreqdist(lossdat[[2]],"pois")
opriskmodel[[3]]$freqdist=fitFreqdist(lossdat[[3]],"nbinom")
opriskmodel[[4]]$freqdist=fitFreqdist(lossdat[[4]],"nbinom")
For loss severities three types of models are available:
### fit Severity Distributions
opriskmodel[[1]]$sevdist=fitPlain(lossdat[[1]],"gamma")
opriskmodel[[2]]$sevdist=fitPlain(lossdat[[2]],"weibull")
opriskmodel[[3]]$sevdist=fitSpliced(lossdat[[3]],"gamma","gpd",method="Fixed",thresh=2000)
opriskmodel[[4]]$sevdist=fitSpliced(lossdat[[4]],"gamma","gpd",method="Fixed",thresh=2000)
Goodness of fit tests are available for the continuous loss severity models (Anderson-Darling, Cramer-von Mises, Kolmogorov-Smirnov test) as well as for the discrete loss frequency models (Chi square test):
### Test Model Fit (Severities)
goftest(lossdat[[3]],opriskmodel[[3]]$sevdist)
## [[1]]
##
## Anderson-Darling test of goodness-of-fit
## Null hypothesis: psevdist
##
## data: cell$Loss
## An = 2.8527, p-value = 0.03253
##
##
## [[2]]
##
## Cramer-von Mises test of goodness-of-fit
## Null hypothesis: psevdist
##
## data: cell$Loss
## omega2 = 0.50876, p-value = 0.03784
##
##
## [[3]]
##
## One-sample Kolmogorov-Smirnov test
##
## data: cell$Loss
## D = 0.034021, p-value = 0.01974
## alternative hypothesis: two-sided
plot(opriskmodel[[3]]$sevdist)
lines(density(lossdat[[3]]$Loss))
### Test Model Fit (Frequencies)
goftest(lossdat[[3]],opriskmodel[[3]]$freqdist)
##
## Chi-squared test for given probabilities
##
## data: frequencies
## X-squared = 28.215, df = 68, p-value = 1
If bivariate dependencies between loss frequencies and severities in the single cells shall be considered, copula models can be fitted using the fitDependency function. fitDependency is an interface to the VineCopula package and thus uses the VineCopula package encoding (0 = independence copula, 1 = Gaussian copula, 2 = t copula, 3 = Clayton copula, 4 = Gumbel copula, 5 = Frank copula, 6 = Joe copula, for details, see the BiCop function documentation in the VineCopula package). If independence is assumed, it can be either specified by the independence copula (0) or simply not be specified.
### Fit Dependency Model
opriskmodel[[1]]$dependency=fitDependency(lossdat[[1]],6)
opriskmodel[[2]]$dependency=fitDependency(lossdat[[1]],0)
opriskmodel[[4]]$dependency=fitDependency(lossdat[[4]],4)
Given a correctly specified opriskmodel, total loss for the single cells can be estimated by Monte Carlo simulation (mcSim function). The simulation result should be stored, so that afterwards Value-at-Risk can be determined by the VaR function. Especially depending on complex dependency or loss severity models, the simulation can be time consuming.
### Monte Carlo Simulation
mc_out=mcSim(opriskmodel,100,verbose=FALSE)
### Value-at-Risk Calculation
VaR(mc_out,.95)
## 95% 95% 95% 95%
## 72234.44 76767.34 43163.74 41194.08
A closed form approximation of Value-at-Risk can also be determined using a tail index based quantile approximation assuming independence of the loss severity and loss frequency. This procedure is numerically fast and can e.g., be used for validating Monte Carlo simualtion based risk figures.
### Benchemark: Value-at-Risk by Single Loss Approximation
sla(opriskmodel,.95)
## For infinite domains Gauss integration is applied!
## For infinite domains Gauss integration is applied!
## For infinite domains Gauss integration is applied!
## For infinite domains Gauss integration is applied!
## OpRiskmodel - SLA: 95 %
## ----------------------------------------------
## VaR Interpolation
## Cell 1 : 6036.38867342468 FALSE
## Cell 2 : 65444.5967578263 FALSE
## Cell 3 : 62567.0910851835 FALSE
## Cell 4 : 53163.5839199277 FALSE
In case that the spline based interpolation is needed, it can also be graphically shown, consider the following example:
opriskmodel2 = list()
opriskmodel2[[1]] = list()
opriskmodel2[[1]]$sevdist = buildSplicedSevdist("lgamma", c(1.23, 0.012), "gpd", c(200, 716, 0.9), 2000, 0.8)
opriskmodel2[[1]]$freqdist = buildFreqdist("pois", 50)
#generate plot if interpolation was performed
sla(opriskmodel2, alpha = 0.95, plot = TRUE)
## For infinite domains Gauss integration is applied!
## OpRiskmodel - SLA: 95 %
## ----------------------------------------------
## VaR Interpolation
## Cell 1 : 442934.287600749 TRUE
Degen, M. (2010): The calculation of minimum regulatory capital using single-loss approximations. The Journal of Operational Risk, 5(4), 3.
Dehler, K. (2017): Bayesianische Methoden im operationellen Risiko. Master's Thesis, Friedrich-Alexander-University Erlangen-Nuremberg.
Ergashev, B. et al. (2013): A Bayesian Approach to Extreme Value Estimation in Operational Risk Modeling. Journal of Operational Risk 8(4):55-81
Fischer, M. et al. (2018): A Statistical Toolkit for the Loss Distribution Approach in Operational Risk Modeling. Working Paper (In Preparation)
Frigessi, A. et al. (2002): A Dynamic Mixture Model for Unsupervised Tail Estimation Without Threshold Selection. Extremes 5(3):219-235
Kuo, T. C. and Headrick, T. C. (2014): Simulating Univariate and Multivariate Tukey g-and-h Distributions Based on the Method of Percentiles. ISRN Probability and Statistics.
Pfaelzner, F. (2017): Einsatz von Tukey-type Verteilungen bei der Quantifizierung von operationellen Risiken. Master's Thesis, Friedrich-Alexander-University Erlangen-Nuremberg.
Reynkens, T. et al. (2017): Modelling Censored Losses Using Splicing: a global fit strategy with mixed Erlang and Extreme Value Distributions. Insurance: Mathematics and Economics 77:67-77
Tukey, J. W. (1960): The Practical Relationship between the Common Transformations of Counts of Amounts. Technical Report 36, Princeton University Statistical Techniques Research Group, Princeton.
Zou, C. Z. et al. (2018): A Monotone Spline Interpolated Closed-Form Approximation for Operational Value-at-Risk. Working Paper (In Preparation)