What is Value-at-Risk (VaR) ?

VaR models predict potential losses in a portfolio over a given time frame at a specific confidence level, assessing financial risk.

Why is backtesting crucial for VaR models ?

Backtesting verifies that actual losses align with VaR predictions, ensuring models are well-calibrated and reliable.

What does model validation entail for VaR models ?

It involves checking the adequacy of the model through backtesting, stress testing, and independent reviews to ensure accuracy.

How does backtesting influence VaR model adjustments ?

If backtesting reveals discrepancies, it prompts reevaluation and refinement of the model to improve accuracy and assumptions.

What is the role of the Basel Committee in VaR backtesting ?

The Basel Committee allows internal VaR models for capital requirements but mandates rigorous backtesting to prevent underestimation of risks.

What does an 'exception' mean in backtesting VaR models ?

An exception occurs when actual losses exceed the VaR estimate, indicating potential underestimation of risk.

How do clean and hypothetical returns differ in backtesting ?

Clean returns adjust for non-mark-to-market items, whereas hypothetical returns reflect a frozen portfolio to match closely with VaR forecasts.

What happens if a VaR model shows too many or too few exceptions ?

Too many exceptions suggest underestimation of risk, while too few may indicate an inefficient allocation of capital.

How is the 'failure rate' used in backtesting VaR models ?

The failure rate measures the proportion of times actual losses exceed VaR, helping assess model accuracy against expected thresholds.

What considerations affect the choice of confidence levels in backtesting ?

The choice of confidence levels for testing is independent of VaR levels and is crucial for determining the robustness and reliability of the model.

MR 4. BackTesting VaR

Interpretation –

1. The exceptions can be expected to occur 3 percent of the time.
2. Through the backtesting process, we want to be 95 percent confident that the number of observed exceptions is different from the expected number of exceptions (3% of n) because of sampling variation only; or simply stated, we want to be 95 percent confident that the model is correct.
A 95 percent VaR model can be tested at 97 percent confidence.

Interpretation –

1. The exceptions can be expected to occur 5 percent of the time.
2. Through the backtesting process, we want to be 97 percent confident that the number of observed exceptions is different from the expected number of exceptions (5% of n) because of sampling variation only; or simply stated, we want to be 97 percent confident that the model is correct.

Model Verification Based on Failure Rates – Example

The simplest method to verify the accuracy of the model is to record the failure rate, which gives the proportion of times VaR is exceeded in a given sample.

Suppose a bank provides a VaR figure at the 1 percent left-tail level (p = 1 – c=0.01) for a total of T days. The number of exceptions is the number of days the actual loss exceeds the VaR level, which is, say N. Hence the failure rate will be N/T. Ideally, the failure rate should give an unbiased measure of p, that is, should converge to p as the sample size increases (i.e. N/T→p as T→∞)

We want to know whether the model is correct or not at a given confidence level, by assessing if N is too small or too large. So our null hypothesis is

H₀ : Model is Correct

This test makes no assumption about the return distribution. The distribution could be normal, or skewed, or with heavy tails, or time-varying. We simply count the number of exceptions. As a result, this approach is fully nonparametric.

The setup for this test is the classic testing framework for a sequence of success and failures, also called Bernoulli trials. Under the null hypothesis that the model is correctly calibrated, the number of exceptions x follows a binomial probability distribution:

Jorion performs the test with 1 year of data (i.e. 250 trading days), and so T = 250.

Since it’s a 99 percent confidence level VaR, so we know that p = 1-0.99 = 0.01

So, for example, if we have to find the probability of getting 3 exceptions, we will have

CASE 1 – WHEN MODEL IS CORRECT ( p = 0.01)

Number of Exceptions, x	Probability, f(x)	Cumulative Probability, F(x)
0	8.1059%	8.1059%
1	20.4693%	28.5752%
2	25.7417%	54.3169%
3	21.4948%	75.8117%
4	13.4071%	89.2188%
5	6.6629%	95.8817%
6	2.7482%	98.6299%
7	0.9676%	99.5975%
8	0.2969%	99.8943%
9	0.0806%	99.9750%
10	0.0196%	99.9946%
11	0.0043%	99.9989%
12	0.0009%	99.9998%
13	0.0002%	100.0000%
14	0.0000%	100.0000%
15	0.0000%	100.0000%

CASE 1 – WHEN MODEL IS INCORRECT (p = 0.03)

Number of Exceptions, x	Probability, f(x)	Cumulative Probability, F(x)
0	0.0493%	0.0493%
1	0.3813%	0.4306%
2	1.4681%	1.8986%
3	3.7534%	5.6520%
4	7.1682%	12.8202%
5	10.9074%	23.7276%
6	13.7749%	37.5025%
7	14.8501%	52.3526%
8	13.9507%	66.3032%
9	11.6016%	77.9048%
10	8.6474%	86.5521%
11	5.8351%	92.3873%
12	3.5943%	95.9816%
13	2.0352%	98.0168%
14	1.0655%	99.0823%
15	0.5185%	99.6008%

Model Verification Based on Failure Rates

Expected value of x, or E[x]= p×T and Variance of x, or V(x) = p×(1 – p)×T. When T is large, we can use the central limit theorem and approximate the binomial distribution by the normal distribution

This provides a convenient shortcut. If the decision rule is defined at the two-tailed 95 percent test confidence level, then the cutoff value of |z| is 1.96.

Example – JP Morgan’s Exceptions

Model Verification Based on Failure Rates – Errors

When designing a verification test, the user faces a trade-off between the two types of errors. This table summarizes the two states of the world, correct versus incorrect model, and the decision.

For backtesting purposes, users of VaR models need to balance type 1 errors against type 2 errors. Ideally, one would want to set a low type 1 error rate and then have a test that creates a very low type 2 error rate, in which case the test is said to be powerful. It should be noted that the choice of the confidence level for the decision rule is not related to the quantitative level p selected for VaR. This confidence level refers to the decision rule to reject the model.
Kupiec (1995) develops approximate 95 percent confidence regions for such a test, defined by the tail points of the log-likelihood ratio:

which is asymptotically (i.e., when T is large) distributed chi-square with one degree of freedom under the null hypothesis that p is the true probability. Thus we would reject the null hypothesis if LR > 3.841.

Example – JP Morgan’s Exceptions

Kupiec’s approximate 95 percent confidence regions are reported in this table

For instance, with 2 years of data (T = 510), and VaR confidence level of 99% (p=0.01), we would expect to observe N = p×T =0.01×510≈5 exceptions. But the VaR user will not be able to reject the null hypothesis as long as N is within the [1 < N < 11] confidence interval. Values of N greater than or equal to 11 indicate that the VaR is too low or that the model understates the probability of large losses. Values of N less than or equal to 1 indicate that the VaR model is overly conservative.

The previous table also shows that this interval, expressed as a proportion N/T, shrinks as the sample size increases. Select, for instance, the p = 0.05 row.

With more data, we should be able to reject the model more easily if it is false.

The table, however, points to a disturbing fact. For small values of the VaR parameter p, it becomes increasingly difficult to confirm deviations. For instance, the nonrejection region under p = 0.01 and T = 252 is N < 7]. Therefore, there is no way to tell if N is abnormally small or whether the model systematically overestimates risk. Intuitively, detection of systematic biases becomes increasingly difficult for low values of p because the exceptions in these cases are very rare events. This explains why some banks prefer to choose a VaR confidence level, such as c = 95 percent and p=0.05, in order to be able to observe sufficient numbers of deviations to validate the model.

Conditional Coverage

So far, the framework focuses on unconditional coverage because it ignores conditioning, or time variation in the data. The observed exceptions, however, could cluster or “bunch” closely in time, which also should invalidate the model. With a 95 percent VAR confidence level, we would expect to have about 13 exceptions every year. In theory, these occurrences should be evenly spread over time. If, instead, we observed that 10 of these exceptions occurred over the last 2 weeks, this should raise a red flag. The market, for instance, could experience increased volatility that is not captured by VAR. Or traders could have moved into unusual positions or risk “holes”.
Whatever the explanation, a verification system should be designed to measure proper conditional coverage, that is, conditional on current conditions. Management then can take the appropriate action.
Such a test has been developed by Christoffersen (1998), who extends the LR_UC statistic to specify that the deviations must be serially independent. Another statistic has been developed to determine the serial independence of deviations known as LR_IND, and the combined test statistic for conditional coverage then is

Each component is independently distributed as chi square variable with 1 degrees of freedom [i.e. χ^2 (1)] asymptotically. The sum is distributed as chi square variable with 2 degrees of freedom [i.e. χ^2 (2)]. Thus we would reject at the 95 percent test confidence level if LR_CC > 5.991. We would reject independence alone if LR_IND > 3.841.

Exceptions do seem to cluster abnormally. As a result, the risk manager may want to explore models that allow for time variation in risk.

Basel Committee Rules For Backtesting

The Basel (1996a) rules for backtesting the internal model’s approach are derived directly from the failure rate test. To design such a test, one has to choose first the type 1 error rate, which is the probability of rejecting the model when it is correct. When this happens, the bank simply suffers bad luck and should not be penalized unduly. Hence one should pick a test with a low type 1 error rate, say, 5 percent (depending on its cost). The heart of the conflict is that, inevitably, the supervisor also will commit type 2 errors for a bank that willfully cheats on its VaR reporting.
The current verification procedure consists of recording daily exceptions of the 99 percent VaR over the last year. One would expect, on average, 1 percent of 250, or 2.5 instances of exceptions over the last year.
The Basel Committee has decided that up to four exceptions are acceptable, which defines a “green light” zone for the bank. If the number of exceptions is five or more, the bank falls into a “yellow” or “red” zone and incurs a progressive penalty whereby the multiplicative factor k is increased from 3 to 4, as described in this table 3-3. An incursion into the “red” zone generates an automatic penalty.

Within the “yellow” zone, the penalty is up to the supervisor, depending on the reason for the exception. The Basel Committee uses the following categories of causes for exceptions, and the subsequent application of penalties:
1. Basic integrity of the model – Exceptions occurred because of incorrect data or errors in the model programming. The penalty “should” apply.
2. Model accuracy needs to be improved – Exceptions occurred because the model does not describe risks with enough precision. The penalty “should” apply.
3. Intraday trading – The exceptions occurred because positions changed during intra-day trading. The penalty should be “considered”, but not necessarily applied.
4. Bad luck – The exceptions occurred because market became volatile due to various reasons. These exceptions should be expected to occur at least some of the time. No penalty guidance is provided.

Introduction and Definition

Value-at-risk (VaR) models are only useful if they predict the risks with a certain degree of accuracy. This is why the application of these models always should be accompanied by validation. Model validation is the general process of checking whether a model is adequate. This can be done with a set of tools, including backtesting, stress testing, and independent review and oversight.
Backtesting is a formal statistical framework that consists of verifying that actual losses are in line with projected losses. This involves systematically comparing the history of VaR forecasts with their associated portfolio returns.

Importance of Backtesting

Backtesting is essential for VaR users and risk managers, who need to check that their VaR forecasts are well calibrated. If not, the models should be reexamined for faulty assumptions, wrong parameters, or inaccurate modeling. Then the models should be improved based on the ideas provided by the backtesting process.
Backtesting is also central to the Basel Committee’s ground-breaking decision to allow internal VaR models for capital requirements. The backtesting framework should be designed to maximize the probability of catching banks that deliberately understate their risk, but at the same time, the framework should also ensure that banks whose VaR is exceeded simply because of bad luck should not be penalized.

Setup For Backtesting

When the model is perfectly calibrated, the number of observations falling outside VaR should be in line with the confidence level. The number of exceedences is also known as the number of exceptions. With too many exceptions, the model underestimates risk. This is a major problem because too little capital may be allocated to risk-taking units; penalties also may be imposed by the regulator. Too few exceptions are also a problem because they lead to excess, or inefficient, allocation of capital across units.

An Example

An example of model calibration is described in this figure which displays the fit between actual and forecast daily VaR numbers for Bankers Trust. The diagram shows the absolute value of the daily profit and loss (P&L) against the 99 percent VaR, defined here as the daily price volatility. Observations that lie above the diagonal line indicate days when the absolute value of the P&L exceeded the VaR.

Assuming symmetry in the P&L distribution, about 2 percent of the daily observations (both positive and negative) should lie above the diagonal, or about 5 data points in a year. Here we observe four exceptions. Thus the model seems to be well calibrated. We could have observed, however, a greater number of deviations simply owing to bad luck. The question is: At what point do we reject the model?

Data Issue in Backtesting

VaR measures assume that the current portfolio is “frozen” over the horizon. In practice, the trading portfolio evolves dynamically during the day. Thus the actual portfolio is “contaminated” by changes in its composition. The actual return corresponds to the actual P&L, taking into account intraday trades and other profit items such as fees, commissions, spreads, and net interest income.
This contamination will be minimized if the horizon is relatively short, which explains why backtesting usually is conducted on daily returns.
Sometimes an approximation is obtained by using a cleaned return, which is the actual return minus all non-mark-to-market items, such as fees, commissions, and net interest income. Under the latest update to the market-risk amendment, supervisors will have the choice to use either hypothetical or cleaned returns.
For verification to be meaningful, the risk manager should track both the actual portfolio return R_t and the hypothetical return R_t^∗ that most closely matches the VaR forecast. The hypothetical return R_t^∗represents a frozen portfolio, obtained from fixed positions applied to the actual returns on all securities, measured from close to close.
Ideally, both actual and hypothetical returns should be used for backtesting because both sets of numbers yield informative comparisons. If, for instance, the model passes backtesting with hypothetical but not actual returns, then the problem lies with intraday trading. In contrast, if the model does not pass backtesting with hypothetical returns, then the modeling methodology should be reexamined.

Model Backtesting With Exceptions

Model backtesting involves systematically comparing historical VaR measures with the actual returns. Since VaR is reported only at a specified confidence level, the actual returns should go beyond the VaR some times. For example, for a 95 percent confidence level VaR, we expect the actual returns should go beyond the VaR 5 percent of the time.
But most of the time we will not observe exactly 5 percent exceptions. A greater percentage could occur because of bad luck, perhaps 8 percent. At some point, if the frequency of deviations becomes too large, say, 20 percent, the user must conclude that the problem lies with the model, not bad luck, and undertake corrective action.
The issue is how to decide whether the deviations were because of bad luck or because of model flaws. This can be formulated as an accept or reject decision and is a classic statistical decision problem. This decision of accept or reject must be made at some confidence level. The choice of this level for the test, however, is not related to the quantitative level p selected for VaR.

Model Backtesting with Exceptions – Example

A 97 percent VaR model can be tested at 95 percent confidence.

Interpretation –

1. The exceptions can be expected to occur 3 percent of the time.
2. Through the backtesting process, we want to be 95 percent confident that the number of observed exceptions is different from the expected number of exceptions (3% of n) because of sampling variation only; or simply stated, we want to be 95 percent confident that the model is correct.
A 95 percent VaR model can be tested at 97 percent confidence.

Interpretation –

1. The exceptions can be expected to occur 5 percent of the time.
2. Through the backtesting process, we want to be 97 percent confident that the number of observed exceptions is different from the expected number of exceptions (5% of n) because of sampling variation only; or simply stated, we want to be 97 percent confident that the model is correct.

Model Verification Based on Failure Rates – Example

The simplest method to verify the accuracy of the model is to record the failure rate, which gives the proportion of times VaR is exceeded in a given sample.

We want to know whether the model is correct or not at a given confidence level, by assessing if N is too small or too large. So our null hypothesis is

H₀ : Model is Correct

Jorion performs the test with 1 year of data (i.e. 250 trading days), and so T = 250.

Since it’s a 99 percent confidence level VaR, so we know that p = 1-0.99 = 0.01

So, for example, if we have to find the probability of getting 3 exceptions, we will have

CASE 1 – WHEN MODEL IS CORRECT ( p = 0.01)

Number of Exceptions, x	Probability, f(x)	Cumulative Probability, F(x)
0	8.1059%	8.1059%
1	20.4693%	28.5752%
2	25.7417%	54.3169%
3	21.4948%	75.8117%
4	13.4071%	89.2188%
5	6.6629%	95.8817%
6	2.7482%	98.6299%
7	0.9676%	99.5975%
8	0.2969%	99.8943%
9	0.0806%	99.9750%
10	0.0196%	99.9946%
11	0.0043%	99.9989%
12	0.0009%	99.9998%
13	0.0002%	100.0000%
14	0.0000%	100.0000%
15	0.0000%	100.0000%

CASE 1 – WHEN MODEL IS INCORRECT (p = 0.03)

Number of Exceptions, x	Probability, f(x)	Cumulative Probability, F(x)
0	0.0493%	0.0493%
1	0.3813%	0.4306%
2	1.4681%	1.8986%
3	3.7534%	5.6520%
4	7.1682%	12.8202%
5	10.9074%	23.7276%
6	13.7749%	37.5025%
7	14.8501%	52.3526%
8	13.9507%	66.3032%
9	11.6016%	77.9048%
10	8.6474%	86.5521%
11	5.8351%	92.3873%
12	3.5943%	95.9816%
13	2.0352%	98.0168%
14	1.0655%	99.0823%
15	0.5185%	99.6008%

Model Verification Based on Failure Rates

Expected value of x, or E[x]= p×T and Variance of x, or V(x) = p×(1 – p)×T. When T is large, we can use the central limit theorem and approximate the binomial distribution by the normal distribution

This provides a convenient shortcut. If the decision rule is defined at the two-tailed 95 percent test confidence level, then the cutoff value of |z| is 1.96.

Example – JP Morgan’s Exceptions

Model Verification Based on Failure Rates – Errors

When designing a verification test, the user faces a trade-off between the two types of errors. This table summarizes the two states of the world, correct versus incorrect model, and the decision.

For backtesting purposes, users of VaR models need to balance type 1 errors against type 2 errors. Ideally, one would want to set a low type 1 error rate and then have a test that creates a very low type 2 error rate, in which case the test is said to be powerful. It should be noted that the choice of the confidence level for the decision rule is not related to the quantitative level p selected for VaR. This confidence level refers to the decision rule to reject the model.
Kupiec (1995) develops approximate 95 percent confidence regions for such a test, defined by the tail points of the log-likelihood ratio:

Example – JP Morgan’s Exceptions

Kupiec’s approximate 95 percent confidence regions are reported in this table

The previous table also shows that this interval, expressed as a proportion N/T, shrinks as the sample size increases. Select, for instance, the p = 0.05 row.

With more data, we should be able to reject the model more easily if it is false.

The table, however, points to a disturbing fact. For small values of the VaR parameter p, it becomes increasingly difficult to confirm deviations. For instance, the nonrejection region under p = 0.01 and T = 252 is N < 7]. Therefore, there is no way to tell if N is abnormally small or whether the model systematically overestimates risk. Intuitively, detection of systematic biases becomes increasingly difficult for low values of p because the exceptions in these cases are very rare events. This explains why some banks prefer to choose a VaR confidence level, such as c = 95 percent and p=0.05, in order to be able to observe sufficient numbers of deviations to validate the model.

Conditional Coverage

So far, the framework focuses on unconditional coverage because it ignores conditioning, or time variation in the data. The observed exceptions, however, could cluster or “bunch” closely in time, which also should invalidate the model. With a 95 percent VAR confidence level, we would expect to have about 13 exceptions every year. In theory, these occurrences should be evenly spread over time. If, instead, we observed that 10 of these exceptions occurred over the last 2 weeks, this should raise a red flag. The market, for instance, could experience increased volatility that is not captured by VAR. Or traders could have moved into unusual positions or risk “holes”.
Whatever the explanation, a verification system should be designed to measure proper conditional coverage, that is, conditional on current conditions. Management then can take the appropriate action.
Such a test has been developed by Christoffersen (1998), who extends the LR_UC statistic to specify that the deviations must be serially independent. Another statistic has been developed to determine the serial independence of deviations known as LR_IND, and the combined test statistic for conditional coverage then is

Exceptions do seem to cluster abnormally. As a result, the risk manager may want to explore models that allow for time variation in risk.

Basel Committee Rules For Backtesting

The Basel (1996a) rules for backtesting the internal model’s approach are derived directly from the failure rate test. To design such a test, one has to choose first the type 1 error rate, which is the probability of rejecting the model when it is correct. When this happens, the bank simply suffers bad luck and should not be penalized unduly. Hence one should pick a test with a low type 1 error rate, say, 5 percent (depending on its cost). The heart of the conflict is that, inevitably, the supervisor also will commit type 2 errors for a bank that willfully cheats on its VaR reporting.
The current verification procedure consists of recording daily exceptions of the 99 percent VaR over the last year. One would expect, on average, 1 percent of 250, or 2.5 instances of exceptions over the last year.
The Basel Committee has decided that up to four exceptions are acceptable, which defines a “green light” zone for the bank. If the number of exceptions is five or more, the bank falls into a “yellow” or “red” zone and incurs a progressive penalty whereby the multiplicative factor k is increased from 3 to 4, as described in this table 3-3. An incursion into the “red” zone generates an automatic penalty.

Within the “yellow” zone, the penalty is up to the supervisor, depending on the reason for the exception. The Basel Committee uses the following categories of causes for exceptions, and the subsequent application of penalties:
1. Basic integrity of the model – Exceptions occurred because of incorrect data or errors in the model programming. The penalty “should” apply.
2. Model accuracy needs to be improved – Exceptions occurred because the model does not describe risks with enough precision. The penalty “should” apply.
3. Intraday trading – The exceptions occurred because positions changed during intra-day trading. The penalty should be “considered”, but not necessarily applied.
4. Bad luck – The exceptions occurred because market became volatile due to various reasons. These exceptions should be expected to occur at least some of the time. No penalty guidance is provided.

Contact us

BackTesting VaR - FRM Part 2

Learning Objectives

Chapter Contents

Model Verification Based on Failure Rates – Example

Model Verification Based on Failure Rates

Model Verification Based on Failure Rates – Errors

Conditional Coverage

Basel Committee Rules For Backtesting

Introduction and Definition

Importance of Backtesting

Setup For Backtesting

An Example

Data Issue in Backtesting

Model Backtesting With Exceptions

Model Backtesting with Exceptions – Example

Model Verification Based on Failure Rates – Example

Model Verification Based on Failure Rates

Model Verification Based on Failure Rates – Errors

Conditional Coverage

Basel Committee Rules For Backtesting

Previous Chapter

Next Chapter

Go to Syllabus

Courses Offered

By : Micky Midha

VIEW DETAILS

By : Micky Midha

VIEW DETAILS

By : Micky Midha

VIEW DETAILS

By : Micky Midha

VIEW DETAILS

By : Shubham Swaraj

VIEW DETAILS

FAQs

No comments on this post so far:

Add your Thoughts:

Quick Links

Company