Stress testing has been part of the risk manager’s toolkit for a long time. It is perhaps the most basic of risk-based questions to want to know the resilience of an exposure to deteriorating conditions, be it a single position or loan or a whole portfolio.
Typically, the stresses take the form of:
Sensitivities (spreads double, prices drop, volatilities rise) or
Scenarios (Black Monday 1987, autumn of 1998, post-Lehman bankruptcy, severe recession, stagflation).
These types of stresses lend themselves naturally to understanding financial risks, particularly in a data-rich environment such as that found in a trading operation. Nonfinancial risks, such as operational, reputational, and other business risks, are much harder to quantify and parameterize yet rely heavily on scenario analysis (e.g., earthquakes, computer hacking, legal risks).
While the original Basel I Accord of 1988 did not make any formal mention of stress testing, it merited its own section in the Market Risk Amendment of 1995 and thus became embedded in the regulatory codex.
There are three kinds of capital and liquidity:
The capital/liquidity firms have.
The capital/liquidity firms need (to support business activities).
The capital/liquidity regulators think firms need.
Stress testing, regulatory, and bank-internal (so-called “economic”) models all seek to do the same thing: assess the amount of capital and liquidity needed to support the business activities of the financial institution.
Capital adequacy addresses the right side of the balance sheet (net worth), and liquidity the left side (share of assets that are “liquid,” however defined).
If all goes well, both the economic and regulatory capital/liquidity are less than the required regulatory minimum, and their difference is small.
Of course, neither firm-internal (economic) nor regulatory capital and liquidity models can guarantee failure prevention. That is not their purpose either, as every firm accepts some probability of failure, sized by its risk appetite. During the financial crisis, the cascading of defaults, and the resulting deep skepticism of the market’s stated capital adequacy, forced regulators to turn to stress testing for assessing the capital adequacy of banks in a credible way.
Supervisory Capital Assessment Program (SCAP)
A successful macro-prudential stress testing program, particularly in a crisis, has at least two components:
A credible assessment of the capital strength of the tested institutions, to size the capital “hole” that needs to be filled.
A credible way of filling that hole.
The US bank stress test in 2009, the Supervisory Capital Assessment Program (SCAP), had these two components.
The US entered 2009 with enormous uncertainty about the health of its banking system. In the absence of a more concrete and credible understanding of the problems with bank balance sheets, investors were reluctant to commit capital, especially given the looming threat of possible government dilution. With a credible assessment of losses under a sufficiently stressful macroeconomic scenario, the supervisors hoped to establish limits for the markets—fill this hole, and you won’t risk being diluted later because the scenario wasn’t tough enough.Moreover, if some institutions could not convince investors to fill the hole, a US government program, namely the Treasury’s Capital Assistance Program (CAP), stood ready to supply the required capital. Importantly, the US Treasury was a sufficiently credible debt issuer that the CAP promise was itself credible. All banks with assets greater than $100 billion (YE 2008) were included, accounting for two-thirds of the total assets and about half of the total loans in the US banking system. In the end, ten of the 19 SCAP banks were required to raise a total of $75 billion in capital within six months and indeed raised $77 billion of Tier 1 common equity in that period. None needed to draw on CAP funds.
The SCAP was the first of the macro-prudential stress tests of this crisis, but the changes at the micro-prudential or bank-specific level were at least equally significant. With the SCAP, stress testing at banks went:
From mostly single-factor shocks (or a handful) to using a broad macro scenario with market-wide stresses.
From product or business unit stress testing, focusing mostly on losses, to firm-wide and comprehensive testing, encompassing losses, revenues, and costs.
Features of Stress Testing, Pre-SCAP and Post- SCAP
Pre-SCAP
Post-SCAP
Mostly single shock
Broad macro scenario and market stress
Product or business unit level
Comprehensive, firm-wide
Static
Dynamic and path dependent
Not usually tied to capital adequacy
Explicit post-stress common equity threshold
Losses only
Losses, revenues and costs
EBA Stress Testing
The European experience in 2010 and 2011 stands in stark contrast to the 2009 SCAP. The Committee of European Bank Supervisors (CEBS) conducted a stress test of 91 European banks in 2010 amidst the looming debt crisis. The level of disclosure provided was rather less than in the SCAP. For instance, loss rates by firm were only made available for two sub-categories: overall retail and overall corporate. By contrast, the SCAP results released loss rates by major asset classes such as first-lien mortgages, credit cards, commercial real estate, and so on. Subsequent stress tests conducted largely by outside independent advisors revealed that the CEBS tests were inadequate. Hence, to help close the credibility gap, the extent and degree of disclosure was increased later.
The stakes for the 2011 European stress test, now conducted by the successor to the CEBS—the European Banking Authority (EBA)—had risen substantially. The degree of disclosure was much more extensive, including information on exposure by asset class and by geography. Importantly, all bank-level results were available to download in spreadsheet form, enabling market analysts to easily impose their own loss rate assumptions. In this way, the “official” results were no longer so final: analysts could (and did) easily apply their own sovereign haircuts on all exposures and thus test the solvency of any of the 90 institutions themselves.
Comprehensive Capital Analysis And Review (CCAR)
All supervisory stress tests imposed the same scenario on all banks. The US CCAR program, which has been in operation since 2011, recognized this problem and asked banks to submit results using their own scenarios (baseline and stress) in addition to results under the common supervisory stress scenario. This was an important step forward from the 2009 SCAP.
By asking banks to develop their own stress scenario(s), thus revealing the particular sensitivities and vulnerabilities of their specific portfolio and business mix, supervisors could learn what the banks themselves thought to be the high-risk scenarios. This is useful not just for micro-prudential supervision—learning about the risk of a given bank—but also for macro-prudential supervision, by allowing for the possibility of discovering common risks across banks which may hitherto have been undiscovered or under-emphasized. With this dual approach, supervisors could compare results across banks from the common scenario directly, without sacrificing risk discovery.
Stress Testing Design
Perhaps the most fundamental choice in stress testing design is the risk appetite of the authorities—how severe and how long the stress scenario should be, and what the post-stress hurdle is. The target calibration is not strict solvency (i.e., just enough capital to have a positive net worth), but rather some notion of adequate capitalization post-stress. For example, the 2009 SCAP in the US presented a two-year scenario with a post-stress hurdle of 4% Tier 1 common capital.
While length and post-stress hurdles are easy to compare across macro-prudential stress tests, scenario severity is not. Also, a multivariate assessment is much more difficult.
Once the risk appetite has been established, one of the principal challenges faced by both the supervisors and the firms when designing stress scenarios is coherence.
The scenarios are inherently multi-factor. A rich description of adverse states of the world has to be developed in the form of several risk factors, be they financial or real, taking on extreme yet coherent (or possible) values.
It is not sufficient to specify only high unemployment, a significant widening of credit spreads, or a sudden drop in equity prices. When one risk factor moves significantly, the others move too.
The real difficulty is in specifying a coherent joint outcome of all the relevant risk factors. For example, not all exchange rates can depreciate at once; some have to appreciate. A high inflation scenario needs to account for likely monetary policy responses, such as an increase in the policy interest rate.
Every market shock scenario resulting in a flight from risky assets—“flight to quality”—must have a (usually small) set of assets that can be considered safe havens. These are typically government bonds from the safest sovereigns (e.g., the US, Japan, Germany, Switzerland).
The problem of coherence is especially acute when considering stress scenarios for market risk, i.e., for portfolios of traded securities and derivatives.
These portfolios are typically marked to market as a matter of course and risk managed in the context of a value-at-risk (VaR) system. In practice, this means that the hundreds of thousands (or more) of positions in the trading book are mapped to tens of thousands of risk factors, which are tracked on a (usually) daily basis and form the “data” used to estimate risk parameters like volatilities and correlations. Finding coherent outcomes in such a high-dimensional space, short of resorting to historical realizations, is extremely difficult.
The challenge of finding a scenario in which the real and financial factors are jointly coherent further complicates the process.
The 2009 SCAP had a rather simple scenario specification. The state space had only three dimensions: GDP growth, unemployment, and the house price index (HPI). The market risk scenario was based on historical experience: an instantaneous risk factor impact reflecting changes from June 30 to December 31, 2008. This period represented a massive flight to quality, with the markets experiencing the failure of at least one global financial institution (Lehman), and risk premia at the time arguably placed a significant probability on the kind of adverse real economic outcome painted by the tri-variate SCAP scenario. This solution achieved a loose coherence of the real and financial stresses. However, it did not test for anything new.
Challenges In Modeling Losses
Geographic Heterogeneity – For a firm that is active in many markets (both product and geography), the first task is to map from the few macro-factors to the many intermediate risk factors that drive losses for particular products by geography.
The EBA was forced to confront the problem of geographic heterogeneity directly, since it spans 21 sovereign nations with rather different economies.
US supervisors left the task of accounting for the geographic heterogeneity to individual firms.
Regional differences are critical in modeling losses for real estate lending (residential and commercial), wholesale lending (particularly for SMEs), etc.
Since the US experiences regional business cycles—the national business cycle obscures a considerable degree of variation across states—nearly all lending has some geographic component. For example, credit card losses are especially sensitive to unemployment, and in July 2011, with the national rate at 9.1%, the state-level unemployment rate ranged from 3.3% in North Dakota to 12.9% in Nevada.
Loss Severity Independence with the Business Cycle – Even if the default rate on auto loans increases significantly during a recession, the corresponding loss given default (LGD) or loss severity need not.
An interesting example is auto lending and leasing, where the collateral assets are used cars. While auto sales invariably decline during a recession, used car sales typically suffer less.
The problem of loose coupling of the loss severity to the business cycle is not limited to auto loans. For corporate credit, an important determinant of LGD is whether the industry of the defaulted firm is in distress at the time of default. For example, if the airline industry is in distress, and a bank is stuck with the collateral on defaulted aircraft loans or leases, it will be hard to sell those aircraft except at very depressed prices. The healthcare sector may be relatively robust at the time, as it was during the recent recession, but it is difficult to transform an airplane into a hospital.
Exposure and CVA Stress Test – Not only does the PD of the counterparty change in a stressful environment, but the exposure does likewise. Thus, any CVA stress test involves two distinct simulation exercises.
If the collateral posted by the counterparty is anything other than cash or a cash equivalent, a revaluation of that collateral under the same stress scenario needs to be added to the process.
Challenges In Modeling Revenues
Lack of Details – Implementing stress scenarios on the revenue side of the equation remains largely a black box and seems far less developed than stress testing for losses. Neither the 2009 SCAP nor the otherwise richly documented 2011 EBA disclosures devoted much space or revealed much detail about the methods and approaches for computing revenues under stressful conditions.
Interest Rates Variability – Banks’ total income can be divided roughly into interest and non-interest income. The interest income is clearly a function of the yield curve and credit spreads posited under the stress scenario, but the net impact of rising or falling rates on bank profitability remains ambiguous, perhaps in part because of interest rate hedging strategies.
Non-Interest Income – The impact of stress scenarios on non-interest income, which includes service charges, fiduciary fees, and other income (e.g., from trading), is far harder to assess, and there has been precious little discussion of its determinants in the literature.
Challenges In Modeling Balance Sheets
Capital adequacy is defined in terms of a capital ratio, roughly capital over assets. The denominator is typically risk-weighted assets (RWA), where the risk weights are determined by the prevailing regulatory capital regime, namely Basel I (in the US cases of the SCAP and CCAR) and Basel II (in the European stress tests). A bank may be forced to raise capital under one regime but not the other, and there is no way to know which regime will result in a more favorable treatment without knowing about the portfolio in considerable detail.
Regardless of the risk weight regime, determining the post-stress capital adequacy requires modeling both the income statement and the balance sheet, including both flows and stocks, over the course of the stress test horizon, which is typically two years. This is illustrated in the figure on the next page.
The point of departure is the current balance sheet, at which point the bank meets the required capital (and, if included, liquidity) ratios.
The starting balance sheet generates the first quarter’s income and loss, which in turn determines the quarter-end balance sheet.
The modeler is then faced with the problem of considering the nature and amount of new assets originated and/or sold during the quarter, and any other capital-depleting or conserving actions, such as acquisitions or spin-offs, dividend changes, or share repurchase or issuance programs, including employee stock and stock option programs.
The problem of balance sheet modeling exists under a static or dynamic balance sheet assumption. The bank should not drop below the required capital (and liquidity) ratios in any quarter.
Moreover, at the end of the stress horizon, the bank needs to estimate the amount of reserves needed to cover expected losses on loans and leases for the following year. In this way, the stress tests are effectively three years (or T + 1 years for a T-year stress test).