Model risk

Neptune, Basinghall’s leading edge model management solution

Neptune logo

A typical model risk management solution

As shown below, comprehensive model management consists of three distinct components:

  • 1.
    Modelling platform (MP)
  • 2.
    Model risk management (MRM)
  • 3.
    Model risk quantification (MRQ)

Basinghall offers an all-in-one model management solution

It has been created, from the ground up, to deal with model risk management and model management

Tap to
zoom image

Domain diagram

Tap to
zoom image

Neptune – principal differentiators

  • Comprehensive solution (most competitor solutions cover only one aspect)
  • Designed from the ground up by practitioners with cutting-edge technology to work in practical environment
Full model management
Only tool on the market that combines model risk management and model execution platform
Populated tool
Typical workflows in MRM embedded in the tool
Includes population of placeholder artefacts
Technology
API first and domain driven design
Microservices architecture
Cloud / Spark implementation
Model risk quantification
Only tool on the market that has MRQ
Algorithms to compute MRQ are part of the model library
Once calculated these metrics are part of the MRM monitoring
They then naturally become part of visualisation and reporting
Embedded domain knowledge
Concept of model version tracks all model changes and allows parallel use of model variants
Lifecycle as directed graph encodes client’s model governance process
Concept of model usage enables focus on suitability of models for specific applications
Relationships between models easy to setup and visualise as a directed graph
Concept of model lake as golden source of all model related information
Sandbox and main tool
Unique combination of user flexibility combined with robust control
MRM component is the same in both tools (microservices based)
MP sandbox uses Jupyter notebooks to execute models and for model build/validation
MP production tool uses full cloud/Spark solution with Enterprise-level UI

Model risk

How to manage model risk at the enterprise level

A senior stakeholder, e.g. a Board member or chair of BRC, would want a view of model risk across the organisation, or across the model portfolio that impact regulatory capital.
In addition, they would like to see the evolution of this aggregated model risk and how it compares to the model risk appetite
Solution? We need a robust methodology for aggregating model risk across the relevant model portfolio. The extended MRQ module can do this as shown below
Enterprise-wide Model Risk Capital
Tap to
zoom image

Typical commentary accompanying
summary model risk report

General commentary
  • The aggregated model risk has always been below the tolerance, except in June 2017. That peak was due to a policy change, increasing the risk appetite as part of lending strategy
  • The aggregated risk has breached appetite a few times, essentially due to two material LGD models for mortgage lending
Portfolios and models
  • UK mortgage Lending. There is well-known issue with two LGD models. These have high risk due to lack of calibration data and continuous change in policy regarding cut-off thresholds
  • The PD and LGD models for SME lending also have high model risk (discovered after the portfolio was bought)
Actions
  • More resources are required in the model development team to address the issues highlighted on the left
  • Perhaps a few resources can be “borrowed” from the validation team in the interim

Where is my risk?

We need to understand where’s the model risk.
This is accomplished by showing model risk (bias and noise) vs. materiality (impact on RWA or P&L).
This is about MRQ (not the related operational controls) which also mitigate risk but are summarised separately.
Guide
  • Material and risky (red): close monitoring and quick action required and taking action quickly
  • Material and safe: no concerns as long as there is no movement, trigger notification if the situation deteriorates
  • Non-material: no concern

Model risk quantification

1
Measure

For every sample, measure the realised model error and express this as the post model adjustment to align the model with the observed outcomes

2
Update

Combine the observed error estimates to get cumulative error parameters, taking into account the dynamics of the model errors, e.g.

  • Stationary, i.e. no dynamics but increasing data to observe
  • External factors, e.g. dependent on cyclical driver
  • Hidden variables, e.g. Kalman filter process
3
Usage

Use the latest error parameters to augment the model output with its uncertainty

  • Siloed approach for similar use cases (e.g. BoE ST, EBA ST, ICAAP)

Model risk quantification output

The graphs clearly show the model risk quantification and the bias and noise for the BB rating grade as it was computed for successive monitoring. The model’s output is an IRB 1-year through-the-cycle PD, and hence shows only small variations (blue line).

As expected, the MRQ procedure generally moves the corrected PD (orange circles) in the direction of the observed default rate. We can see that the uncertainty is not symmetric around its mean value due to nonlinearities (in logistic regression).

As monitoring data gets collected over time it sheds further light on a model’s uncertainty and we estimate the bias and noise after each monitoring dataset becomes available. In some situations, we may want to adjust for the bias and compute a “corrected” PD.

Traditional metrics
Model Risk Quaantification: PD Model [BB Rating]

How does model error arise?

Tap to
zoom image

Multiple sources contribute to the overall model risk:

  • General estimation error – reflecting the dispersion of the distribution of the statistical estimator
  • Data deficiencies – may lead to errors at model build and / or production stage
  • Drivers – models containing different drivers are likely to have different error profiles
  • Functional form – the type of model selected for the build will drive the type of error distribution
  • Exogenous – changes to the environment, post implementation, may shift the error distribution.

The various drivers will interact to produce the overall error distribution:

  • Where model risk is low, actual outcomes will be centred closer to the model estimate, and dispersion will be low
  • Where model risk is high, actual outcomes will be centred further from the model estimate, and / or dispersion will be high.

Margin of conservatism

The essence of MoC is that we shift the output of the model to make it sufficiently conservative – the flip side being that it makes the model less accurate. One needs to manage the trade-off between accuracy and conservatism judiciously. Applies only in contexts where being conservative is more important than being accurate (e.g. regulatory requirements of an IRB framework)
1
Usage Level Quantification and
conservatism requirement
  • Model risk quantification at the usage level estimates the error distribution for the usage metric (e.g. IRB capital) as shown below
  • If conservatism is required, then underestimation beyond an acceptable threshold needs to be limited
  • The grey area on the left shows the part of the error distribution that exceeds the tolerance on the downside (underestimation)
  • A wide error distribution implies a trade-off  between accuracy and conservatism – one cannot tune for both
2
Margin of Conservatism
  • MoC is applied as a mitigating control to underestimation by shifting error distribution to the upside (grey area reduces)
  • This ensures prudential outcomes even in the presence of model deficiencies
  • This is at the cost of loss of accuracy (but this should be kept to a minimum)
  • MoC is key regulatory requirement
  • MoC is expected to decrease over time as one accumulates more data to assess or recalibrate the model

Margin of Conservatism
for IRB models

For mortgages, conservatism is usually applied at several points due to factors such as low default volumes, lack of historical data, changes in portfolio composition over time, and uncertainties around idiosyncratic portfolio issues. Here are some examples:
Regulatory Category A
  • Historical forbearance data is unavailable for a portfolio. Historical PDs are inferred using the relationship between the PD with and without forbearance in the known period
  • The inference may be considered an “appropriate adjustment”, however any conservatism added as part of the process would be captured under Category A
Regulatory Category B
  • Risk appetite was recently expanded, and a small increase in PD has been observed. An uplift is applied to the historic PD to reflect the uncertainty of the full impact
  • The uplift would be captured as Category B conservatism
Over time, the true impact can be evaluated and the conservatism may be reduced
Regulatory Category C
  • The Forced Sale Discount for a segment was estimated based on a limited number of possession sales. An uplift was applied to reflect uncertainty in the estimate
    Number of Observations:
    70
    Mean FSD:
    29.3
    Standard Deviation:
    4.6
    Upper 95% confidence level:
    32.3
  • If the 95% confidence level is taken as the estimate, then the difference from the mean may be captured as Category C conservatism
While it may be possible to reduce conservatism over time, the direction of travel for IRB means that floors (at model component and output level) are likely to limit the impact on the ultimate capital requirement. However, the framework for MoC remains a useful tool to help understand and explain the conservatism built into models.
In the case where proper model risk quantification has been implemented, then the expert judgments illustrated above can be compared to the MoC that follows from the quantification – providing further justification and insight.

Artificial Intelligence and Machine Learning models

  • The services of Basinghall Analytics in AI and ML cover the full model lifecycle – from build, train, validate, implement, productionise, monitor to govern
  • Our differentiated offering provides a scoring mechanism for AI/ML models, e.g. it computes the bias and noise in AI/ML models and then scores them accordingly
The AI/ML universe
Tap to
zoom image
+
-
Performance comparison - Traditional vs. ML models