Wrap-up from 2020 Autumn conference talks

We discuss some questions following our 2020 Autumn conference talks
machine learning
talks
Author

Nigel Carpenter

Published

November 2, 2020

Between September and November 2020, a number of working party members presented at three conferences:

Our talks provided an update on the activities of the working party from some of our workstreams. We covered:

We’ve already shared some of this material on our blog already and will be sharing more over the next few months so check back to read our latest posts and keep an eye on the workstream pages which will also curate the material.

Questions

We received a number of questions at the American conferences. These and the answers raise some interesting points, so we decided to share them more widely.

The documentation of reserving using ML tools and methods, particularly to be reviewed by auditors or by insurance regulators, seems like a big hurdle to implementation - Especially if external data is used. A really robust approach to documentation and testing could be required. Thoughts?

Documentation and testing is increasingly becoming a positive point of differentiation for ML. Good machine learning is reproducible and objective in its decision making rather than subjective as can be the case with traditional reserving. What’s more there are many ways of explaining the choices and outputs from machine learning algorithms. These can even be put into human readable form which opens up the possible for automated model documentation, a product I have seen some vendors actively developing.

So there are solutions to the technical challenges. The bigger barrier is awareness and practical knowledge of how to apply these tools and methods, which is one of the aspects the working party is seeking to address.

ML reserving vs. MCMC Bayesian reserving - which one offers better accuracy

In answering the question we have to recognise, we are answering at a point in time in along the different development curves of a number of algorithmic techniques. So though I’m not the expert I’d hazard a guess that Bayesian MCMC techniques will outperform ML reserving techniques here and now as the former has had more intellectual capital invested into it.

The extension to the question is which one will ultimately offer the better accuracy? In that regard I have confidence in saying that deep neural network based approaches will ultimately dominate because they are among the most flexible learning algorithms whose accuracy gets better and better the more data they can be trained upon.
We have seen that to be true in other domains such as image recognition and natural language processing where deep neural networks now dominate. If course in order for it to be true in insurance reserving we will need to bring big data to the task and find the forms of neural network architectures that excel.

An additional comment from one of the other speakers

Approaching this from a different angle, and based on the current state of ML and MCMC, when you say “which one offers better accuracy?”, this should be considered in the context of your particular problem, what you want to get out of it and the data you have available to you. Do you have a lot of data available? Do you need a full distribution of reserves or is a point estimate sufficient?

If you have limited variables available (e.g. accident/underwriting period, development, calendar period, maybe one or two others), then it may be difficult for ML to really outperform other methods - ML likes big data. On the other hand, MCMC can perform well in a situation like this. In particular, it provides a way to include prior knowledge or expert opinion. Furthermore, MCMC will return a full distribution of outcomes; ML techniques often will not, or will require some type of bootstrapping method to be bolted on.

However, if you have a lot of variables, it can be more difficult to set this up as an MCMC model - you need to fully specify the structure of the ML model. Even if you do have a reasonable structure, it may be slow computationally. ML methods may perform better in this case.

How long does it take to get up to speed with ML?

Coursera courses give a good indication, they often say you need 60 hours to complete them given some pre-requisite knowledge. So depending on your starting point and available time you can become knowledgeable in a narrow domain in a week. But realistically a month is the quickest I’ve seen someone become proficient at xgboost say with no prior knowledge but good maths knowledge and basic programming ability. You do need to be able to set aside time to focus. These are new concepts that will be difficult to pick up in 15 minute sessions! Gaining deep knowledge takes longer and practice; I’d say 6 months of dedication and effort can get you to that point.

What I would say is that gaining ML proficiency is easier than the equivalent proficiency in traditional Insurance GLMs. I’ve regularly seen people that have struggled with traditional GLM modelling, quickly pick up machine learning and xgboost and use it to outperform the experienced GLM modeller.

More questions?

Do you have other questions you’d like to ask? Feel free to ask them in the comments below, or contact us directly.

About the author


Copyright © Machine Learning in Reserving Working Party 2024