Reserving using machine learning - an advanced example in R (Part 3) – Machine Learning in Reserving Working Party

This article introduces the final of a series of three R notebooks. The notebooks provide R code to replicate the central scenario in the paper of Maximilien Baudry “NON-PARAMETRIC INDIVIDUAL CLAIM RESERVING IN INSURANCE”.

Introduction

Maximilien Baudry’s paper illustrates a novel approach, applying machine learning to individual claim transaction data in order to estimate both RBNS and IBNR reserves.

Baudry illustrates his approach with a simulated mobile phone insurance dataset. The working party has recreated Baudry’s mobile phone example in a series of 3 R notebooks that illustrate how to simulate the dataset, create the reserving database and apply machine learning reserving techniques, respectively.

In sharing example code we hope to make machine learning approaches more accessible and encourage further development and research among the wider actuarial community.

Details of suggested pre-reading and a link to the third notebook is given below.

Suggested pre-reading

Baudry’s use of granular and wide ranging data sources, in conjunction with carefully considered data preparation make this an advanced paper. I would therefore recommend reading Baudry’s original paper first.

We hope you will be encouraged to try the code and adapt it yourself. To do so you will need basic knowledge and access to the R programming language. The Foundations workstream provides links to suitable resources.

You should also read Part 1 and Part 2 of this series of posts as well.

R Code Notebook

The third and final notebook shows the end to end process of creating IBNR and RBNS reserves from a simulated mobile phone dataset using the xgboost machine learning algorithm. It also shows how you can inspect and interpret the resulting model and gain insights into what features contribute to the reserve calculation.

The third notebook can be found here.

This final notebook is a little more involved than the first two of the series so I recommend setting aside 30 - 45 minutes to read the notebook and to be familiar with Baudry’s original paper and the first two notebooks of this series.

If you wish to experiment and run the code in your own local instance of R then I would recommend you set aside a good two hours, to give yourself time to install R libraries and follow the code line by line. Instructions and links to the source code can be found at the end of the Notebook.

October 2022 edit: The location of the source code has changed from that shown at the end of notebooks. It can now be found here.

Video

A presentation of this work is available here.

About the author

Nigel Carpenter