Reserving using machine learning - an advanced example in R (Part 3)
This article introduces the final of a series of three R notebooks. The notebooks provide R code to replicate the central scenario in the paper of Maximilien Baudry “NON-PARAMETRIC INDIVIDUAL CLAIM RESERVING IN INSURANCE”.
Maximilien Baudry’s paper illustrates a novel approach, applying machine learning to individual claim transaction data in order to estimate both RBNS and IBNR reserves.
Baudry illustrates his approach with a simulated mobile phone insurance dataset. The working party has recreated Baudry’s mobile phone example in a series of 3 R notebooks that illustrate how to simulate the dataset, create the reserving database and apply machine learning reserving techniques, respectively.
In sharing example code we hope to make machine learning approaches more accessible and encourage further development and research among the wider actuarial community.
Details of suggested pre-reading and a link to the third notebook is given below.
Baudry’s use of granular and wide ranging data sources, in conjunction with carefully considered data preparation make this an advanced paper. I would therefore recommend reading Baudry’s original paper first.
We hope you will be encouraged to try the code and adapt it yourself. To do so you will need basic knowledge and access to the R programming language. The Foundations workstream provides links to suitable resources.
R Code Notebook
The third and final notebook shows the end to end process of creating IBNR and RBNS reserves from a simulated mobile phone dataset using the xgboost machine learning algorithm. It also shows how you can inspect and interpret the resulting model and gain insights into what features contribute to the reserve calculation.
The third notebook can be found here.
This final notebook is a little more involved than the first two of the series so I recommend setting aside 30 - 45 minutes to read the notebook and to be familiar with Baudry’s original paper and the first two notebooks of this series.
If you wish to experiment and run the code in your own local instance of R then I would recommend you set aside a good two hours, to give yourself time to install R libraries and follow the code line by line. Instructions and links to the source code can be found at the end of the Notebook.