Francis Duval, PhD.

Doctorat en mathématiques
UQAM

2024/02

Direction de recherche:

Mathieu Pigeon, professeur au Département de mathématiques de l’Université du Québec à Montréal
Jean-Philippe Boucher, professeur au Département de mathématiques de l’Université du Québec à Montréal

Thèse de doctorat:

Duval, Francis (2024), « Modélisation des sinistres en assurance automobile avec l’utilisation de données télématiques : approches d’apprentissage automatique en classification et régression de comptage », Dir.: Jean-Philippe Boucher et Mathieu Pigeon, Thèse de doctorat. Montréal (Québec, Canada), Université du Québec à Montréal, Doctorat en mathématiques.

Page GitHub:

https://github.com/francisduval

Publications

F.Duval, J.-P. Boucher & M.Pigeon (2024), Telematics Combined Actuarial Neural Networks for Cross-Sectional and Longitudinal Claim Count Data, ASTIN Bulletin, 1-24

We present novel cross-sectional and longitudinal claim count models for vehicle insurance built upon the Combined Actuarial Neural Network (CANN) framework proposed by Mario Wüthrich and Michael Merz. The CANN approach combines a classical actuarial model, such as a generalized linear model, with a neural network. This blending of models results in a two-component model comprising a classical regression model and a neural network part. The CANN model leverages the strengths of both components, providing a solid foundation and interpretability from the classical model while harnessing the flexibility and capacity to capture intricate relationships and interactions offered by the neural network. In our proposed models, we use well-known log-linear claim count regression models for the classical regression part and a multilayer perceptron (MLP) for the neural network part. The MLP part is used to process telematics car driving data given as a vector characterizing the driving behavior of each insured driver. In addition to the Poisson and negative binomial distributions for cross-sectional data, we propose a procedure for training our CANN model with a multivariate negative binomial (MVNB) specification. By doing so, we introduce a longitudinal model that accounts for the dependence between contracts from the same insured. Our results reveal that the CANN models exhibit superior performance compared to log-linear models that rely on manually engineered telematics features.
F.Duval, J.-P. Boucher & M.Pigeon (2023), Enhancing Claim Classification with Feature Extraction from Anomaly-Detection-Derived Routine and Peculiarity Profiles, Journal of Risk and Insurance, vol. 90, no 2, p. 421-458.

Usage-based insurance is becoming the new standard in vehicle insurance; it is therefore relevant to find efficient ways of using insureds’ driving data. Applying anomaly detection to vehicles’ trip summaries, we develop a method allowing to derive a “routine” and a “peculiarity” anomaly profile for each vehicle. To this end, anomaly detection algorithms are used to compute a routine and a peculiarity anomaly score for each trip a vehicle makes. The former measures the anomaly degree of the trip compared to the other trips made by the concerned vehicle, while the latter measures its anomaly degree compared to trips made by any vehicle. The resulting anomaly scores vectors are used as routine and peculiarity profiles. Features are then extracted from these profiles, for which we investigate the predictive power in the claim classification framework. Using real data, we find that features extracted from the vehicles’ peculiarity profile improve classification.
F.Duval, J.-P. Boucher & M.Pigeon (2022), How Much Telematics Information Do Insurers Need For Claim Classification?, North American Actuarial Journal, 1-21.

It has been shown several times in the literature that telematics data collected in motor insurance help to better understand an insured’s driving risk. Insurers that use this data reap several benefits, such as a better estimate of the pure premium, more segmented pricing and less adverse selection. The flip side of the coin is that collected telematics information is often sensitive and can therefore compromise policyholders’ privacy. Moreover, due to their large volume, this type of data is costly to store and hard to manipulate. These factors, combined with the fact that insurance regulators tend to issue more and more recommendations regarding the collection and use of telematics data, make it important for an insurer to determine the right amount of telematics information to collect. In addition to traditional contract information such as the age and gender of the insured, we have access to a telematics dataset where information is summarized by trip. We first derive several features of interest from these trip summaries before building a claim classification model using both traditional and telematics features. By comparing a few classification algorithms, we find that logistic regression with lasso penalty is the most suitable for our problem. Using this model, we develop a method to determine how much information about policyholders’ driving should be kept by an insurer. Using real data from a North American insurance company, we find that telematics data become redundant after about 3 months or 4,000 kilometers of observation, at least from a claim classification perspective.
F.Duval & M.Pigeon (2019), Individual Loss Reserving using a Gradient Boosting-Based Approach, Risks, 7(3), 1–19.

In this paper, we propose models for non-life loss reserving combining traditional approaches such as Mack’s or generalized linear models and gradient boosting algorithm in an individual framework. These claim-level models use information about each of the payments made for each of the claims in the portfolio, as well as characteristics of the insured. We provide an example based on a detailed dataset from a property and casualty insurance company. We contrast some traditional aggregate techniques, at the portfolio-level, with our individual-level approach and we discuss some points related to practical applications.

Présentations scientifiques

Anomaly Detection Techniques For Feature Extraction In Automobile Claim Classification, Actuarial Research Conference (ARC), University of Illinois, Urbana-Champaign, USA (IL), 3 août 2022.
Anomaly Detection Techniques For Feature Extraction In Automobile Claim Classification, 10th Annual Canadian Statistics Student Conference, Virtuel, 28 mai 2022.
How Much Telematics Information Do Insurers Need For Claim Classification?, Casualty Actuaries of Greater New York (CAGNY) Spring Meeting, Virtuel, 16 avril 2021.
Claim Classification Using Partial Telematics Information, Ratemaking, Product and Modeling Virtual Seminar of the Casualty Actuarial Society, Virtuel, 15 mars 2021.
Gradient Boosting-Based Model For Individual Loss Reserving, 3rd International Conference on Statistical Distributions and Applications, Grand Rapids, USA (MI), 10 octobre 2019.
Gradient Boosting-Based Model for Individual Loss Reserving, Congrès annuel de la Société statistique du Canada, Calgary, Canada (AB), 26 mai 2019.
Claim-Level Models Using Statistical Learning Techniques and Risk Analysis, Joint Statistical Meeting (JSM), Vancouver, Canada (CB), 28 juillet 2018.

Implications

Présentations locales

Optimisez vos diapositives de présentation avec Xaringan, Séminaire de la Chaire Co-operators en analyse des risques actuariels, UQAM, Montréal, Canada (QC), 1er mars 2023.
Améliorer son flux de travail en R avec Targets, Séminaire de la Chaire Co-operators en analyse des risques actuariels, UQAM, Montréal, Canada (QC), 1er mars 2022.
Modélisation des réserves individuelles avec un algorithme de gradient boosting en assurance automobile, Séminaire d’été des étudiants en actuariat et en statistique, Virtuel, 1er juillet 2021.
Quelle quantité d’information télématique conserver pour prédire les réclamations?, Sommet des sciences et de l’analyse 2021 de Co-operators, Virtuel, 1er juin 2021.
Micro-level Loss Reserving For General insurance… et tentatives d’amélioration, Séminaire de la Chaire Co-operators en analyse des risques actuariels, UQAM, Montréal, Canada (QC), 31 janvier 2020.
Apprentissage non-supervisé appliqué à la télématique, Séminaire de la Chaire Co-operators en analyse des risques actuariels, UQAM, Montréal, Canada (QC), 1er septembre 2019.
Techniques de gradient boosting pour la modélisation des réserves individuelles en assurance non-vie, Atelier de la Chaire Co-operators en analyse des risques actuariels, UQAM, Montréal, Canada (QC), 12 avril 2019.

Enseignement

ACT6100 - Analyse de données en actuariat (A2021)

Démonstrations

ACT2040 - Assurances IARD: tarification et évaluation (A2017, H2018)
ACT2060 - Applications probabilistes des risques actuariels (H2018, A2018, H2019)
ACT3035 - Laboratoire d’actuariat (A2017, H2018, A2018)
ACT3400 - Distribution de sinistres (A2020, A2022)
ACT4400 - Modèles de survie (A2020)
ACT6061 - Modèles actuariels en assurance non-vie (A2018, H2019, A2020, H2021, A2021)
ACT6100 - Analyse de données en actuariat (A2020, H2021)
MAT2080 - Méthodes statistiques (A2019, H2020)

Autres

Co-organisateur du Congrès Canadien des Étudiants en Statistique (2020, 2021)
Co-organisateur de la Compétition montréalaise en science des données (2022, 2023)