# Minimum Bias and Exponential Family Distributions

## Introduction

This post introduces four short expository notes describing exponential family distributions that are often used to model insurance losses. The exponential family underlies generalized linear models (GLMs), a “near-ubiquitous” Goldburd, Khare, and Tevet (2016) insurance modeling technique. The notes have an unashamedly theoretical focus and aim to educate an advanced modeler or mathematically curious actuary. You don’t need to know how the car works for commuting to work, but it might be helpful if you are a race car driver—and at the cutting edge of pricing in a competitive market, you are in a race.

The four parts are as follows.

• Part I reexamines the Bailey Simon minimum bias method. It presents an overview of how the actuarial approach to modeling has evolved since 1960 as it incorporated statistical models and GLMs. It explains why exponential family distributions are an ideal choice for modeling losses.
• Part II describes nine different ways to define an exponential family distribution. Each approach reveals a different statistical property and the fact there are nine reflects the richness of the family. The variance function, relating variance to mean, emerges as new way to define a distribution.
• Part III analyzes probability models for insurance losses, with an emphasis on ways to extend a compound Poisson distribution, how to embed a static model into a dynamic one, and how to create new distributions from old ones.
• Part IV demystifies the Tweedie-power variance function families of distributions. These distributions are particularly important because they appear as limiting types for small and large expected losses from most exponential family distributions.

Each part contains its own introduction.

What will the actuary learn?

• Why exponential family distributions are so useful and tractable.
• How to identify and interpret the different components defining an exponential family distribution.
• How to use the variance function to distinguish distributions that can be used to model losses.
• How to discern the small and large expected loss behavior of a loss distribution from its variance function, and whether it is discrete, mixed or continuous.
• How to incorporate prior knowledge by choosing an appropriate variance function. (Most GLM software allows for the use of custom error distributions defined by their variance functions.)
• How to measure residual error at the observation level using the variance function.
• How use an size of loss frequency modeling paradigm, similar to that used by catastrophe models, and how it extends the frequency and severity paradigm.
• How to embed a static model into a dynamic stochastic (Lévy) process.
• How to build a general Lévy process from its size of loss frequency distribution.
• Why the distributions in the power variance function family appear in the order they do.
• Why the Tweedie is a compound Poisson distribution with a gamma severity.
• Which distribution can be regarded as the universal severity.

The four parts overlap and there is some duplication between them help make each stand-alone. The core material builds sequentially and any undefined term will appear in an earlier part.

Enjoy!

## Abbreviations

Abbreviation Meaning
CP Compound Poisson
EDM Exponential Dispersion Model
EF Exponential Family
GHS Generalized Hyperbolic Secant distribution
GLM Generalized Linear Model
IACP Infinite Activity Compound Poisson
ID Infinitely Divisible
iid Independent and identically distributed
MGF Moment Generating Function
MLE Maximum Likelihood Estimator
MVB Minimum Variance Bound
NEF Natural Exponential Family
PVF Power Variance Family

The end of an example or exercise is marked with a square.

The gamma function is defined by $\Gamma(\alpha):=\int_0^\infty x^{\alpha-1} e^{-x}\,dx.$ Integration by parts shows $$\Gamma(\alpha+1)=\alpha\Gamma(\alpha)$$ and so $$\Gamma(n)=(n-1)!$$ for integer $$n$$.

## References

There are many good introductions to GLMs available including general treatments: McCullagh and Nelder (1989), Kaas et al. (2008), Dobson and Barnett (2008), Heller and De jong (2008), and Dunn and Smyth (2018), as well as specific actuarial applications: Renshaw (1994), Haberman and Renshaw (1996), Mildenhall (1999), Smyth and Jørgensen (2002), Wüthrich (2003) Taylor (2007), Meyers (2009), and Goldburd, Khare, and Tevet (2016).

Dobson, Annette J., and Adrian G. Barnett. 2008. An Introduction to Generalized Linear Models. Chapman & Hall/CRC. https://doi.org/10.1360/zd-2013-43-6-1064.
Dunn, Peter K., and Gordon K. Smyth. 2018. Generalized Linear Models With Examples in R. New York: Springer. http://www.springer.com/series/417.
Goldburd, Mark, Anand Khare, and Dan Tevet. 2016. Generalized linear models for insurance rating. CAS Monograph Series. https://doi.org/10.1017/CBO9780511755408.
Haberman, Steven, and Arthur E. Renshaw. 1996. Generalized linear models and actuarial science.” Journal of the Royal Statistical Society Series D: The Statistician 45 (4): 407–36. https://doi.org/10.2307/2988543.
Heller, Gillian z., and Piet De jong. 2008. Generalized Linear Models for Insurance Data. Cambridge University Press. https://feb.kuleuven.be/public/u0017833/boek.pdf.
Kaas, Rob, Marc Goovaerts, Jan Dhaene, and Michel Denuit. 2008. Modern Actuarial Risk Theory. Springer. https://doi.org/10.1007/978-3-540-70998-5.
McCullagh, P, and J.\simA. Nelder. 1989. Generalized Linear Models. Second. London; New York: Chapman; Hall.
Meyers, Glenn. 2009. Predictive Modeling with the Tweedie Distribution Background – The Collective Risk Model Describe as a simulation algorithm.” History.
Mildenhall, Stephen J. 1999. A systematic relationship between minimum bias and generalized linear models.” Proceedings of the Casualty Actuarial Society LXXXVI (164): 393–487. http://www.casualtyactuarialsociety.net/pubs/proceed/proceed99/99317.pdf.
Renshaw, Arthur E. 1994. Modelling the Claims Process in the Presence of Covariates.” ASTIN Bulletin 24 (2): 265–85. https://doi.org/10.2143/ast.24.2.2005070.
Smyth, Gordon K., and Bent Jørgensen. 2002. Fitting Tweedie’s Compound Poisson Model to Insurance Claims Data: Dispersion Modelling.” ASTIN Bulletin 32 (01): 143–57. https://doi.org/10.2143/ast.32.1.1020.
Taylor, Greg. 2007. The Chain Ladder and Tweedie Distributed Claims Data,” no. November.
Wüthrich, Mario V. 2003. Claims Reserving Using Tweedie’s Compound Poisson Model.” ASTIN Bulletin 33 (2): 331–46. https://doi.org/10.1017/S0515036100013490.

posted 2020-10-20 | tags: probability, minimum bias, glm, exponential family