Count regression models — countreg • flexCountReg

The purpose of this function is to estimate count regression models using maximum likelihood estimation (MLE) or Maximum Simulated Likelihood Estimation (MSLE). The function can estimate the following models:

Poisson (Poisson)
Negative Binomial 1 (NB1)
Negative Binomial 2 (NB2)
Negative Binomial P (NBP)
Poisson-Lognormal (PLN)
Poisson-Generalized-Exponential (PGE)
Poisson-Inverse-Gaussian Type 1 (PIG1)
Poisson-Inverse-Gaussian Type 2 (PIG2)
Poisson-Inverse-Gamma (PIG)
Poisson-Lindley (PL)
Poisson-Lindley-Gamma (PLG), also known as the Negative Binomial-Lindley (NBL)
Poisson-Lindley-Lognormal (PLL)
Poisson-Weibull (PW)
Sichel (SI)
Generalized Waring (GW)
Conway-Maxwell-Poisson (COM)

Usage

countreg(
  formula,
  data,
  family = "NB2",
  offset = NULL,
  weights = NULL,
  verbose = FALSE,
  dis_param_formula_1 = NULL,
  dis_param_formula_2 = NULL,
  underreport_formula = NULL,
  underreport_family = "logit",
  ndraws = 1500,
  method = "NM",
  max.iters = 1000,
  start.vals = NULL,
  stderr = "normal",
  bootstraps = NULL
)

Arguments

formula: a symbolic description of the model to be fitted.
data: a data frame containing the variables in the model.
family: the name of the distribution/model type to estimate. The default "NB2" is the standard negative binomial distribution with a log link. other options are listed below.
offset: the name of a variable, or vector of variable names, in the data frame that should be used as an offset (i.e., included but forced to have a coefficient of 1). The normal method of setting an offset in the equation can also be used (overrides the offset option).
weights: the name of a variable in the data frame that should be used as a frequency weight.
verbose: an optional parameter. If `TRUE`, the function will print out the progress of the model fitting. Default is `FALSE`.
dis_param_formula_1: a symbolic description of the model for the natural log of the dispersion parameter or first parameter of the count distribution used. Further details are provided below.
dis_param_formula_2: a symbolic description of the model for the second parameter of the count distribution used. Further details are provided below.
underreport_formula: an optional formula to estimate the underreporting for any of the count model options. The underreporting is estimated as a function (logit or probit) of the predictors in the model. For the model to be tractable, the independent variables cannot be the exact same as the count model. The default is `NULL`.
underreport_family: the name of the distribution/model type to estimate the underreporting portion of the model when `underreport_formula` is specified. The default is "logit" for a binary logistic regression model. The other option is "probit" for a probit model.
ndraws: The number of Halton draws for integrating the distribution being compounded with the Poisson distribution when there is not a closed-form solution. Default is 1500. It is recommended to test different numbers of draws to determine if the model is stable (i.e., doesn't change or has minimal change as the number of draws changes within a reasonable range).
method: Optimization method to be used for maximum likelihood estimation. See `maxLik` documentation for options. The default is "NM" for the Nelder-Mead method.
max.iters: Maximum number of iterations for the optimization method.
start.vals: Optional vector of starting values for the optimization.
stderr: Type of standard errors to use. The default is "normal". Other options include "boot" for bootstrapped standard errors, or "robust" for robust standard errors.
bootstraps: Optional integer specifying the number of bootstrap samples to be used for estimating standard errors when `stderr`= "boot". Note that this currently does not work when an offset variable is used.

Value

An object of class `countreg` which is a list with the following components:

model: the fitted model object.
data: the data frame used to fit the model.
call: the matched call.
formula: the formula used to fit the model.

Details

For the `family` argument, the following options are available:

"POISSON" for Poisson distribution with a log link.
"NB1" for Negative Binomial 1 distribution with a log link.
"NB2" for Negative Binomial 2 distribution with a log link (i.e., the standard negative binomial model).
"NBP" for Negative Binomial P distribution with a log link.
"PLN" for Poisson-Lognormal distribution with a log link.
"PGE" for Poisson-Generalized-Exponential distribution with a log link.
"PIG1" for Poisson-Inverse-Gaussian Type-1 distribution with a log link.
"PIG2" for Poisson-Inverse-Gaussian Type-2 distribution with a log link.
"PIG" for Poisson-Inverse-Gamma distribution with a log link.
"PL" for Poisson-Lindley distribution with a log link.
"PLG" for Poisson-Lindley-Gamma distribution with a log link.
"PLL" for Poisson-Lindley-Lognormal distribution with a log link.
"PW" for Poisson-Weibull distribution with a log link.
"SI" for Sichel distribution with a log link.
"GW" for Generalized Waring distribution with a log link.
"COM" for Conway-Maxwell-Poisson (COM) distribution with a log link.

The `dis_param_formula_1` and `dis_param_formula_2` parameters are used to estimate the dispersion parameter or other parameters of the count distribution used. This leads to the distributions parameters being functions rather than constants in the model. For example, if the user wants to estimate the overdispersion parameter of the Negative Binomial 2 distribution as a function of the variable `x1` and `x2`, the user would specify `dis_param_formula_1 = ~ x1 + x2`. In the case of the Negative Binomial distributions, the model is known as a Generalized Negative Binomial model when the overdispersion parameter is specified as a function.

The function linking the distribution parameters to the predictors is: $$Param = \exp((Intercept) + \sum \beta X)$$

The parameters for the different models are as follows:

For `dis_param_formula_1`, the models are for the parameters:

$ln(\alpha)$ for the Negative Binomial 1 model.
$ln(\alpha)$ for the Negative Binomial 2 model.
$ln(\alpha)$ for the Negative Binomial P model.
$ln(\sigma)$ for the Poisson-Lognormal model.
shape parameter for the Poisson-Generalized-Exponential model.
$ln(\eta)$ for the Poisson-Inverse-Gaussian model.
$ln(\eta)$ for the Poisson-Inverse-Gamma model.
$ln(\theta)$ for the Poisson-Lindley model.
$ln(\theta)$ for the Poisson-Lindley-Gamma model.
$ln(\theta)$ for the Poisson-Lindley-Lognormal model.
$ln(\alpha)$ for the Poisson-Weibull model.
$\gamma$ for the Sichel model.
$k$ for the Generalized Waring model.
$ln(\nu)$ for the Conway-Maxwell-Poisson model.

For `dis_param_formula_2`, the models are for the parameters:

Not Applicable for the Negative Binomial 1 model.
Not Applicable for the Negative Binomial 2 model.
p for the Negative Binomial P model.
Not Applicable for the Poisson-Lognormal model.
scale parameter for the Poisson-Generalized-Exponential model.
Not Applicable for the Poisson-Inverse-Gaussian model.
Not Applicable for the Poisson-Inverse-Gamma model.
Not Applicable for the Poisson-Lindley model.
$ln(\alpha)$ for the Poisson-Lindley-Gamma model.
$ln(\sigma)$ for the Poisson-Lindley-Lognormal model.
$ln(\sigma)$ for the Poisson-Weibull model.
$ln(\sigma)$ for the Sichel model.
$ln(\rho)$ for the Generalized Waring model.
Not Applicable for the Conway-Maxwell-Poisson model.

The `ndraws` parameter is used to estimate the distribution when there is not a closed-form solution. This uses Halton draws to integrate the distribution being compounded with the Poisson distribution. The default is 1500. The models this is applicable for include:

Poisson-Lognormal
Poisson-Generalized-Exponential
Poisson-Lindley-Gamma (more efficient than using hypergeometric functions)
Poisson-Lindley-Lognormal
Poisson-Weibull

Model Details

## Poisson Model This implements the Poisson regression model using Maximum Likelihood Estimation, as opposed to the Iteratively Reweighted Least Squares (IRLS) method used in the `glm` function.

The PMF and log-likelihood functions are: $$P(Y = y) = \frac{e^{-\mu} \mu^y}{y!}$$ $$LL_{\text{Poisson}}(\beta) = \sum_{i=1}^n \left[ -\mu_i + y_i \ln(\mu_i) - \ln(y_i!) \right]$$

The mean is: $$\mu = exp(X\beta)$$

The variance is: $$\text{Var}(Y) = \mu$$

## Negative Binomial Models**

The NB-1, NB-2, and NB-P versions of the negative binomial distribution are based on Greene (2008). The details of each of these are provided below.

### NB-1 Model The PMF and log-likelihood functions are: $$P(Y = y) = \frac{\Gamma(y + \frac{\mu}{\alpha})}{y! \, \Gamma(\frac{\mu}{\alpha})} \left( \frac{\frac{\mu}{\alpha}}{\frac{\mu}{\alpha} + \mu} \right) ^{\frac{\mu}{\alpha}} \left( \frac{\mu}{\frac{\mu}{\alpha} + \mu} \right)^y$$ $$LL_{\text{NB1}}(\beta, \alpha) = \sum_{i=1}^n \left[ \ln \Gamma\left( y_i + \frac{\mu_i}{\alpha} \right) - \ln \Gamma\left( \frac{\mu_i}{\alpha} \right) - \ln y_i! + \frac{\mu_i}{\alpha} \ln \left( \frac{\frac{\mu_i}{\alpha}}{\frac{\mu_i}{\alpha} + \mu_i} \right) + y_i \ln \left( \frac{\mu_i}{\frac{\mu_i}{\alpha} + \mu_i} \right) \right]$$

The mean is: $$\mu = exp(X\beta)$$

The variance is: $$\text{Var}(Y) = \mu + \alpha\mu$$

### NB-2 Model The PMF and log-likelihood functions are: $$P(Y = y) = \frac{\Gamma(y + \alpha)}{y! \, \Gamma(\alpha)} \left( \frac{\alpha}{\alpha + \mu} \right)^\alpha \left( \frac{\mu}{\alpha + \mu} \right)^y$$ $$LL_{\text{NB2}} = \sum_{i=1}^n \left[ \ln \Gamma(y_i + \alpha) - \ln \Gamma(\alpha) - \ln y_i! + \alpha \ln \left( \frac{\alpha}{\alpha + \mu_i} \right) + y_i \ln \left( \frac{\mu_i}{\alpha + \mu_i} \right) \right]$$

The mean is: $$\mu = exp(X\beta)$$

The variance is: $$\text{Var}(Y) = \mu + \alpha\mu^2$$

### NB-P Model The PMF and log-likelihood functions are: $$P(Y = y) = \frac{\Gamma(y + \frac{\mu^{2-p}}{\alpha})}{y! \, \Gamma(\frac{\mu^{2-p}}{\alpha})} \left( \frac{\frac{\mu^{2-p}}{\alpha}} {\frac{\mu^{2-p}}{\alpha} + \mu} \right)^{\frac{\mu^{2-p}}{\alpha}} \left( \frac{\mu}{\frac{\mu^{2-p}}{\alpha} + \mu} \right)^y$$ $$LL_{\text{NBP}}(\beta, \alpha, p) = \sum_{i=1}^n \left[ \ln \Gamma\left( y_i + \frac{\mu_i^{2-p}}{\alpha} \right) - \ln \Gamma\left( \frac{\mu_i^{2-p}}{\alpha} \right) - \ln y_i! + \frac{\mu_i^{2-p}}{\alpha} \ln \left( \frac{\frac{\mu_i^{2-p}}{\alpha}}{\frac{\mu_i^{2-p}}{\alpha} + \mu_i} \right) + y_i \ln \left( \frac{\mu_i}{\frac{\mu_i^{2-p}}{\alpha} + \mu_i} \right) \right]$$

The mean is: $$\mu = exp(X\beta)$$

The variance is: $$\text{Var}(Y) = \mu + \alpha\mu^P$$

## Poisson-Lognormal (PLN) Model The compound Probability Mass Function(PMF) for the Poisson-Lognormal distribution is: $$f(y|\lambda,\sigma)=\int_0^\infty \frac{\lambda^y x^y e^{-\lambda x}} {y!}\frac{exp\left(-\frac{ln^2(x)}{2\sigma^2} \right)}{x\sigma\sqrt{2\pi}}dx$$

Where $\sigma$ is a parameter for the lognormal distribution with the restriction $\sigma>0$, and $y$ is a non-negative integer.

The expected value of the distribution is: $$E[y]=e^{X\beta+\sigma^2/2} = \mu e^{\sigma^2/2}$$

When `ln.sigma.formula` is used, the parameter $\sigma$ is modeled as: $$ln(\sigma)=\beta_0+\beta_1 x_1 + \cdots + \beta_n x_n$$

Thus, the resulting value for the parameter $\sigma$ is: $$\sigma=e^{\beta_0+\beta_1 x_1 + \cdots + \beta_n x_n}$$

The t-statistics and p-values for the coefficients related to ln(sigma) are, by default, testing if the coefficients are different from a value of 0. This has little practical meaning given that they are coefficients for ln(sigma). They are not testing if the coefficients have statistical significance in terms of improvement over a Poisson model. The Likelihood-Ratio test results provided in the output provide a test comparing if the Poisson-Lognormal model provides a statistically significant improvement in model fit over the Poisson model.

## Poisson Generalized-Exponential (PGE) Model The Generalized Exponential distribution can be written as a function with a shape parameter $\alpha>0$ and scale parameter $\gamma>0$. The distribution has strictly positive continuous values. The PDF of the distribution is: $$f(x|\alpha,\gamma)=\frac{\alpha}{\gamma}\left(1-e^{-\frac{x}{\gamma}} \right)^{\alpha-1}e^{-\frac{x}{\gamma}}$$

Thus, the compound Probability Mass Function(PMF) for the PGE distribution is: $$f(y|\lambda,\alpha,\beta)=\int_0^\infty \frac{\lambda^y x^y e^{-\lambda x}}{y!}\frac{\alpha}{\gamma}\left(1-e^{-\frac{x}{\gamma}} \right)^{\alpha-1}e^{-\frac{x}{\gamma}} dx$$

The expected value of the distribution is: $$E[y]=\mu=\lambda \left(\frac{\psi(\alpha+1)-\psi(1)}{\gamma}\right)$$

Where $\psi(\cdot)$ is the digamma function.

The variance is: $$\sigma^2=\lambda \left(\frac{\psi(\alpha+1)-\psi(1)}{\gamma}\right) + \left(\frac{-\psi'(\alpha+1)+\psi'(1)}{\gamma^2}\right)\lambda^2$$

Where $\psi'(\cdot)$ is the trigamma function.

To ensure that $\mu=e^{X\beta}$, $\lambda$ is replaced with: $$\lambda=\frac{\gamma e^{X\beta}}{\psi(\alpha+1)-\psi(1)}$$

This results in: $$f(y|\mu,\alpha,\beta)=\int_0^\infty \frac{\left(\frac{\gamma e^{X\beta}}{\psi(\alpha+1)-\psi(1)}\right)^y x^y e^{-\left(\frac{\gamma e^{X\beta}}{\psi(\alpha+1)-\psi(1)}\right) x}}{y!}\frac{\alpha}{\gamma} \left(1-e^{-\frac{x}{\gamma}}\right)^{\alpha-1}e^{-\frac{x}{\gamma}} dx$$

Halton draws are used to perform simulation over the lognormal distribution to solve the integral.

## Poisson-Inverse-Gaussian Type 1 (PIG1) and Type 2 (PIG2) Models The Poisson-Inverse-Gaussian regression model is based on the Poisson-Inverse-Gaussian Distribution.

The expected value of the distribution in the regression utilizes a log-link function. Thus, the mean is: $$\mu=e^{X\beta}$$

The variance function for the Type 1 distribution (which is the default) is: $$\sigma^2=\mu+\eta\mu$$

While the variance for the Type 2 distribution is: $$\sigma^2=\mu+\eta\mu^2$$

The parameter $\eta$ is estimated as the natural logarithm transformed value, $\ln(\eta)$, to ensure that $\eta>0$.

## Poisson-Inverse-Gamma (PIG) Model The PDF of the distribution is: $$f(x|\eta,\mu)=\frac{2\left(\mu\left(\frac{1}{\eta}+1\right)\right) ^{\frac{x+\frac{1}{eta}+2}{2}}}{x!\Gamma\left(\frac{1}{\eta}+2\right)} K_{x-\frac{1}{\eta}-2}\left(2\sqrt{\mu\left(\frac{1}{\eta}+1\right)}\right)$$

Where $\eta$ is a shape parameter with the restriction that $\eta>0$, $\mu>0$ is the mean value, $y$ is a non-negative integer, and $K_i(z)$ is the modified Bessel function of the second kind. This formulation uses the mean directly.

The variance of the distribution is: $$\sigma^2=\mu+\eta\mu^2$$

## Poisson-Lindley (PL) Model The Poisson-Lindley regression is based on a compound Poisson-Lindley distribution. It handles count outcomes with high levels of zero observations (or other high densities at low outcome values) that standard count regression methods, including the negative binomial, may struggle to adequately capture or model.

The compound Probability Mass Function(PMF) for the Poisson-Lindley (PL) distribution is: $$f(y|\theta,\lambda)=\frac{\theta^2\lambda^y(\theta+\lambda+ y+1)}{(\theta+1)(\theta+\lambda)^{y+2}}$$

Where $\theta$ and $\lambda$ are distribution parameters with the restrictions that $\theta>0$ and $\lambda>0$, and $y$ is a non-negative integer.

The expected value of the distribution is: $$\mu=\frac{\lambda(\theta+2)}{\theta(\theta+1)}$$

If a log-link function is used, the mean is: $$\mu=e^{X\beta}=\frac{\lambda(\theta+2)}{\theta(\theta+1)}$$

Thus, the parameter $\lambda$ in the PL distribution when applied to regression analysis is: $$\lambda=\frac{\mu\theta(\theta+1)}{\theta+2}$$

Using the replacement and simplifying results in: $$f(y \mid \theta, \mu) = \\ \frac{\theta^2 (\mu \theta (\theta+1))^y (\theta^2 (1+\mu) + \theta (2+\mu) + (\theta+2) (y+1))}{(\theta+1) (\theta+2)^{y+1} (\theta^2 (1+\mu) + \theta (2+\mu))^{y+2}}$$ And $$LL=2 \log(\theta) + y (\log(\mu) + \log(\theta) + \log(\theta+1)) + \log(\theta^2 (1+\mu) + \theta (2+\mu) + (\theta+2) (y+1)) - \log(\theta+1) - (y+1) \log(\theta+2) - (y+2) \log(\theta^2 (1+\mu) + \theta (2+\mu))$$

The variance function is defined as: $$\sigma^2=\mu+\left(1-\frac{2}{(\theta+2)^2}\right)\mu^2$$

It should be noted that the p-value for the parameter `ln(theta)` in the model summary is testing if the parameter `theta` is equal to a value of 1. This has no practical meaning. The Likelihood-Ratio (LR) test compares the Poisson-Lindley regression with a Poisson regression with the same independent variables. Thus, the PR test result indicates the statistical significance for the improvement in how well the model fits the data over a Poisson regression. This indicates the statistical significance of the `theta` parameter.

## Poisson-Lindley-Gamma (PLG) Model The Poisson-Lindley-Gamma regression is based on a compound Poisson-Lindley- Gamma distribution. Details of the distribution can be seen at dplindGamma.

The mean for the regression model is: $$\mu=e^{X\beta}$$

The variance function is defined as: $$\sigma^2=\mu+\left(2\alpha+1-\frac{2(1+\alpha)} {(\theta+2)^2}\right)\mu^2$$

It should be noted that the p-value for the parameters `ln(theta)` and `ln(alpha)` in the model summary are testing if the parameter `theta` and `alpha` are equal to a value of 1.

## Poisson-Lindley-Lognormal (PLL) Model The Poisson-Lindley-Lognormal regression is based on a compound Poisson- Lindley-Lognormal distribution. Details of the distribution can be seen at dplindLnorm.

The mean for the regression model is: $$\mu=e^{X\beta}$$

The variance function is defined as: $$\sigma^2=\mu+\left(\frac{1-\frac{2}{(\theta+2)^2}}{e^{\frac{\sigma^2} {2}}}+e^{\sigma^2}-1\right)\mu^2$$

It should be noted that the p-value for the parameters `ln(theta)` and `ln(sigma)` in the model summary are testing if the parameter `theta` and `sigma` are equal to a value of 1.

## Poisson-Weibull (PW) Model The Poisson-Weibull distribution uses the Weibull distribution as a mixing distribution for a Poisson process. It is useful for modeling overdispersed count data. The density function (probability mass function) for the Poisson-Weibull distribution is given by: $$P(y|\lambda,\alpha,\sigma) = \int_0^\infty \frac{e^{-\lambda x} \lambda^y x^y }{y!} \left(\frac{\alpha}{\sigma}\right)\left(\frac{x}{\sigma} \right)^{\alpha-1}e^{-\left(\frac{x}{\sigma}\right)^\alpha} dx$$ where $f(x| \alpha, \sigma)$ is the PDF of the Weibull distribution and $\lambda$ is the mean of the Poisson distribution.

For the Poisson-Weibull Regression model, the expected values is: $$E[Y]=\lambda\sigma\Gamma\left(1+\frac{1}{\alpha}\right)$$ Where $\lambda$ is the mean of the Poisson distribution, $\alpha$ is the shape parameter, and $\sigma$ is the scale parameter.

To ensure that the regression model predicts the mean value, the regression utilizes: $$\mu=\exp{X\gamma}=\lambda\sigma\Gamma\left(1+\frac{1}{\alpha}\right)$$ Where $X$ is a matrix of independent variables and $\gamma$ is a vector of coefficients.

This leads to: $$\lambda=\frac{\mu}{\sigma\Gamma\left(1+\frac{1}{\alpha}\right)}$$

The variance for the Poisson-Weibull regression is: $$V[Y]=\mu+\left(\frac{\Gamma\left(1+\frac{2}{\alpha}\right)} {\Gamma\left(1+\frac{1}{\alpha}\right)^2}-1\right)\mu^2$$

## Sichel (SI) Model The compound Probability Mass Function (PMF) for the Sichel distribution uses the formulation from Zhou et al. (2011) and Rigby et al. (2008): $$f(y|\mu, \sigma, \gamma)=\frac{\left(\frac{\mu}{c}\right)^y K_{y+\gamma}(\alpha)}{K_\gamma(1/\sigma)y!(\alpha\sigma)^{y+\gamma}}$$

Where $\sigma$ and $\gamma$ are distribution parameters with $-\infty < \gamma < \infty$ and $\sigma>0$, $c=\frac{K_{\gamma+1}(1/\sigma)}{K_\gamma(1/\sigma)}$, $\alpha^2=\sigma^{-2}+2\mu(c\sigma)^{-1}$, a mean value of $\mu$, $y$ is a non-negative integer, and $K_j(x)$ is a modified Bessel function of the third kind with order $j$ and argument $x$.

The variance of the distribution is: $$\sigma^2=\mu+\left(\frac{2\sigma(\gamma+1)}{c}+\frac{1}{c^2}-1\right) \mu^2$$

## Generalized Waring (GW) Model The following are the versions of the PMF, mean, and variance used for the Generalized Waring model. This is adjusted from the typical formulation by replacing parameter k with $\mu$ $$PMF=\frac{\Gamma(\alpha+y)\Gamma(k+y)\Gamma(\rho+k)\Gamma(\alpha+\rho)} {y!\Gamma(\alpha)\Gamma(k)\Gamma(\rho)\Gamma(\alpha+k+\rho+y)}$$ $$\mu=e^{X\beta}=\frac{\alpha k}{\rho-1}$$ $$\sigma^2=\frac{\alpha k(\alpha+k+\rho-1)}{(\rho-1)^2(\rho-2)}$$

The distribution parameters are often considered to capture the randomness (parameter $$\alpha$$), proneness (parameter $$$$k), and liability (parameter $$\rho$$) of the data.

If we use: $$\alpha=\frac{\mu k}{\rho-1}$$

The PMF becomes:

$$PMF=\frac{\Gamma\left(\frac{\mu k}{\rho-1}+y\right)\Gamma(k+y) \Gamma(\rho+k)\Gamma\left(\frac{\mu k}{\rho-1}+\rho\right)}{y! \Gamma\left(\frac{\mu k}{\rho-1}\right)\Gamma(k)\Gamma(\rho) \Gamma\left(\frac{\mu k}{\rho-1}+k+\rho+y\right)}$$

This results in a regression model where: $$\mu=e^{X\beta}$$ $$\sigma^2=\frac{\mu k^2\left(\frac{\mu k}{\rho-1}+k+\rho-1\right)} {(\rho-1)^3(\rho-2)}=\left(\frac{k^3+\rho k^2- k^2}{(\rho-1)^3(\rho-2)} \right)\mu+\left(\frac{k^3}{(\rho-1)^4(\rho-2)}\right)\mu^2$$

Note that when $$p=1$$ or $$p=2$$, the distribution is undefined.

## Conway-Maxwell-Poisson (COM) Model The following is the the PMF for the COM model. $$f(x|\lambda, \nu)=\frac{\lambda^x}{(x!)^\nu Z(\lambda,\nu)}$$

Where $\lambda$ and $\nu$ are distribution parameters with $\lambda>0$ and $\nu>0$, and $Z(\lambda,\nu)$ is the normalizing constant.

The normalizing constant is given by: $$Z(\lambda,\nu)=\sum_{n=0}^{\infty}\frac{\lambda^n}{(n!)^\nu}$$

The mean and variance are: $$\mu=e^{X\beta}=\lambda \frac{\delta}{\delta \lambda} \log(Z(\lambda, \nu))$$ $$\sigma^2=\lambda \frac{\delta}{\delta \lambda} \mu$$

Note that the COM distribution parameter $\lambda$ is solved for using $\mu$ and $\nu$, so the regression model provides direct predictions for the mean.

## Underreporting Models for underreporting combine a binary probability model (logit or probit) with a count model. This is accomplished using a model for the probability of crashes being reported multiplied by the estimated mean for the count model, based on the observed data. This is discussed in Wood et. al. (2016), Pararai et. al., (2006), and Pararai et. al., (2010). The underreporting model is based on: $$\mu_{true}=\mu_{observed}\cdot P(\text{event is reported})$$

This allows the inference of both the true event count and the probability of the event being reported as a function of independent variables.

References

Greene, W. (2008). Functional forms for the negative binomial model for count data. Economics Letters, 99(3), 585-590.

Pararai, M., Famoye, F., & Lee, C. (2006). Generalized Poisson regression model for underreported counts. Advances and applications in Statistics, 6(3), 305-322.

Pararai, M., Famoye, F., & Lee, C. (2010). Generalized Poisson-Poisson mixture model for misreported counts with an application to smoking data. Journal of Data Science, 8(4), 607-617.

Rigby, R. A., Stasinopoulos, D. M., & Akantziliotou, C. (2008). A framework for modelling overdispersed count data, including the Poisson-shifted generalized inverse Gaussian distribution. Computational Statistics & Data Analysis, 53(2), 381-393.

Wood, J.S., Eric T. Donnell, & Christopher J. Fariss. "A method to account for and estimate underreporting in crash frequency research." Accident Analysis & Prevention 95 (2016): 57-66.

Zou, Y., Lord, D., & Zhang, Y. (2012). Analyzing highly dispersed crash data using the Sichel generalized additive models for location, scale and shape.

Examples

# \donttest{
# Load the Washington data
data("washington_roads")
washington_roads$AADT10kplus <- ifelse(washington_roads$AADT > 10000, 1, 0)
# Estimate an NB2 model with a dispersion parameter as a function of the 
# variable `speed50` (i.e., generalized NB2), verbose output, and use the 
# BFGS optimization method
nb2 <- countreg(Total_crashes ~ lnaadt + lnlength + speed50 + AADT10kplus,
               data = washington_roads, family = "NB2",
               dis_param_formula_1 = ~ speed50, verbose = TRUE, 
               method='BFGS')
#> initial  value 1087.972055 
#> iter   2 value 1086.761180
#> iter   3 value 1082.460820
#> iter   4 value 1081.386386
#> iter   5 value 1080.983879
#> iter   6 value 1079.708978
#> iter   7 value 1070.603400
#> iter   8 value 1068.239212
#> iter   9 value 1065.881518
#> iter  10 value 1065.612937
#> iter  11 value 1065.068995
#> iter  12 value 1064.893660
#> iter  13 value 1064.876776
#> iter  14 value 1064.876161
#> iter  15 value 1064.876105
#> iter  15 value 1064.876105
#> iter  15 value 1064.876105
#> final  value 1064.876105 
#> converged
summary(nb2)
#> Call:
#>  Total_crashes ~ lnaadt + lnlength + speed50 + AADT10kplus 
#> 
#>  Method:  countreg 
#> Iterations:  44 
#> Convergence:  successful convergence  
#> Log-likelihood:  -1064.876 
#> 
#> Parameter Estimates:
#> (Using bootstrapped standard errors)
#> # A tibble: 7 × 7
#>   parameter           coeff `Std. Err.` `t-stat` `p-value` `lower CI` `upper CI`
#>   <chr>               <dbl>       <dbl>    <dbl>     <dbl>      <dbl>      <dbl>
#> 1 (Intercept)        -7.40        0.043  -172.       0         -7.49      -7.32 
#> 2 lnaadt              0.912       0.005   182.       0          0.902      0.921
#> 3 lnlength            0.843       0.037    22.9      0          0.771      0.915
#> 4 speed50            -0.47        0.102    -4.61     0         -0.67      -0.27 
#> 5 AADT10kplus         0.77        0.09      8.59     0          0.594      0.945
#> 6 ln(alpha):(Interc… -1.62        0.288    -5.62     0         -2.18      -1.06 
#> 7 ln(alpha):speed50   1.31        0.458     2.85     0.004      0.409      2.20 


# Estimate a Poisson-Lognormal model (a low number of draws is used to speed 
# up the estimation for examples - not recommended in practice)
pln <- countreg(Total_crashes ~ lnaadt + lnlength + speed50 + AADT10kplus,
              data = washington_roads, family = "PLN", ndraws=10)
summary(pln)  
#> Call:
#>  Total_crashes ~ lnaadt + lnlength + speed50 + AADT10kplus 
#> 
#>  Method:  countreg 
#> Iterations:  485 
#> Convergence:  successful convergence  
#> Log-likelihood:  -1075.729 
#> 
#> Parameter Estimates:
#> (Using bootstrapped standard errors)
#> # A tibble: 6 × 7
#>   parameter    coeff `Std. Err.` `t-stat` `p-value` `lower CI` `upper CI`
#>   <chr>        <dbl>       <dbl>    <dbl>     <dbl>      <dbl>      <dbl>
#> 1 (Intercept) -7.78        0.038  -205.       0         -7.85      -7.70 
#> 2 lnaadt       0.907       0.004   209.       0          0.898      0.915
#> 3 lnlength     0.827       0.033    24.9      0          0.762      0.892
#> 4 speed50     -0.434       0.086    -5.08     0         -0.602     -0.267
#> 5 AADT10kplus  0.774       0.071    10.9      0          0.635      0.913
#> 6 ln(sigma)    0.117       0.096     1.22     0.222     -0.071      0.304

# Estimate an Poisson-Lognormal with underreporting (probit)
plogn_underreport <- countreg(Total_crashes ~ lnaadt + lnlength + speed50 + 
               AADT10kplus,
              data = washington_roads, family = "NB2",
              underreport_formula = ~ speed50 + AADT10kplus, 
              underreport_family = "probit")
summary(plogn_underreport)
#> Call:
#>  Total_crashes ~ lnaadt + lnlength + speed50 + AADT10kplus 
#> 
#>  Method:  countreg 
#> Iterations:  494 
#> Convergence:  successful convergence  
#> Log-likelihood:  -1065.087 
#> 
#> Parameter Estimates:
#> (Using bootstrapped standard errors)
#> # A tibble: 9 × 7
#>   parameter           coeff `Std. Err.` `t-stat` `p-value` `lower CI` `upper CI`
#>   <chr>               <dbl>       <dbl>    <dbl>     <dbl>      <dbl>      <dbl>
#> 1 (Intercept)        -6.69        0.043  -157.       0         -6.78      -6.61 
#> 2 lnaadt              0.901       0.005   182.       0          0.892      0.911
#> 3 lnlength            0.832       0.037    22.6      0          0.76       0.904
#> 4 speed50            -0.924       0.093    -9.92     0         -1.11      -0.741
#> 5 AADT10kplus         2.22        0.09     24.7      0          2.04       2.40 
#> 6 ln(alpha)          -1.50        0.34     -4.42     0         -2.17      -0.838
#> 7 Underreporting:(I… -0.085       0.044    -1.93     0.054     -0.172      0.001
#> 8 Underreporting:sp… -0.774       0.21     -3.69     0         -1.18      -0.362
#> 9 Underreporting:AA…  1.25        0.056    22.4      0          1.14       1.35 

# Estimate a Conway-Maxwell-Poisson model
com_model <- countreg(Total_crashes ~ lnaadt + lnlength + speed50 + 
                      AADT10kplus,
                      data = washington_roads, family = "COM", method="BHHH")
summary(com_model)
#> Call:
#>  Total_crashes ~ lnaadt + lnlength + speed50 + AADT10kplus 
#> 
#>  Method:  countreg 
#> Iterations:  6 
#> Convergence:  successive function values within tolerance limit (tol) 
#> Log-likelihood:  -1066.063 
#> 
#> Parameter Estimates:
#> (Using bootstrapped standard errors)
#> # A tibble: 6 × 7
#>   parameter    coeff `Std. Err.` `t-stat` `p-value` `lower CI` `upper CI`
#>   <chr>        <dbl>       <dbl>    <dbl>     <dbl>      <dbl>      <dbl>
#> 1 (Intercept) -7.42        0.043  -174.           0     -7.50      -7.34 
#> 2 lnaadt       0.912       0.005   185.           0      0.903      0.922
#> 3 lnlength     0.84        0.036    23.2          0      0.769      0.911
#> 4 speed50     -0.445       0.088    -5.05         0     -0.618     -0.272
#> 5 AADT10kplus  0.788       0.085     9.25         0      0.621      0.955
#> 6 ln(nu)      -0.62        0.171    -3.63         0     -0.955     -0.285
## }