Print page Print page
Switch language
The Capital Region of Denmark - a part of Copenhagen University Hospital

Overdispersion and tabulation of rates from time-to-event data

Research output: Contribution to conferenceConference abstract for conferenceResearchpeer-review

  1. Omega-3 fatty acids and risk of cardiovascular disease in Inuit: First prospective cohort study

    Research output: Contribution to journalJournal articleResearchpeer-review

  2. Components of diabetes prevalence in Denmark 1996-2016 and future trends until 2030

    Research output: Contribution to journalJournal articleResearchpeer-review

  3. Prevalence, incidence and mortality of type 1 and type 2 diabetes in Denmark 1996-2016

    Research output: Contribution to journalJournal articleResearchpeer-review

View graph of relations
Context: When applying Poisson regression to tables of rates, one often encounters the issue of overdispersion, i.e. the residual variation being larger than expected from Poisson sampling, see (Breslow 1984). A standard way of handling the extra variation is to use quasi-likelihood, introducing a dispersion parameter to measure the extraneous variation on a multiplicative scale. Reporting is then done by multiplying the standard error of predicted rates with the estimated dispersion, an ad hoc procedure.

Objective: We illustrate that the standard way of using quasi-likelihood to provide a measure of dispersion in the setting of time-to-event data is not well defined and has dangerous unintended consequences. Prime amongst these is a dependence on the tabulation of data. This dependence is not formally included in the modelling, making interpretation of the dispersion essentially meaningless. Additionally, the lack of an interpretable model implies that a relevant simulation scheme cannot be devised (in fact, existence of such could be taken as a defining trait of a meaningful model).

Methods: We have applied quasi-Poisson regression to a real-world dataset containing the number of testis cancer cases and male person-years in the Danish population 1943-1996, ages 15-65. For a range of labelling and tabulation choices(1x1 year, 2x2 years,…,5x5 years), using natural splines, the resulting dispersion as well as deviance and residual deviance divided by the residual degrees of freedom are reported. Here labelling refers to the different sets with constant rates (a model characteristic), in contrast to tabulation which describes the data layout.

Results: The dispersion parameter, whether estimated using the Pearson residuals or the deviance residuals, exhibits dependence on tabulation as opposed to labelling. For comparable models(using splines of degree 5 for the timescales), dispersion ranged from 1.011 for 1x1 tabulation to 1.457 for 5x5 tabulation.

Conclusions: The standard measures of dispersion when applying quasi-Poisson regression to a table of rates are not tabulation invariant.
Original languageDanish
Publication date2019
Publication statusPublished - 2019
EventInternational Society of Clinical Biostatistics - KU Leuven, Leuven, Belgium
Duration: 14 Jul 201918 Aug 2019


ConferenceInternational Society of Clinical Biostatistics
LocationKU Leuven
Internet address


International Society of Clinical Biostatistics


Leuven, Belgium

Event: Conference

ID: 57796173