### Harvard

### APA

### CBE

### MLA

### Vancouver

### Author

### Bibtex

@conference{dd101eea2a664ffc98f95ee7fb281cc5,

title = "Overdispersion and tabulation of rates from time-to-event data",

abstract = "Context: When applying Poisson regression to tables of rates, one often encounters the issue of overdispersion, i.e. the residual variation being larger than expected from Poisson sampling, see (Breslow 1984). A standard way of handling the extra variation is to use quasi-likelihood, introducing a dispersion parameter to measure the extraneous variation on a multiplicative scale. Reporting is then done by multiplying the standard error of predicted rates with the estimated dispersion, an ad hoc procedure.Objective: We illustrate that the standard way of using quasi-likelihood to provide a measure of dispersion in the setting of time-to-event data is not well defined and has dangerous unintended consequences. Prime amongst these is a dependence on the tabulation of data. This dependence is not formally included in the modelling, making interpretation of the dispersion essentially meaningless. Additionally, the lack of an interpretable model implies that a relevant simulation scheme cannot be devised (in fact, existence of such could be taken as a defining trait of a meaningful model). Methods: We have applied quasi-Poisson regression to a real-world dataset containing the number of testis cancer cases and male person-years in the Danish population 1943-1996, ages 15-65. For a range of labelling and tabulation choices(1x1 year, 2x2 years,…,5x5 years), using natural splines, the resulting dispersion as well as deviance and residual deviance divided by the residual degrees of freedom are reported. Here labelling refers to the different sets with constant rates (a model characteristic), in contrast to tabulation which describes the data layout.Results: The dispersion parameter, whether estimated using the Pearson residuals or the deviance residuals, exhibits dependence on tabulation as opposed to labelling. For comparable models(using splines of degree 5 for the timescales), dispersion ranged from 1.011 for 1x1 tabulation to 1.457 for 5x5 tabulation. Conclusions: The standard measures of dispersion when applying quasi-Poisson regression to a table of rates are not tabulation invariant.",

author = "Diaz, {Lars Jorge} and Bendix Carstensen",

year = "2019",

language = "Dansk",

note = "International Society of Clinical Biostatistics ; Conference date: 14-07-2019 Through 18-08-2019",

url = "https://kuleuvencongres.be/iscb40/",

}

### RIS

TY - ABST

T1 - Overdispersion and tabulation of rates from time-to-event data

AU - Diaz, Lars Jorge

AU - Carstensen, Bendix

PY - 2019

Y1 - 2019

N2 - Context: When applying Poisson regression to tables of rates, one often encounters the issue of overdispersion, i.e. the residual variation being larger than expected from Poisson sampling, see (Breslow 1984). A standard way of handling the extra variation is to use quasi-likelihood, introducing a dispersion parameter to measure the extraneous variation on a multiplicative scale. Reporting is then done by multiplying the standard error of predicted rates with the estimated dispersion, an ad hoc procedure.Objective: We illustrate that the standard way of using quasi-likelihood to provide a measure of dispersion in the setting of time-to-event data is not well defined and has dangerous unintended consequences. Prime amongst these is a dependence on the tabulation of data. This dependence is not formally included in the modelling, making interpretation of the dispersion essentially meaningless. Additionally, the lack of an interpretable model implies that a relevant simulation scheme cannot be devised (in fact, existence of such could be taken as a defining trait of a meaningful model). Methods: We have applied quasi-Poisson regression to a real-world dataset containing the number of testis cancer cases and male person-years in the Danish population 1943-1996, ages 15-65. For a range of labelling and tabulation choices(1x1 year, 2x2 years,…,5x5 years), using natural splines, the resulting dispersion as well as deviance and residual deviance divided by the residual degrees of freedom are reported. Here labelling refers to the different sets with constant rates (a model characteristic), in contrast to tabulation which describes the data layout.Results: The dispersion parameter, whether estimated using the Pearson residuals or the deviance residuals, exhibits dependence on tabulation as opposed to labelling. For comparable models(using splines of degree 5 for the timescales), dispersion ranged from 1.011 for 1x1 tabulation to 1.457 for 5x5 tabulation. Conclusions: The standard measures of dispersion when applying quasi-Poisson regression to a table of rates are not tabulation invariant.

AB - Context: When applying Poisson regression to tables of rates, one often encounters the issue of overdispersion, i.e. the residual variation being larger than expected from Poisson sampling, see (Breslow 1984). A standard way of handling the extra variation is to use quasi-likelihood, introducing a dispersion parameter to measure the extraneous variation on a multiplicative scale. Reporting is then done by multiplying the standard error of predicted rates with the estimated dispersion, an ad hoc procedure.Objective: We illustrate that the standard way of using quasi-likelihood to provide a measure of dispersion in the setting of time-to-event data is not well defined and has dangerous unintended consequences. Prime amongst these is a dependence on the tabulation of data. This dependence is not formally included in the modelling, making interpretation of the dispersion essentially meaningless. Additionally, the lack of an interpretable model implies that a relevant simulation scheme cannot be devised (in fact, existence of such could be taken as a defining trait of a meaningful model). Methods: We have applied quasi-Poisson regression to a real-world dataset containing the number of testis cancer cases and male person-years in the Danish population 1943-1996, ages 15-65. For a range of labelling and tabulation choices(1x1 year, 2x2 years,…,5x5 years), using natural splines, the resulting dispersion as well as deviance and residual deviance divided by the residual degrees of freedom are reported. Here labelling refers to the different sets with constant rates (a model characteristic), in contrast to tabulation which describes the data layout.Results: The dispersion parameter, whether estimated using the Pearson residuals or the deviance residuals, exhibits dependence on tabulation as opposed to labelling. For comparable models(using splines of degree 5 for the timescales), dispersion ranged from 1.011 for 1x1 tabulation to 1.457 for 5x5 tabulation. Conclusions: The standard measures of dispersion when applying quasi-Poisson regression to a table of rates are not tabulation invariant.

M3 - Konferenceabstrakt til konference

ER -