The linearity
assumption means that each continuous predictor is related
to the outcome in a linear fashion in the regression formula
of a generalized linear model. There are several ways to deal
with a continuous variable.
Some prefer
to categorize the variable in 2 or more classes. This effectively
means that a piecewise constant relation with the outcome is
assumed. This is very unnatural. For example, if we dichotomize
age at 65 years, a 64-year-old is supposed to be at a clearly
different risk than a 66-year-old, while his or her risk would
be identical to, say, a 54-year-old. This is illustrated in
the graph below.
Figure
5.1: Age and 30-day Mortality Relationship
|
|---|
|
Illustration
of the relationship between age and 30-day mortality
after acute myocardial infarction. Data from the GUSTO-I trial
(Lee
et al., 1995) were analyzed with age as a linear, continuous variable
(thick line) and with a dichotomized version of age (<65 years versus >65
years). With the dichotomized version of age, there is an unnaturally
big step between age 64 and age 65, and no difference in predicted risk
among patients younger than 64 and among those older than 65 years of age. |
|
We advocate
the use of continuous smooth functions. Polynomials such as
age^2, age^3, etc. have a disadvantage in that functions may
behave typically at the tails. This disadvantage also holds
for more recent suggestions, such as fractional polynomials
(e.g. age^2+age^0.5) (Royston,
2000), but less for functions such as restricted cubic splines
(Harrell
et al., 1988).
QUESTION
5.1
Linearity
can be tested by: