The idea of GLMs is that given some covariates, has a distribution in the exponential family (Gaussian, Poisson, Gamma, etc). But that does not mean that
has a similar distribution… so there is no reason to test for a Gamma model for
before running a Gamma regression, for instance. But are there cases where it might work? That the non-conditional distribution is the same (same family at least) than the conditional ones?
For instance, if has a joint Gaussien distribution, then both marginals are Gaussian, but also
. So, in that case, if the covariate is normally distributed, it is possible to have a Gaussian distribution also for
. The econometric interpretation is that with a standard Gaussian linear model, if
is normally distributed, not only the conditional distribution
is Gaussian but also the non-conditional distribution of
.
> set.seed(1) > n=1e3 > X=rnorm(n,10,2) > Y=1+3*X+rnorm(n) > plot(X,Y,xlim=c(4,20))
Indeed, here the distribution of is also Gaussian
> library(nortest) > ad.test(Y) Anderson-Darling normality test data: Y A = 0.23155, p-value = 0.802 > shapiro.test(Y) Shapiro-Wilk normality test data: Y W = 0.99892, p-value = 0.8293
(not only from a statistical point of view, the thoery of Gaussian random vectors confirms that the non-conditional distribution is Gaussian actually)
Here is continuous. What if we consider a finite mixture here, i.e.
takes only a finite number of values? Actually, Teicher (1963) proved that it is not possible to have a non-conditional Gaussian distribution for
. But in practice, would we really reject the Gaussian assumption, for
? If the number of classes is to small, yes. But with a large number of classes (a sufficiently large number of mixture components), …read more
Source:: r-bloggers.com