Ich hinterlege hier mal ein Zitat, das so oder so ähnlich im Forum immer mal wieder gebraucht wird und hier dann mit einem zitierfähigen Quellenverweis verlinktbar ist.
Ich zitiere aus folgendem Buch:
Andrew Gelman, Jennifer Hill und Aki Vehtari: Regression and Other Stories, Cambridge: Cambridge University Press, 2020, DOI: 10.1017/9781139161879
Part 2: Linear Regression, chapter 11 - Assumptions, diagnostics, and model evaluation, subheading 11.1 - Assumptions of regression analysis, pages 153ff
Auf Seite 154f steht
6. Normality of errors. The distribution of the error term is relevant when predicting individual data points. For the purpose of estimatig the regression line (as compared to predicting individual data points), the assumption of normality is typically barely important at all. Thus we do not recommend diagnostics of the normality of regressioin residuals. For example, many textbooks recomment quantile-quantile (Q-Q) plots, in which the ordered residuals are plottetd vs. the corresponding expected values of ordered draws from a normal distribution, with departures of this plot from linearity indicating nonnormality of the error term. There is nothing wrong with making such a plot, and it can be relevant when evaluating the use of the model for predicting individual data points, but we are typically more concerned with the assumptions of validity, representativeness, additivity, linearity, and so on, listed above.
Einen Absatz lasse ich hier aus.
The regression model does not assume or require that predictors be normally distributed. In addition, the normal distribution on the outcome refers to the regression errors, not to the raw data. Depending on the structure of the predictors, it is possible for data y to be far from normally distributed even when coming from a linear regression model.
LG,
Bernhard