Projektionsmatrix (Statistik)

In der Statistik ist eine Projektionsmatrix eine symmetrische und idempotente Matrix.^[1] Weiterhin sind alle Eigenwerte einer Projektionsmatrix entweder 0 oder 1 und Rang und Spur einer Projektionsmatrix sind identisch.^[2] Die einzige nichtsinguläre Projektionsmatrix ist die Einheitsmatrix. Alle anderen Projektionsmatrizen sind singulär. Die wichtigsten Projektionsmatrizen in der Statistik stellen die Prädiktionsmatrix ${\boldsymbol {P}}$ und die residuenerzeugende Matrix bzw. Residualmatrix ${\boldsymbol {Q}}={\boldsymbol {I}}-{\boldsymbol {P}}$ dar. Sie sind ein Beispiel für eine Orthogonalprojektion im Sinne der linearen Algebra, wo jeder Vektor $y$ eines Vektorraumes mit Skalarprodukt bei gegebener Projektionsmatrix ${\boldsymbol {P}}$ in eindeutiger Weise zerlegt werden kann gemäß $y={\boldsymbol {P}}y+({\boldsymbol {I}}-{\boldsymbol {P}})y$ . Eine weitere in der Statistik wichtige Projektionsmatrix ist die zentrierende Matrix.

Als Ausgangslage betrachten wir ein typisches multiples lineares Regressionsmodell mit gegebenen Daten $\{y_{i},x_{ik}\}_{i=1,\dots ,n,k=1,\dots ,K}$ für $n$ statistische Einheiten und $K$ Regressoren. Der Zusammenhang zwischen der abhängigen Variablen und den unabhängigen Variablen kann wie folgt dargestellt werden

y_{i}=\beta _{0}+x_{i1}\beta _{1}+x_{i2}\beta _{2}+\ldots +x_{iK}\beta _{K}+\varepsilon _{i}=\mathbf {x} _{i}^{\top }{\boldsymbol {\beta }}+\varepsilon _{i},\quad i=1,2,\dotsc ,n

.

In Matrixnotation auch

{\begin{pmatrix}y_{1}\\y_{2}\\\vdots \\y_{n}\end{pmatrix}}_{(n\times 1)}\quad =\quad {\begin{pmatrix}1&x_{11}&x_{12}&\cdots &x_{1K}\\1&x_{21}&x_{22}&\cdots &x_{2K}\\\vdots &\vdots &\vdots &\ddots &\vdots \\1&x_{n1}&x_{n2}&\cdots &x_{nK}\end{pmatrix}}_{(n\times p)}\quad \cdot \quad {\begin{pmatrix}\beta _{0}\\\beta _{1}\\\vdots \\\beta _{K}\end{pmatrix}}_{(p\times 1)}\quad +\quad {\begin{pmatrix}\varepsilon _{1}\\\varepsilon _{2}\\\vdots \\\varepsilon _{n}\end{pmatrix}}_{(n\times 1)}

mit $p=K+1$ . In kompakter Schreibweise

\mathbf {y} =\mathbf {X} {\boldsymbol {\beta }}+{\boldsymbol {\varepsilon }}

.

Hier stellt ${\boldsymbol {\beta }}$ einen Vektor von unbekannten Parametern dar (bekannt als Regressionskoeffizienten), die mithilfe der Daten geschätzt werden müssen. Des Weiteren wird angenommen, dass die Fehlerterme im Mittel null sind: $\mathbb {E} [{\boldsymbol {\boldsymbol {\varepsilon }}}]=\mathbf {0}$ , was bedeutet, dass wir davon ausgehen können, dass unser Modell im Mittel korrekt ist.

Eine der wichtigsten Projektionsmatrizen in der Statistik ist die Prädiktionsmatrix. Die Prädiktionsmatrix ist wie folgt definiert

{\boldsymbol {P}}\equiv \mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\mathbf {X} ^{\top }\quad

mit

\quad {\boldsymbol {P}}\in \mathbb {R} ^{n\times n}

,

wobei $\mathbf {X}$ die Datenmatrix darstellt. Die Diagonalelemente der Prädiktionsmatrix ${\boldsymbol {P}}$ werden $p_{ii}$ genannt und können als Hebelwerte interpretiert werden.

Die residuenerzeugende Matrix^[3] (englisch residual-maker matrix), auch Residuum-erzeugende Matrix, Residualmatrix ist wie folgt definiert

{\boldsymbol {Q}}=\left(\mathbf {I} -\mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\mathbf {X} ^{\top }\right)=\left(\mathbf {I} -{\boldsymbol {P}}\right)

,

wobei P die Prädiktionsmatrix darstellt. Der Name residuenerzeugende Matrix ergibt sich dadurch, dass diese Projektionsmatrix multipliziert mit dem y-Vektor den Residualvektor ${\hat {\boldsymbol {\varepsilon }}}$ ergibt. Der kann durch die Prädiktionsmatrix kompakt wie folgt ausgedrückt werden

{\hat {\boldsymbol {\varepsilon }}}=\mathbf {y} -\mathbf {\hat {y}} =\mathbf {y} -{\boldsymbol {P}}\mathbf {y} =\left(\mathbf {I} -{\boldsymbol {P}}\right)\mathbf {y} ={\boldsymbol {Q}}\mathbf {y}

.

Bei linearen Modellen sind Rang und Spur einer Projektionsmatrix identisch. Für den Rang der residuenerzeugenden Matrix gilt

{\begin{aligned}\operatorname {Rang} ({\boldsymbol {Q}})&=\operatorname {Spur} ({\boldsymbol {Q}})\\&=\operatorname {Spur} (\mathbf {I} -\mathbf {P} )\\&=\sum \nolimits _{i=1}^{n}(1-p_{ii})\\&=n-\sum \nolimits _{i=1}^{n}p_{ii}\\&=n-\operatorname {Spur} ({\boldsymbol {P}})\\&=n-\operatorname {Rang} ({\boldsymbol {P}})\\&=n-p\\&=n-(K+1)\end{aligned}}

Idempotenz

Die Idempotenzeigenschaft der residuenerzeugenden Matrix kann wie folgt gezeigt werden

{\begin{aligned}{\boldsymbol {Q}}^{2}&={\boldsymbol {Q}}\cdot {\boldsymbol {Q}}\\&=\left(\mathbf {I} -\mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\mathbf {X} ^{\top }\right)\left(\mathbf {I} -\mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\mathbf {X} ^{\top }\right)\\&=\mathbf {I} \mathbf {I} -\mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\mathbf {X} ^{\top }\mathbf {I} -\mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\mathbf {X} ^{\top }\mathbf {I} +\mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\mathbf {X} ^{\top }\mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\mathbf {X} ^{\top }\\&=\mathbf {I} -\mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\mathbf {X} ^{\top }-\mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\mathbf {X} ^{\top }+\mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\mathbf {X} ^{\top }\\&=\mathbf {I} -\mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\mathbf {X} ^{\top }\\&=\left(\mathbf {I} -{\boldsymbol {P}}\right)\\&={\boldsymbol {Q}}\qquad \Box \end{aligned}}

Symmetrie

Die Symmetrie der residuenerzeugenden Matrix folgt direkt aus der Symmetrie der Prädiktionsmatrix und kann wie folgt gezeigt werden

{\begin{aligned}{\boldsymbol {Q}}^{\top }&=\left(\mathbf {I} -\mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\mathbf {X} ^{\top }\right)^{\top }\\&=\ \mathbf {I} ^{\top }-\left(\left(\mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\right)\left(\mathbf {X} ^{\top }\right)\right)^{\top }\\&=\ \mathbf {I} -\left(\mathbf {X} ^{\top }\right)^{\top }\left(\mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\right)^{\top }\\&=\ \mathbf {I} -\mathbf {X} \left(\left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\right)^{\top }\mathbf {X} ^{\top }\\&=\ \mathbf {I} -\mathbf {X} \left(\mathbf {X} ^{\top }\mathbf {X} \right)^{-1}\mathbf {X} ^{\top }\\&=\left(\mathbf {I} -{\boldsymbol {P}}\right)\\&={\boldsymbol {Q}}\qquad \Box \end{aligned}}

Die Projektionsmatrix hat eine Fülle von nützlichen algebraischen Eigenschaften.^[4]^[5] In der Sprache der linearen Algebra ist die Projektionsmatrix eine orthogonale Projektion auf den Spaltenraum der Datenmatrix $\mathbf {X}$ . Weitere Eigenschaften der Projektionsmatrizen werden im Folgenden zusammengefasst:

$\mathbf {u} =(\mathbf {I} -\mathbf {P} )\mathbf {y} ,$ und $\mathbf {u} =\mathbf {y} -\mathbf {P} \mathbf {y} \perp \mathbf {X}$
$\mathbf {X}$ ist invariant unter $\mathbf {P}$ : $\mathbf {PX} =\mathbf {X} ,$ folglich $\left(\mathbf {I} -\mathbf {P} \right)\mathbf {X} =\mathbf {0}$ .
$\left(\mathbf {I} -\mathbf {P} \right)\mathbf {P} =\mathbf {P} \left(\mathbf {I} -\mathbf {P} \right)=\mathbf {0}$ („Anwendung der Regression auf die Residuen liefert ${\hat {y}}=0$ “)
$\mathbf {P}$ ist eindeutig für einen bestimmten Unterraum
Alle Eigenwerte einer Projektionsmatrix sind entweder 0 oder 1

Schätzung des Varianzparameters nach der Kleinste-Quadrate-Schätzung

Die Residuenquadratsumme, kurz SQR (Summe der Quadrate der Restabweichungen (oder: „Residuen“) bzw. englisch sum of squared residuals, kurz SSR) ergibt in Matrixschreibweise

SQR:={\hat {\boldsymbol {\varepsilon }}}^{\top }{\hat {\boldsymbol {\varepsilon }}}=\mathbf {y} ^{\top }(\mathbf {I} -\mathbf {P} )^{\top }(\mathbf {I} -\mathbf {P} )\mathbf {y} =\mathbf {y} ^{\top }{\boldsymbol {Q}}{\boldsymbol {Q}}\mathbf {y} =\mathbf {y} ^{\top }{\boldsymbol {Q}}\mathbf {y}

.

Dies kann auch geschrieben werden als

SQR:={\hat {\boldsymbol {\varepsilon }}}^{\top }{\hat {\boldsymbol {\varepsilon }}}=\|y-{\hat {y}}\|_{2}^{2}=\sum \limits _{i=1}^{n}(y_{i}-{\hat {y}}_{i})^{2}

.

Eine erwartungstreue Schätzung der Varianz der Störgrößen ist das „mittlere Residuenquadrat“:

{\hat {\sigma }}^{2}={\frac {SQR}{n-p}}={\frac {\sum \nolimits _{i=1}^{n}(y_{i}-{\hat {y}}_{i})^{2}}{n-p}}

.

Mithilfe der residuenerzeugenden Matrix lässt sich die Varianz der Fehlerterme auch schreiben als

{\hat {\sigma }}^{2}={\frac {\mathbf {y} ^{\top }{\boldsymbol {Q}}\mathbf {y} }{n-p}}={\frac {\mathbf {y} ^{\top }{\boldsymbol {Q}}\mathbf {y} }{\operatorname {Rang} ({\boldsymbol {Q}})}}

.

[1]
Alexander Basilevsky: Applied Matrix Algebra in the Statistical Sciences. Dover, 2005, ISBN 0-486-44538-0, S. 160–176 (google.com).
[2]
Wilhelm Caspary: Fehlertolerante Auswertung von Messdaten, ".124
[3]
Peter Hackl: Einführung in die Ökonometrie. 2. aktualisierte Auflage, Pearson Deutschland GmbH, 2008., ISBN 978-3-86894-156-2, S. 75.
[4]
P. Gans: Data Fitting in the Chemical Sciences. Wiley, 1992, ISBN 0-471-93412-7.
[5]
N. R. Draper, H. Smith: Applied Regression Analysis. Wiley, 1998, ISBN 0-471-17082-8.

[1] [1]
Alexander Basilevsky: Applied Matrix Algebra in the Statistical Sciences. Dover, 2005, ISBN 0-486-44538-0, S. 160–176 (google.com).

[Caspary-2] [2]
Wilhelm Caspary: Fehlertolerante Auswertung von Messdaten, ".124

[3] [3]
Peter Hackl: Einführung in die Ökonometrie. 2. aktualisierte Auflage, Pearson Deutschland GmbH, 2008., ISBN 978-3-86894-156-2, S. 75.

[4] [4]
P. Gans: Data Fitting in the Chemical Sciences. Wiley, 1992, ISBN 0-471-93412-7.

[5] [5]
N. R. Draper, H. Smith: Applied Regression Analysis. Wiley, 1998, ISBN 0-471-17082-8.

[1]

[2]

[3]

[4]

[5]

Projektionsmatrix (Statistik)

Idempotenz

Symmetrie

Schätzung des Varianzparameters nach der Kleinste-Quadrate-Schätzung

Wikiwand in your browser!

Projektionsmatrix (Statistik)

Idempotenz

Symmetrie

Schätzung des Varianzparameters nach der Kleinste-Quadrate-Schätzung

Wikiwand in your browser!

Ausgangslage

Prädiktionsmatrix

Residuenerzeugende Matrix

Weitere Eigenschaften

Anwendungen

Einzelnachweise