1 files changed, 92 insertions, 0 deletions
diff --git a/regression.tex b/regression.tex
new file mode 100644
index 0000000..ee73520
--- /dev/null
+++ b/regression.tex
@@ -0,0 +1,92 @@
+\font\bbten=msbm10
+\font\bbsev=msbm7
+\font\bbfiv=msbm5
+\newfam\bbold
+\textfont\bbold=\bbten
+\scriptfont\bbold=\bbsev
+\scriptscriptfont\bbold=\bbfiv
+\def\bb{\fam\bbold}
+\def\bmatrix#1{\left[\matrix{#1}\right]}
+\def\fr#1#2{{#1\over#2}}
+\def\E{{\bb E}}
+\def\var{\mathop{\rm var}}
+\def\cov{\mathop{\rm cov}}
+
+We want to find a quadratic regression model (with errors) for a dataset
+$$\{(X_1,Y_1), (X_2, Y_2), \ldots, (X_n, Y_n)\}$$
+where $X_k$ is a constant and $Y_k$ is a set of independent and
+identically distributed random variables with
+$$Y_k \sim {\cal N}(aX_1^2 + bX_1 + c; \sigma^2),$$
+where a, b, c, and $\sigma$ are unknown model parameters.
+
+We use a least-squared estimator for the quadratic model parameters,
+giving $\hat a,$ $\hat b,$ and $\hat c$ from the least square solution
+to the following linear system:
+$$
+\bmatrix{X_1^2&X_1&1\cr X_2^2&X_2&1\cr&\vdots\cr X_n^2&X_n&1}
+\bmatrix{\hat a\cr\hat b\cr\hat c} =
+\bmatrix{Y_1\cr Y_2\cr\vdots\cr Y_n}
+$$
+
+With $v_k$ being the $k$th column of the leftmost matrix $X$ and $y$
+being the vector of $Y_k$ on the right, we get (note that these
+approximate parameters are also random variables) the following
+estimators:
+
+% ugh they're all wrong but \hat a. Why do I not understand vectors.
+$$\hat a = {v_1\cdot y\over ||v_1||} = \vec a\cdot y \qquad
+\hat b = {v_2\cdot(y-\hat a v_1)\over ||v_2||} = {v_2\cdot(y-\vec a\cdot
+yv_1)\over ||v_2||} = y\cdot{v_2-\vec a(v_1\cdot v_2)\over||v_2||} =
+\vec b\cdot y
+$$
+$$\hat c = {v_3\cdot (y-\vec a\cdot y v_1-\vec b\cdot y v_2)\over
+||v_3||} = y\cdot{v_3 - \vec a(v_1\cdot v_3) - \vec b(v_2\cdot v_3)\over
+||v_3||} = \vec c\cdot y$$
+
+Note that $\vec a,$ $\vec b,$ and $\vec c$ are constants determined only
+by the values of $(X_1, X_2, \cdots, X_n).$
+
+We now want to find an estimator for $\sigma^2,$ so we'll start by
+treating the regression parameters as known and then adopting it to the
+estimates we have. The derivation is omitted, but if the regression
+parameters were known, the following is a fairly straightforward
+most-likely-estimator.
+$$\sigma^2 = \sum_{i=1}^n {(Y_i - aX_i^2 - bX_i - c)^2\over n}$$
+From this we obtain statistic
+$$S^2 = \sum_{i=1}^n {(Y_i - \hat a X_i^2 - \hat b X_i - \hat c)^2\over n}
+= \sum_{i=1}^n {(y\cdot(\vec e_i - \vec a X_i^2 - \vec b X_i - \vec c))^2\over n}$$
+
+Let $\vec R_i := e_i - \vec a X_i^2 - \vec b X_i - \vec c,$
+giving a matrix $R$ where the $i$th row is $R_i.$
+This sum is, then, $(Ry)\cdot(Ry).$
+To find the expectation of this estimator to check if it's biased, we
+take $$\E((Ry)^2) = \E((Ry)^2) - \E(Ry)^2 + \E(Ry)^2 = y^TR^TRy -
+(R\E(y))^2 = \sum_{1\leq i,j\leq
+n}\E(y_ir_{k_i}y_jr_{k_j})-\E(Y_i\cdot r_{k_i})\E(Y_j\cdot r_{k_j}) - (R\E(y))^2
+$$$$
+= \sum_{1\leq i,j\leq n} r_{k_i}r_{k_j}\cov(Y_i,Y_j) - (R\E(y))^2
+= \sum_{1\leq i\leq n} r_{k_i}^2\sigma^2,$$
+because $\cov(Y_i,Y_j)$ is $\sigma^2$ if $i=j,$ and $0$ otherwise (by
+independence). $R\E(y) = (I-XX^T)X\bmatrix{a\cr b\cr c}.$
+
+\iffalse
+We might expand this statistic to
+$$S^2 = \fr1n\sum_{1\leq i,j,k \leq n} Y_j\cdot R_{i_j} Y_k\cdot R_{i_k}$$
+To transform this into an unbiased estimator, if it even is a decent
+estimator, we try to find its expectation
+$$\E[S^2] = \fr1n\sum_{1\leq i,j\leq n} \E[(Y_j R_{i_j})^2]
++
+\fr1n\sum_{1\leq i,j\neq k \leq n} \E[Y_j R_{i_j}]\E[Y_k R_{i_k}]
+$$$$
+= \fr1n\sum_{1\leq i,j\leq n} [R_{i_j}^2\sigma^2 + R_{i_j}^2(aX_j^2+bX_j+c)^2]
++
+\fr1n\sum_{1\leq i,j\neq k \leq n} \E[Y_jR_{i_j}]\E[Y_kR_{i_k}]
+$$$$
+= \sigma^2\sum_{1\leq i,j\leq n} R_{i_j}^2
++
+\sum_{1\leq i,j,k\leq n}
+R_{i_j}(aX_j^2+bX_j+c)R_{i_k}(aX_k^2+bX_k+c)
+$$
+\fi
+
+\bye