diff options
-rw-r--r-- | regression.tex | 92 |
1 files changed, 92 insertions, 0 deletions
diff --git a/regression.tex b/regression.tex new file mode 100644 index 0000000..ee73520 --- /dev/null +++ b/regression.tex @@ -0,0 +1,92 @@ +\font\bbten=msbm10 +\font\bbsev=msbm7 +\font\bbfiv=msbm5 +\newfam\bbold +\textfont\bbold=\bbten +\scriptfont\bbold=\bbsev +\scriptscriptfont\bbold=\bbfiv +\def\bb{\fam\bbold} +\def\bmatrix#1{\left[\matrix{#1}\right]} +\def\fr#1#2{{#1\over#2}} +\def\E{{\bb E}} +\def\var{\mathop{\rm var}} +\def\cov{\mathop{\rm cov}} + +We want to find a quadratic regression model (with errors) for a dataset +$$\{(X_1,Y_1), (X_2, Y_2), \ldots, (X_n, Y_n)\}$$ +where $X_k$ is a constant and $Y_k$ is a set of independent and +identically distributed random variables with +$$Y_k \sim {\cal N}(aX_1^2 + bX_1 + c; \sigma^2),$$ +where a, b, c, and $\sigma$ are unknown model parameters. + +We use a least-squared estimator for the quadratic model parameters, +giving $\hat a,$ $\hat b,$ and $\hat c$ from the least square solution +to the following linear system: +$$ +\bmatrix{X_1^2&X_1&1\cr X_2^2&X_2&1\cr&\vdots\cr X_n^2&X_n&1} +\bmatrix{\hat a\cr\hat b\cr\hat c} = +\bmatrix{Y_1\cr Y_2\cr\vdots\cr Y_n} +$$ + +With $v_k$ being the $k$th column of the leftmost matrix $X$ and $y$ +being the vector of $Y_k$ on the right, we get (note that these +approximate parameters are also random variables) the following +estimators: + +% ugh they're all wrong but \hat a. Why do I not understand vectors. +$$\hat a = {v_1\cdot y\over ||v_1||} = \vec a\cdot y \qquad +\hat b = {v_2\cdot(y-\hat a v_1)\over ||v_2||} = {v_2\cdot(y-\vec a\cdot +yv_1)\over ||v_2||} = y\cdot{v_2-\vec a(v_1\cdot v_2)\over||v_2||} = +\vec b\cdot y +$$ +$$\hat c = {v_3\cdot (y-\vec a\cdot y v_1-\vec b\cdot y v_2)\over +||v_3||} = y\cdot{v_3 - \vec a(v_1\cdot v_3) - \vec b(v_2\cdot v_3)\over +||v_3||} = \vec c\cdot y$$ + +Note that $\vec a,$ $\vec b,$ and $\vec c$ are constants determined only +by the values of $(X_1, X_2, \cdots, X_n).$ + +We now want to find an estimator for $\sigma^2,$ so we'll start by +treating the regression parameters as known and then adopting it to the +estimates we have. The derivation is omitted, but if the regression +parameters were known, the following is a fairly straightforward +most-likely-estimator. +$$\sigma^2 = \sum_{i=1}^n {(Y_i - aX_i^2 - bX_i - c)^2\over n}$$ +From this we obtain statistic +$$S^2 = \sum_{i=1}^n {(Y_i - \hat a X_i^2 - \hat b X_i - \hat c)^2\over n} += \sum_{i=1}^n {(y\cdot(\vec e_i - \vec a X_i^2 - \vec b X_i - \vec c))^2\over n}$$ + +Let $\vec R_i := e_i - \vec a X_i^2 - \vec b X_i - \vec c,$ +giving a matrix $R$ where the $i$th row is $R_i.$ +This sum is, then, $(Ry)\cdot(Ry).$ +To find the expectation of this estimator to check if it's biased, we +take $$\E((Ry)^2) = \E((Ry)^2) - \E(Ry)^2 + \E(Ry)^2 = y^TR^TRy - +(R\E(y))^2 = \sum_{1\leq i,j\leq +n}\E(y_ir_{k_i}y_jr_{k_j})-\E(Y_i\cdot r_{k_i})\E(Y_j\cdot r_{k_j}) - (R\E(y))^2 +$$$$ += \sum_{1\leq i,j\leq n} r_{k_i}r_{k_j}\cov(Y_i,Y_j) - (R\E(y))^2 += \sum_{1\leq i\leq n} r_{k_i}^2\sigma^2,$$ +because $\cov(Y_i,Y_j)$ is $\sigma^2$ if $i=j,$ and $0$ otherwise (by +independence). $R\E(y) = (I-XX^T)X\bmatrix{a\cr b\cr c}.$ + +\iffalse +We might expand this statistic to +$$S^2 = \fr1n\sum_{1\leq i,j,k \leq n} Y_j\cdot R_{i_j} Y_k\cdot R_{i_k}$$ +To transform this into an unbiased estimator, if it even is a decent +estimator, we try to find its expectation +$$\E[S^2] = \fr1n\sum_{1\leq i,j\leq n} \E[(Y_j R_{i_j})^2] ++ +\fr1n\sum_{1\leq i,j\neq k \leq n} \E[Y_j R_{i_j}]\E[Y_k R_{i_k}] +$$$$ += \fr1n\sum_{1\leq i,j\leq n} [R_{i_j}^2\sigma^2 + R_{i_j}^2(aX_j^2+bX_j+c)^2] ++ +\fr1n\sum_{1\leq i,j\neq k \leq n} \E[Y_jR_{i_j}]\E[Y_kR_{i_k}] +$$$$ += \sigma^2\sum_{1\leq i,j\leq n} R_{i_j}^2 ++ +\sum_{1\leq i,j,k\leq n} +R_{i_j}(aX_j^2+bX_j+c)R_{i_k}(aX_k^2+bX_k+c) +$$ +\fi + +\bye |