1. 回归和分类的区别
2. 线性回归
3. 最小二乘法
4. 梯度下降法

## 线性回归

$y=h(x;\theta)$

### 单变量线性回归

$y=w_{1}x+w_{0}$

$J(w_{0},w_{1})=\frac{1}{N}\sum_{i=1}^{N}(h(x_{i};\theta) - y_{i})^{2}$

### 多变量线性回归

$\left\{\begin{matrix} y_{1}=w_{0}+w_{1}\cdot x_{11}+...+w_{n}\cdot x_{1n}\\ y_{2}=w_{0}+w_{1}\cdot x_{21}+...+w_{n}\cdot x_{2n}\\ ...\\ y_{m}=w_{0}+w_{1}\cdot x_{m1}+...+w_{n}\cdot x_{mn} \end{matrix}\right.$

$\left\{\begin{matrix} y_{1}=w_{0}\cdot x_{10}+w_{1}\cdot x_{11}+...+w_{n}\cdot x_{1n}\\ y_{2}=w_{0}\cdot x_{20}+w_{1}\cdot x_{21}+...+w_{n}\cdot x_{2n}\\ ...\\ y_{m}=w_{0}\cdot x_{m0}+w_{1}\cdot x_{m1}+...+w_{n}\cdot x_{mn} \end{matrix}\right.$

$Y=X\cdot W$

$Y_{m\times 1}=\begin{bmatrix} y_{1}\\ y_{2}\\ ...\\ y_{m} \end{bmatrix}$

$X_{m\times (n+1)}=\begin{bmatrix} x_{10} & x_{11} & ... & x_{1n}\\ x_{20} & x_{21} & ... & x_{2n}\\ ... & ... & ... & ...\\ x_{m0} & x_{m1} & ... & x_{mn} \end{bmatrix} =\begin{bmatrix} 1 & x_{11} & ... & x_{1n}\\ 1 & x_{21} & ... & x_{2n}\\ ... & ... & ... & ...\\ 1 & x_{m1} & ... & x_{mn} \end{bmatrix}$

$W_{(n+1)\times 1}= \begin{bmatrix} w_{0} \\ w_{1} \\ ... \\ w_{n} \end{bmatrix}$

$J(W)=\frac{1}{N}\sum_{i=1}^{N}(h(x_{i};W) - y_{i})^{2}$

## 最小二乘法

$loss=N*J=\sum_{i=1}^{N}(h(x_{i}:\theta) - y_{i})^{2}$

### 几何计算

$$J$$得到最小值时，$$w_{0}$$$$w_{1}$$的偏导数一定为0，所以参数$$w_{0}$$$$w_{1}$$的计算公式如下：

$\frac{\varphi J}{\varphi w_{0}} =\frac{\varphi }{\varphi w_{0}}\frac{1}{N} \sum_{i=1}^{N}(h(x_{i})-y_{i})^{2} =\frac{\varphi }{\varphi w_{0}}\frac{1}{N} \sum_{i=1}^{N}(w_{0}+w_{1}\cdot x_{i}-y_{i})^{2}$ $=\frac{2}{N} \sum_{i=1}^{N}(w_{0}+w_{1}\cdot x_{i}-y_{i}) =2\cdot w_{0}+\frac{2}{N} \sum_{i=1}^{N}(w_{1}\cdot x_{i}-y_{i})$

$\frac{\varphi J}{\varphi w_{0}}=0 \Rightarrow w_{0}=-\frac{1}{N}\sum_{i=1}^{N}(w_{1}\cdot x_{i}-y_{i}) =\frac{1}{N}(\sum_{i=1}^{N}y_{i}-\sum_{i=1}^{N}w_{1}\cdot x_{i}) =\bar{y}-w_{1}\cdot \bar{x}$

$\frac{\varphi J}{\varphi w_{1}} =\frac{\varphi }{\varphi w_{1}}\frac{1}{N} \sum_{i=1}^{N}(h(x_{i})-y_{i})^{2} =\frac{\varphi }{\varphi w_{1}}\frac{1}{N} \sum_{i=1}^{N}(w_{0}+w_{1}\cdot x_{i}-y_{i})^{2}$ $=\frac{2}{N} \sum_{i=1}^{N}(w_{0}+w_{1}\cdot x_{i}-y_{i})\cdot x_{i} =\frac{2\cdot w_{0}}{N}\sum_{i=1}^{N}x_{i}+\frac{2\cdot w_{1}}{N}\sum_{i=1}^{N}x_{i}\cdot x_{i}-\frac{2}{N}\sum_{i=1}^{N}x_{i}\cdot y_{i}$ $=2\cdot w_{0}\cdot \bar{x}+2\cdot w_{1}\cdot \bar{x^{2}}-2\cdot \bar{x\cdot y}$

$\frac{\varphi J}{\varphi w_{1}}=0, w_{0}=\bar{y}-w_{1}\cdot \bar{x} \Rightarrow \bar{x}\cdot \bar{y}-w_{1}\cdot \bar{x}^{2}+w_{1}\cdot \bar{x^{2}}-\bar{x\cdot y}=0 \Rightarrow w_{1}=\frac{\bar{x\cdot y} - \bar{x}\cdot \bar{y}}{\bar{x^{2}}-\bar{x}^{2}}$

$w_{0}=\bar{y}-w_{1}\cdot \bar{x}$

$w_{1}=\frac{\bar{x\cdot y} - \bar{x}\cdot \bar{y}}{\bar{x^{2}}-\bar{x}^{2}}$

• 参数$$\bar{y}$$表示真实结果的均值
• 参数$$\bar{x}$$表示输入变量的均值
• 参数$$\bar{x\cdot y}$$表示输入变量和真实结果的乘积的均值
• 其他变量以此类推

### 矩阵计算

$(X\pm Y)^T = X^T\pm Y^T$

$(X\cdot Y)^{T}=Y^{T}\cdot X^{T}$

$(A^T)^T=A$

$\left | A^{T} \right |=\left | A \right |$

$\frac{\varphi (\theta ^{T}\cdot X)}{\varphi \theta}=X$

$\frac{\varphi (X^{T}\cdot \theta )}{\varphi \theta}=X$

$\frac{\varphi (\theta ^{T}\cdot \theta )}{\varphi \theta}=\theta$

$\frac{\varphi (\theta ^{T}\cdot C\cdot \theta )}{\varphi \theta}=2\cdot C\cdot \theta$

$J(W) =\frac{1}{N}\cdot \sum_{i=1}^{N}(h(x_{i};W)-y_{i})^2 =\frac{1}{N}(X\cdot W-Y)^{T}\cdot (X\cdot W -Y)$ $=\frac{1}{N}((X\cdot W)^T-Y^T)\cdot (X\cdot W-Y)) =\frac{1}{N}(W^T\cdot X^T-Y^T)\cdot (X\cdot W-Y))$ $=\frac{1}{N}(W^T\cdot X^{T}\cdot X\cdot W-W^T\cdot X^{T}\cdot Y-Y^{T}\cdot X\cdot W+Y^{T}\cdot Y)$

$J(W)=\frac{1}{N}(W^T\cdot X^{T}\cdot X\cdot W-2\cdot W^T\cdot X^{T}\cdot Y+Y^{T}\cdot Y)$

$\frac{\varphi J(W)}{\varphi W}=\frac{1}{N}\cdot X^{T}\cdot X\cdot W-\frac{1}{N}\cdot X^{T}\cdot Y=0$

$\Rightarrow X^{T}\cdot X\cdot W=X^{T}\cdot Y$

$\Rightarrow W=(X^{T}\cdot X)^{-1}\cdot X^{T}\cdot Y$

$$X^{T}\cdot X$$必须是非奇异矩阵，满足$$\left | X^{T}\cdot X \right |\neq 0$$，才能保证可逆

1. 对于$$n$$阶矩阵$$A$$，当且仅当$$\left | A \right | \neq 0$$时，$$R(A_{n})=n$$，称$$A$$为满秩矩阵
2. $$R(A^T)=R(A)$$
3. $$R(AB)\leq min \left \{ R(A), R(B)\right \}$$
4. $$A$$$$m\times n$$矩阵，则$$0\leq (A)\leq min \left \{ m,n \right \}$$

## 小结

• 对于单变量线性回归问题，适用于最小二乘法的几何计算
• 对于多变量线性回归问题，如果变量维数不大同时满足$$\left | X^{T}\cdot X \right |\neq 0$$$$R(X) = n+1$$的情况，使用最小二乘法的矩阵计算；否则，利用梯度下降方式进行权重更新