Notice

Recent Posts

Recent Comments

Link

« 2025/03 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Tags more

Archives

Today

Total

관리 메뉴

statduck

Basis Expansions & Regularization 본문

Machine Learning

Basis Expansions & Regularization

statduck 2022. 6. 9. 16:38

Basis Expansions & Regularization

We can't assure our function is linear.

To deal with non-linear problem, we can use transformed X instead of original X.

Basis Expansions and Regularization

$$ f(X)=\sum^M_{m=1}\beta_mh_m(X) $$

The basis function, f(X), is linear on h even though $h(X)$ is non linear

Form
$h_m(X)=X_m$	Basic linear model
$h_m(X)=X_j^2 \; or \; h_m(X)=X_jX_k$	Polynomial model
$h_m(X)=log(X_j), \sqrt{X_j}$	Log model
$h_m(X)=I(L_m\leq X_k \leq U_m)$	Range model (When you want to locally analyze data)

When we add a third term into our model, we need to add second, first, and constant order into the model. So polynomial model has a high dimensional problem. To reduce the number of basis function, there are following three methods:

Methods	Example
Restriction	Limited to additional model
Selection	Select only significant variables on the model
Regularization	Constrained coefficients

Natural Cubic Spline

$$
N_1(X)=1,; N_2(X)=X, ; N_{k+2}(X)=d_k(X)-d_{K-1}(X) \\
d_k(X)=\frac{(X-\xi_k)^3_+-(X-\xi_K)^3_+}{\xi_K-\xi_k}
$$

$$
\hat{\theta}=(N^TN+\lambda\Omega_N)^{-1}N^Ty \\
\hat{f}(x)=\sum^N_{j=1}N_j(x)\hat{\theta}_j
$$

class spline:
    def __init__(self, x, y):
        x = np.array([[1]*x.shape[0], x, np.power(x,2), np.power(x,3)])
        b1 = min(x[1]) + (max(x[1])-min(x[1]))/3
        b2 = min(x[1]) + 2*(max(x[1])-min(x[1]))/3

        x1 = np.append(x, [np.power(x[1],3), np.power(x[1],3)], axis=0)
        x1 = np.transpose(x1)[x[1]<b1]
        x2 = np.append(x, [np.power(x[1],3)], axis=0)
        x2 = np.transpose(x2)[(b1<=x[1])&(x[1]<b2)]
        self.x = np.transpose(x)
        self.y = y
        self.x1 = x1
        self.x2 = x2

    def training(self):
        x = self.x # col vec expression
        y = self.y
        x1 = self.x1
        x2 = self.x2

        xt = np.transpose(x)
        beta = np.linalg.inv(xt@x)@xt@y
       #  잔차에다가 또 피팅해주는 방식이에요. y값만 바뀌는 거겠죠

        y_fit = y-(x@beta)
        x1t = np.transpose(x1)
        beta1= np.linalg.inv(x1t@x1 + np.diag([0.01]*x1.shape[1]))@x1t@y_fit
        x2t = np.transpose(x2)
        beta2 = (1/(x2t@x2))*(x2t@y_fit)

        return(np.array([beta1,beta2]))
    def prediction(self, X_test):
        X_test = np.insert(X_test,0,1,axis=1)
        y_pred = (X_test@self.beta > 0).astype('uint8')
        return(y_pred)

Piecewise Polynomials and Splines

✏️ Local regression using range function.

$$ f(X)=\beta_1I(X<\xi_1)+\beta_2I(\xi_1\leq X<\xi_2)+\beta_3I(\xi_2 \leq X) $$

In this case, estimated beta is equal to the mean of target in each area.

$$ \begin{split} f(X) = & \beta_1I(X<\xi_1)+\beta_2I(\xi_1\leq X<\xi_2)+\beta_3I(\xi_2 \leq X)+ \\ & \beta_4I(X<\xi_1)X+\beta_5I(\xi_1\leq X<\xi_2)X+\beta_6I(\xi_2\leq X)X \\ & (f(\xi_1^-)=f(\xi_1^+), f(\xi_2^-)=f(\xi_2^+)) \end{split} $$

$(X-\xi_1)_+$ can be changed into $max(0,X-\xi_1)$.

✏️ Piecewise Cubic Polynomials

$$
f(X)=\beta_1+\beta_2X+\beta_3X^2+\beta_4X^3+\beta_5(X-\xi_1)^3_++\beta_6(X-\xi_2)^3_+
$$

This equation satisfies three constrains that are continuous, first derivative continuous, and second derivative continuous in the border line.$(X-\xi_k)^3_+$means this equation satisfies all constrains because it is a cubic function.

Parameter number

(# of range) $\times$ (# of parameter per range) - (# of knot) $\times$(# of constrain per knot) = 3*4-2*3=6

In lagrange multiplier these two sentences have same meaning,

Maximize f(x,y), s.t. g(x,y)=k
Maximize h, s.t. h(x,y,d)=f(x,y)+d(g(x,y)-k)

It implies one constraint becomes one term in the lagrange equation. Thus, we minus the number of constrains when we derive the parameter number above.

✏️ Weakness of Local polynomial regression

It shows irregular tendency around border lines
It's hard to use extrapolation

The border lines mean the minimun or maximum of input variables. In these borders the variance of predicted value becomes big.

$$
Point ;wise ;var=Var[\hat{f}(x_0)]
$$

Natural Cubic Spline

To overcome the weakness of local polynomial regression, natural cubic spline appears. This model adds linear constraint on the border line. To add this constraint, we need to think about this equation.

$$
f(X)=\beta_1+\beta_2X+\beta_3(d_1(X)-d_{K-1}(X))+\cdots+\beta_K(d_K(X)-d_{K-1}(X))
$$

$$
d_k(X)=\dfrac{(X-\xi_k)^3_+-(X-\xi_K)^3_+}{\xi_K-\xi_k}
$$

Proof: https://statkwon.github.io/ml/natural-spline/

Reference

Hastie, T., Tibshirani, R.,, Friedman, J. (2001). The Elements of Statistical Learning. New York, NY, USA: Springer New York Inc..

'Machine Learning' 카테고리의 다른 글

Kernel Smoothing (0)	2022.06.09
Smoothing Splines & Smoother Matrices (0)	2022.06.09
Linear Classfier (2) (0)	2022.05.27
Linear Classifier (1) (0)	2022.05.27
Orthogonalization (0)	2022.05.27