Derivative to Zero, What does it intuitively mean?

$\begingroup$

I'm currently learning machine learning, and I came across this equation called Least Squares Regression.

X and w are both matrices. The multiplication of both matrices becomes y hat, which is theoretically supposed to be equal to y.

We want to minimize the squared error given by this equation by changing w.

enter image description here

w can be solved by a derivation of the function to w, and setting the equation to zero.

The question is, what does it intuitively mean?

I know that in derivative, we are trying to find the rate of change. BUT what does it mean the rate of change = 0 intuitively?

$\endgroup$ 6

2 Answers

$\begingroup$

Least squares solutions are a convex set; therefore the extremum is a minimum.

To show the set of least squares minimizers is convex, consider the linear system $Ax = b$ where the system matrix $A\in\mathbb{C}^{m\times n}$, the data vector $b\in\mathbb{C}^{m}$, and the solution vector $x\in\mathbb{C}^{n}$. The least squares solution $x_{LS}$ is defined as $$ x_{LS} = \left\{x\in\mathbb{C}^{n} \colon \lVert Ax - b \rVert_{2}^{2} \text{ is minimized} \right\}. $$

Take a vector from the null space $\eta\in\mathcal{N}(A)$. The vector $A(x+\eta) = Ax$ by virtue of null space vector membership. The convex combination of minimizers is also in the set of minimizers which proves the minimizers are a convex set.

Given $0 < \lambda \le 1$, $$ \begin{align} \lVert A(\lambda x_{LS}) + A(1 - \lambda)(x_{LS} + \eta) - b\rVert_{2}^{2} &= \lVert \lambda Ax_{LS} + A x_{LS} + \underbrace{A \eta}_{0} - \lambda A x_{LS} - \underbrace{\lambda A \eta}_{0} - b \rVert_{2}^{2} \\ &= \lVert Ax_{LS} - b \rVert_{2}^{2} \end{align} $$ Because the convex combination of minimizers in within the set of minimizers, the set is convex.

$\endgroup$ $\begingroup$

The derivative of a function, $f(x)$ being zero at a point, $p$ means that $p$ is a stationary point. That is, not "moving" (rate of change is $0$). There are a few things that could happen.

Either the function has a local maximum, minimum, or saddle point. To determine which one, you need to find out what happens around the point. For example, $f(x)=x^2$ has a minimum at $x=0$, $f(x)=-x^2$ has a maximum at $x=0$, and $f(x)=x^3$ has neither. You can see this by looking at the derivative to the left and right. If there is a sign change, it's an extremum. If there's no sign change, it's a saddle point. I'll leave it to you to figure out which sign change means maximum or minimum.

$\endgroup$ 2

Velvet Star Monitor

Derivative to Zero, What does it intuitively mean?

2 Answers

Your Answer

Sign up or log in

Post as a guest

Similar Journal

Tetra Master: What are the full rules?

Do lure modules and incense cause new pokemon to spawn near you, or existing pokemon to be drawn to you?

What is the strongest fixed location equipment you can obtain at Level 1?

What's the best strategy to keep the chaos low?