Velvet Star Monitor

Standout celebrity highlights with iconic style.

news

Derivation of Soft Thresholding Operator / Proximal Operator of $ {L}_{1} $ Norm

Writer Olivia Zamora
$\begingroup$

I was going through the derivation of soft threholding at .

It says the three unique solutions for

$\operatorname{arg min} \|x-b\|_2^2 + \lambda\|x\|_1$ is given by

$\operatorname{arg min} \|x-b\|^2 + \lambda x$ assuming $x > 0$

$\operatorname{arg min} \|x-b\|^2 - \lambda x$ assuming $x < 0$

$\operatorname{arg min} \|x-b\|^2 = 0 $ assuming $x = 0$

I am confused in the third case. For $x = 0$, how come the solution is zero. I mean when x = 0, I will have $\operatorname{arg min} \|x-b\|^2$. If I solve this, I will get x = b, the solution then how come it is $x = 0$

$\endgroup$ 1

3 Answers

$\begingroup$

I wrote a more detailed derivation of the soft-thresholding operator, following the source you mention and other ones. I hope it helps.

The soft-thresholding is just the proximal mapping of the $l_1$-norm. Let $f(x) = \lambda\|x\|_1$, then the proximal mapping of $f$ is defined as

\begin{equation} \operatorname{prox}_f(x) = \operatorname{argmin}_z\{\frac{1}{2}\|x-z\|^2_2 + \lambda\|z\|_1 \} \end{equation}

The optimality condition for the previous problem is

\begin{equation} 0 \in \nabla(\frac{1}{2}\|x-z\|^2_2) + \partial(\lambda\|z\|_1) \Leftrightarrow 0 \in z-x + \lambda\partial\|z\|_1 \end{equation}

The $l_1$-norm is separable and thus we can consider each of its components separately. Let's examine first the case where $z_i \neq 0$. Then, $\partial \|z_i\|=\operatorname{sign}(z_i)$ and the optimum $z_i^*$ is obtained as

\begin{equation} 0 = z_i-x_i + \lambda \operatorname{sign}(z_i) \Leftrightarrow z_i^* = x_i - \lambda \operatorname{sign}(z_i^*) \end{equation}

Note also that if $z_i^* < 0$, then $x_i < -\lambda$ and equivalently if $z_i^* > 0 \Rightarrow x_i > \lambda$. Thus, $|x_i| > \lambda$ and $\operatorname{sign}(z_i^*) = \operatorname{sign}(x_i)$. Substituting in the previous equation we get

\begin{equation} z_i^* = x_i - \lambda \operatorname{sign}(x_i) \end{equation}

In the case where $z_i = 0$, the subdifferential of the $l_1$-norm is the interval $[-1,1]$ and the optimality condition is

\begin{equation} 0 \in -x_i + \lambda[-1,1] \Leftrightarrow x_i \in [-\lambda,\lambda] \Leftrightarrow |x_i| \leq \lambda \end{equation}

Putting all together we get

\begin{equation} [\operatorname{prox}_f(x)]_i = z_i^* = \left\{ \begin{array}{lr} 0 & \text{if } |x_i| \leq \lambda \\ x_i - \lambda \operatorname{sign}(x_i) & \text{if } |x_i| > \lambda \end{array}\right. \end{equation}

The previous equation can also be written as

\begin{align*} [\operatorname{prox}_f(x)]_i &= \operatorname{sign}(x_i)\max(|x_i|-\lambda, 0) \\ &= \operatorname{sign}(x_i)(|x_i|-\lambda)_+ \end{align*}

where $(\cdot)_+$ denotes the positive part.

$\endgroup$ 6 $\begingroup$

Suppose $\hat{x} \in \mathbb R$ and we want to find a minimizer of $f(x) = |x| + \frac{1}{2t}(x - \hat{x})^2$. We want to make $|x|$ small without straying too far from $\hat{x}$.

Consider the case where $\hat{x} > 0$.

In this case there's no reason to let $x$ be negative. Because if $x$ is negative, then $-x$ is closer to $\hat{x}$ and $-x$ has the same absolute value as $x$.

So we need to minimize the quadratic function $g(x) = x + \frac{1}{2t} (x - \hat{x})^2$ over $[0,\infty)$. The graph of $g$ is a parabola that opens upwards, and we can minimize $g$ using techniques from pre-calculus.

The case where $\hat{x} < 0$ is similar.

$\endgroup$ 1 $\begingroup$

I wonder about something.

One could write the Soft Thresholding as:

$$ x - \lambda \operatorname{sgn} \left( x \right) \min \left( \left| \frac{x}{\lambda} \right|, 1 \right) $$

This is of the form $ x - \lambda g \left( \frac{x}{\lambda} \right) $.
Using the trick done in my answer to Closed Form Solution of $ \arg \min_{x} {\left\| x − y \right\|}_{2}^{2} + \lambda {\left\| x \right\|}_{2} $ - Tikhonov Regularized Least Squares - it might suggest that $ \operatorname{sgn} \left( x \right) \min \left( \left| x \right|, 1 \right) $ is the projection to $ {L}_{1} $ Unit Ball (Which doesn't have closed form solution).

So what's wrong here?

Remark
As noted below - The formula states the projection to the dual ball.

$\endgroup$ 1

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy