where
||.
||1
represents the matrix 1-norm (
i.e.,
the sum of absolute values of all entries of the matrix), and
λ
is a positive constant that scales
as
m-½.We
call the above convex program the
Robust
Principal Component Analysis (RPCA). This formulation
performs well in practice, recovering the true low-rank solution even
when upto a third of the observations are grossly corrupted. It also
comes with excellent theoretical guarantees. For example, for generic
matrices
A,
it can be shown to successfully correct a constant fraction of errors,
even when the rank of
A
is nearly proportional to the ambient dimension: rank(
A) =
C m / log(
m). Please refer to
[
Candès et al, 2009] for more details.
The RPCA algorithm is straightforward in that it is essentially a
convex optimization problem or more specifically, it can be recast as a
semidefinite program (SDP). It is well-known that SDPs are tractable
and interior point methods are a popular class of algorithms to solve
them. However, while interior point methods offer superior accuracy and
convergence rates, they do not scale very well with the size of the
matrix - computing the Newton step every iteration has a complexity of
O(
m6).
As a result, off-the-shelf packages to solve SDPs can often handle
matrices of sizes only upto about 70 x 70.
We approach the RPCA optimization problem in two different ways. In the
first approach, we use a first-order method to solve the
primal problem directly. Our algorithm is a special case of a more
general class of algorithms called
proximal
gradient algorithms, and is a direct application of the
more general
FISTA
algorithm. The algorithm has a theoretical iteration
complexity of
O(
k-2),
however in practice, we observe a faster convergence rate. The
computational bottleneck of each iteration is a Singular Value
Decomposition (SVD) computation, while the other steps involve simple
matrix operations and are natually amenable to parallelized operations.
The second approach to solving the RPCA optimization problem is to
formulate and solve the dual problem, and retrieve the
primal solution from the dual optimal solution. The dual problem to the
RPCA can be written as
max
Y 〈
D,
Y〉
subject to
J(
Y) ≤ 1,
where 〈
A,
B〉 = trace(
ATB),
J(
Y) =
max(
||Y||2 , λ
-1|Y|∞ ),
||.
||2
represents the matrix spectral norm, and
|.
|∞ represents the matrix
infinite norm (
i.e.,
the maximum absolute value of the matrix entries).
The above dual
problem can be
solved by constrained steepest ascent.
The highlight of this method is that it does not require a full SVD computation every iteration. Please refer to our
paper for more details on
fast algorithms to solve the RPCA problem.