Johnson-Lindenstrauss lemma

The Johnson-Lindenstrauss lemma asserts that a set of n points in any high dimensional Euclidean space can be mapped down into an $O({\frac {logn}{\epsilon ^{2}}})$ dimensional Euclidean space such that the distance between any two points changes by only a factor of $(1+\epsilon )$ for any $0<\epsilon <1$ .

Introduction

Johnson and Lindenstrauss {cite} proved a fundamental mathematical result: any $n$ point set in any Euclidean space can be embedded in $O(\log n/\epsilon ^{2})$ dimensions without distorting the distances between any pair of points by more than a factor of $(1+\epsilon )$ , for any $0<\epsilon <1$ . The original proof of Johnson and Lindenstrauss was much simplified by Frankl and Maehara {cite}, using geometric insights and refined approximation techniques.

Proof

Suppose we have a set of $n$ $d$ -dimensional points $p_{1},p_{2},...,p_{n}$ and we map them down to $k=C{\frac {\log n}{\epsilon ^{2}}}\ll d$ dimensions, for appropriate constant $C>1$ . Define $T(.)$ as the linear map, that is if $x\in R^{d}$ , then $T(x)\in R^{k}$ . For example $T$ could be a $k\times d$ matrix.

The general proof framework

All known proofs of the Johnson-Lindenstrauss lemma proceed according to the following scheme: For given $n$ and an appropriate $k$ , one defines a suitable probability distribution $F$ on the set of all linear maps $T:R^{d}->R^{k}$ . Then one proves the following statement:

Statement: If any $T:R^{n}->R^{k}$ is a random linear mapping drown from the distribution $F$ , then for every vector $x\in R^{d}$ we have

$Prob[(1-\epsilon )||x||\leq ||T(x)||\leq (1+\epsilon )||x||]\geq 1-{\frac {1}{n^{2}}}$

Having established this statement for the considered distribution $F$ , the JL result follows easily: We choose $T$ at random according to F. Then for every $i<j$ , using linearity of $T$ and the above Statement with $x=pi-pj$ , we get that $T$ fails to satisfy $(1-\epsilon )||p_{i}-p_{j}||\leq ||T(p_{i})-T(p_{j})||\leq (1+\epsilon )||p_{i}-p_{j}||$ with probability at most ${\frac {1}{n^{2}}}$ . Consequently, the probability that any of the $(n2)$ pairwise distances is distorted by $T$ by more than $1+\epsilon$ is at most $(n2)/n^{2}<{\frac {1}{2}}$ . Therefore, a random $T$ works with probability at least ${\frac {1}{2}}$ .

References

S. Dasgupta and A. Gupta, An elementary proof of the Johnson-Lindenstrauss lemma, Tech. Rep. TR-99-06, Intl. Comput. Sci. Inst., Berkeley, CA, 1999.
W. Johnson and J. Lindenstrauss. Extensions of Lipschitz maps into a Hilbert space. Contemporary Mathematics, 26:189--206, 1984.

Fast monte-carlo algorithms for MM

Given an $m\times n$ matrix $A$ and an $n\times p$ matrix $B$ , we present 2 simple and intuitive algorithms to compute an approximation P to the product $AB$ , with provable bounds for the norm of the "error matrix" $P-A\times B$ . Both algorithms run in $O(mp+mn+np)$ time. In both algorithms, we randomly pick $s=O(1)$ columns of A to form an $m\times s$ matrix S and the corresponding rows of B to form an $s\times p$ matrix R. After scaling the columns of S and the rows of R, we multiply them together to obtain our approximation P . The choice of the probability distribution we use for picking the columns of A and the scaling are the crucial features which enable us to give fairly elementary proofs of the error bounds. Our rest algorithm can be implemented without storing the matrices A and B in Random Access Memory, provided we can make two passes through the matrices (stored in external memory). The second algorithm has a smaller bound on the 2-norm of the error matrix, but requires storage of A and B in RAM. We also present a fast algorithm that \describes" P as a sum of rank one matrices if B = A