Tuesday, May 08, 2007

Faure sequences and quasi-Monte Carlo methods

Fix a number n, and consider the following recursive procedure to construct a permutation $\sigma(n)$ of the numbers 0..n-1. We'll view the permutation as sequence, so the permutation (0 2 1) maps 0 to 0, 1 to 2 and 2 to 1. When convenient, we'll also treat the sequences as vectors, so we can do arithmetic on them.
  1. If n is even, then $\sigma(n) = s_1 \circ s_2$, where $s_1 = 2\sigma(n/2), s_2 = 2\sigma(n/2)+1$.

    For example, $\sigma(2) = (0 1)$. $\sigma(4) = (0 2) \circ (1 3) = (0 2 1 3)$
  2. If n is odd, then $\sigma(n) = s_1 \circ (n-1)/2 \circ s_2$, where $s_1$ and $s_2$ are constructed by taking the sequence for $\sigma(n-1)$, incrementing all elements that are at least $(n-1)/2$, and then splitting the sequence into two parts of equal length.

    For example, $\sigma(4) = (0 2 1 3)$. Let n = 5. Then incrementing all elements at least 2 gives us $(0 3 1 4)$. Splitting and inserting (2) gives us $\sigma(5) = (0 3 2 1 4)$
Here's the question: given n and j, can we return the jth entry of $\sigma(n)$ using fewer than $\log(n)$ operations ? Note that writing down $j$ takes $\log n$ bits already, so the question is couched in terms of operations. For reasons explained later, an amortized bound is not useful (i.e we can't compute the entire sequence first and then pick off elements in constant time)

For the backstory of where this sequence, known as a Faure sequence, comes from, read on...

One of the applications of geometric discrepancy is in what are called quasi-Monte Carlo methods. If you want to estimate the integral of some complicated function over a domain, (for simulation purposes for example), one way of doing this is to sample at random from the domain, and use the function evals at the sampled points to give an estimate of the integral. Typically, the function is expensive to evaluate as well, so you don't want to pick too many points.

Of course in practice, you're picking random points using a random number generator that is itself based on some pseudorandom generation process. An important area of study, called quasi-Monte Carlo methods, asks if this middleman can be eliminated. In other words, is it possible to generate a deterministic set of points that suffices to approximate the integration ? This is of course the question of finding a low discrepancy set, something that we've used in computational geometry to derandomized geometric algorithms (especially those based on $\epsilon$-net constructions).

There are many ways of constructing good sequences, and a good overview can be found in Chazelle's discrepancy book (read it free here). One of the more well known is due to Halton, and works by what is called radical inversion: given a base p, write down the numbers from 1 to n in base p, and then reverse the digits, and map back to a number less than one. For example, using a base of 2, we can write 6 as 110, which reversed becomes 011, or after adjoining a decimal point, 0.011 = 3/8. This specific sequence, using base 2, is called a van der Corput sequence and is also used to construct a Hammersley point set. For sampling in d-dimensional space, the trick is to let the jth coordinate of the ith point be the ith entry in a Halton sequence using some base $p_j$ (usually a prime).

It can be shown that these sequences have low discrepancy; however, they can have "aliasing" problems, or repeated patterns, if the primes are not large enough. One way around this problem is b observing that if we ignore the decimal point, all we're really doing is constructing a permutation of the numbers from 0 to $p^d-1$, (for a d-digit number in base p). Thus, if we could scramble the permutation further, this might allow us to preserve the discrepancy properties without the annoying aliasing effects. The process described above is a well known scramble called the Faure sequence. It maintains the desired properties of the sequence, and is quite popular in the quasi-MC community.

Note that if we were to preprocess all needed sequences, we'd need n*d space, where n is the number of samples, and d is the dimension of the space. This is not desirable for large dimensional sampling problems, and hence the question about the direct evaluation of coordinates.

5 comments:

  1. You can compute the jth entry of $\sigma(2^k)$ by taking the reverse of the k-bit binary representation of j. E.g., in binary, $\sigma(8)$ = (000, 100, 010, 110, 001, 101, 011, 111). Do you count this as fewer than log(n) operations or exactly equal to log(n) operations?

    ReplyDelete
  2. I'm willing to count this as a single operation. Basically, I'm using a RAM model: the reverse operation is somewhat unorthodox for a RAM model, but I'm allowing it, because it yields a closed form.

    ReplyDelete
  3. Two further observations:

    To simplify notation, let $\sigma(n,j)$ denote the jth entry of $\sigma(n)$.

    If you have access to all of $\sigma(n)$ and the binary Halton
    sequence H(i) (H(1) = 0.5, H(2) = 0.25, H(3) = 0.75, etc...), you can compute $\sigma(2^k n, j)$ as $2^k \sigma(n, j) + 2^k H(\lfloor j/n\rfloor)$. With this observation -- and constant-time access to the Halton sequence -- you can compute $\sigma(n, j)$ in time proportional to the Hamming weight of n. Not great, but maybe slightly better.

    Secondly, if instead of requiring that we compute each $\sigma(n,j)$ in less than log(n) operations we require that we compute all $\sigma(n,j)$ for j=1,...,n in less than nlog(n) operations I think we can do better. Note that $\sigma(n,1) = 0$, $\sigma(n,n) = n-1$ and $\sigma(2n+1,n) = n$ for all n. Then, if we compute $\sigma(n,j)$ recursively in the obvious way (e.g.: $\sigma(2n, j) = 2\sigma(n,j) if j < n, 2\sigma(n,j-n)+1 else$ and $\sigma(2n+1,j) = n if j=n, 2*\sigma(n) if j<n and 2*\sigma(n)<n, etc...$) but stopping whenever we reach some calculation of the form $\sigma(m,1)$ or $\sigma(m,m)$. Now, we can compute at least 2 entries of $\sigma(n)$ with zero recursive levels (I say "at least" because we get 3 entries if n is odd), at least 2 entries with one recursion, at least 4 entries with two recursions, at least 8 entries with three, etc... Following this line of reasoning and working out all the sums, we find that to calculate all n entries of $\sigma(n)$, we only need to explore a linear number of recursions.

    ReplyDelete
  4. Whoops. I think I meant van Corput sequence, not Halton sequence.

    ReplyDelete
  5. you're right, and the recurrence I think will work out to n + log n overall.

    However, this is why I ruled out such methods: I know that such a procedure can be used, but for the specific application in mind, we can't rely on computing the entire sequence in advance: the storage requirements are too onerous.

    ReplyDelete

Disqus for The Geomblog