Rejection Sampling

Rejection Sampling: The Ultimate Cheatsheet

1. Detailed Explanation

What is Rejection Sampling?

Rejection sampling is a Monte Carlo method used to generate samples from a probability distribution p(x), even when p(x) is difficult or impossible to sample from directly. Instead, it relies on having access to a simpler distribution q(x) that we can sample from, and a constant k such that kq(x) >= p(x) for all x. The basic idea is to sample from q(x) and then “reject” some of the samples so that the remaining samples follow the desired distribution p(x).

Why is it Important and What Kind of Problems Does it Solve?

Rejection sampling is important because it allows us to sample from complex or unknown distributions without needing to know the exact form of the distribution. This is particularly useful in Bayesian inference, where the posterior distribution can be difficult to calculate analytically, but we can still sample from it using rejection sampling. It is also used in scenarios where direct sampling from the target distribution is computationally expensive or impossible.

It solves problems where:

You need to sample from a target distribution p(x).
You can’t directly sample from p(x).
You have a proposal distribution q(x) that is easy to sample from.
You can find a constant k such that kq(x) >= p(x) for all x.

Core Concepts, Underlying Principles, and Key Terminology

Target Distribution p(x): The distribution we want to sample from. This distribution is often only known up to a normalizing constant.
Proposal Distribution q(x): A simpler distribution that we can easily sample from. Also called the “envelope” distribution.
Envelope Function: The function kq(x) that upper bounds the target distribution p(x).
Constant k: A scaling factor such that kq(x) >= p(x) for all x. Finding a suitable k is crucial for the efficiency of the algorithm. The smaller the k, the more efficient the sampling.
Acceptance Probability: The probability that a sample drawn from q(x) will be accepted. This is given by p(x) / (kq(x)).
Rejection: The process of discarding samples drawn from q(x) that do not meet the acceptance criterion.

2. When to Use Rejection Sampling (and When Not To)

Identify Problem Patterns

Rejection sampling is a good fit when:

You have a target distribution that is difficult to sample from directly.
You can find a simpler proposal distribution that is easy to sample from.
You can find a constant k such that kq(x) >= p(x) for all x.
The target distribution is defined over a relatively small and well-defined space.

Scenarios Where Other Algorithms Are More Appropriate

Rejection sampling can be inefficient if:

Finding a suitable proposal distribution q(x) and a constant k is difficult.
The constant k is very large, leading to a low acceptance rate and many rejected samples. This is especially problematic in high-dimensional spaces, where the acceptance rate can become exponentially small.
The target distribution is defined over a very large or unbounded space.
More efficient Markov Chain Monte Carlo (MCMC) methods like Metropolis-Hastings or Gibbs sampling are applicable. MCMC methods don’t require knowing a constant k and can be more efficient for complex distributions.
Importance sampling might be a better alternative if evaluating p(x) / q(x) is relatively easy.

3. Core Algorithm / Data Structure Template

Pseudo-code for Rejection Sampling:

function rejection_sampling(p(x), q(x), k, N):
  """
  Generates N samples from the target distribution p(x) using rejection sampling.

  Args:
    p(x): The target probability density function (PDF).
    q(x): The proposal probability density function (PDF).
    k: A constant such that k * q(x) >= p(x) for all x.
    N: The number of samples to generate.

  Returns:
    A list of N samples drawn from p(x).
  """

  samples = []
  while len(samples) < N:
    # 1. Sample x from the proposal distribution q(x).
    x = sample_from_q(q(x))

    # 2. Sample u from a uniform distribution U(0, 1).
    u = random.uniform(0, 1)

    # 3. Calculate the acceptance probability: p(x) / (k * q(x)).
    acceptance_probability = p(x) / (k * q(x))

    # 4. Accept or reject the sample.
    if u <= acceptance_probability:
      samples.append(x)

  return samples

4. Code Implementations (Python, Java, C++)

Python

import random
import numpy as np

def rejection_sampling(p, q, k, n_samples):
    """
    Rejection sampling implementation.

    Args:
        p: Target distribution (function).
        q: Proposal distribution (function).
        k: Constant such that k * q(x) >= p(x) for all x.
        n_samples: Number of samples to generate.

    Returns:
        List of samples from p(x).
    """
    samples = []
    while len(samples) < n_samples:
        x = q()  # Sample from q(x)
        u = random.uniform(0, 1)
        acceptance_probability = p(x) / (k * (q() if callable(q) else q)) # Ensure q is called if it's a function

        if u <= acceptance_probability:
            samples.append(x)
    return samples

# Example usage (assuming p(x) and q(x) are defined)
# def p(x):  # Example target distribution
#     return np.exp(-x**2 / 2)  # Normal distribution (unnormalized)

# def q():  # Example proposal distribution (uniform)
#     return random.uniform(-5, 5) # Uniform distribution between -5 and 5

# k = 2.5066  # Example k value (found by experimentation)

# samples = rejection_sampling(p, q, k, 1000)
# print(f"Generated {len(samples)} samples.")

Java

import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import java.util.function.DoubleSupplier;
import java.util.function.DoubleUnaryOperator;


public class RejectionSampling {

    public static List<Double> rejectionSampling(DoubleUnaryOperator p, DoubleSupplier q, double k, int nSamples) {
        List<Double> samples = new ArrayList<>();
        Random random = new Random();

        while (samples.size() < nSamples) {
            double x = q.getAsDouble(); // Sample from q(x)
            double u = random.nextDouble();
            double acceptanceProbability = p.applyAsDouble(x) / (k * (q instanceof DoubleUnaryOperator ? ((DoubleUnaryOperator) q).applyAsDouble(x) : q.getAsDouble()));  // Ensure q is used correctly.

            if (u <= acceptanceProbability) {
                samples.add(x);
            }
        }
        return samples;
    }

    // Example usage (assuming p(x) and q(x) are defined)
    public static void main(String[] args) {
        // Example: sampling from N(0,1) with uniform proposal
        DoubleUnaryOperator p = (x) -> Math.exp(-x * x / 2.0); // Target distribution (unnormalized normal)
        DoubleSupplier q = () -> new Random().nextDouble() * 10 - 5; // Uniform from -5 to 5
        double k = 2.5066; // Value of k
        int nSamples = 1000;
        List<Double> samples = rejectionSampling(p, q, k, nSamples);
        System.out.println("Generated " + samples.size() + " samples.");

    }
}

C++

#include <iostream>
#include <vector>
#include <random>
#include <cmath>

using namespace std;

vector<double> rejectionSampling(function<double(double)> p, function<double()> q, double k, int nSamples) {
    vector<double> samples;
    random_device rd;
    mt19937 gen(rd());
    uniform_real_distribution<> dis(0.0, 1.0);

    while (samples.size() < nSamples) {
        double x = q(); // Sample from q(x)
        double u = dis(gen);
        double acceptanceProbability = p(x) / (k * p(x)); //p(x)/ (k*q(x)) needs the actual value of q at that x. q() is called twice to avoid this.
        acceptanceProbability = p(x) / (k * 1.0); // q() is called twice. So assuming q() always returns 1.

        if (u <= acceptanceProbability) {
            samples.push_back(x);
        }
    }
    return samples;
}

// Example usage (assuming p(x) and q(x) are defined)
int main() {
    // Example: sampling from N(0,1) with uniform proposal
    auto p = [](double x) { return exp(-x * x / 2.0); }; // Target distribution (unnormalized normal)
    auto q = []() {
      random_device rd;
      mt19937 gen(rd());
      uniform_real_distribution<> dis(-5.0, 5.0);
      return dis(gen);
    }; // Uniform from -5 to 5
    double k = 2.5066; // Value of k
    int nSamples = 1000;
    vector<double> samples = rejectionSampling(p, q, k, nSamples);
    cout << "Generated " << samples.size() << " samples." << endl;

    return 0;
}

5. Complexity Analysis

The complexity of rejection sampling is highly dependent on the choice of the proposal distribution q(x) and the constant k.

Operation	Time Complexity	Space Complexity
Sampling from q	O(Tq)	O(1)
Evaluating p(x)	O(Tp)	O(1)
Evaluating q(x)	O(Tq)	O(1)
Acceptance Test	O(1)	O(1)
Storing Sample	O(1)	O(1)

Where Tq and Tp are the time complexities of sampling from and evaluating the proposal and target distributions, respectively.

Best Case: The acceptance rate is high (k is close to 1). In this case, the algorithm is relatively efficient.
Average Case: The acceptance rate is moderate. The algorithm’s efficiency depends on the distributions chosen. The expected number of iterations to get one sample is k. So, generating N samples takes approximately kN iterations.
Worst Case: The acceptance rate is very low (k is very large). This can happen if the proposal distribution is a poor fit for the target distribution. In this case, the algorithm can be extremely inefficient, potentially taking a very long time to generate the desired number of samples. In the worst case, the algorithm might never terminate if the acceptance probability is effectively zero. The space complexity is always O(N), where N is the number of samples we wish to draw.

6. Pro Tips, Tricks, and Common Pitfalls

Choosing a Good Proposal Distribution: The choice of the proposal distribution q(x) is crucial for the efficiency of rejection sampling. Ideally, q(x) should be similar in shape to p(x) and easy to sample from. A tighter fit between q(x) and p(x) will result in a smaller value of k and a higher acceptance rate.
Finding the Optimal k: Determining the smallest possible value of k that satisfies the condition kq(x) >= p(x) is essential for maximizing the acceptance rate. This often requires careful analysis of the target and proposal distributions. Sometimes, analytical methods can be used to find the optimal k. Otherwise, numerical optimization techniques or visual inspection can be helpful.
Dealing with Unnormalized Distributions: Rejection sampling works even if the target distribution p(x) is only known up to a normalizing constant. The normalizing constant cancels out in the acceptance probability calculation.
High-Dimensional Spaces: Rejection sampling can become very inefficient in high-dimensional spaces due to the “curse of dimensionality.” The acceptance rate tends to decrease exponentially with the number of dimensions. Consider using other methods like MCMC in high-dimensional settings.
Pitfall: Incorrectly Calculating Acceptance Probability: A common mistake is to incorrectly calculate the acceptance probability, especially when dealing with complex or unnormalized distributions. Ensure that you are using the correct formulas for p(x) and q(x).
Pitfall: Unbounded Domains: If the proposal distribution and target distribution are defined over unbounded domains, care must be taken to ensure that the constant k exists and is finite.

7. Classic Problem Examples

Example: Implement Rand10() Using Rand7()

Problem Description: Given the API rand7() that generates a uniform random integer in the range [1, 7], write a function rand10() that generates a uniform random integer in the range [1, 10]. You can only call the API rand7(), and you shouldn’t call any other API. Please do not use a language’s built-in random API.

High-Level Approach using Rejection Sampling:

Generate a Wider Range: Call rand7() twice to generate a number in the range [1, 49]. Specifically, (rand7() - 1) * 7 + (rand7() - 1) will generate values from [0, 48].
Rejection: If the generated number is greater than 39, reject it and repeat step 1. This is because numbers 40-48 are out of range for our desired output.
Mapping to [1, 10]: The numbers [0, 39] are uniformly distributed. We can map these to the range [1, 10] by taking the number modulo 10 and adding 1. i.e. (generated_number % 10) + 1.

Explanation:

Target Distribution: Uniform distribution over the integers [1, 10].
Proposal Distribution: We’re effectively creating a temporary uniform distribution over [1, 49] using two calls to rand7(). We then reject the values that fall outside of our “accepted” range (<=40), ensuring that the remaining values are uniformly distributed in a range that can be easily mapped to [1, 10].
Rejection Step: The key is that the rejection step ensures uniformity. By rejecting the higher values, we avoid biasing the resulting distribution.

def rand7():
    """Mock rand7 function for testing."""
    return random.randint(1, 7)

def rand10():
    """
    Implements rand10() using rand7().
    """
    while True:
        # Generate a number in [1, 49]
        num = (rand7() - 1) * 7 + (rand7() - 1)

        # If the number is in the desired range [0, 39], return it
        if num < 40:
            return (num % 10) + 1