The function on that slide is dominated by the call to rand, which uses quite di...

Qem · 2025-11-07T23:13:50 1762557230

I tested how PyPy performs on that. Just changing the implementation of Python drops the runtime from ~16.5s to ~3.5s in my computer, approximately a 5x speedup:

  xxxx@xxxx:~
  $ python3 -VV
  Python 3.11.2 (main, Apr 28 2025, 14:11:48) [GCC 12.2.0]
  xxxx@xxxx:~
  $ pypy3 -VV
  Python 3.9.16 (7.3.11+dfsg-2+deb12u3, Dec 30 2024, 22:36:23)
  [PyPy 7.3.11 with GCC 12.2.0]
  xxxx@xxxx:~
  $ cat original_benchmark.py
  #-------------------------------------------
  import random
  import time
  
  def monte_carlo_pi(n):
      inside = 0
      for i in range(n):
          x = random.random()
          y = random.random()
          if x**2 + y**2 <= 1.0:
              inside += 1
      return 4.0 * inside / n
  
  # Benchmark
  start = time.time()
  result = monte_carlo_pi(100_000_000)
  elapsed = time.time() - start
  
  print(f"Time: {elapsed:.3f} seconds")
  print(f"Estimated pi: {result}")
  #-------------------------------------------
  xxxx@xxxx:~
  $ python3 original_benchmark.py
  Time: 16.487 seconds
  Estimated pi: 3.14177012
  xxxx@xxxx:~
  $ pypy3 original_benchmark.py
  Time: 3.357 seconds
  Estimated pi: 3.14166756
  xxxx@xxxx:~
  $ python3 -c "print(round(16.487/3.357, 1))"
  4.9

I changed the code to take advantage of some basic performance tips that are commonly given for CPython (taking advantage of stardard library - itertools, math; prefer comprehensions/generator expressions to loose for loops), and was able to get CPython numbers improve by ~1.3x. But then PyPy numbers took a hit:

  xxxx@xxxx:~
  $ cat mod_benchmark.py
  #-------------------------------------------
  from itertools import repeat
  from math import hypot
  from random import random
  import time
  
  def monte_carlo_pi(n):
      inside = sum(hypot(random(), random()) <= 1.0 for i in repeat(None, n))
      return 4.0 * inside / n
  
  # Benchmark
  start = time.time()
  result = monte_carlo_pi(100_000_000)
  elapsed = time.time() - start
  
  print(f"Time: {elapsed:.3f} seconds")
  print(f"Estimated pi: {result}")
  #-------------------------------------------
  xxxx@xxxx:~
  $ python3 mod_benchmark.py
  Time: 12.998 seconds
  Estimated pi: 3.14149268
  xxxx@xxxx:~
  $ pypy3 mod_benchmark.py
  Time: 12.684 seconds
  Estimated pi: 3.14160844
  xxxx@xxxx:~
  $ python3 -c "print(round(16.487/12.684, 1))"
  1.3

dragonwriter · 2025-11-07T23:58:22 1762559902

I tested staying in CPython but jitting the main function with numba (no code changes but adding the jit decorator and expected type signature, and adding the same jit warmup call before the benchmark that the Julia version uses), and its about an 11× speedup. Code:

    import random
    import time
    from numba import jit, int32, float64


    @jit(float64(int32), nopython=True)
    def monte_carlo_pi(n):
        inside = 0
        for i in range(n):
            x = random.random()
            y = random.random()
            if x**2 + y**2 <= 1.0:
                inside += 1
        return 4.0 * inside / n

    # Warm up (compile)
    monte_carlo_pi(100)

    # Benchmark
    start = time.time()
    result = monte_carlo_pi(100_000_000)
    elapsed = time.time() - start
    print(f"Time: {elapsed:.3f} seconds")
    print(f"Estimated pi: {result}")

Base version (using the unmodified Python code from the slide):

   $ python -m monte
   Time: 13.758 seconds
   Estimated pi: 3.14159524

Numba version:

   $ python -m monte-numba
   Time: 1.212 seconds
   Estimated pi: 3.14143924