If you’re running on PostgreSQL, you could try the following cool query:
WITH RECURSIVE
r (r, i) AS (
SELECT random(), i
FROM generate_series(1, 1000000) AS t (i)
),
s (ri, s, i) AS (
SELECT i, r, i
FROM r
UNION ALL
SELECT s.ri, r.r + s.s, s.i + 1
FROM r
JOIN s ON r.i = s.i + 1
WHERE r.r + s.s <= 1
),
n (n) AS (
SELECT max(i) - min(i) + 2
FROM s
GROUP BY ri
)
SELECT avg(n)
FROM n
What does it print (after a while)? It prints e
(almost). Here are some sample results:
2.7169115477960698 2.7164145522690296 2.7172065451410937 2.7170815462660836
Not perfect, sure, here’s a better approximation written in SQL:
SELECT exp(1);
Producing:
2.718281828459045
Close enough… How does it work? It’s a cool approximation that has been described many times, e.g. here. In prose:
On average, it takes e random values between 0 and 1 until the sum of those values exceeds 1.
Looking at the query again:
WITH RECURSIVE
-- "random values between 0 and 1"
r (r, i) AS (
SELECT random(), i
FROM generate_series(1, 1000000) AS t (i)
),
s (ri, s, i) AS (
SELECT i, r, i
FROM r
UNION ALL
SELECT s.ri, r.r + s.s, s.i + 1
FROM r
JOIN s ON r.i = s.i + 1
-- "... until the sum exceeds 1"
WHERE r.r + s.s <= 1
),
-- "number of values taken until ..."
n (n) AS (
SELECT max(i) - min(i) + 2
FROM s
GROUP BY ri
)
-- "on average"
SELECT avg(n)
FROM n
In prose, read from top to bottom:
- I’m generating 1 million random values between 0 and 1
- Starting from each one of those values, I’m adding consecutive values as long as their sum does not exceed 1
- For each value, I check how many values it took until the sum exceeded 1
- I take the average of that number of values
Highly inefficient, but that wasn’t the point. :-)
from Java, SQL and jOOQ. https://ift.tt/9zKcupe
via IFTTT
No comments:
Post a Comment