Chi2 Goodness of fit test for Random Numbers

Hey all, today I wanted to share a little program I’ve made for my International Baccalaureate Extended Essay.

My Extended Essay consists on evaluating the quality of different Pseudo-Random Number Generators (PRNG) using different tests, this program specifically tests Python’s random module, which is based on the PRNG “Mersenne Twister”, using a chi2 goodness of fit test.

The program is composed by three scripts, these are the following:

“RNG”

import sys
import random

# Set the amount of numbers to be randomly generated
num = 100000
# Save the output of the terminal as it will be used later
TerminalOut = sys.stdout

with open("PythonRandomNums", "w") as NewOut:
    # Change the output of the print function to a txt file instead of the terminal
    sys.stdout = NewOut
    # print a random number from 0 to 9 as many times as the range
    for x in range(num):
        print(random.randint(0, 9))

“Reading”

from RNG import num, TerminalOut
import sys

# Open the RNG numbers file
f = open("PythonRandomNums", "r")
# Change the output back to the terminal
sys.stdout = TerminalOut

# Create a dictionary for the frequency of each number
count = {
    0: 0,
    1: 0,
    2: 0,
    3: 0,
    4: 0,
    5: 0,
    6: 0,
    7: 0,
    8: 0,
    9: 0,
}

# Read each line of the file, and increase the frequency of the numbers respectively
for line in range(1, num+1):
    current_num = int(f.readline())
    count[current_num] = count.get(current_num) + 1

# Print out the observed frequencies of the test
print(f"Observed Frequencies: {count}")

“chi2 test”

import numpy
from RNG import num
from Reading import count
import scipy.stats as stats

# Create a variable for each of the frequencies
Zero = count[0]
One = count[1]
Two = count[2]
Three = count[3]
Four = count[4]
Five = count[5]
Six = count[6]
Seven = count[7]
Eight = count[8]
Nine = count[9]

# Put all the frequencies into an array
f_array = [Zero, One, Two, Three, Four, Five, Six, Seven, Eight, Nine]

# Create an array with the expected frequencies for each number,
# the expectation is that all frequencies will be spread evenly,
# so the expected frequency for each number will be the number of numbers generated {num}
# divided by all possible numbers {10}
single_expect = num / 10
expected_array = numpy.empty([10])
expected_array.fill(single_expect)

# Perform the chi2 test and print the results
print(stats.chisquare(f_obs=f_array, f_exp=expected_array, ddof=0))

The code is commented, and quite tidy if I may say so myself. So it should be easy to understand, but what I’m doing here is basically creating a text file with 100000 “random” numbers, recording the frequency of each number, and putting these values into a chi2 test which will output some statistics about how “good” the spread is so that later I can analyze it.

If this picks your interest you may want to stay tuned for my full Extended Essay which I’ll upload once I graduate (need to be careful with people who may plagiarize it).

Nothing crazy but I’m quite happy with the simplicity of it, and I wanted to share something while I’m busy with school. Just know that I haven’t forgotten about the post of that AI Minecraft project I wrote about a while ago.

Until next time!

Compártelo:

Related

Leave a comment Cancel reply