Lucky Guesses: Simulating a Multiple-Choice Exam

Python Puzzles

Back to the Python! homepage

Simulate a multiple-choice exam where you divide the questions into three probabilities of attaining a correct answer.

Questions that have a 100% confidence of being correct
Questions that have a 1/3 confidence of being correct
Questions that have a 1/4 confidence of being correct

After running 10,000 simulations of the exam, determine the following:

Average number of correct answers
Minimum and maximum number of correct answers
Percentage of the exams where the number of correct answers is over some desired value
Standard Deviation

In my simulation I choose a 50 question exam where my desired score of 35 was attained 74% of the time.

Here are the results of the print statements from my Python code below.

This Pandas Series shows the number of questions correct on the left and the occurrences on the right.

1,916 exam simulations had 36 correct answers.

Probability of obtaining a score of 35 or greater is 73.83%
The average score is 35.84
The minimum score is 30.00
The maximum score is 45.00
The standard deviation is 2.02
30       6
31      77
32     299
33     867
34    1368
35    1866
36    1916
37    1528
38    1111
39     576
40     260
41      93
42      21
43      10
44       1
45       1

Reviewing the bar chart, we can see the data fits a normal distribution.

The mathematical formula to determine the mean is pretty simple. Here we can see the average score will be 36.83, which aligns without simulation.

E(X) = 30 + (1/3) * 10 + (1/4) * 10 = 30 + 3.33 + 2.5 = 36.83

Here is the code I developed in Python.

# import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# define variables for the simulation
desired_score = 35
known_guess = 1
educated_guess = 1/3
random_guess = 1/4
simulations = 10_000

# simulate the exam results using binomial distribution
np_result = np.random.binomial(30, known_guess, simulations)
np_result = np.add(np_result, np.random.binomial(10, educated_guess, simulations))
np_result = np.add(np_result, np.random.binomial(10, random_guess, simulations))

# filter the results to find the probability of obtaining the desired score or greater
filter_arr = np_result >= 35
newarr = np_result[filter_arr]
probability_desired_score = len(newarr)/len(np_result)

# output the results
print('Probability of obtaining a score of ' + str(desired_score) + ' or greater is ' + '{0:.2f}%'.format(probability_desired_score*100))
print('The average score is ' + '{0:.2f}'.format(np.average(np_result)))
print('The minimum score is ' + '{0:.2f}'.format(np.min(np_result)))
print('The maximum score is ' + '{0:.2f}'.format(np.max(np_result)))
print('The standard deviation is ' + '{0:.2f}'.format(np.std(np_result)))

# create a pandas series and plot the distribution
df = pd.Series(np_result)
df = df.value_counts()
print(df.sort_index())

plt.bar(df.index, df.values)
plt.title('Distribution of Correct Answers')
plt.xlabel('Correct Answers')
plt.ylabel('Frequency')
plt.xticks(np.arange(min(df.index), 51, 2))
plt.savefig('guessing-multiple-choice-distribution-answers.png', dpi=300, bbox_inches='tight')
plt.show()