Data Analytics

Welcome to the data analytics portion of my WordPress site.

I’ve divided this page into three sections: Baseball Analytics, Data Explorations and Inferential Statistics.


Baseball Analytics

This section comprises my various data analytic projects concerning Major League Baseball.


MLB Correlating Runs Scored

Correlating runs scored to AVG, SLG, OBP, and OPS. Which metric best correlates to runs scored?

Click here to view my analysis.


MLB Batting Statistics By Year

An analysis of offensive baseball metrics and how they have trended since the Live Ball Era beginning in 1920.

Click here to view my analysis.


MLB No-Hitters and the Exponential Distribution

A review the exponential distribution of major league no-hitters.

Click here to view my analysis.

MLB Player Performance Index – Advanced SQL Puzzles


MLB Player Performance Index

Here I create an offensive score and analyze MLB position players with over 4,000 at bats.

Click here to view my analysis.


Greatest NY Yankee

For this analysis I look at the career totals of 4 of the greatest Yankees to play the game.

Click here to view my analysis.


Data Explorations

This section encompasses a collection of analytic projects, each exploring different datasets that have captured my interest.


Iris Dataset

Here is my foray into the famous iris dataset. Besides the usual EDA, I perform a number of supervised and unsupervised machine learning models on the data. Because it is used extensively in ML training models, it seems appropriate to spend some time to learn this data set in full.

Click here to view my analysis.


Rainfall in Austin, Texas

Analysis of the average rainfall in my home city of Austin, Texas.

Click here to view my analysis.


Literacy vs Fertility

Analysis of literacy vs fertility throughout the world using linear regression and bootstrap sampling.

Click here to view my analysis.


Canelo Álvarez vs. Gennady Golovkin II

Who really won the Álvarez vs Golovkin II match? Here I perform an analysis of the CompuBox stats.

Click here to view my analysis.


Dice Roll Game

A fun dice roll game where I run 10,000 simulations and perform a statistical analysis of my results.

Click here to view my analysis.


High Low Card Game

A fun card game where I run 1 million simulations and perform a statistical analysis of my results.

Click here to view my analysis.


Simulating a Basketball Game

In this study, we simulate two distinct scenarios involving basketball teams: one team exclusively shoots three-pointers, while the other focuses solely on two-pointers. We then visualize the data in a graph and evaluate which team exhibits a higher win percentage. Finally, we conduct a two-sample t-test to ascertain whether the average win percentages of the two teams are statistically different from one another.

Click here to view my analysis.


Inferential Statistics

This section comprises my writings on inferential statistics, focusing on the fundamentals and essential ideas that I found are critical for comprehending the field.


Standard Normal Curve

In this section, I provide a brief Python analysis of the Standard Normal Curve. While the analysis may not be overly complex, it sheds light on a few aspects that might not be immediately evident to all readers.

Click here to view my analysis.


Central Limit Theorem

In this section, I delve into the topic of the Central Limit Theorem and emphasize its significance. By exploring the CLT alongside a comprehensive understanding of the standard error of the mean, readers can make substantial progress towards attaining a deep comprehension of inferential statistics. These concepts play a pivotal role in unlocking the true essence of inferential statistical analysis.

Click here to view my analysis.


Confidence Intervals

In this section, I perform calculations for confidence intervals across various scenarios. By considering different situations, I demonstrate the practical application of confidence intervals in statistical analysis.

Click here to view my analysis.


Hypothesis Testing

In this section, I delve into the realm of hypothesis testing using Python. I explore various scenarios, including single samples, unmatched samples, matched samples, ANOVA (Analysis of Variance), and chi-square tests.

Click here to view my analysis.


Happy coding!