Mathematical-preliminaries-TDSM
Mathematical Preliminaries
Probability
2-1.
Suppose 80% of people like peanut butter, 89% like jelly, and 78% like both. Given that a randomly sampled person likes peanut butter, what is the probability that she also likes jelly?
2-3.
Consider a game where your score is the maximum value from two dice. Compute the probability of each event from [math]\{1, \ldots, 6\}[/math]
2-5.
If two binary random variables X and Y are independent, is [math]\bar{X}[/math] (the complement of X) and Y also independent? Give a proof or a counterexample.
Statistics
2-7.
Construct a probability distribution where none of the mass lies within one [math]\sigma[/math] of the mean.
2-9.
Show that the arithmetic mean equals the geometric mean when all terms are the same.
Correlation Analysis
2-11.
What would be the correlation coefficient between the annual salaries of college and high school graduates at a given company, if for each possible job title the college graduates always made:
- 5,000 dollars more than high school grads?
- 25% more than high school grads?
- 15% less than high school grads?
2-13.
Use data or literature found in a Google search to estimate/measure the strength of the correlation between:
- Hits and walks scored for hitters in baseball.
- Hits and walks allowed by pitchers in baseball.
Logarithms
2-15.
Show that the logarithm of any number less than 1 is negative.
2-17.
Prove that
[math]x \cdot y = b^{(\log_b x + \log_b y)}[/math]
Implementation Projects
2-19.
Find some interesting data sets, and compare how similar their means and medians are? What are the distributions where the mean/median differ on the most?
Interview Questions
2-21.
What is the probability of getting exactly k heads on n tosses, where the coin has a probability of p in coming up heads on each toss? What about k or more heads?
2-23.
At halftime of a basketball game you are offered two possible challenges:
- Take three shots, and make at least two of them.
- Take eight shots, and make at least five of them.
2-25.
Given a stream of n numbers, show how to select one uniformly at random using only constant storage. What if you don't know n in advance?
2-27.
A person randomly types a 8 digit number in a pocket calculator. What is the probability that the number looks the same even if the calculator is turned upside down.
2-29.
What is A/B testing and how does it work?
2-31.
We often say that correlation does not imply causation. What does this mean?
Kaggle Challenges
2-33.
Cause-effect pairs: correlation vs causation.
https://www.kaggle.com/c/cause-effect-pairs
2-35.
Predict the fate of animals at a pet shelter
https://www.kaggle.com/c/shelter-animal-outcomes