Introduction to Bayesian Statistics: A Different Way of Thinking
Introduction to Bayesian Statistics: A Different Way of Thinking
Bayesian statistics offers a fundamentally different approach to statistical inference. Instead of treating parameters as fixed but unknown, Bayesian statistics treats them as random variables with probability distributions.
The Bayesian Framework
At the heart of Bayesian statistics is Bayes' theorem:
Where:
- is the posterior - what we believe about after seeing data
- is the likelihood - how probable the data is given
- is the prior - what we believed about before seeing data
- is the evidence - the probability of observing the data
A Simple Example: Coin Flipping
Suppose we want to estimate the probability that a coin lands heads. In the Bayesian approach:
Prior Distribution
We start with a prior belief. Let's use a Beta distribution:
For a fair coin, we might choose (uniform prior).
Likelihood
For flips with heads, the likelihood is:
Posterior Distribution
The posterior is also a Beta distribution (conjugate prior):
Implementation in Python
import numpy as np import matplotlib.pyplot as plt from scipy import stats def bayesian_coin_inference(flips, alpha_prior=1, beta_prior=1): """ Bayesian inference for coin bias. Args: flips: list of 0s and 1s (0=tails, 1=heads) alpha_prior: prior Beta parameter beta_prior: prior Beta parameter Returns: posterior Beta distribution parameters """ heads = sum(flips) tails = len(flips) - heads # Update parameters alpha_post = alpha_prior + heads beta_post = beta_prior + tails return alpha_post, beta_post # Example usage observed_flips = [1, 0, 1, 1, 0, 1, 1, 1, 0, 1] # 7 heads, 3 tails alpha_post, beta_post = bayesian_coin_inference(observed_flips) # Plot results x = np.linspace(0, 1, 1000) prior = stats.beta(1, 1) posterior = stats.beta(alpha_post, beta_post) plt.figure(figsize=(10, 6)) plt.plot(x, prior.pdf(x), label='Prior', linestyle='--') plt.plot(x, posterior.pdf(x), label='Posterior', linewidth=2) plt.axvline(0.5, color='red', linestyle=':', label='Fair coin') plt.xlabel('Probability of Heads') plt.ylabel('Density') plt.legend() plt.title('Bayesian Coin Inference') plt.show()
Credible Intervals
Unlike confidence intervals, Bayesian credible intervals have an intuitive interpretation:
This means there's a 95% probability that lies between and given our data.
def credible_interval(alpha, beta, confidence=0.95): """Calculate Bayesian credible interval.""" lower = (1 - confidence) / 2 upper = 1 - lower rv = stats.beta(alpha, beta) return rv.ppf(lower), rv.ppf(upper) # 95% credible interval for our coin lower, upper = credible_interval(alpha_post, beta_post) print(f"95% credible interval: [{lower:.3f}, {upper:.3f}]")
Advantages of Bayesian Approach
- Intuitive interpretation: Probabilities represent degrees of belief
- Incorporates prior knowledge: Can include expert opinion or historical data
- Handles uncertainty naturally: Full probability distributions, not just point estimates
- Sequential updating: Easy to incorporate new data as it arrives
Challenges
- Computational complexity: Often requires numerical methods (MCMC)
- Prior sensitivity: Results can depend on prior choice
- Philosophical differences: Different interpretation of probability
Conclusion
Bayesian statistics provides a powerful framework for reasoning under uncertainty. While it requires a shift in thinking from frequentist methods, it offers intuitive interpretations and natural ways to incorporate prior knowledge.
In future posts, we'll explore more advanced topics like hierarchical Bayesian models and Markov Chain Monte Carlo methods.