ML4LM — MLE vs Bayesian intuitive Insights, No Math!

Hoyath
4 min readMar 5, 2024

--

Maximum Likelihood Estimation (MLE): Relying on Observed Data

Imagine you’re given a bag of coins, and your task is to determine the probability of getting heads when flipping one of these coins. In the world of MLE, your approach would be straightforward: you start flipping the coin and recording the outcomes. If you observe heads 10 out of 10 times, you might conclude that the probability of getting heads is 1 (since you’ve only seen heads).

However, there’s a problem with this approach. It doesn’t take into account any prior beliefs or knowledge you might have about the coin. What if the coin is biased towards heads, but you simply haven’t flipped it enough times to observe tails? This is where Bayesian learning comes into play.

Bayesian Learning: Incorporating Prior Beliefs

Let’s revisit our coin example. Suppose, before even flipping the coin, you have a prior belief that the coin is fair, meaning it has an equal chance of landing on heads or tails. This prior belief reflects your initial assumption about the coin’s behavior based on your general knowledge of coins.

Now, as you start flipping the coin and observing only heads, your prior belief is challenged. However, instead of completely discarding your prior belief and solely relying on the observed data (like in MLE), Bayesian learning allows you to update your prior belief with the observed evidence.

Updating Beliefs with Bayes’ Rule

Bayes’ rule provides a systematic way of updating prior beliefs with observed data. It considers both the likelihood of observing the data given a particular hypothesis (in our case, the probability of getting heads given the coin’s bias) and the prior probability of that hypothesis.

So, even though you’ve only observed heads so far, Bayes’ rule allows you to update your prior belief with this evidence and arrive at a posterior probability distribution that reflects your updated understanding of the coin’s bias. This posterior distribution incorporates both your prior belief and the observed data, providing a more comprehensive and nuanced estimate of the coin’s behavior.

Lets see this in detail

Prior Belief: Before flipping the coin, you have a belief about the fairness of the coin, represented as a probability distribution. Let’s say you believe that any bias of the coin is equally likely, which can be represented by a uniform distribution between 0 and 1.

This means you think any probability of getting heads (represented by p) is equally probable, from completely biased towards tails (0) to completely biased towards heads (1).

Likelihood: After flipping the coin and observing a sequence of heads (let’s say n heads out of N flips), you want to calculate how likely it is to observe this sequence given different biases of the coin. This is represented by the binomial distribution, which tells us the probability of getting a certain number of heads (or tails) out of a total number of flips given a bias p of the coin.

This captures the probability of getting the observed data (the sequence of heads) for different possible biases of the coin.

Posterior Belief: Now, using Bayes’ rule, you combine your prior belief with the likelihood of observing the data given different biases of the coin to get the posterior belief. This is the updated belief about the bias of the coin given the observed data.

Here, you’re essentially multiplying the likelihood with your prior belief and normalizing it by the probability of observing the data (which acts as a normalization factor). This gives you the updated probability distribution of different biases of the coin after considering the observed data.

Conclusion

In essence, Bayesian learning offers a more holistic approach to inference by integrating prior beliefs with observed data. Unlike MLE, which relies solely on the observed data, Bayesian learning allows us to incorporate our prior knowledge and beliefs into the modeling process, leading to more robust and informed decisions. By understanding and leveraging the power of Bayesian inference, we can develop models that better capture the complexities of the real world and make more accurate predictions in uncertain environments.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Hoyath
Hoyath

Written by Hoyath

Masters in Computer Science, University of Riverside, California. Ex- Analyst at Goldman Sachs

No responses yet

Write a response