Two statistical approaches to inference: Frequentist and Bayesian statistics

A conceptual introduction to the frequentist and Bayesian approaches to statistical inference

Author

Stefano Coretta

Published

September 19, 2024

Prerequisites

1 Frequentist vs Bayesian probability

The two major approaches to statistical inference are the frequentist and the Bayesian approaches. The difference between these approaches is how probabilities are conceived, from a philosophical perspective if you will.

Probabilities in a frequentist framework are about average occurrences of events in a hypothetical series of repetitions of those events. Imagine you observe a volcano for a long period of time. The number of times the volcano erupts within that time tells us the frequency of occurrence of the event of volcanic eruption. In other words, it tells us its (frequentist) probability.

In the Bayesian framework, probabilities are about the level of (un)certainty that an event will occur at any specific time given certain conditions. This is probably the way we normally think about probabilities: like in the weather forecast, if somebody tells you tomorrow it will rain with a probability of 85%, you intuitively know that it is very likely that it will rain tomorrow although not certain.

In the context of research, a frequentist probability tells you the probability of obtaining the same result again given an imaginary series of replications of the study that generated that probability. On the other hand, a Bayesian probability tells you the probability that the result of the study is the actual result you should have gotten.

2 Frequentist inference

Most of current research is carried out with frequentist methods. This is a historical accident, based on both a misunderstanding of Bayesian statistics (which is, by the way, older than frequentist statistics) and the fact that frequentist maths was much easier (and personal computers did not exist).

The commonly accepted approach to frequentist inference is the so-called Null Hypothesis Significance Testing, or NHST. As practised by researchers, the NHST approach is a (sometimes incoherent) mix of the frequentist work of Fisher on one hand, and Neyman and Pearson on the other.

The main tenet of NHST is that you set a null hypothesis and you try to reject it. A null hypothesis is, in practice, a nil hypothesis: in other words, it is the hypothesis that there is no difference between two values (these usually being means of two or more groups of interest). Using a variety of techniques, one obtains a p-value, i.e. a frequentist probability. p-values are very commonly mistaken for Bayesian probabilities (Cassidy et al 2019) and this results in various misinterpretations of reported results. To learn the meaning (and dangers) of p-values, check this post.

The NHST framework has been criticised because of its incoherence (both by frequentist and non-frequentist folk). A must-read is Mindless statistics by Gigerenzer and Gerd.

3 Bayesian inference

Bayesian inference approaches are now gaining momentum in many fields, including linguistics. The main advantage of Bayesian inference is that it allows researchers to answer research questions in a more straightforward way, using a more intuitive take on uncertainty and probability than what frequentist methods can offer.

Bayesian inference is based on the concept of updating prior beliefs in light of new data. Given a set of prior probabilities and observations, Bayesian inference allows us to revise those prior probabilities and produce posterior probabilities. This is possible through the Bayesian interpretation of probabilities in the context of Bayes’ Theorem.

In simple conceptual terms, the Bayesian interpretation of Bayes’ Theorem states that the probability of a hypothesis \(h\) given the observe data \(d\) is proportional to the product of the prior probability of \(h\) and the probability of \(d\) given \(h\).

\[ P(h|d) \sim P(h) \cdot P(d|h) \]

The prior probability \(P(h)\) represents the researcher’s beliefs towards \(h\). These beliefs can be based on expert knowledge, previous studies or mathematical principles. For a more hands-on overview, I recommend Chapter 2 of the Statistical Rethinking textbook.

The following section goes through a few reasons to use Bayesian inference for research. Note that some of these reasons presuppose understanding of regression modelling, both frequentist and Bayesian, so don’t worry if they are not clear yet.

4 Why Bayesian inference?

Here are a few practical and conceptual reasons for why you should consider switching to Bayesian statistics for your research.

Practical reasons

Fitting frequentist models can lead to anti-conservative p-values (i.e. increased false positive rates, Type-I error rates: there is no effect but yet you got a significant p-value). An interesting example of this for the non-technically inclined reader can be found here. LMER tends to be more sensitive to small sample sizes than Bayesian models (with small sample sizes, Bayesian models return estimates with greater uncertainty, which is a more conservative approach).
While very simple models will return very similar estimates in frequentist and Bayesian statistics, in most cases more complex models won’t fit if run with frequentist packages like lme4, especially without adequate enough sample sizes. Bayesian regression models always converge, while frequentist ones don’t always do.
Frequentist regression models require as much work as Bayesian ones, although it is common practice to skip necessary steps when fitting the former, which gives the impression of it being a quicker process. Factoring out the time needed to run Markov Chain Monte Carlo chains in Bayesian regressions, in frequentist regressions you still have to perform robust perspective power analyses and post-hoc model checks.
With Bayesian models, you can reuse posterior distributions from previous work and include that knowledge as priors into your Bayesian analysis. This feature effectively speeds up the discovery process (getting to the real value estimate of interest faster). You can embed previous knowledge in Bayesian models while you can’t in frequentist ones.

Conceptual reasons

Frequentist regression models cannot provide evidence for a difference between groups, only evidence to reject the null (i.e. nil) hypothesis.
A frequentist Confidence Interval (CI) can only tell us that, if we run the same study multiple times, n percent of the time the CI will include the real value (but we don’t know whether the CI we got in our study is one from the 100-n percent of CIs that DO NOT CONTAIN the real value). On the other hand, a Bayesian Credible Interval (CrI) ALWAYS tells us that the real value is within a certain range at n percent probability. (Of course all conditional on model and data, which is true both for frequentist and Bayesian models alike). So, frequentist models really just gives you a point estimate, while Bayesian models give a range of values.
With Bayesian regressions you can compare any hypothesis, not just null vs alternative. (Although you can use information criteria with frequentist models).
Frequentist regression models are based on an imaginary set of experiments that you never actually carry out.
Bayesian regression models will converge towards the true value in the long run. Frequentist models do not.

Of course, there are merits in fitting frequentist models, for example in corporate decisions, but you’ll still have to do a lot of work. The main conceptual difference then is that frequentist and Bayesian regression models answer very different questions and as (basic) researchers we are generally interested in questions that the latter can answer and the former cannot.

Introduction to regression models (Part I)