content – QML

1 Schedule overview

Week	Topic
1	Quantitative methods and uncertainty	Slides	Tutorial
2	Data wrangling	Slides	Tutorial
3	Data visualisation	Slides	Tutorial
4	Statistical modeling basics	Slides	Tutorial
5	Categorical predictors	Slides	Tutorial	F1
6 Catch up	No classes
7	Binary outcomes	Slides	Tutorial
8	Multiple predictors and interactions	Slides	Tutorial		S1
9	Continuous predictors	Slides	Tutorial
10	Research process: an overview	Slides	Tutorial	F2
11	Obtaining p-values (optional)	Slides	Tutorial
12					S2

2 Weekly schedule

2.1 Week 1: Quantitative methods and uncertainty

Learning Objectives

Questions

What is quantitative data analysis?
What is the inference process?
How can we talk about uncertainty and variability?
Which are the limits of quantitative methods?

Skills

Think critically about statistics, uncertainty and variability.
Use R to perform simple calculations.
Master the basics of the programming language R.
Use RStudio.

Homework

Course website

Carefully read the homepage.
Familiarise yourself with this Course content page (note that the materials will be updated throughout the course).

Intake form

You must complete the intake form before coming to the Tuesday lecture.
The link to the form can be found on the Learn website.

Install R and RStudio

For this course, you need to install both R and RStudio.
NOTE: If you have installed either R or RStudio prior to January 2023, please make sure you delete both R and RStudio from your laptop.
Please, follow the instructions in the Setup page.

Materials

Suggested readings

Main textbooks

Statistics for Linguists with R, by Bodo Winter (S4LR) Ch. 1. [via library]
R for Data Science (R4DS) Ch. 1, Ch. 2. [online book]
Statistical (Re)thinking, by Richard McElreath (SReT), Ch. 1. [via library]

From the lecture

Ellis and Levy 2008. Framework of Problem-Based Research: A Guide for Novice Researchers on the Development of a Research-Worthy Problem
Silberzahn et al. 2018. Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results
Coretta et al. 2023. Multidimensional signals and analytic flexibility: Estimating degrees of freedom in human speech analyses
Cumming 2014. The New Statistics: Why and How
Kurschke and Liddell 2018. The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective

Replication

Other

Methods as theory.
Molnar 2022. Modeling Mindsets: The many cultures of learning from data.
Darwin Holmes 2020. Researcher Positionality - A Consideration of Its Influence and Place in Qualitative Research - A New Researcher Guide
Jafar 2018. What is positionality and should it be expressed in quantitative studies?

2.2 Week 2: Data wrangling

Learning Objectives

Questions

What are the types of statistical variables?
Which summary measures are appropriate for which types of variables?
What are common measures central tendency?
What are common measures of dispersion?

Skills

Organise files efficiently.
Import tabular data in R.
Obtain mean, median, mode, range and standard deviation.
Use R scripts to save and reuse code.

Materials

Lecture slides.
Workshop tutorial.
Workshop files (right-click and download):
- shallow.csv

Suggested readings

Main textbooks

S4LR Ch. 3. [via library]
R4DS Ch. 3 and Ch. 4. [online book]

2.3 Week 3: Data visualisation

Learning Objectives

Questions

What are the principles of good data visualisation?
Which are the main components of a plot?
Which are the appropriate plots for different types of data?
How can we visualise uncertainty?

Skills

Create common types of plots with ggplot2.
Use colour and shape to effectively convey meaning.
Describe a plot in writing and comment on observable patterns.
Create styled HTML reports.

Materials

Lecture slides.
Workshop tutorial.
Workshop files (right-click and download):
- polite.csv
- glot_status.rds

Suggested readings

Main textbooks

R4DS Ch. 2. [online book]
ggplot2 documentation.

From the lecture

Spiegelhalter 2020. The Art of Statistics: Learning from Data.

Other

Gabry et al 2019. Visualization in Bayesian workflow.
Politzer-Ahles and Piccini. On visualizing phonetic data from repeated measures experiments with multiple random effects.
Fundamentals of Data Visualisation.
Data viz catalogues
Tutorials
Colour
Caveats

2.4 Week 4: Statistical modeling basics

Learning Objectives

Questions

What are probability distributions?
How can we describe probability distributions with statistical parameters?
What are the frequentist and Bayesian view of statistical parameters?
How can we estimate parameters using statistical models?

Skills

Transform data by creating new columns (mutate) and filtering based on specific values (filter).
Use logical operators to transform data.
Fit a statistical model to estimate the mean and standard deviation of a Gaussian variable with brm().
Interpret the summary of the model and understand the meaning of the reported estimates.

Materials

[optional] The Golem of Prague (video lecture of SreT Ch 1).
Lecture slides.
Workshop tutorial.
Workshop files (right-click and download):
- alb_vot.csv

Suggested readings

Main textbooks

R4DS Ch. 2. [online book]
ggplot2 documentation.
S4LR Ch 3. [via library]
SReT Ch 2, sparingly (we have not covered everything in the chapter yet). [via library]

Other

The following resources will be helpful throughout the course. Note they cover aspects that we have not yet discussed (some will be in the following weeks, others won’t be due to time), but do bookmark these because they will be valuable when you will be working on your dissertation.

Linear Models and Mixed Models with R tutorials (1 and 2) by Bodo Winter (author of S4LR) for a general overview of the type of models we focus on in this course.
One Thousand and One names: table with naming conventions for different types of linear models.
Linear Models: A cheat-sheet: use this to find out which building blocks you need for your linear model.

2.5 Week 5: Categorical predictors

Formative assessment 1

DUE on Thu 19 October at noon.
Formative assessment 1 requires you to complete a few guided exercises of the type that will be included in Summative 1.
Find instructions and data here: https://github.com/uoelel/qml-f1

Learning Objectives

Questions

How do we model variables using categorical predictors?
Which are the most common coding systems for categorical predictors?
How do we interpret the model output when there are categorical predictors?
How can we quickly check model goodness?

Skills

Master contrast coding in R for categorical predictors.
Understand treatment coding.
Fit, interpret and plot models with a categorical predictor.
Reporting of model specification and results.

Materials

Lecture slides.
Workshop tutorial.
Workshop files (right-click and download):
- senses_valence.csv

Suggested readings

Main textbooks

R4DS Ch. 17. [online book]
S4LR Ch 7. [via library]
SReT Sec 5.3. [via library]

Other

Factors, coding and contrasts: blog post on factors in linear models. It also discusses interactions, which we will cover in Weeks 8-9.

2.6 Week 6: Catch-up Week

Homework

There is no homework as such, so take the time to revise the materials and/or catch up with the previous weeks’ materials.

There will be no classes.

2.7 Week 7: Binary outcomes

Learning Objectives

Questions

How can we visualise proportions of binary outcomes (yes/no, correct/incorrect, …)?
Which distribution do binary outcomes follow?
What is the relationship between probabilities and log-odds?
How do we interpret log-odds and odds?

Skills

Plot binary data as proportions in ggplot2.
Pivot data from wide to long with tidyr.
Fit, interpret and plot linear models with binary outcome variables, using the Bernoulli distribution family.
Convert between log-odds, odds and probabilities.

Materials

Lecture slides.
Workshop tutorial.
Workshop files (right-click and download):
- takete_maluma.txt.

Suggested readings

Main textbooks

R4DS Ch. 6. [online book]
S4LR Ch 12. [via library]
SReT Ch 11. [via library]

2.8 Week 8: Multiple predictors and interactions

Summative 1: Week 8 (Thu 9 November at noon)

Due on Thursday 9 November at noon

The first summative contains a series of guided exercises that cover things done in Weeks 1 to 7.

You can find the instructions and data for the first summative here: https://github.com/uoelel/qml-s1/.

Learning Objectives

Questions

What is a factorial design?
How do we estimate and interpret the effects of multiple predictors?
How do we deal with situations when one predictor’s effect is different, depending on the value of the other predictor?
How can such interactions between predictors be built into our models?
How do we interpret model estimates of interactions?

Skills

Run and interpret models with multiple predictors.
Interpret interactions between two predictors.
Plot posterior and conditional probabilities from models with interactions.
Practice transforming and back-transforming variables.

Materials

Lecture slides.
Workshop tutorial.
Workshop files (right-click and download):
- shallow.csv.
- dur-ita-pol.csv.

2.9 Week 9: Continuous predictors and interactions

Learning Objectives

Questions

How do we model predictors that aren’t categorical, but continuous?
How do we interpret model estimates for continuous predictors?
How do we fit and interpret interactions involving continuous predictors?

Skills

Centre continuous predictors.
Run and interpret models with continuous predictors.
Interpret interactions that are categorical * continuous (in the lecture) and continuous * continuous (in the tutorial).

Materials

Lecture slides.
Workshop tutorial.
Workshop files (right-click and download):
- si.csv.

2.10 Week 10: Research process - An overview

Formative assessment 2

DUE on Thursday 23 November at noon.
F2 requires you to read, plot and model data. Summative 2 will have the same format.

Homework

Please, read the following before coming to class on Wednesday (there will be no lecture on Tuesday).

Stroop Effect in Language
Half a Century of Research on the Stroop Effect: An Integrative Review, pp 163–165 plus one extra section of your choice.

Materials

Lecture slides.
Workshop tutorial.
Workshop files: https://github.com/uoelel/qml-stroop.

Useful resources

The following resources are a useful summaries of conceptual, practical and terminological aspects of linear models in general.

Linear Models Illustrated: a Shiny web app that illustrates linear models. Especially helpful to understand interactions.
One Thousand and One names: table with naming conventions for different types of linear models.
Linear Models: A cheat-sheet: use this to find out which building blocks you need for your linear model.

2.11 Week 11: Obtaining p-values (Optional)

Materials

Suggested readings

Motulsky 2014, Common misconceptions about data analysis and statistics.
Tressoldi et al 2015, The pervasive avoidance of prospective statistical power: Major consequences and practical solutions.
Cassidy et al 2019, Failing grade: 89 per-cent of introduction to psychology textbooks that define/explain statistical significance do so incorrectly.
Gigerenzer 2004, Mindless statistics.
Wagenmakers 2007. A practical solution to the pervasive problems of p-values.
Interpreting (frequentist) Confidence Intervals.

2.12 Week 12

Summative 2: Week 12 (Thu 7 December at noon)

Due on Thursday 7 December at noon

In the second summative assessment, you will:

Select a dataset from a list and its associated research questions.
Analyse the data using one linear model.
Write a report about the data, the model, and your findings.

You can find the instructions and data for the first summative here: https://github.com/uoelel/qml-s2/.