1 Schedule overview
Week | Topic | ||||
---|---|---|---|---|---|
1 | Quantitative methods and uncertainty | Slides | Tutorial | ||
2 | Data wrangling | Slides | Tutorial | ||
3 | Data visualisation | Slides | Tutorial | ||
4 | Statistical modeling basics | Slides | Tutorial | ||
5 | Categorical predictors | Slides | Tutorial | F1 | |
6 Catch up | No classes | ||||
7 | Binary outcomes | Slides | Tutorial | ||
8 | Multiple predictors and interactions | Slides | Tutorial | S1 | |
9 | Continuous predictors | Slides | Tutorial | ||
10 | Research process: an overview | Slides | Tutorial | F2 | |
11 | Obtaining p-values (optional) | Slides | Tutorial | ||
12 | S2 |
2 Weekly schedule
2.1 Week 1: Quantitative methods and uncertainty
Questions
- What is quantitative data analysis?
- What is the inference process?
- How can we talk about uncertainty and variability?
- Which are the limits of quantitative methods?
Skills
- Think critically about statistics, uncertainty and variability.
- Use R to perform simple calculations.
- Master the basics of the programming language R.
- Use RStudio.
Course website
Carefully read the homepage.
Familiarise yourself with this Course content page (note that the materials will be updated throughout the course).
Intake form
- You must complete the intake form before coming to the Tuesday lecture.
- The link to the form can be found on the Learn website.
Install R and RStudio
- For this course, you need to install both R and RStudio.
- NOTE: If you have installed either R or RStudio prior to January 2023, please make sure you delete both R and RStudio from your laptop.
- Please, follow the instructions in the Setup page.
Main textbooks
- Statistics for Linguists with R, by Bodo Winter (S4LR) Ch. 1. [via library]
- R for Data Science (R4DS) Ch. 1, Ch. 2. [online book]
- Statistical (Re)thinking, by Richard McElreath (SReT), Ch. 1. [via library]
From the lecture
- Ellis and Levy 2008. Framework of Problem-Based Research: A Guide for Novice Researchers on the Development of a Research-Worthy Problem
- Silberzahn et al. 2018. Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results
- Coretta et al. 2023. Multidimensional signals and analytic flexibility: Estimating degrees of freedom in human speech analyses
- Cumming 2014. The New Statistics: Why and How
- Kurschke and Liddell 2018. The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective
Replication
- 👉 Assessing the replication landscape in experimental linguistics.
- The Stark realities of reproducible statistically orientated sociological research: Some newer rules of the sociological method.
Other
- Methods as theory.
- Molnar 2022. Modeling Mindsets: The many cultures of learning from data.
- Darwin Holmes 2020. Researcher Positionality - A Consideration of Its Influence and Place in Qualitative Research - A New Researcher Guide
- Jafar 2018. What is positionality and should it be expressed in quantitative studies?
2.2 Week 2: Data wrangling
Questions
- What are the types of statistical variables?
- Which summary measures are appropriate for which types of variables?
- What are common measures central tendency?
- What are common measures of dispersion?
Skills
- Organise files efficiently.
- Import tabular data in R.
- Obtain mean, median, mode, range and standard deviation.
- Use R scripts to save and reuse code.
- Lecture slides.
- Workshop tutorial.
- Workshop files (right-click and download):
2.3 Week 3: Data visualisation
Questions
- What are the principles of good data visualisation?
- Which are the main components of a plot?
- Which are the appropriate plots for different types of data?
- How can we visualise uncertainty?
Skills
- Create common types of plots with ggplot2.
- Use colour and shape to effectively convey meaning.
- Describe a plot in writing and comment on observable patterns.
- Create styled HTML reports.
- Lecture slides.
- Workshop tutorial.
- Workshop files (right-click and download):
Main textbooks
From the lecture
- Spiegelhalter 2020. The Art of Statistics: Learning from Data.
Other
- Gabry et al 2019. Visualization in Bayesian workflow.
- Politzer-Ahles and Piccini. On visualizing phonetic data from repeated measures experiments with multiple random effects.
- Fundamentals of Data Visualisation.
- Data viz catalogues
- Tutorials
- Colour
- Caveats
2.4 Week 4: Statistical modeling basics
Questions
- What are probability distributions?
- How can we describe probability distributions with statistical parameters?
- What are the frequentist and Bayesian view of statistical parameters?
- How can we estimate parameters using statistical models?
Skills
- Transform data by creating new columns (mutate) and filtering based on specific values (filter).
- Use logical operators to transform data.
- Fit a statistical model to estimate the mean and standard deviation of a Gaussian variable with
brm()
. - Interpret the summary of the model and understand the meaning of the reported estimates.
- [optional] The Golem of Prague (video lecture of SreT Ch 1).
- Lecture slides.
- Workshop tutorial.
- Workshop files (right-click and download):
Main textbooks
- R4DS Ch. 2. [online book]
- ggplot2 documentation.
- S4LR Ch 3. [via library]
- SReT Ch 2, sparingly (we have not covered everything in the chapter yet). [via library]
Other
The following resources will be helpful throughout the course. Note they cover aspects that we have not yet discussed (some will be in the following weeks, others won’t be due to time), but do bookmark these because they will be valuable when you will be working on your dissertation.
- Linear Models and Mixed Models with R tutorials (1 and 2) by Bodo Winter (author of S4LR) for a general overview of the type of models we focus on in this course.
- One Thousand and One names: table with naming conventions for different types of linear models.
- Linear Models: A cheat-sheet: use this to find out which building blocks you need for your linear model.
2.5 Week 5: Categorical predictors
DUE on Thu 19 October at noon.
Formative assessment 1 requires you to complete a few guided exercises of the type that will be included in Summative 1.
Find instructions and data here: https://github.com/uoelel/qml-f1
Questions
- How do we model variables using categorical predictors?
- Which are the most common coding systems for categorical predictors?
- How do we interpret the model output when there are categorical predictors?
- How can we quickly check model goodness?
Skills
- Master contrast coding in R for categorical predictors.
- Understand treatment coding.
- Fit, interpret and plot models with a categorical predictor.
- Reporting of model specification and results.
- Lecture slides.
- Workshop tutorial.
- Workshop files (right-click and download):
Main textbooks
- R4DS Ch. 17. [online book]
- S4LR Ch 7. [via library]
- SReT Sec 5.3. [via library]
Other
- Factors, coding and contrasts: blog post on factors in linear models. It also discusses interactions, which we will cover in Weeks 8-9.
2.6 Week 6: Catch-up Week
There is no homework as such, so take the time to revise the materials and/or catch up with the previous weeks’ materials.
There will be no classes.
2.7 Week 7: Binary outcomes
Questions
- How can we visualise proportions of binary outcomes (yes/no, correct/incorrect, …)?
- Which distribution do binary outcomes follow?
- What is the relationship between probabilities and log-odds?
- How do we interpret log-odds and odds?
Skills
- Plot binary data as proportions in ggplot2.
- Pivot data from wide to long with tidyr.
- Fit, interpret and plot linear models with binary outcome variables, using the Bernoulli distribution family.
- Convert between log-odds, odds and probabilities.
- Lecture slides.
- Workshop tutorial.
- Workshop files (right-click and download):
Main textbooks
- R4DS Ch. 6. [online book]
- S4LR Ch 12. [via library]
- SReT Ch 11. [via library]
2.8 Week 8: Multiple predictors and interactions
Due on Thursday 9 November at noon
The first summative contains a series of guided exercises that cover things done in Weeks 1 to 7.
You can find the instructions and data for the first summative here: https://github.com/uoelel/qml-s1/.
Questions
- What is a factorial design?
- How do we estimate and interpret the effects of multiple predictors?
- How do we deal with situations when one predictor’s effect is different, depending on the value of the other predictor?
- How can such interactions between predictors be built into our models?
- How do we interpret model estimates of interactions?
Skills
- Run and interpret models with multiple predictors.
- Interpret interactions between two predictors.
- Plot posterior and conditional probabilities from models with interactions.
- Practice transforming and back-transforming variables.
- Lecture slides.
- Workshop tutorial.
- Workshop files (right-click and download):
2.9 Week 9: Continuous predictors and interactions
Questions
- How do we model predictors that aren’t categorical, but continuous?
- How do we interpret model estimates for continuous predictors?
- How do we fit and interpret interactions involving continuous predictors?
Skills
- Centre continuous predictors.
- Run and interpret models with continuous predictors.
- Interpret interactions that are categorical * continuous (in the lecture) and continuous * continuous (in the tutorial).
- Lecture slides.
- Workshop tutorial.
- Workshop files (right-click and download):
2.10 Week 10: Research process - An overview
DUE on Thursday 23 November at noon.
F2 requires you to read, plot and model data. Summative 2 will have the same format.
Please, read the following before coming to class on Wednesday (there will be no lecture on Tuesday).
- Stroop Effect in Language
- Half a Century of Research on the Stroop Effect: An Integrative Review, pp 163–165 plus one extra section of your choice.
- Lecture slides.
- Workshop tutorial.
- Workshop files: https://github.com/uoelel/qml-stroop.
The following resources are a useful summaries of conceptual, practical and terminological aspects of linear models in general.
- Linear Models Illustrated: a Shiny web app that illustrates linear models. Especially helpful to understand interactions.
- One Thousand and One names: table with naming conventions for different types of linear models.
- Linear Models: A cheat-sheet: use this to find out which building blocks you need for your linear model.
2.11 Week 11: Obtaining p-values (Optional)
- Motulsky 2014, Common misconceptions about data analysis and statistics.
- Tressoldi et al 2015, The pervasive avoidance of prospective statistical power: Major consequences and practical solutions.
- Cassidy et al 2019, Failing grade: 89 per-cent of introduction to psychology textbooks that define/explain statistical significance do so incorrectly.
- Gigerenzer 2004, Mindless statistics.
- Wagenmakers 2007. A practical solution to the pervasive problems of p-values.
- Interpreting (frequentist) Confidence Intervals.
2.12 Week 12
Due on Thursday 7 December at noon
In the second summative assessment, you will:
- Select a dataset from a list and its associated research questions.
- Analyse the data using one linear model.
- Write a report about the data, the model, and your findings.
You can find the instructions and data for the first summative here: https://github.com/uoelel/qml-s2/.