QML - Week 9

Open Research

Stefano Coretta

What’s next?

Case studies:

  1. Recently published paper: Amoruso et al. (2025).
  2. Planned study: Intrinsic Vowel Duration in English.

Case study 1: Amoruso et al. (2025)

  • Amoruso et al. (2025) published in Nature in November: “Multilingualism protects against accelerated ageing”

  • They argue that being multilingual reduces the effects of ageing.

Specifically, speaking one foreign language were 1.3 times less likely (OR = 0.77, 95% CI: 0.71–0.83), two foreign languages 1.96 times less likely (OR = 0.51, 95% CI: 0.47–0.55), and three or more foreign languages 1.56 times less likely (OR = 0.64, 95% CI: 0.59–0.69). These findings underscore the protective effect of multilingualism on aging trajectories.

Causal narrative but no causal inference

While this approach provides evidence of temporal precedence, it does not establish causality. Proper causal inference would require experimental, quasi-experimental or intervention-based designs.

Problematic for two reasons:

  • “protects” is causal.

  • Causal Inference can be achieved with observational studies when using causal inference approaches, like Directed Acyclic Graphs.

Reactions on social media

Doing non-causal inference (and being explicit about it), yet using a causal word as second word in the title. If you pay Nature € 10.690, they will publish this in Nature Ageing. I can tell you what I think of that for free. www.nature.com/articles/s43...

[image or embed]

— Casper Albers 🟥 (@casperalbers.nl) 11 November 2025 at 07:58

Reactions on social media

This slide unfortunately generalizes well 🥲

[image or embed]

— Julia M. Rohrer (@dingdingpeng.the100.ci) 11 November 2025 at 09:25

Reactions on social media

I personally think this is a "harder problem" than we care to admit. So, are you a (social) scientist struggling with this situation? A break between what you _want_ to study (a causal process) and what you feel you _can_ credibly study (a correlation)? Here are some readings that might help. 👇

[image or embed]

— Dan de Kadt (@dandekadt.bsky.social) 11 November 2025 at 11:24

Missing variables

  • Childhood socio-economic background (parental education, early-life deprivation)
  • Lifetime SES (income, wealth, occupational class)
  • Individual migration history (country of birth, age at migration, duration of residence)
  • Detailed multilingualism measures (age of acquisition, proficiency, frequency of use, context of use)
  • Genetic predispositions (e.g., APOE, family history of dementia)
  • Early-life health factors (childhood illnesses, nutrition, developmental conditions)
  • Additional health behaviours (smoking, diet, exercise, alcohol misuse, social networks)
  • Survey/measurement-related factors (selection bias, attrition, exposure misclassification)
  • Time-varying confounders (changes in SES, language use, and health over the life course)

Case study 2: intrinsic vowel duration

Coretta (2025): Directed Acyclic Graphs

Figure 1: DAGs representing three explanatory models of intrinsic vowel duration.

  • Coretta (2025) suggests C: articulatory distance + (possibly) vowel-specific target.

Limitations

Future study

  • Direct articulatory data.

  • Revised Directed Acyclic Graphs.

  • Framing in light of alternative explanations.

Explanatory models

  1. Duration as articulatory distance.

  2. Duration as other than articulatory distance (different possible processes: categorical specification or optimisation process).

  3. Both articulatory distance and other than articulatory distance (different possible processes: categorical specification or optimisation process).

  • Differentiate between categorical specification and optimisation.

Revised DAG

DAGitty demo.

Adjustment sets

To estimate the direct causal effect of vowel and speech rate on distance:

  • No adjustment needed.

To estimate the direct causal effect of distance on peak velocity:

  • vowel, speech rate.

To estimate the direct causal effect of distance on duration:

  • vowel, speech rate, peak velocity.

Power law and log-log form

\[ \begin{aligned} V_p & = k \cdot D^\alpha\\ ln(V_p) & = ln(k) + \alpha \cdot ln(D) \end{aligned} \]

  • Peak velocity \(V_p\) is equal to the product of a constant \(k\) and distance \(D\) raised to the power of \(\alpha\).

  • Equivalently, logged peak velocity is equal to logged \(k\) plus the product of \(\alpha\) and logged distance.

\[ y = \beta_0 + \beta_1 \cdot x \]

In R: log(dur) ~ 1 + log(dist)

Multivariate regression model

ivd_bf <- bf(
  dist_log ~ vowel + speech_rate_log + (vowel | speaker),
  
  pvel_log ~ dist_log * vowel + speech_rate_log + (dist_log * vowel | speaker),
  
  dur_log ~ dist_log * vowel + pvel_log * vowel + speech_rate_log +
    (dist_log * vowel + pvel_log * vowel | speaker)
)

Then

ivd_bm <- brm(
  ivd_bf,
  family = gaussian,
  data = ...
)

Good luck!

You learned a lot, but there is a lot more to learn!

References

Amoruso, Lucia, Hernan Hernandez, Hernando Santamaria-Garcia, Sebastian Moguilner, Agustina Legaz, Pavel Prado, Jhosmary Cuadros, et al. 2025. “Multilingualism Protects Against Accelerated Aging in Cross-Sectional and Longitudinal Analyses of 27 European Countries.” Nature Aging 5 (11): 2340–54. https://doi.org/10.1038/s43587-025-01000-2.
Bermúdez-Otero, Ricardo. 2010. “Morphologically Conditioned Phonetics? Not Proven.” On Linguistic Interfaces II, Belfast 2.
Coretta, Stefano. 2025. “Is "Intrinsic Vowel Duration" Bio-Mechanical or More? Preliminary Results from Northwestern Italian.” Language and Speech.
Elie, Benjamin, David N. Lee, and Alice Turk. 2023. “Modeling Trajectories of Human Speech Articulators Using General Tau Theory.” Speech Communication 151: 2438. https://doi.org/10.1016/j.specom.2023.04.004.
Elie, Benjamin, Juraj Simko, and Alice Turk. 2024. “Interspeech 2024.” In, 3610–14. ISCA. https://doi.org/10.21437/Interspeech.2024-822.
Elie, Benjamin, Juraj Šimko, and Alice Turk. 2024. “Optimization-Based Planning of Speech Articulation Using General Tau Theory.” Speech Communication 160: 103083. https://doi.org/10.1016/j.specom.2024.103083.
Lieberman, Philip, and Cathy Kubaska. 1979. “Intrinsic Vowel Duration and Formant Frequencies: Data from Speech Acquisition.” The Journal of the Acoustical Society of America 65 (S1): S34S34. https://doi.org/10.1121/1.2017216.
Toivonen, Ida, Lev Blumenfeld, Andrea Gormley, Leah Hoiting, John Logan, Nalini Ramlakhan, and Adam Stone. 2015. “Vowel Height and Duration.” In, edited by Ulrike Steindl, Thomas Borer, Huilin Fang, Alfredo García Pardo, Peter Guekguezian, Brian Hsu, Charlie O’Hara, and Iris Chuoying Ouyang, 32:6471. Somerville, MA: Cascadilla Proceedings Project.
Turk, Alice, Melanie Matthies, Joseph Perkell, and Mario Svirsky. 1994. “On Explaining Intrinsic Vowel Duration Differences: An Electromagnetic Midsagittal Articulometer (EMMA) Study.” The Journal of the Acoustical Society of America 96: 33263326. https://doi.org/10.1121/1.410723.
Turk, Alice, and Stefanie Shattuck-Hufnagel. 2020a. Speech Timing: Implications for Theories of Phonology, Phonetics, and Speech Motor Control. Oxford Studies in Phonology and Phonetics. Oxford: Oxford University Press.
———. 2020b. “Timing Evidence for Symbolic Phonological Representations and Phonology-Extrinsic Timing in Speech Production.” Frontiers in Psychology 10 (January): 2952. https://doi.org/10.3389/fpsyg.2019.02952.