Language and big data

Author

Stefano Coretta

In Week 3 we discussed about minoritised and endangered languages and how small data has important consequences on how we approach quantitative methods.

This week’s discussion is still on sample size: the use of big data for language technology.

Prompts
  • What are the ethical considerations when working in language technology?

  • What limitations does big data have?

  • Is it possible to develop language technology without a deep descriptive understanding of the language and language community?

  • Which role does uncertainty play in the context of developing language technology?