Niveau d'étude
Master 2
ECTS
3 crédits
Volume horaire
18h
Période de l'année
Semestre 3
Description
This course aims to support students towards making use of appropriate statistical methods adapted to their data. After revising basics of descriptive and inferential statistics, we will cover the family distributions encountered in linguistic research: numeric, binary categorical, count or likert-scale response variables. We will explore fixed and random effects paying attention to when to use the latter. We continue with various datasets providing descriptions, explorations, and analysing them using the most appropriate tool to answer specific hypotheses, following a confirmatory data analysis approach. An extension to exploratory data analyses will explore data reduction techniques and machine learning suitable for various types of family distributions. Finally, we move to a corpus data analysis, where we look at exploring written corpora and manipulating them by identifying types and tokens and stop words, identifying compound words, n-grams, PoS-Tagging (Part of Speech-Tagging), document and feature co-occurrence matrix (FCM) and then applying various statistical analyses adapted to textual data: Simple and relative frequency analyses, lexical diversity, collocations, word clouds, network analysis, and finally poisson regression.
Objectifs
On successful completion of this course, students should be able to:
- Identify the type of statistical design to employ based on a subset of datasets
- Explore their dataset and provide specific descriptive and inferential statistical analyses
- Understand the difference between confirmatory and exploratory data analyses approaches
- Have a critical eye on the approaches to use in statistical designs
- Explore written corpora and being able to quantify patterns in the data based on various techniques covered in class
Heures d'enseignement
- Advanced StatisticsCours Magistral9h
- Advanced StatisticsTravaux Dirigés9h
Pré-requis nécessaires
Students will need basic knowledge of statistical approaches
Dernière mise à jour le 4 septembre 2025