From The latest issue of Language Learning & TechnologyVolume 25 Number 3, October 2021 I would like to highlight 2 contributions that may well be of special interest to those colleagues among us that are into data-driven language learning and/or attended one of the three MOOC editions that the CATAPULT project offered these past 2 years.
They are:
‘Thirty years of data-driven learning: Taking stock and charting new directions over time’ by Alex Boulton, ATILF – CNRS & University of Lorraine and Nina Vyatkina, University of Kansas
Abstract
The tools and techniques of corpus linguistics have many uses in language pedagogy, most directly with language teachers and learners searching and using corpora themselves. This is often associated with work by Tim Johns who used the term Data-Driven Learning (DDL) back in 1990. This paper examines the growing body of empirical research in DDL over three decades (1989-2019), with rigorous trawls
uncovering 489 separate publications, including 117 in internationally ranked journals, all divided into five time periods. Following a brief overview of previous syntheses, the study introduces our collection, outlining the coding procedures and conversion into a corpus of over 2.5 million words. The main part of the analysis focuses on the concluding sections of the papers to see what recommendations and future
avenues of research are proposed in each time period. We use manual coding and semi-automated corpus keyword analysis to explore whether those points are in fact addressed in later publications as an indication of the evolution of the field.
https://www.lltjournal.org/item/3221
and
‘Review of Voyant Tools: See through your text’ by Ella Alhudithi, Iowa State University
Introduction
Voyant Tools is an open-source online-based platform for the analysis of digitally recorded texts developed by two humanities computing professors, Stefan Sinclair and Geoffrey Rockwell. Using computational algorithms, the platform extracts linguistic and statistical information from texts of different sizes, types, and languages within seconds. All extractions are available in visual formats (e.g., grids, graphs, and animations) to offer a window for a macroscopic view of texts. This input-output process allows for turning complex metadata into easily interpretable visuals. The platform is freely accessible today, requiring an internet connection and a text collection (i.e., corpus). Users of varying expertise and technical ability can use it to uncover insights that characterize their texts.
https://www.lltjournal.org/item/3217