Corpus Linguistics Archives

New issue of Language Learning and Technology, October 2021

Posted on 12 October 202112 October 2021 by Ton Koenraad

From The latest issue of Language Learning & TechnologyVolume 25 Number 3, October 2021 I would like to highlight 2 contributions that may well be of special interest to those colleagues among us that are into data-driven language learning and/or attended one of the three MOOC editions that the CATAPULT project offered these past 2 years.

They are:

‘Thirty years of data-driven learning: Taking stock and charting new directions over time’ by Alex Boulton, ATILF – CNRS & University of Lorraine and Nina Vyatkina, University of Kansas

Abstract
The tools and techniques of corpus linguistics have many uses in language pedagogy, most directly with language teachers and learners searching and using corpora themselves. This is often associated with work by Tim Johns who used the term Data-Driven Learning (DDL) back in 1990. This paper examines the growing body of empirical research in DDL over three decades (1989-2019), with rigorous trawls
uncovering 489 separate publications, including 117 in internationally ranked journals, all divided into five time periods. Following a brief overview of previous syntheses, the study introduces our collection, outlining the coding procedures and conversion into a corpus of over 2.5 million words. The main part of the analysis focuses on the concluding sections of the papers to see what recommendations and future
avenues of research are proposed in each time period. We use manual coding and semi-automated corpus keyword analysis to explore whether those points are in fact addressed in later publications as an indication of the evolution of the field.
https://www.lltjournal.org/item/3221

and

‘Review of Voyant Tools: See through your text’ by Ella Alhudithi, Iowa State University

Introduction
Voyant Tools is an open-source online-based platform for the analysis of digitally recorded texts developed by two humanities computing professors, Stefan Sinclair and Geoffrey Rockwell. Using computational algorithms, the platform extracts linguistic and statistical information from texts of different sizes, types, and languages within seconds. All extractions are available in visual formats (e.g., grids, graphs, and animations) to offer a window for a macroscopic view of texts. This input-output process allows for turning complex metadata into easily interpretable visuals. The platform is freely accessible today, requiring an internet connection and a text collection (i.e., corpus). Users of varying expertise and technical ability can use it to uncover insights that characterize their texts.
https://www.lltjournal.org/item/3217

New Journal about Computer-Assisted Language Learning

Posted on 2 October 2021 by Ton Koenraad

The New (Open Access) Journal of China Computer-Assisted Language Learning published its first issue.
The Journal, an initiative of Beijing Foreign Studies University, is the official journal of ChinaCALL, an affiliate of China Association for Comparative Studies of English and Chinese. The journal seeks to provide an international platform for exchanging ideas, innovations and fi ndings regarding computer-assisted language learning. Research articles and critical reviews are especially welcome in related promising fi elds. The journal is peer-reviewed, published in English, open-access and issued twice a year.

My recommended articles for LinguaCop members include:
Ma, Qing and Mei, Fang. “Review of corpus tools for vocabulary teaching and learning” Journal of China Computer-Assisted Language Learning, vol. 1, no. 1, 2021, pp. 177-190. https://doi.org/10.1515/jccall-2021-2008
and
Shrestha, Prithvi N.. “Designing an online business communication course in English by responding to student needs through an evidence-based approach” Journal of China Computer-Assisted Language Learning, vol. 1, no. 1, 2021, pp. 47-79. https://doi.org/10.1515/jccall-2021-2003

A corpus-based quiz for the Christmas day🎄

Posted on 24 December 202024 December 2020 by Ton Koenraad

Wishing all community members a happy Christmas and hopefully also a good and healthy 2021. And to wet the appetites of those of you who have not attended one of our annual MOOC editions yet find this corpus-based quiz offered by the Corpus for Schools project at Lancaster University.

Which of the 🎄expressions above is more frequent in current spoken British English?

Search and find answers here.

[The quiz and the answers are based on the findings from the British National Corpus2014]

Recently added inventory resources

Posted on 27 October 202028 October 2020 by Ton Koenraad

Would like to share information about two resources I recently added to our inventory.
They are both Padlet pages.
The first offers an overview of paper abstracts and videocordings of online presentations at (the online version of) the 2020 EUROCALL conference.
One of the contributions of specific interest for our community (and teachers of Italian in particular) is ‘Data-driven learning for languages other than English: Charting the territory’ (Forti et al., 2020) about presenting MALT-IT2. This stands for “Measuring automatically the level of texts for second or foreign language learners of Italian”. It is a freely accessible online tool that assigns an inputted text to a specific CEFR level. This can be a great tool for producing customised teaching and learning materials.

The second is a Padlet page on using corpus tools in LSP and CLIL.

In addition to links to several mega corpora this resource about the use of big data in language teaching and learning lists a number of other resources, some accompanied with introductory materials such as tutorials, exploratory tasks and sample activities.
Especially for LSP professionals the column ‘LSP Corpora & DDL’ has recently been added to facilitate knowledge sharing about specialized corpora and related resources in a range of languages.
Its current first entry is ‘Check your Smile’, a digital game-based collaborative language learning website to learn Languages for Specific Purposes and Specific Vocabulary in particular.

Furthermore, in the column ‘More Tools & DDL Resources’ we aim (again collectively) to curate tools, materials and teaching practices that are useful for introducing data driven teaching and learning approaches in the classroom.

As the padlet page is editable (no registration required) we hope that thanks to the contributions of our LinguaCoP members to the ‘LSP Corpora & DDL’ and other columns its usefulness for the LSP community will be enhanced.