Presentation to masters degree students in the Education Faculty on using corpus resources and methods in language teaching and learning. Delivered November 2018.
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
Corpus Linguistics for Language Teaching and Learning
1. Corpus Linguistics for
Language Learning and Teaching
Martin Wynne martin.wynne@bodleian.ox.ac.uk
Bodleian Libraries
Faculty of Linguistics, Philology and Phonetics
2. The 'aftermath' of the seminar
Subject: Les Francais des Corpus – Aftermath
Dear colleagues,
First, many thanks for presenting at /attending
the Francais des Corpus Workshop and for making
it such a success.
I promised I would keep you in touch with one
another and hope that the full list of your e-
mail addresses above makes that possible.
…
4. What is a corpus?
“…a collection of pieces of language, selected and ordered according
to explicit linguistic criteria in order to be used as a sample of the
language.”
(Sinclair 1996)
5. What is Corpus Linguistics?
(1) Focus on linguistic performance, rather than competence
(2) Focus on linguistic description, rather than linguistic
universals
(3) Focus on quantitative, as well as qualitative models of
language
(4) Focus on a more empiricist, rather than rationalist view of
scientific inquiry.
(Leech 1992)
6. How do you know things about
language? Where do we get our
knowledge from?
7. What does your knowledge and
experience tell you about the use of
‘try to’ & ‘try and’?
8. Fill in the blanks
1. Did you try … talk her out of swimming?
2. Mr. Kissinger, try … explain to us what might happen
3. He did it to try … score points
4. They both wanted to try ... have a family
5. They try … treat you like machines
6. Sometimes, people try … make fun of you by
imitating you.
7. Now the government will try … sell all of this.
8. Did you try … get out of it?
9. I will try … understand this.
9. Fill in the blanks
1. Did you try and talk her out of swimming?
2. Mr. Kissinger, try and explain to us what might happen
3. He did it to try and score points
4. They both wanted to try and have a family
5. They try to treat you like machines
6. Sometimes, people try to make fun of you by imitating you.
7. Now the government will try to sell all of this.
8. Did you try to get out of it?
9. I will try ? understand this. [This one was made up!]
10. •
“Try and do something is incorrect
for try to do…” [Partridge and Greet
1947]
•
“Try and is well established in
conversational use ..Try to is to be
preferred in serious writing” [Plain
Words 1986]
•
“… try and has been socially
acceptable for these two centuries
… is not used in an elevated style”
[Webster’s Dictionary 1989]
11. What are the factors governing the choice
and distribution of try to vs. try and ?
How would you investigate this question?
13. BNC COCA Hansard GloWbE COHA soap operas wikipedia
0
50
100
150
200
250
300
350
try and (pmw)
try to (pmw)
14. Try to or try and? Verb
complementation in British and
American English
Hommerberg & Tottie (2007)
ICAME Journal 31:45-64
http://icame.uib.no/ij31/ij31-page45-64.pdf
15. Uses of corpus linguistics in
language pedagogy
●
Developing new theories (e.g. differences between regional
varieties, identifying new varieties such as 'English as a
Lingua Franca')
●
A source primary data for developing e.g.:
➢
dictionaries
➢
grammars
➢
textbooks (and other teaching materials)
●
Preparing materials for classes (e.g. as a source of
examples)
●
Studying learner language (in a learner corpus)
●
Data-driven learning in the classroom
16. “Why not just Google it?”
Linguistic
●
Biased distribution of text-types and genres
●
Repeated and reused text
●
Unknown provenance (“Who wrote this, when, and why?”)
●
Mixture of native and non-native producers of language
●
Mixture of varieties
Technical
●
Unclear separation of elements of the webpage (body text, sidebars, adverts, etc.)
●
Accessing the ‘hidden web’ (content which is not visible to search)
●
Accessing language embedded in audio and video streams
●
Lack of persistence locations and identifiers
Methodological
●
Difficult to compare frequencies of occurrence
●
Unknown (or undesirable) sampling and ranking strategies
of search engines (e.g. promoting commercial products and services, prioritizing
words in titles and headers, user-specific settings)
17. Problems with language in the corpus
●
Limited by copyright (and other legal and ethical barriers)
●
Expensive, time-consuming and slow to make
●
Limited size
●
Not up to date
●
Incomplete information about provenance and context
●
Design decisions were made by someone else
●
Not easily comparable to other corpora
●
Access restrictions
●
Limited functions available for analysis and exploration
●
Not connected to other resources or tools
●
Difficult to deploy in the classroom
18. Find evidence in one or more corpora to
help explain the sources of irony and
humour in Homer’s utterance:
“I'm just going out to commit certain deeds”
http://kisscartoon.eu/watch/the-simpsons-season-9-episode-16-dumbbell-indemnity/
(9:27-9:52)
Exercise
19. Links for Practical Work
●
http://bncweb.lancs.ac.uk/ (register using ac.uk email address)
●
http://corpus.byu.edu
●
All links can be found via: https://ota.ox.ac.uk/oxonly/oxford.xml
20. Data-driven language learning in the
classroom: some reflections
●
Can you use a corpus to reveal 'real language'?
●
Do we want to teach ‘real language’? Should teachers prefer
to control the rate and order of exposure to linguistic
features?
●
Can teachers easily deal with unrestricted language in the
classroom?
●
Effective reading and interpretation of concordance lines and
collocation lists require practice, and the acquisition of skills.
●
There are often difficult technical issues in effective
deployment of corpora in the classroom.
21. Antconc
●
Download for free from
http://www.antlab.sci.waseda.ac.jp/software.html
●
Use with any 'plain' text (txt, html, xml)
●
Multilingual
capabilities
●
Does not interpret
mark-up or metadata
22. CQPweb: an online interface for many corpora
http://cqpweb.lancs.ac.uk
24. References
●
Chambers, A. and M. Wynne. ‘Sharing corpus resources in language learning.’ In F. Zhang and B. Barber (eds.)
Handbook of Research on Computer-Enhanced Language Acquisition and Learning. Hershey, PA: IGI Global,
2008, 438-451.
●
Hommerberg, C. and G. Tottie (2007). Try to or Try and? Verb complementation in British and American English.
ICAME Journal 31: 45-64. http://icame.uib.no/ij31/ij31-page45-64.pdf
●
Leech, G. (1992). Corpora and theories of linguistic performance. In J. Startvik (Ed.), Directions in corpus
linguistics (pp. 105-122). Berlin: Mouton de Gruyter.
●
McEnery, A. & Z. Xiao (2010), What corpora can offer in language teaching and learning. In E. Hinkel (ed.)
Handbook of Research in Second Language Teaching and Learning (Vol. 2). London / New York: Routledge.
[http://www.lancaster.ac.uk/fass/projects/corpus/ZJU/xpapers/McEnery_Xiao_teaching.PDF]
●
McEnery, A., R. Xiao and Y. Tono, (2006). Corpus-based Language Studies: An Advanced Resource Book.
Routledge.
●
Sinclair, J.McH. 1996. 'Preliminary recommendations on corpus typology' EAGLES Document TCWG-CTYP/P
(available from http://www.ilc.cnr.it/EAGLES/corpustyp/corpustyp.html).
Online resources
●
Corpora for users in the University of Oxford https://ota.ox.ac.uk/oxonly/oxford.xml
●
Brigham Young Corpora http://corpus.byu.edu/ (also via Solo)
●
British National Corpus http://bncweb.lancs.ac.uk/ (free registration required here)
●
Linguee (bilingual translations) https://www.linguee.com/
●
VOICE. 2013. The Vienna-Oxford International Corpus of English (version 2.0 Online) http://voice.univie.ac.at,
also available for download from the Oxford Text Archive (http://purl.ox.ac.uk/ota/2542).