SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
1
by Locky, Law
PhD Candidate
The Hong Kong Polytechnic University
Email: Lx3h@yahoo.com
The 9th International Conference of ASIALEX
Words, Dictionaries and Corpora: Innovation in reference science
25-27 June 2015 | Hong Kong
 Television drama, despite its enormous popularity across the globe, has rarely received
attentions from the linguistics field. The dearth of research into television drama dialogue is
further exposed by the thriving contributions from various other fields such as philosophy,
psychology, cultural studies and media studies.
 This paper seeks to promote research interest in this unique mediated text by selecting
renowned medical dramedy House M.D. as research subject and comparing its 927,922-word
House M.D. pure dialogue corpus (HMDC) to both the 450-million-word Corpus of
Contemporary American English (COCA) and its 95-million-word spoken subcorpus (COCA
Spoken) using an adaptation of Bednerak’s (2011) ranked frequency list method.
 Using WordSmith Tools in the calculation of n-gram (n = 1, 2, 3) at the words/clusters level,
the findings indicate that HMDC is more interpersonal than COCA and has a closer
resemblance to COCA Spoken than to COCA. HMDC also contains 3.4 times more
negativity than COCA Spoken and 2.8 times more than COCA. As such, viewers are
presented with English far more interpersonal, as well as involving significantly more
disagreement than one will encounter in the real world. This study not only shows
similarities and differences between House M.D. and contemporary American English, but
also provides a preview of the huge potential in television drama-related research.
2
3
 A spin-off from my PhD project, titled “
and Creativity: a Corpus Linguistic Systemic
Functional Multimodal Discourse Analysis Approach”
 No lack of interest and research in literary texts; films are
gaining popularity, but
 TV drama has not attracted considerable attention from
the linguistic field, so much to a point that it is “marked
equally as popular” as it is “devalued” (Bignell and Lacey
2005:3).
 an “urgent need … for a treatment of fictional cinema
and television from various linguistic perspectives.”
Piazza, Bednarek and Rossi (2011, p. 2)
4
 Interested in how language in drama differs
from/resembles language in “reality”
 Bearing in mind that “reality” represented by any
corpora is always limited by the scope of its data, no
matter it is by genre, region, gender, race or time.
Therefore, there can never be a complete
representation of any actual realities.
5
 FOX: 8 years, 8 seasons, 177 episodes
 David Shore -- Primetime Emmy Awards
Outstanding Writing for a Drama Series winner
 Bryan Singer -- executive producer (film director of
Valkyrie and X-Men)
 Hugh Laurie -- twice winner of the Golden Globe
Best Performance by an Actor in a Television
Series – Drama
 it has received an 8.9 / 10 rating from 237,068
users on IMDb.com as of November 2014
6
 In 2008, it was one of the top-ten rated shows in
the United States & the most watched television
program in the world
 By 2011, it had been viewed by a spectacular 81.8
million in 66 countries
 since 2011, Hugh Laurie has been the world’s
Most-Watched (Leading) Man On Television on the
Guinness Book of Records
 2nd on Forbes’s list of the Highest-Paid TV Actors
in 2012 at $400,000 (£247,230) per episode
7
1. How do dialogues in House M.D. differ
from/resemble contemporary American
English?
2. What differences/similarities can be drawn
from a comparison between COCA and
COCA spoken corpus with respect to a House
M.D. dialogue corpus?
3. What can be unveiled about House M.D.
through this corpus linguistic approach?
8
9
 Construct a House M.D. Corpus (927,922 words)
from fan scripts
 Remove all non-dialogue elements such as fade-
ins, scene headings, action sequences, scene
transitions, mood brackets, parentheticals,
commercial tags and character name tags
 Repeated manual check against internet sources
10
11
 COCA contains more than 450 million words in
189,431 texts equally divided in 5 genres:
spoken, fiction, popular magazines, newspapers
and academic journals
 “the largest freely-available corpus of English,
and the only large and balanced corpus of
American English” (Davies, 2008)
12
 The spoken part of COCA (hereafter referred
to as COCA Spoken) contains 95 million words
[95,385,672] of transcripts of unscripted
conversation from more than 150 different TV
and radio programs such as All Things Considered
(NPR), Newshour (PBS), Good Morning America
(ABC), Today Show (NBC), 60 Minutes (CBS),
Hannity and Colmes (Fox), Jerry Springer, etc
(Davies, 2008).
13
 PhD project on creative language requires the
use of large, balanced and up-to-date corpus of
American English
 Spoken corpus larger than 1 million is rare, eg.
SBCSAE has 249k words
 The “reality” concerned is not of any
specialized purposes, but has to include casual
conversations as well as some medical topics
(i.e. a “reality” which includes a doctor’s daily
spoken English – social and medical discourse)
14
 Ngram comparison & Rank Frequency Lists
1. HMDC 1-gram, 2-gram and 3-gram (hereafter
referred to as 1-to-3-grams) with respect to
COCA’s 1-to-3-grams and vice versa,
2. HMDC 1-to-3-gram with respect to COCA
Spoken’s 1-to-3-grams and vice versa,
3. Negativity / positivity of 3-grams HMDC with
COCA and COCA Spoken, and vice versa.
15
1. Rank sum / difference forms basis of Mann-
Whiteney U test
2. Does not assume normal distribution
3. Works well with small observed frequencies
as well as large ones
4. Does not exaggerate at low frequency count
5. Simplifies large numbers, reveals underlying
patterns
6. Indicates how different two sets of data are
16
17
18
among the shared
thirteen 1-grams
double digit rank
difference
House M.D. appears to be
•more interpersonal
•focused more on 1st and 2nd person singular
than the norm in general written and spoken American English, but
higher token ngrams must be considered.
19
20
Issue 1: Contraction alternatives:
I am (rank 170),You are, is not,etc.
Issue 2: 2-gram There—’s andThat—’s
not found but
2-grams There’s there and That’s
that/i/you/Mr/it are found.
Bug in algorithm?
Divergence from COCA
21
Issue of Contraction alternatives continues:
•It isn’t (rank 532),
•I am not (rank 638),
•It is a (rank 64),
•You are not (rank 2,663),
•You aren’t (rank 6,009),
•There is no (rank 52)
17 Negativity 6 Negativity
18 contractions 6 contractions
 Contraction alternatives & (possible) bug affect
results
 HMDC is not well-reflected by COCA
 Should try comparing with COCA Spoken
22
23
Lower rank difference
24
Lower rank difference, Fewer unfound clusters,
more shared 2-grams
25
Lower rank difference, lower negativity in COCA
Spoken
Issue of contraction alternatives &
acronyms
 Such decrease in the frequency of negativity in COCA
Spoken with respect to COCA is a result of an increase
in the positivity in spoken American English.
 Therefore considering the top twenty 3-grams, COCA
Spoken contains 5% more positivity than COCA
 HDMC contains 3.4 times more negativity than COCA
Spoken and 2.8 times more than COCA.
 In a way, House M.D. has brutally intervened in viewers’
perception of the norm of American English.
26
 This study has discussed how the language used in House M.D. is
related to contemporary spoken American English
 Has listed differences and similarities drawn from a comparison
between COCA and COCA Spoken with respect to HMDC
 Has showed how House M.D. can be identified as a dramedy far more
interpersonal, 1st and 2nd person-addressed and disagreeing than one
would encounter in “reality”.
 In addition to the original research questions, it has demonstrated the
strengths and weaknesses of using 1-to-3-grams rank difference in
comparing HMDC with COCA and COCA Spoken
 Has addressed potential methodological issue of contraction
alternatives and acronyms affecting ngram ranking and rank difference
27
28
 Judging by the results obtained from this simple
analysis, further studies along the line of
television drama are worthy researchers’
attention, interest and devotion.
29
30
 Allen, R. C. (2004). Frequently asked questions. A general introduction to the reader. In R. C. Allen, & A. Hill (Eds.), The Television Studies Reader (pp. 1-26). New
York: Routledge.
 Androutsopoulos, J. (2012). Introduction: Language and society in cinematic discourse. Multilingua , 31, 139-154.
 Bednarek, M. (2011). The language of fictional television: a case study of the ‘dramedy’Gilmore Girls. English Text Construction , 4 (1), 54-83.
 Bednarek, M. (2010). The Language of Fictional Television: Drama and Identity. London: Continuum International Publishing Group.
 Biber, D. (2009). A corpus-driven approach to formulaic language in English. InternationalJournal of Corpus Linguistics , 14 (3), 275-311.
 Bignell, J., & Lacey, S. (2005). Popular television drama : critical perspectives. (J. Bignell, & S. Lacey, Eds.) Manchester: Manchester University Press.
 Brock, A. (2004). Analyzing scripts in humorous communication. Humor: InternationalJournal of Humor Research , 17 (4), 353-360.
 Bubel, C. (2006). The linguistic construction of character relations in TV drama: Doing friendship in Sex and the City. Retrieved April 4, 2013, from SciDok-Datenbank:
http://scidok.sulb.uni-saarland.de/volltexte/2006/598/pdf/Diss_Bubel_publ.pdf
 Chamber, S. A. (2003). Language and Structure in The West Wing. In P. C. Rollins, & J. E. O'Connor (Eds.), The West Wing: The American Presidency As Television
Drama (pp. 83-100). New York: Syracuse University Press.
 Chua, B. H. (2008). Structure of identification and distancing in watching East Asian television drama. In B. H. Chua, & K. Iwabuchi, East Asian Pop Culture:
Analysing the Korean Wave (pp. 73-90). Hong Kong: Hong Kong University Press.
 Cover, R. (2004). From Butler to Buffy: Notes towards a strategy for identity analysis in contemporary television narrative. Reconstruction: Studies in Contemporary
Culture , 4 (2).
 Davies, M. (2014, December 1). CoRD | The Corpus of Contemporary American English (COCA). Retrieved March 21, 2011, from VARIENG: CoRD | The Corpus of
Contemporary American English (COCA)
 Davies, M. (2011). N-grams data from the Corpus of Contemporary American English (COCA). Retrieved November 20, 2014, from http://www.ngrams.info
 Goodier, B. C., & Arrington, M. I. (2007). Physicians, patients, and medical dialogue in the NYPD Blue prostate cancer story. Journal of Medical Humanities , 28 (1),
45-58. IMDb. (n.d.).
 Jacoby, H., & Irwin, W. (Eds.). (2008). House and Philosophy: Everybody Lies. John Wiley & Sons.
 Jamieson, D. (2011, September). Does TV accurately portray psychology? Retrieved April 20, 2013, from American Psychology Association:
http://www.apa.org/gradpsych/2011/09/psychology-shows.aspx
 Munt, S. R. (2006). A queer undertaking: Anxiety and reparation in the HBO television drama series Six Feet Under. Feminist Media Studies , 263-279.
 O’Keeffe, A., McCarthy, M., & Carter, R. (2007). From Corpus to Classroom: Language Use and Language Teaching. Cambridge: CUP.
 Piazza, R., Bednarek, M., & Rossi, F. (Eds.). (2011). Telecinematic Discourse: Approaches to the Language of Films and Television Series. Philadelphia: John Benjamins
B.V.
 Quaglio, P. (2008). Television dialogue and natural conversation: Linguistic similarities and functional differences. In A. Ädel, & R. Reppen (Eds.), Corpora and
Discourse: The challenges of different settings (Vol. vi, pp. 189-210). John Benjamins.
 Richardson, K. (2010). Television Dramatic Dialogue: A Sociolinguistic Study. Oxford: Oxford University Press.
 Scott, M. (2014). WordSmith Tools Manual . Retrieved May 20, 2014, from Lexical Analysis Software:
http://www.lexically.net/downloads/version6/HTML/index.html
 Wild, D. K. (2005a, October 24). Constructing House: An Interview with House, M.D. writer Lawrence Kaplow. Retrieved September 4, 2014, from Blogcritics:
http://blogcritics.org/constructing-house-an-interview-with-house/
 Woznicki, K. (2005, September 27). A doctor/writer in the 'House'. Retrieved September 4, 2014, from CNN.com:
http://edition.cnn.com/2005/HEALTH/09/27/profile.writer.foster/index.html 31
32
by Locky, Law
PhD Candidate
The Hong Kong Polytechnic University
Email: Lx3h@yahoo.com
The 9th International Conference of ASIALEX
Words, Dictionaries and Corpora: Innovation in reference science
25-27 June 2015 | Hong Kong

Contenu connexe

Similaire à Locky's Asialex 2015

What role do expanding circle country users play in the spread of english
What role do expanding circle country users play in the spread of english What role do expanding circle country users play in the spread of english
What role do expanding circle country users play in the spread of english Víctor Elías Lugo Vásquez
 
Psychology of Language 5th Edition Carroll Test Bank
Psychology of Language 5th Edition Carroll Test BankPsychology of Language 5th Edition Carroll Test Bank
Psychology of Language 5th Edition Carroll Test BankKiayadare
 
What can a corpus tell us about registers and genres douglas biber
What can a corpus tell us about registers and genres douglas biberWhat can a corpus tell us about registers and genres douglas biber
What can a corpus tell us about registers and genres douglas biberPascual Pérez-Paredes
 
The Usage of Because of-Words in British National Corpus
 The Usage of Because of-Words in British National Corpus The Usage of Because of-Words in British National Corpus
The Usage of Because of-Words in British National CorpusResearch Journal of Education
 
Ch 6 corpus linguistics
Ch 6   corpus linguisticsCh 6   corpus linguistics
Ch 6 corpus linguisticsNaveed Khokher
 
1 discourse analysis.ppt
1 discourse analysis.ppt1 discourse analysis.ppt
1 discourse analysis.pptUtamitri67
 
language censorship on network television and radioPrompt You.docx
language censorship on network television and radioPrompt You.docxlanguage censorship on network television and radioPrompt You.docx
language censorship on network television and radioPrompt You.docxsmile790243
 
Language structure is partly by social structure
Language structure is partly by social structureLanguage structure is partly by social structure
Language structure is partly by social structureSpanishinBuenosAires
 
Corpus linguistics in language learning
Corpus linguistics in language learningCorpus linguistics in language learning
Corpus linguistics in language learningnfuadah123
 
Tesol2011 pc
Tesol2011 pcTesol2011 pc
Tesol2011 pcvacurves
 
Doing Identity in Ethnographic Interviews
Doing Identity in Ethnographic InterviewsDoing Identity in Ethnographic Interviews
Doing Identity in Ethnographic InterviewsKatherine Morales
 
Creative writing: a lesson plan
Creative writing: a lesson planCreative writing: a lesson plan
Creative writing: a lesson planFatima Gul
 
code switching - code mixing
code switching - code mixingcode switching - code mixing
code switching - code mixingHameel Khan
 
code mixing and code switching
code mixing and code switchingcode mixing and code switching
code mixing and code switchingFatima Gul
 

Similaire à Locky's Asialex 2015 (20)

What role do expanding circle country users play in the spread of english
What role do expanding circle country users play in the spread of english What role do expanding circle country users play in the spread of english
What role do expanding circle country users play in the spread of english
 
Locky's RIDCH Conference 2016
Locky's RIDCH Conference 2016Locky's RIDCH Conference 2016
Locky's RIDCH Conference 2016
 
Vot presentation Feb 6
Vot presentation Feb 6Vot presentation Feb 6
Vot presentation Feb 6
 
Psychology of Language 5th Edition Carroll Test Bank
Psychology of Language 5th Edition Carroll Test BankPsychology of Language 5th Edition Carroll Test Bank
Psychology of Language 5th Edition Carroll Test Bank
 
LANE422ch2.ppt
LANE422ch2.pptLANE422ch2.ppt
LANE422ch2.ppt
 
What can a corpus tell us about registers and genres douglas biber
What can a corpus tell us about registers and genres douglas biberWhat can a corpus tell us about registers and genres douglas biber
What can a corpus tell us about registers and genres douglas biber
 
The Usage of Because of-Words in British National Corpus
 The Usage of Because of-Words in British National Corpus The Usage of Because of-Words in British National Corpus
The Usage of Because of-Words in British National Corpus
 
Ch 6 corpus linguistics
Ch 6   corpus linguisticsCh 6   corpus linguistics
Ch 6 corpus linguistics
 
1 discourse analysis.ppt
1 discourse analysis.ppt1 discourse analysis.ppt
1 discourse analysis.ppt
 
language censorship on network television and radioPrompt You.docx
language censorship on network television and radioPrompt You.docxlanguage censorship on network television and radioPrompt You.docx
language censorship on network television and radioPrompt You.docx
 
ISB
ISBISB
ISB
 
Language structure is partly by social structure
Language structure is partly by social structureLanguage structure is partly by social structure
Language structure is partly by social structure
 
494
494494
494
 
Corpus linguistics in language learning
Corpus linguistics in language learningCorpus linguistics in language learning
Corpus linguistics in language learning
 
Intercultural awareness
Intercultural awarenessIntercultural awareness
Intercultural awareness
 
Tesol2011 pc
Tesol2011 pcTesol2011 pc
Tesol2011 pc
 
Doing Identity in Ethnographic Interviews
Doing Identity in Ethnographic InterviewsDoing Identity in Ethnographic Interviews
Doing Identity in Ethnographic Interviews
 
Creative writing: a lesson plan
Creative writing: a lesson planCreative writing: a lesson plan
Creative writing: a lesson plan
 
code switching - code mixing
code switching - code mixingcode switching - code mixing
code switching - code mixing
 
code mixing and code switching
code mixing and code switchingcode mixing and code switching
code mixing and code switching
 

Dernier

Basic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of TransportBasic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of TransportDenish Jangid
 
How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17Celine George
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽中 央社
 
An overview of the various scriptures in Hinduism
An overview of the various scriptures in HinduismAn overview of the various scriptures in Hinduism
An overview of the various scriptures in HinduismDabee Kamal
 
MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...
MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...
MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...MysoreMuleSoftMeetup
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjMohammed Sikander
 
Contoh Aksi Nyata Refleksi Diri ( NUR ).pdf
Contoh Aksi Nyata Refleksi Diri ( NUR ).pdfContoh Aksi Nyata Refleksi Diri ( NUR ).pdf
Contoh Aksi Nyata Refleksi Diri ( NUR ).pdfcupulin
 
SURVEY I created for uni project research
SURVEY I created for uni project researchSURVEY I created for uni project research
SURVEY I created for uni project researchCaitlinCummins3
 
Improved Approval Flow in Odoo 17 Studio App
Improved Approval Flow in Odoo 17 Studio AppImproved Approval Flow in Odoo 17 Studio App
Improved Approval Flow in Odoo 17 Studio AppCeline George
 
male presentation...pdf.................
male presentation...pdf.................male presentation...pdf.................
male presentation...pdf.................MirzaAbrarBaig5
 
Rich Dad Poor Dad ( PDFDrive.com )--.pdf
Rich Dad Poor Dad ( PDFDrive.com )--.pdfRich Dad Poor Dad ( PDFDrive.com )--.pdf
Rich Dad Poor Dad ( PDFDrive.com )--.pdfJerry Chew
 
e-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopale-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi RajagopalEADTU
 
Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...EduSkills OECD
 
Major project report on Tata Motors and its marketing strategies
Major project report on Tata Motors and its marketing strategiesMajor project report on Tata Motors and its marketing strategies
Major project report on Tata Motors and its marketing strategiesAmanpreetKaur157993
 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptNishitharanjan Rout
 
Observing-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptxObserving-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptxAdelaideRefugio
 
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxAnalyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxLimon Prince
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...Nguyen Thanh Tu Collection
 
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMDEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMELOISARIVERA8
 

Dernier (20)

Basic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of TransportBasic Civil Engineering notes on Transportation Engineering & Modes of Transport
Basic Civil Engineering notes on Transportation Engineering & Modes of Transport
 
How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
 
An overview of the various scriptures in Hinduism
An overview of the various scriptures in HinduismAn overview of the various scriptures in Hinduism
An overview of the various scriptures in Hinduism
 
MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...
MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...
MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
 
Contoh Aksi Nyata Refleksi Diri ( NUR ).pdf
Contoh Aksi Nyata Refleksi Diri ( NUR ).pdfContoh Aksi Nyata Refleksi Diri ( NUR ).pdf
Contoh Aksi Nyata Refleksi Diri ( NUR ).pdf
 
SURVEY I created for uni project research
SURVEY I created for uni project researchSURVEY I created for uni project research
SURVEY I created for uni project research
 
Improved Approval Flow in Odoo 17 Studio App
Improved Approval Flow in Odoo 17 Studio AppImproved Approval Flow in Odoo 17 Studio App
Improved Approval Flow in Odoo 17 Studio App
 
male presentation...pdf.................
male presentation...pdf.................male presentation...pdf.................
male presentation...pdf.................
 
Rich Dad Poor Dad ( PDFDrive.com )--.pdf
Rich Dad Poor Dad ( PDFDrive.com )--.pdfRich Dad Poor Dad ( PDFDrive.com )--.pdf
Rich Dad Poor Dad ( PDFDrive.com )--.pdf
 
e-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopale-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopal
 
Including Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdfIncluding Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdf
 
Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...
 
Major project report on Tata Motors and its marketing strategies
Major project report on Tata Motors and its marketing strategiesMajor project report on Tata Motors and its marketing strategies
Major project report on Tata Motors and its marketing strategies
 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.ppt
 
Observing-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptxObserving-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptx
 
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxAnalyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
 
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMDEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
 

Locky's Asialex 2015

  • 1. 1 by Locky, Law PhD Candidate The Hong Kong Polytechnic University Email: Lx3h@yahoo.com The 9th International Conference of ASIALEX Words, Dictionaries and Corpora: Innovation in reference science 25-27 June 2015 | Hong Kong
  • 2.  Television drama, despite its enormous popularity across the globe, has rarely received attentions from the linguistics field. The dearth of research into television drama dialogue is further exposed by the thriving contributions from various other fields such as philosophy, psychology, cultural studies and media studies.  This paper seeks to promote research interest in this unique mediated text by selecting renowned medical dramedy House M.D. as research subject and comparing its 927,922-word House M.D. pure dialogue corpus (HMDC) to both the 450-million-word Corpus of Contemporary American English (COCA) and its 95-million-word spoken subcorpus (COCA Spoken) using an adaptation of Bednerak’s (2011) ranked frequency list method.  Using WordSmith Tools in the calculation of n-gram (n = 1, 2, 3) at the words/clusters level, the findings indicate that HMDC is more interpersonal than COCA and has a closer resemblance to COCA Spoken than to COCA. HMDC also contains 3.4 times more negativity than COCA Spoken and 2.8 times more than COCA. As such, viewers are presented with English far more interpersonal, as well as involving significantly more disagreement than one will encounter in the real world. This study not only shows similarities and differences between House M.D. and contemporary American English, but also provides a preview of the huge potential in television drama-related research. 2
  • 3. 3
  • 4.  A spin-off from my PhD project, titled “ and Creativity: a Corpus Linguistic Systemic Functional Multimodal Discourse Analysis Approach”  No lack of interest and research in literary texts; films are gaining popularity, but  TV drama has not attracted considerable attention from the linguistic field, so much to a point that it is “marked equally as popular” as it is “devalued” (Bignell and Lacey 2005:3).  an “urgent need … for a treatment of fictional cinema and television from various linguistic perspectives.” Piazza, Bednarek and Rossi (2011, p. 2) 4
  • 5.  Interested in how language in drama differs from/resembles language in “reality”  Bearing in mind that “reality” represented by any corpora is always limited by the scope of its data, no matter it is by genre, region, gender, race or time. Therefore, there can never be a complete representation of any actual realities. 5
  • 6.  FOX: 8 years, 8 seasons, 177 episodes  David Shore -- Primetime Emmy Awards Outstanding Writing for a Drama Series winner  Bryan Singer -- executive producer (film director of Valkyrie and X-Men)  Hugh Laurie -- twice winner of the Golden Globe Best Performance by an Actor in a Television Series – Drama  it has received an 8.9 / 10 rating from 237,068 users on IMDb.com as of November 2014 6
  • 7.  In 2008, it was one of the top-ten rated shows in the United States & the most watched television program in the world  By 2011, it had been viewed by a spectacular 81.8 million in 66 countries  since 2011, Hugh Laurie has been the world’s Most-Watched (Leading) Man On Television on the Guinness Book of Records  2nd on Forbes’s list of the Highest-Paid TV Actors in 2012 at $400,000 (£247,230) per episode 7
  • 8. 1. How do dialogues in House M.D. differ from/resemble contemporary American English? 2. What differences/similarities can be drawn from a comparison between COCA and COCA spoken corpus with respect to a House M.D. dialogue corpus? 3. What can be unveiled about House M.D. through this corpus linguistic approach? 8
  • 9. 9
  • 10.  Construct a House M.D. Corpus (927,922 words) from fan scripts  Remove all non-dialogue elements such as fade- ins, scene headings, action sequences, scene transitions, mood brackets, parentheticals, commercial tags and character name tags  Repeated manual check against internet sources 10
  • 11. 11
  • 12.  COCA contains more than 450 million words in 189,431 texts equally divided in 5 genres: spoken, fiction, popular magazines, newspapers and academic journals  “the largest freely-available corpus of English, and the only large and balanced corpus of American English” (Davies, 2008) 12
  • 13.  The spoken part of COCA (hereafter referred to as COCA Spoken) contains 95 million words [95,385,672] of transcripts of unscripted conversation from more than 150 different TV and radio programs such as All Things Considered (NPR), Newshour (PBS), Good Morning America (ABC), Today Show (NBC), 60 Minutes (CBS), Hannity and Colmes (Fox), Jerry Springer, etc (Davies, 2008). 13
  • 14.  PhD project on creative language requires the use of large, balanced and up-to-date corpus of American English  Spoken corpus larger than 1 million is rare, eg. SBCSAE has 249k words  The “reality” concerned is not of any specialized purposes, but has to include casual conversations as well as some medical topics (i.e. a “reality” which includes a doctor’s daily spoken English – social and medical discourse) 14
  • 15.  Ngram comparison & Rank Frequency Lists 1. HMDC 1-gram, 2-gram and 3-gram (hereafter referred to as 1-to-3-grams) with respect to COCA’s 1-to-3-grams and vice versa, 2. HMDC 1-to-3-gram with respect to COCA Spoken’s 1-to-3-grams and vice versa, 3. Negativity / positivity of 3-grams HMDC with COCA and COCA Spoken, and vice versa. 15
  • 16. 1. Rank sum / difference forms basis of Mann- Whiteney U test 2. Does not assume normal distribution 3. Works well with small observed frequencies as well as large ones 4. Does not exaggerate at low frequency count 5. Simplifies large numbers, reveals underlying patterns 6. Indicates how different two sets of data are 16
  • 17. 17
  • 18. 18 among the shared thirteen 1-grams double digit rank difference House M.D. appears to be •more interpersonal •focused more on 1st and 2nd person singular than the norm in general written and spoken American English, but higher token ngrams must be considered.
  • 19. 19
  • 20. 20 Issue 1: Contraction alternatives: I am (rank 170),You are, is not,etc. Issue 2: 2-gram There—’s andThat—’s not found but 2-grams There’s there and That’s that/i/you/Mr/it are found. Bug in algorithm? Divergence from COCA
  • 21. 21 Issue of Contraction alternatives continues: •It isn’t (rank 532), •I am not (rank 638), •It is a (rank 64), •You are not (rank 2,663), •You aren’t (rank 6,009), •There is no (rank 52) 17 Negativity 6 Negativity 18 contractions 6 contractions
  • 22.  Contraction alternatives & (possible) bug affect results  HMDC is not well-reflected by COCA  Should try comparing with COCA Spoken 22
  • 24. 24 Lower rank difference, Fewer unfound clusters, more shared 2-grams
  • 25. 25 Lower rank difference, lower negativity in COCA Spoken Issue of contraction alternatives & acronyms
  • 26.  Such decrease in the frequency of negativity in COCA Spoken with respect to COCA is a result of an increase in the positivity in spoken American English.  Therefore considering the top twenty 3-grams, COCA Spoken contains 5% more positivity than COCA  HDMC contains 3.4 times more negativity than COCA Spoken and 2.8 times more than COCA.  In a way, House M.D. has brutally intervened in viewers’ perception of the norm of American English. 26
  • 27.  This study has discussed how the language used in House M.D. is related to contemporary spoken American English  Has listed differences and similarities drawn from a comparison between COCA and COCA Spoken with respect to HMDC  Has showed how House M.D. can be identified as a dramedy far more interpersonal, 1st and 2nd person-addressed and disagreeing than one would encounter in “reality”.  In addition to the original research questions, it has demonstrated the strengths and weaknesses of using 1-to-3-grams rank difference in comparing HMDC with COCA and COCA Spoken  Has addressed potential methodological issue of contraction alternatives and acronyms affecting ngram ranking and rank difference 27
  • 28. 28
  • 29.  Judging by the results obtained from this simple analysis, further studies along the line of television drama are worthy researchers’ attention, interest and devotion. 29
  • 30. 30
  • 31.  Allen, R. C. (2004). Frequently asked questions. A general introduction to the reader. In R. C. Allen, & A. Hill (Eds.), The Television Studies Reader (pp. 1-26). New York: Routledge.  Androutsopoulos, J. (2012). Introduction: Language and society in cinematic discourse. Multilingua , 31, 139-154.  Bednarek, M. (2011). The language of fictional television: a case study of the ‘dramedy’Gilmore Girls. English Text Construction , 4 (1), 54-83.  Bednarek, M. (2010). The Language of Fictional Television: Drama and Identity. London: Continuum International Publishing Group.  Biber, D. (2009). A corpus-driven approach to formulaic language in English. InternationalJournal of Corpus Linguistics , 14 (3), 275-311.  Bignell, J., & Lacey, S. (2005). Popular television drama : critical perspectives. (J. Bignell, & S. Lacey, Eds.) Manchester: Manchester University Press.  Brock, A. (2004). Analyzing scripts in humorous communication. Humor: InternationalJournal of Humor Research , 17 (4), 353-360.  Bubel, C. (2006). The linguistic construction of character relations in TV drama: Doing friendship in Sex and the City. Retrieved April 4, 2013, from SciDok-Datenbank: http://scidok.sulb.uni-saarland.de/volltexte/2006/598/pdf/Diss_Bubel_publ.pdf  Chamber, S. A. (2003). Language and Structure in The West Wing. In P. C. Rollins, & J. E. O'Connor (Eds.), The West Wing: The American Presidency As Television Drama (pp. 83-100). New York: Syracuse University Press.  Chua, B. H. (2008). Structure of identification and distancing in watching East Asian television drama. In B. H. Chua, & K. Iwabuchi, East Asian Pop Culture: Analysing the Korean Wave (pp. 73-90). Hong Kong: Hong Kong University Press.  Cover, R. (2004). From Butler to Buffy: Notes towards a strategy for identity analysis in contemporary television narrative. Reconstruction: Studies in Contemporary Culture , 4 (2).  Davies, M. (2014, December 1). CoRD | The Corpus of Contemporary American English (COCA). Retrieved March 21, 2011, from VARIENG: CoRD | The Corpus of Contemporary American English (COCA)  Davies, M. (2011). N-grams data from the Corpus of Contemporary American English (COCA). Retrieved November 20, 2014, from http://www.ngrams.info  Goodier, B. C., & Arrington, M. I. (2007). Physicians, patients, and medical dialogue in the NYPD Blue prostate cancer story. Journal of Medical Humanities , 28 (1), 45-58. IMDb. (n.d.).  Jacoby, H., & Irwin, W. (Eds.). (2008). House and Philosophy: Everybody Lies. John Wiley & Sons.  Jamieson, D. (2011, September). Does TV accurately portray psychology? Retrieved April 20, 2013, from American Psychology Association: http://www.apa.org/gradpsych/2011/09/psychology-shows.aspx  Munt, S. R. (2006). A queer undertaking: Anxiety and reparation in the HBO television drama series Six Feet Under. Feminist Media Studies , 263-279.  O’Keeffe, A., McCarthy, M., & Carter, R. (2007). From Corpus to Classroom: Language Use and Language Teaching. Cambridge: CUP.  Piazza, R., Bednarek, M., & Rossi, F. (Eds.). (2011). Telecinematic Discourse: Approaches to the Language of Films and Television Series. Philadelphia: John Benjamins B.V.  Quaglio, P. (2008). Television dialogue and natural conversation: Linguistic similarities and functional differences. In A. Ädel, & R. Reppen (Eds.), Corpora and Discourse: The challenges of different settings (Vol. vi, pp. 189-210). John Benjamins.  Richardson, K. (2010). Television Dramatic Dialogue: A Sociolinguistic Study. Oxford: Oxford University Press.  Scott, M. (2014). WordSmith Tools Manual . Retrieved May 20, 2014, from Lexical Analysis Software: http://www.lexically.net/downloads/version6/HTML/index.html  Wild, D. K. (2005a, October 24). Constructing House: An Interview with House, M.D. writer Lawrence Kaplow. Retrieved September 4, 2014, from Blogcritics: http://blogcritics.org/constructing-house-an-interview-with-house/  Woznicki, K. (2005, September 27). A doctor/writer in the 'House'. Retrieved September 4, 2014, from CNN.com: http://edition.cnn.com/2005/HEALTH/09/27/profile.writer.foster/index.html 31
  • 32. 32 by Locky, Law PhD Candidate The Hong Kong Polytechnic University Email: Lx3h@yahoo.com The 9th International Conference of ASIALEX Words, Dictionaries and Corpora: Innovation in reference science 25-27 June 2015 | Hong Kong