Boost PC performance: How more available memory can improve productivity
(Jan 2011) Scholars, their Blogs, and Digital Preservation: Reflections on Research Design
1. SCHOLARS &
THEIR BLOGS
REFLECTIONS ON RESEARCH DESIGN
Dr. Carolyn Hank
carolyn.hank@mcgill.ca
School of Information Studies
McGill University
SPEAKER SERIES
28 January 2011
2. 03 Background
04 Research Design
23 Questionnaires
30 Interviews
32 Blog Analysis
36 Findings
58 Discussion
60 Next Steps
02 | 63 agenda
3. LITERATURE Blogger Perceptions on
Digital Preservation
Hank, Sheble, & Choemprayong,
2007-2010
Blogs & Scholarly
Blogging Communication
Blog Digital
Archiving Preservation
03 | 63 background
4. RESEARCH QUESTIONS
How do scholars who
blog perceive their blog
in relation to their cumulative
scholarly record?
04 | 63 research design
5. RESEARCH QUESTIONS
How do scholars who
blog perceive their blog
in relation to long-term
stewardship?
Who do they perceive
as responsible as well
as capable for blog
preservation?
05 | 63 research design
6. RESEARCH QUESTIONS
What blog characteristics
impact preservation?
What blogger behaviours
impact preservation?
06 | 63 research design
9. DATA SOURCES
Questionnaires
BLOGS
Interviews
BLOGGERS Blog Analysis
09 | 63 research design
10. POPULATION
NEEDLE IN A
HAYSTACK
10 | 63 research design
11. POPULATION
CHAMELEON IN
A HAYSTACK
11 | 63 research design
12. POPULATION
Purposive Sampling
Academic Blog
Portal
<http://www.academicblogs.org>
12 | 63 research design
13. BLOGS BY CLUSTER
Blogs Listed Total
Domain Cluster Duplicates
at Source Blogs
Humanities History 190 1 189
Social Sciences Economics 192 0 192
Professions &
Law 120 1 119
Useful Arts
Sciences BioChemPhys 147 3 144
All Domains All Clusters 649 5 644
Note. For total blogs within the Sciences cluster, BioChemPhys (N=144), sub-fields were
represented as follows: Biology blogs, 39% (n=56); Chemistry blogs, 15% (n=21); and
Physics blogs, 47% (n=67).
Also, BioChemPhys is also abbreviated in select tables as, „Sciences.”
13 | 63 research design
14. BLOG ELIGIBILITY …
PUBLICLY AVAILABLE
PUBLISHED IN ENGLISH
KNOWLEDGE OR PERSONAL BLOG
TIME-STAMPED POSTS
ACTIVELY PUBLISHED TO
AT LEAST 1 YEAR OLD
PERSONAL IDENTIFIERS (RE: AUTHORSHIP)
14 | 63 research design
15. CONTINUED
AUTHORED BY 1 OR MORE SCHOLARS
SCHOLAR CRITERIA
a) 1+ descriptor: Ph.D., Dr., Professor, Reader, Lecturer,
Doctoral Student, or Doctoral Candidate
b) 1+ descriptor (Scholar, Academic, Researcher, Research
Director, Fellow, Biologist) and institutional affiliation
c) Link to blogger‟s CV or the like with 1+ citation to a
journal article
d) Graduate student and explicit reference to area of
study or pursuant degree
15 | 63 research design
17. ASSESSMENT
History Econ Law Sciences
Criterion
Freq (%) Freq (%) Freq (%) Freq (%)
Publicly available 168 (90%) 163 (85%) 113 (95%) 126 (88%)
Published in English 159 (84%) 151 (79%) 111 (93%) 123 (85%)
Knowledge or personal
146 (77%) 140 (73%) 93 (78%) 119 (83%)
blog
Time-stamped posts 145 (77%) 140 (73%) 93 (78%) 118 (82%)
Actively published to 68 (36%) 83 (43%) 58 (49%) 62 (43%)
At least 1 year old 58 (31%) 66 (34%) 53 (45%) 54 (38%)
Personal identifiers in
53 (28%) 59 (31%) 48 (40%) 48 (33%)
regard to authorship
Authored by 1 or more
bloggers meeting 46 (24%) 51 (27%) 47 (40%) 44 (31%)
scholar parameters
17 | 63 research design
18. SAMPLING FRAME (1)
Clusters Single-Blogs Co-Blogs Total Blogs
History 32 14 46 (31%)
Economics 34 17 51 (27%)
Law 22 25 47 (40%)
BioChemPhys 37 7 44 (24%)
All Clusters 125 63 188 (29%)
Note. For blogs in the BioChemPhys cluster, disciplines were represented as follows:
Single-blogs: biology 43% (n=16), chemistry 14% (n=5), and physics 43% (n=16); and
Co-blogs: biology 0%, chemistry 29% (n=2), and physics 71% (n=5).
18 | 63 research design
19. BLOGGER ELIGIBILITY
CO-BLOGS : POSTED W/IN 1 MONTH
CO-BLOGS: MEETS SCHOLAR CRITERIA
ALL BLOGS: BLOGGER CONTACT INFO
19 | 63 research design
20. CO-BLOGGERS
History Econ Law Sciences
Criterion (N=151) (N=155) (N=228) (N=49)
Freq (%) Freq (%) Freq (%) Freq (%)
(Special Condition):
Blogger published 43 (29%) 65 (42%) 114 (50%) 19 (39%)
within previous month
Blogger meets
31 (21%) 58 (37%) 107 (47%) 16 (33%)
scholar parameters
Blogger contact
27 (18%) 56 (36%) 102 (45%) 15 (31%)
information available
Protocol: Revised
count and percentage
23 (15%) 53 (34%) 99 (43%) 15 (31%)
after removal of
duplicate listings
20 | 63 research design
21. SINGLE-BLOGGERS
History Econ Law Sciences
Criterion (N=32) (N=34) (N=22) (N=37)
Freq (%) Freq (%) Freq (%) Freq (%)
Blogger contact
27 (84%) 32 (94%) 21 (96%) 28 (76%)
information available
21 | 63 research design
22. APPENDIX A
SAMPLE Criteria 1-9
Data Management
CODING
48 Categories/Attributes
SYSTEM Specific Instructions
22 | 63 research design
23. INSTRUMENT DESIGN
QUESTIONNAIRES
Q1 (single-bloggers) 41 to 58 questions
Q2 (co-bloggers): 41 to 62 questions
Qualtrics
QUESTIONNAIRES AVAILABLE IN APPENDICES C & D
23 | 63 questionnaires
24. INSTRUMENT DESIGN
Lenhart & Fox (2006)
Herring et al. (2005)
Morton and Price (1999)
Olsen et al. (2009)
Rainie (2005)
Do not White & Winn (2009)
Hank et al. (2007)
reinvent
the wheel
24 | 63 questionnaires
27. ADMINISTRATION
All eligible bloggers invited (N=298)
Personalized Email
Salutation | Blog Title | Blog URL | PIN
Manual
Invite and 2 reminders
Available for 3 weeks
No inducements (except final report)
27 | 63 questionnaires
28. COMPLETED SAMPLE
RR 1: QI: 63% | QII: 46% | QI/II: 52%
Completed sample:
153 respondents
Outcome rates derived from Internet surveys of specifically named persons from
the American Association for Public Opinion Research (AAPOR, 2009)
28 | 63 questionnaires
30. DESIGN & ADMIN
11 to 14 questions
Protocol | Debriefing Sheet | Pre-Test
72 (47%) expressed interest
Concurrent to other data collection
24 phone interviews (semi-structured)
15 to 25+ minutes
30 | 63 interviews
31. ANALYSIS
Interviews/ Notes
Digital Partial Transcripts
3+ listening sessions
Recordings
CONSENT SCRIPT, SCHEDULE, & DEBRIEFING SHEET
AVAILABLE IN APPENDICES G &D
31 | 63 interviews
32. SAMPLE
Coded 93 blogs (49.5% sampling ratio)
Single-Blogs Co-Blogs Total Blogs
Clusters
Count Count Count
History 16 7 23
Economics 17 8 25
Law 11 13 24
BioChemPhys 17 4 21
All Clusters 61 32 93
32 | 63 Blog analysis
33. CODE BOOKS
CODING
SYSTEMS Authorship
Blog Elements & Features
CB1 (single-) Rights & Disclaimers
Authority & Audience
63 Indicators (on/off blog Blog Publishing Activity
Post Features
CB2 (co-blogs) Archiving
57 Indicators (on/off blog)
SINGLE- & Co-BLOG CODING SYSTEM AVAILABLE IN APPENDIX J
33 | 63 Blog analysis
34. TESTING/COLLECTING
Time in Single-Blog Co-Blog Count
Minutes Frequency (%) Frequency (%)
≤9 17 (28%) 5 (15%)
10 to 19 32 (52%) 24 (73%)
20 to 29 9 (15%) 2 (6%)
30 to 39 2 (3%) 1 (6%)
≥ 40 1 (2%) -
34 | 63 Blog analysis
35. ANALYIS
Excel
SPSS
Excel
35 | 63 Blog analysis
44. BLOG PROFILE
Avg. blog age
is 4.5 years old
(range 1 to 8)
44 | 63 findings
45. QUESTION (1)
public 100%
allows use and
exchange 94%
part of the
scholarly
record 80% subject to
critical
review 68%
Association of Research Libraries (1986). Braxton, J.M., Luckey, W., & Helland, P. (2002).
45 | 63 findings
46. QUESTION (2) Preservation
Preferences
Personal access/use 16%
Short-term future
Personal access/use 19%
Short-term future
Personal access/use 76%
Indefinite future
Public access/use 80%
Indefinite future
0% 100%
46 | 63 findings
49. QUESTION (2) Doomsday
SADNESS Scenario
“Pretty bad;” “Very bad;” “Sad;” “Pretty sad;” “Panicked;”
“Devastated, both emotionally and professionally.”
ANGER
“Mad as hell;” “Pretty peeved;” “Pretty angry;” “Angry and upset;” “Frustrated;”
I‟d do something drastic [in response] (i.e., legal action).
RELIEF
“I don‟t have to do it anymore;” “I get half an hour of my life back.”
C’EST LA VIE
“Pour another cup of coffee and get back to work;” “Probably have a drink and
forget about it;” “Not welcomed but not tragic … I‟d get over it;” “Drop out of the
blogosphere until something else comes along.”
DOUBT
“How would that happen?;” “It would take an extreme catastrophe;”
“Hard to believe lost and unrecoverable.”
49 | 63 findings
51. QUESTION (3)
Dynamic, changing
Co-producer dependencies
Understandability
Versioning
Rights and Use
Some Archiving Activity
05 || 35
51 63 findings
RESEARCH DESIGN
52. QUESTION (3)
BLOGS BLOGGERS
55
of most
% 55 %
update their
recent posts blog
published several
≤ 3 days times a week
52 | 63 findings
53. QUESTION (3)
95% edit posts after publication
Spelling & grammatical errors
Rephrasing
Remove incorrect info
Published before ready
29% delete posts after publication
Duplicate post
“Post regret”
Too sensitive or revealing
53 | 63 findings
54. QUESTION (3) Most
Recent
Post
Text 99% Links 82% (avg. 5)
Other image elements 16%
544 total words Photos 16%
79 quoted words Comments 57%
465 original words
54 | 63 findings
55. QUESTION (3)
50% check for permissions before
publishing content at least half the
time.
05 || 35
55 63 findings
RESEARCH DESIGN
56. QUESTION (3)
Rights
51 % 37 % 14 %
none text Creative
statement Commons
Other
Policies
56 | 63 findings
57. QUESTION (3)
80% of blogs in sample archived
to Internet Archive Wayback
Machine
50% of law cluster blogs (n=12)
archived at Library of Congress‟
Legal Blawgs Web Archive
57 | 63 findings
58. CONCLUSIONS (2007)
Findings Future
Bloggers are interested Methodology
Save some but not all Responsibility
New content added Access scenarios
Old content altered Versioning
Personal responsibility Intellectual Property
Defining roles of others Access scenarios
Process in time
58 | 63 discussion
59. CONCLUSIONS (2010)
Blogs in support of service,
teaching, and research
First line of defense
Last line of defense
Service Providers and Networks
Tools, Resources, Engagement
59 | 63 discussion
60. Continued analysis
Personal and Programmatic
Approaches
BlogForever
Twitter and the Library of Congress
Terms of Service Agreements
60286330
| | FUTURE next steps
WORK
61. WWTD what would Tufte do?
(see handout for references
Paul Jones
61 | 63 references
62. Thanks to ....
Dr. Helen R. Tibbo
Dr. Lynn Silipigni Connaway
Dr. Jeffrey Pomerantz
Paul Jones
Dr. Richard Marciano
Paul Jones
Thanks for ....
Beta Phi Mu 2010 Eugene Garfield Doctoral
Dissertation Fellowship
Paul Jones
62 | 63 acknowledgements
63. And thank you.
CAROLYN HANK
Email: carolyn.hank@mcgill.ca
Phone: 514.398.4684
Web: http://ils.unc.edu/~hcarolyn
Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United
States License: http://creativecommons.org/licenses/by-nc-nd/3.0/us/
63 | 63 questions