SlideShare une entreprise Scribd logo
1  sur  75
Letting Users Lead:
Analyzing Search Queries & Relevancy
in USC’s Web-Scale Discovery Tool
California Association of Research Libraries, 2014
April 5, 2014
Beth Namei, University of Southern California
Christal Young, University of Southern California
Web Scale Discovery Services in a nutshell
image: http://www.colinburnett.com/wp-content/uploads/2014/01/livingunderrock.jpg
The hype - Have we chosen wisely?
image: http://jessicaknauss.blogspot.com/2012/09/the-grail-knight-as-inspiration.html
Why USC got Summon
#1: To provide better discoverability of our subscription
and purchased content (via a unified access point)
#2: Provide relevant results to our users (most
urgently to provide relevance ranked results for items in
our SIRSI OPAC)
#3: To provide a better user experience with the
library’s website
Our study
We re-executed a sample of Summon search
queries to see how successful our users were in
retrieving relevant results.
April 2010 - Pre-Summon homepage
July 2010 - Summon is launched as the default tab
July 2012 - Our current Summon-centric homepage
(Catalog tab is removed)
Motivation for our study
• There were a lot of anecdotal complaints.
• To get evidence about how successful Summon
was in leading users to relevant sources.
• To learn more about user search behavior with this
single search box
our
methodology
image: http://www.rsc.org/chemistryworld/issues/2007/december/thechemistrysetgeneration.asp
Transaction Log Analysis (TLA)
A transaction log is a history of actions executed in a system.
TLA involves looking at the data captured in a transaction log
to investigate interactions between users and a search tool.
Advantages of TLA
• Unobtrusive
• Large quantities of data that can reveal large-scale
patterns
• Low cost
Limitations of TLA
• Does not capture contextual information such as the
user’s emotions, motivations, intentions and needs.
• Does not capture demographic information.
• Does not capture user satisfaction with the
overall search experience.
Summon searches in Fall 2013: 1,243,250
# of unique searches: 184,076
Our sample: 384 searches
Margin of error: +/- 5%
Defining success (and failure)
image: http://abovethelaw.com/wp-content/uploads/2013/12/thumbs-up.jpg
Success = Relevant results
"We are used to hearing people talk about the 'simple
search box' as the goal” of Discovery services.
But a simple search box has only been one part of the
Google Formula. PageRank has been very important in
providing a good user experience, and Google has
progressively added functionality as it included more
resources in the search results" (Dempsey, 2012, 4).
Information overload
“Surveys of...users
reveal a consistent
theme: most are
overwhelmed and
confused by the
disorganized flood
of information”
(Riley 24).
image: https://www.flickr.com/photos/youreyestellies/7984526358
“Surveys of Internet users reveal a consistent theme:
most are overwhelmed and confused by the
disorganized flood of information. As a result,
librarians have an opportunity to carve their niche as
the de facto information navigators of the Digital
Age”
Riley, Margaret. "Riley's Guided Tour: Job Searching On The Net." Library
Journal 121 (1996): 24-27.
Information used to
be scarce….
Image: http://allaboutalpha.com/blog/wp-content/uploads/2011/04/iStock_000014587595XSmall.jpg
But today our attention
span and time are scarce
while information is
abundant and easily
accessible.
“There are so very many
search engines now and so
much information; I preferred
the ‘good old days’ when there
weren’t so many ways to
access information”
- USC Journalism faculty
member.
image: http://gcn.com/articles/2013/09/30/smarter-cities-cloud.aspx
Summon’s Relevance Ranking
Looks at:
• term proximity - how close query terms are to one
another.
• term frequency - how often do the query terms appear
in the record
• field weighting - where is the query term found?
Finding query terms in some fields are more important
than others.
Also considers:
• Some content types are boosted over others
o Books and journal articles over newspaper articles and book reviews.
o The Journal over the articles in that journal
• Publication Date - Generally, items with newer publication dates are
favored over older items.
• Citation Counts - Items cited more are boosted
• Local Collections - Content from institutional catalog(s) and repositories
are boosted.
• Content Size - Longer works aren’t necessarily more relevant even though
search terms might appear in them more often
A “Success Engine”
Relevancy Enhancements: “to ensure users don’t miss out on the most
relevant content.”
This includes automatically searching for synonyms, term-stemming and smart
searching of stop-words depending on their importance to the search phrase.
image: http://blog.dappersnappers.com/wp-content/uploads/2012/02/baby-laptop-computer.jpg
Measuring Relevancy in our Study
We used “systems-oriented relevance” to rate the
success of search queries.
We rated the relevancy of Summon’s results without
knowing the context of the original query. We looked at
how well the topic of the search was represented in the
topics of the results retrieved in Summon, Google and
Google Scholar.
- Maglaughlin & Sonnenwald 2002, 328-9.
Inter-rater reliability
mage: - http://www.artofmanliness.com/2008/08/21/manly-feats-of-strength/
Types of searches
Known item searches
Had to be specific enough for
us to recognize a definitive
match. If the search got
numerous matches and was
general, we would likely not
categorize it as a known
item.
Examples:
• marketing alterity moore
• 978-0078026676
• An Empirical Analysis of Cigarette
Addiction
• "happy days are here again" garland
streisand
Keyword/topic/exploratory
searches
Encompassed broad, general or
ambiguous searches. Named
persons were put into this category
as well.
Examples:
• Catherine the Great,
• vulnerability and happiness
• PTSD and Substance use in african
american women
• supersitions
Our Relevancy Rubric
Known Item Searches
Relevant:
1st item in the
list of results
Partially
Relevant:
2nd-10th item in
the list of results
Not relevant:
Not listed in the
first 10 results or
no results
retrieved.
(could mean we don’t
own the item or there
is a user input error)
Our Relevancy Rubric
Keyword Searches
Relevant: ALL search
terms appear in item's title or
record.
All search terms appear to
have a relationship to one
another, they are not just
randomly placed throughout
the title/record.
First 5 items look like a
"perfect"/solid match or
clearly seem to be ABOUT
the topic as identified
through the search terms.
Partially Relevant
Not all search terms are
visible in the title or record of
the items.
At least 3 of the first 5 results
appear to be somewhat
related to the topic as
entered, even if broadly or
tangentially.
Would a user easily recognize
a connection between the
results and the topic as it was
entered?
Not Relevant: At least
four of the first 5 items appear
to be false hits.
Even if one or more of the
search terms (or synonyms of
those terms) appear in the
title, abstract or record,
results appear to be only
about a portion of the search
terms entered and not about
all of them combined.
Summon vs. Google vs. Google Scholar
Known Item
Searches:
For Google & Google
Scholar, the relevancy
was determined by a
match, did not have to
be a full-text match.
Links to Amazon,
Google Books,
WorldCat, imdb.com,
and Youtube were
matches.
Image: http://www.geekwire.com/2013/ibm-takes-watson-cloud/
our findings...
Top 100 most frequent queries in Fall 2013
Consisted of 23,813 individual searches
Long Tail of unique searches
87% (160,263) of total searches
make up the long tail
Top 100 unique
searches make up
13% of total queries
36%
62%
2%
Types of Searches
Image: http://seanarchy.files.wordpress.com/2013/12/educated-poor.jpg
Summon vs. Google vs. Google Scholar
Which do you think did better?
http://bit.ly/summon-faceoff
Successful Searches:
54% of all our sample searches
(204) retrieved relevant results.
Summon’s Overall Relevancy Report Card = F
After the curve = C
Moderately Successful +
Successful Searches:
73% (273) retrieved partially
relevant - relevant results
image: http://bleedingedge.pynchonwiki.com/wiki/images/b/b8/Dunce-cap.jpg
Successful Searches:
85% (318) of the sample
searches retrieved
relevant results
Google’s Relevancy Report Card = B
After the curve: A
Moderately Successful
+ Successful Searches:
95% (356) of the searches
retrieved partially relevant
- relevant results.
Successful Searches:
54% of the sample
searches (203) retrieved
relevant results.
Google Scholar’s Relevancy Report Card = F
After the curve: C
Moderately Successful +
Successful Searches:
75% (281) retrieved
partially relevant -
relevant results.
Relevancy of all Keyword Searches
n=236
Relevancy of all known item searches
n=139
Failed searches in Summon
32% (45) of all known item searches did not locate the
item being searched for
• 33% (15) of these failed searches are for items USC
does not own
Of the items we do own:
• 57% (17) did not show up due to user error
• 40% (12) did not show up due to a Summon problem
(bad metadata, poor relevancy, not indexed in Summon)
• 3% (1) had irregular characters and found no results
Relevancy of “academic” known item searches that USC
owns
Summon
improved 13%
Google Scholar
improved 15%
Google improved 1%
n=109
Revised Relevancy Report Cards
Google Scholar = F
57% (192) retrieved
relevant results
(up from 53%)
After Curve = C+
79% (272) retrieved
partially relevant -
relevant results
(up from 73%)
Summon = F
59% (202) retrieved
relevant results
(up from 53%)
After Curve = C+
79% (271) retrieved
partially relevant -
relevant results
(up from 73%)
Google = B
84% (291) retrieved
relevant results
(down from 85%)
After Curve = A
95% (328) retrieved
partially relevant -
relevant results
(no change)
User errors
18% (66) of the
searches in our
sample had a
user input error
Image: http://human-error.sarkisozlerik.com/human-error/a-lifetime-by-design.html
“Did you mean?”
Showed up 24 times
• 83% (20) triggered by user input errors.
• 42% (10) of the time the “Did you mean” links took
users to relevant results.
Google automatically redirects searches with errors
Summon lets the user
decide whether to follow
a “corrected” path
Median: 3
Mean: 3.182
Mode: 2
# of words used in Keyword Searches
Impact of # of search terms entered on relevancy
Type of Keyword Searches Executed
n=236
Impact of search type on relevancy
Duplicates
image: http://amarkedman.com/wp-content/uploads/2011/07/Matrix-Clones.jpg
Known item searches:
15 searches had 2 or
more duplicates (11%)
Keyword searches:
51 searches had 2 or
more duplicates (22%)
Linking to full-text
"Linking users to full text as
quickly as possible after
discovery results are available
is a paramount concern"
(NISO ODI Report, 2013, 7).
There were only 3 bad links
(out of 55 known item article
searches)
image: https://flic.kr/p/mqjHgR
What we learned:
Summon has some work to do to improve the
relevancy of its results.
But, it is doing better in other areas….
Summon added
as default search
tab (July 2010)
Catalog search
tab removed
from homepage
(July 2012)
Winning! (sort of…)
image: http://beautelicious.com/2011/09/charlie-sheen-warner-bros-close-settling-wrongful-termination-suit/
Final Report Card:
Relevancy = C+
Intuitive starting place =
Fast =
Bringing users back to the library =
Maximizing usage of collections =
Moving forward - Following our users’ lead
image: http://blog.cityspoon.com/2012/02/08/gathering-followers/
Leadership Strategy #1: Learn and use your
library’s discovery tool
• 3 million searches in 2013!
• Will give you insight into what
users are experiencing, both
good and bad.
• Discovery tools are taking up
prime real estate on many of
our websites.
image: http://www.creativity4us.com/wp-content/uploads/2012/02/blinders-crop.jpg
Leadership Strategy #2: Change what and how
we teach
”Librarians must reconsider training students to use
advanced search features or Boolean logic if students
purposefully choose not to use them or fail to use them
correctly. Rather than teaching students more effective
search syntax, more attention should be placed on
developing critical thinking and evaluative skills"
(Holman, 2011, 24).
Leadership Strategy #3: Be a squeaky wheel
• Many users will get frustrated and abandon the library
without out ever letting us know about problems
• We cannot depend on other people to report problems
Leadership Strategy #4: Imporvise and fall in
front of students
• Show students how to troubleshoot a failed search.
• Show searches with typos or how to revise a search
that gets no results)
"If we think like users (instead of as librarians) it is easy
to understand the frustration. Our tools must seem
broken or outdated to them….Are we in the business of
promoting library databases or the business of helping
users accomplish their tasks?” (Matthews, 2013)
Leadership
Strategy #5:
“The trouble with Summon is that students don’t need to be
taught how to use it, but librarians do”
-Matt Borg, Sheffield Hallam University, 2012
English Faculty Member:
“I want an easy way to find a book with
author and title, and an easy way to move
from that to journal articles if that’s what
I want….”
Religion Faculty Member: “If I need to
search something I don’t go to any USC
search engine, which is totally a waste of
time. I go to Google where I can get
things ten times as fast.”
image: http://www.morvimmer.com/blog/free-download-staples-easy-button/
Leadership strategy #6: Empathize with users
AND colleagues
• Try not to criticize or judge
• Invite skeptical librarians into your classes to watch you
teach w/the discovery service
• Showing vs telling - talking can only get you so far
Common complaints
• “I think it's a cheat. Too many students don't learn basic searching skills that
would make any search better - like planning before typing. They just throw in
anything and take what comes up first” - 3/1/14 Survey of USC Instruction
Librarians
• “It is so imprecise" (Buck & Steffy, 2013, 76).
• Perpetuates "the homogenization of information" - when "everything looks and
feels the same" (Bawden & Robinson, 2008, 181).
• “pandering to the 'principle of least effort'” (Richardson 2013; Meadow &
Meadow 163-4).
• They “impinge upon the development of critical research skills” (Wiles &
Hofmann, 2013, 156).
• “these systems...reinforce unreflective research habits” (Asher, 2013, 6)
Complaining is not a (constructive) strategy
“When you invent something new, if customers come to
the party, its disruptive to the old way….The internet is
disrupting every media industry...people complain
about that but complaining is not a strategy. Amazon
is not happening to bookselling, the future is
happening to bookselling.”
-Jeff Bezos, 60 Minutes, [8:55]. 12/1/2013
Leadership Strategy #7: Solicit feedback AND
then listen
• Look for positive AND negative feedback
• Re-frame negatives as positives or as opportunities for
dialogue
Leadership Strategy #8: Turn negativity to
your advantage
“Engage and transform the
most negative person in your
library system into a productive
team member.” By “converting
the most negative person has a
huge impact on the rest of the
staff” (Cuillier, 2011, 439).
Image: http://www.salon.com/2013/03/12/why_is_francis_underwood_a_democrat/
Leadership Strategy #9: Gather evidence
• Test your assumptions
• Test your colleagues’ assumptions
• Study user behavior, formally and informally
• Assess the tool and then assess it again
Leadership Strategy # 10: Redefining ourselves
References
Asher, Andrew D., Lynda M. Duke, and Suzanne Wilson. “Paths of Discovery: Comparing the
Search Effectiveness of EBSCO Discovery Service, Summon, Google Scholar, and
Conventional Library Resources.” College & Research Libraries, 74.5 (2013): 464-488.
Bawden, D., and L. Robinson. “The Dark Side of Information: Overload, Anxiety and Other
Paradoxes and Pathologies.” Journal of Information Science 35 2 (November 21, 2008):
180-91. doi:10.1177/0165551508095781.
Buck,, Stefanie, and Christina Steffy. “Promising Practices in Instruction of Discovery Tools.”
Communications in Information Literacy, 7.1 (2013).
Cuillier, Cheryl. “Choosing Our Futures … Still!” Journal of Library Administration 52.5 (July 2012):
436–51. doi:10.1080/01930826.2012.700806.
Dempsey, Lorcan. “Thirteen Ways of Looking at Libraries, Discovery, and the Catalog: Scale,
Workflow, Attention.” Educause, December 10, 2012.
Holman, Lucy. “Millennial Students’ Mental Models of Search: Implications for Academic Librarians
and Database Developers.” The Journal of Academic Librarianship 37.1 (January 2011):
19–27. doi:10.1016/j.acalib.2010.10.003.
Maglaughlin, K. L., and D. H. Sonnenwald. “User Perspectives on Relevance Criteria: A
Comparison among Relevant, Partially Relevant, and Not-Relevant Judgments.” Journal of
the American Society for Information Science and Technology, 2002.
Matthews, Brian. “Database vs. Database vs. Web-Scale Discovery Service: Further Thoughts on
Search Failure (or: More Clicks than Necessary?) (or: Info-Pushers vs. Pedagogical
Partners).” Chronicle of Higher Education. Ubiquitous Librarian, August 21, 2013.
Meadow, Kelly, and James Meadow. “Search Query Quality and Web-Scale Discovery: A
Qualitative and Quantitative Analysis.” College & Undergraduate Libraries 19.2-4 (2012):
163–75. doi:10.1080/10691316.2012.693434.
NISO ODI Working Group. National Information Standards Organization ODI Survey Report:
Reflections and Perspectives on Discovery Services, January 2013.
Pan, Bing, Helene Hembrooke, Thorsten Joachims, Lori Lorigo, Geri Gay, and Laura Granka. “In
Google We Trust: Users’ Decisions on Rank, Position, and Relevance.” Journal of
Computer-Mediated Communication 12.3 (April 2007): 801–23.
doi:10.1111/j.1083-6101.2007.00351.x.
Richardson, Hillary A. H. “Revelations From the Literature: How Web-Scale Discovery Has Already
Changed Us.” Information Today, May 2013.
Rose-Wiles, Lisa M., and Melissa A. Hofmann. “Still Desperately Seeking Citations: Undergraduate
Research in the Age of Web-Scale Discovery.” Journal of Library Administration 53.2–3
(February 2013): 147–66. doi:10.1080/01930826.2013.853493.

Contenu connexe

Tendances

Tendances (20)

Social Websites And Seo Social Dev Camp Chicago2008 By John Fairley
Social Websites And Seo Social Dev Camp Chicago2008 By John FairleySocial Websites And Seo Social Dev Camp Chicago2008 By John Fairley
Social Websites And Seo Social Dev Camp Chicago2008 By John Fairley
 
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search LandscapeBearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
 
Online research
Online researchOnline research
Online research
 
Ranking Factors Data 2011: SMX Elite Sydney
Ranking Factors Data 2011: SMX Elite SydneyRanking Factors Data 2011: SMX Elite Sydney
Ranking Factors Data 2011: SMX Elite Sydney
 
Google and Beyond: Internet Research Skills Optimization
Google and Beyond: Internet Research Skills OptimizationGoogle and Beyond: Internet Research Skills Optimization
Google and Beyond: Internet Research Skills Optimization
 
Search Analytics: Conversations with Your Customers
Search Analytics: Conversations with Your CustomersSearch Analytics: Conversations with Your Customers
Search Analytics: Conversations with Your Customers
 
Site search analytics workshop presentation
Site search analytics workshop presentationSite search analytics workshop presentation
Site search analytics workshop presentation
 
Internet Searching: The Basics
Internet Searching: The BasicsInternet Searching: The Basics
Internet Searching: The Basics
 
Online research and research skills
Online research and research skillsOnline research and research skills
Online research and research skills
 
How to Maximize Conversions Through SEO and CRO
How to Maximize Conversions Through SEO and CROHow to Maximize Conversions Through SEO and CRO
How to Maximize Conversions Through SEO and CRO
 
Identifying Keywords and Searching Techniques
Identifying Keywords and Searching TechniquesIdentifying Keywords and Searching Techniques
Identifying Keywords and Searching Techniques
 
Keyword Research and Topic Modeling in a Semantic Web
Keyword Research and Topic Modeling in a Semantic WebKeyword Research and Topic Modeling in a Semantic Web
Keyword Research and Topic Modeling in a Semantic Web
 
Semantic search
Semantic searchSemantic search
Semantic search
 
Teaching Internet Research
Teaching Internet ResearchTeaching Internet Research
Teaching Internet Research
 
Taming the Hummingbird - #PMIEUR Berlin
Taming the Hummingbird - #PMIEUR BerlinTaming the Hummingbird - #PMIEUR Berlin
Taming the Hummingbird - #PMIEUR Berlin
 
Owning the Answer Box, Knowledge Graph and Featured Snippets
Owning the Answer Box, Knowledge Graph and Featured SnippetsOwning the Answer Box, Knowledge Graph and Featured Snippets
Owning the Answer Box, Knowledge Graph and Featured Snippets
 
Google Panda and Penguin
Google Panda and PenguinGoogle Panda and Penguin
Google Panda and Penguin
 
Vandenbosch2010 04-13search the-internet
Vandenbosch2010 04-13search the-internetVandenbosch2010 04-13search the-internet
Vandenbosch2010 04-13search the-internet
 
KWFinder Review
KWFinder ReviewKWFinder Review
KWFinder Review
 
Bigdataanalytics
BigdataanalyticsBigdataanalytics
Bigdataanalytics
 

Similaire à Carl 2014 slides_gotime

Similaire à Carl 2014 slides_gotime (20)

Sweeny ux-seo om-cap 2014_v3
Sweeny ux-seo om-cap 2014_v3Sweeny ux-seo om-cap 2014_v3
Sweeny ux-seo om-cap 2014_v3
 
What IA, UX and SEO Can Learn from Each Other
What IA, UX and SEO Can Learn from Each OtherWhat IA, UX and SEO Can Learn from Each Other
What IA, UX and SEO Can Learn from Each Other
 
Semantic seo and the evolution of queries
Semantic seo and the evolution of queriesSemantic seo and the evolution of queries
Semantic seo and the evolution of queries
 
Tips and tools for effective SEO and brand recognition - eCommerce Expo Melbo...
Tips and tools for effective SEO and brand recognition - eCommerce Expo Melbo...Tips and tools for effective SEO and brand recognition - eCommerce Expo Melbo...
Tips and tools for effective SEO and brand recognition - eCommerce Expo Melbo...
 
SEO & CRO: can't we just be friends? #CROElite17
SEO & CRO: can't we just be friends? #CROElite17SEO & CRO: can't we just be friends? #CROElite17
SEO & CRO: can't we just be friends? #CROElite17
 
Keyword research tools for Search Engine Optimisation (SEO)
Keyword research tools for Search Engine Optimisation (SEO)Keyword research tools for Search Engine Optimisation (SEO)
Keyword research tools for Search Engine Optimisation (SEO)
 
Learning How to Search and Evaluate Information
Learning How to Search and Evaluate InformationLearning How to Search and Evaluate Information
Learning How to Search and Evaluate Information
 
Introduction to Enterprise Search
Introduction to Enterprise SearchIntroduction to Enterprise Search
Introduction to Enterprise Search
 
RESEARCHING YOUR TOPIC_edit.pptx
RESEARCHING YOUR TOPIC_edit.pptxRESEARCHING YOUR TOPIC_edit.pptx
RESEARCHING YOUR TOPIC_edit.pptx
 
FSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
FSU SLIS InfoSvcs Wk 3 - Web Search & EvaluationFSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
FSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
 
Everything You Wish You Knew About Search
Everything You Wish You Knew About SearchEverything You Wish You Knew About Search
Everything You Wish You Knew About Search
 
Structured data and metadata evaluation methodology for organizations looking...
Structured data and metadata evaluation methodology for organizations looking...Structured data and metadata evaluation methodology for organizations looking...
Structured data and metadata evaluation methodology for organizations looking...
 
Student research eds ugm melbourne presentation (public edit)
Student research   eds ugm melbourne presentation (public edit)Student research   eds ugm melbourne presentation (public edit)
Student research eds ugm melbourne presentation (public edit)
 
Optimising Your Content for Findability
Optimising Your Content for FindabilityOptimising Your Content for Findability
Optimising Your Content for Findability
 
Getting found on Google by Howard Flint of Ghost Partner, Inc.
Getting found on Google by Howard Flint of Ghost Partner, Inc.Getting found on Google by Howard Flint of Ghost Partner, Inc.
Getting found on Google by Howard Flint of Ghost Partner, Inc.
 
Practical Approaches to Sharing Information
Practical Approaches to Sharing InformationPractical Approaches to Sharing Information
Practical Approaches to Sharing Information
 
Stephen kenwright conversion elite
Stephen kenwright conversion eliteStephen kenwright conversion elite
Stephen kenwright conversion elite
 
Improving the Search Experience in Higher Ed: What's Next?
Improving the Search Experience in Higher Ed: What's Next?Improving the Search Experience in Higher Ed: What's Next?
Improving the Search Experience in Higher Ed: What's Next?
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
 
SEO & CRO - Acquiring & Engaging Customers Through Digital Marketing
SEO & CRO - Acquiring & Engaging Customers Through Digital MarketingSEO & CRO - Acquiring & Engaging Customers Through Digital Marketing
SEO & CRO - Acquiring & Engaging Customers Through Digital Marketing
 

Dernier

Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 

Dernier (20)

Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 

Carl 2014 slides_gotime

  • 1. Letting Users Lead: Analyzing Search Queries & Relevancy in USC’s Web-Scale Discovery Tool California Association of Research Libraries, 2014 April 5, 2014 Beth Namei, University of Southern California Christal Young, University of Southern California
  • 2. Web Scale Discovery Services in a nutshell image: http://www.colinburnett.com/wp-content/uploads/2014/01/livingunderrock.jpg
  • 3. The hype - Have we chosen wisely? image: http://jessicaknauss.blogspot.com/2012/09/the-grail-knight-as-inspiration.html
  • 4. Why USC got Summon #1: To provide better discoverability of our subscription and purchased content (via a unified access point) #2: Provide relevant results to our users (most urgently to provide relevance ranked results for items in our SIRSI OPAC) #3: To provide a better user experience with the library’s website
  • 5. Our study We re-executed a sample of Summon search queries to see how successful our users were in retrieving relevant results.
  • 6. April 2010 - Pre-Summon homepage
  • 7. July 2010 - Summon is launched as the default tab
  • 8. July 2012 - Our current Summon-centric homepage (Catalog tab is removed)
  • 9. Motivation for our study • There were a lot of anecdotal complaints. • To get evidence about how successful Summon was in leading users to relevant sources. • To learn more about user search behavior with this single search box
  • 11. Transaction Log Analysis (TLA) A transaction log is a history of actions executed in a system. TLA involves looking at the data captured in a transaction log to investigate interactions between users and a search tool.
  • 12. Advantages of TLA • Unobtrusive • Large quantities of data that can reveal large-scale patterns • Low cost
  • 13. Limitations of TLA • Does not capture contextual information such as the user’s emotions, motivations, intentions and needs. • Does not capture demographic information. • Does not capture user satisfaction with the overall search experience.
  • 14. Summon searches in Fall 2013: 1,243,250 # of unique searches: 184,076 Our sample: 384 searches Margin of error: +/- 5%
  • 15. Defining success (and failure) image: http://abovethelaw.com/wp-content/uploads/2013/12/thumbs-up.jpg
  • 16. Success = Relevant results "We are used to hearing people talk about the 'simple search box' as the goal” of Discovery services. But a simple search box has only been one part of the Google Formula. PageRank has been very important in providing a good user experience, and Google has progressively added functionality as it included more resources in the search results" (Dempsey, 2012, 4).
  • 17. Information overload “Surveys of...users reveal a consistent theme: most are overwhelmed and confused by the disorganized flood of information” (Riley 24). image: https://www.flickr.com/photos/youreyestellies/7984526358
  • 18. “Surveys of Internet users reveal a consistent theme: most are overwhelmed and confused by the disorganized flood of information. As a result, librarians have an opportunity to carve their niche as the de facto information navigators of the Digital Age” Riley, Margaret. "Riley's Guided Tour: Job Searching On The Net." Library Journal 121 (1996): 24-27.
  • 19. Information used to be scarce…. Image: http://allaboutalpha.com/blog/wp-content/uploads/2011/04/iStock_000014587595XSmall.jpg
  • 20. But today our attention span and time are scarce while information is abundant and easily accessible. “There are so very many search engines now and so much information; I preferred the ‘good old days’ when there weren’t so many ways to access information” - USC Journalism faculty member. image: http://gcn.com/articles/2013/09/30/smarter-cities-cloud.aspx
  • 21. Summon’s Relevance Ranking Looks at: • term proximity - how close query terms are to one another. • term frequency - how often do the query terms appear in the record • field weighting - where is the query term found? Finding query terms in some fields are more important than others.
  • 22. Also considers: • Some content types are boosted over others o Books and journal articles over newspaper articles and book reviews. o The Journal over the articles in that journal • Publication Date - Generally, items with newer publication dates are favored over older items. • Citation Counts - Items cited more are boosted • Local Collections - Content from institutional catalog(s) and repositories are boosted. • Content Size - Longer works aren’t necessarily more relevant even though search terms might appear in them more often
  • 23. A “Success Engine” Relevancy Enhancements: “to ensure users don’t miss out on the most relevant content.” This includes automatically searching for synonyms, term-stemming and smart searching of stop-words depending on their importance to the search phrase. image: http://blog.dappersnappers.com/wp-content/uploads/2012/02/baby-laptop-computer.jpg
  • 24. Measuring Relevancy in our Study We used “systems-oriented relevance” to rate the success of search queries. We rated the relevancy of Summon’s results without knowing the context of the original query. We looked at how well the topic of the search was represented in the topics of the results retrieved in Summon, Google and Google Scholar. - Maglaughlin & Sonnenwald 2002, 328-9.
  • 25. Inter-rater reliability mage: - http://www.artofmanliness.com/2008/08/21/manly-feats-of-strength/
  • 26. Types of searches Known item searches Had to be specific enough for us to recognize a definitive match. If the search got numerous matches and was general, we would likely not categorize it as a known item. Examples: • marketing alterity moore • 978-0078026676 • An Empirical Analysis of Cigarette Addiction • "happy days are here again" garland streisand Keyword/topic/exploratory searches Encompassed broad, general or ambiguous searches. Named persons were put into this category as well. Examples: • Catherine the Great, • vulnerability and happiness • PTSD and Substance use in african american women • supersitions
  • 27. Our Relevancy Rubric Known Item Searches Relevant: 1st item in the list of results Partially Relevant: 2nd-10th item in the list of results Not relevant: Not listed in the first 10 results or no results retrieved. (could mean we don’t own the item or there is a user input error)
  • 28. Our Relevancy Rubric Keyword Searches Relevant: ALL search terms appear in item's title or record. All search terms appear to have a relationship to one another, they are not just randomly placed throughout the title/record. First 5 items look like a "perfect"/solid match or clearly seem to be ABOUT the topic as identified through the search terms. Partially Relevant Not all search terms are visible in the title or record of the items. At least 3 of the first 5 results appear to be somewhat related to the topic as entered, even if broadly or tangentially. Would a user easily recognize a connection between the results and the topic as it was entered? Not Relevant: At least four of the first 5 items appear to be false hits. Even if one or more of the search terms (or synonyms of those terms) appear in the title, abstract or record, results appear to be only about a portion of the search terms entered and not about all of them combined.
  • 29. Summon vs. Google vs. Google Scholar Known Item Searches: For Google & Google Scholar, the relevancy was determined by a match, did not have to be a full-text match. Links to Amazon, Google Books, WorldCat, imdb.com, and Youtube were matches. Image: http://www.geekwire.com/2013/ibm-takes-watson-cloud/
  • 31. Top 100 most frequent queries in Fall 2013 Consisted of 23,813 individual searches
  • 32. Long Tail of unique searches 87% (160,263) of total searches make up the long tail Top 100 unique searches make up 13% of total queries
  • 35. Summon vs. Google vs. Google Scholar Which do you think did better? http://bit.ly/summon-faceoff
  • 36.
  • 37. Successful Searches: 54% of all our sample searches (204) retrieved relevant results. Summon’s Overall Relevancy Report Card = F After the curve = C Moderately Successful + Successful Searches: 73% (273) retrieved partially relevant - relevant results image: http://bleedingedge.pynchonwiki.com/wiki/images/b/b8/Dunce-cap.jpg
  • 38. Successful Searches: 85% (318) of the sample searches retrieved relevant results Google’s Relevancy Report Card = B After the curve: A Moderately Successful + Successful Searches: 95% (356) of the searches retrieved partially relevant - relevant results.
  • 39. Successful Searches: 54% of the sample searches (203) retrieved relevant results. Google Scholar’s Relevancy Report Card = F After the curve: C Moderately Successful + Successful Searches: 75% (281) retrieved partially relevant - relevant results.
  • 40. Relevancy of all Keyword Searches n=236
  • 41. Relevancy of all known item searches n=139
  • 42. Failed searches in Summon 32% (45) of all known item searches did not locate the item being searched for • 33% (15) of these failed searches are for items USC does not own Of the items we do own: • 57% (17) did not show up due to user error • 40% (12) did not show up due to a Summon problem (bad metadata, poor relevancy, not indexed in Summon) • 3% (1) had irregular characters and found no results
  • 43. Relevancy of “academic” known item searches that USC owns Summon improved 13% Google Scholar improved 15% Google improved 1% n=109
  • 44. Revised Relevancy Report Cards Google Scholar = F 57% (192) retrieved relevant results (up from 53%) After Curve = C+ 79% (272) retrieved partially relevant - relevant results (up from 73%) Summon = F 59% (202) retrieved relevant results (up from 53%) After Curve = C+ 79% (271) retrieved partially relevant - relevant results (up from 73%) Google = B 84% (291) retrieved relevant results (down from 85%) After Curve = A 95% (328) retrieved partially relevant - relevant results (no change)
  • 45. User errors 18% (66) of the searches in our sample had a user input error Image: http://human-error.sarkisozlerik.com/human-error/a-lifetime-by-design.html
  • 46. “Did you mean?” Showed up 24 times • 83% (20) triggered by user input errors. • 42% (10) of the time the “Did you mean” links took users to relevant results.
  • 47. Google automatically redirects searches with errors Summon lets the user decide whether to follow a “corrected” path
  • 48. Median: 3 Mean: 3.182 Mode: 2 # of words used in Keyword Searches
  • 49. Impact of # of search terms entered on relevancy
  • 50. Type of Keyword Searches Executed n=236
  • 51. Impact of search type on relevancy
  • 52. Duplicates image: http://amarkedman.com/wp-content/uploads/2011/07/Matrix-Clones.jpg Known item searches: 15 searches had 2 or more duplicates (11%) Keyword searches: 51 searches had 2 or more duplicates (22%)
  • 53. Linking to full-text "Linking users to full text as quickly as possible after discovery results are available is a paramount concern" (NISO ODI Report, 2013, 7). There were only 3 bad links (out of 55 known item article searches) image: https://flic.kr/p/mqjHgR
  • 54. What we learned: Summon has some work to do to improve the relevancy of its results. But, it is doing better in other areas….
  • 55. Summon added as default search tab (July 2010) Catalog search tab removed from homepage (July 2012)
  • 56. Winning! (sort of…) image: http://beautelicious.com/2011/09/charlie-sheen-warner-bros-close-settling-wrongful-termination-suit/
  • 57. Final Report Card: Relevancy = C+ Intuitive starting place = Fast = Bringing users back to the library = Maximizing usage of collections =
  • 58. Moving forward - Following our users’ lead image: http://blog.cityspoon.com/2012/02/08/gathering-followers/
  • 59. Leadership Strategy #1: Learn and use your library’s discovery tool • 3 million searches in 2013! • Will give you insight into what users are experiencing, both good and bad. • Discovery tools are taking up prime real estate on many of our websites. image: http://www.creativity4us.com/wp-content/uploads/2012/02/blinders-crop.jpg
  • 60. Leadership Strategy #2: Change what and how we teach ”Librarians must reconsider training students to use advanced search features or Boolean logic if students purposefully choose not to use them or fail to use them correctly. Rather than teaching students more effective search syntax, more attention should be placed on developing critical thinking and evaluative skills" (Holman, 2011, 24).
  • 61. Leadership Strategy #3: Be a squeaky wheel • Many users will get frustrated and abandon the library without out ever letting us know about problems • We cannot depend on other people to report problems
  • 62. Leadership Strategy #4: Imporvise and fall in front of students • Show students how to troubleshoot a failed search. • Show searches with typos or how to revise a search that gets no results)
  • 63. "If we think like users (instead of as librarians) it is easy to understand the frustration. Our tools must seem broken or outdated to them….Are we in the business of promoting library databases or the business of helping users accomplish their tasks?” (Matthews, 2013) Leadership Strategy #5:
  • 64. “The trouble with Summon is that students don’t need to be taught how to use it, but librarians do” -Matt Borg, Sheffield Hallam University, 2012
  • 65. English Faculty Member: “I want an easy way to find a book with author and title, and an easy way to move from that to journal articles if that’s what I want….” Religion Faculty Member: “If I need to search something I don’t go to any USC search engine, which is totally a waste of time. I go to Google where I can get things ten times as fast.” image: http://www.morvimmer.com/blog/free-download-staples-easy-button/
  • 66. Leadership strategy #6: Empathize with users AND colleagues • Try not to criticize or judge • Invite skeptical librarians into your classes to watch you teach w/the discovery service • Showing vs telling - talking can only get you so far
  • 67. Common complaints • “I think it's a cheat. Too many students don't learn basic searching skills that would make any search better - like planning before typing. They just throw in anything and take what comes up first” - 3/1/14 Survey of USC Instruction Librarians • “It is so imprecise" (Buck & Steffy, 2013, 76). • Perpetuates "the homogenization of information" - when "everything looks and feels the same" (Bawden & Robinson, 2008, 181). • “pandering to the 'principle of least effort'” (Richardson 2013; Meadow & Meadow 163-4). • They “impinge upon the development of critical research skills” (Wiles & Hofmann, 2013, 156). • “these systems...reinforce unreflective research habits” (Asher, 2013, 6)
  • 68. Complaining is not a (constructive) strategy “When you invent something new, if customers come to the party, its disruptive to the old way….The internet is disrupting every media industry...people complain about that but complaining is not a strategy. Amazon is not happening to bookselling, the future is happening to bookselling.” -Jeff Bezos, 60 Minutes, [8:55]. 12/1/2013
  • 69. Leadership Strategy #7: Solicit feedback AND then listen • Look for positive AND negative feedback • Re-frame negatives as positives or as opportunities for dialogue
  • 70. Leadership Strategy #8: Turn negativity to your advantage “Engage and transform the most negative person in your library system into a productive team member.” By “converting the most negative person has a huge impact on the rest of the staff” (Cuillier, 2011, 439). Image: http://www.salon.com/2013/03/12/why_is_francis_underwood_a_democrat/
  • 71. Leadership Strategy #9: Gather evidence • Test your assumptions • Test your colleagues’ assumptions • Study user behavior, formally and informally • Assess the tool and then assess it again
  • 72. Leadership Strategy # 10: Redefining ourselves
  • 73. References Asher, Andrew D., Lynda M. Duke, and Suzanne Wilson. “Paths of Discovery: Comparing the Search Effectiveness of EBSCO Discovery Service, Summon, Google Scholar, and Conventional Library Resources.” College & Research Libraries, 74.5 (2013): 464-488. Bawden, D., and L. Robinson. “The Dark Side of Information: Overload, Anxiety and Other Paradoxes and Pathologies.” Journal of Information Science 35 2 (November 21, 2008): 180-91. doi:10.1177/0165551508095781. Buck,, Stefanie, and Christina Steffy. “Promising Practices in Instruction of Discovery Tools.” Communications in Information Literacy, 7.1 (2013). Cuillier, Cheryl. “Choosing Our Futures … Still!” Journal of Library Administration 52.5 (July 2012): 436–51. doi:10.1080/01930826.2012.700806. Dempsey, Lorcan. “Thirteen Ways of Looking at Libraries, Discovery, and the Catalog: Scale, Workflow, Attention.” Educause, December 10, 2012.
  • 74. Holman, Lucy. “Millennial Students’ Mental Models of Search: Implications for Academic Librarians and Database Developers.” The Journal of Academic Librarianship 37.1 (January 2011): 19–27. doi:10.1016/j.acalib.2010.10.003. Maglaughlin, K. L., and D. H. Sonnenwald. “User Perspectives on Relevance Criteria: A Comparison among Relevant, Partially Relevant, and Not-Relevant Judgments.” Journal of the American Society for Information Science and Technology, 2002. Matthews, Brian. “Database vs. Database vs. Web-Scale Discovery Service: Further Thoughts on Search Failure (or: More Clicks than Necessary?) (or: Info-Pushers vs. Pedagogical Partners).” Chronicle of Higher Education. Ubiquitous Librarian, August 21, 2013. Meadow, Kelly, and James Meadow. “Search Query Quality and Web-Scale Discovery: A Qualitative and Quantitative Analysis.” College & Undergraduate Libraries 19.2-4 (2012): 163–75. doi:10.1080/10691316.2012.693434. NISO ODI Working Group. National Information Standards Organization ODI Survey Report: Reflections and Perspectives on Discovery Services, January 2013.
  • 75. Pan, Bing, Helene Hembrooke, Thorsten Joachims, Lori Lorigo, Geri Gay, and Laura Granka. “In Google We Trust: Users’ Decisions on Rank, Position, and Relevance.” Journal of Computer-Mediated Communication 12.3 (April 2007): 801–23. doi:10.1111/j.1083-6101.2007.00351.x. Richardson, Hillary A. H. “Revelations From the Literature: How Web-Scale Discovery Has Already Changed Us.” Information Today, May 2013. Rose-Wiles, Lisa M., and Melissa A. Hofmann. “Still Desperately Seeking Citations: Undergraduate Research in the Age of Web-Scale Discovery.” Journal of Library Administration 53.2–3 (February 2013): 147–66. doi:10.1080/01930826.2013.853493.