Researcher KnowHow session presented by Judith Carr, Research Data Manager and Gordon Sandison, Licensing and Copyright Manager from the University of Liverpool Library on 1st December 2020.
1. Researcher KnowHow - Copyright
and text and data mining
Judith Carr – Research Data Manager
Gordon Sandison – Licensing and Copyright Manager
2. Learning Outcomes
This session will raise awareness of:
• Copyright law and how it relates to performing TDM analysis.
• How researchers can take advantage of permitted acts in
copyright law to legitimately use TDM in their research.
• The tools publishers make available to enable TDM analysis.
3. Disclaimer
The following slides are intended to give an overview of the key
concepts of UK copyright legislation for those in higher
education institutions.
They are not comprehensive, nor do they provide full details of
the provisions within the relevant legislation (most notably the
Copyright, Designs and Patents Act).
The slides are for information purposes only and do not
constitute formal legal advice.
5. Does copyright protect ideas?
No.
There are two tests a work must pass for copyright to exist in it.
Firstly, it must be ‘original’ and secondly, it must be recorded or ‘fixed’ i.e. be
something tangible.
So, copyright does not protect ideas which remain solely as ideas. Rather
copyright protects the way these ideas are expressed.
Copyright covers different types of content (text, images, sound, moving
images etc.)
6. Do copyright works need to be
registered to be protected?
No.
Copyright protection is automatic as soon as the work is ‘fixed’ or
recorded in some format.
8. You always need permission to use
copyright works.
A. Depends on what you’re using it for.
Permission is not required if the work is out of copyright, is under a
Creative Commons licence, or if you are using the work for reasons
permitted under a copyright exception.
In the UK there are copyright exceptions which permit the use of
copyright material under certain circumstances. Usually educational
institutions also pay for specific licences which enable their lecturers
and students to use copyright material.
9. There is a specific amount of someone else’s work
that you can use without asking permission and
without infringing their copyright.
A. False
Though you may use a copyright protected work under a
copyright exception, there is no legal amount specified.
The courts define ‘substantial part’ on a case-by-case basis,
usually focusing on the quality of the parts taken rather than the
amount.
10. What is Intellectual Property?
Intellectual property (IP) refers to unique, creative works which can be
treated as an asset or physical property i.e.
• ‘Intellectual’ because it is creative output of the mind, and
• ‘Property’ because it is viewed as a tradable commodity.
Intellectual property is something original which is subsequently ‘fixed’ in
some format, such as written or drawn on paper, in an audio recording, on
film, or recorded electronically.
An idea alone is not intellectual property. For example, an idea for a book
doesn’t qualify, but the words you’ve written do.
As such, IP is, essentially, the tangible expression of ideas.
11. Intellectual Property Rights (IPRs)
Intellectual property is protected in law by Intellectual Property Rights or
IPRs.
Intellectual Property Rights:
• Are specific legal rights which exist to protect the owners of IP;
• Give the owners of IP specific exclusive rights in regard to the use of their
work;
• Prohibit unauthorised use of protected works;
• Make it easier for the owners of IP to take legal action against anyone who
uses or copies their work illegally;
• Enable people to earn recognition or financial benefit from what they
invent or create;
12. Intellectual Property Rights (IPRs)
Intellectual Property Rights fall, principally, into four main areas;
• Trademarks;
• Designs;
• Patents;
• Copyright;
13. Copyright
Copyright isn’t a single right as such, but a set of exclusive rights
which originators/copyright owners of cultural, creative and artistic
works have over the use of their work.
This set of rights legally gives the copyright holder the exclusive right
to determine:
• Who can use or make copies of their works;
• Under what circumstances;
• In what media;
• For what charge;
Essentially, owning copyright is owning the ‘right to copy’.
15. Copyright Law – Restricted Acts
In the UK, the Copyright, Designs and Patents Act 1988 (as
amended 2014) is the legislation which governs copyright.
This law sets out the types of work protected by copyright, and
the uses of those works which are the exclusive right of the
copyright holder.
The uses of the work, which are the exclusive right of the rights
holder, are called ‘Restricted Acts’ i.e. acts/uses restricted
solely to the copyright holder.
16. Uses Protected by Copyright – Restricted
Acts
• Copying
• Issuing copies to the public
• Rental or Lending
• Public Performance
• Communication to the public
• Adaptation
17. So, what about TDM?
Text and data mining usually requires copying of the work to be
analysed.
Researchers using text and data mining in their research risked
infringing copyright unless they had specific permission from the
copyright owner.
However, copyright was never meant to restrict the use of the facts
and information that exist in a work.
In 2014, the law was changed.
18. Permitted Acts/Copyright Exceptions
Though copyright protects others using works, also built into the
legislation are ‘Acts Permitted in relation to Copyright Works’.
These ‘permitted acts’ allow limited use of copyrighted material
without having to gain permission and without infringing
copyright law.
These are often referred to as ‘Copyright Exceptions’ i.e.
exceptions to copyright law.
19. 29A. Copies for text and data analysis
for non-commercial research
• Allows researchers to make copies of any copyright material for the purpose of
computational analysis if they already have the right to read the work (that is,
work that they have “lawful access” to).
• They will be able to do this without having to obtain additional permission to
make these copies from the rights holder.
• This exception only permits the making of copies for the purpose of text and data
mining for non-commercial research.
20. 29A. Copies for text and data analysis
for non-commercial research
• Publishers and content providers are able to apply reasonable measures to maintain
their network security or stability, so long as these measures do not prevent or
unreasonably restrict a researcher’s ability to make the copies they need to make for
their text and data mining.
• Contract terms that stop researchers making copies of works to which they have lawful
access in order to carry out a text and data mining analysis will be unenforceable.
21. Database Rights
Other legal or technical restrictions may limit the access to collections
of works, such as databases of scientific publishers. Examples of such
databases are JSTOR, ScienceDirect and LexisNexis.
In the UK and in the EU, any collection of data, information or works
which required substantial investment in obtaining, verifying or
presenting its contents, is protected by a ‘database right’.
22. Database Rights
A database right is comparable to, but distinct from copyright, that
exists to recognise the investment that is made in compiling a
database, even when this does not involve the "creative“ and
originality aspect that is reflected by copyright.
The database right is an exclusive right that prevents substantial
extraction or re-utilisation of the content of the database, as well as
systematic insubstantial extraction of the said content (where what is
‘substantial’ and ‘systematic’ depends on the context).
23. Database Rights
Moreover, the use of a database can also be regulated by
contract. In some cases, access to a database may require
acceptance of ‘terms and conditions’ that restrict certain
activities, including text and data analysis. But, as with the
copyright exception discussed above, engaging in permissible
activities on a database for the purpose of text and data analysis
cannot be ruled out by contract.
24. Database Rights
Databases are also usually sheltered by technological measures
which impede systematic access to their contents and ‘bulk’ copying.
So, researchers may need not only permission, but also technical
support from the database owner before engaging in large-scale
computational analysis of the contents of a database.
For this reason, despite the fact that researchers can rely on the
exception for text and data analysis, collaboration between database
owners and researchers remains a fundamental component of text
and data mining research.
25. What can researchers do with the copies
they make as part of their research?
The copies can only be used by those who have lawful access to
the original material for text and data mining for non-commercial
purposes. They can’t be shared, sold, or made publicly available
in any way and anyone doing so could be sued for copyright
infringement.
26. Do researchers have to acknowledge
every work they analyse in this way?
The law requires that there is sufficient acknowledgment of
copied works, but recognises that it may be impractical to
acknowledge every work in a large-scale analysis. A researcher
could, for example, refer to the databases from which the works
were obtained.
27. Can a researcher doing contract research for
an outside company text and data mine
copyright material?
It is unlikely that the research falls within the definition of non-
commercial. You should check before carrying out the analysis,
but the likelihood is that you will have to agree with the copyright
owner that you can make copies for your research.
28. My research is part-funded by a company. I
choose my own research topics and am free
to publish my work without interference from
the company. Can I text and data mine?
This is likely to be fine, so long as the purpose of your research is
non-commercial, but you should check.
29. Are the results of my text and data
mining analysis covered by copyright?
Copyright covers the artistic expression of an original idea or fact,
not the fact or idea itself. So, if your results are simply facts they
are not covered by copyright.
30. Is this compatible with Open Access?
Absolutely. You can text and data mine any work that has been
made available under an open access route. You may publish
your research in an open access journal. You should
acknowledge the works you have mined, unless this is
impossible for reasons of practicality.
31. Can the results of my non-commercial
research be used for commercial purposes?
There are no restrictions on how or where outputs of text and data
mining can be published, including journals published for profit by
academic publishers and under licences that permit commercial
research, such as CC-BY. Other commercialisation of the research
outputs is not restricted either. But it is important to be scrupulous in
assessing whether the original purpose of carrying out the text and
data mining analysis is solely non-commercial; if it isn’t, then
researchers are very likely to be infringing copyright.
32. Key messages
• If you have legal access to a resource, then you may make a
copy for TDM analysis.
• Be aware of database rights which may restrict copying.
• Look to see if the owner of the material offers ‘in-house’ TDM
solutions.
33. RESOURCES
• The Library – subscribes to databases
• The Library – ask your Liaison Librarian
• English Dept UoL – video of demo English Corpus, SketchEngine, Wmatrix
https://stream.liv.ac.uk/e3677y55
• OpenMinTeD - an open, service-oriented e-Infrastructure for Text and Data
Mining (TDM) of scientific and scholarly content. Researchers can collaboratively
create, discover, share and re-use Knowledge from a wide range of text-based
scientific related sources in a seamless way.’http://openminted.eu/
• YouTube -Text and Data Mining in the Humanities and Social Sciences—
Strategies and Toolshttps://www.youtube.com/watch?v=vrX7cM1FC_A
• YouTube - Text Mining for Social Scientists
https://www.youtube.com/watch?v=71FqpwsPNpU&t=2052s
Photo by Sharon McCutcheon on Unsplash
34. Where on the Library webpages?
Gale Digital Scholar Lab
Build ‘Search your institution's Gale Primary Sources, find
relevant texts, and add them to a content set.’
Clean ‘Prepare documents for analysis by stripping the
text of unnecessary words, punctuation, and other
characters.’
Analyse ‘Use analysis tools to explore your content set in
new ways with visualizations to help create new insights
into your texts.’
https://liverpool.idm.oclc.org/login?url=https://infotrac.g
ale.com/itweb/livuni?db=DSLAB
Getting started guide -
https://www.lib.cam.ac.uk/files/getting_started_gdsl.pdf
35. Check polices both publishers and databases
A tale of 3 databases
IBM Micromedex
You may only use a crawler to crawl this Web site as permitted by this Web site’s robots.txt
protocol, and IBM may block any crawlers in its sole discretion -The use authorized under this
agreement is non-commercial in nature
NICE
The Open Licence (UK) referred to on their web page does not have any express prohibition -– has
a ‘short questionnaire you need to submit which will advise what consent/use you have
Medscape
DO NOT attempt to access or search any Medscape Network properties or any content
contained therein through the use of any engine, software, tool, agent, device or mechanism
(including scripts, bots, spiders, scraper, crawlers, data mining tools or the like) other than
through software generally available through web browsers
36. Further Information
• Library Open Research Team https://www.liverpool.ac.uk/open-research/
• Library Licensing and Copyright Manager
https://libguides.liverpool.ac.uk/copyright
• Intellectual Property Office
https://assets.publishing.service.gov.uk/government/uploads/system/uploa
ds/attachment_data/file/375954/Research.pdf