Location of Desirable Open Educational Resources in the Commonwealth Connect Portal Directory of Open Educational Resources (DOER) using the OERScout Artificial Intelligence Based Search Engine.
AI Based Search Engine for Locating Desirable Open Educational Resources
1. OERScout
AI Based Search Engine for Locating Desirable Open
Educational Resources
Research Seminar (23rd May 2012, COL, Vancouver)
Ishan Abeywardena MSc, MSc (Brunel), BSc (Bangalore), MIEEE,
MBCS, MACM
Senior Lecturer, School of Science and Technology
Wawasan Open University
Penang, Malaysia
Currently pursuing PhD in Computer Science at University of Malaya,
Malaysia
2. Acknowledgement
This research project is funded by:
The Grant (# 102791) generously made by the
International Development Research Centre
(IDRC) of Canada through an umbrella study on
Openness and Quality in Asian Distance Education
Commonwealth of Learning (COL), Canada
through an Executive Secondment (4th – 25th May
2012)
Wawasan Open University, Penang, Malaysia
3. Talking Points…
Open Educational Resources (OER)
Searching for OER
Usefulness of OER
Desirability of OER
Search mechanisms
OERScout
Moving Forward
5. What are Open Educational
Resources (OER)?
“web-based materials, offered freely and openly for
use and re-use in teaching, learning and research”
(Joyce, 2007)
“Just as the Linux operating system and other open
source software has become a pervasive computer
technology around the world, so too might OER
materials become the basis for training the global
masses” (Farber, 2009)
Joyce, A. (2007). OECD Study of OER: Forum Report, OECD. Retrieved December 12, 2011 from
http://www.unesco.org/iiep/virtualuniversity/forumsfiche.php?queryforumspages_id=33.
Farber, R. (2009). Probing OER’s huge potential [Electronic Version]. Scientific Computing 26(1), 29-29.
Mello, J. (2012). OER Global Logo. Retrived April 5, 2012 from http://www.unesco.org/new/en/communication-and-
information/access-to-knowledge/open-educational-resources/global-oer-logo/
7. Openness of OER
The four R’s (Hilton, Wiley, Stein and Johnson, 2010)
◦ Reuse: the ability to use all or part of a work for ones
own purposes;
◦ Redistribute: the ability to share ones work with
others;
◦ Revise: the ability to adapt, modify, translate or
change the form of a work;
◦ Remix: the ability to combine resources to make new
resources.
Hilton, J., Wiley, D., Stein, J., & Johnson, A. (2010). The four R‘s of openness and ALMS Analysis: Frameworks for
open educational resources. Open Learning: The Journal of Open and Distance Learning, 25(1), 37-44.
8. Openness of OER
The four R’s are governed by the license*
http://creativecommons.org/
Attribution (CC BY)
Attribution-ShareAlike (CC BY-SA)
Attribution-NoDerivs (CC BY-ND)
Attribution-NonCommercial (CC BY-NC)
Attribution-NonCommercial-ShareAlike (CC BY-NC-SA)
Attribution-NonCommercial-NoDerivs (CC BY-NC-ND)
*There are licensing schemes other than the Creative Commons used to govern the
four R’s
9. Access to OER
ALMS analysis (Hilton, Wiley, Stein and Johnson, 2010)
◦ Access to editing tools
◦ Level of expertise required to revise or remix
◦ Ability to Meaningfully edit
◦ Source-file access
Hilton, J., Wiley, D., Stein, J., & Johnson, A. (2010). The four R‘s of openness and ALMS Analysis: Frameworks for
open educational resources. Open Learning: The Journal of Open and Distance Learning, 25(1), 37-44.
10. Relevance of OER
Metadata Standards
◦ GLOBE (Global Learning Objects
Brokered Exchange)
◦ IEEE-LOM (Learning Object Metadata)
◦ Other Specific Standards
11. The Current Situation of OER
With increased funding and advocacy by governmental/non-
governmental organisations and generous philanthropy, many
OER repositories have mushroomed over the years
boasting a large volume of quality resources.
(Abeywardena, Raviraja and Tham, 2012)
These repositories have started to grow exponentially
rich in knowledge. However, this has in turn given rise to the
new challenge of locating resources suitable for use
and reuse from the large number of disconnected
and disparate repositories available across the globe (Geser,
2007).
Abeywardena, I.S., Raviraja, R. and Tham, C.Y. (2012). Conceptual Framework for Parametrically Measuring the
Desirability of Open Educational Resources using D-index. International Review of Research in Open and Distance
Learning, 13(2), 104-121
Geser, G. (2007). Open Educational Practices and Resources - OLCOS Roadmap 2012. Open Learning Content
Observatory Services. Salzburg, Austria. 2007. Retrieved December 27, 2011 from
13. “…the problem with open content is not the lack of
available resources on the Internet but the inability to
locate suitable resources for academic use” (Unwin,
2005).
“...The problem is in finding the resources, and more
correctly finding the “right” resources. Using a regular
search engine like Google to find content is not
always a viable option as it will generate too many
answers. There is, hence, a need to easily find
relevant content...” (Hatakka, 2009) .
Searching for Suitable OER
Unwin, T. (2005). Towards a Framework for the Use of ICT in Teacher Training in Africa. Open Learning 20, 113-130.
Hatakka, M. (2009). Build It and They Will Come? – Inhibiting Factors for Reuse of Open Content in Developing Countries,
14. How Users Find OER
Identify which material to look for
e.g. integration, C++ programming)
Identify the search queries
(e.g. “undergraduate
mathematics”)
Locate repository(word of mouth,
some link somewhere, go to the
more popular repositories)
Run multiple queries to find resources
Read each resource to identify the
usefulness (openness, access, relevance)
Identify useful resources
Repeat steps 3-6 on multiple repositories
(hundreds to thousands…..)
16. How “They” Find OER
Metadata
◦ Title
◦ Description
◦ Keywords
◦ Others (license, author etc.)
To facilitate accurate searching of OER,
each resource must be tagged using a
specific set of metadata defined by a
global standard.
17. How “They” Find OER
Who will tag the resources?
The authors of the content (Humans) =
Inconsistent, Irrelevant, Non-uniform
metadata
19. The Problem
There is no generic methodology available at
present to enable search mechanisms to
autonomously gauge the usefulness of an OER for
ones teaching and learning needs.
The use of diverse and disparate technology
platforms in these projects further entails the
inability to effectively trawl and located OER using
generic search methodologies.
…if one of the technological barriers towards wider
adoption of OER is the inability of existing
searching methods and techniques to effectively
locate specific, relevant and quality OER then how
can existing search techniques be improved or
augmented to facilitate more effective location of
specific, relevant and quality OER over the
21. What is OERScout?
An Artificial Intelligence (AI) based text
mining algorithm which performs
unsupervised autonomous learning.
Reads the actual content of a text based
OER (webpage, .pdf, .doc, .docx) and
understands what it is about.
◦ i.e. If it reads the content of an article on
calculus, it will understand that the OER belongs
to the domain of mathematics and sub-domain
calculus.
Creates a dynamic map of the resources
called the Keyword-Document Matrix (KDM)
24. The Usefulness of an OER
The usefulness of an OER for a particular teaching or
learning need can only be accurately assessed by
reading through the content.
However, there are other aspects that can be
parametrically measured by a software based
mechanism:
• Whether a resource is relevant to a user’s needs;
• Whether the resource is open enough for using,
reusing, remixing and redistributing;
• Whether the resource is accessible with respect to
technology.
25. The Desirability of an OER
Within the requirement of being able to use and reuse a
particular OER, the three parameters of
Desirability (Abeywardena, Raviraja and Tham,
2012) can be defined as:
◦ level of openness: the permission to use and
reuse the resource;
◦ level of access: the technical keys required to
unlock the resource;
◦ relevance: the level of match between the
resource and the needs of the user.
Less useful resources are less desirable for
teaching and learning needs….
26. Measuring the Desirability
D-index = (level of access x level of openness x relevance)
/ 256
Abeywardena, I.S., Raviraja, R. and Tham, C.Y. (2012). Conceptual Framework for Parametrically
Measuring the Desirability of Open Educational Resources using D-index. International Review
of Research in Open and Distance Learning, 13(2), 104-121 (ISI-cited journal)
27. Openness
Permission Value
Reuse 1
Redistribute 2
Revise 3
Remix 4
The level of openness based on the four R’s of openness
Mapping the CC licenses to the 4 R’s
Permission Creative Commons (CC) licence Value
Reuse None 1
Redistribute Attribution-NonCommercial-NoDerivs (CC BY-NC-ND)
Attribution-NoDerivs (CC BY-ND)
2
Revise Attribution-NonCommercial-ShareAlike (CC BY-NC-SA)
Attribution-ShareAlike (CC BY-SA)
3
Remix Attribution-NonCommercial (CC BY-NC)
Attribution (CC BY)
4
28. Access
Access to
editing tools
Level of expertise
required to revise or
remix
Meaningfully
editable
Source-file
access
Value
LOW HIGH NO NO 01
LOW HIGH NO YES 02
LOW HIGH YES NO 03
LOW HIGH YES YES 04
LOW LOW NO NO 05
LOW LOW NO YES 06
LOW LOW YES NO 07
LOW LOW YES YES 08
HIGH HIGH NO NO 09
HIGH HIGH NO YES 10
HIGH HIGH YES NO 11
HIGH HIGH YES YES 12
HIGH LOW NO NO 13
HIGH LOW NO YES 14
HIGH LOW YES NO 15
The level of access based on the ALMS
analysis
29. Relevance
Search rank Value
Below the top 30 ranks of the search results 1
Within the top 21-30 ranks of the search results 2
Within the top 11-20 ranks of the search results 3
Within the top 10 ranks of the search results 4
The level of relevance based on search rank (Vaughan, 2004)
• Users will only consider the top ten ranked results for a particular search as
the most relevant;
• Users will ignore the results below the top 30 ranks.
Vaughan, L. (2004). New measurements for search engine evaluation proposed and tested. Information Processing and
Management 40, 677–691.
31. Sample Search
Search
Rank Title CC Lisence File Type
1 18.01 Single Variable Calculus
CC BY-NC-
SA PDF
2 Calculus for Beginners and Artists
CC BY-NC-
SA HTML/Text
3 18.01 Single Variable Calculus
CC BY-NC-
SA PDF
4 18.013A Calculus with Applications
CC BY-NC-
SA HTML/Text
5 18.02 Multivariable Calculus
CC BY-NC-
SA PDF
6 Single Variable Calculus
CC BY-NC-
SA PDF
7 Calculus Online Textbook
CC BY-NC-
SA PDF
CC BY-NC-
Top 10 search results returned by MERLOT for the keyword “calculus”
32. Original Search Results
The original top ten search results only
contain resources which are released
under the CC BY-NC-SA license.
6/10 resources returned are in PDF
format which make them difficult to reuse
and remix.
Resource ranked as number ten is a
protected resource which requires a
specific username and password to
access.
33. Application of D-index
Rank After
Applying
D-index
Original
Search
Rank Title CC Lisence File Type D-index
1 2
Calculus for Beginners and
Artists CC BY-NC-SA
HTML/Tex
t 0.75
2 4
18.013A Calculus with
Applications CC BY-NC-SA
HTML/Tex
t 0.75
3 8
Calculus for Beginners and
Artists CC BY-NC-SA
HTML/Tex
t 0.75
4 14 Multivariable Calculus CC BY
HTML/Tex
t 0.75
5 19
MATH 10250 - Elements of
Calculus I, Fall 2008 CC BY-NC-SA
HTML/Tex
t 0.56
6 20 18.022 Calculus CC BY-NC-SA PDF 0.56
7 22 Single-Variable Calculus I CC BY
HTML/Tex
t 0.50
8 25 Single-Variable Calculus II CC BY
HTML/Tex
t 0.50
Top 10 results when D-index is applied to the results returned by
MERLOT
34. Results After Applying D-
index
8/10 resources are in HTML/Text
formats which are the most accessible
in terms of reuse.
4/10 resources are available under the
CC BY licence which make them the
most open resources in the list.
35. Benefits of the D-index
The application of the D-index
would greatly improve the
effectiveness of the search
with respect to locating the
most suitable resources for
use and reuse
37. The KDM
Keyword 1 Keyword 2 ……… Keyword n
Resource 1
Resource 2
………..
Resource n
The system needs a large organised CONTENT AND PORTAL
REPOSITORY for the initial learning.
38.
39. Work Performed at COL
Process 2598 resources from DOER
and create the KDM
Optimise the KDM creation
Autonomously identify the CC license
and File type of resource
Apply the D-index to the results
Build OERScout Client
Build searching mechanism for the
KDM
Setup OERScout.org for KDM hosting
40. How OERScout Works
Web Server hosting
OERScout KDMServer Tools creating
KDM
OERScout
Web Service
OERScout Client
44. Benefits of OERScout
Content Creators:
No need for metadata
No need for manually defining content domains
for categorisation (e.g. mathematics: calculus:
integration)
No need for publicising the availability of a
repository
No need for building custom search mechanisms
More visibility of material to a wider audience
45. Benefits of OERScout
Users:
Provides a central location for finding resources
scattered across the globe hidden in high volume
repositories.
Brings only the most Desirable resources using
the D-index and omits the rest.
Creates a complicated map of resources called
the Keyword-Document Matrix (KDM) for easy
more accurate searching.
46. Benefits of OERScout to
DOER
Provides a more accurate search mechanism
Eliminates the need for manually
categorising, cataloguing and tagging of
resources
Facilitates rapid expansion
Promotes wider access
47. Ultimate Benefit of OERScout
By using OERScout, both Content Creators
and Users only need to concentrate
on the actual content and not
the searching and location of
specific, relevant and
desirable OER from the disconnected
and disparate repositories scattered across
the globe.
48. Future Work
An OERScout web service will be made
available via oerscout.org which will
allow repositories to submit their
resources for processing and also use
the OERScout algorithm to processes
search queries.
49. Further Acknowledgements
I acknowledge the supervision provided by Dr Chan Chee
Seng and Dr S. Raviraja of FCSIT, University of Malaya,
Malaysia.
A special vote of thanks to Tan Sri Dato’ Prof Gajaraj (Raj)
Dhanarajan, Prof Dato’ Wong Tat Meng and Prof Tham
Choy Yoong for allowing me to take time off of work at WOU
and be here.
I express my gratitude to Sir John Daniel and Professor
Asha Kanwar for supporting this secondment and also to all
the colleagues at COL for welcoming me into the COL family.
Last but not the least, I must express my gratitude to Dr
Balaji for the support provided both professionally as a great
mentor and personally as a gracious host.
50. References
Abeywardena, I.S., Raviraja, R. and Tham, C.Y. (2012). Conceptual Framework
for Parametrically Measuring the Desirability of Open Educational Resources
using D-index. International Review of Research in Open and Distance Learning,
13(2), 104-121.
Geser, G. (2007). Open Educational Practices and Resources - OLCOS Roadmap
2012. Open Learning Content Observatory Services. Salzburg, Austria. 2007.
Retrieved December 27, 2011 from
http://www.olcos.org/cms/upload/docs/olcos_roadmap.pdf.
Hatakka, M. (2009). Build It and They Will Come? – Inhibiting Factors for Reuse of
Open Content in Developing Countries, EJISDC 37(5), 1-16.
Hilton, J., Wiley, D., Stein, J., & Johnson, A. (2010). The four R‘s of openness and
ALMS Analysis: Frameworks for open educational resources. Open Learning: The
Journal of Open and Distance Learning, 25(1), 37-44.
Farber, R. (2009). Probing OER’s huge potential [Electronic Version]. Scientific
Computing 26(1), 29-29.
Joyce, A. (2007). OECD Study of OER: Forum Report, OECD. Retrieved
December 12, 2011 from
http://www.unesco.org/iiep/virtualuniversity/forumsfiche.php?queryforumspages_id
=33.
Mello, J. (2012). OER Global Logo. Retrived April 5, 2012 from
http://www.unesco.org/new/en/communication-and-information/access-to-
knowledge/open-educational-resources/global-oer-logo/ .
Unwin, T. (2005). Towards a Framework for the Use of ICT in Teacher Training in
Africa. Open Learning 20, 113-130.