Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

JudaicaLink: Linked Data in the Jewish Studies FID

173 vues

Publié le

Presentation held at EVA/MINERVA 2017 conference.

Publié dans : Sciences
  • DOWNLOAD FULL eBOOK INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, CookeBOOK Crime, eeBOOK Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • DOWNLOAD FULL eBOOK INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, CookeBOOK Crime, eeBOOK Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Soyez le premier à aimer ceci

JudaicaLink: Linked Data in the Jewish Studies FID

  1. 1. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem JudaicaLink Linked Data in the Jewish Studies FID Kai Eckert http://www.judaicalink.org 1
  2. 2. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem FID Jewish Studies / Israel Studies Creation of a specialized information service (Fach-Informations-Dienst) for the domain of Jewish studies and Israel Studies. Our part: ● Metadata integration and enrichment. ● Multilingual data matching. 2 Funding by Consortium
  3. 3. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Portal of Jewish Studies Goals: ● Create a central access point ● Offer high performance information infrastructure And also, ● Contextualize the digital Judaica collections ● Enrich the metadata ● Connect different data sources as Linked Open Data 3
  4. 4. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem The Portal 4 http://umber.ub.uni-frankfurt.de/judaica/ (Beta version, not yet officially launched!)
  5. 5. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem 5
  6. 6. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Portal of Jewish Studies Goals: ● Create a central access point ● Offer high performance information infrastructure And also, ● Contextualize the digital Judaica collections ● Enrich the metadata ● Connect different data sources as Linked Open Data 6
  7. 7. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Re-Transliteration 7 ● Automatic Retro-Conversion of Romanized Hebrew Text ● Improve search facilities for Hebrew speakers ● Needed to match data cross-lingually Aaron Christianson
  8. 8. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Retro-Conversion of Romanized Hebrew Text lĕqahaḥ t teḥ qst ʿivrî bĕ-taʿătîq lātîḥnî ‫עבוריות‬ ‫לאותיות‬ ‫אותו‬ ‫ולהפוך‬ 8
  9. 9. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem ‫המלא‬ ‫הסיפור‬ ‫ֵא‬‫ל‬ ָ‫מ‬ַ‫ה‬ ‫רּ‬ ‫ו‬ִ‫ס‬ַ‫ה‬ 9
  10. 10. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Problem Statements ● No Hebrew Script until 2011 ● Multiple standards of Romanization ● Ambiguities: Same romanized character can refer to several Hebrew letters. ● Data imported from other catalogs ● The transliterations contain errors (yes, even librarians make - rare - mistakes) 10
  11. 11. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem The Plan 1. Generate all possible original forms in a “stupid” way. 2. Match the output against known Hebrew names / titles. 3. Use the verified matches to train a statistical model on the word/phrase level. 11
  12. 12. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Portal of Jewish Studies Goals: ● Create a central access point ● Offer high performance information infrastructure And also, ● Contextualize the digital Judaica collections ● Enrich the metadata ● Connect different data sources as Linked Open Data 12
  13. 13. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Contextualization of Digital Resources ● Find relevant data sources ● Find matching resources ● Extract information ● Add information to library collection 13 Maral Dadvar
  14. 14. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem It’s all about Labels! ● Labels (Strings) are the first thing we search to generate matching candidates. ● Every additional label for a resource is a possible new entry point to create a connection. ● Caveat: More labels also create more false positives. Further evidence is needed to establish a link. 14
  15. 15. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem 15
  16. 16. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem ● Make unstructured data sources like online encyclopedia available as structured data ● Identify and collect relevant subsets of general-purpose knowledge bases like DBpedia ● To function as a single hub for the contextualization process 16
  17. 17. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Main Tasks 1. Find new resource descriptions - with labels! 2. Find new labels (and other data) for known resources. 3. Find connections and duplicates within known resources. 4. Make the data available for others to contextualize. 17
  18. 18. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Example 1: YIVO Encyclopedia 18
  19. 19. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem What Data can we find? ● A title ● Describing text ● Links in texts "Surface form" => Concept ● Pictures ● Description of pictures 19
  20. 20. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Making use of the Surface Forms Minsk article links to "Poland before 1795" calling it "Polish-Lithuanian Commonwealth". "Polish-Lithuanian Commonwealth" is a subsection of "Poland before 1795" in the main article "Poland". So is "Demography"… Surface forms are evidence for labels. 20
  21. 21. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Example 2: Biographisches Handbuch der Rabbiner ● The Biographisches Handbuch der Rabbiner is an online encyclopedia provided by the Salomon L. Steinheim-Institute for German-Jewish history at the University of Duisburg-Essen, edited by Michael Brocke and Julius Carlebach. ● The goal of this encyclopedia is to be a complete directory of all rabbis who lived and worked in or originated from German-speaking areas since the age of enlightenment. ● http://www.steinheim-institut.de/wiki/index.php/Biographisches_Handbuch_de r_Rabbiner_%28BHR%29 21
  22. 22. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Available as PDF. 22
  23. 23. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Some notes about PDF Sources ● PDF is great to keep the visual layout of a text across systems. ● In all other aspects, in particular regarding the access to the content, it is horrible. ● Digital-born PDFs (as in this case) are PDFs that have been created directly by the authoring software. ● Even worse: PDFs created from scans (with OCR). 23
  24. 24. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem 24
  25. 25. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Biographisches Portal der Rabbiner Gladly some people at Steinheim Institute created a database from the handbook: http://www.steinheim-institut.de:50580/cgi-bin/bhr#i0001 This URL above is shown in the browser when you view the entry on Aach, Löb. Great stuff: ● Semi-structured form of the entry ● “Link” to the PDF by means of volume and page number ● Reference to the number of the entry as it is used in the PDF. ● A GND number!!! Not so great: ● We can not link to the database as the link above does not resolve to the article. 25
  26. 26. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Solution The solution actually was already implemented: There is an undocumented way to address an entry: http://steinheim-institut.de:50580/cgi-bin/bhr?id=1 26
  27. 27. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Example 3: DBpedia Generation of a DBpedia subgraph. 1. Focused Crawling of data sources a. identify “relevant” resources b. extract “relevant” information 2. Find matches in the whole dataset a. extract “relevant” information 27
  28. 28. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Interlinking The more data sources we have, the better we can use them to support the linking process. 28
  29. 29. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Architecture and Deployment Triple store and SPARQL endpoint: Apache Jena Fuseki Linked Data frontend (URI dereferencing, HTML Views): Pubby (DM2E version) Static HTML pages of the website: Hugo Versioning and management: GitHub Search Access: Elasticsearch (planned) 29
  30. 30. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Dataset Description in Markdown with Metadata Frontmatter +++ author = "Kai Eckert" title = "Yivo Encyclopedia" website = "http://www.yivoencyclopedia.org" example = "http://data.judaicalink.org/data/yivo/Moscow" graph = "http://data.judaicalink.org/data/yivo" loaded = true [[files]] url = "http://data.judaicalink.org/dumps/yivo/current/yivo.n3.gz" description = "Extraction from YIVO Encyclopediae" +++ The YIVO Encyclopedia of Jews in Eastern Europe, courtesy of the YIVO Institute of Jewish Research, NY. <!--more--> ... 30
  31. 31. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem The whole website is maintained via GitHub 31
  32. 32. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Every new commit gets pushed to the web server Via the static site generator Hugo, all HTML pages are generated. 32 +++ author = "Kai Eckert" title = "Yivo Encyclopedia" website = "http://www.yivoencyclopedia.org" example = "http://data.judaicalink.org/data/yivo/Moscow" graph = "http://data.judaicalink.org/data/yivo" loaded = true [[files]] url = "http://data.judaicalink.org/dumps/yivo/current/ yivo.n3.gz" description = "Extraction from YIVO Encyclopediae" +++ The YIVO Encyclopedia of Jews in Eastern Europe, courtesy of the YIVO Institute of Jewish Research, NY. <!--more--> ...
  33. 33. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Every new commit gets pushed to the web server A Python script parses the metadata of the pages and loads and unloads the datasets automatically. Advantages: ● No one needs access to the server. ● Write access to the data is easily done via GitHub. ● Data dumps of all datasets are always available. ● Description, dumps and loaded data are always synchronous. ● History of the datasets is maintained (and the dumps) ● Mistakes can easily be reverted by going back to an earlier commit. 33
  34. 34. WISS Research Group | JudaicaLink: Linked Data in the Jewish Studies FID - EVA/MINERVA 2017 - Nov 14th, 2017 - Jerusalem Thank you. http://slideshare.net/kaiec http://www.wisslab.org 34

×