Dev Dives: Streamline document processing with UiPath Studio Web
Discovery-ing Symphony MARC records
1. Discovery-ing Symphony MARC records
Who
• Gary Steele Library Systems Manager, Glasgow Caledonian University
• Bernard Scaife Head of Technical Services, UCL Institute of Education
• Kathy Sadler Systems Librarian, Cranfield University
What
• Each use a different discovery layer; Summon, Primo, EDS
• Get bib records from Symphony to the discover layer
• Enable real time availability in the discover layer
Why
• All discovery layers rely on MARC records and holdings details but
there’s more than one way to get this data out of Symphony…
3. Getting bib records from Symphony to Summon
• Summon 2.0
• Getting bib records from Symphony to Summon
• New/Updated records report
• Deleted records report
5. 5
Getting bib records from Symphony to Summon
• Content type, field and location mapping for Summon import
• Update and correct LDR, 007, and 008 fields in MARC records
• Do not export equipment records (e.g. laptops) or eJournal records
• Do export all books, ebooks, journal, ejournals, statutes, law reports,
theses, etheses, dissertations, archive items, DVDs, software, and
databases.
• Schedule
– Deleted records report: Every night at 23:00
– New/Updated records report: Every night at 23:10
– Full Catalogue dump: Every four months
6. 6
New/Updated records report
• Use customised report created by Anne called gcalsummon.pl
• Selection:
– Item type: Exclude (~) items type (equipment and ejournals)
– Date modified: Day report is run (D0:D0)
• Process:
– Select catkeys of records based on criteria above
– Extract MARC records for selected catkeys
– Rename file to gcal-catalog-updates-YYYY-MM-DD-HH-MM-
SS.mrc
– FTP file to Summon “Updates” folder
8. 8
Deleted records report
• Use customised report I created under the guidance of Anne
• Selection:
– Records deleted on the day the report is run
• Process:
– Parse history logs
– Lookup catalogue keys
– Compare files to find deleted keys
– FTP flat file of deleted keys to Summon
9. 9
1. Parse history logs
cat log.hist | seltrans -cFV -otJ | sort | uniq >allkeys
• Select all occurrence of “FV” (Remove Item Part B) command code
in the history log and output “tJ” datacode (Catalogue key number)
• Sort results and remove duplicates
• Write keys returned to new file (allkeys)
E201507081255540077R ^S55FVFFDCR3^FEGCAL^FcNONE^FWDCR3^
NQ506021-1001^IQT610^IS1^NOY^ILENQUIRIES^INENQUIRIES^
NSGCAL^IGT^tJ506021^IUa506021^IKTHESIS^IF2008^Fv3000000^^O
10. 10
2. Lookup catalogue keys
cat allkeys | selcatalog -iC -oC | sort | uniq >presentkeys
• Lookup catalogue keys extracted from history logs
• Catalogue keys for deleted records will return as errors
• Sort results and remove duplicates
• Write keys returned to new file (presentkeys)
**error number 111 on catalog start, key=18622 flex=i9780406237002
**error number 111 on catalog start, key=198847 flex=i9780406237002
**error number 111 on catalog start, key=237053 flex=i9780406237002
**error number 111 on catalog start, key=237143 flex=i9780406237002
**error number 111 on catalog start, key=237145 flex=i9780406237002
**error number 111 on catalog start, key=237147 flex=i9780406237002
**error number 111 on catalog start, key=237148 flex=i9780406237002
**error number 111 on catalog start, key=505999 flex=a505863
**error number 111 on catalog start, key=506002 flex=a506000
11. 11
3. Compare files to find deleted keys
• comm command compares two
files
• -3 suppresses keys in both
files
• -1 suppresses keys unique to
presentkeys
• Leaves keys that are unique to
allkeys, and these keys are
the deleted keys
comm -31 presentkeys allkeys >deletedkeys
12. 12
4. FTP flat file of deleted keys to Summon
tr -d '|' < deletedkeys >summonkeys
• File “translated” up to remove “|”
• Renamed to “gcal-catalog-YYYY-MM-DD-HH-MM-SS.deletes”
• FTP file to Summon “Deletes” folder
18622
198847
237053
237143
237145
237147
237148
505999
506002
16. Timing…
Symphony (7 days per week):
0:45 – Find deleted records for Primo
1:00 – Export “New” marc records
1:10 - Export “Updated” marc records
1:15 – Export Primo catkeys
Primo (7 days per week)
4:00 - 44IOE_LMS_ONGOING_ADD_PIPE
4:15 - 44IOE_LMS_ONGOING_UPDATE_PIPE
4:30 - 44IOE_LMS_ONGOING_DEL_PIPE
17. Prime…
Special initialisation process
Symphony: Run same export.pl report – no date filter
Primo: Special “Prime” pipe (drops/recreates index)
When do we need to do this?
Timing: Symphony export (10m)
18. File management…
Reports save to /Xfer folder in Symphony
FTP is run as a part of the import routine from Primo, so we
had to configure a Sirsi unix account and tell Primo where to
pick up the file from and its name
19. File format and processing
Primo consumers Marc exchange format (.mrc)
Classmark and location information is exported in the 926
field for further processing
Primo runs the following functions on the imported file:
Normalisation, Enrichment, FRBR, Deduplication
20. What about deletions?
More tricky…
getdeleteds.pl custom report compares today’s catkeys with yesterday’s:
Step A
system("selcatalog -f"~ILLLIB,EQUIP,MEMO,ROOM,SELECT" -6"0" -oC
$Directives{'selcatalogoptions'} 1>$xferdir/$exportedbibkeys");
Step B
system("comm -3 $xferdir/$exportedbibkeys $xferdir/$bibkeysyesterday
1>$xferdir/$deletedreckeys");
22. Config
Lots of work to set up!
Mapping of marc fields to Primo normalised fields
Complex logic rules which can draw upon multiple marc
fields in order to create new fields.
Flagged the source as “IOE Library Catalogue”
Created a backlink to the full record using catkey
23. Symphony and EBSCO Discovery
Kathy Sadler, Systems Librarian
@tatielane
Getting bib records from
Symphony to EDS
24. Getting bib records from
Symphony to EDS
Before my time at Cranfield:
• Started with Summon
• Commissioned a SirsiDynix custom report to export to Summon
• Never really implemented Summon fully
• Did a market review, moved to EDS
25. Full data extraction custom report
I copied the Summon custom report and tweaked it for EDS
• It catalogdumps selected records and makes an export file in /tmp
• It FTPs the file to EBSCO into the “full” directory
We run it weekly on a Sunday at 05:00
• It takes about 23 minutes (292,000 records)
• The finished report is emailed to my libsysadmin inbox
28. When indexing is complete
• EBSCO sends a confirmation email – usually same day, else next day
29. Daily adds/updates
Custom report ebscoftp-upd is a copy of the full data extraction report
• Only difference in script is that it FTPs into the “update” directory
diff ebscoftp-full.pl ebscoftp-upd.pl
< my $ftp_dir = "full";
---
> my $ftp_dir = "update";
• Runs daily except Sunday, after midnight
• Takes 6 minutes
30. When the update has been indexed
• EBSCO sends a confirmation email, usually later same day
31. What about deletions?
• We keep it simple and don’t do anything special for EDS
• Items for deletion get checked out to one of three shadowed locations
[Auditors required us to adopt this practice many years ago]
– WITHDRAWN [deliberately decided to withdraw]
– GONE4GOOD [lost or missing, gave up searching]
– DUFFITEM [test items and catalogue errors]
• Cvtdiscard and Remdiscard run weekly before EBSCO Full Extract
• This works fine for us and we haven’t experienced any issues
32. Our other catalogue does things
differently
• Koha catalogue at Cranfield Defence and Security in Shrivenham
• EBSCO harvests the catalogue directly via OAI-PMH
• Weekly full harvest, daily updates
• No data extraction required, it’s all done for us!
34. Real time availability in Summon
• Availability in OPAC
• Availability in Summon
• Web Services
• Availability in Web Services
• Web services xml
• Current location mapping
• Xml mapping
37. 37
Web Services
• Provide simplified remote access to features of Symphony
• Download available from Support Center
• Free to install and use for “standard” services
• Setup guide explains how to install Web Services
• SDK explains how to use Web Services
• https://support.sirsidynix.com/zh-hans/node/55009
38. 38
Web Services xml
Base URL: http://193.62.224.60:8080/symws/
Protocol: rest/
Service: standard/
Request: lookupTitleInfo?
Client: clientID=SummonTiree
Item ID: &titleID=201554
Parameter: &includeItemInfo=true
http://193.62.224.60:8080/symws/rest/standard/lookupTitleInfo?clie
ntID=SummonTiree&titleID=201554&includeItemInfo=true
53. RTA (2): Find environment id
Run listpol report to get envn[nn].env
File is envn13.env /WebCat/Config/
54. RTA (2): create env file
Add a line
CUSTOMCSS|primo.css|
Points to file
Create primo.css and put in the css folder on
the server (Webcat/Config/Css/.)
56. RTA (2): How do we hide the
redundant areas? #3
In Primo, the calling url has &userid=PRIMO
44IOE_LMS_holdings
http://ioe.sirsidynix.net.uk/uhtbin/cgisirsi/x/0/0/5
7/5/3?searchdata1={{control/addsrcrecordid}}{
CKEY}&searchfield1=GENERAL^SUBJECT^G
ENERAL^^&user_id=PRIMO
57. Symphony and EBSCO Discovery
Kathy Sadler, Systems Librarian
@tatielane
Real time availability
58. We use Z39.50 for both catalogues
• May need to open port if not open already
SYSTEMCONFIG/MODULE/DISPLAY Catalog Zserver 19/06/2015
(To return to previous level, press the ESCAPE key <ESC>.)
-------------------------------------------------------------------------------
* Name : ZSERVER
* Description : ZSERVER
* Max Clients : 30
* Port Number : 2200
* Environment : Z3950
* User Access : Z3950
* Thesaurus : GENERAL
* Allow Browsing : YES
* Session Logging : YES
* Text Logging : NO
* Timeout : 300
* Initialization Timeout : 60
* User Library : CRANFIELD
* User Clearance : NONE
* Truncation Limit : 100
* Local Search : NO
Press ENTER or RETURN to continue (CTL-F TO SCROLL)[]
59. Workflows configuration
• Workflows FAQ: “Entering item-level holdings for Z39.50 displays”
– 926 Item level holdings
– 927 MARC holdings
• Opac Configuration >Z39.50 format
– Add 926 & 927 to BIBMARC and SIRSICAT formats
• Cataloguing Configuration > Entry ID
• Cataloguing Configuration > Catalog format
– Add 926 & 927 to every format you use
60. Opac Configuration
• Opac Configuration > Z39.50 Format > BIBMARC > ModifyAdd 926 + 927 Entry IDs
and entries to Z39.50 Format
for BIBMARC, SIRSICAT
65. Questions, comments, queries…
Summon
Gary Steele
Library Systems Manager, Glasgow Caledonian University
gary.steele@gcu.ac.uk
Primo
Bernard Scaife
Technical Services Manager, UCL Institute of Education
b.scaife@ioe.ac.uk
EDS
Kathy Sadler
Systems Librarian, Cranfield University
k.e.sadler@cranfield.ac.uk
@tatielane
Notes de l'éditeur
Catkey from previous file used for match and dummy title IOE DELETED CATALOGUE RECORD. When imported to primo uses Char 5 of LDR d to mean deleted
Catkey from previous file used for match and dummy title IOE DELETED CATALOGUE RECORD. When imported to primo uses Char 5 of LDR d to mean deleted
Catkey from previous file used for match and dummy title IOE DELETED CATALOGUE RECORD. When imported to primo uses Char 5 of LDR d to mean deleted
Catkey from previous file used for match and dummy title IOE DELETED CATALOGUE RECORD. When imported to primo uses Char 5 of LDR d to mean deleted
Catkey from previous file used for match and dummy title IOE DELETED CATALOGUE RECORD. When imported to primo uses Char 5 of LDR d to mean deleted
Catkey from previous file used for match and dummy title IOE DELETED CATALOGUE RECORD. When imported to primo uses Char 5 of LDR d to mean deleted
Catkey from previous file used for match and dummy title IOE DELETED CATALOGUE RECORD. When imported to primo uses Char 5 of LDR d to mean deleted
Catkey from previous file used for match and dummy title IOE DELETED CATALOGUE RECORD. When imported to primo uses Char 5 of LDR d to mean deleted