SlideShare une entreprise Scribd logo
1  sur  76
Another Way to Attack the  BLOB: ,[object Object]
Why Server-side? ,[object Object],[object Object],[object Object],[object Object],[object Object]
Syllabus ,[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object]
MARC? ,[object Object]
MARC ,[object Object],[object Object]
MARC ,[object Object],[object Object],[object Object]
MARC ,[object Object],[object Object],[object Object],[object Object]
MARC ,[object Object],[object Object],[object Object],[object Object],[object Object]
A MARC record’s three pieces: ,[object Object],[object Object],[object Object]
01551nam  22003738a 4500 001001300000003000600013005001700019008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 074000900197082001600206086001700222099001700239100001800256245011000274 260011200384300003800496490005400534500016500588500007500753500003400828 500003900862504005200901650004600953650005000999650004901049710002901098 830005001127ocm10726696 OCoLC19961223115432.0840406s1996  dcuab  b  f000 0 eng    a  84600065   a(GPO)97054409  dGPOdDLCdMvI  an-us-az  awdoc,sudci3114100999573400aQE611.5.U6bF84 1996  a06 /skipping part of record here /turalzArizonazMohave County.2 aGeological Survey (U.S.) 0aGeologic al Survey professional paper ;v1266. Partial view of a MARC record this is the leader
01551nam  22003738a 4500 001001300000003000600013005001700019008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 074000900197082001600206086001700222099001700239100001800256245011000274 260011200384300003800496490005400534500016500588500007500753500003400828 500003900862504005200901650004600953650005000999650004901049710002901098 830005001127 ocm10726696 OCoLC19961223115432.0840406s1996  dcuab  b  f000 0 eng    a  84600065   a(GPO)97054409  dGPOdDLCdMvI  an-us-az  awdoc,sudci3114100999573400aQE611.5.U6bF84 1996  a06 /skipping part of record here /turalzArizonazMohave County.2 aGeological Survey (U.S.) 0aGeologic al Survey professional paper ;v1266. Partial view of a MARC record this is the directory
01551nam  22003738a 4500001001300000003000600013005001700019008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 074000900197082001600206086001700222099001700239100001800256245011000274 260011200384300003800496490005400534500016500588500007500753500003400828 500003900862504005200901650004600953650005000999650004901049710002901098 830005001127 ocm10726696 OCoLC19961223115432.0840406s1996  dcuab  b  f000 0 eng    a  84600065   a(GPO)97054409  dGPOdDLCdMvI  an-us-az  awdoc,sudci3114100999573400aQE611.5.U6bF84 1996  a06 /skipping part of record here /turalzArizonazMohave County.2 aGeological Survey (U.S.) 0aGeologic al Survey professional paper ;v1266. Partial view of a MARC record this is the data
01551 nam  22 00373 8a 4500 001001300000003000600013005001700019008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 074000900197082001600206086001700222099001700239100001800256245011000274 260011200384300003800496490005400534500016500588500007500753500003400828 500003900862504005200901650004600953650005000999650004901049710002901098 830005001127ocm10726696 OCoLC19961223115432.0840406s1996  dcuab  b  f000 0 eng    a  84600065   a(GPO)97054409  dGPOdDLCdMvI  an-us-az  awdoc,sudci3114100999573400aQE611.5.U6bF84 1996  a06 /skipping part of record here /turalzArizonazMohave County.2 aGeological Survey (U.S.) 0aGeologic al Survey professional paper ;v1266. Dissection of MARC record leader record length data starts at this offset, the base address (pertinent details)
Dissection of MARC record directory 01551nam  22003738a 4500 001001300000 003000600013 005001700019 008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 01551nam  22003738a 4500  header 001 0013 00000 003 0006 00013 005 0017 00019 008 0041 00036 010 0017 00077 035 0018 00094 040 0018 00112 etc. tag  len  offset how to parse it Each 12-character “triplet” is associated with one field.
Where in the record does a field’s data start? 01551nam  22 00373 8a 4500 001001300000 003000600013 005001700019 008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 01551nam  22003738a 4500  header 001 0013 00000 003 0006 00013 005 0017 00019 008 0041 00036 010 0017 00077 035 0018 00094 040 0018 00112 etc. tag  len  offset Where a field’s data starts is determined by adding its offset to the  base address . Data for the first field, tag 001, begins at position 373, tag 003 begins at 386, tag 005 begins at 392, etc.
01551nam  22003738a 4500001001300000003000600013005001700019008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 074000900197082001600206086001700222099001700239100001800256245011000274 260011200384300003800496490005400534500016500588500007500753500003400828 500003900862504005200901650004600953650005000999650004901049710002901098 830005001127 ocm10726696 OCoLC19961223115432.0840406s1996  dcuab  b  f000 0 eng    a  84600065   a(GPO)97054409  dGPOdDLCdMvI  an-us-az  awdoc,sudci3114100999573400aQE611.5.U6bF84 1996  a06 /skipping part of record here /turalzArizonazMohave County.2 aGeological Survey (U.S.) 0aGeologic al Survey professional paper ;v1266. Partial view of a raw MARC record, data section The “box characters” below are the MARC format binary separation characters.
01551nam  22003738a 4500001001300000003000600013005001700019008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 074000900197082001600206086001700222099001700239100001800256245011000274 260011200384300003800496490005400534500016500588500007500753500003400828 500003900862504005200901650004600953650005000999650004901049710002901098 830005001127 <TAG> ocm10726696  <TAG> OCoLC <TAG> 19961223115432.0 <TAG> 840406s 1996  dcuab  b  f000 0 eng  <TAG>   <SUB> a  84600065   <TAG>   <SUB> a( GPO)97054409 <TAG>   <SUB> dGPO <SUB> dDLC <SUB> dMvI <TAG>   <SUB> an-us-az <TAG>   <SUB> awdoc,sudc <SUB> i31141009995734 <TAG> 00 <SUB> aQE611.5.U6 <SUB> bF84 199 6 <TAG>   <SUB> a06 /skipping part of record here /tural <SUB> zArizona <SUB> zYavapai County. <TAG> 0 <SUB> aGeology, Structural <S UB> zArizona <SUB> zMohave County. <TAG> 2  <SUB> aGeological Survey (U.S.) <TAG >  0 <SUB> aGeological Survey professional paper ; <SUB> v1266. <TAG> <EOR> Partial view of a raw MARC record, data section The MARC format uses the following characters: <TAG> hex 1e tag delimiter <SUB> hex 1f subfield delimiter <EOR> hex 1d end of record indicator
Programmer’s MARC format review ,[object Object]
Programmer’s MARC format review ,[object Object],[object Object]
Programmer’s MARC format review ,[object Object],[object Object],[object Object]
Programmer’s MARC format review ,[object Object],[object Object],[object Object],[object Object]
Programmer’s MARC format review ,[object Object],[object Object],[object Object],[object Object],[object Object]
Programmer’s MARC format review ,[object Object],[object Object],[object Object],[object Object],[object Object],Beware of the common “off by 1” error. Depending on the language you’re using, you could be off by 1 in either direction regarding your position within the record.
[object Object],[object Object],[object Object],[object Object],[object Object]
The BLOB Plan of Attack ,[object Object],table _ data  (where “table” is auth, bib, or mfhd) table_id record_segment seqnum
The BLOB Plan of Attack table _ data  (where “table” is auth, bib, or mfhd) table_id record_segment seqnum A MARC record is typically stored entirely in one row in the table. Longer records which are longer than the record_segment size have to be stored in more than one row.
The BLOB Plan of Attack table   data  (where “table” is auth, bib, or mfhd) table_id record_segment seqnum Each table_id is unique to an item’s record. However, if more than one row makes up a record, we will have duplicate table_ids. In that case, we’ll have seqnum = 1, 2, 3, etc., for that record.
The BLOB Plan of Attack An example of a record contained completely in one row. This record is ready to be processed after extraction from the record_segment. 1 MARC data 635406 seqnum record_segment auth_id
The BLOB Plan of Attack This longer record is spread across 3 rows. Assemble the MARC record by concatenating MARC data in seqnum order: MARC-record = record_segment<-seqnum1 + record_segment<-seqnum2 + record_segment<-seqnum3 This record is then ready to be processed. 3 MARC data 635406 2 MARC data 635406 1 MARC data 635406 seqnum record_segment auth_id
[object Object],[object Object],[object Object],[object Object],[object Object]
PL/SQL Example The example code retrieves a few MARC records, and displays them on the screen in human-readable format, along with some diagnostics. (The code examined in the following slides starts on Page 2 of the handout.)
Use a cursor to retrieve data PL/SQL Example Also declare necessary variables in this section
PL/SQL Example Open the cursor and start looping through the rows
PL/SQL Example Get a row from the cursor into the program variables
PL/SQL Example Assemble the marc record. The typical record fits into one row, thus seqnum = 1 and we skip the loop.
PL/SQL Example For a longer, multi-segment record (from an earlier example), we 1 st  have seqnum=3 & put it into marc. Then we have seqnum=2 and PREPEND that to marc. Last we exit the loop since now seqnum=1 and the last statement here takes care of that.
Why go “backwards” in assembling a MARC record? If we predicate the segment-to-marc-record assembly on when the auth_id changes in our loop structure, once it changes we've gone too far and can't go back to get the last segment to completely assemble the now previous record. It’s simpler to predicate looping on seqnum in reverse order because there will  always  be a seqnum of 1. If there are multiple segments, we'll always end with a seqnum of 1  and  still be on the same auth_id and can go on processing the record. This reasoning is not for PL/SQL only, although that is “where” the idea came from. PL/SQL Example
PL/SQL Example Now that we have a MARC record, let’s get the record length and data base-address. We set our pointer to the start of the directory and start looping through the directory.
PL/SQL Example As we loop through the directory, we read the tag id, its length, and its offset in the data part. The actual tag address where we get the data is the data base-address plus the offset.
PL/SQL Example In the last line here, the subfield indicators (hex 1f = dec 31) are replaced by the vertical bar character “ | ” for better readability.
PL/SQL Example Along with the subfield indicator character substitution, we add some space formatting to further increase readability. Thus, instead of  0aPetroleumxDrilling fluids we get 0|a Petroleum |x Drilling fluids for tag data.
PL/SQL Example
PL/SQL Example Now we can output the tag’s data. Output is broken into 80 character chunks to get around the 255 character limit of dbms_output and for better readability.
PL/SQL Example We’re done with this tag, so we move on to the next tag in the directory. At the end, close loops and clean up. End looping for directory traversal End looping for cursor Don’t forget that this ending character is required for your PL/SQL code to run!
[object Object],[object Object],PL/SQL Example
[object Object],[object Object],[object Object],[object Object],[object Object]
Additional tools required for Perl to talk to Oracle: ,[object Object],[object Object]
Getting  and installing DBI and DBD ,[object Object],[object Object],[object Object],[object Object],[object Object]
Getting and  installing  DBI and DBD ,[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object]
Perl Example The following real-world example lets you retrieve an arbitrary range of MARC records from your choice of Auth, Bib, or Mfhd. Output goes to <stdout>, and can be raw MARC data, or formatted for human readability. (The code examined in the following slides starts on Page 5 of the handout.)
Perl Example Must pull in DBI stuff Handle program arguments and show how to  use it if  necessary
Perl Example Here we create the database connection and assign its context to a database handle. We need to specify what type of data (Oracle), the name of the machine to which we’re connecting, the SID, and the username and password.
Perl Example We saw this query in the PL/SQL example. Here we build the query statement, inserting the program arguments where needed. This allows this query to work with any MARC table type and an arbitrary table_id range.
Perl Example Execute the statement and receive a return code. Create the query context and assign it to a statement handle.
Perl Example This is how we get rows from the result set of the query, via the statement handle. The three columns in the row fall into the list of three variables.
Perl Example Output last record here Raw output: On record transition, output the MARC record we just built, reset the ID variable, and store the MARC data for the record we just started reading. If on the same record, keep on storing MARC data.
Perl Example Formatted (not raw) output: On record transition, store the accumulated MARC record and start building a new one, else just prepend to the present marc record. Store last record here (We’re effectively building a MARC file in memory, a virtual file, in the $marcstuff variable.)
Perl Example Release the resources associated with the statement handle and the database handle.
Perl Example Executing this part for formatted, readable output MARC data contains no CR-LFs; instead it uses the hex 1d character to delimit the end of a MARC record. Create the array of MARC records here.
Perl Example Executing this part for formatted, readable output Start looping through the array of MARC records.
Perl Example Executing this part for formatted, readable output We get and output the leader, and then get the record length and the data base-address. Then we position ourselves at the start of the directory.
Perl Example Executing this part for formatted, readable output Loop through the directory
Perl Example Executing this part for formatted, readable output Get the tag id, its length, and its offset. Then read the tag’s data. The actual tag address where we get the data is the data base-address plus the offset.
Perl Example Executing this part for formatted, readable output Now do some formatting for readability. We substitute the vertical bar character “ | ” for the subfield delimiter, and remove the other delimiters.
Perl Example Executing this part for formatted, readable output Output the tag’s parameters, and the data. Then go to the next tag in the directory.
Perl Example Executing this part for formatted, readable output End of program stuff. Close loops and show count of records output.
Demo… example.pl Perl Example
Perl ,[object Object],[object Object],[object Object],[object Object]
Perl Large Table Solution in a Nutshell ,[object Object],[object Object],[object Object],[object Object]
This outer loop goes through the entire table: while beginning_bib_id < max_bib_id call chunkthrudb set  beginning_bib_id to (ending_bib_id + 1) increment ending_bib_id by db_increment end while Perl Large Table Solution in a Nutshell
sub chunkthrudb select bib_id, record_segment, seqnum from bib_data where bib_id >= beginning_bib_id and bib_id < ending_bib_id order by bib_id asc, seqnum desc build the MARC record and  call processrec end sub This inner loop goes through db_increment-sized chunks: Perl Large Table Solution in a Nutshell
sub processrec process the MARC record as needed end sub Perl Large Table Solution in a Nutshell
Page 8 of the handout has a diagram illustrating this process. Perl Large Table Solution in a Nutshell
Questions? Email: zimmer@wmich.edu Phone: 616.387.3885 Thanks for listening.

Contenu connexe

Similaire à Another Way to Attack the BLOB: Server-side Access via PL/SQL and Perl

Researching postgresql
Researching postgresqlResearching postgresql
Researching postgresqlFernando Ike
 
Extending BM25 with multiple query operators
Extending BM25 with multiple query operatorsExtending BM25 with multiple query operators
Extending BM25 with multiple query operatorsRoi Blanco
 
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...What is in All of Those SSTable Files Not Just the Data One but All the Rest ...
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...DataStax
 
Bsides
BsidesBsides
Bsidesm j
 
Faceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents StoryFaceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents StorySourcesense
 
iPhone 6 Schematic
iPhone 6 SchematiciPhone 6 Schematic
iPhone 6 SchematicDaniel Ferch
 
As400 load all subfile
As400   load all subfileAs400   load all subfile
As400 load all subfileaminem_mp
 
Latin America Tour 2019 - 10 great sql features
Latin America Tour 2019  - 10 great sql featuresLatin America Tour 2019  - 10 great sql features
Latin America Tour 2019 - 10 great sql featuresConnor McDonald
 
Whitepaper: Mining the AWR repository for Capacity Planning and Visualization
Whitepaper: Mining the AWR repository for Capacity Planning and VisualizationWhitepaper: Mining the AWR repository for Capacity Planning and Visualization
Whitepaper: Mining the AWR repository for Capacity Planning and VisualizationKristofferson A
 

Similaire à Another Way to Attack the BLOB: Server-side Access via PL/SQL and Perl (20)

Cluto presentation
Cluto presentationCluto presentation
Cluto presentation
 
SAT Practice Tests
SAT Practice TestsSAT Practice Tests
SAT Practice Tests
 
Aocr Hmm Presentation
Aocr Hmm PresentationAocr Hmm Presentation
Aocr Hmm Presentation
 
Researching postgresql
Researching postgresqlResearching postgresql
Researching postgresql
 
Extending BM25 with multiple query operators
Extending BM25 with multiple query operatorsExtending BM25 with multiple query operators
Extending BM25 with multiple query operators
 
5638
56385638
5638
 
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...What is in All of Those SSTable Files Not Just the Data One but All the Rest ...
What is in All of Those SSTable Files Not Just the Data One but All the Rest ...
 
Similarity
SimilaritySimilarity
Similarity
 
Bsides
BsidesBsides
Bsides
 
Heap Base Exploitation
Heap Base ExploitationHeap Base Exploitation
Heap Base Exploitation
 
Prysmian FP600S Multicore Fire Resistant Cables
Prysmian FP600S Multicore Fire Resistant CablesPrysmian FP600S Multicore Fire Resistant Cables
Prysmian FP600S Multicore Fire Resistant Cables
 
Faceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents StoryFaceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents Story
 
iPhone 6 Schematic
iPhone 6 SchematiciPhone 6 Schematic
iPhone 6 Schematic
 
Assignment model problems
Assignment model problemsAssignment model problems
Assignment model problems
 
bin4tsv
bin4tsvbin4tsv
bin4tsv
 
As400 load all subfile
As400   load all subfileAs400   load all subfile
As400 load all subfile
 
crack satellite
crack satellite crack satellite
crack satellite
 
8086 architecture
8086 architecture8086 architecture
8086 architecture
 
Latin America Tour 2019 - 10 great sql features
Latin America Tour 2019  - 10 great sql featuresLatin America Tour 2019  - 10 great sql features
Latin America Tour 2019 - 10 great sql features
 
Whitepaper: Mining the AWR repository for Capacity Planning and Visualization
Whitepaper: Mining the AWR repository for Capacity Planning and VisualizationWhitepaper: Mining the AWR repository for Capacity Planning and Visualization
Whitepaper: Mining the AWR repository for Capacity Planning and Visualization
 

Plus de Roy Zimmer

Automating a Vendor File Load Process with Perl and Shell Scripting
Automating a Vendor File Load Process with Perl and Shell ScriptingAutomating a Vendor File Load Process with Perl and Shell Scripting
Automating a Vendor File Load Process with Perl and Shell ScriptingRoy Zimmer
 
Orientation Session for (New) Presenters and Moderators
Orientation Session for (New) Presenters and ModeratorsOrientation Session for (New) Presenters and Moderators
Orientation Session for (New) Presenters and ModeratorsRoy Zimmer
 
Taking Your Customers to the Cleaners: Historical Patron Data Cleanup and Rou...
Taking Your Customers to the Cleaners: Historical Patron Data Cleanup and Rou...Taking Your Customers to the Cleaners: Historical Patron Data Cleanup and Rou...
Taking Your Customers to the Cleaners: Historical Patron Data Cleanup and Rou...Roy Zimmer
 
Perl DBI Scripting with the ILS
Perl DBI Scripting with the ILSPerl DBI Scripting with the ILS
Perl DBI Scripting with the ILSRoy Zimmer
 
You Can Do It! Start Using Perl to Handle Your Voyager Needs
You Can Do It! Start Using Perl to Handle Your Voyager NeedsYou Can Do It! Start Using Perl to Handle Your Voyager Needs
You Can Do It! Start Using Perl to Handle Your Voyager NeedsRoy Zimmer
 
Voyager Meets MeLCat: MC'ing the Introductions
Voyager Meets MeLCat: MC'ing the IntroductionsVoyager Meets MeLCat: MC'ing the Introductions
Voyager Meets MeLCat: MC'ing the IntroductionsRoy Zimmer
 
Plunging Into Perl While Avoiding the Deep End (mostly)
Plunging Into Perl While Avoiding the Deep End (mostly)Plunging Into Perl While Avoiding the Deep End (mostly)
Plunging Into Perl While Avoiding the Deep End (mostly)Roy Zimmer
 
Marcive Documents: Catching Up and Keeping Up
Marcive Documents: Catching Up and Keeping UpMarcive Documents: Catching Up and Keeping Up
Marcive Documents: Catching Up and Keeping UpRoy Zimmer
 
Implementing a Backup Catalog… on a Student Budget
Implementing a Backup Catalog… on a Student BudgetImplementing a Backup Catalog… on a Student Budget
Implementing a Backup Catalog… on a Student BudgetRoy Zimmer
 
A Strand of Perls: Some Home Grown Utilities
A Strand of Perls: Some Home Grown UtilitiesA Strand of Perls: Some Home Grown Utilities
A Strand of Perls: Some Home Grown UtilitiesRoy Zimmer
 

Plus de Roy Zimmer (11)

Automating a Vendor File Load Process with Perl and Shell Scripting
Automating a Vendor File Load Process with Perl and Shell ScriptingAutomating a Vendor File Load Process with Perl and Shell Scripting
Automating a Vendor File Load Process with Perl and Shell Scripting
 
Orientation Session for (New) Presenters and Moderators
Orientation Session for (New) Presenters and ModeratorsOrientation Session for (New) Presenters and Moderators
Orientation Session for (New) Presenters and Moderators
 
Taking Your Customers to the Cleaners: Historical Patron Data Cleanup and Rou...
Taking Your Customers to the Cleaners: Historical Patron Data Cleanup and Rou...Taking Your Customers to the Cleaners: Historical Patron Data Cleanup and Rou...
Taking Your Customers to the Cleaners: Historical Patron Data Cleanup and Rou...
 
Perl DBI Scripting with the ILS
Perl DBI Scripting with the ILSPerl DBI Scripting with the ILS
Perl DBI Scripting with the ILS
 
You Can Do It! Start Using Perl to Handle Your Voyager Needs
You Can Do It! Start Using Perl to Handle Your Voyager NeedsYou Can Do It! Start Using Perl to Handle Your Voyager Needs
You Can Do It! Start Using Perl to Handle Your Voyager Needs
 
Voyager Meets MeLCat: MC'ing the Introductions
Voyager Meets MeLCat: MC'ing the IntroductionsVoyager Meets MeLCat: MC'ing the Introductions
Voyager Meets MeLCat: MC'ing the Introductions
 
Plunging Into Perl While Avoiding the Deep End (mostly)
Plunging Into Perl While Avoiding the Deep End (mostly)Plunging Into Perl While Avoiding the Deep End (mostly)
Plunging Into Perl While Avoiding the Deep End (mostly)
 
Marcive Documents: Catching Up and Keeping Up
Marcive Documents: Catching Up and Keeping UpMarcive Documents: Catching Up and Keeping Up
Marcive Documents: Catching Up and Keeping Up
 
Implementing a Backup Catalog… on a Student Budget
Implementing a Backup Catalog… on a Student BudgetImplementing a Backup Catalog… on a Student Budget
Implementing a Backup Catalog… on a Student Budget
 
A Strand of Perls: Some Home Grown Utilities
A Strand of Perls: Some Home Grown UtilitiesA Strand of Perls: Some Home Grown Utilities
A Strand of Perls: Some Home Grown Utilities
 
Batchhow
BatchhowBatchhow
Batchhow
 

Dernier

Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 

Dernier (20)

Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 

Another Way to Attack the BLOB: Server-side Access via PL/SQL and Perl

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11. 01551nam 22003738a 4500 001001300000003000600013005001700019008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 074000900197082001600206086001700222099001700239100001800256245011000274 260011200384300003800496490005400534500016500588500007500753500003400828 500003900862504005200901650004600953650005000999650004901049710002901098 830005001127ocm10726696 OCoLC19961223115432.0840406s1996 dcuab b f000 0 eng a 84600065 a(GPO)97054409 dGPOdDLCdMvI an-us-az awdoc,sudci3114100999573400aQE611.5.U6bF84 1996 a06 /skipping part of record here /turalzArizonazMohave County.2 aGeological Survey (U.S.) 0aGeologic al Survey professional paper ;v1266. Partial view of a MARC record this is the leader
  • 12. 01551nam 22003738a 4500 001001300000003000600013005001700019008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 074000900197082001600206086001700222099001700239100001800256245011000274 260011200384300003800496490005400534500016500588500007500753500003400828 500003900862504005200901650004600953650005000999650004901049710002901098 830005001127 ocm10726696 OCoLC19961223115432.0840406s1996 dcuab b f000 0 eng a 84600065 a(GPO)97054409 dGPOdDLCdMvI an-us-az awdoc,sudci3114100999573400aQE611.5.U6bF84 1996 a06 /skipping part of record here /turalzArizonazMohave County.2 aGeological Survey (U.S.) 0aGeologic al Survey professional paper ;v1266. Partial view of a MARC record this is the directory
  • 13. 01551nam 22003738a 4500001001300000003000600013005001700019008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 074000900197082001600206086001700222099001700239100001800256245011000274 260011200384300003800496490005400534500016500588500007500753500003400828 500003900862504005200901650004600953650005000999650004901049710002901098 830005001127 ocm10726696 OCoLC19961223115432.0840406s1996 dcuab b f000 0 eng a 84600065 a(GPO)97054409 dGPOdDLCdMvI an-us-az awdoc,sudci3114100999573400aQE611.5.U6bF84 1996 a06 /skipping part of record here /turalzArizonazMohave County.2 aGeological Survey (U.S.) 0aGeologic al Survey professional paper ;v1266. Partial view of a MARC record this is the data
  • 14. 01551 nam 22 00373 8a 4500 001001300000003000600013005001700019008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 074000900197082001600206086001700222099001700239100001800256245011000274 260011200384300003800496490005400534500016500588500007500753500003400828 500003900862504005200901650004600953650005000999650004901049710002901098 830005001127ocm10726696 OCoLC19961223115432.0840406s1996 dcuab b f000 0 eng a 84600065 a(GPO)97054409 dGPOdDLCdMvI an-us-az awdoc,sudci3114100999573400aQE611.5.U6bF84 1996 a06 /skipping part of record here /turalzArizonazMohave County.2 aGeological Survey (U.S.) 0aGeologic al Survey professional paper ;v1266. Dissection of MARC record leader record length data starts at this offset, the base address (pertinent details)
  • 15. Dissection of MARC record directory 01551nam 22003738a 4500 001001300000 003000600013 005001700019 008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 01551nam 22003738a 4500 header 001 0013 00000 003 0006 00013 005 0017 00019 008 0041 00036 010 0017 00077 035 0018 00094 040 0018 00112 etc. tag len offset how to parse it Each 12-character “triplet” is associated with one field.
  • 16. Where in the record does a field’s data start? 01551nam 22 00373 8a 4500 001001300000 003000600013 005001700019 008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 01551nam 22003738a 4500 header 001 0013 00000 003 0006 00013 005 0017 00019 008 0041 00036 010 0017 00077 035 0018 00094 040 0018 00112 etc. tag len offset Where a field’s data starts is determined by adding its offset to the base address . Data for the first field, tag 001, begins at position 373, tag 003 begins at 386, tag 005 begins at 392, etc.
  • 17. 01551nam 22003738a 4500001001300000003000600013005001700019008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 074000900197082001600206086001700222099001700239100001800256245011000274 260011200384300003800496490005400534500016500588500007500753500003400828 500003900862504005200901650004600953650005000999650004901049710002901098 830005001127 ocm10726696 OCoLC19961223115432.0840406s1996 dcuab b f000 0 eng a 84600065 a(GPO)97054409 dGPOdDLCdMvI an-us-az awdoc,sudci3114100999573400aQE611.5.U6bF84 1996 a06 /skipping part of record here /turalzArizonazMohave County.2 aGeological Survey (U.S.) 0aGeologic al Survey professional paper ;v1266. Partial view of a raw MARC record, data section The “box characters” below are the MARC format binary separation characters.
  • 18. 01551nam 22003738a 4500001001300000003000600013005001700019008004100036 010001700077035001800094040001800112043001200130049003000142050002500172 074000900197082001600206086001700222099001700239100001800256245011000274 260011200384300003800496490005400534500016500588500007500753500003400828 500003900862504005200901650004600953650005000999650004901049710002901098 830005001127 <TAG> ocm10726696 <TAG> OCoLC <TAG> 19961223115432.0 <TAG> 840406s 1996 dcuab b f000 0 eng <TAG> <SUB> a 84600065 <TAG> <SUB> a( GPO)97054409 <TAG> <SUB> dGPO <SUB> dDLC <SUB> dMvI <TAG> <SUB> an-us-az <TAG> <SUB> awdoc,sudc <SUB> i31141009995734 <TAG> 00 <SUB> aQE611.5.U6 <SUB> bF84 199 6 <TAG> <SUB> a06 /skipping part of record here /tural <SUB> zArizona <SUB> zYavapai County. <TAG> 0 <SUB> aGeology, Structural <S UB> zArizona <SUB> zMohave County. <TAG> 2 <SUB> aGeological Survey (U.S.) <TAG > 0 <SUB> aGeological Survey professional paper ; <SUB> v1266. <TAG> <EOR> Partial view of a raw MARC record, data section The MARC format uses the following characters: <TAG> hex 1e tag delimiter <SUB> hex 1f subfield delimiter <EOR> hex 1d end of record indicator
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27. The BLOB Plan of Attack table _ data (where “table” is auth, bib, or mfhd) table_id record_segment seqnum A MARC record is typically stored entirely in one row in the table. Longer records which are longer than the record_segment size have to be stored in more than one row.
  • 28. The BLOB Plan of Attack table data (where “table” is auth, bib, or mfhd) table_id record_segment seqnum Each table_id is unique to an item’s record. However, if more than one row makes up a record, we will have duplicate table_ids. In that case, we’ll have seqnum = 1, 2, 3, etc., for that record.
  • 29. The BLOB Plan of Attack An example of a record contained completely in one row. This record is ready to be processed after extraction from the record_segment. 1 MARC data 635406 seqnum record_segment auth_id
  • 30. The BLOB Plan of Attack This longer record is spread across 3 rows. Assemble the MARC record by concatenating MARC data in seqnum order: MARC-record = record_segment<-seqnum1 + record_segment<-seqnum2 + record_segment<-seqnum3 This record is then ready to be processed. 3 MARC data 635406 2 MARC data 635406 1 MARC data 635406 seqnum record_segment auth_id
  • 31.
  • 32. PL/SQL Example The example code retrieves a few MARC records, and displays them on the screen in human-readable format, along with some diagnostics. (The code examined in the following slides starts on Page 2 of the handout.)
  • 33. Use a cursor to retrieve data PL/SQL Example Also declare necessary variables in this section
  • 34. PL/SQL Example Open the cursor and start looping through the rows
  • 35. PL/SQL Example Get a row from the cursor into the program variables
  • 36. PL/SQL Example Assemble the marc record. The typical record fits into one row, thus seqnum = 1 and we skip the loop.
  • 37. PL/SQL Example For a longer, multi-segment record (from an earlier example), we 1 st have seqnum=3 & put it into marc. Then we have seqnum=2 and PREPEND that to marc. Last we exit the loop since now seqnum=1 and the last statement here takes care of that.
  • 38. Why go “backwards” in assembling a MARC record? If we predicate the segment-to-marc-record assembly on when the auth_id changes in our loop structure, once it changes we've gone too far and can't go back to get the last segment to completely assemble the now previous record. It’s simpler to predicate looping on seqnum in reverse order because there will always be a seqnum of 1. If there are multiple segments, we'll always end with a seqnum of 1 and still be on the same auth_id and can go on processing the record. This reasoning is not for PL/SQL only, although that is “where” the idea came from. PL/SQL Example
  • 39. PL/SQL Example Now that we have a MARC record, let’s get the record length and data base-address. We set our pointer to the start of the directory and start looping through the directory.
  • 40. PL/SQL Example As we loop through the directory, we read the tag id, its length, and its offset in the data part. The actual tag address where we get the data is the data base-address plus the offset.
  • 41. PL/SQL Example In the last line here, the subfield indicators (hex 1f = dec 31) are replaced by the vertical bar character “ | ” for better readability.
  • 42. PL/SQL Example Along with the subfield indicator character substitution, we add some space formatting to further increase readability. Thus, instead of 0aPetroleumxDrilling fluids we get 0|a Petroleum |x Drilling fluids for tag data.
  • 44. PL/SQL Example Now we can output the tag’s data. Output is broken into 80 character chunks to get around the 255 character limit of dbms_output and for better readability.
  • 45. PL/SQL Example We’re done with this tag, so we move on to the next tag in the directory. At the end, close loops and clean up. End looping for directory traversal End looping for cursor Don’t forget that this ending character is required for your PL/SQL code to run!
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52. Perl Example The following real-world example lets you retrieve an arbitrary range of MARC records from your choice of Auth, Bib, or Mfhd. Output goes to <stdout>, and can be raw MARC data, or formatted for human readability. (The code examined in the following slides starts on Page 5 of the handout.)
  • 53. Perl Example Must pull in DBI stuff Handle program arguments and show how to use it if necessary
  • 54. Perl Example Here we create the database connection and assign its context to a database handle. We need to specify what type of data (Oracle), the name of the machine to which we’re connecting, the SID, and the username and password.
  • 55. Perl Example We saw this query in the PL/SQL example. Here we build the query statement, inserting the program arguments where needed. This allows this query to work with any MARC table type and an arbitrary table_id range.
  • 56. Perl Example Execute the statement and receive a return code. Create the query context and assign it to a statement handle.
  • 57. Perl Example This is how we get rows from the result set of the query, via the statement handle. The three columns in the row fall into the list of three variables.
  • 58. Perl Example Output last record here Raw output: On record transition, output the MARC record we just built, reset the ID variable, and store the MARC data for the record we just started reading. If on the same record, keep on storing MARC data.
  • 59. Perl Example Formatted (not raw) output: On record transition, store the accumulated MARC record and start building a new one, else just prepend to the present marc record. Store last record here (We’re effectively building a MARC file in memory, a virtual file, in the $marcstuff variable.)
  • 60. Perl Example Release the resources associated with the statement handle and the database handle.
  • 61. Perl Example Executing this part for formatted, readable output MARC data contains no CR-LFs; instead it uses the hex 1d character to delimit the end of a MARC record. Create the array of MARC records here.
  • 62. Perl Example Executing this part for formatted, readable output Start looping through the array of MARC records.
  • 63. Perl Example Executing this part for formatted, readable output We get and output the leader, and then get the record length and the data base-address. Then we position ourselves at the start of the directory.
  • 64. Perl Example Executing this part for formatted, readable output Loop through the directory
  • 65. Perl Example Executing this part for formatted, readable output Get the tag id, its length, and its offset. Then read the tag’s data. The actual tag address where we get the data is the data base-address plus the offset.
  • 66. Perl Example Executing this part for formatted, readable output Now do some formatting for readability. We substitute the vertical bar character “ | ” for the subfield delimiter, and remove the other delimiters.
  • 67. Perl Example Executing this part for formatted, readable output Output the tag’s parameters, and the data. Then go to the next tag in the directory.
  • 68. Perl Example Executing this part for formatted, readable output End of program stuff. Close loops and show count of records output.
  • 70.
  • 71.
  • 72. This outer loop goes through the entire table: while beginning_bib_id < max_bib_id call chunkthrudb set beginning_bib_id to (ending_bib_id + 1) increment ending_bib_id by db_increment end while Perl Large Table Solution in a Nutshell
  • 73. sub chunkthrudb select bib_id, record_segment, seqnum from bib_data where bib_id >= beginning_bib_id and bib_id < ending_bib_id order by bib_id asc, seqnum desc build the MARC record and call processrec end sub This inner loop goes through db_increment-sized chunks: Perl Large Table Solution in a Nutshell
  • 74. sub processrec process the MARC record as needed end sub Perl Large Table Solution in a Nutshell
  • 75. Page 8 of the handout has a diagram illustrating this process. Perl Large Table Solution in a Nutshell
  • 76. Questions? Email: zimmer@wmich.edu Phone: 616.387.3885 Thanks for listening.

Notes de l'éditeur

  1. 1