SlideShare une entreprise Scribd logo
1  sur  62
Télécharger pour lire hors ligne
RIPPING YOUR PDF FILES
APART
What you need to know about what goes on inside your PDF files

Mark Stephens

Thursday, 29 March 12
RIPPING YOUR PDF FILES
APART
What you need to know about what goes on inside your PDF files

Mark Stephens

Thursday, 29 March 12
Mark’s Bio

Thursday, 29 March 12
Mark’s Bio

Thursday, 29 March 12
Mark’s Bio

Thursday, 29 March 12
Mark’s Bio

Working with Java and PDF since 1997

Thursday, 29 March 12
Mark’s Bio

Working with Java and PDF since 1997
Founded IDRsolutions 1999

Thursday, 29 March 12
Mark’s Bio

Working with Java and PDF since 1997
Founded IDRsolutions 1999
Speaker at Seybold, Javaone, Business of Software

Thursday, 29 March 12
Mark’s Bio

Working with Java and PDF since 1997
Founded IDRsolutions 1999
Speaker at Seybold, Javaone, Business of Software

Thursday, 29 March 12
Mark’s Bio

Working with Java and PDF since 1997
Founded IDRsolutions 1999
Speaker at Seybold, Javaone, Business of Software
MA degree in Mediaeval History from St Andrews (how useless
is that)

Thursday, 29 March 12
Mark’s Bio

Working with Java and PDF since 1997
Founded IDRsolutions 1999
Speaker at Seybold, Javaone, Business of Software
MA degree in Mediaeval History from St Andrews (how useless
is that)
Ask me about Java, PDF, business or anything which happened before 1500 AD
Thursday, 29 March 12
BUT FIRST SOME KITTENS...

The support team at IDRsolutions are waiting for your call (maybe)
Thursday, 29 March 12
The PDF reference guide

Thursday, 29 March 12
Loading page 1124 of a file
Word
Read pages 1-1123 (time passes - scroll bar shrinks)
Found it (eventually)

Thursday, 29 March 12
Loading page 1124 of a file
Word
Read pages 1-1123 (time passes - scroll bar shrinks)
Found it (eventually)
PDF
Read the metadata refs table(s) - where do I find all the objects
Skip to page 1124

Thursday, 29 March 12
Loading page 1124 of a file
Word
Read pages 1-1123 (time passes - scroll bar shrinks)
Found it (eventually)
PDF
Read the metadata refs table(s) - where do I find all the objects
Skip to page 1124
PDF (in detail)
Read the refs table(s) - where do I find all the objects
Read the Root object - points to the Pages object
Read object for page 1124 (tells me the linked font, image,
content objects)
Draw it
Thursday, 29 March 12
Your PDF file is a Tree
A root linked to all the branches

Thursday, 29 March 12
The PDF reference guide

Thursday, 29 March 12
The PDF reference guide
Like you have never seen it before...

Thursday, 29 March 12
The PDF reference guide
Like you have never seen it before...

Thursday, 29 March 12
The PDF reference guide
Like you have never seen it before...

You can use vi or emacs if you prefer
Thursday, 29 March 12
The PDF reference guide
End of the file

Thursday, 29 March 12
The PDF reference guide
Like you have never seen it before...

Thursday, 29 March 12
The PDF reference guide

Thursday, 29 March 12
The PDF reference guide
Like you have never seen it before...

Thursday, 29 March 12
The PDF root object
Like you have never seen it before...

Thursday, 29 March 12
The PDF root object
Like you have never seen it before...

Thursday, 29 March 12
PDF files on the web
Isn’t having the marker at the end a problem??

Thursday, 29 March 12
PDF files on the web
Not if you create it properly

Thursday, 29 March 12
Key takeaways from the PDF structure

Thursday, 29 March 12
Key takeaways from the PDF structure
We do not need to load the whole file

Thursday, 29 March 12
Key takeaways from the PDF structure
We do not need to load the whole file
It is equally fast to load any part of it

Thursday, 29 March 12
Key takeaways from the PDF structure
We do not need to load the whole file
It is equally fast to load any part of it
It is very easy to replace objects with new versions

Thursday, 29 March 12
Key takeaways from the PDF structure
We do not need to load the whole file
It is equally fast to load any part of it
It is very easy to replace objects with new versions
There are certain key locations - like at the end of a file

Thursday, 29 March 12
Key takeaways from the PDF structure
We do not need to load the whole file
It is equally fast to load any part of it
It is very easy to replace objects with new versions
There are certain key locations - like at the end of a file
You should not edit it in a text editor

Thursday, 29 March 12
Key takeaways from the PDF structure
We do not need to load the whole file
It is equally fast to load any part of it
It is very easy to replace objects with new versions
There are certain key locations - like at the end of a file
You should not edit it in a text editor
If you want to use PDF files across the Internet, there is a
special mode to make these load the most important parts
first.

Thursday, 29 March 12
Key takeaways from the PDF structure
We do not need to load the whole file
It is equally fast to load any part of it
It is very easy to replace objects with new versions
There are certain key locations - like at the end of a file
You should not edit it in a text editor
If you want to use PDF files across the Internet, there is a
special mode to make these load the most important parts
first.
Lots of features need you to setup the PDF file correctly.

Thursday, 29 March 12
Those PDF objects in more detail
All PDF objects have:1. An ID number
2. (Optional) A set of dictionary key pairs
3. (Optional) A block of binary data.

Thursday, 29 March 12
Those PDF objects in more detail
All PDF objects have:1. An ID number
2. (Optional) A set of dictionary key pairs
3. (Optional) A block of binary data.

Thursday, 29 March 12
PDF images are not Tiff, Png or JPeg

Thursday, 29 March 12
PDF images are not Tiff, Png or JPeg

Thursday, 29 March 12
A word on colour

Thursday, 29 March 12
A word on colour
DeviceRGB
CalRGB
DeviceCMYK
ICC
Separation
DeviceN
DeviceGray
CalGray
Lab
Pattern

Thursday, 29 March 12
PDF pages are ‘drawn’

Thursday, 29 March 12
PDF pages are ‘drawn’

Thursday, 29 March 12
PDF pages are ‘drawn’
0 0 0 1k set cmyk color of text to black

Thursday, 29 March 12
PDF pages are ‘drawn’
0 0 0 1k set cmyk color of text to black
BT start of some text

Thursday, 29 March 12
PDF pages are ‘drawn’
0 0 0 1k set cmyk color of text to black
BT start of some text
/T1_01Tf Use the font defined as T1_0 elsewhere

Thursday, 29 March 12
PDF pages are ‘drawn’
0 0 0 1k set cmyk color of text to black
BT start of some text
/T1_01Tf Use the font defined as T1_0 elsewhere
0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties

Thursday, 29 March 12
PDF pages are ‘drawn’
0 0 0 1k set cmyk color of text to black
BT start of some text
/T1_01Tf Use the font defined as T1_0 elsewhere
0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties
7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen

Thursday, 29 March 12
PDF pages are ‘drawn’
0 0 0 1k set cmyk color of text to black
BT start of some text
/T1_01Tf Use the font defined as T1_0 elsewhere
0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties
7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen
(L*) Tj draw the text L*

Thursday, 29 March 12
PDF pages are ‘drawn’
0 0 0 1k set cmyk color of text to black
BT start of some text
/T1_01Tf Use the font defined as T1_0 elsewhere
0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties
7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen
(L*) Tj draw the text L*
/T1_1 1Tf change font

Thursday, 29 March 12
PDF pages are ‘drawn’
0 0 0 1k set cmyk color of text to black
BT start of some text
/T1_01Tf Use the font defined as T1_0 elsewhere
0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties
7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen
(L*) Tj draw the text L*
/T1_1 1Tf change font
0.856 0 Td move to a different location onscreen

Thursday, 29 March 12
PDF pages are ‘drawn’
0 0 0 1k set cmyk color of text to black
BT start of some text
/T1_01Tf Use the font defined as T1_0 elsewhere
0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties
7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen
(L*) Tj draw the text L*
/T1_1 1Tf change font
0.856 0 Td move to a different location onscreen
( = 100) Tj draw the text = 100

Thursday, 29 March 12
PDF pages are ‘drawn’
0 0 0 1k set cmyk color of text to black
BT start of some text
/T1_01Tf Use the font defined as T1_0 elsewhere
0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties
7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen
(L*) Tj draw the text L*
/T1_1 1Tf change font
0.856 0 Td move to a different location onscreen
( = 100) Tj draw the text = 100
-0.324 -1.133Td move to a different location onscreen

Thursday, 29 March 12
PDF pages are ‘drawn’
0 0 0 1k set cmyk color of text to black
BT start of some text
/T1_01Tf Use the font defined as T1_0 elsewhere
0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties
7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen
(L*) Tj draw the text L*
/T1_1 1Tf change font
0.856 0 Td move to a different location onscreen
( = 100) Tj draw the text = 100
-0.324 -1.133Td move to a different location onscreen
[(whit)6(e)] Tj draw the text white (put a gap between t and e)

Thursday, 29 March 12
Thursday, 29 March 12
PDF myth - files are cross platform
Only if you create them properly...

Thursday, 29 March 12
Obfuscation for idiots!
No-one will be able to guess the secret password

Thursday, 29 March 12
20 seconds later...
And the password is....

Thursday, 29 March 12
Lastly a plea

Not all PDF creation tools are equal

Thursday, 29 March 12
In summary

Thursday, 29 March 12

Contenu connexe

Plus de iText Group nv

ETDA Conference - Digital signatures: how it's done in PDF
ETDA Conference - Digital signatures: how it's done in PDFETDA Conference - Digital signatures: how it's done in PDF
ETDA Conference - Digital signatures: how it's done in PDFiText Group nv
 
FIT Seminar Singapore presentation
FIT Seminar Singapore presentationFIT Seminar Singapore presentation
FIT Seminar Singapore presentationiText Group nv
 
Tech Startup Day 2015: 4 failures and 1 hit
Tech Startup Day 2015: 4 failures and 1 hitTech Startup Day 2015: 4 failures and 1 hit
Tech Startup Day 2015: 4 failures and 1 hitiText Group nv
 
Intellectual property and licensing
Intellectual property and licensingIntellectual property and licensing
Intellectual property and licensingiText Group nv
 
Monetizing open-source projects
Monetizing open-source projectsMonetizing open-source projects
Monetizing open-source projectsiText Group nv
 
PDF made easy with iText 7
PDF made easy with iText 7PDF made easy with iText 7
PDF made easy with iText 7iText Group nv
 
Start-ups: the tortoise and the hare
Start-ups: the tortoise and the hareStart-ups: the tortoise and the hare
Start-ups: the tortoise and the hareiText Group nv
 
IANAL: what developers should know about IP and Legal
IANAL: what developers should know about IP and LegalIANAL: what developers should know about IP and Legal
IANAL: what developers should know about IP and LegaliText Group nv
 
Digital Signatures in the Cloud: A B2C Case Study
Digital Signatures in the Cloud: A B2C Case StudyDigital Signatures in the Cloud: A B2C Case Study
Digital Signatures in the Cloud: A B2C Case StudyiText Group nv
 
Digital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDFDigital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDFiText Group nv
 
PDF is dead. Long live PDF... with Java!
PDF is dead. Long live PDF... with Java!PDF is dead. Long live PDF... with Java!
PDF is dead. Long live PDF... with Java!iText Group nv
 
Digital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDFDigital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDFiText Group nv
 
iText Summit 2014: Talk: iText throughout the document life cycle
iText Summit 2014: Talk: iText throughout the document life cycleiText Summit 2014: Talk: iText throughout the document life cycle
iText Summit 2014: Talk: iText throughout the document life cycleiText Group nv
 
iText Summit 2014: Keynote talk
iText Summit 2014: Keynote talkiText Summit 2014: Keynote talk
iText Summit 2014: Keynote talkiText Group nv
 
iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...
iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...
iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...iText Group nv
 
The XML Forms Architecture
The XML Forms ArchitectureThe XML Forms Architecture
The XML Forms ArchitectureiText Group nv
 
Damn, the new generation kids are getting iPads in Highschool!
Damn, the new generation kids are getting iPads in Highschool!Damn, the new generation kids are getting iPads in Highschool!
Damn, the new generation kids are getting iPads in Highschool!iText Group nv
 
PAdES signatures in iText and the road ahead
PAdES signatures in iText and the road aheadPAdES signatures in iText and the road ahead
PAdES signatures in iText and the road aheadiText Group nv
 

Plus de iText Group nv (20)

ETDA Conference - Digital signatures: how it's done in PDF
ETDA Conference - Digital signatures: how it's done in PDFETDA Conference - Digital signatures: how it's done in PDF
ETDA Conference - Digital signatures: how it's done in PDF
 
FIT Seminar Singapore presentation
FIT Seminar Singapore presentationFIT Seminar Singapore presentation
FIT Seminar Singapore presentation
 
Tech Startup Day 2015: 4 failures and 1 hit
Tech Startup Day 2015: 4 failures and 1 hitTech Startup Day 2015: 4 failures and 1 hit
Tech Startup Day 2015: 4 failures and 1 hit
 
Intellectual property and licensing
Intellectual property and licensingIntellectual property and licensing
Intellectual property and licensing
 
Monetizing open-source projects
Monetizing open-source projectsMonetizing open-source projects
Monetizing open-source projects
 
Oops, I broke my API
Oops, I broke my APIOops, I broke my API
Oops, I broke my API
 
PDF made easy with iText 7
PDF made easy with iText 7PDF made easy with iText 7
PDF made easy with iText 7
 
Start-ups: the tortoise and the hare
Start-ups: the tortoise and the hareStart-ups: the tortoise and the hare
Start-ups: the tortoise and the hare
 
IANAL: what developers should know about IP and Legal
IANAL: what developers should know about IP and LegalIANAL: what developers should know about IP and Legal
IANAL: what developers should know about IP and Legal
 
Digital Signatures in the Cloud: A B2C Case Study
Digital Signatures in the Cloud: A B2C Case StudyDigital Signatures in the Cloud: A B2C Case Study
Digital Signatures in the Cloud: A B2C Case Study
 
Digital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDFDigital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDF
 
ZUGFeRD: an overview
ZUGFeRD: an overviewZUGFeRD: an overview
ZUGFeRD: an overview
 
PDF is dead. Long live PDF... with Java!
PDF is dead. Long live PDF... with Java!PDF is dead. Long live PDF... with Java!
PDF is dead. Long live PDF... with Java!
 
Digital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDFDigital Signatures: how it's done in PDF
Digital Signatures: how it's done in PDF
 
iText Summit 2014: Talk: iText throughout the document life cycle
iText Summit 2014: Talk: iText throughout the document life cycleiText Summit 2014: Talk: iText throughout the document life cycle
iText Summit 2014: Talk: iText throughout the document life cycle
 
iText Summit 2014: Keynote talk
iText Summit 2014: Keynote talkiText Summit 2014: Keynote talk
iText Summit 2014: Keynote talk
 
iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...
iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...
iText Summit 2014: Talk: eGriffie and JustX, introducing digital documents at...
 
The XML Forms Architecture
The XML Forms ArchitectureThe XML Forms Architecture
The XML Forms Architecture
 
Damn, the new generation kids are getting iPads in Highschool!
Damn, the new generation kids are getting iPads in Highschool!Damn, the new generation kids are getting iPads in Highschool!
Damn, the new generation kids are getting iPads in Highschool!
 
PAdES signatures in iText and the road ahead
PAdES signatures in iText and the road aheadPAdES signatures in iText and the road ahead
PAdES signatures in iText and the road ahead
 

Dernier

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 

Dernier (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

Ripping your PDF files apart

  • 1. RIPPING YOUR PDF FILES APART What you need to know about what goes on inside your PDF files Mark Stephens Thursday, 29 March 12
  • 2. RIPPING YOUR PDF FILES APART What you need to know about what goes on inside your PDF files Mark Stephens Thursday, 29 March 12
  • 6. Mark’s Bio Working with Java and PDF since 1997 Thursday, 29 March 12
  • 7. Mark’s Bio Working with Java and PDF since 1997 Founded IDRsolutions 1999 Thursday, 29 March 12
  • 8. Mark’s Bio Working with Java and PDF since 1997 Founded IDRsolutions 1999 Speaker at Seybold, Javaone, Business of Software Thursday, 29 March 12
  • 9. Mark’s Bio Working with Java and PDF since 1997 Founded IDRsolutions 1999 Speaker at Seybold, Javaone, Business of Software Thursday, 29 March 12
  • 10. Mark’s Bio Working with Java and PDF since 1997 Founded IDRsolutions 1999 Speaker at Seybold, Javaone, Business of Software MA degree in Mediaeval History from St Andrews (how useless is that) Thursday, 29 March 12
  • 11. Mark’s Bio Working with Java and PDF since 1997 Founded IDRsolutions 1999 Speaker at Seybold, Javaone, Business of Software MA degree in Mediaeval History from St Andrews (how useless is that) Ask me about Java, PDF, business or anything which happened before 1500 AD Thursday, 29 March 12
  • 12. BUT FIRST SOME KITTENS... The support team at IDRsolutions are waiting for your call (maybe) Thursday, 29 March 12
  • 13. The PDF reference guide Thursday, 29 March 12
  • 14. Loading page 1124 of a file Word Read pages 1-1123 (time passes - scroll bar shrinks) Found it (eventually) Thursday, 29 March 12
  • 15. Loading page 1124 of a file Word Read pages 1-1123 (time passes - scroll bar shrinks) Found it (eventually) PDF Read the metadata refs table(s) - where do I find all the objects Skip to page 1124 Thursday, 29 March 12
  • 16. Loading page 1124 of a file Word Read pages 1-1123 (time passes - scroll bar shrinks) Found it (eventually) PDF Read the metadata refs table(s) - where do I find all the objects Skip to page 1124 PDF (in detail) Read the refs table(s) - where do I find all the objects Read the Root object - points to the Pages object Read object for page 1124 (tells me the linked font, image, content objects) Draw it Thursday, 29 March 12
  • 17. Your PDF file is a Tree A root linked to all the branches Thursday, 29 March 12
  • 18. The PDF reference guide Thursday, 29 March 12
  • 19. The PDF reference guide Like you have never seen it before... Thursday, 29 March 12
  • 20. The PDF reference guide Like you have never seen it before... Thursday, 29 March 12
  • 21. The PDF reference guide Like you have never seen it before... You can use vi or emacs if you prefer Thursday, 29 March 12
  • 22. The PDF reference guide End of the file Thursday, 29 March 12
  • 23. The PDF reference guide Like you have never seen it before... Thursday, 29 March 12
  • 24. The PDF reference guide Thursday, 29 March 12
  • 25. The PDF reference guide Like you have never seen it before... Thursday, 29 March 12
  • 26. The PDF root object Like you have never seen it before... Thursday, 29 March 12
  • 27. The PDF root object Like you have never seen it before... Thursday, 29 March 12
  • 28. PDF files on the web Isn’t having the marker at the end a problem?? Thursday, 29 March 12
  • 29. PDF files on the web Not if you create it properly Thursday, 29 March 12
  • 30. Key takeaways from the PDF structure Thursday, 29 March 12
  • 31. Key takeaways from the PDF structure We do not need to load the whole file Thursday, 29 March 12
  • 32. Key takeaways from the PDF structure We do not need to load the whole file It is equally fast to load any part of it Thursday, 29 March 12
  • 33. Key takeaways from the PDF structure We do not need to load the whole file It is equally fast to load any part of it It is very easy to replace objects with new versions Thursday, 29 March 12
  • 34. Key takeaways from the PDF structure We do not need to load the whole file It is equally fast to load any part of it It is very easy to replace objects with new versions There are certain key locations - like at the end of a file Thursday, 29 March 12
  • 35. Key takeaways from the PDF structure We do not need to load the whole file It is equally fast to load any part of it It is very easy to replace objects with new versions There are certain key locations - like at the end of a file You should not edit it in a text editor Thursday, 29 March 12
  • 36. Key takeaways from the PDF structure We do not need to load the whole file It is equally fast to load any part of it It is very easy to replace objects with new versions There are certain key locations - like at the end of a file You should not edit it in a text editor If you want to use PDF files across the Internet, there is a special mode to make these load the most important parts first. Thursday, 29 March 12
  • 37. Key takeaways from the PDF structure We do not need to load the whole file It is equally fast to load any part of it It is very easy to replace objects with new versions There are certain key locations - like at the end of a file You should not edit it in a text editor If you want to use PDF files across the Internet, there is a special mode to make these load the most important parts first. Lots of features need you to setup the PDF file correctly. Thursday, 29 March 12
  • 38. Those PDF objects in more detail All PDF objects have:1. An ID number 2. (Optional) A set of dictionary key pairs 3. (Optional) A block of binary data. Thursday, 29 March 12
  • 39. Those PDF objects in more detail All PDF objects have:1. An ID number 2. (Optional) A set of dictionary key pairs 3. (Optional) A block of binary data. Thursday, 29 March 12
  • 40. PDF images are not Tiff, Png or JPeg Thursday, 29 March 12
  • 41. PDF images are not Tiff, Png or JPeg Thursday, 29 March 12
  • 42. A word on colour Thursday, 29 March 12
  • 43. A word on colour DeviceRGB CalRGB DeviceCMYK ICC Separation DeviceN DeviceGray CalGray Lab Pattern Thursday, 29 March 12
  • 44. PDF pages are ‘drawn’ Thursday, 29 March 12
  • 45. PDF pages are ‘drawn’ Thursday, 29 March 12
  • 46. PDF pages are ‘drawn’ 0 0 0 1k set cmyk color of text to black Thursday, 29 March 12
  • 47. PDF pages are ‘drawn’ 0 0 0 1k set cmyk color of text to black BT start of some text Thursday, 29 March 12
  • 48. PDF pages are ‘drawn’ 0 0 0 1k set cmyk color of text to black BT start of some text /T1_01Tf Use the font defined as T1_0 elsewhere Thursday, 29 March 12
  • 49. PDF pages are ‘drawn’ 0 0 0 1k set cmyk color of text to black BT start of some text /T1_01Tf Use the font defined as T1_0 elsewhere 0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties Thursday, 29 March 12
  • 50. PDF pages are ‘drawn’ 0 0 0 1k set cmyk color of text to black BT start of some text /T1_01Tf Use the font defined as T1_0 elsewhere 0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties 7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen Thursday, 29 March 12
  • 51. PDF pages are ‘drawn’ 0 0 0 1k set cmyk color of text to black BT start of some text /T1_01Tf Use the font defined as T1_0 elsewhere 0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties 7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen (L*) Tj draw the text L* Thursday, 29 March 12
  • 52. PDF pages are ‘drawn’ 0 0 0 1k set cmyk color of text to black BT start of some text /T1_01Tf Use the font defined as T1_0 elsewhere 0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties 7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen (L*) Tj draw the text L* /T1_1 1Tf change font Thursday, 29 March 12
  • 53. PDF pages are ‘drawn’ 0 0 0 1k set cmyk color of text to black BT start of some text /T1_01Tf Use the font defined as T1_0 elsewhere 0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties 7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen (L*) Tj draw the text L* /T1_1 1Tf change font 0.856 0 Td move to a different location onscreen Thursday, 29 March 12
  • 54. PDF pages are ‘drawn’ 0 0 0 1k set cmyk color of text to black BT start of some text /T1_01Tf Use the font defined as T1_0 elsewhere 0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties 7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen (L*) Tj draw the text L* /T1_1 1Tf change font 0.856 0 Td move to a different location onscreen ( = 100) Tj draw the text = 100 Thursday, 29 March 12
  • 55. PDF pages are ‘drawn’ 0 0 0 1k set cmyk color of text to black BT start of some text /T1_01Tf Use the font defined as T1_0 elsewhere 0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties 7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen (L*) Tj draw the text L* /T1_1 1Tf change font 0.856 0 Td move to a different location onscreen ( = 100) Tj draw the text = 100 -0.324 -1.133Td move to a different location onscreen Thursday, 29 March 12
  • 56. PDF pages are ‘drawn’ 0 0 0 1k set cmyk color of text to black BT start of some text /T1_01Tf Use the font defined as T1_0 elsewhere 0 Tc 0 Tw 0 Ts 100 Tz 0 Tr set other text properties 7.5003 0 0 7.5003 272.1643 540.2979 Tm position onscreen (L*) Tj draw the text L* /T1_1 1Tf change font 0.856 0 Td move to a different location onscreen ( = 100) Tj draw the text = 100 -0.324 -1.133Td move to a different location onscreen [(whit)6(e)] Tj draw the text white (put a gap between t and e) Thursday, 29 March 12
  • 58. PDF myth - files are cross platform Only if you create them properly... Thursday, 29 March 12
  • 59. Obfuscation for idiots! No-one will be able to guess the secret password Thursday, 29 March 12
  • 60. 20 seconds later... And the password is.... Thursday, 29 March 12
  • 61. Lastly a plea Not all PDF creation tools are equal Thursday, 29 March 12