SlideShare une entreprise Scribd logo
1  sur  188
Télécharger pour lire hors ligne
Let’s write a PDF file
A simple walk-through to learn
the basics of the PDF format
(at your rhythm)
PDF = Portable Document Format
r2
Ange Albertini
reverse engineering &
visual documentation
@angealbertini
ange@corkami.com
http://www.corkami.com
Goal:
write a “Hello World” in PDF
PDF is text-based,
with some binary in specific cases.
But not in this example,
so just open a text editor.
Statements are separated
by white space.
(any extra white space is ignored)
Any of these:
0x00 Null 0x0C Form Feed
0x09 Tab 0x0D Carriage Return
0x0A Line feed 0x20 Space
(yes, you can mix EOL style :( )
Delimiters don’t require
white space before.
( ) < > [ ] { } /
_
Let’s start!
%PDF-_
A PDF starts with a %PDF-? signature
followed by a version number.
1.0 <= version number <= 1.7
(it doesn’t really matter here)
%PDF-1.3
_
Ok, we have a valid signature ☺
%PDF-1.3
%_
A comment starts with %
until the end of the line.
%PDF-1.3
%file body
_
After the signature,
comes the file body.
(we’ll see about it later)
%PDF-1.3
%file body
xref
_ After the file body,
comes the cross reference table.
It starts with the xref keyword, on a separated line.
%PDF-1.3
%file body
xref
%xref table here
_
After the xref keyword,
comes the actual table.
(we’ll see about it later)
%PDF-1.3
%file body
xref
%xref table here
trailer_
After the table,
comes the trailer...
It starts with a trailer keyword.
%PDF-1.3
%file body
xref
%xref table here
trailer
%trailer contents
_
(we’ll see that later too…)
...and its contents.
%PDF-1.3
%file body
xref
%xref table here
trailer
%trailer contents
startxref
_
(with startxref)
Then, a pointer
to the xref table...
%PDF-1.3
%file body
xref
%xref table here
trailer
%trailer contents
startxref
%xref pointer
_
(later, too...)
%PDF-1.3
%file body
xref
%xref table here
trailer
%trailer contents
startxref
%xref pointer
%%EOF_
...an %%EOF marker.
Lastly, to mark
the end of the file...
%PDF-1.3
%file body
xref
%xref table here
trailer
%trailer contents
startxref
%xref pointer
%%EOF
Easy ;)
That’s the overall layout
of a PDF document!
%PDF-1.3
%file body
xref
%xref table here
trailer
%trailer contents
startxref
%xref pointer
%%EOF
Now, we just need
to fill in the rest :)
Study time
Def: name objects
A.k.a. “strings starting with a slash”
/Name
A slash, then an alphanumeric string
(no whitespace)
Case sensitive
/Name != /name
Names with incorrect case are just ignored
(no error is triggered)
Def: dictionary object
Sequence of keys and values
(no delimiter in between)
enclosed in << and >>
sets each key to value
Syntax
<<
key value key value
[key value]*…
>>
Keys are always name objects
<< /Index 1>> sets /Index to 1
<< Index 1 >> is invalid
(the key is not a name)
Dictionaries can have any length
<< /Index 1
/Count /Whatever >>
sets /Index to 1
and /Count to /Whatever
Extra white space is ignored(as usual)
<< /Index 1
/Count
/Whatever >>
is equivalent to
<< /Index 1 /Count /Whatever >>
Dictionaries can be nested.
<< /MyDict << >> >>
sets /MyDict to << >> (empty dictionary)
White space before delimiters
is not required.
<< /Index 1 /MyDict << >> >>
equivalent to
<</Index 1/MyDict<<>>>>
Def: indirect object
an object number (>0), a generation number (0*)
the obj keyword
the object content
the endobj keyword
* 99% of the time
Example
1 0 obj
3
endobj
is object #1, generation 0, containing “3”
Def: object reference
object number, object generation, R
number number R
ex: 1 0 R
Object reference
Refers to an indirect object as a value
ex: << /Root 1 0 R >> refers to
object number 1 generation 0
as the /Root
Used only as values
in a dictionary
<< /Root 1 0 R >> is OK.
<< 1 0 R /Catalog>> isn’t.
Be careful with the syntax!
“1 0 3” is a sequence of 3 numbers 1 0 3
“1 0 R” is a single reference to an object
number 1 generation 0
Def: file body
sequence of indirect objects
object order doesn’t matter
Example
1 0 obj 3 endobj
2 0 obj << /Index 1 >> endobj
defines 2 objects with different contents
%PDF-1.3
%file body
xref
%xref table here
trailer
%trailer contents
startxref
%xref pointer
%%EOF
Remember this?
A PDF document is defined
by a tree of objects.
%PDF-1.3
%file body
xref
%xref table here
trailer
%trailer contents
startxref
%xref pointer
%%EOF
Now, let’s start!
%PDF-1.3
%file body
xref
%xref table here
trailer
<< _ >>
startxref
%xref pointer
%%EOF
The trailer is a dictionary.
%PDF-1.3
%file body
xref
%xref table here
trailer
<< /Root_ >>
startxref
%xref pointer
%%EOF
It defines a /Root name...
%PDF-1.3
%file body
xref
%xref table here
trailer
<< /Root 1 0 R_>>
startxref
%xref pointer
%%EOF
...that refers to an object...
%PDF-1.3
%file body
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
(like all the the other objects)
...that will be in
the file body.
Recap:
the trailer is a dictionary
that refers to a root object.
%PDF-1.3
_
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
Let’s create our
first object...
%PDF-1.3
1 0 obj
_
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
…(with the standard
object declaration)...
%PDF-1.3
1 0 obj
<< _ >>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
(like most objects)
...that contains a
dictionary.
%PDF-1.3
1 0 obj
<< /Type_ >>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...and its /Type is...
%PDF-1.3
1 0 obj
<< /Type /Catalog_ >>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...defined as /Catalog...
%PDF-1.3
1 0 obj
<< /Type /Catalog _ >>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
the /Root object also
refers to the page tree...
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages_ >>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...via a /Pages name...
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R_>>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...that refers to
another object...
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
_
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...which we’ll create.
Recap:
object 1 is a catalog, and
refers to a Pages object.
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
_
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
Let’s create object 2.
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
_
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
The usual declaration.
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< _
>>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
It’s a dictionary too.
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages_
>>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
The pages’ object
/Type has to be
defined as … /Pages ☺
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids_
>>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
This object defines
its children via /Kids...
Def: array
enclosed in [ ]
values separated by whitespace
ex: [1 2 3 4] is an array of 4 integers 1 2 3 4
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ _ ]
>>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...which is an array...
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R_]
>>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
… of references
to each page object.
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
_ >>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
One last step...
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1_>>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...the number of kids
has to be set in /Count...
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...and now
object 2 is complete!
Recap:
object 2 is /Pages;
it defines Kids + Count
(pages of the document).
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
_
We can add our only Kid...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
_
endobj
…(a single page)...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< _ >>
endobj
… a dictionary...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type_ >>
endobj
… defining a /Type...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page_ >>
endobj
… as /Page.
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent_ >>
endobj
This grateful kid
properly recognizes
its own parent...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R_>>
endobj
… as you would
expect ☺
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
_
>>
endobj
Our page requires
resources.
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources_
>>
endobj
Let’s add them...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << _ >>
>>
endobj
...as a dictionary:
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font_ >>
>>
endobj
In this case, fonts...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << _ >> >>
>>
endobj
...as a dictionary.
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font <<
_
>> >>
>>
endobj
We define one font...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font <<
/F1_
>> >>
>>
endobj
...by giving it a name...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font <<
/F1 << _ >>
>> >>
>>
endobj
...and setting its
parameters:
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font <<
/F1 << /Type_ >>
>> >>
>>
endobj
its type is ...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font <<
/F1 << /Type /Font_ >>
>> >>
>>
endobj
… font ☺
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font <<
/F1 << /Type /Font /Subtype_ >>
>> >>
>>
endobj
Its font type is...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font <<
/F1 << /Type /Font /Subtype /Type1_
>> >> >>
>>
endobj
…(Adobe) Type1...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font <<
/F1 << /Type /Font /Subtype /Type1
/BaseFont_>> >> >>
>>
endobj
...and its name is...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font <<
/F1 << /Type /Font /Subtype /Type1
/BaseFont /Arial_>> >> >>
>>
endobj
.../Arial.
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 << /Type /Font
/Subtype /Type1 /BaseFont /Arial >> >> >>
_
>>
endobj
One thing is missing
in our page...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 << /Type /Font
/Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents_
>>
endobj
The actual page
contents...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 << /Type /Font
/Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R_
>>
endobj
… as a reference
to another object.
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 << /Type /Font
/Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
That’s all for
our page object.
Recap:
object 3 defines a /Page,
its /Parent, /Resources (fonts)
and its /Contents is
in another object.(thank you Mario!)
Study time
Def: stream objects
So far, everything is text.
How do you store binary data (images,...) ?
1 0 obj
…
endobj
Stream objects are objects.
They start and they end like any other object:
Ex: .
Stream objects contain a stream.
between stream and endstream keywords
1 0 obj
stream
<stream content>
endstream
endobj
Streams can contain anything
Yes, really!
Even binary, other file formats...
(except the endstream keyword)
Stream parameters
are stored before the stream.
a dictionary
after obj, before stream
required: stream length
optional: compression algorithm, etc…
1 0 obj
<< /Length 10 >>
stream
0123456789
endstream
endobj
Example
_
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
4 0 obj
_
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
We create
a /Content object...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
4 0 obj
stream
_
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
...that is a stream
object...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
Study time
Page contents syntax
parameters sequence then operator
ex: param1 param2 operator
4 0 obj
stream
_
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
Text objects are delimited
by BT and ET...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
4 0 obj
stream
BT
_
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
...(BeginText & EndText).
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
4 0 obj
stream
BT
Tf_
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
We need to set a font,
with Tf.
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
4 0 obj
stream
BT
_ Tf
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
It takes 2 parameters:
a font name...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
4 0 obj
stream
BT
/F1_ Tf
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
...(from the page’s
resources)...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
stream
BT
/F1 100_Tf
ET
endstream
endobj
...and a font size.
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
stream
BT
/F1 100 Tf
_
ET
endstream
endobj
We move the cursor...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
stream
BT
/F1 100 Tf
Td_
ET
endstream
endobj
...with the Td operator...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
stream
BT
/F1 100 Tf
_ Td
ET
endstream
endobj
...that takes 2 parameters...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
stream
BT
/F1 100 Tf
10 400_Td
ET
endstream
endobj
...x and y coordinates.
(default page size: 612x792)
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
Study time
Def: literal strings
enclosed in parentheses
Ex: (Hi Mum)
Can contain parentheses
(Hello() World((()
Can contain white space
( Hello
World !
)
Standard escaping is
supported
(Hello 
World rn)
Escaping is in octal
(Hell157 World)
4 0 obj
stream
BT
/F1 100 Tf
10 400 Td
_
ET
endstream
endobj
Showing a text string...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
stream
BT
/F1 100 Tf
10 400 Td
Tj_
ET
endstream
endobj
...is done with the Tj
operator...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
stream
BT
/F1 100 Tf
10 400 Td
_ Tj
ET
endstream
endobj
...that takes a single
parameter...
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
stream
BT
/F1 100 Tf
10 400 Td
(_) Tj
ET
endstream
endobj
...a literal string.
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
stream
BT
/F1 100 Tf
10 400 Td
(Hello World_) Tj
ET
endstream
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
Our contents stream
is complete...
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
_
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
One last thing...
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
<< _ >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...we need to set
its parameters...
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
<< /Length_ >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
… the stream length...
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
<< /Length 44_>>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
…including white space
(new lines characters…).
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
Our stream parameters
are finished...
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...so our page contents
object is finished.
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
Recap:
obj 4 is a stream object with a set length,
defining the page’s contents:
declare text, set a font and size,
move cursor, display text.
The whole document is defined.
We need to polish the structure.
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
Our PDF defines 4 objects,
starting at index 1...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...but PDFs always have an
object 0, that is null...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
%xref table here
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...so 5 objects, starting at 0.
Warning: offsets & EOLs
We have to define offsets,
which are affected by the EOL conventions:
1 char under Linux/Mac, 2 under Windows.
(I use 1 char newlines character here)
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
_
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
Let’s edit the XREF table!
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0_
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
The next line defines the
starting index...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5_
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...and the number of objects.
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
_
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
Then, one line per object...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
_
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...following the
xxxxxxxxxx yyyyy a format
(10 digits, 5 digits, 1 letter).
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
_
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
The first parameter is the offset
(in decimal) of the object...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000_
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...(for the null object, it’s 0).
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 _
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
Then, the generation number
(that is almost always 0)...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535_
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...but for object 0, it’s 65535.
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f_
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
Then, a letter, to tell if this entry
is free (f) or in use (n).
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f_
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
Lastly, each line should take 20
bytes, including EOL...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f _
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...so add a trailing space.
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
_
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
Next line (the first real object)...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010_
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
…object offset, in decimal...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000_
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
…generation number...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n_
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
…and declare the object index
in use (n)...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n _
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
…and the trailing space
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n
_
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
Do the same with the other
objects...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n
00000 n
00000 n
00000 n _
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
…knowing that all lines
will end with “ 00000 n ”,...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n
0000000060 00000 n
0000000120 00000 n
0000000269 00000 n _
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
...set all offsets.
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n
0000000060 00000 n
0000000120 00000 n
0000000269 00000 n
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
The cross-reference table
is finished.
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n
0000000060 00000 n
0000000120 00000 n
0000000269 00000 n
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n
0000000060 00000 n
0000000120 00000 n
0000000269 00000 n
trailer
<< /Root 1 0 R >>
startxref
%xref pointer
%%EOF
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n
0000000060 00000 n
0000000120 00000 n
0000000269 00000 n
trailer
<< /Root 1 0 R >>
startxref
_
%%EOF
We set the startxref
pointer...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n
0000000060 00000 n
0000000120 00000 n
0000000269 00000 n
trailer
<< /Root 1 0 R >>
startxref
364_
%%EOF
...as xref’s offset, in decimal
(no prepending 0s).
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n
0000000060 00000 n
0000000120 00000 n
0000000269 00000 n
trailer
<< /Root 1 0 R >>
startxref
364
%%EOF
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n
0000000060 00000 n
0000000120 00000 n
0000000269 00000 n
trailer
<< /Root 1 0 R _ >>
startxref
364
%%EOF
We also need to update the
trailer dictionary...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n
0000000060 00000 n
0000000120 00000 n
0000000269 00000 n
trailer
<< /Root 1 0 R /Size_ >>
startxref
364
%%EOF
...with the number of
objects...
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n
0000000060 00000 n
0000000120 00000 n
0000000269 00000 n
trailer
<< /Root 1 0 R /Size 5_>>
startxref
364
%%EOF
… in the PDF
(including object 0).
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n
0000000060 00000 n
0000000120 00000 n
0000000269 00000 n
trailer
<< /Root 1 0 R /Size 5 >>
startxref
364
%%EOF
Our PDF is now complete.
%PDF-1.3
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R
/Resources << /Font << /F1 <<
/Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >>
/Contents 4 0 R
>>
endobj
4 0 obj
<< /Length 44 >>
stream
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
endstream
endobj
xref
0 5
0000000000 65535 f
0000000010 00000 n
0000000060 00000 n
0000000120 00000 n
0000000269 00000 n
trailer
<< /Root 1 0 R /Size 5 >>
startxref
364
%%EOF
Disclaimer:
this is a minimal PDF.
Most PDF documents are much bigger,
and contain many more elements.
Our PDF:
528 bytes
4 objects
text only
A standard generated “Hello World”:
15 kiloBytes
20 objects
text and binary (embedded fonts…)
No need to type them yourself!
Hint: use “mutool clean”
to fix offsets and lengths.
http://www.mupdf.com/
⇒ mutool version
Slightly different content,
but same rendering.
%PDF-1.3
%%μῦ
1 0 obj
<</Type/Catalog/Pages 2 0 R>>
endobj
2 0 obj
<</Type/Pages/Kids[3 0 R]/Count 1>>
endobj
3 0 obj
<</Type/Page/Parent 2 0 R/Resources 5 0 R/Contents 4 0 R>>
endobj
4 0 obj
<</Length 49>>
stream
q
BT
/F1 100 Tf
10 400 Td
(Hello World!) Tj
ET
Q
endstream
endobj
5 0 obj
<</Font<</F1<</Type/Font/Subtype/Type1/BaseFont/Arial>>>>>>
endobj
xref
0 6
0000000000 65536 f
0000000018 00000 n
0000000064 00000 n
0000000116 00000 n
0000000191 00000 n
0000000288 00000 n
trailer
<</Size 6/Root 1 0 R>>
startxref
364
%%EOF
Hint: you can directly extract
the PDF sources.
use “pdftotext --layout” on the slide deck
http://www.foolabs.com/xpdf/home.html
One more thing...
This one is important for self study.
Def: stream filters
streams can be encoded and/or compressed
algorithms can be cascaded
ex: compression, then ASCII encoding
New stream parameter:
/Filter
ex: encode the stream in ASCII
1 0 obj
<< /Length 12 >>
stream
Hello World!
endstream
endobj
1 0 obj
<< /Length 24 /Filter /ASCIIHexDecode>>
stream
48656C6C6F20576F726C6421
endstream
endobj
⇔
Ex: compression
(deflate = ZIP compression)
1 0 obj
<< /Length 12 >>
stream
Hello World!
endstream
endobj
1 0 obj
<< /Length 20 /Filter /FlateDecode>>
stream
x£¾H═╔╔¤/╩IQ♦ ∟I♦>
endstream
endobj
⇔
Filters can be cascaded.
Ex: compressed, then encoded in ASCII
1 0 obj
<< /Length 12 >>
stream
Hello World!
endstream
endobj
1 0 obj
<< /Length 40 /Filter [/ASCIIHexDecode /FlateDecode] >>
stream
789CF348CDC9C95708CF2FCA495104001C49043E
endstream
endobj
⇔
Hint: “mutool clean -d”
to remove any stream filter.
(if you want to explore PDFs by yourself)
http://www.mupdf.com/
Want more?
pdf101.corkami.com
Questions?
(you can download this poster at http://pics.corkami.com)
ACK
@Doegox @ChrisJohnRiley
@PDFKungFoo
To be
continued...?
https://leanpub.com/binaryisbeautiful
Let’s write
a PDF file
corkami.com
@angealbertini
Hail to the king, baby!
r2

Contenu connexe

Tendances

C,c++ interview q&a
C,c++ interview q&aC,c++ interview q&a
C,c++ interview q&aKumaran K
 
Caring for file formats
Caring for file formatsCaring for file formats
Caring for file formatsAnge Albertini
 
Introduction to Python
Introduction to Python Introduction to Python
Introduction to Python C. ASWINI
 
Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)Jerome Eteve
 
introduction to python
 introduction to python introduction to python
introduction to pythonJincy Nelson
 
Stream Based Input Output
Stream Based Input OutputStream Based Input Output
Stream Based Input OutputBharat17485
 
Learn Python The Hard Way Presentation
Learn Python The Hard Way PresentationLearn Python The Hard Way Presentation
Learn Python The Hard Way PresentationAmira ElSharkawy
 
Programming in C
Programming in CProgramming in C
Programming in Csujathavvv
 
File Handling in C Programming
File Handling in C ProgrammingFile Handling in C Programming
File Handling in C ProgrammingRavindraSalunke3
 
An Execution-Semantic and Content-and-Context-Based Code-Clone Detection and ...
An Execution-Semantic and Content-and-Context-Based Code-Clone Detection and ...An Execution-Semantic and Content-and-Context-Based Code-Clone Detection and ...
An Execution-Semantic and Content-and-Context-Based Code-Clone Detection and ...Kamiya Toshihiro
 
Improving file formats
Improving file formatsImproving file formats
Improving file formatsAnge Albertini
 
Understanding F# Workflows
Understanding F# WorkflowsUnderstanding F# Workflows
Understanding F# Workflowsmdlm
 
C# Language Overview Part I
C# Language Overview Part IC# Language Overview Part I
C# Language Overview Part IDoncho Minkov
 
Clone detection in Python
Clone detection in PythonClone detection in Python
Clone detection in PythonValerio Maggio
 
Python: an introduction for PHP webdevelopers
Python: an introduction for PHP webdevelopersPython: an introduction for PHP webdevelopers
Python: an introduction for PHP webdevelopersGlenn De Backer
 

Tendances (19)

C,c++ interview q&a
C,c++ interview q&aC,c++ interview q&a
C,c++ interview q&a
 
Caring for file formats
Caring for file formatsCaring for file formats
Caring for file formats
 
Introduction to Python
Introduction to Python Introduction to Python
Introduction to Python
 
Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)
 
introduction to python
 introduction to python introduction to python
introduction to python
 
Stream Based Input Output
Stream Based Input OutputStream Based Input Output
Stream Based Input Output
 
Learn Python The Hard Way Presentation
Learn Python The Hard Way PresentationLearn Python The Hard Way Presentation
Learn Python The Hard Way Presentation
 
Programming in C
Programming in CProgramming in C
Programming in C
 
File Handling in C Programming
File Handling in C ProgrammingFile Handling in C Programming
File Handling in C Programming
 
Programming with Python
Programming with PythonProgramming with Python
Programming with Python
 
An Execution-Semantic and Content-and-Context-Based Code-Clone Detection and ...
An Execution-Semantic and Content-and-Context-Based Code-Clone Detection and ...An Execution-Semantic and Content-and-Context-Based Code-Clone Detection and ...
An Execution-Semantic and Content-and-Context-Based Code-Clone Detection and ...
 
Improving file formats
Improving file formatsImproving file formats
Improving file formats
 
C tutorial
C tutorialC tutorial
C tutorial
 
Understanding F# Workflows
Understanding F# WorkflowsUnderstanding F# Workflows
Understanding F# Workflows
 
C
CC
C
 
C# Language Overview Part I
C# Language Overview Part IC# Language Overview Part I
C# Language Overview Part I
 
Clone detection in Python
Clone detection in PythonClone detection in Python
Clone detection in Python
 
PHP Basics
PHP BasicsPHP Basics
PHP Basics
 
Python: an introduction for PHP webdevelopers
Python: an introduction for PHP webdevelopersPython: an introduction for PHP webdevelopers
Python: an introduction for PHP webdevelopers
 

Similaire à Let's write a PDF file

Making Sense of Twig
Making Sense of TwigMaking Sense of Twig
Making Sense of TwigBrandon Kelly
 
03 HTML #burningkeyboards
03 HTML #burningkeyboards03 HTML #burningkeyboards
03 HTML #burningkeyboardsDenis Ristic
 
Apache Velocity
Apache Velocity Apache Velocity
Apache Velocity yesprakash
 
Mixed Effects Models - Descriptive Statistics
Mixed Effects Models - Descriptive StatisticsMixed Effects Models - Descriptive Statistics
Mixed Effects Models - Descriptive StatisticsScott Fraundorf
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchAndrew Lowe
 
You're Doing It Wrong
You're Doing It WrongYou're Doing It Wrong
You're Doing It Wrongbostonrb
 
You're Doing It Wrong
You're Doing It WrongYou're Doing It Wrong
You're Doing It Wrongbostonrb
 
Milot Shala - C++ (OSCAL2014)
Milot Shala - C++ (OSCAL2014)Milot Shala - C++ (OSCAL2014)
Milot Shala - C++ (OSCAL2014)Open Labs Albania
 
Html,javascript & css
Html,javascript & cssHtml,javascript & css
Html,javascript & cssPredhin Sapru
 
Gráficas en python
Gráficas en python Gráficas en python
Gráficas en python Jhon Valle
 

Similaire à Let's write a PDF file (20)

Making Sense of Twig
Making Sense of TwigMaking Sense of Twig
Making Sense of Twig
 
03 HTML #burningkeyboards
03 HTML #burningkeyboards03 HTML #burningkeyboards
03 HTML #burningkeyboards
 
Apache Velocity
Apache VelocityApache Velocity
Apache Velocity
 
Apache Velocity
Apache Velocity Apache Velocity
Apache Velocity
 
Os Bubna
Os BubnaOs Bubna
Os Bubna
 
Design Patterns in Ruby
Design Patterns in RubyDesign Patterns in Ruby
Design Patterns in Ruby
 
Mixed Effects Models - Descriptive Statistics
Mixed Effects Models - Descriptive StatisticsMixed Effects Models - Descriptive Statistics
Mixed Effects Models - Descriptive Statistics
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible research
 
You're Doing It Wrong
You're Doing It WrongYou're Doing It Wrong
You're Doing It Wrong
 
You're Doing It Wrong
You're Doing It WrongYou're Doing It Wrong
You're Doing It Wrong
 
Xpath
XpathXpath
Xpath
 
Conf orm - explain
Conf orm - explainConf orm - explain
Conf orm - explain
 
Bioinformatica 27-10-2011-p4-files
Bioinformatica 27-10-2011-p4-filesBioinformatica 27-10-2011-p4-files
Bioinformatica 27-10-2011-p4-files
 
Milot Shala - C++ (OSCAL2014)
Milot Shala - C++ (OSCAL2014)Milot Shala - C++ (OSCAL2014)
Milot Shala - C++ (OSCAL2014)
 
Html,javascript & css
Html,javascript & cssHtml,javascript & css
Html,javascript & css
 
Gráficas en python
Gráficas en python Gráficas en python
Gráficas en python
 
Html ppt
Html pptHtml ppt
Html ppt
 
WDD
WDDWDD
WDD
 
Words in Code
Words in CodeWords in Code
Words in Code
 
Rails vu d'un Javaiste
Rails vu d'un JavaisteRails vu d'un Javaiste
Rails vu d'un Javaiste
 

Plus de Ange Albertini

Technical challenges with file formats
Technical challenges with file formatsTechnical challenges with file formats
Technical challenges with file formatsAnge Albertini
 
Relations between archive formats
Relations between archive formatsRelations between archive formats
Relations between archive formatsAnge Albertini
 
Abusing archive file formats
Abusing archive file formatsAbusing archive file formats
Abusing archive file formatsAnge Albertini
 
You are *not* an idiot
You are *not* an idiotYou are *not* an idiot
You are *not* an idiotAnge Albertini
 
An introduction to inkscape
An introduction to inkscapeAn introduction to inkscape
An introduction to inkscapeAnge Albertini
 
The challenges of file formats
The challenges of file formatsThe challenges of file formats
The challenges of file formatsAnge Albertini
 
Exploiting hash collisions
Exploiting hash collisionsExploiting hash collisions
Exploiting hash collisionsAnge Albertini
 
Connecting communities
Connecting communitiesConnecting communities
Connecting communitiesAnge Albertini
 
TASBot - the perfectionist
TASBot - the perfectionistTASBot - the perfectionist
TASBot - the perfectionistAnge Albertini
 
Funky file formats - 31c3
Funky file formats - 31c3Funky file formats - 31c3
Funky file formats - 31c3Ange Albertini
 
Preserving arcade games - 31c3
Preserving arcade games -  31c3Preserving arcade games -  31c3
Preserving arcade games - 31c3Ange Albertini
 
Preserving arcade games
Preserving arcade gamesPreserving arcade games
Preserving arcade gamesAnge Albertini
 
Hide Android applications in images
Hide Android applications in imagesHide Android applications in images
Hide Android applications in imagesAnge Albertini
 

Plus de Ange Albertini (20)

Technical challenges with file formats
Technical challenges with file formatsTechnical challenges with file formats
Technical challenges with file formats
 
Relations between archive formats
Relations between archive formatsRelations between archive formats
Relations between archive formats
 
Abusing archive file formats
Abusing archive file formatsAbusing archive file formats
Abusing archive file formats
 
TimeCryption
TimeCryptionTimeCryption
TimeCryption
 
You are *not* an idiot
You are *not* an idiotYou are *not* an idiot
You are *not* an idiot
 
KILL MD5
KILL MD5KILL MD5
KILL MD5
 
No more dumb hex!
No more dumb hex!No more dumb hex!
No more dumb hex!
 
Beyond your studies
Beyond your studiesBeyond your studies
Beyond your studies
 
An introduction to inkscape
An introduction to inkscapeAn introduction to inkscape
An introduction to inkscape
 
The challenges of file formats
The challenges of file formatsThe challenges of file formats
The challenges of file formats
 
Exploiting hash collisions
Exploiting hash collisionsExploiting hash collisions
Exploiting hash collisions
 
Infosec & failures
Infosec & failuresInfosec & failures
Infosec & failures
 
Connecting communities
Connecting communitiesConnecting communities
Connecting communities
 
TASBot - the perfectionist
TASBot - the perfectionistTASBot - the perfectionist
TASBot - the perfectionist
 
Hacks in video games
Hacks in video gamesHacks in video games
Hacks in video games
 
Funky file formats - 31c3
Funky file formats - 31c3Funky file formats - 31c3
Funky file formats - 31c3
 
Preserving arcade games - 31c3
Preserving arcade games -  31c3Preserving arcade games -  31c3
Preserving arcade games - 31c3
 
Preserving arcade games
Preserving arcade gamesPreserving arcade games
Preserving arcade games
 
Let's talk about...
Let's talk about...Let's talk about...
Let's talk about...
 
Hide Android applications in images
Hide Android applications in imagesHide Android applications in images
Hide Android applications in images
 

Dernier

AI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human BeautyAI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human BeautyRaymond Okyere-Forson
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024Mind IT Systems
 
Webinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.pptWebinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.pptkinjal48
 
online pdf editor software solutions.pdf
online pdf editor software solutions.pdfonline pdf editor software solutions.pdf
online pdf editor software solutions.pdfMeon Technology
 
Watermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security ChallengesWatermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security ChallengesShyamsundar Das
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Incrobinwilliams8624
 
Introduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntroduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntelliSource Technologies
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native BuildpacksVish Abrams
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorShane Coughlan
 
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsYour Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsJaydeep Chhasatia
 
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine HarmonyLeveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmonyelliciumsolutionspun
 
Why Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfWhy Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfBrain Inventory
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageSales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageDista
 
Growing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesGrowing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesSoftwareMill
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...OnePlan Solutions
 
Fields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxFields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxJoão Esperancinha
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 
Deep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampDeep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampVICTOR MAESTRE RAMIREZ
 

Dernier (20)

AI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human BeautyAI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human Beauty
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024
 
Webinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.pptWebinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.ppt
 
Salesforce AI Associate Certification.pptx
Salesforce AI Associate Certification.pptxSalesforce AI Associate Certification.pptx
Salesforce AI Associate Certification.pptx
 
online pdf editor software solutions.pdf
online pdf editor software solutions.pdfonline pdf editor software solutions.pdf
online pdf editor software solutions.pdf
 
Watermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security ChallengesWatermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security Challenges
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Inc
 
Introduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntroduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptx
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native Buildpacks
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS Calculator
 
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsYour Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
 
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine HarmonyLeveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
 
Why Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfWhy Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdf
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageSales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
 
Growing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesGrowing Oxen: channel operators and retries
Growing Oxen: channel operators and retries
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
 
Fields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxFields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptx
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
Deep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampDeep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - Datacamp
 

Let's write a PDF file

  • 1. Let’s write a PDF file A simple walk-through to learn the basics of the PDF format (at your rhythm) PDF = Portable Document Format r2
  • 2. Ange Albertini reverse engineering & visual documentation @angealbertini ange@corkami.com http://www.corkami.com
  • 3. Goal: write a “Hello World” in PDF
  • 4. PDF is text-based, with some binary in specific cases. But not in this example, so just open a text editor.
  • 5. Statements are separated by white space. (any extra white space is ignored) Any of these: 0x00 Null 0x0C Form Feed 0x09 Tab 0x0D Carriage Return 0x0A Line feed 0x20 Space (yes, you can mix EOL style :( )
  • 6. Delimiters don’t require white space before. ( ) < > [ ] { } /
  • 8. %PDF-_ A PDF starts with a %PDF-? signature followed by a version number. 1.0 <= version number <= 1.7 (it doesn’t really matter here)
  • 9. %PDF-1.3 _ Ok, we have a valid signature ☺
  • 10. %PDF-1.3 %_ A comment starts with % until the end of the line.
  • 11. %PDF-1.3 %file body _ After the signature, comes the file body. (we’ll see about it later)
  • 12. %PDF-1.3 %file body xref _ After the file body, comes the cross reference table. It starts with the xref keyword, on a separated line.
  • 13. %PDF-1.3 %file body xref %xref table here _ After the xref keyword, comes the actual table. (we’ll see about it later)
  • 14. %PDF-1.3 %file body xref %xref table here trailer_ After the table, comes the trailer... It starts with a trailer keyword.
  • 15. %PDF-1.3 %file body xref %xref table here trailer %trailer contents _ (we’ll see that later too…) ...and its contents.
  • 16. %PDF-1.3 %file body xref %xref table here trailer %trailer contents startxref _ (with startxref) Then, a pointer to the xref table...
  • 17. %PDF-1.3 %file body xref %xref table here trailer %trailer contents startxref %xref pointer _ (later, too...)
  • 18. %PDF-1.3 %file body xref %xref table here trailer %trailer contents startxref %xref pointer %%EOF_ ...an %%EOF marker. Lastly, to mark the end of the file...
  • 19. %PDF-1.3 %file body xref %xref table here trailer %trailer contents startxref %xref pointer %%EOF Easy ;) That’s the overall layout of a PDF document!
  • 20. %PDF-1.3 %file body xref %xref table here trailer %trailer contents startxref %xref pointer %%EOF Now, we just need to fill in the rest :)
  • 22. Def: name objects A.k.a. “strings starting with a slash”
  • 23. /Name A slash, then an alphanumeric string (no whitespace)
  • 24. Case sensitive /Name != /name Names with incorrect case are just ignored (no error is triggered)
  • 25. Def: dictionary object Sequence of keys and values (no delimiter in between) enclosed in << and >> sets each key to value
  • 26. Syntax << key value key value [key value]*… >>
  • 27. Keys are always name objects << /Index 1>> sets /Index to 1 << Index 1 >> is invalid (the key is not a name)
  • 28. Dictionaries can have any length << /Index 1 /Count /Whatever >> sets /Index to 1 and /Count to /Whatever
  • 29. Extra white space is ignored(as usual) << /Index 1 /Count /Whatever >> is equivalent to << /Index 1 /Count /Whatever >>
  • 30. Dictionaries can be nested. << /MyDict << >> >> sets /MyDict to << >> (empty dictionary)
  • 31. White space before delimiters is not required. << /Index 1 /MyDict << >> >> equivalent to <</Index 1/MyDict<<>>>>
  • 32. Def: indirect object an object number (>0), a generation number (0*) the obj keyword the object content the endobj keyword * 99% of the time
  • 33. Example 1 0 obj 3 endobj is object #1, generation 0, containing “3”
  • 34. Def: object reference object number, object generation, R number number R ex: 1 0 R
  • 35. Object reference Refers to an indirect object as a value ex: << /Root 1 0 R >> refers to object number 1 generation 0 as the /Root
  • 36. Used only as values in a dictionary << /Root 1 0 R >> is OK. << 1 0 R /Catalog>> isn’t.
  • 37. Be careful with the syntax! “1 0 3” is a sequence of 3 numbers 1 0 3 “1 0 R” is a single reference to an object number 1 generation 0
  • 38. Def: file body sequence of indirect objects object order doesn’t matter
  • 39. Example 1 0 obj 3 endobj 2 0 obj << /Index 1 >> endobj defines 2 objects with different contents
  • 40. %PDF-1.3 %file body xref %xref table here trailer %trailer contents startxref %xref pointer %%EOF Remember this?
  • 41. A PDF document is defined by a tree of objects.
  • 42. %PDF-1.3 %file body xref %xref table here trailer %trailer contents startxref %xref pointer %%EOF Now, let’s start!
  • 43. %PDF-1.3 %file body xref %xref table here trailer << _ >> startxref %xref pointer %%EOF The trailer is a dictionary.
  • 44. %PDF-1.3 %file body xref %xref table here trailer << /Root_ >> startxref %xref pointer %%EOF It defines a /Root name...
  • 45. %PDF-1.3 %file body xref %xref table here trailer << /Root 1 0 R_>> startxref %xref pointer %%EOF ...that refers to an object...
  • 46. %PDF-1.3 %file body xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF (like all the the other objects) ...that will be in the file body.
  • 47. Recap: the trailer is a dictionary that refers to a root object.
  • 48. %PDF-1.3 _ xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF Let’s create our first object...
  • 49. %PDF-1.3 1 0 obj _ endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF …(with the standard object declaration)...
  • 50. %PDF-1.3 1 0 obj << _ >> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF (like most objects) ...that contains a dictionary.
  • 51. %PDF-1.3 1 0 obj << /Type_ >> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...and its /Type is...
  • 52. %PDF-1.3 1 0 obj << /Type /Catalog_ >> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...defined as /Catalog...
  • 53. %PDF-1.3 1 0 obj << /Type /Catalog _ >> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF the /Root object also refers to the page tree...
  • 54. %PDF-1.3 1 0 obj << /Type /Catalog /Pages_ >> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...via a /Pages name...
  • 55. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R_>> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...that refers to another object...
  • 56. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj _ xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...which we’ll create.
  • 57. Recap: object 1 is a catalog, and refers to a Pages object.
  • 58. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj _ xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF Let’s create object 2.
  • 59. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj _ endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF The usual declaration.
  • 60. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << _ >> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF It’s a dictionary too.
  • 61. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages_ >> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF The pages’ object /Type has to be defined as … /Pages ☺
  • 62. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids_ >> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF This object defines its children via /Kids...
  • 63. Def: array enclosed in [ ] values separated by whitespace ex: [1 2 3 4] is an array of 4 integers 1 2 3 4
  • 64. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ _ ] >> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...which is an array...
  • 65. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R_] >> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF … of references to each page object.
  • 66. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] _ >> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF One last step...
  • 67. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1_>> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...the number of kids has to be set in /Count...
  • 68. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...and now object 2 is complete!
  • 69. Recap: object 2 is /Pages; it defines Kids + Count (pages of the document).
  • 70. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj _ We can add our only Kid...
  • 71. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj _ endobj …(a single page)...
  • 72. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << _ >> endobj … a dictionary...
  • 73. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type_ >> endobj … defining a /Type...
  • 74. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page_ >> endobj … as /Page.
  • 75. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent_ >> endobj This grateful kid properly recognizes its own parent...
  • 76. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R_>> endobj … as you would expect ☺
  • 77. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R _ >> endobj Our page requires resources.
  • 78. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources_ >> endobj Let’s add them...
  • 79. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << _ >> >> endobj ...as a dictionary:
  • 80. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font_ >> >> endobj In this case, fonts...
  • 81. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << _ >> >> >> endobj ...as a dictionary.
  • 82. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << _ >> >> >> endobj We define one font...
  • 83. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1_ >> >> >> endobj ...by giving it a name...
  • 84. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << _ >> >> >> >> endobj ...and setting its parameters:
  • 85. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type_ >> >> >> >> endobj its type is ...
  • 86. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font_ >> >> >> >> endobj … font ☺
  • 87. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype_ >> >> >> >> endobj Its font type is...
  • 88. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1_ >> >> >> >> endobj …(Adobe) Type1...
  • 89. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont_>> >> >> >> endobj ...and its name is...
  • 90. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial_>> >> >> >> endobj .../Arial.
  • 91. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> _ >> endobj One thing is missing in our page...
  • 92. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents_ >> endobj The actual page contents...
  • 93. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R_ >> endobj … as a reference to another object.
  • 94. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj That’s all for our page object.
  • 95. Recap: object 3 defines a /Page, its /Parent, /Resources (fonts) and its /Contents is in another object.(thank you Mario!)
  • 97. Def: stream objects So far, everything is text. How do you store binary data (images,...) ?
  • 98. 1 0 obj … endobj Stream objects are objects. They start and they end like any other object: Ex: .
  • 99. Stream objects contain a stream. between stream and endstream keywords 1 0 obj stream <stream content> endstream endobj
  • 100. Streams can contain anything Yes, really! Even binary, other file formats... (except the endstream keyword)
  • 101. Stream parameters are stored before the stream. a dictionary after obj, before stream required: stream length optional: compression algorithm, etc…
  • 102. 1 0 obj << /Length 10 >> stream 0123456789 endstream endobj Example
  • 103. _ %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF
  • 104. 4 0 obj _ endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj We create a /Content object... xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF
  • 105. 4 0 obj stream _ endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj ...that is a stream object... xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF
  • 107. Page contents syntax parameters sequence then operator ex: param1 param2 operator
  • 108. 4 0 obj stream _ endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj Text objects are delimited by BT and ET... xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF
  • 109. 4 0 obj stream BT _ ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj ...(BeginText & EndText). xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF
  • 110. 4 0 obj stream BT Tf_ ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj We need to set a font, with Tf. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF
  • 111. 4 0 obj stream BT _ Tf ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj It takes 2 parameters: a font name... xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF
  • 112. 4 0 obj stream BT /F1_ Tf ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj ...(from the page’s resources)... xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF
  • 113. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj 4 0 obj stream BT /F1 100_Tf ET endstream endobj ...and a font size. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF
  • 114. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj 4 0 obj stream BT /F1 100 Tf _ ET endstream endobj We move the cursor... xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF
  • 115. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj 4 0 obj stream BT /F1 100 Tf Td_ ET endstream endobj ...with the Td operator... xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF
  • 116. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj 4 0 obj stream BT /F1 100 Tf _ Td ET endstream endobj ...that takes 2 parameters... xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF
  • 117. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj 4 0 obj stream BT /F1 100 Tf 10 400_Td ET endstream endobj ...x and y coordinates. (default page size: 612x792) xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF
  • 119. Def: literal strings enclosed in parentheses Ex: (Hi Mum)
  • 121. Can contain white space ( Hello World ! )
  • 123. Escaping is in octal (Hell157 World)
  • 124. 4 0 obj stream BT /F1 100 Tf 10 400 Td _ ET endstream endobj Showing a text string... xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj
  • 125. 4 0 obj stream BT /F1 100 Tf 10 400 Td Tj_ ET endstream endobj ...is done with the Tj operator... xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj
  • 126. 4 0 obj stream BT /F1 100 Tf 10 400 Td _ Tj ET endstream endobj ...that takes a single parameter... xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj
  • 127. 4 0 obj stream BT /F1 100 Tf 10 400 Td (_) Tj ET endstream endobj ...a literal string. xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj
  • 128. 4 0 obj stream BT /F1 100 Tf 10 400 Td (Hello World_) Tj ET endstream endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj
  • 129. 4 0 obj stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF Our contents stream is complete... %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj
  • 130. 4 0 obj stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj
  • 131. 4 0 obj _ stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF One last thing... %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj
  • 132. 4 0 obj << _ >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...we need to set its parameters... %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj
  • 133. 4 0 obj << /Length_ >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF … the stream length... %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj
  • 134. 4 0 obj << /Length 44_>> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF …including white space (new lines characters…). %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj
  • 135. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF Our stream parameters are finished... %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj
  • 136. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...so our page contents object is finished. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj
  • 137. Recap: obj 4 is a stream object with a set length, defining the page’s contents: declare text, set a font and size, move cursor, display text.
  • 138. The whole document is defined. We need to polish the structure.
  • 139. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF Our PDF defines 4 objects, starting at index 1...
  • 140. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...but PDFs always have an object 0, that is null...
  • 141. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref %xref table here trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...so 5 objects, starting at 0.
  • 142. Warning: offsets & EOLs We have to define offsets, which are affected by the EOL conventions: 1 char under Linux/Mac, 2 under Windows. (I use 1 char newlines character here)
  • 143. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref _ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF Let’s edit the XREF table!
  • 144. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0_ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF The next line defines the starting index...
  • 145. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5_ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...and the number of objects.
  • 146. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 _ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF Then, one line per object...
  • 147. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 _ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...following the xxxxxxxxxx yyyyy a format (10 digits, 5 digits, 1 letter).
  • 148. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 _ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF The first parameter is the offset (in decimal) of the object...
  • 149. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000_ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...(for the null object, it’s 0).
  • 150. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 _ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF Then, the generation number (that is almost always 0)...
  • 151. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535_ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...but for object 0, it’s 65535.
  • 152. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f_ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF Then, a letter, to tell if this entry is free (f) or in use (n).
  • 153. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f_ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF Lastly, each line should take 20 bytes, including EOL...
  • 154. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f _ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...so add a trailing space.
  • 155. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f _ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF Next line (the first real object)...
  • 156. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010_ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF …object offset, in decimal...
  • 157. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000_ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF …generation number...
  • 158. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000 n_ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF …and declare the object index in use (n)...
  • 159. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000 n _ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF …and the trailing space
  • 160. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000 n _ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF Do the same with the other objects...
  • 161. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000 n 00000 n 00000 n 00000 n _ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF …knowing that all lines will end with “ 00000 n ”,...
  • 162. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000 n 0000000060 00000 n 0000000120 00000 n 0000000269 00000 n _ trailer << /Root 1 0 R >> startxref %xref pointer %%EOF ...set all offsets.
  • 163. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000 n 0000000060 00000 n 0000000120 00000 n 0000000269 00000 n trailer << /Root 1 0 R >> startxref %xref pointer %%EOF The cross-reference table is finished.
  • 164. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000 n 0000000060 00000 n 0000000120 00000 n 0000000269 00000 n trailer << /Root 1 0 R >> startxref %xref pointer %%EOF
  • 165. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000 n 0000000060 00000 n 0000000120 00000 n 0000000269 00000 n trailer << /Root 1 0 R >> startxref %xref pointer %%EOF
  • 166. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000 n 0000000060 00000 n 0000000120 00000 n 0000000269 00000 n trailer << /Root 1 0 R >> startxref _ %%EOF We set the startxref pointer...
  • 167. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000 n 0000000060 00000 n 0000000120 00000 n 0000000269 00000 n trailer << /Root 1 0 R >> startxref 364_ %%EOF ...as xref’s offset, in decimal (no prepending 0s).
  • 168. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000 n 0000000060 00000 n 0000000120 00000 n 0000000269 00000 n trailer << /Root 1 0 R >> startxref 364 %%EOF
  • 169. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000 n 0000000060 00000 n 0000000120 00000 n 0000000269 00000 n trailer << /Root 1 0 R _ >> startxref 364 %%EOF We also need to update the trailer dictionary...
  • 170. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000 n 0000000060 00000 n 0000000120 00000 n 0000000269 00000 n trailer << /Root 1 0 R /Size_ >> startxref 364 %%EOF ...with the number of objects...
  • 171. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000 n 0000000060 00000 n 0000000120 00000 n 0000000269 00000 n trailer << /Root 1 0 R /Size 5_>> startxref 364 %%EOF … in the PDF (including object 0).
  • 172. 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj xref 0 5 0000000000 65535 f 0000000010 00000 n 0000000060 00000 n 0000000120 00000 n 0000000269 00000 n trailer << /Root 1 0 R /Size 5 >> startxref 364 %%EOF Our PDF is now complete.
  • 173. %PDF-1.3 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /Subtype /Type1 /BaseFont /Arial >> >> >> /Contents 4 0 R >> endobj 4 0 obj << /Length 44 >> stream BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET endstream endobj xref 0 5 0000000000 65535 f 0000000010 00000 n 0000000060 00000 n 0000000120 00000 n 0000000269 00000 n trailer << /Root 1 0 R /Size 5 >> startxref 364 %%EOF
  • 174. Disclaimer: this is a minimal PDF. Most PDF documents are much bigger, and contain many more elements. Our PDF: 528 bytes 4 objects text only A standard generated “Hello World”: 15 kiloBytes 20 objects text and binary (embedded fonts…)
  • 175. No need to type them yourself! Hint: use “mutool clean” to fix offsets and lengths. http://www.mupdf.com/
  • 176. ⇒ mutool version Slightly different content, but same rendering. %PDF-1.3 %%μῦ 1 0 obj <</Type/Catalog/Pages 2 0 R>> endobj 2 0 obj <</Type/Pages/Kids[3 0 R]/Count 1>> endobj 3 0 obj <</Type/Page/Parent 2 0 R/Resources 5 0 R/Contents 4 0 R>> endobj 4 0 obj <</Length 49>> stream q BT /F1 100 Tf 10 400 Td (Hello World!) Tj ET Q endstream endobj 5 0 obj <</Font<</F1<</Type/Font/Subtype/Type1/BaseFont/Arial>>>>>> endobj xref 0 6 0000000000 65536 f 0000000018 00000 n 0000000064 00000 n 0000000116 00000 n 0000000191 00000 n 0000000288 00000 n trailer <</Size 6/Root 1 0 R>> startxref 364 %%EOF
  • 177. Hint: you can directly extract the PDF sources. use “pdftotext --layout” on the slide deck http://www.foolabs.com/xpdf/home.html
  • 178. One more thing... This one is important for self study.
  • 179. Def: stream filters streams can be encoded and/or compressed algorithms can be cascaded ex: compression, then ASCII encoding
  • 180. New stream parameter: /Filter ex: encode the stream in ASCII 1 0 obj << /Length 12 >> stream Hello World! endstream endobj 1 0 obj << /Length 24 /Filter /ASCIIHexDecode>> stream 48656C6C6F20576F726C6421 endstream endobj ⇔
  • 181. Ex: compression (deflate = ZIP compression) 1 0 obj << /Length 12 >> stream Hello World! endstream endobj 1 0 obj << /Length 20 /Filter /FlateDecode>> stream x£¾H═╔╔¤/╩IQ♦ ∟I♦> endstream endobj ⇔
  • 182. Filters can be cascaded. Ex: compressed, then encoded in ASCII 1 0 obj << /Length 12 >> stream Hello World! endstream endobj 1 0 obj << /Length 40 /Filter [/ASCIIHexDecode /FlateDecode] >> stream 789CF348CDC9C95708CF2FCA495104001C49043E endstream endobj ⇔
  • 183. Hint: “mutool clean -d” to remove any stream filter. (if you want to explore PDFs by yourself) http://www.mupdf.com/
  • 185. Questions? (you can download this poster at http://pics.corkami.com)
  • 188. Let’s write a PDF file corkami.com @angealbertini Hail to the king, baby! r2