Contenu connexe
Similaire à First Encounters With Office Open Xml Matt Turner 12 4 2007 (20)
Plus de Dave Kellogg (20)
First Encounters With Office Open Xml Matt Turner 12 4 2007
- 1. Unlock Content
First Encounters with Office Open XML
Matt Turner
Principal Consultant
December 3rd, 2007
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 1
- 2. Agenda
Office Open XML basics
p
Office Open XML and XML tools
Some examples
Parting Thoughts
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 2
- 3. Office Open XML
Native format of MS Office 2007
Complete rework of the entire
productivity suite
Word, PowerPoint, Excel, etc.
All have native format of XML!
OOXML = Office Open XML
Standard through Ecma International
Formally k
F ll known as E
Ecma 376
Approved in December 2006
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 3
- 4. Lets Have a Look
There is a lot of it . . .
A Requirement was 100% compatibility
Layout based
6500+ page specification
ifi ti
Thousands of elements + attributes
And speed and space
Single-character QNames
Single character
Single-character namespace prefixes
No spare whitespace
But the core element set is manageable . . .
. . . for simple documents ☺
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 4
- 5. It’s Not XML, It’s Zipped XML
Zipped container with content, formatting info and
manifest
Payload varies by application but its all XML
New extension: .docx, pptx .xlsx
docx pptx, xlsx
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 5
- 6. Office Open XML Sample
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 6
- 7. Runs Present Some Issues
The core of a Word file comprises text “runs”
New styles require new runs sometimes they just show up
runs,
Sometimes split text (!!)
Needs some special handling which we can do with XQuery
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 7
- 8. Hooks to Add Structure
Spec allows for customXml tag that you can use to add
structure
Word also support to let you add structure
Support for schemas to control editing
Controls to let you add arbitrary tags
And flow content from external (XML) sources
Available from the Developer Tab
Enable developer tab from word options under the ‘big button’
Word itself can be configured with XML to jump start
custom editing and XML interactions
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 8
- 9. Agenda
Office Open basics
p
Office Open XML and XML tools
Some examples
Parting Thoughts
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 9
- 10. What can we do with it?
It’s XML – anything!!
You
Y can query it transform it . . . the whole enchilada.
it, t f th hl hil d
Create it
MS Word is now (just) an OOXML editor ( )
(j ) (!!)
There are lots of other ways to edit and create OOXML
Make the desktop connection
Drive application context direct from end-user documents
Output quot;first-draftquot; of end-user documents that work on (real)
desktops
Create content apps that work directly on collections of
Office documents, without conversion
Simplify the XML-ifying of business processes
XML ifying
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 10
- 11. XQuery Makes It Happen
XQuery - much more than a query language
W3C standard
Query, manipulate and render XML
XML Content Servers (like MarkLogic Server)
Application ready extensions provide complete
platform for content applications
Such as
Update features to load / maintain content
HTTP / REST interfaces
Zip tools to h dl the
Zi t l t handle th packaging*
ki*
*MarkLogic Server only
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 11
- 12. Agenda
Office Open basics
p
Office Open XML and XML tools
Some examples
Parting Thoughts
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 12
- 13. Examples
1. Exploring Office Open XML
p g p
• Open up .zip package
• Update XML
• Repackage
• Load into Content Server
• Create CustomXml + Controls
• Query and Update
• Repackage into .docx
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 13
- 14. Examples
Unzip package and edit document.xml to add Custom Structure in
XML editor
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 14
- 15. Examples
Zip back up, rename to .docx and open in word
use developer tab to view CustomXml
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 15
- 16. Examples
• Load into MarkLogic Server, unzip and expand to load individual
XML files
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 16
- 17. Examples
• Query the XML: this XQuery
• Returns:
Rt
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 17
- 18. Examples
• Package the files back up to create a new .docx
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 18
- 19. Content Server Examples
Office Open XML = accessible format
gives you the building blocks to create purpose built
applications to leverage desktop apps
Content Servers put it together
MarkLogic Server combines the XML tools (XQuery)
to process it and the extension to seamlessly round
trip content
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 19
- 20. Content Server Examples
Generate, Query and Mash-up MS Word
1. Use XQuery to transform XML into Office Open XML
2. Use XQuery t access granular content elements in
2 U XQ to l t tl ti
word documents and create new Office Open XML
3. Customize Word Ribbons to query MarkLogic Server
to get content and save new content back
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 20
- 21. Content Server Example 1
From source XML (shakespeare plays)
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 21
- 22. Content Server Example 1
Generate Open Office XML
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 22
- 23. Content Server Example 1
Package and open in Word – with structure in customXML elements
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 23
- 24. Content Server Example 2
Access granular elements of Office Open XML and create new content
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 24
- 25. Content Server Example 3
Add content using custom Mark Logic Ribbon
Insert new content into
Shakespeare play
Query MarkLogic
Server for content from
tech support content
base
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 25
- 26. Content Server Example 3
Add content using custom Mark Logic Ribbon
Insert new content into
Shakespeare play
Query MarkLogic
Server for content from
tech support content
base
Insert content into play
Actions contained in
MarkLogic Ribbon
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 26
- 27. Content Server Example 3
Save content back to contentbase
Select any content
Create a new
document with the
snippet in MarkLogic
Server contentbase
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 27
- 28. Agenda
Open Office basics
p
OOXML and XML tools
Some examples
Parting Thoughts
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 28
- 29. Conclusions
Office 2007’s native file format is XML! For real!
XQuery provides powerful tools to ingest, query,
manipulate and g
p generate the format ( XML after all)
(it’s )
OOXML provides the building blocks for integrated
content apps based on desktop content
XML Content Servers enable these applications so . . .
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 29
- 30. Content Application Resources
OOXML Standard
http://www.ecma-international.org/publications/standards/Ecma-376.htm
Small changes – featuring OOXML
http://developer.marklogic.com/columns/smallchanges
OOXML Developers
http://openxmldeveloper.org/
Discovering XQuery (my blog)
http://xquery.typepad.com
p q y yp p
MarkMail (XML Lists)
http://markmail.org
Mark Logic CEO Blog
http://marklogic.blogspot.com
XQuery site / developers group
htt //
http://x-query.com
Querying XML (book) Melton and Buxton
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 30
- 31. Unlock Content™
Thank You
Sample Template
Matt Turner
Principal Consultant
What do we think of this
matt.turner@marklogic.com
Sfdhskdfjh kjsfhd
http://xquery.typepad.com Sdflkhsdf sdflk
Copyright © 2007 Mark Logic Corporation. All rights reserved. Slide 31