2. John Holliday
CTO, SharePoint Architects, Inc.
www.SharePointArchitects.us
john@johnholliday.net
5 year SharePoint Server MVP
SharePoint Author, Instructor, Developer
Information Architecture Consultant
Records Management Specialist
3. Open XML
“Document Parts”
•Most parts are XML
•Can access/modify parts without
using an Office product
Common Uses
•Interrogate document data
• Search for arbitrary data
•Validate against a schema
•Manipulate existing documents
• Update content
• Remove all comments
• Scan for prohibited text
•Generate new documents
• Combine parts from library
• Generate slides, etc.
5. Parts and Relationships
_rels.rels
<Relationship Id=“unique_id” Type=“relationshipType”
Target=“targetPart” />
Id
Any string that is unique in the .rels file
Type
One of the Open XML part schemas
http://schemas.microsoft.com/office/2006/relationships/styleSheet
Target
The location of the target of the relationship (a document part)
6. Open Packaging API
System.IO.Packaging
Advantages
– Applies to all Office applications
Disadvantages
– LOTS of steps involved to create real documents
Package pkg = Package.Open(“MyDocument.docx”, FileMode.Create);
…LOTS OF CODE…
8. Building Open XML Solutions
Observations
The markup languages are unique to
each document type.
Must anticipate the need to repurpose
data across all types.
The rendering mechanisms are unique
to each type.
Requirements
•Define a consistent architecture
•Enable declarative templating
•Enable code reuse
Available Tools
•Open Packaging API
•Microsoft Open XML SDK
•Open XML SDK Productivity Tool
9. Solution Strategies
Use the Open Packaging API
Advantages
– Can use a code generation tool to jumpstart the
project
– Difficult to understand and modify
Disadvantages
– Must generate the entire document
– Relatively slow
Use the Microsoft Open XML SDK
Advantages
– Code is easier to work with
– SDK provides wrappers
Disadvantages
– Steep learning curve
– Still slow
Use XSLT
Advantages
– Very Fast
Disadvantages
– Requires XSLT Skills
10. Microsoft Open XML SDK
Built on top of System.IO.Packaging
Easier than the raw packaging API
Wrapper classes for all Office doc types
Wrapper classes for all known doc parts
Markup Class
WordProcessingML WordProcessingDocument
SpreadsheetML SpreadsheetDocument
PresentationML PresentationDocument
12. Open XML Productivity Tool
Part of the Open XML SDK
Easy to see the structure of a document
Great for figuring out the correct elements
and attributes
Powerful code generation built-in.
– Useful for one-off projects
14. Using XSLT
Basic Idea:
“Flatten” the Open XML document to a non-
hierarchical XML data stream.
Transform the XML using XSLT style sheet
“Un-flatten” the XML back to Open XML
16. Additional Resources
Microsoft Office Open XML SDK 2.5
Open XML SDK 2.5 Productivity Tool
OpenXML Developer
www.OpenXmlDeveloper.org
Eric White’s Blog (former Microsoft)
www.EricWhite.com