Microsoft SharePoint the natural target for PDF storage with 125 million Microsoft SharePoint Server licenses sold. An overview of PDF and SharePoint - PDF as a SharePoint "First Class Citizen".
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Pdf and microsoft share point hurdles to overcome
1. PDF AssociationTechnical Conference June 18-19 2013
PDF and Microsoft Sharepoint
Hurdles to Overcome
Neil Pitman
Aquaforest Limited
Version 1.120613
3. Objectives
Sharepoint Overview
PDF Capture
PDF Search
Agenda
iFilters
Handling Image and Mixed Mode PDFs
PDF Metadata
Dictionary, XMP and Entity Extraction
Configuration
Sharepoint 2010 , 2013
Summary
4. Microsoft Sharepoint Server - 125 million licenses sold
Sharepoint to be a natural target for PDF storage
What is Sharepoint?
On-Premise and Cloud-based Collaboration &
Document Management Platform
Sharepoint
Overview
Origin - 2001
Usage
Focus on MS Office Documents
Typically distributed capture
6. Sharepoint
Architecture
Overview
MS Web-based (IIS)
MS Office Integration
SQL Server Storage
List or library data in a site collection is stored in a SQL Server database table, which uses queries, indexes and locks to maintain overall performance, sharing, and accuracy.
Filtered views with column indexes (and other operations) create database queries that identify a subset of columns and rows and return this subset to your computer.
Thresholds and limits help throttle operations and balance resources for many simultaneous users.
Privileged developers can use object model overrides to temporarily increase thresholds and limits for custom applications.
Administrators can specify dedicated time windows for all users to do unlimited operations during off-peak hours.
Information workers can use appropriate views, styles, and page limits to speed up the display of data on the page.
Microsoft Technology Stack
Windows Server 2008/12
Internet Information Server (IIS)
.Net Framework
SQL Server
MS Office
7. Options
PDF Capture
for Sharepoint
Sharepoint UI
Acrobat XI
Load Tools
Custom Code
Workflow & Event Receivers
WebRequest request = WebRequest.Create(destUrl);
request.Credentials = CredentialCache.DefaultCredentials;
request.Method = "PUT";
byte[] buffer = new byte[1024];
using (Stream stream = request.GetRequestStream())
using (MemoryStream ms = new MemoryStream(fileBytes))
{
for (int i = ms.Read(buffer, 0, buffer.Length); i > 0;
i = ms.Read(buffer, 0, buffer.Length))
{
stream.Write(buffer, 0, i);
}
}
WebResponse response = request.GetResponse();
response.Close();
Logging.Log("Upload successful");
17. Objectives:
Ensure Full Searchability
Avoid Text to Image Processing
Process :
Dealing with
Image and
Mixed-Mode
PDFs
Capture Time?
Scheduled In-Place?
18. Text Search vs Metadata Search
Crawled vs Managed Properies
Review Requirements
Dictionary Metadata
XMP Metadata
Entity Extraction
PDF Metadata
In Sharepoint
Consider Automation
25. Default for PDF : X-Download-Options: noopen' added to HTTP
Response Header
Sharepoint
PDF
Configuration
26. PDF Format Handler Support
Currently no iFilter Support for PDF !?!?!!
Sharepoint
2013 and PDF
Configuration
27. Inline Viewing PDF in Sharepoint 2013
Sharepoint
2013 and PDF
Configuration
http://stevemannspath.blogspot.co.uk/2012/10/sharepoint-2013-pdf-preview-in-search.html
http://stevemannspath.blogspot.co.uk/2013/04/sharepoint-2013-pdf-support-and.html
28. Microsoft Sharepoint Server - 125 million licenses sold
Sharepoint to be a natural target for PDF storage
PDF as a Sharepoint “First Class Citizen”
Summary
Contact : neil.pitman@aquaforest.com