SlideShare une entreprise Scribd logo
1  sur  39
Metadata Extraction and Content Transformations 2 Nick Burch Software Engineer, Alfresco twitter: @gagravarr
Introduction – 3 Content Related Services 3 Covering Uses Interfaces Calling the Services Java & JavaScript APIs Demos Extensions Apache Tika Metadata Extractor Content Transformer Renditions
The Metadata Extractor Service 4 What, How, Why? ,[object Object]
 Document Metadata is converted into the content model
 Typically used with uploaded binary files
 Upload a PDF, extract out the Title and Description, save these as the properties on the Alfresco Node
 Powered internally by a number of different extractors
 Service picks the appropriate extractor for you
 Since Alfresco 3.4, makes heavy use of Apache Tika,[object Object]
 Driven by mime types, source and destination
 Used to generate plain text versions for indexing
 Used to generate SWF versions for preview
 Used to generate PDF versions for web download
 Powered by a large number of different transformers
 Transformers can be linked together, eg .doc -> .pdf via Open Office, then .pdf -> .swf via pdf2swf
 Since Alfresco 3.4, makes heavy use of Apache Tika,[object Object]
 Or can just alter some content as-is
 Used to manipulate images, eg crop and resize
 Used to generate HTML .docx previews in Web Quick Start
 Often uses the Content Transformation Service
 Replaced the Thumbnail Service
 Renditions are actions,[object Object]
 Grew out of the Lucene community, now widely used
 Provides detection of files – eg this binary blob is really a word file
 Plain text, HTML and XHTML versions of a wide range of different file formats
 Consistent Metadata from different files
Tika hides the complexity of the different formats, and presents a simple, powerful API
 Easy to use and extend,[object Object]
Alfresco 3.3 - Supported Formats 9 File Formats supported out of the box ,[object Object]
 Word, PowerPoint, Excel
 HTML
 Open Document Formats (OpenOffice)
 RFC822 Email
 Outlook .msg Email,[object Object]
 DWG (CAD)
Epub
 RSS and ATOM Feeds
 True Type Fonts
 HTML

Contenu connexe

Tendances

Moving Gigantic Files Into and Out of the Alfresco Repository
Moving Gigantic Files Into and Out of the Alfresco RepositoryMoving Gigantic Files Into and Out of the Alfresco Repository
Moving Gigantic Files Into and Out of the Alfresco RepositoryJeff Potts
 
Bulk Export Tool for Alfresco
Bulk Export Tool for AlfrescoBulk Export Tool for Alfresco
Bulk Export Tool for AlfrescoRichard McKnight
 
Prometheus Overview
Prometheus OverviewPrometheus Overview
Prometheus OverviewBrian Brazil
 
Alfresco search services: Now and Then
Alfresco search services: Now and ThenAlfresco search services: Now and Then
Alfresco search services: Now and ThenAngel Borroy López
 
Intro to the Alfresco Public API
Intro to the Alfresco Public APIIntro to the Alfresco Public API
Intro to the Alfresco Public APIJeff Potts
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Terraform -- Infrastructure as Code
Terraform -- Infrastructure as CodeTerraform -- Infrastructure as Code
Terraform -- Infrastructure as CodeMartin Schütte
 
Kubernetes Architecture
 Kubernetes Architecture Kubernetes Architecture
Kubernetes ArchitectureKnoldus Inc.
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to PrometheusJulien Pivotto
 
Monitoring, Logging and Tracing on Kubernetes
Monitoring, Logging and Tracing on KubernetesMonitoring, Logging and Tracing on Kubernetes
Monitoring, Logging and Tracing on KubernetesMartin Etmajer
 
Configuration management with puppet
Configuration management with puppetConfiguration management with puppet
Configuration management with puppetJakub Stransky
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practicesconfluent
 
OpenShift Meetup - Tokyo - Service Mesh and Serverless Overview
OpenShift Meetup - Tokyo - Service Mesh and Serverless OverviewOpenShift Meetup - Tokyo - Service Mesh and Serverless Overview
OpenShift Meetup - Tokyo - Service Mesh and Serverless OverviewMaría Angélica Bracho
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignMichael Noll
 
Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!Animesh Singh
 
Log management with ELK
Log management with ELKLog management with ELK
Log management with ELKGeert Pante
 

Tendances (20)

Moving Gigantic Files Into and Out of the Alfresco Repository
Moving Gigantic Files Into and Out of the Alfresco RepositoryMoving Gigantic Files Into and Out of the Alfresco Repository
Moving Gigantic Files Into and Out of the Alfresco Repository
 
Bulk Export Tool for Alfresco
Bulk Export Tool for AlfrescoBulk Export Tool for Alfresco
Bulk Export Tool for Alfresco
 
Prometheus Overview
Prometheus OverviewPrometheus Overview
Prometheus Overview
 
Alfresco search services: Now and Then
Alfresco search services: Now and ThenAlfresco search services: Now and Then
Alfresco search services: Now and Then
 
Intro to the Alfresco Public API
Intro to the Alfresco Public APIIntro to the Alfresco Public API
Intro to the Alfresco Public API
 
Webscripts
WebscriptsWebscripts
Webscripts
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Terraform -- Infrastructure as Code
Terraform -- Infrastructure as CodeTerraform -- Infrastructure as Code
Terraform -- Infrastructure as Code
 
Alfresco tuning part1
Alfresco tuning part1Alfresco tuning part1
Alfresco tuning part1
 
Kubernetes Architecture
 Kubernetes Architecture Kubernetes Architecture
Kubernetes Architecture
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to Prometheus
 
Alfresco tuning part1
Alfresco tuning part1Alfresco tuning part1
Alfresco tuning part1
 
Monitoring, Logging and Tracing on Kubernetes
Monitoring, Logging and Tracing on KubernetesMonitoring, Logging and Tracing on Kubernetes
Monitoring, Logging and Tracing on Kubernetes
 
Configuration management with puppet
Configuration management with puppetConfiguration management with puppet
Configuration management with puppet
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practices
 
Alfresco tuning part2
Alfresco tuning part2Alfresco tuning part2
Alfresco tuning part2
 
OpenShift Meetup - Tokyo - Service Mesh and Serverless Overview
OpenShift Meetup - Tokyo - Service Mesh and Serverless OverviewOpenShift Meetup - Tokyo - Service Mesh and Serverless Overview
OpenShift Meetup - Tokyo - Service Mesh and Serverless Overview
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
 
Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!
 
Log management with ELK
Log management with ELKLog management with ELK
Log management with ELK
 

En vedette

Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsJulien Nioche
 
Open source enterprise search and retrieval platform
Open source enterprise search and retrieval platformOpen source enterprise search and retrieval platform
Open source enterprise search and retrieval platformmteutelink
 
Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...
Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...
Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...hannonhill
 
Populate your Search index, NEST 2016-01
Populate your Search index, NEST 2016-01Populate your Search index, NEST 2016-01
Populate your Search index, NEST 2016-01David Smiley
 
Apache Tika end-to-end
Apache Tika end-to-endApache Tika end-to-end
Apache Tika end-to-endgagravarr
 
Content Analysis with Apache Tika
Content Analysis with Apache TikaContent Analysis with Apache Tika
Content Analysis with Apache TikaPaolo Mottadelli
 
Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friendslucenerevolution
 
Search Engine Capabilities - Apache Solr(Lucene)
Search Engine Capabilities - Apache Solr(Lucene)Search Engine Capabilities - Apache Solr(Lucene)
Search Engine Capabilities - Apache Solr(Lucene)Manish kumar
 
Web Crawling with Apache Nutch
Web Crawling with Apache NutchWeb Crawling with Apache Nutch
Web Crawling with Apache Nutchsebastian_nagel
 
An introduction to Storm Crawler
An introduction to Storm CrawlerAn introduction to Storm Crawler
An introduction to Storm CrawlerJulien Nioche
 
Actions rules and workflow in alfresco
Actions rules and workflow in alfrescoActions rules and workflow in alfresco
Actions rules and workflow in alfrescoAlfresco Software
 
Large scale crawling with Apache Nutch
Large scale crawling with Apache NutchLarge scale crawling with Apache Nutch
Large scale crawling with Apache NutchJulien Nioche
 
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)dnaber
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache SolrAndy Jackson
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash courseTommaso Teofili
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrLucidworks (Archived)
 
Alfresco In An Hour - Document Management, Web Content Management, and Collab...
Alfresco In An Hour - Document Management, Web Content Management, and Collab...Alfresco In An Hour - Document Management, Web Content Management, and Collab...
Alfresco In An Hour - Document Management, Web Content Management, and Collab...Alfresco Software
 

En vedette (20)

Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friends
 
Open source enterprise search and retrieval platform
Open source enterprise search and retrieval platformOpen source enterprise search and retrieval platform
Open source enterprise search and retrieval platform
 
Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...
Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...
Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...
 
Populate your Search index, NEST 2016-01
Populate your Search index, NEST 2016-01Populate your Search index, NEST 2016-01
Populate your Search index, NEST 2016-01
 
Apache Tika end-to-end
Apache Tika end-to-endApache Tika end-to-end
Apache Tika end-to-end
 
Content Analysis with Apache Tika
Content Analysis with Apache TikaContent Analysis with Apache Tika
Content Analysis with Apache Tika
 
Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friends
 
ProjectHub
ProjectHubProjectHub
ProjectHub
 
Search Engine Capabilities - Apache Solr(Lucene)
Search Engine Capabilities - Apache Solr(Lucene)Search Engine Capabilities - Apache Solr(Lucene)
Search Engine Capabilities - Apache Solr(Lucene)
 
Web Crawling with Apache Nutch
Web Crawling with Apache NutchWeb Crawling with Apache Nutch
Web Crawling with Apache Nutch
 
Search engine
Search engineSearch engine
Search engine
 
An introduction to Storm Crawler
An introduction to Storm CrawlerAn introduction to Storm Crawler
An introduction to Storm Crawler
 
Actions rules and workflow in alfresco
Actions rules and workflow in alfrescoActions rules and workflow in alfresco
Actions rules and workflow in alfresco
 
Alfresco content model
Alfresco content modelAlfresco content model
Alfresco content model
 
Large scale crawling with Apache Nutch
Large scale crawling with Apache NutchLarge scale crawling with Apache Nutch
Large scale crawling with Apache Nutch
 
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash course
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with Solr
 
Alfresco In An Hour - Document Management, Web Content Management, and Collab...
Alfresco In An Hour - Document Management, Web Content Management, and Collab...Alfresco In An Hour - Document Management, Web Content Management, and Collab...
Alfresco In An Hour - Document Management, Web Content Management, and Collab...
 

Similaire à Metadata Extraction and Content Transformation

Content analysis for ECM with Apache Tika
Content analysis for ECM with Apache TikaContent analysis for ECM with Apache Tika
Content analysis for ECM with Apache TikaPaolo Mottadelli
 
Understanding information content with apache tika
Understanding information content with apache tikaUnderstanding information content with apache tika
Understanding information content with apache tikaSutthipong Kuruhongsa
 
Understanding information content with apache tika
Understanding information content with apache tikaUnderstanding information content with apache tika
Understanding information content with apache tikaSutthipong Kuruhongsa
 
Hatkit Project - Datafiddler
Hatkit Project - DatafiddlerHatkit Project - Datafiddler
Hatkit Project - Datafiddlerholiman
 
High-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig LatinHigh-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig LatinPietro Michiardi
 
Mime Magic With Apache Tika
Mime Magic With Apache TikaMime Magic With Apache Tika
Mime Magic With Apache TikaJukka Zitting
 
CustomizingStyleSheetsForHTMLOutputs
CustomizingStyleSheetsForHTMLOutputsCustomizingStyleSheetsForHTMLOutputs
CustomizingStyleSheetsForHTMLOutputsSuite Solutions
 
Lecture14Slides.ppt
Lecture14Slides.pptLecture14Slides.ppt
Lecture14Slides.pptVideoguy
 
Developing web apps using Erlang-Web
Developing web apps using Erlang-WebDeveloping web apps using Erlang-Web
Developing web apps using Erlang-Webfanqstefan
 
Twig internals - Maksym MoskvychevTwig internals maksym moskvychev
Twig internals - Maksym MoskvychevTwig internals   maksym moskvychevTwig internals - Maksym MoskvychevTwig internals   maksym moskvychev
Twig internals - Maksym MoskvychevTwig internals maksym moskvychevDrupalCampDN
 
Import web resources using R Studio
Import web resources using R StudioImport web resources using R Studio
Import web resources using R StudioRupak Roy
 
IQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic Communication
IQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic CommunicationIQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic Communication
IQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic CommunicationTed Leung
 
Rspec API Documentation
Rspec API DocumentationRspec API Documentation
Rspec API DocumentationSmartLogic
 

Similaire à Metadata Extraction and Content Transformation (20)

Content analysis for ECM with Apache Tika
Content analysis for ECM with Apache TikaContent analysis for ECM with Apache Tika
Content analysis for ECM with Apache Tika
 
Understanding information content with apache tika
Understanding information content with apache tikaUnderstanding information content with apache tika
Understanding information content with apache tika
 
Understanding information content with apache tika
Understanding information content with apache tikaUnderstanding information content with apache tika
Understanding information content with apache tika
 
Hatkit Project - Datafiddler
Hatkit Project - DatafiddlerHatkit Project - Datafiddler
Hatkit Project - Datafiddler
 
Power tools in Java
Power tools in JavaPower tools in Java
Power tools in Java
 
High-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig LatinHigh-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig Latin
 
Tibco business works
Tibco business worksTibco business works
Tibco business works
 
Mime Magic With Apache Tika
Mime Magic With Apache TikaMime Magic With Apache Tika
Mime Magic With Apache Tika
 
CustomizingStyleSheetsForHTMLOutputs
CustomizingStyleSheetsForHTMLOutputsCustomizingStyleSheetsForHTMLOutputs
CustomizingStyleSheetsForHTMLOutputs
 
Ajax
AjaxAjax
Ajax
 
Lecture14Slides.ppt
Lecture14Slides.pptLecture14Slides.ppt
Lecture14Slides.ppt
 
Developing web apps using Erlang-Web
Developing web apps using Erlang-WebDeveloping web apps using Erlang-Web
Developing web apps using Erlang-Web
 
SCDJWS 6. REST JAX-P
SCDJWS 6. REST  JAX-PSCDJWS 6. REST  JAX-P
SCDJWS 6. REST JAX-P
 
5 xml parsing
5   xml parsing5   xml parsing
5 xml parsing
 
Apache tika
Apache tikaApache tika
Apache tika
 
Twig internals - Maksym MoskvychevTwig internals maksym moskvychev
Twig internals - Maksym MoskvychevTwig internals   maksym moskvychevTwig internals - Maksym MoskvychevTwig internals   maksym moskvychev
Twig internals - Maksym MoskvychevTwig internals maksym moskvychev
 
Processing XML with Java
Processing XML with JavaProcessing XML with Java
Processing XML with Java
 
Import web resources using R Studio
Import web resources using R StudioImport web resources using R Studio
Import web resources using R Studio
 
IQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic Communication
IQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic CommunicationIQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic Communication
IQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic Communication
 
Rspec API Documentation
Rspec API DocumentationRspec API Documentation
Rspec API Documentation
 

Plus de Alfresco Software

Alfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossierAlfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossierAlfresco Software
 
Alfresco Day Benelux Hogeschool Inholland Records Management application
Alfresco Day Benelux Hogeschool Inholland Records Management applicationAlfresco Day Benelux Hogeschool Inholland Records Management application
Alfresco Day Benelux Hogeschool Inholland Records Management applicationAlfresco Software
 
Alfresco Day BeNelux: Customer Success Showcase - Saxion Hogescholen
Alfresco Day BeNelux: Customer Success Showcase - Saxion HogescholenAlfresco Day BeNelux: Customer Success Showcase - Saxion Hogescholen
Alfresco Day BeNelux: Customer Success Showcase - Saxion HogescholenAlfresco Software
 
Alfresco Day BeNelux: Customer Success Showcase - Gemeente Amsterdam
Alfresco Day BeNelux: Customer Success Showcase - Gemeente AmsterdamAlfresco Day BeNelux: Customer Success Showcase - Gemeente Amsterdam
Alfresco Day BeNelux: Customer Success Showcase - Gemeente AmsterdamAlfresco Software
 
Alfresco Day BeNelux: The success of Alfresco
Alfresco Day BeNelux: The success of AlfrescoAlfresco Day BeNelux: The success of Alfresco
Alfresco Day BeNelux: The success of AlfrescoAlfresco Software
 
Alfresco Day BeNelux: Customer Success Showcase - Credendo Group
Alfresco Day BeNelux: Customer Success Showcase - Credendo GroupAlfresco Day BeNelux: Customer Success Showcase - Credendo Group
Alfresco Day BeNelux: Customer Success Showcase - Credendo GroupAlfresco Software
 
Alfresco Day BeNelux: Digital Transformation - It's All About Flow
Alfresco Day BeNelux: Digital Transformation - It's All About FlowAlfresco Day BeNelux: Digital Transformation - It's All About Flow
Alfresco Day BeNelux: Digital Transformation - It's All About FlowAlfresco Software
 
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...Alfresco Software
 
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...Alfresco Software
 
Alfresco Day Vienna 2016: Alfrescos neue Rest API
Alfresco Day Vienna 2016: Alfrescos neue Rest APIAlfresco Day Vienna 2016: Alfrescos neue Rest API
Alfresco Day Vienna 2016: Alfrescos neue Rest APIAlfresco Software
 
Alfresco Day Vienna 2016: Support Tools für die Admin-Konsole
Alfresco Day Vienna 2016: Support Tools für die Admin-KonsoleAlfresco Day Vienna 2016: Support Tools für die Admin-Konsole
Alfresco Day Vienna 2016: Support Tools für die Admin-KonsoleAlfresco Software
 
Alfresco Day Vienna 2016: Entwickeln mit Alfresco
Alfresco Day Vienna 2016: Entwickeln mit AlfrescoAlfresco Day Vienna 2016: Entwickeln mit Alfresco
Alfresco Day Vienna 2016: Entwickeln mit AlfrescoAlfresco Software
 
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...Alfresco Software
 
Alfresco Day Vienna 2016: Partner Lightning Talk: Westernacher
Alfresco Day Vienna 2016: Partner Lightning Talk: WesternacherAlfresco Day Vienna 2016: Partner Lightning Talk: Westernacher
Alfresco Day Vienna 2016: Partner Lightning Talk: WesternacherAlfresco Software
 
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...Alfresco Software
 
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novum
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novumAlfresco Day Vienna 2016: Partner Lightning Talk - it-novum
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novumAlfresco Software
 
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...Alfresco Software
 
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...Alfresco Software
 
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - Safran
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - SafranAlfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - Safran
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - SafranAlfresco Software
 
Alfresco Day Warsaw 2016: Advancing the Flow of Digital Business
Alfresco Day Warsaw 2016: Advancing the Flow of Digital BusinessAlfresco Day Warsaw 2016: Advancing the Flow of Digital Business
Alfresco Day Warsaw 2016: Advancing the Flow of Digital BusinessAlfresco Software
 

Plus de Alfresco Software (20)

Alfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossierAlfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossier
 
Alfresco Day Benelux Hogeschool Inholland Records Management application
Alfresco Day Benelux Hogeschool Inholland Records Management applicationAlfresco Day Benelux Hogeschool Inholland Records Management application
Alfresco Day Benelux Hogeschool Inholland Records Management application
 
Alfresco Day BeNelux: Customer Success Showcase - Saxion Hogescholen
Alfresco Day BeNelux: Customer Success Showcase - Saxion HogescholenAlfresco Day BeNelux: Customer Success Showcase - Saxion Hogescholen
Alfresco Day BeNelux: Customer Success Showcase - Saxion Hogescholen
 
Alfresco Day BeNelux: Customer Success Showcase - Gemeente Amsterdam
Alfresco Day BeNelux: Customer Success Showcase - Gemeente AmsterdamAlfresco Day BeNelux: Customer Success Showcase - Gemeente Amsterdam
Alfresco Day BeNelux: Customer Success Showcase - Gemeente Amsterdam
 
Alfresco Day BeNelux: The success of Alfresco
Alfresco Day BeNelux: The success of AlfrescoAlfresco Day BeNelux: The success of Alfresco
Alfresco Day BeNelux: The success of Alfresco
 
Alfresco Day BeNelux: Customer Success Showcase - Credendo Group
Alfresco Day BeNelux: Customer Success Showcase - Credendo GroupAlfresco Day BeNelux: Customer Success Showcase - Credendo Group
Alfresco Day BeNelux: Customer Success Showcase - Credendo Group
 
Alfresco Day BeNelux: Digital Transformation - It's All About Flow
Alfresco Day BeNelux: Digital Transformation - It's All About FlowAlfresco Day BeNelux: Digital Transformation - It's All About Flow
Alfresco Day BeNelux: Digital Transformation - It's All About Flow
 
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...
 
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...
 
Alfresco Day Vienna 2016: Alfrescos neue Rest API
Alfresco Day Vienna 2016: Alfrescos neue Rest APIAlfresco Day Vienna 2016: Alfrescos neue Rest API
Alfresco Day Vienna 2016: Alfrescos neue Rest API
 
Alfresco Day Vienna 2016: Support Tools für die Admin-Konsole
Alfresco Day Vienna 2016: Support Tools für die Admin-KonsoleAlfresco Day Vienna 2016: Support Tools für die Admin-Konsole
Alfresco Day Vienna 2016: Support Tools für die Admin-Konsole
 
Alfresco Day Vienna 2016: Entwickeln mit Alfresco
Alfresco Day Vienna 2016: Entwickeln mit AlfrescoAlfresco Day Vienna 2016: Entwickeln mit Alfresco
Alfresco Day Vienna 2016: Entwickeln mit Alfresco
 
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...
 
Alfresco Day Vienna 2016: Partner Lightning Talk: Westernacher
Alfresco Day Vienna 2016: Partner Lightning Talk: WesternacherAlfresco Day Vienna 2016: Partner Lightning Talk: Westernacher
Alfresco Day Vienna 2016: Partner Lightning Talk: Westernacher
 
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...
 
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novum
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novumAlfresco Day Vienna 2016: Partner Lightning Talk - it-novum
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novum
 
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...
 
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...
 
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - Safran
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - SafranAlfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - Safran
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - Safran
 
Alfresco Day Warsaw 2016: Advancing the Flow of Digital Business
Alfresco Day Warsaw 2016: Advancing the Flow of Digital BusinessAlfresco Day Warsaw 2016: Advancing the Flow of Digital Business
Alfresco Day Warsaw 2016: Advancing the Flow of Digital Business
 

Dernier

Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsLeah Henrickson
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...FIDO Alliance
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераMark Opanasiuk
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...panagenda
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessUXDXConf
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfFIDO Alliance
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FIDO Alliance
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...ScyllaDB
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfUK Journal
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Skynet Technologies
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...FIDO Alliance
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Hiroshi SHIBATA
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfFIDO Alliance
 

Dernier (20)

Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 

Metadata Extraction and Content Transformation

  • 1. Metadata Extraction and Content Transformations 2 Nick Burch Software Engineer, Alfresco twitter: @gagravarr
  • 2. Introduction – 3 Content Related Services 3 Covering Uses Interfaces Calling the Services Java & JavaScript APIs Demos Extensions Apache Tika Metadata Extractor Content Transformer Renditions
  • 3.
  • 4. Document Metadata is converted into the content model
  • 5. Typically used with uploaded binary files
  • 6. Upload a PDF, extract out the Title and Description, save these as the properties on the Alfresco Node
  • 7. Powered internally by a number of different extractors
  • 8. Service picks the appropriate extractor for you
  • 9.
  • 10. Driven by mime types, source and destination
  • 11. Used to generate plain text versions for indexing
  • 12. Used to generate SWF versions for preview
  • 13. Used to generate PDF versions for web download
  • 14. Powered by a large number of different transformers
  • 15. Transformers can be linked together, eg .doc -> .pdf via Open Office, then .pdf -> .swf via pdf2swf
  • 16.
  • 17. Or can just alter some content as-is
  • 18. Used to manipulate images, eg crop and resize
  • 19. Used to generate HTML .docx previews in Web Quick Start
  • 20. Often uses the Content Transformation Service
  • 21. Replaced the Thumbnail Service
  • 22.
  • 23. Grew out of the Lucene community, now widely used
  • 24. Provides detection of files – eg this binary blob is really a word file
  • 25. Plain text, HTML and XHTML versions of a wide range of different file formats
  • 26. Consistent Metadata from different files
  • 27. Tika hides the complexity of the different formats, and presents a simple, powerful API
  • 28.
  • 29.
  • 32. Open Document Formats (OpenOffice)
  • 34.
  • 36. Epub
  • 37. RSS and ATOM Feeds
  • 38. True Type Fonts
  • 40. Images – JPEG, GIF, PNG, TIFF, Bitmap (including EXIF where found)
  • 42.
  • 43. Microsoft Office (Binary) – Word, PowerPoint, Excel, Visio, Publisher, Works
  • 44. Microsoft Office (OOXML) – Word, PowerPoint, Excel
  • 45. MP3 (id3 v1 and v2)
  • 47. Open Document Format (Open Office)
  • 48. Old-style Open Office (.sxw etc)
  • 49.
  • 54. Java class filesAnd I probably forgot one...!
  • 55. Calling Apache Tika 13 // Get a content detector, and an auto-selecting Parser TikaConfigconfig = TikaConfig.getDefaultConfig(); ContainerAwareDetector detector = new ContainerAwareDetector( config.getMimeRepository() ); Parser parser = new AutoDetectParser(detector); // We’ll only want the plain text contents ContentHandler handler = new BodyContentHandler(); // Tell the parser what we have Metadata metadata = new Metadata(); metadata.set(Metadata.RESOURCE_NAME_KEY, filename); // Have it processed parser.parse(input, handler, metadata, new ParseContext());
  • 56. Metadata Extractor – Java Use 14 MetadataExtractorRegistry registry = (MetadataExtractorRegistry)context.getBean(“metadataExtracterRegistry”); MetadataExtracter extractor = registry.getExtracter(“application/vnd.ms-excel”); Map<QName, Serializable> properties = new HashMap<QName, Serializable>(); ContentReader reader = contentService.getReader(nodeRef, ContentModel.PROP_CONTENT); extractor.extract(reader, properties); System.err.println(properties);
  • 57. Metadata Extractor – JavaScript Use 15 JavaScript var action = actions.create("extract-metadata"); action.execute(document); Full access is not directly available You can’t get at the raw properties You can, however, trigger extraction and saving to the node easily Available via an action
  • 58. Metadata Extractor – Geo Content Model 16 <aspect name="cm:geographic"> <title>Geographic</title> <properties> <property name="cm:latitude"> <title>Latitude</title> <type>d:double</type> </property> <property name="cm:longitude"> <title>Longitude</title> <type>d:double</type> </property> </properties> </aspect>
  • 59. Metadata Extractor – Geo Mapping 17 # Namespaces namespace.prefix.cm=http://www.alfresco.org/model/content/1.0 # Geo Mappings geolat=cm:latitude geolong=cm:longitude # Normal Mappings author=cm:author title=cm:title description=cm:description created=cm:created
  • 60. Demo:Geo Tagged Image in Share 18
  • 62.
  • 63. PDF to Image
  • 64. PDF to SWF (for preview)
  • 65. Office File Formats to PDF (via Open Office, using JODConverter in Enterprise)
  • 66. Plain Text and XML to PDF
  • 67. Zip listing to Text
  • 68.
  • 69. Content Transformer – Java Use 22 ContentTransformerRegistry registry = (ContentTransformerRegistry)context.getBean(“contentTransformerRegistry”); ContentTransformer transformer = registry.getTransformer(“application/vnd.ms-excel”,”text/csv”, new TransformationOptions()); ContentReader reader = contentService.getReader(sourceNodeRef, ContentModel.PROP_CONTENT); ContentWriter writer = contentService.getReader(destNodeRef, ContentModel.PROP_CONTENT); transformer.transform(reader, writer);
  • 70. Content Transformer – JavaScript Use 23 JavaScript var action = actions.create("transform"); // Transform into the same folder action.parameters["destination-folder"] = document.parent; action.parameters["assoc-type"] = "{http://www.alfresco.org/model/content/1.0}contains"; action.parameters["assoc-name"] = document.name + "transformed"; action.parameters["mime-type"] = "text/html"; // Execute action.execute(document); Full access is not directly available You can’t control which property is transformed, it’s always Content You can control where the transformed version goes Triggering the transformation is easier than in Java Available via an action
  • 71. Custom Tika Parsers - Interface 24 Interface public interface Parser { Set<MediaType> getSupportedTypes(ParseContext context); void parse( InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context) throws IOException, SAXException, TikaException; } The Tika Parser interface is quite simple Need to provide a list of supported mime types, so that auto-detection can work Accept an input stream, populate the Metadata object, and fire SAX events to the supplied handler That’s it!
  • 72. Custom Tika Parser – Hello World Parser 25 public class HelloWorldParser implements Parser { public Set<MediaType> getSupportedTypes(ParseContext context) { Set<MediaType> types = new HashSet<MediaType>(); types.add(MediaType.parse("hello/world")); return types; } public void parse(InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context) throws SAXException { XHTMLContentHandlerxhtml = new XHTMLContentHandler(handler, metadata); xhtml.startDocument(); xhtml.startElement("h1"); xhtml.characters("Hello, World!"); xhtml.endElement("h1"); xhtml.endDocument(); metadata.set("hello","world"); metadata.set("title","Hello World!"); } }
  • 73. Custom Command Line Transformer 26 <bean id="transformer.worker.helloWorldCMD" class="org.alfresco.repo.content.transform.RuntimeExecutableContentTransformerWorker"> <property name="mimetypeService“><ref bean="mimetypeService"/></property> <property name="transformCommand"> <bean class="org.alfresco.util.exec.RuntimeExec"> <property name="commandsAndArguments“><map> <entry key=".*“><list> <value>/bin/bash</value> <value>-c</value> <value>/bin/echo 'Hello World - ${source}' &gt; ${target}</value> </list></entry> </map></property> <property name="errorCodes“><value>1,127</value></property> </bean> </property <property name="explicitTransformations"> <list><bean class="org.alfresco.repo.content.transform.ExplictTransformationDetails"> <property name="sourceMimetype“><value>text/plain</value></property> <property name="targetMimetype“><value>hello/world</value></property> </bean></list> </property> </bean> <bean id="transformer.helloWorldCMD" class="org.alfresco.repo.content.transform.ProxyContentTransformer" parent="baseContentTransformer"> <property name="worker"><ref bean="transformer.worker.helloWorldCMD"/></property> </bean>
  • 74.
  • 75. Use our Tikatransfomer to turn this back into plain text
  • 76.
  • 78.
  • 79. image – crop, resize, etc
  • 80. freemarker – runs a Freemarker Template against the content of the node
  • 81. html – turns .docx files into clean HTML + images
  • 82. xslt – runs a XSLT Transformation against the content of the node, XML content nodes only!
  • 83.
  • 84. Then set all the parameters against it
  • 85. Finally execute it against a node
  • 86. For very complicated / common renditions, you can save the definition to the data dictionary
  • 87. It can then be retrieved and run
  • 88.
  • 89. QNamerenditionName = QName.createQName( NamespaceService.CONTENT_MODEL_1_0_URI, "myRendDefn");
  • 91. // Make some changes.
  • 94. // Persist the changes.
  • 96. // Run the Rendition
  • 97.
  • 103.
  • 104. They won’t show up in Share when defining Rules, or in Explorer for running a Custom Action
  • 105. Solution – create a JS Script, or some custom Java
  • 106. Use this from your Rule / to run as an Action
  • 107. No dedicated REST API, but Renditions are available through CMIS
  • 108.
  • 109. This delivers lots of flexibility, and means anyone who can write Custom Actions already knows enough to write Custom Rendition Engines!
  • 111.
  • 113. Demo 3:Word .docx -> HTML & Images(Using Web Quick Start) 38
  • 115. Learn More 40 wiki.alfresco.com forums.alfresco.com blogs.alfresco.com/wp/nickb/ twitter: @AlfrescoECM @Gagravarr

Notes de l'éditeur

  1. Use devconf-files/geotagged.jpgAKA quickGEO.jpg from projects/repository/source/test-resources/quick/Upload it into ShareGo to the details page, and show the EXIF and GEO bits
  2. Create a script with the code shown, also available as devconf-js/contentTransformHellowWorld.jsCreate a simple .txt entry in explorerRun the script against itShow that we get a .bin with content type of hello/worldMaybe view the contents of thisRun the script againShow that we get a new plain text file, with the simple contents
  3. Use quick.xls or similar from projects/repository/source/test-resources/quick/Upload it into ExplorerRun a custom action which is a scriptvarnameBase = document.name.substring(0, document.name.lastIndexOf(&quot;.&quot;));var action = actions.create(&quot;transform&quot;);action.parameters[&quot;destination-folder&quot;] = document.parent;action.parameters[&quot;assoc-type&quot;] = &quot;{http://www.alfresco.org/model/content/1.0}contains&quot;;action.parameters[&quot;assoc-name&quot;] = nameBase + &quot;.txt&quot;;action.parameters[&quot;mime-type&quot;] = &quot;text/plain&quot;;action.execute(document);action.parameters[&quot;assoc-name&quot;] = nameBase + &quot;.csv&quot;;action.parameters[&quot;mime-type&quot;] = &quot;text/csv&quot;;action.execute(document);action.parameters[&quot;assoc-name&quot;] = nameBase + &quot;.html&quot;;action.parameters[&quot;mime-type&quot;] = &quot;text/xml&quot;;action.execute(document);
  4. In Share, create a rule on a folder to execute the script belowEnsure the folder that the script writes to isn’t a child!Upload an image, at least 600x400 big into the folderIn share, use the repo browser to get to the special folder, and show the smaller cropped versionvarrenditionDef = renditionService.createRenditionDefinition(&quot;Test&quot;, &quot;htmlRenderingEngine&quot;);renditionDef.parameters[&quot;destination-path-template&quot;] = &quot;/Company Home/Test/${name}.html&quot;;renditionDef.execute(document);renditionDef.parameters[&quot;destination-path-template&quot;] = &quot;/Company Home/Test/BodyOnly/${name}.html&quot;;renditionDef.parameters[&quot;bodyContentsOnly&quot;] = true;
  5. In share, create a Video folder, and a Video Drop subfolderOn the subfolder, create a ruleRule is copy + transformTarget mimetype is flash videoTarget directory is the parent (Videos)Don’t run in the backgroundUpload devconf-files/Video.mp4WaitHopefully show the thumbnail of the .mp4 (refresh if needed)Go to the videos directoryShow the large preview image of the new .flv image
  6. Upload a .docx (including images) to a Web Quick Start folder with .docx extraction enabled via ShareView in Share the HTML, to show what it looks likeFlip to the WCM site, and show it with imagesBACKUP – Create a script based on devconf-files/renderTest.jsIn share, create a folder with a rule of execute scriptPick the scriptUpload devconf-files/word3imgs.docxGo to Company Home/TestShow the htmlShow the images(You can’t show the two together without WCM, sorry....)