SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
Apache ManifoldCF
Alfresco WebScript Repository Connector


                             Alfresco Meetup
                                Rome 2013
About me
● Open Source ECM Specialist at Sourcesence

● Author and Technical Reviewer at Packt Publishing
   ○ Alfresco 3 Web Services (2010)
   ○ GateIn Cookbook (2012)

● Alfresco Community (nickname OpenPj)
   ○ Alfresco Community Star
   ○ Alfresco Wiki Gardener
   ○ Top 10 supporter (english and italian)
   ○ Moderator of the italian forum

● PMC Member and Committer at the Apache Software Foundation

● JBoss Community
   ○ Content editor for jboss.org
   ○ Project Leader and Committer for PortletSwap / Blog / Wiki
Overview
● Introducing Apache ManifoldCF
    ○ What is ManifoldCF?
    ○ Why ManifoldCF?
    ○ Architecture
    ○ Who is using ManifoldCF?
    ○ The book
● How ManifoldCF supports Alfresco
● The goal of the new connector
    ○ Architecture
    ○ Roadmap
    ○ The team
● Resources
The story
The original ManifoldCF code base was granted by MetaCarta to the
Apache Software Foundation in December 2009.

The MetaCarta effort represented more than five years of successful
development and testing in multiple, challenging enterprise
environments.

The project was graduated as Apache Top Level Project in July 2012.
What is ManifoldCF?
Open Source crawler
 ● crawling model (add, change, delete)
 ● schedule jobs to create indexes
    ○ get contents from repositories
    ○ push contents on search servers
   Repository 1                           Search Server 1


   Repository 2       Apache ManifoldCF   Search Server 2


   Repository 3                           Search Server 3
What is ManifoldCF?

● Out-Of-The-Box it is distributed as a webapp
  ○ REST API
  ○ Authority Service
  ○ Crawler UI


● can be embedded in any Java application
Why ManifoldCF?
● Reliability

● Incremental

● Flexible

● Multi repositories

● Security model

● Monitoring
Why ManifoldCF? - Reliability
Jobs scheduling and configuration are stored in the database to
maintain the state of all the executions

      Repository         Pull Agent Daemon             Search Server
                        configuration and scheduling




                               Database
Why ManifoldCF? - Incremental
get content changesets obtained from the repository API


           Repository



                  complete
                 changesets       Apache ManifoldCF
Why ManifoldCF? - Flexible
If the repository can't supply all the changes Manifold can
discover them through crawling

           Repository

                 incomplete
                 changesets           Apache Manifold CF
                                       Change
                                      Discovery



            N1
                        N2
Why ManifoldCF? - Multi repositories
Jobs can retrieve contents from the following repositories:
 ● CMIS-compliant
 ● Alfresco
 ● IBM FileNet
 ● EMC Documentum
 ● Microsoft SharePoint
 ● OpenText LiveLink
 ● Autonomy Meridio
 ● Memex Patriarch
 ● Windows Share/DFS
 ● Generic JDBC
 ● Generic Filesystem
 ● Generic RSS and Web
Why ManifoldCF? - Multi repositories
Jobs can ingest contents to the following search
servers:
● Apache Solr
● ElasticSearch
● OpenSearchServer
● MetaCarta GTS
Why ManifoldCF? - Security model
Retrieve per-content ACLs                      Authority 1

                        Authority Service      Authority 2

                                               Authority 3


       Repository 1

       Repository 2    Pull Agent Daemon
                                            user access
       Repository 3                           tokens
                      doc access
                        tokens
                                                  user specific
                         Search Server              search
                                                    results
Why ManifoldCF? - Monitoring
UI Crawler allows you to:
 ● configure jobs and connectors
 ● monitor jobs execution
 ● monitor contents ingestion
   ○ status reports
      ■ document status
      ■ queue status
   ○ history reports
      ■ simple history
      ■ maximum activity
      ■ maximum bandwidth
      ■ result histogram
Architecture - Job
                                                        Authority
                                                        Connector
                                       ACLs
       Repository
       Connector
                           retrieve                        Output
                         content ACL                      Connector



      Repository                    Job                  Search Server

query to retrieve contents                            - metadata mapping
                               - verbal description   - content ingestion
                               - crawling model
                               - scheduling
Who is using ManifoldCF?
The book: ManifoldCF in Action

ManifoldCF in Action
by Karl Wright
published by Manning


Karl is the original developer and the
principal committer of Apache ManifoldCF


The book is available at http://www.manning.com/wright
How ManifoldCF supports Alfresco
● CMIS Repository Connector based on OpenCMIS


● The current Alfresco Repository Connector only supports CML
   ○ works on any version of Alfresco 2.x, 3.x and 4.x
   ○ no support for quering Solr from Alfresco
   ○ it will die at the end of the year
   ○ Please see the Alfresco Roadmap
Alfresco Solr search subsystem
● Remote crawling of contents and ACLs into Solr
  ○ REST API for retrieving changesets from Alfresco db
● Solr server provided by Alfresco
  ○ based on Apache Solr 1.4.1 (uhm...really!!!???)
● hardcoded
● can't be used with your own Solr instance
  ○ customers have newer version of Solr
      ■ interested in new features (SolrCloud, sharding...)
      ■ hundred of improvements available in 3.x and 4.x
Alfresco Solr search subsystem

                    Tra
                       nsa
                          ctio
                                            Solr 1.4.1
 Alfresco                     ns a
                                  nd A
                                      CL
                                           (provided by Alfresco)



                                             Alfresco REST Client




  alf_transaction
      alf_acl_*
     alf_node_*
                                                 Indexes
Roadmap
Goal - 1
Create a new connector using the Alfresco REST Client
● provided and supported by Alfresco
   ○ for us is a Maven dependency :)


● invokes the Alfresco Solr API
Goal - 2 - check feasibility
Create a real Enterprise alternative for managing indexes


● compatibility with the SearchService of Alfresco
● repository takes care only of contents
● indexes are managed externally
● no redundancy for indexes


effort to redirect queries executions
Goal - 3 - Security
 Implement an Alfresco authority connector
  ○ manages ACLs indexing
Goal - 4
Manage indexes using ManifoldCF against any supported

search server

● Apache Solr 3.x / 4.x

● ElasticSearch

● Open Search Server

● MetaCarta
Architecture

                    ManifoldCF
                                           Search
  Alfresco           Alfresco WebScript    Server
                    Repository Connector



                       Alfresco REST
                            Client




  alf_transaction    Output Connector
      alf_acl_*                             Indexes
     alf_node_*
The team of the new connector
● Piergiorgio Lucidi (Sourcesense + ASF)

● Maurizio Pillitu (Alfresco)

● Aingaran Pillai (Zaizi) [new entry]

● Fran Alvarez (Zaizi) [new entry]

● Abraham Ayala (Zaizi) [new entry]
Join us!

● We are looking for developers

● this is a work in progress

● don't fork the project feel free to join us

                     ^__^
Resources

● Apache ManifoldCF
  http://manifoldcf.apache.org/

● The connector hosted on github:
  https://github.com/maoo/alfresco-webscript-manifold-connector



● it will be included in Apache ManifoldCF
Thank you for your
       attention!




http://www.open4dev.com

Contenu connexe

Tendances

The Need For Speed - NxtGen Cambridge
The Need For Speed - NxtGen CambridgeThe Need For Speed - NxtGen Cambridge
The Need For Speed - NxtGen Cambridge
Phil Pursglove
 
AtlasCamp 2010: Macro Migration Guide for Confluence 4.0 - Ryan Thomas
AtlasCamp 2010: Macro Migration Guide for Confluence 4.0 - Ryan ThomasAtlasCamp 2010: Macro Migration Guide for Confluence 4.0 - Ryan Thomas
AtlasCamp 2010: Macro Migration Guide for Confluence 4.0 - Ryan Thomas
Atlassian
 

Tendances (20)

Advance java session 2
Advance java session 2Advance java session 2
Advance java session 2
 
They why behind php frameworks
They why behind php frameworksThey why behind php frameworks
They why behind php frameworks
 
ColdFusion Fw1 (FrameWork1) introduction
ColdFusion Fw1 (FrameWork1) introductionColdFusion Fw1 (FrameWork1) introduction
ColdFusion Fw1 (FrameWork1) introduction
 
The Need For Speed - NxtGen Cambridge
The Need For Speed - NxtGen CambridgeThe Need For Speed - NxtGen Cambridge
The Need For Speed - NxtGen Cambridge
 
Oslo Vancouver Project Update
Oslo Vancouver Project UpdateOslo Vancouver Project Update
Oslo Vancouver Project Update
 
Ozone-Wayland Support in Chromium (GENIVI 13th All Member Meeting & AMM Open ...
Ozone-Wayland Support in Chromium (GENIVI 13th All Member Meeting & AMM Open ...Ozone-Wayland Support in Chromium (GENIVI 13th All Member Meeting & AMM Open ...
Ozone-Wayland Support in Chromium (GENIVI 13th All Member Meeting & AMM Open ...
 
ASP.NET vNext
ASP.NET vNextASP.NET vNext
ASP.NET vNext
 
Seda与Java并行编程点滴
Seda与Java并行编程点滴Seda与Java并行编程点滴
Seda与Java并行编程点滴
 
Restful风格ž„web服务架构
Restful风格ž„web服务架构Restful风格ž„web服务架构
Restful风格ž„web服务架构
 
Drupal 8 - Quick bites
Drupal 8 - Quick  bitesDrupal 8 - Quick  bites
Drupal 8 - Quick bites
 
JEE session 1
JEE session 1JEE session 1
JEE session 1
 
ASP.NET Core Demos
ASP.NET Core DemosASP.NET Core Demos
ASP.NET Core Demos
 
Agile Site built on the top of Oracle WebCenter Sites
Agile Site built on the top of Oracle WebCenter SitesAgile Site built on the top of Oracle WebCenter Sites
Agile Site built on the top of Oracle WebCenter Sites
 
Mini Training Flyway
Mini Training FlywayMini Training Flyway
Mini Training Flyway
 
Run your Dockerized ASP.NET application on Windows and Linux!
Run your Dockerized ASP.NET application on Windows and Linux!Run your Dockerized ASP.NET application on Windows and Linux!
Run your Dockerized ASP.NET application on Windows and Linux!
 
Tarabica 2019 - Migration from ASP.NET MVC to ASP.NET Core
Tarabica 2019 - Migration from ASP.NET MVC to ASP.NET CoreTarabica 2019 - Migration from ASP.NET MVC to ASP.NET Core
Tarabica 2019 - Migration from ASP.NET MVC to ASP.NET Core
 
Integrating Apache Syncope with Apache CXF
Integrating Apache Syncope with Apache CXFIntegrating Apache Syncope with Apache CXF
Integrating Apache Syncope with Apache CXF
 
.Net Core - not your daddy's dotnet
.Net Core - not your daddy's dotnet.Net Core - not your daddy's dotnet
.Net Core - not your daddy's dotnet
 
Ahmedabad MuleSoft Meetup #1
Ahmedabad MuleSoft Meetup #1Ahmedabad MuleSoft Meetup #1
Ahmedabad MuleSoft Meetup #1
 
AtlasCamp 2010: Macro Migration Guide for Confluence 4.0 - Ryan Thomas
AtlasCamp 2010: Macro Migration Guide for Confluence 4.0 - Ryan ThomasAtlasCamp 2010: Macro Migration Guide for Confluence 4.0 - Ryan Thomas
AtlasCamp 2010: Macro Migration Guide for Confluence 4.0 - Ryan Thomas
 

Similaire à Alfresco WebScript Connector for Apache ManifoldCF

Alfresco search services: Now and Then
Alfresco search services: Now and ThenAlfresco search services: Now and Then
Alfresco search services: Now and Then
Angel Borroy López
 
The Need For Speed - NEBytes
The Need For Speed - NEBytesThe Need For Speed - NEBytes
The Need For Speed - NEBytes
Phil Pursglove
 
The Need for Speed - EpiCenter 2010
The Need for Speed - EpiCenter 2010The Need for Speed - EpiCenter 2010
The Need for Speed - EpiCenter 2010
Phil Pursglove
 
0910 cagliari- spring surf and cmis - the dynamic duo
0910 cagliari- spring surf and cmis - the dynamic duo0910 cagliari- spring surf and cmis - the dynamic duo
0910 cagliari- spring surf and cmis - the dynamic duo
Symphony Software Foundation
 

Similaire à Alfresco WebScript Connector for Apache ManifoldCF (20)

Apache ManifoldCF
Apache ManifoldCFApache ManifoldCF
Apache ManifoldCF
 
Super Size Your Search
Super Size Your SearchSuper Size Your Search
Super Size Your Search
 
Alfresco Day Roma 2015: Platform Update
Alfresco Day Roma 2015: Platform UpdateAlfresco Day Roma 2015: Platform Update
Alfresco Day Roma 2015: Platform Update
 
Smart Content Migration using Apache ManifoldCF
Smart Content Migration using Apache ManifoldCFSmart Content Migration using Apache ManifoldCF
Smart Content Migration using Apache ManifoldCF
 
201511 - Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
201511 -  Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...201511 -  Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
201511 - Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
 
Developer’s intro to the alfresco platform
Developer’s intro to the alfresco platformDeveloper’s intro to the alfresco platform
Developer’s intro to the alfresco platform
 
DBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data LakesDBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data Lakes
 
Alfresco search services: Now and Then
Alfresco search services: Now and ThenAlfresco search services: Now and Then
Alfresco search services: Now and Then
 
The Need For Speed - NEBytes
The Need For Speed - NEBytesThe Need For Speed - NEBytes
The Need For Speed - NEBytes
 
WCM Roadmap Versions 3 3 And 4 0
WCM Roadmap Versions 3 3 And 4 0WCM Roadmap Versions 3 3 And 4 0
WCM Roadmap Versions 3 3 And 4 0
 
The Need for Speed - EpiCenter 2010
The Need for Speed - EpiCenter 2010The Need for Speed - EpiCenter 2010
The Need for Speed - EpiCenter 2010
 
Phil Pursglove: Velocity, the Need for Speed - epicenter 2010
Phil Pursglove: Velocity, the Need for Speed - epicenter 2010Phil Pursglove: Velocity, the Need for Speed - epicenter 2010
Phil Pursglove: Velocity, the Need for Speed - epicenter 2010
 
Intro to Alfresco for Developers
Intro to Alfresco for DevelopersIntro to Alfresco for Developers
Intro to Alfresco for Developers
 
Mule soft meetup_chandigarh_#7_25_sept_2021
Mule soft meetup_chandigarh_#7_25_sept_2021Mule soft meetup_chandigarh_#7_25_sept_2021
Mule soft meetup_chandigarh_#7_25_sept_2021
 
0910 cagliari- spring surf and cmis - the dynamic duo
0910 cagliari- spring surf and cmis - the dynamic duo0910 cagliari- spring surf and cmis - the dynamic duo
0910 cagliari- spring surf and cmis - the dynamic duo
 
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
 
Upgrading to Alfresco 6
Upgrading to Alfresco 6Upgrading to Alfresco 6
Upgrading to Alfresco 6
 
Elements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfElements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdf
 
Elements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdfElements_Architecture_and_Technology.pdf
Elements_Architecture_and_Technology.pdf
 

Plus de Piergiorgio Lucidi

Plus de Piergiorgio Lucidi (11)

Embracing InnerSource for your adaptive Digital Transformation
Embracing InnerSource for your adaptive Digital TransformationEmbracing InnerSource for your adaptive Digital Transformation
Embracing InnerSource for your adaptive Digital Transformation
 
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
Introducing the ASF at Microsoft Build 2020 - Italian Dev Community
 
Smart Alfresco ECM Program Strategy for Your New Success Story
Smart Alfresco ECM Program Strategy for Your New Success StorySmart Alfresco ECM Program Strategy for Your New Success Story
Smart Alfresco ECM Program Strategy for Your New Success Story
 
Design your own BPM Program Strategy with Alfresco Process Services
Design your own BPM Program Strategy with Alfresco Process ServicesDesign your own BPM Program Strategy with Alfresco Process Services
Design your own BPM Program Strategy with Alfresco Process Services
 
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 ItalyAlfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
Alfresco Process Services Live Demo @ Red Hat Open Source Day 2017 Italy
 
The Journey of Apache ManifoldCF: Learning from ASF's Successes
The Journey of Apache ManifoldCF: Learning from ASF's SuccessesThe Journey of Apache ManifoldCF: Learning from ASF's Successes
The Journey of Apache ManifoldCF: Learning from ASF's Successes
 
Alfresco Day Roma 2015 - Sourcesense
Alfresco Day Roma 2015 - SourcesenseAlfresco Day Roma 2015 - Sourcesense
Alfresco Day Roma 2015 - Sourcesense
 
The ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - RomeThe ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
 
Hippo CMS - A first look
Hippo CMS - A first lookHippo CMS - A first look
Hippo CMS - A first look
 
Spring Ldap
Spring LdapSpring Ldap
Spring Ldap
 
Spring In Alfresco Ecm
Spring In Alfresco EcmSpring In Alfresco Ecm
Spring In Alfresco Ecm
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

Alfresco WebScript Connector for Apache ManifoldCF

  • 1. Apache ManifoldCF Alfresco WebScript Repository Connector Alfresco Meetup Rome 2013
  • 2. About me ● Open Source ECM Specialist at Sourcesence ● Author and Technical Reviewer at Packt Publishing ○ Alfresco 3 Web Services (2010) ○ GateIn Cookbook (2012) ● Alfresco Community (nickname OpenPj) ○ Alfresco Community Star ○ Alfresco Wiki Gardener ○ Top 10 supporter (english and italian) ○ Moderator of the italian forum ● PMC Member and Committer at the Apache Software Foundation ● JBoss Community ○ Content editor for jboss.org ○ Project Leader and Committer for PortletSwap / Blog / Wiki
  • 3. Overview ● Introducing Apache ManifoldCF ○ What is ManifoldCF? ○ Why ManifoldCF? ○ Architecture ○ Who is using ManifoldCF? ○ The book ● How ManifoldCF supports Alfresco ● The goal of the new connector ○ Architecture ○ Roadmap ○ The team ● Resources
  • 4. The story The original ManifoldCF code base was granted by MetaCarta to the Apache Software Foundation in December 2009. The MetaCarta effort represented more than five years of successful development and testing in multiple, challenging enterprise environments. The project was graduated as Apache Top Level Project in July 2012.
  • 5. What is ManifoldCF? Open Source crawler ● crawling model (add, change, delete) ● schedule jobs to create indexes ○ get contents from repositories ○ push contents on search servers Repository 1 Search Server 1 Repository 2 Apache ManifoldCF Search Server 2 Repository 3 Search Server 3
  • 6. What is ManifoldCF? ● Out-Of-The-Box it is distributed as a webapp ○ REST API ○ Authority Service ○ Crawler UI ● can be embedded in any Java application
  • 7. Why ManifoldCF? ● Reliability ● Incremental ● Flexible ● Multi repositories ● Security model ● Monitoring
  • 8. Why ManifoldCF? - Reliability Jobs scheduling and configuration are stored in the database to maintain the state of all the executions Repository Pull Agent Daemon Search Server configuration and scheduling Database
  • 9. Why ManifoldCF? - Incremental get content changesets obtained from the repository API Repository complete changesets Apache ManifoldCF
  • 10. Why ManifoldCF? - Flexible If the repository can't supply all the changes Manifold can discover them through crawling Repository incomplete changesets Apache Manifold CF Change Discovery N1 N2
  • 11. Why ManifoldCF? - Multi repositories Jobs can retrieve contents from the following repositories: ● CMIS-compliant ● Alfresco ● IBM FileNet ● EMC Documentum ● Microsoft SharePoint ● OpenText LiveLink ● Autonomy Meridio ● Memex Patriarch ● Windows Share/DFS ● Generic JDBC ● Generic Filesystem ● Generic RSS and Web
  • 12. Why ManifoldCF? - Multi repositories Jobs can ingest contents to the following search servers: ● Apache Solr ● ElasticSearch ● OpenSearchServer ● MetaCarta GTS
  • 13. Why ManifoldCF? - Security model Retrieve per-content ACLs Authority 1 Authority Service Authority 2 Authority 3 Repository 1 Repository 2 Pull Agent Daemon user access Repository 3 tokens doc access tokens user specific Search Server search results
  • 14. Why ManifoldCF? - Monitoring UI Crawler allows you to: ● configure jobs and connectors ● monitor jobs execution ● monitor contents ingestion ○ status reports ■ document status ■ queue status ○ history reports ■ simple history ■ maximum activity ■ maximum bandwidth ■ result histogram
  • 15. Architecture - Job Authority Connector ACLs Repository Connector retrieve Output content ACL Connector Repository Job Search Server query to retrieve contents - metadata mapping - verbal description - content ingestion - crawling model - scheduling
  • 16. Who is using ManifoldCF?
  • 17. The book: ManifoldCF in Action ManifoldCF in Action by Karl Wright published by Manning Karl is the original developer and the principal committer of Apache ManifoldCF The book is available at http://www.manning.com/wright
  • 18. How ManifoldCF supports Alfresco ● CMIS Repository Connector based on OpenCMIS ● The current Alfresco Repository Connector only supports CML ○ works on any version of Alfresco 2.x, 3.x and 4.x ○ no support for quering Solr from Alfresco ○ it will die at the end of the year ○ Please see the Alfresco Roadmap
  • 19. Alfresco Solr search subsystem ● Remote crawling of contents and ACLs into Solr ○ REST API for retrieving changesets from Alfresco db ● Solr server provided by Alfresco ○ based on Apache Solr 1.4.1 (uhm...really!!!???) ● hardcoded ● can't be used with your own Solr instance ○ customers have newer version of Solr ■ interested in new features (SolrCloud, sharding...) ■ hundred of improvements available in 3.x and 4.x
  • 20. Alfresco Solr search subsystem Tra nsa ctio Solr 1.4.1 Alfresco ns a nd A CL (provided by Alfresco) Alfresco REST Client alf_transaction alf_acl_* alf_node_* Indexes
  • 22. Goal - 1 Create a new connector using the Alfresco REST Client ● provided and supported by Alfresco ○ for us is a Maven dependency :) ● invokes the Alfresco Solr API
  • 23. Goal - 2 - check feasibility Create a real Enterprise alternative for managing indexes ● compatibility with the SearchService of Alfresco ● repository takes care only of contents ● indexes are managed externally ● no redundancy for indexes effort to redirect queries executions
  • 24. Goal - 3 - Security Implement an Alfresco authority connector ○ manages ACLs indexing
  • 25. Goal - 4 Manage indexes using ManifoldCF against any supported search server ● Apache Solr 3.x / 4.x ● ElasticSearch ● Open Search Server ● MetaCarta
  • 26. Architecture ManifoldCF Search Alfresco Alfresco WebScript Server Repository Connector Alfresco REST Client alf_transaction Output Connector alf_acl_* Indexes alf_node_*
  • 27. The team of the new connector ● Piergiorgio Lucidi (Sourcesense + ASF) ● Maurizio Pillitu (Alfresco) ● Aingaran Pillai (Zaizi) [new entry] ● Fran Alvarez (Zaizi) [new entry] ● Abraham Ayala (Zaizi) [new entry]
  • 28. Join us! ● We are looking for developers ● this is a work in progress ● don't fork the project feel free to join us ^__^
  • 29. Resources ● Apache ManifoldCF http://manifoldcf.apache.org/ ● The connector hosted on github: https://github.com/maoo/alfresco-webscript-manifold-connector ● it will be included in Apache ManifoldCF
  • 30. Thank you for your attention! http://www.open4dev.com