This talk delivered by Florent Guillaume, Director of R&D at Nuxeo, will provide the audience with a global understanding of what Apricot is and also provide a general overview of what a Content Repository is from a functional standpoint: exploring all the services it offers, identifying the main standards and technologies integrated within a framework of this caliber, such as the Content Management Interoperability Standard (CMIS), and understanding the main technical challenges to be resolved, in particular high scalability and high performance.
4. What Is Content?
• Everything is content!
• Unstructured
• Files, Images, Assets, XML, Binary streams
• Structured
• Schema, Metadata, Business Data, Tables
• Semi-Structured
• Files + Metadata, Web pages (assemblies, relations), Emails
(attachments), Record Management
4
5. A Content Repository is
Middleware
• Between Application and Storage Subsystem
• Does not replace either
• Persistence Service
• Stores structured and unstructured content
• High-Level Abstraction
• Stop caring about storage details
• Focus on your domain model and its objects
5
6. What a Content Repository Is
Not
• Not a CMS (WCM, ECM, ...)
• A CMS is one application on top of a Content Repository
• Not a filesystem
• If all you have is a filesystem, everything looks like a file
• Not an ORM
• Not that granular, don’t think in SQL terms
• Not just for storage
• Provides Services, Domain Model / Business Model
6
8. Eclipse Apricot
• OSGi framework
• Under the Eclipse Runtime project
• Currently in the Incubation phase
• Mentored by Gary Xue (Actuate) and Cédric Brun (Obeo)
• Contributed by Nuxeo, from Nuxeo Core
• http://www.eclipse.org/apricot
8
9. What Is Apricot?
• Content Repository
• Web Support
• Content Automation
• CMIS
9
14. HTTP (CMIS) HTTP
OpenCMIS Java
OpenCMIS Automation
API
Native Java API Core Services
VCS
Binary Store SQL Backend Other Backend
Filesystem SQL Database Cloud
14
15. When to Use Apricot?
• Need to store Objects with Properties
• And also Files
• Don’t want to write SQL
• But be able to fall back to SQL if really needed
• Need Access Control
• Need Versioning, Queries, ...
• Don’t want to reinvent the wheel
15
16. Why Is Apricot Good?
• Modular
• Fast
• Safe
• Scales
• Full-featured services
• Battle-tested
16
17. Apricot: Modular
• Uses OSGi deployment
• OSGi bundles
• Dynamic lifecycle
• Extension points
• XML
• Inspired by those from Eclipse
17
18. Apricot: Fast
• Efficient use of storage backends
• Use their native features and strengths
• Don’t reinvent transactions, relations
• Caching
• Batching
18
20. Apricot: Scaling
• Clusterable on top of a standard SQL database
• Architecture ready for NoSQL backends
• Pluggable Binary Store
• Filesystem, SQL, Amazon S3, etc.
• Lockless
20
22. Apricot: Battle-Tested
• Originates from Nuxeo Core
• Used in Nuxeo DM, Nuxeo DAM, Nuxeo CMF
• In production for 4 years
• Thousands of deployments
22
24. Choosing a Modularity
Framework
• Java SE
• No bundle life cycle, no modularity, no extension system
• Java EE
• All is packaged as one big application (EAR or WAR),
cannot update or add a feature without recompiling the
entire application
• OSGi — yes, but...
24
25. Additions Needed to OSGi
• To achieve a plugin model
• Eclipse already had the answer: extension points
• To provide enterprise features
• No real OSGi Enterprise Framework implementations yet
25
26. Extension Points
• Eclipse used EXSD, which is very tied to PDE
• (remember it was 2006 and we had to use maven)
• Nuxeo redefined something similar but more flexible
• No intermediate object model between services and
contributed extensions
• Write an extension class and map it to XML using Java
annotations
• Easy to write for developers, no specific IDE needed
26
28. Integrating with Java EE
• Apricot should be able to run in an Application Server
(as a WAR)
• Java EE configuration is monolithic
• To declare servlets (web.xml) one must know in advance the
servlets provided by all the different bundles, same for
application.xml
• Apricot is dynamic: bundles may be installed at runtime
• Java EE components declared by bundles must be installed
at runtime
28
29. Nuxeo Configuration
Fragments
• Using templates for Java EE configuration files
• Dynamically generate web.xml and application.xml at
application startup from the configuration contributed by each
bundle
• Ability to package WAR applications that can adapt
themselves to the configuration provided by new application
bundles
• No need to have different product packagings for different
configurations
• Needs a server-specific bridge to do this processing
29
31. Apricot Configuration
Fragments
• Convert all configuration fragments into runtime
extension points
• Declaring a servlet is done by contributing an extension
point
• Servlets can be installed at runtime when the bundle
declaring them is activated
• Provide a bridge to interact with the host
application server
31
32. Java EE Features
• Full OSGi integration of JAAS (authentication system)
• Full JTA support through Apache Geronimo
(transactions)
• Full JCA support through Apache Geronimo
(resource adapters and pooling)
• In-memory JNDI server
• Future plans to integrate the work done in the Gemini
project (and also support Virgo)
32
34. Where Are We Going?
• Finish Apricot first release
• Cleanup, testing framework
• Replace Nuxeo Core with Apricot
• Nuxeo Core running under a full OSGi container
• Bridge for non-OSGi application servers
• Update JSF and Seam to Java EE 6
• Use CDI for easier access to services
• Better IDE support
34
Middleware: Defines domain model and services - used by the application - persisted in the storage\nPersistence Service: Is to semi-structured content what Hibernate is to objects with properties\nHigh-Level Abstraction: Abstract operations, Let the Content Repository do its job\n
Not a Filesystem: A Content Repository offers much richer semantics\nMetadata, Versioning, Relationships, Non-path-based access\nNot an ORM: Content comes before relations and optimizations; Don’t limit yourself to SQL\n