Publicité

ACM BPM and elasticsearch AMIS25

Getting value from IoT, Integration and Data Analytics
14 Jun 2016
Publicité

Contenu connexe

Présentations pour vous(20)

Publicité

Plus de Getting value from IoT, Integration and Data Analytics(20)

Publicité

ACM BPM and elasticsearch AMIS25

  1. Search and Find Luc Gorissen ACM/BPM and Elastic Search
  2. Luc Gorissen Previous employers: - KPN Research - CMG Wireless Data Solutions - OraVision - Oracle Focus: - BPM and SOA Suite luc.gorissen@amis.nl +31 6 3622 4226 @LucGorissen No, no, no LinkedIn
  3. The Challenge Starting Point: Our ACM/BPM implementation supports successfully our core business processes Requirement: We need to be able to search through case/process data of the last 7 year We need: An ACM/BPM archive where we can search through data of cases/processes of up to 7 years old
  4. The Technology Company: Product: Promise: Can it be done? ... a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data quickly and in near real time.
  5. 5 Topics Use Case Data Use Case Evaluation Use CaseElastic Product Stack Basic Concepts 31 2 4 5 Recommendation 6
  6. 6 Elastic Product Stack ... a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data quickly and in near real time. • Full-Text Search • Document-Oriented • Near-Real-Time • Horizontally Scalable • Multi Tenant • Schema-Free • REST-API • Open Source – Apache 2 license • On top of Apache Lucene • REST/JSON Features
  7. 7 Elastic Product Stack Product Description Elasticsearch Search engine Elastic Cloud Elasticsearch Cloud offering Logstash Data collection engine Kibana Analytics and visualization platform Beats Collect data (network, infra, file, winlog) and ship Shield Protect access to your data Watcher Alerts/notifications from changes in your data Marvel Monitor your Elasticsearch cluster
  8. 8 Elastic Product Stack Maturity • Complete product stack • Cloud offering • Modern technology around solid Apache Lucene core (1999) • Clients: Ruby, Python, PHP, Perl, .NET, Java, Javascript, etc • Apache Lucene release 6.0.1, May 27, 2016 • Elasticsearch release 2.3.3, May 18th, 2016 • Oracle plans to replace Secure Enterprise Search with ElasticSearch in WebCenter products (OOW 2015) • Support / community group / meet-ups / training
  9. 9 Basic concepts Supports: availability, scalability, distribution Cluster Document (JSON) Index ABC Index ABC Shard 1 Shard 2 Index ABC Index ABC Replica Shard 1 Replica Shard 2 Distributeovernodes
  10. 10 Installation development set-up Installation of Elasticsearch: [developer@localhost bin]$ tar -xvf elasticsearch-2.3.2.tar.gz [developer@localhost bin]$ pwd /home/developer/elasticsearch/elasticsearch-2.3.2/bin [developer@localhost bin]$ ./elasticsearch Installation of Kibana (‘Analytics and visualization platform’): [developer@localhost kibana]$ tar -xvf kibana-4.5.1-linux- x64.tar.gz [developer@localhost config]$ vi kibana.yml [developer@localhost bin]$ pwd /home/developer/kibana/kibana-4.5.1-linux-x64/bin [developer@localhost bin]$ ./kibana
  11. 11 Use Case: tweets AMISnl tweets AMISnl TwitterSupport ScreenTweet (Office Management) CtoScreening (CTO) TweeterContacted (telemarketeer) MarketingScreening (marketing) Screen all tweets of AMISnl to see if action is required for the conference
  12. 12 Use case Tweets: 733666488083750912 2016-05-20 14:31:36 RT @robbrecht: Orcas - Automatic deployment for the database https://t.co/4U6QSuROjf @amisnl @OC_WIRE 733652455523811328 2016-05-20 13:35:50 RT @sai_penumuru: Learn something new from my session. #AMIS25 @oracleotn @oracleace https://t.co/1gBagwgotD 733652388272312322 2016-05-20 13:35:34 RT @sai_penumuru: Join me on 2nd-3rd June 2016 for BEYOND THE HORIZON conference in Netherlands. #AMIS25 @oracleace @oracleotn https://t.co… 7336219462906716202016-05-20 11:34:36 NEWSFLASH! The official #AMIS25 app is now available. Search for 'AMIS 25' in your app store and enjoy! https://t.co/iYOEGG6l90 In total: 3212 tweets
  13. 13 Use Case result: data in JSON format Transform to JSON <caseActivityDefinition> <applicationName>default</applicationName> <completedDate>2016-05-19T06:29:13.910+02:00</completedDate> <componentName>TwitterSupport</componentName> <compositeDn>default/TwitterSupport!1.0*soa_33331876-7da2-4ba6-b28d-fec89397281e</compositeDn> <compositeName>TwitterSupport</compositeName> <compositeVersion>1.0</compositeVersion> <definitionId>default/TwitterSupport!1.0/CtoScreeningProcess</definitionId> <displayName>CtoScreeningProcess</displayName> { "caseActivityDefinition": { "caseId": "100036", "completedDate": "2016-05-23T09:39:03.111+02:00", "definitionId": "default/TwitterSupport!1.0/CtoScreeningProcess", "displayName": "CtoScreeningProcess", "instanceId": "116187", "name": "CtoScreeningProcess", "nameSpace": "http://xmlns.amis.nl/TwitterSupport/CtoScreeningProcess", "startDate": "2016-05-23T09:19:08.111+02:00" } } 3212 tweets Retrieve data from the ACM system with the platform API. Retrieved data: • CaseActivities • CaseMileStones • Comments • CaseData
  14. 14 Insert data into ElasticSearch Insert MileStone data into ElasticSearch archive: curl -XPUT 'localhost:9200/casemilestones/external/1?pretty' -d ' { "caseMilestone": { "caseId": "103242", "state": "ATTAINED", "name": "TweetScreenedMilestone", "updatedDate": "2016-05-25T10:27:34.111+02:00" } } ' index Milestone data in JSON
  15. 15 Results use case: data into ElasticSearch Totals - start: [developer@localhost elasticsearch-2.3.2]$ curl 'localhost:9200/_cat/indices?v' health status index pri rep docs.count docs.deleted store.size pri.store.size yellow open caseactivities 5 1 0 0 650b 650b yellow open casemilestones 5 1 0 0 260b 260b yellow open casecomments 5 1 0 0 650b 650b yellow open casedata 5 1 0 0 650b 650b Totals - end: [developer@localhost elasticsearch-2.3.2]$ curl 'localhost:9200/_cat/indices?v' health status index pri rep docs.count docs.deleted store.size pri.store.size yellow open caseactivities 5 1 3693 0 929.4kb 929.4kb yellow open casemilestones 5 1 16060 0 1.5mb 1.5mb yellow open casecomments 5 1 7207 0 685.1kb 685.1kb yellow open casedata 5 1 16060 0 2.2mb 2.2mb [developer@localhost elasticsearch-2.3.2]$ Timing : # documents: 43020 Upload time: 9:57 min Upload speed: ~72 docs / sec
  16. 16 Results use case: sample search [developer@localhost json]$ [developer@localhost ~]$ curl -XPOST 'localhost:9200/casedata/_search?pretty' -d ' > { > "query": { "match": { "caseData.value": "Lucas"}}, > "_source": ["caseData.caseId", "caseData.value"] > } > ' { "took" : 96, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 215, "max_score" : 1.3943143, "hits" : [ { "_index" : "casedata", "_type" : "external", "_id" : "AVTZQ0eNjRs4lNcko-Qb", "_score" : 1.3943143, "_source" : { "caseData" : { "caseId" : "102701", "value" : "Blog by Lucas Jellema: UX Expo 18th of March – OTN ArchBeat YouTube Video Interview: Jeremy Ashley &amp; Lucas J... http://t.co/9GlLzTJ3U0" } } }, { "_index" : "casedata",
  17. 17 Kibana Let’s start looking at the data with Kibana: What can it add to the archive?
  18. 18 Kibana Timeline for case activities
  19. 19 Searching with Kibana
  20. 20 Kibana dashboard
  21. 21 Office Documents Especially for case management, ‘Office Documents’ are important. Installation of plugin for indexing Office and PDF docs (Apache Tika): [developer@localhost bin]$ pwd /home/developer/elasticsearch/elasticsearch-2.3.2/bin [developer@localhost bin]$ ./plugin install mapper-attachments
  22. 22 ‘Office Documents’ Document formats: • Supported Document Formats • HyperText Markup Language • XML and derived formats • Microsoft Office document formats • OpenDocument Format • Portable Document Format • Electronic Publication Format • Rich Text Format • Compression and packaging formats • Text formats • Audio formats • Image formats • Video formats • Java class files and archives • The mbox format
  23. 23 Results use case: searching office documents Insert documents base64 encoded … and search: [developer@localhost ~]$ curl -POST 'localhost:9200/casedocuments/document/_search?pretty' -d ' > { > "query": { > "query_string": { > "query": "+bonnetje +teeven" }}, > "_source": ["docName"] > } > ' { "took" : 64, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.43479362, "hits" : [ { "_index" : "casedocuments", "_type" : "document", "_id" : "AVT4Siu7Ia99IOtnY-TF", "_score" : 0.43479362, "_source" : { "docName" : "/doc/factuur.docx"
  24. 24 Use Case Results • Mature, enterprise grade product • Easy search, even ‘Office Documents’ • Basic analysis, more investigation required • Careffully determine what info to put into elasticsearch – Audit trail? TaskQueryService? Other info? • It is schema-free: easy transitions between Oracle releases • You will find the caseIdentifier and anything related to the caseIdentifier • Not an easy overview of case history
  25. 25 Recommendation Back to ‘the challenge’: An ACM/BPM archive where we can search through data of cases/processes of up to 7 years old Aspects: - TCO: License Costs - TCO: Yet another technology - DB versus elasticsearch: - Schema-less JSON data store - No transactions - Near-real-time - Document Management System / doc types - Logstash jdbc plugin
  26. 26
Publicité