SlideShare a Scribd company logo
1 of 19
sqrrl and Accumulo
Which NoSQL solution?

There are a lot of
places to fit
sqrrl, and Accumulo.

© sqrrl, Inc.
What is sqrrl?


Based on Accumulo



A proven, secure, multi tenant, data platform for building real-time
applications



Scales elastically to tens of petabytes of data and enables
organizations to eliminate their internal data silos



Seamless integration with Hadoop, and most of its variants



Providing a supply for a much needed security demand (groundup security)



Already deployed and utilized by defense and government
industries
A history of sqrrl

© sqrrl, Inc. - Accumulo Meetup Presentation
A sqrrl’s architecture

© sqrrl, Inc.
What is Accumulo?


Development began at the NSA in 2008



Base foundation for sqrrl



Cell-level security reduces the cost of app development,
circumnavigating complex, sometimes impossible, legal or
policy restrictions



Provides the ability to scale to >PB levels



Highly adaptive schema and sorted key/value paradigm



Stores key/value pairs in parsed, sorted, secure controls
Where does Accumulo fit?

© sqrrl, Inc.
How does Accumulo provide security?



Security Labels are applied to keys



Cell-level security is implemented to allow for security policy
enforcement, using data labeler tags



These policies are applied when data is ingested



Tablets contain data, are controlled using security policies



Stores key/value pairs in parsed, sorted, secure controls, a 5-tuple
key system
Accumulo Security (cont.)

Why Cell-Level Security Is Important:
Many databases insufficiently implement security through row-and columnlevel restrictions. Column-level security is only sufficient when the data
schema is static, well known, and aligned with security concerns. Row-level
security breaks down when a single record conveys multiple levels of
information. The flexible, fine-grained cell-level security within Sqrrl
Enterprise (or its root Accumulo) supports flexible schemas, new indexing
patterns, and greater analytic adaptability at scale.
Accumulo Security (cont.)

An Accumulo key is a 5-tuple key, consisting of:


Row: Controls Atomicity



Column Family: Controls Locality



Column Qualifier: Controls Uniqueness



Visibility Label: Controls Access



Timestamp: Controls Versioning

(Values are byte arrays)

Keys are sorted:


Hierarchically: Row first, then column family, and so on



Lexicographically: Compare first byte, then second, and so on
Accumulo Security (cont.)

An example of column usage
Accumulo Architecture
Accumulo servers (tablets) utilize a multitude of big data
technologies, but their layout is different than Map/Reduce, HDFS,
MongoDB, Cassandra, etc. used alone.
 Data is stored in HDFS
 Zookeeper is utilized for configuration management
 SSH, password-less, node configuration
 An emphasis, more of an imperative, on data model and data
model design
Accumulo Architecture (cont.)

Tablets

 Partitions of tables, collections of sorted
key/value pairs
 Held and managed by Tablet Servers
Accumulo Architecture (cont.)
Tablet Servers

 Receive writes, responds to reads, from
clients

 Writes to a write-ahead log, sorting new
key/value pairs in memory, while
periodically flushing sorted key/value
pairs to new files in HDFS
 Managed by Master

 Responsible for detecting and
responding to Tablet Server failure,
load balancing
 Coordinates startup, graceful
shutdown, and recovery of writeahead logs
 Zookeeper
 An apache project, open source
 Utilized for distributed locking
mechanism, with no single point of
failure
Integration with users/access
The visibility labels are a feature that is unique to Accumulo. No other
database can apply access controls at such a fine-grained level.
Labels are generated
by translating an
organization’s existing
data security and
information sharing
policies into Boolean
expressions
1.

Gather an organization’s information security policies and dissecting them into data--‐centric
and user--‐centric components

2.

As data is ingested into Accumulo, a data labeler tags individual key/value pairs with the
appropriate data--‐centric visibility labels based on these policies.

3.

Data is then stored in Accumulo where it is available for real--‐time queries by operational
applications. End users are authenticated through these applications and authorized to
access underlying data

4.

As an end user performs an operation via the app (e.g., performs a search request), the
visibility label on each candidate key/value pair is checked against his or her attributes, and
only the data that he or she is authorized to see is returned.
Making sqrrl work
sqrrl’s extensibility of Accumulo allows it to process
millions of records per second, as either static or
streaming objects
These records are converted into hierarchical JSON
documents, giving document store capabilities

Passing this data to the analytics layer is designed to
make integration and development of real-time
analytics possible, and accessible

Combining access at the cell level, with Accumulo,
sqrrl integrates Identity and Access Management
(IAM) systems (LDAP, RADIUS, etc.)
Making sqrrl work

Apache thrift:
(cont.)

Sqrrl process
Apache Lucene:

Enables development in
diverse language choices

Custom iterators, providing developers with real-time
capabilities, such as full-text search, graph analysis, and
statistics, for analytical applications and dashboards.

HDFS:
File Storage system,
compatible with
both open source
(OSS) and
commercial
versions

Data Ingest:
JSON, or graph
format

Apache Accumulo:
The core of transactional and online
analytical data processing in sqrrl
Who is sqrrl for?
CTOs/CIOs: Unlock the value in fractured and unstructured datasets
across your organization

Developers: More easily create apps on top of Big Data and
distributed databases
Infrastructure Managers: Simplify administration of Big Data through
highly scalable and multitenant distributed systems
Data Analysts: Dig deeper into your data using advanced analytical
techniques, such as graph analysis
Business Users: Use Big Data seamlessly via apps developed on top
of sqrrl enterprise
sqrrl/Accumulo wrap-up
Accumulo
sqrrl

bridges the gap for security perspectives that restrict
a large swath of industries
combines the best of available technologies,
develops and contributes their own, and designs
big apps for big data.

Accumulo Setup:
1.
2.
3.

Installation of HDFS and ZooKeeper must installed and configured
Password-less SSH should be configured between all nodes (emphasized master <> tablet)
Installation of Accumulo (from http://accumulo.apache.org/downloads/ using
http://accumulo.apache.org/1.4/user_manual/Administration.html#Installation
Or get started using their AMI (http://www.sqrrl.com/downloads#getting-started)

More Related Content

What's hot

Running Cognos on Hadoop
Running Cognos on HadoopRunning Cognos on Hadoop
Running Cognos on HadoopSenturus
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoopnvvrajesh
 
Impala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopImpala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopCloudera, Inc.
 
Building a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystemBuilding a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystemGregg Barrett
 
SQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL DatabasesSQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL DatabasesOReillyStrata
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impalahuguk
 
High concurrency,
Low latency analytics
using Spark/Kudu
 High concurrency,
Low latency analytics
using Spark/Kudu High concurrency,
Low latency analytics
using Spark/Kudu
High concurrency,
Low latency analytics
using Spark/KuduChris George
 
Taming Big Data with Big SQL 3.0
Taming Big Data with Big SQL 3.0Taming Big Data with Big SQL 3.0
Taming Big Data with Big SQL 3.0Nicolas Morales
 
The Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache SparkThe Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache SparkCloudera, Inc.
 
Interactive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroDataInteractive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroDataOfir Manor
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Cloudera, Inc.
 
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, GuindyScaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, GuindyRohit Kulkarni
 
Introduction To Hadoop Ecosystem
Introduction To Hadoop EcosystemIntroduction To Hadoop Ecosystem
Introduction To Hadoop EcosystemInSemble
 
Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...
Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...
Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...Cloudera, Inc.
 
Impala 2.0 - The Best Analytic Database for Hadoop
Impala 2.0 - The Best Analytic Database for HadoopImpala 2.0 - The Best Analytic Database for Hadoop
Impala 2.0 - The Best Analytic Database for HadoopCloudera, Inc.
 
Impala: Real-time Queries in Hadoop
Impala: Real-time Queries in HadoopImpala: Real-time Queries in Hadoop
Impala: Real-time Queries in HadoopCloudera, Inc.
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceeakasit_dpu
 

What's hot (20)

Running Cognos on Hadoop
Running Cognos on HadoopRunning Cognos on Hadoop
Running Cognos on Hadoop
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
Cloudera Impala
Cloudera ImpalaCloudera Impala
Cloudera Impala
 
Impala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopImpala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on Hadoop
 
Building a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystemBuilding a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystem
 
SQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL DatabasesSQL on Hadoop: Defining the New Generation of Analytic SQL Databases
SQL on Hadoop: Defining the New Generation of Analytic SQL Databases
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
 
High concurrency,
Low latency analytics
using Spark/Kudu
 High concurrency,
Low latency analytics
using Spark/Kudu High concurrency,
Low latency analytics
using Spark/Kudu
High concurrency,
Low latency analytics
using Spark/Kudu
 
Apache Hadoop 3
Apache Hadoop 3Apache Hadoop 3
Apache Hadoop 3
 
Taming Big Data with Big SQL 3.0
Taming Big Data with Big SQL 3.0Taming Big Data with Big SQL 3.0
Taming Big Data with Big SQL 3.0
 
The Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache SparkThe Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache Spark
 
Interactive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroDataInteractive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroData
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
 
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, GuindyScaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
 
Introduction To Hadoop Ecosystem
Introduction To Hadoop EcosystemIntroduction To Hadoop Ecosystem
Introduction To Hadoop Ecosystem
 
Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...
Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...
Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...
 
Impala 2.0 - The Best Analytic Database for Hadoop
Impala 2.0 - The Best Analytic Database for HadoopImpala 2.0 - The Best Analytic Database for Hadoop
Impala 2.0 - The Best Analytic Database for Hadoop
 
Impala: Real-time Queries in Hadoop
Impala: Real-time Queries in HadoopImpala: Real-time Queries in Hadoop
Impala: Real-time Queries in Hadoop
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduce
 

Viewers also liked

Sqrrl June Webinar: An Accumulo Love Story
Sqrrl June Webinar: An Accumulo Love StorySqrrl June Webinar: An Accumulo Love Story
Sqrrl June Webinar: An Accumulo Love StorySqrrl
 
Accumulo14 15
Accumulo14 15Accumulo14 15
Accumulo14 15Sqrrl
 
Oct 2012 HUG: Apache Accumulo: Unlocking the Power of Big Data
Oct 2012 HUG: Apache Accumulo: Unlocking the Power of Big DataOct 2012 HUG: Apache Accumulo: Unlocking the Power of Big Data
Oct 2012 HUG: Apache Accumulo: Unlocking the Power of Big DataYahoo Developer Network
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDFNarni Rajesh
 
Introduction to RDF & SPARQL
Introduction to RDF & SPARQLIntroduction to RDF & SPARQL
Introduction to RDF & SPARQLOpen Data Support
 
Introduction to Apache Accumulo
Introduction to Apache AccumuloIntroduction to Apache Accumulo
Introduction to Apache AccumuloJared Winick
 
Qlitan wid my cousins
Qlitan wid my cousinsQlitan wid my cousins
Qlitan wid my cousinsbhabyjekang
 
Top 150 global design firms
Top 150 global design firmsTop 150 global design firms
Top 150 global design firmsSamar Momin
 
Jiit 2013 14 project presentation aniket mishra
Jiit 2013 14 project presentation aniket mishraJiit 2013 14 project presentation aniket mishra
Jiit 2013 14 project presentation aniket mishraAniket Mishra
 
Organizaciones tradicionales vs organizaciones actuales
Organizaciones tradicionales vs organizaciones actuales Organizaciones tradicionales vs organizaciones actuales
Organizaciones tradicionales vs organizaciones actuales Juan David Plazas Florian
 
Semantic Web (Web 3.0)
Semantic Web (Web 3.0)Semantic Web (Web 3.0)
Semantic Web (Web 3.0)John Dougherty
 
Evaluation Question 4
Evaluation Question 4Evaluation Question 4
Evaluation Question 4annaskelding
 
페차쿠차
페차쿠차페차쿠차
페차쿠차ekwjd4793
 

Viewers also liked (20)

Sqrrl June Webinar: An Accumulo Love Story
Sqrrl June Webinar: An Accumulo Love StorySqrrl June Webinar: An Accumulo Love Story
Sqrrl June Webinar: An Accumulo Love Story
 
Accumulo14 15
Accumulo14 15Accumulo14 15
Accumulo14 15
 
Oct 2012 HUG: Apache Accumulo: Unlocking the Power of Big Data
Oct 2012 HUG: Apache Accumulo: Unlocking the Power of Big DataOct 2012 HUG: Apache Accumulo: Unlocking the Power of Big Data
Oct 2012 HUG: Apache Accumulo: Unlocking the Power of Big Data
 
Introduction to Accumulo
Introduction to AccumuloIntroduction to Accumulo
Introduction to Accumulo
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDF
 
Introduction to RDF & SPARQL
Introduction to RDF & SPARQLIntroduction to RDF & SPARQL
Introduction to RDF & SPARQL
 
Introduction to Apache Accumulo
Introduction to Apache AccumuloIntroduction to Apache Accumulo
Introduction to Apache Accumulo
 
SPARQL Cheat Sheet
SPARQL Cheat SheetSPARQL Cheat Sheet
SPARQL Cheat Sheet
 
RDF and OWL
RDF and OWLRDF and OWL
RDF and OWL
 
Qlitan wid my cousins
Qlitan wid my cousinsQlitan wid my cousins
Qlitan wid my cousins
 
Big Data ROI
Big Data ROIBig Data ROI
Big Data ROI
 
Top 150 global design firms
Top 150 global design firmsTop 150 global design firms
Top 150 global design firms
 
Diapositiva asesores
Diapositiva asesoresDiapositiva asesores
Diapositiva asesores
 
Jiit 2013 14 project presentation aniket mishra
Jiit 2013 14 project presentation aniket mishraJiit 2013 14 project presentation aniket mishra
Jiit 2013 14 project presentation aniket mishra
 
Enc 3241 color
Enc 3241 colorEnc 3241 color
Enc 3241 color
 
Organizaciones tradicionales vs organizaciones actuales
Organizaciones tradicionales vs organizaciones actuales Organizaciones tradicionales vs organizaciones actuales
Organizaciones tradicionales vs organizaciones actuales
 
Semantic Web (Web 3.0)
Semantic Web (Web 3.0)Semantic Web (Web 3.0)
Semantic Web (Web 3.0)
 
Evaluation Question 4
Evaluation Question 4Evaluation Question 4
Evaluation Question 4
 
Michigan Therapy Institute
Michigan Therapy InstituteMichigan Therapy Institute
Michigan Therapy Institute
 
페차쿠차
페차쿠차페차쿠차
페차쿠차
 

Similar to Sqrrl and Accumulo

Analytics and Lakehouse Integration Options for Oracle Applications
Analytics and Lakehouse Integration Options for Oracle ApplicationsAnalytics and Lakehouse Integration Options for Oracle Applications
Analytics and Lakehouse Integration Options for Oracle ApplicationsRay Février
 
Imperative Induced Innovation - Patrick W. Dowd, Ph. D
Imperative Induced Innovation - Patrick W. Dowd, Ph. DImperative Induced Innovation - Patrick W. Dowd, Ph. D
Imperative Induced Innovation - Patrick W. Dowd, Ph. Dscoopnewsgroup
 
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...Maginatics
 
Cooperative Schedule Data Possession for Integrity Verification in Multi-Clou...
Cooperative Schedule Data Possession for Integrity Verification in Multi-Clou...Cooperative Schedule Data Possession for Integrity Verification in Multi-Clou...
Cooperative Schedule Data Possession for Integrity Verification in Multi-Clou...IJMER
 
IRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET Journal
 
Cloudian HyperStore Features and Benefits
Cloudian HyperStore Features and BenefitsCloudian HyperStore Features and Benefits
Cloudian HyperStore Features and BenefitsCloudian
 
Two Level Auditing Architecture to Maintain Consistent In Cloud
Two Level Auditing Architecture to Maintain Consistent In CloudTwo Level Auditing Architecture to Maintain Consistent In Cloud
Two Level Auditing Architecture to Maintain Consistent In Cloudtheijes
 
SURVEY ON KEY AGGREGATE CRYPTOSYSTEM FOR SCALABLE DATA SHARING
SURVEY ON KEY AGGREGATE CRYPTOSYSTEM FOR SCALABLE DATA SHARINGSURVEY ON KEY AGGREGATE CRYPTOSYSTEM FOR SCALABLE DATA SHARING
SURVEY ON KEY AGGREGATE CRYPTOSYSTEM FOR SCALABLE DATA SHARINGEditor IJMTER
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive
 
Survey on Privacy- Preserving Multi keyword Ranked Search over Encrypted Clou...
Survey on Privacy- Preserving Multi keyword Ranked Search over Encrypted Clou...Survey on Privacy- Preserving Multi keyword Ranked Search over Encrypted Clou...
Survey on Privacy- Preserving Multi keyword Ranked Search over Encrypted Clou...Editor IJMTER
 
Cloud Computing_ICT Concepts & Trends.pptx
Cloud Computing_ICT Concepts & Trends.pptxCloud Computing_ICT Concepts & Trends.pptx
Cloud Computing_ICT Concepts & Trends.pptxssuser6063b0
 
The Open Data Lake Platform Brief - Data Sheets | Whitepaper
The Open Data Lake Platform Brief - Data Sheets | WhitepaperThe Open Data Lake Platform Brief - Data Sheets | Whitepaper
The Open Data Lake Platform Brief - Data Sheets | WhitepaperVasu S
 
Scalable POSIX File Systems in the Cloud
Scalable POSIX File Systems in the CloudScalable POSIX File Systems in the Cloud
Scalable POSIX File Systems in the CloudRed_Hat_Storage
 
An Auditing Protocol for Protected Data Storage in Cloud Computing
An Auditing Protocol for Protected Data Storage in Cloud ComputingAn Auditing Protocol for Protected Data Storage in Cloud Computing
An Auditing Protocol for Protected Data Storage in Cloud Computingijceronline
 
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...Editor IJCATR
 
Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfan
 
IRJET- An EFficiency and Privacy-Preserving Biometric Identification Scheme i...
IRJET- An EFficiency and Privacy-Preserving Biometric Identification Scheme i...IRJET- An EFficiency and Privacy-Preserving Biometric Identification Scheme i...
IRJET- An EFficiency and Privacy-Preserving Biometric Identification Scheme i...IRJET Journal
 
trisulnsm_6.5_datasheet
trisulnsm_6.5_datasheettrisulnsm_6.5_datasheet
trisulnsm_6.5_datasheettrisulnsm
 

Similar to Sqrrl and Accumulo (20)

Analytics and Lakehouse Integration Options for Oracle Applications
Analytics and Lakehouse Integration Options for Oracle ApplicationsAnalytics and Lakehouse Integration Options for Oracle Applications
Analytics and Lakehouse Integration Options for Oracle Applications
 
Imperative Induced Innovation - Patrick W. Dowd, Ph. D
Imperative Induced Innovation - Patrick W. Dowd, Ph. DImperative Induced Innovation - Patrick W. Dowd, Ph. D
Imperative Induced Innovation - Patrick W. Dowd, Ph. D
 
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...
 
As34269277
As34269277As34269277
As34269277
 
Cooperative Schedule Data Possession for Integrity Verification in Multi-Clou...
Cooperative Schedule Data Possession for Integrity Verification in Multi-Clou...Cooperative Schedule Data Possession for Integrity Verification in Multi-Clou...
Cooperative Schedule Data Possession for Integrity Verification in Multi-Clou...
 
IRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET- Secured Hadoop Environment
IRJET- Secured Hadoop Environment
 
Cloudian HyperStore Features and Benefits
Cloudian HyperStore Features and BenefitsCloudian HyperStore Features and Benefits
Cloudian HyperStore Features and Benefits
 
Two Level Auditing Architecture to Maintain Consistent In Cloud
Two Level Auditing Architecture to Maintain Consistent In CloudTwo Level Auditing Architecture to Maintain Consistent In Cloud
Two Level Auditing Architecture to Maintain Consistent In Cloud
 
SURVEY ON KEY AGGREGATE CRYPTOSYSTEM FOR SCALABLE DATA SHARING
SURVEY ON KEY AGGREGATE CRYPTOSYSTEM FOR SCALABLE DATA SHARINGSURVEY ON KEY AGGREGATE CRYPTOSYSTEM FOR SCALABLE DATA SHARING
SURVEY ON KEY AGGREGATE CRYPTOSYSTEM FOR SCALABLE DATA SHARING
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
 
Survey on Privacy- Preserving Multi keyword Ranked Search over Encrypted Clou...
Survey on Privacy- Preserving Multi keyword Ranked Search over Encrypted Clou...Survey on Privacy- Preserving Multi keyword Ranked Search over Encrypted Clou...
Survey on Privacy- Preserving Multi keyword Ranked Search over Encrypted Clou...
 
Cloud Computing_ICT Concepts & Trends.pptx
Cloud Computing_ICT Concepts & Trends.pptxCloud Computing_ICT Concepts & Trends.pptx
Cloud Computing_ICT Concepts & Trends.pptx
 
The Open Data Lake Platform Brief - Data Sheets | Whitepaper
The Open Data Lake Platform Brief - Data Sheets | WhitepaperThe Open Data Lake Platform Brief - Data Sheets | Whitepaper
The Open Data Lake Platform Brief - Data Sheets | Whitepaper
 
Scalable POSIX File Systems in the Cloud
Scalable POSIX File Systems in the CloudScalable POSIX File Systems in the Cloud
Scalable POSIX File Systems in the Cloud
 
An Auditing Protocol for Protected Data Storage in Cloud Computing
An Auditing Protocol for Protected Data Storage in Cloud ComputingAn Auditing Protocol for Protected Data Storage in Cloud Computing
An Auditing Protocol for Protected Data Storage in Cloud Computing
 
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
 
Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -
 
IRJET- An EFficiency and Privacy-Preserving Biometric Identification Scheme i...
IRJET- An EFficiency and Privacy-Preserving Biometric Identification Scheme i...IRJET- An EFficiency and Privacy-Preserving Biometric Identification Scheme i...
IRJET- An EFficiency and Privacy-Preserving Biometric Identification Scheme i...
 
trisulnsm_6.5_datasheet
trisulnsm_6.5_datasheettrisulnsm_6.5_datasheet
trisulnsm_6.5_datasheet
 
Real time analytics
Real time analyticsReal time analytics
Real time analytics
 

Recently uploaded

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 

Recently uploaded (20)

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 

Sqrrl and Accumulo

  • 2. Which NoSQL solution? There are a lot of places to fit sqrrl, and Accumulo. © sqrrl, Inc.
  • 3. What is sqrrl?  Based on Accumulo  A proven, secure, multi tenant, data platform for building real-time applications  Scales elastically to tens of petabytes of data and enables organizations to eliminate their internal data silos  Seamless integration with Hadoop, and most of its variants  Providing a supply for a much needed security demand (groundup security)  Already deployed and utilized by defense and government industries
  • 4. A history of sqrrl © sqrrl, Inc. - Accumulo Meetup Presentation
  • 6. What is Accumulo?  Development began at the NSA in 2008  Base foundation for sqrrl  Cell-level security reduces the cost of app development, circumnavigating complex, sometimes impossible, legal or policy restrictions  Provides the ability to scale to >PB levels  Highly adaptive schema and sorted key/value paradigm  Stores key/value pairs in parsed, sorted, secure controls
  • 7. Where does Accumulo fit? © sqrrl, Inc.
  • 8. How does Accumulo provide security?  Security Labels are applied to keys  Cell-level security is implemented to allow for security policy enforcement, using data labeler tags  These policies are applied when data is ingested  Tablets contain data, are controlled using security policies  Stores key/value pairs in parsed, sorted, secure controls, a 5-tuple key system
  • 9. Accumulo Security (cont.) Why Cell-Level Security Is Important: Many databases insufficiently implement security through row-and columnlevel restrictions. Column-level security is only sufficient when the data schema is static, well known, and aligned with security concerns. Row-level security breaks down when a single record conveys multiple levels of information. The flexible, fine-grained cell-level security within Sqrrl Enterprise (or its root Accumulo) supports flexible schemas, new indexing patterns, and greater analytic adaptability at scale.
  • 10. Accumulo Security (cont.) An Accumulo key is a 5-tuple key, consisting of:  Row: Controls Atomicity  Column Family: Controls Locality  Column Qualifier: Controls Uniqueness  Visibility Label: Controls Access  Timestamp: Controls Versioning (Values are byte arrays) Keys are sorted:  Hierarchically: Row first, then column family, and so on  Lexicographically: Compare first byte, then second, and so on
  • 11. Accumulo Security (cont.) An example of column usage
  • 12. Accumulo Architecture Accumulo servers (tablets) utilize a multitude of big data technologies, but their layout is different than Map/Reduce, HDFS, MongoDB, Cassandra, etc. used alone.  Data is stored in HDFS  Zookeeper is utilized for configuration management  SSH, password-less, node configuration  An emphasis, more of an imperative, on data model and data model design
  • 13. Accumulo Architecture (cont.) Tablets  Partitions of tables, collections of sorted key/value pairs  Held and managed by Tablet Servers
  • 14. Accumulo Architecture (cont.) Tablet Servers  Receive writes, responds to reads, from clients  Writes to a write-ahead log, sorting new key/value pairs in memory, while periodically flushing sorted key/value pairs to new files in HDFS  Managed by Master  Responsible for detecting and responding to Tablet Server failure, load balancing  Coordinates startup, graceful shutdown, and recovery of writeahead logs  Zookeeper  An apache project, open source  Utilized for distributed locking mechanism, with no single point of failure
  • 15. Integration with users/access The visibility labels are a feature that is unique to Accumulo. No other database can apply access controls at such a fine-grained level. Labels are generated by translating an organization’s existing data security and information sharing policies into Boolean expressions 1. Gather an organization’s information security policies and dissecting them into data--‐centric and user--‐centric components 2. As data is ingested into Accumulo, a data labeler tags individual key/value pairs with the appropriate data--‐centric visibility labels based on these policies. 3. Data is then stored in Accumulo where it is available for real--‐time queries by operational applications. End users are authenticated through these applications and authorized to access underlying data 4. As an end user performs an operation via the app (e.g., performs a search request), the visibility label on each candidate key/value pair is checked against his or her attributes, and only the data that he or she is authorized to see is returned.
  • 16. Making sqrrl work sqrrl’s extensibility of Accumulo allows it to process millions of records per second, as either static or streaming objects These records are converted into hierarchical JSON documents, giving document store capabilities Passing this data to the analytics layer is designed to make integration and development of real-time analytics possible, and accessible Combining access at the cell level, with Accumulo, sqrrl integrates Identity and Access Management (IAM) systems (LDAP, RADIUS, etc.)
  • 17. Making sqrrl work Apache thrift: (cont.) Sqrrl process Apache Lucene: Enables development in diverse language choices Custom iterators, providing developers with real-time capabilities, such as full-text search, graph analysis, and statistics, for analytical applications and dashboards. HDFS: File Storage system, compatible with both open source (OSS) and commercial versions Data Ingest: JSON, or graph format Apache Accumulo: The core of transactional and online analytical data processing in sqrrl
  • 18. Who is sqrrl for? CTOs/CIOs: Unlock the value in fractured and unstructured datasets across your organization Developers: More easily create apps on top of Big Data and distributed databases Infrastructure Managers: Simplify administration of Big Data through highly scalable and multitenant distributed systems Data Analysts: Dig deeper into your data using advanced analytical techniques, such as graph analysis Business Users: Use Big Data seamlessly via apps developed on top of sqrrl enterprise
  • 19. sqrrl/Accumulo wrap-up Accumulo sqrrl bridges the gap for security perspectives that restrict a large swath of industries combines the best of available technologies, develops and contributes their own, and designs big apps for big data. Accumulo Setup: 1. 2. 3. Installation of HDFS and ZooKeeper must installed and configured Password-less SSH should be configured between all nodes (emphasized master <> tablet) Installation of Accumulo (from http://accumulo.apache.org/downloads/ using http://accumulo.apache.org/1.4/user_manual/Administration.html#Installation Or get started using their AMI (http://www.sqrrl.com/downloads#getting-started)

Editor's Notes

  1. Welcome everyone to the meetup.Thank Michael Walker for being such a generous and kind host.Give a brief overview of the topic: Security is now paramount
  2. We’ve all seen decision trees, this one is a little different.Notice how Accumulo has positioning in almost every aspect of big data analysis.We’ll get into the greater explanation of the interactions and operations of both sqrrl and Accumulo.
  3. Big data has been brewing for a while, and even though Accumulo hasn’t been widely known; it’s had a rich history.Most of you will know about Google’s Bigtable white paper, which Accumulo is based on.In 2008, the NSA began developmentIn 2011 NSA open sources the solution and Accumulo becomes an incubation project at ApacheIn 2012 Accumulo becomes a top-level projectAnd this year the first release of sqrrl was unleashed.
  4. The bulk of sqrrl’s design architecture is held within Accumulo, hence why we’ll be diving so deep into its inner-workings.The majority of value from sqrrl is extracted by augmenting and adapting other technologies to provide real-time analytics, security, SQL interactions and analysis, such as graph search.
  5. As the history described, development began in 2008.Sqrrl uses Accumulo as a basis for most data workflow and storage access.Cell-level security becomes the key element to differentiate Accumulo and allow for lowering of costs and implementing key security measures and compliances.Scalability is, of course, accommodated with Accumulo.Its schema/tablet/table design complements an adaptive approach.Accomplishes the cell-level security by parsing and sorting specific parts of key/value pairs.
  6. This is an overview that we’ll return to in a bit.Notice the “keystone” is Accumulo, but also the diverse technologies associated.
  7. Labels are used to apply security to keysPolicy driven enforcement allows for cell-level securityData is marked and labeled at ingest, requiring a well thought-out approachTablets contain tables, or data, and are controlled by those, hopefully, well thought-out security policiesKey/Value pairs contain inherent data at ingestion, using a 5-tuple system
  8. I’d like to take a minute, just sit right there, and explain how cell security became the most sought-after feature for real-time analysis.And this is why sqrrl and Accumulo have, natively, risen to the challenge, and why it’s important.
  9. An Accumulo key is a pretty complicated column, and is actually a set of nested columns.Explain each individually:Row, provides controlled atomicityCrucial for large data setsColumn Family, controlling locality, fast access systemsColumn Qualifier, sorts third and provides uniqueness to the keyVisibility Label, runs fourth and controls user access to individual key/value pairs. They are Boolean, and require parenthesis to determine orderingVisibility Label, runs fourth and controls user access to individual key/value pairs. They are Boolean, and require parenthesis to determine orderingTimestamp, controls which version of the key is which.Sorting of keys:Hierarchically, row first, column family second, qualifier third, etc.Lexicographically, compare the first by first, second, second, etc.
  10. This is an example of how one might setup a key configuration.Row = NameCol. Fam. = Sorting Locality used as a diverse descriptorCol. Qual. = Additional descriptor for granularityVisibility = User access (the syntax is described in documentation, but parenthesis can be used to “nest” and order access restrictions)Timestamp = The calendar dateValue = Diverse result
  11. Explain above.Make sure to spend some extra time on data modeling and design.
  12. Mention and focus that the sqrrl enterprise white paper is much more verbose and granular when explaining the inner-workings of a Tablet server, etc.
  13. Mention and focus that the sqrrl enterprise white paper is much more verbose and granular when explaining the inner-workings of a Tablet server, etc.
  14. The visibility labels are a feature that is unique to Accumulo. No other database can apply access controls at this fine-­‐grained level. The labels are generated by translating an organization’s existing data security and information sharing policies into Boolean expressions over data attributes and user attributes.The process for applying these security labels to the data and connecting the labels to user’s designated authorizations is depicted in this figure. The first step is gathering an organization’s information security policies and dissecting or dissemenatingthem into data-­‐centric and user-­‐centric components. As data is ingested into Accumulo, a data labeler tags individual key/value pairs with the appropriate data-­‐centric visibility labels based on these policies. Data is then stored in Accumulo where it is available for real-­‐time queries by operational applications. End users are authenticated through these applications and authorized to access underlying data based on their defined attributes (e.g., sworn law enforcement officer, current employee, etc.). As an end user performs an operation via the app (e.g., performs a search request), the visibility label on each candidate key/value pair is checked against his or her attributes, and only the data that he or she is authorized to see is returned.
  15. Real-time processing of millions of records per second, give sqrrl an edge.The workflow processes the ingestion through conversion to hierarchical JSON documents, or can, to utilize a document store’s capabilities.Egress out of Accumulo is where sqrrl shines, and passing this data to the analytics layer gives integration and real-time analytics development a lot more accessibility.Integration with existing Identity and Access Management systems is crucial to a direct, seamless, implementation.
  16. Remember this graphic? There’s quite a lot more to sqrrl than just AccumuloStarting with ingest, sqrrl facilitates delineationPushed to Accumulo, processing and results are utilized, with HDFS as a storage systemThrift enables development from almost any developers favorite flavor of languageAnd finally Lucene, one of many post processing analytics platforms, takes the churned data and presents it to a number of layers.
  17. Explain each usage case
  18. Explain the above.I’d like to thank everyone for their time, and would love to answer any questions.