SlideShare une entreprise Scribd logo
1  sur  57
Tamir Dresher
Senior Software Architect
May 19, 2014
Where is my Data? (In the Cloud)
About Me
• Software architect, consultant and instructor
• Software Engineering Lecturer @ Ruppin Academic Center
• Technology addict
• 10 years of experience
• .NET and Native Windows Programming
@tamir_dresher
tamirdr@codevalue.net
http://www.TamirDresher.com.
Agenda
• Storage
• Blob
• Azure SQL Server
• Azure Tables
• HDInsight
Agenda
• Storage
• Blob
• Azure SQL Server
• Azure Tables
• HDInsight
Storage
Where is my data Storage
Storage Prices
6
Types of information
Where is my data Storage
North America Europe Asia Pacific
Datacenters
Windows Azure Growing Global Presence
Storage SLA – 99.99%
52.56 minutes per year
http://azure.microsoft.com/en-us/support/legal/sla
AZURE BLOBS
9
What is a BLOB
• BLOB – Binary Large OBject
• Storage for any type of entity such as binary files and text
documents
• Distributed File Service (DFS)
– Scalability and High availability
• BLOB file is distributed between multiple server and replicated
at least 3 times
Where is my data BLOB
Blob Storage Concepts
11
Where is my data BLOB
Blob Operations
REST
Where is my data BLOB
DEMO
Creating a Blob
13
BLOBS
• Block blob - up to 200 GB in size
• Page blobs – up to 1 TB in size
• Total Account Capacity - 500 TB
• Pricing
– Storage capacity used
– Replication option (LRS, GRS, RA-GRS)
– Number of requests
– Data egress
– http://azure.microsoft.com/en-us/pricing/details/storage/
Where is my data BLOB
SQL AZURE
15
SQL Azure
• SQL Server in the cloud
• No administrative overheads
• High Availability
• pay-as-you-grow pricing
• Familiar Development Model*
* Despite missing features and some limitations - http://msdn.microsoft.com/en-us/library/ff394115.aspx
Where is my data SQL Azure
DEMO
Creating and Using SQL Azure
17
SQL Azure – Pricing
Where is my data SQL Azure
Case Study - https://haveibeenpwned.com/
Where is my data SQL Azure
Case Study - https://haveibeenpwned.com/
• http://www.troyhunt.com/2013/12/working-with-154-million-
records-on.html
• How do I make querying 154 million email addresses as fast as
possible?
• if I want 100GB of SQL Server and I want to hit it 10 million
times, it’ll cost me $176 a month (now its ~20$)
Where is my data SQL Azure
AZURE TABLES
21
Table Storage Concepts
22
Where is my data Tables
Table Storage
• Not RDBMS
– No relationships between entities
– NoSql
• Entity can have up to 255 properties - Up to 1MB per entity
• Mandatory Properties for every entity
– PartitionKey & RowKey (only indexed properties)
• Uniquely identifies an entity
• Same RowKey can be used in different PartitionKey
• Defines the sort order
– Timestamp - Optimistic Concurrency
Where is my data Tables
No Fixed Schema
24
Where is my data Tables
Table Object Model
• ITableEntity interface –PartitionKey, RowKey, Timestamp, and
Etag properties
– Implemented by TableEntity and DynamicTableEntity
// This class defines one additional property of integer type,
// since it derives from TableEntity it will be automatically
// serialized and deserialized.
public class SampleEntity : TableEntity
{
public int SampleProperty { get; set; }
}
Where is my data Tables
Sample – Inserting an Entity into a Table
// You will need the following using statements
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Table;
// Create the table client.
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
CloudTable peopleTable = tableClient.GetTableReference("people");
peopleTable.CreateIfNotExists();
// Create a new customer entity.
CustomerEntity customer1 = new CustomerEntity("Harp", "Walter");
customer1.Email = "Walter@contoso.com";
customer1.PhoneNumber = "425-555-0101";
// Create an operation to add the new customer to the people table.
TableOperation insertCustomer1 = TableOperation.Insert(customer1);
// Submit the operation to the table service.
peopleTable.Execute(insertCustomer1);
Where is my data Tables
Retrieve
// Create the table client.
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
CloudTable peopleTable = tableClient.GetTableReference("people");
// Retrieve the entity with partition key of "Smith" and row key of "Jeff"
TableOperation retrieveJeffSmith =
TableOperation.Retrieve<CustomerEntity>("Smith", "Jeff");
// Retrieve entity
CustomerEntity specificEntity =
(CustomerEntity)peopleTable.Execute(retrieveJeffSmith).Result;
Where is my data Tables
Table Storage – Important Points
• Azure Tables can store TBs of data
• Tables Operations are fast
• Tables are distributed –PartitionKey defines the partition
– A table might be stored in different partitions on different storage
devices.
Where is my data Tables
Pricing
Where is my data Tables
Case Study - https://haveibeenpwned.com/
Where is my data Tables
Case Study - https://haveibeenpwned.com/
• How do I make querying 154 million email addresses as fast as
possible?
• foo@bar.com – the domain is the partition key and the alias is
the row key
• if I want 100GB of storage and I want to hit it 10 million times,
it’ll cost me $8 a month
• SQL Server will cost $176 a month - 22 times more expensive
Where is my data Tables
HDINSIGHT
32
Hadoop in the cloud
• Hadoop on Azure Cloud
• Some Facts:
– Bing ingests > 7 petabytes
a month
– The Twitter community generates over 1 terabyte
of tweets every day
– Cisco predicts that by 2013 annual internet traffic flowing will reach
667 exabytes
Where is my data HDInsight
Sources: The Economist, Feb ‘10; DBMS2; Microsoft Corp
MapReduce – The BigData Power
• Map – takes input and output key;value pairs
(Key1,Value1)
(Key2,Value2)
:
:
(Keyn,Valuen)
Where is my data HDInsight
MapReduce – The BigData Power
• Reduce – take group of values per key and produce new group
of values
Key1:
[value1-1,Value1-2…]
Key2:
[value2-1,Value2-2…]
Keyn:
[valueN-1,ValueN-2…]
[new_value1-1,new_value1-2…]
[new_value2-1,new_value2-2…]
[new_valueN-1,new_valueN-2…]
: :
Where is my data HDInsight
MapReduce - How Does It Work?
Where is my data HDInsight
So How Does It Work?
Where is my data HDInsight
Finding common friends
• Facebook shows you how many common friends you have with
someone
• There were 1,310,000,000 active users in facebook
with 130 friends on average (01.01.2014)
• Calculating the mutual friends
Where is my data HDInsight
Finding common friends
• We can represent Friend Relationship as:
• Note that a Friend relationship is Symmetrical
– if A is a friend of B then B is a friend of A
Where is my data HDInsight
Someone  [List of hisher friends]
Common Friends
Example of Friends file
• U1 -> U2 U3 U4
• U2 -> U1 U3 U4 U5
• U3 -> U1 U2 U4 U5
• U4 -> U1 U2 U3 U5
• U5 -> U2 U3 U4
Where is my data HDInsight Common Friends
Designing our MapReduce job
• Each line from the file will input line to the Mapper
• The Mapper will output key-value pairs
• Key: (user, friend)
– Sorted, friend might be before user
• value: list of friends
Where is my data HDInsight Common Friends
Designing our MapReduce job - Mapper
• Each line from the file will input line to the Mapper
• The Mapper will output key-value pairs
• Key: (user, friend)
– Sorted, friend might be before user
• value: list of friends
• Having the key sorted will help us with the reducer, same pairs
will be provided together
Where is my data HDInsight Common Friends
Mapper Example
Where is my data HDInsight Common Friends
Mapper Output:Given the Line:
(U1 U2)  U2 U3 U4
(U1 U3)  U2 U3 U4
(U1 U4)  U2 U3 U4
U1U2 U3 U4
Mapper Example
Where is my data HDInsight Common Friends
Mapper Output:Given the Line:
(U1 U2)  U2 U3 U4
(U1 U3)  U2 U3 U4
(U1 U4)  U2 U3 U4
U1U2 U3 U4
(U1 U2) -> U1 U3 U4 U5
(U2 U3) -> U1 U3 U4 U5
(U2 U4) -> U1 U3 U4 U5
(U2 U5) -> U1 U3 U4 U5
U2  U1 U3 U4 U5
Mapper Example – final result
Where is my data HDInsight Common Friends
Mapper Output:Given the Line:
(U1 U2)  U2 U3 U4
(U1 U3)  U2 U3 U4
(U1 U4)  U2 U3 U4
U1U2 U3 U4
(U1 U2) -> U1 U3 U4 U5
(U2 U3) -> U1 U3 U4 U5
(U2 U4) -> U1 U3 U4 U5
(U2 U5) -> U1 U3 U4 U5
U2  U1 U3 U4 U5
(U1 U3) -> U1 U2 U4 U5
(U2 U3) -> U1 U2 U4 U5
(U3 U4) -> U1 U2 U4 U5
(U3 U5) -> U1 U2 U4 U5
U3 -> U1 U2 U4 U5
Mapper Output:Given the Line:
(U1 U4) -> U1 U2 U3 U5
(U2 U4) -> U1 U2 U3 U5
(U3 U4) -> U1 U2 U3 U5
(U4 U5) -> U1 U2 U3 U5
U4 -> U1 U2 U3 U5
(U2 U5) -> U2 U3 U4
(U3 U5) -> U2 U3 U4
(U4 U5) -> U2 U3 U4
U5 -> U2 U3 U4
Designing our MapReduce job - Reducer
• The input for the reducer will be structured as:
(friend1, friend2)  (friend1 friends) (friend2 friends)
• The reducer will find the intersection between the lists
• Output:
(friend1, friend2)  (intersection of friend1 and friend2 friends)
Where is my data HDInsight Common Friends
Reducer Example
Where is my data HDInsight Common Friends
Reducer Output:Given the Line:
(U1 U2) -> (U3 U4)(U1 U2) -> (U1 U3 U4 U5) (U2 U3 U4)
(U1 U3) -> (U2 U4)(U1 U3) -> (U1 U2 U4 U5) (U2 U3 U4)
(U1 U4) -> (U2 U3)(U1 U4) -> (U1 U2 U3 U5) (U2 U3 U4)
(U2 U3) -> (U1 U4 U5)(U2 U3) -> (U1 U2 U4 U5) (U1 U3 U4 U5)
(U2 U4) -> (U1 U3 U5)(U2 U4) -> (U1 U2 U3 U5) (U1 U3 U4 U5)
(U2 U5) -> (U3 U4)(U2 U5) -> (U1 U3 U4 U5) (U2 U3 U4)
(U3 U4) -> (U1 U2 U5)(U3 U4) -> (U1 U2 U3 U5) (U1 U2 U4 U5)
(U3 U5) -> (U2 U4)(U3 U5) -> (U1 U2 U4 U5) (U2 U3 U4)
(U4 U5) -> (U2 U3)(U4 U5) -> (U1 U2 U3 U5) (U2 U3 U4)
Creating c# MapReduce
Where is my data HDInsight Common Friends
Creating c# MapReduce - Mapper
Where is my data HDInsight Common Friends
public class CommonFriendsMapper:MapperBase
{
public override void Map(string inputLine, MapperContext context)
{
var strings = inputLine.Split(new []{' '}, StringSplitOptions.RemoveEmptyEntries);
if (strings.Any())
{
var currentUser = strings[0];
var friends = strings.Skip(1);
foreach (var friend in friends)
{
var keyArr = new[] {currentUser, friend};
Array.Sort(keyArr);
var key = String.Join(" ", keyArr);
context.EmitKeyValue(key, string.Join(" ",friends));
}
}
}
}
Creating c# MapReduce - Reduce
Where is my data HDInsight Common Friends
public class CommonFriendsReducer:ReducerCombinerBase
{
public override void Reduce(string key,
IEnumerable<string> strings,
ReducerCombinerContext context)
{
var friendsLists = strings
.Select(friendList => friendList.Split(' '))
.ToList();
var intersection = friendsLists[0].Intersect(friendsLists[1]);
context.EmitKeyValue(key, string.Join(" ", intersection));
}
}
Creating c# MapReduce – Hadoop Job
Where is my data HDInsight Common Friends
HadoopJobConfiguration myConfig = new HadoopJobConfiguration();
myConfig.InputPath = "wasb:///example/data/friends/friends";
myConfig.OutputFolder = "wasb:////example/data/friends/output";
Environment.SetEnvironmentVariable("HADOOP_HOME", @"c:hadoop");
Environment.SetEnvironmentVariable("Java_HOME", @"c:hadoopjvm");
var hadoop = Hadoop.Connect(clusterUri,
clusterUserName,
hadoopUserName,
clusterPassword,
azureStorageAccount,
azureStorageKey,
azureStorageContainer,
createContinerIfNotExist);
var jobResult =
hadoop.MapReduceJob.Execute<CommonFriendsMapper, CommonFriendsReducer>(myConfig);
int exitCode = jobResult.Info.ExitCode; // (0 – success, otherwise – failure)
Pricing
Where is my data HDInsight
10 node cluster that will exist for 24 hours:
• Secure Gateway Node - free.
• head node - 15.36 USD per 24-hour day
• 1 data node - 7.68 USD per 24-hour day
• 10 data nodes - 76.80 USD per 24-hour day
• Total: $92.16 USD
WRAP UP
53
Comparing the alternatives
Storage Type When Should you Use Implications
BLOB Unstructured data
Files
- Application Logic Responsibility
- Consider using HDInsight(Hadoop)
SQL Server Structured Relational Data
ACID transactions
Max 150GB (500GB in preview)
- SQL DML+DDL
- Could affect scalability
- BI Abilities
- Reporting
Azure Tables Structured Data
Loose Schema
Geo Replication (High DR)
Auto Sharding
- OData, REST
- Application Logic
- Responsibility(Multiple Schemas)
Where is my data Wrap Up
What have we seen
• Azure Blobs
• Azure Tables
• Azure SQL Server
• HDinsight
Where is my data Wrap Up
What’s Next
• NoSql – MongoDB, Cassandra, CouchDB, RavenDB
• Hadoop ecosystem – Hive, Pig, SQOOP, Mahout
• http://blogs.msdn.com/b/windowsazure/
• http://blogs.msdn.com/b/windowsazurestorage/
• http://blogs.msdn.com/b/bigdatasupport/
Where is my data Wrap Up
Presenter contact details
c: +972-52-4772946
t: @tamir_dresher
e: tamirdr@codevalue.net
b: TamirDresher.com
w: www.codevalue.net

Contenu connexe

Tendances

Building a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and JavaBuilding a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and Javaantoinegirbal
 
Solr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW TechnologySolr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW TechnologyLucidworks
 
CONFidence 2015: Fuzz your way into the web server's zoo - Andrey Plastunov
CONFidence 2015: Fuzz your way into the web server's zoo - Andrey PlastunovCONFidence 2015: Fuzz your way into the web server's zoo - Andrey Plastunov
CONFidence 2015: Fuzz your way into the web server's zoo - Andrey PlastunovPROIDEA
 
Search Engine-Building with Lucene and Solr, Part 1 (SoCal Code Camp LA 2013)
Search Engine-Building with Lucene and Solr, Part 1 (SoCal Code Camp LA 2013)Search Engine-Building with Lucene and Solr, Part 1 (SoCal Code Camp LA 2013)
Search Engine-Building with Lucene and Solr, Part 1 (SoCal Code Camp LA 2013)Kai Chan
 
Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)
Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)
Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)Kai Chan
 
Extending Cassandra with Doradus OLAP for High Performance Analytics
Extending Cassandra with Doradus OLAP for High Performance AnalyticsExtending Cassandra with Doradus OLAP for High Performance Analytics
Extending Cassandra with Doradus OLAP for High Performance Analyticsrandyguck
 
Linked Open Communism - c4l13
Linked Open Communism - c4l13Linked Open Communism - c4l13
Linked Open Communism - c4l13charper
 
Java Performance Tips (So Code Camp San Diego 2014)
Java Performance Tips (So Code Camp San Diego 2014)Java Performance Tips (So Code Camp San Diego 2014)
Java Performance Tips (So Code Camp San Diego 2014)Kai Chan
 
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasBerlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasMapR Technologies
 
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB Europe 2016 - Advanced MongoDB Aggregation PipelinesMongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB Europe 2016 - Advanced MongoDB Aggregation PipelinesMongoDB
 
Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status FeedMongoDB
 
Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard keyMongoDB
 
Solr 6.0 Graph Query Overview
Solr 6.0 Graph Query OverviewSolr 6.0 Graph Query Overview
Solr 6.0 Graph Query OverviewKevin Watters
 
Back to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB ApplicationBack to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB ApplicationMongoDB
 
2011 mongo FR - scaling with mongodb
2011 mongo FR - scaling with mongodb2011 mongo FR - scaling with mongodb
2011 mongo FR - scaling with mongodbantoinegirbal
 
Introduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopIntroduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopAhmedabadJavaMeetup
 

Tendances (18)

Building a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and JavaBuilding a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and Java
 
Solr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW TechnologySolr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW Technology
 
CONFidence 2015: Fuzz your way into the web server's zoo - Andrey Plastunov
CONFidence 2015: Fuzz your way into the web server's zoo - Andrey PlastunovCONFidence 2015: Fuzz your way into the web server's zoo - Andrey Plastunov
CONFidence 2015: Fuzz your way into the web server's zoo - Andrey Plastunov
 
Search Engine-Building with Lucene and Solr, Part 1 (SoCal Code Camp LA 2013)
Search Engine-Building with Lucene and Solr, Part 1 (SoCal Code Camp LA 2013)Search Engine-Building with Lucene and Solr, Part 1 (SoCal Code Camp LA 2013)
Search Engine-Building with Lucene and Solr, Part 1 (SoCal Code Camp LA 2013)
 
Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)
Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)
Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)
 
Extending Cassandra with Doradus OLAP for High Performance Analytics
Extending Cassandra with Doradus OLAP for High Performance AnalyticsExtending Cassandra with Doradus OLAP for High Performance Analytics
Extending Cassandra with Doradus OLAP for High Performance Analytics
 
Tthornton code4lib
Tthornton code4libTthornton code4lib
Tthornton code4lib
 
Linked Open Communism - c4l13
Linked Open Communism - c4l13Linked Open Communism - c4l13
Linked Open Communism - c4l13
 
Quepy
QuepyQuepy
Quepy
 
Java Performance Tips (So Code Camp San Diego 2014)
Java Performance Tips (So Code Camp San Diego 2014)Java Performance Tips (So Code Camp San Diego 2014)
Java Performance Tips (So Code Camp San Diego 2014)
 
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasBerlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
 
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB Europe 2016 - Advanced MongoDB Aggregation PipelinesMongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
 
Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status Feed
 
Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard key
 
Solr 6.0 Graph Query Overview
Solr 6.0 Graph Query OverviewSolr 6.0 Graph Query Overview
Solr 6.0 Graph Query Overview
 
Back to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB ApplicationBack to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB Application
 
2011 mongo FR - scaling with mongodb
2011 mongo FR - scaling with mongodb2011 mongo FR - scaling with mongodb
2011 mongo FR - scaling with mongodb
 
Introduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopIntroduction to MongoDB and Workshop
Introduction to MongoDB and Workshop
 

Similaire à Where is my data (in the cloud) tamir dresher

Clickstream data with spark
Clickstream data with sparkClickstream data with spark
Clickstream data with sparkMarissa Saunders
 
Agile Data: Building Hadoop Analytics Applications
Agile Data: Building Hadoop Analytics ApplicationsAgile Data: Building Hadoop Analytics Applications
Agile Data: Building Hadoop Analytics ApplicationsDataWorks Summit
 
Agile Data Science: Building Hadoop Analytics Applications
Agile Data Science: Building Hadoop Analytics ApplicationsAgile Data Science: Building Hadoop Analytics Applications
Agile Data Science: Building Hadoop Analytics ApplicationsRussell Jurney
 
Training in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media AnalyticsTraining in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media AnalyticsAjay Ohri
 
Discovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data PortalsDiscovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data PortalsPeter Haase
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrTrey Grainger
 
Agile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics ApplicationsAgile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics ApplicationsRussell Jurney
 
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014The Hive
 
A Framework for Dynamic Data Source Identification and Orchestration on the Web
A Framework for Dynamic Data Source Identification and Orchestration on the WebA Framework for Dynamic Data Source Identification and Orchestration on the Web
A Framework for Dynamic Data Source Identification and Orchestration on the Webmashups
 
Apache Drill: An Active, Ad-hoc Query System for large-scale Data Sets
Apache Drill: An Active, Ad-hoc Query System for large-scale Data SetsApache Drill: An Active, Ad-hoc Query System for large-scale Data Sets
Apache Drill: An Active, Ad-hoc Query System for large-scale Data SetsMapR Technologies
 
Open LSH - september 2014 update
Open LSH  - september 2014 updateOpen LSH  - september 2014 update
Open LSH - september 2014 updateJ Singh
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use CasesMax De Marzi
 
Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015StampedeCon
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 CareerBuilder.com
 
Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comJungsu Heo
 
Where Is My Data - ILTAM Session
Where Is My Data - ILTAM SessionWhere Is My Data - ILTAM Session
Where Is My Data - ILTAM SessionTamir Dresher
 
Exploring the Semantic Web
Exploring the Semantic WebExploring the Semantic Web
Exploring the Semantic WebRoberto García
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databasesthai
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life琛琳 饶
 

Similaire à Where is my data (in the cloud) tamir dresher (20)

Clickstream data with spark
Clickstream data with sparkClickstream data with spark
Clickstream data with spark
 
Agile Data: Building Hadoop Analytics Applications
Agile Data: Building Hadoop Analytics ApplicationsAgile Data: Building Hadoop Analytics Applications
Agile Data: Building Hadoop Analytics Applications
 
Agile Data Science: Building Hadoop Analytics Applications
Agile Data Science: Building Hadoop Analytics ApplicationsAgile Data Science: Building Hadoop Analytics Applications
Agile Data Science: Building Hadoop Analytics Applications
 
Big Data Tutorial V4
Big Data Tutorial V4Big Data Tutorial V4
Big Data Tutorial V4
 
Training in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media AnalyticsTraining in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media Analytics
 
Discovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data PortalsDiscovering Related Data Sources in Data Portals
Discovering Related Data Sources in Data Portals
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
 
Agile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics ApplicationsAgile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics Applications
 
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
 
A Framework for Dynamic Data Source Identification and Orchestration on the Web
A Framework for Dynamic Data Source Identification and Orchestration on the WebA Framework for Dynamic Data Source Identification and Orchestration on the Web
A Framework for Dynamic Data Source Identification and Orchestration on the Web
 
Apache Drill: An Active, Ad-hoc Query System for large-scale Data Sets
Apache Drill: An Active, Ad-hoc Query System for large-scale Data SetsApache Drill: An Active, Ad-hoc Query System for large-scale Data Sets
Apache Drill: An Active, Ad-hoc Query System for large-scale Data Sets
 
Open LSH - september 2014 update
Open LSH  - september 2014 updateOpen LSH  - september 2014 update
Open LSH - september 2014 update
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 
Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018
 
Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.com
 
Where Is My Data - ILTAM Session
Where Is My Data - ILTAM SessionWhere Is My Data - ILTAM Session
Where Is My Data - ILTAM Session
 
Exploring the Semantic Web
Exploring the Semantic WebExploring the Semantic Web
Exploring the Semantic Web
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
 

Plus de Tamir Dresher

NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdfNET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdfTamir Dresher
 
Tamir Dresher - DotNet 7 What's new.pptx
Tamir Dresher - DotNet 7 What's new.pptxTamir Dresher - DotNet 7 What's new.pptx
Tamir Dresher - DotNet 7 What's new.pptxTamir Dresher
 
Tamir Dresher - What’s new in ASP.NET Core 6
Tamir Dresher - What’s new in ASP.NET Core 6Tamir Dresher - What’s new in ASP.NET Core 6
Tamir Dresher - What’s new in ASP.NET Core 6Tamir Dresher
 
Tamir Dresher - Async Streams in C#
Tamir Dresher - Async Streams in C#Tamir Dresher - Async Streams in C#
Tamir Dresher - Async Streams in C#Tamir Dresher
 
Anatomy of a data driven architecture - Tamir Dresher
Anatomy of a data driven architecture - Tamir Dresher   Anatomy of a data driven architecture - Tamir Dresher
Anatomy of a data driven architecture - Tamir Dresher Tamir Dresher
 
Tamir Dresher Clarizen adventures with the wild GC during the holiday season
Tamir Dresher   Clarizen adventures with the wild GC during the holiday seasonTamir Dresher   Clarizen adventures with the wild GC during the holiday season
Tamir Dresher Clarizen adventures with the wild GC during the holiday seasonTamir Dresher
 
Debugging tricks you wish you knew Tamir Dresher - Odessa 2019
Debugging tricks you wish you knew   Tamir Dresher - Odessa 2019Debugging tricks you wish you knew   Tamir Dresher - Odessa 2019
Debugging tricks you wish you knew Tamir Dresher - Odessa 2019Tamir Dresher
 
From zero to hero with the actor model - Tamir Dresher - Odessa 2019
From zero to hero with the actor model  - Tamir Dresher - Odessa 2019From zero to hero with the actor model  - Tamir Dresher - Odessa 2019
From zero to hero with the actor model - Tamir Dresher - Odessa 2019Tamir Dresher
 
Tamir Dresher - Demystifying the Core of .NET Core
Tamir Dresher  - Demystifying the Core of .NET CoreTamir Dresher  - Demystifying the Core of .NET Core
Tamir Dresher - Demystifying the Core of .NET CoreTamir Dresher
 
Breaking the monolith to microservice with Docker and Kubernetes (k8s)
Breaking the monolith to microservice with Docker and Kubernetes (k8s)Breaking the monolith to microservice with Docker and Kubernetes (k8s)
Breaking the monolith to microservice with Docker and Kubernetes (k8s)Tamir Dresher
 
.Net december 2017 updates - Tamir Dresher
.Net december 2017 updates - Tamir Dresher.Net december 2017 updates - Tamir Dresher
.Net december 2017 updates - Tamir DresherTamir Dresher
 
Testing time and concurrency Rx
Testing time and concurrency RxTesting time and concurrency Rx
Testing time and concurrency RxTamir Dresher
 
Building responsive application with Rx - confoo - tamir dresher
Building responsive application with Rx - confoo - tamir dresherBuilding responsive application with Rx - confoo - tamir dresher
Building responsive application with Rx - confoo - tamir dresherTamir Dresher
 
.NET Debugging tricks you wish you knew tamir dresher
.NET Debugging tricks you wish you knew   tamir dresher.NET Debugging tricks you wish you knew   tamir dresher
.NET Debugging tricks you wish you knew tamir dresherTamir Dresher
 
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir Dresher
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir DresherFrom Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir Dresher
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir DresherTamir Dresher
 
Building responsive applications with Rx - CodeMash2017 - Tamir Dresher
Building responsive applications with Rx  - CodeMash2017 - Tamir DresherBuilding responsive applications with Rx  - CodeMash2017 - Tamir Dresher
Building responsive applications with Rx - CodeMash2017 - Tamir DresherTamir Dresher
 
Debugging tricks you wish you knew - Tamir Dresher
Debugging tricks you wish you knew  - Tamir DresherDebugging tricks you wish you knew  - Tamir Dresher
Debugging tricks you wish you knew - Tamir DresherTamir Dresher
 
Rx 101 - Tamir Dresher - Copenhagen .NET User Group
Rx 101  - Tamir Dresher - Copenhagen .NET User GroupRx 101  - Tamir Dresher - Copenhagen .NET User Group
Rx 101 - Tamir Dresher - Copenhagen .NET User GroupTamir Dresher
 
Cloud patterns - NDC Oslo 2016 - Tamir Dresher
Cloud patterns - NDC Oslo 2016 - Tamir DresherCloud patterns - NDC Oslo 2016 - Tamir Dresher
Cloud patterns - NDC Oslo 2016 - Tamir DresherTamir Dresher
 
Reactiveness All The Way - SW Architecture 2015 Conference
Reactiveness All The Way - SW Architecture 2015 ConferenceReactiveness All The Way - SW Architecture 2015 Conference
Reactiveness All The Way - SW Architecture 2015 ConferenceTamir Dresher
 

Plus de Tamir Dresher (20)

NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdfNET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
NET Aspire - NET Conf IL 2024 - Tamir Dresher.pdf
 
Tamir Dresher - DotNet 7 What's new.pptx
Tamir Dresher - DotNet 7 What's new.pptxTamir Dresher - DotNet 7 What's new.pptx
Tamir Dresher - DotNet 7 What's new.pptx
 
Tamir Dresher - What’s new in ASP.NET Core 6
Tamir Dresher - What’s new in ASP.NET Core 6Tamir Dresher - What’s new in ASP.NET Core 6
Tamir Dresher - What’s new in ASP.NET Core 6
 
Tamir Dresher - Async Streams in C#
Tamir Dresher - Async Streams in C#Tamir Dresher - Async Streams in C#
Tamir Dresher - Async Streams in C#
 
Anatomy of a data driven architecture - Tamir Dresher
Anatomy of a data driven architecture - Tamir Dresher   Anatomy of a data driven architecture - Tamir Dresher
Anatomy of a data driven architecture - Tamir Dresher
 
Tamir Dresher Clarizen adventures with the wild GC during the holiday season
Tamir Dresher   Clarizen adventures with the wild GC during the holiday seasonTamir Dresher   Clarizen adventures with the wild GC during the holiday season
Tamir Dresher Clarizen adventures with the wild GC during the holiday season
 
Debugging tricks you wish you knew Tamir Dresher - Odessa 2019
Debugging tricks you wish you knew   Tamir Dresher - Odessa 2019Debugging tricks you wish you knew   Tamir Dresher - Odessa 2019
Debugging tricks you wish you knew Tamir Dresher - Odessa 2019
 
From zero to hero with the actor model - Tamir Dresher - Odessa 2019
From zero to hero with the actor model  - Tamir Dresher - Odessa 2019From zero to hero with the actor model  - Tamir Dresher - Odessa 2019
From zero to hero with the actor model - Tamir Dresher - Odessa 2019
 
Tamir Dresher - Demystifying the Core of .NET Core
Tamir Dresher  - Demystifying the Core of .NET CoreTamir Dresher  - Demystifying the Core of .NET Core
Tamir Dresher - Demystifying the Core of .NET Core
 
Breaking the monolith to microservice with Docker and Kubernetes (k8s)
Breaking the monolith to microservice with Docker and Kubernetes (k8s)Breaking the monolith to microservice with Docker and Kubernetes (k8s)
Breaking the monolith to microservice with Docker and Kubernetes (k8s)
 
.Net december 2017 updates - Tamir Dresher
.Net december 2017 updates - Tamir Dresher.Net december 2017 updates - Tamir Dresher
.Net december 2017 updates - Tamir Dresher
 
Testing time and concurrency Rx
Testing time and concurrency RxTesting time and concurrency Rx
Testing time and concurrency Rx
 
Building responsive application with Rx - confoo - tamir dresher
Building responsive application with Rx - confoo - tamir dresherBuilding responsive application with Rx - confoo - tamir dresher
Building responsive application with Rx - confoo - tamir dresher
 
.NET Debugging tricks you wish you knew tamir dresher
.NET Debugging tricks you wish you knew   tamir dresher.NET Debugging tricks you wish you knew   tamir dresher
.NET Debugging tricks you wish you knew tamir dresher
 
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir Dresher
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir DresherFrom Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir Dresher
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir Dresher
 
Building responsive applications with Rx - CodeMash2017 - Tamir Dresher
Building responsive applications with Rx  - CodeMash2017 - Tamir DresherBuilding responsive applications with Rx  - CodeMash2017 - Tamir Dresher
Building responsive applications with Rx - CodeMash2017 - Tamir Dresher
 
Debugging tricks you wish you knew - Tamir Dresher
Debugging tricks you wish you knew  - Tamir DresherDebugging tricks you wish you knew  - Tamir Dresher
Debugging tricks you wish you knew - Tamir Dresher
 
Rx 101 - Tamir Dresher - Copenhagen .NET User Group
Rx 101  - Tamir Dresher - Copenhagen .NET User GroupRx 101  - Tamir Dresher - Copenhagen .NET User Group
Rx 101 - Tamir Dresher - Copenhagen .NET User Group
 
Cloud patterns - NDC Oslo 2016 - Tamir Dresher
Cloud patterns - NDC Oslo 2016 - Tamir DresherCloud patterns - NDC Oslo 2016 - Tamir Dresher
Cloud patterns - NDC Oslo 2016 - Tamir Dresher
 
Reactiveness All The Way - SW Architecture 2015 Conference
Reactiveness All The Way - SW Architecture 2015 ConferenceReactiveness All The Way - SW Architecture 2015 Conference
Reactiveness All The Way - SW Architecture 2015 Conference
 

Dernier

GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 

Dernier (20)

GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 

Where is my data (in the cloud) tamir dresher

  • 1. Tamir Dresher Senior Software Architect May 19, 2014 Where is my Data? (In the Cloud)
  • 2. About Me • Software architect, consultant and instructor • Software Engineering Lecturer @ Ruppin Academic Center • Technology addict • 10 years of experience • .NET and Native Windows Programming @tamir_dresher tamirdr@codevalue.net http://www.TamirDresher.com.
  • 3. Agenda • Storage • Blob • Azure SQL Server • Azure Tables • HDInsight
  • 4. Agenda • Storage • Blob • Azure SQL Server • Azure Tables • HDInsight
  • 5. Storage Where is my data Storage
  • 7. Types of information Where is my data Storage
  • 8. North America Europe Asia Pacific Datacenters Windows Azure Growing Global Presence Storage SLA – 99.99% 52.56 minutes per year http://azure.microsoft.com/en-us/support/legal/sla
  • 10. What is a BLOB • BLOB – Binary Large OBject • Storage for any type of entity such as binary files and text documents • Distributed File Service (DFS) – Scalability and High availability • BLOB file is distributed between multiple server and replicated at least 3 times Where is my data BLOB
  • 14. BLOBS • Block blob - up to 200 GB in size • Page blobs – up to 1 TB in size • Total Account Capacity - 500 TB • Pricing – Storage capacity used – Replication option (LRS, GRS, RA-GRS) – Number of requests – Data egress – http://azure.microsoft.com/en-us/pricing/details/storage/ Where is my data BLOB
  • 16. SQL Azure • SQL Server in the cloud • No administrative overheads • High Availability • pay-as-you-grow pricing • Familiar Development Model* * Despite missing features and some limitations - http://msdn.microsoft.com/en-us/library/ff394115.aspx Where is my data SQL Azure
  • 17. DEMO Creating and Using SQL Azure 17
  • 18. SQL Azure – Pricing Where is my data SQL Azure
  • 19. Case Study - https://haveibeenpwned.com/ Where is my data SQL Azure
  • 20. Case Study - https://haveibeenpwned.com/ • http://www.troyhunt.com/2013/12/working-with-154-million- records-on.html • How do I make querying 154 million email addresses as fast as possible? • if I want 100GB of SQL Server and I want to hit it 10 million times, it’ll cost me $176 a month (now its ~20$) Where is my data SQL Azure
  • 22. Table Storage Concepts 22 Where is my data Tables
  • 23. Table Storage • Not RDBMS – No relationships between entities – NoSql • Entity can have up to 255 properties - Up to 1MB per entity • Mandatory Properties for every entity – PartitionKey & RowKey (only indexed properties) • Uniquely identifies an entity • Same RowKey can be used in different PartitionKey • Defines the sort order – Timestamp - Optimistic Concurrency Where is my data Tables
  • 24. No Fixed Schema 24 Where is my data Tables
  • 25. Table Object Model • ITableEntity interface –PartitionKey, RowKey, Timestamp, and Etag properties – Implemented by TableEntity and DynamicTableEntity // This class defines one additional property of integer type, // since it derives from TableEntity it will be automatically // serialized and deserialized. public class SampleEntity : TableEntity { public int SampleProperty { get; set; } } Where is my data Tables
  • 26. Sample – Inserting an Entity into a Table // You will need the following using statements using Microsoft.WindowsAzure.Storage; using Microsoft.WindowsAzure.Storage.Table; // Create the table client. CloudTableClient tableClient = storageAccount.CreateCloudTableClient(); CloudTable peopleTable = tableClient.GetTableReference("people"); peopleTable.CreateIfNotExists(); // Create a new customer entity. CustomerEntity customer1 = new CustomerEntity("Harp", "Walter"); customer1.Email = "Walter@contoso.com"; customer1.PhoneNumber = "425-555-0101"; // Create an operation to add the new customer to the people table. TableOperation insertCustomer1 = TableOperation.Insert(customer1); // Submit the operation to the table service. peopleTable.Execute(insertCustomer1); Where is my data Tables
  • 27. Retrieve // Create the table client. CloudTableClient tableClient = storageAccount.CreateCloudTableClient(); CloudTable peopleTable = tableClient.GetTableReference("people"); // Retrieve the entity with partition key of "Smith" and row key of "Jeff" TableOperation retrieveJeffSmith = TableOperation.Retrieve<CustomerEntity>("Smith", "Jeff"); // Retrieve entity CustomerEntity specificEntity = (CustomerEntity)peopleTable.Execute(retrieveJeffSmith).Result; Where is my data Tables
  • 28. Table Storage – Important Points • Azure Tables can store TBs of data • Tables Operations are fast • Tables are distributed –PartitionKey defines the partition – A table might be stored in different partitions on different storage devices. Where is my data Tables
  • 29. Pricing Where is my data Tables
  • 30. Case Study - https://haveibeenpwned.com/ Where is my data Tables
  • 31. Case Study - https://haveibeenpwned.com/ • How do I make querying 154 million email addresses as fast as possible? • foo@bar.com – the domain is the partition key and the alias is the row key • if I want 100GB of storage and I want to hit it 10 million times, it’ll cost me $8 a month • SQL Server will cost $176 a month - 22 times more expensive Where is my data Tables
  • 33. Hadoop in the cloud • Hadoop on Azure Cloud • Some Facts: – Bing ingests > 7 petabytes a month – The Twitter community generates over 1 terabyte of tweets every day – Cisco predicts that by 2013 annual internet traffic flowing will reach 667 exabytes Where is my data HDInsight Sources: The Economist, Feb ‘10; DBMS2; Microsoft Corp
  • 34. MapReduce – The BigData Power • Map – takes input and output key;value pairs (Key1,Value1) (Key2,Value2) : : (Keyn,Valuen) Where is my data HDInsight
  • 35. MapReduce – The BigData Power • Reduce – take group of values per key and produce new group of values Key1: [value1-1,Value1-2…] Key2: [value2-1,Value2-2…] Keyn: [valueN-1,ValueN-2…] [new_value1-1,new_value1-2…] [new_value2-1,new_value2-2…] [new_valueN-1,new_valueN-2…] : : Where is my data HDInsight
  • 36. MapReduce - How Does It Work? Where is my data HDInsight
  • 37. So How Does It Work? Where is my data HDInsight
  • 38. Finding common friends • Facebook shows you how many common friends you have with someone • There were 1,310,000,000 active users in facebook with 130 friends on average (01.01.2014) • Calculating the mutual friends Where is my data HDInsight
  • 39. Finding common friends • We can represent Friend Relationship as: • Note that a Friend relationship is Symmetrical – if A is a friend of B then B is a friend of A Where is my data HDInsight Someone  [List of hisher friends] Common Friends
  • 40. Example of Friends file • U1 -> U2 U3 U4 • U2 -> U1 U3 U4 U5 • U3 -> U1 U2 U4 U5 • U4 -> U1 U2 U3 U5 • U5 -> U2 U3 U4 Where is my data HDInsight Common Friends
  • 41. Designing our MapReduce job • Each line from the file will input line to the Mapper • The Mapper will output key-value pairs • Key: (user, friend) – Sorted, friend might be before user • value: list of friends Where is my data HDInsight Common Friends
  • 42. Designing our MapReduce job - Mapper • Each line from the file will input line to the Mapper • The Mapper will output key-value pairs • Key: (user, friend) – Sorted, friend might be before user • value: list of friends • Having the key sorted will help us with the reducer, same pairs will be provided together Where is my data HDInsight Common Friends
  • 43. Mapper Example Where is my data HDInsight Common Friends Mapper Output:Given the Line: (U1 U2)  U2 U3 U4 (U1 U3)  U2 U3 U4 (U1 U4)  U2 U3 U4 U1U2 U3 U4
  • 44. Mapper Example Where is my data HDInsight Common Friends Mapper Output:Given the Line: (U1 U2)  U2 U3 U4 (U1 U3)  U2 U3 U4 (U1 U4)  U2 U3 U4 U1U2 U3 U4 (U1 U2) -> U1 U3 U4 U5 (U2 U3) -> U1 U3 U4 U5 (U2 U4) -> U1 U3 U4 U5 (U2 U5) -> U1 U3 U4 U5 U2  U1 U3 U4 U5
  • 45. Mapper Example – final result Where is my data HDInsight Common Friends Mapper Output:Given the Line: (U1 U2)  U2 U3 U4 (U1 U3)  U2 U3 U4 (U1 U4)  U2 U3 U4 U1U2 U3 U4 (U1 U2) -> U1 U3 U4 U5 (U2 U3) -> U1 U3 U4 U5 (U2 U4) -> U1 U3 U4 U5 (U2 U5) -> U1 U3 U4 U5 U2  U1 U3 U4 U5 (U1 U3) -> U1 U2 U4 U5 (U2 U3) -> U1 U2 U4 U5 (U3 U4) -> U1 U2 U4 U5 (U3 U5) -> U1 U2 U4 U5 U3 -> U1 U2 U4 U5 Mapper Output:Given the Line: (U1 U4) -> U1 U2 U3 U5 (U2 U4) -> U1 U2 U3 U5 (U3 U4) -> U1 U2 U3 U5 (U4 U5) -> U1 U2 U3 U5 U4 -> U1 U2 U3 U5 (U2 U5) -> U2 U3 U4 (U3 U5) -> U2 U3 U4 (U4 U5) -> U2 U3 U4 U5 -> U2 U3 U4
  • 46. Designing our MapReduce job - Reducer • The input for the reducer will be structured as: (friend1, friend2)  (friend1 friends) (friend2 friends) • The reducer will find the intersection between the lists • Output: (friend1, friend2)  (intersection of friend1 and friend2 friends) Where is my data HDInsight Common Friends
  • 47. Reducer Example Where is my data HDInsight Common Friends Reducer Output:Given the Line: (U1 U2) -> (U3 U4)(U1 U2) -> (U1 U3 U4 U5) (U2 U3 U4) (U1 U3) -> (U2 U4)(U1 U3) -> (U1 U2 U4 U5) (U2 U3 U4) (U1 U4) -> (U2 U3)(U1 U4) -> (U1 U2 U3 U5) (U2 U3 U4) (U2 U3) -> (U1 U4 U5)(U2 U3) -> (U1 U2 U4 U5) (U1 U3 U4 U5) (U2 U4) -> (U1 U3 U5)(U2 U4) -> (U1 U2 U3 U5) (U1 U3 U4 U5) (U2 U5) -> (U3 U4)(U2 U5) -> (U1 U3 U4 U5) (U2 U3 U4) (U3 U4) -> (U1 U2 U5)(U3 U4) -> (U1 U2 U3 U5) (U1 U2 U4 U5) (U3 U5) -> (U2 U4)(U3 U5) -> (U1 U2 U4 U5) (U2 U3 U4) (U4 U5) -> (U2 U3)(U4 U5) -> (U1 U2 U3 U5) (U2 U3 U4)
  • 48. Creating c# MapReduce Where is my data HDInsight Common Friends
  • 49. Creating c# MapReduce - Mapper Where is my data HDInsight Common Friends public class CommonFriendsMapper:MapperBase { public override void Map(string inputLine, MapperContext context) { var strings = inputLine.Split(new []{' '}, StringSplitOptions.RemoveEmptyEntries); if (strings.Any()) { var currentUser = strings[0]; var friends = strings.Skip(1); foreach (var friend in friends) { var keyArr = new[] {currentUser, friend}; Array.Sort(keyArr); var key = String.Join(" ", keyArr); context.EmitKeyValue(key, string.Join(" ",friends)); } } } }
  • 50. Creating c# MapReduce - Reduce Where is my data HDInsight Common Friends public class CommonFriendsReducer:ReducerCombinerBase { public override void Reduce(string key, IEnumerable<string> strings, ReducerCombinerContext context) { var friendsLists = strings .Select(friendList => friendList.Split(' ')) .ToList(); var intersection = friendsLists[0].Intersect(friendsLists[1]); context.EmitKeyValue(key, string.Join(" ", intersection)); } }
  • 51. Creating c# MapReduce – Hadoop Job Where is my data HDInsight Common Friends HadoopJobConfiguration myConfig = new HadoopJobConfiguration(); myConfig.InputPath = "wasb:///example/data/friends/friends"; myConfig.OutputFolder = "wasb:////example/data/friends/output"; Environment.SetEnvironmentVariable("HADOOP_HOME", @"c:hadoop"); Environment.SetEnvironmentVariable("Java_HOME", @"c:hadoopjvm"); var hadoop = Hadoop.Connect(clusterUri, clusterUserName, hadoopUserName, clusterPassword, azureStorageAccount, azureStorageKey, azureStorageContainer, createContinerIfNotExist); var jobResult = hadoop.MapReduceJob.Execute<CommonFriendsMapper, CommonFriendsReducer>(myConfig); int exitCode = jobResult.Info.ExitCode; // (0 – success, otherwise – failure)
  • 52. Pricing Where is my data HDInsight 10 node cluster that will exist for 24 hours: • Secure Gateway Node - free. • head node - 15.36 USD per 24-hour day • 1 data node - 7.68 USD per 24-hour day • 10 data nodes - 76.80 USD per 24-hour day • Total: $92.16 USD
  • 54. Comparing the alternatives Storage Type When Should you Use Implications BLOB Unstructured data Files - Application Logic Responsibility - Consider using HDInsight(Hadoop) SQL Server Structured Relational Data ACID transactions Max 150GB (500GB in preview) - SQL DML+DDL - Could affect scalability - BI Abilities - Reporting Azure Tables Structured Data Loose Schema Geo Replication (High DR) Auto Sharding - OData, REST - Application Logic - Responsibility(Multiple Schemas) Where is my data Wrap Up
  • 55. What have we seen • Azure Blobs • Azure Tables • Azure SQL Server • HDinsight Where is my data Wrap Up
  • 56. What’s Next • NoSql – MongoDB, Cassandra, CouchDB, RavenDB • Hadoop ecosystem – Hive, Pig, SQOOP, Mahout • http://blogs.msdn.com/b/windowsazure/ • http://blogs.msdn.com/b/windowsazurestorage/ • http://blogs.msdn.com/b/bigdatasupport/ Where is my data Wrap Up
  • 57. Presenter contact details c: +972-52-4772946 t: @tamir_dresher e: tamirdr@codevalue.net b: TamirDresher.com w: www.codevalue.net

Notes de l'éditeur

  1. Slide Objectives Understand the hierarchy of Blob storage Speaker Notes The Blob service provides storage for entities, such as binary files and text files. The REST API for the Blob service exposes two resources: Containers Blobs. A container is a set of blobs; every blob must belong to a container. The Blob service defines two types of blobs: Block blobs, which are optimized for streaming. Page blobs, which are optimized for random read/write operations and which provide the ability to write to a range of bytes in a blob. Blobs can be read by calling the Get Blob operation. A client may read the entire blob, or an arbitrary range of bytes. Block blobs less than or equal to 64 MB in size can be uploaded by calling the Put Blob operation. Block blobs larger than 64 MB must be uploaded as a set of blocks, each of which must be less than or equal to 4 MB in size. Page blobs are created and initialized with a maximum size with a call to Put Blob. To write content to a page blob, you call the Put Page operation. The maximum size currently supported for a page blob is 1 TB. Notes http://msdn.microsoft.com/en-us/library/dd573356.aspx Using the REST API for the Blob service, developers can create a hierarchical namespace similar to a file system. Blob names may encode a hierarchy by using a configurable path separator. For example, the blob names MyGroup/MyBlob1 and MyGroup/MyBlob2 imply a virtual level of organization for blobs. The enumeration operation for blobs supports traversing the virtual hierarchy in a manner similar to that of a file system, so that you can return a set of blobs that are organized beneath a group. For example, you can enumerate all blobs organized under MyGroup/.
  2. Put Blob - Creates a new blob or replaces an existing blob within a container. Get Blob - Reads or downloads a blob from the system, including its metadata and properties. Delete Blob - Deletes a blob Copy Blob - Copies a source blob to a destination blob within the same storage account. SnapShot Blob - The Snapshot Blob operation creates a read-only snapshot of a blob. Lease Blob - Establishes an exclusive one-minute write lock on a blob. To write to a locked blob, a client must provide a lease ID. Using the REST API for the Blob service, developers can create a hierarchical namespace similar to a file system. Blob names may encode a hierarchy by using a configurable path separator. For example, the blob names MyGroup/MyBlob1 and MyGroup/MyBlob2 imply a virtual level of organization for blobs. The enumeration operation for blobs supports traversing the virtual hierarchy in a manner similar to that of a file system, so that you can return a set of blobs that are organized beneath a group. For example, you can enumerate all blobs organized under MyGroup/. Notes The Blob service provides storage for entities, such as binary files and text files. The REST API for the Blob service exposes two resources: containers and blobs. A container is a set of blobs; every blob must belong to a container. The Blob service defines two types of blobs: Block blobs, which are optimized for streaming. This type of blob is the only blob type available with versions prior to 2009-09-19. Page blobs, which are optimized for random read/write operations and which provide the ability to write to a range of bytes in a blob. Page blobs are available only with version 2009-09-19. Containers and blobs support user-defined metadata in the form of name-value pairs specified as headers on a request operation. Using the REST API for the Blob service, developers can create a hierarchical namespace similar to a file system. Blob names may encode a hierarchy by using a configurable path separator. For example, the blob names MyGroup/MyBlob1 and MyGroup/MyBlob2 imply a virtual level of organization for blobs. The enumeration operation for blobs supports traversing the virtual hierarchy in a manner similar to that of a file system, so that you can return a set of blobs that are organized beneath a group. For example, you can enumerate all blobs organized under MyGroup/. A block blob may be created in one of two ways. Block blobs less than or equal to 64 MB in size can be uploaded by calling the Put Blob operation. Block blobs larger than 64 MB must be uploaded as a set of blocks, each of which must be less than or equal to 4 MB in size. A set of successfully uploaded blocks can be assembled in a specified order into a single contiguous blob by calling Put Block List. The maximum size currently supported for a block blob is 200 GB. Page blobs are created and initialized with a maximum size with a call to Put Blob. To write content to a page blob, you call the Put Page operation. The maximum size currently supported for a page blob is 1 TB. Blobs support conditional update operations that may be useful for concurrency control and efficient uploading. Blobs can be read by calling the Get Blob operation. A client may read the entire blob, or an arbitrary range of bytes. For the Blob service API reference, see Blob Service API.
  3. Locally Redundant Storage (LRS) Geographically Redundant Storage (GRS) Read-Access Geographically Redundant Storage (RA-GRS)
  4. moshe@gmail.com, eli@gmail.com, me@gmail.com – was affected yossi@walla.co.il – not affected
  5. moshe@gmail.com, eli@gmail.com, me@gmail.com – was affected yossi@walla.co.il – not affected
  6. Notes http://msdn.microsoft.com/en-us/library/dd573356.aspx
  7. moshe@gmail.com, eli@gmail.com, me@gmail.com – was affected yossi@walla.co.il – not affected
  8. moshe@gmail.com, eli@gmail.com, me@gmail.com – was affected yossi@walla.co.il – not affected