KUSTO and the New Role of Data Engineer

1
TOPIC
You will have
KUSTO and
nothing else!
A new way to interpret the word "Data
Engineer" is rising up. A friction less approach
is possible to this new mantra?

Who I am
@RiccardoZamana (personal)
@ZamanaRiccardo (work)
zama202
https://www.linkedin.com/in/riccardozamana/
RICCARDO ZAMANA

Summary
1. Understand KUSTO Engine (ADX Pro and Cons, Query
Processing and Concurrency)
2. How to use ADX to Kusto-mize data pipeline (Trigger2fill
& Rewrite Patterns)
3. The real role of the Data Engineer (CD/CI for the Data
Engineer, Git-ize Kusto statements, External Data integration)

“No more sessions
starting from 2022
Why is there this session?

1. Understand KUSTO Engine
ADX Pro and Cons, Query Processing and Concurrency

Kusto driven Customer base
The problem is CONFIDENCE!
Data Engineer today wants SQL because they know only that!
But… after a KUSTO TUTORING…:
Customers starting with historical analysis and then move to more and more real time analysis
as the teams are getting more comfortable with the service.
• LOG ANALYSIS: Kusto to analyze unified logs, i.e. logs from on-premise systems and
different clouds
• IOT TELEMETRY ANALYSIS: mine telemetry data to find anomalies in asset utilization
• SALES INSIGHTS: understand customer behaviours, predict trends or spike and optimize
go-to-market strategy

Azure Data Explorer overview
1. Capability for many data types,
formats, and sources
Structured (numbers), semi-structured
(JSONXML), and free text
2. Batch or streaming ingestion
Use managed ingestion pipeline or
queue a request for pull ingestion
3. Compute and storage isolation
• Independent scale out / scale in
• Persistent data in Azure Blob Storage
• Caching for low-latency on compute
4. Multiple options to support
data consumption
Use out-of-the box tools and connectors
or use APIs/SDKs for custom solution
Data Lake
/ Blob
IoT
Ingested Data
Engine
Data
Management
Azure Data Explorer
Azure Storage
Event Hub
IoT Hub
Customer Data
Lake
Kafka Sync
Logstash Plugin
Event Grid
Azure Portal
Power BI
ADX Web UI
ODBC / JDBC Apps
Apps (Via API)
Logstash Plugin
Apps (Via API)
Create,
Manage
Stream
Batch
Grafana
Query,
Control Commands
Azure OSS Applications
Active Data
Connections

9
The role of ADX
Raw data DWH
Refined data
Real time
derived data
Data
comparison
and fast kpi
ADX
THREE KEY USERS IN ONE TOOL:
• IoT Developer (data check, rule engine for insights)
• Data engineer (data exploration/enrichment/manipulation.. Like
‘’Smandruppation’’?)
• Data scientist (data selection and … what else?)

Why Kusto is Fast in Nutshell
WHY KUSTO SPEED IS SO HIGH?
• distributed structure
• store the data in columnar form
• node cluster
• designed for data that are read-only,
delete-rarely, and no updates at all.
Compare with SQL Server, Kusto’s high-speed query is not sourced from magic, the speed is
a tradeoff of data processing, wanting some features and also giving up some.
Remember the old (but good) pricing calculator… and
now?

How is it composed inside?
1) Admin Node
2) Query Head
3) Data Node
4) Gateway Node

The four elements of a Kusto Table
1. Table Metadata
2. Extent Directory
3. Extent
4. Column Index

Data extent & Kusto Index
Data Extent (aka Data Shard)
• Kusto data extent is kind like a ‘’mini Kusto table’’
• columnar data subdivided into segments
• one Kusto query will need only parse the required
columns in the project section only
• Project section is a must
Kusto Index
Two kind of indexes:
 String column index: inverted term index as a B-
tree. This kind of index grants Kusto a powerful
capability of text processing (Similar to
ElasticSearch). The “contains” operator is way
faster than “like” in T-SQL. Numeric column
(include DateTime and TimeSpan) index: range-
based forward index.
 Dynamic column index: inverted term index as a
B-tree, during data ingestion, the engine will
enumerate all elements within the dynamic value
and forward them to the index builder.

Data Shards (Extents) and Column Store
When you ingest some small data twice in table, you will see the following 2
extents after ingestion.
.show table StormEvents extents
After a while, these extents will be merged into a single extent.
Merge Policy
This merge policy (settings) can be seen by
running the following command.
.show database db01 policy merge
"PolicyName": ExtentsMergePolicy,
"EntityName": [db01],
"Policy": {
"RowCountUpperBoundForMerge": 16000000,
"OriginalSizeMBUpperBoundForMerge": 0,
"MaxExtentsToMerge": 100,
"LoopPeriod": "01:00:00",
"MaxRangeInHours": 24,
"AllowRebuild": true,
"AllowMerge": true,
"Lookback": {
"Kind": "Default",
"CustomPeriod": null
}
},
"ChildEntities": [
"StormEvents"
],

A journey of Data Ingestion
Imagine you have a CSV log
file in hand and want to load
it to Kusto.
3. Extent will be
created, and new
infos will sent to
admin
4. Admin will add
shard ref. to
metadata &
commit new
snapshot to db
data
2. finds an
available Data
node & forwards
the command
1. Ingest
command arrives
to ADMIN NODE

Data deletion
1. What happens when a data shard is deleted?
2. What if I am querying the going delete data just before the
data deletion command is executed?
3. Can I recover the deleted data by reverting metadata to a
previous version?
The only exception is the “data purge” command.
Remember: “With no regrets”.

Query processing
When you submit some query written
by Kusto Query Language (KQL), the
query analyzer parses into Abstract
Syntax Tree (AST) and builds an initial
Relational Operators’ Tree (RelOp tree).
It then finally builds a query plan as
follows.
The generated plan is eventually
translated into the distributed query
plan, which is a shard-level access
tree.

Kusto query execution
Script A, Put where before the aggregation
summarize
UsageDailyb
| where DateKey > 20190101
| summarize
DailyUsage_sum =
sum(DailyUsage)
by DateKey
| order by DateKey desc
| take 10
Script B, Put where after the aggregation
summarize
UsageDaily
| summarize
DailyUsage_sum =
sum(DailyUsage)
by DateKey
| where DateKey > 20190101
| order by DateKey desc
| take 10
Which script will return the result first, will script A use less time than script B? the result is Almost The Same. How
could it be?! Let’s go deep and find out why.

Kusto query execution
Abstract Syntax Tree(AST) and Relational Operators
Tree(RelOp Tree)
• Parse the coming script into an Abstract Syntax Tree(AST), and
performs semantic pass over the AST.
• Check names, see if the reference table, functions,pre-defined
variables exist in the database and query context.
• Verify the user has the permissions to access the relevant
entity.
• Check data type and reference, e.g. is an int function dealing a
string?
• After the semantic pass, the query engine will build an initial
Relational Operators Tree(RelOp Tree) based on the AST.
• Next, the Kusto engine will further attempt to optimize the
query by applying one or multiple predefined rewriting rules.
KEEP ATTENTION:
• Aggregations ops are split into the “leaf”.
• Top n operators are replicated to each data extent.
After optimization, both Script A and Script B will
share a common RelOp tree like this:

Join or summarize internal strategy
What ADX does when we ask for a Join or a Summarize?
Broadcast join strategy :
[If one of join sides is
significantly smaller than the
other]
Shuffled join strategy :
[If both join sides are large, it will
apply same partitioning scheme for
both sides]
Other :
[Both join sides are not so large]

Partition
By default (when partitioning policy is not assigned),
extents are partitioned by ingest-time based
partitioning.
When you change the partitioning policy
for existing table, please clear data and re-
ingest all data under new partitioning
policy.
.alter table SalesLogs policy partitioning ```{
"PartitionKeys": [
{
"ColumnName": "City",
"Kind":"Hash",
"Properties": {
"Function": "XxHash64",
"MaxPartitionCount":128,
"Seed": 1,
"PartitionAssignmentMode":"Default"
}
}
]
}```
By setting this custom policy, the extents in this table will be re-
partitioned by the hash of City. This will be run in the
background process after data ingestion.

Other topics for data sharing and distributions
Querying a materialized view is more performant
than the query for source table, in which the
aggregation will be performed each query.
The result of materialized view is always up-to-
date.
After a while, the background process will process
“delta” and merge into “materialized part”.
MATERIALIZED VIEW
MV is made of two components:
• A materialized part - an Azure Data Explorer table
holding aggregated records from the source table,
which have already been processed. This table
always holds a single record per the aggregation's
group-by combination.
• A delta - the newly ingested records in the source
table that haven't yet been processed.
.show materialized-view MaterializedViewName
.show materialized-view MaterializedViewName failures

Other topics for data sharing and distributions
In Data Explorer, you can also use leader-follower
pattern for distributing query workloads across
multiple clusters.
When follower database in a different cluster is
attached to the original database called “leader”
database, the follower database will synchronize
changes in leader database. With read-only follower
database, you can view data of the leader database in
a different cluster. (The followers must be in the same
region with leader.)
You can use this pattern for scale-out purposes in
large system.
You can also specify different SKUs and caching
policies in follower clusters. You can distribute the
read query workloads into multiple clusters, especially
when heavy ingestion’s workload occurs in leader
database.
LEADER AND FOLLOWER

Kusto Limitations
1) Limit on query concurrency
You can estimate the max concurrent
number by
[Cores per node] x 10
You can also view the actual number by
running this Kusto command if you have
permission to run it.
.show cluster policy querythrottling
2) Limit on the node memory
Your Kusto administrator may set the maximum
memory usage by setting an option to override it.
set
max_memory_consumption_per_query_per_node=6
8719476736;
MyTable | ...
.show Queries
| where StartedOn > ago(1d)
| extend MemoryPeak = tolong(ResourcesUtilization.MemoryPeak)
| project StartedOn, CommandType, ClientActivityId, TotalCpu,
MemoryPeak
| top 10 by MemoryPeak

Kusto Limitations
3) Limit on memory per iterator
Whenever there is a join or summarize, the
Kusto engine uses a pull iterator to fulfill the
request (the limitation is set to 5 GB)
you can increase this value by up to half of the
physical memory of the node.
set
maxmemoryconsumptionperiterator=68719476
736;
MyTable | ...
If your query hits this limitation, you may see an
error message “…exceeded memory budget…”.
4) Limit on result set size
You will hit this limitation when your query’s result
dataset rows number exceeds 500,000 or the data
size exceeds 64 MB.
if your script hits this limitation, you will see an error
message containing “partial query failure”.
To solve or avoid this limitation. you can
• summarize the data to output only interesting
results
• use a take operator to see a small sample of the
result.
• use the project operator to output the columns
you need.
What is MILLIBYTE?
BUT.. IF you insist that you want to output the data and copy it
to Excel. you can use this command to remove the limitation:
set truncationmaxsize=1048576;
set truncationmaxrecords=100000;
MyTable | where User=="UserId1"

Kusto Limitations
5) Limit on query complexity
Usually, you won’t hit this limitation unless your Kusto query is extremely complex. for example, you have 5,000
conditions in the where clause.
T
| where Column == "value1" or
Column == "value2" or
.... or
Column == "valueN"
each query will be transformed to a RelOp tree, if the tree depth exceeds the threshold, you hit the limitation. You
can rewrite the script logic to solve it.
T
| where Column in ("value1", "value2".... "valueN")

What ADX isn’t optimal for / stretch scenarios
Since we do not own the hardware the workloads are running on, we do not have to get married with one technology and
run everything on it to amortise the cost of said hardware / licence. We can use the best tool for the job.
Scenario Why Azure PaaS Alternatives
Data warehouse It isn’t transactional, doesn’t have log journals,
etc. . This is part of the reasons it is so fast, but
also part of the reasons it is a poor fit for a
Datawarehouse.
Azure Synapse & Power BI Premium
Application Back end ADX isn’t built as a transactional workload. Cosmos DB, Azure SQL DB, Azure
PostgreSQL, Azure MySQL, Azure
MariaDB
Machine Learning (ML)
Training
Even if ADX supports some built-in ML
algorithms , it isn’t an ML training platform.
Azure ML, Spark (Azure
Databricks or Azure HD Insight), Azure
Batch & Data Science Virtual
Machine (DSVM)
Sub-second streaming ADX can go as low as seconds of latency in
ingesting data and be able to do analytics. Most
“near real time” scenarios fall comfortably within
that window.
Structured Streaming in Continuous
Mode in Spark (Azure
Databricks or Azure HD Insight), Kafka
Streams on Azure HD
Insight, Flink on Azure HD Insight

The way they imagine our data-world
1. How many languages I need?
SQL, PYTHON, KUSTO ?
2. How many services are using
KUSTO?
3. How can you use Kusto to
manage /troubleshoot caveats
within Azure Solutions?

Key Differences with SYNAPSE DATA EXPLORER POOL
Category Capability Azure Data Explorer Synapse Data Explorer
Security VNET Supports VNet Injection and
Azure Private Link
Support for Azure Private link automatically integrated as part of Synapse
Managed VNET
PARI
CMK ✓ Automatically inherited from Synapse workspace configuration PARI
Firewall ✗ Automatically inherited from Synapse workspace configuration PARI
Business Continuity Availability Zones Optional Enabled by default where Availability Zones are available ADX
SKU Compute options 22+ Azure VM SKUs to choose
from
Simplified to Synapse workload types SKUs ADX
Integrations Built-in ingestion
pipelines
Event Hub, Event Grid, IoT Hub Event Hub, Event Grid, and IoT Hub supported via the Azure portal for non-
managed VNet
ADX
Spark integration Azure Data Explorer linked
service: Built-in Kusto Spark
integration with support for Azure
Active Directory pass-though
authentication, Synapse
Workspace MSI, and Service
Principal
Built-in Kusto Spark connector integration with support for Azure Active
Directory pass-though authentication, Synapse Workspace MSI, and Service
Principal
PARI
KQL artifacts
management
✗ Save KQL queries and integrate with Git SYN?
Metadata sync ✗ ✗ PARI
Features KQL queries ✓ ✓ PARI
API and SDKs ✓ ✓ PARI
Connectors ✓ ✓ PARI
Query tools ✓ ✓ PARI
Pricing Business Model Cost plus billing model with VCore billing model with two meters: VCore and Storage ADX

Delta Kusto - CI/CD for Azure Data Explorer (ADX)
WHAT IS DELTA KUSTO?
Command-line interface (CLI) enabling (CI / CD) automation with Kusto objects (e.g.
tables, functions, policies, security roles, etc.)
It can work on a single database, multiple databases, or an entire cluster. It also
supports multi-tenant scenarios.
• single-file executable available on both Windows and Linux
• accepts the path to a parameter YAML file instructing Delta
Kusto on what job to perform.
• A single call to Delta Kusto can run multiple jobs.
• enables change management on multi-tenant solutions
within Azure Data Explorer.

Delta Kusto - CI/CD for Azure Data Explorer (ADX)
HOW DELTA KUSTO WORKS?
Delta Kusto parses scripts and / or load database configuration
into a database model.
It can then compare two models to compute a Delta.
This approach might seem overkilled when considering functions
for instance where a simple create-or-alter can overwrite a function.
It does offer some advantages though:
• Computes a minimalistic set of delta commands since it
doesn’t need to create-or-alter everything just in case
• Detects drops (e.g. table columns) and can treat them as such
• Can do offline delta, i.e. compare two scripts without any
Kusto runtime involved.

GIT-IZE KUSTO Statements
REQUIREMENT
we hit issues where a developer would make a mistake directly
editing the function and it would mess up our production
assets
SOLUTION
• Sync Kusto lets the user pick either the local file system or a
Kusto database as either the source or the target.
• The Compare button checks both schemas and determines
the delta between the source and the target.
• After viewing the differences, the user can put a checkmark
next to the ones they want to publish and then press the
Update button.
• Visualize the differences between the source and the
target before updating the target.
This tool is now available for everyone on GitHub: https://github.com/microsoft/synckusto.

How to use ADX to Kusto-
mize data pipeline
Trigger2fill & Rewrite Patterns

Trigger2Fill and ReWrite Pattern
You can:
 Send daily reports containing tables and charts.
 Set notifications based on query results.
 Schedule control commands on clusters.
 Export and import data between Azure Data Explorer and other databases.
Blob
Storage
RawTables
Logic App Kusto
Queries
Batch
ingestion
New Data
stream
Stream
Ingestion
Trigger2Fill pattern
Refined
Tables
Continuous
Export
Blob
Storage
Batch
ingestion
ReWrite pattern

NO EXCUSES… NOW IT’S FREE!
• Microsoft account or an
Azure Active Directory
user
• No Azure subscription or
a credit card needed!
Setting Suggested value Description
Cluster display name MyFreeCluster The display name for your cluster. A unique cluster name will be generated as part of the deployment
and the domain name [region].kusto.windows.net is appended to it.
Database name MyDatabase The name of database to create. The name must be unique within the cluster.
Select location Europe The location where the cluster will be created.

FREE CLUSTER FEATURES
With FREE
Item Value
Storage (uncompressed) ~100 GB
Databases Up to 10
Tables per database Up to 100
Columns per table Up to 200
Materialized views per database Up to 5
Only with FULL
• External tables
• Continuous export
• Workload groups
• Purge
• Follower clusters
• Partitioning policy
• Streaming ingestion
• Python and R plugins
• Enterprise readiness (Customer managed keys, VNet,
disk encryption, managed identities)
• Autoscale
• Azure Monitor and Insights
• Event Hub and Event Grid connectors

The real role of the Data
Engineer
… and some fun @work

The real Role of Data Engineer
What is Data Engineering?
Data engineering is the practice designing and building systems for collecting, storing, and analyzing data at scale.
What does a data engineer do?
Data engineers work in a variety of settings to build systems that collect, manage, and convert raw data into usable information for data scientists
and business analysts to interpret. Their goal is to make data accessible so that organizations can use it to evaluate and optimize their performance.
What are some common tasks of the Data Engineer?
• Acquire datasets that align with business needs
• Develop algorithms to transform data into useful, actionable information
• Build, test, and maintain database pipeline architectures
• Collaborate with management to understand company objectives
• Create new data validation methods and data analysis tools
• Ensure compliance with data governance and security policies
What’s the difference between a data analyst and a data engineer?
Data scientists and data analysts analyze data sets to glean knowledge and insights.

Data Engineer career path – 1 of 4
Learn the fundamentals of cloud computing, coding skills, and database design as a starting
point for a career in data science.
 Coding: Proficiency in coding languages is essential to this role
 Relational and non-relational databases
 ETL (extract, transform, and load) systems
 Data storage: data lake or DWH?
 Automation and scripting: You should be able to write scripts to automate repetitive tasks.
 Machine learning: it can be helpful to have a grasp of the basic concepts to better
understand the needs of data scientists on your team.
 Big data tools: Data engineers are often tasked with managing big data (Hadoop,
MongoDB, and Kafka).
 Cloud computing. You’ll need to understand cloud storage and cloud computing as
companies increasingly trade physical servers for cloud services.
 Data security: many data engineers are still tasked with securely managing and storing data
to protect it from loss.
1. Develop your data engineering skills

2. Get certified.
A certification can validate your skills to potential employers
and preparing for a certification exam is an excellent way to
develop your skills and knowledge.
If you notice a particular certification is frequently listed as
required or recommended, that might be a good place to
start.

3. Build a portfolio of data engineering projects.
You can add data engineering projects you've completed independently or as part of
coursework to a portfolio website.
Alternately, post your work to the Projects section of your LinkedIn profile or to a site
like GitHub.
Brush up on your big data skills with a portfolio-ready Guided Project that you can
complete in under two hours.

Data Engineer career path
4. Start with an entry-level position.
Many data engineers start off in entry-level roles, such as business intelligence
analyst or database administrator.
As you gain experience, you can pick up new skills and qualify for more
advanced roles.

ADX ‘’WOW’’ PLUGINS – COSMOS DB CALLOUT
Enrich Telemetry with Cosmos DB
cosmosdb_sql_request plugin
Why this plugin Exists?
The cosmosdb_sql_request plugin sends a SQL query
to a Cosmos DB SQL network endpoint and returns the
results of the query. This plugin is primarily designed
for querying small datasets, for example, enriching
data with reference data stored in Azure Cosmos DB.
The plugin is invoked with the evaluate operator.
Syntax
evaluate cosmosdb_sql_request ( ConnectionString ,
SqlQuery [, SqlParameters [, Options]] )
Argument name Description Required/optional
ConnectionString A string literal indicating the connection string that points to the
Cosmos DB collection to query. It must
include AccountEndpoint, Database, and Collection. It may
include AccountKey if a master key is used for authentication.
Example: 'AccountEndpoint=https://cosmosdbacc.documents.azure.
com/
;Database=MyDatabase;Collection=MyCollection;AccountKey='
h'R8PM...;'
Required
SqlQuery A string literal indicating the query to execute. Required
SqlParameters A constant value of type dynamic that holds key-value pairs to pass
as parameters along with the query. Parameter names must begin
with @.
Optional
Options A constant value of type dynamic that holds more advanced settings
as key-value pairs.
Optional
armResourceId Retrieve the API key from the Azure Resource Manager
Example: /subscriptions/a0cd6542-7eaf-43d2-bbdd-
b678a869aad1/resourceGroups/
cosmoddbresourcegrouput/providers/Microsoft.DocumentDb/data
baseAccounts/cosmosdbacc
token Provide the Azure AD access token used to authenticate with the
Azure Resource Manager.
preferredLocations Control which region the data is queried from.
Example: ['East US']

IMPORTANT: Set the callout policy !!
[
{
"CalloutType": "CosmosDB",
"CalloutUriRegex":
"my_endpoint1.documents.azure.com",
"CanCall": true
},
{
"CalloutType": "CosmosDB",
"CalloutUriRegex":
"my_endpoint2.documents.azure.com",
"CanCall": true
}
]
.alter cluster policy callout @'[{"CalloutType": "cosmosdb",
"CalloutUriRegex": ".documents.azure.com", "CanCall":
true}]'
Example: Query Cosmos DB
The following example uses the cosmosdb_sql_request plugin to send a SQL query to
fetch data from Cosmos DB using its SQL API.
evaluate cosmosdb_sql_request(
'AccountEndpoint=https://cosmosdbacc.documents.azure.com/;Database=MyDatabase;C
ollection=MyCollection;AccountKey=' h'R8PM...;',
'SELECT * from c’)
Example: Query Cosmos DB with parameters
The following example uses SQL query parameters and queries the data from an
alternate region. For more information, see preferredLocations.
evaluate cosmosdb_sql_request(
'AccountEndpoint=https://cosmosdbacc.documents.azure.com/;Database=MyDatabase;C
ollection=MyCollection;AccountKey=' h'R8PM...;',
"SELECT c.id, c.lastName, @param0 as Column0 FROM c WHERE c.dob >= '1970-01-
01T00:00:00Z'",
dynamic({'@param0': datetime(2019-04-16 16:47:26.7423305)}),
dynamic({'preferredLocations': ['East US']}))
| where lastName == 'Smith'

ADX ‘’WOW’’ PLUGINS – HTTPS CALL
MAKE INFERENCES WITH HTTPS
PLUGIN
http_request plugin / http_request_post plugin
Why this plugin Exists?
The http_request (GET) and http_request_post
(POST) plugins send an HTTP request and convert
the response into a table to retrieve particular
elaboration and merge it with dataset.
Syntax
evaluate http_request ( Uri [, RequestHeaders [,
Options]] )
evaluate http_request_post ( Uri [, RequestHeaders
[, Options [, Content]]] )
Name Type Required Description
Uri string ✓ The destination URI for the HTTP or
HTTPS request.
RequestHeaders dynamic A property bag containing HTTP
headers to send with the request.
Options dynamic A property bag containing additional
properties of the request.
Content string The body content to send with the
request. The content is encoded
in UTF-8 and the media type for
the Content-Type attribute
is application/json.

WHY IS … SO DIFFICULT?
Returns
Both plugins return a table that has a single record with the following dynamic columns:
• ResponseHeaders: A property bag with the response header.
• ResponseBody: The response body parsed as a value of type dynamic.
Prerequisites
1. CALLOUT POLICY
2. USE HTTPS
Authentication
Argument Description
Uri The URI to authenticate with.
RequestHeaders Using the HTTP standard Authorization header or any custom header supported by the web service.
Options Using the HTTP standard Authorization header.
If you want to use Azure Active Directory (Azure AD) authentication, you must use an HTTPS URI for the request and set the following
values:
* azure_active_directory to Active Directory Integrated
* AadResourceId to the Azure AD ResourceId value of the target web service.

WARNING, WARNING, WARNING !!!!
SECRET INFORMATION MUST BE REALLY SECRET!!!
• Be extra careful not to send secret information, such as authentication tokens, over HTTP connections.
• if the query includes confidential information, make sure that the relevant parts of the query text are obfuscated so that
they'll be omitted from any tracing.
• Uus obfuscated string literals !!!
HEADERS vs HEADACHE
The RequestHeaders argument can be used to add custom headers to the outgoing HTTP request. In addition to the standard
HTTP request headers and the user-provided custom headers, the plugin also adds the following custom headers:
Name Description
x-ms-client-request-id A correlation ID that identifies the request.
x-ms-readonly A flag indicating that the processor of this request shouldn't make any persistent changes.
READ <> READWRITE PERMISSION
The x-ms-readonly flag is set for every HTTP request sent by the plugin that was triggered by a query and not a control
command.

HTTPS PLUGIN: An Example
EXAMPLE NO.1
evaluate
http_request('http://services.groupkt.com/country/get/all')
| project CC=ResponseBody.RestResponse.result
| mv-expand CC limit 10000
| project
name = tostring(CC.name),
alpha2_code = tostring(CC.alpha2_code),
alpha3_code = tostring(CC.alpha3_code)
| where name startswith 'b’
EXAMPLE NO.2
let uri='https://example.com/node/js/on/eniac';
let headers=dynamic({'x-ms-correlation-vector':'abc.0.1.0'});
let options=dynamic({'Authentication':'Active Directory Integrated',
'AadResourceId':'https://eniac.to.the.max.example.com/’});
evaluate http_request_post(uri, headers, options)
Etc etc etc
evaluate http_request_post ( Uri [, RequestHeaders [, Options [, Content]]] )
RESULT
name alpha2_code alpha3_code
Bahamas BS BHS
Bahrain BH BHR
Bangladesh BD BGD
WHERE IS ADX.. IN THIS TYPICAL USE CASE?

“Let your data drive.
But.. Sir… Data driven or data informed?

Thanks
Questions?
zama202 @RiccardoZamana
@ZamanaRiccardo
https://www.linkedin.
com/in/riccardozama
na/

KUSTO and the New Role of Data Engineer

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à KUSTO and the New Role of Data Engineer

Similaire à KUSTO and the New Role of Data Engineer (20)

Plus de Riccardo Zamana

Plus de Riccardo Zamana (12)

Dernier

Dernier (20)

KUSTO and the New Role of Data Engineer