This presentation explains some key concepts of the DataWeave (DW) data transformation language from MuleSoft.
The presentation features a case study that shows how DW achieves a non-trivial structural transformation from CSV to JSON.
2. The data mapping challenge
JSON
XML
CSV
Fixed Width
POJO
JSON
XML
CSV
Fixed Width
POJO
Structural Transformation
Value Transformation
Conditional mapping
Filtering
Grouping
Best practice: always define the mapping in terms of the desired target data structure
3. The old programmatic approach
❖ Map the target message from the source message
programmatically (e.g., via a script or Java method)
❖ Sequence of procedural steps that incrementally build the
target message from the source message
❖ Typical example: loop on elements of a source sequence
and for each element instantiate a target sub-structure, then
attach it to the overall target structure
❖ This approach is neither concise nor expressive; if
implemented incorrectly, it is also inefficient
4. The templating approach
❖ Template engines can be used as
data mapping engines:
❖ We define the target structure
(template)
❖ We define how each part of the
template is generated dynamically
from source data
❖ The template consists of a semi-
literal expression with
placeholders e.g. $() in the this
example
❖ More constructs are necessary to
instantiate repetitive structures
(looping), for conditional
mapping, etc.
{“user”:
{“id”: “$(sourceData.userID)”,
“firstName”: “$(sourceData.givenName)”,
“lastName”: “$(sourceData.lastName)”,
“contacts”: {
“phone”: “$(sourceData.phoneNumber)”,
“email”: “$(sourceData.emailAddress)”
}}
<?xml version="1.0">
<user>
<id> $(sourceData.userID) </id>
<firstName> $(sourceData.givenName) </firstName>,
<lastName> $(sourceData.lastName) </lastName>
<contacts>
<phone> $(sourceData.phoneNumber) </phone>
<email> $(sourceData.emailAddress) </email>
</contacts>
</user>
JSON
XML
5. Issues with standard templating
❖ Template depends on the concrete syntax of the target message (separate
templates for XML, JSON etc.)
❖ Placeholder syntax depends on the type of source message (e.g., XPath for
XML, JSONPath for JSON, non-standard syntax for other media types)
❖ Placeholder syntax may clash with target message syntax (cannot use for
example <> as placeholder markers with XML)
❖ Looping constructs of traditional template engines mix engine syntax with
generated content (“PHP-like”)
❖ XSLT is a very powerful templating and transformation language, but it
does have drawbacks (verbose XML syntax, cannot operate on non-tree-
structured source message that cannot be rendered into XML, etc.)
6. DataWeave (DW)
❖ Data mapping and
transformation tool from
MuleSoft
❖ Tightly integrated with
AnyPoint Studio IDE
❖ Non-procedural expression
language
❖ Applies functional
programming constructs
(lambdas)
❖ Uses internal, canonical data
format (application/dw)
7. Canonical data representation
1. DW parses the source message into application/dw canonical format using supplied metadata
/ DataSense capability
2. A DW expression is used to transform the source message (result still in canonical application/
dw format)
3. DW renders the canonical target message into the target MIME type specified as a “header”
to the DW expression (e.g. %output application/json)
This decouples the transformation from the concrete syntax of source and target messages!
Source
message
<source MIME type>
parser renderer
Source
message
(canonical)
Target
message
(canonical)
Target
message
DW
expression
<target MIME type>application/dw application/dw
8. The DW canonical format
❖ Only 3 kinds of data in SW:
• Simple (String, Number,
Boolean, Date types)
• Array
• Objects (key:value pairs)
❖ The canonical application/dw format
is shown in a JSON-like concrete
syntax in Anypoint Studio
❖ Parsing and rendering between
application/json and application/dw
is straightforward
[
{
"order_nr": "DO1234",
"order_date": "2016-03-12T13:30:23+8.00",
sku: "1233244",
"sku_description": "Product A",
qty: "20"
},
{
"order_nr": "DO1234",
"order_date": "2016-03-12T13:30:23+8.00",
sku: "1233255",
"sku_description": "Product B",
qty: "50"
}
]
9. XML Parsing
❖ repeated XML elements —> repeated object keys
❖ XML attributes —> special @() object
10. CSV parsing
❖ Array of records (lines)
❖ Record (line) —> array
element of type Object
❖ Field in record: object
field (key is taken from
CSV header line or
configured metadata)
❖ Reader configuration to
set field separator, etc.
12. Case study: introduction
Transforming a list of order items into a corresponding list of delivery routes.
The source payload is unsorted list of items in CSV format:
OrderId;OrderDate;CustomerId;DeliveryDate;City;ProductId;Quantity
000001;2016-09-14;Customer1;2016-09-20;London;ProductA;120
000001;2016-09-14;Customer1;2016-09-20;London;ProductB;88
000002;2016-09-15;Customer2;2016-09-20;Paris;ProductC;60
000002;2016-09-15;Customer2;2016-09-20;Paris;ProductA;100
000002;2016-09-15;Customer2;2016-09-20;Paris;ProductD;15
000003;2016-09-15;Customer3;2016-09-23;Berlin;ProductB;14
000003;2016-09-15;Customer3;2016-09-23;Berlin;ProductD;30
000004;2016-09-15;Customer4;2016-09-20;London;ProductC;14
000004;2016-09-15;Customer4;2016-09-20;London;ProductE;30
000005;2016-09-16;Customer4;2016-09-20;London;ProductB;20
000006;2016-09-16;Customer2;2016-09-22;Paris;ProductD;7
000006;2016-09-16;Customer2;2016-09-22;Paris;ProductE;30
000007;2016-09-16;Customer5;2016-09-22;Berlin;ProductB;12
The target structure (described in the following slide) is a multi-level JSON structure.
This case study focuses on the structural transformation capabilities of DW, but DW offers a
wide range of value and formatting capabilities, conditional mapping, and much more!
13. Case study: target format
[
{
city: "<City>",
deliveryDate: "<DeliveryDate>",
stops: [
{
customer: "<CustomerId>",
orderitems: [
{
ordernr: "<OrderId>",
orderdate: "<OrderDate>",
product: "<ProductId>",
qty: "<Quantity>"
}
]
}
]
}
]
JSON document with
sequence of delivery
routes by delivery date
and city:
❖ Sort CSV order lines by
city and delivery date
❖ Within each delivery
date and city, group
order lines by customer
❖ Render the structure as
JSON
By city / delivery date
By customer
By order item
14. Case study: step 1
Source message parsed as application/dw:
The DW expression payload evaluates the entire message payload (see earlier slide “CSV parsing)”
NOTE: the DW transformer Preview functionality in MuleSoft Anypoint Studio maps the sample
source in realtime as you type the transformation!
15. Case study: step 2
Sorting and grouping by combination of city and delivery date:
A composite key is used for sorting and grouping via the string concatenation operator (++) .
The groupBy operator creates an object with the group values as keys.
16. Case study: step 3
Iterating over the group values (city/delivery date combination) to
generate the 1st level of the target structure:
The pluck operator maps an object into an array. $$ is the key in the current iteration, $ is the
value.
City and delivery date are mapped from the composite key by String manipulation.
17. Case study: step 4
Within each route group, group by customer and generate 2nd (inner) level of target
structure:
In the inner pluck the context for $ and $$ changes (e.g., $$ is now the CustomerID key).
18. Case study: (final) step 5
Within each customer group, generate the 3rd (innermost) level of the target
structure via the map operator:
Also get the JSON rending by changing the %output directive.
19. Thanks!
This is just a “taste” of the innovative DataWeave
transformation language.
Find out more at:
https://docs.mulesoft.com/mule-user-guide/v/3.8/
dataweave