In this talk, I discussed the different forms of complexities that can arise when integrating with APIs, and how DSLs can be used to tackle these complexities. I demonstrated that F# can be a very effective tool for creating both internal and external DSLs using both FParsec and active patterns.
31. @theburningmonk
select GameTitle, UserId, TopScore
from GameScores
where GameTitle = “Starship X”
and TopScore >= 1000
order desc
limit 3
with (NoConsistentRead, Index(GameTitleIndex, true))
48. @theburningmonk
SELECT * FROM GameScore
pTableName
let isTableName = isLetter <||> isDigit
let pTableName =
many1Satisfy isTableName
49. @theburningmonk
SELECT * FROM GameScore
pTableName
let isTableName = isLetter <||> isDigit
let pTableName =
many1Satisfy isTableName
50. @theburningmonk
SELECT * FROM GameScore
pTableName
let isTableName = isLetter <||> isDigit
let pTableName =
many1Satisfy isTableName
51. @theburningmonk
SELECT * FROM GameScore
pTableName
let isTableName = isLetter <||> isDigit
let pTableName =
many1Satisfy isTableName
52. @theburningmonk
SELECT * FROM GameScore
pTableName
let isTableName = isLetter <||> isDigit
let pTableName =
many1Satisfy isTableName
parses a sequence of one or
more chars that satisfies the
predicate function
54. @theburningmonk
SELECT * FROM GameScore
pAsterisk
*
pAttributeName
UserId, GameTitle, TopScore, …
let pAsterisk = stringCIReturn "*" Asterisk
55. @theburningmonk
SELECT * FROM GameScore
pAsterisk
*
pAttributeName
UserId, GameTitle, TopScore, …
let pAsterisk = stringCIReturn "*" Asterisk
matches the specified string
and return the given value
56. @theburningmonk
SELECT * FROM GameScore
pAsterisk
*
pAttributeName
UserId, GameTitle, TopScore, …
let isAttributeName = isLetter <||> isDigit
let pAttributeName =
many1Satisfy isAttributeName
57. @theburningmonk
SELECT * FROM GameScore
UserId, GameTitle, TopScore, …
pAttributeName pCommapAsterisk
*
let pComma = skipStringCI ","
58. @theburningmonk
SELECT * FROM GameScore
UserId, GameTitle, TopScore, …
pAttributeName pCommapAsterisk
*
let pAttributeNames =
sepBy1 pAttributeName pComma
59. @theburningmonk
SELECT * FROM GameScore
UserId, GameTitle, TopScore, …
pAttributeName pCommapAsterisk
*
let pAttributeNames =
sepBy1 pAttributeName pComma
parses one or more occurrences of
pAttributeName separated by pComma
77. @theburningmonk
select GameTitle, UserId, TopScore
from GameScores
where GameTitle = “Starship X”
and TopScore >= 1000
order desc
limit 3
with (NoConsistentRead, Index(GameTitleIndex, true))
148. @theburningmonk
type MetricTerm = Namespace | Name
type Unit = | Unit
type Filter =
| MetricFilter of MetricTerm * (string -> bool)
| UnitFilter of Unit * (string -> bool)
150. @theburningmonk
type MetricTerm = Namespace | Name
type Unit = | Unit
type StatsTerm =
| Average | Min | Max | Sum | SampleCount
type Filter =
| MetricFilter of MetricTerm * (string -> bool)
| UnitFilter of Unit * (string -> bool)
| StatsFilter of StatsTerm * (float -> bool)
152. @theburningmonk
type MetricTerm = Namespace | Name
type Unit = | Unit
type StatsTerm =
| Average | Min | Max | Sum | SampleCount
type Filter =
| MetricFilter of MetricTerm * (string -> bool)
| UnitFilter of Unit * (string -> bool)
| StatsFilter of StatsTerm * (float -> bool)
| CompositeFilter of Filter * Filter
162. @theburningmonk
let (|Float|) input =
match Double.TryParse input with
| true, n -> n
| _ ->
failwithf “not a float [%s]” input
Float : string -> float
164. @theburningmonk
let (|Float|) input =
match Double.TryParse input with
| true, n -> n
match someString with
| Float 42.0 -> “ftw”
| Float 11.0 -> “palprime”
| Float x -> sprintf “just %f” x
165. @theburningmonk
let (|Float|) input =
match Double.TryParse input with
| true, n -> n
match someString with
| Float 42.0 -> “ftw”
| Float 11.0 -> “palprime”
| Float x -> sprintf “just %f” x
166. @theburningmonk
match someString with
| Float 42.0 -> “ftw”
| Float 11.0 -> “palprime”
| Float x -> sprintf “just %f” x
let (|Float|) input =
match Double.TryParse input with
| true, n -> n
167. @theburningmonk
match someString with
| Float 42.0 -> “ftw”
| Float 11.0 -> “palprime”
| Float x -> sprintf “just %f” x
let (|Float|) input =
match Double.TryParse input with
| true, n -> n
good afternoon, and welcome to this talk on taming complex APIs with DSLs and how FSharp can help you build these DSLs easily.
My name is Yan Cui and I often go by the online alias of ‘theburningmonk’
I work for a company called Gamesys, we're based in central London and are one of the market leaders in the real-money gaming business.
Me and my team focus on freemium games for a more social audience and as a backend developer, I have built the backend for a number of our social games on Facebook and mobile.
Across our social games, we have around 1 million DAU and, 250 millions requests per day.
Pretty much every user action in our games are recorded and analyzed, we capture around 2TB of data a month for analytics purpose alone, which doesn’t take into account the significant amount of data we generate and store to facilitate the actual gameplay.
All our core game services are deployed to Amazon Web Services, and we make extensive use of its many services..
which has given us first-hand experience of the different kind of…
complexities that can arise from some of these APIs.
Some complexities are visible, they tend to be the result of the inherent complexities with the operations that the API allows you to perform
Whilst other APIs might appear to be simple at what it does but pushes the complexities to you instead. These complexities tend to surface only when you start working with the API.
And sometimes it’s merely a case of an impedance mismatch between what the API designer think you’ll do and what you actually need from the API
We’ll use three AWS APIs that we use regularly as case studies to illustrate the different types of complexities and how…
..F# can be an effective weapon in tackling these complexities by simplifying the task of creating both internal and external DSLs.
We’ll start with DynamoDB…
which is to this day still the fastest growing service that Amazon has and the DB of choice for our small team…
It is a managed key-value store…
with built-in redundancy and 9 9s guarantee…
it has great performance thanks to the fact that it runs on SSD drives…
It differs from other Amazon services in that it doesn’t operate with the usual pay-as-you-go model where you pay for actual amount of usage you have for any particular service…
Instead, when you create a table, you specify the throughput you require, Amazon would reserve enough capacity to meet your requirements and you pay for the amount of resources Amazon has to reserve even if you don’t end up using all that throughput…
Once created, you can still change the throughput of a table on the fly without impacting performance or needing downtime…
and the amount you pay will be adjusted when you change the throughput settings.
You can create a DB that can handle a million transactions/s with a few clicks but it will cost you dearly if you need that level of throughput 24/7.
It does enable an interesting usage pattern though, I know of an advertising firm which sees little traffic all year round but gets 4 million hits/s during the Superbowl, they were able to bump their throughput requirements all the way up during the Superbowl and then change them back afterwards.
They managed to get through the Superbowl for a few thousand bucks and didn’t need to create and maintain a super-scalable, but expensive infrastructure that goes unused 99% of the time.
It is semi-schema’d which means the only schema info you need to provide is the shape of the data you are going to use as the key in your table…
For every hash key, you can specify an additional range key which you can filter on when making a query against your data.
Whilst you’ll most likely be doing simple CRUD operations against specific keys - like you do in a key-value store - DynamoDb supports a limited set of queriability.
A query in DynamoDB must be provided with a hash key, accompanied with a set of filters against the range key.
If you created a local secondary index on your table then you can also filter against the local secondary index instead. Speaking of indices…
It supports two types of indices, local secondary index which gives you an alternative range key to query with.
Whereas global secondary index effectively allows you to specify an alternative set of hash and range key for the whole table.
In our example here, most of the time we’ll be accessing data for individual players, so we use the UserId as Hashkey, and allow us to query the player’s scores by the game Title.
The local index allows us to also query, for a give user, his top score.
The global index, allows us to use Game Title has the hash key, and the TopScore as the range key. Which makes it easy for us to do queries such as - give me the top 10 players for Donkey Kong by score.
DynamoDB also supports full table scans, but these tend to eat up your provisioned throughput, and take a long time to complete.
Whilst it’s running you’re also more likely to have your normal queries throttled by DynamoDB if the Scan takes up too much throughput.
If you consult the documentation for the Query API, you quickly get the sense that it’s anything but straight forward
As an example, take the table we saw earlier, if we were to ask for…
The top 3 player in Starship X by their Top Score, we might write…
this… which is a) a lot of code, and b) the code is not easily comprehensible because there are a lot of noise.
wouldn’t it be nice if you can write something like…
this instead? Which is far less code and more expressive of our intent.
Since data access is such a common task, it makes sense for us to create a DAL layer that provides a nicer abstraction for our game developers to work with.
It’s also a great place to implement Timeout, Circuit Breaker, Caching, and other useful patterns for better performance, resilience and fault tolerance
However, it’s hard to abstract over a complex API like Query, so the game developers end up having to work with the request objects from the AWSSDK directly whenever they want to Query/Scan the table.
Which means the low-level abstractions provided by the AWSSDK itself is now leaking into the abstraction above…
which is why we created…
DynamoDB.SQL
The goals of the project include…
Providing a DSL that hides the complexity of the query/scan API and to…
prevent AWSSDK from leaking through to downstream code
We wanted to create an external DSL that is SQL like, so that it’s familiar to most developers, esp someone new joining the team. To do so…
we need to parse the query into an AST and then translate that AST into a request and execute it against DynamoDB
To create this solution, we used FSharp with FParsec, which is a …
Parser combinator library. It basically means that you create basic, simple parsers and then combine them using functions to create more complex parsers. The functions that combine parsers together are called combinators.
Take this simple query for instance…
We have SELECT and FROM as keywords, GameScore is the name of the table, and for attributes we can either use a wildcard to fetch all the attributes or provide the name of the attributes we want in a comma separated list.
In the AST, we can represent the attributes as a FSharp…
DU, which you can think of as an Enum where each name can be associated with an arbitrary data type.
Here, we say that attributes can be either an Asterisk, or an array of strings
As for the whole query, we need to know the attributes, the table name, and we can store them inside a FSharp…
Record type, which is a lightweight data container where the fields are…
immutable by default.
To build a parser for this simple query, we first need a parser for the select keyword.
in FParsec, you can use the skipStringCI function which…
matches against the string “select” in a case insensitive way, and ignores it
we can do the same with the FROM keyword
for the table name..
DynamoDB requires a table name to consist of only letters and digits, so…
we create a predicate function by combining the isLetter and isDigit predicates built into FParsec, using…
a custom combinator that combines two predicates using OR.
Since isLetter and isDigit predicates work against individual characters in a string, so…
to match a string that is NOT EMPTY, and consists of only letters and digits we use the FParsec combinator many1Satisfy and provide it with our isTableName predicate function
This creates a parser for our table name
For the attributes, we first need a parser for …
asterisk, and the stringCIReturn function here differs from the skipStringCI function we saw earlier in that..
when matched, it will return the specified value, in our case, when we match the * we will return the union case Asterisk
to parse a comma separated list of attribute names, we need a parser for individual attribute names, similar to the table name parser we saw previously
we will also need a parser for commas, which is another use of the skipStringCI function
we can then combine them together using the sepBy1 combinator function to create a parser that…
parses one or more attribute names separated by commas
now that we have both parsers, time to combine them…
with the choice combinator, which can be denoted as…
this special operator, that says…
an attribute can be either an asterisk or a comma separated list of attribute names
we now have all…
the parsers that we need, to put everything together…
we use the tuple4 combinator which takes in..
four parsers that need to match in the specified order, this creates a parser which outputs a tuple of 4 items.
The output can then be forwarded using this…
combinator operator to a function that maps the tuple into…
the query type we defined earlier, hence creating a parser that returns an instance of the Query type
You can then incrementally add on the WHERE, ORDER, LIMIT, and WITH clauses to the parser, all and all…
the query syntax took less than…
50 LOC, which just goes to show how…
much productivity you get from using F# and FParsec.
So to recap, we looked at…
DynamoDB’s query operation as an example of a complex API that you can simplify with an external DSL…
and how such a DSL can be easily implemented with F# and FParsec
Next, let’s look at Amazon SimpleWorkflow, which presented a whole different kind of complexity.
Simple Workflow is an orchestration service that manages the state of your workflow and provides the reporting, heartbeat and retry capabilities.
To build an application on top of SWF, you need to implement a decision worker, which is responsible for…
pulling tasks from the service…
the decision task it gets back contains information about the current state of the workflow and the events that have occurred so far. Based on this, the worker needs to…
decide what happens next in the workflow, whether the it should be completed, cancelled, or some more work needs to be scheduled.
And when you need to do some work, you need an…
activity worker, which also…
pull for tasks…
and the activity tasks it receives contains payloads for the activity that it needs to perform, be it to process some credit card payment or to encode some video, and when it’s done it…
needs to report back to the service that the activity was completes. This triggers another decision task to be made available to the decision worker to process in order to decide what to do next.
Your decision and activity workers doesn’t have to run from EC2, instead it can run from anywhere where you can access the SWF service.
Everything that happens in a workflow execution is recorded and can be viewed from the management console, the input and result of every action, when it started and when it ended.
You can retry failed executions, and workflows can be nested.
As a hello world example, imagine we have a workflow with only one activity - which takes an input string and returns a string as result, here is what the implementation look like…
which is pretty insane how much work was required, and we haven’t even implemented common concerns such as error handling and heartbeats.
And the only thing in this craziness that is even relevant to what I wanted to get out of the service is…
this line at the top…
there’s gotta be a better way. Writing a truck load of code should never be your first-solution to a problem, every line of code you write has a cost - from the time it takes to write it, and then subsequently reading it, plus the time it takes to comprehend it, and then maintaining it.
And considering that by making it easier for you to write more code, you’ve also made it easier for you to create cost to your company, which has a non-zero probability of outweighing the value that line of code brings.
Before we look at our interpretation of a better way is, let’s look at the problem more closely.
For any application that is built on top of Amazon simple workflow…
first, there’s the service API…
then there’re the recurring concerns such as polling, heartbeats, health checks and error handling…
Then finally you have the things that are specific to your workflow, the activity and decision workers.
That’s a lot of things you need to implement just to get going, but really, the only thing that is unique to your workflow are the activities it needs to perform, and everything else is…
just glorified boilerplate. So whilst Simple Workflow’s APIs are pretty simple, as a consumer of the API, there are a lot of complexities that have been pushed down to you.
Which reminds me of a post that Kris Jordan wrote a while back called ‘the complexity of simplicity’…
where he said that…(read)…he went on to expand on this point by talking about how Google hides all the complexities around internet search and gives its users a simple UI with which they can do more with less.
The guys at Stripe has done a very similar thing with taking credit card payments.
It’s a really good read, here’s a bit.ly link to the post for you to have a look later on.
in the decision worker implementation earlier, you know, the thing that orchestrates the workflow, plumbing aside, the highlighted block of code is responsible for scheduling the activity, and completing or failing the workflow after the activity either completes or fails.
Looking at this block of code, I can’t easily reconstruct the mental model I had…
of the workflow I set out to implement? The problem here is that…
the workflow itself is never made explicit, but rather, implied by the logic that’s coded up in the decision worker, which is…
far from ideal. Instead, the mental model you have of a workflow should be…
driving the decision worker logic, or better, it should…
automate the decision worker logic…
and that’s why we created…
a simple workflow extensions library which aims to…
remove the boilerplate code you need to write, and…
allow you to write code that actually matches your mental model of workflows
and with that, let’s have a look how we’d implement our hello world example using our DSL
The DSL allows you to configure the Workflow and attach activities to it using a simple…
arrow-like operator, and when you read this code, you just follow the arrow to see…
what your workflow does
which each activity you add, you need to provide some configuration values and a delegate function which will be called when an activity task is received.
It also automatically registers the workflow and activities with the service, something that you’d have to do manually otherwise
Simple workflow allows you to nest workflows, so you can start and wait for another workflow to complete before continuing on in the parent workflow, to do this in the DSL…
you might write something along the lines of…
notice how we created a child workflow here, with its own activties, and…
simply added it to a parent workflow as if it’s another activity
the DSL takes care of the propagation of input and results so that the result of the previous activity or child workflow is passed as the input for the next activity or workflow
In case of exceptions, it also takes care of capturing exception details and reporting back to the service so that you can see what went wrong in the management console
you can optionally specify how many times an activity or child workflow should be retried
Now, to recap, we looked at…
Simple Workflow and how its simple API ends up pushing a lot of the complexities to you, the consumer, and how we addressed these complexities with…
a super-simple internal DSL in FSharp that consists of an operator, and two types, that…
…removes the need for boilerplate, and lets you create workflows with code that visually informs of what the workflow is going to do.
It takes care of the heavy lifting involved so that you can…
focus on building things that actually add value to your customers
Finally, let’s have a look at Amazon CloudWatch..
CloudWatch is a monitoring service that lets you collect and monitor metrics, and set alarms on those metrics.
You can monitor anything from CPU usage, to database latencies, as well as any custom metrics you track from your application.
It also comes with a nice web UI for browsing your metrics and charting them.
It’s an immensely important and useful service, but, there are shortcomings…
Presumably for performance reasons, the web UI limits you to 200 metrics when browsing, beyond that you have to know what you’re looking for to find it, so discovery is out of the question, but more importantly…
to find correlations between events such as latency spikes, you have to manually search for, and inspect every latency metric yourself…
it’s DevOps where you do all the monkey work, and you can’t even easily answer simple questions such as…
…(read)… driven by pain and anger we decided to automate with…
Amazon.CloudWatch.Selector, which gives us the ability to express queries using both internal and external DSLs
For instance, if we want to find latency metrics that, at some point in the last 12 hours, exceeded a 1s average…
we can express our question using an internal DSL that is very humanly readable, or as an…
external DSL that is very similar in syntax and identical in capability
or, suppose you want to find out if any of your cache nodes had a CPU spike in the last 24 hours…
you might try something that targets specific metrics, in our case, CPU metrics under the Amazon ElastiCache namespace, and…
whenever you see function names that end with ‘like’ it means they support regex
overall the syntax for the DSL is quite small…
as you can see, the internal DSL is implemented in a handful of lines…
the external DSL is slightly more involved, and this time around I chose to implement it using active patterns, which is a…
language feature in FSharp that allows you to abstract away, and give names to patterns so that they can be easily reused in pattern matching…
there’s the single-case patterns which is similar to a normal function, this pattern here…
matches against a string parameter called input, and either returns a float, or errors…
and we’ll give this pattern a name, enclosed in what we call the banana clip operators…
this pattern has a type signature of taking in a string, and returning a float, to use this pattern you could apply it…
in a match…with clause where you can stack them up, or even compose different patterns together…
and notice that the return value of the pattern can be bound and used in the pattern matching too…
but, if someone pass in a string that can’t be parsed to a float…
then this code will error.
But sometimes you need a pattern that doesn’t always match, which is when you can use a partial active pattern instead…
So here we can rewrite the earlier pattern to return an option…
so that if we’re able to parse the string as a float, we’ll…
the value of the float as Some, which is equivalent to Haskell’s Just.
And when we can’t parse it as a float, then we return None, which is again, equivalent to Haskell’s Nothing.
So now we can rewrite our pattern matching code earlier with an additional clause that catches cases where the input string…
cannot be parsed as a float.
and finally, if the input value cannot be classified into just something or nothing, then you can use the multi-case active pattern…
here we’ll try to parse the input string as a float and if the resulting float is a prime number, then return it as a Prime…
alternatively, return it with as a NotPrime…
and if we couldn’t even parse the input then return NaN instead.
and from here you can use these patterns in pattern matching as you would with any other. You can even compose them together to make your pattern matching code even more expressive…
and from here you can use these patterns in pattern matching as you would with any other.
You can use active patterns to really easily create parsers, and compose them to make more interesting parsers like we did with FParsec…
one of the most pleasing thing for me was that I was able to rely on the type system to guide me along the way and after I wrote both DSLs in a 2 hour sprint everything compiled and worked as expected at the first time of asking! Which gave me a beautiful feeling, and just what I needed at 4am!
so even on its own using only vanilla functions and active patterns, F# is still awesome at building DSLs with
for the internal DSL, it can be used from anywhere you can run F# code, including the REPL, or executables…
the external DSL is mainly useful for building tools with, such as a CLI…
which one is available via Chocolatey
with the CLI tool, you can write your queries using the external DSL syntax and if we find any matching metrics, you can plot them on graphs for visual inspection
To recap, we looked at CloudWatch and how its impedance mismatch renders routine post-mortem investigations into an exercise…
of finding a needle in the haystack, and how simple internal and external DSLs make that needle…
that much easier to find
and that concludes our case studies, but F#’s awesomeness doesn’t end here
in fact, we have used F# to build a number of tools and frameworks for working with specific Amazon services, including…
a type provider for S3…
Which allows you to interactively navigate, browse objects in your S3 account, without even leaving Visual Studio, and you get full intellisense support on the buckets and objects you find
you can also search for items in S3, which is useful for buckets with large number of objects, our buckets for player states have millions of objects, each with up to thousands of versions.
we also have Reacto-Kinesix, a framework…
for building real-time stream processing applications with Amazon Kinesis service
and Metricano, a library for…
for collecting and publishing custom metrics to services such as Amazon CloudWatch, whereby you can…
either track metrics manually from your code or to use PostSharp aspects to auto instrument your methods to record execution count and time metrics.
These metrics can then be hooked up to publishers such as the builtin CloudWatch publisher or…
one of your own.
You can use the PostSharp aspects to target individual methods, or multi-cast to all methods in a class or assembly to apply tracking to all your service entry points, or database calls…
and to publish the metrics to CloudWatch, you just need to call Publish.With with an instance of CloudWatchPublisher
and then you’ll be able to see your metrics on the CloudWatch management console, simple!
and with that, thank you very much for listening
We have the first ever FSharp Exchange in London next April, so hope to see you there.
If you have any questions for me after today, feel free to get in touch