Speaker: Raphael Londner, Developer Advocate, MongoDB
Speaker: Paul Sears, Partner Solutions Architect, Amazon Web Services
Level: 200 (Intermediate)
Track: Atlas
In this session, AWS Solutions Architect Paul Sears will provide an overview of AWS Lambda functions, including some key integration use cases with MongoDB Atlas. Developer Advocate Raphael Londner will walk you through how to code a Lambda function connected to MongoDB Atlas, with a specific focus on performance optimization. Raphael will then demonstrate how to orchestrate multiple Lambda functions inside a state machine built on top of AWS Step Functions.
What You Will Learn:
- Common use cases for which MongoDB Atlas + AWS Lambda help you boost developer productivity and minimize operational costs.
- How to write a performance-optimized Lambda function that re-uses MongoDB Atlas database connections across multiple calls in order to speed up queries.
- How AWS Step Functions can help you easily build application workflows to coordinate your Lambda functions.
5. #MDBW17
SERVERLESS COMPUTING - AWS LAMBDA
Run code without provisioning or managing servers – pay only for the
compute time you consume.
6. #MDBW17
Continuous ScalingNo Servers to
Manage
Subsecond
Metering
BENEFITS OF AWS LAMBDA
AWS Lambda handles:
• Operations and
management
• Provisioning and utilization
• Scaling
• Availability and fault
tolerance
Automatically scales your
application, running code in
response to each trigger
Your code runs in parallel and
processes each trigger
individually, scaling precisely
with the size of the workload
Pricing
• CPU and Network
scaled based on
RAM (128 MB to
1500 MB)
• $0.20 per
1M requests
• Price per 100ms
7. #MDBW17
AWS LAMBDA USE CASES
• Web applications
• Data processing (real-time streaming analytics)
• Scalable back ends (mobile apps, IoT devices)
• Amazon Alexa
• Chatbots
8. #MDBW17
Web
Applications
• Static
websites
• Complex web
apps
• Packages for
Flask and
Express
Data
Processing
• Real time
• MapReduce
• Batch
Chatbots
• Powering
chatbot logic
Backends
• Apps &
services
• Mobile
• IoT
</></>
Amazon
Alexa
• Powering
voice-enabled
apps
• Alexa Skills Kit
Autonomous
IT
• Policy engines
• Extending
AWS services
• Infrastructure
management
Data processing: Lambda + S3
COMMON USE CASES
13. #MDBW17
COMPONENTS OF A SERVERLESS APP
EVENT SOURCE
Requests to
endpoints
Changes in
resource state
FUNCTION SERVICES
14. #MDBW17
ANATOMY OF A LAMBDA FUNCTION
• Handler function (or method)
• An input parameter
‒ Integer, double, string, JSON…
• A context parameter
‒ Provides information such as
o Remaining time until timeout, log group and stream, request ID, etc…
‒ Exposes public properties
• A return value (optional)
15. #MDBW17
ANATOMY OF A LAMBDA FUNCTION
(NODE.JS)
• Handler function in index.js
exports.myHandler = function(event, context, callback)
=> {…}
• Handler format in AWS Lambda: [moduleName].[functionName]
index.myHandler
16. #MDBW17
ANATOMY OF A LAMBDA FUNCTION (JAVA)
• Handler function in example.Hello class
package example;
public class Hello {
public String myHandler(type inputVal, Context context) {...}
• Handler format in Lambda: [package].[className]::[functionName]
17. #MDBW17
ANATOMY OF A LAMBDA FUNCTION
(PYTHON)• Handler function in hello_python.py file
def my_handler(event, context):
…
return some_value
• Handler format in Lambda: [fileName].[functionName]
18. #MDBW17
ANATOMY OF A LAMBDA FUNCTION (C#)
• Handler function in Example.Hello
namespace Example;
public class Hello {
public Stream MyHandler(type inputVal, ILambdaContext context) {...}
• Handler forma in Lambda:
[assemblyName]::[namespace].[className]::[methodName]
19. #MDBW17
HOW TO DEVELOP FOR LAMBDA & ATLAS
IN NODE.JS
Sign up for
Atlas
Create
a
cluster
Copy cluster
URI to a safe
location
Create
a DB
user
Init a
Node
project
Write code!
Import
MongoDB
Node driver
20. #MDBW17
HOW TO TEST AND DEPLOY YOUR LAMBDA
Upload your
package to
AWS Lambda
Use lambda-
local to test
locally
Zip your
Node.js
code
Create
your
Lambda
function
Set env.
variables,
memory,
security…
Test in
AWS
(Console,
API
Gateway)
24. #MDBW17
BEST PRACTICES: LAMBDA FUNCTION
Use the right “timeout”
Utilize the functions local storage which is 500MB in size in the /tmp
Lower costs and improve performance by minimizing the use of startup code
not directly related to processing the current event
Use the built-in CloudWatch monitoring of your Lambda functions to view and
optimize request latencies
25. #MDBW17
THINGS TO REMEMBER: LAMBDA FUNCTION
Memory = “Power level”
Higher levels offer more memory and more CPU power
Functions don’t have a notion of state
Use MongoDB Atlas, S3, or Elasticache, or AWS Step Functions
Wrap your config in a function and call it from your published code
Use the right access control for downstream services
IAM roles and permissions for AWS services
VPC for private endpoints
KMS for storing credentials for downstream endpoints
26. #MDBW17
THINGS TO REMEMBER: LAMBDA
APPLICATION
Lambda scales by events/requests
Plan for concurrent request rate on downstream services
Shared scaling responsibility for VPC enabled functions
Sufficient IPs to match your expected concurrency
at least one subnet in each availability zone
Retries are built in for asynchronous and Stream invokes
Plan for retries for synchronous applications
27. #MDBW17
WHERE NOT TO CONSIDER LAMBDA
(TODAY)
• Large software dependencies: Custom software applications with
licensing agreements such as MS-Office document processing, EDA
tools, Oracle databases, etc.
• OS dependencies: Software packages or applications which rely on
calling underlying Windows RPCs
• Custom hardware: GPU acceleration, hardware affinity
29. #MDBW17
AWS LAMBDA WITH MONGODB ATLAS
• Store database connection string in an environment variable
‒ Use –E parameter with lambda-local
‒ Reference it with process.env['MONGODB_ATLAS_URI']
• Encrypt database connection string in AWS Lambda
‒ Use AWS.KMS() to decrypt() the connection string
30. #MDBW17
PERFORMANCE BEST PRACTICES WITH
NODE.JS
• Declare the db object outside the handler method
• Do NOT close the db object!
• Set context.callbackWaitsForEmptyEventLoop to
‘false’
• Try to re-use the db object if
db.serverConfig.isConnected() returns true
35. #MDBW17
“I WANT TO SEQUENCE FUNCTIONS”
“I want to select functions based on data”
“I want to retry functions”
“I want try/catch/finally”
Turning functions into apps
“I have code that runs for hours”
“I want to run functions in parallel”
36. #MDBW17
BENEFITS OF AWS STEP FUNCTIONS
Diagnose and debug
problems faster
Adapt to change
Easy to connect and
coordinate
distributed components and
microservices to quickly
create apps
Manages the operations
and infrastructure of
service coordination to
ensure availability at
scale, and
under failure
Productivity Agility Resilience
41. DEFINE IN JSON AND THEN VISUALIZE IN THE
CONSOLE• {
• ”Comment”: “Hello World Example",
• "StartAt” : "HelloWorld”,
• "States” : {
• "HelloWorld” : {
• "Type” : "Task",
• "Resource” :
"arn:aws:lambda:REGION:ACCOUNT_ID:function:FUNCT
ION_NAME”,
• "End” : true
• }
• }
42. #MDBW17
EXECUTE ONE OR ONE MILLION
Start
End
HelloWorld
Start
End
HelloWorld
Start
End
HelloWorld
Start
End
HelloWorld
Start
End
HelloWorld
Start
End
HelloWorld
Start
End
HelloWorld
Start
End
HelloWorld
44. #MDBW17
SEVEN STATE TYPES
Task A single unit of work
Choice Adds branching logic
Parallel Fork and join the data across tasks
Wait Delay for a specified time
Fail
Stops an execution and marks it as a
failure
Succee
d
Stops an execution successfully
Pass Passes its input to its output
47. #MDBW17
INTEGRATE WITH OTHER AWS SERVICES
• Create state machines and Activities with AWS CloudFormation
• Call Step Functions with Amazon API Gateway
• Start state machines in response to events or on a schedule with
CloudWatch Events
• Monitor state machine executions with CloudWatch
• Log API calls with CloudTrail
51. #MDBW17
FOLLOW-ON SESSION
• What: MongoDB Atlas and Serverless Architectures
• When: Wednesday, June 21 at 4.30PM
• Where: Crystal A
• Who: Tosin Ajayi, Sr. Solutions Architect
52. #MDBW17
REFERENCE LINKS
• Serverless development with Node.js, Lambda and MongoDB Atlas
• Optimizing AWS Lambda performance with MongoDB Atlas and
Node.js
• Step Functions with MongoDB Atlas Part 1
• Step Functions with MongoDB Atlas Part 2
• Lambda with MongoDB GitHub repository
• Step Functions with MongoDB GitHub repository
56. #MDBW17
AWS LAMBDA = SERVERLESS COMPUTE
COMPUTE
SERVICE
Run arbitrary code
without managing
servers
EVENT
DRIVEN
Code only runs when it
needs to run
57. #MDBW17
LANGUAGE GUIDANCE
• Best overall choice: Python
‒ Lowest startup latency (~50ms)
‒ Supports most popular version (2.7)
• Fastest warm option: Java
‒ Downside is cold start latency: Up to 30s for 128MB size
‒ Use larger sizes to improve latency
‒ Avoid loading unnecessary classes on the critical path
59. #MDBW17
“I WANT TO SELECT FUNCTIONS BASED ON DATA”
With AWS Step Functions, it was easy to build a multi-step product
updating system to ensure our database and website always have the
latest price and availability information.
AWS Step Functions let us replace a manual updating process with an
automated series of steps, including built-in retry conditions and error
handling, so we can reliably scale before a big show, and keep pace
with rapidly changing fashions.
Jared Browarnik, CTO, TheTake
“
63. #MDBW17
AWS LAMBDA VPC ESSENTIALS
• All Lambda functions run in a VPC, all the time
• You can also grant Lambda functions access to resources in
your own VPC (optional)
• Functions configured for VPC access lose internet access by
default
• The ENIs used by Lambda’s VPC feature hit your quota
• Ensure your subnets have enough IPs for those ENIs.
• Specify at least one subnet in each Availability Zone
64. #MDBW17
“I WANT TO SEQUENCE FUNCTIONS”
AWS Step Functions, we can easily change and iterate on
the application workflow of our food delivery service in
order to optimize operations and continually improve
delivery times. AWS Step Functions lets us dynamically
scale the steps in our food delivery algorithm so we can
manage spikes in customer orders and meet demand.
Mathias Nitzsche, CTO, foodpanda
“
Hello my name is Raphael Londner with MongoDB and I am here with Paul Sears from Amazon Web Services.
Today we are going to discuss building serverless applications with MongoDB Atlas, AWS Lambda and AWS Step Functions.
During this talk we will be covering the following topics
An overview of AWS Lambda
A deep dive into the code
A demo of Lambda and MongoDB Atlas
Some best practices for using Lambda and MongoDB Atlas
And finally an introduction to AWS Step Functions
Thank you Raphael
So what is Serverless computing? Serverless computing allows you to build and run applications and services without thinking of servers. With serverless computing, your application still runs on servers, but all the server management is done by a service provider. You no longer have to provision, scale, and maintain servers; install and operate databases and storage systems; or manage connections and messages from mobile and IoT devices with serverless architectures. You also no longer need to worry about application fault tolerance and availability. Instead, a service provider does all of this for you.
And AWS Lambda lets you use serverless computing.
Some benefits of AWS Lambda are no servers to manage -- no worrying about provisioning, availability or fault-tolerance of the infrastructure. AWS Lamba provides continuous scaling as well as subsecond metering, so you only pay for the resources you actually use.
So do you use AWS Lambda? Some common patterns are web applications, data processing, scalable back ends, Amazon Alexa integration, and chatbots.
Let's talk about Data Processing. For example, a common use case is to automatically generate various image sizes from an image uploaded into S3. When the upload is completed, you can have S3 trigger AWS Lambda to process the image and generate the desired formats
Another example of processing data is to use AWS Lambda to act upon data from a stream, such as from Kinesis. Kinesis can trigger a Lambda function and the Lambda function can operate on the data, such as creating metadata, and then store it somewhere, such as MongoDB Atlas.
Another popular use case for AWS Lambda is with Alexa skills. Alexa calls a Lambda function to do something, in this example, Alexa triggers Lambda to post a message in a Slack channel, and the the lambda function can poll for a response that Alexa can say to back to the user.
AWS Lambda can also be used for Autonomous IT such as policy engines, or managing your infrastructure.
So now I will hand the discussion over to Raphael to explain how he is using AWS Lambda and MongoDB Atlas.
Sign up for Atlas at www.mongodb.com/atlas
Create an M0 free cluster
Grab your Atlas connection string
Initialize a Node.js project
Import MongoDB Node.js driver (or Mongoose)
AWS SDK for JavaScript if necessary
Write code! (using your Atlas connection string)
Use the lambda-local NPM package to test locally
Zip your Node.js code with ALL modules
Except aws-sdk if used
Create your Lambda function in the AWS Console
Use the AWS CLI, or packaging tools such as ClaudiaJS or Serverless.com
Upload your package to AWS Lambda
Create your environment variables (if necessary)
Test your function in the AWS Console
Now we will share some best practices for AWS Lambda
Some things to remember when using AWS Lamba.
First - Memory is your performance dial. When running within Lambda, you control how much CPU and memory is available to your function by configuring its memory. If your function is CPU bound, higher settings equals faster runtimes!
Second – functions don’t have a built in notion of state. If you need configuration persisted across invocations, or need to persist a session across multiple visits to your web app, you should store it in an external store such as MongoDB Atlas or S3. Another important aspect to remember is that you have granular controls to secure the interactions between each of these components independently. You control which functions can be invoked by which event source, using resource policies on the Lambda function, and you control exactly what AWS services your function has access to using an IAM, or appropriate VPC configuration, and leverage KMS to store secrets for downstream endpoints.
One of the key benefits of the serverless approach to application development is that scalability and reliability are built in. Lambda will automatically spin up enough copies of your function to handle the incoming event rate; however, this can cause an impedance mismatch with downstream services that don’t have this kind of scaling behavior. You can either ensure sufficient write capacity for downstream databases, or appropriately configured API call limits for any services being invoked; or, introduce a buffer between the Lambda function and the downstream service using a Kinesis Stream.
Scaling also becomes an important consideration when leveraging the recently announced Lambda feature that allows functions to access resources within a VPC. To use this feature, you specify the VPC subnets that a function can access, ideally one within each availability zone. When the function executes, it leverages an Elastic Network Interface or ENI to connect to resources within the specified subnets. Each concurrent request could require its own ENI, which in turn could require an IP, so ensure you size your subnet to match your expected concurrency. This makes scaling a shared responsibility for VPC based functions.
And again, a reminder on Lambda’s retry behavior – Asynchronous invokes and invokes from Stream event sources have retried built in. For synchronous invocations such as those originating from API Gateway, you control the retries in the client as required.
There are some patterns where Lambda isn’t the optimal choice. For example, enterprise applications that require licensing agreements typically don’t work well in Lambda functions.
Also, if your application stack relies on Windows RPCs calls, these won’t work well with Lambda.
And if your application relies on specialized or custom hardware, such as GPU acceleration, then it isn’t a good candidate as a lambda function.
Now we will share some best practices for AWS Lambda
As discussed, Lambdas are stateless, event driven functions that exist in the cloud.
But there aren’t that many apps with only one function, one entry point, one module, one component. So there’ll be more than one function. NEXT
In fact, it’ll be common to have LOTS of functions, lots of them talking to each other. NEXT
And in fact applications, serverless or not, tend to have databases. NEXT
And in the cloud, I notice that a lot of them have queues of one kind or another. NEXT
Some of the Lambda functions, connect to servers. NEXT
This is more like what an actual modern serverless app might really look like.
You’ll notice that those arrows are in different colors –because there are lots of ways for functions and other serverless components to talk with each other and control each other and orchestrate each other. Before we dive into the “how”, let’s talk about WHAT we want to do with functions when we have more than one.
These are the things that people are trying to accomplish with those different-colored arrows, when coordinating a set of functions into an application.
ProductivitySpend more time thinking about innovating the business logic that makes your application unique, and your applications are easier to operate and maintain.
AgilityAWS Step Functions records a history of each execution, so you can review in one place, all the events in sequence, in one location. Scale instantly from one execution to hundreds of thousands of concurrent executions, especially when used with other serverless AWS resources such as AWS Lambda, Amazon S3, and MongoDB Atlas. With AWS Step Functions, you pay only for what you use, when you use it.
ResilienceAWS Step Functions supports automatic error handling for graceful exits, and operates at scale without you needing to configure or manage its underlying resources.
So one example is you are planning to travel to the Grand Canyon. You can break this down into a series of activities, such as booking a flight, and booking a hotel, and booking a rental car. You can run each step in sequence.
But if your plans fail, for example you cannot book a car, you want to go back and cancel your hotel and your flight. You will need some kind of decision or coordination to complete all the steps. You don't want waste money on your flight if you don't have any way to drive to the Grand Canyon.
In other words, you can use a state machine, such as AWS Step Functions
A coordination solution must have several characteristics
It needs to scale out as demand grows. You should be able to run one execution, or run thousands.
You can never lose state.
It deals with errors and time out, and implements things like try/catch/finally
It is easy to use and easy to manage
It keeps a record of its operations and is completely auditable
With AWS Step Functions, you define your workflow as a series of steps and transitions between each step, also known as a state machine.
In Step Functions, we call these “states” and “state transitions”. A simple state machine looks like this, and you define it using simple commands in JSON.
When you start a state machine, you pass it input in the form of JSON, and each state changes or adds to this JSON blob as output, which becomes input to the next state. The console provides this visualization and uses it to provide near-real-time information on your state machine execution.
This is a classic Hello World example. This JSON code on the left, generates this graph on the right.
We specify where we start, we define each state, and we define each state transition. And then we get back to working on the code that makes our apps unique.
The power of Step Functions is that you can define a state machine once, and then run one or thousands and thousand of concurrent executions.
This allows you to break down big tasks into a set of smaller, simpler tasks.
The console provides all kinds of information. Anything you see in the console, with the exception of the graph, is also accessible through the API. Let’s step through the elements. In the upper left, I can tab between my graph and my JSON code. This graph is color coded to show me successfully executed states, my current state in progress, failed states, or states not passed during execution.
In the upper right panel, I get details about my execution. I can see general info, the input I gave to the execution, in the form of JSON key/value pairs, and the final output of the execution. The Step Details toggle at the bottom, allows me to inspect individual states. I can see the input and the output state by state. This is really useful in debugging when something unexpected occurs.
Finally, at the bottom is the complete history of my execution, step by step with timestamps. Again, this is accessible from the API as well as the console.
AWS Step Functions supports seven kinds of states today: Task, Choice, Parallel, Wait, Pass, Fail, and Success, and we will have more states in the future. Let’s look a few.
Task states do your work. These call on your application components and microservices. There are two kinds of Task states today: one pushes a call to AWS Lambda functions, and the other dispatches tasks to applications, which we call “activity workers”, when these activity workers long-poll for work.
Choice states allow you to introduce branching logic to your state machines.
Parallel states allow you fork the same input across multiple states, and then join the results into a combined output. This is really useful when you want to apply several independent manipulations to your data, such as image processing or data reduction.
When you combine these states, you can build really interesting state machines. Let’s look at a few.
For example, with Step Functions, you can build visual work flows. This workflow, which is available as a reference architecture, take a photo and processes using Rekognition. First it extracts the image metadata, then it checks the file type. It exits with a fail if its not a supported file type. It if it is supported, it stores the metadata, and then in parallel, creates a thumbnail and also used recognition to tag objects in the picture.
Here are some common patterns for AWS Step Functions
Order management
Batch processing
Replacing shell scripts
Enterprise application workflows
Data collection and processing
With AWS Step Functions you can easily integrate with other AWS Services such as creating your state machine and activities with AWS CloudFormation.
You can use Amazon API Gateway to call Step Functions or you can start state machines from a schedule or a response to events with CloudWatch Events. And of course you can monitor the execution of your state machine with Cloud Watch.
And for logging, all API calls are captured in CloudTrail
Now for some best practices. For lambda the best overall choice is Python due to having the lowest startup latency.
However, for warmed functions, Java can sometimes be faster. However, there are some potential downsides such as a high cold start latency. If you are using Java, only include classes that are absolutely necessary.
This is another example that is available in a blog on the link on the slide.
This is simple employee promotion process which involves a single task: getting a manager’s approval through email. When an employee is nominated for promotion, a new execution starts. The name of the employee and the email address of the employee’s manager are provided to the execution.
We use the design pattern shown to implement the manual approval step, and SES to send the email to the manager. After acquiring the task token, the Lambda function generates and sends an email to the manager with embedded hyperlinks to URIs hosted by API Gateway.
When the manage select the appropriate URL through API GW, the token is passed to Step Functions and the execution continues.
Keep in mind that all Lambda functions run in a VPC, all the time
You never need to “turn on” security – it’s always on
You can also grant Lambda functions access to resources in your own VPC
Functions configured for VPC access lose internet access…
unless you have managed NAT or a NAT instance in the VPC
FoodPanda is a take-out food is a food delivery service that is making excellent use of Cloud Infrastructure. They have been in the Step Functions beta. They get food orders and have delivery people that take the food to hungry people. The solve this delivery problem with Lambda functions.