AWS LambdaandCassandra
Paris AWS User Group | 5th Sep 2018
Lyuben TodorovDirector of Consulting, EMEA
PARIS AWS User GroupLes AWS User Group permettent aux
utilisateurs d’AWS de communiquer et
échanger pour répondre à des questions,
partager des idées et tout savoir sur les
nouveaux services et les bonnes pratiques.+2000 utilisateurs
Me
• Lyuben Todorov
• Consulting Director Instaclustr EMEA
• Univ. of Dundee
• Distributed Programming / OSS
Social Media: /in/lyubent
Talk Overview
Cassandra + λ Scale POC
• λ and C* (Cassandra) introduction
• Why use λ and Instaclustr’s managed service
• High Level Setup of λ and C* in Instaclustr
• Technical Challenges of using λ• Lessons Learned
What is λ
• Serverless
• Pay for execution time (1M requests free)(400k GBsec free)
• Auto-scale
• Always Available
Server
Operation
App
Operation
Operation
Database
Operation
What is λ
• Serverless (no need to share)
• Pay for execution time (1M requests free)(400k GBsec free)
• Auto-scale
• Always Available
λ Operation λ Operation
What is λ
• Serverless (no need to share)
• Pay for execution time (1M requests free)(400k GBsec free)
• Auto-scale
• Always Available
λ Operation λ Operation
User Event
Container Teardown
Container Creation
What is C*
• Highly Available Distributed Database
• No SPOF (p2p architecture)
• Open Source
• Tunable Consistency
Available
Partition Tolerant Consistent
C* Client
• Relevant to lambda:• Gossip – used by client to discover nodes• create λ per DC and use DC Aware Client• Query with LOCAL consistencies• Be careful with client timestamps (due to cold start)
Instaclustr Hosted Service
• Simple
• Auto scaling service
• 24/7 Support
• Access to Analytics
• Dashboard for Monitoring
• Security Plugins
How to set up λ
• Connect λ to Backend
• Deploy and test web app
Create λ VPC
Create subnet for VPC
Provision C* Cluster
VPC Peering Request – C* to λ
Update Route Table
Create & Deploy λ
Create VPC Create λ VPC
Create subnet for VPC
Provision C* Cluster
VPC Peering Request – C* to λ
Update Route Table
Create & Deploy λ
VPC Subnet Create λ VPC
Create subnet for VPC
Provision C* Cluster
VPC Peering Request – C* to λ
Update Route Table
Create & Deploy λ
Add Some Instant Awesome Create λ VPC
Create subnet for VPC
Provision C* Cluster
VPC Peering Request – C* to λ
Update Route Table
Create & Deploy λ
Pick Your Cloud Create λ VPC
Create subnet for VPC
Provision C* Cluster
VPC Peering Request – C* to λ
Update Route Table
Create & Deploy λ
Choose Node Capacity & Type Create λ VPC
Create subnet for VPC
Provision C* Cluster
VPC Peering Request – C* to λ
Update Route Table
Create & Deploy λ
Scalable backend Create λ VPC
Create subnet for VPC
Provision C* Cluster
VPC Peering Request – C* to λ
Update Route Table
Create & Deploy λ
Peering λ and C*’s VPCsLambda’s VPC needs to be connected with Instaclustr’s Cassandra VPC via Instaclustr console:
Create λ VPC
Create subnet for VPC
Provision C* Cluster
VPC Peering Request – C* to λ
Update Route Table
Create & Deploy λ
Route Tables
• Add rule for the API Gateway
• Add rule for Instaclustr VPC
Create λ VPC
Create subnet for VPC
Provision C* Cluster
VPC Peering Request – C* to λ
Update Route Table
Create & Deploy λ
Create λ in AWS Create λ VPC
Create subnet for VPC
Provision C* Cluster
VPC Peering Request – C* to λ
Update Route Table
Create & Deploy λ
Deploy λ Create λ VPC
Create subnet for VPC
Provision C* Cluster
VPC Peering Request – C* to λ
Update Route Table
Create & Deploy λ
Deploy λ Create λ VPC
Create subnet for VPC
Provision C* Cluster
VPC Peering Request – C* to λ
Update Route Table
Create & Deploy λ
The App
• Allows to process web requests
• POST used for inserting an event
• GET used for fetching an event
• Cassandra Table (Model):
CREATE TABLE event ( id uuid, source text, type text, recorded timestamp, PRIMARY KEY(id) )
The App - API Gateway
• POST /event/ writes an event to C*session.execute("INSERT INTO ic.event (id, source, type, recorded)" +"VALUES (now(), '10.1.13.77', 'Auth', toTimestamp(now())");
• GET /event/{id} retrieves an event from C* by id. session.execute("SELECT * FROM ic.event");
The App - Code
• Java Application
• Request processed as stream
• Output as JSON
public void handler(InputStream inputStream, OutputStream outputStream, Context context) {
// IMPL.
// Pass request to either GET or POST depending on context.
}
Challenges
• Application Scalability
• λ Warmup Time
• Reducing Memory Usage
• Connection Pooling
• Dependency Management
• Execution Environment Limits
Scaling Requests
• Load balancer can distribute requests
• Adds Complexity
• What if a backend changes
Function Warmup Time
• Cold start is when λ has to initialise resources in order to execute a λ• Container / NIC / other resources.
• Containers torn-down after 15 min of inactivity = cold start after
• λ Function avoids cold-start if constantly running
Function Warmup Time
1 2 3 4 5 6 7
time (min)
Avg. Request Response Time (sec)Parallel Requests (hundreds)
12
10
8
6
4
2
Function Warmup Time
• Cheat – Ping the λ every 5-10 minsCreate a Rule in AWS as an Event and schedule it to run every 10 min.
• Monitor container changes
Reduce Memory Usage
• 512 MB by default
• Way too much for a simple C* client
• CPU is proportional to memory allocated to app
Connection Pool Management
• Creating connections is expensive
• Connection pooling allows reuse
• λ is stateless and asynchronous in nature
Connection Pool Management
• Store session state outside of handler function’s scope
• Variables outside of handler remain initialised across λ calls
// Keep client wrapper outside of handleReqest function // will keep client initialised throughout λ execution private CassandraClient client = new CassandraClient(); public String handleRequest(Map<String,Object> input, Context context) { return "C* Version: " + client.getVersion(); }}
Dependency Management(Java ftw)
• Lean dependencies
• Smaller App
• Faster Deployment
• Less Downtime
<dependency> <groupId>io.symphonia</groupId> <artifactId>lambda-logging</artifactId> <version>1.0.1</version></dependency>
pom.xml
Log4J Jar Size 8.6 MBSymphonia Jar Size 8.1 MBNo Logger Jar Size 7.3 MB
Execution Environment Limits
• Limited to 512 MB of disk
• 3008 MB Memory Limit
• Max timeout – 5 mins.
• Max response payload – 6MB
• Event payload – 128 KB
Per λ invocation
POC Benchmark
• Create client to send out periodically increasing requests
• Run for 7 min 30 sec
• Review Cassandra latency metric