Serverless Computation with OpenLambda · Serverless Computation with OpenLambda Scott Hendrickson,...

Serverless Computationwith OpenLambda

Scott Hendrickson, Stephen Sturdevant, Tyler Harter, Venkateshwaran Venkataramani†, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau

Unaffiliated

Web development in the cloud

CDN: static content (e.g., JavaScript)

Compute: dynamic logic (e.g., Python)

Storage: application data

RPCs Queries

compute is evolving

RPCs Queries

compute is evolving

RPCs Queries

(Containers)

RPCs Queries

compute is evolving

(Lambdas)

prior to the Lambda model, cloud compute was neither elastic nor pay-as-you-go

RPCs Queries

(Lambdas)

Outline

Evolution of compute

Non-conventional virtualization

Lambda model

Why OpenLambda?

Conclusion

How to virtualize compute?

Classic web stack

Hardware

Server

ApplicationRPCs

Classic web stack

Hardware

Server

ApplicationRPCs

weak virtualization

Hardware

Server

ApplicationRPCs

1st generation: virtual machines

virtual H/W

Hardware

Server

ApplicationRPCs

virtual H/WOS

Server

Application

Hardware

Server

ApplicationRPCs

virtual H/WOS

Server

Application

advantages: • very flexible • use any OS

problems: • interposition • is RAM used? (ballooning) • redundancy (e.g., FS journal)

2nd generation: containers

Hardware

Server

ApplicationRPCs

virtual OSServer

Application

advantages: • centralized view • init H/W once

Are containers good enough?

Container case studies

Literature: Google Borg • Internal container platform [1] • 25 second median startup • 80% of time spent on package installation • matters for flash crowds, load balance, interactive development, etc

[1] Large-scale cluster management at Google with Borg. http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43438.pdf

Container case studies

Literature: Google Borg • Internal container platform [1] • 25 second median startup • 80% of time spent on package installation • matters for flash crowds, load balance, interactive development, etc

Experimental: Amazon Elastic Beanstalk • Autoscaling cloud service • Build applications as containerized servers, service RPCs • Rules dictate when to start/stop (various factors)

[1] Large-scale cluster management at Google with Borg. http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43438.pdf

Interesting “autoscaling” rule

Experiment

Simulate a small short burst • Maintain 100 concurrent requests • Use 200 ms of compute per request • Run for 1 minute

Container Case Study: Elastic Beanstalk

Conclusion: Elastic Beanstalk does not scale quickly enough to handle load bursts.

Container Case Study: Elastic Beanstalk

Conclusion: Elastic Beanstalk does not scale quickly enough to handle load bursts.

Elastic BS

Why should it take minutes (or even seconds) to execute scripts that are <1000s of LOC?

Hardware

Server

ApplicationRPCs

virtual OSServer

Application

advantages: • centralized view • init H/W once

problems: • large deployment bundle • server spinup

Hardware

Server

ApplicationRPCs

virtual OSServer

Application

3rd generation: Lambdas

Hardware

Server and Runtime

ApplicationRPCs virtual servers

Application

Hardware

Server and Runtime

ApplicationRPCs virtual servers

Application

serverless computing

Hardware

Server and Runtime

RPCs virtual servers

decompose application

FnA FnZ… Fn0 Fn9…

Hardware

Server and Runtime

RPCs virtual servers

advantages: • very fast startup • agile deployment • share memory

problems: • not flexible

FnA FnZ… Fn0 Fn9…

already running

already in mem

small deployment

bundle

Lambda elasticity

Repeat ElasticBS experiment • Maintain 100 concurrent requests • Spin 200 ms per request • Run for 1 minute

Lambda elasticity

Conclusion: Lambdas are highly elastic

Lambda elasticity

Conclusion: Lambdas are highly elastic (though a little slow)

Outline

Lambda model

Why OpenLambda?

Conclusion

Lambda model

Run user handlers in response to events • web requests (RPC handlers) • database updates (triggers) • scheduled events (cron jobs)

Pay per function invocation • actually pay-as-you-go • no charge for idle time between calls • e.g., charge actual_time * memory_cap

Share everything

Share server pool between customers • Any worker can execute any handler • No spinup time • Less switching

Encourage specific runtime (C#, Node.JS, Python) • Minimize network copying • Code will be in resident in memory

Multi-node architecture

Load Balancer

Server

Python

Server

Python

workers

Load Balancer

load balancers

handler store

Load Balancer

Server

Python

Server

Python

workers

Load Balancer

load balancers

handler store

developerupload code

Load Balancer

Server

Python

Server

Python

workers

Load Balancer

load balancers

handler store

Load Balancer

Server

Python

Server

Python

workers

Load Balancer

load balancers

handler store

userRPC

Load Balancer

Server

Python

Server

Python

workers

Load Balancer

load balancers

handler store

userRPC

Load Balancer

Server

Python

Server

Python

workers

Load Balancer

load balancers

handler store

userRPC

Load Balancer

Server

Python

Server

Python

workers

Load Balancer

load balancers

handler store

userRPC

H2 sandbox

Load Balancer

Server

Python

Server

Python

workers

Load Balancer

load balancers

handler store

userRPC

H2 sandbox

Outline

Lambda model

Why OpenLambda?

Conclusion

Need for open source serverless

Many research areas • Applications, tools, distributed systems, execution engines • Evaluate ideas by building, not just simulating

AWS Lambda Google Cloud Functions

First implementations are proprietary

OpenLambda: explore further-reaching techniques• Goal: enable academic research on Lambdas • Storage awareness, kernel support, RPC inspection • …

Other recent open-source implementations

Azure Functions

IBM OpenWhisk

OpenLambda: explore further-reaching techniques• Goal: enable academic research on Lambdas • Storage awareness, kernel support, RPC inspection • …

OpenLambda research topics

Workloads • Workload studies • Benchmarks • Versioning+dependencies • Code characteristics • Package management

Distributed systems • Databases • Load balancing • Scatter gather patterns • Sessions and streams

Tools • Debugging • Monetary cost optimization • Porting legacy applications

Execution engines • Sandboxing • Containers • Just-in-time interpreters

Load Balancer

Server

Python

workers

Load Balancer

load balancers

handler store

H2 sandbox

H1 H2developerupload code

userRPC

Load Balancer

Server

Python

workers

Load Balancer

load balancers

handler store

H2 sandbox

userRPC

workloads

distributed systems

execution engine

Load Balancer

Server

Python

workers

Load Balancer

load balancers

handler store

H2 sandbox

userRPC

workloads

distributed systems

execution engine

Understanding Lambda workloadsCollaborate with industry, measurement studies

• e.g., Azure Functions

Build LambdaBench • Everybody joining builds an application • Ticketing, calendar, autocomplete, OCR, flash card, stock alert,

blog, and scientific compute applications

Trace RPC calls (e.g., AJAX) of existing apps

Gmail:

Load Balancer

Server

Python

workers

Load Balancer

load balancers

handler store

H2 sandbox

userRPC

workloads

distributed systems

execution engine

Load Balancer

Server

Python

workers

Load Balancer

load balancers

handler store

H2 sandbox

userRPC

workloads

distributed systems

execution engine

Developer toolsPortability

• E.g., can Django apps run on Lambdas?

Debugging • Understand Lambda flows, may be a complex graph

Optimizing expense • Hard with containers: how to share 1-hour server time across requests? • With Lambdas: know cost of every RPC and query • Show where money is going

Load Balancer

Server

Python

workers

Load Balancer

load balancers

handler store

H2 sandbox

userRPC

workloads

distributed systems

execution engine

Load Balancer

Server

Python

workers

Load Balancer

load balancers

handler store

H2 sandbox

userRPC

workloads

distributed systems

execution engine

Building locality-aware Lambdas

Use deep inspection of RPCs for routing • Working with gRPC group • GSOC project (Stephen Sturdevant)

Locality factors • code locality • data locality • session locality

Load Balancer

Lambda workers

call B ?

Load Balancer

Lambda workers

numpy …

A numpy

call B(uses numpy)

Load Balancer

Lambda workers

DB shard: A-M (keys)

DB shard: N-Z (keys)

call predict XBBlocal

Load Balancer

Lambda workers

browser sessionservice

sessionservice

web socket

LambdaEngine

Load Balancer

Server

Python

workers

Load Balancer

load balancers

handler store

H2 sandbox

userRPC

workloads

distributed systems

execution engine

Load Balancer

Server

Python

workers

Load Balancer

load balancers

handler store

H2 sandbox

userRPC

workloads

distributed systems

execution engine

Minimizing latency

reduce

How can we reduce base latency?

Execution engine

Sandboxing • Process VMs (e.g., JVM): how to mostly initialize? • Containers: how to speed up restart and optimize pausing?

Execution engine

Language runtimes • Challenge: code warms up over time • How to share dynamic optimizations?

code code

worker A worker B

interpretedcompiled

Sandboxing • Process VMs (e.g., JVM): how to mostly initialize? • Containers: how to speed up restart and optimize pausing?

tracing enables inline optimization

Outline

Lambda model

Why OpenLambda?

Conclusion

ConclusionLambdas finally deliver on promises of the cloud

• finally pay-as-you-go • finally elastic • will fundamentally change how people build scalable applications

New challenges in every area of systems • scheduling, isolation, languages, debugging, tools, storage, …

Getting involved • contribute at https://github.com/open-lambda • site: https://open-lambda.org/

Serverless Computation with OpenLambda · Serverless Computation with OpenLambda Scott Hendrickson,...

Documents

Review on Deep-Learning-based Cognitive Computingesnl.hnu.edu.cn/phd2015/chenweihong_Review_on_Deep... · 2018. 6. 3. · and cognitive computation are presented at the view of three

POUR PDF - copiebibli/Rapports-internes/2010/RR1537.pdf · 2010-12-16 · Self-Stabilizing Computation and preservation of Knowledge of neighbor clusters ColetteJohnen1 andFouziMekhaldi2

-NIC: Interactive Serverless Compute on Programmable SmartNICs · within the NIC cores without involving the host CPU. In summary, we make the following contributions: We introduce

Quantum Processes and Computation: It's the final lecture · 2019. 6. 6. · Magic State Distillation Most Fault-toleration computation schemes can only do Cli ord computation. As

Hybridation of Thiazolidinone with Hydroxamate Scaffold ... · Hybridation of Thiazolidinone with Hydroxamate Scaffold for Developing Novel HDAC Inhibitors with Antitumor Activities

Application Model AWS Serverless...to build production-ready applications. Next Step Getting Started with AWS SAM (p. 3) 2 AWS Serverless Application Model Developer Guide Installing

Regionalized Pathology Correlates with Augmentation of ... · Regionalized Pathology Correlates with Augmentation of mtDNA Copy Numbers in a Patient with Myoclonic Epilepsy with Ragged-Red

Asymptotic behaviour of cellular automata: computation and … · Abstract Cellular automata are discrete dynamical systems as well as a massively parallel model of computation. Empirical

ﻪﺒﺳﺎﺤﻣﻪﯾﺮﻈﻧﺮﺑﯼﺍﻪﻣﺪﻘﻣphysics.sharif.edu/~vahid/teaching/Quantum Computation 1393/13-QCI-Theory of...Introduction to the theory of Computation,

with - Margres

Fast computation of all pairs of geodesic distances

archive.numdam.orgarchive.numdam.org › article › JTNB_1996__8_2_283_0.pdf · 283 On the computation of quadratic 2-class groups par WIEB BOSMA ET PETER STEVENHAGEN RÉSUMÉ. Nous

Serverless Architecture Hochskalierbare Anwendungen ohne ...aws-de-media.s3.amazonaws.com/images/_Munich_Loft... · Serverless Architecture –Hochskalierbare Anwendungen ohne Server

Quantum Computation - Lecture 08 - Quantum Error Correction IImdeoliv/QuantumCourse-Slides... · Quantum Computation - Lecture 08 - Quantum Error Correction II ... s.

D serverless AWS Lambda : le SLA by design · AWS Lambda : une solution pensée pour l’IOT 9 ★Adresse nativement des volumétries colossales, mise à l’échelle automatique

Superconducting circuits and quantum computation · 2019. 11. 20. · 21 - Quantum Computation and Communication – Superconducting Circuits and Quantum Computation – 21 RLE Progress

A survey of multiple precision computation using ﬂoating ...bt.pa.msu.edu/TM/BocaRaton2006/talks/lauter.pdf · Multiple precision using ﬂoating-point - Lauter - TMW 2006 1. Project

LIVING WITH LIONSlionconservation.org/ScientificPapers/Living-with-lions,Frank.pdf · LIVING WITH LIONS: CARNIVORE CONSERVATION AND LIVESTOCK IN LAIKIPIA DISTRICT, KENYA Laurence

Télémac, un système hydroinformatique · - dry areas within the computation domain: tidal flats and flood plains; - advection and diffusion of a pollutant, with source or ... the

GShard: Scaling Giant Models with Conditional Computation ... · zhifengc@google.com Abstract Neural network scaling has been critical for improving the model quality in many real-world