Programmable Packet Scheduling at Line Rate

Anirudh Sivaraman, Suvinay Subramanian, Mohammad Alizadeh, Sharad Chole, Shang-Tse Chuang, Anurag Agrawal, Hari Balakrishnan, Tom Edsall, Sachin Katti, Nick McKeown

Programmable scheduling at line rate• Motivation: Can’t deploy new schedulers in production networks• The status quo in line-rate switches

Parser DeparserIngress pipeline Egress pipeline

Queues/Scheduler

RMT RMT, Domino RMTRMT, Domino???

The scheduler is still fixed

Why is programmable scheduling hard?

• Many algorithms, yet no consensus on abstractions, cf.• Parse graphs for parsing• Match-action tables for forwarding• Packet transactions for data-plane algorithms

• Scheduler has tight timing requirements• Can’t simply use an FPGA/CPU

Need expressive abstraction that can run at line rate

What does the scheduler do?

It decides• In what order are packets sent• e.g., FCFS, priorities, weighted fair queueing

• At what time are packets sent• e.g., Token bucket shaping

A strawman programmable scheduler

• Very little time on the dequeue side => limited programmability• Can we move programmability to the enqueue side instead?

Classification Programmable logic to decide order or time

Packets

The Push-In First-Out QueueKey observation• In many cases, relative order of buffered packets does not change• i.e., a packet’s place in the scheduling order is known at enqueue

The Push-In First-Out Queue (PIFO): Packets are pushed into an arbitrary location based on a rank, and dequeued from the head

259 791013

A programmable scheduler

To program the scheduler, program the rank computation

Rank Computation

(programmable) (fixed logic)

29 8 5

PIFO Scheduler

f = flow(pkt) …...p.rank= T[f] + p.len

In Out

Parser DeparserIngress pipeline Egress pipeline

Rank Computation

Queues/Scheduler

PIFO Scheduler

A programmable scheduler

Rank computation is a packet transaction (Domino, SIGCOMM’ 16)

In Out

Parser DeparserIngress pipeline Egress pipelineQueues/

SchedulerPIFO

Scheduler

Rank Computation 1. f = flow(p)2. p.start = max(T[f].finish,

virtual_time)3. T[f].finish = p.start + p.len4. p.rank = p.start

Fair queuing

In Out

SchedulerPIFO

Scheduler

1. tokens = min(tokens + rate * (now – last), burst)

2. p.send = now + max( (p.len – tokens) / rate, 0)

3. tokens = tokens - p.len4. last = now5. p.rank = p.send

Rank Computation

Token bucket shaping

29 8 5

PIFO Scheduler

Shortest remaining flow size

In Out

SchedulerPIFO

Scheduler

Shortest remaining flow size

Rank Computation 1. f = flow(p)2. p.rank = f.rem_size

29 8 5

PIFO Scheduler

Beyond a single PIFO

b1b2b3y2

Hierarchical scheduling algorithms need hierarchy of PIFOs

Red (0.5) Blue (0.5)

a(0.99)

b(0.01)

x(0.5)

y(0.5)

HierarchicalPacket Fair Queuing

Tree of PIFOs

Red (0.5) Blue (0.5)

a(0.99)

b(0.01)

x(0.5)

y(0.5)

HierarchicalPacket Fair Queuing

PIFO-Red(WFQ on a & b)

PIFO-root (WFQ on Red & Blue)

PIFO-Blue(WFQ on x & y)

BRBB RRBR

Expressiveness of PIFOs

• Fine-grained priorities: shortest-flow first, earliest deadline first, service-curve EDF• Hierarchical scheduling: HPFQ, Class-Based Queuing• Non-work-conserving algorithms: Token buckets, Stop-And-Go, Rate

Controlled Service Disciplines• Least Slack Time First• Service Curve Earliest Deadline First• Minimum and maximum rate limits on a flow• Cannot express some scheduling algorithms, e.g., output shaping.

PIFO in hardware

• Performance targets for a shared-memory switch• 1 GHz pipeline (64 ports * 10 Gbit/s)• 1K flows/physical queues• 60K packets (12 MB packet buffer, 200 byte cell)• Scheduler is shared across ports

• Naive solution: flat, sorted array is infeasible

• Exploit observation that ranks increase within a flow

A single PIFO block

Rank Store(SRAM)

Flow Scheduler(flip-flops)

Dequeue Enqueue

A 0 B 1 C 3 C24

345 C 6D 4

• 1 enqueue + 1 dequeue per clock cycle• Can be shared among multiple logical PIFOs

Hardware feasibility

• The rank store is just a bank of FIFOs (well-understood design)

• Flow scheduler for 1K flows meets timing at 1GHz on 16-nm transistor library• Continues to meet timing until 2048 flows, fails timing at 4096

• 7 mm2 area for 5-level programmable hierarchical scheduler• < 4% for a typical chip.

Related work

• PIFO: Used in theoretical work by Chuang et. al. in the 90s

• Universal Packet Scheduling (UPS): Uses LSTF to replay all schedules, end point sets slack• Assumes fixed switches => cannot express fair queueing, shaping• Assumes single priority queue => cannot express hierarchies

Conclusion

• Programmable scheduling at line rate is within reach

• Two benefits:• Express new schedulers for different performance objectives• Express existing schedulers as software, not hardware

• Code: http://web.mit.edu/pifo

Backup slides

Limitations of PIFOs

• Output shaping: PIFOs rate limit input to a queue, not output

• Shaping and scheduling are coupled.

PIFO mesh

Proposal: scheduling in P4

• Currently not modeled at all, blackbox left to vendor

• Only part of the switch that isn’t programmable

• PIFOs present a candidate

• Concurrent work on Universal Packet Scheduling also requires a priority queue that is identical to a PIFO

Hardware implementation

== comparators> comparators

Priority encoderPriority encoder

Shift elements based on push, pop indices

Pop(DEQ)

Push 1(ENQ)

Push 2(reinsert)

LogicalPIFO ID

LogicalPIFO ID Rank Logical

PIFO IDRank Logical

PIFO ID

§ Meets timing (1 GHz) for up to 2048 flows at 16 nm§ Less than 4% area overhead (~7 mm2) for 5-level scheduler

A PIFO block

Enqueue:(logical PIFO,

rank, flow)

Dequeue:(logical PIFO)

A PIFO mesh

Enq Deq

DeqEnq

Next-hop lookup

Proposal: scheduling in P4

• Need to model a PIFO (or priority queue) in P4

• Requires an extern instance to model a PIFO• Can start by including it in a target-specific library• Later migrate to standard library if there’s sufficient interest• Section 16 of P4v1.1

• Transactions themselves can be compiled down to P4 code using the Domino DSL for stateful algorithms.

Hardware feasibility of PIFOs

• Number of flows handled by a PIFO affects timing.

• Number of logical PIFOs within a PIFO, priority and metadata width, and number of PIFO blocks only increases area.

Composing PIFOs: min. rate guarantees

Minimum rate guarantees:

Provide each flow a guaranteedrate provided the sum of theseguarantees is below capacity.

PIFO-RootPrioritize flows undermin. rate

PIFO-A(FIFO for flow A)

PIFO-B(FIFO for flow B)

1 32 42

Composing PIFOs

Traffic Shaping

1. update tokens2. p.send = now +

(p.len - tokens) / rate;3. p.prio =p.send Push-In-First-Out

(PIFO) Queue

SchedulerIngress Pipeline

Add transmission delay to slack Push-In-First-Out

(PIFO) Queue

SchedulerIngress Pipeline

Decrement wait time in queue

from slack

Initialize slackvalues

The PIFO abstraction in one slide

• PIFO: A sorted array that let us insert an entry (packet or PIFO pointer) into a PIFO based on a programmable priority• Entries are always dequeued from the head• If an entry is a packet, dequeue and transmit it• If an entry is a PIFO, dequeue it, and continue recursively

Programmable Packet Scheduling at Line Rate - SIGCOMM · 2016. 8. 26. · The PIFO abstraction in...

Documents

HSPA (High Speed Packet Access) - irt.enseeiht.frirt.enseeiht.fr/beylot/enseignement/HSPA.pdf · proposés au travers de la solution high-speed uplink packet access (HSUPA) ... La

yannbhrbtssio.files.wordpress.com€¦ · Web viewProcédure du DHCP avec Basculement sur Cisco Packet Tracer. Table des matières. Cahier de charges3. Contexte3. L’Expression

Configurer Un Routeur Cisco Avec Packet Tracer (Routage,Nat,DHCP)

Polycopié de TP Réseaux Informatiquestalea.ma/cours/tp/tp1.pdf · 2016-10-23 · Annexe 1 : Prise en main Packet Tracer Le logiciel Packet Tracer est un simulateur de réseau qui

JCO Concours Registration Packet

Construction Management & Project Scheduling · ΠροχωρημέναΘέματαΔιεύθυνσηςΚατασκευαστικώνΈργων 9 Δρ.Σ.Χριστοδούλου Αναγκαιότητες

Ordonnancement de la CPUprofs.etsmtl.ca/.../LOG710-Hiver2014-CPUScheduling.pdfLOG 710 Hiver2014 Introduction • Ordonnancement de la CPU (CPU scheduling) –Est à la base des OS

116218 Learner Guide - AgriSeta€¦ · The purpose of production planning and scheduling. The various production methods encountered in agricultural processing. Basic statistics

Packet Tracer vlan - WordPress.com · 2016-06-08 · packet tracer & vlan anthony ferreira i. contexte 2 ii. cahier des charges: 2 iii. etudes des solutions 2 iv. principe de la solution

SOMMAIRE - Overblogdata.over-blog-kiwi.com/0/64/49/45/201309/ob_75ff3976d...Résultats des clubs PIFO en championnats de France p. 6 Programme des clubs PIFO en championnats de France

Systèmes multi-utilisateurs - ENSEA · Accés (aléatoire) au système: performances 376 Wireless Communications 17.4.3 Packet Reservation Multiple Access In Packet Reservation Multiple

Le support de VolP dans les réseaux maillés sans fil … networks. ... System problems in the ... MSH-CSCH Mesh Centralized Scheduling MSH-DSCH Mesh Distributed Scheduling

Packet Tracer 4 - Université Paris 13saidi/supports/reseaux...Packet Tracer est un outil de simulation d’équipements Cisco. Il permet aux stagiaires de pratiquer même à domicile

Belgique - Comité PIFO Gentiles hors...SOUSSIEN,NE : de Sousse SFAXIEN,NE : de Sfax DJERBIEN,NE : de Pikine Created Date 4/30/2020 9:46:11 PM

TUTORIEL – CRÉATION D’UNE ACTIVITÉ PACKET TRACER€¦ · · 2016-03-23Lancer Cisco Packet Tracer version Instructor ... v1.0 Page 8/34 Nous allons sauvegarder notre travail,

APRS Automatic Packet Reporting System (système de suivi automatique par paquets)

Packet tracer.pdf

Réseau sur simulateur - lewebpedagogique.comlewebpedagogique.com/sfeulvarch/files/2012/08/TP-réseau-Packet... · program files\Cisco packet tracer 5.3.3\languages et ajouter dedans

Scheduling and data redistribution strategies on star ... · 5 Load balancing of divisible loads using the multiport switch-model 32 ... Scheduling and data redistribution strategies

Deep Packet Inspection (DPI)static.webra.ru/meeting.intertax.ru/slides/meeting-37-1.pdf · История развития DPI Слабый анализ пакетов Shallow Packet