lec_17_IOInterface

Embed Size (px)

Citation preview

  • 8/3/2019 lec_17_IOInterface

    1/31

    I/O Interface CS510 Computer Architectures Lecture 17 - 1

    Lecture 17

    I/O Interfaces and

    I/O Busses

  • 8/3/2019 lec_17_IOInterface

    2/31

    I/O Interface CS510 Computer Architectures Lecture 17 - 2

    Storage System Issues

    Historical Context of Storage I/O

    Storage I/O Performance Measures

    Secondary and Tertiary Storage Devices Processor Interface Issues

    I/O Buses

    Redundant Arrays of Inexpensive Disks (RAID)

  • 8/3/2019 lec_17_IOInterface

    3/31

    I/O Interface CS510 Computer Architectures Lecture 17 - 3

    Disk Time Example

    Disk Parameters: Transfer size is 8K bytes

    Advertised average seek is 12 ms

    Disk spins at 7200 RPM

    Transfer rate is 4 MB/sec

    Controller overhead is 2 ms

    Assume that disk is idle so no queuing delay

    What is Average Disk Access Time for a Sector?

    Ave seek + ave rot delay + transfer time + controller overhead

    12 ms + 0.5/(7200 RPM/60) + 8 KB/4 MB/s + 2 ms 12 + 4.15 + 2 + 2 = 20 ms

    Advertised seek time assumes no locality: typically 1/4 to 1/3advertised seek time

    Average Disk Access Time for a Sector: 20 ms => 12 ms

  • 8/3/2019 lec_17_IOInterface

    4/31

    I/O Interface CS510 Computer Architectures Lecture 17 - 4

    Processor Interface Issues

    Interconnections

    Busses

    Processor interface

    Interrupts Memory mapped I/O

    I/O Control Structures

    Polling

    Interrupts

    DMA I/O Controllers

    I/O Processors

    Capacity, Access Time, Bandwidth

  • 8/3/2019 lec_17_IOInterface

    5/31

    I/O Interface CS510 Computer Architectures Lecture 17 - 5

    I/O Interface

    Seperate I/O instructions (in,out)

    Lines distinguish betweenI/O and memory transfers

    VME busMultibus-IINubus

    40 Mbytes/secoptimistically

    10 MIPS processorcompletelysaturates the bus!

    CPU

    Interface Interface

    Peripheral Peripheral

    Memory

    memorybus

    Independent I/O Bus

    Common Memory

    & I/O bus

    CPU

    Interface Interface

    Peripheral Peripheral

    Memory

  • 8/3/2019 lec_17_IOInterface

    6/31

    I/O Interface CS510 Computer Architectures Lecture 17 - 6

    Memory Mapped I/O

    Single Memory & I/O BusNo Separate I/O Instructions

    CPU

    Interface Interface

    Peripheral Peripheral

    Memory

    L2 $

    Memory Bus

    Memory Bus Adapter

    I/O bus

    $

    CPU

    ROM

    RAM

    I/O

  • 8/3/2019 lec_17_IOInterface

    7/31I/O Interface CS510 Computer Architectures Lecture 17 - 7

    Programmed I/O (Polling)

    CPU

    IOC

    device

    Memory

    not an efficient way touse the CPU, unlessthe device is very fast!

    but checks for I/Ocompletion can bedispersed among

    computationallyintensive code

    busy on wait loopIs dataready?

    readdata

    storedata

    yesno

    done?no

    yes

  • 8/3/2019 lec_17_IOInterface

    8/31I/O Interface CS510 Computer Architectures Lecture 17 - 8

    Interrupt Driven Data Transfer

    (2) save PC

    (3) interruptservice addr

    (4)

    1000 transfers/second:1000 interrupts @ 2 msec per interrupt=> 2 msec1000 interrupt service @ 98 msec each=> 98 msec

    100 msec = 0.1 CPU seconds

    CPU

    IOC

    device

    Memory

    User program halts onlyduring actual transfer

    memory

    userprogram

    interruptserviceroutine

    readstore

    ...rti

    addsubandornop

    (1) I/Ointerrupt

  • 8/3/2019 lec_17_IOInterface

    9/31I/O Interface CS510 Computer Architectures Lecture 17 - 9

    Direct Memory Access

    0ROM

    RAM

    Peripherals

    DMACn

    MemoryMapped I/O

    Time to do 1000 xfers in 1 msec:

    1 DMA set-up sequence: @ 50 msec1 interrupt: @ 2 msec1 interrupt service sequence: @ 48 msec

    100msec.0001 second of CPU time

    CPU sends a starting address,direction(R/W), and word countto DMAC. Then issues "start".

    DMAC provides;Peripheral controller Handshake signalsMemory Addresses

    Handshake signals

    CPU

    IOC

    device

    Memory DMAC

    I/O

  • 8/3/2019 lec_17_IOInterface

    10/31I/O Interface CS510 Computer Architectures Lecture 17 - 10

    Input/Output Processors

    (2)

    (3)

    Device to/from memory transfers arecontrolled by the IOP directly.

    IOP steals memory cycles from CPU

    OP Device Address

    target device

    where cmnds are

    looks in memory for commands

    CPU IOP

    Mem

    D1

    D2

    Dn

    . . .

    main memorybus

    I/Obus

    OP Addr Cnt Other

    whatto do

    where

    to putdata

    how

    much

    specialrequests

    Commandmemory

    CPU

    IOP

    (1)issuesinstructionto IOP

    interruptswhen done(4)

  • 8/3/2019 lec_17_IOInterface

    11/31I/O Interface CS510 Computer Architectures Lecture 17 - 11

    Relationship

    to Processor Architecture I/O instructions and busses have largely disappeared

    Interrupt vectors have been replaced by jump tablesPC

  • 8/3/2019 lec_17_IOInterface

    12/31I/O Interface CS510 Computer Architectures Lecture 17 - 12

    Relationshipto Processor Architecture

    Caches required for processor performance causeproblems for I/O

    Flushing is expensive, I/O pollutes cache

    Solution is borrowed from shared memory multiprocessors"snooping"

    Virtual memory frustrates DMA

    Stateful processors are hard to switch context

  • 8/3/2019 lec_17_IOInterface

    13/31I/O Interface CS510 Computer Architectures Lecture 17 - 13

    Interconnect Trends

    >1000 m 10 - 100 m 1 mDistance

    10 - 100 Mb/s 40 - 1000 Mb/s 320 - 1000+ Mb/sBandwidth

    high (>ms) medium low (

  • 8/3/2019 lec_17_IOInterface

    14/31I/O Interface CS510 Computer Architectures Lecture 17 - 14

    Backplane Architectures

    Distinctions begin to blur:

    SCSI channel is like a busFutureBus is like a channel (disconnect/reconnect)HIPPI forms links in high speed switching fabrics

    128No

    16 - 32Single/Multiple

    MultipleNo

    Async25

    12.927.913.621

    .5 mIEEE 1014

    96Yes32

    Single/MultipleMultiple

    OptionalAsync37

    15.595.220.820

    .5 mIEEE 896

    Metric VME FutureBus96Yes32

    Single/MultipleMultipleOptional

    Sync201040

    13.321

    .5 mANSI/IEEE 1296

    MultiBus IIBus Width (signals)Address data multiplexed?Data WidthXfer Size# of Bus MastersSplit TransactionsClockingBandwidth, Single Word (0 ns mem)Bandwidth, Single Word (150ns mem)Bandwidth Multiple Word (0 ns mem)Bandwidth Multiple Word (150 ns mem)Max # of devicesMax Bus LengthStandard

    25na8

    Single/MultipleMultiple

    OptionalEither5, 1.55, 1.55, 1.55, 1.5

    725 m

    ANSI X3.131

    SCSI-I

  • 8/3/2019 lec_17_IOInterface

    15/31I/O Interface CS510 Computer Architectures Lecture 17 - 15

    Bus-Based Interconnect

    Bus: a shared communication link betweensubsystems

    Low cost: a single set of wires is shared multiple ways

    Versatility: Easy to add new devices & peripherals mayeven be ported between computers using common bus

    Disadvantage

    A communication bottleneck, possibly limiting themaximum I/O throughput

    Bus speed is limited by physical factors the bus length

    the number of devices (and, hence, bus loading).

    these physical limits prevent arbitrary bus speedup.

  • 8/3/2019 lec_17_IOInterface

    16/31I/O Interface CS510 Computer Architectures Lecture 17 - 16

    Bus-Based Interconnect

    Two generic types of busses:

    I/O busses (sometimes called a channel)

    lengthy

    many types of devices connected

    wide range in the data bandwidth

    follow a bus standard

    CPU-Memory buses (sometimes called a backplane)

    short

    high speed, matched to the memory system to maximizememory-CPU bandwidth

    To lower costs, low cost (older) systems combine together

    Bus transaction

    Sending address & receiving or sending data

  • 8/3/2019 lec_17_IOInterface

    17/31I/O Interface CS510 Computer Architectures Lecture 17 - 17

    Bus Protocols

    Multibus: 20 address, 16 data, 5 control, 50ns Pause

    Bus Master: has ability to control the bus, initiates a transaction

    Bus Slave: module activated by the transaction

    Bus Communication Protocol: specification of sequence ofevents and timing requirements in transferring information.

    Asynchronous Bus Transfers: control lines (req., ack.) serve toorchestrate sequencing

    Synchronous Bus Transfers: sequence relative to the commonclock

    Master Slave

    Control Lines

    Address LinesData Lines

  • 8/3/2019 lec_17_IOInterface

    18/31I/O Interface CS510 Computer Architectures Lecture 17 - 18

    Synchronous Bus Protocols

    Pipelined/Split transaction Bus Protocol

    Address

    Data

    Wait

    addr 1

    data 0

    addr 2

    wait 1

    data 1

    addr 3

    OK 1

    begin read Read complete

    Address

    Data

    Read

    Wait

    Clock

    Read transaction

  • 8/3/2019 lec_17_IOInterface

    19/31I/O Interface CS510 Computer Architectures Lecture 17 - 19

    Asynchronous Handshake

    Write Transaction

    t3: Master releases req

    Address

    Data

    Read

    Req.

    Ack.4 Cycle Handshake

    t0 t1 t2 t3 t4 t5

    Master Asserts Address Next Address

    Master Asserts Data

    t0: Master has obtained control and asserts address, direction, data;Waits a specified amount of time for slaves to decode target

    t2: Slave asserts ack, indicating data receivedt1: Master asserts request line

    t4: Slave releases ack

  • 8/3/2019 lec_17_IOInterface

    20/31I/O Interface CS510 Computer Architectures Lecture 17 - 20

    Asynchronous Handshake

    Time Multiplexed Bus: address and data share lines

    Read TransactionAddress

    Data

    Read

    Req

    Ack

    Master Asserts Address Next Address

    t0 t1 t2 t3 t4 t5

    4 Cycle Handshake

    t0: Master has obtained control and asserts address, direction;

    Waits a specified amount of time for slaves to decode

    t2: Slave asserts ack, indicating ready to transmit data;Slave asserts data

    t3: Master releases req, data received

    t1: Master asserts request line

    t4: Slave releases ack

  • 8/3/2019 lec_17_IOInterface

    21/31I/O Interface CS510 Computer Architectures Lecture 17 - 21

    Bus ArbitrationParallel (Centralized) Arbitration

    BR BG

    M

    BR BG

    M

    BR BG

    M

    Bus Arbiter

    M

    BGi BGo

    BRM

    BGi BGo

    BRM

    BGi BGo

    BR

    BG

    BR

    A.U.

    BR A

    M

    BR A

    M

    BR A

    M

    BRA

    A.U.

    Serial Arbitration (daisy chaining)

    Polling

    BR: Bus RequestBG: Bus Grant

    Bus sequence:BR

    BG

    Busy

    data

  • 8/3/2019 lec_17_IOInterface

    22/31

    I/O Interface CS510 Computer Architectures Lecture 17 - 22

    Bus Options

    Bus width Separate address Multiplex address& data lines & data lines

    Data width Wider is faster Narrower is cheaper(e.g., 32 bits) (e.g., 8 bits)

    Transfer size Multiple words has Single-word transferless bus overhead is simpler

    Bus masters Multiple Single master(requires arbitration) (no arbitration)

    Split Yes - separate No - continuous transaction?Request and Reply connection is cheaperpackets gets higher and has lower latencybandwidth(needs multiple masters)

    Clocking Synchronous Asynchronous

    Option High performance Low cost

  • 8/3/2019 lec_17_IOInterface

    23/31

    I/O Interface CS510 Computer Architectures Lecture 17 - 23

    1990 Bus Survey

    Signals 128 96 96 16 8

    Addr/Data mux no yes yes n/a n/a

    Data width 16 - 32 32 32 16 8

    Masters multi multi multi single multiClocking Async Async Sync Async either

    MB/s 25 37 20 25 1.5 (asyn)

    (0ns, word) 5 (sync)

    150ns word 12.9 15.5 10 = =

    0ns block 27.9 95.2 40 = =

    150ns block 13.6 20.8 13.3 = =Max devices 21 20 21 8 7

    Max meters 0.5 0.5 0.5 50 25

    Standard IEEE 1014 IEEE 896.1 ANSI/IEEE ANSI X3.129 ANSI X3.1311296

    VME FutureBus Multibus II IPI SCSI

  • 8/3/2019 lec_17_IOInterface

    24/31

    I/O Interface CS510 Computer Architectures Lecture 17 - 24

    VME

    3 96-pin connectors

    128 defined as standard, rest customer defined

    32 address

    32 data

    64 command & power/ground lines

  • 8/3/2019 lec_17_IOInterface

    25/31

    I/O Interface CS510 Computer Architectures Lecture 17 - 25

    SCSI: Small ComputerSystem Interface

    Up to 8 devices to communicate on a bus or string at

    sustained speeds of 4-5 MBytes/sec

    SCSI-2 up to 20 MB/sec

    Devices can be slave (target) or master(initiator)

    SCSI protocol: a series of phases, during which specificactions are taken by the controller and the SCSI disks

    Bus Free: No device is currently accessing the bus

    Arbitration: When the SCSI bus goes free, multiple devices mayrequest (arbitrate for) the bus; fixed priority by address

    Selection: informs the target that it will participate (Reselection if

    disconnected) Command: the initiator reads the SCSI command bytes from host

    memory and sends them to the target

    Data Transfer: data in or out, initiator: target

    Message Phase: message in or out, initiator: target (identify,save/restore data pointer, disconnect, command complete)

    Status Phase: target, just before command complete

  • 8/3/2019 lec_17_IOInterface

    26/31

    I/O Interface CS510 Computer Architectures Lecture 17 - 26

    SCSI Bus: Channel

    ArchitectureCommand Setup

    Arbitration

    Selection

    Message Out (Identify)

    CommandDisconnect to seek/ ll buf fer

    Message In (Disconnect)

    - - Bus Free - -

    Arbitration

    Reselection

    Message In (Identify)

    Data Transfer

    Data In

    Disconnect to ll buf fer

    Message In (Save Data Ptr)

    Message In (Disconnect)

    - - Bus Free - -

    Arbitration

    Reselection

    Message In (Identify)

    Command Completion

    Status

    Message In (Command Complete)

    If no disconnect is needed

    Completion

    Message In (Restore Data Ptr)

    peer-to-peer protocolsinitiator/targetlinear byte streamsdisconnect/reconnect

  • 8/3/2019 lec_17_IOInterface

    27/31

    I/O Interface CS510 Computer Architectures Lecture 17 - 27

    1993 I/O Bus Survey

    Originator Sun DEC IBM Intel

    Clock Rate (MHz) 16-25 12.5-25 async 33Addressing Virtual Physical Physical Physical

    Data Sizes (bits) 8,16,32 8,16,24,32 8,16,24,32,64 8,16,24,32,64

    Master Multi Single Multi Multi

    Arbitration Central Central Central Central

    32 bit read (MB/s) 33 25 20 33Peak (MB/s) 89 84 75 111 (222)

    Max Power (W) 16 26 13 25

    Bus SBus TurboChannel MicroChannel PCI

  • 8/3/2019 lec_17_IOInterface

    28/31

    I/O Interface CS510 Computer Architectures Lecture 17 - 28

    1993 MP Server Memory Bus

    SurveyOriginator HP SGI Sun

    Clock Rate (MHz) 60 48 66

    Split transaction? Yes Yes Yes?Address lines 48 40 ??

    Data lines 128 256 144 (parity)Data Sizes (bits) 512 1024 512

    Clocks/transfer 4 5 4?

    Peak (MB/s) 960 1200 1056

    Master Multi Multi Multi

    Arbitration Central Central Central

    Addressing Physical Physical PhysicalSlots 16 9 10

    Busses/system 1 1 2

    Length 13 inches 12? inches 17 inches

    Bus Summit Challenge XDBus

  • 8/3/2019 lec_17_IOInterface

    29/31

    I/O Interface CS510 Computer Architectures Lecture 17 - 29

    Communications Networks

    Performance limiter is memory system, OS overhead, not protocols

    Send/receive queues in processor memories Network controller copies back and forth via DMA No host intervention needed Interrupt host when message sent or received

    NodeProcessor

    ControlReg. I/F

    NetI/F Memory

    RequestBlock

    ReceiveBlock

    Media

    Network Controller

    Peripheral Backplane Bus

    DMA

    . . .

    Processor MemoryList of request blocks

    Data to be transmitted

    . . .

    List of receive blocks

    Data receivedDMA

    . . .

    List of free blocks

  • 8/3/2019 lec_17_IOInterface

    30/31

    I/O Interface CS510 Computer Architectures Lecture 17 - 30

    I/O Controller Architecture

    Request/Response block interface

    Backdoor access to host memory

    Peripheral Bus (VME, FutureBus, etc.)

    Host

    Memory

    Processor

    Cache

    Host

    Processor

    Peripheral Bus Interface/DMA

    I/O Channel Interface

    BufferMemory

    ROM

    Proc

    I/O Controller

  • 8/3/2019 lec_17_IOInterface

    31/31

    I/O Data Flow

    I/O Device Head/DiskAssembly

    DMAover Peripheral BusOS Buffers (>10 MByte)

    Xfer over Serial InterfaceTrack Buffers (32K - 256KBytes)Embedded Controller

    Memory-to-Memory CopyApplication Address Space

    Host Processor

    Xfer over Disk ChannelHBA Buffers (1 M - 4 MBytes)I/O Controller

    Impediment to high performance: multiple copies,complex hierarchy