Upload
ah-chong
View
218
Download
0
Embed Size (px)
Citation preview
8/3/2019 lec_17_IOInterface
1/31
I/O Interface CS510 Computer Architectures Lecture 17 - 1
Lecture 17
I/O Interfaces and
I/O Busses
8/3/2019 lec_17_IOInterface
2/31
I/O Interface CS510 Computer Architectures Lecture 17 - 2
Storage System Issues
Historical Context of Storage I/O
Storage I/O Performance Measures
Secondary and Tertiary Storage Devices Processor Interface Issues
I/O Buses
Redundant Arrays of Inexpensive Disks (RAID)
8/3/2019 lec_17_IOInterface
3/31
I/O Interface CS510 Computer Architectures Lecture 17 - 3
Disk Time Example
Disk Parameters: Transfer size is 8K bytes
Advertised average seek is 12 ms
Disk spins at 7200 RPM
Transfer rate is 4 MB/sec
Controller overhead is 2 ms
Assume that disk is idle so no queuing delay
What is Average Disk Access Time for a Sector?
Ave seek + ave rot delay + transfer time + controller overhead
12 ms + 0.5/(7200 RPM/60) + 8 KB/4 MB/s + 2 ms 12 + 4.15 + 2 + 2 = 20 ms
Advertised seek time assumes no locality: typically 1/4 to 1/3advertised seek time
Average Disk Access Time for a Sector: 20 ms => 12 ms
8/3/2019 lec_17_IOInterface
4/31
I/O Interface CS510 Computer Architectures Lecture 17 - 4
Processor Interface Issues
Interconnections
Busses
Processor interface
Interrupts Memory mapped I/O
I/O Control Structures
Polling
Interrupts
DMA I/O Controllers
I/O Processors
Capacity, Access Time, Bandwidth
8/3/2019 lec_17_IOInterface
5/31
I/O Interface CS510 Computer Architectures Lecture 17 - 5
I/O Interface
Seperate I/O instructions (in,out)
Lines distinguish betweenI/O and memory transfers
VME busMultibus-IINubus
40 Mbytes/secoptimistically
10 MIPS processorcompletelysaturates the bus!
CPU
Interface Interface
Peripheral Peripheral
Memory
memorybus
Independent I/O Bus
Common Memory
& I/O bus
CPU
Interface Interface
Peripheral Peripheral
Memory
8/3/2019 lec_17_IOInterface
6/31
I/O Interface CS510 Computer Architectures Lecture 17 - 6
Memory Mapped I/O
Single Memory & I/O BusNo Separate I/O Instructions
CPU
Interface Interface
Peripheral Peripheral
Memory
L2 $
Memory Bus
Memory Bus Adapter
I/O bus
$
CPU
ROM
RAM
I/O
8/3/2019 lec_17_IOInterface
7/31I/O Interface CS510 Computer Architectures Lecture 17 - 7
Programmed I/O (Polling)
CPU
IOC
device
Memory
not an efficient way touse the CPU, unlessthe device is very fast!
but checks for I/Ocompletion can bedispersed among
computationallyintensive code
busy on wait loopIs dataready?
readdata
storedata
yesno
done?no
yes
8/3/2019 lec_17_IOInterface
8/31I/O Interface CS510 Computer Architectures Lecture 17 - 8
Interrupt Driven Data Transfer
(2) save PC
(3) interruptservice addr
(4)
1000 transfers/second:1000 interrupts @ 2 msec per interrupt=> 2 msec1000 interrupt service @ 98 msec each=> 98 msec
100 msec = 0.1 CPU seconds
CPU
IOC
device
Memory
User program halts onlyduring actual transfer
memory
userprogram
interruptserviceroutine
readstore
...rti
addsubandornop
(1) I/Ointerrupt
8/3/2019 lec_17_IOInterface
9/31I/O Interface CS510 Computer Architectures Lecture 17 - 9
Direct Memory Access
0ROM
RAM
Peripherals
DMACn
MemoryMapped I/O
Time to do 1000 xfers in 1 msec:
1 DMA set-up sequence: @ 50 msec1 interrupt: @ 2 msec1 interrupt service sequence: @ 48 msec
100msec.0001 second of CPU time
CPU sends a starting address,direction(R/W), and word countto DMAC. Then issues "start".
DMAC provides;Peripheral controller Handshake signalsMemory Addresses
Handshake signals
CPU
IOC
device
Memory DMAC
I/O
8/3/2019 lec_17_IOInterface
10/31I/O Interface CS510 Computer Architectures Lecture 17 - 10
Input/Output Processors
(2)
(3)
Device to/from memory transfers arecontrolled by the IOP directly.
IOP steals memory cycles from CPU
OP Device Address
target device
where cmnds are
looks in memory for commands
CPU IOP
Mem
D1
D2
Dn
. . .
main memorybus
I/Obus
OP Addr Cnt Other
whatto do
where
to putdata
how
much
specialrequests
Commandmemory
CPU
IOP
(1)issuesinstructionto IOP
interruptswhen done(4)
8/3/2019 lec_17_IOInterface
11/31I/O Interface CS510 Computer Architectures Lecture 17 - 11
Relationship
to Processor Architecture I/O instructions and busses have largely disappeared
Interrupt vectors have been replaced by jump tablesPC
8/3/2019 lec_17_IOInterface
12/31I/O Interface CS510 Computer Architectures Lecture 17 - 12
Relationshipto Processor Architecture
Caches required for processor performance causeproblems for I/O
Flushing is expensive, I/O pollutes cache
Solution is borrowed from shared memory multiprocessors"snooping"
Virtual memory frustrates DMA
Stateful processors are hard to switch context
8/3/2019 lec_17_IOInterface
13/31I/O Interface CS510 Computer Architectures Lecture 17 - 13
Interconnect Trends
>1000 m 10 - 100 m 1 mDistance
10 - 100 Mb/s 40 - 1000 Mb/s 320 - 1000+ Mb/sBandwidth
high (>ms) medium low (
8/3/2019 lec_17_IOInterface
14/31I/O Interface CS510 Computer Architectures Lecture 17 - 14
Backplane Architectures
Distinctions begin to blur:
SCSI channel is like a busFutureBus is like a channel (disconnect/reconnect)HIPPI forms links in high speed switching fabrics
128No
16 - 32Single/Multiple
MultipleNo
Async25
12.927.913.621
.5 mIEEE 1014
96Yes32
Single/MultipleMultiple
OptionalAsync37
15.595.220.820
.5 mIEEE 896
Metric VME FutureBus96Yes32
Single/MultipleMultipleOptional
Sync201040
13.321
.5 mANSI/IEEE 1296
MultiBus IIBus Width (signals)Address data multiplexed?Data WidthXfer Size# of Bus MastersSplit TransactionsClockingBandwidth, Single Word (0 ns mem)Bandwidth, Single Word (150ns mem)Bandwidth Multiple Word (0 ns mem)Bandwidth Multiple Word (150 ns mem)Max # of devicesMax Bus LengthStandard
25na8
Single/MultipleMultiple
OptionalEither5, 1.55, 1.55, 1.55, 1.5
725 m
ANSI X3.131
SCSI-I
8/3/2019 lec_17_IOInterface
15/31I/O Interface CS510 Computer Architectures Lecture 17 - 15
Bus-Based Interconnect
Bus: a shared communication link betweensubsystems
Low cost: a single set of wires is shared multiple ways
Versatility: Easy to add new devices & peripherals mayeven be ported between computers using common bus
Disadvantage
A communication bottleneck, possibly limiting themaximum I/O throughput
Bus speed is limited by physical factors the bus length
the number of devices (and, hence, bus loading).
these physical limits prevent arbitrary bus speedup.
8/3/2019 lec_17_IOInterface
16/31I/O Interface CS510 Computer Architectures Lecture 17 - 16
Bus-Based Interconnect
Two generic types of busses:
I/O busses (sometimes called a channel)
lengthy
many types of devices connected
wide range in the data bandwidth
follow a bus standard
CPU-Memory buses (sometimes called a backplane)
short
high speed, matched to the memory system to maximizememory-CPU bandwidth
To lower costs, low cost (older) systems combine together
Bus transaction
Sending address & receiving or sending data
8/3/2019 lec_17_IOInterface
17/31I/O Interface CS510 Computer Architectures Lecture 17 - 17
Bus Protocols
Multibus: 20 address, 16 data, 5 control, 50ns Pause
Bus Master: has ability to control the bus, initiates a transaction
Bus Slave: module activated by the transaction
Bus Communication Protocol: specification of sequence ofevents and timing requirements in transferring information.
Asynchronous Bus Transfers: control lines (req., ack.) serve toorchestrate sequencing
Synchronous Bus Transfers: sequence relative to the commonclock
Master Slave
Control Lines
Address LinesData Lines
8/3/2019 lec_17_IOInterface
18/31I/O Interface CS510 Computer Architectures Lecture 17 - 18
Synchronous Bus Protocols
Pipelined/Split transaction Bus Protocol
Address
Data
Wait
addr 1
data 0
addr 2
wait 1
data 1
addr 3
OK 1
begin read Read complete
Address
Data
Read
Wait
Clock
Read transaction
8/3/2019 lec_17_IOInterface
19/31I/O Interface CS510 Computer Architectures Lecture 17 - 19
Asynchronous Handshake
Write Transaction
t3: Master releases req
Address
Data
Read
Req.
Ack.4 Cycle Handshake
t0 t1 t2 t3 t4 t5
Master Asserts Address Next Address
Master Asserts Data
t0: Master has obtained control and asserts address, direction, data;Waits a specified amount of time for slaves to decode target
t2: Slave asserts ack, indicating data receivedt1: Master asserts request line
t4: Slave releases ack
8/3/2019 lec_17_IOInterface
20/31I/O Interface CS510 Computer Architectures Lecture 17 - 20
Asynchronous Handshake
Time Multiplexed Bus: address and data share lines
Read TransactionAddress
Data
Read
Req
Ack
Master Asserts Address Next Address
t0 t1 t2 t3 t4 t5
4 Cycle Handshake
t0: Master has obtained control and asserts address, direction;
Waits a specified amount of time for slaves to decode
t2: Slave asserts ack, indicating ready to transmit data;Slave asserts data
t3: Master releases req, data received
t1: Master asserts request line
t4: Slave releases ack
8/3/2019 lec_17_IOInterface
21/31I/O Interface CS510 Computer Architectures Lecture 17 - 21
Bus ArbitrationParallel (Centralized) Arbitration
BR BG
M
BR BG
M
BR BG
M
Bus Arbiter
M
BGi BGo
BRM
BGi BGo
BRM
BGi BGo
BR
BG
BR
A.U.
BR A
M
BR A
M
BR A
M
BRA
A.U.
Serial Arbitration (daisy chaining)
Polling
BR: Bus RequestBG: Bus Grant
Bus sequence:BR
BG
Busy
data
8/3/2019 lec_17_IOInterface
22/31
I/O Interface CS510 Computer Architectures Lecture 17 - 22
Bus Options
Bus width Separate address Multiplex address& data lines & data lines
Data width Wider is faster Narrower is cheaper(e.g., 32 bits) (e.g., 8 bits)
Transfer size Multiple words has Single-word transferless bus overhead is simpler
Bus masters Multiple Single master(requires arbitration) (no arbitration)
Split Yes - separate No - continuous transaction?Request and Reply connection is cheaperpackets gets higher and has lower latencybandwidth(needs multiple masters)
Clocking Synchronous Asynchronous
Option High performance Low cost
8/3/2019 lec_17_IOInterface
23/31
I/O Interface CS510 Computer Architectures Lecture 17 - 23
1990 Bus Survey
Signals 128 96 96 16 8
Addr/Data mux no yes yes n/a n/a
Data width 16 - 32 32 32 16 8
Masters multi multi multi single multiClocking Async Async Sync Async either
MB/s 25 37 20 25 1.5 (asyn)
(0ns, word) 5 (sync)
150ns word 12.9 15.5 10 = =
0ns block 27.9 95.2 40 = =
150ns block 13.6 20.8 13.3 = =Max devices 21 20 21 8 7
Max meters 0.5 0.5 0.5 50 25
Standard IEEE 1014 IEEE 896.1 ANSI/IEEE ANSI X3.129 ANSI X3.1311296
VME FutureBus Multibus II IPI SCSI
8/3/2019 lec_17_IOInterface
24/31
I/O Interface CS510 Computer Architectures Lecture 17 - 24
VME
3 96-pin connectors
128 defined as standard, rest customer defined
32 address
32 data
64 command & power/ground lines
8/3/2019 lec_17_IOInterface
25/31
I/O Interface CS510 Computer Architectures Lecture 17 - 25
SCSI: Small ComputerSystem Interface
Up to 8 devices to communicate on a bus or string at
sustained speeds of 4-5 MBytes/sec
SCSI-2 up to 20 MB/sec
Devices can be slave (target) or master(initiator)
SCSI protocol: a series of phases, during which specificactions are taken by the controller and the SCSI disks
Bus Free: No device is currently accessing the bus
Arbitration: When the SCSI bus goes free, multiple devices mayrequest (arbitrate for) the bus; fixed priority by address
Selection: informs the target that it will participate (Reselection if
disconnected) Command: the initiator reads the SCSI command bytes from host
memory and sends them to the target
Data Transfer: data in or out, initiator: target
Message Phase: message in or out, initiator: target (identify,save/restore data pointer, disconnect, command complete)
Status Phase: target, just before command complete
8/3/2019 lec_17_IOInterface
26/31
I/O Interface CS510 Computer Architectures Lecture 17 - 26
SCSI Bus: Channel
ArchitectureCommand Setup
Arbitration
Selection
Message Out (Identify)
CommandDisconnect to seek/ ll buf fer
Message In (Disconnect)
- - Bus Free - -
Arbitration
Reselection
Message In (Identify)
Data Transfer
Data In
Disconnect to ll buf fer
Message In (Save Data Ptr)
Message In (Disconnect)
- - Bus Free - -
Arbitration
Reselection
Message In (Identify)
Command Completion
Status
Message In (Command Complete)
If no disconnect is needed
Completion
Message In (Restore Data Ptr)
peer-to-peer protocolsinitiator/targetlinear byte streamsdisconnect/reconnect
8/3/2019 lec_17_IOInterface
27/31
I/O Interface CS510 Computer Architectures Lecture 17 - 27
1993 I/O Bus Survey
Originator Sun DEC IBM Intel
Clock Rate (MHz) 16-25 12.5-25 async 33Addressing Virtual Physical Physical Physical
Data Sizes (bits) 8,16,32 8,16,24,32 8,16,24,32,64 8,16,24,32,64
Master Multi Single Multi Multi
Arbitration Central Central Central Central
32 bit read (MB/s) 33 25 20 33Peak (MB/s) 89 84 75 111 (222)
Max Power (W) 16 26 13 25
Bus SBus TurboChannel MicroChannel PCI
8/3/2019 lec_17_IOInterface
28/31
I/O Interface CS510 Computer Architectures Lecture 17 - 28
1993 MP Server Memory Bus
SurveyOriginator HP SGI Sun
Clock Rate (MHz) 60 48 66
Split transaction? Yes Yes Yes?Address lines 48 40 ??
Data lines 128 256 144 (parity)Data Sizes (bits) 512 1024 512
Clocks/transfer 4 5 4?
Peak (MB/s) 960 1200 1056
Master Multi Multi Multi
Arbitration Central Central Central
Addressing Physical Physical PhysicalSlots 16 9 10
Busses/system 1 1 2
Length 13 inches 12? inches 17 inches
Bus Summit Challenge XDBus
8/3/2019 lec_17_IOInterface
29/31
I/O Interface CS510 Computer Architectures Lecture 17 - 29
Communications Networks
Performance limiter is memory system, OS overhead, not protocols
Send/receive queues in processor memories Network controller copies back and forth via DMA No host intervention needed Interrupt host when message sent or received
NodeProcessor
ControlReg. I/F
NetI/F Memory
RequestBlock
ReceiveBlock
Media
Network Controller
Peripheral Backplane Bus
DMA
. . .
Processor MemoryList of request blocks
Data to be transmitted
. . .
List of receive blocks
Data receivedDMA
. . .
List of free blocks
8/3/2019 lec_17_IOInterface
30/31
I/O Interface CS510 Computer Architectures Lecture 17 - 30
I/O Controller Architecture
Request/Response block interface
Backdoor access to host memory
Peripheral Bus (VME, FutureBus, etc.)
Host
Memory
Processor
Cache
Host
Processor
Peripheral Bus Interface/DMA
I/O Channel Interface
BufferMemory
ROM
Proc
I/O Controller
8/3/2019 lec_17_IOInterface
31/31
I/O Data Flow
I/O Device Head/DiskAssembly
DMAover Peripheral BusOS Buffers (>10 MByte)
Xfer over Serial InterfaceTrack Buffers (32K - 256KBytes)Embedded Controller
Memory-to-Memory CopyApplication Address Space
Host Processor
Xfer over Disk ChannelHBA Buffers (1 M - 4 MBytes)I/O Controller
Impediment to high performance: multiple copies,complex hierarchy