19
8/2/2019 Trans Met A http://slidepdf.com/reader/full/trans-met-a 1/19 Transmeta Crusoe Microprocessor Dr. Doug L. Hoffman Computer Science 330 Spring 2002

Trans Met A

Embed Size (px)

Citation preview

Page 1: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 1/19

Transmeta Crusoe

Microprocessor 

Dr. Doug L. Hoffman

Computer Science 330Spring 2002

Page 2: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 2/19

Transmeta Crusoe

"Today [in RISC] we have large design teams and long design

cycles. The performance story is also much less clear now. The

die sizes are no longer small. It just doesn't seem to make as

much sense. Superscalar and out-of-order execution are thebiggest problem areas that have impeded performance [leaps].

The MIPS R10,000 and HP PA-8000 seem much more

complex to me than today's standard CISC architecture, which

is the Pentium II. So where is the advantage of RISC, if the

chips aren't as simple anymore?” 

David Ditzel, Transmeta CEO

Page 3: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 3/19

Transmeta’s 80x86

Architecture?

Crusoe microprocessors can run the same

software that runs on IBM PC-compatiblepersonal computers.

Smaller, simpler logic. Only about half the logictransistors of an x86 processor.

Consumes between one-third andone-30th the power.

Implements none of the x86 instructions in

hardware.

Page 4: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 4/19

X86 vs. Crusoe

The blue stuff is silicon, and the yellow is software. Crusoe's blue part is smaller,

because branch prediction, and out-of-order execution (OOO) hardware has moved

off the die and into software. All of those functions are now done in real-time by a

special program as the application code is executing. 

Page 5: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 5/19

Transmeta’s Crusoe 

The highest-performance Crusoe chip, the TM5400

Page 6: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 6/19

Crusoe Features

Dynamic binary translation, gives programs theimpression that they are running on an x86

machine. VLIW processor executes up to 4 instructions in

parallel.

LongRun power control adjust CPU power to thetasks being performed.

Page 7: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 7/19

Transmeta-ese

Individual instructions are called atoms.

VLIW instruction groups are calledmolecules.

Commit and rollback allows instructions tobe “un

-done”.

 Code Morphing®

Page 8: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 8/19

Transmeta-ese

Page 9: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 9/19

VLIW vs. Superscaler 

A "traditional" VLIW machine does reordering and parallelism hunting in software.

For a straight-ahead VLIW design like Intel's IA-64, the piece of software that does

all this is the compiler. The compiler extracts the parallelism from the code, looks

for dependencies, etc., and produces optimized code that the VLIW core can run as

fast as possible, in-order. 

Page 10: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 10/19

Code Morphing 

The x86 architecture is an ill-defined amoeba containing such features as

segmentation, ASCII arithmetic, and variable-length instructions; the square

inside the blob is the VLIW processor and its functions. 

Page 11: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 11/19

Code Morphing 

Since Crusoe is a VLIW machine that's made to run

code compiled for a superscalar machine, its

compilation and scheduling scheme is sort of a hybrid

of both approaches. Crusoe's Code Morphing software

actually takes a compiled x86 program and recompilesit, on-the-fly, to Crusoe's native VLIW instruction

format. This recompilation uses sophisticated

compiler algorithms to extract parallelism from the

code, look for dependencies and do all those things

that a state-of-the-art VLIW compiler does.

Page 12: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 12/19

Code Morphing Details

Takes x86 instructions and recompiles them on the flyinto VLIW instructions (atoms).

 As it recompiles them, it optimizes them, making them

run, in many cases, more efficiently than the originalx86 code.

Finally, a scheduler reorders the atoms and groups theminto molecules.

Once translated, the VLIW code is stored in a specialpart of memory, accessible only by the Code Morphingsoftware, so that particular program need not betranslated again.

But that’s not all... 

Page 13: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 13/19

Code Morphing Details

Software continues to monitor how anapplication is being used.

If it finds that a process is spending a lot of timein one part of the code, it turns on more levelsof optimization to make that part of theprogram run faster.

It only optimizes the parts of the code beingused. Things that are executed infrequently arenot optimized.

Page 14: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 14/19

Code Morphing 

One of the challenges of 

creating the Code Morphing

software was to make theCrusoe processor, in many

cases, bug-compatible with

the x86 so that it would 

generate the so-called  BlueScreen of Death at many of 

the same times an x86 

 processor would. 

Page 15: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 15/19

Processor Features

Five execution units; two arithmetic-logic, aload/store, a branch, and a floating-point.

Can execute four instructions in a cycle.

Sixty-four general-purpose and 32 floating-pointworking registers shadowed by 48 general-

purpose and 16 floating-point registers. 

64KB level one (L1) caches and a 256KB leveltwo (L2) cache.

Even more important  

Page 16: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 16/19

What it doesn’t have 

no superscalar decode, grouping, or issuelogic.

no register renaming.no segmentation hardware.

no floating-point stack hardware.

less interlock and bypassing logic than atraditional central processing unit.

Page 17: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 17/19

Low Power Features

If you have fewer transistors, you burn lesspower.

Only those functional units that are absolutelyneeded to execute an instruction are turned on.

LongRun hardware adjusts boththe supply voltage and the clock frequency so

that each application runs only as fast as it mustto get the job done.

Page 18: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 18/19

Hardware and Software

Architecture 

Processor upgrades are

simplified because the layer of 

software between the applications

and the chip frees the designers to

change the chip architecture

without causing x86 software

developers to have to recompile

their code.

Code Morphing software can be

updated independently of hardware by loading a software

upgrade into Flash memory. 

Page 19: Trans Met A

8/2/2019 Trans Met A

http://slidepdf.com/reader/full/trans-met-a 19/19

The Last Word

"Considering the complexity of the project, it is amazing how well it works, how fast it works, and how low- power it is. For the end-user, this is just a normal PC,

but under the hood, it is a technological marvel."  -- Marc Fleischmann,Transmeta

"Revolutionary may be an overstatement, but they are definitely different..."  

-- Cahners Microprocessor Report