CERN les.robertson@cern.ch december-00 La Politique du tout PC au CERN Séminaire EDF Clamart – 15...

Preview:

Citation preview

CERN

les.robertson@cern.ch december-00

La Politique du tout PC au CERNLa Politique du tout PC au CERN

Séminaire EDF

Clamart – 15 décembre 2000

Les Robertson

CERN/IT – Genève

CERN

december-00 - #2les robertson - cern/it

SommaireSommaire

Le problème La stratégie Les difficultés

CERN

les.robertson@cern.ch december-00

Le problèmeLe problème

CERN

december-00 - #4les robertson - cern/it

Architectures & operating systems supported at end 1999

AIX

Windows NT

Irix

Sola

ris

Digital Unix

HP-

UXM

AC-

OS

Linu

x

Windows

95

SPARCMIPS

Inte

l IA-3

2

PA-RISCPower

PC

Alpha Windows 2000The legacy of ten yearsThe legacy of ten years

of RISC computingof RISC computing

CERN

december-00 - #5les robertson - cern/it

Combien d’architectures et systèmes d’exploitation sont vraiment nécessaire?

Combien coûte le support? Combien vaut la diversité?

Comment imposer des limitations de choix dans un environnement de recherche scientifique ?

L’Organisation Européenne pour la Recherche Nucléaire

20 pays européens

2,700 employées

6,000 utilisateurs

CERN

CERN

december-00 - #7les robertson - cern/it

The Large Hadron Collider - LHCThe Large Hadron Collider - LHC

LHC accelerator under construction Proton-proton collider 27 km of super-conducting magnets Target date for first beams - 2005 Four experiments

Example - CMS 2000 physicists, 150 universities

The LHC DetectorsCMS

ATLAS

LHCb3.5 PetaBytes / year

~108 events/year

CERN

december-00 - #9les robertson - cern/it

HEP Computing CharacteristicsHEP Computing Characteristics

Large numbers of independent events trivial parallelism

Large data sets smallish records mostly read-only

Modest I/O rates few MB/sec per fast processor

Modest floating point requirement SPECint performance

High Througput Computing

Very large aggregate requirements – computation, data• Scaling up is not just big – it is also complex

• …and once you exceed the capabilities of a single geographical installation ………?

network servers

tape servers

disk servers

application servers

Generic computing farm

Cern/it/pdp-les.robertson 10-98-10 les.

rob

ert

son

@ce

rn.c

h

Estimated CPU Capacity at CERN

0

500

1,000

1,500

2,000

2,500

1998 1999 2000 2001 2002 2003 2004 2005 2006

year

K S

I95

~10K SI951200 processors

Non-LHC

Moore’s Law – estimate of the capacity for fixed level of investment, fixed number

of processors

LHC

10-20Kcpus?

network servers

tape servers

disk servers

application servers

Generic computing farm

Cern/it/pdp-les.robertson 10-98-12 les.

rob

ert

son

@ce

rn.c

h

LHC physics facility – 4 experiments

2 M SPECint9510-20K processors

2 PByte disk>20 K disks

CERN

december-00 - #13les robertson - cern/it

Summary of the problemSummary of the problem

HEP is using far too many operating systems in many cases with only slightly different functionality or hardware

cost benefits and at a high cost for users and support teams

The scale of LHC computing - massive numbers of processors/boxes integration of regional computing centres and CERN

problem is how to manage on this scale while limiting costs of equipment, management &

support

We must reduce the diversity while retaining flexibility to use low-cost, mass market componentsand adapt rapidly to changing physics needs

CERN

les.robertson@cern.ch december-00

La stratégieLa stratégie

CERN

december-00 - #15les robertson - cern/it

OpportunityOpportunity

PCs + { Linux ¦ Windows } offer an historic opportunity to reduce the solution set

Costs and performance PCs will consistently be among the very best

price/performers for HEP codes They may not be the fastest,but they are fast

enough

Linux -a non-proprietary operating system compatible with the recent Unix history

Windows – a mass market alternative – widely used on the desktop

CERN

december-00 - #16les robertson - cern/it

PolicyPolicy

Restrict ourselves to PC hardware with Linux or Windows 2000

Develop a migration plan - progressively freeze support for other Unixes,

announcing end-dates which are reasonable for old experiments,

strongly discourage further investments in RISC systems by current and future experiments

install large Linux public facility, testbed for future experiments

Concentrate investment in Linux and Windows bring support up to the standards of proprietary

Unixes tackle the problems of scaling the management

and performance of physics farms and desktops seek HEP-wide consensus

CERN

december-00 - #17les robertson - cern/it

But do not be unrealistic ----But do not be unrealistic ----

This is a convergence policy which looks realistic now and will provide a single starting point

for LHC computing but we can be sure that the industry

will not stand still, and we shall sooner or later have to expand the systems and architectures supported

AIX

WNTIrix

Sola

ris

Digital Unix

HP-

UX

MAC-

OS

Linu

x

Windows

95

SPARC

MIPS In

tel I

A-32

PA-RISC

Power

PC

Alpha LinuxWindows

2000

Intel IA-64- - - ?

CERN

les.robertson@cern.ch december-00

Les difficultésLes difficultés et l'état de la migrationet l'état de la migration

CERN

december-00 - #19les robertson - cern/it

Difficulties - IDifficulties - I

Physics – (almost) entirely Unix based Linux is not quite ready

(Too) wide a choice of kernels, compilers, debuggers Different versions supported by different applications Some applications not supported on Linux Complex packages (Oracle, AFS) – better go with the

standard platform Stability problems under load Who provides in-depth, on-site Linux systems support?

Solution: Standard Linux Package certified for all CERN

applications Solaris/SPARC for special purposes Open posts for Linux experts

CERN

december-00 - #20les robertson - cern/it

Difficulties - IIDifficulties - II

In a research environment Easy to estimate the costs of systems support Hard to estimate the cost of application migration

The application experts have already moved on The developers have other (more interesting)

problems to solve The problem is not only to port the code but (more important) to acquire confidence in the

physics results Compiler, architecture, old bugs But there are signs that Linux+Intel are as good as

any! In the past, the production use of multiple

architectures was an important factor in finding bugs

CERN

december-00 - #21les robertson - cern/it

Current Status – PhysicsCurrent Status – Physics

For older experiments – Strong resistance to aggressive migration

proposal Agreement for complete freeze on all proprietary

Unixes during 2003

For future experiments (not yet collecting data) – General agreement on Linux/Intel for production,

but require a second (limited) development platform for validation

For new experiments collecting data now Easy to calculate the benefits Have already completed migration

CERN

december-00 - #22les robertson - cern/it

Current Status – other applicationsCurrent Status – other applications

Desktop applications Web, Office, …. Windows 2000

Engineering applications Aggressive migration plan to Windows NT/2000 &

Linux, with some residual SUN Major exception is mechanical CAE (Euclid + Digital

Unix)

Administration Database (Oracle) on SUN Clients Web-based (Outlook/Netscape) Strong pockets of MAC resistance

– led by the Directorate

CERN

december-00 - #23les robertson - cern/it

ConclusionsConclusions

Les besoins énormes du LHC exigent la standardisation et l’utilisation des composants bon-marché

Opportunité –Linux + Windows 2000 avec Intel IA32/64 & Ethernet

Grande inertie (résistance?) de la part des « vielles » expériences - il faudra 4 années pour terminer la migration

Mais – déjà plus de trois quarts des systèmes installés et 90% de la capacité sont Linux/Intel

Recommended