45
la Business Intelligence en Suisse Romande 4ème Table Ronde Lausanne, le 1er juin 2012 Dario Mangano Head Of Knowledge Management Nestlé Nespresso S.A. HQ

Linked in 4eme table ronde 20120601

Embed Size (px)

Citation preview

Page 1: Linked in 4eme table ronde 20120601

Réseau des Professionnels de la Business Intelligence en Suisse Romande

4ème Table Ronde

Lausanne, le 1er juin 2012

Dario ManganoHead Of Knowledge ManagementNestlé Nespresso S.A. HQ

Page 2: Linked in 4eme table ronde 20120601

AGENDA

14h00 Welcome 14h30 Le groupe LinkedIn 14h45 Les métadonnées de chargement 15h30 Coffee Break 15h45 Le chargement par fuseaux

horaires 16h30 Futures tables rondes 16h45 Coffee 17h30 Fin

Lausanne, le 1er juin 2012

Page 3: Linked in 4eme table ronde 20120601

Le groupe LinkedIn

AGENDA

Dario ManganoHead Of Knowledge ManagementNestlé Nespresso S.A.

Page 4: Linked in 4eme table ronde 20120601

Le groupe LinkedIn

Page 5: Linked in 4eme table ronde 20120601

Le groupe LinkedIn

Page 6: Linked in 4eme table ronde 20120601

Le groupe LinkedIn

Page 7: Linked in 4eme table ronde 20120601

Le groupe LinkedIn

Page 8: Linked in 4eme table ronde 20120601

Le groupe LinkedIn

Page 9: Linked in 4eme table ronde 20120601

Le groupe LinkedIn

Page 10: Linked in 4eme table ronde 20120601

Le groupe LinkedIn

Page 11: Linked in 4eme table ronde 20120601

Le groupe LinkedIn

Page 12: Linked in 4eme table ronde 20120601
Page 13: Linked in 4eme table ronde 20120601

La gestion des métadonnées de

chargement

AGENDA

Anthony BrouardBusiness Intelligence ExpertCapital International

Page 14: Linked in 4eme table ronde 20120601

Métadonnées de chargement

Page 15: Linked in 4eme table ronde 20120601

Métadonnées de chargement

ANTHONY’s Slides

Page 16: Linked in 4eme table ronde 20120601

Data Management Services

Data Integration FrameworkOverview

Page 17: Linked in 4eme table ronde 20120601

17

Data Integration Best Practices oriented

Data integration is a family of techniques, most commonly including ETL (extract, transform, and load), but also lots of related techniques that are inevitable when dealing with Data Integration: Metadata, Change Data Capture, File loading, Publication, Data quality,... moreover it is always involving different technologies: DB server, DB scripting, Shell scripting, …

All these techniques and technologies require development and support for a wide range of interfaces using solution that can be hand-coded, based on vendor’s tool , or mix of both.

With such complexity in Data Integration systems, to develop and support these solutions is becoming very challenging.

Having Best Practices and standards will ensure that all the systems are developed in a way that it is much easier to support and also much safer and scalable to afford future needs and data volume.

The Data Integration Framework is a metadata driven development environment that is providing turn key solutions for all these tasks around Data Integration:

- Metadata- Change Data Capture- File loading- Data quality- Publication- Archiving,…

It ensures that all those tasks are performed in a good, efficient and standard way, and so it keeps the development Team focused on the real value added of the data integration system: making all the data coming from different source system available for business users, and applying required business rules.

Page 18: Linked in 4eme table ronde 20120601

18

Metadata Management oriented

Metadata is a key feature in Data Integration and Data Warehousing.

This is the only way to get answers to the following questions:- Which column did this data come from?- When was this data populated in the system?- How is calculated this result on my report?- Is my report up to date?- Is my system scalable?

Having these answers will just increase the trust in the data, enable a pro active monitoring of Data Integration processes, ensure that the data are loaded in a effective manner and at the end prevent our system to lose value over the time by decreasing and absorbing the costs of understanding, maintenance and repair.

The Data Integration Framework is providing a metadata management solution without any development effort required from the project team:

Collecting operational metadata in real time

Capturing business and technical metadata related to data integration processes

Integrating all these metadata in a metadata repository

Proposing report to access these integrated metadata with user friendly navigation capabilities (drill down, drill through, direct access to log files from monitoring reports, impact analysis,…)

Page 19: Linked in 4eme table ronde 20120601

19

Data Integration Framework (DIF): Data Mngmt Framework

Source systems

Downstream systems

Support Teams

File

load

ing

Notification

Chan

ge D

ata

Capt

ure

Arch

iving

Data

pub

licat

ion

Business Rules

Development Teams

develop

monitor

DIF components

(see slide notes for comments)

File

load

ing

Notification

Chan

ge D

ata

Capt

ure

Arch

iving

Data

pub

licat

ion

Metadata

Monitoring

developUse

Business Rules

Graphical User Interface

Page 20: Linked in 4eme table ronde 20120601

20

Designs methods and tools to perform data integration services

Reference Reference ArchitectureArchitecture

Event-Triggered ETL

Batch ETL

ETL Development Methodology

MethodMethodss

DIF DIF ComponenComponentstsWrapper

File Loader

Parsing, Matching & Merging, Consolidation…

Change Data Capture

Reject Management

Standard Integration Methods by Subject Area

Pub/Sub Event Bulk Pub/Sub Pattern Publisher Module

Metadata Management

(Operational & Technical)

Op. & Technical Metadata Collection Standards

Metadata data model

Metadata collection daemon

Quality Auditing Data Quality Control Methods DQ module

Page 21: Linked in 4eme table ronde 20120601

21

DIF: Back end modular architecture

MetadataRepositorySheduler Wrapper module

DIF Minimal installation

FileLoader module

Notification module

Purge module

Archiver module

Publisher module

Project specific code to apply business rules and requirements (Powercenter, Shell script, Sql script, Store procedures,…)

DIF available Modules/Services/Re-usable components

HP OV Metrics collection services

Powercenter Metrics collection services

Re-usable inlcudes (logging routines, mail sending routines,…)

Data Quality module

Page 22: Linked in 4eme table ronde 20120601

22

Potential for Global Monitoring

Application environments

APP1

APP2

APP3

Shared Metadata Repository

Metadata Repository

- Autosys wrapper- File loader- CDC- Rejects recycling- Archiver- Publisher- ...

Project Team

Support Team

Middleware Team

External system

Application process

PowerCenter SchedulerUnix

serversOracle Dbs

HP Openview

use

Reports

PublicationExtract

Data access layer

Reusable components

Engines

Web

Ser

ver

Oracle Dbs

Retrieve key metadata from infrastructure and middleware components

Page 23: Linked in 4eme table ronde 20120601

23

Reporting services (Cognos/BO reports) – Ex1Using the reporting layer we can have access to the integrated metadata repository for all kind of report or ad hoc query:

- monitoring report- capacity planning- impact analysis

Example of monitoring report with embedded navigation capabilities:Dril down

button

Open log file for more details

Page 24: Linked in 4eme table ronde 20120601

24

Reporting services (Cognos/BO reports) - Ex2The value added of having integrated metadata, is to have report showing correlated metadata on the same view.

For example this Gantt view execution report, will show if there is a correlation between a given interface execution and server workload.This is very useful to understand performance issue, but also for capacity planning purposes….

Drill through this interface details report

Drill down to interface step Gantt view for this

interface

Page 25: Linked in 4eme table ronde 20120601

25

Reporting services (Cognos/BO reports) - Ex3Another example of the details we can get from the reports.

Using publication module, the metadata will tell you what are the XML files that were produced, how many rows were extracted from the database,..

And also to which downstream applications the package was pushed to:

Page 26: Linked in 4eme table ronde 20120601

26

Appendix

Page 27: Linked in 4eme table ronde 20120601

27

Staging Layer

Integration Layer DWH

Data Publication Layer

Flat File

DB Table(s)

Source system

Publication Layer

Data Integration Application Architecture

ArchivingException

MgmtRejection recycle

AuditingNotificationData

MovementWorkflow

ETL

FileLoader

Script

Oracle store proc

Sql script

ETL

Script

Oracle store proc

Sql script

ETL

Script

Oracle store proc

Sql script

ETL

Publisher

Script

Oracle store proc

Sql script

Data Integration Framework Services (modules, monitoring services)

Operational Metadata Repository

MetricsMetadata Exception Logs

Data Quality PublicationChange data capture

Reporting services

Cognos

Level 2 SupprtDEV Team (L3 support)Business Users

Business Objects

configuration Metadata

Step(s) Step(s) Step(s) Step(s)

Task(s)

Task(s)

Task(s)

Task(s)

Task(s)

Task(s)

Task(s)

Task(s)

Interface

Data Flow

Scheduler

Page 28: Linked in 4eme table ronde 20120601

28

Reports &Dashboards

1st Level Support Application/service level view enables Service Desk to rapidly intercept fatal alerts & communicate service outages to affected users

2nd Level Support

Identify root cause of issue & take effective action. Data is available for analysis to anticipate issues and bottlenecks.

3rd Level Support

Perform complex analysis, troubleshooting, storage capacity planning, improve efficiency (identify weak points, alarming trends).

Monitoring Implementation

PowerCenter

DB Servers

Application & Web Servers

Scheduler DIF Modules and Services

MetadataRepository

Service-Level View of the Data Integration Application

Page 29: Linked in 4eme table ronde 20120601

Le chargement par fuseaux horaires

AGENDA

Dario ManganoHead Of Knowledge ManagementNestlé Nespresso S.A.

Cedric ZbindenBI architectNestlé Nespresso S.A.

Page 30: Linked in 4eme table ronde 20120601

Le charement par fuseaux horaires

Question: Comment gérer la cohérence et l’intégrité des données du DWH lorsque les

données sont chargées par zones géographiques et par fuseaux

horaires ?

Page 31: Linked in 4eme table ronde 20120601

Le charement par fuseaux horaires

Page 32: Linked in 4eme table ronde 20120601

Le charement par fuseaux horaires

Page 33: Linked in 4eme table ronde 20120601

Le charement par fuseaux horaires

Page 34: Linked in 4eme table ronde 20120601

Le charement par fuseaux horaires

Page 35: Linked in 4eme table ronde 20120601

Le charement par fuseaux horaires

Page 36: Linked in 4eme table ronde 20120601

Le charement par fuseaux horaires

Page 37: Linked in 4eme table ronde 20120601

Le charement par fuseaux horaires

Page 38: Linked in 4eme table ronde 20120601

Le charement par fuseaux horaires

Page 39: Linked in 4eme table ronde 20120601

Le charement par fuseaux horaires

Page 40: Linked in 4eme table ronde 20120601

Le charement par fuseaux horaires

Débat – propositions ?

Page 41: Linked in 4eme table ronde 20120601

Le charement par fuseaux horaires

Résumé des discussions:-Le HQ et les marchés n’ont pas les mêmes besoins en terme de rafraicissmeent de données revoir si le HQ peut se satisfaire de J-2 ?-Utiliser le chargement de schémas différents afin de ne pas requêter le schémas qui est en train d’être lo9ader, puis faire un drop partition à la fin ?-Utiliser les master cubes Cognos

Page 42: Linked in 4eme table ronde 20120601

Futures tables rondes

AGENDA

Page 43: Linked in 4eme table ronde 20120601

Futures Tables Rondes

- Sujets- Lieux- Communication

Page 44: Linked in 4eme table ronde 20120601

Futures Tables RondesPropositions:-Big Data-Démonstration d’un POC ClickView (groupe Mutuelle)-Exemple de governance permettant de mieux cadrer les demandes des businesses (Business Case, Demand management committee, etc.)-In Memory appliances-BI Mobile-Column based DB / NoSQL-Data Virtualization-BI SaaS

Page 45: Linked in 4eme table ronde 20120601

MERCICafé !