Python postgre sql a wonderful wedding

Preview:

DESCRIPTION

Python and PostgreSQL, A wonderful Wedding. But are you sure that you know everything ? Not sure. This presentation is for you.

Citation preview

Python & Postgresql A Wonderful Wedding

Stéphane Wirtel @PyConFR 2014 - Lyon

http://goo.gl/UsPLwh

Stéphane Wirtel

#fellow @ThePSF #member EPS

#member AFPy @PythonFOSDEM

@matrixise stephane@wirtel.be

http://wirtel.be

AgendaQu’allons-nous apprendre ?

Python

Python, c’est quoi ?

Langage de Programmation

Syntaxe simple & lisible

Interprété

Multi-plateformes

Typage Dynamique Fort

Multi-paradigmes

Garbage Collector

Postgresql

PostgreSQL - StoryDéveloppée à Berkeley

Réécriture de Ingres

1985 -> Postgres (vient de Post-Ingres)

1995 -> Postgres95 (0.01) (Ajout d’un interpréter SQL)

1997 -> PostgreSQL 6.0

2013 -> PostgreSQL 9.3

2014 -> PostgreSQL 9.4 (b4)

PostgreSQL - TL;DRORDBMS, ACID

SQL:2011

DataTypes

Transactional DDL

Concurrent Index

Extensions

Common Table Expression

MultiVersion Concurrency Control

Cross-Platform

PostgreSQL - TL;DRReplication

Foreign Data Wrappers

Procedural Languages

Triggers

Full text Search

Views (Materialized)

Table Inheritance

Listen/Notify

PostgreSQL, FDW ?Utiliser une source externe à PostgreSQL

Twitter, RSS, CSV, XML

FileSystem

Processes

PostgreSQL, PL ?Étend la base de données

Fonctions stockées

Safe (sandbox, SQL, PL/pgSQL) / Unsafe (C)

PL/Python, PL/V8, PL/PERL

DB-API 2.0Commençons par le commencement…

http://python.org/peps/pep-0249

DB-API 2.0 aka PEP-249API pour les connecteurs de Base de Données

Facile à utiliser, et à comprendre

Utilise deux concepts:

• Connection

• Curseur

http://legacy.python.org/dev/peps/pep-0249/

ConnectionEmballe une connection vers la base de données

Gère les transactions et leur état (Commit/Rollback)

N’exécute pas de requêtes SQL.

Connection - API• connect(parameters=None)

• close()

• commit()

• rollback()

• cursor([name=None])

Connection - Exempleimport driver

conn = driver.connect(database='database', host='localhost', port=5432)try: # cree le curseur # utilise le curseurexcept Exception: conn.rollback()else: conn.commit()

conn.close()

CurseurCréé via une instance d’une connection

Utilisé pour la manipulation et interrogation de la Base de Données

cursor = conn.cursor()cursor.execute(""" SELECT column1, column2 FROM tableA""")for column1, column2 in cursor.fetchall(): print(column1, column2)

Curseur - API• callproc(procname[, parameters])

• execute(query[, parameters])

• fetchone(), fetchmany([size=cursor.arraysize]), fetchall()

• close()

Curseur - executeexecute(query[, parameters])

Performance et Sécurité

JAMAIS utiliser l’interpolation (%) et la concaténation (+)

=> SQL Injection

Curseur - execute (2)Accepte un formatage de la requête.

qmark Question mark WHERE field = ?

numeric Numeric positional WHERE field = :1

named Named WHERE field = :code

format ANSI C print format WHERE field = %s

pyformat Python format WHERE field = %(name)s

Exemple plus complet ;-)import driver

conn = driver.connect(database='database', host='localhost', port=5432)cursor = conn.cursor()try: cursor.execute("SELECT column1, column2 FROM tableA") for column1, column2 in cursor.fetchall(): print(column1, column2)

except Exception: conn.rollback()else: conn.commit()finally: cursor.close() conn.close()

psycopg2La Base…

pip install psycopg2

IntroductionPostgreSQL Adaptor, basé sur libpq

DB-API 2.0 Compliant

Multi-thread

Pool de Connections

Full Asynchronous, Coroutines

Supporte TOUS les types de données de PostgreSQL (Json, Hstore, …)

Python 2.5+, 3.1+, PostgreSQL 7.4+

Curseurscursor = conn.cursor()

cursor = conn.cursor(name=“pycon_cursor”)

• Curseur coté PostgreSQL

• Interrogation, manipulation des données de la base.

• Nommé => Utilisé pour les gros ensembles de données

Requêtes SQL

cursor.execute("SELECT * FROM table")cursor.execute("INSERT INTO table (field1, field2) VALUES (%s, %s)", (field1, field2))cursor.execute("DELETE FROM table")cursor.execute("UPDATE table SET field1=%s", ('value',))cursor.execute("CREATE DATABASE database")cursor.execute("DROP DATABASE database")

Exempleimport psycopg2

conn = psycopg2.connect(host='localhost', port=5432, user='username', password='password', database='database')cursor = conn.cursor()try: cursor.execute("SELECT column1, column2 FROM tableA") for column1, column2 in cursor.fetchall(): print(column1, column2)except Exception: conn.rollback()else: conn.commit()finally: cursor.close() conn.close()

Exemple - Context Managerimport psycopg2

DSN = dict(host='localhost', port=5432, user='username', password='password', database='database')

with psycopg2.connect(**DSN) as conn: with conn.cursor() as cursor: cursor.execute("SELECT column1, column2 FROM tableA") for column1, column2 in cursor.fetchall(): print(column1, column2) conn.commit()

Object Relational Mapping

Introduction“Technique de Programmation créant l’illusion d’une base de données orientée objet à partir

d’une base de données relationnelle en définissant des correspondances entre cette base de données et les objets du langage utilisé.”

Wikipedia

En bref• Mapper une classe Business sur une table ! • Méthodes pour ajouter, modifier, supprimer et

rechercher

Peewee-ORMObject Relational Mapping

pip install peewee

PeeweeORM

Simple, facile à comprendre et extensible

Python 2.6+ et 3.2+

Extensions

3200 lignes de code (MySQL, Sqlite3 et PostgreSQL)

Très bonne documentation

Connection

import peewee

database = peewee.PostgresqlDatabase('demo')

Modèlesclass BaseModel(peewee.Model): id = peewee.PrimaryKeyField()

class Meta: database = database

class Address(BaseModel): street = peewee.CharField() zipcode = peewee.CharField() country = peewee.CharField()

class Contact(BaseModel): firstname = peewee.CharField(null=False) lastname = peewee.CharField(null=False)

email = peewee.CharField(unique=True, null=False, index=True) address = peewee.ForeignKeyField(Address, null=False, related_name=‘contacts')

Création des tables

database.create_tables([Address, Contact])

Peewee - Transactions

tx = database.transaction()tx.commit()tx.rollback()

with database.transaction(): pass

Createwith database.transaction(): address = Address.create( street='Rue de Lyon', zipcode='90001', country='France')

contact = Contact.create( firstname='Stephane', lastname='Wirtel', email='stephane@wirtel.be', address=address ) print(contact.id)

Read

contact = Contact.get(Contact.email == 'stephane@wirtel.be')print(contact.firstname)

contacts = Contact.select()for contact in contacts: print(contact.firstname)

for contact in contacts.where(Contact.email == 'stephane@wirtel.be'): print(contact.firstname)

Update

with database.transaction(): contact = Contact.get(Contact.email == 'stephane@wirtel.be') contact.firstname = 'Speaker' contact.save()

Delete

# Suppression pour une collectioncontacts = Contact.delete().where(Contact.email == 'stephane@wirtel.be')contacts.execute()

# Suppression d'un seul enregistrementcontact = Contact.get(Contact.email == 'stephane@wirtel.be')contact.delete_instance()

SQLAlchemyObject Relational Mapping

pip install sqlalchemy

SQLAlchemy• Gestionnaire de connections

• Abstraction des connecteurs SQL

• Langage d’Expression SQL

• ORM

Connection

from sqlalchemy import create_engine

engine = create_engine('postgresql:///demo', echo=True)

Modèlesfrom sqlalchemy import Column, Integer, String, ForeignKeyfrom sqlalchemy.ext.declarative import declarative_basefrom sqlalchemy.orm import relationship

Base = declarative_base()

class Address(Base): __tablename__ = 'address'

id = Column(Integer, primary_key=True) street = Column(String) zipcode = Column(String) country = Column(String, nullable=False)

class Contact(Base): __tablename__ = 'contact'

id = Column(Integer, primary_key=True) firstname = Column(String, nullable=False) lastname = Column(String, nullable=False) email = Column(String, nullable=False)

address_id = Column(Integer, ForeignKey(Address.id), nullable=False) address = relationship('Address')

Création des tables

Base.metadata.create_all(engine)

SessionsWraps les transactions

session = Session(engine)session.commit()

Createsession = Session(engine)

address = Address(street='SQLAlchemy Street', zipcode='5432', country='Belgique')

contact = Contact( firstname='Stephane', lastname='Wirtel', email='stephane@wirtel.be', address=address)

session.add(contact)session.commit()

Readcontact = session.query(Contact).filter_by(email='stephane@wirtel.be').first()print(contact.firstname)

contacts = session.query(Contact).all()for contact in contacts: print(contact.firstname)

contacts = session.query(Contact).filter_by(email='stephane@wirtel.be').all()for contact in contacts: print(contact.firstname)

Update

contact = session.query(Contact) \ .filter_by(email='stephane@wirtel.be').first()contact.email = 'sw@mgx.io'session.add(contact)session.commit()

Delete

contact = session.query(Contact) \ .filter_by(email='stephane@wirtel.be').first()session.delete(contact)session.commit()

AlembicLa Migration Facile

pip install alembic

AlembicBasé sur SQLAlchemy

Utilise un environnement de Migration

Fichier de configuration INI

Scripts de migration

http://alembic.readthedocs.org/

Alembiccd votreproject

alembic init migrations

> treemigrations/!"" README!"" alembic.ini!"" env.py!"" script.py.mako#"" versions

Alembicrevision = '22630db6f519'down_revision = None

from alembic import opimport sqlalchemy as sa

def upgrade(): op.create_table('user', sa.Column('id', sa.Integer(), nullable=False), sa.Column('created_at', sa.DateTime(), nullable=False), sa.Column('name', sa.String(length=255), nullable=False), sa.Column('email', sa.String(length=255), nullable=True), sa.Column('password', sa.String(length=255), nullable=True), sa.Column('active', sa.Boolean(), nullable=True), sa.Column('confirmed_at', sa.DateTime(), nullable=True), sa.PrimaryKeyConstraint('id'), sa.UniqueConstraint('email') )

def downgrade(): op.drop_table('user')

alembic revision -m “bootstrap"

alembic upgrade head

Alembic

revision = '7e4b6a43e6c'down_revision = '22630db6f519'

from alembic import opimport sqlalchemy as sa

def upgrade(): op.add_column('user', sa.Column('slug', sa.Unicode(length=255), nullable=False, server_default=''))

def downgrade(): op.drop_column('user', 'slug')

alembic revision -m “add_slug”

alembic upgrade head

Alembic

> treemigrations/!"" README!"" alembic.ini!"" env.py!"" script.py.mako#"" versions !"" 22630db6f519_bootstrap.py #"" 7e4b6a43e6c_add_slug.py

Alembicfrom alembic import opfrom sqlalchemy.orm import Sessionfrom youproject import models

def upgrade(): engine = op.get_bind() session = Session(bind=engine) for contact in session.query(models.Contact).all(): session.add( models.WifiUser( login=contact.name.lower(), password=random_string() ) ) session.commit()

alembic upgrade head

Alembicalembic revisionalembic upgrade headalembic upgrade +2alembic downgrade -1alembic downgrade basealembic currentalembic history

Zone Dangereuse !!!

MulticornOù comment se connecter au monde !

http://multicorn.org

Rappel: Foreign Data Wrappers

Utiliser une source externe à PostgreSQL

Twitter, RSS, CSV, XML

FileSystem

Processes

PostgreSQL, Oracle, MongoDB

/* decla

rations

for dyna

mic load

ing */

PG_MODUL

E_MAGIC;

PG_FUNCT

ION_INFO

_V1(mong

o_fdw_ha

ndler);

PG_FUNCT

ION_INFO

_V1(mong

o_fdw_va

lidator)

;

/* * m

ongo_fdw

_handler

creates

and ret

urns a s

truct wi

th point

ers to f

oreign t

able

* callb

ack func

tions.

*/

Datum

mongo_fd

w_handle

r(PG_FUN

CTION_AR

GS)

{

FdwRouti

ne *fdwR

outine =

makeNod

e(FdwRou

tine);

fdwR

outine->

GetForei

gnRelSiz

e = Mong

oGetFore

ignRelSi

ze;

fdwR

outine->

GetForei

gnPaths

= MongoG

etForeig

nPaths;

fdwR

outine->

GetForei

gnPlan =

MongoGe

tForeign

Plan;

fdwR

outine->

ExplainF

oreignSc

an = Mon

goExplai

nForeign

Scan;

fdwR

outine->

BeginFor

eignScan

= Mongo

BeginFor

eignScan

;

fdwR

outine->

IterateF

oreignSc

an = Mon

goIterat

eForeign

Scan;

fdwR

outine->

ReScanFo

reignSca

n = Mong

oReScanF

oreignSc

an;

fdwR

outine->

EndForei

gnScan =

MongoEn

dForeign

Scan;

fdwR

outine->

AnalyzeF

oreignTa

ble = Mo

ngoAnaly

zeForeig

nTable;

PG_R

ETURN_PO

INTER(fd

wRoutine

);

}

Multicorn

• Extension PostgreSQL

• Permet d’écrire un Foreign Data Wrapper

• Wrapper Python de l’API C de PostgreSQL

• Tous les outils utilisant SQL peuvent utiliser un FDW

• Support FULL SQL ;-)

FDW de base• RSS, CSV, XML

• LDAP

• Gmail, IMAP

• Google Search

• SQLAlchemy (mysql, sqlite, oracle, …)

Simple exemple

CREATE EXTENSION multicorn;

Simple exemple (2)CREATE SERVER wirtel_be_srv FOREIGN DATA WRAPPER multicorn OPTIONS ( wrapper 'multicorn.rssfdw.RssFdw');

CREATE FOREIGN TABLE wirtel_be_rss ( "pubDate" TIMESTAMP, description CHARACTER VARYING, title CHARACTER VARYING, link CHARACTER VARYING) SERVER wirtel_be_srv OPTIONS( url 'http://wirtel.be/feeds/python.rss.xml');

Simple exemple (3)

SELECT "pubDate", title, link FROM wirtel_be_rss LIMIT 1;

-[ RECORD 1 ]------------------------------------------------------------------------ pubDate | 2014-10-19 00:00:00 title | Python @ FOSDEM 2015 - Call For Proposals link | http://wirtel.be/posts/en/2014/10/19/python-fosdem-2015-call-for-proposals/

avec OpenERPfrom multicorn import ForeignDataWrapperimport erppeek

class OpenERPForeignDataWrapper(ForeignDataWrapper): def __init__(self, options, columns): super(OpenERPForeignDataWrapper, self).__init__(options, columns) url = 'http://{hostname}:{password}'.format(**options)

self.client = erppeek.Client(url, options['database'], options['username'], options['password'])

self.object_name = options['object']

def execute(self, quals, columns): proxy = self.client.model(self.object_name)

item = {} for record in proxy.browse([]): for column in columns: item[column] = record[column] yield item

avec OpenERP (2)CREATE EXTENSION multicorn;

CREATE SERVER multicorn_openerp FOREIGN DATA WRAPPER multicorn OPTIONS (wrapper 'multicorn.openerpfdw.OpenERPForeignDataWrapper'); CREATE FOREIGN TABLE oe_users ( login character varying, name character varying) SERVER multicorn_openerp OPTIONS ( hostname 'localhost', port '8069', database 'openerp', username 'admin', password 'admin', object 'res.users' ); SELECT id, login, name, active FROM oe_users WHERE login = 'admin';

PL/PythonEtendre PostgreSQL

PL/PythonPython 2 ou Python 3

Utilise toutes les librairies Python (PyPI)

Apprentissage plus rapide que PL/pgSQL

PL/Pythonapt-get install postgresql-plpython-9.3

CREATE EXTENSION IF NOT EXISTS plpythonu;

PL/PythonCREATE OR REPLACE FUNCTION str_title(s VARCHAR) RETURNS VARCHAR AS $$ return s.title()$$ LANGUAGE plpythonu;

demo=# select str_title('hello world');-[ RECORD 1 ]----------str_title | Hello World

PL/Python - DataTypes

PostgreSQL Python

integer, bigint int, long

boolean bool

text types str

SQL Array list

Custom Types dict

PL/Python - DebugFonctions de base pour afficher les messages

• plpy.notice(“<msg>”)

• plpy.debug(“<msg>”)

• plpy.error(“<msg>”)

• plpy.fatal(“<msg>”)

Oublier le ”print”

PL/PythonCREATE OR REPLACE FUNCTION get_pid_cpu_mem(pid INT) RETURNS TABLE(pid INT, cpu_perc FLOAT, mem_perc FLOAT) AS $$ import psutil

process = psutil.Process(pid) return [ { 'pid': pid, 'cpu_perc': process.get_cpu_percent(interval=0), 'mem_perc': process.get_memory_percent() } ]

$$ LANGUAGE plpythonu;

PL/Python

-[ RECORD 1 ]----+---------------------------------datid | 16416datname | demopid | 14680usesysid | 16384usename | stephaneapplication_name | psqlclient_addr |client_hostname |client_port | -1backend_start | 2014-10-25 04:22:08.235532+02xact_start | 2014-10-25 04:25:28.712812+02query_start | 2014-10-25 04:25:28.712812+02state_change | 2014-10-25 04:25:28.712815+02waiting | fstate | activequery | select * from pg_stat_activity ;

SELECT * FROM pg_stat_activity;

PL/Python

pid | cpu_perc | mem_perc-------+----------+-------------- 14680 | 0 | 1.4454081372(1 row)

SELECT * FROM get_pid_cpu_mem(14680);

PL/Python

pid | cpu_perc | mem_perc | application_name-------+----------+---------------+------------------ 14680 | 0 | 1.50435626678 | psql(1 row)

WITH stats AS ( SELECT psa.*, get_pid_cpu_mem(psa.pid) as attrs from pg_stat_activity psa) SELECT (stats.attrs).pid, (stats.attrs).cpu_perc, (stats.attrs).mem_perc, stats.application_nameFROM stats;

PL/Python - AttentionUnsafe -> pas de sandbox

Difficile à maintenir et à debugger

Comment rendre fou votre DBA ;-)

Pas de virtualenv et demande les privilèges superuser.

PL/Python - Pistes ;-)Traitement sur grosses masses de données

Ajouter des contraintes fortes et logique dans le système.

Ajouter de nouvelles fonctionnalités dans PostgreSQL “str_title”

Triggers ?

Utilisation des libs de PyPI (requests, redis, zmq, smtp…)

Accès complet à la base de données !!!

Questions ?@matrixise

https://speakerdeck.com/matrixise

Recommended