View
499
Download
10
Category
Preview:
DESCRIPTION
Python and PostgreSQL, A wonderful Wedding. But are you sure that you know everything ? Not sure. This presentation is for you.
Citation preview
Python & Postgresql A Wonderful Wedding
Stéphane Wirtel @PyConFR 2014 - Lyon
http://goo.gl/UsPLwh
Stéphane Wirtel
#fellow @ThePSF #member EPS
#member AFPy @PythonFOSDEM
@matrixise stephane@wirtel.be
http://wirtel.be
AgendaQu’allons-nous apprendre ?
Python
Python, c’est quoi ?
Langage de Programmation
Syntaxe simple & lisible
Interprété
Multi-plateformes
Typage Dynamique Fort
Multi-paradigmes
Garbage Collector
Postgresql
PostgreSQL - StoryDéveloppée à Berkeley
Réécriture de Ingres
1985 -> Postgres (vient de Post-Ingres)
1995 -> Postgres95 (0.01) (Ajout d’un interpréter SQL)
1997 -> PostgreSQL 6.0
2013 -> PostgreSQL 9.3
2014 -> PostgreSQL 9.4 (b4)
PostgreSQL - TL;DRORDBMS, ACID
SQL:2011
DataTypes
Transactional DDL
Concurrent Index
Extensions
Common Table Expression
MultiVersion Concurrency Control
Cross-Platform
PostgreSQL - TL;DRReplication
Foreign Data Wrappers
Procedural Languages
Triggers
Full text Search
Views (Materialized)
Table Inheritance
Listen/Notify
PostgreSQL, FDW ?Utiliser une source externe à PostgreSQL
Twitter, RSS, CSV, XML
FileSystem
Processes
PostgreSQL, PL ?Étend la base de données
Fonctions stockées
Safe (sandbox, SQL, PL/pgSQL) / Unsafe (C)
PL/Python, PL/V8, PL/PERL
DB-API 2.0Commençons par le commencement…
http://python.org/peps/pep-0249
DB-API 2.0 aka PEP-249API pour les connecteurs de Base de Données
Facile à utiliser, et à comprendre
Utilise deux concepts:
• Connection
• Curseur
http://legacy.python.org/dev/peps/pep-0249/
ConnectionEmballe une connection vers la base de données
Gère les transactions et leur état (Commit/Rollback)
N’exécute pas de requêtes SQL.
Connection - API• connect(parameters=None)
• close()
• commit()
• rollback()
• cursor([name=None])
Connection - Exempleimport driver
conn = driver.connect(database='database', host='localhost', port=5432)try: # cree le curseur # utilise le curseurexcept Exception: conn.rollback()else: conn.commit()
conn.close()
CurseurCréé via une instance d’une connection
Utilisé pour la manipulation et interrogation de la Base de Données
cursor = conn.cursor()cursor.execute(""" SELECT column1, column2 FROM tableA""")for column1, column2 in cursor.fetchall(): print(column1, column2)
Curseur - API• callproc(procname[, parameters])
• execute(query[, parameters])
• fetchone(), fetchmany([size=cursor.arraysize]), fetchall()
• close()
Curseur - executeexecute(query[, parameters])
Performance et Sécurité
JAMAIS utiliser l’interpolation (%) et la concaténation (+)
=> SQL Injection
Curseur - execute (2)Accepte un formatage de la requête.
qmark Question mark WHERE field = ?
numeric Numeric positional WHERE field = :1
named Named WHERE field = :code
format ANSI C print format WHERE field = %s
pyformat Python format WHERE field = %(name)s
Exemple plus complet ;-)import driver
conn = driver.connect(database='database', host='localhost', port=5432)cursor = conn.cursor()try: cursor.execute("SELECT column1, column2 FROM tableA") for column1, column2 in cursor.fetchall(): print(column1, column2)
except Exception: conn.rollback()else: conn.commit()finally: cursor.close() conn.close()
psycopg2La Base…
pip install psycopg2
IntroductionPostgreSQL Adaptor, basé sur libpq
DB-API 2.0 Compliant
Multi-thread
Pool de Connections
Full Asynchronous, Coroutines
Supporte TOUS les types de données de PostgreSQL (Json, Hstore, …)
Python 2.5+, 3.1+, PostgreSQL 7.4+
Curseurscursor = conn.cursor()
cursor = conn.cursor(name=“pycon_cursor”)
• Curseur coté PostgreSQL
• Interrogation, manipulation des données de la base.
• Nommé => Utilisé pour les gros ensembles de données
Requêtes SQL
cursor.execute("SELECT * FROM table")cursor.execute("INSERT INTO table (field1, field2) VALUES (%s, %s)", (field1, field2))cursor.execute("DELETE FROM table")cursor.execute("UPDATE table SET field1=%s", ('value',))cursor.execute("CREATE DATABASE database")cursor.execute("DROP DATABASE database")
Exempleimport psycopg2
conn = psycopg2.connect(host='localhost', port=5432, user='username', password='password', database='database')cursor = conn.cursor()try: cursor.execute("SELECT column1, column2 FROM tableA") for column1, column2 in cursor.fetchall(): print(column1, column2)except Exception: conn.rollback()else: conn.commit()finally: cursor.close() conn.close()
Exemple - Context Managerimport psycopg2
DSN = dict(host='localhost', port=5432, user='username', password='password', database='database')
with psycopg2.connect(**DSN) as conn: with conn.cursor() as cursor: cursor.execute("SELECT column1, column2 FROM tableA") for column1, column2 in cursor.fetchall(): print(column1, column2) conn.commit()
Object Relational Mapping
Introduction“Technique de Programmation créant l’illusion d’une base de données orientée objet à partir
d’une base de données relationnelle en définissant des correspondances entre cette base de données et les objets du langage utilisé.”
Wikipedia
En bref• Mapper une classe Business sur une table ! • Méthodes pour ajouter, modifier, supprimer et
rechercher
Peewee-ORMObject Relational Mapping
pip install peewee
PeeweeORM
Simple, facile à comprendre et extensible
Python 2.6+ et 3.2+
Extensions
3200 lignes de code (MySQL, Sqlite3 et PostgreSQL)
Très bonne documentation
Connection
import peewee
database = peewee.PostgresqlDatabase('demo')
Modèlesclass BaseModel(peewee.Model): id = peewee.PrimaryKeyField()
class Meta: database = database
class Address(BaseModel): street = peewee.CharField() zipcode = peewee.CharField() country = peewee.CharField()
class Contact(BaseModel): firstname = peewee.CharField(null=False) lastname = peewee.CharField(null=False)
email = peewee.CharField(unique=True, null=False, index=True) address = peewee.ForeignKeyField(Address, null=False, related_name=‘contacts')
Création des tables
database.create_tables([Address, Contact])
Peewee - Transactions
tx = database.transaction()tx.commit()tx.rollback()
with database.transaction(): pass
Createwith database.transaction(): address = Address.create( street='Rue de Lyon', zipcode='90001', country='France')
contact = Contact.create( firstname='Stephane', lastname='Wirtel', email='stephane@wirtel.be', address=address ) print(contact.id)
Read
contact = Contact.get(Contact.email == 'stephane@wirtel.be')print(contact.firstname)
contacts = Contact.select()for contact in contacts: print(contact.firstname)
for contact in contacts.where(Contact.email == 'stephane@wirtel.be'): print(contact.firstname)
Update
with database.transaction(): contact = Contact.get(Contact.email == 'stephane@wirtel.be') contact.firstname = 'Speaker' contact.save()
Delete
# Suppression pour une collectioncontacts = Contact.delete().where(Contact.email == 'stephane@wirtel.be')contacts.execute()
# Suppression d'un seul enregistrementcontact = Contact.get(Contact.email == 'stephane@wirtel.be')contact.delete_instance()
SQLAlchemyObject Relational Mapping
pip install sqlalchemy
SQLAlchemy• Gestionnaire de connections
• Abstraction des connecteurs SQL
• Langage d’Expression SQL
• ORM
Connection
from sqlalchemy import create_engine
engine = create_engine('postgresql:///demo', echo=True)
Modèlesfrom sqlalchemy import Column, Integer, String, ForeignKeyfrom sqlalchemy.ext.declarative import declarative_basefrom sqlalchemy.orm import relationship
Base = declarative_base()
class Address(Base): __tablename__ = 'address'
id = Column(Integer, primary_key=True) street = Column(String) zipcode = Column(String) country = Column(String, nullable=False)
class Contact(Base): __tablename__ = 'contact'
id = Column(Integer, primary_key=True) firstname = Column(String, nullable=False) lastname = Column(String, nullable=False) email = Column(String, nullable=False)
address_id = Column(Integer, ForeignKey(Address.id), nullable=False) address = relationship('Address')
Création des tables
Base.metadata.create_all(engine)
SessionsWraps les transactions
session = Session(engine)session.commit()
Createsession = Session(engine)
address = Address(street='SQLAlchemy Street', zipcode='5432', country='Belgique')
contact = Contact( firstname='Stephane', lastname='Wirtel', email='stephane@wirtel.be', address=address)
session.add(contact)session.commit()
Readcontact = session.query(Contact).filter_by(email='stephane@wirtel.be').first()print(contact.firstname)
contacts = session.query(Contact).all()for contact in contacts: print(contact.firstname)
contacts = session.query(Contact).filter_by(email='stephane@wirtel.be').all()for contact in contacts: print(contact.firstname)
Update
contact = session.query(Contact) \ .filter_by(email='stephane@wirtel.be').first()contact.email = 'sw@mgx.io'session.add(contact)session.commit()
Delete
contact = session.query(Contact) \ .filter_by(email='stephane@wirtel.be').first()session.delete(contact)session.commit()
AlembicLa Migration Facile
pip install alembic
AlembicBasé sur SQLAlchemy
Utilise un environnement de Migration
Fichier de configuration INI
Scripts de migration
http://alembic.readthedocs.org/
Alembiccd votreproject
alembic init migrations
> treemigrations/!"" README!"" alembic.ini!"" env.py!"" script.py.mako#"" versions
Alembicrevision = '22630db6f519'down_revision = None
from alembic import opimport sqlalchemy as sa
def upgrade(): op.create_table('user', sa.Column('id', sa.Integer(), nullable=False), sa.Column('created_at', sa.DateTime(), nullable=False), sa.Column('name', sa.String(length=255), nullable=False), sa.Column('email', sa.String(length=255), nullable=True), sa.Column('password', sa.String(length=255), nullable=True), sa.Column('active', sa.Boolean(), nullable=True), sa.Column('confirmed_at', sa.DateTime(), nullable=True), sa.PrimaryKeyConstraint('id'), sa.UniqueConstraint('email') )
def downgrade(): op.drop_table('user')
alembic revision -m “bootstrap"
alembic upgrade head
Alembic
revision = '7e4b6a43e6c'down_revision = '22630db6f519'
from alembic import opimport sqlalchemy as sa
def upgrade(): op.add_column('user', sa.Column('slug', sa.Unicode(length=255), nullable=False, server_default=''))
def downgrade(): op.drop_column('user', 'slug')
alembic revision -m “add_slug”
alembic upgrade head
Alembic
> treemigrations/!"" README!"" alembic.ini!"" env.py!"" script.py.mako#"" versions !"" 22630db6f519_bootstrap.py #"" 7e4b6a43e6c_add_slug.py
Alembicfrom alembic import opfrom sqlalchemy.orm import Sessionfrom youproject import models
def upgrade(): engine = op.get_bind() session = Session(bind=engine) for contact in session.query(models.Contact).all(): session.add( models.WifiUser( login=contact.name.lower(), password=random_string() ) ) session.commit()
alembic upgrade head
Alembicalembic revisionalembic upgrade headalembic upgrade +2alembic downgrade -1alembic downgrade basealembic currentalembic history
Zone Dangereuse !!!
MulticornOù comment se connecter au monde !
http://multicorn.org
Rappel: Foreign Data Wrappers
Utiliser une source externe à PostgreSQL
Twitter, RSS, CSV, XML
FileSystem
Processes
PostgreSQL, Oracle, MongoDB
/* decla
rations
for dyna
mic load
ing */
PG_MODUL
E_MAGIC;
PG_FUNCT
ION_INFO
_V1(mong
o_fdw_ha
ndler);
PG_FUNCT
ION_INFO
_V1(mong
o_fdw_va
lidator)
;
/* * m
ongo_fdw
_handler
creates
and ret
urns a s
truct wi
th point
ers to f
oreign t
able
* callb
ack func
tions.
*/
Datum
mongo_fd
w_handle
r(PG_FUN
CTION_AR
GS)
{
FdwRouti
ne *fdwR
outine =
makeNod
e(FdwRou
tine);
fdwR
outine->
GetForei
gnRelSiz
e = Mong
oGetFore
ignRelSi
ze;
fdwR
outine->
GetForei
gnPaths
= MongoG
etForeig
nPaths;
fdwR
outine->
GetForei
gnPlan =
MongoGe
tForeign
Plan;
fdwR
outine->
ExplainF
oreignSc
an = Mon
goExplai
nForeign
Scan;
fdwR
outine->
BeginFor
eignScan
= Mongo
BeginFor
eignScan
;
fdwR
outine->
IterateF
oreignSc
an = Mon
goIterat
eForeign
Scan;
fdwR
outine->
ReScanFo
reignSca
n = Mong
oReScanF
oreignSc
an;
fdwR
outine->
EndForei
gnScan =
MongoEn
dForeign
Scan;
fdwR
outine->
AnalyzeF
oreignTa
ble = Mo
ngoAnaly
zeForeig
nTable;
PG_R
ETURN_PO
INTER(fd
wRoutine
);
}
Multicorn
• Extension PostgreSQL
• Permet d’écrire un Foreign Data Wrapper
• Wrapper Python de l’API C de PostgreSQL
• Tous les outils utilisant SQL peuvent utiliser un FDW
• Support FULL SQL ;-)
FDW de base• RSS, CSV, XML
• LDAP
• Gmail, IMAP
• Google Search
• SQLAlchemy (mysql, sqlite, oracle, …)
Simple exemple
CREATE EXTENSION multicorn;
Simple exemple (2)CREATE SERVER wirtel_be_srv FOREIGN DATA WRAPPER multicorn OPTIONS ( wrapper 'multicorn.rssfdw.RssFdw');
CREATE FOREIGN TABLE wirtel_be_rss ( "pubDate" TIMESTAMP, description CHARACTER VARYING, title CHARACTER VARYING, link CHARACTER VARYING) SERVER wirtel_be_srv OPTIONS( url 'http://wirtel.be/feeds/python.rss.xml');
Simple exemple (3)
SELECT "pubDate", title, link FROM wirtel_be_rss LIMIT 1;
-[ RECORD 1 ]------------------------------------------------------------------------ pubDate | 2014-10-19 00:00:00 title | Python @ FOSDEM 2015 - Call For Proposals link | http://wirtel.be/posts/en/2014/10/19/python-fosdem-2015-call-for-proposals/
avec OpenERPfrom multicorn import ForeignDataWrapperimport erppeek
class OpenERPForeignDataWrapper(ForeignDataWrapper): def __init__(self, options, columns): super(OpenERPForeignDataWrapper, self).__init__(options, columns) url = 'http://{hostname}:{password}'.format(**options)
self.client = erppeek.Client(url, options['database'], options['username'], options['password'])
self.object_name = options['object']
def execute(self, quals, columns): proxy = self.client.model(self.object_name)
item = {} for record in proxy.browse([]): for column in columns: item[column] = record[column] yield item
avec OpenERP (2)CREATE EXTENSION multicorn;
CREATE SERVER multicorn_openerp FOREIGN DATA WRAPPER multicorn OPTIONS (wrapper 'multicorn.openerpfdw.OpenERPForeignDataWrapper'); CREATE FOREIGN TABLE oe_users ( login character varying, name character varying) SERVER multicorn_openerp OPTIONS ( hostname 'localhost', port '8069', database 'openerp', username 'admin', password 'admin', object 'res.users' ); SELECT id, login, name, active FROM oe_users WHERE login = 'admin';
PL/PythonEtendre PostgreSQL
PL/PythonPython 2 ou Python 3
Utilise toutes les librairies Python (PyPI)
Apprentissage plus rapide que PL/pgSQL
PL/Pythonapt-get install postgresql-plpython-9.3
CREATE EXTENSION IF NOT EXISTS plpythonu;
PL/PythonCREATE OR REPLACE FUNCTION str_title(s VARCHAR) RETURNS VARCHAR AS $$ return s.title()$$ LANGUAGE plpythonu;
demo=# select str_title('hello world');-[ RECORD 1 ]----------str_title | Hello World
PL/Python - DataTypes
PostgreSQL Python
integer, bigint int, long
boolean bool
text types str
SQL Array list
Custom Types dict
PL/Python - DebugFonctions de base pour afficher les messages
• plpy.notice(“<msg>”)
• plpy.debug(“<msg>”)
• plpy.error(“<msg>”)
• plpy.fatal(“<msg>”)
Oublier le ”print”
PL/PythonCREATE OR REPLACE FUNCTION get_pid_cpu_mem(pid INT) RETURNS TABLE(pid INT, cpu_perc FLOAT, mem_perc FLOAT) AS $$ import psutil
process = psutil.Process(pid) return [ { 'pid': pid, 'cpu_perc': process.get_cpu_percent(interval=0), 'mem_perc': process.get_memory_percent() } ]
$$ LANGUAGE plpythonu;
PL/Python
-[ RECORD 1 ]----+---------------------------------datid | 16416datname | demopid | 14680usesysid | 16384usename | stephaneapplication_name | psqlclient_addr |client_hostname |client_port | -1backend_start | 2014-10-25 04:22:08.235532+02xact_start | 2014-10-25 04:25:28.712812+02query_start | 2014-10-25 04:25:28.712812+02state_change | 2014-10-25 04:25:28.712815+02waiting | fstate | activequery | select * from pg_stat_activity ;
SELECT * FROM pg_stat_activity;
PL/Python
pid | cpu_perc | mem_perc-------+----------+-------------- 14680 | 0 | 1.4454081372(1 row)
SELECT * FROM get_pid_cpu_mem(14680);
PL/Python
pid | cpu_perc | mem_perc | application_name-------+----------+---------------+------------------ 14680 | 0 | 1.50435626678 | psql(1 row)
WITH stats AS ( SELECT psa.*, get_pid_cpu_mem(psa.pid) as attrs from pg_stat_activity psa) SELECT (stats.attrs).pid, (stats.attrs).cpu_perc, (stats.attrs).mem_perc, stats.application_nameFROM stats;
PL/Python - AttentionUnsafe -> pas de sandbox
Difficile à maintenir et à debugger
Comment rendre fou votre DBA ;-)
Pas de virtualenv et demande les privilèges superuser.
PL/Python - Pistes ;-)Traitement sur grosses masses de données
Ajouter des contraintes fortes et logique dans le système.
Ajouter de nouvelles fonctionnalités dans PostgreSQL “str_title”
Triggers ?
Utilisation des libs de PyPI (requests, redis, zmq, smtp…)
Accès complet à la base de données !!!
Recommended