Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
NoSQLLa fin du relationnel ?
Michael Bailly
RMLL 2011
Raison #1
RDBMS don't are hard to scale
Mais aussi...
Dénormalisation Mise en cache Moteurs d'indexation (Solr, Sphinx) Files d'attentes (Gearman, ActiveMQ)
Reason #2
Schema stinks smells
Raison #3
Données éphémères
En résumé
RDBMS Enregistrements de petite
taille, dôtés de relations bien définies et normalisées
Requêtes pouvant évoluer
Données à longue durée de vie (fréquence MAJ faible)
Pas besoin de perfs exceptionnelles en lecture
Intégrité des données > perfs | extensibilité
NoSQL Jeu de données très
important
Données pouvant être modélisées en arbres dont l'accès se fait par le noeud racine
Structure de données hautement dynamique
Complexité du mapping relationnel/objet -> perte de productivité
Compromis #1
ACID BASE
ACID = Atomicity, Consistency, Isolation, DurabilityBASE = Basically Available Soft-state Eventual
consistency
Compromis #2
C
A P
Consistency, Availability, Partition-tolerance
Pick two !Eric A. Brewer (2000)
Dépôts clé-valeur
http://www.flickr.com/photos/nshepard/2309015
Dépôts clé-valeur// création d'un utilisateurINCR global:next_user_id => 1234SET uid:1234:username jdoeSET uid:1234:password p4s5w0rDSET username:jdoe:uid 1234
// authentification de l'utilisateurGET username:jdoe:uid => 1234GET uid:1234:password => p4s5w0rDSET uid:1234:auth fea5e81ac8ca77622bed1c2132a021f9SET auth:fea5e81ac8ca77622bed1c2132a021f9 1234
// récupération des docs de l'utilisateurGET uid:1234:docs => [7890, 4567, 2345]GET doc:7890 => "<owner_id>|<ts>|Awesome !"
// ajout d'un docINCR global:next_doc_id => 7891SET doc:7891 "<owner_id>|<ts>|Enorme !"LPUSH uid:1234:docs 7891
Redis
Ecrit en C, licence BSD Stockage en RAM Persistance via snapshotting asynchrome
ou AOF STRING, LIST, SET, ZSET Non distribué Pas de tolérance aux pannes Mais très rapide !
Redis : cas d'utilisation
"Memcached on steroids" Nuages de tags Statistiques, logs Sessions Moteurs d'indexation Job queues (resque, Celery)
Redis : exemple
Tags : utilisation d'un SET
// tagging du doc 1234SADD article:1234 rubySADD article:1234 pythonSADD article:1234 php
// tagging du doc 6789SADD article:6789 djangoSADD article:6789 python
// récupération des tags communsSINTER article:1234 article:6789 => python
Redis : librairies
PHP Python Ruby
Predis redis-py redis-rb
Rediska txredis em-ruby
PHP-Redis ohm
Ohmclass Event < Ohm::Model attribute :name reference :venue, Venue set :participants, Person counter :votes
index :name
def validate assert_present :name endend
class Venue < Ohm::Model attribute :name collection :events, Eventend
class Person < Ohm::Model attribute :nameend
Rediska
$options = array( 'namespace' => 'Application_', 'servers' => array( array ('host' => '127.0.0.1', 'port' => 6379) ));
$writer = new Rediska_Zend_Log_Writer_Redis( 'keyName' , $options);
$log = new Zend_Log($writer);
Project Voldemort
Ecrit en Java, licence Apache 2.0
Distribué !
Réplication auto des données sur plusieurs serveurs
Partitionnement auto des données Tolérant aux pannes
Versioning des données
Backends de stockage pluggables (BDB, MySQL)
Pluggable serialization (Thrift, Protocol Buffers, Java)
Dépôts orientés colonnes
http://www.flickr.com/photos/stuckincustoms/536710395
Cassandra
Ecrit en Java, licence Apache 2.0
Distribué !
Réplication auto des données sur plusieurs serveurs (et même datacenters)
Partitionnement auto des données Tolérant aux pannes, décentralisé (pas de
SPOF) Disponibilité paramétrable via le
ConsistencyLevel Excellente performances en écriture (stockage
en RAM + commit log, flush to disk régulier)
Modèle de données
Column :{ name : "emailAddress", value : "[email protected]", timestamp: 123456789}
SuperColumn :{ name : "physicalAddress", value : { street: { name: "street", value: "xxx", ts: 123 }, city: { name: "city", value: "Paris", ts: 123 }, zip: { name: "zip", value: "75017", ts: 123 } }}
Modèle de données
ColumnFamily :
Users = { jdoe: { username: { name: "username", value: "jdoe", ts: 123 }, email: { name: "email", value: "[email protected]", ts: 123 } }, jane: { username: { name: "username", value: "jane", ts: 123 }, email: { name: "email", value: "[email protected]", ts: 123 }, gender: { name: "gender", value: "female", ts: 123 }, age: { name: "age", value: "25", ts: 123 }, }}
Modèle de données
SuperColumnFamily :
AddressBooks = { jdoe: { bob: { name : "physicalAddress", value : { street: { name: "street", value: "1 rue de la paix", ts: 123 }, city: { name: "city", value: "Paris", ts: 123 }, zip: { name: "zip", value: "75017", ts: 123 } } }, karen: { name : "physicalAddress", value : { street: { name: "street", value: "2 rue de la paix", ts: 123 }, city: { name: "city", value: "Paris", ts: 123 }, zip: { name: "zip", value: "75017", ts: 123 } } }, }}
Cassandra Object
class Customer < CassandraObject::Base attribute :first_name, :type => :string attribute :last_name, :type => :string attribute :date_of_birth, :type => :date attribute :signed_up_at, :type => :time_with_zone
validate :should_be_cool
key :uuid
index :date_of_birth
association :invoices, :unique=>false, :inverse_of=>:customerend
Dépôts de documents
http://www.flickr.com/photos/60849961@N00/2386584804
CouchDB
Ecrit en Erlang, licence Apache 2.0
Document = structure JSON
API REST : HTTP + JSON
Résolution de conflits facile (ID + n° révision)
Réplication incrémentale (scalable +++ !)
Robuste
MapReduce© incrémental
Service Comet de notification des changements
Partionnement automatique avec CouchDB Lounge
MongoDB
Ecrit en C++, licence AGPL v.3.0
Document = structure JSON++, stockage en BSON
API de requête assez complète
Indexes
Partitionnement (sharding)
MapReduce ©
Réplication
Support payant possible
In-Place updates
MongoDB : librairies
PHP Python Ruby
Mongo Pymongo (driver)
Mango MongoMapper
Mongoid
MongoRecordMongoModelMongoDoc
MongoDB : shell interactifroot@lenny:/opt/mongodb1.3.2/bin# ./mongoMongoDB shell version: 1.3.2url: testconnecting to: testtype "help" for help> show dbsadminlocaltest> db.people.save({"firstname":"John"})ObjectId("4b8cccc622131491059056cd")> person = db.people.findOne({ firstname : "John"}){ "_id" : ObjectId("4b8cccc622131491059056cd"), "firstname" : "John" }> person.lastname = "Doe"Doe> db.people.save(person)> db.people.findOne({ firstname : "John"}){ "_id" : ObjectId("4b8cccc622131491059056cd"), "firstname" : "John", "lastname" : "Doe"}
Pymongo
>>> import pymongo>>> from pymongo import Connection>>> conn = Connection('localhost', 27017)>>> db = conn.blog>>> posts = db.posts>>> import datetime>>> post = {"author": "Raphael",... "title": "La guerre des frameworks",... "tags": ["rails", "django", "symfony"],... "date": datetime.datetime.utcnow()}>>> posts.insert(post)ObjectId ('4b8ce378a5835f0dda000000')post = {"author": "Raphael",... "title": "Django + MongoDB = Mango",... "tags": ["python", "django", "mongodb"],... "date": datetime.datetime.utcnow()}>>> posts.insert(post)ObjectId ('4b8ce531a5835f0dda000001')>>> for post in posts.find({"author": "Raphael"}):... print post...>>> posts.count()2>>> d = datetime.datetime(2010, 3, 1)>>> for post in posts.find({"date": {"$gt": d}}):... print post
Pymongo
>>> contact = {"firstname": "John", ... "lastname": "Doe", ... "address": {"street" : "1 rue de la paix", ... "city": "Paris"}, ... "phones": [{"type": "home", "number": "+331234567"}, ... {"type": "mobile", "number" : "+33678901234"}]} >>> contacts.insert(contact) ObjectId ('4b8ce9f8a5835f0dda000002') >>> contact = {"firstname": "Jane", ... "lastname": "Doe", ... "address": {"street" : "1 rue de la paix", ... "city": "Paris"}, ... "phones": [{"type": "home", "number": "+331234567"}, ... {"type": "mobile", "number" : "+3361234567"}]} >>> contacts.insert(contact) ObjectId ('4b8ceb90a5835f0dda000003') >>> for contact in contacts.find({"address.city": "Paris"}).sort("lastname"): ... print contact ...
Pymongo
>>> contacts.find({"address.city": "Paris"}).sort("lastname").explain()["cursor"]u'BasicCursor'>>> contacts.find({"address.city": "Paris"}).sort("lastname").explain()["nscanned"]3.0>>> from pymongo import ASCENDING, DESCENDING>>> contacts.create_index([("address.city", ASCENDING), ("lastname", ASCENDING)])u'address.city_1_lastname_1'>>> contacts.find({"address.city": "Paris"}).sort("lastname").explain()["cursor"]u'BtreeCursor address.city_1_lastname_1'>>> contacts.find({"address.city": "Paris"}).sort("lastname").explain()["nscanned"]2.0
MongooseJS
var Schema = mongoose.Schema ;
var Phone = new Schema({ type: {type: String}, number: {type: String}});
var Person = new Schema({ firstname: {type: String}, lastname: {type: String, required: true}, address: { street: {type: String}, city: {type: String} }, phones: [Phone]});
mongoose.model("Person", Person) ;
MongooseJS
var newperson = mongoose.model("Person");
newperson.firstname = "John" ;newperson.lastname = "Doe" ;newperson.address = {street: "1 rue de la Paix", city: "Paris"};newperson.phones.push({type: "home", number: "+331234567"});newperson.phones.push({type: "mobile", number: "+33067891234"});newperson.save(function(err) { if ( !err ) console.log("Saved !");});
mongoose.model("Person").find( {"address.city" : "Paris" }, function (err,docs) { if ( err ) return console.log("Aie !",err) ; console.log("Found "+docs.length+" persons living in Paris"); });
Un "planet NoSQL"
http://nosql.mypopescu.com/
Merci de votre attention
Questions ?