Sunday, December 25, 2011

Getting started with CouchDB: a beginner’s guide


Have you ever dreamt about a powerful database that you can access easily, without using the SQL language? That what Apache CouchDB is all about. In this tutorial, I’m going to show you how to get started with this document-riented database and how you can use it with PHP.

Getting started with CouchDB

Apache CouchDB is one of a new breed of database management systems. These new systems are known as NoSQL. NoSQL is a buzz word term first popularized in early 2009 to describe a database that is non-SQL… NoSQL is a term for a loosely defined class of non-relational data stores that break with a long history of relational databases and ACID guarantees. Data stores that fall under this term may not require fixed table schemas.
The first reason I am quickly growing to love CouchDB, and hence decided to write this post is due to the fact that it is a document-oriented DB, rather then storing content into set tables, it allows us to store information, in a manor that is as flexible as an array.
For example here’s a sample document:
FirstName="Bob", Address="5 Oak St.", Hobby="sailing".
However another document could have this data:
FirstName="Jonathan", Address="15 Wanamassa Point Road", Children=("Michael,10", "Jennifer,8", "Samantha,5", "Elena,2").
This is great because first of all, we are not wasting storage on empty, or null fields.
The second reason this is nice, is that we no longer worry about tables, and columns! we need to set info, then we set just what we need. This CAN cause issues if you do not plan correctly, but we will get into that a little later on.
Another big reason I like CouchDB, is that access is through a REST API, for those who know what that means, this is big! For those who don’t, it means access to get or set data can be granted directly from the browser via javascript, without the need to write extra PHP code on the server side!

Using CouchDB

Now that I have you all hyped about it, lets get to using it. The first thing you need to know is that PHP does not have any built in functions to access a CouchDB database.
To do this I recommend PHPillow, a class written by Kore Nordmann. It is definitely one of the best I have seen so far. The second thing you need to know is that setting, and querying a CouchDB is not the same as a MySQL query. As I stated, PHPillow is the best (in my opinion) way to access CouchDB, so that is what I will be using in this example…
Database connection
To connect to your CouchDB instance simply use the phpillowConnection class like shown here:
phpillowConnection->createInstance('localhost', 5984, 'user', 'password');
Once created this connection will be used in your document and view classes automatically.
Define a custom document
All documents extend the abstract base class phpillowDocument. A complete model defining a blog entry could look like:
class myBlogDocument extends phpillowDocument
{
    protected static $type = 'blog_entry';

    protected $requiredProperties = array(
        'title',
        'text',
    );

    public function __construct()
    {
        $this->properties = array(
            'title'     => new phpillowStringValidator(),
            'text'      => new phpillowTextValidator(),
            'comments'  => new phpillowDocumentArrayValidator(
                'myBlogComments'
            ),
        );

        parent::__construct();
    }

    protected function generateId()
    {
        return $this->stringToId( $this->storage->title );
    }

    protected function getType()
    {
        return self::$type;
    }
}
The static property $type defines the type of the stored document and should be unique for each document in your application. If you are implementing a module, prefix this type with the name of the module, like “blog” in this example. If you happen to use a PHP version prior 5.3 you have to return the document type in each of your document classes like shown above. 5.3 and above users can use a more generic approach with returning static::$type in a base document class.
The $requiredProperties array defined the properties, which are mandatory to be set. The properties itself are defined in the $properties property, which is initialized in the constructor of the document. We associate a validator with each property which validates the input set on the document. There are quite complex validators, like the phpillowDocumentArrayValidator shown here, which will be described later, which are all documented in the generated API documentation.
The last thing you need to define is the generation of the document ID. An ID in CouchDB needs to fulfill some requirements, which are ensured by using the protected method stringToId(). Normally you use one somehow unique property of the document. If this is not entirely unique the document handler will append something, so that it will get unique. Just return null if you want CouchDB to give you an unique id for the document.
Using a document
Now to save data using the document layout that the above code would create we can simply call:
$doc = new myBlogDocument();
$doc->title = 'New blog post';
$doc->text  = 'Hello world.';
$doc->save();
With the call to the save() method the document will be generated and stored in the database. After this a new magic property is available for the document:
$doc->_id;
Using documents directly this ID is the way to fetch the document back from the database, like:
$doc = new myBlogDocument();
$doc->fetchById('blog_entry-new_blog_post');
This call retrieved the above document back from the database. The magic CouchDB properties _id and _rev (for revision) are set for the document. Beside the defined properties another property has been created by the wrapper, called revisions, which contains all old (and the current) revisions of the document:
echo $doc->revisions[0]['title'];
If you now change a property on the object and store it again in the database the old revision will also be stored in the database, so that no information is lost on change. This behavior may be deactivated by setting the $versioned property to false.
Did you say revisions?
Why yes I did! Thanks for noticing! My Steve Jobs “One more thing!” moment, is that if you alter a document in a CouchDB database, it save the pervious version as a revision automagicly! No need for multiple database entries to make sure your application can roll back!
So thats about it for this tutorial. Next time we will get into how to run more advanced queries using PHPillow.

No comments:

Post a Comment