LevelDB

Overview
Install on Ubuntu
Example
Deeper into LevelDB
Performance
- Comparison to others
- Comparison to TokyoCabinet
More
cc

Overview

LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.

Durability

Sanjay commented on HN about the durability of LevelDB. And what exactly it means that writes may be lost (for asynchronous writes), namely that the database is not corrupted following a crash, but merely result in incomplete data files:

Merely an incomplete one. Leveldb never writes in place: it always appends to a log file, or merges existing files together to produce new ones. So an OS crash will cause a partially written log record (or a few partially written log records). Leveldb recovery code uses checksums to detect this and will skip the incomplete records.

LSM-Tree

An LSM-Tree is used to increase write-throughput. This makes LevelDB more suitable for random writes-heavy workloads. In contrast KyotoCabinet implements ordered keys using a B-Tree, which is more suitable for read-heavy workloads.

Features

Keys and values are arbitrary byte arrays.
Data is stored sorted by key.
Callers can provide a custom comparison function to override the sort order.
The basic operations are Put(key,value), Get(key), Delete(key).
Multiple changes can be made in one atomic batch.
Users can create a transient snapshot to get a consistent view of data.
Forward and backward iteration is supported over the data.
Data is automatically compressed using the Snappy compression library.
External activity (file system operations etc.) is relayed through a virtual interface so users can customize the operating system interactions.

Install on Ubuntu

Install git and snappy (don’t neccessarily need snappy as LevelDB will work without it but you would need to recompile if you don’t install it before compiling)
```
sudo apt-get install git-core libsnappy-dev
```

Clone

git clone  https://github.com/google/leveldb.git

Compile
```
cd leveldb
make
```

Install

cd out-shared
sudo cp --preserve=links libleveldb.* /usr/local/lib
cd ../include
sudo cp -R leveldb /usr/local/include/
sudo ldconfig

Example

Write and Read back.

#include <cassert>
#include <leveldb/db.h>
#include <iostream>
#include <string>
using namespace std;

int main() {
  leveldb::DB* db;
  leveldb::Options options;
  options.create_if_missing = true;
  leveldb::Status status = leveldb::DB::Open(options, "/tmp/testdb", &db);
  assert(status.ok());
  string key = "foo";
  string value = "bar";
  cout << "write Key:" << key << " and value:" << value << endl;
  status = db->Put(leveldb::WriteOptions(), key, value);
  assert(status.ok());
  string value_back;
  status = db->Get(leveldb::ReadOptions(), key, &value_back);
  cout << "Read back:" << value_back << endl;
  assert(status.ok());
  assert(value == value_back);
  delete db;
  return 0;
}

g++ -o test test.cpp  -lleveldb

Deeper into LevelDB

http://blog.csdn.net/houzengjiang/article/details/7718548

http://www.cnblogs.com/haippy/archive/2011/12/04/2276064.html

http://www.cppblog.com/sandy/archive/2012/03/28/leveldb-trick1.html

Performance

leveldb 性能、使用场景评估

Comparison to others

Sanjay Ghemawat talked on YCombinator/HackerNews about the difference between Leveldb and BitCask:

LevelDB is a persistent ordered map; bitcask is a persistent hash table (no ordered iteration).
Bitcask stores a fixed size record in memory for every key. So for databases with large number of keys, it may use too much memory for some applications. bitcask can guarantee at most one disk seek per lookup I think. leveldb may have to do a small handful of disk seeks.
To clarify, leveldb stores data in a sequence of levels. Each level stores approximately ten times as much data as the level before it. A read needs one disk seek per level. So if 10% of the db fits in memory, leveldb will need to do one seek (for the last level since all of the earlier levels should end up cached in the OS buffer cache). If 1% fits in memory, leveldb will need two seeks.

Comparison to TokyoCabinet

LevelDB is optimized for writes, and TokyoCabinet for reads. Comments by Sanjay:

TokyoCabinet is something we seriously considered using instead of writing leveldb. TokyoCabinet has great performance usually. I haven’t done a careful head-to-head comparison, but it wouldn’t surprise me if it was somewhat faster than leveldb for many workloads. Plus TokyoCabinet is more mature, has matching server code etc. and may therefore be a better fit for many projects.

However because of a fundamental difference in data structures (TokyoCabinet uses btrees for ordered storage; leveldb uses log structured merge trees), random write performance (which is important for our needs) is significantly better in leveldb. This part we did measure. IIRC, we could fill TokyoCabinet with a million 100-byte writes in less than two seconds if writing sequentially, but the time ballooned to ~2000 seconds if we wrote randomly. The corresponding slowdown for leveldb is from ~1.5 seconds (sequential) to ~2.5 seconds (random).

SSDB A fast NoSQL database, an alternative to Redis, the Redis protocol wrapper of LevelDB.
Cache-oblivious B-trees are LSM-trees with faster searches.