Archive

Archive for November, 2006

hamsterdb: next release comes soon…

November 24th, 2006

I’m working hard on the next release. It will feature a couple of minor bug fixes and memory leaks, and significant performance improvements.

Most important changes:

  • replaced linear search in pages with binary search
  • completely rewrote the blob data management, which led to performance improvements of up to 300% for big blobs!
  • completely rewrote the freelist – entries are now merged, which results in smaller files

I still have one big item on my todo list: make extended keys faster. Currently, if a key does not fit in the btree index, an overflow area is allocated (this is what i call “extended key”). I use the normal blob routines to store and load the overflow areas. I will add a cache for extended keys, then they don’t have to be fetched from disk whenever they are needed.

I have done a lot of profiling lately, and so far the improvements are awesome. Some of my unit tests, which ran up to 30 minutes, are now processed in 2 or 3 minutes.

I have also rewritten my test skripts. I have nearly 1 GB of test scripts, and they run about 30 to 60 minutes, depending on the configuration. One full test run of all scripts in all configurations (with/without mmap, as in-memory-db, with different cache sizes, page sizes, key sizes, with overwriting keys etc etc) takes nearly a day.

I expect that in three or four weeks I can finally release the new version.

chris Coding, hamsterdb

hamsterdb: release 0.1pre2

November 4th, 2006

After releasing the first version, and after testing for weeks (i have more then 700 MB of test scripts which run for days), i found a couple of bugs which are fixed now.

Changes

  • enabled support for O_LARGEFILE
  • added the doxygen build script to generate documentation
  • rewrote all SCons-related files
  • rewrote the test environment
  • fixed a bug when allocating big blobs
  • fixed a bug when overwriting keys, and the blob was empty or tiny (<= sizeof(offset_t))
  • fixed a major bug in the freelist which created one new freelist page for every deleted item – therefore, database files became HUGE
  • fixed minor issues on 32bit

Known Issues

  • Iterators are not yet available, and therefore duplicate items are not yet supported
  • No concurrency, no transactions, no SQL support…
  • Only tested on Linux; should compile on most Unices, but definitely not yet on Microsoft Windows.
  • Endian-independent features were not tested
  • older gcc-versions (reproducable with 3.4.5) break hamsterdb, if it’s compiled with enabled optimization (-O); newer gcc-versions (tested with 4.1.1) work. If you have an older gcc-version, compile without -O

Download

You can download hamsterdb sources here: http://www.crupp.de/dl/hamsterdb-0.1pre2.tar.gz

Sample

Example code is provided in the tarball, but also available here: http://www.crupp.de/dl/simple.c

Roadmap to version 0.1

The roadmap was modified. New targets are improved performance, porting to Microsoft Windows, the support for iterators and duplicate items. Also, the decision whether to use SCons or automake/autoconf is not yet final.

chris Coding, hamsterdb, Libraries