hamsterdb: my first release!
I started developing my own database engine more then two years ago. Databases have always been a fascinating topic for me – a cool mixture of performance-critical system-programming and theoretical algorithms. Far more interesting than computer games!
My database engine, hamsterdb, supports a B+Tree as an index structure (it will support hash tables and maybe other structures in the future). My plan is to write a database which is similar in performance and features to Sleepycat’s BerkeleyDB. Just like BerkeleyDB, it creates one file per database, in which the index and the data records are stored. Unlike BerkeleyDB, hamsterdb supports creating in-memory-databases, which never write on disk and are therefore even faster.
Features currently supported are:
- B+Tree index with variable length keys
- Configurable page size and cache size
- ANSI-C implementation, should be portable on all platforms, also embedded
- Uses memory mapped I/O for fast disk access (but falls back to read/write if mmap is not available)
- Uses 64bit file pointers
- Endian-independent (not tested, though)
- Support for in-memory-databases
Known Issues
- Iterators are not yet available, and therefore duplicate items are not yet supported
- No concurrency, no transactions, no SQL support…
- Only tested on Linux; should compile on most Unices, but definitely not on Microsoft Windows.
- Endian-independent features were not tested
- older gcc-versions (reproducable with 3.4.5) break hamsterdb, if it’s compiled with enabled optimization (-O); newer gcc-versions (tested with 4.1.1) work
Download
You can download hamsterdb sources here: http://www.crupp.de/dl/hamsterdb-0.1pre1.tar.gz
It’s a pre-pre-release, but it’s stable and i don’t know any bugs.
Roadmap to Version 0.1
- Port to Microsoft Windows
- Make it as fast as BerkeleyDB, or at least try to (BerkeleyDB is really fast, I take my hat off to them)
- Write tools to dump and repair databases
Roadmap to Version 0.2
- Add iterators
- Add support for duplicate items
Roadmap to Version 0.3
- Support hash-tables
- Add support for generic “filters”, i.e. to encrypt/decrypt or compress database pages
- Add bindings for other languages (C++, Python, Perl, Java…)
- Documentation…
Feel free to leave a comment!