TidesDB Java API Reference

If you want to download the source of this document, you can find it here.

Getting Started

Prerequisites

You must have the TidesDB shared C library installed on your system. You can find the installation instructions here.

Requirements

Java 11 or higher
Maven 3.6+
TidesDB native library installed on the system

Building the JNI Library

cd src/main/c
cmake -S . -B build
cmake --build build
sudo cmake --install build

Adding to Your Project

Maven

<dependency>
    <groupId>com.tidesdb</groupId>
    <artifactId>tidesdb-java</artifactId>
    <version>0.6.5</version>
</dependency>

Usage

Opening and Closing a Database

import com.tidesdb.*;

public class Example {
    public static void main(String[] args) throws TidesDBException {
        Config config = Config.builder("./mydb")
            .numFlushThreads(2)
            .numCompactionThreads(2)
            .logLevel(LogLevel.INFO)
            .blockCacheSize(64 * 1024 * 1024)
            .maxOpenSSTables(256)
            .maxMemoryUsage(0)
            .build();

        try (TidesDB db = TidesDB.open(config)) {
            System.out.println("Database opened successfully");
        }
    }
}

Creating and Dropping Column Families

Column families are isolated key-value stores with independent configuration.

// Create with default configuration
ColumnFamilyConfig cfConfig = ColumnFamilyConfig.defaultConfig();
db.createColumnFamily("my_cf", cfConfig);

// Create with custom configuration
ColumnFamilyConfig customConfig = ColumnFamilyConfig.builder()
    .writeBufferSize(128 * 1024 * 1024)
    .levelSizeRatio(10)
    .minLevels(5)
    .compressionAlgorithm(CompressionAlgorithm.LZ4_COMPRESSION)
    .enableBloomFilter(true)
    .bloomFPR(0.01)
    .enableBlockIndexes(true)
    .syncMode(SyncMode.SYNC_INTERVAL)
    .syncIntervalUs(128000)
    .defaultIsolationLevel(IsolationLevel.READ_COMMITTED)
    .build();

db.createColumnFamily("custom_cf", customConfig);

db.dropColumnFamily("my_cf");

String[] cfNames = db.listColumnFamilies();
for (String name : cfNames) {
    System.out.println("Column family: " + name);
}

Working with Transactions

Writing Data

ColumnFamily cf = db.getColumnFamily("my_cf");

try (Transaction txn = db.beginTransaction()) {
    txn.put(cf, "key".getBytes(), "value".getBytes());
    txn.commit();
}

Writing with TTL

import java.time.Instant;

ColumnFamily cf = db.getColumnFamily("my_cf");

try (Transaction txn = db.beginTransaction()) {
    long ttl = Instant.now().getEpochSecond() + 10;

    txn.put(cf, "temp_key".getBytes(), "temp_value".getBytes(), ttl);
    txn.commit();
}

TTL Examples

long ttl = -1;

long ttl = Instant.now().getEpochSecond() + 5 * 60;

long ttl = Instant.now().getEpochSecond() + 60 * 60;

long ttl = LocalDateTime.of(2026, 12, 31, 23, 59, 59)
        .toEpochSecond(ZoneOffset.UTC);

Reading Data

ColumnFamily cf = db.getColumnFamily("my_cf");

try (Transaction txn = db.beginTransaction()) {
    byte[] value = txn.get(cf, "key".getBytes());
    System.out.println("Value: " + new String(value));
}

Deleting Data

ColumnFamily cf = db.getColumnFamily("my_cf");

try (Transaction txn = db.beginTransaction()) {
    txn.delete(cf, "key".getBytes());
    txn.commit();
}

Transaction Rollback

ColumnFamily cf = db.getColumnFamily("my_cf");

try (Transaction txn = db.beginTransaction()) {
    txn.put(cf, "key".getBytes(), "value".getBytes());

    txn.rollback();
}

Multi-Operation Transactions

ColumnFamily cf = db.getColumnFamily("my_cf");

try (Transaction txn = db.beginTransaction()) {
    txn.put(cf, "key1".getBytes(), "value1".getBytes());
    txn.put(cf, "key2".getBytes(), "value2".getBytes());
    txn.delete(cf, "old_key".getBytes());

    txn.commit();
} catch (TidesDBException e) {
    throw e;
}

Iterating Over Data

Iterators provide efficient bidirectional traversal over key-value pairs.

Forward Iteration

ColumnFamily cf = db.getColumnFamily("my_cf");

try (Transaction txn = db.beginTransaction()) {
    try (TidesDBIterator iter = txn.newIterator(cf)) {
        iter.seekToFirst();

        while (iter.isValid()) {
            byte[] key = iter.key();
            byte[] value = iter.value();

            System.out.printf("Key: %s, Value: %s%n",
                new String(key), new String(value));

            iter.next();
        }
    }
}

Backward Iteration

ColumnFamily cf = db.getColumnFamily("my_cf");

try (Transaction txn = db.beginTransaction()) {
    try (TidesDBIterator iter = txn.newIterator(cf)) {
        iter.seekToLast();

        while (iter.isValid()) {
            byte[] key = iter.key();
            byte[] value = iter.value();

            System.out.printf("Key: %s, Value: %s%n",
                new String(key), new String(value));

            iter.prev();
        }
    }
}

Seeking to a Specific Key

try (TidesDBIterator iter = txn.newIterator(cf)) {
    iter.seek("prefix".getBytes());

    iter.seekForPrev("prefix".getBytes());
}

Prefix Seeking

Since seek positions the iterator at the first key >= target, you can use a prefix as the seek target to efficiently scan all keys sharing that prefix:

ColumnFamily cf = db.getColumnFamily("my_cf");

try (Transaction txn = db.beginTransaction()) {
    try (TidesDBIterator iter = txn.newIterator(cf)) {
        byte[] prefix = "user:".getBytes();
        iter.seek(prefix);

        while (iter.isValid()) {
            byte[] key = iter.key();
            String keyStr = new String(key);

            if (!keyStr.startsWith("user:")) break;

            byte[] value = iter.value();
            System.out.printf("Key: %s, Value: %s%n", keyStr, new String(value));

            iter.next();
        }
    }
}

This pattern works across both memtables and SSTables. When block indexes are enabled, the seek operation uses binary search to jump directly to the relevant block, making prefix scans efficient even on large datasets.

Getting Column Family Statistics

ColumnFamily cf = db.getColumnFamily("my_cf");
Stats stats = cf.getStats();

System.out.println("Number of levels: " + stats.getNumLevels());
System.out.println("Memtable size: " + stats.getMemtableSize());
System.out.println("Total keys: " + stats.getTotalKeys());
System.out.println("Total data size: " + stats.getTotalDataSize());
System.out.println("Average key size: " + stats.getAvgKeySize());
System.out.println("Average value size: " + stats.getAvgValueSize());
System.out.println("Read amplification: " + stats.getReadAmp());
System.out.println("Hit rate: " + stats.getHitRate());

if (stats.isUseBtree()) {
    System.out.println("B+tree total nodes: " + stats.getBtreeTotalNodes());
    System.out.println("B+tree max height: " + stats.getBtreeMaxHeight());
    System.out.println("B+tree avg height: " + stats.getBtreeAvgHeight());
}

long[] levelSizes = stats.getLevelSizes();
int[] levelSSTables = stats.getLevelNumSSTables();
long[] levelKeyCounts = stats.getLevelKeyCounts();

Stats Fields

Field	Type	Description
`numLevels`	int	Number of LSM levels
`memtableSize`	long	Current memtable size in bytes
`levelSizes`	long[]	Size of each level in bytes
`levelNumSSTables`	int[]	Number of SSTables at each level
`levelKeyCounts`	long[]	Number of keys per level
`totalKeys`	long	Total keys across memtable and all SSTables
`totalDataSize`	long	Total data size (klog + vlog) in bytes
`avgKeySize`	double	Average key size in bytes
`avgValueSize`	double	Average value size in bytes
`readAmp`	double	Read amplification (point lookup cost multiplier)
`hitRate`	double	Cache hit rate (0.0 if cache disabled)
`useBtree`	boolean	Whether column family uses B+tree klog format
`btreeTotalNodes`	long	Total B+tree nodes (only if useBtree=true)
`btreeMaxHeight`	int	Maximum B+tree height (only if useBtree=true)
`btreeAvgHeight`	double	Average B+tree height (only if useBtree=true)
`config`	ColumnFamilyConfig	Column family configuration

Getting Cache Statistics

CacheStats cacheStats = db.getCacheStats();

System.out.println("Cache enabled: " + cacheStats.isEnabled());
System.out.println("Total entries: " + cacheStats.getTotalEntries());
System.out.println("Hit rate: " + cacheStats.getHitRate());

Manual Compaction and Flush

ColumnFamily cf = db.getColumnFamily("my_cf");

cf.compact();

cf.flushMemtable();

boolean flushing = cf.isFlushing();
boolean compacting = cf.isCompacting();

Range Cost Estimation

Estimate the computational cost of iterating between two keys in a column family. The returned value is an opaque double — meaningful only for comparison with other values from the same method. Uses only in-memory metadata and performs no disk I/O.

ColumnFamily cf = db.getColumnFamily("my_cf");

double costA = cf.rangeCost("user:0000".getBytes(), "user:0999".getBytes());
double costB = cf.rangeCost("user:1000".getBytes(), "user:1099".getBytes());

if (costA < costB) {
    System.out.println("Range A is cheaper to iterate");
}

Key order does not matter — the method normalizes the range so keyA > keyB produces the same result as keyB > keyA. A cost of 0.0 means no overlapping SSTables or memtable entries were found for the range.

Use cases

Query planning · Compare candidate key ranges to find the cheapest one to scan
Load balancing · Distribute range scan work across threads by estimating per-range cost
Adaptive prefetching · Decide how aggressively to prefetch based on range size
Monitoring · Track how data distribution changes across key ranges over time

Updating Runtime Configuration

Update runtime-safe configuration settings for a column family:

ColumnFamily cf = db.getColumnFamily("my_cf");

ColumnFamilyConfig newConfig = ColumnFamilyConfig.builder()
    .writeBufferSize(256 * 1024 * 1024)
    .skipListMaxLevel(16)
    .bloomFPR(0.001)
    .syncMode(SyncMode.SYNC_INTERVAL)
    .syncIntervalUs(100000)
    .build();

cf.updateRuntimeConfig(newConfig, true);

Updatable settings (safe to change at runtime):

writeBufferSize · Memtable flush threshold
skipListMaxLevel · Skip list level for new memtables
skipListProbability · Skip list probability for new memtables
bloomFPR · False positive rate for new SSTables
indexSampleRatio · Index sampling ratio for new SSTables
syncMode · Durability mode
syncIntervalUs · Sync interval in microseconds

Commit Hook (Change Data Capture)

Register a callback that fires synchronously after every transaction commit on a column family. The hook receives the full batch of committed operations atomically, enabling real-time change data capture without WAL parsing.

ColumnFamily cf = db.getColumnFamily("my_cf");

cf.setCommitHook((ops, commitSeq) -> {
    for (CommitOp op : ops) {
        if (op.isDelete()) {
            System.out.println("DELETE key=" + new String(op.getKey()));
        } else {
            System.out.println("PUT key=" + new String(op.getKey())
                + " value=" + new String(op.getValue()));
        }
    }
    System.out.println("Commit seq: " + commitSeq);
    return 0;
});

// Normal writes now trigger the hook automatically
try (Transaction txn = db.beginTransaction()) {
    txn.put(cf, "key1".getBytes(), "value1".getBytes());
    txn.commit();  // hook fires here
}

// Detach hook
cf.clearCommitHook();

The CommitHook functional interface receives a CommitOp[] array and a monotonic commitSeq number. Each CommitOp contains:

getKey() · Key bytes
getValue() · Value bytes (null for deletes)
getTtl() · Time-to-live (-1 for no expiry)
isDelete() · True if this is a delete operation

Behavior

The hook fires after WAL write, memtable apply, and commit status marking — data is fully durable before the callback runs
Hook failure (non-zero return) is logged but does not roll back the commit
Each column family has its own independent hook; a multi-CF transaction fires the hook once per CF with only that CF’s operations
commitSeq is monotonically increasing and can be used as a replication cursor
The hook executes synchronously on the committing thread — keep the callback fast to avoid stalling writers
Hooks are runtime-only and not persisted. After a database restart, hooks must be re-registered by the application

Use cases

Replication · Ship committed batches to replicas in commit order
Event streaming · Publish mutations to Kafka, NATS, or any message broker
Secondary indexing · Maintain a reverse index or materialized view
Audit logging · Record every mutation with key, value, TTL, and sequence number
Debugging · Attach a temporary hook in production to inspect live writes

Multi-Column-Family Transactions

TidesDB supports atomic transactions across multiple column families with true all-or-nothing semantics.

ColumnFamily usersCf = db.getColumnFamily("users");
ColumnFamily ordersCf = db.getColumnFamily("orders");

try (Transaction txn = db.beginTransaction()) {
    txn.put(usersCf, "user:1000".getBytes(), "John Doe".getBytes());
    txn.put(ordersCf, "order:5000".getBytes(), "user:1000|product:A".getBytes());

    txn.commit();
}

Multi-CF guarantees

Either all CFs commit or none do (atomic)
Automatically detected when operations span multiple CFs
Uses global sequence numbers for atomic ordering
Each CF’s WAL receives operations with the same commit sequence number
No two-phase commit or coordinator overhead

Custom Comparators

TidesDB uses comparators to determine the sort order of keys. Once a comparator is set for a column family, it cannot be changed without corrupting data.

Built-in Comparators

"memcmp" (default) · Binary byte-by-byte comparison
"lexicographic" · Null-terminated string comparison
"uint64" · Unsigned 64-bit integer comparison
"int64" · Signed 64-bit integer comparison
"reverse" · Reverse binary comparison (descending order)
"case_insensitive" · Case-insensitive ASCII comparison

Registering a Comparator

db.registerComparator("reverse", null);

ColumnFamilyConfig cfConfig = ColumnFamilyConfig.builder()
    .comparatorName("reverse")
    .build();

db.createColumnFamily("sorted_cf", cfConfig);

Database Backup

Create an on-disk snapshot without blocking normal reads/writes:

db.backup("./mydb_backup");

Database Checkpoint

Create a lightweight, near-instant snapshot of an open database using hard links instead of copying SSTable data:

db.checkpoint("./mydb_checkpoint");

Checkpoint vs Backup

	`backup()`	`checkpoint()`
Speed	Copies every SSTable byte-by-byte	Near-instant (hard links, O(1) per file)
Disk usage	Full independent copy	No extra disk until compaction removes old SSTables
Portability	Can be moved to another filesystem or machine	Same filesystem only (hard link requirement)
Use case	Archival, disaster recovery, remote shipping	Fast local snapshots, point-in-time reads, streaming backups

Behavior

Requires the directory to be non-existent or empty
For each column family: flushes the active memtable, halts compactions, hard links all SSTable files, copies small metadata files, then resumes compactions
Falls back to file copy if hard linking fails (e.g., cross-filesystem)
Database stays open and usable during checkpoint
The checkpoint can be opened as a normal TidesDB database with TidesDB.open()

Renaming Column Families

Atomically rename a column family:

db.renameColumnFamily("old_name", "new_name");

Cloning Column Families

Create a complete copy of an existing column family with a new name. The clone is completely independent; modifications to one do not affect the other.

db.cloneColumnFamily("source_cf", "cloned_cf");

ColumnFamily original = db.getColumnFamily("source_cf");
ColumnFamily clone = db.getColumnFamily("cloned_cf");

Use cases

Testing · Create a copy of production data for testing without affecting the original
Branching · Create a snapshot of data before making experimental changes
Migration · Clone data before schema or configuration changes
Backup verification · Clone and verify data integrity without modifying the source

B+tree KLog Format (Optional)

Column families can optionally use a B+tree structure for the key log instead of the default block-based format. The B+tree klog format offers faster point lookups through O(log N) tree traversal.

ColumnFamilyConfig btreeConfig = ColumnFamilyConfig.builder()
    .writeBufferSize(128 * 1024 * 1024)
    .compressionAlgorithm(CompressionAlgorithm.LZ4_COMPRESSION)
    .enableBloomFilter(true)
    .useBtree(true)
    .build();

db.createColumnFamily("btree_cf", btreeConfig);

ColumnFamily cf = db.getColumnFamily("btree_cf");

try (Transaction txn = db.beginTransaction()) {
    txn.put(cf, "key".getBytes(), "value".getBytes());
    txn.commit();
}

Stats stats = cf.getStats();
if (stats.isUseBtree()) {
    System.out.println("B+tree nodes: " + stats.getBtreeTotalNodes());
    System.out.println("B+tree max height: " + stats.getBtreeMaxHeight());
    System.out.println("B+tree avg height: " + stats.getBtreeAvgHeight());
}

When to use B+tree klog format

Read-heavy workloads with frequent point lookups
Workloads where read latency is more important than write throughput
Large SSTables where block scanning becomes expensive

Tradeoffs

Slightly higher write amplification during flush
Larger metadata overhead per node
Block-based format may be faster for sequential scans

Transaction Isolation Levels

try (Transaction txn = db.beginTransaction(IsolationLevel.SERIALIZABLE)) {
    txn.commit();
}

Available Isolation Levels

READ_UNCOMMITTED · Sees all data including uncommitted changes
READ_COMMITTED · Sees only committed data (default)
REPEATABLE_READ · Consistent snapshot, phantom reads possible
SNAPSHOT · Write-write conflict detection
SERIALIZABLE · Full read-write conflict detection (SSI)

Savepoints

Savepoints allow partial rollback within a transaction:

try (Transaction txn = db.beginTransaction()) {
    txn.put(cf, "key1".getBytes(), "value1".getBytes());

    txn.savepoint("sp1");
    txn.put(cf, "key2".getBytes(), "value2".getBytes());

    txn.rollbackToSavepoint("sp1");

    txn.commit();
}

Savepoint API

savepoint(name) · Create a savepoint
rollbackToSavepoint(name) · Rollback to savepoint
releaseSavepoint(name) · Release savepoint without rolling back

Transaction Reset

reset resets a committed or aborted transaction for reuse with a new isolation level. This avoids the overhead of freeing and reallocating transaction resources in hot loops.

ColumnFamily cf = db.getColumnFamily("my_cf");

Transaction txn = db.beginTransaction();
txn.put(cf, "key1".getBytes(), "value1".getBytes());
txn.commit();

txn.reset(IsolationLevel.READ_COMMITTED);

txn.put(cf, "key2".getBytes(), "value2".getBytes());
txn.commit();

txn.free();

Behavior

The transaction must be committed or aborted before reset; resetting an active transaction throws TidesDBException
Internal buffers are retained to avoid reallocation
A fresh transaction ID and snapshot sequence are assigned based on the new isolation level
The isolation level can be changed on each reset (e.g., READ_COMMITTED to REPEATABLE_READ)

When to use

Batch processing · Reuse a single transaction across many commit cycles in a loop
Connection pooling · Reset a transaction for a new request without reallocation
High-throughput ingestion · Reduce allocation overhead in tight write loops

Reset vs Free + Begin

For a single transaction, reset is functionally equivalent to calling free followed by beginTransaction. The difference is performance: reset retains allocated buffers and avoids repeated allocation overhead. This matters most in loops that commit and restart thousands of transactions.

Configuration Options

Database Configuration

Option	Type	Default	Description
`dbPath`	String	-	Path to the database directory
`numFlushThreads`	int	2	Number of flush threads
`numCompactionThreads`	int	2	Number of compaction threads
`logLevel`	LogLevel	INFO	Logging level
`blockCacheSize`	long	64MB	Block cache size in bytes
`maxOpenSSTables`	long	256	Maximum open SSTable files
`logToFile`	boolean	false	Write logs to file instead of stderr
`logTruncationAt`	long	24MB	Log file truncation size (0 = no truncation)
`maxMemoryUsage`	long	0	Global memory limit in bytes (0 = auto, 50% of system RAM)

Column Family Configuration

Option	Type	Default	Description
`writeBufferSize`	long	128MB	Memtable flush threshold
`levelSizeRatio`	long	10	Level size multiplier
`minLevels`	int	5	Minimum LSM levels
`dividingLevelOffset`	int	2	Compaction dividing level offset
`klogValueThreshold`	long	512	Values > threshold go to vlog
`compressionAlgorithm`	CompressionAlgorithm	LZ4_COMPRESSION	Compression algorithm
`enableBloomFilter`	boolean	true	Enable bloom filters
`bloomFPR`	double	0.01	Bloom filter false positive rate (1%)
`enableBlockIndexes`	boolean	true	Enable compact block indexes
`indexSampleRatio`	int	1	Sample every block for index
`blockIndexPrefixLen`	int	16	Block index prefix length
`syncMode`	SyncMode	SYNC_FULL	Sync mode for durability
`syncIntervalUs`	long	1000000	Sync interval (1 second, for SYNC_INTERVAL)
`comparatorName`	String	""	Custom comparator name (empty = memcmp)
`skipListMaxLevel`	int	12	Skip list max level
`skipListProbability`	float	0.25	Skip list probability
`defaultIsolationLevel`	IsolationLevel	READ_COMMITTED	Default transaction isolation
`minDiskSpace`	long	100MB	Minimum disk space required
`l1FileCountTrigger`	int	4	L1 file count trigger for compaction
`l0QueueStallThreshold`	int	20	L0 queue stall threshold
`useBtree`	boolean	false	Use B+tree format for klog (faster point lookups)

Compression Algorithms

Algorithm	Value	Description
`NO_COMPRESSION`	0	No compression
`SNAPPY_COMPRESSION`	1	Snappy compression
`LZ4_COMPRESSION`	2	LZ4 standard compression (default)
`ZSTD_COMPRESSION`	3	Zstandard compression (best ratio)
`LZ4_FAST_COMPRESSION`	4	LZ4 fast mode (higher throughput)

Sync Modes

Mode	Description
`SYNC_NONE`	No explicit sync, relies on OS page cache (fastest)
`SYNC_FULL`	Fsync on every write (most durable)
`SYNC_INTERVAL`	Periodic background syncing at configurable intervals

Error Codes

Code	Value	Description
`ERR_SUCCESS`	0	Operation completed successfully
`ERR_MEMORY`	-1	Memory allocation failed
`ERR_INVALID_ARGS`	-2	Invalid arguments passed
`ERR_NOT_FOUND`	-3	Key not found
`ERR_IO`	-4	I/O operation failed
`ERR_CORRUPTION`	-5	Data corruption detected
`ERR_EXISTS`	-6	Resource already exists
`ERR_CONFLICT`	-7	Transaction conflict detected
`ERR_TOO_LARGE`	-8	Key or value size exceeds maximum
`ERR_MEMORY_LIMIT`	-9	Memory limit exceeded
`ERR_INVALID_DB`	-10	Database handle is invalid
`ERR_UNKNOWN`	-11	Unknown error
`ERR_LOCKED`	-12	Database is locked

Testing

# Run all tests
mvn test

# Run specific test
mvn test -Dtest=TidesDBTest#testOpenClose

# Run with verbose output
mvn test -X

Building from Source

# Clone the repository
git clone https://github.com/tidesdb/tidesdb-java.git
cd tidesdb-java

# Build the JNI library
cd src/main/c
cmake -S . -B build
cmake --build build
sudo cmake --install build
cd ../../..

# Build the Java package
mvn clean package

# Install to local Maven repository
mvn install