A useful template would be (take care of the code version): https://github.com/cobble-project/cobble/blob/main/template/config.yaml
Storage Feature Flags (OpenDAL)
Cobble exposes a focused set of optional OpenDAL backend features.
Local file:// is always enabled (no feature required)
Enable all optional backends: storage-all
Workspace crates that depend on cobble (cobble-cli, cobble-web-monitor, cobble-cluster, cobble-bench, cobble-data-structure, cobble-java) re-expose and forward the same storage-* features.
Optional feature mapping:
Cobble Feature
OpenDAL Service
storage-alluxio
services-alluxio
storage-cos
services-cos
storage-ftp
services-ftp
storage-hdfs
services-hdfs
storage-oss
services-oss
storage-s3
services-s3
storage-sftp
services-sftp
Windows note: storage-hdfs and storage-sftp are currently unsupported.
Max immutable+L0 files before stall. Auto: min(l0+4, l0×2)
l1_base_bytes
Size
64MiB
Target size for level 1
level_size_multiplier
usize
10
Size multiplier per level
max_level
u8
6
Maximum number of LSM levels
SST Options
Parameter
Type
Default
Description
base_file_size
Size
64MiB
Target output file size
sst_bloom_filter_enabled
bool
false
Enable bloom filter per SST file
sst_bloom_bits_per_key
u32
10
Bits per key for bloom filter
sst_partitioned_index
bool
false
Enable two-level partitioned index
sst_data_block_restart_interval
usize
16
Restart interval in SST data blocks (>1 enables prefix compression, 1 disables; range 1..=65535)
sst_compression_by_level
Vec<SstCompressionAlgorithm>
[None, None, Lz4]
Compression per level
Parquet
Parameter
Type
Default
Description
parquet_row_group_size_bytes
Size
256KiB
Row group size
Block Cache
Parameter
Type
Default
Description
block_cache_size
Size
64MiB
In-memory cache size (0 also disables cache)
block_cache_hybrid_enabled
bool
false
Enable memory + disk hybrid cache
block_cache_hybrid_disk_size
Option<Size>
None
Disk tier capacity (defaults to memory size)
Compaction
Parameter
Type
Default
Description
compaction_policy
CompactionPolicyKind
RoundRobin
Policy: RoundRobin, MinOverlap, or ScorePriority
compaction_read_ahead_enabled
bool
true
Buffered reads during compaction
compaction_remote_addr
Option<String>
None
Remote compaction server address (host:port)
compaction_threads
usize
4
Compaction thread pool size
compaction_remote_timeout_ms
u64
300,000
Remote compaction timeout (milliseconds)
compaction_server_max_concurrent
usize
4
Max concurrent tasks on remote server
compaction_server_max_queued
usize
64
Max queued tasks before rejecting
compaction_policy accepts:
round_robin - rotate through files in oversized non-L0 levels
min_overlap - choose the next file with the smallest overlap in the next level
score_priority - prefer the highest-scored level first, then pick files in a RocksDB-style min-overlap order with a per-level cursor and RocksDB-style trivial-move gating
Value Separation
Parameter
Type
Default
Description
value_separation_threshold
Option<Size>
None
Byte threshold for VLOG separation (None = disabled)
Ratio for incremental memtable snapshots (0 = disabled)
Governance
Parameter
Type
Default
Description
governance_mode
GovernanceMode
Filesystem
Writable DB governance mode: Filesystem uses the manifest-backed ownership registry, Noop disables Cobble-side registration
governance_mode accepts:
filesystem - default mode. Writable Db opens register their bucket ranges into the governance manifest stored in the Meta volume and reject overlaps with other registered shards.
noop - skip governance registration and unregistration entirely. Choose this only when exclusive bucket ownership is already enforced by the embedding runtime or deployment orchestration.
Schema
Parameter
Type
Default
Description
num_columns
usize
1
Initial number of columns in the default column family when creating a new DB
total_buckets
u32
1
Total buckets for sharding (1–65536)
Named column families are added later through schema evolution. Reopen, restore, read-only, and compaction paths use the persisted schema rather than reapplying num_columns.
Volume Offload
Parameter
Type
Default
Description
primary_volume_write_stop_watermark
f64
0.95
Usage ratio to stop writes
primary_volume_offload_trigger_watermark
f64
0.85
Usage ratio to trigger offload
primary_volume_offload_policy
PrimaryVolumeOffloadPolicyKind
Priority
Policy: LargestFile or Priority
LSM Splitting
Parameter
Type
Default
Description
lsm_split_trigger_level
Option<u8>
None
Level that triggers LSM tree splitting
Logging
Parameter
Type
Default
Description
log_path
Option<String>
None
Log file path (must be local)
log_max_file_size
Size
10MiB
Maximum size of the active log file before rollover
log_keep_files
usize
3
Total number of log files retained, including the active file
log_console
bool
false
Enable console logging
log_level
log::LevelFilter
Info
Trace, Debug, Info, Warn, Error, Off
Java JNI Direct Buffer
Parameter
Type
Default
Description
jni_direct_buffer_size
Size
2KiB
Capacity of each pooled direct ByteBuffer used by Java direct get/scan APIs and structured direct APIs (Db.getDirect*, Db.scanDirect*, io.cobble.structured.Db.getDirect*, io.cobble.structured.Db.scanDirect*)
jni_direct_buffer_pool_size
usize
64
Maximum number of pooled direct buffers kept per Java process for raw + structured Java direct APIs
CoordinatorConfig
Configuration for DbCoordinator.
Parameter
Type
Default
Description
volumes
Vec<VolumeDescriptor>
Single local volume
Storage volumes for global manifests
snapshot_retention
Option<usize>
None
Auto-expire old global snapshots
VolumeDescriptor
Describes a storage volume.
Field
Type
Default
Description
base_dir
String
(required)
Base directory URL (file://, s3://, etc.); S3 supports URL-encoded endpoint/root hints (for example s3://127.0.0.1:9000/bucket/prefix?endpoint_scheme=http®ion=us-east-1)
access_id
Option<String>
None
Access ID for remote storage
secret_key
Option<String>
None
Secret key for remote storage
size_limit
Option<Size>
None
Maximum volume size
custom_options
Option<HashMap<String, String>>
None
Backend-specific initialization options passed to OpenDAL
kinds
u8
0
Bitmask of VolumeUsageKind values
If you want to inject some custom options to OpenDAL for specific backends, you can use the custom_options field. You can find the list of supported options for each backend in the OpenDAL documentation, take S3 for example, you can set endpoint, region and so on.
VolumeUsageKind
Kind
Value
Description
Meta
0
Metadata files (manifests, schemas)
PrimaryDataPriorityHigh
1
High-priority data (SST, Parquet, VLOG)
PrimaryDataPriorityMedium
2
Medium-priority data
PrimaryDataPriorityLow
3
Low-priority data
Snapshot
4
Snapshot materialization
Cache
5
Block cache disk tier
Readonly
6
Read-only data source
Helper Methods
Method
Description
VolumeDescriptor::single_volume(url)
Create single volume with PrimaryDataPriorityHigh + Meta
VolumeDescriptor::new(url, kinds)
Create volume with specified usage kinds list (Vec<VolumeUsageKind>)