A useful template would be (take care of the code version): https://github.com/cobble-project/cobble/blob/main/template/config.yaml
Storage Feature Flags (OpenDAL)
Cobble exposes a focused set of optional OpenDAL backend features.
Local file:// is always enabled (no feature required)
Enable all optional backends: storage-all
Workspace crates that depend on cobble (cobble-cli, cobble-web-monitor, cobble-cluster, cobble-bench, cobble-data-structure, cobble-java) re-expose and forward the same storage-* features.
Optional feature mapping:
Cobble Feature
OpenDAL Service
storage-alluxio
services-alluxio
storage-cos
services-cos
storage-ftp
services-ftp
storage-hdfs
services-hdfs
storage-oss
services-oss
storage-s3
services-s3
storage-sftp
services-sftp
Windows note: storage-hdfs and storage-sftp are currently unsupported.
Max immutable+L0 files before stall. Auto: min(l0+4, l0×2)
l1_base_bytes
Size
64MiB
Target size for level 1
level_size_multiplier
usize
10
Size multiplier per level
max_level
u8
6
Maximum number of LSM levels
SST Options
Parameter
Type
Default
Description
base_file_size
Size
64MiB
Target output file size
sst_bloom_filter_enabled
bool
false
Enable bloom filter per SST file
sst_bloom_bits_per_key
u32
10
Bits per key for bloom filter
sst_partitioned_index
bool
false
Enable two-level partitioned index
sst_compression_by_level
Vec<SstCompressionAlgorithm>
[None, None, Lz4]
Compression per level
Parquet
Parameter
Type
Default
Description
parquet_row_group_size_bytes
Size
256KiB
Row group size
Block Cache
Parameter
Type
Default
Description
block_cache_size
Size
64MiB
In-memory cache size (0 also disables cache)
block_cache_hybrid_enabled
bool
false
Enable memory + disk hybrid cache
block_cache_hybrid_disk_size
Option<Size>
None
Disk tier capacity (defaults to memory size)
Compaction
Parameter
Type
Default
Description
compaction_policy
CompactionPolicyKind
RoundRobin
Policy: RoundRobin or MinOverlap
compaction_read_ahead_enabled
bool
true
Buffered reads during compaction
compaction_remote_addr
Option<String>
None
Remote compaction server address (host:port)
compaction_threads
usize
4
Compaction thread pool size
compaction_remote_timeout_ms
u64
300,000
Remote compaction timeout (milliseconds)
compaction_server_max_concurrent
usize
4
Max concurrent tasks on remote server
compaction_server_max_queued
usize
64
Max queued tasks before rejecting
Value Separation
Parameter
Type
Default
Description
value_separation_threshold
Option<Size>
None
Byte threshold for VLOG separation (None = disabled)
TTL
Parameter
Type
Default
Description
ttl_enabled
bool
false
Enable TTL metadata processing
default_ttl_seconds
Option<u32>
None
Default TTL for entries (None = no expiration)
time_provider
TimeProviderKind
System
Time source: System or Manual
Snapshots
Parameter
Type
Default
Description
snapshot_on_flush
bool
false
Auto-snapshot after each memtable flush
snapshot_retention
Option<usize>
None
Keep only N most recent snapshots
active_memtable_incremental_snapshot_ratio
f64
0.0
Ratio for incremental memtable snapshots (0 = disabled)
Schema
Parameter
Type
Default
Description
num_columns
usize
1
Number of value columns per key
total_buckets
u32
1
Total buckets for sharding (1–65536)
Volume Offload
Parameter
Type
Default
Description
primary_volume_write_stop_watermark
f64
0.95
Usage ratio to stop writes
primary_volume_offload_trigger_watermark
f64
0.85
Usage ratio to trigger offload
primary_volume_offload_policy
PrimaryVolumeOffloadPolicyKind
Priority
Policy: LargestFile or Priority
LSM Splitting
Parameter
Type
Default
Description
lsm_split_trigger_level
Option<u8>
None
Level that triggers LSM tree splitting
Logging
Parameter
Type
Default
Description
log_path
Option<String>
None
Log file path (must be local)
log_console
bool
false
Enable console logging
log_level
log::LevelFilter
Info
Trace, Debug, Info, Warn, Error, Off
CoordinatorConfig
Configuration for DbCoordinator.
Parameter
Type
Default
Description
volumes
Vec<VolumeDescriptor>
Single local volume
Storage volumes for global manifests
snapshot_retention
Option<usize>
None
Auto-expire old global snapshots
VolumeDescriptor
Describes a storage volume.
Field
Type
Default
Description
base_dir
String
(required)
Base directory URL (file://, s3://, etc.); S3 supports URL-encoded endpoint/root hints (for example s3://127.0.0.1:9000/bucket/prefix?endpoint_scheme=http®ion=us-east-1)
access_id
Option<String>
None
Access ID for remote storage
secret_key
Option<String>
None
Secret key for remote storage
size_limit
Option<Size>
None
Maximum volume size
custom_options
Option<HashMap<String, String>>
None
Backend-specific initialization options passed to OpenDAL
kinds
u8
0
Bitmask of VolumeUsageKind values
If you want to inject some custom options to OpenDAL for specific backends, you can use the custom_options field. You can find the list of supported options for each backend in the OpenDAL documentation, take S3 for example, you can set endpoint, region and so on.
VolumeUsageKind
Kind
Value
Description
Meta
0
Metadata files (manifests, schemas)
PrimaryDataPriorityHigh
1
High-priority data (SST, Parquet, VLOG)
PrimaryDataPriorityMedium
2
Medium-priority data
PrimaryDataPriorityLow
3
Low-priority data
Snapshot
4
Snapshot materialization
Cache
5
Block cache disk tier
Readonly
6
Read-only data source
Helper Methods
Method
Description
VolumeDescriptor::single_volume(url)
Create single volume with PrimaryDataPriorityHigh + Meta
VolumeDescriptor::new(url, kinds)
Create volume with specified usage kinds list (Vec<VolumeUsageKind>)