25.3-PRE
Simplyblock is happy to release the pre-release version of the upcoming Simplyblock 25.3.
Warning
This is a pre-release and may contain issues. It is not recommended for production use.
New Features
Simplyblock strives to provide a strong product. Following is a list of the enhancements and features that made it into this release.
- High availability has been significantly hardened for production. Main improvements concern the support for safe and interruption free fail-over and fail-back in different types of outage scenarios. Those include: partial network outage, full network outage, host failure, container failure, reboots, graceful and ungraceful shutdowns of nodes. Tested for single und dial node outages.
- Multiple journal compression bugs have been identified and fixed.
- Multiple journal fail-over bugs have been identified and fixed.
- Logical volume creation, deletion, snapshotting, and resizing can now be performed via a secondary storage node (when the primary storage node is offline).
- The system has been hardened against high load scenarios, determined by the amount of parallel NVMe-oF volumes per node, the amount of storage and parallel I/O. Tested up to 400 concurrent and fully active logical volumes per node and u to 20 concurrent I/O processes per logical volume.
- Erasure coding schemes 2+1, 2+2, 4+2, 4+4 have been made power-fail-safe with high availability enabled.
- System has been extensively tested outside of AWS with KVM-based virtualization and on bare-metal deployments.
- Significant rework of the command line tool
sbcli
to simplify commands and parameters and make it more consistent. For more information see Important Changes. - Support for Linux Core Isolation to improve performance and system stability.
- Added support for Proxmox via the Simplyblock Proxmox Integration.
Important Changes
Simplyblock made significant changes to the command line tool sbcli
to simplify working with it. Many parameters and
commands were meant for internal testing and confusing to users. Hence, simplyblock decided to make those private.
Parameters and commands that were made private should not influence users. If the change to private for a parameter or command should affect your deployment, please reach out.
Most changes are backwards-compatible, however, some are not. Following is a list of all changes.
- Command:
storage-node
- Renamed command
sn
tostorage-node
(sn
still works as an alias) - Changed subcommand
device-testing-mode
to private - Changed subcommand
info-spdk
to private - Changed subcommand
remove-jm-device
to private - Changed subcommand
send-cluster-map
to private - Changed subcommand
get-cluster-map
to private - Changed subcommand
dump-lvstore
to private - Changed subcommand
set
to private - Subcommand:
deploy
- Added parameter
--cpu-mask
- Added parameter
--isolate-cores
- Added parameter
- Subcommand:
add-node
- Renamed parameter
--partitions
to--journal-patition
- Renamed parameter
--storage-nics
to--data-nics
- Renamed parameter
--number-of-vcpus
to--vcpu-count
- Added parameter
--max-snap
(private) - Changed parameter
--jm-percent
to private - Changed parameter
--number-of-devices
to private - Changed parameter
--size-of-device
to private - Changed parameter
--cpu-mask
to private - Changed parameter
--spdk-image
to private - Changed parameter
--spdk-debug
to private - Changed parameter
--iobuf_small_bufsize
to private - Changed parameter
--iobuf_large_bufsize
to private - Changed parameter
--enable-test-device
to private - Changed parameter
--disable-ha-jm
to private - Changed parameter
--id-device-by-nqn
to private - Changed parameter
--max-snap
to private
- Renamed parameter
- Subcommand:
restart
- Renamed parameter
--node-ip
to--node-addr
(--node-ip
still works but is deprecated and should be exchanged) - Changed parameter
--max-snap
to private - Changed parameter
--max-size
to private - Changed parameter
--spdk-image
to private - Changed parameter
--spdk-debug
to private - Changed parameter
--iobuf_small_bufsize
to private - Changed parameter
--iobuf_large_bufsize
to private
- Renamed parameter
- Subcommand:
list-devices
- Removed parameter
--sort
/-s
- Removed parameter
- Renamed command
- Command:
cluster
- Changed subcommand
graceful-shutdown
to private - Changed subcommand
graceful-startup
to private- Subcommand:
deploy
- Renamed parameter
--separate-journal-device
to--journal-partition
- Renamed parameter
--storage-nics
to--data-nics
- Renamed parameter
--number-of-vcpus
to--vcpu-count
- Changed parameter
--ha-jm-count
to private - Changed parameter
--enable-qos
to private - Changed parameter
--blk-size
to private - Changed parameter
--page_size
to private - Changed parameter
--CLI_PASS
to private - Changed parameter
--grafana-endpoint
to private - Changed parameter
--distr-bs
to private - Changed parameter
--max-queue-size
to private - Changed parameter
--inflight-io-threshold
to private - Changed parameter
--jm-percent
to private - Changed parameter
--max-snap
to private - Changed parameter
--number-of-distribs
to private - Changed parameter
--size-of-device
to private - Changed parameter
--cpu-mask
to private - Changed parameter
--spdk-image
to private - Changed parameter
--spdk-debug
to private - Changed parameter
--iobuf_small_bufsize
to private - Changed parameter
--iobuf_large_bufsize
to private - Changed parameter
--enable-test-device
to private - Changed parameter
--disable-ha-jm
to private - Changed parameter
--lvol-name
to private - Changed parameter
--lvol-size
to private - Changed parameter
--pool-name
to private - Changed parameter
--pool-max
to private - Changed parameter
--snapshot
/-s
to private - Changed parameter
--max-volume-size
to private - Changed parameter
--encrypt
to private - Changed parameter
--crypto-key1
to private - Changed parameter
--crypto-key2
to private - Changed parameter
--max-rw-iops
to private - Changed parameter
--max-rw-mbytes
to private - Changed parameter
--max-r-mbytes
to private - Changed parameter
--max-w-mbytes
to private - Changed parameter
--distr-vuid
to private - Changed parameter
--lvol-ha-type
to private - Changed parameter
--lvol-priority-class
to private - Changed parameter
--fstype
to private
- Renamed parameter
- Subcommand:
create
- Changed parameter
--page_size
to private - Changed parameter
--CLI_PASS
to private - Changed parameter
--distr-bs
to private - Changed parameter
--distr-chunk-bs
to private - Changed parameter
--ha-type
to private - Changed parameter
--max-queue-size
to private - Changed parameter
--inflight-io-threshold
to private - Changed parameter
--enable-qos
to private
- Changed parameter
- Subcommand:
add
- Changed parameter
--page_size
to private - Changed parameter
--distr-bs
to private - Changed parameter
--distr-chunk-bs
to private - Changed parameter
--max-queue-size
to private - Changed parameter
--inflight-io-threshold
to private - Changed parameter
--enable-qos
to private
- Changed parameter
- Subcommand:
- Command:
storage-pool
- Removed subcommandget-secret
- Removed subcommandupdate-secret
- Subcommand:
add
- Changed parameter
--has-secret
to private -Command:caching-node
- Changed parameter
- Subcommand:
add-node
- Renamed parameter
--number-of-vcpus
to--vcpu-count
- Changed parameter
--cpu-mask
to private - Changed parameter
--memory
to private - Changed parameter
--spdk-image
to private
- Renamed parameter
- Command:
volume
- Changed subcommand
list-mem
to private - Changed subcommand
move
to private
- Changed subcommand
- Subcommand:
add
- Renamed parameter
--pvc_name
to--pvc-name
(--pvc_name
still works but is deprecated and should be exchanged) - Changed parameter
--distr-vuid
to private - Changed parameter
--uid
to private
- Renamed parameter
- Subcommand:
Known Issues
Simplyblock always seeks to provide a stable and strong release. However, smaller known issues happen. Following is a list of known issues of the current simplyblock release.
Info
This is a pre-release and many of those known issues are expected to be resolved with the final release.
- The control plane reaches a limit at around 2,200 logical volumes.
- If a storage node goes offline while a logical volume is being deleted, the storage cluster may keep some garbage.
- In rare cases, resizing a logical volume under high I/O load may cause a storage node restart.
- If a storage cluster reaches its capacity limit and runs full, file systems on logical volumes may return I/O errors.
- A fail-back after a fail-over may increase to >10s (with freezing I/O) with a larger number of logical volumes per storage node (>100 logical volumes).
- A fail-over time may increase to >5s (with freezing I/O) on large logical volumes (>5 TB).
- During a node outage, I/O performance may drop significantly with certain I/O patterns due to a performance issue in the journal compression.
- Journal compression may cause significant I/O performance drops (10-20s) in periodic intervals under certain I/O load patterns, especially when the logical volume capacity reaches its limits for the first time.
- A peak read IOPS performance regression has been observed.
- In rare cases, a primary-secondary storage node combination may get into a flip-flop situation with multiple fail-over/fail-back iterations due to network or configuration issues of particular logical volumes or clients.
- A secondary node may get stuck when trying to restart under high load (>100 logical volumes).
- Node affinity rules are not considered after a storage node migration to a new host.
- Return code of sbcli commands is always 0.