Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Adding async data cache configs and ssd cache documentation #11429

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions velox/docs/configs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -760,3 +760,74 @@ Tracing
- integer
- 0
- The max trace bytes limit. Tracing is disabled if zero.

Cache
-----
.. list-table::
:widths: 30 10 10 70
:header-rows: 1

* - Property Name
- Type
- Default Value
- Description
* - async-data-cache-enabled
- bool
- true
- If true, enable async data cache.
* - async-cache-ssd-gb
- integer
- 0
- The size of the SSD.
* - async-cache-ssd-path
- string
- /mnt/flash/async_cache.
- The directory that is mounted onto SSD.
* - async-cache-max-ssd-write-ratio
- double
- 0.7
- The max ratio of the number of in-memory cache entries being written to SSD cache over the total number of cache entries. This is to control SSD cache write rate, and once the ratio exceeds this threshold, then we stop writing to SSD cache.
* - async-cache-max-ssd-savable-ratio
minhancao marked this conversation as resolved.
Show resolved Hide resolved
- double
- 0.125
- The min ratio of SSD savable (in-memory) cache space over the total cache space. Once the ratio exceeds this limit, we start writing SSD savable cache entries into SSD cache.
* - async-cache-max-ssd-savable-bytes
minhancao marked this conversation as resolved.
Show resolved Hide resolved
- integer
- 1 << 24 (16 MB)
minhancao marked this conversation as resolved.
Show resolved Hide resolved
- Min SSD savable (in-memory) cache space to start writing SSD savable cache entries into SSD cache. NOTE: we only write to SSD cache when both above conditions are satisfied.
minhancao marked this conversation as resolved.
Show resolved Hide resolved
* - async-cache-persistence-interval
- string
- 0s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the explanation do we need to add the allowed units to be specified? Is this mentioned in other configs that use duration units?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add units has to be in seconds here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit is part of the string and we can see it is in seconds. But we would allow time based units like hour, day etc to be used as well. Not just seconds and we want to mention this in the description what they are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added
The following time units are supported: ns, us, ms, s, m, h, d.

Reference: https://github.com/facebookincubator/velox/blob/main/velox/common/config/Config.cpp#L82

- The interval for persisting in-memory cache to SSD. Setting this config to a non-zero value will activate periodic cache persistence.
* - async-cache-ssd-disable-file-cow
- bool
- false
- In file systems, such as btrfs, supporting cow (copy on write), the ssd cache can use all ssd space and stop working. To prevent that, use this option to disable cow for cache files.
* - async-cache-ssd-disable-file-cow
minhancao marked this conversation as resolved.
Show resolved Hide resolved
- bool
- false
- In file systems, such as btrfs, supporting cow (copy on write), the ssd cache can use all ssd space and stop working. To prevent that, use this option to disable cow for cache files.
* - ssd-cache-checksum-enabled
- bool
- false
- When enabled, a CRC-based checksum is calculated for each cache entry written to SSD. The checksum is stored in the next checkpoint file.
* - ssd-cache-read-verification-enabled
- bool
- false
- When enabled, the checksum is recalculated and verified against the stored value when cache data is loaded from the SSD.
* - enable-serialized-page-checksum
minhancao marked this conversation as resolved.
Show resolved Hide resolved
- bool
- true
-
* - cache.velox.ttl-enabled
- bool
- false
- Enable TTL for AsyncDataCache and SSD cache.
* - cache.velox.ttl-threshold
- string
- 2d
- TTL duration for AsyncDataCache and SSD cache entries.
* - cache.velox.ttl-check-interval
- string
- 1h
- The periodic duration to apply cache TTL and evict AsyncDataCache and SSD cache entries.
152 changes: 152 additions & 0 deletions velox/docs/develop/cache.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
===========================
AsyncDataCache (File Cache)
===========================

Background
----------
Velox provides transparent file cache (AsyncDataCache) to accelerate table scan through the hot data reuse and prefetch. The file cache is integrated with the memory system to achieve dynamic memory sharing between file cache and query memory. When a query fails to allocate memory, we retry the allocation by shrinking the file cache. Therefore, the file cache size is automatically adjusted in response to the query memory usage change. See `Memory Management - Velox Documentation <https://facebookincubator.github.io/velox/develop/memory.html>`_ for more information about Velox's file cache.
minhancao marked this conversation as resolved.
Show resolved Hide resolved

Configuration Properties
------------------------
The AsyncDataCache can be enabled by setting the following config:

.. code-block:: bash

async-data-cache-enabled=true


Other Properties
----------------
There is a ``cache.no_retention`` session property in Velox that can be set to control if a query's cached data is retained or not after its execution.

.. list-table::
:widths: 30 10 10 70
:header-rows: 1

* - Property Name
- Type
- Default Value
- Description
* - cache.no_retention
- bool
- false
- If set to true, evicts out a query scanned data out of in-memory cache right after the access, and also skips staging to the SSD cache.​
minhancao marked this conversation as resolved.
Show resolved Hide resolved

Set the ``hive.node_scheduler_affinity`` session property accordingly to turn ON/OFF cache.no_retention.​
minhancao marked this conversation as resolved.
Show resolved Hide resolved

.. code-block:: bash

SET SESSION hive.node_selection_strategy='NO_PREFERENCE'; // To turn cache.no_retention ON.​
SET SESSION hive.node_selection_strategy='SOFT_AFFINITY'; // To turn cache.no_retention OFF.​
SET SESSION hive.node_selection_strategy='HARD_AFFINITY'; // To turn cache.no_retention OFF.​


=========
SSD Cache
=========

Background
----------
The AsyncDataCache is configured to use SSD when provided.
The SSD serves as an extension for the async data cache (file cache).
minhancao marked this conversation as resolved.
Show resolved Hide resolved
This helps mitigate the number of reads from slower storage.

Configuration Properties
------------------------
The SSD cache can be used by setting the following configs:

.. code-block:: bash

async-data-cache-enabled=true
async-cache-ssd-gb=<the size of your SSD>
async-cache-ssd-path=<path to directory that is mounted onto SSD>

.. list-table::
:widths: 30 10 10 70
:header-rows: 1

* - Property Name
- Type
- Default Value
- Description
* - async-data-cache-enabled
- bool
- true
- If true, enable async data cache.
* - async-cache-ssd-gb
- integer
- 0
- The size of the SSD.
* - async-cache-ssd-path
- string
- /mnt/flash/async_cache.
- The directory that is mounted onto SSD.


Other configuration properties can also be set to control how often the async data cache writes to SSD.
See `Configuration Properties <../configs.rst>`_ for more SSD Cache related configuration properties.

Metrics
-------
There are SSD cache relevant metrics that Velox emits during query execution and runtime.
See `Debugging Metrics <./debugging/metrics.rst>`_ and `Monitoring Metrics <../monitoring/metrics.rst>`_ for more details.


Setup with btrfs filesystem on worker machines (Linux only)
-----------------------------------------------------------
NOTE: Commands below were ran successfully for worker machines of Amazon EC2 r6 instances with CentOS.
minhancao marked this conversation as resolved.
Show resolved Hide resolved


.. code-block:: bash

# Installs the centos-release-hyperscale-experimental module and other necessary packages.
# https://sigs.centos.org/hyperscale/content/repositories/experimental/
# It will also upgrade the kernel to the supported version for btrfs installation.
hostnamectl
sudo dnf -y install centos-release-hyperscale-experimental
sudo dnf --disablerepo=* --enablerepo=centos-hyperscale,centos-hyperscale-experimental -y update --allowerasing
sudo dnf -y install kernel-modules-extra
# Restart worker machine to have the new Kernel version take into effect.
sudo shutdown -r now || true


.. code-block:: bash

# This is for if your worker machine is part of a Docker swarm and needs to connect back to it.
# The systemd packages need to be updated to match with the new updated kernel.
sudo dnf -y install systemd-networkd systemd-boot


.. code-block:: bash

# Install the btrfs packages.
hostnamectl
sudo yum -y install btrfs-progs
echo "Checking /proc/filesystems for btrfs support..."
if ! grep -q btrfs /proc/filesystems; then
echo "Btrfs is not supported by the kernel."
exit 1
fi
echo "Btrfs is supported by the kernel."


.. code-block:: bash

# If btrfs is successfully supported by the kernel, mount btrfs onto a disk and directory path.
sudo lsblk -d -o NAME | tail -n +2
# Only install btrfs onto a disk that is not EBS (EBS holds the OS).
disk_names=( $(sudo lsblk -d -o NAME | tail -n +2) )
for disk in "${disk_names[@]}"; do
echo "Checking disk: $disk"
# If the disk is an Amazon EC2 NVMe Instance Storage volume, then install btrfs onto that disk
if sudo fdisk -l "/dev/$disk" | grep -q "Amazon EC2 NVMe Instance Storage"; then
echo "Disk $disk is an Amazon EC2 NVMe Instance Storage"
sudo mkfs.btrfs /dev/$disk
sudo mount -t btrfs /dev/$disk /home/centos/presto/async_data_cache
sudo echo "/dev/$disk /home/centos/presto/async_data_cache auto noatime 0 0" | sudo tee -a /etc/fstab
sudo lsblk -f
break
else
echo "Disk $disk is not an Amazon EC2 NVMe Instance Storage volume"
fi
done
Loading