System Information

CrateDB provides the sys schema which contains virtual tables. These tables are read-only and can be queried to get statistical real-time information about the cluster, its nodes and their shards:

Table of Contents

Cluster

Basic information about the CrateDB cluster can be retrieved from the sys.cluster table:

Name Description Return Type
id A unique ID generated by the system. String
name The cluster name. String
master_node Node ID of the node which currently operates as master String
settings The cluster settings. Object

The result has at most 1 row:

cr> select name from sys.cluster;
+--------------+
| name         |
+--------------+
| Testing...   |
+--------------+
SELECT 1 row in set (... sec)

Cluster Settings

The sys.cluster.settings expression returns information about the currently applied cluster settings.

cr> select settings from sys.cluster;
+-----------------------------------------------------------------------------------------------------------------------------------------------------...-+
| settings                                                                                                                                                |
+-----------------------------------------------------------------------------------------------------------------------------------------------------...-+
| {"bulk": {...}, "cluster": {...}, "discovery": {...}, "gateway": {...}, "indices": {...}, "license": {...}, "logger": [], "stats": {...}, "udc": {...}} |
+-----------------------------------------------------------------------------------------------------------------------------------------------------...-+
SELECT 1 row in set (... sec)
cr> select column_name, data_type from information_schema.columns
... where column_name like 'settings%'
... and table_name = 'cluster';
+-----------------------------------------------------------------------------------+--------------+
| column_name                                                                       | data_type    |
+-----------------------------------------------------------------------------------+--------------+
| settings                                                                          | object       |
| settings['bulk']                                                                  | object       |
| settings['bulk']['request_timeout']                                               | string       |
| settings['cluster']                                                               | object       |
| settings['cluster']['graceful_stop']                                              | object       |
| settings['cluster']['graceful_stop']['force']                                     | boolean      |
| settings['cluster']['graceful_stop']['min_availability']                          | string       |
| settings['cluster']['graceful_stop']['reallocate']                                | boolean      |
| settings['cluster']['graceful_stop']['timeout']                                   | string       |
| settings['cluster']['info']                                                       | object       |
| settings['cluster']['info']['update']                                             | object       |
| settings['cluster']['info']['update']['interval']                                 | string       |
| settings['cluster']['routing']                                                    | object       |
| settings['cluster']['routing']['allocation']                                      | object       |
| settings['cluster']['routing']['allocation']['allow_rebalance']                   | string       |
| settings['cluster']['routing']['allocation']['balance']                           | object       |
| settings['cluster']['routing']['allocation']['balance']['index']                  | float        |
| settings['cluster']['routing']['allocation']['balance']['shard']                  | float        |
| settings['cluster']['routing']['allocation']['balance']['threshold']              | float        |
| settings['cluster']['routing']['allocation']['cluster_concurrent_rebalance']      | integer      |
| settings['cluster']['routing']['allocation']['disk']                              | object       |
| settings['cluster']['routing']['allocation']['disk']['threshold_enabled']         | boolean      |
| settings['cluster']['routing']['allocation']['disk']['watermark']                 | object       |
| settings['cluster']['routing']['allocation']['disk']['watermark']['high']         | string       |
| settings['cluster']['routing']['allocation']['disk']['watermark']['low']          | string       |
| settings['cluster']['routing']['allocation']['enable']                            | string       |
| settings['cluster']['routing']['allocation']['exclude']                           | object       |
| settings['cluster']['routing']['allocation']['exclude']['_host']                  | string       |
| settings['cluster']['routing']['allocation']['exclude']['_id']                    | string       |
| settings['cluster']['routing']['allocation']['exclude']['_ip']                    | string       |
| settings['cluster']['routing']['allocation']['exclude']['_name']                  | string       |
| settings['cluster']['routing']['allocation']['include']                           | object       |
| settings['cluster']['routing']['allocation']['include']['_host']                  | string       |
| settings['cluster']['routing']['allocation']['include']['_id']                    | string       |
| settings['cluster']['routing']['allocation']['include']['_ip']                    | string       |
| settings['cluster']['routing']['allocation']['include']['_name']                  | string       |
| settings['cluster']['routing']['allocation']['node_concurrent_recoveries']        | integer      |
| settings['cluster']['routing']['allocation']['node_initial_primaries_recoveries'] | integer      |
| settings['cluster']['routing']['allocation']['require']                           | object       |
| settings['cluster']['routing']['allocation']['require']['_host']                  | string       |
| settings['cluster']['routing']['allocation']['require']['_id']                    | string       |
| settings['cluster']['routing']['allocation']['require']['_ip']                    | string       |
| settings['cluster']['routing']['allocation']['require']['_name']                  | string       |
| settings['discovery']                                                             | object       |
| settings['discovery']['zen']                                                      | object       |
| settings['discovery']['zen']['minimum_master_nodes']                              | integer      |
| settings['discovery']['zen']['ping_timeout']                                      | string       |
| settings['discovery']['zen']['publish_timeout']                                   | string       |
| settings['gateway']                                                               | object       |
| settings['gateway']['expected_nodes']                                             | integer      |
| settings['gateway']['recover_after_nodes']                                        | integer      |
| settings['gateway']['recover_after_time']                                         | string       |
| settings['indices']                                                               | object       |
| settings['indices']['breaker']                                                    | object       |
| settings['indices']['breaker']['fielddata']                                       | object       |
| settings['indices']['breaker']['fielddata']['limit']                              | string       |
| settings['indices']['breaker']['fielddata']['overhead']                           | double       |
| settings['indices']['breaker']['query']                                           | object       |
| settings['indices']['breaker']['query']['limit']                                  | string       |
| settings['indices']['breaker']['query']['overhead']                               | double       |
| settings['indices']['breaker']['request']                                         | object       |
| settings['indices']['breaker']['request']['limit']                                | string       |
| settings['indices']['breaker']['request']['overhead']                             | double       |
| settings['indices']['recovery']                                                   | object       |
| settings['indices']['recovery']['internal_action_long_timeout']                   | string       |
| settings['indices']['recovery']['internal_action_timeout']                        | string       |
| settings['indices']['recovery']['max_bytes_per_sec']                              | string       |
| settings['indices']['recovery']['recovery_activity_timeout']                      | string       |
| settings['indices']['recovery']['retry_delay_network']                            | string       |
| settings['indices']['recovery']['retry_delay_state_sync']                         | string       |
| settings['indices']['store']                                                      | object       |
| settings['indices']['store']['throttle']                                          | object       |
| settings['indices']['store']['throttle']['max_bytes_per_sec']                     | string       |
| settings['indices']['store']['throttle']['type']                                  | string       |
| settings['license']                                                               | object       |
| settings['license']['enterprise']                                                 | boolean      |
| settings['license']['ident']                                                      | string       |
| settings['logger']                                                                | object_array |
| settings['logger']['level']                                                       | string       |
| settings['logger']['name']                                                        | string       |
| settings['stats']                                                                 | object       |
| settings['stats']['breaker']                                                      | object       |
| settings['stats']['breaker']['log']                                               | object       |
| settings['stats']['breaker']['log']['jobs']                                       | object       |
| settings['stats']['breaker']['log']['jobs']['limit']                              | string       |
| settings['stats']['breaker']['log']['jobs']['overhead']                           | double       |
| settings['stats']['breaker']['log']['operations']                                 | object       |
| settings['stats']['breaker']['log']['operations']['limit']                        | string       |
| settings['stats']['breaker']['log']['operations']['overhead']                     | double       |
| settings['stats']['enabled']                                                      | boolean      |
| settings['stats']['jobs_log_expiration']                                          | string       |
| settings['stats']['jobs_log_size']                                                | integer      |
| settings['stats']['operations_log_expiration']                                    | string       |
| settings['stats']['operations_log_size']                                          | integer      |
| settings['stats']['service']                                                      | object       |
| settings['stats']['service']['interval']                                          | string       |
| settings['udc']                                                                   | object       |
| settings['udc']['enabled']                                                        | boolean      |
| settings['udc']['initial_delay']                                                  | string       |
| settings['udc']['interval']                                                       | string       |
| settings['udc']['url']                                                            | string       |
+-----------------------------------------------------------------------------------+--------------+
SELECT ... rows in set (... sec)

For further details, see the Cluster Settings configuration section.

Nodes

To get information about the nodes query for sys.nodes.

This table can be queried for one, multiple or all nodes within a cluster.

The table schema is as follows:

id

Column Name Description Return Type
id A unique ID within the cluster generated by the system. String

name

Column Name Description Return Type
name The node name within a cluster. The system will choose a random name. You can specify the node name via your own custom configuration. String

hostname

Column Name Description Return Type
hostname The specified host name of the machine the node is running on. String

rest_url

Column Name Description Return Type
rest_url Full http(s) address where the REST API of the node is exposed, including schema, hostname (or IP) and port. String

port

Column Name Description Return Type
port The specified ports for both HTTP and binary transport interfaces. You can specify the ports via your own custom configuration. Object
port['http'] CrateDB’s HTTP port. Integer
port['transport'] CrateDB’s binary transport port. Integer
port['psql'] The PostgreSQL wire protocol port. Integer

load

Column Name Description Return Type
load System load statistics Object
load['1'] Average load over the last 1 minute. Double
load['5'] Average load over the last 5 minutes. Double
load['15'] Average load over the last 15 minutes. Double
load['probe_timestamp'] Unix timestamp at the time of collection of the load probe. Long

mem

Column Name Description Return Type
mem Memory utilization statistics of the host. Object
mem['used'] Currently used memory in bytes. Long
mem['used_percent'] Currently used memory in percent of total. Short
mem['free'] Currently available memory in bytes. Long
mem['free_percent'] Currently available memory in percent of total. Short
mem['probe_timestamp'] Unix timestamp at the time of collection of the memory probe. Long

heap

Column Name Description Return Type
heap Heap memory utilization statistics. Object
heap['used'] Currently used heap memory in bytes. Long
heap['max'] Maximum available heap memory. You can specify the max heap memory CrateDB should use in the configuration. Long
heap['free'] Currently available heap memory in bytes. Long
heap['probe_timestamp'] Unix timestamp at the time of collection of the heap probe. Long

version

Column Name Description Return Type
version CrateDB version information. Object
version['number'] Version string in format "major.minor.hotfix" String
version['build_hash'] SHA hash of the Github commit which this build was built from. String
version['build_snapshot'] Indicates whether this build is a snapshot build. Boolean

fs

Column Name Description Return Type
fs Utilization statistics about the file system. Object
fs['total'] Aggregated usage statistic of all disks on the host. Object
fs['total']['size'] Total size of all disks in bytes. Long
fs['total']['used'] Total used space of all disks in bytes. Long
fs['total']['available'] Total available space of all disks in bytes. Long
fs['total']['reads'] Total number of reads on all disks. Long
fs['total']['bytes_read'] Total size of reads on all disks in bytes. Long
fs['total']['writes'] Total number of writes on all disks. Long
fs['total']['bytes_written'] Total size of writes on all disks in bytes. Long
fs['disks'] Usage statistics of individual disks on the host. Array
fs['disks']['dev'] Device name String
fs['disks']['size'] Total size of the disk in bytes. Long
fs['disks']['used'] Used space of the disk in bytes. Long
fs['disks']['available'] Available space of the disk in bytes. Long
fs['disks']['reads'] Number of reads on the disk. Long
fs['disks']['bytes_read'] Total size of reads on the disk in bytes. Long
fs['disks']['writes'] Number of writes on the disk. Long
fs['disks']['bytes_written'] Total size of writes on the disk in bytes. Long
fs['data'] Information about data paths used by the node. Array
fs['data']['dev'] Device name String
fs['data']['path'] File path where the data of the node resides. String

thread_pools

Column Name Description Return Type
thread_pools Usage statistics of Java thread pools. Array
thread_pools['name'] Name of the pool. String
thread_pools['active'] Number of currently running thread in the thread pool. Integer
thread_pools['rejected'] Total number of rejected threads in the thread pool. Long
thread_pools['largest'] Largest number of threads that have ever simultaniously been in the pool. Integer
thread_pools['completed'] Total number of completed thread in teh thread pool. Long
thread_pools['threads'] Size of the thread pool. Integer
thread_pools['queue'] Number of thread currently in the queue. Integer

os

Column Name Description Return Type
os Operating system stats Object
os['uptime'] System uptime in milliseconds Long
os['timestamp'] UNIX timestamp in millisecond resolution Long
os['cpu'] Information about CPU utilization Object
os['cpu']['system'] CPU time used by the system Short
os['cpu']['user'] CPU time used by applications Short
os['cpu']['idle'] Idle CPU time Short
os['cpu']['used'] Used CPU (system + user) Short
os['cpu']['stolen'] The amount of CPU ‘stolen’ from this virtual machine by the hypervisor for other tasks. Short
os['probe_timestamp'] Unix timestamp at the time of collection of the OS probe. Long

The cpu information values are cached for 1s. They might differ from the actual values at query time. Use the probe timestamp to get the time of collection. When analyzing the cpu usage over time, always use os['probe_timestamp'] to calculate the time difference between 2 probes.

os_info

Column Name Description Return Type
os_info Operating system information Object
os_info['available_processors'] Number of processors that are available in the JVM. This is usually equal to the number of cores of the CPU. Integer
os_info['name'] Name of the operating system (ex: Linux, Windows, Mac OS X) String
os_info['arch'] Name of the JVM architecture (ex: amd64, x86) String
os_info['version'] Version of the operating system String
os_info['jvm'] Information about the JVM (Java Virtual Machine) Object
os_info['jvm']['version'] The JVM version String
os_info['jvm']['vm_name'] The name of the JVM (eg. OpenJDK, Java Hotspot(TM) ) String
os_info['jvm']['vm_vendor'] The vendor name of the JVM String
os_info['jvm']['vm_version'] The version of the JVM String

network

Column Name Description Return Type
network Statistics about network activity on the host. Object
network['probe_timestamp'] Unix timestamp at the time of collection of the network probe. Long
network['tcp'] TCP network activity on the host. Object
network['tcp']['connections'] Information about TCP network connections. Object
network['tpc']['connections']['initiated'] Total number of initiated TCP connections. Long
network['tpc']['connections']['accepted'] Total number of accepted TCP connections. Long
network['tpc']['connections']['curr_established'] Total number of currently established TCP connections. Long
network['tcp']['connections']['dropped'] Total number of dropped TCP connections. Long
network['tcp']['connections']['embryonic_dropped'] Total number of TCP connections that have been dropped before they were accepted. Long
network['tcp']['packets'] Information about TCP packets. Object
network['tpc']['packets']['sent'] Total number of TCP packets sent. Long
network['tcp']['packets']['received'] Total number of TCP packets received. Long
network['tpc']['packets']['retransmitted'] Total number of TCP packets retransmitted due to an error. Long
network['tcp']['packets']['errors_received'] Total number of TCP packets that contained checksum errors, had a bad offset, were dropped because of a lack of memory or were too short. Long
network['tcp']]['packets']['rst_sent'] Total number of RST packets sent due to left unread data in queue when socket is closed. See tools.ietf.org. Long

process

Column Name Description Return Type
process Statistics about the CrateDB process. Object
process['open_file_descriptors'] Number of currently open file descriptors used by the CrateDB process. Long
process['max_open_file_descriptors'] The maximum number of open file descriptors CrateDB can use. Long
process['probe_timestamp'] The system UNIX timestamp at the moment of the probe collection. Long
process['cpu'] Information about the CPU usage of the CrateDB process. Object
process['cpu']['percent'] The CPU usage of the CrateDB JVM process given in percent. Short
process['cpu']['user'] The process CPU user time in milliseconds. Long
process['cpu']['system'] The process CPU kernel time in milliseconds. Long

The cpu information values are cached for 1s. They might differ from the actual values at query time. Use the probe timestamp to get the time of the collect. When analyzing the cpu usage over time, always use process['probe_timestamp'] to calculate the time difference between 2 probes.

Note

If one of the queried nodes is not responding within three seconds it returns null every column except id and name. This behaviour could be used to detect hanging nodes.

Node Checks

The table sys.node_checks exposes a list of internal node checks and results of their validation.

The table schema is the following:

Name Description Return Type
id The unique check ID. Integer
node_id The unique node ID. String
severity The level of severity. The higher the value of the field the higher severity. Integer
description The description message for the setting check. String
passed The flag determines whether the check for the setting has passed. Boolean
acknowledged The flag determines whether the check for this setting has been acknowledged by the user in order to ignored the value of passed column. This column can be updated. Boolean

Example query:

cr> select id, node_id, description from sys.node_checks order by id, node_id;
+-...-+---------...-+--------------------------------------------------------------...-+
|  id | node_id     | description                                                      |
+-...-+---------...-+--------------------------------------------------------------...-+
|   1 | ...         | The value of the cluster setting 'gateway.expected_nodes' mus... |
|   2 | ...         | The value of the cluster setting 'gateway.recover_after_nodes... |
|   3 | ...         | If any of the "expected nodes" recovery settings are set, the... |
|   5 | ...         | The high disk watermark is exceeded on the node. The cluster ... |
|   6 | ...         | The low disk watermark is exceeded on the node. The cluster w... |
...
+-...-+---------...-+--------------------------------------------------------------...-+
SELECT ... rows in set (... sec)

Acknowledge Failed Checks

It is possible to acknowledge every check by updating the acknowledged column. By doing this, specially CrateDB’s built-in Admin-UI won’t complain anymore about failing checks.

Imagine we’ve added a new node to our cluster, but as the gateway.expected_nodes column can only be set via config-file or command-line argument, the check for this setting will not pass on the already running nodes until the config-file or command-line argument on these nodes is updated and the nodes are restarted (which is not what we want on a healthy well running cluster).

In order to make the Admin-UI accept a failing check (so the checks label goes green again), we must acknowledge this check by updating it’s acknowledged flag:

cr> update sys.node_checks set acknowledged = true where id = 1;
UPDATE OK, 1 row affected (... sec)

Note

Updates on this column are transient, so changed values are lost after the affected node is restarted.

Description of Checked Node Settings

Recovery Expected Nodes

The check for the gateway.expected_nodes setting checks that the number of nodes that should be waited for the immediate cluster state recovery, must be equal to the maximum number of data and master nodes in the cluster.

Recovery After Nodes

The check for the gateway.recover_after_nodes verifies that the number of started nodes before the cluster starts must be greater than the half of the expected number of nodes and equal/less than number of nodes in the cluster.

(E / 2) < R <= E

where R is the number of recovery nodes, E is the number of expected nodes.

Recovery After Time

If gateway.recover_after_nodes is set, then gateway.recover_after_time must not be set to 0s, otherwise the gateway.recover_after_nodes setting wouldn’t have any effect.

Routing Allocation Disk Watermark High

The check for the cluster.routing.allocation.disk.watermark.high setting verifies that the high watermark is not exceeded on the current node. The usage of each disk for configured CrateDB data paths is verified against the threshold setting. If one or more verification fails the check is marked as not passed.

Routing Allocation Disk Watermark Low

The check for the cluster.routing.allocation.disk.watermark.low which controls the low watermark for the node disk usage. The check verifies that the low watermark is not exceeded on the current node. The verification is done against each disk for configured CrateDB data paths. The check is not passed if the verification for one or more disk fails.

Shards

The table sys.shards contains real-time statistics for all shards of all (non-system) tables.

Table Schema

Column Name Description Return Type
id

The shard ID.

This shard ID is managed by the managed by the system ranging from 0 and up to the specified number of shards of a table (by default the number of shards is 5).

Integer
schema_name

The schema name.

This will be “blob” for shards of blob tables and “doc” for shards of common tables without a defined schema.

String
table_name The table name. String
partition_ident

The partition ident of a partitioned table.

Empty string on non-partitioned tables.

String
num_docs The total amount of docs within a shard. Long
min_lucene_version Shows the oldest lucene segment version used in this shard. String
primary Describes if the shard is the primary shard. Boolean
relocating_node The node ID which the shard is getting relocated to at the time. String
size

Current size in bytes.

This value is cached for max. 10 seconds to reduce file system access.

Long
state

The current state of the shard.

Possible states are:

-CREATED, -RECOVERING -POST_RECOVERY -STARTED -RELOCATED -CLOSED -INITIALIZING -UNASSIGNED

String
routing_state

The current state of a shard as defined by the routing.

Possible states of the shard routing are:

  • UNASSIGNED,
  • INITIALIZING
  • STARTED
  • RELOCATING
String
orphan_partition

True if the partition has NO table associated with. In rare situations the table is missing.

False on non-partitioned tables.

Boolean
path

Path to the shard directory on the filesystem.

This directory contains state and index files.

String
blob_path Path to the directory which contains the blob files of the shard, or null if the shard is not a blob shard. String
_node

Information about the node the shard is located at.

Contains the same information as the sys.nodes table.

Object
recovery

Represents recovery statistic of the particular shard.

Recovery is the process of moving a table shard to a different node or loading it from disk, e.g. during node startup (local gateway recovery), replication, shard rebalancing or snapshot recovery.

Object
recovery['stage']

Recovery stage:

  • init: Recovery has not started
  • index: Reading the Lucene index meta-data and copying bytes from source to destination
  • start: Starting the engine, opening the index for use
  • translog: Replaying transaction log
  • finalize: Cleanup
  • done: Complete
String
recovery['type']

Recovery type:

  • gateway
  • snapshot
  • replica
  • relocating
String
recovery['size'] Shards recovery statistic in bytes. Object
recovery['size']['used'] Total number of bytes in the shard. Long
recovery['size']['reused'] Number of bytes reused from a local copy while recovering the shard. Long
recovery['size']['recovered'] Number of actual bytes recovered in the shard. Includes both existing and reused bytes. Long
recovery['size']['percent'] Percentage of bytes already recovered. Float
recovery['files'] Shards recovery statistic in files. Object
recovery['files']['used'] Total number of files in the shard. Integer
recovery['files']['reused'] Total number of files reused from a local copy while recovering the shard. Integer
recovery['files']['recovered'] Number of actual files recovered in the shard. Includes both existing and reused files. Integer
recovery['files']['percent'] Percentage of files already recovered. Float
recovery['total_time'] Returns elapsed time from the start of the shard recovery. Long

Notes

Accessing the sys.shards table is subjected to the same privileges constraints as the other tables. Namely, in order to query it, the connected user needs to have the DQL privilege on the sys.shards table, either directly or inherited from the SCHEMA or CLUSTER (for more information on privileges inheritance see Hierarchical Inheritance of Privileges).

However, being able to query sys.shards will not allow the user to retrieve all the rows in the table, as it can contain information related to tables over which the connected user does not have any privileges. The only rows that will be returned will be the ones the user is allowed to access.

For example, if the user john has any privilege on the doc.books table but no privilege at all on doc.locations, when john issues a SELECT * FROM sys.shards statement, the shards information related to the doc.locations table will not be returned.

Example

For example, you can query shards like this:

cr> select schema_name as schema,
...   table_name as t,
...   id,
...   partition_ident as p_i,
...   num_docs as docs,
...   primary,
...   relocating_node as r_n,
...   routing_state as r_state,
...   state,
...   orphan_partition as o_p
... from sys.shards where table_name = 'locations' and id = 1;
+--------+-----------+----+-----+------+---------+------+---------+---------+-------+
| schema | t         | id | p_i | docs | primary | r_n  | r_state |  state  | o_p   |
+--------+-----------+----+-----+------+---------+------+---------+---------+-------+
| doc    | locations |  1 |     |    8 | TRUE    | NULL | STARTED | STARTED | FALSE |
+--------+-----------+----+-----+------+---------+------+---------+---------+-------+
SELECT 1 row in set (... sec)

Jobs, Operations, and Logs

To let you inspect the activities currently taking place in a cluster, CrateDB provides system tables that let you track current cluster jobs and operations. CrateDB also logs both of these tables.

By default, these tables are not populated, because tracking jobs and operations adds a slight performance overhead.

If you want to use the jobs and operations tables, you must enable the collection of CrateDB statistics with SET and RESET, like so:

cr> set global stats.enabled = true;
SET OK, 1 row affected (... sec)

Jobs

The sys.jobs table is a constantly updated view of all jobs that are currently being executed in the cluster. The field username corresponds to the SESSION_USER that is performing the query:

cr> select stmt, username, started from sys.jobs where stmt like 'sel% from %jobs%';
+---------------------------------------------------------------------------------+----------+-...-----+
| stmt                                                                            | username | started |
+---------------------------------------------------------------------------------+----------+-...-----+
| select stmt, username, started from sys.jobs where stmt like 'sel% from %jobs%' | crate    | ...     |
+---------------------------------------------------------------------------------+----------+-...-----+
SELECT 1 row in set (... sec)

Note

If the enterprise edition is disabled or the user management module is not available, the username is represented as NULL.

Every request that queries data or manipulates data is considered a “job” if it is a valid query. Requests that are not valid queries (for example, a request that tries to query a non-existent table) will not show up as jobs.

Operations

The sys.operations table is a constantly updated view of all operations that are currently being executed in the cluster:

cr> select _node['name'], job_id, name, used_bytes from sys.operations
... order by name limit 1;
+---------------+--------...-+-----...-+------------+
| _node['name'] | job_id     | name    | used_bytes |
+---------------+--------...-+-----...-+------------+
| crate         | ...        | ...     | ...        |
+---------------+--------...-+-----...-+------------+
SELECT 1 row in set (... sec)

An operation is a node-specific sub-component of a job (for when a job involves multi-node processing). Jobs that do not require multi-node processing will not produce any operations.

Note

In some cases, operations are generated for internal CrateDB work that does not directly correspond to a user request. These entries do not have corresponding entries in sys.jobs.

Logs

The sys.jobs and sys.operations tables have corresponding log tables: sys.jobs_log and sys.operations_log.

After a job or operation finishes, the corresponding entry will be moved into the corresponding log table:

cr> select id, stmt, username, started, ended, error
... from sys.jobs_log order by ended desc limit 2;
+-...+----------------------------------------------...-+----------+-...-----+-...---+-------+
| id | stmt                                             | username | started | ended | error |
+-...+----------------------------------------------...-+----------+-...-----+-...---+-------+
| ...| select _node['name'], ...                        | crate    | ...     | ...   |  NULL |
| ...| select stmt, username, started from sys.jobs ... | crate    | ...     | ...   |  NULL |
+-...+----------------------------------------------...-+----------+-...-----+-...---+-------+
SELECT 2 rows in set (... sec)

Invalid queries are also logged in the sys.jobs_log table, i.e. queries that never make it to the sys.jobs table because they could not be executed.

The log tables are bound by a fixed size (stats.jobs_log_size) or by an expiration time (stats.jobs_log_expiration)

See Collecting Stats for information on how to configure logs.

Caution

If you deactivate statistics tracking, the logs tables will be truncated.

Cluster Checks

The table sys.checks exposes a list of internal cluster checks and results of their validation.

The sys.checks table looks like this:

Name Description Return Type
id The unique check id. Integer
severity The level of severity. The higher the value of the field the higher severity.  
description The description message for the setting check. String
passed The flag determines whether the check for the setting has passed. Boolean

Here’s an example query:

cr> select id, description from sys.checks order by id;
+----+--------------------------------------------------------------...-+
| id | description                                                      |
+----+--------------------------------------------------------------...-+
|  1 | The setting 'discovery.zen.minimum_master_nodes' must not be ... |
|  2 | The total number of partitions of one or more partitioned tab... |
|  3 | The following tables need to be upgraded for compatibility wi... |
|  4 | CrateDB Enterprise features are active. Please request a lice... |
+----+--------------------------------------------------------------...-+
SELECT 4 rows in set (... sec)

Cluster checks are also indicated in the CrateDB admin console. When all cluster checks (and all Node Checks) pass, the Checks icon will be green. Here’s what it looks like when some checks are failing at the CRITICAL severity level:

../_images/cluster-checks-critical.png

Current Checks

Minimum Master Nodes

The check for the discovery.zen.minimum_master_nodes setting verifies that the minimum number of nodes is equal/greater than the half of maximum number of nodes in the cluster.

(N / 2) + 1 <= M

where N is the number of nodes in the cluster, and M is the value of the setting discovery.zen.minimum_master_nodes.

You can change the value (via SET and RESET) permanently by issuing the following SQL statement:

SET GLOBAL PERSISTENT discovery.zen.minimum_master_nodes = M;

Number of Partitions

This check warns if any partitioned table has more than 1000 partitions to detect the usage of a high cardinality field for partitioning.

Tables Need to Be Upgraded

Warning

Do not attempt to upgrade your cluster if this cluster check is failing. Follow the instructions below to get this cluster check passing.

This check warns you if there are tables that need to be upgraded for compatibility with future versions of CrateDB.

For tables that need upgrading, use the OPTIMIZE command to perform a Segments Upgrade.

For each table, run a command like so:

OPTIMIZE TABLE table_ident WITH (upgrade_segments=true);

Here, replace table_ident with the name of the table you are upgrading.

When all tables that needed upgrading have been upgraded, this cluster check should pass.

Note

Snapshots of your tables created prior to them being upgraded will not work with future versions of CrateDB. For this reason, you should create a new snapshot for each of your tables. (See Backup With Snapshot and Restore.)

Tables Need to Be Recreated

Warning

Do not attempt to upgrade your cluster if this cluster check is failing. Follow the instructions below to get this cluster check passing.

This check warns you if there are tables that need to be recreated for compatibility with future versions of CrateDB.

For tables that need recreating, use SHOW CREATE TABLE to get the SQL statement needed to restore the table, like so:

SHOW CREATE TABLE table_ident;

Here, table_ident is the name of the table you want to recreate.

Copy the output of this command to somewhere safe. You will need it to recreate the table.

Next, use COPY TO to export the table data to a file.

One way of doing that is like so:

COPY table_ident TO DIRECTORY '/data/backups';

This will create several JSON files with names like table_ident_0_.json in the /data/backups directory on your local node. (The directory you use for data export is up to you, but it must be writable by CrateDB.)

COPY TO also includes support for Amazon S3, gzip output, and more. See the command documentation for the full range of usage options.

Once you have both the table schema and table data export, you can use DROP TABLE to drop the table:

DROP TABLE table_ident;

Now, recreate the table by executing the CREATE TABLE command you previously copied.

Once created, re-import your data using COPY FROM.

If you exported your table like in the previous example, you could then import your data one file at a time, like so:

COPY table_ident FROM '/data/backups/table_ident_0_.json';

See the COPY FROM command documentation for more information.

When all tables that needed recreating have been exported, dropped, recreated, and reimported, this cluster check should pass.

Note

Snapshots of your tables created prior to them being recreated will not work with future versions of CrateDB. For this reason, you should create a new snapshot for each of your tables. (See Backup With Snapshot and Restore.)

Repositories

The table sys.repositories lists all configured repositories that can be used to create, manage and restore snapshots (see Backup With Snapshot and Restore).

Name Description Return Type
name The repository name String
type The type of the repository determining how and where the repository stores its snapshots. String
settings The configuration settings the repository has been created with. The specific settings depend on the repository type, see CREATE REPOSITORY. Object
cr> SELECT name, type, settings FROM sys.repositories
... ORDER BY name;
+---------+------+---------------------------------------------------...--+
| name    | type | settings                                               |
+---------+------+---------------------------------------------------...--+
| my_repo | fs   | {"compress": "true", "location": "repo_location", ...} |
+---------+------+---------------------------------------------------...--+
SELECT 1 row in set (... sec)

Snapshots

The table sys.snapshots lists all existing snapshots in all configured repositories (see Backup With Snapshot and Restore).

Name Description Return Type
name The name of the snapshot String
repository The name of the repository that contains this snapshot. String
concrete_indices Contains the names of all tables and partitions that are contained in this snapshot how they are represented as ES index names. Array
started The point in time when the creation of the snapshot started. Changes made after that are not stored in this snapshot. Timestamp
finished The point in time when the snapshot creation finished. Timestamp
state The current state of the snapshot. One of: IN_PROGRESS, SUCCESS, PARTIAL, or FAILED. String
version An internal version this snapshot was created with. String

Snapshot/Restore operates on a per-shard basis. Hence, the state column indicates whether all (SUCCESS), some (PARTIAL), or no shards(FAILED) have been backed up. PARTIAL snapshots are the result of some primaries becoming unavailable while taking the snapshot when there are no replicas at hand (cluster state is RED). If there are replicas of the (now unavailable) primaries (cluster state is YELLOW) the snapshot succeeds and all shards are included (state SUCCESS). Building on a PARTIAL snapshot will include all primaries again.

Warning

In case of a PARTIAL state another snapshot should be created in order to guarantee a full backup! Only SUCCESS includes all shards.

The concrete_indices column contains the names of all Elasticsearch indices that were stored in the snapshot. A normal CrateDB table maps to one Elasticsearch index, a partitioned table maps to one Elasticsearch index per partition. The mapping follows the following pattern:

CrateDB table / partition name concrete_indices entry
doc.my_table my_table
my_schema.my_table my_schema.my_table
doc.parted_table (value=null) .partitioned.my_table.0400
my_schema.parted_table (value=null) my_schema..partitioned.my_table.0400
cr> SELECT "repository", name, state, concrete_indices
... FROM sys.snapshots order by "repository", name;
+------------+-------------+---------+-----------------...-+
| repository | name        | state   | concrete_indices    |
+------------+-------------+---------+-----------------...-+
| my_repo    | my_snapshot | SUCCESS | [...]               |
+------------+-------------+---------+-----------------...-+
SELECT 1 row in set (... sec)

Summits

The sys.summits table contains the information about the mountains in the Alps higher than 2000m. The mountain names from the table are also used to generate random nodes names.

Users

The sys.users table contains all existing database users in the cluster. The table is only available in the CrateDB Enterprise Edition.

Name Description Return Type
name The name of the database user. String
superuser Boolean flag to indicate whether the user is a superuser. Boolean