Insert Performance

Tip

If ingestion performance is an important metric for you, and you can do your inserts in batches, you should switch from Single Inserts to Bulk Inserts.

Single Inserts

Single inserts (see Inserting Data) are typically very fast with CrateDB. A small cluster can easily handle several thousand inserts per second done this way.

However, single inserts generate a lot of internal network traffic, because every insert is applied to the primary shard, and then individually communicated in parallel to every configured replica shard. In addition, CrateDB will not return a response to an insert request until all replica shards have been updated.

If you can batch up your inserts, switching to Bulk Inserts will dramatically improve ingestion performance.

Bulk Inserts

If you use the SQL HTTP Endpoint, you can insert multiple records at once by making use of Bulk Operations. Our benchmarking indicates that you can expect to see at least a twofold increase in ingestion performance.

Bulk inserts still generate internal network traffic, and CrateDB still waits until all replicas have been updated before returning a response. But because inserts are batched up, less internal communication and synchronisation is needed. In addition, the bulk query only needs to be parsed, planned, and executed once.