Backup and restore

Tip: For a full working example step by step, please check also this well written article. This also explain more deeply how Casskop Backup & Restore works in background

In order to provide Backup/Restore abilities we use InstaCluster's cassandra-sidecar project and add it to each Cassandra node to spawn. We want to thant Instaclustr for the modifications they made to make it work with CassKop!

Backup#

It is possible to backup keyspaces or tables from a cluster managed by Casskop. To start or schedule a backup, you create an object of type CassandraBackup:

apiVersion: db.orange.com/v2
kind: CassandraBackup
metadata:
name: nightly-cassandra-backup
labels:
app: cassandra
spec:
cassandraCluster: test-cluster
datacenter: dc1
storageLocation: s3://cassie
snapshotTag: SnapshotTag2
secret: cloud-backup-secrets
schedule: "@midnight"
entities: k1.t1,k2.t3

If there is no schedule defined, the backup will start as soon as it's created and won't be start again with that object. You can always delete the object and recreate it though.

Supported storage#

The following storage options for storing the backups are:

  • s3 (as in the example above)
  • gcp
  • azure
  • oracle cloud

More details can be found on Instaclustr's Cassandra backup page

Life cycle of the CassandraBackup object#

When this object gets created, CassKop does a few checks to ensure:

  • The specified Cassandra cluster exists
  • If there is a secret that it has the expected parameters depending on the chosen backend
  • If there is a schedule that its format is correct (Cron expressions, Predefined schedules or Intervals)

Then, if all those checks pass, it triggers the backup if there is no schedule, or creates a Cron task with the specified schedule.

When this object gets deleted, if there is a scheduled task, it is unscheduled.

When this object gets updated, and the change is located in the spec section, CassKop unschedules the existing task and schedules a new one with the new parameters provided.

Restore#

Following the same logic, a CassandraRestore object must be created to trigger a restore, and it must refer to an existing CassandraBackup object in K8S:

apiVersion: db.orange.com/v2
kind: CassandraRestore
metadata:
name: nightly-cassandra-backup
labels:
app: cassandra
spec:
cassandraBackup: nightly-cassandra-backup
cassandraCluster: test-cluster
entities: k1.t1

Rename#

It's possible to restore the content of tables into other existing tables. Here is an example

apiVersion: db.orange.com/v2
kind: CassandraRestore
metadata:
name: nightly-cassandra-backup
labels:
app: cassandra
spec:
cassandraBackup: nightly-cassandra-backup
cassandraCluster: test-cluster
entities: k1.t1
rename:
k1.t1: k1.t2

With the object above, table k1.t1 will be restored under k1.t2 using the backup nightly-cassandra-backup

Entities#

In the restore phase, you can specify a subset of the entities specified in the backup. For instance, you can backup 2 tables and only restore one.

Datacenter#

It can be specified in a backup or a restore and declares where data must be backed up or restored. If not specified it will run everywhere and entities must exist if they're specified. Specifying it in a restore will declare where data will be restored but as icarus truncates entities it restores, it won't prevent the truncate from cleaning data in non chosen datacenters.