Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [che-dev] Che operator: backup / restore

After reviewing the doc and discussing with David F. we agree with what Sergii K is proposing. The good practice is to have
a separate CRD to trigger the backup  or restore (CheClusterBackup/CheClusterRestore for example).

Examples of such mechanisms can be found in the etcd [1] and infinispan [2] operators.

[1] https://operatorhub.io/operator/etcd
[2] https://operatorhub.io/operator/infinispan

On Mon, Mar 1, 2021 at 2:10 PM Mykola Morhun <mmorhun@xxxxxxxxxx> wrote:
@Sergii Kabashniuk
it doesn't have to be in the same CR, but as we have a single CR for Che server, Che Keycloak, Postgres and other components it is consistent to have everything in one place.

@Mario Loriedo thank you for your feedback. To me the requirement to automatically rollback an update that failed sounds like a different task.
The issue [1] is a part of epic [2] that have goal to support level 3 of operator capabilities [3] that says:

- Operator provides the ability to create backups of the Operand
- Operator is able to restore a backup of an Operand

So, should we stop the backup / restore issue for now and switch to the update fail problem?



On Wed, Feb 24, 2021 at 3:03 PM Mario Loriedo <mario.loriedo@xxxxxxxxx> wrote:


On Tue, Feb 23, 2021 at 11:32 AM Mykola Morhun <mmorhun@xxxxxxxxxx> wrote:
Hello all.
Deploy team is working on backup / restore feature for Che operator.
Backup/restore should be triggered by setting a CR field to "backup" or "restore". Also destination storage server (another field) and credentials for it (a secret) have to be configured as well.

The requirement is to automatically rollback an update that failed somehow. 

There is no requirement to trigger a backup/restore. This is out of scope.
 

The order of the backup procedure is planned to be:
 1. Ensure Che is up and running and backup storage with credentials is configured
 2. Gather all resources to backup:
  - Postgres databases
  - Che cluster CR and CRD
  - Che related secrets and CAs configmaps
 3. Send the resources to an external backup/storage server
 
Operator can get all needed resources yamls directly from the cluster except databases. To backups databases it is planned to use exec into postgres pod and dump them.
To send/retrieve back collected data we plan to use restic [1] cli tool:
 - open source under BSD 2-rd clause licence
 - written in golang, no external dependencies
 - supports many destinations: AWS, Azure, GCS, OpenStack Swift, NFS, SSH, SFTP and many more
 - used as an extesion in Velero to deliver backups to storage servers.
The only downside we see is that we have to include the binary into the operator image, that will increase its size for ~20Mb.

If anyone has concerns / suggestions regarding this topic, please let us know.

[1] https://restic.net/

--

Mykola Morhun

Software engineer

Red Hat

_______________________________________________
che-dev mailing list
che-dev@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/che-dev
_______________________________________________
che-dev mailing list
che-dev@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/che-dev


--

Mykola Morhun

Software engineer

Red Hat

_______________________________________________
che-dev mailing list
che-dev@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/che-dev

Back to the top