Create Linux application consistent disk snapshots


You can create application consistent snapshots of disks attached to Linux virtual machine (VM) instances. In general, the quality of your disk snapshot depends on how well your applications can recover from snapshots that you create during heavy write workloads. Application consistent snapshots capture the state of application data at the time of backup with all application transactions completed and all pending writes flushed to the disk.

To create snapshots that are application consistent, pause apps or operating system processes that write data to the disk, flush the disk buffers, and sync the file system before you create the snapshot. Depending on your application, these and other steps might be required to ensure that all application transactions are complete and captured in the backup.

To create an application consistent snapshot of your disks, use the following process:

  1. To prepare the guest environment for application consistency, create custom shell scripts to run before and after the snapshot is captured
  2. Configure snapshot settings on your virtual machine (VM) instance.
  3. Create a snapshot with the guest-flush option enabled. The guest-flush option starts your pre and post snapshot scripts.

Before you begin

  • Create a Linux VM.
  • Update the guest environment.
  • If you haven't already, then set up authentication. Authentication is the process by which your identity is verified for access to Google Cloud services and APIs. To run code or samples from a local development environment, you can authenticate to Compute Engine by selecting one of the following options:

    Select the tab for how you plan to use the samples on this page:

    Console

    When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.

    gcloud

    1. Install the Google Cloud CLI, then initialize it by running the following command:

      gcloud init
    2. Set a default region and zone.

    REST

    To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.

      Install the Google Cloud CLI, then initialize it by running the following command:

      gcloud init

    For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

Required roles and permissions

To get the permissions that you need to manage standard snapshots, ask your administrator to grant you the following IAM roles on the project:

For more information about granting roles, see Manage access to projects, folders, and organizations.

These predefined roles contain the permissions required to manage standard snapshots. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to manage standard snapshots:

  • To create a snapshot of a zonal disk:
    • compute.snapshots.create on the project
    • compute.disks.createSnapshot on the disk
  • To create a snapshot of a regional disk using the data on the disk:
    • compute.snapshots.create on the project
    • compute.instances.useReadOnly on the source VM
    • compute.disks.createSnapshot on the disk
  • To create a snapshot of a regional disk from a replica recovery checkpoint:
    • compute.snapshots.create on the project
    • compute.disks.createSnapshot on the disk
  • To create a snapshot schedule: compute.resourcePolicies.create on the project or organization
  • To attach a snapshot schedule to a disk:
    • compute.disks.addResourcePolicies on the disk
    • compute.resourcePolicies.use on the resource policy
  • To delete a snapshot:
    • compute.snapshots.delete on the snapshot
    • compute.snapshots.list on the project

You might also be able to get these permissions with custom roles or other predefined roles.

Limitations

Creating application consistent snapshots on Linux has the following limitations:

  • Application consistency is guaranteed only by the behavior of your custom pre and post snapshot scripts, not by the snapshot operation itself.
  • When using the guest-flush option in your snapshot creation request, the snapshot isn't created if the script returns an error or reaches the timeout limit.

Create pre and post snapshot scripts

Before you proceed, update the guest environment so that you are running the latest software on your Linux VM.

To facilitate application consistency, create pre and post snapshot shell scripts to run before and after the snapshot is captured. Use the pre and post scripts for operations such as:

  • Pause apps or operating system processes running on the VM that writes data to the disk.
  • Flush the disk buffers. For example, MySQL has a FLUSH statement. Use whichever tool is available for your application.
  • Sync your file system.

The following code example shows a pre snapshot script. Note the leading #! characters.

#!/bin/bash
sudo fsfreeze -f [example-disk-location]

The following code example shows a post snapshot script. Note the leading #! characters.

#!/bin/bash
sudo fsfreeze -u [example-disk-location]

You must save your scripts on your VM in the directory /etc/google/snapshots/. The full path of your pre script must be /etc/google/snapshots/pre.sh and the full path of your post script must be /etc/google/snapshots/post.sh.

Referencing specific disks in your scripts

The first argument passed to your pre and post snapshot scripts is a list of disks for which you are creating snapshots. You can use this argument in your scripts for various checks. For example, if your VM has multiple disks attached but you only specified one disk in your snapshot request, you can check which disk the snapshot is being created for.

The argument is formatted as follows:

  • SCSI-attached disks: a comma-separated list of <target/lun> pairs.
  • NVME-attached disks: a comma-separated list of <nvme:namespace> pairs.

For example, your SCSI-attached boot disk might appear as 1/0 while an additional disk attached to the VM might appear as 2/0.

Edit your guest environment configuration file

Configure your application consistent snapshot settings by updating a specific configuration file on your VM.

  1. Open or create your guest environment configuration file.

    edit /etc/default/instance_configs.cfg
    
  2. Add the following section to the configuration file, then save your changes and exit the editor.

    [Snapshots]
    enabled = ENABLED
    timeout_in_seconds = TIMEOUT_SECONDS
    

    Replace the following:

    • ENABLED: Set to true to enable the application consistent snapshot feature. The default value is false.
    • TIMEOUT_SECONDS: The number of seconds the pre or post snapshot script can take to finish running before timing out. The integer value must be between 0 and 300. The default value is 60.

  3. Restart the Guest Agent to use the new configuration settings.

    $ sudo systemctl restart google-guest-agent.service
    

Create a snapshot with guest-flush enabled

Using the Google Cloud console, the Google Cloud CLI, or REST, create a snapshot with the guest-flush option enabled. This starts running the pre and post snapshot scripts before and after the snapshot is captured.

Console

  1. Go to the Create a Snapshot page in the Google Cloud console.

    Go to the Create a Snapshot page
  2. Enter a snapshot Name.
  3. Select a Snapshot type. The default is a STANDARD snapshot, which is the best option for long-term back up and disaster recovery.

    Choose Archive snapshot for more cost-efficient data retention.

  4. Optional: Enter a Description of the snapshot.
  5. Under Source disk, select the existing disk that you want to create a snapshot of.
  6. In the Location section, choose your snapshot storage location.

    The predefined or customized default location defined in your snapshot settings is automatically selected. Optionally, you can override the snapshot settings and store your snapshots in a custom storage location by doing the following:

    1. Choose the type of storage location that you want for your snapshot.

      • Choose Multi-regional for higher availability at a higher cost.
      • Choose Regional snapshots for more control over the physical location of your data at a lower cost.
    2. In the Select location field, select the specific region or multi-region that you want to use. To use the region or multi-region that is closest to your source disk, select Based on disk's location.
  7. Check the Enable application consistent snapshot option.
  8. Click Create to create the snapshot.

gcloud

You can create your snapshot in the storage location policy defined by your snapshot settings or using an alternative storage location of your choice. For more information, see Choose your snapshot storage location.

  • To create a snapshot in the predefined or customized default location configured in your snapshot settings, use the gcloud compute snapshots create command.

    gcloud compute snapshots create SNAPSHOT_NAME \
        --source-disk-zone=SOURCE_ZONE \
        --source-disk=SOURCE_DISK_NAME \
        --snapshot-type=SNAPSHOT_TYPE \
        --guest-flush
    
  • Alternatively, to override the snapshot settings and create a snapshot in a custom storage location, include the --storage-location flag to indicate where to store your snapshot.

    gcloud compute snapshots create SNAPSHOT_NAME \
        --source-disk-zone=SOURCE_ZONE \
        --source-disk=SOURCE_DISK_NAME \
        --snapshot-type=SNAPSHOT_TYPE \
        --storage-location=STORAGE_LOCATION \
        --guest-flush
    

    Replace the following:

    • SNAPSHOT_NAME: A name for the snapshot.
    • SOURCE_ZONE: The zone of the source disk.
    • SOURCE_DISK_NAME: The name of the disk volume from which you want to create a snapshot.
    • SNAPSHOT_TYPE: The snapshot type, either STANDARD or ARCHIVE. If a snapshot type is not specified, a STANDARD snapshot is created.
    • STORAGE_LOCATION: Optional: The Cloud Storage multi-region or the Cloud Storage region where you want to store your snapshot. You can specify only one storage location.

      Use the --storage-location parameter only when you want to override the predefined or customized default storage location configured in your snapshot settings.

REST

You can create your snapshot in the storage location policy defined by your snapshot settings or using an alternative storage location of your choice. For more information, see Choose your snapshot storage location.

  • To create a snapshot in the predefined or customized default location configured in your snapshot settings, make a POST request to the snapshots.insert method:

    POST https://compute.googleapis.com/compute/beta/projects/DESTINATION_PROJECT_ID/global/snapshots
    {
      "name": "SNAPSHOT_NAME",
      "sourceDisk": "projects/SOURCE_PROJECT_ID/zones/SOURCE_ZONE/disks/SOURCE_DISK_NAME",
      "snapshotType": "SNAPSHOT_TYPE",
      "guestFlush": true,
    }
    
  • Alternatively, to override the snapshot settings and create a snapshot in a custom storage location, make a POST request to the snapshots.insert method and include the storageLocations property in your request:

    POST https://compute.googleapis.com/compute/beta/projects/DESTINATION_PROJECT_ID/global/snapshots
    {
      "name": "SNAPSHOT_NAME",
      "sourceDisk": "projects/SOURCE_PROJECT_ID/zones/SOURCE_ZONE/disks/SOURCE_DISK_NAME",
      "snapshotType": "SNAPSHOT_TYPE",
      "storageLocations": [
          "STORAGE_LOCATION"
      ],
      "guestFlush": true,
    }
    

Replace the following:

  • DESTINATION_PROJECT_ID: The ID of project in which you want to create the snapshot.
  • SNAPSHOT_NAME: A name for the snapshot.
  • SOURCE_PROJECT_ID: The ID of the source disk project.
  • SOURCE_ZONE: The zone of the source disk.
  • SOURCE_DISK_NAME: The name of the disk from which you want to create a snapshot.
  • SNAPSHOT_TYPE: The snapshot type, either STANDARD or ARCHIVE. If a snapshot type is not specified, a STANDARD snapshot is created.
  • STORAGE_LOCATION: Optional: The Cloud Storage multi-region or the Cloud Storage region where you want to store your snapshot. You can specify only one storage location.

    Use the storageLocations parameter only when you want to override the predefined or customized default storage location configured in your snapshot settings.

Create a snapshot schedule with guest-flush enabled

Use scheduled snapshots to regularly and automatically backup your zonal and regional Persistent Disk and Google Cloud Hyperdisk. If you want to schedule application consistent snapshots for your backup, use the --guest-flush option when you create the snapshot schedule so that the pre and post snapshot scripts execute before and after each scheduled snapshot.

For example, after configuring your guest environment configuration file and creating custom scripts, the following command creates hourly application consistent snapshots:

gcloud compute resource-policies create snapshot-schedule SCHEDULE_NAME \
  --description "MY HOURLY SNAPSHOT SCHEDULE" \
  --start-time 22:00 \
  --hourly-schedule 4 \
  --guest-flush \
  --max-retention-days SNAPSHOT_RETENTION_AGE

To learn more, see About snapshot schedules for disks.

Troubleshooting

Troubleshoot the snapshot schedule and creation process by reviewing reviewing logs and checking configurations.

Review the logs

  1. Go to the Logs Explorer page in the Google Cloud console:

    Go to Logs Explorer

  2. Paste the following query in the Log query pane:

    resource.type="gce_disk"
    jsonPayload.event_subtype="compute.disks.createSnapshot" OR
    protoPayload.methodName="ScheduledSnapshots"
    
  3. Run the query and investigate the logs:

    snapshot creation log query.

Check configurations

What's next