YouTube Content Owner transfers

The BigQuery Data Transfer Service for YouTube Content Owner connector lets you automatically schedule and manage recurring load jobs for YouTube Content Owner reports.

Supported Reports

The BigQuery Data Transfer Service for YouTube Content Owner reports supports the following reporting options:

Reporting option Support
Supported API version June 18, 2018
Repeat frequency

Daily, around 14:45 UTC

You can configure the time of day

Refresh window

Last 1 day

Not configurable.

Maximum backfill duration

30 days

As of July 2018, YouTube reports containing historical data are available for 30 days from the time that they are generated. Reports that contain non-historical data are available for 60 days after the policy change. For more information, see Historical data in the YouTube Reporting API documentation.

For information on how YouTube Content Owner reports are transformed into BigQuery tables and views, see YouTube Content Owner report transformations.

Data ingestion from YouTube Content Owner transfers

When you transfer data from YouTube Content Owner reports into BigQuery, the data is loaded into BigQuery tables that are partitioned by date. The table partition that the data is loaded into corresponds to the date from the data source. If you schedule multiple transfers for the same date, BigQuery Data Transfer Service overwrites the partition for that specific date with the latest data. Multiple transfers in the same day or running backfills don't result in duplicate data, and partitions for other dates are not affected.

Refresh windows

A refresh window is the number of days that a data transfer retrieves data when a data transfer occurs. For example, if the refresh window is three days and a daily transfer occurs, the BigQuery Data Transfer Service retrieves all data from your source table from the past three days. In this example, when a daily transfer occurs, the BigQuery Data Transfer Service creates a new BigQuery destination table partition with a copy of your source table data from the current day, then automatically triggers backfill runs to update the BigQuery destination table partitions with your source table data from the past two days. The automatically triggered backfill runs will either overwrite or incrementally update your BigQuery destination table, depending on whether or not incremental updates are supported in the BigQuery Data Transfer Service connector.

When you run a data transfer for the first time, the data transfer retrieves all source data available within the refresh window. For example, if the refresh window is three days and you run the data transfer for the first time, the BigQuery Data Transfer Service retrieves all source data within three days.

Refresh windows are mapped to the TransferConfig.data_refresh_window_days API field.

To retrieve data outside the refresh window, such as historical data, or to recover data from any transfer outages or gaps, you can initiate or schedule a backfill run.

Limitations

  • The maximum supported file size for each report is 1710 GB.
  • The minimum frequency that you can schedule a data transfer for is once every 24 hours. By default, a data transfer starts at the time that you create the data transfer. However, you can configure the transfer start time when you set up your transfer.
  • The BigQuery Data Transfer Service does not support incremental data transfers during a YouTube Content Owner transfer. When you specify a date for a data transfer, all of the data that is available for that date is transferred.

Before you begin

Before you create a YouTube Content Owner data transfer:

Required permissions

Ensure that the person creating the data transfer has the following required permissions:

  • BigQuery:

    • bigquery.transfers.update permissions to create the data transfer
    • Both bigquery.datasets.get and bigquery.datasets.update permissions on the target dataset

    The bigquery.admin predefined IAM role includes bigquery.transfers.update, bigquery.datasets.update and bigquery.datasets.get permissions. For more information on IAM roles in BigQuery Data Transfer Service, see Access control.

  • YouTube:

    • YouTube Content Manager or YouTube Content Owner.

    A Content Manager is granted rights to administer YouTube content for a Content Owner. A Content Owner is an umbrella account that owns one or more YouTube channels and the videos on those channels.

    • Hide revenue data is unchecked in YouTube Content Owner report settings.

    For revenue-related reports to transfer, the YouTube reports permission setting Hide revenue data should be unchecked for the user creating the transfer.

    youtube-content-owner-reports-uncheck-hide-revenue

Set up a YouTube Content Owner transfer

Setting up a YouTube Content Owner data transfer requires a:

  • Content Owner ID: Provided by YouTube. When you log in to YouTube as a Content Owner or Manager, your ID appears in the URL after o=. For example, if the URL is https://studio.youtube.com/owner/AbCDE_8FghIjK?o=AbCDE_8FghIjK, the Content Owner ID is AbCDE_8FghIjK. To select a different Content Manager account, see Sign in to a Content Manager account or YouTube Channel Switcher. For more information on creating and managing your Content Manager account, see Configure Content Manager account settings.
  • Table Suffix: A user-friendly name for the channel provided by you when you set up the transfer. The suffix is appended to the job ID to create the table name, for example reportTypeId_suffix. The suffix is used to prevent separate data transfers from writing to the same tables. The table suffix must be unique across all transfers that load data into the same dataset, and the suffix should be short to minimize the length of the resulting table name.

If you use the YouTube Reporting API and have existing reporting jobs, the BigQuery Data Transfer Service loads your report data. If you don't have existing reporting jobs, setting up the data transfer automatically enables YouTube reporting jobs.

To set up a YouTube Content Owner data transfer:

Console

  1. Go to the BigQuery page in the Google Cloud console. Ensure that you are signed in to the account as either the Content Owner or Content Manager.

    Go to the BigQuery page

  2. Click Transfers.

  3. Click Create Transfer.

  4. On the Create Transfer page:

    • In the Source type section, for Source, choose YouTube Content Owner.

      Transfer source

    • In the Transfer config name section, for Display name, enter a name for the data transfer such as My Transfer. The transfer name can be any value that lets you identify the transfer if you need to modify it later.

      Transfer name

    • In the Schedule options section:

      • For Repeat frequency, choose an option for how often to run the data transfer. If you select Days, provide a valid time in UTC.

        • Hours
        • Days
        • On-demand
      • If applicable, select either Start now or Start at set time and provide a start date and run time.

    • In the Destination settings section, for Destination dataset, choose the dataset you created to store your data.

      Transfer dataset

    • In the Data source details section:

      • For Content owner ID, enter your Content Owner ID.
      • For Table suffix, enter a suffix such as MT.

        YouTube Content Owner source details

    • In the Service Account menu, select a service account from the service accounts associated with your Google Cloud project. You can associate a service account with your data transfer instead of using your user credentials. For more information about using service accounts with data transfers, see Use service accounts.

      • If you signed in with a federated identity, then a service account is required to create a data transfer. If you signed in with a Google Account, then a service account for the data transfer is optional.
      • The service account must have the required permissions.
    • (Optional) In the Notification options section:

      • Click the toggle to enable email notifications. When you enable this option, the transfer administrator receives an email notification when a data transfer run fails.
      • For Select a Pub/Sub topic, choose your topic name or click Create a topic. This option configures Pub/Sub run notifications for your transfer.
  5. Click Save.

  6. If this is your first time signing into the account, select an account and then click Allow. Select the same account where you are the Content Owner or Content Manager.

bq

Enter the bq mk command and supply the transfer creation flag — --transfer_config. The following flags are also required:

  • --data_source
  • --target_dataset
  • --display_name
  • --params

Optional flags:

  • --service_account_name - Specifies a service account to use for Content Owner transfer authentication instead of your user account.
bq mk \
--transfer_config \
--project_id=project_id \
--target_dataset=dataset \
--display_name=name \
--params='parameters' \
--data_source=data_source \
--service_account_name=service_account_name

Where:

  • project_id is your project ID.
  • dataset is the target dataset for the transfer configuration.
  • name is the display name for the transfer configuration. The data transfer name can be any value that lets you identify the transfer if you need to modify it later.
  • parameters contains the parameters for the created transfer configuration in JSON format. For example: --params='{"param":"param_value"}'. For YouTube Content Owner data transfers, you must supply the content_owner_id and table_suffix parameters. You may optionally set the configure_jobs parameter to true to allow the BigQuery Data Transfer Service to manage YouTube reporting jobs for you. If there are YouTube reports that don't exist for your account, new reporting jobs are created to enable them.
  • data_source is the data source — youtube_content_owner.
  • service_account_name is the service account name used to authenticate your data transfer. The service account should be owned by the same project_id used to create the transfer and it should have all of the required permissions.

You can also supply the --project_id flag to specify a particular project. If --project_id isn't specified, the default project is used.

For example, the following command creates a YouTube Content Owner data transfer named My Transfer using content owner ID AbCDE_8FghIjK, table suffix MT, and target dataset mydataset. The data transfer is created in the default project:

bq mk \
--transfer_config \
--target_dataset=mydataset \
--display_name='My Transfer' \
--params='{"content_owner_id":"abCDE_8FghIjK","table_suffix":"MT","configure_jobs":"true"}' \
--data_source=youtube_content_owner

API

Use the projects.locations.transferConfigs.create method and supply an instance of the TransferConfig resource.

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries. For more information, see the BigQuery Java API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries.

import com.google.api.gax.rpc.ApiException;
import com.google.cloud.bigquery.datatransfer.v1.CreateTransferConfigRequest;
import com.google.cloud.bigquery.datatransfer.v1.DataTransferServiceClient;
import com.google.cloud.bigquery.datatransfer.v1.ProjectName;
import com.google.cloud.bigquery.datatransfer.v1.TransferConfig;
import com.google.protobuf.Struct;
import com.google.protobuf.Value;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;

// Sample to create youtube content owner channel transfer config
public class CreateYoutubeContentOwnerTransfer {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    final String projectId = "MY_PROJECT_ID";
    String datasetId = "MY_DATASET_ID";
    String contentOwnerId = "MY_CONTENT_OWNER_ID";
    String tableSuffix = "_test";
    Map<String, Value> params = new HashMap<>();
    params.put("content_owner_id", Value.newBuilder().setStringValue(contentOwnerId).build());
    params.put("table_suffix", Value.newBuilder().setStringValue(tableSuffix).build());
    TransferConfig transferConfig =
        TransferConfig.newBuilder()
            .setDestinationDatasetId(datasetId)
            .setDisplayName("Your Youtube Owner Channel Config Name")
            .setDataSourceId("youtube_content_owner")
            .setParams(Struct.newBuilder().putAllFields(params).build())
            .build();
    createYoutubeContentOwnerTransfer(projectId, transferConfig);
  }

  public static void createYoutubeContentOwnerTransfer(
      String projectId, TransferConfig transferConfig) throws IOException {
    try (DataTransferServiceClient client = DataTransferServiceClient.create()) {
      ProjectName parent = ProjectName.of(projectId);
      CreateTransferConfigRequest request =
          CreateTransferConfigRequest.newBuilder()
              .setParent(parent.toString())
              .setTransferConfig(transferConfig)
              .build();
      TransferConfig config = client.createTransferConfig(request);
      System.out.println(
          "Youtube content owner channel transfer created successfully :" + config.getName());
    } catch (ApiException ex) {
      System.out.print("Youtube content owner channel transfer was not created." + ex.toString());
    }
  }
}

Query your data

When your data is transferred to BigQuery, the data is written to ingestion-time partitioned tables. For more information, see Partitioned tables.

If you query your tables directly instead of using the auto-generated views, you must use the _PARTITIONTIME pseudocolumn in your query. For more information, see Querying partitioned tables.

Troubleshoot YouTube Content Owner transfer setup

If you are having issues setting up your data transfer, see YouTube transfer issues in Troubleshooting transfer configurations.