PARTITIONS view

The INFORMATION_SCHEMA.PARTITIONS view contains one row for each partition.

Querying the INFORMATION_SCHEMA.PARTITIONS view is limited to 1000 tables. To get the data about partitions at the project level, you can split the query into multiple queries and then join the results. If you exceed the limit, you can encounter an error similar to the following:

INFORMATION_SCHEMA.PARTITIONS query attempted to read too many tables. Please add more restrictive filters.

Required permissions

To query the INFORMATION_SCHEMA.PARTITIONS view, you need the following Identity and Access Management (IAM) permissions:

  • bigquery.tables.get
  • bigquery.tables.getData
  • bigquery.tables.list

Each of the following predefined IAM roles includes the preceding permissions:

  • roles/bigquery.admin
  • roles/bigquery.dataEditor
  • roles/bigquery.dataViewer

For more information about BigQuery permissions, see Access control with IAM.

Schema

When you query the INFORMATION_SCHEMA.PARTITIONS view, the query results typically contain one row for each partition. The exception is when there is a combination of long-term and active storage tier data in the __UNPARTITIONED__ partition. In that case, the view returns two rows for the __UNPARTITIONED__ partition, one for each storage tier.

The INFORMATION_SCHEMA.PARTITIONS view has the following schema:

Column name Data type Value
TABLE_CATALOG STRING The project ID of the project that contains the table
TABLE_SCHEMA STRING The name of the dataset that contains the table, also referred to as the datasetId
TABLE_NAME STRING The name of the table, also referred to as the tableId
PARTITION_ID STRING A single partition's ID. For unpartitioned tables, the value is NULL. For partitioned tables that contain rows with NULL values in the partitioning column, the value is __NULL__.
TOTAL_ROWS INTEGER The total number of rows in the partition
TOTAL_LOGICAL_BYTES INTEGER The total number of logical bytes in the partition
LAST_MODIFIED_TIME TIMESTAMP The time when the data was most recently written to the partition
STORAGE_TIER STRING The partition's storage tier:

Scope and syntax

Queries against this view must include a dataset qualifier. For queries with a dataset qualifier, you must have permissions for the dataset. For more information see Syntax. The following table explains the region and resource scopes for this view:

View name Resource scope Region scope
[PROJECT_ID.]DATASET_ID.INFORMATION_SCHEMA.PARTITIONS Dataset level Dataset location
Replace the following:

  • Optional: PROJECT_ID: the ID of your Google Cloud project. If not specified, the default project is used.

    Examples

    Example 1

    The following example calculates the number of logical bytes used by each storage tier in all of the tables in a dataset named mydataset:

    SELECT
      storage_tier,
      SUM(total_logical_bytes) AS logical_bytes
    FROM
      `mydataset.INFORMATION_SCHEMA.PARTITIONS`
    GROUP BY
      storage_tier;

    The results look similar to the following:

    +--------------+----------------+
    | storage_tier | logical_bytes  |
    +--------------+----------------+
    | LONG_TERM    |  1311495144879 |
    | ACTIVE       |    66757629240 |
    +--------------+----------------+
    

    Example 2

    The following example creates a column that extracts the partition type from the partition_id field and aggregates partition information at the table level for the public bigquery-public-data.covid19_usafacts dataset:

    SELECT
      table_name,
      CASE
        WHEN regexp_contains(partition_id, '^[0-9]{4}$') THEN 'YEAR'
        WHEN regexp_contains(partition_id, '^[0-9]{6}$') THEN 'MONTH'
        WHEN regexp_contains(partition_id, '^[0-9]{8}$') THEN 'DAY'
        WHEN regexp_contains(partition_id, '^[0-9]{10}$') THEN 'HOUR'
        END AS partition_type,
      min(partition_id) AS earliest_partition,
      max(partition_id) AS latest_partition_id,
      COUNT(partition_id) AS partition_count,
      sum(total_logical_bytes) AS sum_total_logical_bytes,
      max(last_modified_time) AS max_last_updated_time
    FROM `bigquery-public-data.covid19_usafacts.INFORMATION_SCHEMA.PARTITIONS`
    GROUP BY 1, 2;

    The results look similar to the following:

    +-----------------+----------------+--------------------+---------------------+-----------------+-------------------------+--------------------------------+
    | table_name      | partition_type | earliest_partition | latest_partition_id | partition_count | sum_total_logical_bytes | max_last_updated_time          |
    +--------------+-------------------+--------------------+---------------------+-----------------+-------------------------+--------------------------------+
    | confirmed_cases | DAY            | 20221204           | 20221213            | 10              | 26847302                | 2022-12-13 00:09:25.604000 UTC |
    | deaths          | DAY            | 20221204           | 20221213            | 10              | 26847302                | 2022-12-13 00:09:24.709000 UTC |
    | summary         | DAY            | 20221204           | 20221213            | 10              | 241285338               | 2022-12-13 00:09:27.496000 UTC |
    +-----------------+----------------+--------------------+---------------------+-----------------+-------------------------+--------------------------------+