PARTITIONS view

The INFORMATION_SCHEMA.PARTITIONS view contains one row for each partition.

Querying the INFORMATION_SCHEMA.PARTITIONS view is limited to 1000 tables. To get the data about partitions at the project level, you can split the query into multiple queries and then join the results. If you exceed the limit, you can encounter an error similar to the following:

INFORMATION_SCHEMA.PARTITIONS query attempted to read too many tables. Please add more restrictive filters.

Required permissions

To query the INFORMATION_SCHEMA.PARTITIONS view, you need the following Identity and Access Management (IAM) permissions:

bigquery.tables.get
bigquery.tables.getData
bigquery.tables.list

Each of the following predefined IAM roles includes the preceding permissions:

roles/bigquery.admin
roles/bigquery.dataEditor
roles/bigquery.dataViewer

For more information about BigQuery permissions, see Access control with IAM.

Schema

When you query the INFORMATION_SCHEMA.PARTITIONS view, the query results typically contain one row for each partition. The exception is when there is a combination of long-term and active storage tier data in the __UNPARTITIONED__ partition. In that case, the view returns two rows for the __UNPARTITIONED__ partition, one for each storage tier.

The INFORMATION_SCHEMA.PARTITIONS view has the following schema:

Column name	Data type	Value
`TABLE_CATALOG`	`STRING`	The project ID of the project that contains the table
`TABLE_SCHEMA`	`STRING`	The name of the dataset that contains the table, also referred to as the `datasetId`
`TABLE_NAME`	`STRING`	The name of the table, also referred to as the `tableId`
`PARTITION_ID`	`STRING`	A single partition's ID. For unpartitioned tables, the value is `NULL`. For partitioned tables that contain rows with `NULL` values in the partitioning column, the value is `__NULL__`.
`TOTAL_ROWS`	`INTEGER`	The total number of rows in the partition
`TOTAL_LOGICAL_BYTES`	`INTEGER`	The total number of logical bytes in the partition
`LAST_MODIFIED_TIME`	`TIMESTAMP`	The time when the data was most recently written to the partition
`STORAGE_TIER`	`STRING`	The partition's storage tier: `ACTIVE`: the partition is billed as active storage `LONG_TERM`: the partition is billed as long-term storage

Scope and syntax

Queries against this view must include a dataset qualifier. For queries with a dataset qualifier, you must have permissions for the dataset. For more information see Syntax. The following table explains the region and resource scopes for this view:

View name	Resource scope	Region scope
`[PROJECT_ID.]DATASET_ID.INFORMATION_SCHEMA.PARTITIONS`	Dataset level	Dataset location

Replace the following:

Optional: PROJECT_ID: the ID of your Google Cloud project. If not specified, the default project is used.

DATASET_ID: the ID of your dataset. For more information, see Dataset qualifier.

Examples

Example 1

The following example calculates the number of logical bytes used by each storage tier in all of the tables in a dataset named mydataset:

SELECT
  storage_tier,
  SUM(total_logical_bytes) AS logical_bytes
FROM
  `mydataset.INFORMATION_SCHEMA.PARTITIONS`
GROUP BY
  storage_tier;

The results look similar to the following:

+--------------+----------------+
| storage_tier | logical_bytes  |
+--------------+----------------+
| LONG_TERM    |  1311495144879 |
| ACTIVE       |    66757629240 |
+--------------+----------------+

Example 2

The following example creates a column that extracts the partition type from the partition_id field and aggregates partition information at the table level for the public bigquery-public-data.covid19_usafacts dataset:

SELECT
  table_name,
  CASE
    WHEN regexp_contains(partition_id, '^[0-9]{4}$') THEN 'YEAR'
    WHEN regexp_contains(partition_id, '^[0-9]{6}$') THEN 'MONTH'
    WHEN regexp_contains(partition_id, '^[0-9]{8}$') THEN 'DAY'
    WHEN regexp_contains(partition_id, '^[0-9]{10}$') THEN 'HOUR'
    END AS partition_type,
  min(partition_id) AS earliest_partition,
  max(partition_id) AS latest_partition_id,
  COUNT(partition_id) AS partition_count,
  sum(total_logical_bytes) AS sum_total_logical_bytes,
  max(last_modified_time) AS max_last_updated_time
FROM `bigquery-public-data.covid19_usafacts.INFORMATION_SCHEMA.PARTITIONS`
GROUP BY 1, 2;

The results look similar to the following:

+-----------------+----------------+--------------------+---------------------+-----------------+-------------------------+--------------------------------+
| table_name      | partition_type | earliest_partition | latest_partition_id | partition_count | sum_total_logical_bytes | max_last_updated_time          |
+--------------+-------------------+--------------------+---------------------+-----------------+-------------------------+--------------------------------+
| confirmed_cases | DAY            | 20221204           | 20221213            | 10              | 26847302                | 2022-12-13 00:09:25.604000 UTC |
| deaths          | DAY            | 20221204           | 20221213            | 10              | 26847302                | 2022-12-13 00:09:24.709000 UTC |
| summary         | DAY            | 20221204           | 20221213            | 10              | 241285338               | 2022-12-13 00:09:27.496000 UTC |
+-----------------+----------------+--------------------+---------------------+-----------------+-------------------------+--------------------------------+