PARTITIONS view
The INFORMATION_SCHEMA.PARTITIONS
view contains one row for each partition.
Querying the INFORMATION_SCHEMA.PARTITIONS
view is limited to 1000
tables. To get the data about partitions at the project level, you can split the
query into multiple queries and then join the results. If you exceed the limit,
you can encounter an error similar to the following:
INFORMATION_SCHEMA.PARTITIONS query attempted to read too many tables. Please add more restrictive filters.
Required permissions
To query the INFORMATION_SCHEMA.PARTITIONS
view, you need the following
Identity and Access Management (IAM) permissions:
bigquery.tables.get
bigquery.tables.getData
bigquery.tables.list
Each of the following predefined IAM roles includes the preceding permissions:
roles/bigquery.admin
roles/bigquery.dataEditor
roles/bigquery.dataViewer
For more information about BigQuery permissions, see Access control with IAM.
Schema
When you query the INFORMATION_SCHEMA.PARTITIONS
view, the query results
typically contain one row for each partition. The exception is when there is
a combination of long-term and active storage tier data in the
__UNPARTITIONED__
partition. In that case,
the view returns two rows for the __UNPARTITIONED__
partition, one for each
storage tier.
The INFORMATION_SCHEMA.PARTITIONS
view has the following schema:
Column name | Data type | Value |
---|---|---|
TABLE_CATALOG |
STRING |
The project ID of the project that contains the table |
TABLE_SCHEMA |
STRING |
The name of the dataset that contains the table, also referred to as
the datasetId |
TABLE_NAME |
STRING |
The name of the table, also referred to as the tableId |
PARTITION_ID |
STRING |
A single partition's ID. For unpartitioned tables, the value is
NULL . For partitioned tables that contain rows with
NULL values in the partitioning column, the value is
__NULL__ . |
TOTAL_ROWS |
INTEGER |
The total number of rows in the partition |
TOTAL_LOGICAL_BYTES |
INTEGER |
The total number of logical bytes in the partition |
LAST_MODIFIED_TIME |
TIMESTAMP |
The time when the data was most recently written to the partition |
STORAGE_TIER |
STRING |
The partition's storage tier:
|
Scope and syntax
Queries against this view must include a dataset qualifier. For queries with a dataset qualifier, you must have permissions for the dataset. For more information see Syntax. The following table explains the region and resource scopes for this view:
View name | Resource scope | Region scope |
---|---|---|
[PROJECT_ID.]DATASET_ID.INFORMATION_SCHEMA.PARTITIONS |
Dataset level | Dataset location |
Optional: PROJECT_ID
: the ID of your
Google Cloud project. If not specified, the default project is used.
DATASET_ID
: the ID of your dataset. For more information, see Dataset qualifier.
Examples
Example 1
The following example calculates the number of logical bytes used by each
storage tier in all of the tables in a dataset named mydataset
:
SELECT storage_tier, SUM(total_logical_bytes) AS logical_bytes FROM `mydataset.INFORMATION_SCHEMA.PARTITIONS` GROUP BY storage_tier;
The results look similar to the following:
+--------------+----------------+ | storage_tier | logical_bytes | +--------------+----------------+ | LONG_TERM | 1311495144879 | | ACTIVE | 66757629240 | +--------------+----------------+
Example 2
The following example creates a column that extracts the partition type from the
partition_id
field and aggregates partition information at the table level
for the public bigquery-public-data.covid19_usafacts
dataset:
SELECT table_name, CASE WHEN regexp_contains(partition_id, '^[0-9]{4}$') THEN 'YEAR' WHEN regexp_contains(partition_id, '^[0-9]{6}$') THEN 'MONTH' WHEN regexp_contains(partition_id, '^[0-9]{8}$') THEN 'DAY' WHEN regexp_contains(partition_id, '^[0-9]{10}$') THEN 'HOUR' END AS partition_type, min(partition_id) AS earliest_partition, max(partition_id) AS latest_partition_id, COUNT(partition_id) AS partition_count, sum(total_logical_bytes) AS sum_total_logical_bytes, max(last_modified_time) AS max_last_updated_time FROM `bigquery-public-data.covid19_usafacts.INFORMATION_SCHEMA.PARTITIONS` GROUP BY 1, 2;
The results look similar to the following:
+-----------------+----------------+--------------------+---------------------+-----------------+-------------------------+--------------------------------+ | table_name | partition_type | earliest_partition | latest_partition_id | partition_count | sum_total_logical_bytes | max_last_updated_time | +--------------+-------------------+--------------------+---------------------+-----------------+-------------------------+--------------------------------+ | confirmed_cases | DAY | 20221204 | 20221213 | 10 | 26847302 | 2022-12-13 00:09:25.604000 UTC | | deaths | DAY | 20221204 | 20221213 | 10 | 26847302 | 2022-12-13 00:09:24.709000 UTC | | summary | DAY | 20221204 | 20221213 | 10 | 241285338 | 2022-12-13 00:09:27.496000 UTC | +-----------------+----------------+--------------------+---------------------+-----------------+-------------------------+--------------------------------+