JOBS_BY_USER view
The INFORMATION_SCHEMA.JOBS_BY_USER
view contains near real-time
metadata about the BigQuery jobs submitted by
the current user in the current project.
Required role
To get the permission that you need to query the INFORMATION_SCHEMA.JOBS_BY_USER
view,
ask your administrator to grant you the
BigQuery User (roles/bigquery.user
) IAM role on your project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
This predefined role contains the
bigquery.jobs.list
permission,
which is required to
query the INFORMATION_SCHEMA.JOBS_BY_USER
view.
You might also be able to get this permission with custom roles or other predefined roles.
For more information about BigQuery permissions, see Access control with IAM.
Schema
The underlying data is partitioned by the creation_time
column and
clustered by project_id
and user_email
.
The INFORMATION_SCHEMA.JOBS_BY_USER
view has the following schema:
Column name | Data type | Value |
---|---|---|
bi_engine_statistics |
RECORD |
If the project is configured to use the BI
Engine SQL Interface, then this field contains BiEngineStatistics.
Otherwise NULL .
|
cache_hit |
BOOLEAN |
Whether the query results of this job were from a cache.
If you have a multi-query statement
job, cache_hit for your parent query is
NULL .
|
creation_time |
TIMESTAMP |
(Partitioning column) Creation time of this job. Partitioning is based on the UTC time of this timestamp. |
destination_table |
RECORD |
Destination table for results, if any. |
dml_statistics |
RECORD |
If the job is a query with a DML statement, the value is a record with the
following fields:
NULL .This column is present in the INFORMATION_SCHEMA.JOBS_BY_USER and
INFORMATION_SCHEMA.JOBS_BY_PROJECT views.
|
end_time |
TIMESTAMP |
The end time of this job, in milliseconds since the epoch. This field represents the
time when the job enters the DONE state. |
error_result |
RECORD |
Details of any errors as ErrorProto objects. |
job_creation_reason.code |
STRING |
Specifies the high level reason why a job was created. Possible values are:
|
job_id |
STRING |
The ID of the job if a job was created. Otherwise, the query ID of a query using short
query mode. For example, bquxjob_1234 . |
job_stages |
RECORD |
Query
stages of the job.
Note: This column's values are empty for queries that read from tables with row-level access policies. For more information, see best practices for row-level security in BigQuery. |
job_type |
STRING |
The type of the job. Can be QUERY , LOAD , EXTRACT ,
COPY , or NULL . A NULL value
indicates an internal job, such as a script job statement evaluation
or a materialized view refresh.
|
labels |
RECORD |
Array of labels applied to the job as key-value pairs. |
parent_job_id |
STRING |
ID of the parent job, if any. |
priority |
STRING |
The priority of this job. Valid values include INTERACTIVE and
BATCH . |
project_id |
STRING |
(Clustering column) The ID of the project. |
project_number |
INTEGER |
The number of the project. |
query |
STRING |
SQL query text. Only the JOBS_BY_PROJECT view has the query
column. |
referenced_tables |
RECORD |
Array of tables referenced by the job. Only populated for query jobs that are not cache hits. |
reservation_id |
STRING |
Name of the primary reservation assigned to this job,
in the format
RESERVATION_ADMIN_PROJECT:RESERVATION_LOCATION.RESERVATION_NAME .In this output:
|
edition |
STRING |
The edition associated with the reservation assigned to this job. For more information about editions, see Introduction to BigQuery editions. |
session_info |
RECORD |
Details about the session in which this job ran, if any. |
start_time |
TIMESTAMP |
The start time of this job, in milliseconds since the epoch. This field represents the
time when the job transitions from the PENDING state to either
RUNNING or DONE . |
state |
STRING |
Running state of the job. Valid states include PENDING , RUNNING , and
DONE .
|
statement_type |
STRING |
The type of query statement. For example, DELETE , INSERT ,
SCRIPT , SELECT , or UPDATE . See QueryStatementType
for list of valid values.
|
timeline |
RECORD |
Query timeline of the job. Contains snapshots of query execution. |
total_bytes_billed |
INTEGER |
If the project is configured to use on-demand
pricing, then this field contains the total bytes billed for the
job. If the project is configured to use flat-rate
pricing, then you are not billed for bytes and this field is
informational only.
Note: This column's values are empty for queries that read from tables with row-level access policies. For more information, see best practices for row-level security in BigQuery. |
total_bytes_processed |
INTEGER |
Total bytes processed by the job. Note: This column's values are empty for queries that read from tables with row-level access policies. For more information, see best practices for row-level security in BigQuery. |
total_modified_partitions |
INTEGER |
The total number of partitions the job modified. This field is
populated for LOAD and QUERY jobs.
|
total_slot_ms |
INTEGER |
Slot milliseconds for the job over its entire duration in the RUNNING state,
including retries. |
transaction_id |
STRING |
ID of the transaction in which this job ran, if any. (Preview) |
user_email |
STRING |
(Clustering column) Email address or service account of the user who ran the job. |
query_info.resource_warning |
STRING |
The warning message that appears if the resource usage during query processing is above the internal threshold of the system. A successful query job can have the resource_warning field populated. With resource_warning , you get additional data points to optimize your queries and to set up monitoring for performance trends of an equivalent set of queries by using query_hashes .
|
query_info.query_hashes.normalized_literals |
STRING |
Contains the hashes of the query. normalized_literals is a hexadecimal
STRING hash that ignores comments, parameter values, UDFs, and literals.
The hash value will differ when underlying views change, or if the query implicitly
references columns, such as SELECT * , and the table schema changes.
This field appears for successful GoogleSQL queries that are not cache hits. |
query_info.performance_insights |
RECORD |
Performance insights for the job. |
query_info.optimization_details |
STRUCT |
The history-based optimizations for the job. |
transferred_bytes |
INTEGER |
Total bytes transferred for cross-cloud queries, such as BigQuery Omni cross-cloud transfer jobs. |
materialized_view_statistics |
RECORD |
Statistics of materialized views considered in a query job. (Preview) |
Data retention
This view contains currently running jobs and the job history of the past 180 days.
Scope and syntax
Queries against this view must include a region qualifier. The following table explains the region scope for this view:
View name | Resource scope | Region scope |
---|---|---|
[PROJECT_ID.]`region-REGION`.INFORMATION_SCHEMA.JOBS_BY_USER |
Jobs submitted by the current user in the specified project. | REGION |
Optional: PROJECT_ID
: the ID of your
Google Cloud project. If not specified, the default project is used.
REGION
: any dataset region name.
For example, `region-us`
.
Example
To run the query against a project other than your default project, add the project ID in the following format:
`PROJECT_ID`.`region-REGION_NAME`.INFORMATION_SCHEMA.JOBS_BY_USER
PROJECT_ID
: the ID of the projectREGION_NAME
: the region for your project
For example, `myproject`.`region-us`.INFORMATION_SCHEMA.JOBS_BY_USER
.
View pending or running jobs
SELECT job_id, creation_time, query FROM `region-REGION_NAME`.INFORMATION_SCHEMA.JOBS_BY_USER WHERE state != 'DONE';
The result is similar to the following:
+--------------+---------------------------+---------------------------------+ | job_id | creation_time | query | +--------------+---------------------------+---------------------------------+ | bquxjob_1 | 2019-10-10 00:00:00 UTC | SELECT ... FROM dataset.table1 | | bquxjob_2 | 2019-10-10 00:00:01 UTC | SELECT ... FROM dataset.table2 | | bquxjob_3 | 2019-10-10 00:00:02 UTC | SELECT ... FROM dataset.table3 | | bquxjob_4 | 2019-10-10 00:00:03 UTC | SELECT ... FROM dataset.table4 | | bquxjob_5 | 2019-10-10 00:00:04 UTC | SELECT ... FROM dataset.table5 | +--------------+---------------------------+---------------------------------+
View performance insights for queries
The following example returns all your query jobs that have performance insights in the last 30 days, along with a URL that links to the query execution graph in the Google Cloud console.
SELECT `bigquery-public-data`.persistent_udfs.job_url( project_id || ':us.' || job_id) AS job_url, query_info.performance_insights FROM `region-REGION_NAME`.INFORMATION_SCHEMA.JOBS_BY_USER WHERE DATE(creation_time) >= CURRENT_DATE - 30 -- scan 30 days of query history AND job_type = 'QUERY' AND state = 'DONE' AND error_result IS NULL AND statement_type != 'SCRIPT' AND EXISTS ( -- Only include queries which had performance insights SELECT 1 FROM UNNEST( query_info.performance_insights.stage_performance_standalone_insights ) WHERE slot_contention OR insufficient_shuffle_quota UNION ALL SELECT 1 FROM UNNEST( query_info.performance_insights.stage_performance_change_insights ) WHERE input_data_change.records_read_diff_percentage IS NOT NULL );