Linha de comando (CLI)
config
CLI config commands.
Usage:
config [OPTIONS] COMMAND [ARGS]...
Options:
Name | Type | Description | Default |
---|---|---|---|
--help |
boolean | Show this message and exit. | False |
init
Initialize configuration
Usage:
config init [OPTIONS]
Options:
Name | Type | Description | Default |
---|---|---|---|
--overwrite |
boolean | Wheteher to overwrite current config | False |
--help |
boolean | Show this message and exit. | False |
refresh_template
Overwrite current templates
Usage:
config refresh_template [OPTIONS]
Options:
Name | Type | Description | Default |
---|---|---|---|
--help |
boolean | Show this message and exit. | False |
dataset
Command to manage datasets.
Usage:
dataset [OPTIONS] COMMAND [ARGS]...
Options:
Name | Type | Description | Default |
---|---|---|---|
--help |
boolean | Show this message and exit. | False |
create
Create dataset on BigQuery
Usage:
dataset create [OPTIONS] DATASET_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--mode , -m |
text | What datasets to create [prod | staging |
--if_exists |
text | [raise | update |
--dataset_is_public |
boolean | Control if prod dataset is public or not. By default staging datasets like dataset_id_staging are not public. |
True |
--location |
text | Location of dataset data. List of possible region names locations: https://cloud.google.com/bigquery/docs/locations | None |
--help |
boolean | Show this message and exit. | False |
delete
Delete dataset
Usage:
dataset delete [OPTIONS] DATASET_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--mode , -m |
text | What datasets to create [prod | staging |
--help |
boolean | Show this message and exit. | False |
init
Initialize metadata files of dataset
Usage:
dataset init [OPTIONS] DATASET_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--replace |
boolean | Whether to replace current metadata files | False |
--help |
boolean | Show this message and exit. | False |
publicize
Make a dataset public
Usage:
dataset publicize [OPTIONS] DATASET_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--dataset_is_public |
boolean | Control if prod dataset is public or not. By default staging datasets like dataset_id_staging are not public. |
True |
--help |
boolean | Show this message and exit. | False |
update
Update dataset on BigQuery
Usage:
dataset update [OPTIONS] DATASET_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--mode , -m |
text | What datasets to create [prod | staging |
--help |
boolean | Show this message and exit. | False |
download
Downloads data do SAVEPATH. SAVEPATH must point to a .csv file.
Example:
basedosdados download data.csv --query="select * from basedosdados.br_ibge_pib.municipio limit 10" \ --billing_project_id=basedosdados-dev
Usage:
download [OPTIONS] SAVEPATH
Options:
Name | Type | Description | Default |
---|---|---|---|
--query |
text | A SQL Standard query to download data from BigQuery | None |
--dataset_id |
text | Dataset_id, enter with table_id to download table | None |
--table_id |
text | Table_id, enter with dataset_id to download table | None |
--query_project_id |
text | Which project the table lives. You can change this you want to query different projects. | None |
--billing_project_id |
text | Project that will be billed. Find your Project ID here https://console.cloud.google.com/projectselector2/home/dashboard | None |
--limit |
text | Number of rows returned | None |
--help |
boolean | Show this message and exit. | False |
get
Get commands.
Usage:
get [OPTIONS] COMMAND [ARGS]...
Options:
Name | Type | Description | Default |
---|---|---|---|
--help |
boolean | Show this message and exit. | False |
dataset_description
Get the full description for given dataset
Usage:
get dataset_description [OPTIONS] DATASET_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--project_id |
text | The project which will be queried. You should have list/read permissions | basedosdados |
--help |
boolean | Show this message and exit. | False |
table_columns
Get fields names,types and description for columns at given table
Usage:
get table_columns [OPTIONS] DATASET_ID TABLE_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--project_id |
text | The project which will be queried. You should have list/read permissions | basedosdados |
--help |
boolean | Show this message and exit. | False |
table_description
Get the full description for given table
Usage:
get table_description [OPTIONS] DATASET_ID TABLE_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--project_id |
text | The project which will be queried. You should have list/read permissions | basedosdados |
--help |
boolean | Show this message and exit. | False |
list
CLI list commands.
Usage:
list [OPTIONS] COMMAND [ARGS]...
Options:
Name | Type | Description | Default |
---|---|---|---|
--help |
boolean | Show this message and exit. | False |
dataset_tables
List tables available at given dataset
Usage:
list dataset_tables [OPTIONS] DATASET_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--project_id |
text | The project which will be queried. You should have list/read permissions | basedosdados |
--filter_by |
text | Filter your search, must be a string | None |
--with_description |
boolean | [bool]Fetch short description for each table | False |
--help |
boolean | Show this message and exit. | False |
datasets
List datasets available at given project_id
Usage:
list datasets [OPTIONS]
Options:
Name | Type | Description | Default |
---|---|---|---|
--project_id |
text | The project which will be queried. You should have list/read permissions | basedosdados |
--filter_by |
text | Filter your search, must be a string | None |
--with_description |
boolean | [bool]Fetch short description for each dataset | False |
--help |
boolean | Show this message and exit. | False |
metadata
CLI metadata commands.
Usage:
metadata [OPTIONS] COMMAND [ARGS]...
Options:
Name | Type | Description | Default |
---|---|---|---|
--help |
boolean | Show this message and exit. | False |
create
Creates new metadata config file
Usage:
metadata create [OPTIONS] DATASET_ID [TABLE_ID]
Options:
Name | Type | Description | Default |
---|---|---|---|
--if_exists |
text | [raise | replace |
--columns |
text | Data columns. Example: --columns=col1,col2 | [] |
--partition_columns |
text | Columns that partition the data. Example: --partition_columns=col1,col2 | [] |
--force_columns |
boolean | Overwrite columns with local columns. | False |
--table_only |
boolean | Force the creation of table_config.yaml file only if dataset_config.yaml doesn't exist. |
True |
--help |
boolean | Show this message and exit. | False |
is_updated
Check if user's local metadata is updated
Usage:
metadata is_updated [OPTIONS] DATASET_ID [TABLE_ID]
Options:
Name | Type | Description | Default |
---|---|---|---|
--help |
boolean | Show this message and exit. | False |
publish
Publish user's local metadata
Usage:
metadata publish [OPTIONS] DATASET_ID [TABLE_ID]
Options:
Name | Type | Description | Default |
---|---|---|---|
--all |
boolean | Force the publishment of metadata specified in both dataset_config.yaml and table_config.yaml at once. |
False |
--if_exists |
text | Define what to do in case metadata already exists in CKAN. | raise |
--update_locally |
boolean | Update local metadata with the new CKAN metadata on publish. | False |
--help |
boolean | Show this message and exit. | False |
validate
Validate user's local metadata
Usage:
metadata validate [OPTIONS] DATASET_ID [TABLE_ID]
Options:
Name | Type | Description | Default |
---|---|---|---|
--help |
boolean | Show this message and exit. | False |
reauth
Reauthorize credentials.
Usage:
reauth [OPTIONS]
Options:
Name | Type | Description | Default |
---|---|---|---|
--help |
boolean | Show this message and exit. | False |
storage
Commands for Google Cloud Storage.
Usage:
storage [OPTIONS] COMMAND [ARGS]...
Options:
Name | Type | Description | Default |
---|---|---|---|
--help |
boolean | Show this message and exit. | False |
copy_table
Copy table to your bucket
Usage:
storage copy_table [OPTIONS] DATASET_ID TABLE_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--source_bucket_name |
text | N/A | basedosdados |
--dst_bucket_name |
text | Bucket where data will be copied to, defaults to your bucket | None |
--mode , -m |
text | which bucket folder to get the table [raw | staging |
--help |
boolean | Show this message and exit. | False |
delete_table
Delete table from bucket
Usage:
storage delete_table [OPTIONS] DATASET_ID TABLE_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--mode , -m |
text | Where to delete the file from [raw | staging |
--bucket_name |
text | Bucket from which to delete data, you can change it to delete from a bucket other than yours | None |
--not_found_ok |
boolean | what to do if table not found | False |
--help |
boolean | Show this message and exit. | False |
download
Download file from bucket
Usage:
storage download [OPTIONS] DATASET_ID TABLE_ID SAVEPATH
Options:
Name | Type | Description | Default |
---|---|---|---|
--filename , -f |
text | filename to download single file. If * downloads all files from bucket folder | * |
--mode , -m |
text | Where to download data from [raw | staging |
--partitions |
text | Data partition as value=key/value2=key2 |
None |
--if_not_exists |
text | [raise | pass] if file file not found at bucket folder |
--help |
boolean | Show this message and exit. | False |
init
Create bucket and initial folders
Usage:
storage init [OPTIONS]
Options:
Name | Type | Description | Default |
---|---|---|---|
--bucket_name |
text | Bucket name | basedosdados |
--replace |
boolean | Whether to replace current bucket files | False |
--very-sure / --not-sure |
boolean | Are you sure that you want to replace current bucket files? | False |
--help |
boolean | Show this message and exit. | False |
upload
Upload file to bucket
Usage:
storage upload [OPTIONS] DATASET_ID TABLE_ID FILEPATH
Options:
Name | Type | Description | Default |
---|---|---|---|
--mode , -m |
text | Where to save the file [raw | staging |
--partitions |
text | Data partition as value=key/value2=key2 |
None |
--if_exists |
text | [raise | replace |
--chunk_size |
text | The size of a chunk of data whenever iterating (in bytes). This must be a multiple of 256 KB per the API specification. | None |
--help |
boolean | Show this message and exit. | False |
table
Command to manage tables.
Usage:
table [OPTIONS] COMMAND [ARGS]...
Options:
Name | Type | Description | Default |
---|---|---|---|
--help |
boolean | Show this message and exit. | False |
append
Append new data to existing table
Usage:
table append [OPTIONS] DATASET_ID TABLE_ID FILEPATH
Options:
Name | Type | Description | Default |
---|---|---|---|
--partitions |
text | Data partition as value=key/value2=key2 |
None |
--if_exists |
text | [raise | replace |
--chunk_size |
text | The size of a chunk of data whenever iterating (in bytes). This must be a multiple of 256 KB per the API specification. | None |
--help |
boolean | Show this message and exit. | False |
create
Create stagging table in BigQuery
Usage:
table create [OPTIONS] DATASET_ID TABLE_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--path , -p |
path | Path of data folder or file. | None |
--if_table_exists |
text | [raise | replace |
--force_dataset |
boolean | Whether to automatically create the dataset folders and in BigQuery | True |
--if_storage_data_exists |
text | [raise | replace |
--if_table_config_exists |
text | [raise | replace |
--source_format |
text | Data source format. Only 'csv' is supported. Defaults to 'csv'. | csv |
--force_columns |
boolean | Overwrite columns with local columns. | False |
--columns_config_url_or_path |
text | Path to the local architeture file or a public google sheets URL. Path only suports csv, xls, xlsx, xlsm, xlsb, odf, ods, odt formats. Google sheets URL must be in the format https://docs.google.com/spreadsheets/d/ |
None |
--dataset_is_public |
boolean | Control if prod dataset is public or not. By default staging datasets like dataset_id_staging are not public. |
True |
--location |
text | Location of dataset data. List of possible region names locations: https://cloud.google.com/bigquery/docs/locations | None |
--chunk_size |
text | The size of a chunk of data whenever iterating (in bytes). This must be a multiple of 256 KB per the API specification. | None |
--help |
boolean | Show this message and exit. | False |
delete
Delete BigQuery table
Usage:
table delete [OPTIONS] DATASET_ID TABLE_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--mode |
text | Which table to delete [prod | staging] |
--help |
boolean | Show this message and exit. | False |
init
Create metadata files
Usage:
table init [OPTIONS] DATASET_ID TABLE_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--data_sample_path |
path | Sample data used to pre-fill metadata | None |
--if_folder_exists |
text | [raise | replace |
--if_table_config_exists |
text | [raise | replace |
--source_format |
text | Data source format. Only 'csv' is supported. Defaults to 'csv'. | csv |
--force_columns |
boolean | Overwrite columns with local columns. | False |
--columns_config_url_or_path |
text | google sheets URL. Must be in the format https://docs.google.com/spreadsheets/d/ |
None |
--help |
boolean | Show this message and exit. | False |
publish
Publish staging table to prod
Usage:
table publish [OPTIONS] DATASET_ID TABLE_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--if_exists |
text | [raise | replace] actions if table exists |
--help |
boolean | Show this message and exit. | False |
update
Update tables in BigQuery
Usage:
table update [OPTIONS] DATASET_ID TABLE_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--mode |
text | Choose a table from a dataset to update [prod | staging |
--help |
boolean | Show this message and exit. | False |
update_columns
Update columns fields in tables_config.yaml
Usage:
table update_columns [OPTIONS] DATASET_ID TABLE_ID
Options:
Name | Type | Description | Default |
---|---|---|---|
--columns_config_url_or_path |
text | Fills columns in table_config.yaml automatically using a public google sheets URL or a local file. Also regenerate |
publish.sql and autofill type using bigquery_type.
The sheet must contain the columns:
- name: column name
- description: column description
- bigquery_type: column bigquery type
- measurement_unit: column mesurement unit
- covered_by_dictionary: column related dictionary
- directory_column: column related directory in the format <dataset_id>.<table_id>:<column_name>
- temporal_coverage: column temporal coverage
- has_sensitive_data: the column has sensitive data
- observations: column observations
Args:
columns_config_url_or_path (str): Path to the local architeture file or a public google sheets URL.
Path only suports csv, xls, xlsx, xlsm, xlsb, odf, ods, odt formats.
Google sheets URL must be in the format https://docs.google.com/spreadsheets/d/<table_key>/edit#gid=<table_gid>. | None |
| --help
| boolean | Show this message and exit. | False
|