Ir para o conteúdo

Linha de comando (CLI)

config

CLI config commands.

Usage:

config [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help boolean Show this message and exit. False

init

Initialize configuration

Usage:

config init [OPTIONS]

Options:

Name Type Description Default
--overwrite boolean Wheteher to overwrite current config False
--help boolean Show this message and exit. False

refresh_template

Overwrite current templates

Usage:

config refresh_template [OPTIONS]

Options:

Name Type Description Default
--help boolean Show this message and exit. False

dataset

Command to manage datasets.

Usage:

dataset [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help boolean Show this message and exit. False

create

Create dataset on BigQuery

Usage:

dataset create [OPTIONS] DATASET_ID

Options:

Name Type Description Default
--mode, -m text What datasets to create [prod staging
--if_exists text [raise update
--dataset_is_public boolean Control if prod dataset is public or not. By default staging datasets like dataset_id_staging are not public. True
--location text Location of dataset data. List of possible region names locations: https://cloud.google.com/bigquery/docs/locations None
--help boolean Show this message and exit. False

delete

Delete dataset

Usage:

dataset delete [OPTIONS] DATASET_ID

Options:

Name Type Description Default
--mode, -m text What datasets to create [prod staging
--help boolean Show this message and exit. False

init

Initialize metadata files of dataset

Usage:

dataset init [OPTIONS] DATASET_ID

Options:

Name Type Description Default
--replace boolean Whether to replace current metadata files False
--help boolean Show this message and exit. False

publicize

Make a dataset public

Usage:

dataset publicize [OPTIONS] DATASET_ID

Options:

Name Type Description Default
--dataset_is_public boolean Control if prod dataset is public or not. By default staging datasets like dataset_id_staging are not public. True
--help boolean Show this message and exit. False

update

Update dataset on BigQuery

Usage:

dataset update [OPTIONS] DATASET_ID

Options:

Name Type Description Default
--mode, -m text What datasets to create [prod staging
--help boolean Show this message and exit. False

download

Downloads data do SAVEPATH. SAVEPATH must point to a .csv file.

Example:

basedosdados download data.csv --query="select * from basedosdados.br_ibge_pib.municipio limit 10" \ --billing_project_id=basedosdados-dev

Usage:

download [OPTIONS] SAVEPATH

Options:

Name Type Description Default
--query text A SQL Standard query to download data from BigQuery None
--dataset_id text Dataset_id, enter with table_id to download table None
--table_id text Table_id, enter with dataset_id to download table None
--query_project_id text Which project the table lives. You can change this you want to query different projects. None
--billing_project_id text Project that will be billed. Find your Project ID here https://console.cloud.google.com/projectselector2/home/dashboard None
--limit text Number of rows returned None
--help boolean Show this message and exit. False

get

Get commands.

Usage:

get [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help boolean Show this message and exit. False

dataset_description

Get the full description for given dataset

Usage:

get dataset_description [OPTIONS] DATASET_ID

Options:

Name Type Description Default
--project_id text The project which will be queried. You should have list/read permissions basedosdados
--help boolean Show this message and exit. False

table_columns

Get fields names,types and description for columns at given table

Usage:

get table_columns [OPTIONS] DATASET_ID TABLE_ID

Options:

Name Type Description Default
--project_id text The project which will be queried. You should have list/read permissions basedosdados
--help boolean Show this message and exit. False

table_description

Get the full description for given table

Usage:

get table_description [OPTIONS] DATASET_ID TABLE_ID

Options:

Name Type Description Default
--project_id text The project which will be queried. You should have list/read permissions basedosdados
--help boolean Show this message and exit. False

list

CLI list commands.

Usage:

list [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help boolean Show this message and exit. False

dataset_tables

List tables available at given dataset

Usage:

list dataset_tables [OPTIONS] DATASET_ID

Options:

Name Type Description Default
--project_id text The project which will be queried. You should have list/read permissions basedosdados
--filter_by text Filter your search, must be a string None
--with_description boolean [bool]Fetch short description for each table False
--help boolean Show this message and exit. False

datasets

List datasets available at given project_id

Usage:

list datasets [OPTIONS]

Options:

Name Type Description Default
--project_id text The project which will be queried. You should have list/read permissions basedosdados
--filter_by text Filter your search, must be a string None
--with_description boolean [bool]Fetch short description for each dataset False
--help boolean Show this message and exit. False

metadata

CLI metadata commands.

Usage:

metadata [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help boolean Show this message and exit. False

create

Creates new metadata config file

Usage:

metadata create [OPTIONS] DATASET_ID [TABLE_ID]

Options:

Name Type Description Default
--if_exists text [raise replace
--columns text Data columns. Example: --columns=col1,col2 []
--partition_columns text Columns that partition the data. Example: --partition_columns=col1,col2 []
--force_columns boolean Overwrite columns with local columns. False
--table_only boolean Force the creation of table_config.yaml file only if dataset_config.yaml doesn't exist. True
--help boolean Show this message and exit. False

is_updated

Check if user's local metadata is updated

Usage:

metadata is_updated [OPTIONS] DATASET_ID [TABLE_ID]

Options:

Name Type Description Default
--help boolean Show this message and exit. False

publish

Publish user's local metadata

Usage:

metadata publish [OPTIONS] DATASET_ID [TABLE_ID]

Options:

Name Type Description Default
--all boolean Force the publishment of metadata specified in both dataset_config.yaml and table_config.yaml at once. False
--if_exists text Define what to do in case metadata already exists in CKAN. raise
--update_locally boolean Update local metadata with the new CKAN metadata on publish. False
--help boolean Show this message and exit. False

validate

Validate user's local metadata

Usage:

metadata validate [OPTIONS] DATASET_ID [TABLE_ID]

Options:

Name Type Description Default
--help boolean Show this message and exit. False

reauth

Reauthorize credentials.

Usage:

reauth [OPTIONS]

Options:

Name Type Description Default
--help boolean Show this message and exit. False

storage

Commands for Google Cloud Storage.

Usage:

storage [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help boolean Show this message and exit. False

copy_table

Copy table to your bucket

Usage:

storage copy_table [OPTIONS] DATASET_ID TABLE_ID

Options:

Name Type Description Default
--source_bucket_name text N/A basedosdados
--dst_bucket_name text Bucket where data will be copied to, defaults to your bucket None
--mode, -m text which bucket folder to get the table [raw staging
--help boolean Show this message and exit. False

delete_table

Delete table from bucket

Usage:

storage delete_table [OPTIONS] DATASET_ID TABLE_ID

Options:

Name Type Description Default
--mode, -m text Where to delete the file from [raw staging
--bucket_name text Bucket from which to delete data, you can change it to delete from a bucket other than yours None
--not_found_ok boolean what to do if table not found False
--help boolean Show this message and exit. False

download

Download file from bucket

Usage:

storage download [OPTIONS] DATASET_ID TABLE_ID SAVEPATH

Options:

Name Type Description Default
--filename, -f text filename to download single file. If * downloads all files from bucket folder *
--mode, -m text Where to download data from [raw staging
--partitions text Data partition as value=key/value2=key2 None
--if_not_exists text [raise pass] if file file not found at bucket folder
--help boolean Show this message and exit. False

init

Create bucket and initial folders

Usage:

storage init [OPTIONS]

Options:

Name Type Description Default
--bucket_name text Bucket name basedosdados
--replace boolean Whether to replace current bucket files False
--very-sure / --not-sure boolean Are you sure that you want to replace current bucket files? False
--help boolean Show this message and exit. False

upload

Upload file to bucket

Usage:

storage upload [OPTIONS] DATASET_ID TABLE_ID FILEPATH

Options:

Name Type Description Default
--mode, -m text Where to save the file [raw staging
--partitions text Data partition as value=key/value2=key2 None
--if_exists text [raise replace
--chunk_size text The size of a chunk of data whenever iterating (in bytes). This must be a multiple of 256 KB per the API specification. None
--help boolean Show this message and exit. False

table

Command to manage tables.

Usage:

table [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help boolean Show this message and exit. False

append

Append new data to existing table

Usage:

table append [OPTIONS] DATASET_ID TABLE_ID FILEPATH

Options:

Name Type Description Default
--partitions text Data partition as value=key/value2=key2 None
--if_exists text [raise replace
--chunk_size text The size of a chunk of data whenever iterating (in bytes). This must be a multiple of 256 KB per the API specification. None
--help boolean Show this message and exit. False

create

Create stagging table in BigQuery

Usage:

table create [OPTIONS] DATASET_ID TABLE_ID

Options:

Name Type Description Default
--path, -p path Path of data folder or file. None
--if_table_exists text [raise replace
--force_dataset boolean Whether to automatically create the dataset folders and in BigQuery True
--if_storage_data_exists text [raise replace
--if_table_config_exists text [raise replace
--source_format text Data source format. Only 'csv' is supported. Defaults to 'csv'. csv
--force_columns boolean Overwrite columns with local columns. False
--columns_config_url_or_path text Path to the local architeture file or a public google sheets URL. Path only suports csv, xls, xlsx, xlsm, xlsb, odf, ods, odt formats. Google sheets URL must be in the format https://docs.google.com/spreadsheets/d//edit#gid=. None
--dataset_is_public boolean Control if prod dataset is public or not. By default staging datasets like dataset_id_staging are not public. True
--location text Location of dataset data. List of possible region names locations: https://cloud.google.com/bigquery/docs/locations None
--chunk_size text The size of a chunk of data whenever iterating (in bytes). This must be a multiple of 256 KB per the API specification. None
--help boolean Show this message and exit. False

delete

Delete BigQuery table

Usage:

table delete [OPTIONS] DATASET_ID TABLE_ID

Options:

Name Type Description Default
--mode text Which table to delete [prod staging]
--help boolean Show this message and exit. False

init

Create metadata files

Usage:

table init [OPTIONS] DATASET_ID TABLE_ID

Options:

Name Type Description Default
--data_sample_path path Sample data used to pre-fill metadata None
--if_folder_exists text [raise replace
--if_table_config_exists text [raise replace
--source_format text Data source format. Only 'csv' is supported. Defaults to 'csv'. csv
--force_columns boolean Overwrite columns with local columns. False
--columns_config_url_or_path text google sheets URL. Must be in the format https://docs.google.com/spreadsheets/d//edit#gid=. The sheet must contain the column name: 'coluna' and column description: 'descricao'. None
--help boolean Show this message and exit. False

publish

Publish staging table to prod

Usage:

table publish [OPTIONS] DATASET_ID TABLE_ID

Options:

Name Type Description Default
--if_exists text [raise replace] actions if table exists
--help boolean Show this message and exit. False

update

Update tables in BigQuery

Usage:

table update [OPTIONS] DATASET_ID TABLE_ID

Options:

Name Type Description Default
--mode text Choose a table from a dataset to update [prod staging
--help boolean Show this message and exit. False

update_columns

Update columns fields in tables_config.yaml

Usage:

table update_columns [OPTIONS] DATASET_ID TABLE_ID

Options:

Name Type Description Default
--columns_config_url_or_path text Fills columns in table_config.yaml automatically using a public google sheets URL or a local file. Also regenerate

publish.sql and autofill type using bigquery_type.

The sheet must contain the columns:

    - name: column name

    - description: column description

    - bigquery_type: column bigquery type

    - measurement_unit: column mesurement unit

    - covered_by_dictionary: column related dictionary

    - directory_column: column related directory in the format <dataset_id>.<table_id>:<column_name>

    - temporal_coverage: column temporal coverage

    - has_sensitive_data: the column has sensitive data

    - observations: column observations

Args:

columns_config_url_or_path (str): Path to the local architeture file or a public google sheets URL.

    Path only suports csv, xls, xlsx, xlsm, xlsb, odf, ods, odt formats.

    Google sheets URL must be in the format https://docs.google.com/spreadsheets/d/<table_key>/edit#gid=<table_gid>. | None |

| --help | boolean | Show this message and exit. | False |