Linha de comando (CLI)

config

CLI config commands.

Usage:

config [OPTIONS] COMMAND [ARGS]...

Options:

Name	Type	Description	Default
`--help`	boolean	Show this message and exit.	`False`

init

Initialize configuration

Usage:

config init [OPTIONS]

Options:

Name	Type	Description	Default
`--overwrite`	boolean	Wheteher to overwrite current config	`False`
`--help`	boolean	Show this message and exit.	`False`

refresh_template

Overwrite current templates

Usage:

config refresh_template [OPTIONS]

Options:

Name	Type	Description	Default
`--help`	boolean	Show this message and exit.	`False`

dataset

Command to manage datasets.

Usage:

dataset [OPTIONS] COMMAND [ARGS]...

Options:

Name	Type	Description	Default
`--help`	boolean	Show this message and exit.	`False`

create

Create dataset on BigQuery

Usage:

dataset create [OPTIONS] DATASET_ID

Options:

Name	Type	Description	Default
`--mode`, `-m`	text	What datasets to create [prod	staging
`--if_exists`	text	[raise	update
`--dataset_is_public`	boolean	Control if prod dataset is public or not. By default staging datasets like `dataset_id_staging` are not public.	`True`
`--location`	text	Location of dataset data. List of possible region names locations: https://cloud.google.com/bigquery/docs/locations	None
`--help`	boolean	Show this message and exit.	`False`

delete

Delete dataset

Usage:

dataset delete [OPTIONS] DATASET_ID

Options:

Name	Type	Description	Default
`--mode`, `-m`	text	What datasets to create [prod	staging
`--help`	boolean	Show this message and exit.	`False`

init

Initialize metadata files of dataset

Usage:

dataset init [OPTIONS] DATASET_ID

Options:

Name	Type	Description	Default
`--replace`	boolean	Whether to replace current metadata files	`False`
`--help`	boolean	Show this message and exit.	`False`

publicize

Make a dataset public

Usage:

dataset publicize [OPTIONS] DATASET_ID

Options:

Name	Type	Description	Default
`--dataset_is_public`	boolean	Control if prod dataset is public or not. By default staging datasets like `dataset_id_staging` are not public.	`True`
`--help`	boolean	Show this message and exit.	`False`

update

Update dataset on BigQuery

Usage:

dataset update [OPTIONS] DATASET_ID

Options:

Name	Type	Description	Default
`--mode`, `-m`	text	What datasets to create [prod	staging
`--help`	boolean	Show this message and exit.	`False`

download

Downloads data do SAVEPATH. SAVEPATH must point to a .csv file.

Example:

basedosdados download data.csv --query="select * from basedosdados.br_ibge_pib.municipio limit 10" \ --billing_project_id=basedosdados-dev

Usage:

download [OPTIONS] SAVEPATH

Options:

Name	Type	Description	Default
`--query`	text	A SQL Standard query to download data from BigQuery	None
`--dataset_id`	text	Dataset_id, enter with table_id to download table	None
`--table_id`	text	Table_id, enter with dataset_id to download table	None
`--query_project_id`	text	Which project the table lives. You can change this you want to query different projects.	None
`--billing_project_id`	text	Project that will be billed. Find your Project ID here https://console.cloud.google.com/projectselector2/home/dashboard	None
`--limit`	text	Number of rows returned	None
`--help`	boolean	Show this message and exit.	`False`

get

Get commands.

Usage:

get [OPTIONS] COMMAND [ARGS]...

Options:

Name	Type	Description	Default
`--help`	boolean	Show this message and exit.	`False`

dataset_description

Get the full description for given dataset

Usage:

get dataset_description [OPTIONS] DATASET_ID

Options:

Name	Type	Description	Default
`--project_id`	text	The project which will be queried. You should have list/read permissions	`basedosdados`
`--help`	boolean	Show this message and exit.	`False`

table_columns

Get fields names,types and description for columns at given table

Usage:

get table_columns [OPTIONS] DATASET_ID TABLE_ID

Options:

Name	Type	Description	Default
`--project_id`	text	The project which will be queried. You should have list/read permissions	`basedosdados`
`--help`	boolean	Show this message and exit.	`False`

table_description

Get the full description for given table

Usage:

get table_description [OPTIONS] DATASET_ID TABLE_ID

Options:

Name	Type	Description	Default
`--project_id`	text	The project which will be queried. You should have list/read permissions	`basedosdados`
`--help`	boolean	Show this message and exit.	`False`

list

CLI list commands.

Usage:

list [OPTIONS] COMMAND [ARGS]...

Options:

Name	Type	Description	Default
`--help`	boolean	Show this message and exit.	`False`

dataset_tables

List tables available at given dataset

Usage:

list dataset_tables [OPTIONS] DATASET_ID

Options:

Name	Type	Description	Default
`--project_id`	text	The project which will be queried. You should have list/read permissions	`basedosdados`
`--filter_by`	text	Filter your search, must be a string	None
`--with_description`	boolean	[bool]Fetch short description for each table	`False`
`--help`	boolean	Show this message and exit.	`False`

datasets

List datasets available at given project_id

Usage:

list datasets [OPTIONS]

Options:

Name	Type	Description	Default
`--project_id`	text	The project which will be queried. You should have list/read permissions	`basedosdados`
`--filter_by`	text	Filter your search, must be a string	None
`--with_description`	boolean	[bool]Fetch short description for each dataset	`False`
`--help`	boolean	Show this message and exit.	`False`

metadata

CLI metadata commands.

Usage:

metadata [OPTIONS] COMMAND [ARGS]...

Options:

Name	Type	Description	Default
`--help`	boolean	Show this message and exit.	`False`

create

Creates new metadata config file

Usage:

metadata create [OPTIONS] DATASET_ID [TABLE_ID]

Options:

Name	Type	Description	Default
`--if_exists`	text	[raise	replace
`--columns`	text	Data columns. Example: --columns=col1,col2	`[]`
`--partition_columns`	text	Columns that partition the data. Example: --partition_columns=col1,col2	`[]`
`--force_columns`	boolean	Overwrite columns with local columns.	`False`
`--table_only`	boolean	Force the creation of `table_config.yaml` file only if `dataset_config.yaml` doesn't exist.	`True`
`--help`	boolean	Show this message and exit.	`False`

is_updated

Check if user's local metadata is updated

Usage:

metadata is_updated [OPTIONS] DATASET_ID [TABLE_ID]

Options:

Name	Type	Description	Default
`--help`	boolean	Show this message and exit.	`False`

publish

Publish user's local metadata

Usage:

metadata publish [OPTIONS] DATASET_ID [TABLE_ID]

Options:

Name	Type	Description	Default
`--all`	boolean	Force the publishment of metadata specified in both `dataset_config.yaml` and `table_config.yaml` at once.	`False`
`--if_exists`	text	Define what to do in case metadata already exists in CKAN.	`raise`
`--update_locally`	boolean	Update local metadata with the new CKAN metadata on publish.	`False`
`--help`	boolean	Show this message and exit.	`False`

validate

Validate user's local metadata

Usage:

metadata validate [OPTIONS] DATASET_ID [TABLE_ID]

Options:

Name	Type	Description	Default
`--help`	boolean	Show this message and exit.	`False`

reauth

Reauthorize credentials.

Usage:

reauth [OPTIONS]

Options:

Name	Type	Description	Default
`--help`	boolean	Show this message and exit.	`False`

storage

Commands for Google Cloud Storage.

Usage:

storage [OPTIONS] COMMAND [ARGS]...

Options:

Name	Type	Description	Default
`--help`	boolean	Show this message and exit.	`False`

copy_table

Copy table to your bucket

Usage:

storage copy_table [OPTIONS] DATASET_ID TABLE_ID

Options:

Name	Type	Description	Default
`--source_bucket_name`	text	N/A	`basedosdados`
`--dst_bucket_name`	text	Bucket where data will be copied to, defaults to your bucket	None
`--mode`, `-m`	text	which bucket folder to get the table [raw	staging
`--help`	boolean	Show this message and exit.	`False`

delete_table

Delete table from bucket

Usage:

storage delete_table [OPTIONS] DATASET_ID TABLE_ID

Options:

Name	Type	Description	Default
`--mode`, `-m`	text	Where to delete the file from [raw	staging
`--bucket_name`	text	Bucket from which to delete data, you can change it to delete from a bucket other than yours	None
`--not_found_ok`	boolean	what to do if table not found	`False`
`--help`	boolean	Show this message and exit.	`False`

download

Download file from bucket

Usage:

storage download [OPTIONS] DATASET_ID TABLE_ID SAVEPATH

Options:

Name	Type	Description	Default
`--filename`, `-f`	text	filename to download single file. If * downloads all files from bucket folder	`*`
`--mode`, `-m`	text	Where to download data from [raw	staging
`--partitions`	text	Data partition as `value=key/value2=key2`	None
`--if_not_exists`	text	[raise	pass] if file file not found at bucket folder
`--help`	boolean	Show this message and exit.	`False`

init

Create bucket and initial folders

Usage:

storage init [OPTIONS]

Options:

Name	Type	Description	Default
`--bucket_name`	text	Bucket name	`basedosdados`
`--replace`	boolean	Whether to replace current bucket files	`False`
`--very-sure` / `--not-sure`	boolean	Are you sure that you want to replace current bucket files?	`False`
`--help`	boolean	Show this message and exit.	`False`

upload

Upload file to bucket

Usage:

storage upload [OPTIONS] DATASET_ID TABLE_ID FILEPATH

Options:

Name	Type	Description	Default
`--mode`, `-m`	text	Where to save the file [raw	staging
`--partitions`	text	Data partition as `value=key/value2=key2`	None
`--if_exists`	text	[raise	replace
`--chunk_size`	text	The size of a chunk of data whenever iterating (in bytes). This must be a multiple of 256 KB per the API specification.	None
`--help`	boolean	Show this message and exit.	`False`

table

Command to manage tables.

Usage:

table [OPTIONS] COMMAND [ARGS]...

Options:

Name	Type	Description	Default
`--help`	boolean	Show this message and exit.	`False`

append

Append new data to existing table

Usage:

table append [OPTIONS] DATASET_ID TABLE_ID FILEPATH

Options:

Name	Type	Description	Default
`--partitions`	text	Data partition as `value=key/value2=key2`	None
`--if_exists`	text	[raise	replace
`--chunk_size`	text	The size of a chunk of data whenever iterating (in bytes). This must be a multiple of 256 KB per the API specification.	None
`--help`	boolean	Show this message and exit.	`False`

create

Create stagging table in BigQuery

Usage:

table create [OPTIONS] DATASET_ID TABLE_ID

Options:

Name	Type	Description	Default
`--path`, `-p`	path	Path of data folder or file.	None
`--if_table_exists`	text	[raise	replace
`--force_dataset`	boolean	Whether to automatically create the dataset folders and in BigQuery	`True`
`--if_storage_data_exists`	text	[raise	replace
`--if_table_config_exists`	text	[raise	replace
`--source_format`	text	Data source format. Only 'csv' is supported. Defaults to 'csv'.	`csv`
`--force_columns`	boolean	Overwrite columns with local columns.	`False`
`--columns_config_url_or_path`	text	Path to the local architeture file or a public google sheets URL. Path only suports csv, xls, xlsx, xlsm, xlsb, odf, ods, odt formats. Google sheets URL must be in the format https://docs.google.com/spreadsheets/d//edit#gid=.	None
`--dataset_is_public`	boolean	Control if prod dataset is public or not. By default staging datasets like `dataset_id_staging` are not public.	`True`
`--location`	text	Location of dataset data. List of possible region names locations: https://cloud.google.com/bigquery/docs/locations	None
`--chunk_size`	text	The size of a chunk of data whenever iterating (in bytes). This must be a multiple of 256 KB per the API specification.	None
`--help`	boolean	Show this message and exit.	`False`

delete

Delete BigQuery table

Usage:

table delete [OPTIONS] DATASET_ID TABLE_ID

Options:

Name	Type	Description	Default
`--mode`	text	Which table to delete [prod	staging]
`--help`	boolean	Show this message and exit.	`False`

init

Create metadata files

Usage:

table init [OPTIONS] DATASET_ID TABLE_ID

Options:

Name	Type	Description	Default
`--data_sample_path`	path	Sample data used to pre-fill metadata	None
`--if_folder_exists`	text	[raise	replace
`--if_table_config_exists`	text	[raise	replace
`--source_format`	text	Data source format. Only 'csv' is supported. Defaults to 'csv'.	`csv`
`--force_columns`	boolean	Overwrite columns with local columns.	`False`
`--columns_config_url_or_path`	text	google sheets URL. Must be in the format https://docs.google.com/spreadsheets/d//edit#gid=. The sheet must contain the column name: 'coluna' and column description: 'descricao'.	None
`--help`	boolean	Show this message and exit.	`False`

publish

Publish staging table to prod

Usage:

table publish [OPTIONS] DATASET_ID TABLE_ID

Options:

Name	Type	Description	Default
`--if_exists`	text	[raise	replace] actions if table exists
`--help`	boolean	Show this message and exit.	`False`

update

Update tables in BigQuery

Usage:

table update [OPTIONS] DATASET_ID TABLE_ID

Options:

Name	Type	Description	Default
`--mode`	text	Choose a table from a dataset to update [prod	staging
`--help`	boolean	Show this message and exit.	`False`

update_columns

Update columns fields in tables_config.yaml

Usage:

table update_columns [OPTIONS] DATASET_ID TABLE_ID

Options:

Name	Type	Description	Default
`--columns_config_url_or_path`	text	Fills columns in table_config.yaml automatically using a public google sheets URL or a local file. Also regenerate

publish.sql and autofill type using bigquery_type.

The sheet must contain the columns:

    - name: column name

    - description: column description

    - bigquery_type: column bigquery type

    - measurement_unit: column mesurement unit

    - covered_by_dictionary: column related dictionary

    - directory_column: column related directory in the format <dataset_id>.<table_id>:<column_name>

    - temporal_coverage: column temporal coverage

    - has_sensitive_data: the column has sensitive data

    - observations: column observations

Args:

columns_config_url_or_path (str): Path to the local architeture file or a public google sheets URL.

    Path only suports csv, xls, xlsx, xlsm, xlsb, odf, ods, odt formats.

    Google sheets URL must be in the format https://docs.google.com/spreadsheets/d/<table_key>/edit#gid=<table_gid>. | None |