Skip to content

duckdb module

Module with DuckDBData.


DuckDBData class

DuckDBData(
    wrapper,
    data,
    single_key=True,
    classes=None,
    level_name=None,
    fetch_kwargs=None,
    returned_kwargs=None,
    last_index=None,
    delisted=None,
    tz_localize=None,
    tz_convert=None,
    missing_index=None,
    missing_columns=None,
    **kwargs
)

Data class for fetching data using DuckDB.

See DuckDBData.pull() and DuckDBData.fetch_key() for arguments.

Usage

  • Set up the connection settings globally (optional):
>>> from vectorbtpro import *

>>> vbt.DuckDBData.set_custom_settings(connection="database.duckdb")
  • Pull tables:
>>> data = vbt.DuckDBData.pull(["TABLE1", "TABLE2"])
  • Rename tables:
>>> data = vbt.DuckDBData.pull(
...     ["SYMBOL1", "SYMBOL2"],
...     table=vbt.key_dict({
...         "SYMBOL1": "TABLE1",
...         "SYMBOL2": "TABLE2"
...     })
... )
  • Pull queries:
>>> data = vbt.DuckDBData.pull(
...     ["SYMBOL1", "SYMBOL2"],
...     query=vbt.key_dict({
...         "SYMBOL1": "SELECT * FROM TABLE1",
...         "SYMBOL2": "SELECT * FROM TABLE2"
...     })
... )
  • Pull Parquet files:
>>> data = vbt.DuckDBData.pull(
...     ["SYMBOL1", "SYMBOL2"],
...     read_path=vbt.key_dict({
...         "SYMBOL1": "s1.parquet",
...         "SYMBOL2": "s2.parquet"
...     })
... )

Superclasses

Inherited members


fetch_feature class method

DuckDBData.fetch_feature(
    feature,
    **kwargs
)

Fetch the table of a feature.

Uses DuckDBData.fetch_key().


fetch_key class method

DuckDBData.fetch_key(
    key,
    table=None,
    schema=None,
    catalog=None,
    read_path=None,
    read_format=None,
    read_options=None,
    query=None,
    connection=None,
    connection_config=None,
    start=None,
    end=None,
    align_dates=None,
    parse_dates=None,
    to_utc=None,
    tz=None,
    index_col=None,
    squeeze=None,
    df_kwargs=None,
    **sql_kwargs
)

Fetch a feature or symbol from a DuckDB database.

Can use a table name (which defaults to the key) or a custom query.

Args

key : str

Feature or symbol.

If table and query are both None, becomes the table name.

Key can be in the SCHEMA:TABLE format, in this case schema argument will be ignored.

table : str

Table name.

Cannot be used together with file or query.

schema : str

Schema name.

Cannot be used together with file or query.

catalog : str

Catalog name.

Cannot be used together with `file or query`.

read_path : path_like

Path to a file to read.

Cannot be used together with table, schema, catalog, or query.

read_format : str

Format of the file to read.

Allowed values are "csv", "parquet", and "json".

Requires read_path to be set.

read_options : str or dict

Options used to read the file.

Requires read_path and read_format to be set.

Uses DuckDBData.format_read_options() to transform a dictionary to a string.

query : str or DuckDBPyRelation

Custom query.

Cannot be used together with catalog, schema, and table.

connection : str or object
See DuckDBData.resolve_connection().
connection_config : dict
See DuckDBData.resolve_connection().
start : any

Start datetime (if datetime index) or any other start value.

Will parse with to_timestamp() if align_dates is True and the index is a datetime index. Otherwise, you must ensure the correct type is provided.

Cannot be used together with query. Include the condition into the query.

end : any

End datetime (if datetime index) or any other end value.

Will parse with to_timestamp() if align_dates is True and the index is a datetime index. Otherwise, you must ensure the correct type is provided.

Cannot be used together with query. Include the condition into the query.

align_dates : bool

Whether to align start and end to the timezone of the index.

Will pull one row (using LIMIT 1) and use SQLData.prepare_dt to get the index.

parse_dates : bool or sequence of str
See Data.prepare_dt().
to_utc : bool, str, or sequence of str
See Data.prepare_dt().
tz : any

Timezone.

See to_timezone().

index_col : int, str, or list
One or more columns that should become the index.
squeeze : int
Whether to squeeze a DataFrame with one column into a Series.
df_kwargs : dict
Keyword arguments passed to relation.df to convert a relation to a DataFrame.
**sql_kwargs
Other keyword arguments passed to connection.execute to run a SQL query.

For defaults, see custom.duckdb in data.


fetch_symbol class method

DuckDBData.fetch_symbol(
    symbol,
    **kwargs
)

Fetch the table for a symbol.

Uses DuckDBData.fetch_key().


format_read_option class method

DuckDBData.format_read_option(
    option
)

Format a read option.


format_read_options class method

DuckDBData.format_read_options(
    options
)

Format read options.


format_write_option class method

DuckDBData.format_write_option(
    option
)

Format a write option.


format_write_options class method

DuckDBData.format_write_options(
    options
)

Format write options.


get_current_schema class method

DuckDBData.get_current_schema(
    connection=None,
    connection_config=None
)

Get the current schema.


list_catalogs class method

DuckDBData.list_catalogs(
    pattern=None,
    use_regex=False,
    sort=True,
    incl_system=False,
    connection=None,
    connection_config=None
)

List all catalogs.

Catalogs "system" and "temp" are skipped if incl_system is False.

Uses CustomData.key_match() to check each symbol against pattern.


list_schemas class method

DuckDBData.list_schemas(
    catalog_pattern=None,
    schema_pattern=None,
    use_regex=False,
    sort=True,
    catalog=None,
    incl_system=False,
    connection=None,
    connection_config=None
)

List all schemas.

If catalog is None, searches for all catalog names in the database and prefixes each schema with the respective catalog name. If catalog is provided, returns the schemas corresponding to this catalog without a prefix. Schemas "information_schema" and "pg_catalog" are skipped if incl_system is False.

Uses CustomData.key_match() to check each symbol against pattern.


list_tables class method

DuckDBData.list_tables(
    *,
    catalog_pattern=None,
    schema_pattern=None,
    table_pattern=None,
    use_regex=False,
    sort=True,
    catalog=None,
    schema=None,
    incl_system=False,
    incl_temporary=False,
    incl_views=True,
    connection=None,
    connection_config=None
)

List all tables and views.

If schema is None, searches for all schema names in the database and prefixes each table with the respective catalog and schema name (unless there's only one schema which is the current schema or schema is current_schema). If schema is provided, returns the tables corresponding to this schema without a prefix.

Uses CustomData.key_match() to check each schema against schema_pattern and each table against table_pattern.


pull class method

DuckDBData.pull(
    keys=None,
    *,
    keys_are_features=None,
    features=None,
    symbols=None,
    catalog=None,
    schema=None,
    list_tables_kwargs=None,
    read_path=None,
    read_format=None,
    connection=None,
    connection_config=None,
    share_connection=None,
    **kwargs
)

Override Data.pull() to resolve and share the connection among the keys and use the table names available in the database in case no keys were provided.


resolve_connection class method

DuckDBData.resolve_connection(
    connection=None,
    read_only=True,
    return_meta=False,
    **connection_config
)

Resolve the connection.


update_feature method

DuckDBData.update_feature(
    feature,
    **kwargs
)

Update data of a feature.

Uses DuckDBData.update_key().


update_key method

DuckDBData.update_key(
    key,
    from_last_index=None,
    **kwargs
)

Update data of a feature or symbol.


update_symbol method

DuckDBData.update_symbol(
    symbol,
    **kwargs
)

Update data for a symbol.

Uses DuckDBData.update_key().