Skip to content

base module

Base class for working with records.

vectorbt works with two different representations of data: matrices and records.

A matrix, in this context, is just an array of one-dimensional arrays, each corresponding to a separate feature. The matrix itself holds only one kind of information (one attribute). For example, one can create a matrix for entry signals, with columns being different strategy configurations. But what if the matrix is huge and sparse? What if there is more information we would like to represent by each element? Creating multiple matrices would be a waste of memory.

Records make possible representing complex, sparse information in a dense format. They are just an array of one-dimensional arrays of a fixed schema, where each element holds a different kind of information. You can imagine records being a DataFrame, where each row represents a record and each column represents a specific attribute. Read more on structured arrays here.

For example, let's represent two DataFrames as a single record array:

               a     b
         0   1.0   5.0
attr1 =  1   2.0   NaN
         2   NaN   7.0
         3   4.0   8.0
               a     b
         0   9.0  13.0
attr2 =  1  10.0   NaN
         2   NaN  15.0
         3  12.0  16.0
            |
            v
      id  col  idx  attr1  attr2
0      0    0    0      1      9
1      1    0    1      2     10
2      2    0    3      4     12
3      0    1    0      5     13
4      1    1    2      7     15
5      2    1    3      8     16

Another advantage of records is that they are not constrained by size. Multiple records can map to a single element in a matrix. For example, one can define multiple orders at the same timestamp, which is impossible to represent in a matrix form without duplicating index entries or using complex data types.

Consider the following example:

>>> from vectorbtpro import *

>>> example_dt = np.dtype([
...     ('id', np.int_),
...     ('col', np.int_),
...     ('idx', np.int_),
...     ('some_field', np.float_)
... ])
>>> records_arr = np.array([
...     (0, 0, 0, 10.),
...     (1, 0, 1, 11.),
...     (2, 0, 2, 12.),
...     (0, 1, 0, 13.),
...     (1, 1, 1, 14.),
...     (2, 1, 2, 15.),
...     (0, 2, 0, 16.),
...     (1, 2, 1, 17.),
...     (2, 2, 2, 18.)
... ], dtype=example_dt)
>>> wrapper = vbt.ArrayWrapper(index=['x', 'y', 'z'],
...     columns=['a', 'b', 'c'], ndim=2, freq='1 day')
>>> records = vbt.Records(wrapper, records_arr)

Printing

There are two ways to print records:

  • Raw dataframe that preserves field names and data types:
>>> records.records
   id  col  idx  some_field
0   0    0    0        10.0
1   1    0    1        11.0
2   2    0    2        12.0
3   0    1    0        13.0
4   1    1    1        14.0
5   2    1    2        15.0
6   0    2    0        16.0
7   1    2    1        17.0
8   2    2    2        18.0
>>> records.readable
   Id Column Timestamp  some_field
0   0      a         x        10.0
1   1      a         y        11.0
2   2      a         z        12.0
3   0      b         x        13.0
4   1      b         y        14.0
5   2      b         z        15.0
6   0      c         x        16.0
7   1      c         y        17.0
8   2      c         z        18.0

Mapping

Records are just structured arrays with a bunch of methods and properties for processing them. Their main feature is to map the records array and to reduce it by column (similar to the MapReduce paradigm). The main advantage is that it all happens without conversion to the matrix form and wasting memory resources.

Records can be mapped to MappedArray in several ways:

>>> records.map_field('some_field')
<vectorbtpro.records.mapped_array.MappedArray at 0x7ff49bd31a58>

>>> records.map_field('some_field').values
array([10., 11., 12., 13., 14., 15., 16., 17., 18.])
>>> @njit
... def power_map_nb(record, pow):
...     return record.some_field ** pow

>>> records.map(power_map_nb, 2)
<vectorbtpro.records.mapped_array.MappedArray at 0x7ff49c990cf8>

>>> records.map(power_map_nb, 2).values
array([100., 121., 144., 169., 196., 225., 256., 289., 324.])

>>> # Map using a meta function

>>> @njit
... def power_map_meta_nb(ridx, records, pow):
...     return records[ridx].some_field ** pow

>>> vbt.Records.map(power_map_meta_nb, records.values, 2, col_mapper=records.col_mapper).values
array([100., 121., 144., 169., 196., 225., 256., 289., 324.])
>>> records.map_array(records_arr['some_field'] ** 2)
<vectorbtpro.records.mapped_array.MappedArray object at 0x7fe9bccf2978>

>>> records.map_array(records_arr['some_field'] ** 2).values
array([100., 121., 144., 169., 196., 225., 256., 289., 324.])
>>> @njit
... def cumsum_apply_nb(records):
...     return np.cumsum(records.some_field)

>>> records.apply(cumsum_apply_nb)
<vectorbtpro.records.mapped_array.MappedArray at 0x7ff49c990cf8>

>>> records.apply(cumsum_apply_nb).values
array([10., 21., 33., 13., 27., 42., 16., 33., 51.])

>>> group_by = np.array(['first', 'first', 'second'])
>>> records.apply(cumsum_apply_nb, group_by=group_by, apply_per_group=True).values
array([10., 21., 33., 46., 60., 75., 16., 33., 51.])

>>> # Apply using a meta function

>>> @njit
... def cumsum_apply_meta_nb(idxs, col, records):
...     return np.cumsum(records[idxs].some_field)

>>> vbt.Records.apply(cumsum_apply_meta_nb, records.values, col_mapper=records.col_mapper).values
array([10., 21., 33., 13., 27., 42., 16., 33., 51.])

Notice how cumsum resets at each column in the first example and at each group in the second example.

Filtering

Use Records.apply_mask() to filter elements per column/group:

>>> mask = [True, False, True, False, True, False, True, False, True]
>>> filtered_records = records.apply_mask(mask)
>>> filtered_records.records
   id  col  idx  some_field
0   0    0    0        10.0
1   2    0    2        12.0
2   1    1    1        14.0
3   0    2    0        16.0
4   2    2    2        18.0

Grouping

One of the key features of Records is that you can perform reducing operations on a group of columns as if they were a single column. Groups can be specified by group_by, which can be anything from positions or names of column levels, to a NumPy array with actual groups.

There are multiple ways of define grouping:

>>> group_by = np.array(['first', 'first', 'second'])
>>> grouped_wrapper = wrapper.replace(group_by=group_by)
>>> grouped_records = vbt.Records(grouped_wrapper, records_arr)

>>> grouped_records.map_field('some_field').mean()
first     12.5
second    17.0
dtype: float64
>>> records.regroup(group_by).map_field('some_field').mean()
first     12.5
second    17.0
dtype: float64
  • Pass group_by directly to the mapping method:
>>> records.map_field('some_field', group_by=group_by).mean()
first     12.5
second    17.0
dtype: float64
  • Pass group_by directly to the reducing method:
>>> records.map_field('some_field').mean(group_by=group_by)
a    11.0
b    14.0
c    17.0
dtype: float64

Note

Grouping applies only to reducing operations, there is no change to the arrays.

Indexing

Like any other class subclassing Wrapping, we can do pandas indexing on a Records instance, which forwards indexing operation to each object with columns:

>>> records['a'].records
   id  col  idx  some_field
0   0    0    0        10.0
1   1    0    1        11.0
2   2    0    2        12.0

>>> grouped_records['first'].records
   id  col  idx  some_field
0   0    0    0        10.0
1   1    0    1        11.0
2   2    0    2        12.0
3   0    1    0        13.0
4   1    1    1        14.0
5   2    1    2        15.0

Note

Changing index (time axis) is not supported. The object should be treated as a Series rather than a DataFrame; for example, use some_field.iloc[0] instead of some_field.iloc[:, 0] to get the first column.

Indexing behavior depends solely upon ArrayWrapper. For example, if group_select is enabled indexing will be performed on groups when grouped, otherwise on single columns.

Caching

Records supports caching. If a method or a property requires heavy computation, it's wrapped with cached_method() and cached_property respectively. Caching can be disabled globally via caching.

Note

Because of caching, class is meant to be immutable and all properties are read-only. To change any attribute, use the Records.replace() method and pass changes as keyword arguments.

Saving and loading

Like any other class subclassing Pickleable, we can save a Records instance to the disk with Pickleable.save() and load it with Pickleable.load().

Stats

>>> records.stats(column='a')
Start                          x
End                            z
Period           3 days 00:00:00
Total Records                  3
Name: a, dtype: object

StatsBuilderMixin.stats() also supports (re-)grouping:

>>> grouped_records.stats(column='first')
Start                          x
End                            z
Period           3 days 00:00:00
Total Records                  6
Name: first, dtype: object

Plots

This class is too generic to have any subplots, but feel free to add custom subplots to your subclass.

Extending

Records class can be extended by subclassing.

In case some of our fields have the same meaning but different naming (such as the base field idx) or other properties, we can override field_config using override_field_config(). It will look for configs of all base classes and merge our config on top of them. This preserves any base class property that is not explicitly listed in our config.

>>> from vectorbtpro.records.decorators import override_field_config

>>> my_dt = np.dtype([
...     ('my_id', np.int_),
...     ('my_col', np.int_),
...     ('my_idx', np.int_)
... ])

>>> my_fields_config = dict(
...     dtype=my_dt,
...     settings=dict(
...         id=dict(name='my_id'),
...         col=dict(name='my_col'),
...         idx=dict(name='my_idx')
...     )
... )
>>> @override_field_config(my_fields_config)
... class MyRecords(vbt.Records):
...     pass

>>> records_arr = np.array([
...     (0, 0, 0),
...     (1, 0, 1),
...     (0, 1, 0),
...     (1, 1, 1)
... ], dtype=my_dt)
>>> wrapper = vbt.ArrayWrapper(index=['x', 'y'],
...     columns=['a', 'b'], ndim=2, freq='1 day')
>>> my_records = MyRecords(wrapper, records_arr)

>>> my_records.id_arr
array([0, 1, 0, 1])

>>> my_records.col_arr
array([0, 0, 1, 1])

>>> my_records.idx_arr
array([0, 1, 0, 1])

Alternatively, we can override the _field_config class attribute.

>>> @override_field_config
... class MyRecords(vbt.Records):
...     _field_config = dict(
...         dtype=my_dt,
...         settings=dict(
...             id=dict(name='my_id'),
...             idx=dict(name='my_idx'),
...             col=dict(name='my_col')
...         )
...     )

Note

Don't forget to decorate the class with @override_field_config to inherit configs from base classes.

You can stop inheritance by not decorating or passing merge_configs=False to the decorator.


MetaFields class

MetaFields(
    *args,
    **kwargs
)

Meta class that exposes a read-only class property MetaFields.field_config.

Superclasses

  • builtins.type

Subclasses


field_config property

Field config.


MetaRecords class

MetaRecords(
    *args,
    **kwargs
)

Meta class that exposes a read-only class property StatsBuilderMixin.metrics.

Superclasses

Inherited members


Records class

Records(
    wrapper,
    records_arr,
    col_mapper=None,
    **kwargs
)

Wraps the actual records array (such as trades) and exposes methods for mapping it to some array of values (such as PnL of each trade).

Args

wrapper : ArrayWrapper

Array wrapper.

See ArrayWrapper.

records_arr : array_like

A structured NumPy array of records.

Must have the fields id (record index) and col (column index).

col_mapper : ColumnMapper

Column mapper if already known.

Note

It depends on records_arr, so make sure to invalidate col_mapper upon creating a Records instance with a modified records_arr.

Records.replace() does it automatically.

**kwargs

Custom keyword arguments passed to the config.

Useful if any subclass wants to extend the config.

Superclasses

Inherited members

Subclasses


apply class method

Records.apply(
    apply_func_nb,
    *args,
    group_by=None,
    apply_per_group=False,
    dtype=None,
    jitted=None,
    chunked=None,
    col_mapper=None,
    **kwargs
)

Apply function on records per column/group. Returns mapped array.

Applies per group if apply_per_group is True.

See apply_nb().

For details on the meta version, see apply_meta_nb().

**kwargs are passed to Records.map_array().


apply_mask method

Records.apply_mask(
    mask,
    group_by=None,
    **kwargs
)

Return a new class instance, filtered by mask.


build_field_config_doc class method

Records.build_field_config_doc(
    source_cls=None
)

Build field config documentation.


col_arr property

Get column array.


col_mapper property

Column mapper.

See ColumnMapper.


column_stack class method

Records.column_stack(
    *objs,
    wrapper_kwargs=None,
    get_indexer_kwargs=None,
    **kwargs
)

Stack multiple Records instances along columns.

Uses ArrayWrapper.column_stack() to stack the wrappers and Records.column_stack_records_arrs() to stack the record arrays.

get_indexer_kwargs are passed to pandas.Index.get_indexer to translate old indices to new ones after the reindexing operation.

Note

Will produce a column-sorted array.


column_stack_records_arrs class method

Records.column_stack_records_arrs(
    *objs,
    get_indexer_kwargs=None,
    **kwargs
)

Stack multiple record arrays along columns.


count method

Records.count(
    group_by=None,
    wrap_kwargs=None
)

Get count by column.


coverage_map method

Records.coverage_map(
    **kwargs
)

See MappedArray.coverage_map().


field_config class variable

Field config of Records.

HybridConfig(
    dtype=None,
    settings=dict(
        id=dict(
            name='id',
            title='Id',
            mapping='ids'
        ),
        col=dict(
            name='col',
            title='Column',
            mapping='columns',
            as_customdata=False
        ),
        idx=dict(
            name='idx',
            title='Index',
            mapping='index'
        )
    )
)

Returns Records._field_config, which gets (hybrid-) copied upon creation of each instance. Thus, changing this config won't affect the class.

To change fields, you can either change the config in-place, override this property, or overwrite the instance variable Records._field_config.


field_names property

Field names.


first_n method

Records.first_n(
    n,
    jitted=None,
    chunked=None,
    **kwargs
)

Return the first N records in each column.


get_apply_mapping_arr method

Records.get_apply_mapping_arr(
    field,
    mapping_kwargs=None,
    **kwargs
)

Get the mapped array on the field, with mapping applied. Uses Records.field_config.


get_apply_mapping_str_arr method

Records.get_apply_mapping_str_arr(
    field,
    mapping_kwargs=None,
    **kwargs
)

Get the mapped array on the field, with mapping applied and stringified. Uses Records.field_config.


get_column_stack_record_indices class method

Records.get_column_stack_record_indices(
    *objs,
    **kwargs
)

Get the indices that map concatenated record arrays into the column-stacked record array.


get_field_arr method

Records.get_field_arr(
    field,
    copy=False
)

Get the array of the field. Uses Records.field_config.


get_field_mapping method

Records.get_field_mapping(
    field
)

Get the mapping of the field. Uses Records.field_config.


get_field_name method

Records.get_field_name(
    field
)

Get the name of the field. Uses Records.field_config..


get_field_setting method

Records.get_field_setting(
    field,
    setting,
    default=None
)

Get any setting of the field. Uses Records.field_config.


get_field_title method

Records.get_field_title(
    field
)

Get the title of the field. Uses Records.field_config.


get_map_field method

Records.get_map_field(
    field,
    **kwargs
)

Get the mapped array of the field. Uses Records.field_config.


get_map_field_to_columns method

Records.get_map_field_to_columns(
    field,
    **kwargs
)

Get the mapped array on the field, with columns applied. Uses Records.field_config.


get_map_field_to_index method

Records.get_map_field_to_index(
    field,
    minus_one_to_zero=False,
    **kwargs
)

Get the mapped array on the field, with index applied. Uses Records.field_config.


get_pd_mask method

Records.get_pd_mask(
    idx_arr=None,
    group_by=None,
    wrap_kwargs=None
)

Get mask in form of a Series/DataFrame from row and column indices.


get_row_stack_record_indices class method

Records.get_row_stack_record_indices(
    *objs,
    **kwargs
)

Get the indices that map concatenated record arrays into the row-stacked record array.


has_conflicts method

Records.has_conflicts(
    **kwargs
)

See MappedArray.has_conflicts().


id_arr property

Get id array.


idx_arr property

Get index array.


indexing_func method

Records.indexing_func(
    *args,
    records_meta=None,
    **kwargs
)

Perform indexing on Records.


indexing_func_meta method

Records.indexing_func_meta(
    *args,
    wrapper_meta=None,
    **kwargs
)

Perform indexing on Records and return metadata.

By default, all fields that are mapped to index are indexed. To avoid indexing on some fields, set their setting noindex to True.


is_sorted method

Records.is_sorted(
    incl_id=False,
    jitted=None
)

Check whether records are sorted.


last_n method

Records.last_n(
    n,
    jitted=None,
    chunked=None,
    **kwargs
)

Return the last N records in each column.


map class method

Records.map(
    map_func_nb,
    *args,
    dtype=None,
    jitted=None,
    chunked=None,
    col_mapper=None,
    **kwargs
)

Map each record to a scalar value. Returns mapped array.

See map_records_nb().

For details on the meta version, see map_records_meta_nb().

**kwargs are passed to Records.map_array().


map_array method

Records.map_array(
    a,
    idx_arr=None,
    mapping=None,
    group_by=None,
    **kwargs
)

Convert array to mapped array.

The length of the array must match that of the records.


map_field method

Records.map_field(
    field,
    **kwargs
)

Convert field to mapped array.

**kwargs are passed to Records.map_array().


metrics class variable

Metrics supported by Records.

HybridConfig(
    start_index=dict(
        title='Start Index',
        calc_func=<function Records.<lambda> at 0x132587c40>,
        agg_func=None,
        tags='wrapper'
    ),
    end_index=dict(
        title='End Index',
        calc_func=<function Records.<lambda> at 0x132587ce0>,
        agg_func=None,
        tags='wrapper'
    ),
    total_duration=dict(
        title='Total Duration',
        calc_func=<function Records.<lambda> at 0x132587d80>,
        apply_to_timedelta=True,
        agg_func=None,
        tags='wrapper'
    ),
    count=dict(
        title='Count',
        calc_func='count',
        tags='records'
    )
)

Returns Records._metrics, which gets (hybrid-) copied upon creation of each instance. Thus, changing this config won't affect the class.

To change metrics, you can either change the config in-place, override this property, or overwrite the instance variable Records._metrics.


override_field_config_doc class method

Records.override_field_config_doc(
    __pdoc__,
    source_cls=None
)

Call this method on each subclass that overrides Records.field_config.


pd_mask property

MappedArray.get_pd_mask with default arguments.


plots_defaults property

Defaults for PlotsBuilderMixin.plots().

Merges PlotsBuilderMixin.plots_defaults and plots from records.


prepare_customdata method

Records.prepare_customdata(
    incl_fields=None,
    excl_fields=None,
    append_info=None,
    mask=None
)

Prepare customdata and hoverinfo for Plotly.

Will display all fields in the data type or only those in incl_fields, unless any of them has the field config setting as_customdata disabled, or it's listed in excl_fields. Additionally, you can define hovertemplate in the field config such as by using Sub where title is substituted by the title and index is substituted by (final) index in the customdata. If provided as a string, will be wrapped with Sub. Defaults to "$title: %{{customdata[$index]}}". Mapped fields will be stringified automatically.

To append one or more custom arrays, provide append_info as a list of tuples, each consisting of a 1-dim NumPy array, title, and optionally hoverinfo. If the array's data type is object, will treat it as strings, otherwise as numbers.


random_n method

Records.random_n(
    n,
    seed=None,
    jitted=None,
    chunked=None,
    **kwargs
)

Return random N records in each column.


readable property

Records.to_readable() with default arguments.


recarray property


records property

Records.


records_arr property

Records array.


records_readable property

Records.to_readable() with default arguments.


replace method

Records.replace(
    **kwargs
)

See Configured.replace().

Also, makes sure that Records.col_mapper is not passed to the new instance.


resample method

Records.resample(
    *args,
    records_meta=None,
    **kwargs
)

Perform resampling on Records.


resample_meta method

Records.resample_meta(
    *args,
    wrapper_meta=None,
    **kwargs
)

Perform resampling on Records and return metadata.


resample_records_arr method

Records.resample_records_arr(
    resampler
)

Perform resampling on the record array.


row_stack class method

Records.row_stack(
    *objs,
    wrapper_kwargs=None,
    **kwargs
)

Stack multiple Records instances along rows.

Uses ArrayWrapper.row_stack() to stack the wrappers and Records.row_stack_records_arrs() to stack the record arrays.

Note

Will produce a column-sorted array.


row_stack_records_arrs class method

Records.row_stack_records_arrs(
    *objs,
    **kwargs
)

Stack multiple record arrays along rows.


select_cols method

Records.select_cols(
    col_idxs,
    jitted=None
)

Select columns.

Returns indices and new record array. Automatically decides whether to use column lengths or column map.


sort method

Records.sort(
    incl_id=False,
    group_by=None,
    **kwargs
)

Sort records by columns (primary) and ids (secondary, optional).

Note

Sorting is expensive. A better approach is to append records already in the correct order.


stats_defaults property

Defaults for StatsBuilderMixin.stats().

Merges StatsBuilderMixin.stats_defaults and stats from records.


subplots class variable

Subplots supported by Records.

HybridConfig()

Returns Records._subplots, which gets (hybrid-) copied upon creation of each instance. Thus, changing this config won't affect the class.

To change subplots, you can either change the config in-place, override this property, or overwrite the instance variable Records._subplots.


to_readable method

Records.to_readable(
    expand_columns=False
)

Get records in a human-readable format.


values property

Records array.


RecordsWithFields class

RecordsWithFields()

Class exposes a read-only class property RecordsWithFields.field_config.

Subclasses


field_config function

Field config of ${cls_name}.

${field_config}