Release notes for 2021¶

All notable changes in reverse chronological order.

Version 1.0.2 (31 Dec, 2021)¶

Added Alpaca data source (#31). In contrast to the open-source version, additionally allows passing a pre-configured REST object to the AlpacaData.pull method.
Changed the default index field of EntryTrades from exit_idx to entry_idx
Dropped JSON and implemented a custom formatting engine that represents objects in Python format. This perfectly aligns with the switch to dataclasses VBT has made. Here's a comparison of a wrapper being printed out by the open-source version and JSON, and VBT with the new engine:

ArrayWrapper(**Config({
    "index": "<RangeIndex at 0x1045815e8> of shape (3,)",
    "columns": "<Int64Index at 0x1045815e8> of shape (1,)",
    "ndim": 1,
    "freq": null,
    "column_only_select": null,
    "group_select": null,
    "grouped_ndim": null,
    "group_by": null,
    "allow_enable": true,
    "allow_disable": true,
    "allow_modify": true
}))

ArrayWrapper(
    index=<RangeIndex at 0x1045815e8 of shape (3,)>,
    columns=<Int64Index at 0x1045815e8 of shape (1,)>,
    ndim=1,
    freq=None,
    column_only_select=None,
    group_select=None,
    grouped_ndim=None,
    grouper=Grouper(
        index=<Int64Index at 0x1045815e8 of shape (1,)>,
        group_by=None,
        allow_enable=True,
        allow_disable=True,
        allow_modify=True
    )
)

Version 1.0.1 (21 Dec, 2021)¶

Adapted the codebase to the new documentation format
Upgraded the documentation website generator from pdoc3 to MkDocs (Material Insiders). API is being automatically converted to Markdown files by a modified version of pdoc3 that resides in a private repository of @polakowo.

Version 1.0.0 (13 Dec, 2021)¶

Info

This section briefly describes major changes made to the open-source version. For more details, see commits.

Execution¶

Parallelized most functions that take 2-dimensional arrays using Explicit Parallel Loops
Built an infrastructure for chunking. Any Python function can be wrapped with the @chunked decorator, which returns a new function with the identical signature but capable of 1) splitting passed positional and keyword arguments into multiple chunks, 2) executing each chunk of arguments using the wrapped function, and 3) merging back the results. The rules by which the arguments are split and the results are merged must be explicitly provided using arg_take_spec and merge_func respectively. The chunk taking and merging specification is provided to most of the Numba-compiled functions that take 2-dimensional arrays. To only chunk functions by request, the decorator register_chunkable was created, which leaves the Python function unwrapped and registers a so-called "setup" with all specifications by the global registry ChunkableRegistry. Additionally, there are multiple present engines for executing chunks: SerialEngine (a simple queue), DaskEngine (mainly for multithreading), and RayEngine (mainly for multiprocessing).
Built an infrastructure for wrapping and running JIT-able functions. At the heart of it is the register_jitted decorator, which registers a Python function and the instructions on how to JIT compile it at the global registry JITRegistry. The registry, once instructed, finds the function's setup and passes the function to a jitting class (aka "jitter") for wrapping. Preset jitters include NumPyJitter for NumPy implementations and NumbaJitter for Numba-compiled functions. The registry can also register tasks (by task id) and capture multiple jitter candidates for the same task. The user can then switch between different implementations by specifying jitter.

Generic¶

Refactored many methods that take UDFs (such as GenericAccessor.rolling_apply) by converting each into both a class (meta) and an instance (regular) method using class_or_instancemethod. If the method was called on an instance, its UDFs do not have to take any metadata apart from (a part of) the array, such as apply_func_nb(window, *args). If the method was called on the class, it iterates over an abstract shape and its UDFs must take metadata of each iteration, which can be used to select a part of any custom array passed as a variable argument, such as apply_func_nb(from_i, to_i, col, *args). Previously, UDFs had to accept both the metadata and the array, even if the metadata was not used.
Most of the functions that take custom UDFs and variable arguments, such as GenericAccessor.rolling_apply, received support for utils.templates. The same goes for broadcasting named arguments - a practice initially introduced in Portfolio.
Made crossovers more resilient to NaN and moved them to GenericAccessor.crossed_above and GenericAccessor.crossed_below
Added BaseAccessor.eval, which is similar to pd.eval but broadcasts inputs prior to evaluation and can switch between NumPy and NumExpr
Improved conflict control in MappedArray. Multiple elements pointing to the same timestamp can be reduced using Mapped.reduce_segments. They can also be safely converted to Pandas by repeating index.
Made tolerance checks and values for Numba math functions such as is_less_nb globally adjustable. Disabling tolerance checks increases performance but can lead to round-off errors.
Implemented context managers for profiling time (Timer) and memory (MemTracer)
Added support for Scattergl for plotting big datasets with increased speed. Used by default on more than 10,000 points.

Broadcasting¶

Refactored broadcasting mechanism inside of broadcast. Added BCO dataclass, whose instances can be passed to change broadcasting behavior for individual objects. Introduced a possibility to build a Cartesian product of scalar-like parameters and other broadcastable objects (both using BCO and pd.Index as a shortcut) using operation trees and generate_param_combs. Additionally, a random subset of parameter combinations can be automatically selected to emulate random search. Default and Ref and the surrounding logic were moved to this module.

Data¶

Implemented data classes for working with local files: CSVData for CSV files and HDFData for HDF files and keys. Both support efficient updates without having to read the entire file. To make this possible, symbol fetching methods can return a state, which are preserved for the use in data updates.
Refactored RandomData and GBMData
Better handling of missing data. Made BinanceData and CCXTData more error-resilient: in case of connectivity issues, data won't be lost but returned, so it can be updated later.
Moved progress bar logic into a separate module to standardize handling of all progress bars across vectorbt. Added progress bar for symbols in Data.
Renamed download to fetch everywhere since not all data sources reside online

Portfolio¶

Added a new simulation method Portfolio.from_def_order_func that combines Portfolio.from_orders and Portfolio.from_order_func. It uses a custom order function to transform array-like objects into orders and allows attaching and overriding user-defined callbacks to change and monitor simulation.
Added support for in-output simulation objects. Instead of creating various arrays during the simulation, they can be manually created by the user (or automatically created and broadcasted by utilizing templates) outside the simulation, passed as regular arguments, and modified in-place. They are then conveniently stored in Portfolio.in_outputs for further analysis. In addition, Portfolio can detect when an in-output array shadows a regular portfolio attribute and takes this array instead of reconstructing the attribute, which is the new way to efficiently precompute various artifacts such as returns.
Implemented shortcut properties for Portfolio and Records, which are cacheable properties that call their respective getter methods with default arguments. This enables dot notation such as pf.trades.winning.pnl.count(), where trades and winning are cached properties that call the Portfolio.get_trades and Trades.get_winning method respectively. In Portfolio, shortcut properties can also utilize in-outputs.
Made various portfolio attributes (such as Portfolio.get_returns) flexible by converting each into both a class and an instance method using class_or_instancemethod. If the method was called on the class, the operation is run using the passed arguments only. If the method was called on an instance, the operation is run on the data from the instance, which can be overridden by setting any of the arguments.
Introduced extra validation of arguments passed to simulation. For instance, passing arrays that look boolean but have object data type raises an (informative) error.
Not only Portfolio.from_signals but all the simulation functions accept open, high, and low (all optional). This enables various interesting automatisms: order price of -np.inf gets automatically replaced by the opening price and np.inf (default everywhere) by the closing price. The highest and lowest prices are being used for bar boundary checks.
Added the following arguments:
- cash_deposits: cash deposits/withdrawals at the beginning of each time step
- cash_earnings: cash earnings (independent of position) at the end of each time step
- cash_dividends: dividends (relative to position) at the end of each time step
- init_position: the initial position
- stop_signal_priority: which signal to prioritize: stop or user?
Allowed price of 0. This allows for P&L-effective insertion and removal of cash and assets. For instance, to periodically charge a fee, one can create a range of orders with zero price and non-zero fees. They are visible as regular trades and appear in records.
Allowed max_order_records=0 and max_log_records=0 to disable filling records - for example, if the performance is assessed during the simulation and there is no need to save this data for post-simulation analysis. Also, for parallelization reasons, both of the numbers refer to the maximal number of records per column rather than per entire input.
Allowed negative fees (-0.05 means that you earn 0.05% per trade instead of paying a fee)
Converted simulation outputs to named tuples of type SimulationOutput

Returns¶

Updated metrics based on returns to take into account datetime-like properties. For instance, having two data points with the timestamps "2020-01-01" and "2021-01-01" are considered as a full year rather than 2 days as it was previously. See ArrayWrapper.dt_period.
Rolling metrics such as ReturnsAccessor.rolling_sortino_ratio were made much faster by refactoring their Numba-compiled functions

Caching¶

Completely refactored caching. Previously, caching was managed by specialized property and method decorators. Once the user invoked such a property or method, it checked for global settings to see whether it's blacklisted, and stored the cache on the instance it's bound to. Cached attributes weren't easily discoverable, which led to less transparency. In the new approach, caching is being managed by a global registry CacheableRegistry, which registers so-called "setups" for all cacheable objects, such as functions, properties, methods, instances, and even classes. They all build a well-connected hierarchy that can propagate actions. For instance, disabling caching in a class setup of type CAClassSetup will disable caching across all of its child setups, down to CARunSetup, which takes care of actual caching. Cacheable decorators such as cacheable communicate with the registry and do all actions on their particular setup. The user can easily find the setup for any (cacheable) object to, for example, display various caching statistics.
Removed caching of attributes that return DataFrames (apart from a few exceptions) to avoid wasting memory

Design¶

Restructured the project and reformatted the codebase. Most notably, Numba-compiled simulation functions were distributed across multiple modules.
Some previously required packages such as Plotly and Dill were made optional to make the core of vectorbt even more lightweight. Optional packages are tracked in utils.opt_packages and whenever a code that requires a package is accessed but the package is missing, an error is raised with instructions on how to install it.
Converted minimalistic classes to dataclasses using attrs
Refactored Config, which shrank initialization time of various vectorbt objects by 25%. Config respects Liskov substitution principle and similar to a dict, can be initialized by using both positional and keyword arguments. Also, created read-only and hybrid preset classes to unify configs created across vectorbt.
Removed expected key checks, which makes subclassing vectorbt classes easier but removes dynamic checks of keyword arguments passed to the initializer (which is an overkill anyway)
Accessors were made cached by default (which can be changed in the settings) to avoid repeated initialization, and all options for changing data in-place were removed
Made _settings more modular and better embeddable into documentation. Additionally, upon import, vectorbt looks for an environment variable that contains the path to a settings file and updates/replaces the current settings in-place.
Created and set up a private repository