Skip to content

base module

Base class for splitting.


FixRange class

FixRange(
    *args,
    **kwargs
)

Class that represents a fixed range.

Superclasses

Inherited members


range_ class variable

Range.


RelRange class

RelRange(
    *args,
    **kwargs
)

Class that represents a relative range.

Superclasses

Inherited members


is_gap class variable

Whether the range acts as a gap.


length class variable

Length.

Floating values between 0 and 1 are considered relative.

Can be negative.


length_space class variable

Length space.

Supported are

  • 'all': All space
  • 'free': Remaining space after the offset
  • 'free_or_prev': Remaining space after the offset or the start/end of the previous range, depending what comes first in the direction of RelRange.length

Applied only when RelRange.length is a relative number.


offset class variable

Offset.

Floating values between 0 and 1 are considered relative.

Can be negative.


offset_anchor class variable

Offset anchor.

Supported are

  • 'start': Start of the range
  • 'end': End of the range
  • 'prev_start': Start of the previous range
  • 'prev_end': End of the previous range

offset_space class variable

Offset space.

Supported are

  • 'all': All space
  • 'free': Remaining space after the offset anchor
  • 'prev': Length of the previous range

Applied only when RelRange.offset is a relative number.


out_of_bounds class variable

Check if start and stop are within bounds.

Supported are

  • 'keep': Keep out-of-bounds values
  • 'ignore': Ignore if out-of-bounds
  • 'warn': Emit a warning if out-of-bounds
  • 'raise": Raise an error if out-of-bounds

to_slice method

RelRange.to_slice(
    total_len,
    prev_start=0,
    prev_end=0,
    index=None,
    freq=None
)

Convert the relative range into a slice.


Splitter class

Splitter(
    wrapper,
    index,
    splits_arr,
    **kwargs
)

Base class for splitting.

Superclasses

Inherited members


apply method

Splitter.apply(
    apply_func,
    *apply_args,
    split=None,
    set_=None,
    split_group_by=None,
    set_group_by=None,
    squeeze_one_split=True,
    squeeze_one_set=True,
    remap_to_obj=True,
    obj_index=None,
    obj_freq=None,
    range_format='slice_or_any',
    point_wise=False,
    attach_bounds=False,
    right_inclusive=False,
    template_context=None,
    silence_warnings=False,
    index_combine_kwargs=None,
    freq=None,
    iteration='split_wise',
    execute_kwargs=None,
    filter_results=True,
    raise_no_results=True,
    merge_func=None,
    merge_kwargs=None,
    merge_all=True,
    wrap_results=True,
    eval_id=None,
    **apply_kwargs
)

Apply a function on each range.

Uses Splitter.select_indices() to get the indices for selected splits and sets. Arguments split_group_by and set_group_by can be used to group splits and sets respectively. Ranges belonging to the same split and set group will be merged.

For each index pair, in a lazily manner, resolves the source range using Splitter.select_range() and Splitter.get_ready_range(). Then, takes each argument from args and kwargs wrapped with Takeable, remaps the range into each object's index using Splitter.get_ready_obj_range(), and takes the slice from that object using Splitter.take_range(). The original object will be substituted by this slice. At the end, substitutes any templates in the prepared args and kwargs and saves the function and arguments for execution.

For substitution, the following information is available:

  • split/set_group_indices: Indices corresponding to the selected row/column groups
  • split/set_indices: Indices corresponding to the selected rows/columns
  • n_splits/sets: Number of the selected rows/columns
  • split/set_labels: Labels corresponding to the selected row/column groups
  • split/set_idx: Index of the selected row/column
  • split/set_label: Label of the selected row/column
  • range_: Selected range ready for indexing (see Splitter.get_ready_range())
  • range_meta: Various information on the selected range
  • obj_range_meta: Various information on the range taken from each takeable argument. Positional arguments are denoted by position, keyword arguments are denoted by keys.
  • args: Positional arguments with ranges already selected
  • kwargs: Keyword arguments with ranges already selected
  • bounds: A tuple of either integer or index bounds. Can be source or target depending on attach_bounds.
  • template_context: Passed template context

Since each range is processed lazily (that is, upon request), there are multiple iteration modes controlled by the argument iteration:

  • 'split_major': Flatten all ranges in split-major order and iterate over them
  • 'set_major': Flatten all ranges in set-major order and iterate over them
  • 'split_wise': Iterate over splits, while ranges in each split are processed sequentially
  • 'set_wise': Iterate over sets, while ranges in each set are processed sequentially

The execution is done using execute() with execute_kwargs. Once all results have been obtained, attempts to merge them using merge_func with merge_kwargs (all templates in it will be substituted as well), which can also be a string or a tuple of strings resolved using resolve_merge_func(). If wrap_results is enabled, packs the results into a Pandas object. If apply_func returns something complex, the resulting Pandas object will be of object data type. If apply_func returns a tuple (detected by the first returned result), a Pandas object is built for each element of that tuple.

If merge_all is True, will merge all results in a flattened manner irrespective of the iteration mode. Otherwise, will merge by split/set.

If NoResult is returned, will skip the current iteration and remove it from the final index.

Usage

  • Get the return of each data range:
>>> from vectorbtpro import *

>>> data = vbt.YFData.pull(
...     "BTC-USD",
...     start="2020-01-01 UTC",
...     end="2021-01-01 UTC"
... )
>>> splitter = vbt.Splitter.from_n_rolling(data.wrapper.index, 5)

>>> def apply_func(data):
...     return data.close.iloc[-1] - data.close.iloc[0]

>>> splitter.apply(apply_func, vbt.Takeable(data))
split
0    -1636.467285
1     3706.568359
2     2944.720703
3     -118.113281
4    17098.916016
dtype: float64
  • The same but by indexing manually:
>>> def apply_func(range_, data):
...     data = data.iloc[range_]
...     return data.close.iloc[-1] - data.close.iloc[0]

>>> splitter.apply(apply_func, vbt.Rep("range_"), data)
split
0    -1636.467285
1     3706.568359
2     2944.720703
3     -118.113281
4    17098.916016
dtype: float64
  • Divide into two windows, each consisting of 50% train and 50% test, compute SMA for each range, and row-stack the outputs of each set upon merging:
>>> splitter = vbt.Splitter.from_n_rolling(data.wrapper.index, 2, split=0.5)

>>> def apply_func(data):
...     return data.run("SMA", 10).real

>>> splitter.apply(
...     apply_func,
...     vbt.Takeable(data),
...     merge_func="row_stack"
... ).unstack("set").vbt.drop_levels("split", axis=0).vbt.plot().show()


bounds property

Splitter.get_bounds() with default arguments.


bounds_arr property

Splitter.get_bounds_arr() with default arguments.


break_up_splits method

Splitter.break_up_splits(
    new_split,
    sort=False,
    template_context=None,
    wrapper_kwargs=None,
    init_kwargs=None,
    **split_range_kwargs
)

Split each split into multiple splits.

If there are multiple sets, make sure to merge them into one beforehand.

Arguments new_split and **split_range_kwargs are passed to Splitter.split_range().


column_stack class method

Splitter.column_stack(
    *objs,
    wrapper_kwargs=None,
    **kwargs
)

Stack multiple Splitter instances along columns.

Uses ArrayWrapper.column_stack() to stack the wrappers.


coverage property

Splitter.get_coverage() with default arguments.


duration property

Splitter.get_duration() with default arguments.


from_expanding class method

Splitter.from_expanding(
    index,
    min_length,
    offset,
    split=None,
    split_range_kwargs=None,
    range_bounds_kwargs=None,
    template_context=None,
    freq=None,
    **kwargs
)

Create a Splitter instance from an expanding range.

Argument min_length is the minimum length of the expanding range. Provide it as a float between 0 and 1 to make it relative to the length of the index. Argument offset is an offset after the right bound of the previous range from which the next range should start. It can also be a float relative to the index length. For other arguments, see Splitter.from_rolling().

Usage

  • Roll an expanding range with a length of 10 and an offset of 10, and split it into 3/4:
>>> from vectorbtpro import *

>>> index = pd.date_range("2020", "2021", freq="D")

>>> splitter = vbt.Splitter.from_expanding(
...     index,
...     10,
...     10,
...     split=3/4,
...     set_labels=["train", "test"]
... )
>>> splitter.plot().show()


from_grouper class method

Splitter.from_grouper(
    index,
    by,
    groupby_kwargs=None,
    grouper_kwargs=None,
    split=None,
    split_range_kwargs=None,
    template_context=None,
    split_labels=None,
    freq=None,
    **kwargs
)

Create a Splitter instance from a grouper.

See BaseIDXAccessor.get_grouper().

Uses Splitter.from_splits() to prepare the splits array and labels, and to build the instance.

Usage

  • Map each month into a range:
>>> from vectorbtpro import *

>>> index = pd.date_range("2020", "2021", freq="D")

>>> def is_month_end(index, split):
...     last_range = split[-1]
...     return index[last_range][-1].is_month_end

>>> splitter = vbt.Splitter.from_grouper(
...     index,
...     "M",
...     split_check_template=vbt.RepFunc(is_month_end)
... )
>>> splitter.plot().show()


from_n_expanding class method

Splitter.from_n_expanding(
    index,
    n,
    min_length=None,
    split=None,
    split_range_kwargs=None,
    template_context=None,
    freq=None,
    **kwargs
)

Create a Splitter instance from a number of expanding ranges.

Picks n evenly-spaced, expanding ranges. Argument min_length defines the minimum length for each range. For other arguments, see Splitter.from_rolling().

Usage

  • Roll 10 expanding ranges with a minimum length of 100, while reserving 50 elements for test:
>>> from vectorbtpro import *

>>> index = pd.date_range("2020", "2021", freq="D")

>>> splitter = vbt.Splitter.from_n_expanding(
...     index,
...     10,
...     min_length=100,
...     split=-50,
...     set_labels=["train", "test"]
... )
>>> splitter.plot().show()


from_n_random class method

Splitter.from_n_random(
    index,
    n,
    min_length,
    max_length=None,
    min_start=None,
    max_end=None,
    length_choice_func=None,
    start_choice_func=None,
    length_p_func=None,
    start_p_func=None,
    seed=None,
    split=None,
    split_range_kwargs=None,
    template_context=None,
    freq=None,
    **kwargs
)

Create a Splitter instance from a number of random ranges.

Randomly picks the length of a range between min_length and max_length (including) using length_choice_func, which receives an array of possible values and selects one. It defaults to numpy.random.Generator.choice. Optional function length_p_func takes the same as length_choice_func and must return either None or probabilities.

Randomly picks the start position of a range starting at min_start and ending at max_end (excluding) minus the chosen length using start_choice_func, which receives an array of possible values and selects one. It defaults to numpy.random.Generator.choice. Optional function start_p_func takes the same as start_choice_func and must return either None or probabilities.

Note

Each function must take two arguments: the iteration index and the array with possible values.

For other arguments, see Splitter.from_rolling().

Usage

  • Generate 20 random ranges with a length from [40, 100], and split each into 3/4:
>>> from vectorbtpro import *

>>> index = pd.date_range("2020", "2021", freq="D")

>>> splitter = vbt.Splitter.from_n_random(
...     index,
...     20,
...     min_length=40,
...     max_length=100,
...     split=3/4,
...     set_labels=["train", "test"]
... )
>>> splitter.plot().show()


from_n_rolling class method

Splitter.from_n_rolling(
    index,
    n,
    length=None,
    optimize_anchor_set=1,
    split=None,
    split_range_kwargs=None,
    template_context=None,
    freq=None,
    **kwargs
)

Create a Splitter instance from a number of rolling ranges of the same length.

If length is None, splits the index evenly into n non-overlapping ranges using Splitter.from_rolling(). Otherwise, picks n evenly-spaced, potentially overlapping ranges of a fixed length. For other arguments, see Splitter.from_rolling().

If length is "optimize", searches for a length to cover the most of the index. Use optimize_anchor_set to provide the index of a set that should become non-overlapping.

Usage

  • Roll 10 ranges with 100 elements, and split it into 3/4:
>>> from vectorbtpro import *

>>> index = pd.date_range("2020", "2021", freq="D")

>>> splitter = vbt.Splitter.from_n_rolling(
...     index,
...     10,
...     length=100,
...     split=3/4,
...     set_labels=["train", "test"]
... )
>>> splitter.plot().show()


from_purged class method

Splitter.from_purged(
    index,
    purged_splitter,
    pred_times=None,
    eval_times=None,
    split_labels=None,
    set_labels=None,
    **kwargs
)

Create a Splitter instance from a purged splitter.

The splitter must be an instance of BasePurgedCV.

Uses Splitter.from_splits() to prepare the splits array and labels, and to build the instance.


from_purged_kfold class method

Splitter.from_purged_kfold(
    index,
    n_folds=10,
    n_test_folds=2,
    purge_td=0,
    embargo_td=0,
    pred_times=None,
    eval_times=None,
    **kwargs
)

Create a Splitter instance from PurgedKFoldCV.

Keyword arguments are passed to Splitter.from_purged().


from_purged_walkforward class method

Splitter.from_purged_walkforward(
    index,
    n_folds=10,
    n_test_folds=1,
    min_train_folds=2,
    max_train_folds=None,
    split_by_time=False,
    purge_td=0,
    pred_times=None,
    eval_times=None,
    **kwargs
)

Create a Splitter instance from PurgedWalkForwardCV.

Keyword arguments are passed to Splitter.from_purged().


from_ranges class method

Splitter.from_ranges(
    index,
    split=None,
    split_range_kwargs=None,
    template_context=None,
    **kwargs
)

Create a Splitter instance from ranges.

Uses get_index_ranges() to generate start and end indices. Passes only related keyword arguments found in kwargs.

Other keyword arguments will be passed to Splitter.from_splits(). For details on split and split_range_kwargs, see Splitter.from_rolling().

Usage

  • Translate each quarter into a range:
>>> from vectorbtpro import *

>>> index = pd.date_range("2020", "2021", freq="D")

>>> splitter = vbt.Splitter.from_ranges(index, every="QS")
>>> splitter.plot().show()

  • In addition to the above, reserve the last month for testing purposes:
>>> splitter = vbt.Splitter.from_ranges(
...     index,
...     every="QS",
...     split=(1.0, lambda index: index.month == index.month[-1]),
...     split_range_kwargs=dict(backwards=True)
... )
>>> splitter.plot().show()


from_rolling class method

Splitter.from_rolling(
    index,
    length,
    offset=0,
    offset_anchor='prev_end',
    offset_anchor_set=0,
    offset_space='prev',
    backwards=False,
    split=None,
    split_range_kwargs=None,
    range_bounds_kwargs=None,
    template_context=None,
    freq=None,
    **kwargs
)

Create a Splitter instance from a rolling range of a fixed length.

Uses Splitter.from_splits() to prepare the splits array and labels, and to build the instance.

Args

index : index_like
Index.
length : int, float, or timedelta_like
See RelRange.length.
offset : int, float, or timedelta_like
See RelRange.offset.
offset_anchor : str
See RelRange.offset_anchor.
offset_anchor_set : int

Offset anchor set.

Selects the set from the previous range to be used as an offset anchor. If None, the whole previous split is considered as a single range. By default, it's the first set.

offset_space : str
See RelRange.offset_space.
backwards : bool or str

Whether to roll backwards.

If 'sorted', will roll backwards and sort the resulting splits by the start index.

split : any

Ranges to split the range into.

If None, will produce the entire range as a single range. Otherwise, will use Splitter.split_range() to split the range into multiple ranges.

split_range_kwargs : dict
Keyword arguments passed to Splitter.split_range().
range_bounds_kwargs : dict
Keyword arguments passed to Splitter.get_range_bounds().
template_context : dict
Context used to substitute templates in ranges.
freq : any

Index frequency in case it cannot be parsed from index.

If None, will be parsed using BaseIDXAccessor.get_freq().

**kwargs
Keyword arguments passed to the constructor of Splitter.

Usage

  • Divide a range into a set of non-overlapping ranges:
>>> from vectorbtpro import *

>>> index = pd.date_range("2020", "2021", freq="D")

>>> splitter = vbt.Splitter.from_rolling(index, 30)
>>> splitter.plot().show()

  • Divide a range into ranges, each split into 1/2:
>>> splitter = vbt.Splitter.from_rolling(
...     index,
...     60,
...     split=1/2,
...     set_labels=["train", "test"]
... )
>>> splitter.plot().show()

  • Make the ranges above non-overlapping by using the right bound of the last set as an offset anchor:
>>> splitter = vbt.Splitter.from_rolling(
...     index,
...     60,
...     offset_anchor_set=-1,
...     split=1/2,
...     set_labels=["train", "test"]
... )
>>> splitter.plot().show()


from_single class method

Splitter.from_single(
    index,
    split,
    split_range_kwargs=None,
    template_context=None,
    **kwargs
)

Create a Splitter instance from a single split.


from_sklearn class method

Splitter.from_sklearn(
    index,
    skl_splitter,
    groups=None,
    split_labels=None,
    set_labels=None,
    **kwargs
)

Create a Splitter instance from a scikit-learn's splitter.

The splitter must be an instance of sklearn.model_selection.BaseCrossValidator.

Uses Splitter.from_splits() to prepare the splits array and labels, and to build the instance.


from_split_func class method

Splitter.from_split_func(
    index,
    split_func,
    split_args=None,
    split_kwargs=None,
    fix_ranges=True,
    split=None,
    split_range_kwargs=None,
    range_bounds_kwargs=None,
    template_context=None,
    freq=None,
    **kwargs
)

Create a Splitter instance from a custom split function.

In a while-loop, substitutes templates in split_args and split_kwargs and passes them to split_func, which should return either a split (see new_split in Splitter.split_range(), also supports a single range if it's not an iterable) or None to abrupt the while-loop. If fix_ranges is True, the returned split is then converted into a fixed split using Splitter.split_range() and the bounds of its sets are measured using Splitter.get_range_bounds().

Each template substitution has the following information:

  • split_idx: Current split index, starting at 0
  • splits: Nested list of splits appended up to this point
  • bounds: Nested list of bounds appended up to this point
  • prev_start: Left bound of the previous split
  • prev_end: Right bound of the previous split
  • Arguments and keyword arguments passed to Splitter.from_split_func()

Usage

  • Rolling window of 30 elements, 20 for train and 10 for test:
>>> from vectorbtpro import *

>>> index = pd.date_range("2020", "2021", freq="D")

>>> def split_func(splits, bounds, index):
...     if len(splits) == 0:
...         new_split = (slice(0, 20), slice(20, 30))
...     else:
...         # Previous split, first set, right bound
...         prev_end = bounds[-1][0][1]
...         new_split = (
...             slice(prev_end, prev_end + 20),
...             slice(prev_end + 20, prev_end + 30)
...         )
...     if new_split[-1].stop > len(index):
...         return None
...     return new_split

>>> splitter = vbt.Splitter.from_split_func(
...     index,
...     split_func,
...     split_args=(
...         vbt.Rep("splits"),
...         vbt.Rep("bounds"),
...         vbt.Rep("index"),
...     ),
...     set_labels=["train", "test"]
... )
>>> splitter.plot().show()


from_splits class method

Splitter.from_splits(
    index,
    splits,
    squeeze=False,
    fix_ranges=True,
    wrap_with_fixrange=False,
    split_range_kwargs=None,
    split_check_template=None,
    template_context=None,
    split_labels=None,
    set_labels=None,
    wrapper_kwargs=None,
    **kwargs
)

Create a Splitter instance from an iterable of splits.

Argument splits supports both absolute and relative ranges. To transform relative ranges into the absolute format, enable fix_ranges. Arguments split_range_kwargs are then passed to Splitter.split_range().

Enable wrap_with_fixrange to wrap any fixed range with FixRange. If the range is an array, it will be wrapped regardless of this argument to avoid building a 3d array.

Pass a template via split_check_template to discard splits that do not fulfill certain criteria. The current split will be available as split. Should return a boolean (False to discard).

Labels for splits and sets can be provided via split_labels and set_labels respectively. Both arguments can be provided as templates. The split array will be available as splits.


get_bounds method

Splitter.get_bounds(
    index_bounds=False,
    right_inclusive=False,
    split_group_by=None,
    set_group_by=None,
    squeeze_one_split=True,
    squeeze_one_set=True,
    index_combine_kwargs=None,
    **kwargs
)

Boolean Series/DataFrame where index are bounds and columns are splits stacked together.

Keyword arguments **kwargs are passed to Splitter.get_bounds_arr().


get_bounds_arr method

Splitter.get_bounds_arr(
    index_bounds=False,
    right_inclusive=False,
    split_group_by=None,
    set_group_by=None,
    template_context=None,
    **range_bounds_kwargs
)

Three-dimensional integer array with bounds.

First axis represents splits. Second axis represents sets. Third axis represents bounds.

Each range is getting selected using Splitter.select_range() and then measured using Splitter.get_range_bounds(). Keyword arguments **kwargs are passed to Splitter.get_range_bounds().


get_coverage method

Splitter.get_coverage(
    overlapping=False,
    normalize=True,
    split_group_by=None,
    set_group_by=None,
    **kwargs
)

Get the coverage of the entire mask.

If overlapping is True, returns the number of overlapping True values. If normalize is True, returns the number of True values relative to the length of the index. If overlapping and normalize are True, returns the number of overlapping True values relative to the total number of True values.

Keyword arguments **kwargs are passed to Splitter.get_mask_arr().


get_duration method

Splitter.get_duration(
    **kwargs
)

Get duration.


get_iter_set_mask_arrs method

Splitter.get_iter_set_mask_arrs(
    split_group_by=None,
    set_group_by=None,
    template_context=None,
    **kwargs
)

Generator of two-dimensional boolean arrays, one per set.

First axis represents splits. Second axis represents index.

Keyword arguments **kwargs are passed to Splitter.get_range_mask().


get_iter_set_masks method

Splitter.get_iter_set_masks(
    split_group_by=None,
    set_group_by=None,
    **kwargs
)

Generator of boolean DataFrames, one per set.

Keyword arguments **kwargs are passed to Splitter.get_iter_set_mask_arrs().


get_iter_split_mask_arrs method

Splitter.get_iter_split_mask_arrs(
    split_group_by=None,
    set_group_by=None,
    template_context=None,
    **kwargs
)

Generator of two-dimensional boolean arrays, one per split.

First axis represents sets. Second axis represents index.

Keyword arguments **kwargs are passed to Splitter.get_range_mask().


get_iter_split_masks method

Splitter.get_iter_split_masks(
    split_group_by=None,
    set_group_by=None,
    **kwargs
)

Generator of boolean DataFrames, one per split.

Keyword arguments **kwargs are passed to Splitter.get_iter_split_mask_arrs().


get_mask method

Splitter.get_mask(
    split_group_by=None,
    set_group_by=None,
    squeeze_one_split=True,
    squeeze_one_set=True,
    index_combine_kwargs=None,
    **kwargs
)

Boolean Series/DataFrame where index is Splitter.index and columns are splits stacked together.

Keyword arguments **kwargs are passed to Splitter.get_mask_arr().

Warning

Boolean arrays for a big number of splits may take a considerable amount of memory.


get_mask_arr method

Splitter.get_mask_arr(
    split_group_by=None,
    set_group_by=None,
    template_context=None,
    **kwargs
)

Three-dimensional boolean array with splits.

First axis represents splits. Second axis represents sets. Third axis represents index.

Keyword arguments **kwargs are passed to Splitter.get_iter_split_mask_arrs().


get_n_sets method

Splitter.get_n_sets(
    set_group_by=None
)

Get number of sets while considering the grouper.


get_n_splits method

Splitter.get_n_splits(
    split_group_by=None
)

Get number of splits while considering the grouper.


get_obj_index class method

Splitter.get_obj_index(
    obj
)

Get index from an object.


get_overlap_matrix method

Splitter.get_overlap_matrix(
    by='split',
    normalize=True,
    split_group_by=None,
    set_group_by=None,
    jitted=None,
    squeeze_one_split=True,
    squeeze_one_set=True,
    index_combine_kwargs=None,
    **kwargs
)

Get the overlap between each pair of ranges.

The argument by can be one of 'split', 'set', and 'range'.

If normalize is True, returns the number of True values in each overlap relative to the total number of True values in both ranges.

Keyword arguments **kwargs are passed to Splitter.get_mask_arr().


get_range_bounds class method

Splitter.get_range_bounds(
    range_,
    index_bounds=False,
    right_inclusive=False,
    check_constant=True,
    template_context=None,
    index=None,
    freq=None
)

Get the left (inclusive) and right (exclusive) bound of a range.

Note

Even when mapped to the index, the right bound is always exclusive.


get_range_coverage method

Splitter.get_range_coverage(
    normalize=True,
    relative=False,
    split_group_by=None,
    set_group_by=None,
    squeeze_one_split=True,
    squeeze_one_set=True,
    index_combine_kwargs=None,
    **kwargs
)

Get the coverage of each range mask.

If normalize is True, returns the number of True values in each range relative to the length of the index. If normalize and relative are True, returns the number of True values in each range relative to the total number of True values in its split.

Keyword arguments **kwargs are passed to Splitter.get_mask_arr().


get_range_mask class method

Splitter.get_range_mask(
    range_,
    template_context=None,
    index=None
)

Get the mask of a range.


get_ready_obj_range class method

Splitter.get_ready_obj_range(
    obj,
    range_,
    remap_to_obj=True,
    obj_index=None,
    obj_freq=None,
    template_context=None,
    jitted=None,
    silence_warnings=False,
    index=None,
    freq=None,
    return_obj_meta=False,
    **ready_range_kwargs
)

Get a range that is ready to be mapped into an array-like object.

If the object is Pandas-like and obj_index is not None, searches for an index in the object using Splitter.get_obj_index(). Once found, uses Splitter.remap_range() to get the range that maps to the object index. Finally, uses Splitter.get_ready_range() to convert the range into the one that can be used directly in indexing.


get_ready_range class method

Splitter.get_ready_range(
    range_,
    allow_relative=False,
    allow_zero_len=False,
    range_format='slice_or_any',
    template_context=None,
    index=None,
    return_meta=False
)

Get a range that can be directly used in array indexing.

Such a range is either an integer or datetime-like slice (right bound is always exclusive!), a one-dimensional NumPy array with integer indices or datetime-like objects, or a one-dimensional NumPy mask of the same length as the index.

Argument range_format accepts the following options:

  • 'any': Return any format
  • 'indices': Return indices
  • 'mask': Return mask of the same length as index
  • 'slice': Return slice
  • 'slice_or_indices': If slice fails, return indices
  • 'slice_or_mask': If slice fails, return mask
  • 'slice_or_any': If slice fails, return any format

get_set_coverage method

Splitter.get_set_coverage(
    overlapping=False,
    normalize=True,
    relative=False,
    split_group_by=None,
    set_group_by=None,
    squeeze_one_set=True,
    **kwargs
)

Get the coverage of each set mask.

If overlapping is True, returns the number of overlapping True values between splits in each set. If normalize is True, returns the number of True values in each set relative to the length of the index. If normalize and relative are True, returns the number of True values in each set relative to the total number of True values across all sets.

Keyword arguments **kwargs are passed to Splitter.get_mask_arr().


get_set_grouper method

Splitter.get_set_grouper(
    set_group_by=None
)

Get set grouper.


get_set_labels method

Splitter.get_set_labels(
    set_group_by=None
)

Get set labels while considering the grouper.


get_split_coverage method

Splitter.get_split_coverage(
    overlapping=False,
    normalize=True,
    relative=False,
    split_group_by=None,
    set_group_by=None,
    squeeze_one_split=True,
    **kwargs
)

Get the coverage of each split mask.

If overlapping is True, returns the number of overlapping True values between sets in each split. If normalize is True, returns the number of True values in each split relative to the length of the index. If normalize and relative are True, returns the number of True values in each split relative to the total number of True values across all splits.

Keyword arguments **kwargs are passed to Splitter.get_mask_arr().


get_split_grouper method

Splitter.get_split_grouper(
    split_group_by=None
)

Get split grouper.


get_split_labels method

Splitter.get_split_labels(
    split_group_by=None
)

Get split labels while considering the grouper.


guess_method class method

Splitter.guess_method(
    **kwargs
)

Guess the factory method based on keyword arguments.

Returns None if cannot guess.


index property

Index.


index_bounds property

Splitter.get_bounds() with index_bounds=True.


index_duration property

Splitter.get_duration() with index_bounds=True.


indexing_func method

Splitter.indexing_func(
    *args,
    splitter_meta=None,
    **kwargs
)

Perform indexing on Splitter.


indexing_func_meta method

Splitter.indexing_func_meta(
    *args,
    wrapper_meta=None,
    **kwargs
)

Perform indexing on Splitter and return metadata.


is_range_relative class method

Splitter.is_range_relative(
    range_
)

Return whether a range is relative.


iter_set_mask_arrs property

Splitter.get_iter_set_mask_arrs() with default arguments.


iter_set_masks property

Splitter.get_iter_set_masks() with default arguments.


iter_split_mask_arrs property

Splitter.get_iter_split_mask_arrs() with default arguments.


iter_split_masks property

Splitter.get_iter_split_masks() with default arguments.


map_bounds_to_index class method

Splitter.map_bounds_to_index(
    start,
    stop,
    right_inclusive=False,
    index=None,
    freq=None
)

Map bounds to index.


mask property

Splitter.get_mask() with default arguments.


mask_arr property

Splitter.get_mask_arr() with default arguments.


merge_sets method

Splitter.merge_sets(
    columns=None,
    new_set_label=None,
    insert_at_last=False,
    wrapper_kwargs=None,
    init_kwargs=None,
    **merge_split_kwargs
)

Merge multiple sets (columns) into a set (column).

Arguments **merge_split_kwargs are passed to Splitter.merge_split().

If columns are not provided, merges all columns. If provided and insert_at_last is True, a new column is inserted at the position of the last column.

Use new_set_label to specify the label of the new set. To provide final labels, define columns in wrapper_kwargs.


merge_split class method

Splitter.merge_split(
    split,
    range_format=None,
    wrap_with_template=False,
    wrap_with_fixrange=False,
    wrap_with_hslice=False,
    template_context=None,
    index=None
)

Merge a split of multiple fixed ranges into a fixed range.

Creates one mask and sets True for each range. If all input ranges are masks, returns that mask. If all input ranges are slices, returns a slice if possible. Otherwise, returns integer indices.

For range_format, see Splitter.get_ready_range(). Enable wrap_with_template to wrap the resulting range with a template of the type Rep.


metrics class variable

Metrics supported by Splitter.

HybridConfig(
    start=dict(
        title='Index Start',
        calc_func=<function Splitter.<lambda> at 0x162cf6020>,
        agg_func=None,
        tags=[
            'splitter',
            'index'
        ]
    ),
    end=dict(
        title='Index End',
        calc_func=<function Splitter.<lambda> at 0x162cf60c0>,
        agg_func=None,
        tags=[
            'splitter',
            'index'
        ]
    ),
    period=dict(
        title='Index Length',
        calc_func=<function Splitter.<lambda> at 0x162cf6160>,
        agg_func=None,
        tags=[
            'splitter',
            'index'
        ]
    ),
    split_count=dict(
        title='Splits',
        calc_func='n_splits',
        agg_func=None,
        tags=[
            'splitter',
            'splits'
        ]
    ),
    set_count=dict(
        title='Sets',
        calc_func='n_sets',
        agg_func=None,
        tags=[
            'splitter',
            'splits'
        ]
    ),
    coverage=dict(
        title=RepFunc(
            template=<function Splitter.<lambda> at 0x162cf6200>,
            context=None,
            strict=None,
            context_merge_kwargs=None,
            eval_id=None
        ),
        calc_func='coverage',
        overlapping=False,
        post_calc_func=<function Splitter.<lambda> at 0x162cf62a0>,
        agg_func=None,
        tags=[
            'splitter',
            'splits',
            'coverage'
        ]
    ),
    set_coverage=dict(
        title=RepFunc(
            template=<function Splitter.<lambda> at 0x162cf6340>,
            context=None,
            strict=None,
            context_merge_kwargs=None,
            eval_id=None
        ),
        check_has_multiple_sets=True,
        calc_func='set_coverage',
        overlapping=False,
        relative=False,
        post_calc_func=<function Splitter.<lambda> at 0x162cf63e0>,
        agg_func=None,
        tags=[
            'splitter',
            'splits',
            'coverage'
        ]
    ),
    set_mean_rel_coverage=dict(
        title='Mean Rel Coverage [%]',
        check_has_multiple_sets=True,
        check_normalize=True,
        calc_func='range_coverage',
        relative=True,
        post_calc_func=<function Splitter.<lambda> at 0x162cf6480>,
        agg_func=None,
        tags=[
            'splitter',
            'splits',
            'coverage'
        ]
    ),
    overlap_coverage=dict(
        title=RepFunc(
            template=<function Splitter.<lambda> at 0x162cf6520>,
            context=None,
            strict=None,
            context_merge_kwargs=None,
            eval_id=None
        ),
        calc_func='coverage',
        overlapping=True,
        post_calc_func=<function Splitter.<lambda> at 0x162cf65c0>,
        agg_func=None,
        tags=[
            'splitter',
            'splits',
            'coverage'
        ]
    ),
    set_overlap_coverage=dict(
        title=RepFunc(
            template=<function Splitter.<lambda> at 0x162cf6660>,
            context=None,
            strict=None,
            context_merge_kwargs=None,
            eval_id=None
        ),
        check_has_multiple_sets=True,
        calc_func='set_coverage',
        overlapping=True,
        post_calc_func=<function Splitter.<lambda> at 0x162cf6700>,
        agg_func=None,
        tags=[
            'splitter',
            'splits',
            'coverage'
        ]
    )
)

Returns Splitter._metrics, which gets (hybrid-) copied upon creation of each instance. Thus, changing this config won't affect the class.

To change metrics, you can either change the config in-place, override this property, or overwrite the instance variable Splitter._metrics.


n_sets property

Number of sets.


n_splits property

Number of splits.


parse_and_inject_takeables class method

Splitter.parse_and_inject_takeables(
    flat_ann_args,
    eval_id=None
)

Parse Takeable instances from function annotations and inject them into flattened annotated arguments.


plot method

Splitter.plot(
    split_group_by=None,
    set_group_by=None,
    mask_kwargs=None,
    trace_kwargs=None,
    add_trace_kwargs=None,
    fig=None,
    **layout_kwargs
)

Plot splits as rows and sets as colors.

Args

split_group_by : any
Split groups. See BaseIDXAccessor.get_grouper().
set_group_by : any
Set groups. See BaseIDXAccessor.get_grouper().
mask_kwargs : dict
Keyword arguments passed to Splitter.get_iter_set_masks().
trace_kwargs : dict

Keyword arguments passed to plotly.graph_objects.Heatmap.

Can be a sequence, one per set.

add_trace_kwargs : dict
Keyword arguments passed to add_trace.
fig : Figure or FigureWidget
Figure to add traces to.
**layout_kwargs
Keyword arguments for layout.

Usage

  • Plot a scikit-learn splitter:
>>> from vectorbtpro import *
>>> from sklearn.model_selection import TimeSeriesSplit

>>> index = pd.date_range("2020", "2021", freq="D")
>>> splitter = vbt.Splitter.from_sklearn(index, TimeSeriesSplit())
>>> splitter.plot().show()


plot_coverage method

Splitter.plot_coverage(
    stacked=True,
    split_group_by=None,
    set_group_by=None,
    mask_kwargs=None,
    trace_kwargs=None,
    add_trace_kwargs=None,
    fig=None,
    **layout_kwargs
)

Plot index as rows and sets as lines.

Args

stacked : bool
Whether to plot as an area plot.
split_group_by : any
Split groups. See BaseIDXAccessor.get_grouper().
set_group_by : any
Set groups. See BaseIDXAccessor.get_grouper().
mask_kwargs : dict
Keyword arguments passed to Splitter.get_iter_set_masks().
trace_kwargs : dict

Keyword arguments passed to plotly.graph_objects.Scatter.

Can be a sequence, one per set.

add_trace_kwargs : dict
Keyword arguments passed to add_trace.
fig : Figure or FigureWidget
Figure to add traces to.
**layout_kwargs
Keyword arguments for layout.

Usage

  • Area plot:
>>> from vectorbtpro import *
>>> from sklearn.model_selection import TimeSeriesSplit

>>> index = pd.date_range("2020", "2021", freq="D")
>>> splitter = vbt.Splitter.from_sklearn(index, TimeSeriesSplit())
>>> splitter.plot_coverage().show()

  • Line plot:
>>> splitter.plot_coverage(stacked=False).show()


plots_defaults property

Defaults for PlotsBuilderMixin.plots().

Merges PlotsBuilderMixin.plots_defaults and plots from splitter.


range_coverage property

Splitter.get_range_coverage() with default arguments.


range_overlap_matrix property

Splitter.get_overlap_matrix() with by="range".


remap_range class method

Splitter.remap_range(
    range_,
    target_index,
    target_freq=None,
    template_context=None,
    jitted=None,
    silence_warnings=False,
    index=None,
    freq=None
)

Remap a range to a target index.

If index and target_index are the same, returns the range. Otherwise, uses Resampler.resample_source_mask() to resample the range into the target index. In such a case, freq and target_freq must be provided.


resolve_column_stack_kwargs class method

Splitter.resolve_column_stack_kwargs(
    *objs,
    reindex_kwargs=None,
    **kwargs
)

Resolve keyword arguments for initializing Splitter after stacking along columns.


resolve_row_stack_kwargs class method

Splitter.resolve_row_stack_kwargs(
    *objs,
    **kwargs
)

Resolve keyword arguments for initializing Splitter after stacking along rows.


row_stack class method

Splitter.row_stack(
    *objs,
    wrapper_kwargs=None,
    **kwargs
)

Stack multiple Splitter instances along rows.

Uses ArrayWrapper.row_stack() to stack the wrappers.


select_indices method

Splitter.select_indices(
    split=None,
    set_=None,
    split_group_by=None,
    set_group_by=None
)

Get indices corresponding to selected splits and sets.

Arguments split and set_ can be either integers and labels. Also, multiple values are accepted; in such a case, the corresponding ranges are merged. If split/set labels are of an integer data type, treats the provided values as labels rather than indices, unless the split/set index is not of an integer data type or the values are wrapped with PosSel.

If split_group_by and/or set_group_by are provided, their groupers get created using BaseIDXAccessor.get_grouper() and arguments split and set_ become relative to the groups.

If split/set_ is not provided, selects all indices.

Returns four arrays: split group indices, set group indices, split indices, and set indices.


select_range method

Splitter.select_range(
    merge_split_kwargs=None,
    **select_indices_kwargs
)

Select a range.

Passes **select_indices_kwargs to Splitter.select_indices() to get the indices for selected splits and sets. If multiple ranges correspond to those indices, merges them using Splitter.merge_split().


set_coverage property

Splitter.get_set_coverage() with default arguments.


set_labels property

Set labels.


set_overlap_matrix property

Splitter.get_overlap_matrix() with by="set".


shuffle_splits method

Splitter.shuffle_splits(
    size=None,
    replace=False,
    p=None,
    seed=None,
    wrapper_kwargs=None,
    **init_kwargs
)

Shuffle splits.


split_and_apply class method

Splitter.split_and_apply(
    index,
    apply_func,
    *apply_args,
    splitter=None,
    splitter_kwargs=None,
    apply_kwargs=None,
    template_context=None,
    **var_kwargs
)

Split an index and apply a function.

Argument splitter can be an actual Splitter instance, the name of a factory method (such as "from_n_rolling"), or the factory method itself. If splitter is None, the right method will be guessed based on the supplied arguments using Splitter.guess_method().

Keyword arguments splitter_kwargs are passed to the factory method. Keyword arguments apply_kwargs are passed to Splitter.apply(). If variable keyword arguments are provided, they will be used as apply_kwargs if a splitter instance has been built, otherwise, arguments will be distributed based on the signatures of the factory method and Splitter.apply().


split_and_take class method

Splitter.split_and_take(
    index,
    obj,
    splitter=None,
    splitter_kwargs=None,
    take_kwargs=None,
    template_context=None,
    **var_kwargs
)

Split an index and take from an object.

Argument splitter can be an actual Splitter instance, the name of a factory method (such as "from_n_rolling"), or the factory method itself. If splitter is None, the right method will be guessed based on the supplied arguments using Splitter.guess_method().

Keyword arguments splitter_kwargs are passed to the factory method. Keyword arguments take_kwargs are passed to Splitter.take(). If variable keyword arguments are provided, they will be used as take_kwargs if a splitter instance has been built, otherwise, arguments will be distributed based on the signatures of the factory method and Splitter.take().


split_coverage property

Splitter.get_split_coverage() with default arguments.


split_labels property

Split labels.


split_overlap_matrix property

Splitter.get_overlap_matrix() with by="split".


split_range class method

Splitter.split_range(
    range_,
    new_split,
    backwards=False,
    allow_zero_len=False,
    range_format=None,
    wrap_with_template=False,
    wrap_with_fixrange=False,
    template_context=None,
    index=None,
    freq=None
)

Split a fixed range into a split of multiple fixed ranges.

Range must be either a template, a callable, a tuple (start and stop), a slice, a sequence of indices, or a mask. This range will then be re-mapped into the index.

Each sub-range in new_split can be either a fixed or relative range, that is, an instance of RelRange or a number that will be used as a length to create an RelRange. Each sub-range will then be re-mapped into the main range. Argument new_split can also be provided as an integer or a float indicating the length; in such a case the second part (or the first one depending on backwards) will stretch. If new_split is a string, the following options are supported:

New ranges are returned relative to the index and in the same order as passed.

For range_format, see Splitter.get_ready_range(). Enable wrap_with_template to wrap the resulting ranges with a template of the type Rep.


split_set method

Splitter.split_set(
    new_split,
    column=None,
    new_set_labels=None,
    wrapper_kwargs=None,
    init_kwargs=None,
    **split_range_kwargs
)

Split a set (column) into multiple sets (columns).

Arguments new_split and **split_range_kwargs are passed to Splitter.split_range().

Column must be provided if there are two or more sets.

Use new_set_labels to specify the labels of the new sets; it must have the same length as there are new ranges in the new split. To provide final labels, define columns in wrapper_kwargs.


splits property

Splitter.splits_arr as a DataFrame.


splits_arr property

Two-dimensional, object-dtype DataFrame with splits.

First axis represents splits. Second axis represents sets. Elements represent ranges. Range must be either a slice, a sequence of indices, a mask, or a callable that returns such.


stats_defaults property

Defaults for StatsBuilderMixin.stats().

Merges StatsBuilderMixin.stats_defaults and stats from splitter.


subplots class variable

Subplots supported by Splitter.

HybridConfig(
    plot=dict(
        title='Splits',
        yaxis_kwargs=dict(
            title='Split'
        ),
        plot_func='plot',
        tags='splitter'
    ),
    plot_coverage=dict(
        title='Coverage',
        yaxis_kwargs=dict(
            title='Count'
        ),
        plot_func='plot_coverage',
        tags='splitter'
    )
)

Returns Splitter._subplots, which gets (hybrid-) copied upon creation of each instance. Thus, changing this config won't affect the class.

To change subplots, you can either change the config in-place, override this property, or overwrite the instance variable Splitter._subplots.


take method

Splitter.take(
    obj,
    split=None,
    set_=None,
    split_group_by=None,
    set_group_by=None,
    squeeze_one_split=True,
    squeeze_one_set=True,
    into=None,
    remap_to_obj=True,
    obj_index=None,
    obj_freq=None,
    range_format='slice_or_any',
    point_wise=False,
    attach_bounds=False,
    right_inclusive=False,
    template_context=None,
    silence_warnings=False,
    index_combine_kwargs=None,
    stack_axis=1,
    stack_kwargs=None,
    freq=None
)

Take all ranges from an array-like object and optionally column-stack them.

Uses Splitter.select_indices() to get the indices for selected splits and sets. Arguments split_group_by and set_group_by can be used to group splits and sets respectively. Ranges belonging to the same split and set group will be merged.

For each index pair, resolves the source range using Splitter.select_range() and Splitter.get_ready_range(). Then, remaps this range into the object index using Splitter.get_ready_obj_range() and takes the slice from the object using Splitter.take_range(). If the object is a custom template, substitutes its instead of calling Splitter.take_range(). Finally, uses column_stack_merge() (stack_axis=1) or row_stack_merge() (stack_axis=0) with stack_kwargs to merge the taken slices.

If attach_bounds is enabled, measures the bounds of each range and makes it an additional level in the final index hierarchy. The argument supports the following options:

  • True, 'index', 'source', or 'source_index': Attach source (index) bounds
  • 'target' or 'target_index': Attach target (index) bounds
  • False: Do not attach

Argument into supports the following options:

  • None: Series of range slices
  • 'stacked': Stack all slices into a single object
  • 'stacked_by_split': Stack set slices in each split and return a Series of objects
  • 'stacked_by_set': Stack split slices in each set and return a Series of objects
  • 'split_major_meta': Generator with ranges processed lazily in split-major order. Returns meta with indices and labels, and the generator.
  • 'set_major_meta': Generator with ranges processed lazily in set-major order. Returns meta with indices and labels, and the generator.

Prepend any stacked option with "from_start_" (also "reset_") or "from_end_" to reset the index from start and from end respectively.

Usage

  • Roll a window and stack it along columns by keeping the index:
>>> from vectorbtpro import *

>>> data = vbt.YFData.pull(
...     "BTC-USD",
...     start="2020-01-01 UTC",
...     end="2021-01-01 UTC"
... )
>>> splitter = vbt.Splitter.from_n_rolling(
...     data.wrapper.index,
...     3,
...     length=5
... )
>>> splitter.take(data.close, into="stacked")
split                                0            1             2
Date
2020-01-01 00:00:00+00:00  7200.174316          NaN           NaN
2020-01-02 00:00:00+00:00  6985.470215          NaN           NaN
2020-01-03 00:00:00+00:00  7344.884277          NaN           NaN
2020-01-04 00:00:00+00:00  7410.656738          NaN           NaN
2020-01-05 00:00:00+00:00  7411.317383          NaN           NaN
2020-06-29 00:00:00+00:00          NaN  9190.854492           NaN
2020-06-30 00:00:00+00:00          NaN  9137.993164           NaN
2020-07-01 00:00:00+00:00          NaN  9228.325195           NaN
2020-07-02 00:00:00+00:00          NaN  9123.410156           NaN
2020-07-03 00:00:00+00:00          NaN  9087.303711           NaN
2020-12-27 00:00:00+00:00          NaN          NaN  26272.294922
2020-12-28 00:00:00+00:00          NaN          NaN  27084.808594
2020-12-29 00:00:00+00:00          NaN          NaN  27362.437500
2020-12-30 00:00:00+00:00          NaN          NaN  28840.953125
2020-12-31 00:00:00+00:00          NaN          NaN  29001.720703
  • Disgard the index and attach index bounds to the column hierarchy:
>>> splitter.take(
...     data.close,
...     into="reset_stacked",
...     attach_bounds="index"
... )
split                         0                         1  \
start 2020-01-01 00:00:00+00:00 2020-06-29 00:00:00+00:00
end   2020-01-06 00:00:00+00:00 2020-07-04 00:00:00+00:00
0                   7200.174316               9190.854492
1                   6985.470215               9137.993164
2                   7344.884277               9228.325195
3                   7410.656738               9123.410156
4                   7411.317383               9087.303711

split                         2
start 2020-12-27 00:00:00+00:00
end   2021-01-01 00:00:00+00:00
0                  26272.294922
1                  27084.808594
2                  27362.437500
3                  28840.953125
4                  29001.720703

take_range class method

Splitter.take_range(
    obj,
    ready_range,
    point_wise=False
)

Take a ready range from an array-like object.

Set point_wise to True to select one range point at a time and return a tuple.


take_range_from_takeable class method

Splitter.take_range_from_takeable(
    takeable,
    range_,
    remap_to_obj=True,
    obj_index=None,
    obj_freq=None,
    point_wise=False,
    template_context=None,
    return_obj_meta=False,
    return_meta=False,
    **ready_obj_range_kwargs
)

Take a range from a takeable object.


to_fixed method

Splitter.to_fixed(
    split_range_kwargs=None,
    **kwargs
)

Convert relative ranges into fixed ones and return a new Splitter instance.

Keyword arguments split_range_kwargs are passed to Splitter.split_range().


to_grouped method

Splitter.to_grouped(
    split=None,
    set_=None,
    split_group_by=None,
    set_group_by=None,
    merge_split_kwargs=None,
    **kwargs
)

Merge all ranges within the same group and return a new Splitter instance.


Takeable class

Takeable(
    *args,
    **kwargs
)

Class that represents an object from which a range can be taken.

Superclasses

Inherited members


eval_id class variable

One or more identifiers at which to evaluate this instance.


freq class variable

Frequency of Takeable.index.


index class variable

Index of the object.

If not present, will be accessed using Splitter.get_obj_index().


obj class variable

Takeable object.


point_wise class variable

Whether to select one range point at a time and return a tuple.


remap_to_obj class variable

Whether to remap Splitter.index to the index of Takeable.obj.

Otherwise, will assume that the object has the same index.


ZeroLengthError class

ZeroLengthError(
    *args,
    **kwargs
)

Thrown whenever a range has a length of zero.

Superclasses

  • builtins.BaseException
  • builtins.Exception
  • builtins.ValueError