decorators module¶
Decorators for splitting.
cv_split function¶
cv_split(
*args,
parameterized_kwargs=None,
selection='max',
return_grid=False,
skip_errored=False,
raise_no_results=True,
template_context=None,
**split_kwargs
)
Decorator that combines split() and parameterized() for cross-validation.
Creates a new apply function that is going to be decorated with split() and thus applied at each single range using Splitter.apply(). Inside this apply function, there is a test whether the current range belongs to the first (training) set. If yes, parameterizes the underlying function and runs it on the entire grid of parameters. The returned results are then stored in a global list. These results are then read by the other (testing) sets in the same split. If selection is a template, it can evaluate the grid results (available as grid_results) and return the best parameter combination. This parameter combination is then executed by each set (including training).
Argument selection also accepts "min" for np.argmin and "max" for np.argmax.
Keyword arguments parameterized_kwargs will be passed to parameterized() and will have their templates substituted with a context that will also include the split-related context (including split_idx, set_idx, etc., see Splitter.apply()).
If return_grid is True or 'first', returns both the grid and the selection. If return_grid is 'all', executes the grid on each set and returns along with the selection. Otherwise, returns only the selection.
If NoResultsException is raised or skip_errored is True and any exception is raised, will skip the current iteration and remove it from the final index.
Usage
- Permutate a series and pick the first value. Make the seed parameterizable. Cross-validate based on the highest picked value:
>>> from vectorbtpro import *
>>> @vbt.cv_split(
... splitter="from_n_rolling",
... splitter_kwargs=dict(n=3, split=0.5),
... takeable_args=["sr"],
... merge_func="concat",
... )
... def f(sr, seed):
... np.random.seed(seed)
... return np.random.permutation(sr)[0]
>>> index = pd.date_range("2020-01-01", "2020-02-01")
>>> np.random.seed(0)
>>> sr = pd.Series(np.random.permutation(np.arange(len(index))), index=index)
>>> f(sr, vbt.Param([41, 42, 43]))
split set seed
0 set_0 41 22
set_1 41 28
1 set_0 43 8
set_1 43 31
2 set_0 43 19
set_1 43 0
dtype: int64
- Extend the example above to also return the grid results of each set:
>>> f(sr, vbt.Param([41, 42, 43]), _return_grid="all")
(split set seed
0 set_0 41 22
42 22
43 2
set_1 41 28
42 28
43 20
1 set_0 41 5
42 5
43 8
set_1 41 23
42 23
43 31
2 set_0 41 18
42 18
43 19
set_1 41 27
42 27
43 0
dtype: int64,
split set seed
0 set_0 41 22
set_1 41 28
1 set_0 43 8
set_1 43 31
2 set_0 43 19
set_1 43 0
dtype: int64)
split function¶
split(
*args,
splitter=None,
splitter_cls=None,
splitter_kwargs=None,
index=None,
index_from=None,
takeable_args=None,
template_context=None,
forward_kwargs_as=None,
return_splitter=False,
apply_kwargs=None,
**var_kwargs
)
Decorator that splits the inputs of a function.
Does the following:
- Resolves the splitter of the type Splitter using the argument
splitter. It can be either an already provided splitter instance, the name of a factory method (such as "from_n_rolling"), or the factory method itself. Ifsplitteris None, the right method will be guessed based on the supplied arguments using Splitter.guess_method(). To construct a splitter, it will passindexand**splitter_kwargs. Index is getting resolved either using an already providedindex, by parsing the argument under a name/position provided inindex_from, or by parsing the first argument fromtakeable_args(in this order). - Wraps arguments in
takeable_argswith Takeable - Runs Splitter.apply() with arguments passed to the function as
argsandkwargs, but alsoapply_kwargs(the ones passed to the decorator)
Keyword arguments splitter_kwargs are passed to the factory method. Keyword arguments apply_kwargs are passed to Splitter.apply(). If variable keyword arguments are provided, they will be used as splitter_kwargs if apply_kwargs is already set, and vice versa. If splitter_kwargs and apply_kwargs aren't set, they will be used as splitter_kwargs if a splitter instance hasn't been built yet, otherwise as apply_kwargs. If both arguments are set, will raise an error.
Usage
- Split a Series and return its sum:
>>> from vectorbtpro import *
>>> @vbt.split(
... splitter="from_n_rolling",
... splitter_kwargs=dict(n=2),
... takeable_args=["sr"]
... )
... def f(sr):
... return sr.sum()
>>> index = pd.date_range("2020-01-01", "2020-01-06")
>>> sr = pd.Series(np.arange(len(index)), index=index)
>>> f(sr)
split
0 3
1 12
dtype: int64
- Perform a split manually:
>>> @vbt.split(
... splitter="from_n_rolling",
... splitter_kwargs=dict(n=2),
... takeable_args=["index"]
... )
... def f(index, sr):
... return sr[index].sum()
>>> f(index, sr)
split
0 3
1 12
dtype: int64
- Construct splitter and mark arguments as "takeable" manually:
>>> splitter = vbt.Splitter.from_n_rolling(index, n=2)
>>> @vbt.split(splitter=splitter)
... def f(sr):
... return sr.sum()
>>> f(vbt.Takeable(sr))
split
0 3
1 12
dtype: int64
- Split multiple timeframes using a custom index:
>>> @vbt.split(
... splitter="from_n_rolling",
... splitter_kwargs=dict(n=2),
... index=index,
... takeable_args=["h12_sr", "d2_sr"]
... )
... def f(h12_sr, d2_sr):
... return h12_sr.sum() + d2_sr.sum()
>>> h12_index = pd.date_range("2020-01-01", "2020-01-06", freq="12H")
>>> d2_index = pd.date_range("2020-01-01", "2020-01-06", freq="2D")
>>> h12_sr = pd.Series(np.arange(len(h12_index)), index=h12_index)
>>> d2_sr = pd.Series(np.arange(len(d2_index)), index=d2_index)
>>> f(h12_sr, d2_sr)
split
0 15
1 42
dtype: int64