jit_registry module¶
Global registry for jittables.
Jitting is a process of just-in-time compiling functions to make their execution faster. A jitter is a decorator that wraps a regular Python function and returns the decorated function. Depending upon a jitter, this decorated function has the same or at least a similar signature to the function that has been decorated. Jitters take various jitter-specific options to change the behavior of execution; that is, a single regular Python function can be decorated by multiple jitter instances (for example, one jitter for decorating a function with numba.jit and another jitter for doing the same with parallel=True flag).
In addition to jitters, vectorbt introduces the concept of tasks. One task can be executed by multiple jitter types (such as NumPy, Numba, and JAX). For example, one can create a task that converts price into returns and implements it using NumPy and Numba. Those implementations are registered by JITRegistry as JitableSetup instances, are stored in JITRegistry.jitable_setups, and can be uniquely identified by the task id and jitter type. Note that JitableSetup instances contain only information on how to decorate a function.
The decorated function itself and the jitter that has been used are registered as a JittedSetup instance and stored in JITRegistry.jitted_setups. It acts as a cache to quickly retrieve an already decorated function and to avoid recompilation.
Let's implement a task that takes a sum over an array using both NumPy and Numba:
>>> from vectorbtpro import *
>>> @vbt.register_jitted(task_id_or_func='sum')
... def sum_np(a):
... return a.sum()
>>> @vbt.register_jitted(task_id_or_func='sum')
... def sum_nb(a):
... out = 0.
... for i in range(a.shape[0]):
... out += a[i]
... return out
We can see that two new jitable setups were registered:
>>> vbt.jit_reg.jitable_setups['sum']
{'np': JitableSetup(task_id='sum', jitter_id='np', py_func=<function sum_np at 0x7fea215b1e18>, jitter_kwargs={}, tags=None),
'nb': JitableSetup(task_id='sum', jitter_id='nb', py_func=<function sum_nb at 0x7fea273d41e0>, jitter_kwargs={}, tags=None)}
Moreover, two jitted setups were registered for our decorated functions:
>>> from vectorbtpro.registries.jit_registry import JitableSetup
>>> hash_np = JitableSetup.get_hash('sum', 'np')
>>> vbt.jit_reg.jitted_setups[hash_np]
{3527539: JittedSetup(jitter=<vectorbtpro.utils.jitting.NumPyJitter object at 0x7fea21506080>, jitted_func=<function sum_np at 0x7fea215b1e18>)}
>>> hash_nb = JitableSetup.get_hash('sum', 'nb')
>>> vbt.jit_reg.jitted_setups[hash_nb]
{6326224984503844995: JittedSetup(jitter=<vectorbtpro.utils.jitting.NumbaJitter object at 0x7fea214d0ba8>, jitted_func=CPUDispatcher(<function sum_nb at 0x7fea273d41e0>))}
These setups contain decorated functions with the options passed during the registration. When we call JITRegistry.resolve() without any additional keyword arguments, JITRegistry returns exactly these functions:
>>> jitted_func = vbt.jit_reg.resolve('sum', jitter='nb')
>>> jitted_func
CPUDispatcher(<function sum_nb at 0x7fea273d41e0>)
>>> jitted_func.targetoptions
{'nopython': True, 'nogil': True, 'parallel': False, 'boundscheck': False}
Once we pass any other option, the Python function will be redecorated, and another JittedOption instance will be registered:
>>> jitted_func = vbt.jit_reg.resolve('sum', jitter='nb', nopython=False)
>>> jitted_func
CPUDispatcher(<function sum_nb at 0x7fea273d41e0>)
>>> jitted_func.targetoptions
{'nopython': False, 'nogil': True, 'parallel': False, 'boundscheck': False}
>>> vbt.jit_reg.jitted_setups[hash_nb]
{6326224984503844995: JittedSetup(jitter=<vectorbtpro.utils.jitting.NumbaJitter object at 0x7fea214d0ba8>, jitted_func=CPUDispatcher(<function sum_nb at 0x7fea273d41e0>)),
-2979374923679407948: JittedSetup(jitter=<vectorbtpro.utils.jitting.NumbaJitter object at 0x7fea00bf94e0>, jitted_func=CPUDispatcher(<function sum_nb at 0x7fea273d41e0>))}
Templates¶
Templates can be used to, based on the current context, dynamically select the jitter or keyword arguments for jitting. For example, let's pick the NumPy jitter over any other jitter if there are more than two of them for a given task:
>>> vbt.jit_reg.resolve('sum', jitter=vbt.RepEval("'nb' if 'nb' in task_setups else None"))
CPUDispatcher(<function sum_nb at 0x7fea273d41e0>)
Disabling¶
In the case we want to disable jitting, we can simply pass disable=True to JITRegistry.resolve():
>>> py_func = vbt.jit_reg.resolve('sum', jitter='nb', disable=True)
>>> py_func
<function __main__.sum_nb(a)>
We can also disable jitting globally:
>>> vbt.settings.jitting['disable'] = True
>>> vbt.jit_reg.resolve('sum', jitter='nb')
<function __main__.sum_nb(a)>
Hint
If we don't plan to use any additional options and we have only one jitter registered per task, we can also disable resolution to increase performance.
Warning
Disabling jitting globally only applies to functions resolved using JITRegistry.resolve(). Any decorated function that is being called directly will be executed as usual.
Jitted option¶
Since most functions that call other jitted functions in vectorbt have a jitted argument, you can pass jitted as a dictionary with options, as a string denoting the jitter, or False to disable jitting (see resolve_jitted_option()):
>>> def sum_arr(arr, jitted=None):
... func = vbt.jit_reg.resolve_option('sum', jitted)
... return func(arr)
>>> arr = np.random.uniform(size=1000000)
>>> %timeit sum_arr(arr, jitted='np')
319 µs ± 3.35 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit sum_arr(arr, jitted='nb')
1.09 ms ± 4.13 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit sum_arr(arr, jitted=dict(jitter='nb', disable=True))
133 ms ± 2.32 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Hint
A good rule of thumb is: whenever a caller function accepts a jitted argument, the jitted functions it calls are most probably resolved using JITRegistry.resolve_option().
Changing options upon registration¶
Options are usually specified upon registration using register_jitted():
>>> from numba import prange
>>> @vbt.register_jitted(parallel=True, tags={'can_parallel'})
... def sum_parallel_nb(a):
... out = np.empty(a.shape[1])
... for col in prange(a.shape[1]):
... total = 0.
... for i in range(a.shape[0]):
... total += a[i, col]
... out[col] = total
... return out
>>> sum_parallel_nb.targetoptions
{'nopython': True, 'nogil': True, 'parallel': True, 'boundscheck': False}
But what if we wanted to change the registration options of vectorbt's own jitable functions, such as diff_nb()? For example, let's disable caching for all Numba functions.
Since all functions have already been registered, the above statement has no effect:
>>> vbt.jit_reg.jitable_setups['vectorbtpro.generic.nb.base.diff_nb']['nb'].jitter_kwargs
{'cache': True}
In order for them to be applied, we need to save the settings to a file and load them before all functions are imported:
Let's restart the runtime and instruct vectorbt to load the file with settings before anything else:
>>> import os
>>> os.environ['VBT_SETTINGS_PATH'] = "my_settings"
>>> from vectorbtpro import *
>>> vbt.jit_reg.jitable_setups['vectorbtpro.generic.nb.base.diff_nb']['nb'].jitter_kwargs
{'cache': False}
We can also change the registration options for some specific tasks, and even replace Python functions. For example, we can change the implementation in the deepest places of the core. Let's change the default ddof from 0 to 1 in nanstd_1d_nb() and disable caching with Numba:
>>> vbt.nb.nanstd_1d_nb(np.array([1, 2, 3]))
0.816496580927726
>>> def new_nanstd_1d_nb(arr, ddof=1):
... return np.sqrt(vbt.nb.nanvar_1d_nb(arr, ddof=ddof))
>>> vbt.settings.jitting.jitters['nb']['tasks']['vectorbtpro.generic.nb.base.nanstd_1d_nb'] = dict(
... replace_py_func=new_nanstd_1d_nb,
... override_options=dict(
... cache=False
... )
... )
>>> vbt.settings.save('my_settings')
After restarting the runtime:
>>> import os
>>> os.environ['VBT_SETTINGS_PATH'] = "my_settings"
>>> vbt.nb.nanstd_1d_nb(np.array([1, 2, 3]))
1.0
Note
All of the above examples require saving the setting to a file, restarting the runtime, setting the path to the file to an environment variable, and only then importing vectorbtpro.
Changing options upon resolution¶
Another approach but without the need to restart the runtime is by changing the options upon resolution using JITRegistry.resolve_option():
>>> # On specific Numba function
>>> vbt.settings.jitting.jitters['nb']['tasks']['vectorbtpro.generic.nb.base.diff_nb'] = dict(
... resolve_kwargs=dict(
... nogil=False
... )
... )
>>> # disabled
>>> vbt.jit_reg.resolve('vectorbtpro.generic.nb.base.diff_nb', jitter='nb').targetoptions
{'nopython': True, 'nogil': False, 'parallel': False, 'boundscheck': False}
>>> # still enabled
>>> vbt.jit_reg.resolve('sum', jitter='nb').targetoptions
{'nopython': True, 'nogil': True, 'parallel': False, 'boundscheck': False}
>>> # On each Numba function
>>> vbt.settings.jitting.jitters['nb']['resolve_kwargs'] = dict(nogil=False)
>>> # disabled
>>> vbt.jit_reg.resolve('vectorbtpro.generic.nb.base.diff_nb', jitter='nb').targetoptions
{'nopython': True, 'nogil': False, 'parallel': False, 'boundscheck': False}
>>> # disabled
>>> vbt.jit_reg.resolve('sum', jitter='nb').targetoptions
{'nopython': True, 'nogil': False, 'parallel': False, 'boundscheck': False}
Building custom jitters¶
Let's build a custom jitter on top of NumbaJitter that converts any argument that contains a Pandas object to a 2-dimensional NumPy array prior to decoration:
>>> from functools import wraps
>>> from vectorbtpro.utils.jitting import NumbaJitter
>>> class SafeNumbaJitter(NumbaJitter):
... def decorate(self, py_func, tags=None):
... if self.wrapping_disabled:
... return py_func
...
... @wraps(py_func)
... def wrapper(*args, **kwargs):
... new_args = ()
... for arg in args:
... if isinstance(arg, pd.Series):
... arg = np.expand_dims(arg.values, 1)
... elif isinstance(arg, pd.DataFrame):
... arg = arg.values
... new_args += (arg,)
... new_kwargs = dict()
... for k, v in kwargs.items():
... if isinstance(v, pd.Series):
... v = np.expand_dims(v.values, 1)
... elif isinstance(v, pd.DataFrame):
... v = v.values
... new_kwargs[k] = v
... return NumbaJitter.decorate(self, py_func, tags=tags)(*new_args, **new_kwargs)
... return wrapper
After we have defined our jitter class, we need to register it globally:
Finally, we can execute any Numba function by specifying our new jitter:
>>> func = vbt.jit_reg.resolve(
... task_id_or_func=vbt.generic.nb.diff_nb,
... jitter='safe_nb',
... allow_new=True
... )
>>> func(pd.DataFrame([[1, 2], [3, 4]]))
array([[nan, nan],
[ 2., 2.]])
Whereas executing the same func using the vanilla Numba jitter causes an error:
>>> func = vbt.jit_reg.resolve(task_id_or_func=vbt.generic.nb.diff_nb)
>>> func(pd.DataFrame([[1, 2], [3, 4]]))
Failed in nopython mode pipeline (step: nopython frontend)
non-precise type pyobject
Note
Make sure to pass a function as task_id_or_func if the jitted function hasn't been registered yet.
This jitter cannot be used for decorating Numba functions that should be called from other Numba functions since the convertion operation is done using Python.
jit_reg JITRegistry¶
Default registry of type JITRegistry.
get_func_full_name function¶
Get full name of the func to be used as task id.
register_jitted function¶
register_jitted(
py_func=None,
task_id_or_func=None,
registry=<vectorbtpro.registries.jit_registry.JITRegistry object>,
tags=None,
**options
)
Decorate and register a jitable function using JITRegistry.decorate_and_register().
If task_id_or_func is a callable, gets replaced by the callable's module name and function name. Additionally, the function name may contain a suffix pointing at the jitter (such as _nb).
Options are merged in the following order:
jitters.{jitter_id}.optionsin jittingjitters.{jitter_id}.tasks.{task_id}.optionsin jittingoptionsjitters.{jitter_id}.override_optionsin jittingjitters.{jitter_id}.tasks.{task_id}.override_optionsin jitting
py_func can also be overridden using jitters.your_jitter.tasks.your_task.replace_py_func in jitting.
JITRegistry class¶
Class for registering jitted functions.
decorate_and_register method¶
Decorate a jitable function and register both jitable and jitted setups.
jitable_setups property¶
Dict of registered JitableSetup instances by task_id and jitter_id.
jitted_setups property¶
Nested dict of registered JittedSetup instances by hash of their JitableSetup instance.
match_jitable_setups method¶
Match jitable setups against an expression with each setup being a context.
match_jitted_setups method¶
Match jitted setups of a jitable setup against an expression with each setup a context.
register_jitable_setup method¶
Register a jitable setup.
register_jitted_setup method¶
Register a jitted setup.
resolve method¶
JITRegistry.resolve(
task_id_or_func,
jitter=None,
disable=None,
disable_resolution=None,
allow_new=None,
register_new=None,
return_missing_task=False,
template_context=None,
tags=None,
**jitter_kwargs
)
Resolve jitted function for the given task id.
For details on the format of task_id_or_func, see register_jitted().
Jitter keyword arguments are merged in the following order:
jitable_setup.jitter_kwargsjitter.your_jitter.resolve_kwargsin jittingjitter.your_jitter.tasks.your_task.resolve_kwargsin jittingjitter_kwargs
Templates are substituted in jitter, disable, and jitter_kwargs.
Set disable to True to return the Python function without decoration. If disable_resolution is enabled globally, task_id_or_func is returned unchanged.
Note
disable is only being used by JITRegistry, not vectorbtpro.utils.jitting.
Note
If there are more than one jitted setups registered for a single task id, make sure to provide a jitter.
If no jitted setup of type JittedSetup was found and allow_new is True, decorates and returns the function supplied as task_id_or_func (otherwise throws an error).
Set return_missing_task to True to return task_id_or_func if it cannot be found in JITRegistry.jitable_setups.
resolve_option method¶
Resolve option using resolve_jitted_option() and call JITRegistry.resolve().
JitableSetup class¶
Class that represents a jitable setup.
Note
Hashed solely by task_id and jitter_id.
Superclasses
Inherited members
- DefineMixin.asdict()
- DefineMixin.assert_field_not_missing()
- DefineMixin.get_field()
- DefineMixin.hash
- DefineMixin.hash_key
- DefineMixin.is_field_missing()
- DefineMixin.is_field_optional()
- DefineMixin.is_field_required()
- DefineMixin.merge_over()
- DefineMixin.merge_with()
- DefineMixin.replace()
- DefineMixin.resolve()
- DefineMixin.resolve_field()
- Hashable.get_hash()
jitter_id class variable¶
Jitter id.
jitter_kwargs class variable¶
Keyword arguments passed to resolve_jitter().
py_func class variable¶
Python function to be jitted.
tags class variable¶
Set of tags.
task_id class variable¶
Task id.
JittedSetup class¶
Class that represents a jitted setup.
Note
Hashed solely by sorted config of jitter. That is, two jitters with the same config will yield the same hash and the function won't be re-decorated.
Superclasses
Inherited members
- DefineMixin.asdict()
- DefineMixin.assert_field_not_missing()
- DefineMixin.get_field()
- DefineMixin.hash
- DefineMixin.hash_key
- DefineMixin.is_field_missing()
- DefineMixin.is_field_optional()
- DefineMixin.is_field_required()
- DefineMixin.merge_over()
- DefineMixin.merge_with()
- DefineMixin.replace()
- DefineMixin.resolve()
- DefineMixin.resolve_field()
- Hashable.get_hash()
jitted_func class variable¶
Decorated function.
jitter class variable¶
Jitter that decorated the function.