Device

Collection of device Ivy functions.

class ivy.DevClonedItem(data: Dict[ivy.Device, Any], axis=0)[source]

Bases: ivy.core.device.MultiDevItem

class ivy.DevClonedIter(data: Iterable, dev_strs)[source]

Bases: ivy.core.device.MultiDevIter

class ivy.DevClonedNest(data: Iterable, dev_strs, max_depth=1)[source]

Bases: ivy.core.device.MultiDevNest

class ivy.DevDistItem(data: Dict[ivy.Device, Any], axis=0)[source]

Bases: ivy.core.device.MultiDevItem

class ivy.DevDistIter(data: Iterable, dev_strs)[source]

Bases: ivy.core.device.MultiDevIter

class ivy.DevDistNest(data: Iterable, dev_strs, max_depth=1)[source]

Bases: ivy.core.device.MultiDevNest

class ivy.DevManager(dev_mapper=None, dev_strs: Optional[Union[Iterable[str], Dict[str, int]]] = None, da_dim_size=None, safety_factor=1.1, min_dev_dim_size=0, max_dev_dim_step_ratio=0.1, min_unit_dev_tune_steps=10, min_sf_tune_steps=10, starting_split_factor=0.0, max_split_factor_step_size=0.05, tune_dev_alloc=True, tune_dev_splits=True)[source]

Bases: object

__init__(dev_mapper=None, dev_strs: Optional[Union[Iterable[str], Dict[str, int]]] = None, da_dim_size=None, safety_factor=1.1, min_dev_dim_size=0, max_dev_dim_step_ratio=0.1, min_unit_dev_tune_steps=10, min_sf_tune_steps=10, starting_split_factor=0.0, max_split_factor_step_size=0.05, tune_dev_alloc=True, tune_dev_splits=True)[source]

Create device manager, which unlike the device mapper, handles all argument cloning and distributing internally. The device manager only receivess a specification regarding the ratio of the batch each device should consume.

Parameters
  • dev_mapper (DevMapper) – The pre-built device mapper used by the manager internally.

  • dev_strs (sequence of strs or dict of split sizes) – The devices to distribute and clone the arguments across.

  • da_dim_size (int) – The size of the dimension along which the device allocation splitting is performed.

  • safety_factor (float, optional) – The factor by which to be safe in the avoidance of OOM GPU errors. Default is 1.1.

  • min_dev_dim_size (int, optional) – The minimum dimension size to pass to a device. Default is 0.

  • max_dev_dim_step_ratio (int, optional) – The maximum step ratio for changing the dimension for a device. Default is 0.1.

  • min_unit_dev_tune_steps (int, optional) – The minimum number of tune steps to make when optimizing with unit step size. Default is 10.

  • min_sf_tune_steps (int, optional) – Minimum number of split factor tune steps. Default is 10.

  • starting_split_factor (float, optional) – The initial device-specific split factor. Default is 0.

  • max_split_factor_step_size (float, optional) – The maximum step size for changing the split factor for a device. Default is 0.05.

  • tune_dev_alloc (bool, optional) – Whether to tune the device split sizes internally based on device utilization tracking, and use the provided values for initialization. Default is True.

  • tune_dev_splits (bool, optional) – Whether to tune the per-device split sizes internally. Default is True.

da_tune_step()[source]
property dim_size
ds_tune_step()[source]
map(cloned=None, to_clone=None, distributed=None, to_distribute=None)[source]

Map the function fn to each of the MultiDevice args and kwargs, running each function in parallel with CUDA-safe multiprocessing.

Parameters
  • cloned (dict of any, optional) – The MutliDevice keyword arguments which are already cloned. Default is None.

  • to_clone (dict of any, optional) – The MutliDevice keyword arguments to clone and map to the function. Default is None.

  • distributed (dict of any, optional) – The MutliDevice keyword arguments which already distributed. Default is None.

  • to_distribute (dict of any, optional) – The MutliDevice keyword arguments to distribute and map to the function. Default is None.

Returns

The results of the function, returned as a MultiDevice instance.

repeated_config_check()[source]
property tune_step
class ivy.DevMapper(fn, ret_fn, queue_class, worker_class, dev_strs, timeout=None, constant=None, unique=None)[source]

Bases: abc.ABC

__init__(fn, ret_fn, queue_class, worker_class, dev_strs, timeout=None, constant=None, unique=None)[source]

Device Mapper base class.

Parameters
  • fn (callable) – The function which the device mapper parallelises across devices.

  • ret_fn (callable) – The function which receives the ivy.MultiDevIter as input, and produces a single device output.

  • queue_class (class) – The class to use for creating queues.

  • worker_class (class) – The class to use for creating parallel workers.

  • dev_strs (sequence of str) – A list of devices on which to parallelise the function.

  • timeout (float, optional) – The timeout for getting items from the queues. Default is global.

  • constant (dict of any, optional) – A dict of keyword arguments which are the same for each process. Default is None.

  • unique (dict of iterables of any, optional) – A dict of keyword argument sequences which are unique for each process. Default is None.

map(used_dev_strs=None, split_factors=None, **kwargs)[source]

Map the function fn to each of the MultiDevice args and kwargs, running each function in parallel with CUDA-safe multiprocessing.

Parameters
  • used_dev_strs (sequence of str, optional) – The devices used in the current mapping pass. Default is all dev_strs.

  • split_factors (dict of floats, optional) – The updated split factors 0 < sf < 1 for each device. Default is None.

  • kwargs (dict of any) – The MutliDevice keyword arguments to map the function to.

Returns

The results of the function, returned as a MultiDevice instance.

class ivy.DevMapperMultiProc(fn, ret_fn, dev_strs, timeout=None, constant=None, unique=None)[source]

Bases: ivy.core.device.DevMapper

__init__(fn, ret_fn, dev_strs, timeout=None, constant=None, unique=None)[source]

Device Mapper base class.

Parameters
  • fn (callable) – The function which the device mapper parallelises across devices.

  • ret_fn (callable) – The function which receives the ivy.MultiDevIter as input, and produces a single device output.

  • queue_class (class) – The class to use for creating queues.

  • worker_class (class) – The class to use for creating parallel workers.

  • dev_strs (sequence of str) – A list of devices on which to parallelise the function.

  • timeout (float, optional) – The timeout for getting items from the queues. Default is global.

  • constant (dict of any, optional) – A dict of keyword arguments which are the same for each process. Default is None.

  • unique (dict of iterables of any, optional) – A dict of keyword argument sequences which are unique for each process. Default is None.

class ivy.MultiDev(data: Iterable, axis=0)[source]

Bases: object

__init__(data: Iterable, axis=0)[source]

Initialize self. See help(type(self)) for accurate signature.

class ivy.MultiDevItem(data: Dict[ivy.Device, Any], axis=0)[source]

Bases: ivy.core.device.MultiDev

__init__(data: Dict[ivy.Device, Any], axis=0)[source]

Initialize self. See help(type(self)) for accurate signature.

items()[source]
keys()[source]
property shape
values()[source]
class ivy.MultiDevIter(data: Iterable, dev_strs)[source]

Bases: ivy.core.device.MultiDev

__init__(data: Iterable, dev_strs)[source]

Initialize self. See help(type(self)) for accurate signature.

at_dev(dev_str)[source]
at_devs()[source]
class ivy.MultiDevNest(data: Iterable, dev_strs, max_depth=1)[source]

Bases: ivy.core.device.MultiDevIter

__init__(data: Iterable, dev_strs, max_depth=1)[source]

Initialize self. See help(type(self)) for accurate signature.

at_dev(dev_str)[source]
class ivy.Profiler(save_dir)[source]

Bases: abc.ABC

__init__(save_dir)[source]

Initialize self. See help(type(self)) for accurate signature.

abstract start()[source]
abstract stop()[source]
ivy.default_device()[source]

Return the default device.

ivy.dev(x: Union[ivy.Array, ivy.NativeArray], f: Optional[ivy.Framework] = None) → ivy.Device[source]

Get the native device handle for input array x.

Parameters
  • x (array) – Tensor for which to get the device handle.

  • f (ml_framework, optional) – Machine learning framework. Inferred from inputs if None.

Returns

Device handle for the array, in native framework format.

ivy.dev_clone(x, dev_strs)[source]

Clone the input item to each of the specified devices, returning a list of cloned items, each on a different device.

Parameters
  • x (array or container) – The input array or container to clone to each device.

  • dev_strs (sequence of strs) – The devices to clone the input to.

Returns

array or container distributed across the target devices

ivy.dev_clone_array(x, dev_strs)[source]

Clone an array across the specified devices, returning a list of cloned arrays, each on a different device.

Parameters
  • x (array) – The array to clone across devices.

  • dev_strs (sequence of strs) – The devices to clone the array to.

Returns

array cloned to each of the target devices

ivy.dev_clone_iter(xs, dev_strs)[source]

Clone elements of the iterbale xs to each of the specified devices.

Parameters
  • xs (iterable of any) – The iterable of items to clone.

  • dev_strs (sequence of strs) – The devices to clone each of the iterable elements to.

Returns

iterable with each element cloned to each of the target devices

ivy.dev_clone_nest(args, kwargs, dev_strs, max_depth=1)[source]

Clone the input arguments across the specified devices.

Parameters
  • args (list of any) – The positional arguments to clone.

  • kwargs (dict of any) – The keyword arguments to clone.

  • dev_strs (sequence of strs) – The devices to clone the arguments to.

  • max_depth (int, optional) – The maximum nested depth to reach. Default is 1. Increase this if the nest is deeper.

Returns

arguments cloned to each of the target devices

ivy.dev_dist(x, dev_strs: Union[Iterable[str], Dict[str, int]], axis=0)[source]

Distribute the input item across the specified devices, returning a list of sub-items, each on a different device.

Parameters
  • x (array or container) – The input array or container to distribute across devices.

  • dev_strs (sequence of strs or dict of split sizes) – The devices to distribute the input across.

  • axis (int, optional) – The axis along which to split the input. Default is 0.

Returns

array or container distributed across the target devices

ivy.dev_dist_array(x, dev_strs: Union[Iterable[str], Dict[str, int]], axis=0)[source]

Distribute an array across the specified devices, returning a list of sub-arrays, each on a different device.

Parameters
  • x (array) – The array to distribute across devices.

  • dev_strs (sequence of strs or dict of split sizes) – The devices to distribute the array across.

  • axis (int, optional) – The axis along which to split the array. Default is 0.

Returns

array distributed across the target devices

ivy.dev_dist_iter(xs, dev_strs: Union[Iterable[str], Dict[str, int]], axis=0)[source]

Distribute elements of the iterbale xs across the specified devices.

Parameters
  • xs (iterable of any) – The iterable of items to distribute.

  • dev_strs (sequence of strs or dict of split sizes) – The devices to distribute the iterable elements across.

  • axis (int, optional) – The axis along which to split the arrays in the iterable xs. Default is 0.

Returns

iterable with each element distributed to the target devices

ivy.dev_dist_nest(args, kwargs, dev_strs: Union[Iterable[str], Dict[str, int]], axis=0, max_depth=1)[source]

Distribute the nested input arguments across the specified devices.

Parameters
  • args (list of any) – The positional nested arguments to distribute.

  • kwargs (dict of any) – The keyword nested arguments to distribute.

  • dev_strs (sequence of strs or dict of split sizes) – The devices to distribute the nested arguments across.

  • axis (int, optional) – The axis along which to split the arrays in the arguments. Default is 0.

  • max_depth (int, optional) – The maximum nested depth to reach. Default is 1. Increase this if the nest is deeper.

Returns

nested arguments distributed to the target devices

ivy.dev_str(x: Union[ivy.Array, ivy.NativeArray], f: Optional[ivy.Framework] = None) → str[source]

Get the device string for input array x.

Parameters
  • x (array) – Tensor for which to get the device string.

  • f (ml_framework, optional) – Machine learning framework. Inferred from inputs if None.

Returns

Device string for the array, e.g. ‘cuda:0’, ‘cuda:1’, ‘cpu’ etc..

ivy.dev_to_str(dev_in: ivy.Device, f: Optional[ivy.Framework] = None) → str[source]

Convert native data type to string representation.

Parameters
  • dev_in (device handle) – The device handle to convert to string.

  • f (ml_framework, optional) – Machine learning framework. Inferred from inputs if None.

Returns

Device string e.g. ‘cuda:0’.

ivy.dev_unify(xs, dev_str, mode, axis=0)[source]

Unify a list of sub-arrays, on arbitrary devices, to a single concattenated array on the specified device.

Parameters
  • xs (sequence of arrays) – The list of sub-arrays to unify onto the specified device.

  • dev_str (str) – The device to unify the sub-arrays to.

  • mode (str) – The mode by which to unify, must be one of [ concat | mean | sum ]

  • axis (int, optional) – The axis along which to concattenate the array, if concat mode is set. Default is 0.

Returns

array unified to the target device

ivy.dev_unify_array(xs, dev_str, mode, axis=0)[source]

Unify a list of sub-arrays, on arbitrary devices, to a single array on the specified device.

Parameters
  • xs (sequence of arrays) – The list of arrays to unify onto the specified device.

  • dev_str (str) – The device to unify the arrays to.

  • mode (str) – The mode by which to unify, must be one of [ concat | mean | sum ]

  • axis (int, optional) – The axis along which to concattenate the array, if concat mode is set. Default is 0.

Returns

array unified to the target device

ivy.dev_unify_iter(xs, dev_str, mode, axis=0, transpose=False)[source]

Unify elements of the iterbale xs to a single target device.

Parameters
  • xs (iterable of any) – The iterable of items to unify.

  • dev_str (str) – The device to unify the elements of the iterable to.

  • mode (str) – The mode by which to unify, must be one of [ concat | mean | sum ]

  • axis (int, optional) – The axis along which to concattenate the sub-arrays. Default is 0.

  • transpose (bool, optional) – Whether to transpose the first and second dimensions of the iterator. Default is False.

Returns

iterable with each element unified to a single target devices

ivy.dev_unify_nest(args: Type[ivy.core.device.MultiDev], kwargs: Type[ivy.core.device.MultiDev], dev_str, mode, axis=0, max_depth=1)[source]

Unify the input nested arguments, which consist of sub-arrays spread across arbitrary devices, to unified arrays on the single target device.

Parameters
  • args (MultiDev) – The nested positional arguments to unify.

  • kwargs (MultiDev) – The nested keyword arguments to unify.

  • dev_str (str) – The device to unify the nested arguments to.

  • mode (str) – The mode by which to unify, must be one of [ concat | mean | sum ]

  • axis (int, optional) – The axis along which to concattenate the sub-arrays. Default is 0.

  • max_depth (int, optional) – The maximum nested depth to reach. Default is 1. Increase this if the nest is deeper.

Returns

nested arguments unified to the target device

ivy.dev_util(dev_str: str) → float[source]

Get the current utilization (%) for a given device.

Parameters

dev_str (str) – The device string of the device to query utilization for.

Returns

The device utilization (%)

ivy.gpu_is_available(f: Optional[ivy.Framework] = None) → bool[source]

Determine whether a GPU is available to use, with the backend framework.

Parameters

f (ml_framework, optional) – Machine learning framework. Inferred from inputs if None.

Returns

Boolean, as to whether a gpu is available.

ivy.num_gpus(f: Optional[ivy.Framework] = None) → int[source]

Determine the number of available GPUs, with the backend framework.

Parameters

f (ml_framework, optional) – Machine learning framework. Inferred from inputs if None.

Returns

Number of available GPUs.

ivy.percent_used_mem_on_dev(dev_str: str, process_specific=False) → float[source]

Get the percentage used memory for a given device string. In case of CPU, the used RAM is returned.

Parameters
  • dev_str (str) – The device string to conver to native device handle.

  • process_specific (bool, optional) – Whether the check the memory used by this python process alone. Default is False.

Returns

The percentage used memory on the device.

ivy.set_default_device(device)[source]
ivy.set_split_factor(factor, dev_str=None)[source]

Set the global split factor for a given device, which can be used to scale batch splitting chunk sizes for the device across the codebase.

Parameters
  • factor (float) – The factor to set the device-specific split factor to.

  • dev_str (str, optional) – The device to set the split factor for. Sets the default device by default.

ivy.split_factor(dev_str=None)[source]

Get the global split factor for a given device, which can be used to scale batch splitting chunk sizes for the device across the codebase. Default global value for each device is 1.

Parameters

dev_str (str, optional) – The device to query the split factor for. Sets the default device by default.

Returns

The split factor for the specified device.

ivy.split_func_call(func: Callable, inputs: Iterable[Union[ivy.Array, ivy.NativeArray, ivy.Container]], mode: str, max_chunk_size: Optional[int] = None, chunk_size: Optional[int] = None, input_axes: Union[int, Iterable[int]] = 0, output_axes: Optional[Union[int, Iterable[int]]] = None) → Iterable[Union[ivy.Array, ivy.NativeArray, ivy.Container]][source]

Call a function by splitting its inputs along a given axis, and calling the function in chunks, rather than feeding the entire input array at once. This can be useful to reduce memory usage of the device the arrays are on. :param func: The function to be called. :type func: callable :param inputs: A list of inputs to pass into the function. :type inputs: sequence of arrays :param mode: The mode by which to unify the return values, must be one of [ concat | mean | sum ] :type mode: str :param max_chunk_size: The maximum size of each of the chunks to be fed into the function. :type max_chunk_size: int :param chunk_size: The size of each of the chunks to be fed into the function. Specifying this arg overwrites the

global split factor. Default is None.

Parameters
  • input_axes (int or sequence of ints, optional) – The axes along which to split each of the inputs, before passing to the function. Default is 0.

  • output_axes (int or sequence of ints, optional) – The axes along which to concat each of the returned outputs. Default is same as fist input axis.

Returns

The return from the function, following input splitting and re-concattenation.

ivy.str_to_dev(dev_str: str, f: Optional[ivy.Framework] = None) → ivy.Device[source]

Convert device string representation to native device type.

Parameters
  • dev_str (str) – The device string to conver to native device handle.

  • f (ml_framework, optional) – Machine learning framework. Inferred from inputs if None.

Returns

Native device handle.

ivy.to_dev(x: Union[ivy.Array, ivy.NativeArray], dev_str: Optional[str] = None, f: Optional[ivy.Framework] = None) → Union[ivy.Array, ivy.NativeArray][source]

Move the input array x to the desired device, specified by device string.

Parameters
  • x (array) – Array to move onto the device.

  • dev_str (str, optional) – device to move the array to ‘cuda:0’, ‘cuda:1’, ‘cpu’ etc. Keep same device if None.

  • f (ml_framework, optional) – Machine learning framework. Inferred from inputs if None.

Returns

The array x, but now placed on the target device.

ivy.total_mem_on_dev(dev_str: str) → float[source]

Get the total amount of memory (in GB) for a given device string. In case of CPU, the total RAM is returned.

Parameters

dev_str (str) – The device string to conver to native device handle.

Returns

The total memory on the device in GB.

ivy.tpu_is_available(f: Optional[ivy.Framework] = None) → bool[source]

Determine whether a TPU is available to use, with the backend framework.

Parameters

f (ml_framework, optional) – Machine learning framework. Inferred from inputs if None.

Returns

Boolean, as to whether a tpu is available.

ivy.used_mem_on_dev(dev_str: str, process_specific=False) → float[source]

Get the used memory (in GB) for a given device string. In case of CPU, the used RAM is returned.

Parameters
  • dev_str (str) – The device string to conver to native device handle.

  • process_specific (bool, optional) – Whether the check the memory used by this python process alone. Default is False.

Returns

The used memory on the device in GB.