Gradients

Collection of gradient Ivy functions.

ivy.adam_step(dcdws, mw, vw, step, beta1=0.9, beta2=0.999, epsilon=1e-07)[source]

Compute adam step delta, given the derivatives of some cost c with respect to ws, using ADAM update. [reference]

Parameters
  • dcdws (container of arrays) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • mw (container of arrays) – running average of the gradients

  • vw (container of arrays) – running average of second moments of the gradients

  • step (int) – training step

  • beta1 (float) – gradient forgetting factor

  • beta2 (float) – second moment of gradient forgetting factor

  • epsilon (float) – divisor during adam update, preventing division by zero

Returns

The adam step delta.

ivy.adam_update(ws, dcdws, lr, mw_tm1, vw_tm1, step, beta1=0.9, beta2=0.999, epsilon=1e-07, inplace=True, stop_gradients=True)[source]

Update weights ws of some function, given the derivatives of some cost c with respect to ws, using ADAM update. [reference]

Parameters
  • ws (container of variables) – Weights of the function to be updated.

  • dcdws (container of arrays) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (float or container of layer-wise rates.) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.

  • mw_tm1 (container of arrays) – running average of the gradients, from the previous time-step.

  • vw_tm1 (container of arrays) – running average of second moments of the gradients, from the previous time-step.

  • step (int) – training step

  • beta1 (float) – gradient forgetting factor

  • beta2 (float) – second moment of gradient forgetting factor

  • epsilon (float) – divisor during adam update, preventing division by zero

  • inplace (bool, optional) – Whether to perform the operation inplace, for backends which support inplace variable updates, and handle gradients behind the scenes such as PyTorch. If the update step should form part of a computation graph (i.e. higher order optimization), then this should be set to False. Default is True.

  • stop_gradients (bool, optional) – Whether to stop the gradients of the variables after each gradient step. Default is True.

Returns

The new function weights ws_new, and also new mw and vw, following the adam updates.

ivy.execute_with_gradients(func, xs, retain_grads=False, f=None)[source]

Call function func with input of xs variables, and return func first output y, the gradients [dy/dx for x in xs], and any other function outputs after the returned y value

Parameters
  • func (function) – Function for which we compute the gradients of the output with respect to xs input.

  • xs (sequence of variables) – Variables for which to compute the function gradients with respective to.

  • retain_grads (bool) – Whether to retain the gradients of the returned values.

  • f (ml_framework, optional) – Machine learning framework. Inferred from inputs if None.

Returns

the function first output y, the gradients [dy/dx for x in xs], and any other extra function outputs

ivy.gradient_descent_update(ws, dcdws, lr, inplace=True, stop_gradients=True)[source]

Update weights ws of some function, given the derivatives of some cost c with respect to ws, [dc/dw for w in ws].

Parameters
  • ws (Ivy container) – Weights of the function to be updated.

  • dcdws (Ivy container) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (float or container of layer-wise rates.) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.

  • inplace (bool, optional) – Whether to perform the operation inplace, for backends which support inplace variable updates, and handle gradients behind the scenes such as PyTorch. If the update step should form part of a computation graph (i.e. higher order optimization), then this should be set to False. Default is True.

  • stop_gradients (bool, optional) – Whether to stop the gradients of the variables after each gradient step. Default is True.

Returns

The new function weights ws_new, following the gradient descent updates.

ivy.inplace_decrement(x, val, f=None)[source]

Perform in-place decrement for the input variable.

Parameters
  • x (variable) – The variable to decrement.

  • val (array) – The array to decrement the variable with.

  • f (ml_framework, optional) – Machine learning framework. Inferred from inputs if None.

Returns

The variable following the in-place decrement.

ivy.inplace_increment(x, val, f=None)[source]

Perform in-place increment for the input variable.

Parameters
  • x (variable) – The variable to increment.

  • val (array) – The array to increment the variable with.

  • f (ml_framework, optional) – Machine learning framework. Inferred from inputs if None.

Returns

The variable following the in-place increment.

ivy.inplace_update(x, val, f=None)[source]

Perform in-place update for the input variable.

Parameters
  • x (variable) – The variable to update.

  • val (array) – The array to update the variable with.

  • f (ml_framework, optional) – Machine learning framework. Inferred from inputs if None.

Returns

The variable following the in-place update.

ivy.is_variable(x, exclusive=False, f=None)[source]

Determines whether the input is a variable or not.

Parameters
  • x (array) – An ivy array.

  • exclusive (bool, optional) – Whether to check if the data type is exclusively a variable, rather than an array. For frameworks like JAX that do not have exclusive variable types, the function will always return False if this flag is set, otherwise the check is the same for general arrays. Default is False.

  • f (ml_framework, optional) – Machine learning framework. Inferred from inputs if None.

Returns

Boolean, true if x is a trainable variable, false otherwise.

ivy.lamb_update(ws, dcdws, lr, mw_tm1, vw_tm1, step, beta1=0.9, beta2=0.999, epsilon=1e-07, max_trust_ratio=10, decay_lambda=0, inplace=True, stop_gradients=True)[source]

Update weights ws of some function, given the derivatives of some cost c with respect to ws, [dc/dw for w in ws], by applying LAMB method.

Parameters
  • ws (container of variables) – Weights of the function to be updated.

  • dcdws (container of arrays) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (float or container of layer-wise rates.) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.

  • mw_tm1 (container of arrays) – running average of the gradients, from the previous time-step.

  • vw_tm1 (container of arrays) – running average of second moments of the gradients, from the previous time-step.

  • step (int) – training step

  • beta1 (float) – gradient forgetting factor

  • beta2 (float) – second moment of gradient forgetting factor

  • epsilon (float) – divisor during adam update, preventing division by zero

  • max_trust_ratio (float, optional) – The maximum value for the trust ratio. Default is 10.

  • decay_lambda (float) – The factor used for weight decay. Default is zero.

  • inplace (bool, optional) – Whether to perform the operation inplace, for backends which support inplace variable updates, and handle gradients behind the scenes such as PyTorch. If the update step should form part of a computation graph (i.e. higher order optimization), then this should be set to False. Default is True.

  • stop_gradients (bool, optional) – Whether to stop the gradients of the variables after each gradient step. Default is True.

Returns

The new function weights ws_new, following the LARS updates.

ivy.lars_update(ws, dcdws, lr, decay_lambda=0, inplace=True, stop_gradients=True)[source]

Update weights ws of some function, given the derivatives of some cost c with respect to ws, [dc/dw for w in ws], by applying Layerwise Adaptive Rate Scaling (LARS) method.

Parameters
  • ws (Ivy container) – Weights of the function to be updated.

  • dcdws (Ivy container) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (float) – Learning rate, the rate at which the weights should be updated relative to the gradient.

  • decay_lambda (float) – The factor used for weight decay. Default is zero.

  • inplace (bool, optional) – Whether to perform the operation inplace, for backends which support inplace variable updates, and handle gradients behind the scenes such as PyTorch. If the update step should form part of a computation graph (i.e. higher order optimization), then this should be set to False. Default is True.

  • stop_gradients (bool, optional) – Whether to stop the gradients of the variables after each gradient step. Default is True.

Returns

The new function weights ws_new, following the LARS updates.

ivy.optimizer_update(ws, effective_grads, lr, inplace=True, stop_gradients=True)[source]

Update weights ws of some function, given the true or effective derivatives of some cost c with respect to ws, [dc/dw for w in ws].

Parameters
  • ws (Ivy container) – Weights of the function to be updated.

  • effective_grads (Ivy container) – Effective gradients of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (float or container of layer-wise rates.) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.

  • inplace (bool, optional) – Whether to perform the operation inplace, for backends which support inplace variable updates, and handle gradients behind the scenes such as PyTorch. If the update step should form part of a computation graph (i.e. higher order optimization), then this should be set to False. Default is True.

  • stop_gradients (bool, optional) – Whether to stop the gradients of the variables after each gradient step. Default is True.

Returns

The new function weights ws_new, following the optimizer updates.

ivy.stop_gradient(x, preserve_type=True, f=None)[source]

Stops gradient computation.

Parameters
  • x (array) – Array for which to stop the gradient.

  • preserve_type – Whether to preserve the input type (ivy.Variable or ivy.Array), otherwise an array is always returned. Default is True.

  • preserve_type – bool, optional

  • f (ml_framework, optional) – Machine learning framework. Inferred from inputs if None.

Returns

The same array x, but with no gradient information.

ivy.variable(x, f=None)[source]

Creates a variable, which supports gradient computation.

Parameters
  • x (array) – An ivy array.

  • f (ml_framework, optional) – Machine learning framework. Inferred from inputs if None.

Returns

An ivy variable, supporting gradient computation.