lamb_update

ivy.lamb_update(ws, dcdws, lr, mw_tm1, vw_tm1, step, beta1=0.9, beta2=0.999, epsilon=1e-07, max_trust_ratio=10, decay_lambda=0, inplace=True, stop_gradients=True)[source]

Update weights ws of some function, given the derivatives of some cost c with respect to ws, [dc/dw for w in ws], by applying LAMB method.

Parameters
  • ws (container of variables) – Weights of the function to be updated.

  • dcdws (container of arrays) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (float or container of layer-wise rates.) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.

  • mw_tm1 (container of arrays) – running average of the gradients, from the previous time-step.

  • vw_tm1 (container of arrays) – running average of second moments of the gradients, from the previous time-step.

  • step (int) – training step

  • beta1 (float) – gradient forgetting factor

  • beta2 (float) – second moment of gradient forgetting factor

  • epsilon (float) – divisor during adam update, preventing division by zero

  • max_trust_ratio (float, optional) – The maximum value for the trust ratio. Default is 10.

  • decay_lambda (float) – The factor used for weight decay. Default is zero.

  • inplace (bool, optional) – Whether to perform the operation inplace, for backends which support inplace variable updates, and handle gradients behind the scenes such as PyTorch. If the update step should form part of a computation graph (i.e. higher order optimization), then this should be set to False. Default is True.

  • stop_gradients (bool, optional) – Whether to stop the gradients of the variables after each gradient step. Default is True.

Returns

The new function weights ws_new, following the LARS updates.


Supported Frameworks:

empty jax_logo empty tf_logo empty pytorch_logo empty mxnet_logo empty numpy_logo empty