Definition

Adaptive momentum estimation (Adam) combines the ideas of momentum and RMSProp optimizers.
Where:
- is the estimate of the first moment (mean) of the gradients
- is the estimate of the second moment (un-centered variance) of the gradients
- and are decay rates for the moment estimates
- and are bias-corrected estimates