Module juice::layers::activation::tanh

Expand description

Applies the nonlinear TanH function.

Non-linearity activation function: y = sinh(x) / cosh(x)

You might consider using ReLU as an alternative.

ReLU, compared to TanH:

reduces the likelyhood of vanishing gradients
increases the likelyhood of a more beneficial sparse representation
can be computed faster
is therefore the most popular activation function in DNNs as of this writing (2016).

Structs§