The Leaky Rectified Linear Unit (Leaky ReLU) is an activation perform utilized in synthetic neural networks, and it’s a variant of the ReLU activation perform. The primary objective of an activation perform is to introduce non-linearity into the neural community, permitting it to be taught extra complicated patterns and representations from the enter information.
The Leaky ReLU perform is outlined as:
f(x) = max(αx, x)
the place α is a small optimistic fixed (often set to a price like 0.01).
Right here’s how the Leaky ReLU perform works:
- If the enter (x) is bigger than 0, the perform returns the enter worth itself, similar to the usual ReLU perform.
- If the enter (x) is lower than or equal to 0, as a substitute of returning 0 (as within the case of the usual ReLU), the Leaky ReLU returns a small destructive worth proportional to the enter. That is decided by multiplying the enter with the small fixed α.
The important thing distinction between Leaky ReLU and the usual ReLU is that Leaky ReLU permits a small gradient movement for destructive enter values, whereas ReLU units the output to 0 for destructive inputs. This helps alleviate the “dying ReLU” drawback, the place sure neurons within the community would possibly completely flip off and cease studying throughout coaching in the event that they constantly obtain destructive inputs.
By permitting a small gradient movement for destructive inputs, Leaky ReLU helps forestall neurons from getting caught and permits them to proceed studying and updating their weights throughout the coaching course of. This may result in higher efficiency and sooner convergence of the neural community.
Nevertheless, it’s necessary to notice that the selection of activation perform depends upon the precise drawback and the structure of the neural community. Different activation capabilities, resembling ReLU, ELU (Exponential Linear Unit), and Swish, are additionally generally utilized in numerous situations.