"in practice it is often the case that 3-layer neural networks will outperform 2-layer nets, but going even deeper (4,5,6-layer) rarely helps much more. This is in stark contrast to Convolutional Networks, where depth has been found to be an extremely important component for a good recognition system (e.g. on order of 10 learnable layers)." ~ Andrej Karpathy inhttps://cs231n.github.io/neural-networks-1/
정리
- overfitting이 발생하면 hidden units 개수를 줄어거나, Dropout이나 L2 Regularization 사용.
"in practice it is often the case that 3-layer neural networks will outperform 2-layer nets, but going even deeper (4,5,6-layer) rarely helps much more. This is in stark contrast to Convolutional Networks, where depth has been found to be an extremely important component for a good recognition system (e.g. on order of 10 learnable layers)." ~ Andrej Karpathy inhttps://cs231n.github.io/neural-networks-1/
More on Capacity
A more detailed discussion on a model's capacity appears in theDeep Learning book, chapter 5.2(pages 110-120).