*by **Computer Vision Department of NTRLab *

Suppose we are given a set of distinct points *P* = {(x_{i}, y_{i}) ∈ ℝ^{m} ×ℝ}_{i=1,…,n} which we regard as a set of test samples* x _{i}* ∈ ℝ

^{m}with known answers

*y*∈ ℝ. To avoid non-compactness we may assume that

_{i}*P*lie in some compact

*K*, for example,

*K*may be some polytope. Does there exist some continuous function in space of all

*C(K)*continuous functions on

*K*such that its graph is a good approximation for our set

*P*in some sense?

From the approximation theory point of view, a neural network is a family of functions {*F*_{θ}, θ ∈ Θ} of some functional class. Each special neural network defines each own family of functions. Some of them might be equivalent in some sense. If we restrict ourselves to only MLP according to the above problem with only one intermediate layer consisting of *N* elements then the corresponding family of functions will be

Continue reading Why Do Neural Networks Need An Activation Function?