Monotone Approximation by Quadratic Neural Network of Functions in L

Some researchers are interested in using the flexible and applicable properties of quadratic functions as activation functions for FNNs. We study the essential approximation rate of any Lebesgue-integrable monotone function by a neural network of quadratic activation functions. The simultaneous degree of essential approximation is also studied. Both estimates are proved to be within the second order of modulus of smoothness.


Introduction
In the field of Artificial Neural Networks, researchers can use the neural network to approximate a function with an acceptable close degree to the original function.
In 1989, Cybenko [1] introduced the Universal Theorem of Approximation, which states that for any continuous function defined on a compact space X, there exists a neural network such that ‖ ‖ Other researchers concluded similar results in that period [2,3].
In spite of the importance of the above theorem, it suffered from a number of limitations, including being primitive for the following aspects; the used function space, the degree of approximation and the nature of the neural network itself. Cybenko took the function from a very tight space. Other spaces were studied later such as Banach Space [4], Sobolov space [5], Hilbert spaces [6], and other wider spaces [7,8].

ISSN: 0067-2904
In addition, the activation function of the neural network defined above by Cybenko is not specified. He proved it for any neural network with any arbitrary activation function. The importance of the activation function is not only to create a relationship between inputs and outputs but also to add the ability to the network to learn any type of data. To build a more powerful network, it is essential to choose a suitable activation function depending on various issues, such as type of data, number of hidden layers, and the network's model. Sigmodal, threshold, binary, identity, tan and arctan are examples of some types of functions that activate the neural network.
Cybenko's Theorem was for any arbitrary activation function. Later, different types of neural networks with specified activation functions were defined to achieve good approximation [9][10][11][12][13][14][15][16][17][18][19][20]. For deep learning applications, the squared activation functions were used efficiently in different areas to achieve favourable properties, which generated expressive networks with good learning abilities [21]. Here, we define the quadratic activation function as ( ) , -(1) Divide ,into subintervals each of length at most ( ) So for any we define If , and for some the following neural network operator is defined We name to be the set of all neural networks of type Error! Reference source not found.. Now, we move to the criterion of approximation that makes the rate of approximation as accurate as possible.
The rate of monotone approximation of a function on ( ) by elements of ( ). Modulus of smoothness is the best measure of the rate of approximation, since it judges the accuracy of the best approximation of a function as the error of approximation is estimated. Smoothness of functions can be measured by the following modulus [22]: which is called the rth modulus of smoothness, where the rth symmetric difference of is

Auxiliary Lemmas
At the beginning, we need some useful facts about the relationships among moduli of smoothness and/or the rate of approximation. These facts were previously proved in details by other authors [23][24][25]

Lemma 1
The first inequality gives a relationship between the error of approximation, while the second is a more general one

Almurieb and Bhaya
Iraqi Journal of Science, 2020, Vol. 61, No. 4, pp: 870-874 872 which easily implies that the derivative of the operator Error! Reference source not found. is:

Main Results
Let be an increasing function in , and since is increasing on ,as well, then so is ( ) for any . So we conclude a monotone approximation for the function to be equivalent to the second modulus of smoothness for both upper and lower bounds. We begin with the following main result for the upper bound of ( ) Theorem 1 If , -, then there exists , such that By (Error! Reference source not found.) and (Error! Reference source not found.), Also, we can prove the lower bound to get an essential approximation to the function by .

Theorem 2 If
, -, then there exists , with such that

Proof
Let * + then we get the expansion of in Error! Reference source not found. as follow So, we get Now, for derivatives of the function and its operator, we find that Theorem 3 If , -, then for any there exists , with that satisfies the following

Proof
Using the derivatives of (Error! Reference source not found.) and Error! Reference source not found. located in (Error! Reference source not found.) and (Error! Reference source not found.), respectively, we have It is easy to conclude that the essential approximation also holds for derivatives of any , -.