(ProQuest: ... denotes formulae omitted)An intelligence system basically consists of three layers (i.e., input, output, and hidden), which contain built-in neurons that connect one layer to another. In a neural network, basic functions are inferred from the data, which allow the network to understand complex interactions between predictor variables (Gonzalez-Camacho et al., 2012; Hastie, Tibshirani, & Friedman, 2009). Artificial neural networks (ANNs) are computational models based on parallel distributed processing, which can be used to model highly complex and non-linear stochastic problems such as the ability to learn, generalize, classify, and organize data (Gomes & Awruch, 2004; Sharaf, Noureldin, Osman, & El-Sheimy, 2005). ANNs can also be used to analyze complex data structures and large data sets, and these so-called intelligence systems are capable of generalizing the findings of scientific studies (Santos, Rupp, Bonzi, & Filed, 2013). Inspired by the thinking patterns of the human brain, ANNs can learn the data structures and conduct numerous statistical processes such as parameter estimations, classifications, and optimizations. In other words, learning in ANNs is accomplished through algorithms that mimic the learning mechanisms of biological systems (Yilmaz & Ozer, 2009). Therefore, the present study investigates the factors that affect the success of university students by employing two artificial neural network methods (i.e., multilayer perceptron and radial basis function), and compares the effects of these methods on educational data in terms of predictive ability.Multilayer Perceptron Artificial Neural NetworkThe multilayer perceptron artificial neural network (MTPANN) and the radial basis function artificial neural network (RBFANN) are both widely used as supervised training methods. Although their structures are somewhat similar, the RBFANN is used to solve scientific problems, whereas the MTPANN is applied for pattern recognition or classification problems by using the error back propagation algorithm. The main purpose of this algorithm is to minimize estimation error by computing all of the weights in the network. In addition, this algorithm systematically updates these weights in order to achieve the best neural network configuration. Essentially, this algorithm consists of two steps: propagation and weight update. Basically, the propagation step involves forward propagations (producing output activations) and backward ones (computing the difference between input (X ) and output (Y) by using output activations). In the weight update process, synaptic weight is multiplied by the delta (X.-Y.) to obtain the gradient weight. Then, a percentage of the gradient weight is subtracted to obtain the rate of pattern recognition. In this case, if the percentage is low, then the accuracy of the training is high. Moreover, if the percentage is high, then the training of the neurons is faster. During this process, the two steps (propagation and weight update) are repeated until the performance of the network architecture is satisfactory.Back propagation needs to compute the derivative of the squared error function by considering the weights in the network. Assuming one output neuron, the squared error function can be computed as follows:...(1)where t is the target output, y is the actual output, and E is the squared error. Each output (O ) that is matched with each neuron can be expressed as follows:...(2)The net. (input) to a neuron is the weighted sum of output O,. In addition, w is the weight between neurons and j. In general, the activation function of the hidden and output layers is non-linear and differentiable. A common activation function is shown as follows:...(3)The activation function in Equation 3 is a logistic form that is commonly used as the activation function. This process continues until the error is optimized. …
Read full abstract