A Convolutional Neural Network Approach Using Tensorflow For Image Classifıcation

International Conference On Stem And Educational Sciences, Muş, Türkiye, 3 - 05 Mayıs 2018, ss.197

Yayın Türü: Bildiri / Özet Bildiri
Basıldığı Şehir: Muş
Basıldığı Ülke: Türkiye
Sayfa Sayıları: ss.197
İnönü Üniversitesi Adresli: Evet

Özet

Nowadays, deep learning has proved its success in many different research fields. Convolutional neural networks (CNNs) are special type of deep learning models widely used in areas such as image classification (Ciregan, Meier & Schmidhuber, 2012) and natural language processing (Collobert & Weston, 2008). Imagine that simple neural networks are used to determine what features an image contains. To achieve this, the image is converted into a column or row vector and given to the system input. In this type of structure, many features are composed of side-by-side pixel values. However, human perception system looks at corners, lines and rounded shapes to perceive an image. When an image is flattened and brought into a row or column vector, all the fine details disappear. It is almost impossible to try to solve this structure, which is difficult to perceive even by perfectly functioning human intelligence, using machine learning techniques. CNNs has been developed as a solution to this problem. CNNs use feed forward structure. Unlike artificial neural networks, it has convolution and pooling layers for feature extraction and reducing the size of input image, respectively. Using both layers, important features in the image can be extracted. The proposed CNN consists of seven layers. The first layer is the convolution layer where the basic features of image detected. The second layer is the pooling layer, which reduces the image size by half. Third and fourth layers are convolutional layers In these layers, more detailed information about the image was tried to be obtained by using 64 filters of size 3x3. In the fifth layer, the image is reduced to half size by pooling process. The sixth layer is a fully connected layer in which all neurons are connected together. Immediately after this layer, unnecessary nerve cells were deleted using a 0.5 dropout value in order to prevent over fitting problem. The seventh and last layer determines which results will be included in the class of birds or airplanes from the previous layers. The Rectified Linear Unit activation function is used in the first, third, and fourth layers of the convolution layers.

In this study, CNN using TensorFlow application was performed on GPU using Python language. The calculations on the GPU have been performed via the Nvidia CUDA library. Proposed network has been trained on the Nvidia GeForce GTX 1070 8GB 256 bit graphic card which has 1920 CUDA cores. The Caltech-UCSD Birds-200-2011 (Wah et. al., 2011) and Caltech 101 (Fei-Fei, Fergus & Perona, 2004) datasets has been used to train and validate neural network. Airplane and bird images have been tried to classify in correct manner. To achieve this, total 1600 images of birds and airplanes used. The split rate of network is 33%. The 67% of images (1072 images) used to train network. The %33 of the images (528 images) used to validate network. The training phase lasts only 20 epochs to achieve 100% correctness and the test data were classified as 99% percent. Although they have similar structures, aircraft and bird images have been successfully distinguished from each other.This work was presented as a classification process using deep learning architecture TensorFlow. In the future studies, it is planned to develop deep learning methods for object detection on real time images.