Spam Mail Detection using Naive Bayes method with Apache Spark


Aydogan M., KARCI A.

International Conference on Artificial Intelligence and Data Processing (IDAP), Malatya, Türkiye, 28 - 30 Eylül 2018 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası:
  • Basıldığı Şehir: Malatya
  • Basıldığı Ülke: Türkiye
  • İnönü Üniversitesi Adresli: Evet

Özet

Significant progress has been made in internet technologies with great progress in information infrastructure and in parallel, the amount of data produced has reached incredible dimensions. Nowadays, storage and processing of this data is the most important big data problem. In recent years new technologies have been developed in this study area. The Apache Spark project is considered one of the most important of these Technologies. In this study, a classification application was devoloped on Apache Spark using the Naive Bayes method which machine learning libraries of Apache Spark A data set including of mails labeled as Spam and Not Spam was analyzed using Apache Spark and a classification application with a high accuracy ratio was performed. The performance of Apache Spark is quite different compared to other platforms that are most used in data analysis.