Improving The Performance of Big Data Databases

Abstract views: 1391 / PDF downloads: 881


  • Nzar Abdulqadir Ali Computer Science Department, Cihan University - Sulaimani, Sulaimani, Iraq
  • Dashne Raouf Arif big data


Real-time monitoring systems utilize two types of database, they are relational databases such as MySQL and non-relational databases such as MongoDB. A relational database management system (RDBMS) stores data in a structured format using rows and columns. It is relational because the values of the tables are connected. A non-relational database is a database that does not adopt the relational structure given by traditional. In recent years, this class of databases has also been referred to as Not only SQL (NoSQL).  This paper discusses many comparisons that have been conducted on the execution time performance of types of databases (SQL and NoSQL). In SQL (Structured Query Language) databases different algorithms are used for inserting and updating data, such as indexing, bulk insert and multiple updating. However, in NoSQL different algorithms are used for inserting and updating operations such as default-indexing, batch insert, multiple updating and pipeline aggregation. As a result, firstly compared with related papers, this paper shows that the performance of both SQL and NoSQL can be improved. Secondly, performance can be dramatically improved for inserting and updating operations in the NoSQL database compared to the SQL database. To demonstrate the performance of the different algorithms for entering and updating data in SQL and NoSQL, this paper focuses on a different number of data sets and different performance results. The SQL part of the paper is conducted on 50,000 records to 3,000,000 records, while the NoSQL part of the paper is conducted on 50,000 to 16,000,000 documents (2GB) for NoSQL. In SQL, three million records are inserted within 606.53 seconds, while in NoSQL this number of documents is inserted within 67.87 seconds. For updating data, in SQL 300,000 records are updated within 271.17 seconds, while for NoSQL this number of documents is updated within just 46.02 seconds.



Keywords: Big data, Real-time and SQL, NoSQL


[1] Y. Arora and D. Goyal, "Big data: A review of analytics methods & techniques," in 2016 2nd International Conference on Contemporary Computing and Informatics (IC3I), 2016, pp. 225-230.
[2] C. Luo, "Survey of Parallel Processing on Big Data," 2017.
[3] V. Rubin and T. Lukoianova, "Veracity roadmap: Is big data objective, truthful and credible?," Advances in Classification Research Online, vol. 24, p. 4, 2013.
[4] G. Bello-Orgaz, J. J. Jung, and D. Camacho, "Social big data: Recent achievements and new challenges," Information Fusion, vol. 28, pp. 45-59, 2016.
[5] K. Al-Barznji and A. Atanassov, "A survey of Big Data Mining: challenges and techniques," in Proceedings of 24th International Symposium" Control of Energy, Industrial and Ecological Systems, Bankia, Bulgaria, 2016.
[6] H. K. Omar and A. K. Jumaa, "Big Data Analysis Using Apache Spark MLlib and Hadoop HDFS with Scala and Java," Kurdistan Journal of Applied Research, vol. 4, pp. 7-14, 2019.
[7] O. Kushanova, "Building, Testing and Evaluating Database Clusters: OSA project," 2014.
[8] E. Andersson and Z. Berggren, "A Comparison Between MongoDB and MySQL Document Store Considering Performance," ed, 2017.
[9] H. Ansari, "Performance Comparison of Two Database Management Systems MySQL vs MongoDB," ed, 2018.
[10] Y.-S. Kang, I.-H. Park, J. Rhee, and Y.-H. Lee, "MongoDB-based repository design for IoT-generated RFID/sensor big data," IEEE Sensors Journal, vol. 16, pp. 485-497, 2015.
[11] S. S. Nyati, S. Pawar, and R. Ingle, "Performance evaluation of unstructured NoSQL data over distributed framework," in 2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI), 2013, pp. 1623-1627.
[12] K. Fraczek and M. Plechawska-Wojcik, "Comparative analysis of relational and non-relational databases in the context of performance in web applications," in International Conference: Beyond Databases, Architectures and Structures, 2017, pp. 153-164.
[13] S. Agrawal and A. Patel, "AStudy ON GRAPH STORAGE DATABASE OF NOSQL," International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), vol. 5, pp. 33-39, 2016.
[14] S. Venkatraman, K. Fahd, S. Kaspi, and R. Venkatraman, "SQL versus NoSQL movement with big data analytics," Int. J. Inform. Technol. Comput. Sci, vol. 8, pp. 59-66, 2016.
[15] S. H. Aboutorabi, M. Rezapour, M. Moradi, and N. Ghadiri, "Performance evaluation of SQL and MongoDB databases for big e-commerce data," in 2015 International Symposium on Computer Science and Software Engineering (CSSE), 2015, pp. 1-7.
[16] P. Kookarinrat and Y. Temtanapat, "Analysis of range-based key properties for sharded cluster of mongodb," in 2015 2nd International Conference on Information Science and Security (ICISS), 2015, pp. 1-4.
[17] S. Ahmed, "A RESTFUL API WITH MONGODB," California State University, Sacramento, 2018.
[18] D. P. Seaman, J. J. Chaves, and K. S. Bugbee, "Benchmarking Big Data Cloud-Based Infrastructures," 2017.
[19] C. Gy?rödi, R. Gy?rödi, G. Pecherle, and A. Olah, "A comparative study: MongoDB vs. MySQL," in 2015 13th International Conference on Engineering of Modern Electric Systems (EMES), 2015, pp. 1-6.
[20] J. Ajdari and B. Kasami, "MapReduce Performance in MongoDB Sharded Collections," INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, vol. 9, pp. 115-120, 2018.
[21] D. Sink, "A Real-time Database System for Managing Aquarium Data," Appalachian State University, 2017.
[22] V. Abramova, J. Bernardino, and P. Furtado, "Which nosql database? a performance overview," Open Journal of Databases (OJDB), vol. 1, pp. 17-24, 2014.
[23] Z. Parker, S. Poe, and S. V. Vrbsky, "Comparing nosql mongodb to an sql db," in Proceedings of the 51st ACM Southeast Conference, 2013, p. 5.
[24] D. Merriman, E. Horowitz, and C. T. Westin, "Aggregation framework system architecture and method," ed: Google Patents, 2018.


How to Cite

N. Abdulqadir Ali and D. R. Arif, “Improving The Performance of Big Data Databases”, KJAR, vol. 4, no. 2, pp. 206–220, Dec. 2019, doi: 10.24017/science.2019.2.20.

Article Metrics





Pure and Applied Science