Published at : 27 Oct 2015
Volume : IJtech
Vol 6, No 4 (2015)
DOI : https://doi.org/10.14716/ijtech.v6i4.1557
Gupta, S., Narsimha, 2015. Performance Evaluation of Nosql-Cassandra over Relational Data Store-Mysql for Bigdata. International Journal of Technology. Volume 6(4), pp. 640-649
Sangeeta Gupta | Jawaharlal Nehru Technological University, Kakinada, Andhra Pradesh 533003, India |
Narsimha | Jawaharlal Nehru Technological Universiy Hyderabad College of Engineering Jagtial, Kondagattu, Karimnagar, Telangana 505501, India |
The massive amounts of data collected from numerous sources like social media, e-commerce websites are a challenging aspect for analysis using the available storage technologies. Relational databases are a traditional approach of data storage more suitable for structured data formats and are constrained by Atomicity, Consistency, Isolation, and Durability (ACID) properties. In the modern world, data in the form of word documents, pdf files, audio and video formats are unstructured. Therefore, tables and schema definition are not a major concern, Relational databases, such as Mysql, may not be suitable to serve such Bigdata. An alternate approach is to use the emerging Nosql databases. In this work, a comprehensive performance and scalability evaluation of large web collection data in data stores, such as Nosql-Cassandra and relational-Mysql, is presented. These systems are evaluated with data and workloads that can be found related to Bigdata, yielding scalability of applications. The insights presented in this work serve not only for performance and scalability, but also as lessons learned and experiences relating to the configuration complexity and evaluation in sorting out the complex queries of what data storage can be used on which usage cases for large data sets. The results show how the Bigdata collected across the Web with billions of records generating continuously are poorly evaluated with Mysql in terms of ‘write’ operations, but how these perform well with Nosql-Cassandra. This paper yields a new approach which is unique in representing Nosql-Cassandra’s poor performance in retrieval of records and disk utilisation with ever-increasing loads. The results presented in this paper show an improvement in ‘read’ performance with the proposed architecture and configuration over Mysql, achieving cost saving benefits to any organisation willing to use Nosql-Cassandra for managing Bigdata for heavy loads.
Bigdata, Cassandra, Crawler, Mysql, Nosql
Datastax Corporation, 2014. The Modern Online application for the Internet economy: 5 Key Requirements that Ensure Success. White paper by Datastax Corporation, Santa Clara, Calif., Available at datastax.com
Divyakant, A., Das, S., Abbadi, A.E., 2011. Bigdata and Cloud Computing: Current State and Future opportunities. In: Proceedings of the EDBT 2011/ACM, March 22-24 2011, Uppsala, Sweden
Gansen, Z., Huang, W., Liang, S., Tang, Y., 2013. Modelling Mongo DB with Relational Model. In: Proceedings of the Fourth International Conference on Emerging Intelligent Data and Web Technologies, IEEE, Volume 25, pp. 115-121
Introduction to Apache Cassandra, 2013. White paper by Datastax Corporation, San Mateo, Calif., July 2013
Naim, N.F., Mohd Yassin, A.I., Wan Zamri, W.M.A., Sarnin, S.S., 2011. Mysql Database for Storage of Fingerprint Data. In: Proceedings of the 13th International Conference on Modelling and Simulation, IEEE, Volume 62, pp. 293-298
Sudhanshu, K., Shelly, S., 2014. Performance Comparison for Data Storage-DB4o and Mysql Databases. In: Proceedings Seventh International Conference on Contemporary Computing, IEEE, 2014
Thomas, S., Dongman, L., 2014. Notes on Cloud Computing Principles. Journal of Cloud Computing: Advances, Systems and Applications, Volume 3(21), pp. 1-10
Venkat, N.G., Dhana, R., Vijay, V.R., 2014. Nosql Systems for Bigdata Management. In: Proceedings of the 10th World Congress on Services, IEEE, Volume 42, pp.190-197
Vora, M.N., 2011. Hadoop-HBase for Large Scale Data. In: Proceedings of the IEEE International Conference on Computer Science and Network Technology, December 24-26, 2011, pp. 601-605