Advanced visualization approaches for statistical data analysis

Authors

  • Md Wahiduzzaman Suva Department of Computer Science, American International University-Bangladesh, Dhaka, Bangladesh
  • Md. Imtiaj Alam Sajin Department of Computer Science, American International University-Bangladesh, Dhaka, Bangladesh
  • Esm E Moula Chowdhury Abha Department of Computer Science, American International University-Bangladesh, Dhaka, Bangladesh
  • Mushfiqur Rahman Abir Department of Computer Science, American International University-Bangladesh, Dhaka, Bangladesh
  • Asif Zaman * Department of Computer Science, American International University-Bangladesh, Dhaka, Bangladesh https://orcid.org/0009-0000-6412-4051

https://doi.org/10.22105/metaverse.v1i1.36

Abstract

In the era of big data, effective data visualization plays a crucial role in presenting and understanding complex datasets. This work investigates sophisticated visualization methods that help researchers spot patterns, comprehend complex relationships within data, and effectively convey findings. R offers several benefits for statistical research and data processing. Many researchers hesitate to use it because of its perceived coding complexity and dependence on proprietary software. This study demonstrates R's powerful data visualization features and uses a dataset named HCV data from the UCI machine learning repository that includes a range of laboratory and demographic characteristics. This study aims to address the challenges associated with data visualization.

Keywords:

Data visualisation, R language, Ggplot2, Smplot, Visreg

References

  1. [1] Pant, A., & Rajput Head, R. S. (2019). Introduction to research data and its visualization using R. In Writing qualitative research paper (pp. 18–32). https://www.researchgate.net/profile/R-Rajput/publication/336982016_Introduction_To_Research_Data_And_Its_

  2. [2] Visualization_Using_R/links/5dbcfd9492851c8180212b8e/Introduction-To-Research-Data-And-Its-Visualization-Using-R.pdf

  3. [3] Hosain, M. T., Zaman, A., Sajid, M. S., Khan, S. S., & Akter, S. (2023). Privacy preserving machine learning model personalization through federated personalized learning. 2023 4th international conference on data analytics for business and industry, icdabi 2023 (pp. 536–545). IEEE. DOI: 10.1109/ICDABI60145.2023.10629638

  4. [4] Brennan, P. (2021). Data visualization with the programming language R. Biochemist, 43(5), 8–14. DOI:10.1042/bio_2021_174

  5. [5] Hosain, M. T., Abir, M. R., Rahat, M. Y., Mridha, M. F., & Mukta, S. H. (2024). Privacy preserving machine learning with federated personalized learning in artificially generated environment. IEEE open journal of the computer society. DOI:10.1109/OJCS.2024.3466859

  6. [6] Zoghi, Z., & Serpen, G. (2024). UNSW‐NB15 computer security dataset: analysis through visualization. Security and privacy, 7(1), e331. DOI:10.1002/spy2.331

  7. [7] Kucukler, O. F., Amira, A., & Malekmohamadi, H. (2024). EEG dataset for energy data visualizations. Data in brief, 52, 109933. DOI:10.1016/j.dib.2023.109933

  8. [8] Pilhöfer, A., & Unwin, A. (2013). New approaches in visualization of categorical data: R package extracat. Journal of statistical software, 53(7), 1–25. DOI:10.18637/jss.v053.i07

  9. [9] Nordmann, E., McAleer, P., Toivo, W., Paterson, H., & DeBruine, L. M. (2022). Data visualization using R for researchers who do not use R. Advances in methods and practices in psychological science, 5(2), 25152459221074656. DOI:10.1177/25152459221074654

  10. [10] Min, S. H., & Zhou, J. (2021). Smplot: An R package for easy and elegant data visualization. Frontiers in genetics, 12, 802894. DOI:10.3389/fgene.2021.802894

  11. [11] Cho, W., Lim, Y., Lee, H., Varma, M. K., Lee, M., & Choi, E. (2014). Big data analysis with interactive visualization using r packages [presentation]. Proceedings of the 2014 international conference on big data science and computing (pp. 1–6). https://doi.org/10.1145/2640087.264416

  12. [12] Breheny, P., & Burchett, W. (2017). Visualization of regression models using visreg. R journal, 9(2), 56–71. DOI:10.32614/rj-2017-046

  13. [13] Bale, K., Chapman, P., Barraclough, N., Purdy, J., Aydin, N., & Dark, P. (2007). Kaleidomaps: A new technique for the visualization of multivariate time-series data. Information visualization, 6(2), 155–167. DOI:10.1057/palgrave.ivs.9500154

  14. [14] Lichtinghagen, R., Klawonn, F., & Hoffmann, G. (2020). HCV data. UCI Machine Learning Repository. https://archive.ics.uci.edu/dataset/571/hcv+data

  15. [15] Dupont, W., & Plummer, W. (2002). Sunflower: stata module to generate density distribution sunflower plots. Statistical software components s430201. https://ideas.repec.org/c/boc/bocode/s430201.html

  16. [16] Taiyun. (2021). Corrplot: visualization of a correlation matrix. GitHub, Inc. Footer Navigation. https://github.com/taiyun/corrplot

  17. [17] Egoshin, V. L., Ivanov, S. V., Savvina, N. V., Kalmakhanov, S. B., & Grjibovski, A. M. (2018). Visualization of biomedical data using R. Ekologiya cheloveka (human ecology), 25(8), 52–64. DOI:10.33396/1728-0869-2018-8-52-64

Published

2024-12-16

How to Cite

Suva, M. W. ., Alam Sajin, M. I. ., Chowdhury Abha, E. E. M. ., Abir, M. R. ., & Zaman, A. (2024). Advanced visualization approaches for statistical data analysis. Metaversalize, 1(1), 55-69. https://doi.org/10.22105/metaverse.v1i1.36

Most read articles by the same author(s)