Advanced visualization approaches for statistical data analysis
Abstract
In the era of big data, effective data visualization plays a crucial role in presenting and understanding complex datasets. This work investigates sophisticated visualization methods that help researchers spot patterns, comprehend complex relationships within data, and effectively convey findings. R offers several benefits for statistical research and data processing. Many researchers hesitate to use it because of its perceived coding complexity and dependence on proprietary software. This study demonstrates R's powerful data visualization features and uses a dataset named HCV data from the UCI machine learning repository that includes a range of laboratory and demographic characteristics. This study aims to address the challenges associated with data visualization.
Keywords:
Data visualisation, R language, Ggplot2, Smplot, VisregReferences
- [1] Pant, A., & Rajput Head, R. S. (2019). Introduction to research data and its visualization using R. In Writing qualitative research paper (pp. 18–32). https://www.researchgate.net/profile/R-Rajput/publication/336982016_Introduction_To_Research_Data_And_Its_
- [2] Visualization_Using_R/links/5dbcfd9492851c8180212b8e/Introduction-To-Research-Data-And-Its-Visualization-Using-R.pdf
- [3] Hosain, M. T., Zaman, A., Sajid, M. S., Khan, S. S., & Akter, S. (2023). Privacy preserving machine learning model personalization through federated personalized learning. 2023 4th international conference on data analytics for business and industry, icdabi 2023 (pp. 536–545). IEEE. DOI: 10.1109/ICDABI60145.2023.10629638
- [4] Brennan, P. (2021). Data visualization with the programming language R. Biochemist, 43(5), 8–14. DOI:10.1042/bio_2021_174
- [5] Hosain, M. T., Abir, M. R., Rahat, M. Y., Mridha, M. F., & Mukta, S. H. (2024). Privacy preserving machine learning with federated personalized learning in artificially generated environment. IEEE open journal of the computer society. DOI:10.1109/OJCS.2024.3466859
- [6] Zoghi, Z., & Serpen, G. (2024). UNSW‐NB15 computer security dataset: analysis through visualization. Security and privacy, 7(1), e331. DOI:10.1002/spy2.331
- [7] Kucukler, O. F., Amira, A., & Malekmohamadi, H. (2024). EEG dataset for energy data visualizations. Data in brief, 52, 109933. DOI:10.1016/j.dib.2023.109933
- [8] Pilhöfer, A., & Unwin, A. (2013). New approaches in visualization of categorical data: R package extracat. Journal of statistical software, 53(7), 1–25. DOI:10.18637/jss.v053.i07
- [9] Nordmann, E., McAleer, P., Toivo, W., Paterson, H., & DeBruine, L. M. (2022). Data visualization using R for researchers who do not use R. Advances in methods and practices in psychological science, 5(2), 25152459221074656. DOI:10.1177/25152459221074654
- [10] Min, S. H., & Zhou, J. (2021). Smplot: An R package for easy and elegant data visualization. Frontiers in genetics, 12, 802894. DOI:10.3389/fgene.2021.802894
- [11] Cho, W., Lim, Y., Lee, H., Varma, M. K., Lee, M., & Choi, E. (2014). Big data analysis with interactive visualization using r packages [presentation]. Proceedings of the 2014 international conference on big data science and computing (pp. 1–6). https://doi.org/10.1145/2640087.264416
- [12] Breheny, P., & Burchett, W. (2017). Visualization of regression models using visreg. R journal, 9(2), 56–71. DOI:10.32614/rj-2017-046
- [13] Bale, K., Chapman, P., Barraclough, N., Purdy, J., Aydin, N., & Dark, P. (2007). Kaleidomaps: A new technique for the visualization of multivariate time-series data. Information visualization, 6(2), 155–167. DOI:10.1057/palgrave.ivs.9500154
- [14] Lichtinghagen, R., Klawonn, F., & Hoffmann, G. (2020). HCV data. UCI Machine Learning Repository. https://archive.ics.uci.edu/dataset/571/hcv+data
- [15] Dupont, W., & Plummer, W. (2002). Sunflower: stata module to generate density distribution sunflower plots. Statistical software components s430201. https://ideas.repec.org/c/boc/bocode/s430201.html
- [16] Taiyun. (2021). Corrplot: visualization of a correlation matrix. GitHub, Inc. Footer Navigation. https://github.com/taiyun/corrplot
- [17] Egoshin, V. L., Ivanov, S. V., Savvina, N. V., Kalmakhanov, S. B., & Grjibovski, A. M. (2018). Visualization of biomedical data using R. Ekologiya cheloveka (human ecology), 25(8), 52–64. DOI:10.33396/1728-0869-2018-8-52-64