International Journal of Applied Science and Engineering
Published by Chaoyang University of Technology

Nagham A. Sultan *, Dhuha B. Abdullah

Department of Computer Science, University of Mosul, Mosul, Iraq


 

Download Citation: |
Download PDF


ABSTRACT


The academic performance of Iraqi authors and institutions is an important aspect that needs to be investigated in the field of research. Special scientific criteria establish a researcher's level. These include citations, published articles, and others. In this study, the aim is to analyze the academic performance patterns of Iraqi authors and institutions in Google Scholar. Collaboration patterns between Iraqi scientists and universities were explored using sophisticated data collection and analysis techniques. A crawler that worked in parallel with Google Scholar was created with SerpApi, thus collecting amounts of scientist profile data stored in MongoDB Atlas. The data were processed using Spark within the Amazon Web Services (AWS) Cloud. Complex network analysis and Cytoscape software were used to comprehensively investigate the co-authorship networks of Iraqi scientists. In addition, a set of complex network criteria was applied to reveal patterns of cooperation. After visualizing the network diagram and analyzing the relationships between the nodes, the virtual nodes and communities in the network were identified, revealing critical insights into the cooperation patterns of Iraqi scientists and universities. The results have important implications for research and development policies in Iraq and demonstrate the power of complex network analysis to reveal valuable insights into academic research collaboration.


Keywords: Bigdata, Parallel scraping, MongoDB, Apatch spark, Collaboration network.


Share this article with your colleagues

 


REFERENCES


  1. Abdullah, D.B., 2020. Network-based bibliometric method for analyzing collaboration and publishing tendencies. In 6th International Engineering Conference Sustainable Technology and Development (IEC), 174–178.

  2. Al Husaeni, D.F., Nandiyanto, A.B.D. 2022. Bibliometric using vosviewer with publish or perish (using Google Scholar data): From step-by-step processing for users to the practical examples in the analysis of digital learning articles in pre and post Covid-19 pandemic. ASEAN Journal of Science and Engineering, 2, 19–46.

  3. Ali, M., Jung, L.T., Sodhro, A.H., Laghari, A.A., Belhaouari, S.B., Gillani, Z. 2023. A Confidentiality-based data Classification-as-a-Service (C2aaS) for cloud security. Alexandria Engineering Journal, 64, 749–760.

  4. Azhar, R.J.K., Nurhakim, L., Putra, R.E. 2019. Implementasi web scraping untuk menampilkan informasi tayangan film di bioskop: Book my show. Universitas Siliwangi, 1, 1–7.

  5. Buyya, R., Calheiros, R.N., Dastjerdi, A.V. (Eds.). 2016. Big data: Principles and paradigms. Morgan Kaufmann, USA.

  6. Dabdawb, M., Mahmood, B. 2021. On the relations among object-oriented software metrics: A network-based approach. International Journal of Computing and Digital Systems, 1, 901–915.

  7. Delgado López-Cózar, E., Orduña-Malea, E., Martín-Martín, A. 2019. Google Scholar as a data source for research assessment. In: Glänzel, W., Moed, H.F., Schmoch, U., Thelwall, M. (eds) Springer Handbook of Science and Technology Indicators, 95–127.

  8. Divakarmurthy, P., Menezes, R. 2013. The effect of citations to collaboration networks. Complex Networks, 424, 177–185.

  9. Fronczak, A., Hołyst, J.A., Jedynak, M., Sienkiewicz, J. 2002. Higher order clustering coefficients in Barabási–Albert networks. Physica A: Statistical Mechanics and Its Applications, 316, 688–694.

  10. Fujita, M., Inoue, H., Terano, T. 2021. Analyzing promising researchers using network centralities of co-authorship networks from academic literature. New Generation Computing, 39, 181–197.

  11. Hammadi, D.S., Mahmood, B., Dabdawb, M.M. 2021. Approaches on modelling genes interactions: A review. Technium BioChemMed, 2, 38–52.

  12. Jaiswal, A., Dwivedi, V.K., Yadav, O.P. 2020. Big data and its analyzing tools: A perspective. In 6th International Conference on Advanced Computing and Communication Systems (ICACCS 2020), 560–565.

  13. Laghari, A.A., He, H., Khan, A., Kumar, N., Kharel, R. 2018. Quality of experience framework for cloud computing (QoC). IEEE Access, 6, 64876–64890.

  14. Laghari, A.A., He, H., Khan, A., Laghari, R.A., Yin, S., Wang, J. 2022. Crowdsourcing platform for QoE evaluation for cloud multimedia services. Computer Science and Information Systems, 19, 1305–1328.

  15. Laghari, A.A., Jumani, A.K., Laghari, R.A. 2021. Review and state of art of fog computing. Archives of Computational Methods in Engineering, 28, 1–13.

  16. Lula, P., Dospinescu, O., Homocianu, D., Sireteanu, N.A. 2020. An advanced analysis of cloud computing concepts based on the computer science ontology. Computers, Materials and Continua, 66, 2425–2443.

  17. Mahmood, B., Menezes, R. 2013. United states congress relations according to liberal and conservative newspapers. In 2013 IEEE 2nd Network Science Workshop (NSW), 98–101.

  18. Martín-Martín, A., Orduna-Malea, E., Delgado López-Cózar, E. 2018. A novel method for depicting academic disciplines through Google Scholar Citations: The case of bibliometrics. Scientometrics, 114, 1251–1273.

  19. Mohammed, A.J., Hasan, T.M., Mahmood, B. 2020. Citation networks Iraqi universities case study. In 2020 3rd International Conference on Engineering Technology and its Applications (IICETA), 41–46.

  20. Orduna-Malea, E., Ayllón, J.M., Martín-Martín, A., Delgado López-Cózar, E. 2015. Methods for estimating the size of Google Scholar. Scientometrics, 104, 931–949.

  21. Osipov, D. 2019. Development of a MongoDB-connected VR application. [Bachelor’s thesis, TURKU University of Applied Sciences].

  22. Ramirez-Gallego, S., Mourino-Talin, H., Martinez-Rego, D., Bolon-Canedo, V., Benitez, J.M., Alonso-Betanzos, A., Herrera, F. 2018. An information theory-based feature selection framework for big data under Apache spark. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 48, 1441–1453.

  23. Shaikh, Z.A., Khan, A.A., Teng, L., Wagan, A.A., Laghari, A.A. 2022. BIoMT modular infrastructure: The recent challenges, issues, and limitations in blockchain hyperledger-enabled e-healthcare application. Wireless Communications and Mobile Computing, 2022, 1–14.

  24. Sivarajah, U., Kamal, M.M., Irani, Z., Weerakkody, V. 2017. Critical analysis of big data challenges and analytical methods. Journal of Business Research, 70, 263–286.

  25. Sultan, N.A., Mahmood, B., Thanoon, K.H., Khadhim, D.S. 2020. Network centralities-based approach for evaluating interdisciplinary collaboration. In 6th International Engineering Conference "Sustainable Technology and Development"(IEC), 216–221.

  26. Sun, Y., Yin, S., Li, H., Teng, L., Karim, S. 2019. GPOGC: Gaussian pigeon-oriented graph clustering algorithm for social networks cluster. IEEE Access, 7, 99254–99262.

  27. Tomasini, M., Menezes, R. 2015. Estimating memory requirements in wireless sensor networks using social tie strengths. In IEEE 40th Local Computer Networks Conference Workshops, 695–698.

  28. Yu, J., Li, H., Liu, D. 2020. Modified immune evolutionary algorithm for medical data clustering and feature extraction under cloud computing environment. Journal of Healthcare Engineering, 2020, 1–12.


ARTICLE INFORMATION


Received: 2023-06-18
Revised: 2023-07-11
Accepted: 2023-07-21
Available Online: 2023-10-16


Cite this article:

Sultan, N.A., Abdullah, D.B. 2023. Investigating scientific collaboration networks in Iraq using cloud computing and data mining. International Journal of Applied Science and Engineering, 20, 2023241. https://doi.org/10.6703/IJASE.202312_20(4).003

  Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.