Centrality and scalability analysis on distributed graph of large-scale e-mail dataset for digital forensics | Kütüphane.osmanlica.com

Centrality and scalability analysis on distributed graph of large-scale e-mail dataset for digital forensics

İsim Centrality and scalability analysis on distributed graph of large-scale e-mail dataset for digital forensics
Yazar Ozcan, S., Astekin, Merve, Shashidhar, N. K., Zhou, B.
Basım Tarihi: 2020-12-10
Basım Yeri - IEEE
Konu Big data, Centrality measurement, Digital forensics, Distributed systems, E-mail forensics, Graph analysis, Runtime comparison, Scalability
Tür Belge
Dil İngilizce
Dijital Evet
Yazma Hayır
Kütüphane: Özyeğin Üniversitesi
Demirbaş Numarası 2-s2.0-85103819648
Kayıt Numarası 55f1d4e8-0f08-48c4-8a0f-3182d2e86b6b
Tarih 2020-12-10
Örnek Metin Today's digital forensics software tools mostly do not offer automatic analysis methods to reveal evidences among huge amounts of digital files within hard disk images. It is important that finding evidence in digital and cyber forensics investigations as soon as possible by examining hard disk images. E-mails constitute a rich source of information in hard disk images, and they are the most possible data source to obtain an evidence. The analyzers search e-mail files by manually or using traditional methods in order to find an evidence. However, this operation could take a long time due to the size of the e-mail data which can contain a huge number of files and a huge volume of data. This study introduces an end-to-end distributed graph analysis framework for large-scale digital forensic datasets, and evaluates the accuracy of the centrality algorithms and the scalability of the proposed framework in terms of running time performance. The framework is comprised of specific processes to perform pre-processing, graph building, and algorithm activities. An architecture is introduced based on distributed big data techniques. Three different centrality algorithms are implemented to analyze the accuracy of our framework. Further, three implementations are provided to demonstrate the running time performance of our framework. Experiments are performed on Enron e-mail dataset to analyze the centrality algorithms, to evaluate the performance of the framework, and to compare the running times between the traditional approach and our approach. Moreover, the running time performance of the framework is evaluated under various parallelization level. The accuracy of the results is also evaluated and compared between the centrality algorithms. The comparison shows that some certain algorithms provide more accurate results and it is possible to improve the running time by orders of magnitude utilizing our end-to-end distributed graph analysis approach.
DOI 10.1109/BigData50022.2020.9378152
Kaynağa git Özyeğin Üniversitesi Özyeğin Üniversitesi
Özyeğin Üniversitesi Özyeğin Üniversitesi
Kaynağa git

Centrality and scalability analysis on distributed graph of large-scale e-mail dataset for digital forensics

Yazar Ozcan, S., Astekin, Merve, Shashidhar, N. K., Zhou, B.
Basım Tarihi 2020-12-10
Basım Yeri - IEEE
Konu Big data, Centrality measurement, Digital forensics, Distributed systems, E-mail forensics, Graph analysis, Runtime comparison, Scalability
Tür Belge
Dil İngilizce
Dijital Evet
Yazma Hayır
Kütüphane Özyeğin Üniversitesi
Demirbaş Numarası 2-s2.0-85103819648
Kayıt Numarası 55f1d4e8-0f08-48c4-8a0f-3182d2e86b6b
Tarih 2020-12-10
Örnek Metin Today's digital forensics software tools mostly do not offer automatic analysis methods to reveal evidences among huge amounts of digital files within hard disk images. It is important that finding evidence in digital and cyber forensics investigations as soon as possible by examining hard disk images. E-mails constitute a rich source of information in hard disk images, and they are the most possible data source to obtain an evidence. The analyzers search e-mail files by manually or using traditional methods in order to find an evidence. However, this operation could take a long time due to the size of the e-mail data which can contain a huge number of files and a huge volume of data. This study introduces an end-to-end distributed graph analysis framework for large-scale digital forensic datasets, and evaluates the accuracy of the centrality algorithms and the scalability of the proposed framework in terms of running time performance. The framework is comprised of specific processes to perform pre-processing, graph building, and algorithm activities. An architecture is introduced based on distributed big data techniques. Three different centrality algorithms are implemented to analyze the accuracy of our framework. Further, three implementations are provided to demonstrate the running time performance of our framework. Experiments are performed on Enron e-mail dataset to analyze the centrality algorithms, to evaluate the performance of the framework, and to compare the running times between the traditional approach and our approach. Moreover, the running time performance of the framework is evaluated under various parallelization level. The accuracy of the results is also evaluated and compared between the centrality algorithms. The comparison shows that some certain algorithms provide more accurate results and it is possible to improve the running time by orders of magnitude utilizing our end-to-end distributed graph analysis approach.
DOI 10.1109/BigData50022.2020.9378152
Özyeğin Üniversitesi
Özyeğin Üniversitesi yönlendiriliyorsunuz...

Lütfen bekleyiniz.