Crawling and Mining the Dark Web: A Survey on Existing and New Approaches
DOI:
https://doi.org/10.24996/ijs.2022.63.3.36Keywords:
Dark Web, Deep Web, Data Mining, Crawling, TOR (The Onion Routing)Abstract
The last two decades have seen a marked increase in the illegal activities on the Dark Web. Prompt evolvement and use of sophisticated protocols make it difficult for security agencies to identify and investigate these activities by conventional methods. Moreover, tracing criminals and terrorists poses a great challenge keeping in mind that cybercrimes are no less serious than real life crimes. At the same time, computer security societies and law enforcement pay a great deal of attention on detecting and monitoring illegal sites on the Dark Web. Retrieval of relevant information is not an easy task because of vastness and ever-changing nature of the Dark Web; as a result, web crawlers play a vital role in achieving this task. Thereafter, data mining techniques are applied to extract useful patterns that would help security agencies to limit and get rid of cybercrimes. The aim of this paper is to present a survey for those researchers who are interested in this topic. We started by discussing the internet layers and the properties of the Deep Web, followed by explaining the technical characters of The Onion Routing (TOR) network, and finally describing the approaches of accessing, extracting and processing Dark Web data. Understanding the Dark Web, its properties and its threats is vital for internet servers; we do hope this paper be of help in that goal.