Abstract

In today’s information age as use of websites, mobile apps and all forms of information sharing forms have increased which gave rise to malicious URL forms. These malicious URLs are forwarded and users attention is diverted from the main course for what he is searching to other non-necessary and harmful content, thus wasting a lot of time and money. Theses malicious URLs have given rise to authentication thefts, money thefts and bullying of a user who falls in to a trap set by hackers by accessing these URLs. To resolve and find a solution to this kind of menace there is need to detect and prevent users from accessing these URLs. So, while studying various techniques put forward by various authors in different research papers, we found a few techniques quite interesting and useful. The first is detecting malicious URLs using CNN and GRU. The second is where a text mining technique is proposed using Natural Language Processing (NLP) which can be used for classification. The third is a combination of CNN and NLP. By studying them we came to understand that there should be a combination of both NLP and CNN together to implement a successful malicious URL detection system. So, in our paper we are proposing a fusion of R-CNN, NLP and Cloud together. The main work in our paper is to collect malicious and healthy URL which will be done using internet and multiple sources and combined as one dataset. Thus, we will use Google cloud to create a blacklisted URL database of our own and not depend upon multiple sources internet for them. In our system first we will create a blacklist database on cloud and then apply classification on it using NLP and machine learning algorithm SVM. The second step will be to use same URL dataset to train a R-CNN AI algorithm and get an output in form of malicious identified URLs. Then in the final phase we will compare the final results from SVM and R-CNN and analyse which one is efficient and highs and lows of the technique.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call