Damage caused due to phishing attacks is that which targets the user's personal information. Phishing includes sending a user an email, or causing a phishing page to steal personal information from a user. Blacklist-based detection techniques can detect this form of attack; however, these approaches have certain limitations, and the number of people affecting have continued to grow. The aim of a technique for phishing detection using machine learning to identify each URL into either a legitimate URL or a phished URL. Data availability here in this proposed solution is the key to executing the solution and if there is any issue with data availability it can cost the project accuracy. Data used for model testing must be reliable and appropriate to almost identify all the websites that the user wants to check. Model consistency is another factor that may trigger project failure so the model has to be accurate in determining a true identity of URLs. This technique employs features of a standardized resource locator (URL). The features have been defined which contain URLs for the phishing site. The suggested approach employs certain characteristics to detect phishing. The strategy was tested with a data collection of 3,000 URLs for the phishing site and 3,000 valid URLs for the site. The findings show that more than 90 percent of phishing sites can be identified by the proposed technique.
Read full abstract