Abstract

Criminal activity detection in social network by text mining is the process of finding criminal activity by the criminals and help law text mining technique, the ability to detect hidden text from corpus documents. Text mining is process of transforming data from unstructured text to structured text which is easily perceived and processed by humans, but hard for machines to understand without designing algorithms, tools and methods in order to effectively process, such enforcing agencies to keep control of the prevailing crimes Text mining is method deriving high-quality information from raw data through the pattern devising and statistical pattern learning. Text mining is field a multidisciplinary field that relies on data mining, information retrieval, statistics, machine learning, and computational linguistics. The main thing in text mining process of analyzing and exploring is natural language processing, information retrieval, information extraction, content analysis, text clustering, and text classification. All that processes are wanted after you complete a step, the preprocess task. The importance of pre-processing task is to reduce the volume of the corpus textual documents and the tasks involved in that step are text boundary determinant, natural language specific stemming stop-word, elimination, and tokenization to remove unwanted data and handling missing data. Among this, doing the most important work is tokenization. Tokenization assist to divide the text data to individual words, open source tools become available for those interested such as spacy, NLTK with python, Gensim and many other. After that define model architecture to fit the model on the training data and evaluate this model on test simple data in order to predict values.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.