SOME TECHNOLOGIES USED IN DATA SCIENCE

Dr Vimmi Pandey,Mr Prashant Kumar Koshta

doi:10.58532/nbennurch290

Abstract

Data science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data. It involves the collection, cleaning, processing, and analysis of large and complex data sets to uncover patterns, relationships, and trends that can inform decision-making. The significance of data science lies in its ability to help organizations and individuals make informed decisions based on data. By turning raw data into actionable insights, data science helps organizations optimize their operations, improve customer experiences, develop new products and services, and gain a competitive edge in their respective markets. Additionally, data science also plays a critical role in various industries, including finance, healthcare, marketing, retail, transportation, and more. In all these sectors a huge amount of data is generated and must be processed and used for analysis. To accomplish this task efficiently various machine learning framework like TensorFlow, Keras, PyTorch, and scikit-learn , the programming languages like Python, R, and SQL, Data visualization tools like Tableau, PowerBI, and Matplotlib and Big data tools like Apache Hadoop, Apache Spark, and Apache Storm are required. Apart from these Collaboration and project management tools like Jupyter Notebook, GitHub and cloud computing platforms are also required. The main objective of this chapter is to discuss about the most common and frequently used technologies, frameworks and tools with their features, applications, strengths and weaknesses including future scope because these technologies play a crucial role in the field of data science, enabling data scientists to collect, process, analyze, and communicate insights from large amounts of data

Full Text