Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

ARTIFICIAL INTELLIGENCE-GENERATED ART AND THE QUESTION OF AUTHORSHIP

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Artificial Intelligence (AI) has quickly changed the artistic production by allowing machines to produce images, music, literature, and multimedia works that mimic the work of humans with regard to creativity. The latest developments of machine learning, especially deep neural networks, Generative Adversarial Networks (GANs), and diffusion-based models, have increased what computational systems can do: creating complex artistic patterns based on massive data. Such developments have also brought up critical theoretical, legal, and philosophical issues of authorship, originality, and creative ownership on AI-generated artworks. This paper looks at the technical underlying principals of AI generated art and discusses the processes by which algorithms discover stylistic tropes, generate visual shapes and respond to human intervention in user prompts and parameter adjustment. The paper also discusses the changing argument over authorship in AI-generated art, which takes into account programmers, dataset curators, artists, and end users advantages in the creative pipeline. Moral and cultural considerations are also outlined, such as the issues concerning intellectual property, cultural biasness in training data, and the possible repercussion to the conventional artistic careers. With the combination of the views of computational creativity, the digital humanities and the cultural policy, the study points to the transformative paradigm of human-intelligent systems collaborativity of creativity.

Similar Papers
  • Research Article
  • Cite Count Icon 20
  • 10.2144/fsoa-2022-0010
Artificial intelligence in interdisciplinary life science and drug discovery research.
  • Mar 8, 2022
  • Future science OA
  • Jürgen Bajorath

Artificial intelligence in interdisciplinary life science and drug discovery research.

  • Discussion
  • Cite Count Icon 8
  • 10.1016/j.ejmp.2021.05.008
Focus issue: Artificial intelligence in medical physics.
  • Mar 1, 2021
  • Physica Medica
  • F Zanca + 11 more

Focus issue: Artificial intelligence in medical physics.

  • Research Article
  • Cite Count Icon 50
  • 10.1016/j.fertnstert.2020.10.040
Predictive modeling in reproductive medicine: Where will the future of artificial intelligence research take us?
  • Nov 1, 2020
  • Fertility and Sterility
  • Carol Lynn Curchoe + 18 more

Predictive modeling in reproductive medicine: Where will the future of artificial intelligence research take us?

  • Book Chapter
  • 10.1108/s1548-643520230000020016
Index
  • Mar 13, 2023

Citation (2023), "Index", Sudhir, K. and Toubia, O. (Ed.) Artificial Intelligence in Marketing (Review of Marketing Research, Vol. 20), Emerald Publishing Limited, Bingley, pp. 309-318. https://doi.org/10.1108/S1548-643520230000020016 Publisher: Emerald Publishing Limited Copyright © 2023 K. Sudhir and Olivier Toubia. Published under exclusive licence by Emerald Publishing Limited INDEX Activation functions, 246 Advertising, 89–90 context of, 96 Agent-based simulation model, 183 Airbnb context, 117 Smart Pricing algorithm, 231 smart pricing tool, 107 Airlines, 105, 107 Alexa, 289 Algorithmic bias, 117 Algorithmic collusion, 33, 118–119 Algorithmic sellers, 109–110 Amazon Mechanical Turk, 163 Amazon’s current business model, 32–33 Anthropomorphism, 185, 274–275 in AI, 277–286 beneficial and harmful effects, 287–290 conceptual framework, 286–298 conditions, 290–293 cumulative distribution of articles, 275 future research directions, 298–302 individual characteristics of AI users, 293–296 insights emerging from literature, 284–286 journals included in literature search, 279 limitations, 302 literature review procedures, 278, 280, 283–284 related to context of employing AI anthropomorphism, 299–300 related to effects of AI anthropomorphism, 298–299 related to individual characteristics of AI users, 300–302 relationship perspective, 297–298 Apple (technology company), 13–14 Application programming interfaces (APIs), 159–160 Area Under the Curve (AUC), 87–88 Artificial intelligence (AI), 1–2, 13–14, 104–105, 125–126, 147–148, 170, 218, 274 advertising, persuasion, and communication, 153 agenda for future work, 34 AI-based algorithm, 29 AI-based innovation, 1–2 AI-based model selection tools, 28–29 AI-based queries, 154 AI-based solutions, 133 AI-supported content generation, 139–140 aiding marketing decisions, 4–6 algorithmic collusion, 118–119 anthropomorphism in, 277–286 applications of AI-powered VOC, 150–153 challenges in use of UGC, 149 consequences for pricing, 115–119 considerations for use of, 139–140 consumer reactions, 139 data available for AI and VOC, 154–156 decisions, 30–32 dynamic pricing, 115–116 economic framework of, 14–27 firms implementing AI for pricing, 104–115 identifying and organizing customer needs, 150 impact on consumers and society, 8–9 level of impact of, 27–28, 34 market research, 6 marketing purpose of, 4–6 opportunity identification for AI research, 10 personalized pricing, 117 potential abuse and need for regulation, 139–140 prediction, 28–30 price algorithms, 111 promise of, 149–150 promise of AI and Machine Learning, 149–150 promise of user-generated content, 149 reflection of branding through users, 151–153 research in marketing on, 16, 26, 40, 76 society, 33–34 strategy, 32–33 tools, 32, 136, 140 understanding and forecasting demand, 150–151 and VOC, 148–150 VOC practice before, 148–149 workforce implications, 140 Artificial Intelligence Assistants (AIAs), 289 Artificial neural networks (ANNs), 227, 240–241 Attenuation bias, 170, 175 Augmented reality (AR), 7, 228–229 Autocompletion for email and text messaging, 139 Autoencoders, 257 Autoencoding models, 203 Automated content generation, 129 Automation, 104–105 Autoregressive models, 203 Average treatment effect (ATE), 84 Azure’s Face API, 222–223 “Bag-of-words”–based methods, 180 Behavioral experiments, 229–230 Berry–Levinsohn–Pakes–type random coefficient choice model, 178 Bias mitigation, 230–231 Bibliometric network, 173 Bidirectional Encoder Representations from Transformers (BERT), 125–126, 157, 159, 180–181, 198–199, 202–203 Big data (see also Data), 104 VOC practice before, 148–149 Big GAN (BigGAN), 132 Binary Robust Independent Elementary Features, 221 Brand logos, 220 Brand perception, 162 Brand selfies concept, 153 Brand-related social tags, 30 Branding, 148 brand perception, 151–152 brand positioning, 152–153 reflection of branding through users, 151–153 user–brand interaction, 152–153 Brands marketing strategies, 154 Business leaders, 13–14 Business-to-consumer (B2C), 275 Canny edge detector, 221 Causal inference, 268 Causality, 194–195 Charge supracompetitive prices, 118 Classical ML models, 242 Click-through rate (CTR), 89–90 Clustering, 158 algorithms, 150 Co-citation analysis, 173 Collusive algorithm, 118–119 Color histogram descriptor, 221 Colors, 221 Common method bias, 170 Company-level topics, 223–224 Computer vision, 7–8 application domain, 223–224 data format, 219–221 future research, 228–231 in marketing research, 219–224 model structure, 221–223 techniques, 218 Conditional average treatment effect (CATE), 82 Conditional GAN (CGAN), 131–132 Conjoint analysis, 149 Construct validity, 164 Consumer reactions, 139 Consumer silence, 170–171 Consumer-centric perspective, 276 Consumer-level topics, 223 Content generation considerations for use of AI-supported content generation, 139–140 generating synthetic images, 131–133 generating textual content with language models, 129–131 potential for, 129–133 potential for AI throughout customer journey, 126–129 potential for content generation, 129–133 supporting customer equity management with content generation, 133–139 Content selection method, 130 Content-related marketing tasks, 140 Contextual bandit, 82–83 Contour, 221 Convolutional neural networks (CNNs), 28, 129, 157, 163, 202, 220–221, 246, 249, 254 Convolutional-LSTM, 157 Counterfactual explanations, 227–228 Counterfactual policy evaluation, 84–85 Counterfactual validity, 81–82 Criticisms, 227–228 Cross-entropy, 247 CTR prediction problems, 81 Customer acquisition, 134–136 Customer equity framework, 126–127 Customer equity management with content generation customer acquisition, 134–136 customer retention, 138–139 relationship development, 136–138 supporting, 133–139 Customer feedback, 170 future of customer feedback research, 183–185 online customer feedback, 175–183 publication count by journal, 172 publication count by year, 172 review methodology, 171–174 from user-generated content, 176 Customer relationship management (CRM), 8 Customer retention, 138–139 Customer reviews, 154 Customer satisfaction research, 174–175 DALL-E, 137 Data, 14, 154 available for AI AND VOC, 154–156 customer reviews, 154 data-generating process, 87–88 direct queries to customers, 154 images, 155 preprocessing, 156–157 social media, 154 sources, 154 text, 155 trading, 1 transformation, 160 types, 155–156 user engagement, 155–156 De-bias pricing algorithms, 117 Decision trees, 82 Decision types, 15–27 Decision-makers in marketing, 218 Deep learning (DL), 8, 82, 192, 230, 239–240 algorithms, 248, 250, 265 architectures for NLP, 200–202 causal inference, 268 combine unstructured data with structured data, 265–266 common testbeds, 268–269 customized algorithm development, 265–266 customized constraint, 266 deep learning–based language model, 174 future directions, 266–269 in marketing, 239–240, 243–244 model efficiency improvement, 267 models, 126, 220–221, 224 multimodal, five senses, and networks, 267 neural networks, 242–248 plug and play, 265 problems, 265–266 properties, 241–242 reinforcement learning, 267–268 theory-driven architecture design, 266 theory-driven initialization, 266 Deep neural networks (DNN), 81, 242 Deep Q-Network (DQN), 248–249 Deep reinforcement learning (DRL), 262–265 Deepfakes, 139–140 Demand, real-time swings in, 105–110 Diachronic word embeddings, 199–200 Dictionary and word frequency–based text mining, 179–180 Difference-in-difference estimation approach, 194–195 Digital cameras, 150 Digital exhaust of individual behavior, 1 Digital footprints, 147–148 Digital voice assistants, 139 Direct marketing context, 138 Direct method, 85 Direct queries to customers, 154 Discriminative deep learning models, 249–258 CNNs, 249–254 RNN, 254–255 transformers, 255–258 Discriminative models, 248–249 Discriminator network, 131 Disney, 88 Distributional hypothesis, 199 doc2vec, 200 Dominant color descriptor, 221 Double machine learning (DML), 178 Doubly Robust estimator (DR estimator), 86 Doubly robust method, 86 Dropout method, 248 Dynamic methods, 83–84 Dynamic models update customers, 138 Dynamic pricing, 105, 110, 115–116 E-commerce, 104 Emails, 132–133 Embedded Topic Model (ETM), 204 Embeddings, 157–158, 198–200, 204 Entropy, 87–88 Equilibrium analysis, strategic behavior and, 95–96 ERNIE 3.0, 203–204 European Union (EU), 33 Evaluation of AI methods, 161–162 Evaluative Lexicon 2.0, 197 Evidence lower bound (ELBO), 259 Example-based explanation techniques, 227–228 eXplainable Artificial Intelligence (XAI), 219 External validity, 164 Facebook (technology company), 13–14 engagement data, 156 news feed algorithm, 93 user-engagement data, 155–156 Fairness in marketing, 224 Fake reviews, 183 fastText, 198–199 Feature-level models, 221 Feedback data, 14 Field experimentation, 84 Field experiments, 229–230 Financial Times Top 50 journals (FT50 journals), 278–279, 284 Fine-tuning, 130 Firms, 3–4, 131 First-order methods, 247 Flexible supervised learning algorithms, 81 Frames, 219 Fuzzy SVM, 159 Gated recurrent unit (GRU), 255 GauGAN, 137 Gender differences, 294 General Data Protection Regulation (GDPR), 33 Generalizability, 81–82 Generative adversarial networks (GAN), 8, 131, 248–249, 260, 262 Generative deep learning models, 258–262 GAN, 260–262 VAE, 258–260 Generative models, 140, 248–249 Generative Pre-trained Transformer 3 (GPT-3), 125–126, 180–181, 198–199 Generative video models, 139 Generator network, 131 Global interpretability, 225 GLOVE, 180, 198–199 Google (technology company), 13–14 search engine algorithm, 134 search personalization, 93 Gradient-weighted class activation mapping (Grad-CAM), 161, 226–227 Heatmap method, 228 Hidden Markov Model (HMM), 165 Hulu, 88 Human–machine collaboration, 230 ImageNet, 268 Images, 155, 218 data, 220 image-based social media, 218 image/post clusters, 153 tagging, 157 Incentive-aware personalization, 96 Individual-level personalization, 77–78 InferNER approach, 195 Influence methods, 227 Input data, 14 Insight generation, 170–171 Instagram, 171 Instrumental variable approach (IV approach), 175 Interactive methods, 82–83 Internet, 151 Interpretability, 219, 228 issues, 224–228 Inverse Propensity Score estimator (IPS estimator), 85–86 ISI Web of Science, 171–172 Judgment, 14–15 Knowledge extraction, 227 LambdaMART ranking algorithm, 29 Language models, 131, 198, 204 generating textual content with, 129–131 marketing applications of, 204–205 Language structure and deep learning–based text mining, 180 Large language models, 140 Large-scale pretrained language models, 129–130 LDA, 195 Learning from audio visual data, 184 from interactive two-sided feedback, 185 Lexicons, 197–198 and word frequency–based methods, 180 Linguistic Inquiry and Word Count (LIWC), 197 Local interpretability, 226 Local Interpretable Model-Agnostic Explanations (LIME), 226 Long short-term memory (LSTM), 129, 157, 201–202, 255 Low response rates, 170 Lyft, 107 Machine learning (ML) (see also Deep learning (DL)), 14–15, 29–30, 32, 147–148, 170, 219, 240 algorithms, 150 methods, 82 promise of AI and, 149–150 Manual encoding, 219–220 Manual inspection, 160 Mapping methods to research questions, 162–165 posteriori–identified phenomena and constructs, 162–163 priori–defined constructs, 163–164 validation, 164–165 Market fairness, 224 research, 6 Marketers, 1–2 Marketing, 274 AI’s impact on consumers and society and vice versa, 8–9 algorithms and methods, 7–8 applications of language models, 204–206 communications, 132–133 data, 6–7 marketing-AI ecosystem, 2–4 modelers, 242–246 novel approaches for established tasks, 204 novel approaches for novel tasks, 204–205 opportunity identification for AI research, 10 purpose of AI, 4–6 research in marketing on Artificial Intelligence, 40–76 scholars, 1–2, 14, 28–30, 34, 193 Markov decision process (MDP), 263 Matrix factorization approaches, 81 Maximum likelihood estimation (MLE), 247 Mean Average Error (MAE), 87–88 Measurement error, 180 Megatron-Turing NLG, 125–126 Menu costs, 103 Metaphor elicitation technique, 151–152 Methodological approaches to personalization, 79–84 dynamic methods, 83–84 generalizability and counterfactual validity, 81–82 online and interactive methods, 82–83 scalability, 80–81 Metric-based evaluation, 87–88 Mind perception, 299 mini-Xception, 222–223 Model interpretability, 219 Model interpretation, 160–161 with manual inspection and data transformation, 160 post hoc model explanation, 161 Model-agnostic interpretability, 227 Model-agnostic techniques, 226 Model-specific interpretability, 226–227 Multi-armed bandit (MAB), 113 Multiarmed bandit models (MAB models), 5 Multihead attention, 202 Naive Bayes classifier, 178 Named entity extraction (NER), 195 Natural language generation models (NLG models), 125–126, 129–130 Natural language inference task (NLI task), 205 Natural language processing (NLP), 3, 7, 150, 172–173, 192 applications, 192 challenges, biases, and potential harms, 208–209 concept and topic extraction, 195–197 current state of NLP in marketing, 195–198 embeddings, language models, transfer learning, 198–204 established and novel tools for diverse text-based marketing applications, 196 marketing applications of language models, 204–205 relationship extraction, 197 roadmap and future trends, 206–207 sentiment and writing style extraction, 197–198 text in marketing, 193–195 Netflix, 88 Network embeddings, 163 Neural networks (NN), 242, 248 activation functions, 246 architecture, 246 objective function, 247 optimizer, 247–248 regularization, 248 News personalization, 83 Nonconvergence, 262 Nonparametric approach, 115 Nontech firms, 267 Nontextual data, 194 Objective function, 247 Offline beacons, 1 Online customer feedback, 175–183 AI and machine learning in analyzing unstructured review data, 178–181 challenges in learning from, 181–183 economic impact of online reviews, 177–178 Online forum discussions, 30 Online methods, 82–83 Online platforms, 147–148 Online reviews, 30 OpenCV, 222–223 Optimal algorithm, 115 Optimizer, 247–248 Overlap assumption, 82 Peer influence, 182–183 Personality, 295 Personalization algorithms, 82–83, 91 alternative approaches, 87–88 direct method, 85 doubly robust method, 86 evaluation, 84–88 extensions to special settings, 86–87 IPS estimator, 85–86 methodological approaches to personalization, 79–84 models, 94–95 multiple objectives and long-term outcomes, 94–95 problem definition, 78–79 returns to personalization, 88–90 signal-to-noise ratio, 94 strategic behavior and equilibrium analysis, 95–96 time drifts, 95 and welfare, 90–93 Personalized policy design, 78–79 Personalized pricing, 89–90, 110, 113, 117 Personification, 278 Photorealistic images, 131 Pix2pix approach, 132 Pixel-level models, 222 Plug and play language models (PPLM), 130 Poisson factorization, 197 Polarization, 93 Position encoding, 255, 257 Post hoc model explanation, 161 Posteriori–identified phenomena and constructs, 162–163 Prediction Machines , 14, 32–33 Predictions, 4, 28, 30 prediction-based algorithms, 158–159 process, 14 Predictive ML algorithms, 163 Preprocessing images, 157 Price discrimination, 110–113 Price experimentation, 113–115 Pricing automation, 105–107 consequences of AI for pricing, 115–119 dynamic pricing, 105–110 firms implementing AI for pricing, 104–115 personalized pricing, 110–113 price experimentation, 113–115 Primary data, 149 Prime Video, 88 Principal component analysis (PCA), 258 Priori–defined constructs, 163–164 Privacy, personalization and welfare, 91–92 Probabilistic content generation process, 130 Product development, 150 Propensity-based approaches, 87 Prospective customers, 126 Prototypes, 227–228 Q-learning algorithm, 115 models, 118 Q-value function approximator, 264–265 Quantitative marketers, 1 Racist language, 130 Random Forests, 81–82 Recency Frequency Monetary value (RFM value), 115 Rectified linear units (ReLu), 246 Recurrent neural networks (RNNs), 129, 159, 200–201, 248–249, 254–255 Recursive neural networks, 129 Regression models, 225 Regularization, 248 Regulators, 4 Reinforcement learning (RL), 248–249, 267–268 Relationship development, 136–138 Relationship extraction, 197 Relative Information Gain (RIG), 87–88 Representation learning, 240–241 Reputation platforms, 171 systems, 175 ResNet-50, 222 Restricted Boltzmann machine (RBM), 242 Ride-hailing platforms, 107 RoBERTa, 157, 198–199 Robots, 289–290 Rule-based learners, 226–227 Scalability, methodological approaches to personalization, 80–81 Scale-Invariant Feature Transform (SIFT), 221 SCImago Journal & Country Rank, 278–279 SE-ResNet-50, 222 Search engine optimization (SEO), 30, 133, 205 Second-order methods, 247 Seeded LDA, 195–196 Selection bias, 183 Self-attention, 255 Self-selection, 182 Self-supervised representation learning, 200 Semantic network analysis, 179–180 Sentence-based LDA, 195–196 SentenceBERT, 200 Sentiment analysis, 151, 198 Sentiment and writing style extraction, 197–198 Sequence-to-sequence models, 203 SHapley Additive exPlanations (SHAP), 161, 226 algorithm, 226 values, 163 “Shipping then shopping” strategy, 32–33 “Shop, then ship” model, 4 Short-term rental market, 107 Signal-to-noise ratio, 94 Small-and medium-sized enterprises (SMEs), 267 “Smart pricing” tool, 117 Social media, 147–148, 154, 218 messages, 135–136 messaging, 135 posts, 132–133 Social Sciences Citation Index (SSCI), 171–172 Speeded-Up Robust Features, 221 Standard reinforcement learning algorithm, 118 Stanford Named Entity recognizer, 195 “Stick-and carrot” strategies, 118 Stochastic gradient descent (SGD), 247–248 Stochastic parrots, 140 Stroop test performance, 293 Structural models, 225 Style-based GAN (StyleGAN), 131–132 Subnetworks, 131 Subscription-based “shipping-then-shopping” business model, 32–33 Supervised learning algorithms, 81 Supervised ML models, 151 Supply, real-time swings in, 105–110 Support vector machines (SVM), 29, 159, 178 Surge pricing algorithms, 107 Survey-based perceptual maps, 152 Synthetic images, generating, 131–133 Technology, 274 companies, 13–14 Text data, 7, 155 Text in marketing, 193–195 causality, 194–195 dependent variable, 194 dual role of language, 193 independent variables, 194 Text mining, 192 algorithms, 240 Textual analysis in marketing, 192–193 Textual consumer feedback, 179–181 Textual content with language models, generating, 129–131 Textures, 221 3D convolutional neural network, 220–221 TikTok, 218 Time drifts, 95 Topic modeling, 158, 192 Traditional LDA approach, 195–196 Training data, 14 Training process, 131 Transaction data, 149 Transfer learning, 198, 202, 204, 222 Transform data, 157 Transformer-based models, 157, 202, 204 Transformers, 202, 255, 258 Twitter, 171 Uber, 107 Unconditional counterfactual explanations, 227–228 Unconfoundedness assumption, 82 Underspecification, 209 Uniform policy, 79 Unstructured data, 170, 192, 218 Unsupervised learning, 157–158 clustering, 158 embeddings, 157–158 topic modeling, 158 Upper confidence bound algorithm (UCB algorithm), 115 US Congress, 116 User clusters, 153 User engagement, 155–156 User-generated content (UGC), 30, 147–149, 170–171 challenges in use of, 149 customer feedback from, 176 data preprocessing, 156–157 evaluation, 161–162 hybrid of unsupervised and supervised learning, 159–160 model interpretation, 160–161 prediction-based algorithms, 158–159 promise of, 149 tools and methods to understand, 156–162 unsupervised learning, 157–158 User-generated text, 156 User–brand interaction, 152–153 VADER, 197 Validation, 164–165 Value functions, 263 Variational autoencoders (VAE), 8, 160, 248–249, 258, 260 Vector semantics, 199–200 VGG-16 algorithm, 159 Video analytics, 7 Video content, 137 Video data, 220 Video platforms, 218 Virtual reality (VR), 7, 228–229 Visual consumer feedback, 181 Visual content, 137 Visual data, 7 Visualization techniques, 227 Voice of the Customer (VOC), 6, 147–148, 150 data available for AI AND, 154–156 importance of, 148 practice before artificial intelligence and big data, 148–149 Volume, velocity, variety (3Vs), 3 VOSviewer software, 173 Welfare fairness, 92–93 personalization and, 90–93 polarization, 93 privacy, 91–92 search cost, 91 White House’s Council of Economic Advisors (White House’s CEA), 117 Word embeddings, 198–199 Word-of-mouth (WOM), 172–173 Word2Vec (language embedding algorithm), 157, 180, 198–200 XAI methods, 224–228 model specificity, 226–228 model transparency, 224–225 scope of explanation, 225–226 XGBoost, 81, 159 Yelp, 171 YouTube, 88, 93, 218 ZIP codes, 111 Book Chapters Prelims The State of AI Research in Marketing: Active, Fertile, and Ready for Explosive Growth The Economics of Artificial Intelligence: A Marketing Perspective AI and Personalization Artificial Intelligence and Pricing Leveraging AI for Content Generation: A Customer Equity Perspective Artificial Intelligence and User-Generated Data Are Transforming How Firms Come to Understand Customer Needs Artificial Intelligence Applications to Customer Feedback Research: A Review Natural Language Processing in Marketing Marketing Through the Machine's Eyes: Image Analytics and Interpretability Deep Learning in Marketing: A Review and Research Agenda Anthropomorphism in Artificial Intelligence: A Review of Empirical Work Across Domains and Insights for Future Research Index

  • Book Chapter
  • Cite Count Icon 9
  • 10.1007/978-3-030-91356-4_18
Detect and Remove Watermark in Deep Neural Networks via Generative Adversarial Networks
  • Jan 1, 2021
  • Shichang Sun + 5 more

Deep neural networks (DNN) have achieved remarkable performance in various fields. However, training a DNN model from scratch requires a lot of computing resources and training data. It is difficult for most individual users to obtain such computing resources and training data. Model copyright infringement is an emerging problem in recent years. For instance, pre-trained models may be stolen or abuse by illegal users without the authorization of the model owner. Recently, many works on protecting the intellectual property of DNN models have been proposed. In these works, embedding watermarks into DNN based on backdoor is one of the widely used methods. However, when the DNN model is stolen, the backdoor-based watermark may face the risk of being detected and removed by an adversary. In this paper, we propose a scheme to detect and remove watermark in deep neural networks via generative adversarial networks (GAN). We demonstrate that the backdoor-based DNN watermarks are vulnerable to the proposed GAN-based watermark removal attack. The proposed attack method includes two phases. In the first phase, we use the GAN and few clean images to detect and reverse the watermark in the DNN model. In the second phase, we fine-tune the watermarked DNN based on the reversed backdoor images. Experimental evaluations on the MNIST and CIFAR10 datasets demonstrate that, the proposed method can effectively remove about 98% of the watermark in DNN models, as the watermark retention rate reduces from 100% to less than 2% after applying the proposed attack. In the meantime, the proposed attack hardly affects the model's performance. The test accuracy of the watermarked DNN on the MNIST and the CIFAR10 datasets drops by less than 1% and 3%, respectively.

  • Research Article
  • 10.48175/ijarsct-25257
A Study on the Impact of AI on Artistic Creation
  • Apr 15, 2025
  • International Journal of Advanced Research in Science, Communication and Technology
  • Aiman Ume + 1 more

This research examines the revolutionary effect of Artificial Intelligence (AI) on creative work, with a focus on how new technologies are shifting conventional definitions of creativity, authorship, and artistic worth. As machine learning, deep learning, and neural networks continue to advance, AI systems like Generative Adversarial Networks (GANs) and deep neural networks can currently create original works of visual art, music, literature, and digital media. These advancements undermine the centuries-old assumption that creativity is a human-specific attribute, as machines start producing content that is as complex and valuable as human-created art. From AI-painted works of art selling for millions to AI-written music and literature receiving mainstream recognition, this technological advancement poses important questions regarding the future of artistic careers, intellectual property rights, and the ethical limits of machine-generated creativity. The report seeks to explore these changes, providing an inside look at the ways in which AI is remaking the art world and how this is shaping artists, spectators, and the wider cultural environment.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1016/b978-0-443-15452-2.00011-x
Chapter 11 - Computational intelligence on medical imaging with artificial neural networks
  • Jan 1, 2025
  • Mining Biomedical Text, Images and Visual Features for Information Retrieval
  • Oznur Ozaltin + 1 more

Chapter 11 - Computational intelligence on medical imaging with artificial neural networks

  • Discussion
  • Cite Count Icon 1
  • 10.1002/acm2.14456
Embracing Real AI: A call to action for medical physicists in healthcare.
  • Jul 18, 2024
  • Journal of applied clinical medical physics
  • Dee H Wu + 5 more

The article "Embracing Real AI: A Call to Action for Medical Physicists in Healthcare" urges medical physicists to prepare for the integration of artificial intelligence (AI) into healthcare practices, emphasizing their pivotal role in adapting to technological advancements. The authors advocate for embracing AI through advocacy, broadening perspectives, and enhancing coordination and communication. They propose an ABC strategy focusing on increasing educational initiatives, fostering interdisciplinary collaboration, and creating team collaboration to facilitate AI integration. The commentary highlights AI's potential in enhancing diagnostics, personalizing medicine, and automating routine tasks while addressing challenges such as data sharing and the role of federated learning. The article calls for medical physicists to lead in embracing AI, emphasizing continuous learning and collaboration to leverage its potential for improving healthcare and patient care. Medical physicists have consistently demonstrated strong interest in developing proficiency in the adoption of new technological advancements. The roots of the profession come from the radiation sciences, including radiation protection, radiation therapy, diagnostic imaging, and nuclear medicine.1 As science and technology continued to evolve, medical physicists' roles have extended into other non-radiation domains, such as non-ionizing-radiation-based imaging (ultrasound and magnetic resonance), molecular imaging, computer aided diagnosis (CAD), information technologies, and data science.2 In addition, medical physicists gradually have adopted increasingly more active roles in ensuring the professional education of other radiology/radiation oncology team members, maintaining high quality standards via quality assurance (QA) methods. They also play a major role in advising the hospital management on medical devices and software acquisition. The continuing expansion of these roles and responsibilities has put medical physicists on the forefront of embracing emerging technologies, making the profession one of the most technical and versatile in healthcare settings. Currently, as our field grows in importance, we medical physicists seek to continue to engage in significant ways to for increased contributions and roles in human health. This commentary/opinion urges medical physicists to prepare for their expanding roles in the field of AI and its implementation and oversight in clinical practice. Medical physicists must embrace "Real AI" to help integrate AI into healthcare practices. Conceptually we advocate for a strategy that involves Real AI through advocacy, broadening, and enhancing coordination/communication (an ABC strategy). In our current and future work medical physicists will use AI to automate routine tasks, allowing medical physicists to focus on more complex tasks. Furthermore, Medical Physics will use AI to enhance efficiency, safety, diagnostic and therapeutic applications, and for personalized medicine. However, as we have done in the past with other complex concepts (such as radiation), medical physicists need to be prepared for the potential risks and ethical dilemmas associated with AI, such as bias and lack of transparency. It will be important that Medical Physicists prepare for the rapidly changing AI landscape, and continue learning, gain hands-on experience, and collaborate with other AI experts in the healthcare environment. This paper aligns with the already approved guidance document developed by the AAPM in conjunction with International Atomic Energy Agency (IAEA)3 that discusses how medical physicists can ensure the effective implementation and management of AI systems. It is crucial for the Clinical Quality Management Program (CQMP) personnel to receive regular training and updates on relevant guidelines and legislation. Clear communication channels should be established with IT experts, vendors, and other stakeholders for smooth coordination.4 Comprehensive documentation should be developed to ensure compliance with contractual obligations and guidelines. The clinical team should be involved in acceptance testing and discussions, depending on the clinical purpose of the AI system.4 Protocols for data collection and curation should be established, along with the development of standardized validation datasets for performance evaluation.4 A system for monitoring updates to AI systems and models should be implemented, with the CQMP leading new acceptance/commissioning rounds for any updates. Lastly, mechanisms for continuous evaluation and improvement of the CQMP processes should be established, which could involve regular audits, feedback mechanisms from end-users, and incorporating lessons learned from previous rounds.4 Nowadays, major healthcare systems in the US consider their data as immensely valuable assets that require rigorous protection to ensure Health Insurance Portability and Accountability Act (HIPAA) compliance, as well as intellectual property considerations. It can be very difficult for researchers to share clinical data with vendors for development purposes without a significant return being specified to the institution, such as joint intellectual property or substantial grant funding. Instead, these healthcare systems encourage their researchers to commercialize their findings independently, allowing the institution to retain full rights to intellectual property. That said, the realization of federated learning would be a significant advancement. To achieve this, a powerful pre-trained model that would be adaptable to operation on different scales and in various clinical scenarios is necessary. It is plausible that local adaptation may not require substantial computing power or AI expertise. This concept is particularly intriguing and could be beneficial to smaller centers and clinics in underserved areas. However, the primary challenge is the cost. As we become more reliant on AI systems like OpenAI's ChatGPT or Google Gemini, we often overlook the fact that these conveniences come with a hefty price tag, costing billions of dollars to develop and maintain.5 As medical physicists we and other healthcare professionals can anticipate that AI will significantly transform healthcare, improving efficiency, accuracy, and the level of detail that can be extracted from imaging, and methods of therapy. These technological advancements are expected to bring immense value to the field, offering a new horizon in diagnostic and therapeutic capabilities. Yet, we also must recognize that it also introduces potential significant risks and ethical dilemmas. One of the primary concerns is the possibility of bias in AI, which can stem from the training data, the algorithms, or their application, leading to potentially detrimental effects on patient care. As medical physicists, we should acknowledge that the complexity and lack of transparency in AI decision-making processes present obstacles in terms of accountability and rectifying errors and requires greater oversight and responsibility. The integration of AI also has great capacity in redefining the role of medical physicists, impacting education and employment within the field. Addressing these issues necessitates the creation of ethical standards for AI in healthcare, emphasizing transparency, responsibility, and equity, with contributions from diverse stakeholders, including patients, medical professionals, and ethicists.6 Such measures are crucial to ensure the responsible utilization of AI in healthcare, and ultimately serve the best interests of patients and society. We anticipate that continued guidance from our professional societies will be helpful as our collective communities develop methods and approaches that help us learn, adopt, and employ AI responsibly. Advocacy: increase educational initiative, public awareness, and recommending processes at all levels of the clinical workforce, as well as patient engagement. Broadening Perspectives: encourage Interdisciplinary Collaborations that allow medical physicists to work with professionals from other disciplines such as computer science, data science, and biomedical engineering, to gain insights into different perspectives on AI applications in healthcare. This enables medical physicists to provide continuing education and connect the community with research opportunities. Improving Coordination and Communication through creating team collaboration: enhance communication with healthcare professionals, administrators, and patients by clearly defining and articulating the role of medical physicists in AI applications. Promote the sharing of knowledge, as exemplified by creating data repositories through contributions, to further creating the foundation of our understanding and application of AI in the field. We consider the concept of Real AI in our context to be aimed at providing and/or qualifying a ready AI product that has undergone a rigorous QA process, that is free of false additives and biases, with data carefully curated to represent the demographics and be attuned to the needs of the clinic, sourced with proper ingredients, and abiding by laws and regulations that can ensure the product serves the common health needs of patients and benefits the public's interest. What AI 'is' and what it 'is not' is a complex topic that warrants further exploration and understanding, but one vital for comprehension of what utility AI can fulfill in the clinical process, what its advantages and limitations are, and how it can be curated to perform in the clinical scenarios relevant to a particular radiology/radiation oncology practice. Multiple data-analysis algorithms have been created over the course of years, and not all of them qualify as AI.7 What distinction(s) lie in what constitutes AI? One possible interpretation is that AI is a system that can adapt to new data, or a system that generates insights driven by data. AI systems are designed to "learn" and adapt to new data and be stable over the course of introducing data perturbations or employ model adaptation mechanisms. AI systems can adjust the underlying data-processing mechanisms based on the input they receive, which allows them to improve their performance and make more accurate predictions or decisions over time. This is often achieved through techniques such as machine learning, where algorithms are trained on a dataset and then used to make predictions or decisions without being explicitly programed to perform the task.8 Understanding how such datasets are selected, what data needs to be fed into AI model to achieve desired results, and how to prevent common pitfalls and ethical conundrums associated with the use of AI models requires additional training that might yet be lacking in the traditional training of the radiology/radiation oncology adjacent specialists. The scope of involvement of each member of the team when it comes to AI integration into the clinic continues to be determined as the field rapidly evolves. When it comes to the role of medical physicists in conjunction with AI, an open discussion of the exact responsibilities is still ongoing, and feedback is encouraged from all the members of the community. So, what can medical physicists do? They can use AI to enhance quality improvement and safety by analyzing medical data to identify trends, patterns, and outliers.9 This can lead to the identification of areas for improvement or potential safety hazards and help them enter the realm of Responsible AI. AI can also improve diagnostic and therapeutic techniques by enhancing the quality of medical imaging and automating image interpretation.10 Furthermore, AI can help in integrating diagnostics, personalized medicine, and theragnostics by analyzing large datasets to tailor treatment plans to individual patients.11 This can lead to more effective and personalized care. AI can also automate routine tasks in medical physics, such as treatment planning and QA processes, leading to increased efficiency.12 Lastly, AI techniques like machine learning and deep learning can be leveraged for research and development to analyze complex datasets, discover patterns, and develop innovative techniques for disease detection, treatment, and monitoring.13 Whether it involves developing AI-driven solutions like automated segmentation, dose calculations, addressing intricate problems in the clinic, or potentially even contributing to open-source AI initiatives, such activities will empower medical physicists to enhance their skills and make tangible contributions to the advancement of healthcare. Embracing AI not only fosters a sense of accomplishment but also opens doors to the world of `automation' and scaling that will pervade all technologies of the future. The AHAIBC committee is at the center of bringing the medical physicist forward by developing curriculum concepts, bootcamps, and engendering engagement for our society. Integration of AI into the realm of medical physics education is critical, especially considering the potential significance of incorrect AI usage or misapplication. The physicist is responsible for installing and commissioning the AI software, ensuring the modeling is not biased, performing continuing QA on the hospital data and processes, and establishing efficient resource management. Embracing education in AI offers new benefits for medical physicists as it is already revolutionizing various industries and professional practices and we need to be equally prepared. One way to engage and prepare healthcare professionals for the upcoming AI wave is to start with the roots of quality safety and assurance. To do this, we should enable a comprehensive QA program that encompasses all clinical operations related to medical fields including radiology, nuclear medicine, and radiation oncology. Ensuring the safe operation of hardware, software, clinical operation processes and machinery is of utmost importance and one of the most crucial responsibilities of a medical physicist. A Real AI approach can be highly beneficial in achieving the goal of safe clinical implementation. Understanding the potential and limitations of AI serves as a cornerstone for fostering engagement not only within our profession but with other healthcare providers. Continuous learning and participation in hands-on experience are essential components for navigating the complexities of AI applications within healthcare. Collaboration, networking, and exploring AI's purpose and impact are equally vital in this journey. Additionally, some physicists may choose personal projects, embracing challenges in small groups, and actively contributing to AI-focused teams to amplify the motivation and expertise of our field. Insights through personal and collaborative opportunities ultimately provide for and encourage professional growth and innovation within our medical physics field. Some medical physicists may be able to attend specialty meetings and conferences dedicated to AI which further enriches their knowledge base and provides them avenues for fruitful collaboration. There are successful educational programs such as the Radiological Society of North America Artificial Intelligence (RSNA AI)-certificate program.14 Interdisciplinary cooperation and inter-institutional collaboration for AI experts is of paramount importance for integrating AI into medical physicists' practice on a larger scale, and mechanisms enabling this collaboration should be provided to the community. In summary, the authors believe that being prepared for and embracing the changes that AI is already bringing at the current time will benefit our community, healthcare, patient care, and society at large immediately and for the future. We are at a critical juncture, which can be considered a fourth industrial revolution, where AI and automation are applied more broadly. Medical physicists have a pivotal role to play in this revolution. We need to position ourselves at the forefront of 'Real AI' and lead the charge in this exciting new era. It is time for action, and we can take the first steps with potentially just a few ABCs. All authors contributed their efforts in writing and editing this call for action. ChatGPT search engine has been utilized to provide additional background to the subject of matter for illustrative purposes. The authors appreciate members of the Ad. The authors declare no conflicts of interest. The content for this call for action has been edited with the help of large language models ChatGPT and Google NotebookLM.

  • Research Article
  • 10.52783/jisem.v10i31s.5123
Artificial Intelligence Driven Gender Based Text-to-Speech System (TTS) Using Deep Learning Algorithms
  • Apr 2, 2025
  • Journal of Information Systems Engineering and Management
  • Bechoo Lal

Introduction: In this research article the researcher proposed a gender-based Text-to-Speech (TTS) system that uses specific voices to simulate either male or female speech, offering a variety of voices and accents to match the user's needs. These systems can be integrated into various applications to create more natural-sounding, and customized speech outputs. The main goal is to make synthetic speech more eminently plausible and natural sounding, closer to human speech in terms of suitability for the context and sound quality. Objectives: This research study aims to developed the limits of TTS technology by using AI; taking into account both the ethical issues and the technological constraints that must be overcome for general deployment. The ultimate objective is to create TTS systems that are widely accessible, morally acceptable, and able to be mistaken for human speech. Methods: The resrarcher implemented the deep neural networks (DNNs), such as recurrent and convolution neural networks (RNNs), along with more recent designs like Transformer and Generative Adversarial Networks (GANs). This research article investigates how AI technologies deep learning in particular can overcome these obstacles in a revolutionary way. Implementing and assessing cutting edge AI approaches to enhance TTS systems is the basis of this study. It is essential to this to employ deep neural networks (DNNs), such as recurrent and convolution neural networks (RNNs), along with more recent designs like Transformer and Generative Adversarial Networks (GANs). Results: The researcher achieved overall accuracy of 85% of the predictive model based on AI driven using CNN. Perfromance most likely can be improved by fine tuning, adding more data for training and possibly balancing target 50/50 instead of 60/40. It explores the shortcomings of conventional TTS systems, especially with regard to their ability to generate speech that is both natural and understandable. Conclusions: Finally the researcher concluded that the proposed model improving text to speech (TTS) systems naturalness and intelligibility using AI has been an extensive and fruitful endeavor, with several goals set out to do just that. Incorporating contextual and situational awareness, improving speech naturalness and intelligibility, prioritizing user centric assessment and optimization, and ensuring the ethical and responsible deployment of AI in TTS systems were all stated goals of this research. Extensive testing and analysis, however, revealed that prosody modeling is essential for improving speech realism. The researcher achieved the overall accuracy of 85%. Perfromance most likely can be improved by fine tuning, adding more data for training and possibly balancing target 50/50 instead of 60/40.

  • Research Article
  • Cite Count Icon 6
  • 10.2118/1019-0065-jpt
Artificial Intelligence Improves Seismic-Image Reconstruction
  • Oct 1, 2019
  • Journal of Petroleum Technology
  • Chris Carpenter

This article, written by JPT Technology Editor Chris Carpenter, contains highlights of the open-submission paper “Artificial Intelligence for Seismic-Image Reconstruction,” by Yogendra Narayan Pandey, SPE, and Govind Chada, Prabuddha, and Tejas Karmarkar, Oracle Cloud Infrastructure. The paper was not presented at an SPE conference and has not been peer-reviewed. Seismic imaging provides vital tools•for the exploration of potential hydrocarbon reserves and subsequent production-planning activities. The acquisition of high-resolution, regularly sampled seismic data may be hindered by physical or financial constraints, which lead to undersampled, sparse seismic data. However, if seismic data are available at a higher resolution and sampled evenly throughout the region of interest, the generated 3D models of petrophysical properties could be improved. Such improvements would show potential benefits through the successive steps of reservoir modeling and production•planning. Traditional Approaches Traditional methods used to overcome the previously mentioned data-quality issues can be divided broadly into three categories. Wave-equation-based methods. These methods use physics-based•wave-propagation equations, using•velocity models•to•reconstruct missing seismic traces. Domain-transform methods. These are data-driven methods that involve transformation of data•between different domains, such as time and frequency. Prediction-error filters. These methods use a filter that learns from the known seismic data and constructs missing seismic data. Given recent advances in the field of artificial intelligence (AI), it is worth examining whether AI methods also can be useful in the task of seismic-data•reconstruction. Generative Adversarial Networks (GANs) GANs are a recent addition to the field of solution techniques. A variant of GANs, conditional generative adversarial networks, was used to test the efficacy of GANs for seismic-data reconstruction. Fig. 1 shows a schematic of a GAN model for seismic-data reconstruction. To effectively reconstruct a seismic image, a large number of 2D seismic images are fed into the GAN during the training process. A GAN model consists of two main components: Generator. This is a deep convolutional neural network (CNN) that uses random noise as an input and generates an image by expanding the input through a series of deconvolutions. As shown in Fig.•1, the generator network is fed 2D seismic images, with a portion of these masked to replicate missing traces. The generator network tries to reconstruct these•masked portions of the 2D seismic images. Discriminator. This is also a deep CNN that is shown the images reconstructed by the generator network and the original seismic images from which some parts were masked. The discriminator’s job is to distinguish between the images reconstructed by the generator from the original seismic images. As training progresses, the generator network tries to create images that the discriminator will not be able to differentiate from the original seismic images. At the same time, the discriminator’s ability to distinguish the generated images from the original images continuously improves. As a result, the generator begins producing reconstructed seismic images, which look similar to the original ones.

  • Research Article
  • Cite Count Icon 7
  • 10.9734/ajrcos/2024/v17i12533
Exploring Generative AI: Models, Applications, and Challenges in Data Synthesis
  • Dec 13, 2024
  • Asian Journal of Research in Computer Science
  • S Ramalakshmi + 1 more

Generative AI has emerged as a transformative field within artificial intelligence, enabling the creation of new data that mimics real-world information and expands the boundaries of what machines can autonomously generate. This study discuss the various models of generative AI, focusing on Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and Auto-Regressive models, each offering distinct approaches and strengths in data generation. VAEs excel in learning latent representations, making them ideal for applications like anomaly detection and data imputation. GANs, renowned for their high-quality image synthesis, have found extensive use in tasks ranging from text-to-image conversion to super-resolution. Auto-Regressive models, on the other hand, are particularly effective in sequential data generation, such as text generation, music composition, and time series prediction. The paper highlights key applications of these models across diverse domains, including image synthesis, text generation, drug discovery, and simulation tasks in fields like healthcare, finance, and entertainment. Additionally, the study emphasizes the evaluation metrics are also called the comparitive parameters crucial for assessing the performance of generative models, such as perceptual quality metrics, Inception Score (IS), and Fréchet Inception Distance (FID), which provide quantitative insights into the quality and diversity of generated data. This study employs a systematic methodology comprising a comprehensive literature review, strategic search queries, and thematic data synthesis to explore generative AI. Key areas of focus include models (VAE, GAN, auto-regressive, flow-based), applications, evaluation techniques, challenges, and recent advances. The analysis identifies emerging trends, novel methods, and critical gaps in the field. This study also compares the performance of three Gen –AI models along with the comparative parameters like data type, Data Type, Applications, Training Complexity, Output Quality, Interpretability, Limitations, Advantages, Computational Cost and Scalability. Generative AI raises ethical concerns, including biases in training data that perpetuate stereotypes and marginalization. It can be misused for harmful purposes like creating deepfakes or spreading misinformation, impacting trust and privacy. Questions of accountability and ownership arise when AI-generated content infringes on intellectual property or causes harm. Addressing these issues is essential for responsible AI deployment.

  • Research Article
  • Cite Count Icon 47
  • 10.3970/cmc.2018.03950
Embedding Image Through Generated Intermediate Medium Using Deep Convolutional Generative Adversarial Network
  • Sep 13, 2018
  • Cmc-computers Materials & Continua
  • Chuanlong Li

Deep neural network has proven to be very effective in computer vision fields. Deep convolutional network can learn the most suitable features of certain images without specific measure functions and outperform lots of traditional image processing methods. Generative adversarial network (GAN) is becoming one of the highlights among these deep neural networks. GAN is capable of generating realistic images which are imperceptible to the human vision system so that the generated images can be directly used as intermediate medium for many tasks. One promising application of using GAN generated images would be image concealing which requires the embedded image looks like not being tampered to human vision system and also undetectable to most analyzers. Texture synthesizing has drawn lots of attention in computer vision field and is used for image concealing in steganography and watermark. The traditional methods which use synthesized textures for information hiding mainly select features and mathematic functions by human metrics and usually have a low embedding rate. This paper takes advantage of the generative network and proposes an approach for synthesizing complex texture-like image of arbitrary size using a modified deep convolutional generative adversarial network (DCGAN), and then demonstrates the feasibility of embedding another image inside the generated texture while the difference between the two images is nearly invisible to the human eyes.

  • Book Chapter
  • 10.1007/978-3-031-15175-0_33
Training Generative Adversarial Networks (GANs) Over Parameter Server and Worker Node Architecture
  • Jan 1, 2023
  • Amit Ranjan + 1 more

The latest technological discovery in the field of artificial intelligence (AI) is the learning and widespread use of different Generative Adversarial Networks (GANs) applications. GANs have made progress in numerous applications like image editing, style transfer, scene generation, so on. However, these types of generative models demand high computation because GANs are made out of two deep neural networks and in light of the fact that it trains on huge datasets. As with other AI models, GANs also face problems of insufficient data while training for some real-world situations. In numerous situations, available databases might be restricted and distributed over various worker nodes (i.e., end users) where the local datasets are intrinsically private and ultimately workers toward the end do not want to share them. In this chapter, we addressed the issue of training GANs in a distributed way so that they can train over datasets that are distributed to various worker nodes. We have developed a training framework for GANs under the setting of the parameter server and worker node. Under this framework, various workers can produce results similar to real data while keeping it completely in a distributed way and also keeping their information confidential. Test results obtained with the CIFAR-10 dataset indicate that our architecture can produce high-quality data samples that look similar to real data and can be used in various real-life applications.

  • Research Article
  • Cite Count Icon 1
  • 10.54097/hset.v62i.10427
Comparative Study of Deep Learning Neural Networks for Image Classification
  • Jul 27, 2023
  • Highlights in Science, Engineering and Technology
  • Lanqin Huang

Image classification plays a crucial role in image recognition in present times. Researchers have developed numerous methods for achieving image classification through long-term research. Deep Learning, as one of the most important methodologies, is used to clarify the contents of an image. This paper provides an overview of four typical deep learning neural networks used for image classification, including conventional neural networks (CNN), recurrent neural networks (RNN), generative adversarial networks (GAN), and deep neural networks (DNN). The architecture, function, and applications of each network are discussed and compared. CNN is commonly used for image recognition, while RNN is suitable for speech recognition, natural language processing, and time series forecasting. DNN is ideal for handling high-dimensional data, and GAN can generate new data samples. The StyleGAN is introduced as an application of GAN, which can produce high-quality images. Finally, the future work and challenges in image recognition are discussed.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 9
  • 10.1038/s41598-024-63285-4
Imbalanced spectral data analysis using data augmentation based on the generative adversarial network
  • Jun 9, 2024
  • Scientific Reports
  • Jihoon Chung + 5 more

Spectroscopic techniques generate one-dimensional spectra with distinct peaks and specific widths in the frequency domain. These features act as unique identities for material characteristics. Deep neural networks (DNNs) has recently been considered a powerful tool for automatically categorizing experimental spectra data by supervised classification to evaluate material characteristics. However, most existing work assumes balanced spectral data among various classes in the training data, contrary to actual experiments, where the spectral data is usually imbalanced. The imbalanced training data deteriorates the supervised classification performance, hindering understanding of the phase behavior, specifically, sol-gel transition (gelation) of soft materials and glycomaterials. To address this issue, this paper applies a novel data augmentation method based on a generative adversarial network (GAN) proposed by the authors in their prior work. To demonstrate the effectiveness of the proposed method, the actual imbalanced spectral data from Pluronic F-127 hydrogel and Alpha-Cyclodextrin hydrogel are used to classify the phases of data. Specifically, our approach improves 8.8%, 6.4%, and 6.2% of the performance of the existing data augmentation methods regarding the classifier’s F-score, Precision, and Recall on average, respectively. Specifically, our method consists of three DNNs: the generator, discriminator, and classifier. The method generates samples that are not only authentic but emphasize the differentiation between material characteristics to provide balanced training data, improving the classification results. Based on these validated results, we expect the method’s broader applications in addressing imbalanced measurement data across diverse domains in materials science and chemical engineering.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant