AltCanvas: A Tile-Based Editor for Visual Content Creation with Generative AI for Blind or Visually Impaired People
People with visual impairments often struggle to create content that relies heavily on visual elements, particularly when conveying spatial and structural information. Existing accessible drawing tools, which construct images line by line, are suitable for simple tasks like math but not for more expressive artwork. On the other hand, emerging generative AI-based text-to-image tools can produce expressive illustrations from descriptions in natural language, but they lack precise control over image composition and properties. To address this gap, our work integrates generative AI with a constructive approach that provides users with enhanced control and editing capabilities. Our system, AltCanvas, features a tile-based interface enabling users to construct visual scenes incrementally, with each tile representing an object within the scene. Users can add, edit, move, and arrange objects while receiving speech and audio feedback. Once completed, the scene can be rendered as a color illustration or as a vector for tactile graphic generation. Involving 14 blind or low-vision users in design and evaluation, we found that participants effectively used the AltCanvas’s workflow to create illustrations.
- Research Article
- 10.30857/2786-5371.2025.2.8
- Jul 31, 2025
- Technologies and Engineering
In the context of the modern information war, the development of effective methods to counter disinformation and hostile propaganda has become critically important. The amount of false information and the speed of its dissemination necessitated the implementation of automated systems involving artificial intelligence to optimise the processes of creating visual counter-propaganda content. This research aimed to develop a methodology for the integration of artificial intelligence technologies into the processes of creating effective visual tools for countering disinformation, taking into account the principles of graphic design and the psychology of visual information perception. The research was based on a comprehensive approach that combined theoretical analysis of scientific literature, comparative analysis of the neural networks Midjourney, Stable Diffusion, and DALL-E, semiotic analysis of visual materials, as well as experimental implementation of the developed system for the automated creation of counter-propaganda visual content. A comprehensive approach has been developed for the creation of visual tools to counter disinformation, which combines the capabilities of automated information collection systems, artificial intelligence algorithms for generating graphic content, and the principles of effective graphic design. It has been found that the use of artificial intelligence in graphic design has optimised up to 20% of routine tasks in the creation of visual content, allowing designers to focus on the strategic and creative aspects of development. The developed recommendations for the use of artificial intelligence in graphic design may be implemented by state institutions, media, and public organisations to respond promptly to information threats
- Research Article
1
- 10.7256/2454-0625.2024.2.69753
- Feb 1, 2024
- Культура и искусство
This article is part of a larger study of design as a cultural phenomenon. In this part of the study, the author examines the process that is currently taking place in the semiotic structure of design, associated with the active introduction of neural networks into the creation of visual content. Artificial intelligence products stylistically take the design away from the fourth-order simulacrum (from the flat design style) and return the design to using the third-order simulacrum as the main iconic form.The object of the research is the transformation of the semiotic design system. The subject of the study is a return to the third–order simulacrum in modern design. The purpose of the study is to show and try to explain how in modern design there is a return to the use of a third-order simulacrum. The research method is a semiotic analysis of modern design based on the methodology of R. Barth's semiotic analysis. The study is also based on J. Baudrillard's theory of semiosis in hyperreality and three orders of simulacra. The author sees the philosophical justification of the art of neural networks in the concept of flat ontologies. The study of the semiotic structure of design allows us to see that the logic of design development carries this phenomenon through successive stages of semiosis associated with a decrease in meaning and a diminution of being. In its development, the design consistently uses first index signs, a second-order simulacrum, then a third-order simulacrum as the main sign form. Nowadays, the main iconic form in design has become a fourth-order simulacrum. Next, design had to either end as a profession and phenomenon, or move into a new cultural paradigm that was not related to simulation. However, unexpected transformations have begun to occur in design, due to the active involvement of non–human agents - neural networks – in the creation of visual content. Neural network products are a typical example of a third-order simulacrum. Thanks to the use of neural networks, modern design finally acquires the vector of transhumanism and closes in simulation.
- Conference Article
31
- 10.1145/3379337.3415845
- Oct 20, 2020
Despite the ubiquity of direct manipulation techniques available in computer-aided design applications, creating digital content remains a tedious and indirect task. This is because applications require users to perform numerous low-level editing operations rather than allowing them to directly indicate high-level design goals. Yet, the creation of graphic content, such as videos, animations, and presentations often begins with a description of design goals in natural language, such as screenplays, scripts, outlines. Therefore, there is an opportunity for language-oriented authoring, i.e., leveraging the information found in the structure of a language to facilitate the creation of graphic content. We present a systematic exploration of the identification, graphic description, and interaction with various linguistic structures to assist in the creation of visual content. The prototype system, Crosspower, and its proposed interaction techniques, enables content creators to indicate and customize their desired visual content in a flexible and direct manner.
- Research Article
- 10.31891/2307-5732-2024-333-2-34
- Apr 25, 2024
- Herald of Khmelnytskyi National University. Technical sciences
To date, the use of applications for the generation of illustrative material with the help of artificial intelligence (AI) is one of the most progressive for creating content in the field of visual occupations (designers, architects, artists), for marketers, students and ordinary people. The main reason for using neural networks is to save time and create inspiring examples in any field of human activity. Currently, there are more than 20 independent programs that generate visual content, and many companies such as Adobe and Canva use neural network tools. The use of artificial intelligence is irreversible and requires practice and some experience in its work with the creation of visual content. The development and updating of the main programs in this direction, such as Midjourney and Leonardo, is constant and needs to be studied. The photo-realism and image detailing of the latest versions of Midjourney allows you to create visual content that can be used to generate new ideas and use it for creation quality visual content for advertising. The article compares the quality parameters of the visual content created on the basis of artificial intelligence by the Midjourney program using its different versions. The updates from the initial version to the latest V6.0 were phased in less than a year and a half. Certain deviations in the images are considered, which leads to the impossibility of their further use. Analyzed the parameters of writing explanations (promts), which will affect the final quality of the generated image, the possibility of writing complex promts with one and several images, as well as advanced promts. The possible use of neural networks that work with text for targeted writing of appropriate prompts for better automation of the generation process is considered. The article considers the possible use of the Midjourney program for various spheres of human activity that use various images, and also raises the question of the role of a person as the main creator and generator of creative ideas.
- Research Article
182
- 10.9781/ijimai.2023.07.006
- Dec 1, 2023
- International Journal of Interactive Multimedia and Artificial Intelligence
Artificial Intelligence has become a focal point of interest across various sectors due to its ability to generate creative and realistic outputs. A specific subset, generative artificial intelligence, has seen significant growth, particularly in late 2022. Tools like ChatGPT, Dall-E, or Midjourney have democratized access to Large Language Models, enabling the creation of human-like content. However, the concept 'Generative Artificial Intelligence lacks a universally accepted definition, leading to potential misunderstandings. While a model that produces any output can be technically seen as generative, the Artificial Intelligent research community often reserves the term for complex models that generate high-quality, human-like material. This paper presents a literature mapping of AI-driven content generation, analyzing 631 solutions published over the last five years to better understand and characterize the Generative Artificial Intelligence landscape. Our findings suggest a dichotomy in the understanding and application of the term "Generative AI". While the broader public often interprets "Generative AI" as AI-driven creation of tangible content, the AI research community mainly discusses generative implementations with an emphasis on the models in use, without explicitly categorizing their work under the term "Generative AI".
- Research Article
- 10.36893/iej.2025.v54i3.007
- Jan 1, 2025
- Industrial Engineering Journal
The rapid advancements in generative AI have led to the development of dedicated models for content, image, music, and video creation. However, customers are often faced with difficulties in switching between devices to meet multi-modal content generation. ZumbleBot bridges this gap by combining content, image, music, and video creation into one, integrated platform. Using cutting-edge Huggingface Pre-trained AI models like Qwen for content, Steady Dissemination for images, MusicGen for music, and text-to-video models, ZumbleBot uncouples creative workflows and enhances openness. The platform constitutes a literary insight and creates returns over unique groups of media while ensuring proper coherence. This article analyzes the engineering, demonstrate integration, and application of ZumbleBot, as well as its uses in content creation, education, and advertising. Also, we examine the challenge of multimodal AI age and suggest arrangements to maximize execution and maintain yield quality. ZumbleBot addresses a step toward steady, expert, and astutely AI-powered imagination. With the use of cutting-edge generative AI, ZumbleBot redefines multi-modal creativity, making content generation with AI more accessible and efficient.
- Research Article
- 10.55041/isjem04361
- Jun 8, 2025
- International Scientific Journal of Engineering and Management
This study investigates how visual content influences marketing effectiveness on LinkedIn, focusing on its role in enhancing engagement, improving brand perception, and amplifying message clarity. As LinkedIn evolves from a professional networking site to a comprehensive platform for B2B communication and digital branding, organizations are increasingly leveraging it to share updates, build authority, and foster professional relationships. Visual content—such as infographics, short videos, graphics, carousels, and animations—has emerged as a key tool in capturing user attention and improving the delivery of complex information. The research combines both primary and secondary data sources. A survey was conducted among professionals from sectors including IT, marketing, education, and human resources, aiming to understand how visual elements affect content preference, engagement behaviour, and memory retention. Findings suggest that visual content significantly outperforms text-based posts in driving interactions such as likes, comments, shares, and click-through rates. Infographics are particularly effective for simplifying data, while short-form videos are favored for their ability to communicate brand value quickly and persuasively. Supporting secondary literature and LinkedIn’s algorithmic patterns confirm that posts enriched with visual media tend to receive broader organic reach. Studies also highlight that visual storytelling increases brand recall and strengthens trust and credibility. Furthermore, neuroscience supports the idea that visuals are processed faster than text, making them a powerful asset for marketers aiming to deliver impactful messages in a short time. The paper also explores strategic applications of visual content on LinkedIn, recommending the use of consistent branding, high-quality design, and goal-oriented visuals tailored to specific marketing objectives—such as employer branding, lead generation, or thought leadership. It encourages investment in design resources and training to maximize the effectiveness of visual marketing. However, the study recognizes certain limitations, including a geographically narrow sample (primarily Indian professionals), a relatively small data pool, and reliance on self-reported user behaviour. The absence of experimental methods like A/B testing also limits the ability to establish direct causation. In conclusion, the research calls for future studies on the integration of emerging technologies like AI in visual content creation, the development of platform-specific visual strategies, and the importance of accessibility and inclusivity in design. As digital engagement continues to shift towards visual-first experiences, LinkedIn marketers must adapt to stay competitive and relevant in the professional content ecosystem. Keywords: Visual Content, LinkedIn Marketing, User Engagement, Brand Visibility, Content Strategy, Infographics, Short-form Videos, B2B Communication, Professional Networking, Social Media Marketing, Brand Recall, Visual Storytelling, Digital Branding, Click-through Rate, Organic Reach, Marketing Analytics, Content Performance, Thought Leadership, Employer Branding, Visual Design
- Research Article
- 10.55041/ijsrem48372
- May 20, 2025
- INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT
This study focuses on turning written descriptions into high-quality pictures using powerful AI diffusion models. These models use iterative denoising, which begins with a noisy image and gradually refines it to produce realistic and coherent outputs that match the user-provided text.Pre-trained models, such as Stable Diffusion, are used for their efficiency in text- toimage generation. Fine-tuning on specialized datasets improves adaptability, allowing the system to handle a wide range of textual inputs, from straightforward descriptions to complicated prompts. Techniques such as latent space processing maximize computing efficiency while maintaining output quality.With an impressive 90 percent accuracy A U-Net architecture that incorporates attention processes enhances the model’s capacity to generate detailed and accurate pictures. Index Terms—AI Diffusion Models,Text-to- ImageGeneration,Stable-Diffusion,Latent Space Processing,U- Net Architecture,Attention Mechanisms,Image Synthesis,Generative AI,Frechet Inception Distance (FID),Visual Content Creation´
- Book Chapter
13
- 10.1007/978-1-4471-0563-3_9
- Jan 1, 1999
There is a widening gap between the creation of visual content and its analysis and interpretation by machine, an increasingly essential require-ment for correct indexing and filtering. In the case of the WWW, for instance, although there are efficient methods to process the encoded (e.g. ASCII) text, there are no such methods for the (significant) visual content. This paper focuses on the methods developed by the authors to address the problem of extracting the characters from WWW images containing text.
- Research Article
1
- 10.22492/ije.13.2.10
- Jun 3, 2025
- IAFOR Journal of Education
The study explored the educational potential of the application of student-generated digital visual content for learning English as a second language (ESL) by undergraduate students enrolled in the course Foreign Language which is actually Introduction to Legal English. This study used a mixed-methods approach. The researchers designed a quasi-experimental design to examine whether the students’ creation of visual content, supported by structured use of artificial intelligence (AI), could improve second language learning outcomes, increase motivation, and promote critical engagement with digital tools. The experimental group was tasked with creating personalized visual learning materials. The applied approach was structured in several steps, from creating simple forms including infographics and comparative charts to poster presentations and digital video passion projects. The algorithm for collaboration with AI and the work with specific features of AI-generated materials was applied aimed at making a student a critical consumer of this content and mitigating potential drawbacks of using AI. To assess the learning outcomes after the intervention, the post-test was administered, which revealed that the studied instructional design had a positive impact on language development across all aspects checked. The questionnaire, which included both open-ended and closed-ended questions, investigated students’ perceptions of the applied methodology and faced challenges. The findings showed that students perceived integrating visual creation and structured AI-supported activities into English language learning as beneficial for language skills development, boosting motivation and interest, and the advancement of digital literacy.
- Research Article
- 10.22492/ije.13.1.10
- Jun 3, 2025
- IAFOR Journal of Education
The study explored the educational potential of the application of student-generated digital visual content for learning English as a second language (ESL) by undergraduate students enrolled in the course Foreign Language which is actually Introduction to Legal English. This study used a mixed-methods approach. The researchers designed a quasi-experimental design to examine whether the students’ creation of visual content, supported by structured use of artificial intelligence (AI), could improve second language learning outcomes, increase motivation, and promote critical engagement with digital tools. The experimental group was tasked with creating personalized visual learning materials. The applied approach was structured in several steps, from creating simple forms including infographics and comparative charts to poster presentations and digital video passion projects. The algorithm for collaboration with AI and the work with specific features of AI-generated materials was applied aimed at making a student a critical consumer of this content and mitigating potential drawbacks of using AI. To assess the learning outcomes after the intervention, the post-test was administered, which revealed that the studied instructional design had a positive impact on language development across all aspects checked. The questionnaire, which included both open-ended and closed-ended questions, investigated students’ perceptions of the applied methodology and faced challenges. The findings showed that students perceived integrating visual creation and structured AI-supported activities into English language learning as beneficial for language skills development, boosting motivation and interest, and the advancement of digital literacy.
- Research Article
2
- 10.1177/02734753251326457
- Mar 25, 2025
- Journal of Marketing Education
This study aims to examine how to integrate generative AI (GenAI) into marketing education. We used the transformation mechanism within boundary crossing theory to explore how marketing professional insights can be utilized to prepare students for industry demands in the GenAI era. We analyze industry content and GenAI courses alongside 26 interviews with industry practitioners to identify essential knowledge, skillsets, and optimal strategies for implementing GenAI in marketing curricula. Findings underscore the necessity of equipping students with GenAI skills for marketing research, strategy development, content creation, creativity, and ideation across use cases. Practitioners emphasized that marketing theory and ethics should be centralized in any GenAI-related subject matter. For educators, the study highlights the importance of involving industry partners, integrating external materials, and offering master classes to ensure students develop practical skills alongside theoretical knowledge. This research contributes to the discourse on GenAI in marketing education by providing use-cases and actionable insights into subject design, ensuring alignment with industry expectations and equipping students with necessary competencies for a GenAI-driven marketing environment. We extend the application of Boundary Crossing theory into marketing education literature by theorizing how transformation deepens and operates bidirectionally in the context of disruptive technologies, such as GenAI.
- Research Article
- 10.12688/mep.21403.1
- Dec 8, 2025
- MedEdPublish (2016)
Generative AI (GenAI) tools are transforming health professions education, offering opportunities to enhance faculty development (FD). Faculty developers are uniquely positioned to integrate GenAI into practice to address resource constraints, improve accessibility, and foster equity across diverse educational contexts. This Applied Insights article offers a perspective on how GenAI can be leveraged as a co-developer in FD by drawing on emerging literature and discussion points from a workshop at the 8th International Faculty Development Conference in the Health Professions. The applied insights are structured around key phases of FD: planning, content creation, delivery, and evaluation. They include actionable strategies for using GenAI in needs assessment, multilingual and culturally relevant resource creation, personalized learning plans, and when providing feedback and mentorship. Each insight is rooted in pedagogical rationale, evidence, and strategies to address ethical and practical challenges, with an emphasis on human oversight, contextual relevance, and continuous evaluation of GenAI's impact. By considering these insights, faculty developers can harness GenAI to co-design educational materials, extend their reach through innovative formats, and maintain ethical and equity-driven educational practices. This article highlights the transformative potential of GenAI in FD when thoughtfully integrated. GenAI can empower faculty developers to enhance the quality and inclusivity of HPE while safeguarding educational standards.
- Research Article
- 10.36253/me-16303
- Dec 30, 2024
- Media Education
The study examines the transformative potential impact of Generative AI (GAI) on society, media, and media education, focusing on the challenges and opportunities these advancements bring. GAI technologies, particularly large language models (LLMs) like GPT-4, are revolutionizing content creation, platforms, and interaction within the media landscape. This radical shift is generating both innovative educational methodologies and challenges in maintaining academic integrity and the quality of learning. The study aims to provide a comprehensive understanding of how GAI impacts media education by reshaping the content and traditional practices of media-related higher education. The research delves into three main questions: the nature of GAI as an innovation, its effect on media research and knowledge acquisition, and its implications for media education. It introduces critical concepts such as radical uncertainty, which refers to the unpredictable outcomes and impacts of GAI, making traditional forecasting and planning challenging. The paper utilizes McLuhan’s tetrad to analyze GAI’s role in media, questioning what it enhances or obsoletes, retrieves, or reverses when pushed to extremes. This theoretical approach helps in understanding the multifaceted influence of GAI on media practices and education. Overall, the research underscores the dual-edged nature of GAI in media education, where it presents significant enhancements in learning and content creation while simultaneously posing risks related to misinformation, academic integrity, and the dilution of human-centered educational practices. The study calls for a balanced approach to integrating GAI in media education, advocating for preparedness against its potential drawbacks while leveraging its capabilities to revolutionize educational paradigms.
- Book Chapter
3
- 10.4018/979-8-3693-6577-9.ch009
- Dec 2, 2024
The significance of ongoing study and development of AI technology is emphasized as this chapter explores the part and utility of generative AI in medical education and training, examines the difficulties it encounters, and systems unborn development patterns in the medical field. We can have a better understanding of how generative AI is impacting medical education going forward and offering fresh styles for training healthcare workers by reading this thorough review. A branch of artificial intelligence called” generative AI” is concerned with creating systems that can produce original and cultural labors, including textbooks, music, plates, and more. These systems may produce content that mimics mortal-generated content on their own by exercising deep literacy ways, particularly generative models. The interesting field of generative artificial intelligence focuses on creating systems that can singly produce original, creative content. It makes it possible for machines to perform creative and imaginative tasks in addition to further conventional bones. By exercising generative models and deep literacy approaches, these systems may induce innovative labors that nearly mimic mortal-generated content, including literature, music, prints, and more. This system makes it possible to produce creative and original content, making it an effective tool for various uses. Generative models are central to the idea of generative AI. Generative AI enables machines to autonomously induce creative content, similar as images, music, textbooks, and more. This addresses the need for new and different content in colorful disciplines, including art, entertainment, design, and marketing. Generative AI opens new possibilities for creative expression and expands the boundaries of mortal imagination. The possibilities for creative expression are increased, and the limits of mortal imagination are pushed by generative AI. Medical training serves as a means of guaranteeing that the performance of the mortal pool is observed in a realistic and secure setting. Their use of generative AI to produce virtual cases to instruct medical scholars. These realistic clinical scenarios in the simulations were designed to help medical professionals and students make better diagnoses and treatment-related decisions.