FontCLIP: A Semantic Typography Visual‐Language Model for Multilingual Font Applications

Yuki Tatsukawa,Takeo Igarashi,Yuki Koyama,I‐Chao Shen,Ariel Shamir,Anran Qi

doi:10.1111/cgf.15043

Abstract

AbstractAcquiring the desired font for various design tasks can be challenging and requires professional typographic knowledge. While previous font retrieval or generation works have alleviated some of these difficulties, they often lack support for multiple languages and semantic attributes beyond the training data domains. To solve this problem, we present FontCLIP – a model that connects the semantic understanding of a large vision‐language model with typographical knowledge. We integrate typography‐specific knowledge into the comprehensive vision‐language knowledge of a pretrained CLIP model through a novel finetuning approach. We propose to use a compound descriptive prompt that encapsulates adaptively sampled attributes from a font attribute dataset focusing on Roman alphabet characters. FontCLIP's semantic typographic latent space demonstrates two unprecedented generalization abilities. First, FontCLIP generalizes to different languages including Chinese, Japanese, and Korean (CJK), capturing the typographical features of fonts across different languages, even though it was only finetuned using fonts of Roman characters. Second, FontCLIP can recognize the semantic attributes that are not presented in the training data. FontCLIP's dual‐modality and generalization abilities enable multilingual and cross‐lingual font retrieval and letter shape optimization, reducing the burden of obtaining desired fonts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

FontCLIP: A Semantic Typography Visual‐Language Model for Multilingual Font Applications

Abstract

Talk to us

Similar Papers

More From: Computer Graphics Forum

Lead the way for us

Journal: Computer Graphics Forum	Publication Date: Apr 30, 2024
License type: CC BY 4.0

Similar Papers

Explainable and Generalizable Blind Image Quality Assessment via Semantic Attribute Reasoning
Yipo Huang ... Yuzhe Yang
IEEE Transactions on Multimedia | VOL. 25
Yipo Huang, et. al.Yipo Huang ... Yuzhe Yang
01 Jan 2023
IEEE Transactions on Multimedia | VOL. 25

Knowledge-based dynamic prompt learning for multi-label disease diagnosis
Jing Xie ... Xin Peng
Knowledge-Based Systems | VOL. 286
Jing Xie, et. al.Jing Xie ... Xin Peng
11 Jan 2024
Knowledge-Based Systems | VOL. 286

Generalized Zero-Shot Learning Via Multi-Modal Aggregated Posterior Aligning Neural Network
Xingyu Chen ... Xuguang Lan
IEEE Transactions on Multimedia | VOL. 24
Xingyu Chen, et. al.Xingyu Chen ... Xuguang Lan
25 Dec 2020
IEEE Transactions on Multimedia | VOL. 24

3D Shape Reconstruction from 2D Images with Disentangled Attribute Flow
Xin Wen ... Yu-Shen Liu
-
Xin Wen, et. al.Xin Wen ... Yu-Shen Liu
01 Jun 2022
01 Jun 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

FontCLIP: A Semantic Typography Visual‐Language Model for Multilingual Font Applications

Abstract

Talk to us

Similar Papers

More From: Computer Graphics Forum