Author: "Lin, Haonan" / Publication Type: Electronic Resources - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Lin, Haonan"' showing total 3 results

Start Over Author "Lin, Haonan" Publication Type Electronic Resources

3 results on '"Lin, Haonan"'

1. DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation

Author: Lin, Haonan, Wang, Mengmeng, Chen, Yan, An, Wenbin, Yao, Yuzhe, Dai, Guang, Wang, Qianying, Liu, Yong, Wang, Jingdong, Lin, Haonan, Wang, Mengmeng, Chen, Yan, An, Wenbin, Yao, Yuzhe, Dai, Guang, Wang, Qianying, Liu, Yong, and Wang, Jingdong
Abstract: While large-scale pre-trained text-to-image models can synthesize diverse and high-quality human-centered images, novel challenges arise with a nuanced task of "identity fine editing": precisely modifying specific features of a subject while maintaining its inherent identity and context. Existing personalization methods either require time-consuming optimization or learning additional encoders, adept in "identity re-contextualization". However, they often struggle with detailed and sensitive tasks like human face editing. To address these challenges, we introduce DreamSalon, a noise-guided, staged-editing framework, uniquely focusing on detailed image manipulations and identity-context preservation. By discerning editing and boosting stages via the frequency and gradient of predicted noises, DreamSalon first performs detailed manipulations on specific features in the editing stage, guided by high-frequency information, and then employs stochastic denoising in the boosting stage to improve image quality. For more precise editing, DreamSalon semantically mixes source and target textual prompts, guided by differences in their embedding covariances, to direct the model's focus on specific manipulation areas. Our experiments demonstrate DreamSalon's ability to efficiently and faithfully edit fine details on human faces, outperforming existing methods both qualitatively and quantitatively.
Published: 2024

2. OneActor: Consistent Character Generation via Cluster-Conditioned Guidance

Author: Wang, Jiahao, Yan, Caixia, Lin, Haonan, Zhang, Weizhan, Wang, Jiahao, Yan, Caixia, Lin, Haonan, and Zhang, Weizhan
Abstract: Text-to-image diffusion models benefit artists with high-quality image generation. Yet its stochastic nature prevent artists from creating consistent images of the same character. Existing methods try to tackle this challenge and generate consistent content in various ways. However, they either depend on external data or require expensive tuning of the diffusion model. For this issue, we argue that a lightweight but intricate guidance is enough to function. Aiming at this, we lead the way to formalize the objective of consistent generation, derive a clustering-based score function and propose a novel paradigm, OneActor. We design a cluster-conditioned model which incorporates posterior samples to guide the denoising trajectories towards the target cluster. To overcome the overfitting challenge shared by one-shot tuning pipelines, we devise auxiliary components to simultaneously augment the tuning and regulate the inference. This technique is later verified to significantly enhance the content diversity of generated images. Comprehensive experiments show that our method outperforms a variety of baselines with satisfactory character consistency, superior prompt conformity as well as high image quality. And our method is at least 4 times faster than tuning-based baselines. Furthermore, to our best knowledge, we first prove that the semantic space has the same interpolation property as the latent space dose. This property can serve as another promising tool for fine generation control.
Published: 2024

3. Generalized Category Discovery with Large Language Models in the Loop

Author: An, Wenbin, Shi, Wenkai, Tian, Feng, Lin, Haonan, Wang, QianYing, Wu, Yaqiang, Cai, Mingxiang, Wang, Luyan, Chen, Yan, Zhu, Haiping, Chen, Ping, An, Wenbin, Shi, Wenkai, Tian, Feng, Lin, Haonan, Wang, QianYing, Wu, Yaqiang, Cai, Mingxiang, Wang, Luyan, Chen, Yan, Zhu, Haiping, and Chen, Ping
Abstract: Generalized Category Discovery (GCD) is a crucial task that aims to recognize both known and novel categories from a set of unlabeled data by utilizing a few labeled data with only known categories. Due to the lack of supervision and category information, current methods usually perform poorly on novel categories and struggle to reveal semantic meanings of the discovered clusters, which limits their applications in the real world. To mitigate the above issues, we propose Loop, an end-to-end active-learning framework that introduces Large Language Models (LLMs) into the training loop, which can boost model performance and generate category names without relying on any human efforts. Specifically, we first propose Local Inconsistent Sampling (LIS) to select samples that have a higher probability of falling to wrong clusters, based on neighborhood prediction consistency and entropy of cluster assignment probabilities. Then we propose a Scalable Query strategy to allow LLMs to choose true neighbors of the selected samples from multiple candidate samples. Based on the feedback from LLMs, we perform Refined Neighborhood Contrastive Learning (RNCL) to pull samples and their neighbors closer to learn clustering-friendly representations. Finally, we select representative samples from clusters corresponding to novel categories to allow LLMs to generate category names for them. Extensive experiments on three benchmark datasets show that Loop outperforms SOTA models by a large margin and generates accurate category names for the discovered clusters. Code and data are available at https://github.com/Lackel/LOOP., Comment: Accepted by ACL 2024 Findings, code and data are available at https://github.com/Lackel/LOOP
Published: 2023

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

3 results on '"Lin, Haonan"'

1. DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation

2. OneActor: Consistent Character Generation via Cluster-Conditioned Guidance

3. Generalized Category Discovery with Large Language Models in the Loop

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Publication Type

Database

3 results on '"Lin, Haonan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources