12 results on '"Kun Lai"'
Search Results
2. Geometry-Guided Dense Perspective Network for Speech-Driven Facial Animation
- Author
-
Jingying Liu, Jingyu Yang, Yebin Liu, Yunke Liu, Yu-Kun Lai, Kun Li, Binyuan Hui, and Yuxiang Zhang
- Subjects
FOS: Computer and information sciences ,Generalization ,Computer science ,Computer Science - Graphics ,Imaging, Three-Dimensional ,Robustness (computer science) ,Computer Graphics ,Humans ,Speech ,Computer vision ,Representation (mathematics) ,Computer facial animation ,business.industry ,Perspective (graphical) ,Computer Graphics and Computer-Aided Design ,Graphics (cs.GR) ,Feature (computer vision) ,Face ,Face (geometry) ,Signal Processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Encoder ,Algorithms ,Software - Abstract
Realistic speech-driven 3D facial animation is a challenging problem due to the complex relationship between speech and face. In this paper, we propose a deep architecture, called Geometry-guided Dense Perspective Network (GDPnet), to achieve speaker-independent realistic 3D facial animation. The encoder is designed with dense connections to strengthen feature propagation and encourage the re-use of audio features, and the decoder is integrated with an attention mechanism to adaptively recalibrate point-wise feature responses by explicitly modeling interdependencies between different neuron units. We also introduce a non-linear face reconstruction representation as a guidance of latent space to obtain more accurate deformation, which helps solve the geometry-related deformation and is good for generalization across subjects. Huber and HSIC (Hilbert-Schmidt Independence Criterion) constraints are adopted to promote the robustness of our model and to better exploit the non-linear and high-order correlations. Experimental results on the public dataset and real scanned dataset validate the superiority of our proposed GDPnet compared with state-of-the-art model. The code is available for research purposes at http://cic.tju.edu.cn/faculty/likun/projects/GDPnet.
- Published
- 2022
3. GAN-Based Multi-Style Photo Cartoonization
- Author
-
Wang Zhao, Yu-Kun Lai, Ran Yi, Yong-Jin Liu, Zipeng Ye, Yezhi Shu, Mengfei Xia, and Yang Chen
- Subjects
Network architecture ,Exploit ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Semantics ,Computer Graphics and Computer-Aided Design ,GeneralLiterature_MISCELLANEOUS ,Image (mathematics) ,Style (sociolinguistics) ,Signal Processing ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Architecture ,business ,Encoder ,Software ,ComputingMethodologies_COMPUTERGRAPHICS ,Generator (mathematics) - Abstract
Cartoon is a common form of art in our daily life and automatic generation of cartoon images from photos is highly desirable. However, state-of-the-art single-style methods can only generate one style of cartoon images from photos and existing multi-style image style transfer methods still struggle to produce high-quality cartoon images due to their highly simplified and abstract nature. In this article, we propose a novel multi-style generative adversarial network (GAN) architecture, called MS-CartoonGAN, which can transform photos into multiple cartoon styles. MS-CartoonGAN uses only unpaired photos and cartoon images of multiple styles for training. To achieve this, we propose to use (1) a hierarchical semantic loss with sparse regularization to retain semantic content and recover flat shading in different abstract levels, (2) a new edge-promoting adversarial loss for producing fine edges, and (3) a style loss to enhance the difference between output cartoon styles and make training process more stable. We also develop a multi-domain architecture, where the generator consists of a shared encoder and multiple decoders for different cartoon styles, along with multiple discriminators for individual styles. By observing that cartoon images drawn by different artists have their unique styles while sharing some common characteristics, our shared network architecture exploits the common characteristics of cartoon styles, achieving better cartoonization and being more efficient than single-style cartoonization. We show that our multi-domain architecture can theoretically guarantee to output desired multiple cartoon styles. Through extensive experiments including a user study, we demonstrate the superiority of the proposed method, outperforming state-of-the-art single-style and multi-style image style transfer methods.
- Published
- 2022
4. PRS-Net: Planar Reflective Symmetry Detection Net for 3D Models
- Author
-
Leif Kobbelt, Yu-Kun Lai, Lin Gao, Ling-Xiao Zhang, Hsien-Yu Meng, and Yi-Hui Ren
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial neural network ,Computer science ,business.industry ,Deep learning ,020207 software engineering ,02 engineering and technology ,Geometry processing ,Global symmetry ,Computer Graphics and Computer-Aided Design ,Convolutional neural network ,Graphics (cs.GR) ,Machine Learning (cs.LG) ,Computer Science - Graphics ,Reflection symmetry ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Symmetry (geometry) ,business ,Algorithm ,Rotation (mathematics) ,Software - Abstract
In geometry processing, symmetry is a universal type of high-level structural information of 3D models and benefits many geometry processing tasks including shape segmentation, alignment, matching, and completion. Thus it is an important problem to analyze various symmetry forms of 3D shapes. Planar reflective symmetry is the most fundamental one. Traditional methods based on spatial sampling can be time-consuming and may not be able to identify all the symmetry planes. In this article, we present a novel learning framework to automatically discover global planar reflective symmetry of a 3D shape. Our framework trains an unsupervised 3D convolutional neural network to extract global model features and then outputs possible global symmetry parameters, where input shapes are represented using voxels. We introduce a dedicated symmetry distance loss along with a regularization loss to avoid generating duplicated symmetry planes. Our network can also identify generalized cylinders by predicting their rotation axes. We further provide a method to remove invalid and duplicated planes and axes. We demonstrate that our method is able to produce reliable and accurate results. Our neural network based method is hundreds of times faster than the state-of-the-art methods, which are based on sampling. Our method is also robust even with noisy or incomplete input surfaces.
- Published
- 2021
5. Learning to Infer Inner-Body Under Clothing From Monocular Video
- Author
-
Xiongzheng Li, Jing Huang, Jinsong Zhang, Xiaokun Sun, Haibiao Xuan, Yu-Kun Lai, Yingdi Xie, Jingyu Yang, and Kun Li
- Subjects
Signal Processing ,Computer Vision and Pattern Recognition ,Computer Graphics and Computer-Aided Design ,Software - Abstract
Accurately estimating the human inner-body under clothing is very important for body measurement, virtual try-on and VR/AR applications. In this paper, we propose the first method to allow everyone to easily reconstruct their own 3D inner-body under daily clothing from a self-captured video with the mean reconstruction error of 0.73 cm within 15 s. This avoids privacy concerns arising from nudity or minimal clothing. Specifically, we propose a novel two-stage framework with a Semantic-guided Undressing Network (SUNet) and an Intra-Inter Transformer Network (IITNet). SUNet learns semantically related body features to alleviate the complexity and uncertainty of directly estimating 3D inner-bodies under clothing. IITNet reconstructs the 3D inner-body model by making full use of intra-frame and inter-frame information, which addresses the misalignment of inconsistent poses in different frames. Experimental results on both public datasets and our collected dataset demonstrate the effectiveness of the proposed method. The code and dataset is available for research purposes at http://cic.tju.edu.cn/faculty/likun/projects/Inner-Body.
- Published
- 2022
6. Deep Line Art Video Colorization with a Few References
- Author
-
Min, Shi, Jia-Qi, Zhang, Shu-Yu, Chen, Lin, Gao, Yu-Kun, Lai, and Fanglue, Zhang
- Abstract
Coloring line art images based on the colors of reference images is an important stage in animation production, which is time-consuming and tedious. In this paper, we propose a deep architecture to automatically color line art videos with the same color style as the given reference images. Our framework consists of a color transform network and a temporal refinement network based on 3U-net. The color transform network takes the target line art images as well as the line art and color images of the reference images as input, and generates corresponding target color images. To cope with the large differences between each target line art image and the reference color images, we propose a distance attention layer that utilizes non-local similarity matching to determine the region correspondences between the target image and the reference images and transforms the local color information from the references to the target. To ensure global color style consistency, we further incorporate Adaptive Instance Normalization (AdaIN) with the transformation parameters obtained from a multiple-layer AdaIN that describes the global color style of the references, extracted by an embedder network. The temporal refinement network learns spatiotemporal features through 3D convolutions to ensure the temporal color consistency of the results. Our model can achieve even better coloring results by fine-tuning the parameters with only a small number of samples when dealing with an animation of a new style. To evaluate our method, we build a line art coloring dataset.
- Published
- 2022
7. 3D-CariGAN: An End-to-End Solution to 3D Caricature Generation from Normal Face Photos
- Author
-
Yanan Sun, Mengfei Xia, Zipeng Ye, Yu-Kun Lai, Minjing Yu, Ran Yi, Juyong Zhang, and Yong-Jin Liu
- Subjects
FOS: Computer and information sciences ,business.industry ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Entertainment industry ,Computer Science - Computer Vision and Pattern Recognition ,Representation (arts) ,Computer Graphics and Computer-Aided Design ,Sketch ,Domain (software engineering) ,Character (mathematics) ,Face (geometry) ,Signal Processing ,Polygon mesh ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Set (psychology) ,business ,Software - Abstract
Caricature is a type of artistic style of human faces that attracts considerable attention in the entertainment industry. So far a few 3D caricature generation methods exist and all of them require some caricature information (e.g., a caricature sketch or 2D caricature) as input. This kind of input, however, is difficult to provide by non-professional users. In this paper, we propose an end-to-end deep neural network model that generates high-quality 3D caricatures directly from a normal 2D face photo. The most challenging issue for our system is that the source domain of face photos (characterized by normal 2D faces) is significantly different from the target domain of 3D caricatures (characterized by 3D exaggerated face shapes and textures). To address this challenge, we: (1) build a large dataset of 5,343 3D caricature meshes and use it to establish a PCA model in the 3D caricature shape space; (2) reconstruct a normal full 3D head from the input face photo and use its PCA representation in the 3D caricature shape space to establish correspondences between the input photo and 3D caricature shape; and (3) propose a novel character loss and a novel caricature loss based on previous psychological studies on caricatures. Experiments including a novel two-level user study show that our system can generate high-quality 3D caricatures directly from normal face photos., Comment: Accepted by IEEE Transactions on Visualization and Computer Graphics
- Published
- 2021
8. E-ffective: A Visual Analytic System for Exploring the Emotion and Effectiveness of Inspirational Speeches
- Author
-
Hao Wang, Yong-Jin Liu, Jian-Cheng Song, Hongan Wang, Xiaoming Deng, Ze-Yuan Huang, Yu-Kun Lai, Cuixia Ma, and Kevin T. Maher
- Subjects
Visualization methods ,FOS: Computer and information sciences ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Emotions ,Computer Science - Human-Computer Interaction ,Computer Science - Computer Vision and Pattern Recognition ,Domain (software engineering) ,Human-Computer Interaction (cs.HC) ,Computer Graphics ,Speech ,business.industry ,Critical factors ,Subject (documents) ,Usability ,Computer Graphics and Computer-Aided Design ,Multimedia (cs.MM) ,Public speaking ,Quantitative analysis (finance) ,Signal Processing ,Domain knowledge ,Computer Vision and Pattern Recognition ,business ,Software ,Computer Science - Multimedia ,Cognitive psychology - Abstract
What makes speeches effective has long been a subject for debate, and until today there is broad controversy among public speaking experts about what factors make a speech effective as well as the roles of these factors in speeches. Moreover, there is a lack of quantitative analysis methods to help understand effective speaking strategies. In this paper, we propose E-ffective, a visual analytic system allowing speaking experts and novices to analyze both the role of speech factors and their contribution in effective speeches. From interviews with domain experts and investigating existing literature, we identified important factors to consider in inspirational speeches. We obtained the generated factors from multi-modal data that were then related to effectiveness data. Our system supports rapid understanding of critical factors in inspirational speeches, including the influence of emotions by means of novel visualization methods and interaction. Two novel visualizations include E-spiral (that shows the emotional shifts in speeches in a visually compact way) and E-script (that connects speech content with key speech delivery information). In our evaluation we studied the influence of our system on experts' domain knowledge about speech factors. We further studied the usability of the system by speaking novices and experts on assisting analysis of inspirational speech effectiveness., Comment: IEEE Transactions of Visualization and Computer Graphics (TVCG, Proc. VIS 2021), to appear
- Published
- 2021
9. Learning on 3D Meshes With Laplacian Encoding and Pooling
- Author
-
Paul L. Rosin, Yi-Ling Qiao, Xilin Chen, Jie Yang, Yu-Kun Lai, and Lin Gao
- Subjects
Computer science ,Pooling ,020207 software engineering ,02 engineering and technology ,Computer Graphics and Computer-Aided Design ,Matrix multiplication ,Vertex (geometry) ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Segmentation ,Polygon mesh ,Computer Vision and Pattern Recognition ,Graphics ,Algorithm ,Laplace operator ,Software ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
3D models are commonly used in computer vision and graphics. With the wider availability of mesh data, an efficient and intrinsic deep learning approach to processing 3D meshes is in great need. Unlike images, 3D meshes have irregular connectivity, requiring careful design to capture relations in the data. To utilize the topology information while staying robust under different triangulations, we propose to encode mesh connectivity using Laplacian spectral analysis, along with mesh feature aggregation blocks (MFABs) that can split the surface domain into local pooling patches and aggregate global information amongst them. We build a mesh hierarchy from fine to coarse using Laplacian spectral clustering, which is flexible under isometric transformations. Inside the MFABs there are pooling layers to collect local information and multi-layer perceptrons to compute vertex features of increasing complexity. To obtain the relationships among different clusters, we introduce a Correlation Net to compute a correlation matrix, which can aggregate the features globally by matrix multiplication with cluster features. Our network architecture is flexible enough to be used on meshes with different numbers of vertices. We conduct several experiments including shape segmentation and classification, and our method outperforms state-of-the-art algorithms for these tasks on the ShapeNet and COSEG datasets.
- Published
- 2020
10. Content-Preserving Image Stitching With Piecewise Rectangular Boundary Constraints
- Author
-
Yu-Kun Lai, Fang-Lue Zhang, and Yun Zhang
- Subjects
Quantitative Biology::Biomolecules ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Boundary (topology) ,Computer Science::Human-Computer Interaction ,Computer Graphics and Computer-Aided Design ,Quantitative Biology::Genomics ,Image stitching ,Signal Processing ,Line (geometry) ,Polygon ,Piecewise ,Polygon mesh ,Computer Vision and Pattern Recognition ,Rectangle ,Image warping ,Algorithm ,Software ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
This article proposes an approach to content-preserving image stitching with regular boundary constraints, which aims to stitch multiple images to generate a panoramic image with piecewise rectangular boundaries. Existing methods treat image stitching and rectangling as two separate steps, which may result in suboptimal results as the stitching process is not aware of the further warping needs for rectangling. We address these limitations by formulating image stitching with regular boundaries in a unified optimization framework. Starting from the initial stitching result produced by traditional warping-based optimization, we obtain the irregular boundary from the warped meshes by polygon Boolean operations which robustly handle arbitrary mesh compositions. By analyzing the irregular boundary, we construct a piecewise rectangular boundary. Based on this, we further incorporate line and regular boundary preservation constraints into the image stitching framework, and conduct iterative optimizations to obtain an optimal piecewise rectangular boundary. Thus we can make the boundary of the stitching result as close as possible to a rectangle, while reducing unwanted distortions. We further extend our method to video stitching, by integrating the temporal coherence into the optimization. Experiments show that our method efficiently produces visually pleasing panoramas with regular boundaries and unnoticeable distortions.
- Published
- 2020
11. Generalized anisotropic stratified surface sampling
- Author
-
Jonathan Alexander Quinn, Yu-Kun Lai, Ralph R. Martin, and Frank C. Langbein
- Subjects
Surface (mathematics) ,Nonuniform sampling ,Slice sampling ,Sampling (statistics) ,Geometry ,Computer Graphics and Computer-Aided Design ,Tensor field ,Stratified sampling ,Mesh generation ,Signal Processing ,Oversampling ,Computer Vision and Pattern Recognition ,Software ,Mathematics - Abstract
We introduce a novel stratified sampling technique for mesh surfaces that gives the user control over sampling density and anisotropy via a tensor field. Our approach is based on sampling space-filling curves mapped onto mesh segments via parametrizations aligned with the tensor field. After a short preprocessing step, samples can be generated in real time. Along with visual examples, we provide rigorous spectral analysis and differential domain analysis of our sampling. The sample distributions are of high quality: they fulfil the blue noise criterion, so have minimal artifacts due to regularity of sampling patterns, and they accurately represent isotropic and anisotropic densities on the plane and on mesh surfaces. They also have low discrepancy, ensuring that the surface is evenly covered.
- Published
- 2013
12. Metric-driven RoSy field design and remeshing
- Author
-
Eugene Zhang, Jonathan Palacios, Xianfeng Gu, Shi-Min Hu, Xuexiang Xie, Yu-Kun Lai, Miao Jin, and Ying He
- Subjects
Computer science ,Finite Element Analysis ,Rotational symmetry ,Topology ,User-Computer Interface ,Knot (unit) ,Singularity ,Imaging, Three-Dimensional ,Local symmetry ,Computer graphics (images) ,Computer Graphics ,Almost everywhere ,Computer Simulation ,ComputingMethodologies_COMPUTERGRAPHICS ,Parallel transport ,Holonomy ,Global symmetry ,Models, Theoretical ,Computational geometry ,Computer Graphics and Computer-Aided Design ,Signal Processing ,Curve fitting ,Gravitational singularity ,Computer Vision and Pattern Recognition ,Parametrization ,Software - Abstract
Designing rotational symmetry fields on surfaces is an important task for a wide range of graphics applications. This work introduces a rigorous and practical approach for automatic N-RoSy field design on arbitrary surfaces with user-defined field topologies. The user has full control of the number, positions, and indexes of the singularities (as long as they are compatible with necessary global constraints), the turning numbers of the loops, and is able to edit the field interactively. We formulate N-RoSy field construction as designing a Riemannian metric such that the holonomy along any loop is compatible with the local symmetry of N-RoSy fields. We prove the compatibility condition using discrete parallel transport. The complexity of N-RoSy field design is caused by curvatures. In our work, we propose to simplify the Riemannian metric to make it flat almost everywhere. This approach greatly simplifies the process and improves the flexibility such that it can design N-RoSy fields with single singularity and mixed-RoSy fields. This approach can also be generalized to construct regular remeshing on surfaces. To demonstrate the effectiveness of our approach, we apply our design system to pen-and-ink sketching and geometry remeshing. Furthermore, based on our remeshing results with high global symmetry, we generate Celtic knots on surfaces directly.
- Published
- 2009
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.