82 results
Search Results
2. A Generalized Framework for Edge-Preserving and Structure-Preserving Image Smoothing.
- Author
-
Liu, Wei, Zhang, Pingping, Lei, Yinjie, Huang, Xiaolin, Yang, Jie, and Ng, Michael
- Subjects
COMPUTER vision ,COMPUTER graphics ,IMAGE intensifiers ,APPLICATION software - Abstract
Image smoothing is a fundamental procedure in applications of both computer vision and graphics. The required smoothing properties can be different or even contradictive among different tasks. Nevertheless, the inherent smoothing nature of one smoothing operator is usually fixed and thus cannot meet the various requirements of different applications. In this paper, we first introduce the truncated Huber penalty function which shows strong flexibility under different parameter settings. A generalized framework is then proposed with the introduced truncated Huber penalty function. When combined with its strong flexibility, our framework is able to achieve diverse smoothing natures where contradictive smoothing behaviors can even be achieved. It can also yield the smoothing behavior that can seldom be achieved by previous methods, and superior performance is thus achieved in challenging cases. These together enable our framework capable of a range of applications and able to outperform the state-of-the-art approaches in several tasks, such as image detail enhancement, clip-art compression artifacts removal, guided depth map restoration, image texture removal, etc. In addition, an efficient numerical solution is provided and its convergence is theoretically guaranteed even the optimization framework is non-convex and non-smooth. A simple yet effective approach is further proposed to reduce the computational cost of our method while maintaining its performance. The effectiveness and superior performance of our approach are validated through comprehensive experiments in a range of applications. Our code is available at https://github.com/wliusjtu/Generalized-Smoothing-Framework . [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
3. Learning and Meshing From Deep Implicit Surface Networks Using an Efficient Implementation of Analytic Marching.
- Author
-
Lei, Jiabao, Jia, Kui, and Ma, Yi
- Subjects
DEEP learning ,COMPUTER vision ,MULTILAYER perceptrons ,COMPUTER graphics ,SURFACE reconstruction ,PARALLEL programming ,IMPLICIT functions - Abstract
Reconstruction of object or scene surfaces has tremendous applications in computer vision, computer graphics, and robotics. The topic attracts increased attention with the emerging pipeline of deep learning surface reconstruction, where implicit field functions constructed from deep networks (e.g., multi-layer perceptrons or MLPs) are proposed for generative shape modeling. In this paper, we study a fundamental problem in this context about recovering a surface mesh from an implicit field function whose zero-level set captures the underlying surface. To achieve the goal, existing methods rely on traditional meshing algorithms (e.g., the de-facto standard marching cubes); while promising, they suffer from loss of precision learned in the implicit surface networks, due to the use of discrete space sampling in marching cubes. Given that an MLP with activations of Rectified Linear Unit (ReLU) partitions its input space into a number of linear regions, we are motivated to connect this local linearity with a same property owned by the desired result of polygon mesh. More specifically, we identify from the linear regions, partitioned by an MLP based implicit function, the analytic cells and analytic facesthat are associated with the function's zero-level isosurface. We prove that under mild conditions, the identified analytic faces are guaranteed to connect and form a closed, piecewise planar surface. Based on the theorem, we propose an algorithm of analytic marching, which marches among analytic cells to exactly recover the mesh captured by an implicit surface network. We also show that our theory and algorithm are equally applicable to advanced MLPs with shortcut connections and max pooling. Given the parallel nature of analytic marching, we contribute AnalyticMesh, a software package that supports efficient meshing of implicit surface networks via CUDA parallel computing, and mesh simplification for efficient downstream processing. We apply our method to different settings of generative shape modeling using implicit surface networks. Extensive experiments demonstrate our advantages over existing methods in terms of both meshing accuracy and efficiency. Codes are at https://github.com/Karbo123/AnalyticMesh. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
4. Coresets for Triangulation.
- Author
-
Zhang, Qianggong and Chin, Tat-Jun
- Subjects
VISUAL perception ,MULTIPLE correspondence analysis (Statistics) ,IMAGE processing ,COMPUTER graphics ,IMAGE recognition (Computer vision) - Abstract
Multiple-view triangulation by $\ell _\infty$ minimisation has become established in computer vision. State-of-the-art $\ell _\infty$ triangulation algorithms exploit the quasiconvexity of the cost function to derive iterative update rules that deliver the global minimum. Such algorithms, however, can be computationally costly for large problem instances that contain many image measurements, e.g., from web-based photo sharing sites or long-term video recordings. In this paper, we prove that $\ell _\infty$ triangulation admits a coreset approximation scheme, which seeks small representative subsets of the input data called coresets. A coreset possesses the special property that the error of the $\ell _\infty$ solution on the coreset is within known bounds from the global minimum. We establish the necessary mathematical underpinnings of the coreset algorithm, specifically, by enacting the stopping criterion of the algorithm and proving that the resulting coreset gives the desired approximation accuracy. On large-scale triangulation problems, our method provides theoretically sound approximate solutions. Iterated until convergence, our coreset algorithm is also guaranteed to reach the true optimum. On practical datasets, we show that our technique can in fact attain the global minimiser much faster than current methods. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
5. Survey and Evaluation of Neural 3D Shape Classification Approaches.
- Author
-
Mirbauer, Martin, Krabec, Miroslav, Krivanek, Jaroslav, and Sikudova, Elena
- Subjects
ARTIFICIAL neural networks ,MACHINE learning ,OBJECT recognition (Computer vision) ,CLASSIFICATION ,GEOMETRIC shapes ,COMPUTER graphics ,CLASSIFICATION algorithms - Abstract
Classification of 3D objects – the selection of a category in which each object belongs – is of great interest in the field of machine learning. Numerous researchers use deep neural networks to address this problem, altering the network architecture and representation of the 3D shape used as an input. To investigate the effectiveness of their approaches, we conduct an extensive survey of existing methods and identify common ideas by which we categorize them into a taxonomy. Second, we evaluate 11 selected classification networks on two 3D object datasets, extending the evaluation to a larger dataset on which most of the selected approaches have not been tested yet. For this, we provide a framework for converting shapes from common 3D mesh formats into formats native to each network, and for training and evaluating different classification approaches on this data. Despite being partially unable to reach the accuracies reported in the original papers, we compare the relative performance of the approaches as well as their performance when changing datasets as the only variable to provide valuable insights into performance on different kinds of data. We make our code available to simplify running training experiments with multiple neural networks with different prerequisites. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
6. Large-Scale Urban Reconstruction with Tensor Clustering and Global Boundary Refinement.
- Author
-
Poullis, Charalambos
- Subjects
COMPUTER vision ,WEIBULL distribution ,COMPUTER graphics ,ALGORITHMS ,GLOBAL optimization - Abstract
Accurate and efficient methods for large-scale urban reconstruction are of significant importance to the computer vision and computer graphics communities. Although rapid acquisition techniques such as airborne LiDAR have been around for many years, creating a useful and functional virtual environment from such data remains difficult and labor intensive. This is due largely to the necessity in present solutions for data dependent user defined parameters. In this paper we present a new solution for automatically converting large LiDAR data pointcloud into simplified polygonal 3D models. The data is first divided into smaller components which are processed independently and concurrently to extract various metrics about the points. Next, the extracted information is converted into tensors. A robust agglomerate clustering algorithm is proposed to segment the tensors into clusters representing geospatial objects e.g., roads, buildings, etc. Unlike previous methods, the proposed tensor clustering process has no data dependencies and does not require any user-defined parameter. The required parameters are adaptively computed assuming a Weibull distribution for similarity distances. Lastly, to extract boundaries from the clusters a new multi-stage boundary refinement process is developed by reformulating this extraction as a global optimization problem. We have extensively tested our methods on several pointcloud datasets of different resolutions which exhibit significant variability in geospatial characteristics e.g., ground surface inclination, building density, etc and the results are reported. The source code for both tensor clustering and global boundary refinement will be made publicly available with the publication on the author's website. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
7. Mutually Guided Image Filtering.
- Author
-
Guo, Xiaojie, Li, Yu, Ma, Jiayi, and Ling, Haibin
- Subjects
COMPUTER vision ,FILTERS & filtration ,IMAGE color analysis ,COMPUTER graphics ,MULTIMEDIA systems - Abstract
Filtering images is required by numerous multimedia, computer vision and graphics tasks. Despite diverse goals of different tasks, making effective rules is key to the filtering performance. Linear translation-invariant filters with manually designed kernels have been widely used. However, their performance suffers from content-blindness. To mitigate the content-blindness, a family of filters, called joint/guided filters, have attracted a great amount of attention from the community. The main drawback of most joint/guided filters comes from the ignorance of structural inconsistency between the reference and target signals like color, infrared, and depth images captured under different conditions. Simply adopting such guidelines very likely leads to unsatisfactory results. To address the above issues, this paper designs a simple yet effective filter, named mutually guided image filter (muGIF), which jointly preserves mutual structures, avoids misleading from inconsistent structures and smooths flat regions. The proposed muGIF is very flexible, which can work in various modes including dynamic only (self-guided), static/dynamic (reference-guided) and dynamic/dynamic (mutually guided) modes. Although the objective of muGIF is in nature non-convex, by subtly decomposing the objective, we can solve it effectively and efficiently. The advantages of muGIF in effectiveness and flexibility are demonstrated over other state-of-the-art alternatives on a variety of applications. Our code is publicly available at https://sites.google.com/view/xjguo/mugif. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
8. Generalized Canonical Time Warping.
- Author
-
Zhou, Feng and Torre, Fernando De la
- Subjects
ARTIFICIAL intelligence ,HUMAN mechanics ,MOTION capture (Human mechanics) ,MOTION capture (Cinematography) ,ANIMATION (Cinematography) ,COMPUTER graphics - Abstract
Temporal alignment of human motion has been of recent interest due to its applications in animation, tele-rehabilitation and activity recognition. This paper presents generalized canonical time warping (GCTW), an extension of dynamic time warping (DTW) and canonical correlation analysis (CCA) for temporally aligning multi-modal sequences from multiple subjects performing similar activities. GCTW extends previous work on DTW and CCA in several ways: (1) it combines CCA with DTW to align multi-modal data (e.g., video and motion capture data); (2) it extends DTW by using a linear combination of monotonic functions to represent the warping path, providing a more flexible temporal warp. Unlike exact DTW, which has quadratic complexity, we propose a linear time algorithm to minimize GCTW. (3) GCTW allows simultaneous alignment of multiple sequences. Experimental results on aligning multi-modal data, facial expressions, motion capture data and video illustrate the benefits of GCTW. The code is available at
http://humansensing.cs.cmu.edu/ctw . [ABSTRACT FROM PUBLISHER]- Published
- 2016
- Full Text
- View/download PDF
9. Sharable and Individual Multi-View Metric Learning.
- Author
-
Hu, Junlin, Lu, Jiwen, and Tan, Yap-Peng
- Subjects
IMAGE processing ,MULTIPLE correspondence analysis (Statistics) ,DIGITAL image processing ,COMPUTER graphics ,IMAGE recognition (Computer vision) - Abstract
This paper presents a sharable and individual multi-view metric learning (MvML) approach for visual recognition. Unlike conventional metric leaning methods which learn a distance metric on either a single type of feature representation or a concatenated representation of multiple types of features, the proposed MvML jointly learns an optimal combination of multiple distance metrics on multi-view representations, where not only it learns an individual distance metric for each view to retain its specific property but also a shared representation for different views in a unified latent subspace to preserve the common properties. The objective function of the MvML is formulated in the large margin learning framework via pairwise constraints, under which the distance of each similar pair is smaller than that of each dissimilar pair by a margin. Moreover, to exploit the nonlinear structure of data points, we extend MvML to a sharable and individual multi-view deep metric learning (MvDML) method by utilizing the neural network architecture to seek multiple nonlinear transformations. Experimental results on face verification, kinship verification, and person re-identification show the effectiveness of the proposed sharable and individual multi-view metric learning methods. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
10. Artificial Neural Networks for Document Analysis and Recognition.
- Author
-
Marinal, Simone, Gori, Marco, and Soda, Giovanni
- Subjects
ARTIFICIAL intelligence ,NEURAL circuitry ,ARTIFICIAL neural networks ,IMAGE processing ,DOCUMENT imaging systems ,COMPUTER graphics - Abstract
Artificial neural networks have been extensively applied to document analysis and recognition. Most efforts have been devoted to the recognition of isolated handwritten and printed characters with widely recognized successful results. However, many other document processing tasks, like preprocessing, layout analysis, character segmentation, word recognition, and signature verification, have been effectively faced with very promising results. This paper surveys the most significant problems in the area of offline document image processing, where connectionist-based approaches have been applied. Similarities and differences between approaches belonging to different categories are discussed. A particular emphasis is given on the crucial role of prior knowledge for the conception of both appropriate architectures and learning algorithms. Finally, the paper provides a critical analysis on the reviewed approaches and depicts the most promising research guidelines in the field. In particular, a second generation of connectionist-based models are foreseen which are based on appropriate graphical representations of the learning environment. [ABSTRACT FROM AUTHOR]
- Published
- 2005
11. Colour Constancy Beyond the Classical Receptive Field.
- Author
-
Akbarinia, Arash and Parraga, C. Alejandro
- Subjects
VISUAL perception ,COMPUTER graphics ,IMAGE processing ,IMAGE recognition (Computer vision) ,MULTIPLE correspondence analysis (Statistics) - Abstract
The problem of removing illuminant variations to preserve the colours of objects (colour constancy) has already been solved by the human brain using mechanisms that rely largely on centre-surround computations of local contrast. In this paper we adopt some of these biological solutions described by long known physiological findings into a simple, fully automatic, functional model (termed Adaptive Surround Modulation or ASM). In ASM, the size of a visual neuron's receptive field (RF) as well as the relationship with its surround varies according to the local contrast within the stimulus, which in turn determines the nature of the centre-surround normalisation of cortical neurons higher up in the processing chain. We modelled colour constancy by means of two overlapping asymmetric Gaussian kernels whose sizes are adapted based on the contrast of the surround pixels, resembling the change of RF size. We simulated the contrast-dependent surround modulation by weighting the contribution of each Gaussian according to the centre-surround contrast. In the end, we obtained an estimation of the illuminant from the set of the most activated RFs’ outputs. Our results on three single-illuminant and one multi-illuminant benchmark datasets show that ASM is highly competitive against the state-of-the-art and it even outperforms learning-based algorithms in one case. Moreover, the robustness of our model is more tangible if we consider that our results were obtained using the same parameters for all datasets, that is, mimicking how the human visual system operates. These results suggest a dynamical adaptation mechanisms contribute to achieving higher accuracy in computational colour constancy. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
12. Force-Based Representation for Non-Rigid Shape and Elastic Model Estimation.
- Author
-
Agudo, Antonio and Moreno-Noguer, Francesc
- Subjects
VISUAL perception ,IMAGE processing ,COMPUTER graphics ,IMAGE recognition (Computer vision) ,MULTIPLE correspondence analysis (Statistics) - Abstract
This paper addresses the problem of simultaneously recovering 3D shape, pose and the elastic model of a deformable object from only 2D point tracks in a monocular video. This is a severely under-constrained problem that has been typically addressed by enforcing the shape or the point trajectories to lie on low-rank dimensional spaces. We show that formulating the problem in terms of a low-rank force space that induces the deformation and introducing the elastic model as an additional unknown, allows for a better physical interpretation of the resulting priors and a more accurate representation of the actual object’s behavior. In order to simultaneously estimate force, pose, and the elastic model of the object we use an expectation maximization strategy, where each of these parameters are successively learned by partial M-steps. Once the elastic model is learned, it can be transfered to similar objects to code its 3D deformation. Moreover, our approach can robustly deal with missing data, and encode both rigid and non-rigid points under the same formalism. We thoroughly validate the approach on Mocap and real sequences, showing more accurate 3D reconstructions than state-of-the-art, and additionally providing an estimate of the full elastic model with no a priori information. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
13. Bayesian Helmholtz Stereopsis with Integrability Prior.
- Author
-
Roubtsova, Nadejda and Guillemaut, Jean-Yves
- Subjects
VISUAL perception ,COMPUTER graphics ,IMAGE processing ,IMAGE recognition (Computer vision) ,MULTIPLE correspondence analysis (Statistics) - Abstract
Helmholtz Stereopsis is a 3D reconstruction method uniquely independent of surface reflectance. Yet, its sub-optimal maximum likelihood formulation with drift-prone normal integration limits performance. Via three contributions this paper presents a complete novel pipeline for Helmholtz Stereopsis. First, we propose a Bayesian formulation replacing the maximum likelihood problem by a maximum a posteriori one. Second, a tailored prior enforcing consistency between depth and normal estimates via a novel metric related to optimal surface integrability is proposed. Third, explicit surface integration is eliminated by taking advantage of the accuracy of prior and high resolution of the coarse-to-fine approach. The pipeline is validated quantitatively and qualitatively against alternative formulations, reaching sub-millimetre accuracy and coping with complex geometry and reflectance. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
14. SimiNet: A Novel Method for Quantifying Brain Network Similarity.
- Author
-
Mheich, Ahmad, Hassan, Mahmoud, Khalil, Mohamad, Gripon, Vincent, Dufor, Olivier, and Wendling, Fabrice
- Subjects
MULTIPLE correspondence analysis (Statistics) ,IMAGE processing ,DIGITAL image processing ,COMPUTER graphics ,IMAGE recognition (Computer vision) - Abstract
Quantifying the similarity between two networks is critical in many applications. A number of algorithms have been proposed to compute graph similarity, mainly based on the properties of nodes and edges. Interestingly, most of these algorithms ignore the physical location of the nodes, which is a key factor in the context of brain networks involving spatially defined functional areas. In this paper, we present a novel algorithm called “SimiNet” for measuring similarity between two graphs whose nodes are defined a priori within a 3D coordinate system. SimiNet provides a quantified index (ranging from 0 to 1) that accounts for node, edge and spatiality features. Complex graphs were simulated to evaluate the performance of SimiNet that is compared with eight state-of-art methods. Results show that SimiNet is able to detect weak spatial variations in compared graphs in addition to computing similarity using both nodes and edges. SimiNet was also applied to real brain networks obtained during a visual recognition task. The algorithm shows high performance to detect spatial variation of brain networks obtained during a naming task of two categories of visual stimuli: animals and tools. A perspective to this work is a better understanding of object categorization in the human brain. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
15. Highly Articulated Kinematic Structure Estimation Combining Motion and Skeleton Information.
- Author
-
Chang, Hyung Jin and Demiris, Yiannis
- Subjects
IMAGE processing ,MULTIPLE correspondence analysis (Statistics) ,DIGITAL image processing ,IMAGE recognition (Computer vision) ,COMPUTER graphics - Abstract
In this paper, we present a novel framework for unsupervised kinematic structure learning of complex articulated objects from a single-view 2D image sequence. In contrast to prior motion-based methods, which estimate relatively simple articulations, our method can generate arbitrarily complex kinematic structures with skeletal topology via a successive iterative merging strategy. The iterative merge process is guided by a density weighted skeleton map which is generated from a novel object boundary generation method from sparse 2D feature points. Our main contributions can be summarised as follows: (i) An unsupervised complex articulated kinematic structure estimation method that combines motion segments with skeleton information. (ii) An iterative fine-to-coarse merging strategy for adaptive motion segmentation and structural topology embedding. (iii) A skeleton estimation method based on a novel silhouette boundary generation from sparse feature points using an adaptive model selection method. (iv) A new highly articulated object dataset with ground truth annotation. We have verified the effectiveness of our proposed method in terms of computational time and estimation accuracy through rigorous experiments with multiple datasets. Our experiments show that the proposed method outperforms state-of-the-art methods both quantitatively and qualitatively. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
16. Direct Least Square Fitting of Hyperellipsoids.
- Author
-
Kesaniemi, Martti and Virtanen, Kai
- Subjects
LEAST squares ,ELLIPSES (Geometry) ,GEOMETRIC surfaces ,ELLIPSOIDS ,COMPUTER graphics - Abstract
This paper presents two new computationally efficient direct methods for fitting n-dimensional ellipsoids to noisy data. They conduct the fitting by minimizing the algebraic distance in subject to suitable quadratic constraints. The hyperellipsoid-specific (HES) method is an elaboration of existing ellipse and 3D ellipsoid-specific fitting methods. It is shown that HES is ellipsoid-specific in n-dimensional space. A limitation of HES is that it may provide biased fitting results with data originating from an ellipsoid with a large ratio between the longest and shortest main axis. The sum-of-discriminants (SOD) method does not have such a limitation. The constraint used by SOD rejects a subset of non-ellipsoidal quadrics, which enables a high tendency to produce ellipsoidal solutions. Moreover, a regularization technique is presented to force the solutions towards ellipsoids with SOD. The regularization technique is compatible also with several existing 2D and 3D fitting methods. The new methods are compared through extensive numerical experiments with n-dimensional variants of three commonly used direct fitting approaches for quadratic surfaces. The results of the experiments imply that in addition to the superior capability to create ellipsoidal solutions, the estimation accuracy of the new methods is better or equal to that of the reference approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
17. Multi-View Multi-Instance Learning Based on Joint Sparse Representation and Multi-View Dictionary Learning.
- Author
-
Li, Bing, Yuan, Chunfeng, Xiong, Weihua, Hu, Weiming, Peng, Houwen, Ding, Xinmiao, and Maybank, Steve
- Subjects
CLASSIFICATION algorithms ,MACHINE learning ,CLASSIFICATION ,SUPPORT vector machines ,COMPUTER graphics - Abstract
In multi-instance learning (MIL), the relations among instances in a bag convey important contextual information in many applications. Previous studies on MIL either ignore such relations or simply model them with a fixed graph structure so that the overall performance inevitably degrades in complex environments. To address this problem, this paper proposes a novel multi-view multi-instance learning algorithm (M$^2$
IL) that combines multiple context structures in a bag into a unified framework. The novel aspects are: (i) we propose a sparse $\varepsilon$ IL. Experiments and analyses in many practical applications prove the effectiveness of the M $^2$ IL. [ABSTRACT FROM PUBLISHER]- Published
- 2017
- Full Text
- View/download PDF
18. Table Detection in Online Ink Notes.
- Author
-
Zhouchen Lin, Junfeng He, Zhicheng Zhong, Rongrong Wang, and Heung-Yeung Shum
- Subjects
PEN-based computers ,PERSONAL computers ,COMPUTER graphics ,ARTIFICIAL intelligence ,GRAPHOLOGY - Abstract
In documents, tables are important structured objects that present statistical and relational information. In this paper, we present a robust system which is capable of detecting tables from free style online ink notes and extracting their structure so that they can be further edited in multiple ways. First, the primitive structure of tables, i.e., candidates for ruling lines and table bounding boxes, are detected among drawing strokes. Second, the logical structure of tables is determined by normalizing the table skeletons, identifying the skeleton structure, and extracting the cell contents. The detection process is similar to a decision tree so that invalid candidates can be ruled out quickly. Experimental results suggest that our system is robust and accurate in dealing with tables having complex structure or drawn under complex situations. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
19. Morphological Image Compositing.
- Author
-
Soille, Pierre
- Subjects
IMAGE processing ,COMPUTER graphics ,DIGITAL images ,REMOTE-sensing images ,IMAGING systems ,INFORMATION processing ,AERIAL photographs - Abstract
Image mosaicking can be defined as the registration of two or more images that are then combined into a single image. Once the images have been registered to a common coordinate system, the problem amounts to the definition of a selection rule to output a unique value for all those pixels that are present in more than one image. This process is known as image compositing. In this paper, we propose a compositing procedure based on mathematical morphology and its marker-controlled segmentation paradigm. Its scope is to position seams along salient image structures so as to diminish their visibility in the output mosaic even in the absence of radiometric corrections or blending procedures. We also show that it is suited to the seamless minimization of undesirable transient objects occurring in the regions where two or more images overlap. The proposed methodology and algorithms are illustrated for the composition of satellite images minimizing cloud cover. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
20. Guided Image Filtering.
- Author
-
He, Kaiming, Sun, Jian, and Tang, Xiaoou
- Subjects
EDGE detection (Image processing) ,COMPUTER graphics ,HISTOGRAMS ,LINEAR statistical models ,JACOBIAN matrices ,COMPUTER vision ,LAPLACE'S equation - Abstract
In this paper, we propose a novel explicit image filter called guided filter. Derived from a local linear model, the guided filter computes the filtering output by considering the content of a guidance image, which can be the input image itself or another different image. The guided filter can be used as an edge-preserving smoothing operator like the popular bilateral filter [1], but it has better behaviors near edges. The guided filter is also a more generic concept beyond smoothing: It can transfer the structures of the guidance image to the filtering output, enabling new filtering applications like dehazing and guided feathering. Moreover, the guided filter naturally has a fast and nonapproximate linear time algorithm, regardless of the kernel size and the intensity range. Currently, it is one of the fastest edge-preserving filters. Experiments show that the guided filter is both effective and efficient in a great variety of computer vision and computer graphics applications, including edge-aware smoothing, detail enhancement, HDR compression, image matting/feathering, dehazing, joint upsampling, etc. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
21. Bidirectional Texture Function Modeling: A State of the Art Survey.
- Author
-
Filip, Jiři and Haindl, Michal
- Subjects
COMPUTER vision ,COMPUTER graphics ,VISUAL texture recognition ,REFLECTANCE ,VISUAL programming (Computer science) - Abstract
An ever-growing number of real-world computer vision applications require classification, segmentation, retrieval, or realistic rendering of genuine materials. However, the appearance of real materials dramatically changes with illumination and viewing variations. Thus, the only reliable representation of material visual properties requires capturing of its reflectance in as wide range of light and camera position combinations as possible. This is a principle of the recent most advanced texture representation, the Bidirectional Texture Function (BTF). Multispectral BTF is a seven-dimensional function that depends on view and illumination directions as well as on planar texture coordinates. BTF is typically obtained by measurement of thousands of images covering many combinations of illumination and viewing angles. However, the large size of such measurements has prohibited their practical exploitation in any sensible application until recently. During the last few years, the first BTF measurement, compression, modeling, and rendering methods have emerged. In this paper, we categorize, critically survey, and psychophysically compare such approaches, which were published in this newly arising and important computer vision and graphics area. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
22. Graph Cuts via ℓ1 Norm Minimization.
- Author
-
Bhusnurmath, Arvind and Taylor, Camillo J.
- Subjects
ARTIFICIAL intelligence ,COMPUTER vision ,LINEAR systems ,IMAGE processing ,COMPUTER graphics ,IMAGING systems ,PATTERN recognition systems - Abstract
Graph cuts have become an increasingly important tool for solving a number of energy minimization problems in computer vision and other fields. In this paper, the graph cut problem is reformulated as an unconstrained ℓ¹ norm minimization that can be solved effectively using interior point methods. This reformulation exposes connections between graph cuts and other related continuous optimization problems. Eventually, the problem is reduced to solving a sequence of sparse linear systems involving the Laplacian of the underlying graph. The proposed procedure exploits the structure of these linear systems in a manner that is easily amenable to parallel implementations. Experimental results obtained by applying the procedure to graphs derived from image processing problems are provided. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
23. A Closed-Form Solution to Natural Image Matting.
- Author
-
Levin, Anat, Lischinski, Dani, and Weiss, Yair
- Subjects
PATTERN recognition systems ,IMAGE processing ,ARTIFICIAL intelligence ,COMPUTER vision ,COMPUTER graphics ,VIDEO editing ,MATRICES (Mathematics) ,LINEAR systems ,ALGORITHMS - Abstract
Interactive digital matting, the process of extracting a foreground object from an image based on limited user input, is an important task in image and video editing. From a computer vision perspective, this task is extremely challenging because it is massively ill-posed—at each pixel we must estimate the foreground and the background colors, as well as the foreground opacity (‘alpha matte’) from a single color measurement. Current approaches either restrict the estimation to a small part of the image, estimating foreground and background colors based on nearby pixels where they are known, or perform iterative nonlinear estimation by alternating foreground and background color estimation with alpha estimation. In this paper, we present a closed-form solution to natural image matting. We derive a cost function from local smoothness assumptions on foreground and background colors and show that in the resulting expression, it is possible to analytically eliminate the foreground and background colors to obtain a quadratic cost function in alpha. This allows us to find the globally optimal alpha matte by solving a sparse linear system of equations. Furthermore, the closed-form formula allows us to predict the properties of the solution by analyzing the eigenvectors of a sparse matrix, closely related to matrices used in spectral image segmentation algorithms. We show that high-quality mattes for natural images may be obtained from a small amount of user input. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
24. Local or Global Minima: Flexible Dual-Front Active Contours.
- Author
-
Hua Li and Yezzi, Anthony
- Subjects
COMPUTER algorithms ,THREE-dimensional imaging ,COMPUTER graphics ,LEVEL set methods ,MAXIMA & minima ,MATHEMATICAL functions ,ALGORITHMS - Abstract
Most variational active contour models are designed to find local minima of data-dependent energy functionals with the hope that reasonable initial placement of the active contour will drive it toward a ‘desirable’ local minimum as opposed to an undesirable configuration due to noise or complex image structure. As such, there has been much research into the design of complex region-based energy functionals that are less likely to yield undesirable local minima when compared to simpler edge-based energy functionals whose sensitivity to noise and texture is significantly worse. Unfortunately, most of these more ‘robust’ region-based energy functionals are applicable to a much narrower class of imagery compared to typical edge-based energies due to stronger global assumptions about the underlying image data. Devising new implementation algorithms for active contours that attempt to capture more global minimizers of already proposed image-based energies would allow us to choose an energy that makes sense for a particular class of energy without concern over its sensitivity to local minima. Such implementations have been proposed for capturing global minima. However, sometimes the completely-global minimum is just as undesirable as a minimum that is too local. In this paper, we propose a novel, fast, and flexible dual front implementation of active contours, motivated by minimal path techniques and utilizing fast sweeping algorithms, which is easily manipulated to yield minima with variable ‘degrees’ of localness and globalness. By simply adjusting the Size of active regions, the ability to gracefully move from capturing minima that are more local (according to the initial placement of the active contour/surface) to minima that are more global allows this model to more easily obtain ‘desirable’ minimizers (which often are neither the most local nor the most global). Experiments on various 2D and 3D images and comparisons with some active contour models and region-growing methods are also given to illustrate the properties of this model and its performance in a variety of segmentation applications. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
25. Minimum Reliable Scale Selection in 3D.
- Author
-
Wyatt, Christopher, Bayram, Ersin, and Yaorong Ge
- Subjects
IMAGE processing ,IMAGING systems ,COMPUTER graphics ,INFORMATION processing ,THREE-dimensional imaging ,VOLUMETRIC analysis - Abstract
Multiscale analysis is often required in image processing applications because image features are optimally detected at different levels of resolution. With the advance of high-resolution 3D imaging, the extension of multiscale analysis to higher dimensions is necessary. This paper extends an existing 2D scale selection method, known as the minimum reliable scale, to 3D volumetric images. The method is applied to 3D boundary detection and is illustrated in examples from biomedical imaging. The experimental results show that the 3D scale selection improves the detection of edges over single scale operators using as few as three different scales. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
26. Frequency Domain Formulation of Active Parametric Deformable Models.
- Author
-
Weruaga, Luis, Verdil, Rafael, and Morales, Juan
- Subjects
SNAKES ,COMPUTER vision ,COMPUTER graphics ,ELASTICITY ,PATTERN recognition systems ,PATTERN perception - Abstract
Active deformable models are simple tools, very popular in computer vision and computer graphics, for solving ill-posed problems or mimic real physical systems. The classical formulation is given in the spatial domain, the motor of the procedure is a second- order linear system, and rigidity and elasticity are the basic parameters for its characterization. This paper proposes a novel formulation based on a frequency-domain analysis: The internal energy functional and the Lagrange minimization are performed entirely in the frequency domain, which leads to a simple formulation and design. The frequency-based implementation offers important computational savings in comparison to the original one, a feature that is improved by the efficient hardware and software computation of the FFT algorithm. This new formulation focuses on the stiffness spectrum, allowing the possibility of constructing deformable models apart from the elasticity and rigidity-based original formulation. Simulation examples validate the theoretical results. [ABSTRACT FROM AUTHOR]
- Published
- 2004
- Full Text
- View/download PDF
27. Numerical Inversion of SRNF Maps for Elastic Shape Analysis of Genus-Zero Surfaces.
- Author
-
Laga, Hamid, Xie, Qian, Jermyn, Ian H., and Srivastava, Anuj
- Subjects
SHAPE analysis (Computational geometry) ,ALGORITHMS ,COMPUTER graphics ,EUCLIDEAN algorithm ,DEFORMATIONS (Mechanics) - Abstract
Recent developments in elastic shape analysis (ESA) are motivated by the fact that it provides a comprehensive framework for simultaneous registration, deformation, and comparison of shapes. These methods achieve computational efficiency using certain square-root representations that transform invariant elastic metrics into euclidean metrics, allowing for the application of standard algorithms and statistical tools. For analyzing shapes of embeddings of \mathbf S^2
, Jermyn et al.[1] introduced square-root normal fields (SRNFs), which transform an elastic metric, with desirable invariant properties, into the \mathbb L^2- Published
- 2017
- Full Text
- View/download PDF
28. Symbol Recognition with Kernel Density Matching.
- Author
-
Wan Zhang, Liu Wenyin, and Kun Zhang
- Subjects
GRAPHIC methods ,ENGINEERING graphics ,COMPUTER graphics ,ENGINEERING ,COMBINATORICS ,STATISTICAL matching - Abstract
We propose a novel approach to similarity assessment for graphic symbols. Symbols are represented as 2D kernel densities and their similarity is measured by the Kullback-Leibler divergence. Symbol orientation is found by gradient-based angle searching or independent component analysis. Experimental results show the outstanding performance of this approach in various situations. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
29. Alignment of Continuous Video onto 3D Point Clouds.
- Author
-
Zhao, Wenyl, Nister, David, and Hsu, Steve
- Subjects
DIGITAL video ,COMPUTER graphics ,DETECTORS ,ALGORITHMS ,ESTIMATION theory ,TECHNOLOGY - Abstract
We propose a general framework for aligning continuous (oblique) video onto 3D sensor data. We align a point cloud computed from the video onto the point cloud directly obtained from a 3D sensor. This is in contrast to existing techniques where the 2D images are aligned to a 3D model derived from the 3D sensor data. Using point clouds enables the alignment for scenes full of objects that are difficult to model; for example, trees. To compute 3D point clouds from video, motion stereo is used along with a state-of-the-art algorithm for camera pose estimation. Our experiments with real data demonstrate the advantages of the proposed registration algorithm for texturing models in large-scale semiurban environments. The capability to align video before a 3D model is built from the 3D sensor data offers new practical opportunities for 3D modeling. We introduce a novel modeling-through-registration approach that fuses 3D information from both the 3D sensor and the video. Initial experiments with real data illustrate the potential of the proposed approach. [ABSTRACT FROM AUTHOR]
- Published
- 2005
- Full Text
- View/download PDF
30. A General Differentiable Mesh Renderer for Image-Based 3D Reasoning.
- Author
-
Liu, Shichen, Li, Tianye, Chen, Weikai, and Li, Hao
- Subjects
IMAGE representation ,DIFFERENTIABLE functions ,COMPUTER graphics ,TASK analysis ,PIXELS - Abstract
Rendering bridges the gap between 2D vision and 3D scenes by simulating the physical process of image formation. By inverting such renderer, one can think of a learning approach to infer 3D information from 2D images. However, standard graphics renderers involve a fundamental step called rasterization, which prevents rendering to be differentiable. Unlike the state-of-the-art differentiable renderers (Kato et al. 2018 and Loper 2018), which only approximate the rendering gradient in the backpropagation, we propose a natually differentiable rendering framework that is able to (1) directly render colorized mesh using differentiable functions and (2) back-propagate efficient supervisions to mesh vertices and their attributes from various forms of image representations. The key to our framework is a novel formulation that views rendering as an aggregation function that fuses the probabilistic contributions of all mesh triangles with respect to the rendered pixels. Such formulation enables our framework to flow gradients to the occluded and distant vertices, which cannot be achieved by the previous state-of-the-arts. We show that by using the proposed renderer, one can achieve significant improvement in 3D unsupervised single-view reconstruction both qualitatively and quantitatively. Experiments also demonstrate that our approach can handle the challenging tasks in image-based shape fitting, which remain nontrivial to existing differentiable renders. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
31. A Curriculum Domain Adaptation Approach to the Semantic Segmentation of Urban Scenes.
- Author
-
Zhang, Yang, David, Philip, Foroosh, Hassan, and Gong, Boqing
- Subjects
CONVOLUTIONAL neural networks ,DRIVERLESS cars ,COMPUTER graphics ,COMPUTER-generated imagery ,AUGMENTED reality ,CURRICULUM - Abstract
During the last half decade, convolutional neural networks (CNNs) have triumphed over semantic segmentation, which is one of the core tasks in many applications such as autonomous driving and augmented reality. However, to train CNNs requires a considerable amount of data, which is difficult to collect and laborious to annotate. Recent advances in computer graphics make it possible to train CNNs on photo-realistic synthetic imagery with computer-generated annotations. Despite this, the domain mismatch between real images and the synthetic data hinders the models’ performance. Hence, we propose a curriculum-style learning approach to minimizing the domain gap in urban scene semantic segmentation. The curriculum domain adaptation solves easy tasks first to infer necessary properties about the target domain; in particular, the first task is to learn global label distributions over images and local distributions over landmark superpixels. These are easy to estimate because images of urban scenes have strong idiosyncrasies (e.g., the size and spatial relations of buildings, streets, cars, etc.). We then train a segmentation network, while regularizing its predictions in the target domain to follow those inferred properties. In experiments, our method outperforms the baselines on two datasets and three backbone networks. We also report extensive ablation studies about our approach. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
32. Light-Efficient Photography.
- Author
-
Hasinoff, Samuel W. and Kutulakos, Kiriakos N.
- Subjects
COMPUTATIONAL photography ,INTEGER programming ,LENSES ,COMPUTER vision ,COMPUTER graphics ,CAMERAS ,PROBLEM solving - Abstract
In this paper, we consider the problem of imaging a scene with a given depth of field at a given exposure level in the shortest amount of time possible. We show that by 1) collecting a sequence of photos and 2) controlling the aperture, focus, and exposure time of each photo individually, we can span the given depth of field in less total time than it takes to expose a single narrower-aperture photo. Using this as a starting point, we obtain two key results. First, for lenses with continuously variable apertures, we derive a closed-form solution for the globally optimal capture sequence, i.e., that collects light from the specified depth of field in the most efficient way possible. Second, for lenses with discrete apertures, we derive an integer programming problem whose solution is the optimal sequence. Our results are applicable to off-the-shelf cameras and typical photography conditions, and advocate the use of dense, wide-aperture photo sequences as a light-efficient alternative to single-shot, narrow-aperture photography. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
33. A New 3D-Matching Method of Nonrigid and Partially Similar Models Using Curve Analysis.
- Author
-
Tabia, Hedi, Daoudi, Mohamed, Vandeborre, Jean-Philippe, and Colot, Olivier
- Subjects
GEOMETRIC shapes ,IMAGE databases ,THREE-dimensional imaging ,COMPUTER vision ,COMPUTER graphics ,MATHEMATICAL symmetry ,GEODESICS ,EUCLIDEAN algorithm - Abstract
The 3D-shape matching problem plays a crucial role in many applications, such as indexing or modeling, by example. Here, we present a novel approach to matching 3D objects in the presence of nonrigid transformation and partially similar models. In this paper, we use the representation of surfaces by 3D curves extracted around feature points. Indeed, surfaces are represented with a collection of closed curves, and tools from shape analysis of curves are applied to analyze and to compare curves. The belief functions are used to define a global distance between 3D objects. The experimental results obtained on the TOSCA and the SHREC07 data sets show that the system performs efficiently in retrieving similar 3D models. [ABSTRACT FROM PUBLISHER]
- Published
- 2011
- Full Text
- View/download PDF
34. MRF Energy Minimization and Beyond via Dual Decomposition.
- Author
-
Komodakis, Nikos, Paragios, Nikos, and Tziritas, Georgios
- Subjects
COMPUTER vision ,MARKOV random fields ,COMPUTER algorithms ,COMPUTER science ,MATHEMATICAL optimization ,COMPUTER graphics ,LINEAR programming - Abstract
This paper introduces a new rigorous theoretical framework to address discrete MRF-based optimization in computer vision. Such a framework exploits the powerful technique of Dual Decomposition. It is based on a projected subgradient scheme that attempts to solve an MRF optimization problem by first decomposing it into a set of appropriately chosen subproblems, and then combining their solutions in a principled way. In order to determine the limits of this method, we analyze the conditions that these subproblems have to satisfy and demonstrate the extreme generality and flexibility of such an approach. We thus show that by appropriately choosing what subproblems to use, one can design novel and very powerful MRF optimization algorithms. For instance, in this manner we are able to derive algorithms that: 1) generalize and extend state-of-the-art message-passing methods, 2) optimize very tight LP-relaxations to MRF optimization, and 3) take full advantage of the special structure that may exist in particular MRFs, allowing the use of efficient inference techniques such as, e.g., graph-cut-based methods. Theoretical analysis on the bounds related with the different algorithms derived from our framework and experimental results/comparisons using synthetic and real data for a variety of tasks in computer vision demonstrate the extreme potentials of our approach. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
35. Fixed Points of Belief Propagation—An Analysis via Polynomial Homotopy Continuation.
- Author
-
Knoll, Christian, Mehta, Dhagash, Chen, Tianran, and Pernkopf, Franz
- Subjects
IMAGE processing ,MULTIPLE correspondence analysis (Statistics) ,DIGITAL image processing ,COMPUTER graphics ,IMAGE recognition (Computer vision) - Abstract
Belief propagation (BP) is an iterative method to perform approximate inference on arbitrary graphical models. Whether BP converges and if the solution is a unique fixed point depends on both the structure and the parametrization of the model. To understand this dependence it is interesting to find all fixed points. In this work, we formulate a set of polynomial equations, the solutions of which correspond to BP fixed points. To solve such a nonlinear system we present the numerical polynomial-homotopy-continuation (NPHC) method. Experiments on binary Ising models and on error-correcting codes show how our method is capable of obtaining all BP fixed points. On Ising models with fixed parameters we show how the structure influences both the number of fixed points and the convergence properties. We further asses the accuracy of the marginals and weighted combinations thereof. Weighting marginals with their respective partition function increases the accuracy in all experiments. Contrary to the conjecture that uniqueness of BP fixed points implies convergence, we find graphs for which BP fails to converge, even though a unique fixed point exists. Moreover, we show that this fixed point gives a good approximation, and the NPHC method is able to obtain this fixed point. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
36. Viewpoint-Consistent 3D Face Alignment.
- Author
-
Tulyakov, Sergey, Jeni, Laszlo A., Cohn, Jeffrey F., and Sebe, Nicu
- Subjects
FACE perception ,VISUAL perception ,MULTIPLE correspondence analysis (Statistics) ,IMAGE processing ,COMPUTER graphics ,IMAGE recognition (Computer vision) - Abstract
Most approaches to face alignment treat the face as a 2D object, which fails to represent depth variation and is vulnerable to loss of shape consistency when the face rotates along a 3D axis. Because faces commonly rotate three dimensionally, 2D approaches are vulnerable to significant error. 3D morphable models, employed as a second step in 2D+3D approaches are robust to face rotation but are computationally too expensive for many applications, yet their ability to maintain viewpoint consistency is unknown. We present an alternative approach that estimates 3D face landmarks in a single face image. The method uses a regression forest-based algorithm that adds a third dimension to the common cascade pipeline. 3D face landmarks are estimated directly, which avoids fitting a 3D morphable model. The proposed method achieves viewpoint consistency in a computationally efficient manner that is robust to 3D face rotation. To train and test our approach, we introduce the Multi-PIE Viewpoint Consistent database. In empirical tests, the proposed method achieved simple yet effective head pose estimation and viewpoint consistency on multiple measures relative to alternative approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
37. Learning from Narrated Instruction Videos.
- Author
-
Alayrac, Jean-Baptiste, Bojanowski, Piotr, Agrawal, Nishant, Sivic, Josef, Laptev, Ivan, and Lacoste-Julien, Simon
- Subjects
IMAGE processing ,MULTIPLE correspondence analysis (Statistics) ,DIGITAL image processing ,COMPUTER graphics ,IMAGE recognition (Computer vision) - Abstract
Automatic assistants could guide a person or a robot in performing new tasks, such as changing a car tire or repotting a plant. Creating such assistants, however, is non-trivial and requires understanding of visual and verbal content of a video. Towards this goal, we here address the problem of automatically learning the main steps of a task from a set of narrated instruction videos. We develop a new unsupervised learning approach that takes advantage of the complementary nature of the input video and the associated narration. The method sequentially clusters textual and visual representations of a task, where the two clustering problems are linked by joint constraints to obtain a single coherent sequence of steps in both modalities. To evaluate our method, we collect and annotate a new challenging dataset of real-world instruction videos from the Internet. The dataset contains videos for five different tasks with complex interactions between people and objects, captured in a variety of indoor and outdoor settings. We experimentally demonstrate that the proposed method can automatically discover, learn and localize the main steps of a task in input videos. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
38. Bilinear Factor Matrix Norm Minimization for Robust PCA: Algorithms and Applications.
- Author
-
Shang, Fanhua, Cheng, James, Liu, Yuanyuan, Luo, Zhi-Quan, and Lin, Zhouchen
- Subjects
IMAGE processing ,MULTIPLE correspondence analysis (Statistics) ,DIGITAL image processing ,IMAGE recognition (Computer vision) ,COMPUTER graphics - Abstract
The heavy-tailed distributions of corrupted outliers and singular values of all channels in low-level vision have proven effective priors for many applications such as background modeling, photometric stereo and image alignment. And they can be well modeled by a hyper-Laplacian. However, the use of such distributions generally leads to challenging non-convex, non-smooth and non-Lipschitz problems, and makes existing algorithms very slow for large-scale applications. Together with the analytic solutions to $\ell _{p}$ -norm minimization with two specific values of $p$ , i.e., $p=1/2$ and $p=2/3$ , we propose two novel bilinear factor matrix norm minimization models for robust principal component analysis. We first define the double nuclear norm and Frobenius/nuclear hybrid norm penalties, and then prove that they are in essence the Schatten- $1/2$ and $2/3$ quasi-norms, respectively, which lead to much more tractable and scalable Lipschitz optimization problems. Our experimental analysis shows that both our methods yield more accurate solutions than original Schatten quasi-norm minimization, even when the number of observations is very limited. Finally, we apply our penalties to various low-level vision problems, e.g., text removal, moving object detection, image alignment and inpainting, and show that our methods usually outperform the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
39. Image Visual Realism: From Human Perception to Machine Computation.
- Author
-
Fan, Shaojing, Ng, Tian-Tsong, Koenig, Bryan Lee, Herberg, Jonathan Samuel, Jiang, Ming, Shen, Zhiqi, and Zhao, Qi
- Subjects
IMAGE processing ,VISUAL perception ,FEATURE extraction ,COMPUTER graphics ,DIGITAL image processing - Abstract
Visual realism is defined as the extent to which an image appears to people as a photo rather than computer generated. Assessing visual realism is important in applications like computer graphics rendering and photo retouching. However, current realism evaluation approaches use either labor-intensive human judgments or automated algorithms largely dependent on comparing renderings to reference images. We develop a reference-free computational framework for visual realism prediction to overcome these constraints. First, we construct a benchmark dataset of 2,520 images with comprehensive human annotated attributes. From statistical modeling on this data, we identify image attributes most relevant for visual realism. We propose both empirically-based (guided by our statistical modeling of human data) and deep convolutional neural network models to predict visual realism of images. Our framework has the following advantages: (1) it creates an interpretable and concise empirical model that characterizes human perception of visual realism; (2) it links computational features to latent factors of human image perception. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
40. Newton-Type Greedy Selection Methods for $\ell _0$ -Constrained Minimization.
- Author
-
Yuan, Xiao-Tong and Liu, Qingshan
- Subjects
COMPUTER graphics ,NEWTON-Raphson method ,QUADRATIC programming ,SPARSE approximations ,ESTIMATION theory - Abstract
We introduce a family of Newton-type greedy selection methods for $\ell _0$
-constrained minimization problems. The basic idea is to construct a quadratic function to approximate the original objective function around the current iterate and solve the constructed quadratic program over the cardinality constraint. The next iterate is then estimated via a line search operation between the current iterate and the solution of the sparse quadratic program. This iterative procedure can be interpreted as an extension of the constrained Newton methods from convex minimization to non-convex $\ell _0$ -constrained minimization. We show that the proposed algorithms converge asymptotically and the rate of local convergence is superlinear up to certain estimation error. Our methods compare favorably against several state-of-the-art greedy selection methods when applied to sparse logistic regression and sparse support vector machines. [ABSTRACT FROM PUBLISHER]- Published
- 2017
- Full Text
- View/download PDF
41. Compositional Model Based Fisher Vector Coding for Image Classification.
- Author
-
Liu, Lingqiao, Wang, Peng, Shen, Chunhua, Wang, Lei, Hengel, Anton van den, Wang, Chao, and Shen, Heng Tao
- Subjects
CLASSIFICATION ,INFORMATION organization ,COMPUTER graphics ,DIGITAL image processing ,IMAGE segmentation - Abstract
Deriving from the gradient vector of a generative model of local features, Fisher vector coding (FVC) has been identified as an effective coding method for image classification. Most, if not all, FVC implementations employ the Gaussian mixture model (GMM) as the generative model for local features. However, the representative power of a GMM can be limited because it essentially assumes that local features can be characterized by a fixed number of feature prototypes, and the number of prototypes is usually small in FVC. To alleviate this limitation, in this work, we break the convention which assumes that a local feature is drawn from one of a few Gaussian distributions. Instead, we adopt a compositional mechanism which assumes that a local feature is drawn from a Gaussian distribution whose mean vector is composed as a linear combination of multiple key components, and the combination weight is a latent random variable. In doing so we greatly enhance the representative power of the generative model underlying FVC. To implement our idea, we design two particular generative models following this compositional approach. In our first model, the mean vector is sampled from the subspace spanned by a set of bases and the combination weight is drawn from a Laplace distribution. In our second model, we further assume that a local feature is composed of a discriminative part and a residual part. As a result, a local feature is generated by the linear combination of discriminative part bases and residual part bases. The decomposition of the discriminative and residual parts is achieved via the guidance of a pre-trained supervised coding method. By calculating the gradient vector of the proposed models, we derive two new Fisher vector coding strategies. The first is termed Sparse Coding-based Fisher Vector Coding (SCFVC) and can be used as the substitute of traditional GMM based FVC. The second is termed Hybrid Sparse Coding-based Fisher vector coding (HSCFVC) since it combines the merits of both pre-trained supervised coding methods and FVC. Using pre-trained Convolutional Neural Network (CNN) activations as local features, we experimentally demonstrate that the proposed methods are superior to traditional GMM based FVC and achieve state-of-the-art performance in various image classification tasks. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
42. Hyperbolic Harmonic Mapping for Surface Registration.
- Author
-
Shi, Rui, Zeng, Wei, Su, Zhengyu, Jiang, Jian, Damasio, Hanna, Lu, Zhonglin, Wang, Yalin, Yau, Shing-Tung, and Gu, Xianfeng
- Subjects
HARMONIC maps ,HYPERBOLIC geometry ,GEOMETRIC surfaces ,COMPUTER vision ,COMPUTER graphics ,COMPUTATIONAL geometry - Abstract
Automatic computation of surface correspondence via harmonic map is an active research field in computer vision, computer graphics and computational geometry. It may help document and understand physical and biological phenomena and also has broad applications in biometrics, medical imaging and motion capture industries. Although numerous studies have been devoted to harmonic map research, limited progress has been made to compute a diffeomorphic harmonic map on general topology surfaces with landmark constraints. This work conquers this problem by changing the Riemannian metric on the target surface to a hyperbolic metric so that the harmonic mapping is guaranteed to be a diffeomorphism under landmark constraints. The computational algorithms are based on Ricci flow and nonlinear heat diffusion methods. The approach is general and robust. We employ our algorithm to study the constrained surface registration problem which applies to both computer vision and medical imaging applications. Experimental results demonstrate that, by changing the Riemannian metric, the registrations are always diffeomorphic and achieve relatively high performance when evaluated with some popular surface registration evaluation standards. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
43. Fast Exact Euclidean Distance (FEED): A New Class of Adaptable Distance Transforms.
- Author
-
Schouten, Theo E. and Broek, Egon L. van den
- Subjects
EUCLIDEAN distance ,IMAGE representation ,ADAPTIVE computing systems ,COMPUTATIONAL geometry ,APPROXIMATION algorithms ,IMAGE processing ,COMPUTER vision - Abstract
A new unique class of foldable distance transforms of digital images (DT) is introduced, baptized: Fast exact euclidean distance (FEED) transforms. FEED class algorithms calculate the DT startingdirectly from the definition or rather its inverse. The principle of FEED class algorithms is introduced, followed by strategies for their efficient implementation. It is shown that FEED class algorithms unite properties of ordered propagation, raster scanning, and independent scanning DT. Moreover, FEED class algorithms shown to have a unique property: they can be tailored to the images under investigation. Benchmarks are conducted on both the Fabbri et al. data set and on a newly developed data set. Three baseline, three approximate, and three state-of-the-art DT algorithms were included, in addition to two implementations of FEED class algorithms. It illustrates that FEED class algorithms i) provide truly exact Euclidean DT; ii) do no suffer from disconnected Voronoi tiles, which is a unique feature for non-parallel but fast DT; iii) outperform any other approximate and exact Euclidean DT with its time complexity $O(N)$
, even after their optimization; and iv) are unequaled in that they can be adapted to the characteristics of the image class at hand. [ABSTRACT FROM PUBLISHER]- Published
- 2014
- Full Text
- View/download PDF
44. Direct Method for Video-Based Navigation Using a Digital Terrain Map.
- Author
-
Lerner, Ronen and Rivlin, Ehud
- Subjects
VIDEOS ,DIGITAL mapping ,COMPUTER graphics ,FEASIBILITY studies ,ALGORITHMS ,DATA analysis ,SIMULATION methods & models ,ESTIMATION theory - Abstract
A novel vision-based navigation algorithm is proposed. The gray levels of two images, together with a Digital Terrain Map (DTM), are directly utilized to define constraints on the navigation parameters. The feasibility of the algorithm is examined both under a simulated environment and using real flight data. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
45. Design and Evaluation of More Accurate Gradient Operators on Hexagonal Lattices.
- Author
-
Shima, Tetsuo, Saito, Suguru, and Nakajima, Masayuki
- Subjects
IMAGING systems ,IMAGE processing ,LATTICE theory ,COMPUTER graphics - Abstract
Digital two-dimensional images are usually sampled on square lattices, while the receptors of the human eye are following a hexagonal structure. It is the main motivation for adopting hexagonal lattices. The fundamental operation in many image processing algorithms is to extract the gradient information. As such, various gradient operators have been proposed for square lattices and have been thoroughly optimized. Accurate gradient operators for hexagonal lattices have, however, not been researched well enough, while the distance between neighbor pixels is constant. We therefore derive consistent gradient operators on hexagonal lattices and compare them with the existing optimized filters on square lattices. The results show that the derived filters on hexagonal lattices achieve a better signal-to-noise ratio than those on square lattices. Results on artificial images also show that the derived filters on hexagonal lattices outperform the square ones with respect to accuracy of gradient intensity and orientation detection. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
46. A Fast Sweeping Method for Computing Geodesics on Triangular Manifolds.
- Author
-
Song-Gang Xu, Yun-Xiang Zhang, and Jun-Hai Yong
- Subjects
COMPUTER graphics ,GEODESY ,MANIFOLDS (Mathematics) ,ESTIMATION theory ,WAVES (Physics) - Abstract
A wide range of applications in computer intelligence and computer graphics require computing geodesics accurately and efficiently. The fast marching method (FMM) is widely used to solve this problem, of which the complexity is O(N log N), where N is the total number of nodes on the manifold. A fast sweeping method (FSM) is proposed and applied on arbitrary triangular manifolds of which the complexity is reduced to O(N). By traversing the undigraph, four orderings are built to produce two groups of interfering waves, which cover all directions of characteristics. The correctness of this method is proved by analyzing the coverage of characteristics. The convergence and error estimation are also presented. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
47. Texture Analysis and Segmentation Using Modulation Features, Generative Models, and Weighted Curve Evolution.
- Author
-
Kokkinos, Iasonas, Evangelopoulos, Georgios, and Maragos, Petros
- Subjects
TEXTURE mapping ,THREE-dimensional imaging ,COMPUTER graphics ,COMPUTER vision ,IMAGE processing ,PATTERN recognition systems - Abstract
In this work, we approach the analysis and segmentation of natural textured images by combining ideas from image analysis and probabilistic modeling. We rely on AM-FM texture models and, specifically, on the Dominant Component Analysis (DCA) paradigm for feature extraction. This method provides a low-dimensional, dense, and smooth descriptor, capturing the essential aspects of texture, namely, scale, orientation, and contrast. Our contributions are at three levels of the texture analysis and segmentation problems: First, at the feature extraction stage, we propose a Regularized Demodulation Algorithm that provides more robust texture features and we explore the merits of modifying the channel selection criterion of DCA. Second, we propose a probabilistic interpretation of DCA and Gabor filtering in general, in terms of Local Generative Models. Extending this point of view to edge detection facilitates the estimation of posterior probabilities for the edge and texture classes. Third, we propose the Weighted Curve Evolution scheme that enhances curve evolution-based segmentation methods by allowing for the locally adaptive combination of heterogeneous cues. Our segmentation results are evaluated on the Berkeley Segmentation Benchmark and compare favorably to current state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
48. Locally Rotation, Contrast, and Scale Invariant Descriptors for Texture Analysis.
- Author
-
Mellor, Matthew, Byung-Woo Hong, and Brady, Michael
- Subjects
TEXTURE (Art) ,DIGITAL images ,TEXTURE mapping ,COMPUTER graphics ,COMPUTERS ,DIGITAL image processing ,MATHEMATICAL symmetry ,GROUP theory ,COMBINATORICS - Abstract
Textures within real images vary in brightness, contrast, scale, and skew as imaging conditions change. To enable recognition of textures in real images, it is necessary to employ a similarity measure that is invariant to these properties. Furthermore, since textures often appear on undulating surfaces, such invariances must necessarily be local rather than global. Despite these requirements, it is only relatively recently that texture recognition algorithms with local scale and affine invariance properties have begun to be reported. Typically, they comprise detecting feature points followed by geometric normalization prior to description. We describe a method based on invariant combinations of linear filters. Unlike previous methods, we introduce a novel family of filters, which provides scale invariance, resulting in a texture description invariant to local changes in orientation, contrast, and scale and robust to local skew. Significantly, the family of filters enables local scale invariants to be defined without using a scale selection principle or a large number of filters. A texture discrimination method based on the χ
2 similarity measure applied to histograms derived from our filter responses outperforms existing methods for retrieval and classification results for both the Brodatz textures and the University of Illinois, Urbana-Champaign (UIUC) database, which has been designed to require local invariance. [ABSTRACT FROM AUTHOR]- Published
- 2008
- Full Text
- View/download PDF
49. Surface Dependent Representations for Illumination Insensitive Image Comparison.
- Author
-
Osadchy, Margarita, Jacobs, David W., and Lindenbaum, Michael
- Subjects
VIDEO lighting ,DIGITAL images ,GAUSSIAN processes ,COMPUTER graphics ,PIXELS ,DIGITAL image processing ,IMAGE analysis - Abstract
We consider the problem of matching images to tell whether they come from the same scene viewed under different lighting conditions. We show that the surface characteristics determine the type of image comparison method that should be used. Previous work has shown the effectiveness of comparing the image gradient direction for surfaces with material properties that change rapidly in one direction. We show analytically that two other widely used methods, normalized correlation of small windows and comparison of multiscale oriented filters, essentially compute the same thing. Then, we show that for surfaces whose properties change more slowly, comparison of the output of whitening filters is most effective. This suggests that a combination of these strategies should be employed to compare general objects. We discuss indications that Gabor jets use such a mixed strategy effectively, and we propose a new mixed strategy. We validate our results on synthetic and real images. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
50. Pores and Ridges: High-Resolution Fingerprint Matching Using Level 3 Features.
- Author
-
Jain, Anil K., Yi Chen, and Demirkus, Meltem
- Subjects
HUMAN fingerprints ,SCANNING systems ,DETECTORS ,NATIONAL security ,ANTHROPOMETRY ,COMPUTER algorithms ,COMPUTER graphics - Abstract
Fingerprint friction ridge details are generally described in a hierarchical order at three different levels, namely, Level 1 (pattern), Level 2 (minutia points), and Level 3 (pores and ridge contours). Although latent print examiners frequently take advantage of Level 3 features to assist in identification, Automated Fingerprint Identification Systems (AFIS) currently rely only on Level 1 and Level 2 features. In fact, the Federal Bureau of Investigation's (FBI) standard of fingerprint resolution for AFIS is 500 pixels per inch (ppi), which is inadequate for capturing Level 3 features, such as pores. With the advances in fingerprint sensing technology, many sensors are now equipped with dual resolution (500 ppi/1,000 ppi) scanning capability. However, increasing the scan resolution alone does not necessarily provide any performance improvement in fingerprint matching, unless an extended feature set is utilized. As a result, a systematic study to determine how much performance gain one can achieve by introducing Level 3 features in AFIS is highly desired. We propose a hierarchical matching system that utilizes features at all the three levels extracted from 1,000 ppi fingerprint scans. Level 3 features, including pores and ridge contours, are automatically extracted using Gabor filters and wavelet transform and are locally matched using the Iterative Closest Point (ICP) algorithm. Our experiments show that Level 3 features carry significant discriminatory information. There is a relative reduction of 20 percent in the equal error rate (EER) of the matching system when Level 3 features are employed in combination with Level 1 and 2 features. This significant performance gain is consistently observed across various quality fingerprint images. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.