Author: "Jordi Pont-Tuset" / Topic: business - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Jordi Pont-Tuset"' showing total 22 results

Start Over Author "Jordi Pont-Tuset" Topic business

22 results on '"Jordi Pont-Tuset"'

1. Connecting Vision and Language with Localized Narratives

Author: Jordi Pont-Tuset, Vittorio Ferrari, Jasper Uijlings, Radu Soricut, and Soravit Changpinyo
Subjects: Closed captioning, Computer science, business.industry, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 02 engineering and technology, 010501 environmental sciences, computer.software_genre, 01 natural sciences, Image (mathematics), Mouseover, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Narrative, Artificial intelligence, business, computer, Word (computer architecture), Natural language processing, 0105 earth and related environmental sciences, TRACE (psycholinguistics)
Abstract: We propose Localized Narratives, a new form of multimodal image annotations connecting vision and language. We ask annotators to describe an image with their voice while simultaneously hovering their mouse over the region they are describing. Since the voice and the mouse pointer are synchronized, we can localize every single word in the description. This dense visual grounding takes the form of a mouse trace segment per word and is unique to our data. We annotated 849k images with Localized Narratives: the whole COCO, Flickr30k, and ADE20K datasets, and 671k images of Open Images, all of which we make publicly available. We provide an extensive analysis of these annotations showing they are diverse, accurate, and efficient to produce. We also demonstrate their utility on the application of controlled image captioning.
Published: 2020
Full Text: View/download PDF

2. Video Object Segmentation without Temporal Information

Author: Jordi Pont-Tuset, Kevis-Kokitsi Maninis, L. Van Gool, Daniel Cremers, Laura Leal-Taixé, Yuhua Chen, and Sergi Caelles
Subjects: FOS: Computer and information sciences, Technology, Computer science, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 02 engineering and technology, Computer Science, Artificial Intelligence, Redundancy (information theory), Engineering, Artificial Intelligence, Video object segmentation, convolutional neural networks, 0202 electrical engineering, electronic engineering, information engineering, Segmentation, Computer vision, Temporal information, Science & Technology, IMAGE SEGMENTATION, business.industry, Applied Mathematics, Engineering, Electrical & Electronic, Video processing, Image segmentation, semantic segmentation, Computational Theory and Mathematics, Computer Science, instance segmentation, Task analysis, 020201 artificial intelligence & image processing, Computer Vision and Pattern Recognition, Artificial intelligence, business, Software
Abstract: Video Object Segmentation, and video processing in general, has been historically dominated by methods that rely on the temporal consistency and redundancy in consecutive video frames. When the temporal smoothness is suddenly broken, such as when an object is occluded, or some frames are missing in a sequence, the result of these methods can deteriorate significantly or they may not even produce any result at all. This paper explores the orthogonal approach of processing each frame independently, i.e disregarding the temporal information. In particular, it tackles the task of semi-supervised video object segmentation: the separation of an object from the background in a video, given its mask in the first frame. We present Semantic One-Shot Video Object Segmentation (OSVOS-S), based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one shot). We show that instance level semantic information, when combined effectively, can dramatically improve the results of our previous method, OSVOS. We perform experiments on two recent video segmentation databases, which show that OSVOS-S is both the fastest and most accurate method in the state of the art., Accepted to T-PAMI. Extended version of "One-Shot Video Object Segmentation", CVPR 2017 (arXiv:1611.05198). Project page: http://www.vision.ee.ethz.ch/~cvlsegmentation/osvos/
Published: 2019

3. Deep Extreme Cut: From Extreme Points to Object Segmentation

Author: L. Van Gool, Sergi Caelles, Jordi Pont-Tuset, and Kevis-Kokitsi Maninis
Subjects: FOS: Computer and information sciences, Channel (digital image), Pixel, Computer science, business.industry, Computer Vision and Pattern Recognition (cs.CV), Gaussian, Computer Science - Computer Vision and Pattern Recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Pattern recognition, 02 engineering and technology, Image segmentation, Object (computer science), Convolutional neural network, 030218 nuclear medicine & medical imaging, 03 medical and health sciences, symbols.namesake, 0302 clinical medicine, 0202 electrical engineering, electronic engineering, information engineering, symbols, 020201 artificial intelligence & image processing, Segmentation, Artificial intelligence, Extreme point, business
Abstract: This paper explores the use of extreme points in an object (left-most, right-most, top, bottom pixels) as input to obtain precise object segmentation for images and videos. We do so by adding an extra channel to the image in the input of a convolutional neural network (CNN), which contains a Gaussian centered in each of the extreme points. The CNN learns to transform this information into a segmentation of an object that matches those extreme points. We demonstrate the usefulness of this approach for guided segmentation (grabcut-style), interactive segmentation, video object segmentation, and dense segmentation annotation. We show that we obtain the most precise results to date, also with less user input, in an extensive and varied selection of benchmarks and datasets. All our models and code are publicly available on http://www.vision.ee.ethz.ch/~cvlsegmentation/dextr/., CVPR 2018 camera ready. Project webpage and code: http://www.vision.ee.ethz.ch/~cvlsegmentation/dextr/
Published: 2018

4. The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale

Author: Jordi Pont-Tuset, Tom Duerig, Matteo Malloci, Shahab Kamali, Alina Kuznetsova, Hassan Rom, Neil Alldrin, Vittorio Ferrari, Ivan Krasin, Stefan Popov, Alexander Kolesnikov, and Jasper Uijlings
Subjects: FOS: Computer and information sciences, Class (computer programming), Contextual image classification, business.industry, Computer science, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Pattern recognition, 02 engineering and technology, Object (computer science), Object detection, Image (mathematics), Artificial Intelligence, Bounding overwatch, Pattern recognition (psychology), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Computer Vision and Pattern Recognition, Artificial intelligence, Scale (map), business, Software
Abstract: We present Open Images V4, a dataset of 9.2M images with unified annotations for image classification, object detection and visual relationship detection. The images have a Creative Commons Attribution license that allows to share and adapt the material, and they have been collected from Flickr without a predefined list of class names or tags, leading to natural class statistics and avoiding an initial design bias. Open Images V4 offers large scale across several dimensions: 30.1M image-level labels for 19.8k concepts, 15.4M bounding boxes for 600 object classes, and 375k visual relationship annotations involving 57 classes. For object detection in particular, we provide 15x more bounding boxes than the next largest datasets (15.4M boxes on 1.9M images). The images often show complex scenes with several objects (8 annotated objects per image on average). We annotated visual relationships between them, which support visual relationship detection, an emerging task that requires structured reasoning. We provide in-depth comprehensive statistics about the dataset, we validate the quality of the annotations, we study how the performance of several modern models evolves with increasing amounts of training data, and we demonstrate two applications made possible by having unified annotations of multiple types coexisting in the same images. We hope that the scale, quality, and variety of Open Images V4 will foster further research and innovation even beyond the areas of image classification, object detection, and visual relationship detection., Accepted to International Journal of Computer Vision, 2020
Published: 2018

5. Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning

Author: Alberto Montes, Yuhua Chen, Jordi Pont-Tuset, and Luc Van Gool
Subjects: FOS: Computer and information sciences, Pixel, business.industry, Computer science, Computer Vision and Pattern Recognition (cs.CV), Feature extraction, Frame (networking), ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Computer Science - Computer Vision and Pattern Recognition, 02 engineering and technology, 010501 environmental sciences, Object (computer science), 01 natural sciences, Set (abstract data type), Metric (mathematics), 0202 electrical engineering, electronic engineering, information engineering, Embedding, 020201 artificial intelligence & image processing, Segmentation, Computer vision, Artificial intelligence, business, 0105 earth and related environmental sciences
Abstract: This paper tackles the problem of video object segmentation, given some user annotation which indicates the object of interest. The problem is formulated as pixel-wise retrieval in a learned embedding space: we embed pixels of the same object instance into the vicinity of each other, using a fully convolutional network trained by a modified triplet loss as the embedding model. Then the annotated pixels are set as reference and the rest of the pixels are classified using a nearest-neighbor approach. The proposed method supports different kinds of user input such as segmentation mask in the first frame (semi-supervised scenario), or a sparse set of clicked points (interactive scenario). In the semi-supervised scenario, we achieve results competitive with the state of the art but at a fraction of computation cost (275 milliseconds per frame). In the interactive scenario where the user is able to refine their input iteratively, the proposed method provides instant response to each input, and reaches comparable quality to competing methods with much less interaction., Accepted to CVPR 2018
Published: 2018

6. Convolutional Oriented Boundaries: From Image Segmentation to High-Level Tasks

Author: Kevis-Kokitsi Maninis, Luc Van Gool, Jordi Pont-Tuset, and Pablo Arbeláez
Subjects: Boundary detection, FOS: Computer and information sciences, Computer science, Computer Vision and Pattern Recognition (cs.CV), Feature extraction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Computer Science - Computer Vision and Pattern Recognition, Scale-space segmentation, 02 engineering and technology, 010501 environmental sciences, 01 natural sciences, Convolutional neural network, Artificial Intelligence, 0202 electrical engineering, electronic engineering, information engineering, Segmentation, Computer vision, 0105 earth and related environmental sciences, computer.programming_language, Contextual image classification, business.industry, Applied Mathematics, Pattern recognition, Pascal (programming language), Image segmentation, Object detection, Computational Theory and Mathematics, 020201 artificial intelligence & image processing, Computer Vision and Pattern Recognition, Artificial intelligence, business, computer, Software
Abstract: We present Convolutional Oriented Boundaries (COB), which produces multiscale oriented contours and region hierarchies starting from generic image classification Convolutional Neural Networks (CNNs). COB is computationally efficient, because it requires a single CNN forward pass for multi-scale contour detection and it uses a novel sparse boundary representation for hierarchical segmentation; it gives a significant leap in performance over the state-of-the-art, and it generalizes very well to unseen categories and datasets. Particularly, we show that learning to estimate not only contour strength but also orientation provides more accurate results. We perform extensive experiments for low-level applications on BSDS, PASCAL Context, PASCAL Segmentation, and NYUD to evaluate boundary detection performance, showing that COB provides state-of-the-art contours and region hierarchies in all datasets. We also evaluate COB on high-level tasks when coupled with multiple pipelines for object proposals, semantic contours, semantic segmentation, and object detection on MS-COCO, SBD, and PASCAL; showing that COB also improves the results for all tasks., Accepted by T-PAMI. Extended version of "Convolutional Oriented Boundaries", ECCV 2016 (arXiv:1608.02755). Project page: http://www.vision.ee.ethz.ch/~cvlsegmentation/cob/
Published: 2017

7. Scale-Aware Alignment of Hierarchical Image Segmentation

Author: Dengxin Dai, Jordi Pont-Tuset, Yuhua Chen, and Luc Van Gool
Subjects: Scale (ratio), Hierarchy (mathematics), Computer science, Segmentation-based object categorization, business.industry, Scale-space segmentation, 020207 software engineering, Pattern recognition, 02 engineering and technology, Image segmentation, PSI_VISICS, Tree (data structure), Minimum spanning tree-based segmentation, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Segmentation, Artificial intelligence, business
Abstract: © 2016 IEEE. Image segmentation is a key component in many computer vision systems, and it is recovering a prominent spot in the literature as methods improve and overcome their limitations. The outputs of most recent algorithms are in the form of a hierarchical segmentation, which provides segmentation at different scales in a single tree-like structure. Commonly, these hierarchical methods start from some low-level features, and are not aware of the scale information of the different regions in them. As such, one might need to work on many different levels of the hierarchy to find the objects in the scene. This work tries to modify the existing hierarchical algorithm by improving their alignment, that is, by trying to modify the depth of the regions in the tree to better couple depth and scale. To do so, we first train a regressor to predict the scale of regions using mid-level features. We then define the anchor slice as the set of regions that better balance between over-segmentation and under-segmentation. The output of our method is an improved hierarchy, re-aligned by the anchor slice. To demonstrate the power of our method, we perform comprehensive experiments, which show that our method, as a post-processing step, can significantly improve the quality of the hierarchical segmentation representations, and ease the usage of hierarchical image segmentation to high-level vision tasks such as object segmentation. We also prove that the improvement generalizes well across different algorithms and datasets, with a low computational cost.1 Chen Y., Dai D., Pont-Tuset J., Van Gool L., ''Scale-aware alignment of hierarchical image segmentation'', 29th IEEE conference on computer vision and pattern recognition - CVPR 2016, 9 pp., June 26 - July 1, 2016, Las Vegas, Nevada, USA. ispartof: pages:364-372 ispartof: Proceedings CVPR 2016 vol:2016-December pages:364-372 ispartof: IEEE conference on computer vision and pattern recognition - CVPR 2016 location:Las Vegas, Nevada, USA date:27 Jun - 30 Jun 2016 status: published
Published: 2016
Full Text: View/download PDF

8. One-Shot Video Object Segmentation

Author: L. Van Gool, Jordi Pont-Tuset, Laura Leal-Taixé, Sergi Caelles, Kevis-Kokitsi Maninis, and Daniel Cremers
Subjects: FOS: Computer and information sciences, business.industry, Computer science, Deep learning, Computer Vision and Pattern Recognition (cs.CV), Frame (networking), Computer Science - Computer Vision and Pattern Recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 020207 software engineering, 02 engineering and technology, Image segmentation, Object (computer science), Task (project management), ComputingMethodologies_PATTERNRECOGNITION, Margin (machine learning), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Computer vision, Segmentation, Artificial intelligence, business, ComputingMilieux_MISCELLANEOUS
Abstract: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), ISBN:978-1-5386-0457-1, ISBN:978-1-5386-0458-8
Published: 2016
Full Text: View/download PDF

9. Deep Retinal Image Understanding

Author: Pablo Arbeláez, Jordi Pont-Tuset, Kevis-Kokitsi Maninis, Luc Van Gool, Ourselin, Sébastien, Joskowicz, Leo, Sabuncu, Mert R, Ünal, Gözde B, and Wells, William
Subjects: FOS: Computer and information sciences, Contextual image classification, genetic structures, Computer science, business.industry, Deep learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, 02 engineering and technology, PSI_VISICS, Fundus (eye), Convolutional neural network, Object detection, Retinal image, 030218 nuclear medicine & medical imaging, Set (abstract data type), 03 medical and health sciences, 0302 clinical medicine, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, business
Abstract: This paper presents Deep Retinal Image Understanding (DRIU), a unified framework of retinal image analysis that provides both retinal vessel and optic disc segmentation. We make use of deep Convolutional Neural Networks (CNNs), which have proven revolutionary in other fields of computer vision such as object detection and image classification, and we bring their power to the study of eye fundus images. DRIU uses a base network architecture on which two set of specialized layers are trained to solve both the retinal vessel and optic disc segmentation. We present experimental validation, both qualitative and quantitative, in four public datasets for these tasks. In all of them, DRIU presents super-human performance, that is, it shows results more consistent with a gold standard than a second human annotator used as control., Comment: MICCAI 2016 Camera Ready
Published: 2016
Full Text: View/download PDF

10. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016

Author: Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Luc Van Gool, and Pablo Arbeláez
Subjects: Network architecture, genetic structures, Contextual image classification, business.industry, Computer science, 02 engineering and technology, Experimental validation, Convolutional neural network, Object detection, Retinal image, 030218 nuclear medicine & medical imaging, Set (abstract data type), 03 medical and health sciences, 0302 clinical medicine, 0202 electrical engineering, electronic engineering, information engineering, Optic disc segmentation, 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, business
Abstract: This paper presents Deep Retinal Image Understanding (DRIU), a unified framework of retinal image analysis that provides both retinal vessel and optic disc segmentation. We make use of deep Convolutional Neural Networks (CNNs), which have proven revolutionary in other fields of computer vision such as object detection and image classification, and we bring their power to the study of eye fundus images. DRIU uses a base network architecture on which two set of specialized layers are trained to solve both the retinal vessel and optic disc segmentation. We present experimental validation, both qualitative and quantitative, in four public datasets for these tasks. In all of them, DRIU presents super-human performance, that is, it shows results more consistent with a gold standard than a second human annotator used as control.
Published: 2016
Full Text: View/download PDF

11. Supervised evaluation of image segmentation and object proposal techniques

Author: Ferran Marques, Jordi Pont-Tuset, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. GPI - Grup de Processament d'Imatge i Vídeo
Subjects: Computer science, Scale-space segmentation, Supervised evaluation, Context (language use), 02 engineering and technology, Imatges -- Processament, Artificial Intelligence, Object proposals, Meta-measures, 0202 electrical engineering, electronic engineering, information engineering, Segmentation, Reconeixement de formes (Informàtica), Measure (data warehouse), Ground truth, Image segmentation, business.industry, Segmentation-based object categorization, Applied Mathematics, Enginyeria de la telecomunicació::Processament del senyal::Reconeixement de formes [Àrees temàtiques de la UPC], 020207 software engineering, Pattern recognition, Pattern recognition systems, Object detection image segmentation, Enginyeria de la telecomunicació::Processament del senyal::Processament de la imatge i del senyal vídeo [Àrees temàtiques de la UPC], Object (computer science), Computational Theory and Mathematics, 020201 artificial intelligence & image processing, Computer Vision and Pattern Recognition, Artificial intelligence, business, Software, Algorithms
Abstract: This paper tackles the supervised evaluation of image segmentation and object proposal algorithms. It surveys, structures, and deduplicates the measures used to compare both segmentation results and object proposals with a ground truth database; and proposes a new measure: the precision-recall for objects and parts. To compare the quality of these measures, eight state-of-the-art object proposal techniques are analyzed and two quantitative meta-measures involving nine state of the art segmentation methods are presented. The meta-measures consist in assuming some plausible hypotheses about the results and assessing how well each measure reflects these hypotheses. As a conclusion of the performed experiments, this paper proposes the tandem of precision-recall curves for boundaries and for objects-and-parts as the tool of choice for the supervised evaluation of image segmentation. We make the datasets and code of all the measures publicly available.
Published: 2016

12. Boosting Object Proposals: From Pascal to COCO

Author: Luc Van Gool and Jordi Pont-Tuset
Subjects: Boosting (machine learning), business.industry, Computer science, Artificial intelligence, Image segmentation, Pascal (programming language), business, Machine learning, computer.software_genre, computer, computer.programming_language, Visualization
Abstract: Computer vision in general, and object proposals in particular, are nowadays strongly influenced by the databases on which researchers evaluate the performance of their algorithms. This paper studies the transition from the Pascal Visual Object Challenge dataset, which has been the benchmark of reference for the last years, to the updated, bigger, and more challenging Microsoft Common Objects in Context. We first review and deeply analyze the new challenges, and opportunities, that this database presents. We then survey the current state of the art in object proposals and evaluate it focusing on how it generalizes to the new dataset. In sight of these results, we propose various lines of research to take advantage of the new benchmark and improve the techniques. We explore one of these lines, which leads to an improvement over the state of the art of +5.2%.
Published: 2015
Full Text: View/download PDF

13. Video content and structure description based on keyframes, clusters and storyboards

Author: Jordi Pont-Tuset, Aljoscha Smolic, Marc Junyent, Alexandre Chapiro, Pablo Beltran, and Miquel A. Farre
Subjects: Structure (mathematical logic), Computer science, business.industry, Shot (filmmaking), Search engine indexing, Automatic summarization, Visualization, Set (abstract data type), ComputerApplications_MISCELLANEOUS, Computer vision, Artificial intelligence, Storyboard, business, Face detection
Abstract: In this paper we present a novel system to extract keyframes, shot clusters and structural storyboards for video content description, which can be used for a variety of summarization, visualization, classification, indexing and retrieval applications. The system automatically selects an appealing set of keyframes and creates meaningful clusters of shots. It further identifies sections that appear recurrently, which are called anchors, and typically divide television shows into different parts. This information about anchors can then be used to browse video content in a new fashion. Finally, our system creates a new type of interactive storyboard suitable to visualize and analyze the structure of the video in a novel way.
Published: 2015
Full Text: View/download PDF

14. Semi-automatic video object segmentation by advanced manipulation of segmentation hierarchies

Author: Jordi Pont-Tuset, Aljoscha Smolic, and Miquel A. Farre
Subjects: Minimum spanning tree-based segmentation, Segmentation-based object categorization, business.industry, Computer science, Interface (computing), Video tracking, Scale-space segmentation, Context (language use), Computer vision, Segmentation, Image segmentation, Artificial intelligence, business
Abstract: For applications that require very accurate video object segmentations, semi-automatic algorithms are typically used, which help operators to minimize the annotation time, as off-the-shelf automatic segmentation techniques are still far from being precise enough in this context. This paper presents a novel interface based on a click-and-drag interaction that allows to rapidly select regions from state-of-the-art segmentation hierarchies. The interface is very responsive, allows to obtain very accurate segmentations, and is designed to minimize the human interaction. To evaluate the results, we provide a new set of object video ground truth data.
Published: 2015
Full Text: View/download PDF

15. Measures and Meta-Measures for the Supervised Evaluation of Image Segmentation

Author: Ferran Marques, Jordi Pont-Tuset, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. GPI - Grup de Processament d'Imatge i Vídeo
Subjects: Object detection, Computer science, Scale-space segmentation, Context (language use), Machine learning, computer.software_genre, Databases, Imatges -- Processament -- Tècniques digitals, Image texture, Segmentation, Image processing -- Digital techniques, Image segmentation, Benchmark testing, Ground truth, Segmentation-based object categorization, business.industry, Context, Enginyeria de la telecomunicació::Processament del senyal::Processament de la imatge i del senyal vídeo [Àrees temàtiques de la UPC], Current measurement, Partitioning algorithms, Vídeo digital, Data mining, Artificial intelligence, business, computer
Abstract: This paper tackles the supervised evaluation of image segmentation algorithms. First, it surveys and structures the measures used to compare the segmentation results with a ground truth database, and proposes a new measure: the precision-recall for objects and parts. To compare the goodness of these measures, it defines three quantitative meta-measures involving six state of the art segmentation methods. The meta-measures consist in assuming some plausible hypotheses about the results and assessing how well each measure reflects these hypotheses. As a conclusion, this paper proposes the precision-recall curves for boundaries and for objects-and-parts as the tool of choice for the supervised evaluation of image segmentation. We make the datasets and code of all the measures publicly available.
Published: 2013
Full Text: View/download PDF

16. Upper-bound assessment of the spatial accuracy of hierarchical region-based image representations

Author: Jordi Pont-Tuset and Ferran Marques
Subjects: Image texture, Minimum spanning tree-based segmentation, Region growing, business.industry, Segmentation-based object categorization, Scale-space segmentation, Pattern recognition, Artificial intelligence, Image segmentation, business, Object detection, Feature detection (computer vision), Mathematics
Abstract: Hierarchical region-based image representations are versatile tools for segmentation, filtering, object detection, etc. The evaluation of their spatial accuracy has been usually performed assessing the final result of an algorithm based on this representation. Given its wide applicability, however, a direct supervised assessment, independent of any application, would be desirable and fair. A brute-force assessment of all the partitions represented in the hierarchical structure would be a correct approach, but as we prove formally, it is computationally unfeasible. This paper presents an efficient algorithm to find the upper-bound performance of the representation and we show that the previous approximations in the literature can fail at finding this bound.
Published: 2012
Full Text: View/download PDF

17. Supervised Assessment of Segmentation Hierarchies

Author: Jordi Pont-Tuset, Ferran Marques, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. GPI - Grup de Processament d'Imatge i Vídeo
Subjects: Image segmentation, business.industry, Computer science, Enginyeria de la telecomunicació::Processament del senyal::Processament de la imatge i del senyal vídeo [Àrees temàtiques de la UPC], computer.software_genre, Machine learning, Partition (database), Imatges -- Segmentació, Partition refinement, Random tree, Combinatorial optimization, Partition (number theory), Quadtree, Segmentation, Data mining, Artificial intelligence, business, computer
Abstract: This paper addresses the problem of the supervised assessment of hierarchical region-based image representations. Given the large amount of partitions represented in such structures, the supervised assessment approaches in the literature are based on selecting a reduced set of representative partitions and evaluating their quality. Assessment results, therefore, depend on the partition selection strategy used. Instead, we propose to find the partition in the tree that best matches the ground-truth partition, that is, the upper-bound partition selection. We show that different partition selection algorithms can lead to different conclusions regarding the quality of the assessed trees and that the upper-bound partition selection provides the following advantages: 1) it does not limit the assessment to a reduced set of partitions, and 2) it better discriminates the random trees from actual ones, which reflects a better qualitative behavior. We model the problem as a Linear Fractional Combinatorial Optimization (LFCO) problem, which makes the upper-bound selection feasible and efficient.
Published: 2012
Full Text: View/download PDF

18. Contour detection using Binary Partition Trees

Author: Jordi Pont-Tuset and Ferran Marques
Subjects: Active contour model, Pixel, business.industry, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Image processing, Pattern recognition, Image segmentation, Object detection, Edge detection, Tree (data structure), Computer vision, Artificial intelligence, business, Feature detection (computer vision), Mathematics
Abstract: Contour detection is a hard, challenging, and of paramount importance problem in image processing. State-of-the-art algorithms are approaching human performance but usually entail complex and tailored image models and arduous training. Binary Partition Tree is a hierarchical region-based image model that has been proven to have a wide range of applications in image filtering, information retrieval, object detection, etc. In this paper we propose a contour detection technique based on this versatile image model, extracting the contour information available in the tree yet outperforming one of the most widely-used contour detector.
Published: 2010
Full Text: View/download PDF

19. System architecture of a web service for Content-Based Image Retrieval

Author: Carles Ventura, Silvia Cortes, Xavier Giro-i-Nieto, Ferran Marques, Jordi Pont-Tuset, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. GPI - Grup de Processament d'Imatge i Vídeo
Subjects: Information retrieval, Database, business.industry, Computer science, Software as a service, Software architecture, Programari -- Arquitectura, computer.file_format, computer.software_genre, Content-based image retrieval, Arquitectura de computadors, Metadata, Web semàntica, Systems architecture, Computer architecture, User interface, Web service, RDF, business, Image retrieval, computer, Enginyeria de la telecomunicació::Telemàtica i xarxes d'ordinadors::Internet [Àrees temàtiques de la UPC], Semantic Web
Abstract: This paper presents the system architecture of a Content- Based Image Retrieval system implemented as a web service. The proposed solution is composed of two parts, a client run- ning a graphical user interface for query formulation and a server where the search engine explores an image repository. The separation of the user interface and the search engine follows a Service as a Software (SaaS) model, a type of cloud computing design where a single core system is online and available to authorized clients. The proposed architecture follows the REST software architecture and HTTP proto- col for communications, two solutions that combined with metadata coded in RDF, make the proposed system ready for its integration in the semantic web. User queries are formulated by visual examples through a graphical inter- face and content is remotely accessed also through HTTP communication. Visual descriptors and similarity measures implemented in this work are mostly de ned in the MPEG-7 standard, while textual metadata is coded according to the Dublin Core speci cations.
Published: 2010

20. ONN the Use of Neural Networks for Data Privacy

Author: Jordi Pont-Tuset, Pau Medrano-Gracia, Jordi Nin, Victor Muntés-Mulero, and Josep-L. Larriba-Pey
Subjects: Information privacy, Artificial neural network, Computer science, business.industry, Machine learning, computer.software_genre, Physics::History of Physics, Original data, Data set, Set (abstract data type), Preprocessor, Data pre-processing, Data mining, Artificial intelligence, business, computer
Abstract: The need for data privacy motivates the development of new methods that allow to protect data minimizing the disclosure risk without losing valuable statistical information. In this paper, we propose a new protection method for numerical data called Ordered Neural Networks (ONN). ONN presents a new way to protect data based on the use of Artificial Neural Networks (ANNs). The main contribution of ONN is a new strategy for preprocessing data so that the ANNs are not capable of accurately learning the original data set. Using the results obtained by the ANNs, ONN generates a new data set similar to the original one without disclosing the real sensible values. We compare our method to the best methods presented in the literature, using data provided by the US Census Bureau. Our experiments show that ONN outperforms the previous methods proposed in the literature, proving that the use of ANNs is convenient to protect the data efficiently without losing the statistical properties of the set.
Published: 2008
Full Text: View/download PDF

21. Convolutional Oriented Boundaries

Author: Kevis-Kokitsi Maninis, Pablo Arbeláez, Luc Van Gool, Jordi Pont-Tuset, Leibe, B, Matas, J, Sebe, N, and Welling, M
Subjects: FOS: Computer and information sciences, Contextual image classification, Computer science, business.industry, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 020207 software engineering, Pattern recognition, 02 engineering and technology, Pascal (programming language), PSI_VISICS, Convolutional neural network, Computer Science::Computer Vision and Pattern Recognition, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Segmentation, Artificial intelligence, business, computer, computer.programming_language
Abstract: We present Convolutional Oriented Boundaries (COB), which produces multiscale oriented contours and region hierarchies starting from generic image classification Convolutional Neural Networks (CNNs). COB is computationally efficient, because it requires a single CNN forward pass for contour detection and it uses a novel sparse boundary representation for hierarchical segmentation; it gives a significant leap in performance over the state-of-the-art, and it generalizes very well to unseen categories and datasets. Particularly, we show that learning to estimate not only contour strength but also orientation provides more accurate results. We perform extensive experiments on BSDS, PASCAL Context, PASCAL Segmentation, and MS-COCO, showing that COB provides state-of-the-art contours, region hierarchies, and object proposals in all datasets., Comment: ECCV 2016 Camera Ready
Full Text: View/download PDF

22. Multiscale combinatorial grouping

Author: Jon Barron, Pablo Arbeláez, Jitendra Malik, Jordi Pont-Tuset, Ferran Marques, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, and Universitat Politècnica de Catalunya. GPI - Grup de Processament d'Imatge i Vídeo
Subjects: Normalization (statistics), Segmentation-based object categorization, Computer science, business.industry, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Scale-space segmentation, Object Candidates, Pattern recognition, Pascal (programming language), Image segmentation, Enginyeria de la telecomunicació::Processament del senyal::Processament de la imatge i del senyal vídeo [Àrees temàtiques de la UPC], Imatges -- Processament, Image Segmentation, Image processing, Computer Science::Computer Vision and Pattern Recognition, Segmentation, Artificial intelligence, business, computer, Image resolution, computer.programming_language
Abstract: We propose a unified approach for bottom-up hierarchical image segmentation and object candidate generation for recognition, called Multiscale Combinatorial Grouping (MCG). For this purpose, we first develop a fast normalized cuts algorithm. We then propose a high-performance hierarchical segmenter that makes effective use of multiscale information. Finally, we propose a grouping strategy that combines our multiscale regions into highly-accurate object candidates by exploring efficiently their combinatorial space. We conduct extensive experiments on both the BSDS500 and on the PASCAL 2012 segmentation datasets, showing that MCG produces state-of-the-art contours, hierarchical regions and object candidates.

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

22 results on '"Jordi Pont-Tuset"'

1. Connecting Vision and Language with Localized Narratives

2. Video Object Segmentation without Temporal Information

3. Deep Extreme Cut: From Extreme Points to Object Segmentation

4. The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale

5. Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning

6. Convolutional Oriented Boundaries: From Image Segmentation to High-Level Tasks

7. Scale-Aware Alignment of Hierarchical Image Segmentation

8. One-Shot Video Object Segmentation

9. Deep Retinal Image Understanding

10. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016

11. Supervised evaluation of image segmentation and object proposal techniques

12. Boosting Object Proposals: From Pascal to COCO

13. Video content and structure description based on keyframes, clusters and storyboards

14. Semi-automatic video object segmentation by advanced manipulation of segmentation hierarchies

15. Measures and Meta-Measures for the Supervised Evaluation of Image Segmentation

16. Upper-bound assessment of the spatial accuracy of hierarchical region-based image representations

17. Supervised Assessment of Segmentation Hierarchies

18. Contour detection using Binary Partition Trees

19. System architecture of a web service for Content-Based Image Retrieval

20. ONN the Use of Neural Networks for Data Privacy

21. Convolutional Oriented Boundaries

22. Multiscale combinatorial grouping

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

22 results on '"Jordi Pont-Tuset"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources