Author: "Kilian Q. Weinberger" / Topic: computer science - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Kilian Q. Weinberger"' showing total 32 results

Start Over Author "Kilian Q. Weinberger" Topic computer science

32 results on '"Kilian Q. Weinberger"'

1. LDLS: 3-D Object Segmentation Through Label Diffusion From 2-D Images

Author: Bharath Hariharan, Wei-Lun Chao, Brian H. Wang, Yan Wang, Mark Campbell, and Kilian Q. Weinberger
Subjects: FOS: Computer and information sciences, Control and Optimization, Computer science, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Biomedical Engineering, Point cloud, 02 engineering and technology, Computer Science - Robotics, Artificial Intelligence, FOS: Electrical engineering, electronic engineering, information engineering, 0202 electrical engineering, electronic engineering, information engineering, Segmentation, Computer vision, Pixel, business.industry, Mechanical Engineering, Deep learning, Image and Video Processing (eess.IV), 020206 networking & telecommunications, Mobile robot, Image segmentation, Electrical Engineering and Systems Science - Image and Video Processing, Object (computer science), Computer Science Applications, Human-Computer Interaction, Control and Systems Engineering, Graph (abstract data type), 020201 artificial intelligence & image processing, Computer Vision and Pattern Recognition, Artificial intelligence, business, Robotics (cs.RO)
Abstract: Object segmentation in three-dimensional (3-D) point clouds is a critical task for robots capable of 3-D perception. Despite the impressive performance of deep learning-based approaches on object segmentation in 2-D images, deep learning has not been applied nearly as successfully for 3-D point cloud segmentation. Deep networks generally require large amounts of labeled training data, which are readily available for 2-D images but are difficult to produce for 3-D point clouds. In this letter, we present Label Diffusion Lidar Segmentation (LDLS), a novel approach for 3-D point cloud segmentation, which leverages 2-D segmentation of an RGB image from an aligned camera to avoid the need for training on annotated 3-D data. We obtain 2-D segmentation predictions by applying Mask-RCNN to the RGB image, and then link this image to a 3-D lidar point cloud by building a graph of connections among 3-D points and 2-D pixels. This graph then directs a semi-supervised label diffusion process, where the 2-D pixels act as source nodes that diffuse object label information through the 3-D point cloud, resulting in a complete 3-D point cloud segmentation. We conduct empirical studies on the KITTI benchmark dataset and on a mobile robot, demonstrating wide applicability and superior performance of LDLS compared with the previous state of the art in 3-D point cloud segmentation, without any need for either 3-D training data or fine tuning of the 2-D image segmentation model., Comment: Accepted for publication in IEEE Robotics and Automation Letters with presentation at IROS 2019
Published: 2019
Full Text: View/download PDF

2. Correlator Convolutional Neural Networks: An Interpretable Architecture for Image-like Quantum Matter Data

Author: Annabelle Bohrdt, Ruihan Wu, Eun-Ah Kim, Eugene Demler, Christie S. Chiu, Cole Miles, Muqing Xu, Kilian Q. Weinberger, Markus Greiner, and Geoffrey Ji
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Theoretical computer science, Science, FOS: Physical sciences, General Physics and Astronomy, Quantum simulator, 01 natural sciences, Convolutional neural network, General Biochemistry, Genetics and Molecular Biology, Article, Machine Learning (cs.LG), 010305 fluids & plasmas, Image (mathematics), Set (abstract data type), Condensed Matter - Strongly Correlated Electrons, 0103 physical sciences, 010306 general physics, Quantum, Ultracold gases, Condensed Matter::Quantum Gases, Multidisciplinary, Strongly Correlated Electrons (cond-mat.str-el), Artificial neural network, Observable, Disordered Systems and Neural Networks (cond-mat.dis-nn), General Chemistry, Computational Physics (physics.comp-ph), Condensed Matter - Disordered Systems and Neural Networks, Computer science, Quantum Gases (cond-mat.quant-gas), Key (cryptography), Quantum simulation, Condensed Matter - Quantum Gases, Physics - Computational Physics
Abstract: Machine learning models are a powerful theoretical tool for analyzing data from quantum simulators, in which results of experiments are sets of snapshots of many-body states. Recently, they have been successfully applied to distinguish between snapshots that can not be identified using traditional one and two point correlation functions. Thus far, the complexity of these models has inhibited new physical insights from this approach. Here, using a novel set of nonlinearities we develop a network architecture that discovers features in the data which are directly interpretable in terms of physical observables. In particular, our network can be understood as uncovering high-order correlators which significantly differ between the data studied. We demonstrate this new architecture on sets of simulated snapshots produced by two candidate theories approximating the doped Fermi-Hubbard model, which is realized in state-of-the art quantum gas microscopy experiments. From the trained networks, we uncover that the key distinguishing features are fourth-order spin-charge correlators, providing a means to compare experimental data to theoretical predictions. Our approach lends itself well to the construction of simple, end-to-end interpretable architectures and is applicable to arbitrary lattice data, thus paving the way for new physical insights from machine learning studies of experimental as well as numerical data., 7 pages, 4 figures + 13 pages of supplemental material
Published: 2020

3. End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection

Author: Yan Wang, Rui Qian, Mark Campbell, Yurong You, Bharath Hariharan, Wei-Lun Chao, Serge Belongie, Divyansh Garg, and Kilian Q. Weinberger
Subjects: FOS: Computer and information sciences, Stereo cameras, Computer science, business.industry, Computer Vision and Pattern Recognition (cs.CV), Image and Video Processing (eess.IV), Computer Science - Computer Vision and Pattern Recognition, Point cloud, 02 engineering and technology, Electrical Engineering and Systems Science - Image and Video Processing, 010501 environmental sciences, 01 natural sciences, Pipeline (software), Object detection, Reduction (complexity), Lidar, Depth map, FOS: Electrical engineering, electronic engineering, information engineering, 0202 electrical engineering, electronic engineering, information engineering, Code (cryptography), 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, business, 0105 earth and related environmental sciences
Abstract: Reliable and accurate 3D object detection is a necessity for safe autonomous driving. Although LiDAR sensors can provide accurate 3D point cloud estimates of the environment, they are also prohibitively expensive for many settings. Recently, the introduction of pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras. PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs. However, so far these two networks have to be trained separately. In this paper, we introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end. The resulting framework is compatible with most state-of-the-art networks for both tasks and in combination with PointRCNN improves over PL consistently across all benchmarks -- yielding the highest entry on the KITTI image-based 3D object detection leaderboard at the time of submission. Our code will be made available at https://github.com/mileyan/pseudo-LiDAR_e2e., Comment: Accepted to 2020 Conference on Computer Vision and Pattern Recognition (CVPR 2020)
Published: 2020
Full Text: View/download PDF

4. Train in Germany, Test in The USA: Making 3D Object Detectors Generalize

Author: Mark Campbell, Bharath Hariharan, Wei-Lun Chao, Yan Wang, Kilian Q. Weinberger, Yurong You, Li Erran Li, and Xiangyu Chen
Subjects: FOS: Computer and information sciences, business.industry, Computer science, Computer Vision and Pattern Recognition (cs.CV), Deep learning, GRASP, Computer Science - Computer Vision and Pattern Recognition, 02 engineering and technology, 010501 environmental sciences, Overfitting, Machine learning, computer.software_genre, Object (computer science), 01 natural sciences, Object detection, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Adaptation (computer science), business, computer, Stereo camera, 0105 earth and related environmental sciences
Abstract: In the domain of autonomous driving, deep learning has substantially improved the 3D object detection accuracy for LiDAR and stereo camera data alike. While deep networks are great at generalization, they are also notorious to over-fit to all kinds of spurious artifacts, such as brightness, car sizes and models, that may appear consistently throughout the data. In fact, most datasets for autonomous driving are collected within a narrow subset of cities within one country, typically under similar weather conditions. In this paper we consider the task of adapting 3D object detectors from one dataset to another. We observe that naively, this appears to be a very challenging task, resulting in drastic drops in accuracy levels. We provide extensive experiments to investigate the true adaptation challenges and arrive at a surprising conclusion: the primary adaptation hurdle to overcome are differences in car sizes across geographic areas. A simple correction based on the average car size yields a strong correction of the adaptation gap. Our proposed method is simple and easily incorporated into most 3D object detection frameworks. It provides a first baseline for 3D object detection adaptation across countries, and gives hope that the underlying problem may be more within grasp than one may have hoped to believe. Our code is available at https://github.com/cxy1997/3D_adapt_auto_driving., Accepted to 2020 Conference on Computer Vision and Pattern Recognition (CVPR 2020)
Published: 2020

5. Convolutional Networks with Dense Connectivity

Author: Geoff Pleiss, Kilian Q. Weinberger, Gao Huang, Zhuang Liu, and Laurens van der Maaten
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer science, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (stat.ML), 02 engineering and technology, Convolutional neural network, Machine Learning (cs.LG), Artificial Intelligence, Statistics - Machine Learning, 0202 electrical engineering, electronic engineering, information engineering, Feature (machine learning), Layer (object-oriented design), Network architecture, Artificial neural network, business.industry, Applied Mathematics, Deep learning, Computational Theory and Mathematics, Computer engineering, Benchmark (computing), 020201 artificial intelligence & image processing, Computer Vision and Pattern Recognition, Artificial intelligence, business, Software
Abstract: Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion.Whereas traditional convolutional networks with L layers have L connections - one between each layer and its subsequent layer - our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, encourage feature reuse and substantially improve parameter efficiency. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less parameters and computation to achieve high performance., Journal(PAMI) version of DenseNet(CVPR'17)
Published: 2020

6. On Feature Normalization and Data Augmentation

Author: Ser-Nam Lim, Serge Belongie, Kilian Q. Weinberger, Felix Wu, and Boyi Li
Subjects: Normalization (statistics), FOS: Computer and information sciences, Computer Science - Machine Learning, Computer science, Feature vector, Computer Vision and Pattern Recognition (cs.CV), Feature extraction, Stability (learning theory), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (stat.ML), 02 engineering and technology, 030218 nuclear medicine & medical imaging, Data modeling, Machine Learning (cs.LG), 03 medical and health sciences, 0302 clinical medicine, Statistics - Machine Learning, 0202 electrical engineering, electronic engineering, information engineering, Computer Science - Computation and Language, business.industry, Pattern recognition, Moment (mathematics), Feature (computer vision), 020201 artificial intelligence & image processing, Noise (video), Artificial intelligence, business, Computation and Language (cs.CL)
Abstract: The moments (a.k.a., mean and standard deviation) of latent features are often removed as noise when training image recognition models, to increase stability and reduce training time. However, in the field of image generation, the moments play a much more central role. Studies have shown that the moments extracted from instance normalization and positional normalization can roughly capture style and shape information of an image. Instead of being discarded, these moments are instrumental to the generation process. In this paper we propose Moment Exchange, an implicit data augmentation method that encourages the model to utilize the moment information also for recognition models. Specifically, we replace the moments of the learned features of one training image by those of another, and also interpolate the target labels -- forcing the model to extract training signal from the moments in addition to the normalized features. As our approach is fast, operates entirely in feature space, and mixes different signals than prior methods, one can effectively combine it with existing augmentation approaches. We demonstrate its efficacy across several recognition benchmark data sets where it improves the generalization capability of highly competitive baseline networks with remarkable consistency., Comment: CVPR 2021. Code is available at https://github.com/Boyiliee/MoEx
Published: 2020
Full Text: View/download PDF

7. Vision-only 3D Tracking for Self-Driving Cars

Author: Carlos Diaz-Ruiz, Yan Wang, Kilian Q. Weinberger, Wei-Lun Chao, and Mark Campbell
Subjects: 0209 industrial biotechnology, Radar tracker, Computer science, Orientation (computer vision), business.industry, BitTorrent tracker, Detector, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Point cloud, 02 engineering and technology, 010501 environmental sciences, Tracking (particle physics), 01 natural sciences, Object detection, 020901 industrial engineering & automation, Lidar, Computer vision, Artificial intelligence, business, 0105 earth and related environmental sciences
Abstract: A vision-only tracking framework is developed and experimentally evaluated for self-driving cars. Vision-only object detection is achieved by transforming stereo depth maps into point clouds, followed by LiDAR-based detectors. Each detection yields location, orientation, and object size. A tracking algorithm is then used to combine the detections with a physics based model to create robust vehicle tracks and IDs. We empirically evaluate our approach to the ones relying on LiDAR using the KITTI Tracking dataset. We found that vision-only trackers yield comparable performance in short ranges, but are still outperformed by the LiDAR-based one at far distances. Specifically, vision-only detection and tracking can generate good estimates achieving close performance to LiDAR based detection at close range. The approach is generalizable to other trackers, particularly those which use multiple sensors.
Published: 2019
Full Text: View/download PDF

8. Anytime Stereo Image Depth Estimation on Mobile Devices

Author: Mark Campbell, Zihang Lai, Kilian Q. Weinberger, Brian H. Wang, Laurens van der Maaten, Gao Huang, and Yan Wang
Subjects: FOS: Computer and information sciences, Computer science, business.industry, Computer Vision and Pattern Recognition (cs.CV), Feature extraction, Process (computing), Computer Science - Computer Vision and Pattern Recognition, Robotics, 02 engineering and technology, 010501 environmental sciences, 01 natural sciences, Range (mathematics), Computer engineering, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business, Image resolution, 0105 earth and related environmental sciences
Abstract: Many applications of stereo depth estimation in robotics require the generation of accurate disparity maps in real time under significant computational constraints. Current state-of-the-art algorithms force a choice between either generating accurate mappings at a slow pace, or quickly generating inaccurate ones, and additionally these methods typically require far too many parameters to be usable on power- or memory-constrained devices. Motivated by these shortcomings, we propose a novel approach for disparity prediction in the anytime setting. In contrast to prior work, our end-to-end learned approach can trade off computation and accuracy at inference time. Depth estimation is performed in stages, during which the model can be queried at any time to output its current best estimate. Our final model can process 1242$ \times $375 resolution images within a range of 10-35 FPS on an NVIDIA Jetson TX2 module with only marginal increases in error -- using two orders of magnitude fewer parameters than the most competitive baseline. The source code is available at https://github.com/mileyan/AnyNet ., Accepted by ICRA2019
Published: 2018

9. CondenseNet: An Efficient DenseNet Using Learned Group Convolutions

Author: Laurens van der Maaten, Kilian Q. Weinberger, Shichen Liu, and Gao Huang
Subjects: FOS: Computer and information sciences, Network architecture, business.industry, Group (mathematics), Computer science, Computer Vision and Pattern Recognition (cs.CV), Computation, Computer Science - Computer Vision and Pattern Recognition, 020207 software engineering, 02 engineering and technology, Convolution, Computer engineering, Feature (computer vision), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business
Abstract: Deep neural networks are increasingly used on mobile devices, where computational resources are limited. In this paper we develop CondenseNet, a novel network architecture with unprecedented efficiency. It combines dense connectivity with a novel module called learned group convolution. The dense connectivity facilitates feature re-use in the network, whereas learned group convolutions remove connections between layers for which this feature re-use is superfluous. At test time, our model can be implemented using standard group convolutions, allowing for efficient computation in practice. Our experiments show that CondenseNets are far more efficient than state-of-the-art compact convolutional networks such as MobileNets and ShuffleNets.
Published: 2018
Full Text: View/download PDF

10. Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving

Author: Mark Campbell, Kilian Q. Weinberger, Yan Wang, Divyansh Garg, Wei-Lun Chao, and Bharath Hariharan
Subjects: FOS: Computer and information sciences, 0209 industrial biotechnology, Bridging (networking), Monocular, Computer science, business.industry, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 02 engineering and technology, Convolutional neural network, Object detection, Range (mathematics), 020901 industrial engineering & automation, Lidar, 0202 electrical engineering, electronic engineering, information engineering, Benchmark (computing), 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, Representation (mathematics), business
Abstract: 3D object detection is an essential task in autonomous driving. Recent techniques excel with highly accurate detection rates, provided the 3D input data is obtained from precise but expensive LiDAR technology. Approaches based on cheaper monocular or stereo imagery data have, until now, resulted in drastically lower accuracies --- a gap that is commonly attributed to poor image-based depth estimation. However, in this paper we argue that it is not the quality of the data but its representation that accounts for the majority of the difference. Taking the inner workings of convolutional neural networks into consideration, we propose to convert image-based depth maps to pseudo-LiDAR representations --- essentially mimicking the LiDAR signal. With this representation we can apply different existing LiDAR-based detection algorithms. On the popular KITTI benchmark, our approach achieves impressive improvements over the existing state-of-the-art in image-based performance --- raising the detection accuracy of objects within the 30m range from the previous state-of-the-art of 22% to an unprecedented 74%. At the time of submission our algorithm holds the highest entry on the KITTI 3D object detection leaderboard for stereo-image-based approaches. Our code is publicly available at https://github.com/mileyan/pseudo_lidar., Comment: Accepted by CVPR 2019
Published: 2018
Full Text: View/download PDF

11. Resource Aware Person Re-identification across Multiple Resolutions

Author: Yan Wang, Gao Huang, Serena Li, Xu Zou, Yurong You, Kilian Q. Weinberger, Bharath Hariharan, Lequn Wang, and Vincent Chen
Subjects: FOS: Computer and information sciences, Computer science, business.industry, Computer Vision and Pattern Recognition (cs.CV), Feature extraction, Computer Science - Computer Vision and Pattern Recognition, 020207 software engineering, 02 engineering and technology, Semantics, Machine learning, computer.software_genre, Re identification, Resource (project management), 0202 electrical engineering, electronic engineering, information engineering, Task analysis, 020201 artificial intelligence & image processing, Limit (mathematics), Artificial intelligence, business, computer
Abstract: Not all people are equally easy to identify: color statistics might be enough for some cases while others might require careful reasoning about high- and low-level details. However, prevailing person re-identification(re-ID) methods use one-size-fits-all high-level embeddings from deep convolutional networks for all cases. This might limit their accuracy on difficult examples or makes them needlessly expensive for the easy ones. To remedy this, we present a new person re-ID model that combines effective embeddings built on multiple convolutional network layers, trained with deep-supervision. On traditional re-ID benchmarks, our method improves substantially over the previous state-of-the-art results on all five datasets that we evaluate on. We then propose two new formulations of the person re-ID problem under resource-constraints, and show how our model can be used to effectively trade off accuracy and computation in the presence of resource constraints. Code and pre-trained models are available at https://github.com/mileyan/DARENet., Comment: 8 pages, 8 figures, CVPR 2018
Published: 2018
Full Text: View/download PDF

12. Densely Connected Convolutional Networks

Author: Laurens van der Maaten, Kilian Q. Weinberger, Gao Huang, and Zhuang Liu
Subjects: Vanishing gradient problem, FOS: Computer and information sciences, Network architecture, Artificial neural network, Computer science, business.industry, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, 02 engineering and technology, Parallel computing, 010501 environmental sciences, 01 natural sciences, Residual neural network, Machine Learning (cs.LG), Computer Science - Learning, Convolutional code, 0202 electrical engineering, electronic engineering, information engineering, Feature (machine learning), Code (cryptography), Benchmark (computing), 020201 artificial intelligence & image processing, Artificial intelligence, Layer (object-oriented design), business, 0105 earth and related environmental sciences
Abstract: Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections - one between each layer and its subsequent layer - our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less computation to achieve high performance. Code and pre-trained models are available at https://github.com/liuzhuang13/DenseNet ., CVPR 2017
Published: 2016

13. Compressing Convolutional Neural Networks in the Frequency Domain

Author: Kilian Q. Weinberger, Yixin Chen, James Wilson, Stephen Tyree, and Wenlin Chen
Subjects: Computer science, business.industry, Deep learning, Hash function, Pattern recognition, 02 engineering and technology, Filter (signal processing), 010501 environmental sciences, computer.software_genre, 01 natural sciences, Convolutional neural network, Image (mathematics), Frequency domain, 0202 electrical engineering, electronic engineering, information engineering, Redundancy (engineering), Discrete cosine transform, 020201 artificial intelligence & image processing, Data mining, Artificial intelligence, business, computer, 0105 earth and related environmental sciences
Abstract: Convolutional neural networks (CNN) are increasingly used in many areas of computer vision. They are particularly attractive because of their ability to "absorb" great quantities of labeled data through millions of parameters. However, as model sizes increase, so do the storage and memory requirements of the classifiers, hindering many applications such as image and speech recognition on mobile phones and other devices. In this paper, we present a novel net- work architecture, Frequency-Sensitive Hashed Nets (FreshNets), which exploits inherent redundancy in both convolutional layers and fully-connected layers of a deep learning model, leading to dramatic savings in memory and storage consumption. Based on the key observation that the weights of learned convolutional filters are typically smooth and low-frequency, we first convert filter weights to the frequency domain with a discrete cosine transform (DCT) and use a low-cost hash function to randomly group frequency parameters into hash buckets. All parameters assigned the same hash bucket share a single value learned with standard back-propagation. To further reduce model size, we allocate fewer hash buckets to high-frequency components, which are generally less important. We evaluate FreshNets on eight data sets, and show that it leads to better compressed performance than several relevant baselines.
Published: 2016
Full Text: View/download PDF

14. Deep Feature Interpolation for Image Content Changes

Author: Kilian Q. Weinberger, Kavita Bala, Paul Upchurch, Robert Pless, Noah Snavely, Jacob R. Gardner, and Geoff Pleiss
Subjects: FOS: Computer and information sciences, Matching (statistics), Computer science, business.industry, Deep learning, media_common.quotation_subject, Computer Vision and Pattern Recognition (cs.CV), Feature extraction, Computer Science - Computer Vision and Pattern Recognition, 020207 software engineering, Pattern recognition, 02 engineering and technology, Iterative reconstruction, Linear interpolation, Feature (computer vision), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Simplicity, Artificial intelligence, business, Image resolution, media_common, Interpolation
Abstract: We propose Deep Feature Interpolation (DFI), a new data-driven baseline for automatic high-resolution image transformation. As the name suggests, it relies only on simple linear interpolation of deep convolutional features from pre-trained convnets. We show that despite its simplicity, DFI can perform high-level semantic transformations like "make older/younger", "make bespectacled", "add smile", among others, surprisingly well - sometimes even matching or outperforming the state-of-the-art. This is particularly unexpected as DFI requires no specialized network architecture or even any deep network to be trained for these tasks. DFI therefore can be used as a new baseline to evaluate more complex algorithms and provides a practical answer to the question of which image transformation tasks are still challenging in the rise of deep learning., Comment: First two authors contributed equally. Accepted by CVPR 2017. Code at https://github.com/paulu/deepfeatinterp
Published: 2016
Full Text: View/download PDF

15. Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification

Author: Xilun Chen, Yu Sun, Claire Cardie, Kilian Q. Weinberger, and Ben Athiwaratkun
Subjects: FOS: Computer and information sciences, Linguistics and Language, Cross lingual, Computer Science - Computation and Language, Computer science, business.industry, Communication, 02 engineering and technology, computer.software_genre, Computer Science Applications, Human-Computer Interaction, 030507 speech-language pathology & audiology, 03 medical and health sciences, Adversarial system, ComputingMethodologies_PATTERNRECOGNITION, Artificial Intelligence, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, 0305 other medical science, business, computer, Computation and Language (cs.CL), Natural language processing
Abstract: In recent years great success has been achieved in sentiment classification for English, thanks in part to the availability of copious annotated resources. Unfortunately, most languages do not enjoy such an abundance of labeled data. To tackle the sentiment classification problem in low-resource languages without adequate annotated data, we propose an Adversarial Deep Averaging Network (ADAN) to transfer the knowledge learned from labeled data on a resource-rich source language to low-resource languages where only unlabeled data exists. ADAN has two discriminative branches: a sentiment classifier and an adversarial language discriminator. Both branches take input from a shared feature extractor to learn hidden representations that are simultaneously indicative for the classification task and invariant across languages. Experiments on Chinese and Arabic sentiment classification demonstrate that ADAN significantly outperforms state-of-the-art systems., Comment: TACL journal version
Published: 2016
Full Text: View/download PDF

16. Deep Networks with Stochastic Depth

Author: Yu Sun, Zhuang Liu, Gao Huang, Daniel Sedra, and Kilian Q. Weinberger
Subjects: Computer science, business.industry, Training (meteorology), 020207 software engineering, 02 engineering and technology, Residual, Machine learning, computer.software_genre, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Set (psychology), business, Algorithm, computer
Abstract: Very deep convolutional networks with hundreds of layers have led to significant reductions in error on competitive benchmarks. Although the unmatched expressiveness of the many layers can be highly desirable at test time, training very deep networks comes with its own set of challenges. The gradients can vanish, the forward flow often diminishes, and the training time can be painfully slow. To address these problems, we propose stochastic depth, a training procedure that enables the seemingly contradictory setup to train short networks and use deep networks at test time. We start with very deep networks but during training, for each mini-batch, randomly drop a subset of layers and bypass them with the identity function. This simple approach complements the recent success of residual networks. It reduces training time substantially and improves the test error significantly on almost all data sets that we used for evaluation. With stochastic depth we can increase the depth of residual networks even beyond 1200 layers and still yield meaningful improvements in test error (4.91 % on CIFAR-10).
Published: 2016
Full Text: View/download PDF

17. Boosted multi-task learning

Author: Ya Zhang, Kilian Q. Weinberger, Srinivas Vadrevu, Olivier Chapelle, Pannagadatta K. Shivaswamy, and Belle L. Tseng
Subjects: Boosting (machine learning), Computer science, business.industry, Decision tree, Multi-task learning, Semi-supervised learning, Machine learning, computer.software_genre, Regularization (mathematics), Data sharing, Artificial Intelligence, Alternating decision tree, Unsupervised learning, Artificial intelligence, business, computer, Software
Abstract: In this paper we propose a novel algorithm for multi-task learning with boosted decision trees. We learn several different learning tasks with a joint model, explicitly addressing their commonalities through shared parameters and their differences with task-specific ones. This enables implicit data sharing and regularization. Our algorithm is derived using the relationship between l 1-regularization and boosting. We evaluate our learning method on web-search ranking data sets from several countries. Here, multi-task learning is particularly helpful as data sets from different countries vary largely in size because of the cost of editorial judgments. Further, the proposed method obtains state-of-the-art results on a publicly available multi-task dataset. Our experiments validate that learning various tasks jointly can lead to significant improvements in performance with surprising reliability.
Published: 2010
Full Text: View/download PDF

18. Learning to rank with (a lot of) word features

Author: David Grangier, Kilian Q. Weinberger, Yanjun Qi, Ronan Collobert, Kunihiko Sadamasa, Bing Bai, Olivier Chapelle, and Jason Weston
Subjects: Information retrieval, Computer science, business.industry, Rank (computer programming), Search engine indexing, Library and Information Sciences, computer.software_genre, Ranking (information retrieval), Pattern recognition (psychology), Learning to rank, Feature hashing, Artificial intelligence, Polysemy, business, computer, Word (computer architecture), Natural language processing, Information Systems
Abstract: In this article we present Supervised Semantic Indexing which defines a class of nonlinear (quadratic) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score. Like Latent Semantic Indexing (LSI), our models take account of correlations between words (synonymy, polysemy). However, unlike LSI our models are trained from a supervised signal directly on the ranking task of interest, which we argue is the reason for our superior results. As the query and target texts are modeled separately, our approach is easily generalized to different retrieval tasks, such as cross-language retrieval or online advertising placement. Dealing with models on all pairs of words features is computationally challenging. We propose several improvements to our basic model for addressing this issue, including low rank (but diagonal preserving) representations, correlated feature hashing and sparsification. We provide an empirical study of all these methods on retrieval tasks based on Wikipedia documents as well as an Internet advertisement task. We obtain state-of-the-art performance while providing realistically scalable methods.
Published: 2009
Full Text: View/download PDF

19. Unsupervised Learning of Image Manifolds by Semidefinite Programming

Author: Kilian Q. Weinberger and Lawrence K. Saul
Subjects: Semidefinite programming, Semidefinite embedding, Clustering high-dimensional data, business.industry, Computer science, Dimensionality reduction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Nonlinear dimensionality reduction, Pattern recognition, Manifold, Artificial Intelligence, Computer Science::Computer Vision and Pattern Recognition, Unsupervised learning, Computer Vision and Pattern Recognition, Artificial intelligence, business, Representation (mathematics), Software
Abstract: Can we detect low dimensional structure in high dimensional data sets of images? In this paper, we propose an algorithm for unsupervised learning of image manifolds by semidefinite programming. Given a data set of images, our algorithm computes a low dimensional representation of each image with the property that distances between nearby images are preserved. More generally, it can be used to analyze high dimensional data that lies on or near a low dimensional manifold. We illustrate the algorithm on easily visualized examples of curves and surfaces, as well as on actual images of faces, handwritten digits, and solid objects.
Published: 2006
Full Text: View/download PDF

20. Physicochemical signatures of nanoparticle-dependent complement activation

Author: Mark F. Tardiff, Alejandro Heredia-Langner, Christine T.N. Pham, Kilian Q. Weinberger, Dennis E. Hourcade, Zhixiang Xu, Gregory M. Lanza, Nathan A. Baker, Dennis G. Thomas, and Satish Chikkagoudar
Subjects: Numerical Analysis, Computer science, Dispersity, General Physics and Astronomy, Nanoparticle, Nanotechnology, Ligand (biochemistry), medicine.disease, Hemolysis, Article, Complement (complexity), Complement system, Computational Mathematics, Zeta potential, medicine, Biophysics, Surface charge
Abstract: Nanoparticles are potentially powerful therapeutic tools that have the capacity to target drug payloads and imaging agents. However, some nanoparticles can activate complement, a branch of the innate immune system, and cause adverse side-effects. Recently, we employed an in vitro hemolysis assay to measure the serum complement activity of perfluorocarbon nanoparticles that differed by size, surface charge, and surface chemistry, quantifying the nanoparticle-dependent complement activity using a metric called Residual Hemolytic Activity (RHA). In the present work, we have used a decision tree learning algorithm to derive the rules for estimating nanoparticle-dependent complement response based on the data generated from the hemolytic assay studies. Our results indicate that physicochemical properties of nanoparticles, namely, size, polydispersity index, zeta potential, and mole percentage of the active surface ligand of a nanoparticle, can serve as good descriptors for prediction of nanoparticle-dependent complement activation in the decision tree modeling framework.
Published: 2014

21. Transductive Minimax Probability Machine

Author: Kilian Q. Weinberger, Zhixiang Eddie Xu, Gao Huang, and Shiji Song
Subjects: Transduction (machine learning), Optimization problem, Computer science, business.industry, Semi-supervised learning, Machine learning, computer.software_genre, Minimax, Upper and lower bounds, Generalization error, Test set, Artificial intelligence, business, computer, Classifier (UML), Algorithm
Abstract: The Minimax Probability Machine (MPM) is an elegant machine learning algorithm for inductive learning. It learns a classifier that minimizes an upper bound on its own generalization error. In this paper, we extend its celebrated inductive formulation to an equally elegant transductive learning algorithm. In the transductive setting, the label assignment of a test set is already optimized during training. This optimization problem is an intractable mixed-integer programming. Thus, we provide an efficient label-switching approach to solve it approximately. The resulting method scales naturally to large data sets and is very efficient to run. In comparison with nine competitive algorithms on eleven data sets, we show that the proposed Transductive MPM (TMPM) almost outperforms all the other algorithms in both accuracy and speed.
Published: 2014
Full Text: View/download PDF

22. Predicting a multi-parametric probability map of active tumor extent using random forests

Author: Tammie L.S. Benzinger, Fred W. Prior, Bart Keogh, Daniel S. Marcus, Stephen Tyree, Alicia Boyd, David G. Politte, Sarah Jost Fouke, Kilian Q. Weinberger, Mikhail Milchenko, Michael R. Chicoine, Sharath R. Cholleti, Matthew Kelsey, and Lauren Kim
Subjects: Diagnostic Imaging, Computer science, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Contrast Media, Image processing, Linear classifier, Article, Pattern Recognition, Automated, Artificial Intelligence, Predictive Value of Tests, Image Processing, Computer-Assisted, medicine, Medical imaging, Humans, Computer vision, Probability, Contextual image classification, medicine.diagnostic_test, Brain Neoplasms, business.industry, Magnetic resonance imaging, Pattern recognition, Image segmentation, medicine.disease, Magnetic Resonance Imaging, Random forest, ComputingMethodologies_PATTERNRECOGNITION, ROC Curve, Artificial intelligence, Glioblastoma, business, Classifier (UML), Algorithms
Abstract: Glioblastoma Mulitforme is highly infiltrative, making precise delineation of tumor margin difficult. Multimodality or multi-parametric MR imaging sequences promise an advantage over anatomic sequences such as post contrast enhancement as methods for determining the spatial extent of tumor involvement. In considering multi-parametric imaging sequences however, manual image segmentation and classification is time-consuming and prone to error. As a preliminary step toward integration of multi-parametric imaging into clinical assessments of primary brain tumors, we propose a machine-learning based multi-parametric approach that uses radiologist generated labels to train a classifier that is able to classify tissue on a voxel-wise basis and automatically generate a tumor segmentation. A random forests classifier was trained using a leave-one-out experimental paradigm. A simple linear classifier was also trained for comparison. The random forests classifier accurately predicted radiologist generated segmentations and tumor extent.
Published: 2013
Full Text: View/download PDF

23. From sBoW to dCoT marginalized encoders for text representation

Author: Minmin Chen, Kilian Q. Weinberger, Fei Sha, and Zhixiang Xu
Subjects: business.industry, Generalization, Computer science, Document classification, Pattern recognition, Overfitting, computer.software_genre, ComputingMethodologies_PATTERNRECOGNITION, Text mining, Bag-of-words model, Feature (machine learning), Benchmark (computing), Artificial intelligence, Polysemy, Representation (mathematics), business, computer, Natural language processing
Abstract: In text mining, information retrieval, and machine learning, text documents are commonly represented through variants of sparse Bag of Words (sBoW) vectors (e.g. TF-IDF [1]). Although simple and intuitive, sBoW style representations suffer from their inherent over-sparsity and fail to capture word-level synonymy and polysemy. Especially when labeled data is limited (e.g. in document classification), or the text documents are short (e.g. emails or abstracts), many features are rarely observed within the training corpus. This leads to overfitting and reduced generalization accuracy. In this paper we propose Dense Cohort of Terms (dCoT), an unsupervised algorithm to learn improved sBoW document features. dCoT explicitly models absent words by removing and reconstructing random sub-sets of words in the unlabeled corpus. With this approach, dCoT learns to reconstruct frequent words from co-occurring infrequent words and maps the high dimensional sparse sBoW vectors into a low-dimensional dense representation. We show that the feature removal can be marginalized out and that the reconstruction can be solved for in closed-form. We demonstrate empirically, on several benchmark datasets, that dCoT features significantly improve the classification accuracy across several document classification tasks.
Published: 2012
Full Text: View/download PDF

24. Spam or ham?

Author: Nick Feamster, Anirban Dasgupta, Anirudh Ramachandran, and Kilian Q. Weinberger
Subjects: Computer science, business.industry, InformationSystems_INFORMATIONSYSTEMSAPPLICATIONS, Internet privacy, ComputingMilieux_LEGALASPECTSOFCOMPUTING, Computer security, computer.software_genre, Spamming, Sping, Forum spam, Spambot, Srizbi botnet, Spam blog, Social spam, Spam and Open Relay Blocking System, business, computer
Abstract: Web mail providers rely on users to "vote" to quickly and col-laboratively identify spam messages. Unfortunately, spammers have begun to use bots to control large collections of compromised Web mail accounts not just to send spam, but also to vote "not spam" on incoming spam emails in an attempt to thwart collaborative filtering. We call this practice a vote gaming attack. This attack confuses spam filters, since it causes spam messages to be mislabeled as legitimate; thus, spammer IP addresses can continue sending spam for longer. In this paper, we introduce the vote gaming attack and study the extent of these attacks in practice, using four months of email voting data from a large Web mail provider. We develop a model for vote gaming attacks, explain why existing detection mechanisms cannot detect them, and develop a new, scalable clustering-based detection method that identifies compromised accounts that engage in vote-gaming attacks. Our method detected 1.1 million potentially compromised accounts with only a 0.17% false positive rate, which is nearly 10 times more effective than existing clustering methods used to detect bots that send spam from compromised Web mail accounts.
Published: 2011
Full Text: View/download PDF

25. Parallel boosted regression trees for web search ranking

Author: Jennifer Paykin, Kilian Q. Weinberger, Stephen Tyree, and Kunal Agrawal
Subjects: Boosting (machine learning), Speedup, Theoretical computer science, Computer science, Computation, Decision tree, computer.software_genre, Regression, Data set, Shared memory, Ranking, Histogram, Distributed memory, Data mining, computer
Abstract: Gradient Boosted Regression Trees (GBRT) are the current state-of-the-art learning paradigm for machine learned web-search ranking - a domain notorious for very large data sets. In this paper, we propose a novel method for parallelizing the training of GBRT. Our technique parallelizes the construction of the individual regression trees and operates using the master-worker paradigm as follows. The data are partitioned among the workers. At each iteration, the worker summarizes its data-partition using histograms. The master processor uses these to build one layer of a regression tree, and then sends this layer to the workers, allowing the workers to build histograms for the next layer. Our algorithm carefully orchestrates overlap between communication and computation to achieve good performance.Since this approach is based on data partitioning, and requires a small amount of communication, it generalizes to distributed and shared memory machines, as well as clouds. We present experimental results on both shared memory machines and clusters for two large scale web search ranking data sets. We demonstrate that the loss in accuracy induced due to the histogram approximation in the regression tree creation can be compensated for through slightly deeper trees. As a result, we see no significant loss in accuracy on the Yahoo data sets and a very small reduction in accuracy for the Microsoft LETOR data. In addition, on shared memory machines, we obtain almost perfect linear speed-up with up to about 48 cores on the large data sets. On distributed memory machines, we get a speedup of 25 with 32 processors. Due to data partitioning our approach can scale to even larger data sets, on which one can reasonably expect even higher speedups.
Published: 2011
Full Text: View/download PDF

26. Multi-task learning for boosting with application to web search ranking

Author: Srinivas Vadrevu, Olivier Chapelle, Kilian Q. Weinberger, Belle L. Tseng, Ya Zhang, and Pannagadatta K. Shivaswamy
Subjects: Proactive learning, Boosting (machine learning), Computer science, Active learning (machine learning), Competitive learning, Stability (learning theory), Decision tree, Multi-task learning, Semi-supervised learning, Machine learning, computer.software_genre, Regularization (mathematics), Instance-based learning, Learning classifier system, business.industry, Algorithmic learning theory, Online machine learning, Generalization error, Ranking, Computational learning theory, Unsupervised learning, Alternating decision tree, Learning to rank, Artificial intelligence, business, computer
Abstract: In this paper we propose a novel algorithm for multi-task learning with boosted decision trees. We learn several different learning tasks with a joint model, explicitly addressing the specifics of each learning task with task-specific parameters and the commonalities between them through shared parameters. This enables implicit data sharing and regularization. We evaluate our learning method on web-search ranking data sets from several countries. Here, multitask learning is particularly helpful as data sets from different countries vary largely in size because of the cost of editorial judgments. Our experiments validate that learning various tasks jointly can lead to significant improvements in performance with surprising reliability.
Published: 2010
Full Text: View/download PDF

27. Supervised semantic indexing

Author: Jason Weston, Yanjun Qi, Bing Bai, Kunihiko Sadamasa, David Grangier, Ronan Collobert, Olivier Chapelle, and Kilian Q. Weinberger
Subjects: Information retrieval, Probabilistic latent semantic analysis, business.industry, Computer science, Search engine indexing, Rank (computer programming), computer.software_genre, Ranking (information retrieval), Task (computing), Ranking, Learning to rank, Feature hashing, Artificial intelligence, Polysemy, business, computer, Natural language processing, Latent semantic indexing
Abstract: In this article we propose Supervised Semantic Indexing (SSI), an algorithm that is trained on (query, document) pairs of text documents to predict the quality of their match. Like Latent Semantic Indexing (LSI), our models take account of correlations between words (synonymy, polysemy). However, unlike LSI our models are trained with a supervised signal directly on the ranking task of interest, which we argue is the reason for our superior results. As the query and target texts are modeled separately, our approach is easily generalized to different retrieval tasks, such as online advertising placement. Dealing with models on all pairs of words features is computationally challenging. We propose several improvements to our basic model for addressing this issue, including low rank (but diagonal preserving) representations, and correlated feature hashing (CFH). We provide an empirical study of all these methods on retrieval tasks based on Wikipedia documents as well as an Internet advertisement task. We obtain state-of-the-art performance while providing realistically scalable methods.
Published: 2009
Full Text: View/download PDF

28. Reliable tags using image similarity

Author: Kilian Q. Weinberger, Lyndon Kennedy, and Malcolm Slaney
Subjects: Information retrieval, Computer science, Reliability (computer networking), Visual descriptors, Similarity (psychology), ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Scale (map), Image (mathematics)
Abstract: This paper describes an approach for finding image descriptors or tags that are highly reliable and specific. Reliable, in this work, means that the tags are related to the image's visual content, which we verify by finding two or more real people who agree that the tag is applicable. Our work differs from prior work by mining the photographer's (or web master's) original words and seeking inter-subject agreement for images that we judge to be highly similar. By using the photographer's words we gain specificity since the photographer knows that the image represents something specific, such as the Augsburg Cathedral; whereas random people from the web playing a labeling game might not have this knowledge. We describe our approach and demonstrate that we identify reliable tags with greater specificity than human annotators.
Published: 2009
Full Text: View/download PDF

29. Unsupervised image ranking

Author: Marc'Aurelio Ranzato, Kilian Q. Weinberger, Malcolm Slaney, and Eva Hörster
Subjects: Ranking, Computer science, business.industry, Feature vector, Graph (abstract data type), Pattern recognition, Artificial intelligence, Machine learning, computer.software_genre, business, computer
Abstract: In the paper, we propose and test an unsupervised approach for image ranking. Prior solutions are based on image content and the similarity graph connecting images. We generalize this idea by directly estimating the likelihood of each photo in a feature space. We hypothesize the photos at the peaks of this distribution are the most likely photos for any given category and therefore these images are the most representative.Our approach is unsupervised and allows for various feature modalities. We demonstrate the effectiveness of our approach using both visual-content-based and tag-based features. The experimental evaluation shows that the presented model outperforms baseline approaches. Moreover, the performance of our method will only get better with time as more images move online and it is thus possible to build more detailed models based on the approach presented here.
Published: 2009
Full Text: View/download PDF

30. Feature Hashing for Large Scale Multitask Learning

Author: Josh Attenberg, John Langford, Anirban Dasgupta, Alexander J. Smola, and Kilian Q. Weinberger
Subjects: FOS: Computer and information sciences, Scale (ratio), Computer Science - Artificial Intelligence, business.industry, Computer science, Dimensionality reduction, Hash function, Multi-task learning, Machine learning, computer.software_genre, Linear subspace, Exponential function, Artificial Intelligence (cs.AI), Feature hashing, Artificial intelligence, business, computer
Abstract: Empirical evidence suggests that hashing is an effective strategy for dimensionality reduction and practical nonparametric estimation. In this paper we provide exponential tail bounds for feature hashing and show that the interaction between random subspaces is negligible with high probability. We demonstrate the feasibility of this approach with experimental results for a new use case -- multitask learning with hundreds of thousands of tasks., Comment: Fixed broken theorem
Published: 2009
Full Text: View/download PDF

31. Mapping Uncharted Waters: Exploratory Analysis, Visualization, and Clustering of Oceanographic Data

Author: Joshua M. Lewis, Kilian Q. Weinberger, Pincelli M. Hull, and Lawrence K. Saul
Subjects: Clustering high-dimensional data, Computer science, business.industry, Dimensionality reduction, Biome, Vegetation, computer.software_genre, Manifold, Visualization, Statistical classification, Data visualization, Principal component analysis, Precipitation, Data mining, business, Cluster analysis, Isomap, computer
Abstract: In this paper we describe an interdisciplinary collaboration between researchers in machine learning and oceanography. The collaboration was formed to study the problem of open ocean biome classification. Biomes are regions on Earth with similar climate (e.g., temperature and rainfall) and vegetation structure (e.g., grasslands, coniferous forests, and deserts). To discover biomes in the open ocean, we apply leading methods in high dimensional data analysis, clustering, and visualization to oceanographic measurements culled from multiple existing databases. We compare traditional approaches, such as k-means clustering and principal component analysis, to newer approaches such as Isomap and maximum variance unfolding. Our work provides the first quantitative classification of open ocean biomes from an automated statistical analysis of multivariate data. It also provides a valuable case study in the use (and misuse) of recently developed algorithms for high dimensional data analysis.
Published: 2008
Full Text: View/download PDF

32. Resolving tag ambiguity

Author: Roelof van Zwol, Kilian Q. Weinberger, and Malcolm Slaney
Subjects: Metadata, Set (abstract data type), Upload, Query expansion, Information retrieval, Computer science, business.industry, media_common.quotation_subject, Contrast (statistics), The Internet, Ambiguity, business, media_common
Abstract: Tagging is an important way for users to succinctly describe the content they upload to the Internet. However, most tag-suggestion systems recommend words that are highly correlated with the existing tag set, and thus add little information to a user's contribution. This paper describes a means to determine the ambiguity of a set of (user-contributed) tags and suggests new tags that disambiguate the original tags. We introduce a probabilistic framework that allows us to find two tags that appear in different contexts but are both likely to co-occur with the original tag set. If such tags can be found, the current description is considered "ambiguous" and the two tags are recommended to the user for further clarification. In contrast to previous work, we only query the user when information is most needed and good suggestions are available. We verify the efficacy of our approach using geographical, temporal and semantic metadata, and a user study. We built our system using statistics from a large (100M) database of images and their tags.
Published: 2008
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

32 results on '"Kilian Q. Weinberger"'

1. LDLS: 3-D Object Segmentation Through Label Diffusion From 2-D Images

2. Correlator Convolutional Neural Networks: An Interpretable Architecture for Image-like Quantum Matter Data

3. End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection

4. Train in Germany, Test in The USA: Making 3D Object Detectors Generalize

5. Convolutional Networks with Dense Connectivity

6. On Feature Normalization and Data Augmentation

7. Vision-only 3D Tracking for Self-Driving Cars

8. Anytime Stereo Image Depth Estimation on Mobile Devices

9. CondenseNet: An Efficient DenseNet Using Learned Group Convolutions

10. Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving

11. Resource Aware Person Re-identification across Multiple Resolutions

12. Densely Connected Convolutional Networks

13. Compressing Convolutional Neural Networks in the Frequency Domain

14. Deep Feature Interpolation for Image Content Changes

15. Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification

16. Deep Networks with Stochastic Depth

17. Boosted multi-task learning

18. Learning to rank with (a lot of) word features

19. Unsupervised Learning of Image Manifolds by Semidefinite Programming

20. Physicochemical signatures of nanoparticle-dependent complement activation

21. Transductive Minimax Probability Machine

22. Predicting a multi-parametric probability map of active tumor extent using random forests

23. From sBoW to dCoT marginalized encoders for text representation

24. Spam or ham?

25. Parallel boosted regression trees for web search ranking

26. Multi-task learning for boosting with application to web search ranking

27. Supervised semantic indexing

28. Reliable tags using image similarity

29. Unsupervised image ranking

30. Feature Hashing for Large Scale Multitask Learning

31. Mapping Uncharted Waters: Exploratory Analysis, Visualization, and Clustering of Oceanographic Data

32. Resolving tag ambiguity

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

32 results on '"Kilian Q. Weinberger"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources