Author: "Joshi, Ameya" / Publication Type: Reports - Searchworks@Jio Institute Digital Library Search Results

1. A Curious Case of Remarkable Resilience to Gradient Attacks via Fully Convolutional and Differentiable Front End with a Skip Connection

Author: Boytsov, Leonid, Joshi, Ameya, and Condessa, Filipe
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: We tested front-end enhanced neural models where a frozen classifier was prepended by a differentiable and fully convolutional model with a skip connection. By training them using a small learning rate for about one epoch, we obtained models that retained the accuracy of the backbone classifier while being unusually resistant to gradient attacks including APGD and FAB-T attacks from the AutoAttack package, which we attributed to gradient masking. The gradient masking phenomenon is not new, but the degree of masking was quite remarkable for fully differentiable models that did not have gradient-shattering components such as JPEG compression or components that are expected to cause diminishing gradients. Though black box attacks can be partially effective against gradient masking, they are easily defeated by combining models into randomized ensembles. We estimate that such ensembles achieve near-SOTA AutoAttack accuracy on CIFAR10, CIFAR100, and ImageNet despite having virtually zero accuracy under adaptive attacks. Adversarial training of the backbone classifier can further increase resistance of the front-end enhanced model to gradient attacks. On CIFAR10, the respective randomized ensemble achieved 90.8$\pm 2.5$% (99% CI) accuracy under AutoAttack while having only 18.2$\pm 3.6$% accuracy under the adaptive attack. We do not establish SOTA in adversarial robustness. Instead, we make methodological contributions and further supports the thesis that adaptive attacks designed with the complete knowledge of model architecture are crucial in demonstrating model robustness and that even the so-called white-box gradient attacks can have limited applicability. Although gradient attacks can be complemented with black-box attack such as the SQUARE attack or the zero-order PGD, black-box attacks can be weak against randomized ensembles, e.g., when ensemble models mask gradients.
Published: 2024

2. PriViT: Vision Transformers for Fast Private Inference

Author: Dhyani, Naren, Mo, Jianqiao, Cho, Minsu, Joshi, Ameya, Garg, Siddharth, Reagen, Brandon, and Hegde, Chinmay
Subjects: Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: The Vision Transformer (ViT) architecture has emerged as the backbone of choice for state-of-the-art deep models for computer vision applications. However, ViTs are ill-suited for private inference using secure multi-party computation (MPC) protocols, due to the large number of non-polynomial operations (self-attention, feed-forward rectifiers, layer normalization). We propose PriViT, a gradient based algorithm to selectively "Taylorize" nonlinearities in ViTs while maintaining their prediction accuracy. Our algorithm is conceptually simple, easy to implement, and achieves improved performance over existing approaches for designing MPC-friendly transformer architectures in terms of achieving the Pareto frontier in latency-accuracy. We confirm these improvements via experiments on several standard image classification tasks. Public code is available at https://github.com/NYU-DICE-Lab/privit., Comment: 18 pages, 14 figures
Published: 2023

3. Distributionally Robust Classification on a Data Budget

Author: Feuer, Benjamin, Joshi, Ameya, Pham, Minh, and Hegde, Chinmay
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Real world uses of deep learning require predictable model behavior under distribution shifts. Models such as CLIP show emergent natural distributional robustness comparable to humans, but may require hundreds of millions of training samples. Can we train robust learners in a domain where data is limited? To rigorously address this question, we introduce JANuS (Joint Annotations and Names Set), a collection of four new training datasets with images, labels, and corresponding captions, and perform a series of carefully controlled investigations of factors contributing to robustness in image classification, then compare those results to findings derived from a large-scale meta-analysis. Using this approach, we show that standard ResNet-50 trained with the cross-entropy loss on 2.4 million image samples can attain comparable robustness to a CLIP ResNet-50 trained on 400 million samples. To our knowledge, this is the first result showing (near) state-of-the-art distributional robustness on limited data budgets. Our dataset is available at \url{https://huggingface.co/datasets/penfever/JANuS_dataset}, and the code used to reproduce our experiments can be found at \url{https://github.com/penfever/vlhub/}., Comment: TMLR 2023; openreview link: https://openreview.net/forum?id=D5Z2E8CNsD
Published: 2023

4. Identity-Preserving Aging of Face Images via Latent Diffusion Models

Author: Banerjee, Sudipta, Mittal, Govind, Joshi, Ameya, Hegde, Chinmay, and Memon, Nasir
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The performance of automated face recognition systems is inevitably impacted by the facial aging process. However, high quality datasets of individuals collected over several years are typically small in scale. In this work, we propose, train, and validate the use of latent text-to-image diffusion models for synthetically aging and de-aging face images. Our models succeed with few-shot training, and have the added benefit of being controllable via intuitive textual prompting. We observe high degrees of visual realism in the generated images while maintaining biometric fidelity measured by commonly used metrics. We evaluate our method on two benchmark datasets (CelebA and AgeDB) and observe significant reduction (~44%) in the False Non-Match Rate compared to existing state-of the-art baselines., Comment: Accepted to appear in International Joint Conference in Biometrics (IJCB) 2023
Published: 2023

5. Vision-Language Models can Identify Distracted Driver Behavior from Naturalistic Videos

Author: Hasan, Md Zahid, Chen, Jiajing, Wang, Jiyang, Rahman, Mohammed Shaiqur, Joshi, Ameya, Velipasalar, Senem, Hegde, Chinmay, Sharma, Anuj, and Sarkar, Soumik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Recognizing the activities causing distraction in real-world driving scenarios is critical for ensuring the safety and reliability of both drivers and pedestrians on the roadways. Conventional computer vision techniques are typically data-intensive and require a large volume of annotated training data to detect and classify various distracted driving behaviors, thereby limiting their efficiency and scalability. We aim to develop a generalized framework that showcases robust performance with access to limited or no annotated training data. Recently, vision-language models have offered large-scale visual-textual pretraining that can be adapted to task-specific learning like distracted driving activity recognition. Vision-language pretraining models, such as CLIP, have shown significant promise in learning natural language-guided visual representations. This paper proposes a CLIP-based driver activity recognition approach that identifies driver distraction from naturalistic driving images and videos. CLIP's vision embedding offers zero-shot transfer and task-based finetuning, which can classify distracted activities from driving video data. Our results show that this framework offers state-of-the-art performance on zero-shot transfer and video-based CLIP for predicting the driver's state on two public datasets. We propose both frame-based and video-based frameworks developed on top of the CLIP's visual representation for distracted driving detection and classification tasks and report the results., Comment: 15 pages, 7 figures
Published: 2023

6. ZeroForge: Feedforward Text-to-Shape Without 3D Supervision

Author: Marshall, Kelly O., Pham, Minh, Joshi, Ameya, Jignasu, Anushrut, Balu, Aditya, Krishnamurthy, Adarsh, and Hegde, Chinmay
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Current state-of-the-art methods for text-to-shape generation either require supervised training using a labeled dataset of pre-defined 3D shapes, or perform expensive inference-time optimization of implicit neural representations. In this work, we present ZeroForge, an approach for zero-shot text-to-shape generation that avoids both pitfalls. To achieve open-vocabulary shape generation, we require careful architectural adaptation of existing feed-forward approaches, as well as a combination of data-free CLIP-loss and contrastive losses to avoid mode collapse. Using these techniques, we are able to considerably expand the generative ability of existing feed-forward text-to-shape models such as CLIP-Forge. We support our method via extensive qualitative and quantitative evaluations, Comment: 19 pages, High resolution figures needed to demonstrate 3D results
Published: 2023

7. Caption supervision enables robust learners

Author: Feuer, Benjamin, Joshi, Ameya, and Hegde, Chinmay
Subjects: Computer Science - Computer Vision and Pattern Recognition, I.4.9
Abstract: Vision language (VL) models like CLIP are robust to natural distribution shifts, in part because CLIP learns on unstructured data using a technique called caption supervision; the model inteprets image-linked texts as ground-truth labels. In a carefully controlled comparison study, we show that caption-supervised CNNs trained on a standard cross-entropy loss (with image labels assigned by scanning captions for class names) can exhibit greater distributional robustness than VL models trained on the same data. To facilitate future experiments with high-accuracy caption-supervised models, we introduce CaptionNet (https://github.com/penfever/CaptionNet/), which includes a class-balanced, fully supervised dataset with over 50,000 new human-labeled ImageNet-compliant samples which includes web-scraped captions. In a series of experiments on CaptionNet, we show how the choice of loss function, data filtration and supervision strategy enable robust computer vision. We also provide the codebase necessary to reproduce our experiments at VL Hub (https://github.com/penfever/vlhub/).
Published: 2022

8. Revisiting Self-Distillation

Author: Pham, Minh, Cho, Minsu, Joshi, Ameya, and Hegde, Chinmay
Subjects: Computer Science - Machine Learning
Abstract: Knowledge distillation is the procedure of transferring "knowledge" from a large model (the teacher) to a more compact one (the student), often being used in the context of model compression. When both models have the same architecture, this procedure is called self-distillation. Several works have anecdotally shown that a self-distilled student can outperform the teacher on held-out data. In this work, we systematically study self-distillation in a number of settings. We first show that even with a highly accurate teacher, self-distillation allows a student to surpass the teacher in all cases. Secondly, we revisit existing theoretical explanations of (self) distillation and identify contradicting examples, revealing possible drawbacks of these explanations. Finally, we provide an alternative explanation for the dynamics of self-distillation through the lens of loss landscape geometry. We conduct extensive experiments to show that self-distillation leads to flatter minima, thereby resulting in better generalization.
Published: 2022

9. A Meta-Analysis of Distributionally-Robust Models

Author: Feuer, Benjamin, Joshi, Ameya, and Hegde, Chinmay
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: State-of-the-art image classifiers trained on massive datasets (such as ImageNet) have been shown to be vulnerable to a range of both intentional and incidental distribution shifts. On the other hand, several recent classifiers with favorable out-of-distribution (OOD) robustness properties have emerged, achieving high accuracy on their target tasks while maintaining their in-distribution accuracy on challenging benchmarks. We present a meta-analysis on a wide range of publicly released models, most of which have been published over the last twelve months. Through this meta-analysis, we empirically identify four main commonalities for all the best-performing OOD-robust models, all of which illuminate the considerable promise of vision-language pre-training., Comment: To be presented at ICML Workshop on Principles of Distribution Shift 2022. Copyright 2022 by the author(s)
Published: 2022

10. Smooth-Reduce: Leveraging Patches for Improved Certified Robustness

Author: Joshi, Ameya, Pham, Minh, Cho, Minsu, Boytsov, Leonid, Condessa, Filipe, Kolter, J. Zico, and Hegde, Chinmay
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: Randomized smoothing (RS) has been shown to be a fast, scalable technique for certifying the robustness of deep neural network classifiers. However, methods based on RS require augmenting data with large amounts of noise, which leads to significant drops in accuracy. We propose a training-free, modified smoothing approach, Smooth-Reduce, that leverages patching and aggregation to provide improved classifier certificates. Our algorithm classifies overlapping patches extracted from an input image, and aggregates the predicted logits to certify a larger radius around the input. We study two aggregation schemes -- max and mean -- and show that both approaches provide better certificates in terms of certified accuracy, average certified radii and abstention rates as compared to concurrent approaches. We also provide theoretical guarantees for such certificates, and empirically show significant improvements over other randomized smoothing methods that require expensive retraining. Further, we extend our approach to videos and provide meaningful certificates for video classifiers. A project page can be found at https://nyu-dice-lab.github.io/SmoothReduce/
Published: 2022

11. Selective Network Linearization for Efficient Private Inference

Author: Cho, Minsu, Joshi, Ameya, Garg, Siddharth, Reagen, Brandon, and Hegde, Chinmay
Subjects: Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: Private inference (PI) enables inference directly on cryptographically secure data.While promising to address many privacy issues, it has seen limited use due to extreme runtimes. Unlike plaintext inference, where latency is dominated by FLOPs, in PI non-linear functions (namely ReLU) are the bottleneck. Thus, practical PI demands novel ReLU-aware optimizations. To reduce PI latency we propose a gradient-based algorithm that selectively linearizes ReLUs while maintaining prediction accuracy. We evaluate our algorithm on several standard PI benchmarks. The results demonstrate up to $4.25\%$ more accuracy (iso-ReLU count at 50K) or $2.2\times$ less latency (iso-accuracy at 70\%) than the current state of the art and advance the Pareto frontier across the latency-accuracy space. To complement empirical results, we present a "no free lunch" theorem that sheds light on how and when network linearization is possible while maintaining prediction accuracy. Public code is available at \url{https://github.com/NYU-DICE-Lab/selective_network_linearization}., Comment: Published in ICML 2022
Published: 2022

12. Adversarial Token Attacks on Vision Transformers

Author: Joshi, Ameya, Jagatap, Gauri, and Hegde, Chinmay
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: Vision transformers rely on a patch token based self attention mechanism, in contrast to convolutional networks. We investigate fundamental differences between these two families of models, by designing a block sparsity based adversarial token attack. We probe and analyze transformer as well as convolutional models with token attacks of varying patch sizes. We infer that transformer models are more sensitive to token attacks than convolutional models, with ResNets outperforming Transformer models by up to $\sim30\%$ in robust accuracy for single token attacks.
Published: 2021

13. NeuFENet: Neural Finite Element Solutions with Theoretical Bounds for Parametric PDEs

Author: Khara, Biswajit, Balu, Aditya, Joshi, Ameya, Sarkar, Soumik, Hegde, Chinmay, Krishnamurthy, Adarsh, and Ganapathysubramanian, Baskar
Subjects: Computer Science - Machine Learning, Mathematics - Numerical Analysis
Abstract: We consider a mesh-based approach for training a neural network to produce field predictions of solutions to parametric partial differential equations (PDEs). This approach contrasts current approaches for "neural PDE solvers" that employ collocation-based methods to make point-wise predictions of solutions to PDEs. This approach has the advantage of naturally enforcing different boundary conditions as well as ease of invoking well-developed PDE theory -- including analysis of numerical stability and convergence -- to obtain capacity bounds for our proposed neural networks in discretized domains. We explore our mesh-based strategy, called NeuFENet, using a weighted Galerkin loss function based on the Finite Element Method (FEM) on a parametric elliptic PDE. The weighted Galerkin loss (FEM loss) is similar to an energy functional that produces improved solutions, satisfies a priori mesh convergence, and can model Dirichlet and Neumann boundary conditions. We prove theoretically, and illustrate with experiments, convergence results analogous to mesh convergence analysis deployed in finite element solutions to PDEs. These results suggest that a mesh-based neural network approach serves as a promising approach for solving parametric PDEs with theoretical bounds.
Published: 2021

14. Differentiable Spline Approximations

Author: Cho, Minsu, Balu, Aditya, Joshi, Ameya, Prasad, Anjana Deva, Khara, Biswajit, Sarkar, Soumik, Ganapathysubramanian, Baskar, Krishnamurthy, Adarsh, and Hegde, Chinmay
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: The paradigm of differentiable programming has significantly enhanced the scope of machine learning via the judicious use of gradient-based optimization. However, standard differentiable programming methods (such as autodiff) typically require that the machine learning models be differentiable, limiting their applicability. Our goal in this paper is to use a new, principled approach to extend gradient-based optimization to functions well modeled by splines, which encompass a large family of piecewise polynomial models. We derive the form of the (weak) Jacobian of such functions and show that it exhibits a block-sparse structure that can be computed implicitly and efficiently. Overall, we show that leveraging this redesigned Jacobian in the form of a differentiable "layer" in predictive models leads to improved performance in diverse applications such as image segmentation, 3D point cloud reconstruction, and finite element analysis., Comment: 9 pages, accepted in Neurips 2021
Published: 2021

15. Adversarially Robust Learning via Entropic Regularization

Author: Jagatap, Gauri, Joshi, Ameya, Chowdhury, Animesh Basak, Garg, Siddharth, and Hegde, Chinmay
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Statistics - Machine Learning
Abstract: In this paper we propose a new family of algorithms, ATENT, for training adversarially robust deep neural networks. We formulate a new loss function that is equipped with an additional entropic regularization. Our loss function considers the contribution of adversarial samples that are drawn from a specially designed distribution in the data space that assigns high probability to points with high loss and in the immediate neighborhood of training samples. Our proposed algorithms optimize this loss to seek adversarially robust valleys of the loss landscape. Our approach achieves competitive (or better) performance in terms of robust classification accuracy as compared to several state-of-the-art robust learning approaches on benchmark datasets such as MNIST and CIFAR-10.
Published: 2020

16. Deep Generative Models that Solve PDEs: Distributed Computing for Training Large Data-Free Models

Author: Botelho, Sergio, Joshi, Ameya, Khara, Biswajit, Sarkar, Soumik, Hegde, Chinmay, Adavani, Santi, and Ganapathysubramanian, Baskar
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Recent progress in scientific machine learning (SciML) has opened up the possibility of training novel neural network architectures that solve complex partial differential equations (PDEs). Several (nearly data free) approaches have been recently reported that successfully solve PDEs, with examples including deep feed forward networks, generative networks, and deep encoder-decoder networks. However, practical adoption of these approaches is limited by the difficulty in training these models, especially to make predictions at large output resolutions ($\geq 1024 \times 1024$). Here we report on a software framework for data parallel distributed deep learning that resolves the twin challenges of training these large SciML models - training in reasonable time as well as distributing the storage requirements. Our framework provides several out of the box functionality including (a) loss integrity independent of number of processes, (b) synchronized batch normalization, and (c) distributed higher-order optimization methods. We show excellent scalability of this framework on both cloud as well as HPC clusters, and report on the interplay between bandwidth, network topology and bare metal vs cloud. We deploy this approach to train generative models of sizes hitherto not possible, showing that neural PDE solvers can be viably trained for practical applications. We also demonstrate that distributed higher-order optimization methods are $2-3\times$ faster than stochastic gradient-based methods and provide minimal convergence drift with higher batch-size., Comment: 10 pages, 18 figures
Published: 2020

17. ESPN: Extremely Sparse Pruned Networks

Author: Cho, Minsu, Joshi, Ameya, and Hegde, Chinmay
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Deep neural networks are often highly overparameterized, prohibiting their use in compute-limited systems. However, a line of recent works has shown that the size of deep networks can be considerably reduced by identifying a subset of neuron indicators (or mask) that correspond to significant weights prior to training. We demonstrate that an simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks. Our algorithm represents a hybrid approach between single shot network pruning methods (such as SNIP) with Lottery-Ticket type approaches. We validate our approach on several datasets and outperform several existing pruning approaches in both test accuracy and compression ratio.
Published: 2020

18. Encoding Invariances in Deep Generative Models

Author: Shah, Viraj, Joshi, Ameya, Ghosal, Sambuddha, Pokuri, Balaji, Sarkar, Soumik, Ganapathysubramanian, Baskar, and Hegde, Chinmay
Subjects: Computer Science - Machine Learning, Electrical Engineering and Systems Science - Image and Video Processing, Statistics - Machine Learning
Abstract: Reliable training of generative adversarial networks (GANs) typically require massive datasets in order to model complicated distributions. However, in several applications, training samples obey invariances that are \textit{a priori} known; for example, in complex physics simulations, the training data obey universal laws encoded as well-defined mathematical equations. In this paper, we propose a new generative modeling approach, InvNet, that can efficiently model data spaces with known invariances. We devise an adversarial training algorithm to encode them into data distribution. We validate our framework in three experimental settings: generating images with fixed motifs; solving nonlinear partial differential equations (PDEs); and reconstructing two-phase microstructures with desired statistical properties. We complement our experiments with several theoretical results.
Published: 2019

19. Semantic Adversarial Attacks: Parametric Transformations That Fool Deep Classifiers

Author: Joshi, Ameya, Mukherjee, Amitangshu, Sarkar, Soumik, and Hegde, Chinmay
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Deep neural networks have been shown to exhibit an intriguing vulnerability to adversarial input images corrupted with imperceptible perturbations. However, the majority of adversarial attacks assume global, fine-grained control over the image pixel space. In this paper, we consider a different setting: what happens if the adversary could only alter specific attributes of the input image? These would generate inputs that might be perceptibly different, but still natural-looking and enough to fool a classifier. We propose a novel approach to generate such `semantic' adversarial examples by optimizing a particular adversarial loss over the range-space of a parametric conditional generative model. We demonstrate implementations of our attacks on binary classifiers trained on face images, and show that such natural-looking semantic adversarial examples exist. We evaluate the effectiveness of our attack on synthetic and real data, and present detailed comparisons with existing attack methods. We supplement our empirical results with theoretical bounds that demonstrate the existence of such parametric adversarial examples., Comment: Accepted to International Conference on Computer Vision, (ICCV) 2019
Published: 2019

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

19 results on '"Joshi, Ameya"'

1. A Curious Case of Remarkable Resilience to Gradient Attacks via Fully Convolutional and Differentiable Front End with a Skip Connection

2. PriViT: Vision Transformers for Fast Private Inference

3. Distributionally Robust Classification on a Data Budget

4. Identity-Preserving Aging of Face Images via Latent Diffusion Models

5. Vision-Language Models can Identify Distracted Driver Behavior from Naturalistic Videos

6. ZeroForge: Feedforward Text-to-Shape Without 3D Supervision

7. Caption supervision enables robust learners

8. Revisiting Self-Distillation

9. A Meta-Analysis of Distributionally-Robust Models

10. Smooth-Reduce: Leveraging Patches for Improved Certified Robustness

11. Selective Network Linearization for Efficient Private Inference

12. Adversarial Token Attacks on Vision Transformers

13. NeuFENet: Neural Finite Element Solutions with Theoretical Bounds for Parametric PDEs

14. Differentiable Spline Approximations

15. Adversarially Robust Learning via Entropic Regularization

16. Deep Generative Models that Solve PDEs: Distributed Computing for Training Large Data-Free Models

17. ESPN: Extremely Sparse Pruned Networks

18. Encoding Invariances in Deep Generative Models

19. Semantic Adversarial Attacks: Parametric Transformations That Fool Deep Classifiers

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Database

19 results on '"Joshi, Ameya"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources