Author: "Giryes, Raja" / Topic: computer science - machine learning - Searchworks@Jio Institute Digital Library Search Results

1. Provable Benefits of Complex Parameterizations for Structured State Space Models

Author: Ran-Milo, Yuval, Lumbroso, Eden, Cohen-Karlik, Edo, Giryes, Raja, Globerson, Amir, and Cohen, Nadav
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Neural and Evolutionary Computing
Abstract: Structured state space models (SSMs), the core engine behind prominent neural networks such as S4 and Mamba, are linear dynamical systems adhering to a specified structure, most notably diagonal. In contrast to typical neural network modules, whose parameterizations are real, SSMs often use complex parameterizations. Theoretically explaining the benefits of complex parameterizations for SSMs is an open problem. The current paper takes a step towards its resolution, by establishing formal gaps between real and complex diagonal SSMs. Firstly, we prove that while a moderate dimension suffices in order for a complex SSM to express all mappings of a real SSM, a much higher dimension is needed for a real SSM to express mappings of a complex SSM. Secondly, we prove that even if the dimension of a real SSM is high enough to express a given mapping, typically, doing so requires the parameters of the real SSM to hold exponentially large values, which cannot be learned in practice. In contrast, a complex SSM can express any given mapping with moderate parameter values. Experiments corroborate our theory, and suggest a potential extension of the theory that accounts for selectivity, a new architectural feature yielding state of the art performance., Comment: 12 pages. Accepted to NeurIPS 2024
Published: 2024

2. On the Relation Between Linear Diffusion and Power Iteration

Author: Weitzner, Dana, Delbracio, Mauricio, Milanfar, Peyman, and Giryes, Raja
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Recently, diffusion models have gained popularity due to their impressive generative abilities. These models learn the implicit distribution given by the training dataset, and sample new data by transforming random noise through the reverse process, which can be thought of as gradual denoising. In this work, we examine the generation process as a ``correlation machine'', where random noise is repeatedly enhanced in correlation with the implicit given distribution. To this end, we explore the linear case, where the optimal denoiser in the MSE sense is known to be the PCA projection. This enables us to connect the theory of diffusion models to the spiked covariance model, where the dependence of the denoiser on the noise level and the amount of training data can be expressed analytically, in the rank-1 case. In a series of numerical experiments, we extend this result to general low rank data, and show that low frequencies emerge earlier in the generation process, where the denoising basis vectors are more aligned to the true data with a rate depending on their eigenvalues. This model allows us to show that the linear diffusion model converges in mean to the leading eigenvector of the underlying data, similarly to the prevalent power iteration method. Finally, we empirically demonstrate the applicability of our findings beyond the linear case, in the Jacobians of a deep, non-linear denoiser, used in general image generation tasks.
Published: 2024

3. DeciMamba: Exploring the Length Extrapolation Potential of Mamba

Author: Ben-Kish, Assaf, Zimerman, Itamar, Abu-Hussein, Shady, Cohen, Nadav, Globerson, Amir, Wolf, Lior, and Giryes, Raja
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Long-range sequence processing poses a significant challenge for Transformers due to their quadratic complexity in input length. A promising alternative is Mamba, which demonstrates high performance and achieves Transformer-level capabilities while requiring substantially fewer computational resources. In this paper we explore the length-generalization capabilities of Mamba, which we find to be relatively limited. Through a series of visualizations and analyses we identify that the limitations arise from a restricted effective receptive field, dictated by the sequence length used during training. To address this constraint, we introduce DeciMamba, a context-extension method specifically designed for Mamba. This mechanism, built on top of a hidden filtering mechanism embedded within the S6 layer, enables the trained model to extrapolate well even without additional training. Empirical experiments over real-world long-range NLP tasks show that DeciMamba can extrapolate to context lengths that are 25x times longer than the ones seen during training, and does so without utilizing additional computational resources. We will release our code and models., Comment: Link To Official Implementation: https://github.com/assafbk/DeciMamba
Published: 2024

4. Effective Subset Selection Through The Lens of Neural Network Pruning

Author: Bar, Noga and Giryes, Raja
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Having large amounts of annotated data significantly impacts the effectiveness of deep neural networks. However, the annotation task can be very expensive in some domains, such as medical data. Thus, it is important to select the data to be annotated wisely, which is known as the subset selection problem. We investigate the relationship between subset selection and neural network pruning, which is more widely studied, and establish a correspondence between them. Leveraging insights from network pruning, we propose utilizing the norm criterion of neural network features to improve subset selection methods. We empirically validate our proposed strategy on various networks and datasets, demonstrating enhanced accuracy. This shows the potential of employing pruning tools for subset selection.
Published: 2024

5. DINOv2 based Self Supervised Learning For Few Shot Medical Image Segmentation

Author: Ayzenberg, Lev, Giryes, Raja, and Greenspan, Hayit
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Deep learning models have emerged as the cornerstone of medical image segmentation, but their efficacy hinges on the availability of extensive manually labeled datasets and their adaptability to unforeseen categories remains a challenge. Few-shot segmentation (FSS) offers a promising solution by endowing models with the capacity to learn novel classes from limited labeled examples. A leading method for FSS is ALPNet, which compares features between the query image and the few available support segmented images. A key question about using ALPNet is how to design its features. In this work, we delve into the potential of using features from DINOv2, which is a foundational self-supervised learning model in computer vision. Leveraging the strengths of ALPNet and harnessing the feature extraction capabilities of DINOv2, we present a novel approach to few-shot segmentation that not only enhances performance but also paves the way for more robust and adaptable medical image analysis.
Published: 2024

6. ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation

Author: Yanuka, Moran, Alper, Morris, Averbuch-Elor, Hadar, and Giryes, Raja
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: Web-scale training on paired text-image data is becoming increasingly central to multimodal learning, but is challenged by the highly noisy nature of datasets in the wild. Standard data filtering approaches succeed in removing mismatched text-image pairs, but permit semantically related but highly abstract or subjective text. These approaches lack the fine-grained ability to isolate the most concrete samples that provide the strongest signal for learning in a noisy dataset. In this work, we propose a new metric, image caption concreteness, that evaluates caption text without an image reference to measure its concreteness and relevancy for use in multimodal learning. Our approach leverages strong foundation models for measuring visual-semantic information loss in multimodal representations. We demonstrate that this strongly correlates with human evaluation of concreteness in both single-word and sentence-level texts. Moreover, we show that curation using ICC complements existing approaches: It succeeds in selecting the highest quality samples from multimodal web-scale datasets to allow for efficient training in resource-constrained settings., Comment: Accepted to ACL 2024 (Finding). For Project webpage, see https://moranyanuka.github.io/icc/
Published: 2024

7. Implicit Bias of Policy Gradient in Linear Quadratic Control: Extrapolation to Unseen Initial States

Author: Razin, Noam, Alexander, Yotam, Cohen-Karlik, Edo, Giryes, Raja, Globerson, Amir, and Cohen, Nadav
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Electrical Engineering and Systems Science - Systems and Control, Statistics - Machine Learning
Abstract: In modern machine learning, models can often fit training data in numerous ways, some of which perform well on unseen (test) data, while others do not. Remarkably, in such cases gradient descent frequently exhibits an implicit bias that leads to excellent performance on unseen data. This implicit bias was extensively studied in supervised learning, but is far less understood in optimal control (reinforcement learning). There, learning a controller applied to a system via gradient descent is known as policy gradient, and a question of prime importance is the extent to which a learned controller extrapolates to unseen initial states. This paper theoretically studies the implicit bias of policy gradient in terms of extrapolation to unseen initial states. Focusing on the fundamental Linear Quadratic Regulator (LQR) problem, we establish that the extent of extrapolation depends on the degree of exploration induced by the system when commencing from initial states included in training. Experiments corroborate our theory, and demonstrate its conclusions on problems beyond LQR, where systems are non-linear and controllers are neural networks. We hypothesize that real-world optimal control may be greatly improved by developing methods for informed selection of initial states to train on., Comment: Accepted to ICML 2024
Published: 2024

8. How do Transformers perform In-Context Autoregressive Learning?

Author: Sander, Michael E., Giryes, Raja, Suzuki, Taiji, Blondel, Mathieu, and Peyré, Gabriel
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: Transformers have achieved state-of-the-art performance in language modeling tasks. However, the reasons behind their tremendous success are still unclear. In this paper, towards a better understanding, we train a Transformer model on a simple next token prediction task, where sequences are generated as a first-order autoregressive process $s_{t+1} = W s_t$. We show how a trained Transformer predicts the next token by first learning $W$ in-context, then applying a prediction mapping. We call the resulting procedure in-context autoregressive learning. More precisely, focusing on commuting orthogonal matrices $W$, we first show that a trained one-layer linear Transformer implements one step of gradient descent for the minimization of an inner objective function, when considering augmented tokens. When the tokens are not augmented, we characterize the global minima of a one-layer diagonal linear multi-head Transformer. Importantly, we exhibit orthogonality between heads and show that positional encoding captures trigonometric relations in the data. On the experimental side, we consider the general case of non-commuting orthogonal matrices and generalize our theoretical findings., Comment: 20 pages ICML 2024
Published: 2024

9. A Self Supervised StyleGAN for Image Annotation and Classification with Extremely Limited Labels

Author: Hochberg, Dana Cohen, Greenspan, Hayit, and Giryes, Raja
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, 92C55, J.3, I.5.3
Abstract: The recent success of learning-based algorithms can be greatly attributed to the immense amount of annotated data used for training. Yet, many datasets lack annotations due to the high costs associated with labeling, resulting in degraded performances of deep learning methods. Self-supervised learning is frequently adopted to mitigate the reliance on massive labeled datasets since it exploits unlabeled data to learn relevant feature representations. In this work, we propose SS-StyleGAN, a self-supervised approach for image annotation and classification suitable for extremely small annotated datasets. This novel framework adds self-supervision to the StyleGAN architecture by integrating an encoder that learns the embedding to the StyleGAN latent space, which is well-known for its disentangled properties. The learned latent space enables the smart selection of representatives from the data to be labeled for improved classification performance. We show that the proposed method attains strong classification results using small labeled datasets of sizes 50 and even 10. We demonstrate the superiority of our approach for the tasks of COVID-19 and liver tumor pathology identification., Comment: Accepted to IEEE Transactions on Medical Imaging
Published: 2023
Full Text: View/download PDF

10. Deep Internal Learning: Deep Learning from a Single Input

Author: Tirer, Tom, Giryes, Raja, Chun, Se Young, and Eldar, Yonina C.
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing, Electrical Engineering and Systems Science - Signal Processing
Abstract: Deep learning, in general, focuses on training a neural network from large labeled datasets. Yet, in many cases there is value in training a network just from the input at hand. This is particularly relevant in many signal and image processing problems where training data is scarce and diversity is large on the one hand, and on the other, there is a lot of structure in the data that can be exploited. Using this information is the key to deep internal-learning strategies, which may involve training a network from scratch using a single input or adapting an already trained network to a provided input example at inference time. This survey paper aims at covering deep internal-learning techniques that have been proposed in the past few years for these two important directions. While our main focus will be on image processing problems, most of the approaches that we survey are derived for general signals (vectors with recurring patterns that can be distinguished from noise) and are therefore applicable to other modalities., Comment: Accepted to IEEE Signal Processing Magazine
Published: 2023

11. Trees versus Neural Networks for enhancing tau lepton real-time selection in proton-proton collisions

Author: Yaary, Maayan, Barron, Uriel, Domínguez, Luis Pascual, Chen, Boping, Barak, Liron, Etzion, Erez, and Giryes, Raja
Subjects: High Energy Physics - Experiment, Computer Science - Machine Learning, Physics - Instrumentation and Detectors
Abstract: This paper introduces supervised learning techniques for real-time selection (triggering) of hadronically decaying tau leptons in proton-proton colliders. By implementing classic machine learning decision trees and advanced deep learning models, such as Multi-Layer Perceptron or residual neural networks, visible improvements in performance compared to standard threshold tau triggers are observed. We show how such an implementation may lower selection energy thresholds, thus contributing to increasing the sensitivity of searches for new phenomena in proton-proton collisions classified by low-energy tau leptons. Moreover, we analyze when it is better to use neural networks versus decision trees for tau triggers with conclusions relevant to other problems in physics.
Published: 2023

12. An information-Theoretic Approach to Semi-supervised Transfer Learning

Author: Jakubovitz, Daniel, Uliel, David, Rodrigues, Miguel, and Giryes, Raja
Subjects: Computer Science - Machine Learning
Abstract: Transfer learning is a valuable tool in deep learning as it allows propagating information from one "source dataset" to another "target dataset", especially in the case of a small number of training examples in the latter. Yet, discrepancies between the underlying distributions of the source and target data are commonplace and are known to have a substantial impact on algorithm performance. In this work we suggest novel information-theoretic approaches for the analysis of the performance of deep neural networks in the context of transfer learning. We focus on the task of semi-supervised transfer learning, in which unlabeled samples from the target dataset are available during network training on the source dataset. Our theory suggests that one may improve the transferability of a deep neural network by incorporating regularization terms on the target data based on information-theoretic quantities, namely the Mutual Information and the Lautum Information. We demonstrate the effectiveness of the proposed approaches in various semi-supervised transfer learning experiments., Comment: arXiv admin note: substantial text overlap with arXiv:1904.01670
Published: 2023

13. SENS: Part-Aware Sketch-based Implicit Neural Shape Modeling

Author: Binninger, Alexandre, Hertz, Amir, Sorkine-Hornung, Olga, Cohen-Or, Daniel, and Giryes, Raja
Subjects: Computer Science - Graphics, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: We present SENS, a novel method for generating and editing 3D models from hand-drawn sketches, including those of abstract nature. Our method allows users to quickly and easily sketch a shape, and then maps the sketch into the latent space of a part-aware neural implicit shape architecture. SENS analyzes the sketch and encodes its parts into ViT patch encoding, subsequently feeding them into a transformer decoder that converts them to shape embeddings suitable for editing 3D neural implicit shapes. SENS provides intuitive sketch-based generation and editing, and also succeeds in capturing the intent of the user's sketch to generate a variety of novel and expressive 3D shapes, even from abstract and imprecise sketches. Additionally, SENS supports refinement via part reconstruction, allowing for nuanced adjustments and artifact removal. It also offers part-based modeling capabilities, enabling the combination of features from multiple sketches to create more complex and customized 3D shapes. We demonstrate the effectiveness of our model compared to the state-of-the-art using objective metric evaluation criteria and a user study, both indicating strong performance on sketches with a medium level of abstraction. Furthermore, we showcase our method's intuitive sketch-based shape editing capabilities, and validate it through a usability study., Comment: 25 pages, 24 figures
Published: 2023

14. Pruning at Initialization -- A Sketching Perspective

Author: Bar, Noga and Giryes, Raja
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: The lottery ticket hypothesis (LTH) has increased attention to pruning neural networks at initialization. We study this problem in the linear setting. We show that finding a sparse mask at initialization is equivalent to the sketching problem introduced for efficient matrix multiplication. This gives us tools to analyze the LTH problem and gain insights into it. Specifically, using the mask found at initialization, we bound the approximation error of the pruned linear model at the end of training. We theoretically justify previous empirical evidence that the search for sparse networks may be data independent. By using the sketching perspective, we suggest a generic improvement to existing algorithms for pruning at initialization, which we show to be beneficial in the data-independent case., Comment: 20 pages
Published: 2023

15. Neural (Tangent Kernel) Collapse

Author: Seleznova, Mariia, Weitzner, Dana, Giryes, Raja, Kutyniok, Gitta, and Chou, Hung-Hsu
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: This work bridges two important concepts: the Neural Tangent Kernel (NTK), which captures the evolution of deep neural networks (DNNs) during training, and the Neural Collapse (NC) phenomenon, which refers to the emergence of symmetry and structure in the last-layer features of well-trained classification DNNs. We adopt the natural assumption that the empirical NTK develops a block structure aligned with the class labels, i.e., samples within the same class have stronger correlations than samples from different classes. Under this assumption, we derive the dynamics of DNNs trained with mean squared (MSE) loss and break them into interpretable phases. Moreover, we identify an invariant that captures the essence of the dynamics, and use it to prove the emergence of NC in DNNs with block-structured NTK. We provide large-scale numerical experiments on three common DNN architectures and three benchmark datasets to support our theory.
Published: 2023

16. UDPM: Upsampling Diffusion Probabilistic Models

Author: Abu-Hussein, Shady and Giryes, Raja
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Denoising Diffusion Probabilistic Models (DDPM) have recently gained significant attention. DDPMs compose a Markovian process that begins in the data domain and gradually adds noise until reaching pure white noise. DDPMs generate high-quality samples from complex data distributions by defining an inverse process and training a deep neural network to learn this mapping. However, these models are inefficient because they require many diffusion steps to produce aesthetically pleasing samples. Additionally, unlike generative adversarial networks (GANs), the latent space of diffusion models is less interpretable. In this work, we propose to generalize the denoising diffusion process into an Upsampling Diffusion Probabilistic Model (UDPM). In the forward process, we reduce the latent variable dimension through downsampling, followed by the traditional noise perturbation. As a result, the reverse process gradually denoises and upsamples the latent variable to produce a sample from the data distribution. We formalize the Markovian diffusion processes of UDPM and demonstrate its generation capabilities on the popular FFHQ, AFHQv2, and CIFAR10 datasets. UDPM generates images with as few as three network evaluations, whose overall computational cost is less than a single DDPM or EDM step, while achieving an FID score of 6.86. This surpasses current state-of-the-art efficient diffusion models that use a single denoising step for sampling. Additionally, UDPM offers an interpretable and interpolable latent space, which gives it an advantage over traditional DDPMs. Our code is available online: \url{https://github.com/shadyabh/UDPM/}
Published: 2023

17. Set-the-Scene: Global-Local Training for Generating Controllable NeRF Scenes

Author: Cohen-Bar, Dana, Richardson, Elad, Metzer, Gal, Giryes, Raja, and Cohen-Or, Daniel
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Graphics, Computer Science - Machine Learning
Abstract: Recent breakthroughs in text-guided image generation have led to remarkable progress in the field of 3D synthesis from text. By optimizing neural radiance fields (NeRF) directly from text, recent methods are able to produce remarkable results. Yet, these methods are limited in their control of each object's placement or appearance, as they represent the scene as a whole. This can be a major issue in scenarios that require refining or manipulating objects in the scene. To remedy this deficit, we propose a novel GlobalLocal training framework for synthesizing a 3D scene using object proxies. A proxy represents the object's placement in the generated scene and optionally defines its coarse geometry. The key to our approach is to represent each object as an independent NeRF. We alternate between optimizing each NeRF on its own and as part of the full scene. Thus, a complete representation of each object can be learned, while also creating a harmonious scene with style and lighting match. We show that using proxies allows a wide variety of editing options, such as adjusting the placement of each independent object, removing objects from a scene, or refining an object. Our results show that Set-the-Scene offers a powerful solution for scene synthesis and manipulation, filling a crucial gap in controllable text-to-3D synthesis., Comment: project page at https://danacohen95.github.io/Set-the-Scene/
Published: 2023

18. Learning Low Dimensional State Spaces with Overparameterized Recurrent Neural Nets

Author: Cohen-Karlik, Edo, Menuhin-Gruman, Itamar, Giryes, Raja, Cohen, Nadav, and Globerson, Amir
Subjects: Computer Science - Machine Learning
Abstract: Overparameterization in deep learning typically refers to settings where a trained neural network (NN) has representational capacity to fit the training data in many ways, some of which generalize well, while others do not. In the case of Recurrent Neural Networks (RNNs), there exists an additional layer of overparameterization, in the sense that a model may exhibit many solutions that generalize well for sequence lengths seen in training, some of which extrapolate to longer sequences, while others do not. Numerous works have studied the tendency of Gradient Descent (GD) to fit overparameterized NNs with solutions that generalize well. On the other hand, its tendency to fit overparameterized RNNs with solutions that extrapolate has been discovered only recently and is far less understood. In this paper, we analyze the extrapolation properties of GD when applied to overparameterized linear RNNs. In contrast to recent arguments suggesting an implicit bias towards short-term memory, we provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory. Our result relies on a dynamical characterization which shows that GD (with small step size and near-zero initialization) strives to maintain a certain form of balancedness, as well as on tools developed in the context of the moment problem from statistics (recovery of a probability distribution from its moments). Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs., Comment: Accepted to ICLR 2023, 9 pages, 2 figures plus supplementary
Published: 2022

19. A Diffusion Model Predicts 3D Shapes from 2D Microscopy Images

Author: Waibel, Dominik J. E., Röell, Ernst, Rieck, Bastian, Giryes, Raja, and Marr, Carsten
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Diffusion models are a special type of generative model, capable of synthesising new data from a learnt distribution. We introduce DISPR, a diffusion-based model for solving the inverse problem of three-dimensional (3D) cell shape prediction from two-dimensional (2D) single cell microscopy images. Using the 2D microscopy image as a prior, DISPR is conditioned to predict realistic 3D shape reconstructions. To showcase the applicability of DISPR as a data augmentation tool in a feature-based single cell classification task, we extract morphological features from the red blood cells grouped into six highly imbalanced classes. Adding features from the DISPR predictions to the three minority classes improved the macro F1 score from $F1_\text{macro} = 55.2 \pm 4.6\%$ to $F1_\text{macro} = 72.2 \pm 4.9\%$. We thus demonstrate that diffusion models can be successfully applied to inverse biomedical problems, and that they learn to reconstruct 3D shapes with realistic morphological features from 2D microscopy images.
Published: 2022

20. Utilizing Excess Resources in Training Neural Networks

Author: Henig, Amit and Giryes, Raja
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: In this work, we suggest Kernel Filtering Linear Overparameterization (KFLO), where a linear cascade of filtering layers is used during training to improve network performance in test time. We implement this cascade in a kernel filtering fashion, which prevents the trained architecture from becoming unnecessarily deeper. This also allows using our approach with almost any network architecture and let combining the filtering layers into a single layer in test time. Thus, our approach does not add computational complexity during inference. We demonstrate the advantage of KFLO on various network models and datasets in supervised learning., Comment: Accepted to ICIP 2022. Code available at https://github.com/AmitHenig/KFLO
Published: 2022

21. Membership Inference Attack Using Self Influence Functions

Author: Cohen, Gilad and Giryes, Raja
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: Member inference (MI) attacks aim to determine if a specific data sample was used to train a machine learning model. Thus, MI is a major privacy threat to models trained on private sensitive data, such as medical records. In MI attacks one may consider the black-box settings, where the model's parameters and activations are hidden from the adversary, or the white-box case where they are available to the attacker. In this work, we focus on the latter and present a novel MI attack for it that employs influence functions, or more specifically the samples' self-influence scores, to perform the MI prediction. We evaluate our attack on CIFAR-10, CIFAR-100, and Tiny ImageNet datasets, using versatile architectures such as AlexNet, ResNet, and DenseNet. Our attack method achieves new state-of-the-art results for both training with and without data augmentations. Code is available at https://github.com/giladcohen/sif_mi_attack.
Published: 2022

22. ProCST: Boosting Semantic Segmentation Using Progressive Cyclic Style-Transfer

Author: Ettedgui, Shahaf, Abu-Hussein, Shady, and Giryes, Raja
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Using synthetic data for training neural networks that achieve good performance on real-world data is an important task as it can reduce the need for costly data annotation. Yet, synthetic and real world data have a domain gap. Reducing this gap, also known as domain adaptation, has been widely studied in recent years. Closing the domain gap between the source (synthetic) and target (real) data by directly performing the adaptation between the two is challenging. In this work, we propose a novel two-stage framework for improving domain adaptation techniques on image data. In the first stage, we progressively train a multi-scale neural network to perform image translation from the source domain to the target domain. We denote the new transformed data as "Source in Target" (SiT). Then, we insert the generated SiT data as the input to any standard UDA approach. This new data has a reduced domain gap from the desired target domain, which facilitates the applied UDA approach to close the gap further. We emphasize the effectiveness of our method via a comparison to other leading UDA and image-to-image translation techniques when used as SiT generators. Moreover, we demonstrate the improvement of our framework with three state-of-the-art UDA methods for semantic segmentation, HRDA, DAFormer and ProDA, on two UDA tasks, GTA5 to Cityscapes and Synthia to Cityscapes., Comment: Code available at https://github.com/shahaf1313/ProCST
Published: 2022

23. Stress-Testing Point Cloud Registration on Automotive LiDAR

Author: Drory, Amnon, Avidan, Shai, and Giryes, Raja
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Rigid Point Cloud Registration (PCR) algorithms aim to estimate the 6-DOF relative motion between two point clouds, which is important in various fields, including autonomous driving. Recent years have seen a significant improvement in global PCR algorithms, i.e. algorithms that can handle a large relative motion. This has been demonstrated in various scenarios, including indoor scenes, but has only been minimally tested in the Automotive setting, where point clouds are produced by vehicle-mounted LiDAR sensors. In this work, we aim to answer questions that are important for automotive applications, including: which of the new algorithms is the most accurate, and which is fastest? How transferable are deep-learning approaches, e.g. what happens when you train a network with data from Boston, and run it in a vehicle in Singapore? How small can the overlap between point clouds be before the algorithms start to deteriorate? To what extent are the algorithms rotation invariant? Our results are at times surprising. When comparing robust parameter estimation methods for registration, we find that the fastest and most accurate is not one of the newest approaches. Instead, it is a modern variant of the well known RANSAC technique. We also suggest a new outlier filtering method, Grid-Prioritized Filtering (GPF), to further improve it. An additional contribution of this work is an algorithm for selecting challenging sets of frame-pairs from automotive LiDAR datasets. This enables meaningful benchmarking in the Automotive LiDAR setting, and can also improve training for learning algorithms., Comment: Accepted to the NeurIPS 2022 workshop on Machine Learning for Autonomous Driving. Project Page: https://github.com/AmnonDrory/LidarRegistration
Published: 2022

24. Generative Adversarial Networks

Author: Cohen, Gilad and Giryes, Raja
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Generative Adversarial Networks (GANs) are very popular frameworks for generating high-quality data, and are immensely used in both the academia and industry in many domains. Arguably, their most substantial impact has been in the area of computer vision, where they achieve state-of-the-art image generation. This chapter gives an introduction to GANs, by discussing their principle mechanism and presenting some of their inherent problems during training and evaluation. We focus on these three issues: (1) mode collapse, (2) vanishing gradients, and (3) generation of low-quality images. We then list some architecture-variant and loss-variant GANs that remedy the above challenges. Lastly, we present two utilization examples of GANs for real-world applications: Data augmentation and face images generation.
Published: 2022

25. SPAGHETTI: Editing Implicit Shapes Through Part Aware Generation

Author: Hertz, Amir, Perel, Or, Giryes, Raja, Sorkine-Hornung, Olga, and Cohen-Or, Daniel
Subjects: Computer Science - Graphics, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Neural implicit fields are quickly emerging as an attractive representation for learning based techniques. However, adopting them for 3D shape modeling and editing is challenging. We introduce a method for $\mathbf{E}$diting $\mathbf{I}$mplicit $\mathbf{S}$hapes $\mathbf{T}$hrough $\mathbf{P}$art $\mathbf{A}$ware $\mathbf{G}$enera$\mathbf{T}$ion, permuted in short as SPAGHETTI. Our architecture allows for manipulation of implicit shapes by means of transforming, interpolating and combining shape segments together, without requiring explicit part supervision. SPAGHETTI disentangles shape part representation into extrinsic and intrinsic geometric information. This characteristic enables a generative framework with part-level control. The modeling capabilities of SPAGHETTI are demonstrated using an interactive graphical interface, where users can directly edit neural implicit shapes.
Published: 2022

26. NeuralMLS: Geometry-Aware Control Point Deformation

Author: Shechter, Meitar, Hanocka, Rana, Metzer, Gal, Giryes, Raja, and Cohen-Or, Daniel
Subjects: Computer Science - Graphics, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: We introduce NeuralMLS, a space-based deformation technique, guided by a set of displaced control points. We leverage the power of neural networks to inject the underlying shape geometry into the deformation parameters. The goal of our technique is to enable a realistic and intuitive shape deformation. Our method is built upon moving least-squares (MLS), since it minimizes a weighted sum of the given control point displacements. Traditionally, the influence of each control point on every point in space (i.e., the weighting function) is defined using inverse distance heuristics. In this work, we opt to learn the weighting function, by training a neural network on the control points from a single input shape, and exploit the innate smoothness of neural networks. Our geometry-aware control point deformation is agnostic to the surface representation and quality; it can be applied to point clouds or meshes, including non-manifold and disconnected surface soups. We show that our technique facilitates intuitive piecewise smooth deformations, which are well suited for manufactured objects. We show the advantages of our approach compared to existing surface and space-based deformation techniques, both quantitatively and qualitatively., Comment: Eurographics 2022 Short Papers
Published: 2022

27. Mesh Draping: Parametrization-Free Neural Mesh Transfer

Author: Hertz, Amir, Perel, Or, Giryes, Raja, Sorkine-Hornung, Olga, and Cohen-Or, Daniel
Subjects: Computer Science - Graphics, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Despite recent advances in geometric modeling, 3D mesh modeling still involves a considerable amount of manual labor by experts. In this paper, we introduce Mesh Draping: a neural method for transferring existing mesh structure from one shape to another. The method drapes the source mesh over the target geometry and at the same time seeks to preserve the carefully designed characteristics of the source mesh. At its core, our method deforms the source mesh using progressive positional encoding. We show that by leveraging gradually increasing frequencies to guide the neural optimization, we are able to achieve stable and high quality mesh transfer. Our approach is simple and requires little user guidance, compared to contemporary surface mapping techniques which rely on parametrization or careful manual tuning. Most importantly, Mesh Draping is a parameterization-free method, and thus applicable to a variety of target shape representations, including point clouds, polygon soups, and non-manifold meshes. We demonstrate that the transferred meshing remains faithful to the source mesh design characteristics, and at the same time fits the target geometry well., Comment: 12 pages. Portions of this work previously appeared as arXiv:2104.09125v1 which has been split into two works: arXiv:2104.09125v2+ and this work
Published: 2021

28. Simple Post-Training Robustness Using Test Time Augmentations and Random Forest

Author: Cohen, Gilad and Giryes, Raja
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Although Deep Neural Networks (DNNs) achieve excellent performance on many real-world tasks, they are highly vulnerable to adversarial attacks. A leading defense against such attacks is adversarial training, a technique in which a DNN is trained to be robust to adversarial attacks by introducing adversarial noise to its input. This procedure is effective but must be done during the training phase. In this work, we propose Augmented Random Forest (ARF), a simple and easy-to-use strategy for robustifying an existing pretrained DNN without modifying its weights. For every image, we generate randomized test time augmentations by applying diverse color, blur, noise, and geometric transforms. Then we use the DNN's logits output to train a simple random forest to predict the real class label. Our method achieves state-of-the-art adversarial robustness on a diversity of white and black box attacks with minimal compromise on the natural images' classification. We test ARF also against numerous adaptive white-box attacks and it shows excellent results when combined with adversarial training. Code is available at https://github.com/giladcohen/ARF.
Published: 2021

29. MISS GAN: A Multi-IlluStrator Style Generative Adversarial Network for image to illustration translation

Author: Barzilay, Noa, Shalev, Tal Berkovitz, and Giryes, Raja
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Unsupervised style transfer that supports diverse input styles using only one trained generator is a challenging and interesting task in computer vision. This paper proposes a Multi-IlluStrator Style Generative Adversarial Network (MISS GAN) that is a multi-style framework for unsupervised image-to-illustration translation, which can generate styled yet content preserving images. The illustrations dataset is a challenging one since it is comprised of illustrations of seven different illustrators, hence contains diverse styles. Existing methods require to train several generators (as the number of illustrators) to handle the different illustrators' styles, which limits their practical usage, or require to train an image specific network, which ignores the style information provided in other images of the illustrator. MISS GAN is both input image specific and uses the information of other images using only one trained model., Comment: Accepted to Pattern Recognition Letters
Published: 2021
Full Text: View/download PDF

30. Z2P: Instant Visualization of Point Clouds

Author: Metzer, Gal, Hanocka, Rana, Giryes, Raja, Mitra, Niloy J., and Cohen-Or, Daniel
Subjects: Computer Science - Graphics, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: We present a technique for visualizing point clouds using a neural network. Our technique allows for an instant preview of any point cloud, and bypasses the notoriously difficult surface reconstruction problem or the need to estimate oriented normals for splat-based rendering. We cast the preview problem as a conditional image-to-image translation task, and design a neural network that translates point depth-map directly into an image, where the point cloud is visualized as though a surface was reconstructed from it. Furthermore, the resulting appearance of the visualized point cloud can be, optionally, conditioned on simple control variables (e.g., color and light). We demonstrate that our technique instantly produces plausible images, and can, on-the-fly effectively handle noise, non-uniform sampling, and thin surfaces sheets., Comment: Eurographics 2022
Published: 2021

31. FLEX: Extrinsic Parameters-free Multi-view 3D Human Motion Reconstruction

Author: Gordon, Brian, Raab, Sigal, Azov, Guy, Giryes, Raja, and Cohen-Or, Daniel
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Graphics, Computer Science - Machine Learning, I.4.5
Abstract: The increasing availability of video recordings made by multiple cameras has offered new means for mitigating occlusion and depth ambiguities in pose and motion reconstruction methods. Yet, multi-view algorithms strongly depend on camera parameters; particularly, the relative transformations between the cameras. Such a dependency becomes a hurdle once shifting to dynamic capture in uncontrolled settings. We introduce FLEX (Free muLti-view rEconstruXion), an end-to-end extrinsic parameter-free multi-view model. FLEX is extrinsic parameter-free (dubbed ep-free) in the sense that it does not require extrinsic camera parameters. Our key idea is that the 3D angles between skeletal parts, as well as bone lengths, are invariant to the camera position. Hence, learning 3D rotations and bone lengths rather than locations allows predicting common values for all camera views. Our network takes multiple video streams, learns fused deep features through a novel multi-view fusion layer, and reconstructs a single consistent skeleton with temporally coherent joint rotations. We demonstrate quantitative and qualitative results on three public datasets, and on synthetic multi-person video streams captured by dynamic cameras. We compare our model to state-of-the-art methods that are not ep-free and show that in the absence of camera parameters, we outperform them by a large margin while obtaining comparable results when camera parameters are available. Code, trained models, and other materials are available on our project page., Comment: Accepted to ECCV22 Project page: https://briang13.github.io/FLEX/ Video: https://www.youtube.com/watch?v=hXc97BDJJTg
Published: 2021

32. Orienting Point Clouds with Dipole Propagation

Author: Metzer, Gal, Hanocka, Rana, Zorin, Denis, Giryes, Raja, Panozzo, Daniele, and Cohen-Or, Daniel
Subjects: Computer Science - Graphics, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Establishing a consistent normal orientation for point clouds is a notoriously difficult problem in geometry processing, requiring attention to both local and global shape characteristics. The normal direction of a point is a function of the local surface neighborhood; yet, point clouds do not disclose the full underlying surface structure. Even assuming known geodesic proximity, calculating a consistent normal orientation requires the global context. In this work, we introduce a novel approach for establishing a globally consistent normal orientation for point clouds. Our solution separates the local and global components into two different sub-problems. In the local phase, we train a neural network to learn a coherent normal direction per patch (i.e., consistently oriented normals within a single patch). In the global phase, we propagate the orientation across all coherent patches using a dipole propagation. Our dipole propagation decides to orient each patch using the electric field defined by all previously orientated patches. This gives rise to a global propagation that is stable, as well as being robust to nearby surfaces, holes, sharp features and noise., Comment: SIGGRAPH 2021
Published: 2021
Full Text: View/download PDF

33. SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization

Author: Hertz, Amir, Perel, Or, Giryes, Raja, Sorkine-Hornung, Olga, and Cohen-Or, Daniel
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Graphics
Abstract: Multilayer-perceptrons (MLP) are known to struggle with learning functions of high-frequencies, and in particular cases with wide frequency bands. We present a spatially adaptive progressive encoding (SAPE) scheme for input signals of MLP networks, which enables them to better fit a wide range of frequencies without sacrificing training stability or requiring any domain specific preprocessing. SAPE gradually unmasks signal components with increasing frequencies as a function of time and space. The progressive exposure of frequencies is monitored by a feedback loop throughout the neural optimization process, allowing changes to propagate at different rates among local spatial portions of the signal space. We demonstrate the advantage of SAPE on a variety of domains and applications, including regression of low dimensional signals and images, representation learning of occupancy networks, and a geometric task of mesh transfer between 3D shapes.
Published: 2021

34. Multiplicative Reweighting for Robust Neural Network Optimization

Author: Bar, Noga, Koren, Tomer, and Giryes, Raja
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security
Abstract: Neural networks are widespread due to their powerful performance. However, they degrade in the presence of noisy labels at training time. Inspired by the setting of learning with expert advice, where multiplicative weight (MW) updates were recently shown to be robust to moderate data corruptions in expert advice, we propose to use MW for reweighting examples during neural networks optimization. We theoretically establish the convergence of our method when used with gradient descent and prove its advantages in 1d cases. We then validate our findings empirically for the general case by showing that MW improves the accuracy of neural networks in the presence of label noise on CIFAR-10, CIFAR-100 and Clothing1M. We also show the impact of our approach on adversarial robustness., Comment: Our code is publicly available in https://github.com/NogaBar/mr_robust_optim
Published: 2021

35. GMM-Based Generative Adversarial Encoder Learning

Author: Feigin, Yuri, Spitzer, Hedva, and Giryes, Raja
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: While GAN is a powerful model for generating images, its inability to infer a latent space directly limits its use in applications requiring an encoder. Our paper presents a simple architectural setup that combines the generative capabilities of GAN with an encoder. We accomplish this by combining the encoder with the discriminator using shared weights, then training them simultaneously using a new loss term. We model the output of the encoder latent space via a GMM, which leads to both good clustering using this latent space and improved image generation by the GAN. Our framework is generic and can be easily plugged into any GAN strategy. In particular, we demonstrate it both with Vanilla GAN and Wasserstein GAN, where in both it leads to an improvement in the generated images in terms of both the IS and FID scores. Moreover, we show that our encoder learns a meaningful representation as its clustering results are competitive with the current GAN-based state-of-the-art in clustering.
Published: 2020

36. Kernel-Based Smoothness Analysis of Residual Networks

Author: Tirer, Tom, Bruna, Joan, and Giryes, Raja
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: A major factor in the success of deep neural networks is the use of sophisticated architectures rather than the classical multilayer perceptron (MLP). Residual networks (ResNets) stand out among these powerful modern architectures. Previous works focused on the optimization advantages of deep ResNets over deep MLPs. In this paper, we show another distinction between the two models, namely, a tendency of ResNets to promote smoother interpolations than MLPs. We analyze this phenomenon via the neural tangent kernel (NTK) approach. First, we compute the NTK for a considered ResNet model and prove its stability during gradient descent training. Then, we show by various evaluation methodologies that for ReLU activations the NTK of ResNet, and its kernel regression results, are smoother than the ones of MLP. The better smoothness observed in our analysis may explain the better generalization ability of ResNets and the practice of moderately attenuating the residual blocks., Comment: Accepted to MSML 2021
Published: 2020

37. Self-Sampling for Neural Point Cloud Consolidation

Author: Metzer, Gal, Hanocka, Rana, Giryes, Raja, and Cohen-Or, Daniel
Subjects: Computer Science - Graphics, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: We introduce a novel technique for neural point cloud consolidation which learns from only the input point cloud. Unlike other point upsampling methods which analyze shapes via local patches, in this work, we learn from global subsets. We repeatedly self-sample the input point cloud with global subsets that are used to train a deep neural network. Specifically, we define source and target subsets according to the desired consolidation criteria (e.g., generating sharp points or points in sparse regions). The network learns a mapping from source to target subsets, and implicitly learns to consolidate the point cloud. During inference, the network is fed with random subsets of points from the input, which it displaces to synthesize a consolidated point set. We leverage the inductive bias of neural networks to eliminate noise and outliers, a notoriously difficult problem in point cloud consolidation. The shared weights of the network are optimized over the entire shape, learning non-local statistics and exploiting the recurrence of local-scale geometries. Specifically, the network encodes the distribution of the underlying shape surface within a fixed set of local kernels, which results in the best explanation of the underlying shape surface. We demonstrate the ability to consolidate point sets from a variety of shapes, while eliminating outliers and noise., Comment: TOG 2021
Published: 2020

38. Self-supervised Neural Architecture Search

Author: Kaplan, Sapir and Giryes, Raja
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Statistics - Machine Learning
Abstract: Neural Architecture Search (NAS) has been used recently to achieve improved performance in various tasks and most prominently in image classification. Yet, current search strategies rely on large labeled datasets, which limit their usage in the case where only a smaller fraction of the data is annotated. Self-supervised learning has shown great promise in training neural networks using unlabeled data. In this work, we propose a self-supervised neural architecture search (SSNAS) that allows finding novel network models without the need for labeled data. We show that such a search leads to comparable results to supervised training with a "fully labeled" NAS and that it can improve the performance of self-supervised learning. Moreover, we demonstrate the advantage of the proposed approach when the number of labels in the search is relatively small.
Published: 2020

39. Deep Geometric Texture Synthesis

Author: Hertz, Amir, Hanocka, Rana, Giryes, Raja, and Cohen-Or, Daniel
Subjects: Computer Science - Graphics, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Recently, deep generative adversarial networks for image generation have advanced rapidly; yet, only a small amount of research has focused on generative models for irregular structures, particularly meshes. Nonetheless, mesh generation and synthesis remains a fundamental topic in computer graphics. In this work, we propose a novel framework for synthesizing geometric textures. It learns geometric texture statistics from local neighborhoods (i.e., local triangular patches) of a single reference 3D model. It learns deep features on the faces of the input triangulation, which is used to subdivide and generate offsets across multiple scales, without parameterization of the reference or target mesh. Our network displaces mesh vertices in any direction (i.e., in the normal and tangential direction), enabling synthesis of geometric textures, which cannot be expressed by a simple 2D displacement map. Learning and synthesizing on local geometric patches enables a genus-oblivious framework, facilitating texture transfer between shapes of different genus., Comment: SIGGRAPH 2020
Published: 2020
Full Text: View/download PDF

40. Point2Mesh: A Self-Prior for Deformable Meshes

Author: Hanocka, Rana, Metzer, Gal, Giryes, Raja, and Cohen-Or, Daniel
Subjects: Computer Science - Graphics, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: In this paper, we introduce Point2Mesh, a technique for reconstructing a surface mesh from an input point cloud. Instead of explicitly specifying a prior that encodes the expected shape properties, the prior is defined automatically using the input point cloud, which we refer to as a self-prior. The self-prior encapsulates reoccurring geometric repetitions from a single shape within the weights of a deep neural network. We optimize the network weights to deform an initial mesh to shrink-wrap a single input point cloud. This explicitly considers the entire reconstructed shape, since shared local kernels are calculated to fit the overall object. The convolutional kernels are optimized globally across the entire shape, which inherently encourages local-scale geometric self-similarity across the shape surface. We show that shrink-wrapping a point cloud with a self-prior converges to a desirable solution; compared to a prescribed smoothness prior, which often becomes trapped in undesirable local minima. While the performance of traditional reconstruction approaches degrades in non-ideal conditions that are often present in real world scanning, i.e., unoriented normals, noise and missing (low density) parts, Point2Mesh is robust to non-ideal conditions. We demonstrate the performance of Point2Mesh on a large variety of shapes with varying complexity., Comment: SIGGRAPH 2020; Project page: https://ranahanocka.github.io/point2mesh/
Published: 2020
Full Text: View/download PDF

41. On the Convergence Rate of Projected Gradient Descent for a Back-Projection based Objective

Author: Tirer, Tom and Giryes, Raja
Subjects: Mathematics - Optimization and Control, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Mathematics - Numerical Analysis, Statistics - Machine Learning
Abstract: Ill-posed linear inverse problems appear in many scientific setups, and are typically addressed by solving optimization problems, which are composed of data fidelity and prior terms. Recently, several works have considered a back-projection (BP) based fidelity term as an alternative to the common least squares (LS), and demonstrated excellent results for popular inverse problems. These works have also empirically shown that using the BP term, rather than the LS term, requires fewer iterations of optimization algorithms. In this paper, we examine the convergence rate of the projected gradient descent (PGD) algorithm for the BP objective. Our analysis allows to identify an inherent source for its faster convergence compared to using the LS objective, while making only mild assumptions. We also analyze the more general proximal gradient method under a relaxed contraction condition on the proximal mapping of the prior. This analysis further highlights the advantage of BP when the linear measurement operator is badly conditioned. Numerical experiments with both $\ell_1$-norm and GAN-based priors corroborate our theoretical results., Comment: Accepted to SIAM Journal on Imaging Sciences (SIIMS)
Published: 2020

42. A function space analysis of finite neural networks with insights from sampling theory

Author: Giryes, Raja
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory, Mathematics - Functional Analysis, Statistics - Machine Learning, 62D05, 68Q32, I.2.6
Abstract: This work suggests using sampling theory to analyze the function space represented by neural networks. First, it shows, under the assumption of a finite input domain, which is the common case in training neural networks, that the function space generated by multi-layer networks with non-expansive activation functions is smooth. This extends over previous works that show results for the case of infinite width ReLU networks. Then, under the assumption that the input is band-limited, we provide novel error bounds for univariate neural networks. We analyze both deterministic uniform and random sampling showing the advantage of the former., Comment: Accepted to Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
Published: 2020

43. PointGMM: a Neural GMM Network for Point Clouds

Author: Hertz, Amir, Hanocka, Rana, Giryes, Raja, and Cohen-Or, Daniel
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Graphics, Statistics - Machine Learning
Abstract: Point clouds are a popular representation for 3D shapes. However, they encode a particular sampling without accounting for shape priors or non-local information. We advocate for the use of a hierarchical Gaussian mixture model (hGMM), which is a compact, adaptive and lightweight representation that probabilistically defines the underlying 3D surface. We present PointGMM, a neural network that learns to generate hGMMs which are characteristic of the shape class, and also coincide with the input point cloud. PointGMM is trained over a collection of shapes to learn a class-specific prior. The hierarchical representation has two main advantages: (i) coarse-to-fine learning, which avoids converging to poor local-minima; and (ii) (an unsupervised) consistent partitioning of the input shape. We show that as a generative model, PointGMM learns a meaningful latent space which enables generating consistent interpolations between existing shapes, as well as synthesizing novel shapes. We also present a novel framework for rigid registration using PointGMM, that learns to disentangle orientation from structure of an input shape., Comment: CVPR 2020 -- final version
Published: 2020

44. Autoencoders

Author: Bank, Dor, Koenigstein, Noam, and Giryes, Raja
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Statistics - Machine Learning
Abstract: An autoencoder is a specific type of a neural network, which is mainly designed to encode the input into a compressed and meaningful representation, and then decode it back such that the reconstructed input is similar as possible to the original one. This chapter surveys the different types of autoencoders that are mainly used today. It also describes various applications and use-cases of autoencoders., Comment: Book chapter
Published: 2020

45. BP-DIP: A Backprojection based Deep Image Prior

Author: Zukerman, Jenny, Tirer, Tom, and Giryes, Raja
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Deep neural networks are a very powerful tool for many computer vision tasks, including image restoration, exhibiting state-of-the-art results. However, the performance of deep learning methods tends to drop once the observation model used in training mismatches the one in test time. In addition, most deep learning methods require vast amounts of training data, which are not accessible in many applications. To mitigate these disadvantages, we propose to combine two image restoration approaches: (i) Deep Image Prior (DIP), which trains a convolutional neural network (CNN) from scratch in test time using the given degraded image. It does not require any training data and builds on the implicit prior imposed by the CNN architecture; and (ii) a backprojection (BP) fidelity term, which is an alternative to the standard least squares loss that is usually used in previous DIP works. We demonstrate the performance of the proposed method, termed BP-DIP, on the deblurring task and show its advantages over the plain DIP, with both higher PSNR values and better inference run-time., Comment: Accepted to EUSIPCO 2020. Link to code: https://github.com/jennyzu/BP-DIP-deblurring. 5 pages, 5 figures, 1 table
Published: 2020

46. Introduction to deep learning

Author: Shiloh-Perl, Lihi and Giryes, Raja
Subjects: Computer Science - Machine Learning
Abstract: Deep Learning (DL) has made a major impact on data science in the last decade. This chapter introduces the basic concepts of this field. It includes both the basic structures used to design deep neural networks and a brief survey of some of its popular use cases.
Published: 2020

47. DEGAS: Differentiable Efficient Generator Search

Author: Doveh, Sivan and Giryes, Raja
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Network architecture search (NAS) achieves state-of-the-art results in various tasks such as classification and semantic segmentation. Recently, a reinforcement learning-based approach has been proposed for Generative Adversarial Networks (GANs) search. In this work, we propose an alternative strategy for GAN search by using a method called DEGAS (Differentiable Efficient GenerAtor Search), which focuses on efficiently finding the generator in the GAN. Our search algorithm is inspired by the differential architecture search strategy and the Global Latent Optimization (GLO) procedure. This leads to both an efficient and stable GAN search. After the generator architecture is found, it can be plugged into any existing framework for GAN training. For CTGAN, which we use in this work, the new model outperforms the original inception score results by 0.25 for CIFAR-10 and 0.77 for STL. It also gets better results than the RL based GAN search methods in shorter search time., Comment: appeared at NeurIPS 2019, Meta-Learning Workshop
Published: 2019

48. MetAdapt: Meta-Learned Task-Adaptive Architecture for Few-Shot Classification

Author: Doveh, Sivan, Schwartz, Eli, Xue, Chao, Feris, Rogerio, Bronstein, Alex, Giryes, Raja, and Karlinsky, Leonid
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Few-Shot Learning (FSL) is a topic of rapidly growing interest. Typically, in FSL a model is trained on a dataset consisting of many small tasks (meta-tasks) and learns to adapt to novel tasks that it will encounter during test time. This is also referred to as meta-learning. Another topic closely related to meta-learning with a lot of interest in the community is Neural Architecture Search (NAS), automatically finding optimal architecture instead of engineering it manually. In this work, we combine these two aspects of meta-learning. So far, meta-learning FSL methods have focused on optimizing parameters of pre-defined network architectures, in order to make them easily adaptable to novel tasks. Moreover, it was observed that, in general, larger architectures perform better than smaller ones up to a certain saturation point (where they start to degrade due to over-fitting). However, little attention has been given to explicitly optimizing the architectures for FSL, nor to an adaptation of the architecture at test time to particular novel tasks. In this work, we propose to employ tools inspired by the Differentiable Neural Architecture Search (D-NAS) literature in order to optimize the architecture for FSL without over-fitting. Additionally, to make the architecture task adaptive, we propose the concept of `MetAdapt Controller' modules. These modules are added to the model and are meta-trained to predict the optimal network connections for a given novel task. Using the proposed approach we observe state-of-the-art results on two popular few-shot benchmarks: miniImageNet and FC100.
Published: 2019

49. Correction Filter for Single Image Super-Resolution: Robustifying Off-the-Shelf Deep Super-Resolvers

Author: Hussein, Shady Abu, Tirer, Tom, and Giryes, Raja
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: The single image super-resolution task is one of the most examined inverse problems in the past decade. In the recent years, Deep Neural Networks (DNNs) have shown superior performance over alternative methods when the acquisition process uses a fixed known downsampling kernel-typically a bicubic kernel. However, several recent works have shown that in practical scenarios, where the test data mismatch the training data (e.g. when the downsampling kernel is not the bicubic kernel or is not available at training), the leading DNN methods suffer from a huge performance drop. Inspired by the literature on generalized sampling, in this work we propose a method for improving the performance of DNNs that have been trained with a fixed kernel on observations acquired by other kernels. For a known kernel, we design a closed-form correction filter that modifies the low-resolution image to match one which is obtained by another kernel (e.g. bicubic), and thus improves the results of existing pre-trained DNNs. For an unknown kernel, we extend this idea and propose an algorithm for blind estimation of the required correction filter. We show that our approach outperforms other super-resolution methods, which are designed for general downsampling kernels., Comment: Accepted to CVPR 2020 (Oral). Code is available at https://github.com/shadyabh/Correction-Filter
Published: 2019

50. Detecting Adversarial Samples Using Influence Functions and Nearest Neighbors

Author: Cohen, Gilad, Sapiro, Guillermo, and Giryes, Raja
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Deep neural networks (DNNs) are notorious for their vulnerability to adversarial attacks, which are small perturbations added to their input images to mislead their prediction. Detection of adversarial examples is, therefore, a fundamental requirement for robust classification frameworks. In this work, we present a method for detecting such adversarial attacks, which is suitable for any pre-trained neural network classifier. We use influence functions to measure the impact of every training sample on the validation set data. From the influence scores, we find the most supportive training samples for any given validation example. A k-nearest neighbor (k-NN) model fitted on the DNN's activation layers is employed to search for the ranking of these supporting training samples. We observe that these samples are highly correlated with the nearest neighbors of the normal inputs, while this correlation is much weaker for adversarial inputs. We train an adversarial detector using the k-NN ranks and distances and show that it successfully distinguishes adversarial examples, getting state-of-the-art results on six attack methods with three datasets. Code is available at https://github.com/giladcohen/NNIF_adv_defense., Comment: Paper accepted to CVPR 2020
Published: 2019

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

Publisher

100 results on '"Giryes, Raja"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources