17 results on '"Paudice, P"'
Search Results
2. General Tail Bounds for Non-Smooth Stochastic Mirror Descent
- Author
-
Eldowa, Khaled and Paudice, Andrea
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
In this paper, we provide novel tail bounds on the optimization error of Stochastic Mirror Descent for convex and Lipschitz objectives. Our analysis extends the existing tail bounds from the classical light-tailed Sub-Gaussian noise case to heavier-tailed noise regimes. We study the optimization error of the last iterate as well as the average of the iterates. We instantiate our results in two important cases: a class of noise with exponential tails and one with polynomial tails. A remarkable feature of our results is that they do not require an upper bound on the diameter of the domain. Finally, we support our theory with illustrative experiments that compare the behavior of the average of the iterates with that of the last iterate in heavy-tailed noise regimes.
- Published
- 2023
3. An Improved Uniform Convergence Bound with Fat-Shattering Dimension
- Author
-
Colomboni, Roberto, Esposito, Emmanuel, and Paudice, Andrea
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
The fat-shattering dimension characterizes the uniform convergence property of real-valued functions. The state-of-the-art upper bounds feature a multiplicative squared logarithmic factor on the sample complexity, leaving an open gap with the existing lower bound. We provide an improved uniform convergence bound that closes this gap.
- Published
- 2023
4. Active Learning of Classifiers with Label and Seed Queries
- Author
-
Bressan, Marco, Cesa-Bianchi, Nicolò, Lattanzi, Silvio, Paudice, Andrea, and Thiessen, Maximilian
- Subjects
Computer Science - Machine Learning - Abstract
We study exact active learning of binary and multiclass classifiers with margin. Given an $n$-point set $X \subset \mathbb{R}^m$, we want to learn any unknown classifier on $X$ whose classes have finite strong convex hull margin, a new notion extending the SVM margin. In the standard active learning setting, where only label queries are allowed, learning a classifier with strong convex hull margin $\gamma$ requires in the worst case $\Omega\big(1+\frac{1}{\gamma}\big)^{(m-1)/2}$ queries. On the other hand, using the more powerful seed queries (a variant of equivalence queries), the target classifier could be learned in $O(m \log n)$ queries via Littlestone's Halving algorithm; however, Halving is computationally inefficient. In this work we show that, by carefully combining the two types of queries, a binary classifier can be learned in time $\operatorname{poly}(n+m)$ using only $O(m^2 \log n)$ label queries and $O\big(m \log \frac{m}{\gamma}\big)$ seed queries; the result extends to $k$-class classifiers at the price of a $k!k^2$ multiplicative overhead. Similar results hold when the input points have bounded bit complexity, or when only one class has strong convex hull margin against the rest. We complement the upper bounds by showing that in the worst case any algorithm needs $\Omega\big(k m \log \frac{1}{\gamma}\big)$ seed and label queries to learn a $k$-class classifier with strong convex hull margin $\gamma$.
- Published
- 2022
5. Regret Analysis of Dyadic Search
- Author
-
Bachoc, François, Cesari, Tommaso, Colomboni, Roberto, and Paudice, Andrea
- Subjects
Computer Science - Machine Learning ,Mathematics - Optimization and Control - Abstract
We analyze the cumulative regret of the Dyadic Search algorithm of Bachoc et al. [2022]., Comment: arXiv admin note: substantial text overlap with arXiv:2208.06720
- Published
- 2022
6. High Probability Bounds for Stochastic Subgradient Schemes with Heavy Tailed Noise
- Author
-
Parletta, Daniela A., Paudice, Andrea, Pontil, Massimiliano, and Salzo, Saverio
- Subjects
Mathematics - Optimization and Control ,Statistics - Machine Learning ,90C25, 62L20 - Abstract
In this work we study high probability bounds for stochastic subgradient methods under heavy tailed noise. In this setting the noise is only assumed to have finite variance as opposed to a sub-Gaussian distribution for which it is known that standard subgradient methods enjoys high probability bounds. We analyzed a clipped version of the projected stochastic subgradient method, where subgradient estimates are truncated whenever they have large norms. We show that this clipping strategy leads both to near optimal any-time and finite horizon bounds for many classical averaging schemes. Preliminary experiments are shown to support the validity of the method., Comment: 39 pages
- Published
- 2022
7. A Near-Optimal Algorithm for Univariate Zeroth-Order Budget Convex Optimization
- Author
-
Bachoc, François, Cesari, Tommaso, Colomboni, Roberto, and Paudice, Andrea
- Subjects
Mathematics - Optimization and Control ,Computer Science - Machine Learning - Abstract
This paper studies a natural generalization of the problem of minimizing a univariate convex function $f$ by querying its values sequentially. At each time-step $t$, the optimizer can invest a budget $b_t$ in a query point $X_t$ of their choice to obtain a fuzzy evaluation of $f$ at $X_t$ whose accuracy depends on the amount of budget invested in $X_t$ across times. This setting is motivated by the minimization of objectives whose values can only be determined approximately through lengthy or expensive computations. We design an any-time parameter-free algorithm called Dyadic Search, for which we prove near-optimal optimization error guarantees. As a byproduct of our analysis, we show that the classical dependence on the global Lipschitz constant in the error bounds is an artifact of the granularity of the budget. Finally, we illustrate our theoretical findings with numerical simulations.
- Published
- 2022
8. On Margin-Based Cluster Recovery with Oracle Queries
- Author
-
Bressan, Marco, Cesa-Bianchi, Nicolò, Lattanzi, Silvio, and Paudice, Andrea
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
We study an active cluster recovery problem where, given a set of $n$ points and an oracle answering queries like "are these two points in the same cluster?", the task is to recover exactly all clusters using as few queries as possible. We begin by introducing a simple but general notion of margin between clusters that captures, as special cases, the margins used in previous work, the classic SVM margin, and standard notions of stability for center-based clusterings. Then, under our margin assumptions we design algorithms that, in a variety of settings, recover all clusters exactly using only $O(\log n)$ queries. For the Euclidean case, $\mathbb{R}^m$, we give an algorithm that recovers arbitrary convex clusters, in polynomial time, and with a number of queries that is lower than the best existing algorithm by $\Theta(m^m)$ factors. For general pseudometric spaces, where clusters might not be convex or might not have any notion of shape, we give an algorithm that achieves the $O(\log n)$ query bound, and is provably near-optimal as a function of the packing number of the space. Finally, for clusterings realized by binary concept classes, we give a combinatorial characterization of recoverability with $O(\log n)$ queries, and we show that, for many concept classes in Euclidean spaces, this characterization is equivalent to our margin condition. Our results show a deep connection between cluster margins and active cluster recoverability.
- Published
- 2021
9. Multitask Online Mirror Descent
- Author
-
Cesa-Bianchi, Nicolò, Laforgue, Pierre, Paudice, Andrea, and Pontil, Massimiliano
- Subjects
Computer Science - Machine Learning - Abstract
We introduce and analyze MT-OMD, a multitask generalization of Online Mirror Descent (OMD) which operates by sharing updates between tasks. We prove that the regret of MT-OMD is of order $\sqrt{1 + \sigma^2(N-1)}\sqrt{T}$, where $\sigma^2$ is the task variance according to the geometry induced by the regularizer, $N$ is the number of tasks, and $T$ is the time horizon. Whenever tasks are similar, that is $\sigma^2 \le 1$, our method improves upon the $\sqrt{NT}$ bound obtained by running independent OMDs on each task. We further provide a matching lower bound, and show that our multitask extensions of Online Gradient Descent and Exponentiated Gradient, two major instances of OMD, enjoy closed-form updates, making them easy to use in practice. Finally, we present experiments which support our theoretical findings.
- Published
- 2021
10. Exact Recovery of Clusters in Finite Metric Spaces Using Oracle Queries
- Author
-
Bressan, Marco, Cesa-Bianchi, Nicolò, Lattanzi, Silvio, and Paudice, Andrea
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
We investigate the problem of exact cluster recovery using oracle queries. Previous results show that clusters in Euclidean spaces that are convex and separated with a margin can be reconstructed exactly using only $O(\log n)$ same-cluster queries, where $n$ is the number of input points. In this work, we study this problem in the more challenging non-convex setting. We introduce a structural characterization of clusters, called $(\beta,\gamma)$-convexity, that can be applied to any finite set of points equipped with a metric (or even a semimetric, as the triangle inequality is not needed). Using $(\beta,\gamma)$-convexity, we can translate natural density properties of clusters (which include, for instance, clusters that are strongly non-convex in $\mathbb{R}^d$) into a graph-theoretic notion of convexity. By exploiting this convexity notion, we design a deterministic algorithm that recovers $(\beta,\gamma)$-convex clusters using $O(k^2 \log n + k^2 (6/\beta\gamma)^{dens(X)})$ same-cluster queries, where $k$ is the number of clusters and $dens(X)$ is the density dimension of the semimetric. We show that an exponential dependence on the density dimension is necessary, and we also show that, if we are allowed to make $O(k^2 + k\log n)$ additional queries to a "cluster separation" oracle, then we can recover clusters that have different and arbitrary scales, even when the scale of each cluster is unknown., Comment: Accepted for presentation at the Conference on Learning Theory (COLT) 2021
- Published
- 2021
11. Robust Unsupervised Learning via L-Statistic Minimization
- Author
-
Maurer, Andreas, Parletta, Daniela A., Paudice, Andrea, and Pontil, Massimiliano
- Subjects
Computer Science - Machine Learning - Abstract
Designing learning algorithms that are resistant to perturbations of the underlying data distribution is a problem of wide practical and theoretical importance. We present a general approach to this problem focusing on unsupervised learning. The key assumption is that the perturbing distribution is characterized by larger losses relative to a given class of admissible models. This is exploited by a general descent algorithm which minimizes an $L$-statistic criterion over the model class, weighting small losses more. Our analysis characterizes the robustness of the method in terms of bounds on the reconstruction error relative to the underlying unperturbed distribution. As a byproduct, we prove uniform convergence bounds with respect to the proposed criterion for several popular models in unsupervised learning, a result which may be of independent interest.Numerical experiments with kmeans clustering and principal subspace analysis demonstrate the effectiveness of our approach., Comment: We have just uploaded a new version of the paper with a more relavant title " Robust Unsupervised Learning via L-statistic Minimization"
- Published
- 2020
12. Exact Recovery of Mangled Clusters with Same-Cluster Queries
- Author
-
Bressan, Marco, Cesa-Bianchi, Nicolò, Lattanzi, Silvio, and Paudice, Andrea
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
We study the cluster recovery problem in the semi-supervised active clustering framework. Given a finite set of input points, and an oracle revealing whether any two points lie in the same cluster, our goal is to recover all clusters exactly using as few queries as possible. To this end, we relax the spherical $k$-means cluster assumption of Ashtiani et al.\ to allow for arbitrary ellipsoidal clusters with margin. This removes the assumption that the clustering is center-based (i.e., defined through an optimization problem), and includes all those cases where spherical clusters are individually transformed by any combination of rotations, axis scalings, and point deletions. We show that, even in this much more general setting, it is still possible to recover the latent clustering exactly using a number of queries that scales only logarithmically with the number of input points. More precisely, we design an algorithm that, given $n$ points to be partitioned into $k$ clusters, uses $O(k^3 \ln k \ln n)$ oracle queries and $\tilde{O}(kn + k^3)$ time to recover the clustering with zero misclassification error. The $O(\cdot)$ notation hides an exponential dependence on the dimensionality of the clusters, which we show to be necessary thus characterizing the query complexity of the problem. Our algorithm is simple, easy to implement, and can also learn the clusters using low-stretch separators, a class of ellipsoids with additional theoretical guarantees. Experiments on large synthetic datasets confirm that we can reconstruct clusterings exactly and efficiently., Comment: To appear at NeurIPS 2020 (oral)
- Published
- 2020
13. Correlation Clustering with Adaptive Similarity Queries
- Author
-
Bressan, Marco, Cesa-Bianchi, Nicolò, Paudice, Andrea, and Vitale, Fabio
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
In correlation clustering, we are given $n$ objects together with a binary similarity score between each pair of them. The goal is to partition the objects into clusters so to minimise the disagreements with the scores. In this work we investigate correlation clustering as an active learning problem: each similarity score can be learned by making a query, and the goal is to minimise both the disagreements and the total number of queries. On the one hand, we describe simple active learning algorithms, which provably achieve an almost optimal trade-off while giving cluster recovery guarantees, and we test them on different datasets. On the other hand, we prove information-theoretical bounds on the number of queries necessary to guarantee a prescribed disagreement bound. These results give a rich characterization of the trade-off between queries and clustering error.
- Published
- 2019
14. Label Sanitization against Label Flipping Poisoning Attacks
- Author
-
Paudice, Andrea, Muñoz-González, Luis, and Lupu, Emil C.
- Subjects
Statistics - Machine Learning ,Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
Many machine learning systems rely on data collected in the wild from untrusted sources, exposing the learning algorithms to data poisoning. Attackers can inject malicious data in the training dataset to subvert the learning process, compromising the performance of the algorithm producing errors in a targeted or an indiscriminate way. Label flipping attacks are a special case of data poisoning, where the attacker can control the labels assigned to a fraction of the training points. Even if the capabilities of the attacker are constrained, these attacks have been shown to be effective to significantly degrade the performance of the system. In this paper we propose an efficient algorithm to perform optimal label flipping poisoning attacks and a mechanism to detect and relabel suspicious data points, mitigating the effect of such poisoning attacks.
- Published
- 2018
15. Detection of Adversarial Training Examples in Poisoning Attacks through Anomaly Detection
- Author
-
Paudice, Andrea, Muñoz-González, Luis, Gyorgy, Andras, and Lupu, Emil C.
- Subjects
Statistics - Machine Learning ,Computer Science - Cryptography and Security ,Computer Science - Learning - Abstract
Machine learning has become an important component for many systems and applications including computer vision, spam filtering, malware and network intrusion detection, among others. Despite the capabilities of machine learning algorithms to extract valuable information from data and produce accurate predictions, it has been shown that these algorithms are vulnerable to attacks. Data poisoning is one of the most relevant security threats against machine learning systems, where attackers can subvert the learning process by injecting malicious samples in the training data. Recent work in adversarial machine learning has shown that the so-called optimal attack strategies can successfully poison linear classifiers, degrading the performance of the system dramatically after compromising a small fraction of the training dataset. In this paper we propose a defence mechanism to mitigate the effect of these optimal poisoning attacks based on outlier detection. We show empirically that the adversarial examples generated by these attack strategies are quite different from genuine points, as no detectability constrains are considered to craft the attack. Hence, they can be detected with an appropriate pre-filtering of the training dataset., Comment: 10 pages, 3 figures
- Published
- 2018
16. Towards Poisoning of Deep Learning Algorithms with Back-gradient Optimization
- Author
-
Muñoz-González, Luis, Biggio, Battista, Demontis, Ambra, Paudice, Andrea, Wongrassamee, Vasin, Lupu, Emil C., and Roli, Fabio
- Subjects
Computer Science - Learning - Abstract
A number of online services nowadays rely upon machine learning to extract valuable information from data collected in the wild. This exposes learning algorithms to the threat of data poisoning, i.e., a coordinate attack in which a fraction of the training data is controlled by the attacker and manipulated to subvert the learning process. To date, these attacks have been devised only against a limited class of binary learning algorithms, due to the inherent complexity of the gradient-based procedure used to optimize the poisoning points (a.k.a. adversarial training examples). In this work, we rst extend the de nition of poisoning attacks to multiclass problems. We then propose a novel poisoning algorithm based on the idea of back-gradient optimization, i.e., to compute the gradient of interest through automatic di erentiation, while also reversing the learning procedure to drastically reduce the attack complexity. Compared to current poisoning strategies, our approach is able to target a wider class of learning algorithms, trained with gradient- based procedures, including neural networks and deep learning architectures. We empirically evaluate its e ectiveness on several application examples, including spam ltering, malware detection, and handwritten digit recognition. We nally show that, similarly to adversarial test examples, adversarial training examples can also be transferred across di erent learning algorithms.
- Published
- 2017
17. Efficient Attack Graph Analysis through Approximate Inference
- Author
-
Muñoz-González, Luis, Sgandurra, Daniele, Paudice, Andrea, and Lupu, Emil C.
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Artificial Intelligence ,Statistics - Machine Learning - Abstract
Attack graphs provide compact representations of the attack paths that an attacker can follow to compromise network resources by analysing network vulnerabilities and topology. These representations are a powerful tool for security risk assessment. Bayesian inference on attack graphs enables the estimation of the risk of compromise to the system's components given their vulnerabilities and interconnections, and accounts for multi-step attacks spreading through the system. Whilst static analysis considers the risk posture at rest, dynamic analysis also accounts for evidence of compromise, e.g. from SIEM software or forensic investigation. However, in this context, exact Bayesian inference techniques do not scale well. In this paper we show how Loopy Belief Propagation - an approximate inference technique - can be applied to attack graphs, and that it scales linearly in the number of nodes for both static and dynamic analysis, making such analyses viable for larger networks. We experiment with different topologies and network clustering on synthetic Bayesian attack graphs with thousands of nodes to show that the algorithm's accuracy is acceptable and converge to a stable solution. We compare sequential and parallel versions of Loopy Belief Propagation with exact inference techniques for both static and dynamic analysis, showing the advantages of approximate inference techniques to scale to larger attack graphs., Comment: 30 pages, 14 figures
- Published
- 2016
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.