Author: "Ordentlich A" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Ordentlich A"' showing total 1,398 results

Start Over Author "Ordentlich A"

1,398 results on '"Ordentlich A"'

1. Optimal Quantization for Matrix Multiplication

Author: Ordentlich, Or and Polyanskiy, Yury
Subjects: Computer Science - Information Theory, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Recent work in machine learning community proposed multiple methods for performing lossy compression (quantization) of large matrices. This quantization is important for accelerating matrix multiplication (main component of large language models), which is often bottlenecked by the speed of loading these matrices from memory. Unlike classical vector quantization and rate-distortion theory, the goal of these new compression algorithms is to be able to approximate not the matrices themselves, but their matrix product. Specifically, given a pair of real matrices $A,B$ an encoder (compressor) is applied to each of them independently producing descriptions with $R$ bits per entry. These representations subsequently are used by the decoder to estimate matrix product $A^\top B$. In this work, we provide a non-asymptotic lower bound on the mean squared error of this approximation (as a function of rate $R$) for the case of matrices $A,B$ with iid Gaussian entries. Algorithmically, we construct a universal quantizer based on nested lattices with an explicit guarantee of approximation error for any (non-random) pair of matrices $A$, $B$ in terms of only Frobenius norms $\|A\|_F, \|B\|_F$ and $\|A^\top B\|_F$. For iid Gaussian matrices our quantizer achieves the lower bound and is, thus, asymptotically optimal. A practical low-complexity version of our quantizer achieves performance quite close to optimal. In information-theoretic terms we derive rate-distortion function for matrix multiplication of iid Gaussian matrices.
Published: 2024

2. Memory Complexity of Estimating Entropy and Mutual Information

Author: Berg, Tomer, Ordentlich, Or, and Shayevitz, Ofer
Subjects: Computer Science - Information Theory
Abstract: We observe an infinite sequence of independent identically distributed random variables $X_1,X_2,\ldots$ drawn from an unknown distribution $p$ over $[n]$, and our goal is to estimate the entropy $H(p)=-\mathbb{E}[\log p(X)]$ within an $\varepsilon$-additive error. To that end, at each time point we are allowed to update a finite-state machine with $S$ states, using a possibly randomized but time-invariant rule, where each state of the machine is assigned an entropy estimate. Our goal is to characterize the minimax memory complexity $S^*$ of this problem, which is the minimal number of states for which the estimation task is feasible with probability at least $1-\delta$ asymptotically, uniformly in $p$. Specifically, we show that there exist universal constants $C_1$ and $C_2$ such that $ S^* \leq C_1\cdot\frac{n (\log n)^4}{\varepsilon^2\delta}$ for $\varepsilon$ not too small, and $S^* \geq C_2 \cdot \max \{n, \frac{\log n}{\varepsilon}\}$ for $\varepsilon$ not too large. The upper bound is proved using approximate counting to estimate the logarithm of $p$, and a finite memory bias estimation machine to estimate the expectation operation. The lower bound is proved via a reduction of entropy estimation to uniformity testing. We also apply these results to derive bounds on the memory complexity of mutual information estimation.
Published: 2024

3. The strong data processing inequality under the heat flow

Author: Klartag, Bo'az and Ordentlich, Or
Subjects: Computer Science - Information Theory, Mathematics - Functional Analysis
Abstract: Let $\nu$ and $\mu$ be probability distributions on $\mathbb{R}^n$, and $\nu_s,\mu_s$ be their evolution under the heat flow, that is, the probability distributions resulting from convolving their density with the density of an isotropic Gaussian random vector with variance $s$ in each entry. This paper studies the rate of decay of $s\mapsto D(\nu_s\|\mu_s)$ for various divergences, including the $\chi^2$ and Kullback-Leibler (KL) divergences. We prove upper and lower bounds on the strong data-processing inequality (SDPI) coefficients corresponding to the source $\mu$ and the Gaussian channel. We also prove generalizations of de Brujin's identity, and Costa's result on the concavity in $s$ of the differential entropy of $\nu_s$. As a byproduct of our analysis, we obtain new lower bounds on the mutual information between $X$ and $Y=X+\sqrt{s} Z$, where $Z$ is a standard Gaussian vector in $\mathbb{R}^n$, independent of $X$, and on the minimum mean-square error (MMSE) in estimating $X$ from $Y$, in terms of the Poincar\'e constant of $X$.
Published: 2024

4. Lower Bounds on Mutual Information for Linear Codes Transmitted over Binary Input Channels, and for Information Combining

Author: Erez, Uri, Ordentlich, Or, and Shamai, Shlomo
Subjects: Computer Science - Information Theory
Abstract: It has been known for a long time that the mutual information between the input sequence and output of a binary symmetric channel (BSC) is upper bounded by the mutual information between the same input sequence and the output of a binary erasure channel (BEC) with the same capacity. Recently, Samorodintsky discovered that one may also lower bound the BSC mutual information in terms of the mutual information between the same input sequence and a more capable BEC. In this paper, we strengthen Samordnitsky's bound for the special case where the input to the channel is distributed uniformly over a linear code. Furthermore, for a general (not necessarily binary) input distribution $P_X$ and channel $W_{Y|X}$, we derive a new lower bound on the mutual information $I(X;Y^n)$ for $n$ transmissions of $X\sim P_X$ through the channel $W_{Y|X}$.
Published: 2024

5. Statistical Inference with Limited Memory: A Survey

Author: Berg, Tomer, Ordentlich, Or, and Shayevitz, Ofer
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory, Statistics - Machine Learning
Abstract: The problem of statistical inference in its various forms has been the subject of decades-long extensive research. Most of the effort has been focused on characterizing the behavior as a function of the number of available samples, with far less attention given to the effect of memory limitations on performance. Recently, this latter topic has drawn much interest in the engineering and computer science literature. In this survey paper, we attempt to review the state-of-the-art of statistical inference under memory constraints in several canonical problems, including hypothesis testing, parameter estimation, and distribution property testing/estimation. We discuss the main results in this developing field, and by identifying recurrent themes, we extract some fundamental building blocks for algorithmic construction, as well as useful techniques for lower bound derivations., Comment: Accepted to JSAIT Special Issue
Published: 2023

6. Bounds on the density of smooth lattice coverings

Author: Ordentlich, Or, Regev, Oded, and Weiss, Barak
Subjects: Mathematics - Number Theory, Computer Science - Information Theory
Abstract: Let $K$ be a convex body in $\mathbb{R}^n$, let $L$ be a lattice with covolume one, and let $\eta>0$. We say that $K$ and $L$ form an $\eta$-smooth cover if each point $x \in \mathbb{R}^n$ is covered by $(1 \pm \eta) vol(K)$ translates of $K$ by $L$. We prove that for any positive $\sigma, \eta$, asymptotically as $n \to \infty$, for any $K$ of volume $n^{3+\sigma}$, one can find a lattice $L$ for which $L, K$ form an $\eta$-smooth cover. Moreover, this property is satisfied with high probability for a lattice chosen randomly, according to the Haar-Siegel measure on the space of lattices. Similar results hold for random construction A lattices, albeit with a worse power law, provided the ratio between the covering and packing radii of $\mathbb{Z}^n$ with respect to $K$ is at most polynomial in $n$. Our proofs rely on a recent breakthrough by Dhar and Dvir on the discrete Kakeya problem.
Published: 2023

7. Lower Bounds on Mutual Information for Linear Codes Transmitted over Binary Input Channels, and for Information Combining.

Author: Uri Erez, Or Ordentlich, and Shlomo Shamai
Published: 2024
Full Text: View/download PDF

8. On The Memory Complexity of Uniformity Testing

Author: Berg, Tomer, Ordentlich, Or, and Shayevitz, Ofer
Subjects: Computer Science - Information Theory
Abstract: In this paper we consider the problem of uniformity testing with limited memory. We observe a sequence of independent identically distributed random variables drawn from a distribution $p$ over $[n]$, which is either uniform or is $\varepsilon$-far from uniform under the total variation distance, and our goal is to determine the correct hypothesis. At each time point we are allowed to update the state of a finite-memory machine with $S$ states, where each state of the machine is assigned one of the hypotheses, and we are interested in obtaining an asymptotic probability of error at most $0<\delta<1/2$ uniformly under both hypotheses. The main contribution of this paper is deriving upper and lower bounds on the number of states $S$ needed in order to achieve a constant error probability $\delta$, as a function of $n$ and $\varepsilon$, where our upper bound is $O(\frac{n\log n}{\varepsilon})$ and our lower bound is $\Omega (n+\frac{1}{\varepsilon})$. Prior works in the field have almost exclusively used collision counting for upper bounds, and the Paninski mixture for lower bounds. Somewhat surprisingly, in the limited memory with unlimited samples setup, the optimal solution does not involve counting collisions, and the Paninski prior is not hard. Thus, different proof techniques are needed in order to attain our bounds., Comment: To be presented in COLT 2022
Published: 2022

9. Deterministic Finite-Memory Bias Estimation

Author: Berg, Tomer, Ordentlich, Or, and Shayevitz, Ofer
Subjects: Computer Science - Information Theory, Mathematics - Statistics Theory
Abstract: In this paper we consider the problem of estimating a Bernoulli parameter using finite memory. Let $X_1,X_2,\ldots$ be a sequence of independent identically distributed Bernoulli random variables with expectation $\theta$, where $\theta \in [0,1]$. Consider a finite-memory deterministic machine with $S$ states, that updates its state $M_n \in \{1,2,\ldots,S\}$ at each time according to the rule $M_n = f(M_{n-1},X_n)$, where $f$ is a deterministic time-invariant function. Assume that the machine outputs an estimate at each time point according to some fixed mapping from the state space to the unit interval. The quality of the estimation procedure is measured by the asymptotic risk, which is the long-term average of the instantaneous quadratic risk. The main contribution of this paper is an upper bound on the smallest worst-case asymptotic risk any such machine can attain. This bound coincides with a lower bound derived by Leighton and Rivest, to imply that $\Theta(1/S)$ is the minimax asymptotic risk for deterministic $S$-state machines. In particular, our result disproves a longstanding $\Theta(\log S/S)$ conjecture for this quantity, also posed by Leighton and Rivest., Comment: Presented in COLT 2021
Published: 2022

10. On the Role of Channel Capacity in Learning Gaussian Mixture Models

Author: Romanov, Elad, Bendory, Tamir, and Ordentlich, Or
Subjects: Computer Science - Information Theory, Computer Science - Machine Learning, Mathematics - Statistics Theory, Statistics - Machine Learning
Abstract: This paper studies the sample complexity of learning the $k$ unknown centers of a balanced Gaussian mixture model (GMM) in $\mathbb{R}^d$ with spherical covariance matrix $\sigma^2\mathbf{I}$. In particular, we are interested in the following question: what is the maximal noise level $\sigma^2$, for which the sample complexity is essentially the same as when estimating the centers from labeled measurements? To that end, we restrict attention to a Bayesian formulation of the problem, where the centers are uniformly distributed on the sphere $\sqrt{d}\mathcal{S}^{d-1}$. Our main results characterize the exact noise threshold $\sigma^2$ below which the GMM learning problem, in the large system limit $d,k\to\infty$, is as easy as learning from labeled observations, and above which it is substantially harder. The threshold occurs at $\frac{\log k}{d} = \frac12\log\left( 1+\frac{1}{\sigma^2} \right)$, which is the capacity of the additive white Gaussian noise (AWGN) channel. Thinking of the set of $k$ centers as a code, this noise threshold can be interpreted as the largest noise level for which the error probability of the code over the AWGN channel is small. Previous works on the GMM learning problem have identified the minimum distance between the centers as a key parameter in determining the statistical difficulty of learning the corresponding GMM. While our results are only proved for GMMs whose centers are uniformly distributed over the sphere, they hint that perhaps it is the decoding error probability associated with the center constellation as a channel code that determines the statistical difficulty of learning the corresponding GMM, rather than just the minimum distance., Comment: COLT 2022
Published: 2022

11. Blind Modulo Analog-to-Digital Conversion of Vector Processes

Author: Weiss, Amir, Huang, Everest, Ordentlich, Or, and Wornell, Gregory W.
Subjects: Computer Science - Information Theory, Electrical Engineering and Systems Science - Signal Processing
Abstract: In a growing number of applications, there is a need to digitize a (possibly high) number of correlated signals whose spectral characteristics are challenging for traditional analog-to-digital converters (ADCs). Examples, among others, include multiple-input multiple-output systems where the ADCs must acquire at once several signals at a very wide but sparsely and dynamically occupied bandwidth supporting diverse services. In such scenarios, the resolution requirements can be prohibitively high. As an alternative, the recently proposed modulo-ADC architecture can in principle require dramatically fewer bits in the conversion to obtain the target fidelity, but requires that spatiotemporal information be known and explicitly taken into account by the analog and digital processing in the converter, which is frequently impractical. Building on our recent work, we address this limitation and develop a blind version of the architecture that requires no such knowledge in the converter. In particular, it features an automatic modulo-level adjustment and a fully adaptive modulo-decoding mechanism, allowing it to asymptotically match the characteristics of the unknown input signal. Simulation results demonstrate the successful operation of the proposed algorithm., Comment: arXiv admin note: substantial text overlap with arXiv:2108.08937
Published: 2021

12. Spiked Covariance Estimation from Modulo-Reduced Measurements

Author: Romanov, Elad and Ordentlich, Or
Subjects: Computer Science - Information Theory, Mathematics - Statistics Theory, Statistics - Machine Learning
Abstract: Consider the rank-1 spiked model: $\bf{X}=\sqrt{\nu}\xi \bf{u}+ \bf{Z}$, where $\nu$ is the spike intensity, $\bf{u}\in\mathbb{S}^{k-1}$ is an unknown direction and $\xi\sim \mathcal{N}(0,1),\bf{Z}\sim \mathcal{N}(\bf{0},\bf{I})$. Motivated by recent advances in analog-to-digital conversion, we study the problem of recovering $\bf{u}\in \mathbb{S}^{k-1}$ from $n$ i.i.d. modulo-reduced measurements $\bf{Y}=[\bf{X}]\mod \Delta$, focusing on the high-dimensional regime ($k\gg 1$). We develop and analyze an algorithm that, for most directions $\bf{u}$ and $\nu=\mathrm{poly}(k)$, estimates $\bf{u}$ to high accuracy using $n=\mathrm{poly}(k)$ measurements, provided that $\Delta\gtrsim \sqrt{\log k}$. Up to constants, our algorithm accurately estimates $\bf{u}$ at the smallest possible $\Delta$ that allows (in an information-theoretic sense) to recover $\bf{X}$ from $\bf{Y}$. A key step in our analysis involves estimating the probability that a line segment of length $\approx\sqrt{\nu}$ in a random direction $\bf{u}$ passes near a point in the lattice $\Delta \mathbb{Z}^k$. Numerical experiments show that the developed algorithm performs well even in a non-asymptotic setting., Comment: AISTATS, 2022
Published: 2021

13. Blind Modulo Analog-to-Digital Conversion

Author: Weiss, Amir, Huang, Everest, Ordentlich, Or, and Wornell, Gregory W.
Subjects: Electrical Engineering and Systems Science - Signal Processing
Abstract: In a growing number of applications, there is a need to digitize signals whose spectral characteristics are challenging for traditional Analog-to-Digital Converters (ADCs). Examples, among others, include systems where the ADC must acquire at once a very wide but sparsely and dynamically occupied bandwidth supporting diverse services, as well as systems where the signal of interest is subject to strong narrowband co-channel interference. In such scenarios, the resolution requirements can be prohibitively high. As an alternative, the recently proposed modulo-ADC architecture can in principle require dramatically fewer bits in the conversation to obtain the target fidelity, but requires that information about the spectrum be known and explicitly taken into account by the analog and digital processing in the converter, which is frequently impractical. To address this limitation, we develop a blind version of the architecture that requires no such knowledge in the converter, without sacrificing performance. In particular, it features an automatic modulo-level adjustment and a fully adaptive modulo unwrapping mechanism, allowing it to asymptotically match the characteristics of the unknown input signal. In addition to detailed analysis, simulations demonstrate the attractive performance characteristics in representative settings.
Published: 2021
Full Text: View/download PDF

14. On Compressed Sensing of Binary Signals for the Unsourced Random Access Channel

Author: Romanov, Elad and Ordentlich, Or
Subjects: Computer Science - Information Theory, Electrical Engineering and Systems Science - Signal Processing
Abstract: Motivated by applications in unsourced random access, this paper develops a novel scheme for the problem of compressed sensing of binary signals. In this problem, the goal is to design a sensing matrix $A$ and a recovery algorithm, such that the sparse binary vector $\mathbf{x}$ can be recovered reliably from the measurements $\mathbf{y}=A\mathbf{x}+\sigma\mathbf{z}$, where $\mathbf{z}$ is additive white Gaussian noise. We propose to design $A$ as a parity check matrix of a low-density parity-check code (LDPC), and to recover $\mathbf{x}$ from the measurements $\mathbf{y}$ using a Markov chain Monte Carlo algorithm, which runs relatively fast due to the sparse structure of $A$. The performance of our scheme is comparable to state-of-the-art schemes, which use dense sensing matrices, while enjoying the advantages of using a sparse sensing matrix., Comment: Accepted to Entropy Special Issue on "Information-Theoretic Aspects of Non-Orthogonal and Massive Access for Future Wireless Networks"
Published: 2021
Full Text: View/download PDF

15. Critical Slowing Down Near Topological Transitions in Rate-Distortion Problems

Author: Agmon, Shlomi, Benger, Etam, Ordentlich, Or, and Tishby, Naftali
Subjects: Computer Science - Information Theory
Abstract: In rate-distortion (RD) problems one seeks reduced representations of a source that meet a target distortion constraint. Such optimal representations undergo topological transitions at some critical rate values, when their cardinality or dimensionality change. We study the convergence time of the Arimoto-Blahut alternating projection algorithms, used to solve such problems, near those critical points, both for the rate-distortion and information bottleneck settings. We argue that they suffer from critical slowing down -- a diverging number of iterations for convergence -- near the critical points. This phenomenon can have theoretical and practical implications for both machine learning and data compression problems., Comment: 10 pages, 2 figures, ISIT 2021 submission
Published: 2021
Full Text: View/download PDF

16. Hyperfast Second-Order Local Solvers for Efficient Statistically Preconditioned Distributed Optimization

Author: Dvurechensky, Pavel, Kamzolov, Dmitry, Lukashevich, Aleksandr, Lee, Soomin, Ordentlich, Erik, Uribe, César A., and Gasnikov, Alexander
Subjects: Mathematics - Optimization and Control
Abstract: Statistical preconditioning enables fast methods for distributed large-scale empirical risk minimization problems. In this approach, multiple worker nodes compute gradients in parallel, which are then used by the central node to update the parameter by solving an auxiliary (preconditioned) smaller-scale optimization problem. The recently proposed Statistically Preconditioned Accelerated Gradient (SPAG) method has complexity bounds superior to other such algorithms but requires an exact solution for computationally intensive auxiliary optimization problems at every iteration. In this paper, we propose an Inexact SPAG (InSPAG) and explicitly characterize the accuracy by which the corresponding auxiliary subproblem needs to be solved to guarantee the same convergence rate as the exact method. We build our results by first developing an inexact adaptive accelerated Bregman proximal gradient method for general optimization problems under relative smoothness and strong convexity assumptions, which may be of independent interest. Moreover, we explore the properties of the auxiliary problem in the InSPAG algorithm assuming Lipschitz third-order derivatives and strong convexity. For such problem class, we develop a linearly convergent Hyperfast second-order method and estimate the total complexity of the InSPAG method with hyperfast auxiliary problem solver. Finally, we illustrate the proposed method's practical efficiency by performing large-scale numerical experiments on logistic regression models. To the best of our knowledge, these are the first empirical results on implementing high-order methods on large-scale problems, we work with data where the dimension is of the order of~$3$ million, and the number of samples is~$700$ million.
Published: 2021

17. Constructing Multiclass Classifiers using Binary Classifiers Under Log-Loss

Author: Ben-Yishai, Assaf and Ordentlich, Or
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory, Mathematics - Statistics Theory, Statistics - Machine Learning
Abstract: The construction of multiclass classifiers from binary elements is studied in this paper, and performance is quantified by the regret, defined with respect to the Bayes optimal log-loss. We discuss two known methods. The first is one vs. all (OVA), for which we prove that the multiclass regret is upper bounded by the sum of binary regrets of the constituent classifiers. The second is hierarchical classification, based on a binary tree. For this method we prove that the multiclass regret is exactly a weighted sum of constituent binary regrets where the weighing is determined by the tree structure. We also introduce a leverage-hierarchical classification method, which potentially yields smaller log-loss and regret. The advantages of these classification methods are demonstrated by simulation on both synthetic and real-life datasets., Comment: A shorter version of this contribution was presented in ISIT 2021
Published: 2021

18. Minimax Risk Upper Bounds Based on Shell Analysis of a Quantized Maximum Likelihood Estimator.

Author: Noam Gavish and Or Ordentlich
Published: 2023
Full Text: View/download PDF

19. The menin inhibitor revumenib in KMT2A-rearranged or NPM1-mutant leukaemia

Author: Issa, Ghayas C., Aldoss, Ibrahim, DiPersio, John, Cuglievan, Branko, Stone, Richard, Arellano, Martha, Thirman, Michael J., Patel, Manish R., Dickens, David S., Shenoy, Shalini, Shukla, Neerav, Kantarjian, Hagop, Armstrong, Scott A., Perner, Florian, Perry, Jennifer A., Rosen, Galit, Bagley, Rebecca G., Meyers, Michael L., Ordentlich, Peter, Gu, Yu, Kumar, Vinit, Smith, Steven, McGeehan, Gerard M., and Stein, Eytan M.
Published: 2023
Full Text: View/download PDF

20. Strong data processing constant is achieved by binary inputs

Author: Ordentlich, Or and Polyanskiy, Yury
Subjects: Computer Science - Information Theory
Abstract: For any channel $P_{Y|X}$ the strong data processing constant is defined as the smallest number $\eta_{KL}\in[0,1]$ such that $I(U;Y)\le \eta_{KL} I(U;X)$ holds for any Markov chain $U-X-Y$. It is shown that the value of $\eta_{KL}$ is given by that of the best binary-input subchannel of $P_{Y|X}$. The same result holds for any $f$-divergence, verifying a conjecture of Cohen, Kemperman and Zbaganu (1998)., Comment: 1 page
Published: 2020

21. Multi-reference alignment in high dimensions: sample complexity and phase transition

Author: Romanov, Elad, Bendory, Tamir, and Ordentlich, Or
Subjects: Computer Science - Information Theory, Electrical Engineering and Systems Science - Signal Processing, Mathematics - Statistics Theory, Statistics - Machine Learning
Abstract: Multi-reference alignment entails estimating a signal in $\mathbb{R}^L$ from its circularly-shifted and noisy copies. This problem has been studied thoroughly in recent years, focusing on the finite-dimensional setting (fixed $L$). Motivated by single-particle cryo-electron microscopy, we analyze the sample complexity of the problem in the high-dimensional regime $L\to\infty$. Our analysis uncovers a phase transition phenomenon governed by the parameter $\alpha = L/(\sigma^2\log L)$, where $\sigma^2$ is the variance of the noise. When $\alpha>2$, the impact of the unknown circular shifts on the sample complexity is minor. Namely, the number of measurements required to achieve a desired accuracy $\varepsilon$ approaches $\sigma^2/\varepsilon$ for small $\varepsilon$; this is the sample complexity of estimating a signal in additive white Gaussian noise, which does not involve shifts. In sharp contrast, when $\alpha\leq 2$, the problem is significantly harder and the sample complexity grows substantially quicker with $\sigma^2$.
Published: 2020
Full Text: View/download PDF

22. Denoising as well as the best of any two denoisers

Author: Ordentlich, Erik
Subjects: Computer Science - Information Theory, Mathematics - Statistics Theory
Abstract: Given two arbitrary sequences of denoisers for block lengths tending to infinity we ask if it is possible to construct a third sequence of denoisers with an asymptotically vanishing (in block length) excess expected loss relative to the best expected loss of the two given denoisers for all clean channel input sequences. As in the setting of DUDE [1], which solves this problem when the given denoisers are sliding block denoisers, the construction is allowed to depend on the two given denoisers and the channel transition probabilities. We show that under certain restrictions on the two given denoisers the problem can be solved using a straightforward application of a known loss estimation paradigm. We then show by way of a counter-example that the loss estimation approach fails in the general case. Finally, we show that for the binary symmetric channel, combining the loss estimation with a randomization step leads to a solution to the stated problem under no restrictions on the given denoisers., Comment: 19 pages. Appeared, in part, in Proceedings of 2013 IEEE Intl. Symp. on Info. Theory. This version has full proofs (e.g., of Proposition 2)
Published: 2020

23. Learning to Bid Optimally and Efficiently in Adversarial First-price Auctions

Author: Han, Yanjun, Zhou, Zhengyuan, Flores, Aaron, Ordentlich, Erik, and Weissman, Tsachy
Subjects: Computer Science - Machine Learning, Computer Science - Computer Science and Game Theory, Computer Science - Information Theory, Statistics - Machine Learning
Abstract: First-price auctions have very recently swept the online advertising industry, replacing second-price auctions as the predominant auction mechanism on many platforms. This shift has brought forth important challenges for a bidder: how should one bid in a first-price auction, where unlike in second-price auctions, it is no longer optimal to bid one's private value truthfully and hard to know the others' bidding behaviors? In this paper, we take an online learning angle and address the fundamental problem of learning to bid in repeated first-price auctions, where both the bidder's private valuations and other bidders' bids can be arbitrary. We develop the first minimax optimal online bidding algorithm that achieves an $\widetilde{O}(\sqrt{T})$ regret when competing with the set of all Lipschitz bidding policies, a strong oracle that contains a rich set of bidding strategies. This novel algorithm is built on the insight that the presence of a good expert can be leveraged to improve performance, as well as an original hierarchical expert-chaining structure, both of which could be of independent interest in online learning. Further, by exploiting the product structure that exists in the problem, we modify this algorithm--in its vanilla form statistically optimal but computationally infeasible--to a computationally efficient and space efficient algorithm that also retains the same $\widetilde{O}(\sqrt{T})$ minimax optimal regret guarantee. Additionally, through an impossibility result, we highlight that one is unlikely to compete this favorably with a stronger oracle (than the considered Lipschitz bidding policies). Finally, we test our algorithm on three real-world first-price auction datasets obtained from Verizon Media and demonstrate our algorithm's superior performance compared to several existing bidding algorithms.
Published: 2020

24. New bounds on the density of lattice coverings

Author: Ordentlich, Or, Regev, Oded, and Weiss, Barak
Subjects: Mathematics - Number Theory, Computer Science - Information Theory, Mathematics - Metric Geometry, 11H31, 94B75, 11T30
Abstract: We obtain new upper bounds on the minimal density of lattice coverings of Euclidean space by dilates of a convex body K. We also obtain bounds on the probability (with respect to the natural Haar-Siegel measure on the space of lattices) that a randomly chosen lattice L satisfies that L+K is all of space. As a step in the proof, we utilize and strengthen results on the discrete Kakeya problem.
Published: 2020

25. Binary Hypothesis Testing with Deterministic Finite-Memory Decision Rules

Author: Berg, Tomer, Shayevitz, Ofer, and Ordentlich, Or
Subjects: Computer Science - Information Theory
Abstract: In this paper we consider the problem of binary hypothesis testing with finite memory systems. Let $X_1,X_2,\ldots$ be a sequence of independent identically distributed Bernoulli random variables, with expectation $p$ under $\mathcal{H}_0$ and $q$ under $\mathcal{H}_1$. Consider a finite-memory deterministic machine with $S$ states that updates its state $M_n \in \{1,2,\ldots,S\}$ at each time according to the rule $M_n = f(M_{n-1},X_n)$, where $f$ is a deterministic time-invariant function. Assume that we let the process run for a very long time ($n\rightarrow \infty)$, and then make our decision according to some mapping from the state space to the hypothesis space. The main contribution of this paper is a lower bound on the Bayes error probability $P_e$ of any such machine. In particular, our findings show that the ratio between the maximal exponential decay rate of $P_e$ with $S$ for a deterministic machine and for a randomized one, can become unbounded, complementing a result by Hellman., Comment: To be presented at ISIT 2020
Published: 2020

26. An Information-Theoretic Proof of the Streaming Switching Lemma for Symmetric Encryption

Author: Shahaf, Ido, Ordentlich, Or, and Segev, Gil
Subjects: Computer Science - Information Theory, Computer Science - Cryptography and Security
Abstract: Motivated by a fundamental paradigm in cryptography, we consider a recent variant of the classic problem of bounding the distinguishing advantage between a random function and a random permutation. Specifically, we consider the problem of deciding whether a sequence of $q$ values was sampled uniformly with or without replacement from $[N]$, where the decision is made by a streaming algorithm restricted to using at most $s$ bits of internal memory. In this work, the distinguishing advantage of such an algorithm is measured by the KL divergence between the distributions of its output as induced under the two cases. We show that for any $s=\Omega(\log N)$ the distinguishing advantage is upper bounded by $O(q \cdot s / N)$, and even by $O(q \cdot s / N \log N)$ when $q \leq N^{1 - \epsilon}$ for any constant $\epsilon > 0$ where it is nearly tight with respect to the KL divergence.
Published: 2020

27. An Upgrading Algorithm with Optimal Power Law

Author: Ordentlich, Or and Tal, Ido
Subjects: Computer Science - Information Theory
Abstract: Consider a channel $W$ along with a given input distribution $P_X$. In certain settings, such as in the construction of polar codes, the output alphabet of $W$ is `too large', and hence we replace $W$ by a channel $Q$ having a smaller output alphabet. We say that $Q$ is upgraded with respect to $W$ if $W$ is obtained from $Q$ by processing its output. In this case, the mutual information $I(P_X,W)$ between the input and output of $W$ is upper-bounded by the mutual information $I(P_X,Q)$ between the input and output of $Q$. In this paper, we present an algorithm that produces an upgraded channel $Q$ from $W$, as a function of $P_X$ and the required output alphabet size of $Q$, denoted $L$. We show that the difference in mutual informations is `small'. Namely, it is $O(L^{-2/(|\mathcal{X}|-1)})$, where $|\mathcal{X}|$ is the size of the input alphabet. This power law of $L$ is optimal. We complement our analysis with numerical experiments which show that the developed algorithm improves upon the existing state-of-the-art algorithms also in non-asymptotic setups.
Published: 2020

28. A Note on the Probability of Rectangles for Correlated Binary Strings

Author: Ordentlich, Or, Polyanskiy, Yury, and Shayevitz, Ofer
Subjects: Computer Science - Information Theory, Mathematics - Combinatorics
Abstract: Consider two sequences of $n$ independent and identically distributed fair coin tosses, $X=(X_1,\ldots,X_n)$ and $Y=(Y_1,\ldots,Y_n)$, which are $\rho$-correlated for each $j$, i.e. $\mathbb{P}[X_j=Y_j] = {1+\rho\over 2}$. We study the question of how large (small) the probability $\mathbb{P}[X \in A, Y\in B]$ can be among all sets $A,B\subset\{0,1\}^n$ of a given cardinality. For sets $|A|,|B| = \Theta(2^n)$ it is well known that the largest (smallest) probability is approximately attained by concentric (anti-concentric) Hamming balls, and this can be proved via the hypercontractive inequality (reverse hypercontractivity). Here we consider the case of $|A|,|B| = 2^{\Theta(n)}$. By applying a recent extension of the hypercontractive inequality of Polyanskiy-Samorodnitsky (J. Functional Analysis, 2019), we show that Hamming balls of the same size approximately maximize $\mathbb{P}[X \in A, Y\in B]$ in the regime of $\rho \to 1$. We also prove a similar tight lower bound, i.e. show that for $\rho\to 0$ the pair of opposite Hamming balls approximately minimizes the probability $\mathbb{P}[X \in A, Y\in B]$.
Published: 2019

29. A Lower Bound on the Essential Interactive Capacity of Binary Memoryless Symmetric Channels

Author: Ben-Yishai, Assaf, Kim, Young-Han, Ordentlich, Or, and Shayevitz, Ofer
Subjects: Computer Science - Information Theory
Abstract: The essential interactive capacity of a discrete memoryless channel is defined in this paper as the maximal rate at which the transcript of any interactive protocol can be reliably simulated over the channel, using a deterministic coding scheme. In contrast to other interactive capacity definitions in the literature, this definition makes no assumptions on the order of speakers (which can be adaptive) and does not allow any use of private / public randomness; hence, the essential interactive capacity is a function of the channel model only. It is shown that the essential interactive capacity of any binary memoryless symmetric (BMS) channel is at least $0.0302$ its Shannon capacity. To that end, we present a simple coding scheme, based on extended-Hamming codes combined with error detection, that achieves the lower bound in the special case of the binary symmetric channel (BSC). We then adapt the scheme to the entire family of BMS channels, and show that it achieves the same lower bound using extremes of the Bhattacharyya parameter.
Published: 2019

30. Above the Nyquist Rate, Modulo Folding Does Not Hurt

Author: Romanov, Elad and Ordentlich, Or
Subjects: Computer Science - Information Theory, Electrical Engineering and Systems Science - Signal Processing
Abstract: We consider the problem of recovering a continuous-time bandlimited signal from the discrete-time signal obtained from sampling it every $T_s$ seconds and reducing the result modulo $\Delta$, for some $\Delta>0$. For $\Delta=\infty$ the celebrated Shannon-Nyquist sampling theorem guarantees that perfect recovery is possible provided that the sampling rate $1/T_s$ exceeds the so-called Nyquist rate. Recent work by Bhandari et al. has shown that for any $\Delta>0$ perfect reconstruction is still possible if the sampling rate exceeds the Nyquist rate by a factor of $\pi e$. In this letter we improve upon this result and show that for finite energy signals, perfect recovery is possible for any $\Delta>0$ and any sampling rate above the Nyquist rate. Thus, modulo folding does not degrade the signal, provided that the sampling rate exceeds the Nyquist rate. This claim is proved by establishing a connection between the recovery problem of a discrete-time signal from its modulo reduced version and the problem of predicting the next sample of a discrete-time signal from its past, and leveraging the fact that for a bandlimited signal the prediction error can be made arbitrarily small.
Published: 2019
Full Text: View/download PDF

31. A Lower Bound on the Expected Distortion of Joint Source-Channel Coding

Author: Kochman, Yuval, Ordentlich, Or, and Polyanskiy, Yury
Subjects: Computer Science - Information Theory
Abstract: We consider the classic joint source-channel coding problem of transmitting a memoryless source over a memoryless channel. The focus of this work is on the long-standing open problem of finding the rate of convergence of the smallest attainable expected distortion to its asymptotic value, as a function of blocklength $n$. Our main result is that in general the convergence rate is not faster than $n^{-1/2}$. In particular, we show that for the problem of transmitting i.i.d uniform bits over a binary symmetric channels with Hamming distortion, the smallest attainable distortion (bit error rate) is at least $\Omega(n^{-1/2})$ above the asymptotic value, if the ``bandwidth expansion ratio'' is above $1$.
Published: 2019

32. Blind Unwrapping of Modulo Reduced Gaussian Vectors: Recovering MSBs from LSBs

Author: Romanov, Elad and Ordentlich, Or
Subjects: Computer Science - Information Theory, Electrical Engineering and Systems Science - Signal Processing
Abstract: We consider the problem of recovering $n$ i.i.d samples from a zero mean multivariate Gaussian distribution with an unknown covariance matrix, from their modulo wrapped measurements, i.e., measurement where each coordinate is reduced modulo $\Delta$, for some $\Delta>0$. For this setup, which is motivated by quantization and analog-to-digital conversion, we develop a low-complexity iterative decoding algorithm. We show that if a benchmark informed decoder that knows the covariance matrix can recover each sample with small error probability, and $n$ is large enough, the performance of the proposed blind recovery algorithm closely follows that of the informed one. We complement the analysis with numeric results that show that the algorithm performs well even in non-asymptotic conditions.
Published: 2019
Full Text: View/download PDF

33. Information-Distilling Quantizers

Author: Bhatt, Alankrita, Nazer, Bobak, Ordentlich, Or, and Polyanskiy, Yury
Subjects: Computer Science - Information Theory
Abstract: Let $X$ and $Y$ be dependent random variables. This paper considers the problem of designing a scalar quantizer for $Y$ to maximize the mutual information between the quantizer's output and $X$, and develops fundamental properties and bounds for this form of quantization, which is connected to the log-loss distortion criterion. The main focus is the regime of low $I(X;Y)$, where it is shown that, if $X$ is binary, a constant fraction of the mutual information can always be preserved using $\mathcal{O}(\log(1/I(X;Y)))$ quantization levels, and there exist distributions for which this many quantization levels are necessary. Furthermore, for larger finite alphabets $2 < |\mathcal{X}| < \infty$, it is established that an $\eta$-fraction of the mutual information can be preserved using roughly $(\log(| \mathcal{X} | /I(X;Y)))^{\eta\cdot(|\mathcal{X}| - 1)}$ quantization levels.
Published: 2018

34. On The Memory Complexity of Uniformity Testing.

Author: Tomer Berg, Or Ordentlich, and Ofer Shayevitz
Published: 2022

35. On the Role of Channel Capacity in Learning Gaussian Mixture Models.

Author: Elad Romanov, Tamir Bendory, and Or Ordentlich
Published: 2022

36. Spiked Covariance Estimation from Modulo-Reduced Measurements.

Author: Elad Romanov and Or Ordentlich
Published: 2022

37. Blind Modulo Analog-to-Digital Conversion of Vector Processes.

Author: Amir Weiss, Everest W. Huang, Or Ordentlich, and Gregory W. Wornell
Published: 2022
Full Text: View/download PDF

38. A Modulo-Based Architecture for Analog-to-Digital Conversion

Author: Ordentlich, Or, Tabak, Gizem, Hanumolu, Pavan Kumar, Singer, Andrew C., and Wornell, Gregory W.
Subjects: Computer Science - Information Theory
Abstract: Systems that capture and process analog signals must first acquire them through an analog-to-digital converter. While subsequent digital processing can remove statistical correlations present in the acquired data, the dynamic range of the converter is typically scaled to match that of the input analog signal. The present paper develops an approach for analog-to-digital conversion that aims at minimizing the number of bits per sample at the output of the converter. This is attained by reducing the dynamic range of the analog signal by performing a modulo operation on its amplitude, and then quantizing the result. While the converter itself is universal and agnostic of the statistics of the signal, the decoder operation on the output of the quantizer can exploit the statistical structure in order to unwrap the modulo folding. The performance of this method is shown to approach information theoretical limits, as captured by the rate-distortion function, in various settings. An architecture for modulo analog-to-digital conversion via ring oscillators is suggested, and its merits are numerically demonstrated.
Published: 2018
Full Text: View/download PDF

39. Almost Optimal Scaling of Reed-Muller Codes on BEC and BSC Channels

Author: Hassani, Hamed, Kudekar, Shrinivas, Ordentlich, Or, Polyanskiy, Yury, and Urbanke, Rüdiger
Subjects: Computer Science - Information Theory
Abstract: Consider a binary linear code of length $N$, minimum distance $d_{\text{min}}$, transmission over the binary erasure channel with parameter $0 < \epsilon < 1$ or the binary symmetric channel with parameter $0 < \epsilon < \frac12$, and block-MAP decoding. It was shown by Tillich and Zemor that in this case the error probability of the block-MAP decoder transitions "quickly" from $\delta$ to $1-\delta$ for any $\delta>0$ if the minimum distance is large. In particular the width of the transition is of order $O(1/\sqrt{d_{\text{min}}})$. We strengthen this result by showing that under suitable conditions on the weight distribution of the code, the transition width can be as small as $\Theta(1/N^{\frac12-\kappa})$, for any $\kappa>0$, even if the minimum distance of the code is not linear. This condition applies e.g., to Reed-Mueller codes. Since $\Theta(1/N^{\frac12})$ is the smallest transition possible for any code, we speak of "almost" optimal scaling. We emphasize that the width of the transition says nothing about the location of the transition. Therefore this result has no bearing on whether a code is capacity-achieving or not. As a second contribution, we present a new estimate on the derivative of the EXIT function, the proof of which is based on the Blowing-Up Lemma., Comment: Submitted to ISIT 2018
Published: 2018

40. How to Quantize $n$ Outputs of a Binary Symmetric Channel to $n-1$ Bits?

Author: Huleihel, Wasim and Ordentlich, Or
Subjects: Computer Science - Information Theory
Abstract: Suppose that $Y^n$ is obtained by observing a uniform Bernoulli random vector $X^n$ through a binary symmetric channel with crossover probability $\alpha$. The "most informative Boolean function" conjecture postulates that the maximal mutual information between $Y^n$ and any Boolean function $\mathrm{b}(X^n)$ is attained by a dictator function. In this paper, we consider the "complementary" case in which the Boolean function is replaced by $f:\left\{0,1\right\}^n\to\left\{0,1\right\}^{n-1}$, namely, an $n-1$ bit quantizer, and show that $I(f(X^n);Y^n)\leq (n-1)\cdot\left(1-h(\alpha)\right)$ for any such $f$. Thus, in this case, the optimal function is of the form $f(x^n)=(x_1,\ldots,x_{n-1})$., Comment: 5 pages, accepted ISIT 2017
Published: 2017

41. Deterministic Finite-Memory Bias Estimation.

Author: Tomer Berg, Or Ordentlich, and Ofer Shayevitz
Published: 2021

42. The Double-Sided Information-Bottleneck Function.

Author: Michael Dikshtein, Or Ordentlich, and Shlomo Shamai Shitz
Published: 2021
Full Text: View/download PDF

43. Constructing Multiclass Classifiers using Binary Classifiers Under Log-Loss.

Author: Assaf Ben-Yishai and Or Ordentlich
Published: 2021
Full Text: View/download PDF

44. Binary Maximal Correlation Bounds and Isoperimetric Inequalities via Anti-Concentration.

Author: Dror Drach, Or Ordentlich, and Ofer Shayevitz
Published: 2021
Full Text: View/download PDF

45. Critical Slowing Down Near Topological Transitions in Rate-Distortion Problems.

Author: Shlomi Agmon, Etam Benger, Or Ordentlich, and Naftali Tishby
Published: 2021
Full Text: View/download PDF

46. Hyperfast second-order local solvers for efficient statistically preconditioned distributed optimization

Author: Dvurechensky, Pavel, Kamzolov, Dmitry, Lukashevich, Aleksandr, Lee, Soomin, Ordentlich, Erik, Uribe, César A., and Gasnikov, Alexander
Published: 2022
Full Text: View/download PDF

47. Inhibition of menin, BCL-2, and FLT3 combined with a hypomethylating agent cures NPM1/FLT3-ITD/-TKD mutant acute myeloid leukemia in a patient-derived xenograft model

Author: Bing Z. Carter, Po Yee Mak, Wenjing Tao, Lauren B. Ostermann, Duncan H. Mak, Baozhen Ke, Peter Ordentlich, Gerard M. McGeehan, and Michael Andreeff
Subjects: Diseases of the blood and blood-forming organs, RC633-647.5
Published: 2023
Full Text: View/download PDF

48. An Information-Theoretic Proof of the Streaming Switching Lemma for Symmetric Encryption.

Author: Ido Shahaf, Or Ordentlich, and Gil Segev 0001
Published: 2020
Full Text: View/download PDF

49. Binary Hypothesis Testing with Deterministic Finite-Memory Decision Rules.

Author: Tomer Berg, Or Ordentlich, and Ofer Shayevitz
Published: 2020
Full Text: View/download PDF

50. Characterizing the Performance of Wireless Communication Architectures via Basic Diophantine Approximation Bounds

Author: Nazer, Bobak, Ordentlich, Or, Schröder, Jörg, Series Editor, Weigand, Bernhard, Series Editor, Beresnevich, Victor, editor, Burr, Alister, editor, Nazer, Bobak, editor, and Velani, Sanju, editor
Published: 2020
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

1,398 results on '"Ordentlich A"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources