Author: "Vesely P" / Database: arXiv - Searchworks@Jio Institute Digital Library Search Results

1. Discrete Dynamical Systems with Random Impulses

Author: Kováč, J., Veselý, J., and Janková, K.
Subjects: Mathematics - Dynamical Systems
Abstract: We study the behaviour of discrete dynamical systems generated by a continuous map $f$ of a compact real interval into itself where at randomly chosen times a function different from $f$ - so called impulse function is applied. We show that both the splittting property and the average contraction property guarantee the stability of the system. We give a number of examples where the verification of these properties is simple.
Published: 2024

2. Phase Composition of AlTiNbMoV, AlTiNbTaZr and AlTiNbMoCr Refractory Complex Concentrated Alloys: A Correlation of Predictions and Experiment

Author: Kozlík, Jiří, Lukáč, František, Luna, Mariano Casas, Šalata, Kristián, Stráský, Josef, Veselý, Jozef, Jača, Eliška, and Chráska, Tomáš
Subjects: Condensed Matter - Materials Science
Abstract: Designing complex concentrated alloys (CCA), also known as high entropy alloys (HEA), requires reliable and accessible thermodynamic predictions due to vast space of possible compositions. Numerous semiempirical parameters have been developed for phase predictions over the years. However, in this paper we show that none of these parameters is a robust indicator of phase content in various refractory CCA. CALPHAD proved to be a more powerful tool for phase predictions, however, the predictions face several limitations. AlTiNbMoV, AlTiNbTaZr and AlTiNbMoCr alloys were prepared using blended elemental powder metallurgy. Their phase and chemical composition were investigated by the means of scanning electron microscopy, energy-dispersive X-ray spectroscopy and X-ray diffraction. Apart from the minor contamination phases (Al2O3 and Ti(C,N,O)), AlTiNbMoV and AlTiNbMoCr exhibited single-phase solid solution microstructure at the homogenization temperature of 1400 {\deg}C, while Al3Zr5 based intermetallics were present in the AlTiNbTaZr alloy. None of the simple semiempirical parameter was able to predict phase content correctly in all three alloys. Predictions by CALPHAD (TCHEA4 database) were able to predict the phases with limited accuracy only. Critical limitation of the TCHEA4 database is that only binary and ternary phase diagrams are assessed and some more complex phases cannot be predicted., Comment: 17 pages, 15 figures, 32 references. Postprint version published in Metallurgical and Materials Transactions A (2024)
Published: 2024
Full Text: View/download PDF

3. Nuclear shape / phase transitions in the N = 40, 60, 90 regions

Author: Petrellis, Dimitrios, Prášek, Adam, Alexa, Petr, Bonatsos, Dennis, Thiamová, Gabriela, and Veselý, Petr
Subjects: Nuclear Theory
Abstract: We investigate the isotopes of Se, Zr, Mo and Nd in the regions with N = 40, 60 and 90, where a first-order shape / phase transition, from spherical to deformed, can be observed. The signs of phase transitional behavior become evident by examining structure indicators, such as certain energy ratios and B(E2) transition rates and, in particular, how they evolve with neutron number. Microscopic mean-field calculations using the Skyrme-Hartree-Fock + Bardeen-Cooper-Schrieffer framework also reveal structural changes when considering the evolution of the resulting potential energy curves as functions of deformation. Finally, macroscopic calculations, using the Algebraic Collective Model, specifically for $^{74}$Se, $^{102}$Mo and $^{150}$Nd, after fitting its parameters to experimental spectra, result in potentials that resemble some of the potentials proposed in the framework of the Bohr Hamiltonian to describe shape transitions in nuclei., Comment: 5 pages, 4 figures, 7th Workshop of the Hellenic Institute of Nuclear Physics (HINPw7)
Published: 2024

4. Phase transitions in N = 40, 60 and 90 nuclei

Author: Prášek, A., Alexa, P., Bonatsos, D., Thiamová, G., Petrellis, D., and Veselý, P.
Subjects: Nuclear Theory
Abstract: In this paper we focus on three mass regions where first-order phase transitions occur, namely for $N=40$, 60 and 90. We investigate four isotopic chains (Se, Zr, Mo and Nd) in the framework of microscopic Skyrme-Hartree-Fock + Bardeen-Cooper-Schrieffer calculations for 15 different parametrizations. The microscopic calculations show the typical behavior expected for first-order phase transitions. To find the best candidate for the critical point phase transition we propose new microscopic position and occupation indices calculated for positive-parity and negative-parity proton and neutron single-quasiparticle states around the Fermi level. The microscopic calculations are completed by macroscopic calculations within the Algebraic Collective Model (ACM), and compared to the experimental data for $^{74}$Se, $^{102}$Mo and $^{150}$Nd, considered to be the best candidates for the critical point nuclei., Comment: accepted in Phys. Rev. C
Published: 2024

5. Beyond Image-Text Matching: Verb Understanding in Multimodal Transformers Using Guided Masking

Author: Beňová, Ivana, Košecká, Jana, Gregor, Michal, Tamajka, Martin, Veselý, Marcel, and Šimko, Marián
Subjects: Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: The dominant probing approaches rely on the zero-shot performance of image-text matching tasks to gain a finer-grained understanding of the representations learned by recent multimodal image-language transformer models. The evaluation is carried out on carefully curated datasets focusing on counting, relations, attributes, and others. This work introduces an alternative probing strategy called guided masking. The proposed approach ablates different modalities using masking and assesses the model's ability to predict the masked word with high accuracy. We focus on studying multimodal models that consider regions of interest (ROI) features obtained by object detectors as input tokens. We probe the understanding of verbs using guided masking on ViLBERT, LXMERT, UNITER, and VisualBERT and show that these models can predict the correct verb with high accuracy. This contrasts with previous conclusions drawn from image-text matching probing techniques that frequently fail in situations requiring verb understanding. The code for all experiments will be publicly available https://github.com/ivana-13/guided_masking., Comment: 9 pages of text, 11 pages total, 7 figures, 3 tables, preprint
Published: 2024

6. A Wild Bootstrap Procedure for the Identification of Optimal Groups in Singular Spectrum Analysis

Author: Movahedifar, Maryam, Preusse, Friederike, Vesely, Anna, Ochieng, Daniel, and Dickhaus, Thorsten
Subjects: Statistics - Methodology, Statistics - Applications, 94A12, 62F40, 62J15
Abstract: A key step in separating signal from noise using Singular Spectrum Analysis (SSA) is grouping, which is often done subjectively. In this article a method which enables the identification of statistically significant groups for the grouping step in SSA is presented. The proposed procedure provides a more objective and reliable approach for separating noise from the main signal in SSA. We utilize the w- correlation and test if it close or equal to zero. A wild bootstrap approach is used to determine the distribution of the w-correlation. To identify an ideal number of groupings which leads to almost perfect separation of the noise and signal, a given number of groups are tested, necessitating accounting for multiplicity. The effectiveness of our method in identifying the best group is demonstrated through a simulation study, furthermore, we have applied the approach to real world data in the context of neuroimaging. This research provides a valuable contribution to the field of SSA and offers important insights into the statistical properties of the w-correlation distribution. The results obtained from the simulation studies and analysis of real-world data demonstrate the effectiveness of the proposed approach in identifying the best groupings for SSA., Comment: We have 22 pages and 5 figures
Published: 2024

7. Structural Properties of Search Trees with 2-way Comparisons

Author: Atalig, Sunny, Chrobak, Marek, Mousavian, Erfan, Sgall, Jiri, and Vesely, Pavel
Subjects: Computer Science - Data Structures and Algorithms
Abstract: Optimal 3-way comparison search trees (3WCST's) can be computed using standard dynamic programming in time O(n^3), and this can be further improved to O(n^2) by taking advantage of the Monge property. In contrast, the fastest algorithm in the literature for computing optimal 2-way comparison search trees (2WCST's) runs in time O(n^4). To shed light on this discrepancy, we study structure properties of 2WCST's. On one hand, we show some new threshold bounds involving key weights that can be helpful in deciding which type of comparison should be at the root of the optimal tree. On the other hand, we also show that the standard techniques for speeding up dynamic programming (the Monge property / quadrangle inequality) do not apply to 2WCST's.
Published: 2023

8. BUT CHiME-7 system description

Author: Karafiát, Martin, Veselý, Karel, Szöke, Igor, Mošner, Ladislav, Beneš, Karel, Witkowski, Marcin, Barchi, Germán, and Pepino, Leonardo
Subjects: Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: This paper describes the joint effort of Brno University of Technology (BUT), AGH University of Krakow and University of Buenos Aires on the development of Automatic Speech Recognition systems for the CHiME-7 Challenge. We train and evaluate various end-to-end models with several toolkits. We heavily relied on Guided Source Separation (GSS) to convert multi-channel audio to single channel. The ASR is leveraging speech representations from models pre-trained by self-supervised learning, and we do a fusion of several ASR systems. In addition, we modified external data from the LibriSpeech corpus to become a close domain and added it to the training. Our efforts were focused on the far-field acoustic robustness sub-track of Task 1 - Distant Automatic Speech Recognition (DASR), our systems use oracle segmentation., Comment: 6 pages, Chime-7 challenge 2023
Published: 2023

9. Confidence bounds for the true discovery proportion based on the exact distribution of the number of rejections

Author: Preusse, Friederike, Vesely, Anna, and Dickhaus, Thorsten
Subjects: Statistics - Methodology
Abstract: In multiple hypotheses testing it has become widely popular to make inference on the true discovery proportion (TDP) of a set $\mathcal{M}$ of null hypotheses. This approach is useful for several application fields, such as neuroimaging and genomics. Several procedures to compute simultaneous lower confidence bounds for the TDP have been suggested in prior literature. Simultaneity allows for post-hoc selection of $\mathcal{M}$. If sets of interest are specified a priori, it is possible to gain power by removing the simultaneity requirement. We present an approach to compute lower confidence bounds for the TDP if the set of null hypotheses is defined a priori. The proposed method determines the bounds using the exact distribution of the number of rejections based on a step-up multiple testing procedure under independence assumptions. We assess robustness properties of our procedure and apply it to real data from the field of functional magnetic resonance imaging.
Published: 2023

10. Fully Scalable MPC Algorithms for Clustering in High Dimension

Author: Czumaj, Artur, Gao, Guichen, Jiang, Shaofeng H. -C., Krauthgamer, Robert, and Veselý, Pavel
Subjects: Computer Science - Data Structures and Algorithms, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: We design new parallel algorithms for clustering in high-dimensional Euclidean spaces. These algorithms run in the Massively Parallel Computation (MPC) model, and are fully scalable, meaning that the local memory in each machine may be $n^{\sigma}$ for arbitrarily small fixed $\sigma>0$. Importantly, the local memory may be substantially smaller than the number of clusters $k$, yet all our algorithms are fast, i.e., run in $O(1)$ rounds. We first devise a fast MPC algorithm for $O(1)$-approximation of uniform facility location. This is the first fully-scalable MPC algorithm that achieves $O(1)$-approximation for any clustering problem in general geometric setting; previous algorithms only provide $\mathrm{poly}(\log n)$-approximation or apply to restricted inputs, like low dimension or small number of clusters $k$; e.g. [Bhaskara and Wijewardena, ICML'18; Cohen-Addad et al., NeurIPS'21; Cohen-Addad et al., ICML'22]. We then build on this facility location result and devise a fast MPC algorithm that achieves $O(1)$-bicriteria approximation for $k$-Median and for $k$-Means, namely, it computes $(1+\varepsilon)k$ clusters of cost within $O(1/\varepsilon^2)$-factor of the optimum for $k$ clusters. A primary technical tool that we introduce, and may be of independent interest, is a new MPC primitive for geometric aggregation, namely, computing for every data point a statistic of its approximate neighborhood, for statistics like range counting and nearest-neighbor search. Our implementation of this primitive works in high dimension, and is based on consistent hashing (aka sparse partition), a technique that was recently used for streaming algorithms [Czumaj et al., FOCS'22].
Published: 2023

11. Selective inference for fMRI cluster-wise analysis, issues, and recommendations for critical vector selection: A comment on Blain et al

Author: Andreella, Angela, Vesely, Anna, Wouter, Weeda, and Goeman, Jelle
Subjects: Statistics - Applications
Abstract: Two permutation-based methods for simultaneous inference on the proportion of active voxels in cluster-wise brain imaging analysis have recently been published: Notip (Blain et al. 2022) and pARI (Andreella et al. 2023). Both rely on the definition of a critical vector of ordered p-values, chosen from a family of candidate vectors, but differ in how the family is defined: computed from randomization of external data for Notip and determined a priori for pARI. These procedures were compared to other proposals in the literature, but an extensive comparison between the two methods is missing due to their parallel publication. We provide such a comparison and find that pARI outperforms Notip if both methods are applied under their recommended settings. However, each method carries different advantages and drawbacks.
Published: 2023

12. Self-consistent many-body approach to the electroproduction of hypernuclei

Author: Bydžovský, P., Denisova, D., Petrellis, D., Skoupil, D., Veselý, P., De Gregorio, G., Knapp, F., and Iudice, N. Lo
Subjects: Nuclear Theory
Abstract: The electroproduction of selected $p$- and $sd$-shell hypernuclei was studied within a many-body approach using realistic interactions between the constituent baryons. The cross sections were computed in distorted-wave impulse approximation using two elementary amplitudes for the electroproduction of the $\Lambda$ hyperon. The structure of the hypernuclei was investigated within the framework of the self-consistent $\Lambda$-nucleon Tamm-Dancoff approach and its extension known as the $\Lambda$-nucleon equation of motion phonon method. Use was made of the NNLOsat chiral potential plus the effective Nijmegen-F YN interaction. The method was first implemented on light nuclei for studying the available experimental data and establishing a relation to other approaches. After this proof test, it was adopted for predicting the electroproduction cross section of the hypernuclei $^{40}_{~\Lambda}$K and $^{48}_{~\Lambda}$K in view of the E12-15-008 experiment in preparation at JLab. On the ground of these predictions, appreciable effects on the spectra are expected to be induced by the YN interaction., Comment: 10 pages, 7 figures (version 2); 11 pages, 9 figures (v1)
Published: 2023

13. Finding the Optimal Currency Composition of Foreign Exchange Reserves with a Quantum Computer

Author: Vesely, Martin
Subjects: Economics - General Economics, Quantitative Finance - Computational Finance, Quantum Physics
Abstract: Portfolio optimization is an inseparable part of strategic asset allocation at the Czech National Bank. Quantum computing is a new technology offering algorithms for that problem. The capabilities and limitations of quantum computers with regard to portfolio optimization should therefore be investigated. In this paper, we focus on applications of quantum algorithms to dynamic portfolio optimization based on the Markowitz model. In particular, we compare algorithms for universal gate-based quantum computers (the QAOA, the VQE and Grover adaptive search), single-purpose quantum annealers, the classical exact branch and bound solver and classical heuristic algorithms (simulated annealing and genetic optimization). To run the quantum algorithms we use the IBM Quantum\textsuperscript{TM} gate-based quantum computer. We also employ the quantum annealer offered by D-Wave. We demonstrate portfolio optimization on finding the optimal currency composition of the CNB's FX reserves. A secondary goal of the paper is to provide staff of central banks and other financial market regulators with literature on quantum optimization algorithms, because financial firms are active in finding possible applications of quantum computing.
Published: 2023

14. Procrustes-based distances for exploring between-matrices similarity

Author: Andreella, Angela, De Santis, Riccardo, Vesely, Anna, and Finos, Livio
Subjects: Statistics - Applications
Abstract: The statistical shape analysis called Procrustes analysis minimizes the distance between matrices by similarity transformations. The method returns a set of optimal orthogonal matrices, which project each matrix into a common space. This manuscript presents two types of distances derived from Procrustes analysis for exploring between-matrices similarity. The first one focuses on the residuals from the Procrustes analysis, i.e., the residual-based distance metric. In contrast, the second one exploits the fitted orthogonal matrices, i.e., the rotational-based distance metric. Thanks to these distances, similarity-based techniques such as the multidimensional scaling method can be applied to visualize and explore patterns and similarities among observations. The proposed distances result in being helpful in functional magnetic resonance imaging (fMRI) data analysis. The brain activation measured over space and time can be represented by a matrix. The proposed distances applied to a sample of subjects -- i.e., matrices -- revealed groups of individuals sharing patterns of neural brain activation.
Published: 2023

15. Extendability of continuous quasiconvex functions from subspaces

Author: De Bernardi, Carlo Alberto and Veselý, Libor
Subjects: Mathematics - Functional Analysis
Abstract: Let $Y$ be a subspace of a topological vector space $X$, and $A\subset X$ an open convex set that intersects $Y$. We say that the property $(QE)$ [property $(CE)$] holds if every continuous quasiconvex [continuous convex] function on $A\cap Y$ admits a continuous quasiconvex [continuous convex] extension defined on $A$. We study relations between $(QE)$ and $(CE)$ properties, proving that $(QE)$ always implies $(CE)$ and that, under suitable hypotheses (satisfied for example if $X$ is a normed space and $Y$ is a closed subspace of $X$), the two properties are equivalent. By combining the previous implications between $(QE)$ and $(CE)$ properties with known results about the property $(CE)$, we obtain some new positive results about the extension of quasiconvex continuous functions. In particular, we generalize the results contained in \cite{DEQEX} to the infinite-dimensional separable case. Moreover, we also immediately obtain existence of examples in which $(QE)$ does not hold.
Published: 2022

16. Speech and Natural Language Processing Technologies for Pseudo-Pilot Simulator

Author: Prasad, Amrutha, Zuluaga-Gomez, Juan, Motlicek, Petr, Sarfjoo, Saeed, Nigmatulina, Iuliia, and Vesely, Karel
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: This paper describes a simple yet efficient repetition-based modular system for speeding up air-traffic controllers (ATCos) training. E.g., a human pilot is still required in EUROCONTROL's ESCAPE lite simulator (see https://www.eurocontrol.int/simulator/escape) during ATCo training. However, this need can be substituted by an automatic system that could act as a pilot. In this paper, we aim to develop and integrate a pseudo-pilot agent into the ATCo training pipeline by merging diverse artificial intelligence (AI) powered modules. The system understands the voice communications issued by the ATCo, and, in turn, it generates a spoken prompt that follows the pilot's phraseology to the initial communication. Our system mainly relies on open-source AI tools and air traffic control (ATC) databases, thus, proving its simplicity and ease of replicability. The overall pipeline is composed of the following: (1) a submodule that receives and pre-processes the input stream of raw audio, (2) an automatic speech recognition (ASR) system that transforms audio into a sequence of words; (3) a high-level ATC-related entity parser, which extracts relevant information from the communication, i.e., callsigns and commands, and finally, (4) a speech synthesizer submodule that generates responses based on the high-level ATC entities previously extracted. Overall, we show that this system could pave the way toward developing a real proof-of-concept pseudo-pilot system. Hence, speeding up the training of ATCos while drastically reducing its overall cost., Comment: Presented at Sesar Innovation Days 2022. https://www.sesarju.eu/sesarinnovationdays
Published: 2022

17. ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications

Author: Zuluaga-Gomez, Juan, Veselý, Karel, Szöke, Igor, Blatt, Alexander, Motlicek, Petr, Kocour, Martin, Rigault, Mickael, Choukri, Khalid, Prasad, Amrutha, Sarfjoo, Seyyed Saeed, Nigmatulina, Iuliia, Cevenini, Claudia, Kolčárek, Pavel, Tart, Allan, Černocký, Jan, and Klakow, Dietrich
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Personal assistants, automatic speech recognizers and dialogue understanding systems are becoming more critical in our interconnected digital world. A clear example is air traffic control (ATC) communications. ATC aims at guiding aircraft and controlling the airspace in a safe and optimal manner. These voice-based dialogues are carried between an air traffic controller (ATCO) and pilots via very-high frequency radio channels. In order to incorporate these novel technologies into ATC (low-resource domain), large-scale annotated datasets are required to develop the data-driven AI systems. Two examples are automatic speech recognition (ASR) and natural language understanding (NLU). In this paper, we introduce the ATCO2 corpus, a dataset that aims at fostering research on the challenging ATC field, which has lagged behind due to lack of annotated data. The ATCO2 corpus covers 1) data collection and pre-processing, 2) pseudo-annotations of speech data, and 3) extraction of ATC-related named entities. The ATCO2 corpus is split into three subsets. 1) ATCO2-test-set corpus contains 4 hours of ATC speech with manual transcripts and a subset with gold annotations for named-entity recognition (callsign, command, value). 2) The ATCO2-PL-set corpus consists of 5281 hours of unlabeled ATC data enriched with automatic transcripts from an in-domain speech recognizer, contextual information, speaker turn information, signal-to-noise ratio estimate and English language detection score per sample. Both available for purchase through ELDA at http://catalog.elra.info/en-us/repository/browse/ELRA-S0484. 3) The ATCO2-test-set-1h corpus is a one-hour subset from the original test set corpus, that we are offering for free at https://www.atco2.org/data. We expect the ATCO2 corpus will foster research on robust ASR and NLU not only in the field of ATC communications but also in the general research community., Comment: Manuscript under review; The code is available at: https://github.com/idiap/atco2-corpus
Published: 2022

18. Comparative analysis of formalisms and performances of three different beyond mean-field approaches

Author: Knapp, František, Papakonstantinou, Panagiota, Veselý, Petr, De Gregorio, Giovanni, Herko, Jakub, and Iudice, Nicola Lo
Subjects: Nuclear Theory
Abstract: We investigate the differences and analogies between the equation of motion phonon method (EMPM) and second Tamm-Dancoff and random-phase approximations (STDA and SRPA) paying special attention to the problem of spurious center-of-mass (c.m.) admixtures. In order to compare them on an equal footing, we perform self-consistent calculations of the multipole strength distributions in selected doubly magic nuclei within a space including up to two-particle-two-hole (2p-2h) basis states using the UCOM two-body intrinsic Hamiltonian and we explore the tools each approach supplies for removing the spurious c.m. admixtures. We find that the EMPM and STDA yield exactly the same results when the same intrinsic Hamiltonian is used and the coupling of the Hartree-Fock state with the 2p-2h space is neglected, but, unlike STDA and SRPA, the EMPM offers the possibility to completely remove c.m. admixtures., Comment: 13 pages, 17 figures
Published: 2022
Full Text: View/download PDF

19. Post-selection Inference in Multiverse Analysis (PIMA): an inferential framework based on the sign flipping score test

Author: Girardi, Paolo, Vesely, Anna, Lakens, Daniël, Altoè, Gianmarco, Pastore, Massimiliano, Calcagnì, Antonio, and Finos, Livio
Subjects: Statistics - Methodology, Statistics - Applications, 62F03, G.3
Abstract: When analyzing data researchers make some decisions that are either arbitrary, based on subjective beliefs about the data generating process, or for which equally justifiable alternative choices could have been made. This wide range of data-analytic choices can be abused, and has been one of the underlying causes of the replication crisis in several fields. Recently, the introduction of multiverse analysis provides researchers with a method to evaluate the stability of the results across reasonable choices that could be made when analyzing data. Multiverse analysis is confined to a descriptive role, lacking a proper and comprehensive inferential procedure. Recently, specification curve analysis adds an inferential procedure to multiverse analysis, but this approach is limited to simple cases related to the linear model, and only allows researchers to infer whether at least one specification rejects the null hypothesis, but not which specifications should be selected. In this paper we present a Post-selection Inference approach to Multiverse Analysis (PIMA) which is a flexible and general inferential approach that accounts for all possible models, i.e., the multiverse of reasonable analyses. The approach allows for a wide range of data specifications (i.e. pre-processing) and any generalized linear model; it allows testing the null hypothesis of a given predictor not being associated with the outcome, by merging information from all reasonable models of multiverse analysis, and provides strong control of the family-wise error rate such that it allows researchers to claim that the null-hypothesis can be rejected for each specification that shows a significant effect. The inferential proposal is based on a conditional resampling procedure. To be continued..., Comment: 37 pages, 2 figures
Published: 2022

20. Fermi motion effects in electroproduction of hypernuclei

Author: Bydžovský, P., Denisova, D., Skoupil, D., and Veselý, P.
Subjects: Nuclear Theory
Abstract: In a previous analysis of electroproduction of hypernuclei the cross sections were calculated in distorted-wave impulse approximation where the momentum of the initial proton in the nucleus was set to zero (the frozen-proton approximation). In this paper we go beyond this approximation assuming a non zero effective proton momentum due to proton Fermi motion inside of the target nucleus discussing also other kinematical effects. To this end we have derived a more general form of the two-component elementary electroproduction amplitude (Chew-Goldberger-Low-Nambu like) which allows its use in a general reference frame moving with respect to the nucleus-rest frame. The effects of Fermi motion were found to depend on kinematics and elementary amplitudes. The largest effects were observed in the contributions from the longitudinal and interference parts of the cross sections. The extension of the calculations beyond the frozen-proton approximation improved the agreement of predicted theoretical cross sections with experimental data and once we assumed the optimum on-shell approximation, we were able to remove an inconsistency which was previously present in the calculations., Comment: 20 pages, 7 figures, 5 tables
Published: 2022
Full Text: View/download PDF

21. Resampling-Based Multisplit Inference for High-Dimensional Regression

Author: Vesely, Anna, Goeman, Jelle J., and Finos, Livio
Subjects: Statistics - Methodology
Abstract: We propose a novel resampling-based method to construct an asymptotically exact test for any subset of hypotheses on coefficients in high-dimensional linear regression. It can be embedded into any multiple testing procedure to make confidence statements on relevant predictor variables. The method constructs permutation test statistics for any individual hypothesis by means of repeated splits of the data and a variable selection technique; then it defines a test for any subset by suitably aggregating its variables' test statistics. The resulting procedure is extremely flexible, as it allows different selection techniques and several combining functions. We present it in two ways: an exact method and an approximate one, that requires less memory usage and shorter computation time, and can be scaled up to higher dimensions. We illustrate the performance of the method with simulations and the analysis of real gene expression data., Comment: 31 pages (16 pages main, 15 pages appendix), 12 figures
Published: 2022

22. Call-sign recognition and understanding for noisy air-traffic transcripts using surveillance information

Author: Blatt, Alexander, Kocour, Martin, Veselý, Karel, Szöke, Igor, and Klakow, Dietrich
Subjects: Computer Science - Computation and Language, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Air traffic control (ATC) relies on communication via speech between pilot and air-traffic controller (ATCO). The call-sign, as unique identifier for each flight, is used to address a specific pilot by the ATCO. Extracting the call-sign from the communication is a challenge because of the noisy ATC voice channel and the additional noise introduced by the receiver. A low signal-to-noise ratio (SNR) in the speech leads to high word error rate (WER) transcripts. We propose a new call-sign recognition and understanding (CRU) system that addresses this issue. The recognizer is trained to identify call-signs in noisy ATC transcripts and convert them into the standard International Civil Aviation Organization (ICAO) format. By incorporating surveillance information, we can multiply the call-sign accuracy (CSA) up to a factor of four. The introduced data augmentation adds additional performance on high WER transcripts and allows the adaptation of the model to unseen airspaces., Comment: Accepted by ICASSP 2022
Published: 2022

23. Streaming Facility Location in High Dimension via Geometric Hashing

Author: Czumaj, Artur, Filtser, Arnold, Jiang, Shaofeng H. -C., Krauthgamer, Robert, Veselý, Pavel, and Yang, Mingwei
Subjects: Computer Science - Data Structures and Algorithms
Abstract: In Euclidean Uniform Facility Location (UFL), the input is a set of clients in $\mathbb{R}^d$ and the goal is to place facilities to serve them, so as to minimize the total cost of opening facilities plus connecting the clients. We study the setting of dynamic geometric streams, where the clients are presented as a sequence of insertions and deletions of points in the grid $\{1,\ldots,\Delta\}^d$, and we focus on the \emph{high-dimensional regime}, where the algorithm must use space polynomial in $d\cdot\log\Delta$. We present a new algorithmic framework, based on importance sampling, for $O(1)$-approximation of UFL using only $\mathrm{poly}(d\cdot\log\Delta)$ space. This framework is easy to implement in two passes, one for sampling points and the other for estimating their contribution. Over random-order streams, we can extend this to one pass by using the two halves of the stream separately. Our main result, for arbitrary-order streams, computes $O(d / \log d)$-approximation in one pass by combining the two passes differently. This improves upon previous algorithms that either need space $\exp(d)$ or only guarantee $O(d\cdot\log^2\Delta)$-approximation, and therefore our algorithms for high dimension are the first to avoid the $O(\log\Delta)$-factor in approximation that is inherent to the widely-used quadtree decomposition. Our improvement is achieved by employing a geometric hashing scheme that maps points in $\mathbb{R}^d$ into buckets of bounded diameter, with the key property that every point set of small-enough diameter is hashed into few buckets. By applying an alternative bound for this hashing, we also obtain an $O(1 / \epsilon)$-approximation in one pass, using larger but still sublinear space $O(n^{\epsilon})$ where $n$ is the number of clients. We complement our results by showing $1.085$-approximation requires space exponential in $\mathrm{poly}(d\cdot\log\Delta)$., Comment: The abstract is shortened to meet the length constraint of arXiv
Published: 2022

24. Application of Quantum Computers in Foreign Exchange Reserves Management

Author: Veselý, Martin
Subjects: Economics - General Economics, Quantitative Finance - Computational Finance, Quantum Physics
Abstract: The main purpose of this article is to evaluate possible applications of quantum computers in foreign exchange reserves management. The capabilities of quantum computers are demonstrated by means of risk measurement using the quantum Monte Carlo method and portfolio optimization using a linear equations system solver (the Harrow-Hassidim-Lloyd algorithm) and quadratic unconstrained binary optimization (the quantum approximate optimization algorithm). All demonstrations are carried out on the cloud-based IBM Quantum(TM) platform. Despite the fact that real-world applications are impossible under the current state of development of quantum computers, it is proven that in principle it will be possible to apply such computers in FX reserves management in the future. In addition, the article serves as an introduction to quantum computing for the staff of central banks and financial market supervisory authorities.
Published: 2022

25. Spectroscopic properties of 4He within a multiphonon approach

Author: De Gregorio, G., Knapp, F., Iudice, N. Lo, and Veselý, P.
Subjects: Nuclear Theory
Abstract: Bulk and spectroscopic properties of 4He are studied within an equation of motion phonon method. Such a method generates a basis of n-phonon (n = 0, 1, 2, 3...) states composed of tensor products of particle-hole Tamm-Dancoff phonons and then solves the full eigenvalue problem in such a basis. The method does not rely on any approximation and is free of any contamination induced by the center of mass, in virtue of a procedure exploiting the singular value decomposition of rectangular matrices. Two potentials, both derived from the chiral effective field theory, are adopted in a self-consistent calculation performed within a space including up to three phonons. The latter basis states are treated under a simplifying assumption. A comparative analysis with the experimental data points out the different performances of the two potentials. It shows also that the calculation succeeds only partially in the description of the spectroscopic properties and suggests a recipe for further improvements., Comment: Accepted for publication on Phys. Rev. C
Published: 2022
Full Text: View/download PDF

26. Improved Approximation Guarantees for Shortest Superstrings using Cycle Classification by Overlap to Length Ratios

Author: Englert, Matthias, Matsakis, Nicolaos, and Veselý, Pavel
Subjects: Computer Science - Data Structures and Algorithms
Abstract: In the Shortest Superstring problem, we are given a set of strings and we are asking for a common superstring, which has the minimum number of characters. The Shortest Superstring problem is NP-hard and several constant-factor approximation algorithms are known for it. Of particular interest is the GREEDY algorithm, which repeatedly merges two strings of maximum overlap until a single string remains. The GREEDY algorithm, being simpler than other well-performing approximation algorithms for this problem, has attracted attention since the 1980s and is commonly used in practical applications. Tarhio and Ukkonen (TCS 1988) conjectured that GREEDY gives a 2-approximation. In a seminal work, Blum, Jiang, Li, Tromp, and Yannakakis (STOC 1991) proved that the superstring computed by GREEDY is a 4-approximation, and this upper bound was improved to 3.5 by Kaplan and Shafrir (IPL 2005). We show that the approximation guarantee of GREEDY is at most $(13+\sqrt{57})/6 \approx 3.425$, making the first progress on this question since 2005. Furthermore, we prove that the Shortest Superstring can be approximated within a factor of $(37+\sqrt{57})/18\approx 2.475$, improving slightly upon the currently best $2\frac{11}{23}$-approximation algorithm by Mucha (SODA 2013).
Published: 2021

27. Distill: Domain-Specific Compilation for Cognitive Models

Author: Vesely, Jan, Pothukuchi, Raghavendra Pradyumna, Joshi, Ketaki, Gupta, Samyak, Cohen, Jonathan D., and Bhattacharjee, Abhishek
Subjects: Computer Science - Programming Languages
Abstract: This paper discusses our proposal and implementation of Distill, a domain-specific compilation tool based on LLVM to accelerate cognitive models. Cognitive models explain the process of cognitive function and offer a path to human-like artificial intelligence. However, cognitive modeling is laborious, requiring composition of many types of computational tasks, and suffers from poor performance as it relies on high-level languages like Python. In order to continue enjoying the flexibility of Python while achieving high performance, Distill uses domain-specific knowledge to compile Python-based cognitive models into LLVM IR, carefully stripping away features like dynamic typing and memory management that add overheads to the actual model. As we show, this permits significantly faster model execution. We also show that the code so generated enables using classical compiler data flow analysis passes to reveal properties about data flow in cognitive models that are useful to cognitive scientists. Distill is publicly available, is being used by researchers in cognitive science, and has led to patches that are currently being evaluated for integration into mainline LLVM., Comment: 11 pages, 7 figures
Published: 2021

28. Quality Control Methodology for Simulation Models of Computer Network Protocols

Author: Veselý, Vladimír and Zavřel, Jan
Subjects: Computer Science - Networking and Internet Architecture, Computer Science - Performance
Abstract: This paper summarizes know-how about modeling and simulation of computer networking protocols we contributed to the OMNeT++ community. We propose a methodology aiming to set a reliable ground truth for the quality of simulation models of networking protocols. We demonstrate the application of this methodology on our EIGRP source code pull-requested to the INET framework., Comment: Published in: M. Marek, G. Nardini, V. Vesely (Eds.), Proceedings of the 8th OMNeT++ Community Summit, Virtual Summit, September 8-10, 2021
Published: 2021

29. Proceedings of the 8th OMNeT++ Community Summit, Virtual Summit, September 8-10, 2021

Author: Marek, Marcel, Nardini, Giovanni, and Veselý, Vladimír
Subjects: Computer Science - Networking and Internet Architecture, Computer Science - Performance
Abstract: These are the Proceedings of the 8th OMNeT++ Community Summit, which was held virtually on September 8-10, 2021.
Published: 2021

30. Improved Analysis of Online Balanced Clustering

Author: Bienkowski, Marcin, Böhm, Martin, Koutecký, Martin, Rothvoß, Thomas, Sgall, Jiří, and Veselý, Pavel
Subjects: Computer Science - Data Structures and Algorithms
Abstract: In the online balanced graph repartitioning problem, one has to maintain a clustering of $n$ nodes into $\ell$ clusters, each having $k = n / \ell$ nodes. During runtime, an online algorithm is given a stream of communication requests between pairs of nodes: an inter-cluster communication costs one unit, while the intra-cluster communication is free. An algorithm can change the clustering, paying unit cost for each moved node. This natural problem admits a simple $O(\ell^2 \cdot k^2)$-competitive algorithm COMP, whose performance is far apart from the best known lower bound of $\Omega(\ell \cdot k)$. One of open questions is whether the dependency on $\ell$ can be made linear; this question is of practical importance as in the typical datacenter application where virtual machines are clustered on physical servers, $\ell$ is of several orders of magnitude larger than $k$. We answer this question affirmatively, proving that a simple modification of COMP is $(\ell \cdot 2^{O(k)})$-competitive. On the technical level, we achieve our bound by translating the problem to a system of linear integer equations and using Graver bases to show the existence of a ``small'' solution.
Published: 2021

31. Contextual Semi-Supervised Learning: An Approach To Leverage Air-Surveillance and Untranscribed ATC Data in ASR Systems

Author: Zuluaga-Gomez, Juan, Nigmatulina, Iuliia, Prasad, Amrutha, Motlicek, Petr, Veselý, Karel, Kocour, Martin, and Szöke, Igor
Subjects: Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Air traffic management and specifically air-traffic control (ATC) rely mostly on voice communications between Air Traffic Controllers (ATCos) and pilots. In most cases, these voice communications follow a well-defined grammar that could be leveraged in Automatic Speech Recognition (ASR) technologies. The callsign used to address an airplane is an essential part of all ATCo-pilot communications. We propose a two-steps approach to add contextual knowledge during semi-supervised training to reduce the ASR system error rates at recognizing the part of the utterance that contains the callsign. Initially, we represent in a WFST the contextual knowledge (i.e. air-surveillance data) of an ATCo-pilot communication. Then, during Semi-Supervised Learning (SSL) the contextual knowledge is added by second-pass decoding (i.e. lattice re-scoring). Results show that `unseen domains' (e.g. data from airports not present in the supervised training data) are further aided by contextual SSL when compared to standalone SSL. For this task, we introduce the Callsign Word Error Rate (CA-WER) as an evaluation metric, which only assesses ASR performance of the spoken callsign in an utterance. We obtained a 32.1% CA-WER relative improvement applying SSL with an additional 17.5% CA-WER improvement by adding contextual knowledge during SSL on a challenging ATC-based test set gathered from LiveATC., Comment: Presented at: Interspeech conference 2021 (Brno, Czechia, August 30 - September 3)
Published: 2021

32. Detecting English Speech in the Air Traffic Control Voice Communication

Author: Szoke, Igor, Kesiraju, Santosh, Novotny, Ondrej, Kocour, Martin, Vesely, Karel, and Cernocky, Jan "Honza"
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: We launched a community platform for collecting the ATC speech world-wide in the ATCO2 project. Filtering out unseen non-English speech is one of the main components in the data processing pipeline. The proposed English Language Detection (ELD) system is based on the embeddings from Bayesian subspace multinomial model. It is trained on the word confusion network from an ASR system. It is robust, easy to train, and light weighted. We achieved 0.0439 equal-error-rate (EER), a 50% relative reduction as compared to the state-of-the-art acoustic ELD system based on x-vectors, in the in-domain scenario. Further, we achieved an EER of 0.1352, a 33% relative reduction as compared to the acoustic ELD, in the unseen language (out-of-domain) condition. We plan to publish the evaluation dataset from the ATCO2 project.
Published: 2021

33. Cosmological magnetic field---the boost-symmetric case

Author: Veselý, Jiří and Žofka, Martin
Subjects: General Relativity and Quantum Cosmology
Abstract: We find a class of cylindrically symmetric, static electrovacuum spacetimes generated by a non-homogeneous magnetic field and involving the cosmological constant and one additional parameter, which determine uniquely the strength of the magnetic field. We provide a simple model of a source producing the field., Comment: 7 pages, no figures, a reference added
Published: 2021
Full Text: View/download PDF

34. Cylindrical spacetimes due to radial magnetic fields

Author: Veselý, Jiří and Žofka, Martin
Subjects: General Relativity and Quantum Cosmology
Abstract: We continue our previous study of cylindrically symmetric, static electrovacuum spacetimes generated by a magnetic field, involving optionally the cosmological constant, and investigate several classes of exact solutions. These spacetimes are due to magnetic fields that are perpendicular to the axis of symmetry., Comment: 8 pages, 6 figures
Published: 2021
Full Text: View/download PDF

35. Permutation-Based True Discovery Guarantee by Sum Tests

Author: Vesely, Anna, Finos, Livio, and Goeman, Jelle J.
Subjects: Statistics - Methodology
Abstract: Sum-based global tests are highly popular in multiple hypothesis testing. In this paper we propose a general closed testing procedure for sum tests, which provides lower confidence bounds for the proportion of true discoveries (TDP), simultaneously over all subsets of hypotheses. These simultaneous inferences come for free, i.e., without any adjustment of the alpha-level, whenever a global test is used. Our method allows for an exploratory approach, as simultaneity ensures control of the TDP even when the subset of interest is selected post hoc. It adapts to the unknown joint distribution of the data through permutation testing. Any sum test may be employed, depending on the desired power properties. We present an iterative shortcut for the closed testing procedure, based on the branch and bound algorithm, which converges to the full closed testing results, often after few iterations; even if it is stopped early, it controls the TDP. We compare the properties of different choices for the sum test through simulations, then we illustrate the feasibility of the method for high dimensional data on brain imaging and genomics data., Comment: Main: 27 pages, 3 figures. Appendices: 19 pages, 7 figures
Published: 2021
Full Text: View/download PDF

36. Theory meets Practice at the Median: a worst case comparison of relative error quantile algorithms

Author: Cormode, Graham, Mishra, Abhinav, Ross, Joseph, and Veselý, Pavel
Subjects: Computer Science - Data Structures and Algorithms, Statistics - Computation, F.2.2
Abstract: Estimating the distribution and quantiles of data is a foundational task in data mining and data science. We study algorithms which provide accurate results for extreme quantile queries using a small amount of space, thus helping to understand the tails of the input distribution. Namely, we focus on two recent state-of-the-art solutions: $t$-digest and ReqSketch. While $t$-digest is a popular compact summary which works well in a variety of settings, ReqSketch comes with formal accuracy guarantees at the cost of its size growing as new observations are inserted. In this work, we provide insight into which conditions make one preferable to the other. Namely, we show how to construct inputs for $t$-digest that induce an almost arbitrarily large error and demonstrate that it fails to provide accurate results even on i.i.d. samples from a highly non-uniform distribution. We propose practical improvements to ReqSketch, making it faster than $t$-digest, while its error stays bounded on any instance. Still, our results confirm that $t$-digest remains more accurate on the ``non-adversarial'' data encountered in practice., Comment: Updated experiments, improved presentation. To appear in KDD 2021
Published: 2021

37. BCN2BRNO: ASR System Fusion for Albayzin 2020 Speech to Text Challenge

Author: Kocour, Martin, Cámbara, Guillermo, Luque, Jordi, Bonet, David, Farrús, Mireia, Karafiát, Martin, Veselý, Karel, and Ĉernocký, Jan ''Honza''
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Computation and Language
Abstract: This paper describes joint effort of BUT and Telef\'onica Research on development of Automatic Speech Recognition systems for Albayzin 2020 Challenge. We compare approaches based on either hybrid or end-to-end models. In hybrid modelling, we explore the impact of SpecAugment layer on performance. For end-to-end modelling, we used a convolutional neural network with gated linear units (GLUs). The performance of such model is also evaluated with an additional n-gram language model to improve word error rates. We further inspect source separation methods to extract speech from noisy environment (i.e. TV shows). More precisely, we assess the effect of using a neural-based music separator named Demucs. A fusion of our best systems achieved 23.33% WER in official Albayzin 2020 evaluations. Aside from techniques used in our final submitted systems, we also describe our efforts in retrieving high quality transcripts for training., Comment: fusion, end-to-end model, hybrid model, semisupervised, automatic speech recognition, convolutional neural network
Published: 2021

38. Breaking the Barrier of 2 for the Competitiveness of Longest Queue Drop

Author: Antoniadis, Antonios, Englert, Matthias, Matsakis, Nicolaos, and Veselý, Pavel
Subjects: Computer Science - Data Structures and Algorithms, F.2.2
Abstract: We consider the problem of managing the buffer of a shared-memory switch that transmits packets of unit value. A shared-memory switch consists of an input port, a number of output ports, and a buffer with a specific capacity. In each time step, an arbitrary number of packets arrive at the input port, each packet designated for one output port. Each packet is added to the queue of the respective output port. If the total number of packets exceeds the capacity of the buffer, some packets have to be irrevocably evicted. At the end of each time step, each output port transmits a packet in its queue and the goal is to maximize the number of transmitted packets. The Longest Queue Drop (LQD) online algorithm accepts any arriving packet to the buffer. However, if this results in the buffer exceeding its memory capacity, then LQD drops a packet from whichever queue is currently the longest, breaking ties arbitrarily. The LQD algorithm was first introduced in 1991, and is known to be $2$-competitive since 2001. Although LQD remains the best known online algorithm for the problem and is of practical interest, determining its true competitiveness is a long-standing open problem. We show that LQD is 1.6918-competitive, establishing the first $(2-\varepsilon)$ upper bound for the competitive ratio of LQD, for a constant $\varepsilon>0$., Comment: A preliminary version appeared at ICALP 2021. This version contains an improved analysis which yields a slightly better upper bound. 30 pages
Published: 2020

39. Streaming Algorithms for Geometric Steiner Forest

Author: Czumaj, Artur, Jiang, Shaofeng H. -C., Krauthgamer, Robert, and Veselý, Pavel
Subjects: Computer Science - Data Structures and Algorithms
Abstract: We consider an important generalization of the Steiner tree problem, the \emph{Steiner forest problem}, in the Euclidean plane: the input is a multiset $X \subseteq \mathbb{R}^2$, partitioned into $k$ color classes $C_1, C_2, \ldots, C_k \subseteq X$. The goal is to find a minimum-cost Euclidean graph $G$ such that every color class $C_i$ is connected in $G$. We study this Steiner forest problem in the streaming setting, where the stream consists of insertions and deletions of points to $X$. Each input point $x\in X$ arrives with its color $\textsf{color}(x) \in [k]$, and as usual for dynamic geometric streams, the input points are restricted to the discrete grid $\{0, \ldots, \Delta\}^2$. We design a single-pass streaming algorithm that uses $\mathrm{poly}(k \cdot \log\Delta)$ space and time, and estimates the cost of an optimal Steiner forest solution within ratio arbitrarily close to the famous Euclidean Steiner ratio $\alpha_2$ (currently $1.1547 \le \alpha_2 \le 1.214$). This approximation guarantee matches the state-of-the-art bound for streaming Steiner tree, i.e., when $k=1$, and it is a major open question to improve the ratio to $1 + \epsilon$ even for this special case. Our approach relies on a novel combination of streaming techniques, like sampling and linear sketching, with the classical Arora-style dynamic-programming framework for geometric optimization problems, which usually requires large memory and has so far not been applied in the streaming setting. We complement our streaming algorithm for the Steiner forest problem with simple arguments showing that any finite approximation requires $\Omega(k)$ bits of space.
Published: 2020

40. Automatic Speech Recognition Benchmark for Air-Traffic Communications

Author: Zuluaga-Gomez, Juan, Motlicek, Petr, Zhan, Qingran, Vesely, Karel, and Braun, Rudolf
Subjects: Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Advances in Automatic Speech Recognition (ASR) over the last decade opened new areas of speech-based automation such as in Air-Traffic Control (ATC) environment. Currently, voice communication and data links communications are the only way of contact between pilots and Air-Traffic Controllers (ATCo), where the former is the most widely used and the latter is a non-spoken method mandatory for oceanic messages and limited for some domestic issues. ASR systems on ATCo environments inherit increasing complexity due to accents from non-English speakers, cockpit noise, speaker-dependent biases, and small in-domain ATC databases for training. Hereby, we introduce CleanSky EC-H2020 ATCO2, a project that aims to develop an ASR-based platform to collect, organize and automatically pre-process ATCo speech-data from air space. This paper conveys an exploratory benchmark of several state-of-the-art ASR models trained on more than 170 hours of ATCo speech-data. We demonstrate that the cross-accent flaws due to speakers' accents are minimized due to the amount of data, making the system feasible for ATC environments. The developed ASR system achieves an averaged word error rate (WER) of 7.75% across four databases. An additional 35% relative improvement in WER is achieved on one test set when training a TDNNF system with byte-pair encoding., Comment: Accepted to: 21st INTERSPEECH conference (Shanghai, October 25-29)
Published: 2020

41. Relative Error Streaming Quantiles

Author: Cormode, Graham, Karnin, Zohar, Liberty, Edo, Thaler, Justin, and Veselý, Pavel
Subjects: Computer Science - Data Structures and Algorithms, F.2.2
Abstract: Estimating ranks, quantiles, and distributions over streaming data is a central task in data analysis and monitoring. Given a stream of $n$ items from a data universe equipped with a total order, the task is to compute a sketch (data structure) of size polylogarithmic in $n$. Given the sketch and a query item $y$, one should be able to approximate its rank in the stream, i.e., the number of stream elements smaller than or equal to $y$. Most works to date focused on additive $\varepsilon n$ error approximation, culminating in the KLL sketch that achieved optimal asymptotic behavior. This paper investigates multiplicative $(1\pm\varepsilon)$-error approximations to the rank. Practical motivation for multiplicative error stems from demands to understand the tails of distributions, and hence for sketches to be more accurate near extreme values. The most space-efficient algorithms due to prior work store either $O(\log(\varepsilon^2 n)/\varepsilon^2)$ or $O(\log^3(\varepsilon n)/\varepsilon)$ universe items. We present a randomized sketch storing $O(\log^{1.5}(\varepsilon n)/\varepsilon)$ items that can $(1\pm\varepsilon)$-approximate the rank of each universe item with high constant probability; this space bound is within an $O(\sqrt{\log(\varepsilon n)})$ factor of optimal. Our algorithm does not require prior knowledge of the stream length and is fully mergeable, rendering it suitable for parallel and distributed computing environments., Comment: Final version of the paper to appear in Journal of the ACM. Compared to the previous version, we removed any restrictions on the accuracy parameters in the main result and thoroughly revised the paper. 48 pages, 2 figures
Published: 2020
Full Text: View/download PDF

42. Star-finite coverings of Banach spaces

Author: De Bernardi, Carlo Alberto, Somaglia, Jacopo, and Vesely, Libor
Subjects: Mathematics - Functional Analysis
Abstract: We study star-finite coverings of infinite-dimensional normed spaces. A family of sets is called star-finite if each of its members intersects only finitely many other members of the family. It follows by our results that an LUR or a uniformly Fr\'echet smooth infinite-dimensional Banach space does not admit star-finite coverings by closed balls. On the other hand, we present a quite involved construction proving existence of a star-finite covering of $c_0(\Gamma)$ by Fr\'echet smooth centrally symmetric bounded convex bodies. A similar but simpler construction shows that every normed space of countable dimension (and hence incomplete) has a star-finite covering by closed balls.
Published: 2020

43. BUT Opensat 2019 Speech Recognition System

Author: Karafiát, Martin, Baskar, Murali Karthick, Szöke, Igor, Vydana, Hari Krishna, Veselý, Karel, and Černocký, Jan "Honza''
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Machine Learning, Computer Science - Sound
Abstract: The paper describes the BUT Automatic Speech Recognition (ASR) systems submitted for OpenSAT evaluations under two domain categories such as low resourced languages and public safety communications. The first was challenging due to lack of training data, therefore various architectures and multilingual approaches were employed. The combination led to superior performance. The second domain was challenging due to recording in extreme conditions such as specific channel, speaker under stress and high levels of noise. Data augmentation process was inevitable to get reasonably good performance., Comment: REJECTED in ICASSP 2020
Published: 2020

44. Streaming Algorithms for Bin Packing and Vector Scheduling

Author: Cormode, Graham and Veselý, Pavel
Subjects: Computer Science - Data Structures and Algorithms, F.2.2
Abstract: Problems involving the efficient arrangement of simple objects, as captured by bin packing and makespan scheduling, are fundamental tasks in combinatorial optimization. These are well understood in the traditional online and offline cases, but have been less well-studied when the volume of the input is truly massive, and cannot even be read into memory. This is captured by the streaming model of computation, where the aim is to approximate the cost of the solution in one pass over the data, using small space. As a result, streaming algorithms produce concise input summaries that approximately preserve the optimum value. We design the first efficient streaming algorithms for these fundamental problems in combinatorial optimization. For Bin Packing, we provide a streaming asymptotic $1+\varepsilon$-approximation with $\widetilde{O}\left(\frac{1}{\varepsilon}\right)$ memory, where $\widetilde{O}$ hides logarithmic factors. Moreover, such a space bound is essentially optimal. Our algorithm implies a streaming $d+\varepsilon$-approximation for Vector Bin Packing in $d$ dimensions, running in space $\widetilde{O}\left(\frac{d}{\varepsilon}\right)$. For the related Vector Scheduling problem, we show how to construct an input summary in space $\widetilde{O}(d^2\cdot m / \varepsilon^2)$ that preserves the optimum value up to a factor of $2 - \frac{1}{m} +\varepsilon$, where $m$ is the number of identical machines., Comment: 19 pages, 1 figure, submitted
Published: 2019

45. Tight Lower Bound for Comparison-Based Quantile Summaries

Author: Cormode, Graham and Veselý, Pavel
Subjects: Computer Science - Data Structures and Algorithms, F.2.2
Abstract: Quantiles, such as the median or percentiles, provide concise and useful information about the distribution of a collection of items, drawn from a totally ordered universe. We study data structures, called quantile summaries, which keep track of all quantiles, up to an error of at most $\varepsilon$. That is, an $\varepsilon$-approximate quantile summary first processes a stream of items and then, given any quantile query $0\le \phi\le 1$, returns an item from the stream, which is a $\phi'$-quantile for some $\phi' = \phi \pm \varepsilon$. We focus on comparison-based quantile summaries that can only compare two items and are otherwise completely oblivious of the universe. The best such deterministic quantile summary to date, due to Greenwald and Khanna (SIGMOD '01), stores at most $O(\frac{1}{\varepsilon}\cdot \log \varepsilon N)$ items, where $N$ is the number of items in the stream. We prove that this space bound is optimal by showing a matching lower bound. Our result thus rules out the possibility of constructing a deterministic comparison-based quantile summary in space $f(\varepsilon)\cdot o(\log N)$, for any function $f$ that does not depend on $N$. As a corollary, we improve the lower bound for biased quantiles, which provide a stronger, relative-error guarantee of $(1\pm \varepsilon)\cdot \phi$, and for other related computational tasks., Comment: 20 pages, 2 figures, major revison of the construction (Sec. 3) and some other parts of the paper
Published: 2019

46. How to glide in Schwarzschild spacetime

Author: Veselý, Vítek and Zofka, Martin
Subjects: General Relativity and Quantum Cosmology, 83C10, 83C55
Abstract: We investigate the motion of extended test objects in the Schwarzschild spacetime, particularly the radial fall of two point masses connected by a massless rod of a length given as a fixed, periodic function of time. We argue that such a model is inappropriate in the most interesting regimes of high and low oscillation frequencies., Comment: 15 pages, 10 figures
Published: 2019
Full Text: View/download PDF

47. Microscopic multiphonon approach to nuclei with a valence hole in the oxygen region

Author: De Gregorio, Giovanni, Knapp, Frantisek, Iudice, Nicola Lo, and Vesely, Petr
Subjects: Nuclear Theory
Abstract: An equation of motion phonon method, developed for even nuclei and recently extended to odd systems with a valence particle, is formulated in the hole-phonon coupling scheme and applied to A=15 and A=21 isobars with a valence hole. The method derives a set of equations which yield an orthonormal basis of states composed of a hole coupled to an orthonormal basis of correlated n-phonon states (n = 0, 1, 2, . . .), built of constituent Tamm-Dancoff phonons, describing the excitations of a doubly magic core. The basis is then adopted to solve the full eigenvalue problem. The method is formally exact but lends itself naturally to simplifying approximations. Self-consistent calculations using a chiral Hamiltonian in a space encompassing up to two-phonon and three-phonon basis states in A=21 A=15 nuclei, respectively, yield full spectra, moments, electromagnetic and beta-decay transition strengths, and electric dipole cross sections. The analysis of the hole-phonon composition of the eigenfunctions contributes to clarify the mechanism of excitation of levels and resonances and to understand the reasons of the deviations of the theory from the experiments. Prescriptions for reducing these discrepancies are suggested., Comment: 11 pages, 7 figures, accepted in Phys. Rev. C
Published: 2019

48. Effect of a realistic three-body force on the spectra of medium-mass hypernuclei

Author: Vesely, Petr, De Gregorio, Giovanni, and Pokorny, Jan
Subjects: Nuclear Theory
Abstract: We adopt the Hartree-Fock (HF) method in the proton-neutron-$\Lambda$ (p-n-$\Lambda$) formalism and the nucleon-$\Lambda$ Tamm-Dancoff Approximation (N$\Lambda$ TDA) to study the energy spectra of medium-mass hypernuclei. The formalism is developed for a potential derived from effective field theories which includes explicitly the 3-body $NNN$ forces plus the $YN$ LO potential. The energy spectra of selected medium-mass hypernuclei are presented and their properties discussed. The present calculation is the first step of a project devoted to {\it ab initio} studies of hypernuclei in medium and heavy mass regions. This may provide a guide for a better understanding of the $YN$ interactions at momentum scales not accessible in few-body hypernuclei., Comment: 7 pages, 9 figures, accepted in Physica Scripta
Published: 2018
Full Text: View/download PDF

49. Introducing SPAIN (SParse Audio INpainter)

Author: Mokrý, Ondřej, Záviška, Pavel, Rajmic, Pavel, and Veselý, Vítězslav
Subjects: Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing, Mathematics - Optimization and Control
Abstract: A novel sparsity-based algorithm for audio inpainting is proposed. It is an adaptation of the SPADE algorithm by Kiti\'c et al., originally developed for audio declipping, to the task of audio inpainting. The new SPAIN (SParse Audio INpainter) comes in synthesis and analysis variants. Experiments show that both A-SPAIN and S-SPAIN outperform other sparsity-based inpainting algorithms. Moreover, A-SPAIN performs on a par with the state-of-the-art method based on linear prediction in terms of the SNR, and, for larger gaps, SPAIN is even slightly better in terms of the PEMO-Q psychoacoustic criterion.
Published: 2018
Full Text: View/download PDF

50. Residual Memory Networks: Feed-forward approach to learn long temporal dependencies

Author: Baskar, Murali Karthick, Karafiat, Martin, Burget, Lukas, Vesely, Karel, Grezl, Frantisek, and Cernocky, Jan Honza
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Training deep recurrent neural network (RNN) architectures is complicated due to the increased network complexity. This disrupts the learning of higher order abstracts using deep RNN. In case of feed-forward networks training deep structures is simple and faster while learning long-term temporal information is not possible. In this paper we propose a residual memory neural network (RMN) architecture to model short-time dependencies using deep feed-forward layers having residual and time delayed connections. The residual connection paves way to construct deeper networks by enabling unhindered flow of gradients and the time delay units capture temporal information with shared weights. The number of layers in RMN signifies both the hierarchical processing depth and temporal depth. The computational complexity in training RMN is significantly less when compared to deep recurrent networks. RMN is further extended as bi-directional RMN (BRMN) to capture both past and future information. Experimental analysis is done on AMI corpus to substantiate the capability of RMN in learning long-term information and hierarchical information. Recognition performance of RMN trained with 300 hours of Switchboard corpus is compared with various state-of-the-art LVCSR systems. The results indicate that RMN and BRMN gains 6 % and 3.8 % relative improvement over LSTM and BLSTM networks.
Published: 2018

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

100 results on '"Vesely P"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources