100 results on '"Vesely P"'
Search Results
2. Phase Composition of AlTiNbMoV, AlTiNbTaZr and AlTiNbMoCr Refractory Complex Concentrated Alloys: A Correlation of Predictions and Experiment
- Author
-
Kozlík, Jiří, Lukáč, František, Luna, Mariano Casas, Šalata, Kristián, Stráský, Josef, Veselý, Jozef, Jača, Eliška, and Chráska, Tomáš
- Subjects
Condensed Matter - Materials Science - Abstract
Designing complex concentrated alloys (CCA), also known as high entropy alloys (HEA), requires reliable and accessible thermodynamic predictions due to vast space of possible compositions. Numerous semiempirical parameters have been developed for phase predictions over the years. However, in this paper we show that none of these parameters is a robust indicator of phase content in various refractory CCA. CALPHAD proved to be a more powerful tool for phase predictions, however, the predictions face several limitations. AlTiNbMoV, AlTiNbTaZr and AlTiNbMoCr alloys were prepared using blended elemental powder metallurgy. Their phase and chemical composition were investigated by the means of scanning electron microscopy, energy-dispersive X-ray spectroscopy and X-ray diffraction. Apart from the minor contamination phases (Al2O3 and Ti(C,N,O)), AlTiNbMoV and AlTiNbMoCr exhibited single-phase solid solution microstructure at the homogenization temperature of 1400 {\deg}C, while Al3Zr5 based intermetallics were present in the AlTiNbTaZr alloy. None of the simple semiempirical parameter was able to predict phase content correctly in all three alloys. Predictions by CALPHAD (TCHEA4 database) were able to predict the phases with limited accuracy only. Critical limitation of the TCHEA4 database is that only binary and ternary phase diagrams are assessed and some more complex phases cannot be predicted., Comment: 17 pages, 15 figures, 32 references. Postprint version published in Metallurgical and Materials Transactions A (2024)
- Published
- 2024
- Full Text
- View/download PDF
3. Nuclear shape / phase transitions in the N = 40, 60, 90 regions
- Author
-
Petrellis, Dimitrios, Prášek, Adam, Alexa, Petr, Bonatsos, Dennis, Thiamová, Gabriela, and Veselý, Petr
- Subjects
Nuclear Theory - Abstract
We investigate the isotopes of Se, Zr, Mo and Nd in the regions with N = 40, 60 and 90, where a first-order shape / phase transition, from spherical to deformed, can be observed. The signs of phase transitional behavior become evident by examining structure indicators, such as certain energy ratios and B(E2) transition rates and, in particular, how they evolve with neutron number. Microscopic mean-field calculations using the Skyrme-Hartree-Fock + Bardeen-Cooper-Schrieffer framework also reveal structural changes when considering the evolution of the resulting potential energy curves as functions of deformation. Finally, macroscopic calculations, using the Algebraic Collective Model, specifically for $^{74}$Se, $^{102}$Mo and $^{150}$Nd, after fitting its parameters to experimental spectra, result in potentials that resemble some of the potentials proposed in the framework of the Bohr Hamiltonian to describe shape transitions in nuclei., Comment: 5 pages, 4 figures, 7th Workshop of the Hellenic Institute of Nuclear Physics (HINPw7)
- Published
- 2024
4. Phase transitions in N = 40, 60 and 90 nuclei
- Author
-
Prášek, A., Alexa, P., Bonatsos, D., Thiamová, G., Petrellis, D., and Veselý, P.
- Subjects
Nuclear Theory - Abstract
In this paper we focus on three mass regions where first-order phase transitions occur, namely for $N=40$, 60 and 90. We investigate four isotopic chains (Se, Zr, Mo and Nd) in the framework of microscopic Skyrme-Hartree-Fock + Bardeen-Cooper-Schrieffer calculations for 15 different parametrizations. The microscopic calculations show the typical behavior expected for first-order phase transitions. To find the best candidate for the critical point phase transition we propose new microscopic position and occupation indices calculated for positive-parity and negative-parity proton and neutron single-quasiparticle states around the Fermi level. The microscopic calculations are completed by macroscopic calculations within the Algebraic Collective Model (ACM), and compared to the experimental data for $^{74}$Se, $^{102}$Mo and $^{150}$Nd, considered to be the best candidates for the critical point nuclei., Comment: accepted in Phys. Rev. C
- Published
- 2024
5. Beyond Image-Text Matching: Verb Understanding in Multimodal Transformers Using Guided Masking
- Author
-
Beňová, Ivana, Košecká, Jana, Gregor, Michal, Tamajka, Martin, Veselý, Marcel, and Šimko, Marián
- Subjects
Computer Science - Computation and Language ,Computer Science - Computer Vision and Pattern Recognition - Abstract
The dominant probing approaches rely on the zero-shot performance of image-text matching tasks to gain a finer-grained understanding of the representations learned by recent multimodal image-language transformer models. The evaluation is carried out on carefully curated datasets focusing on counting, relations, attributes, and others. This work introduces an alternative probing strategy called guided masking. The proposed approach ablates different modalities using masking and assesses the model's ability to predict the masked word with high accuracy. We focus on studying multimodal models that consider regions of interest (ROI) features obtained by object detectors as input tokens. We probe the understanding of verbs using guided masking on ViLBERT, LXMERT, UNITER, and VisualBERT and show that these models can predict the correct verb with high accuracy. This contrasts with previous conclusions drawn from image-text matching probing techniques that frequently fail in situations requiring verb understanding. The code for all experiments will be publicly available https://github.com/ivana-13/guided_masking., Comment: 9 pages of text, 11 pages total, 7 figures, 3 tables, preprint
- Published
- 2024
6. A Wild Bootstrap Procedure for the Identification of Optimal Groups in Singular Spectrum Analysis
- Author
-
Movahedifar, Maryam, Preusse, Friederike, Vesely, Anna, Ochieng, Daniel, and Dickhaus, Thorsten
- Subjects
Statistics - Methodology ,Statistics - Applications ,94A12, 62F40, 62J15 - Abstract
A key step in separating signal from noise using Singular Spectrum Analysis (SSA) is grouping, which is often done subjectively. In this article a method which enables the identification of statistically significant groups for the grouping step in SSA is presented. The proposed procedure provides a more objective and reliable approach for separating noise from the main signal in SSA. We utilize the w- correlation and test if it close or equal to zero. A wild bootstrap approach is used to determine the distribution of the w-correlation. To identify an ideal number of groupings which leads to almost perfect separation of the noise and signal, a given number of groups are tested, necessitating accounting for multiplicity. The effectiveness of our method in identifying the best group is demonstrated through a simulation study, furthermore, we have applied the approach to real world data in the context of neuroimaging. This research provides a valuable contribution to the field of SSA and offers important insights into the statistical properties of the w-correlation distribution. The results obtained from the simulation studies and analysis of real-world data demonstrate the effectiveness of the proposed approach in identifying the best groupings for SSA., Comment: We have 22 pages and 5 figures
- Published
- 2024
7. Structural Properties of Search Trees with 2-way Comparisons
- Author
-
Atalig, Sunny, Chrobak, Marek, Mousavian, Erfan, Sgall, Jiri, and Vesely, Pavel
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
Optimal 3-way comparison search trees (3WCST's) can be computed using standard dynamic programming in time O(n^3), and this can be further improved to O(n^2) by taking advantage of the Monge property. In contrast, the fastest algorithm in the literature for computing optimal 2-way comparison search trees (2WCST's) runs in time O(n^4). To shed light on this discrepancy, we study structure properties of 2WCST's. On one hand, we show some new threshold bounds involving key weights that can be helpful in deciding which type of comparison should be at the root of the optimal tree. On the other hand, we also show that the standard techniques for speeding up dynamic programming (the Monge property / quadrangle inequality) do not apply to 2WCST's.
- Published
- 2023
8. BUT CHiME-7 system description
- Author
-
Karafiát, Martin, Veselý, Karel, Szöke, Igor, Mošner, Ladislav, Beneš, Karel, Witkowski, Marcin, Barchi, Germán, and Pepino, Leonardo
- Subjects
Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
This paper describes the joint effort of Brno University of Technology (BUT), AGH University of Krakow and University of Buenos Aires on the development of Automatic Speech Recognition systems for the CHiME-7 Challenge. We train and evaluate various end-to-end models with several toolkits. We heavily relied on Guided Source Separation (GSS) to convert multi-channel audio to single channel. The ASR is leveraging speech representations from models pre-trained by self-supervised learning, and we do a fusion of several ASR systems. In addition, we modified external data from the LibriSpeech corpus to become a close domain and added it to the training. Our efforts were focused on the far-field acoustic robustness sub-track of Task 1 - Distant Automatic Speech Recognition (DASR), our systems use oracle segmentation., Comment: 6 pages, Chime-7 challenge 2023
- Published
- 2023
9. Confidence bounds for the true discovery proportion based on the exact distribution of the number of rejections
- Author
-
Preusse, Friederike, Vesely, Anna, and Dickhaus, Thorsten
- Subjects
Statistics - Methodology - Abstract
In multiple hypotheses testing it has become widely popular to make inference on the true discovery proportion (TDP) of a set $\mathcal{M}$ of null hypotheses. This approach is useful for several application fields, such as neuroimaging and genomics. Several procedures to compute simultaneous lower confidence bounds for the TDP have been suggested in prior literature. Simultaneity allows for post-hoc selection of $\mathcal{M}$. If sets of interest are specified a priori, it is possible to gain power by removing the simultaneity requirement. We present an approach to compute lower confidence bounds for the TDP if the set of null hypotheses is defined a priori. The proposed method determines the bounds using the exact distribution of the number of rejections based on a step-up multiple testing procedure under independence assumptions. We assess robustness properties of our procedure and apply it to real data from the field of functional magnetic resonance imaging.
- Published
- 2023
10. Fully Scalable MPC Algorithms for Clustering in High Dimension
- Author
-
Czumaj, Artur, Gao, Guichen, Jiang, Shaofeng H. -C., Krauthgamer, Robert, and Veselý, Pavel
- Subjects
Computer Science - Data Structures and Algorithms ,Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
We design new parallel algorithms for clustering in high-dimensional Euclidean spaces. These algorithms run in the Massively Parallel Computation (MPC) model, and are fully scalable, meaning that the local memory in each machine may be $n^{\sigma}$ for arbitrarily small fixed $\sigma>0$. Importantly, the local memory may be substantially smaller than the number of clusters $k$, yet all our algorithms are fast, i.e., run in $O(1)$ rounds. We first devise a fast MPC algorithm for $O(1)$-approximation of uniform facility location. This is the first fully-scalable MPC algorithm that achieves $O(1)$-approximation for any clustering problem in general geometric setting; previous algorithms only provide $\mathrm{poly}(\log n)$-approximation or apply to restricted inputs, like low dimension or small number of clusters $k$; e.g. [Bhaskara and Wijewardena, ICML'18; Cohen-Addad et al., NeurIPS'21; Cohen-Addad et al., ICML'22]. We then build on this facility location result and devise a fast MPC algorithm that achieves $O(1)$-bicriteria approximation for $k$-Median and for $k$-Means, namely, it computes $(1+\varepsilon)k$ clusters of cost within $O(1/\varepsilon^2)$-factor of the optimum for $k$ clusters. A primary technical tool that we introduce, and may be of independent interest, is a new MPC primitive for geometric aggregation, namely, computing for every data point a statistic of its approximate neighborhood, for statistics like range counting and nearest-neighbor search. Our implementation of this primitive works in high dimension, and is based on consistent hashing (aka sparse partition), a technique that was recently used for streaming algorithms [Czumaj et al., FOCS'22].
- Published
- 2023
11. Selective inference for fMRI cluster-wise analysis, issues, and recommendations for critical vector selection: A comment on Blain et al
- Author
-
Andreella, Angela, Vesely, Anna, Wouter, Weeda, and Goeman, Jelle
- Subjects
Statistics - Applications - Abstract
Two permutation-based methods for simultaneous inference on the proportion of active voxels in cluster-wise brain imaging analysis have recently been published: Notip (Blain et al. 2022) and pARI (Andreella et al. 2023). Both rely on the definition of a critical vector of ordered p-values, chosen from a family of candidate vectors, but differ in how the family is defined: computed from randomization of external data for Notip and determined a priori for pARI. These procedures were compared to other proposals in the literature, but an extensive comparison between the two methods is missing due to their parallel publication. We provide such a comparison and find that pARI outperforms Notip if both methods are applied under their recommended settings. However, each method carries different advantages and drawbacks.
- Published
- 2023
12. Self-consistent many-body approach to the electroproduction of hypernuclei
- Author
-
Bydžovský, P., Denisova, D., Petrellis, D., Skoupil, D., Veselý, P., De Gregorio, G., Knapp, F., and Iudice, N. Lo
- Subjects
Nuclear Theory - Abstract
The electroproduction of selected $p$- and $sd$-shell hypernuclei was studied within a many-body approach using realistic interactions between the constituent baryons. The cross sections were computed in distorted-wave impulse approximation using two elementary amplitudes for the electroproduction of the $\Lambda$ hyperon. The structure of the hypernuclei was investigated within the framework of the self-consistent $\Lambda$-nucleon Tamm-Dancoff approach and its extension known as the $\Lambda$-nucleon equation of motion phonon method. Use was made of the NNLOsat chiral potential plus the effective Nijmegen-F YN interaction. The method was first implemented on light nuclei for studying the available experimental data and establishing a relation to other approaches. After this proof test, it was adopted for predicting the electroproduction cross section of the hypernuclei $^{40}_{~\Lambda}$K and $^{48}_{~\Lambda}$K in view of the E12-15-008 experiment in preparation at JLab. On the ground of these predictions, appreciable effects on the spectra are expected to be induced by the YN interaction., Comment: 10 pages, 7 figures (version 2); 11 pages, 9 figures (v1)
- Published
- 2023
13. Finding the Optimal Currency Composition of Foreign Exchange Reserves with a Quantum Computer
- Author
-
Vesely, Martin
- Subjects
Economics - General Economics ,Quantitative Finance - Computational Finance ,Quantum Physics - Abstract
Portfolio optimization is an inseparable part of strategic asset allocation at the Czech National Bank. Quantum computing is a new technology offering algorithms for that problem. The capabilities and limitations of quantum computers with regard to portfolio optimization should therefore be investigated. In this paper, we focus on applications of quantum algorithms to dynamic portfolio optimization based on the Markowitz model. In particular, we compare algorithms for universal gate-based quantum computers (the QAOA, the VQE and Grover adaptive search), single-purpose quantum annealers, the classical exact branch and bound solver and classical heuristic algorithms (simulated annealing and genetic optimization). To run the quantum algorithms we use the IBM Quantum\textsuperscript{TM} gate-based quantum computer. We also employ the quantum annealer offered by D-Wave. We demonstrate portfolio optimization on finding the optimal currency composition of the CNB's FX reserves. A secondary goal of the paper is to provide staff of central banks and other financial market regulators with literature on quantum optimization algorithms, because financial firms are active in finding possible applications of quantum computing.
- Published
- 2023
14. Procrustes-based distances for exploring between-matrices similarity
- Author
-
Andreella, Angela, De Santis, Riccardo, Vesely, Anna, and Finos, Livio
- Subjects
Statistics - Applications - Abstract
The statistical shape analysis called Procrustes analysis minimizes the distance between matrices by similarity transformations. The method returns a set of optimal orthogonal matrices, which project each matrix into a common space. This manuscript presents two types of distances derived from Procrustes analysis for exploring between-matrices similarity. The first one focuses on the residuals from the Procrustes analysis, i.e., the residual-based distance metric. In contrast, the second one exploits the fitted orthogonal matrices, i.e., the rotational-based distance metric. Thanks to these distances, similarity-based techniques such as the multidimensional scaling method can be applied to visualize and explore patterns and similarities among observations. The proposed distances result in being helpful in functional magnetic resonance imaging (fMRI) data analysis. The brain activation measured over space and time can be represented by a matrix. The proposed distances applied to a sample of subjects -- i.e., matrices -- revealed groups of individuals sharing patterns of neural brain activation.
- Published
- 2023
15. Extendability of continuous quasiconvex functions from subspaces
- Author
-
De Bernardi, Carlo Alberto and Veselý, Libor
- Subjects
Mathematics - Functional Analysis - Abstract
Let $Y$ be a subspace of a topological vector space $X$, and $A\subset X$ an open convex set that intersects $Y$. We say that the property $(QE)$ [property $(CE)$] holds if every continuous quasiconvex [continuous convex] function on $A\cap Y$ admits a continuous quasiconvex [continuous convex] extension defined on $A$. We study relations between $(QE)$ and $(CE)$ properties, proving that $(QE)$ always implies $(CE)$ and that, under suitable hypotheses (satisfied for example if $X$ is a normed space and $Y$ is a closed subspace of $X$), the two properties are equivalent. By combining the previous implications between $(QE)$ and $(CE)$ properties with known results about the property $(CE)$, we obtain some new positive results about the extension of quasiconvex continuous functions. In particular, we generalize the results contained in \cite{DEQEX} to the infinite-dimensional separable case. Moreover, we also immediately obtain existence of examples in which $(QE)$ does not hold.
- Published
- 2022
16. Speech and Natural Language Processing Technologies for Pseudo-Pilot Simulator
- Author
-
Prasad, Amrutha, Zuluaga-Gomez, Juan, Motlicek, Petr, Sarfjoo, Saeed, Nigmatulina, Iuliia, and Vesely, Karel
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
This paper describes a simple yet efficient repetition-based modular system for speeding up air-traffic controllers (ATCos) training. E.g., a human pilot is still required in EUROCONTROL's ESCAPE lite simulator (see https://www.eurocontrol.int/simulator/escape) during ATCo training. However, this need can be substituted by an automatic system that could act as a pilot. In this paper, we aim to develop and integrate a pseudo-pilot agent into the ATCo training pipeline by merging diverse artificial intelligence (AI) powered modules. The system understands the voice communications issued by the ATCo, and, in turn, it generates a spoken prompt that follows the pilot's phraseology to the initial communication. Our system mainly relies on open-source AI tools and air traffic control (ATC) databases, thus, proving its simplicity and ease of replicability. The overall pipeline is composed of the following: (1) a submodule that receives and pre-processes the input stream of raw audio, (2) an automatic speech recognition (ASR) system that transforms audio into a sequence of words; (3) a high-level ATC-related entity parser, which extracts relevant information from the communication, i.e., callsigns and commands, and finally, (4) a speech synthesizer submodule that generates responses based on the high-level ATC entities previously extracted. Overall, we show that this system could pave the way toward developing a real proof-of-concept pseudo-pilot system. Hence, speeding up the training of ATCos while drastically reducing its overall cost., Comment: Presented at Sesar Innovation Days 2022. https://www.sesarju.eu/sesarinnovationdays
- Published
- 2022
17. ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications
- Author
-
Zuluaga-Gomez, Juan, Veselý, Karel, Szöke, Igor, Blatt, Alexander, Motlicek, Petr, Kocour, Martin, Rigault, Mickael, Choukri, Khalid, Prasad, Amrutha, Sarfjoo, Seyyed Saeed, Nigmatulina, Iuliia, Cevenini, Claudia, Kolčárek, Pavel, Tart, Allan, Černocký, Jan, and Klakow, Dietrich
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Personal assistants, automatic speech recognizers and dialogue understanding systems are becoming more critical in our interconnected digital world. A clear example is air traffic control (ATC) communications. ATC aims at guiding aircraft and controlling the airspace in a safe and optimal manner. These voice-based dialogues are carried between an air traffic controller (ATCO) and pilots via very-high frequency radio channels. In order to incorporate these novel technologies into ATC (low-resource domain), large-scale annotated datasets are required to develop the data-driven AI systems. Two examples are automatic speech recognition (ASR) and natural language understanding (NLU). In this paper, we introduce the ATCO2 corpus, a dataset that aims at fostering research on the challenging ATC field, which has lagged behind due to lack of annotated data. The ATCO2 corpus covers 1) data collection and pre-processing, 2) pseudo-annotations of speech data, and 3) extraction of ATC-related named entities. The ATCO2 corpus is split into three subsets. 1) ATCO2-test-set corpus contains 4 hours of ATC speech with manual transcripts and a subset with gold annotations for named-entity recognition (callsign, command, value). 2) The ATCO2-PL-set corpus consists of 5281 hours of unlabeled ATC data enriched with automatic transcripts from an in-domain speech recognizer, contextual information, speaker turn information, signal-to-noise ratio estimate and English language detection score per sample. Both available for purchase through ELDA at http://catalog.elra.info/en-us/repository/browse/ELRA-S0484. 3) The ATCO2-test-set-1h corpus is a one-hour subset from the original test set corpus, that we are offering for free at https://www.atco2.org/data. We expect the ATCO2 corpus will foster research on robust ASR and NLU not only in the field of ATC communications but also in the general research community., Comment: Manuscript under review; The code is available at: https://github.com/idiap/atco2-corpus
- Published
- 2022
18. Comparative analysis of formalisms and performances of three different beyond mean-field approaches
- Author
-
Knapp, František, Papakonstantinou, Panagiota, Veselý, Petr, De Gregorio, Giovanni, Herko, Jakub, and Iudice, Nicola Lo
- Subjects
Nuclear Theory - Abstract
We investigate the differences and analogies between the equation of motion phonon method (EMPM) and second Tamm-Dancoff and random-phase approximations (STDA and SRPA) paying special attention to the problem of spurious center-of-mass (c.m.) admixtures. In order to compare them on an equal footing, we perform self-consistent calculations of the multipole strength distributions in selected doubly magic nuclei within a space including up to two-particle-two-hole (2p-2h) basis states using the UCOM two-body intrinsic Hamiltonian and we explore the tools each approach supplies for removing the spurious c.m. admixtures. We find that the EMPM and STDA yield exactly the same results when the same intrinsic Hamiltonian is used and the coupling of the Hartree-Fock state with the 2p-2h space is neglected, but, unlike STDA and SRPA, the EMPM offers the possibility to completely remove c.m. admixtures., Comment: 13 pages, 17 figures
- Published
- 2022
- Full Text
- View/download PDF
19. Post-selection Inference in Multiverse Analysis (PIMA): an inferential framework based on the sign flipping score test
- Author
-
Girardi, Paolo, Vesely, Anna, Lakens, Daniël, Altoè, Gianmarco, Pastore, Massimiliano, Calcagnì, Antonio, and Finos, Livio
- Subjects
Statistics - Methodology ,Statistics - Applications ,62F03 ,G.3 - Abstract
When analyzing data researchers make some decisions that are either arbitrary, based on subjective beliefs about the data generating process, or for which equally justifiable alternative choices could have been made. This wide range of data-analytic choices can be abused, and has been one of the underlying causes of the replication crisis in several fields. Recently, the introduction of multiverse analysis provides researchers with a method to evaluate the stability of the results across reasonable choices that could be made when analyzing data. Multiverse analysis is confined to a descriptive role, lacking a proper and comprehensive inferential procedure. Recently, specification curve analysis adds an inferential procedure to multiverse analysis, but this approach is limited to simple cases related to the linear model, and only allows researchers to infer whether at least one specification rejects the null hypothesis, but not which specifications should be selected. In this paper we present a Post-selection Inference approach to Multiverse Analysis (PIMA) which is a flexible and general inferential approach that accounts for all possible models, i.e., the multiverse of reasonable analyses. The approach allows for a wide range of data specifications (i.e. pre-processing) and any generalized linear model; it allows testing the null hypothesis of a given predictor not being associated with the outcome, by merging information from all reasonable models of multiverse analysis, and provides strong control of the family-wise error rate such that it allows researchers to claim that the null-hypothesis can be rejected for each specification that shows a significant effect. The inferential proposal is based on a conditional resampling procedure. To be continued..., Comment: 37 pages, 2 figures
- Published
- 2022
20. Fermi motion effects in electroproduction of hypernuclei
- Author
-
Bydžovský, P., Denisova, D., Skoupil, D., and Veselý, P.
- Subjects
Nuclear Theory - Abstract
In a previous analysis of electroproduction of hypernuclei the cross sections were calculated in distorted-wave impulse approximation where the momentum of the initial proton in the nucleus was set to zero (the frozen-proton approximation). In this paper we go beyond this approximation assuming a non zero effective proton momentum due to proton Fermi motion inside of the target nucleus discussing also other kinematical effects. To this end we have derived a more general form of the two-component elementary electroproduction amplitude (Chew-Goldberger-Low-Nambu like) which allows its use in a general reference frame moving with respect to the nucleus-rest frame. The effects of Fermi motion were found to depend on kinematics and elementary amplitudes. The largest effects were observed in the contributions from the longitudinal and interference parts of the cross sections. The extension of the calculations beyond the frozen-proton approximation improved the agreement of predicted theoretical cross sections with experimental data and once we assumed the optimum on-shell approximation, we were able to remove an inconsistency which was previously present in the calculations., Comment: 20 pages, 7 figures, 5 tables
- Published
- 2022
- Full Text
- View/download PDF
21. Resampling-Based Multisplit Inference for High-Dimensional Regression
- Author
-
Vesely, Anna, Goeman, Jelle J., and Finos, Livio
- Subjects
Statistics - Methodology - Abstract
We propose a novel resampling-based method to construct an asymptotically exact test for any subset of hypotheses on coefficients in high-dimensional linear regression. It can be embedded into any multiple testing procedure to make confidence statements on relevant predictor variables. The method constructs permutation test statistics for any individual hypothesis by means of repeated splits of the data and a variable selection technique; then it defines a test for any subset by suitably aggregating its variables' test statistics. The resulting procedure is extremely flexible, as it allows different selection techniques and several combining functions. We present it in two ways: an exact method and an approximate one, that requires less memory usage and shorter computation time, and can be scaled up to higher dimensions. We illustrate the performance of the method with simulations and the analysis of real gene expression data., Comment: 31 pages (16 pages main, 15 pages appendix), 12 figures
- Published
- 2022
22. Call-sign recognition and understanding for noisy air-traffic transcripts using surveillance information
- Author
-
Blatt, Alexander, Kocour, Martin, Veselý, Karel, Szöke, Igor, and Klakow, Dietrich
- Subjects
Computer Science - Computation and Language ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Air traffic control (ATC) relies on communication via speech between pilot and air-traffic controller (ATCO). The call-sign, as unique identifier for each flight, is used to address a specific pilot by the ATCO. Extracting the call-sign from the communication is a challenge because of the noisy ATC voice channel and the additional noise introduced by the receiver. A low signal-to-noise ratio (SNR) in the speech leads to high word error rate (WER) transcripts. We propose a new call-sign recognition and understanding (CRU) system that addresses this issue. The recognizer is trained to identify call-signs in noisy ATC transcripts and convert them into the standard International Civil Aviation Organization (ICAO) format. By incorporating surveillance information, we can multiply the call-sign accuracy (CSA) up to a factor of four. The introduced data augmentation adds additional performance on high WER transcripts and allows the adaptation of the model to unseen airspaces., Comment: Accepted by ICASSP 2022
- Published
- 2022
23. Streaming Facility Location in High Dimension via Geometric Hashing
- Author
-
Czumaj, Artur, Filtser, Arnold, Jiang, Shaofeng H. -C., Krauthgamer, Robert, Veselý, Pavel, and Yang, Mingwei
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
In Euclidean Uniform Facility Location (UFL), the input is a set of clients in $\mathbb{R}^d$ and the goal is to place facilities to serve them, so as to minimize the total cost of opening facilities plus connecting the clients. We study the setting of dynamic geometric streams, where the clients are presented as a sequence of insertions and deletions of points in the grid $\{1,\ldots,\Delta\}^d$, and we focus on the \emph{high-dimensional regime}, where the algorithm must use space polynomial in $d\cdot\log\Delta$. We present a new algorithmic framework, based on importance sampling, for $O(1)$-approximation of UFL using only $\mathrm{poly}(d\cdot\log\Delta)$ space. This framework is easy to implement in two passes, one for sampling points and the other for estimating their contribution. Over random-order streams, we can extend this to one pass by using the two halves of the stream separately. Our main result, for arbitrary-order streams, computes $O(d / \log d)$-approximation in one pass by combining the two passes differently. This improves upon previous algorithms that either need space $\exp(d)$ or only guarantee $O(d\cdot\log^2\Delta)$-approximation, and therefore our algorithms for high dimension are the first to avoid the $O(\log\Delta)$-factor in approximation that is inherent to the widely-used quadtree decomposition. Our improvement is achieved by employing a geometric hashing scheme that maps points in $\mathbb{R}^d$ into buckets of bounded diameter, with the key property that every point set of small-enough diameter is hashed into few buckets. By applying an alternative bound for this hashing, we also obtain an $O(1 / \epsilon)$-approximation in one pass, using larger but still sublinear space $O(n^{\epsilon})$ where $n$ is the number of clients. We complement our results by showing $1.085$-approximation requires space exponential in $\mathrm{poly}(d\cdot\log\Delta)$., Comment: The abstract is shortened to meet the length constraint of arXiv
- Published
- 2022
24. Application of Quantum Computers in Foreign Exchange Reserves Management
- Author
-
Veselý, Martin
- Subjects
Economics - General Economics ,Quantitative Finance - Computational Finance ,Quantum Physics - Abstract
The main purpose of this article is to evaluate possible applications of quantum computers in foreign exchange reserves management. The capabilities of quantum computers are demonstrated by means of risk measurement using the quantum Monte Carlo method and portfolio optimization using a linear equations system solver (the Harrow-Hassidim-Lloyd algorithm) and quadratic unconstrained binary optimization (the quantum approximate optimization algorithm). All demonstrations are carried out on the cloud-based IBM Quantum(TM) platform. Despite the fact that real-world applications are impossible under the current state of development of quantum computers, it is proven that in principle it will be possible to apply such computers in FX reserves management in the future. In addition, the article serves as an introduction to quantum computing for the staff of central banks and financial market supervisory authorities.
- Published
- 2022
25. Spectroscopic properties of 4He within a multiphonon approach
- Author
-
De Gregorio, G., Knapp, F., Iudice, N. Lo, and Veselý, P.
- Subjects
Nuclear Theory - Abstract
Bulk and spectroscopic properties of 4He are studied within an equation of motion phonon method. Such a method generates a basis of n-phonon (n = 0, 1, 2, 3...) states composed of tensor products of particle-hole Tamm-Dancoff phonons and then solves the full eigenvalue problem in such a basis. The method does not rely on any approximation and is free of any contamination induced by the center of mass, in virtue of a procedure exploiting the singular value decomposition of rectangular matrices. Two potentials, both derived from the chiral effective field theory, are adopted in a self-consistent calculation performed within a space including up to three phonons. The latter basis states are treated under a simplifying assumption. A comparative analysis with the experimental data points out the different performances of the two potentials. It shows also that the calculation succeeds only partially in the description of the spectroscopic properties and suggests a recipe for further improvements., Comment: Accepted for publication on Phys. Rev. C
- Published
- 2022
- Full Text
- View/download PDF
26. Improved Approximation Guarantees for Shortest Superstrings using Cycle Classification by Overlap to Length Ratios
- Author
-
Englert, Matthias, Matsakis, Nicolaos, and Veselý, Pavel
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
In the Shortest Superstring problem, we are given a set of strings and we are asking for a common superstring, which has the minimum number of characters. The Shortest Superstring problem is NP-hard and several constant-factor approximation algorithms are known for it. Of particular interest is the GREEDY algorithm, which repeatedly merges two strings of maximum overlap until a single string remains. The GREEDY algorithm, being simpler than other well-performing approximation algorithms for this problem, has attracted attention since the 1980s and is commonly used in practical applications. Tarhio and Ukkonen (TCS 1988) conjectured that GREEDY gives a 2-approximation. In a seminal work, Blum, Jiang, Li, Tromp, and Yannakakis (STOC 1991) proved that the superstring computed by GREEDY is a 4-approximation, and this upper bound was improved to 3.5 by Kaplan and Shafrir (IPL 2005). We show that the approximation guarantee of GREEDY is at most $(13+\sqrt{57})/6 \approx 3.425$, making the first progress on this question since 2005. Furthermore, we prove that the Shortest Superstring can be approximated within a factor of $(37+\sqrt{57})/18\approx 2.475$, improving slightly upon the currently best $2\frac{11}{23}$-approximation algorithm by Mucha (SODA 2013).
- Published
- 2021
27. Distill: Domain-Specific Compilation for Cognitive Models
- Author
-
Vesely, Jan, Pothukuchi, Raghavendra Pradyumna, Joshi, Ketaki, Gupta, Samyak, Cohen, Jonathan D., and Bhattacharjee, Abhishek
- Subjects
Computer Science - Programming Languages - Abstract
This paper discusses our proposal and implementation of Distill, a domain-specific compilation tool based on LLVM to accelerate cognitive models. Cognitive models explain the process of cognitive function and offer a path to human-like artificial intelligence. However, cognitive modeling is laborious, requiring composition of many types of computational tasks, and suffers from poor performance as it relies on high-level languages like Python. In order to continue enjoying the flexibility of Python while achieving high performance, Distill uses domain-specific knowledge to compile Python-based cognitive models into LLVM IR, carefully stripping away features like dynamic typing and memory management that add overheads to the actual model. As we show, this permits significantly faster model execution. We also show that the code so generated enables using classical compiler data flow analysis passes to reveal properties about data flow in cognitive models that are useful to cognitive scientists. Distill is publicly available, is being used by researchers in cognitive science, and has led to patches that are currently being evaluated for integration into mainline LLVM., Comment: 11 pages, 7 figures
- Published
- 2021
28. Quality Control Methodology for Simulation Models of Computer Network Protocols
- Author
-
Veselý, Vladimír and Zavřel, Jan
- Subjects
Computer Science - Networking and Internet Architecture ,Computer Science - Performance - Abstract
This paper summarizes know-how about modeling and simulation of computer networking protocols we contributed to the OMNeT++ community. We propose a methodology aiming to set a reliable ground truth for the quality of simulation models of networking protocols. We demonstrate the application of this methodology on our EIGRP source code pull-requested to the INET framework., Comment: Published in: M. Marek, G. Nardini, V. Vesely (Eds.), Proceedings of the 8th OMNeT++ Community Summit, Virtual Summit, September 8-10, 2021
- Published
- 2021
29. Proceedings of the 8th OMNeT++ Community Summit, Virtual Summit, September 8-10, 2021
- Author
-
Marek, Marcel, Nardini, Giovanni, and Veselý, Vladimír
- Subjects
Computer Science - Networking and Internet Architecture ,Computer Science - Performance - Abstract
These are the Proceedings of the 8th OMNeT++ Community Summit, which was held virtually on September 8-10, 2021.
- Published
- 2021
30. Improved Analysis of Online Balanced Clustering
- Author
-
Bienkowski, Marcin, Böhm, Martin, Koutecký, Martin, Rothvoß, Thomas, Sgall, Jiří, and Veselý, Pavel
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
In the online balanced graph repartitioning problem, one has to maintain a clustering of $n$ nodes into $\ell$ clusters, each having $k = n / \ell$ nodes. During runtime, an online algorithm is given a stream of communication requests between pairs of nodes: an inter-cluster communication costs one unit, while the intra-cluster communication is free. An algorithm can change the clustering, paying unit cost for each moved node. This natural problem admits a simple $O(\ell^2 \cdot k^2)$-competitive algorithm COMP, whose performance is far apart from the best known lower bound of $\Omega(\ell \cdot k)$. One of open questions is whether the dependency on $\ell$ can be made linear; this question is of practical importance as in the typical datacenter application where virtual machines are clustered on physical servers, $\ell$ is of several orders of magnitude larger than $k$. We answer this question affirmatively, proving that a simple modification of COMP is $(\ell \cdot 2^{O(k)})$-competitive. On the technical level, we achieve our bound by translating the problem to a system of linear integer equations and using Graver bases to show the existence of a ``small'' solution.
- Published
- 2021
31. Contextual Semi-Supervised Learning: An Approach To Leverage Air-Surveillance and Untranscribed ATC Data in ASR Systems
- Author
-
Zuluaga-Gomez, Juan, Nigmatulina, Iuliia, Prasad, Amrutha, Motlicek, Petr, Veselý, Karel, Kocour, Martin, and Szöke, Igor
- Subjects
Computer Science - Computation and Language ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Air traffic management and specifically air-traffic control (ATC) rely mostly on voice communications between Air Traffic Controllers (ATCos) and pilots. In most cases, these voice communications follow a well-defined grammar that could be leveraged in Automatic Speech Recognition (ASR) technologies. The callsign used to address an airplane is an essential part of all ATCo-pilot communications. We propose a two-steps approach to add contextual knowledge during semi-supervised training to reduce the ASR system error rates at recognizing the part of the utterance that contains the callsign. Initially, we represent in a WFST the contextual knowledge (i.e. air-surveillance data) of an ATCo-pilot communication. Then, during Semi-Supervised Learning (SSL) the contextual knowledge is added by second-pass decoding (i.e. lattice re-scoring). Results show that `unseen domains' (e.g. data from airports not present in the supervised training data) are further aided by contextual SSL when compared to standalone SSL. For this task, we introduce the Callsign Word Error Rate (CA-WER) as an evaluation metric, which only assesses ASR performance of the spoken callsign in an utterance. We obtained a 32.1% CA-WER relative improvement applying SSL with an additional 17.5% CA-WER improvement by adding contextual knowledge during SSL on a challenging ATC-based test set gathered from LiveATC., Comment: Presented at: Interspeech conference 2021 (Brno, Czechia, August 30 - September 3)
- Published
- 2021
32. Detecting English Speech in the Air Traffic Control Voice Communication
- Author
-
Szoke, Igor, Kesiraju, Santosh, Novotny, Ondrej, Kocour, Martin, Vesely, Karel, and Cernocky, Jan "Honza"
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
We launched a community platform for collecting the ATC speech world-wide in the ATCO2 project. Filtering out unseen non-English speech is one of the main components in the data processing pipeline. The proposed English Language Detection (ELD) system is based on the embeddings from Bayesian subspace multinomial model. It is trained on the word confusion network from an ASR system. It is robust, easy to train, and light weighted. We achieved 0.0439 equal-error-rate (EER), a 50% relative reduction as compared to the state-of-the-art acoustic ELD system based on x-vectors, in the in-domain scenario. Further, we achieved an EER of 0.1352, a 33% relative reduction as compared to the acoustic ELD, in the unseen language (out-of-domain) condition. We plan to publish the evaluation dataset from the ATCO2 project.
- Published
- 2021
33. Cosmological magnetic field---the boost-symmetric case
- Author
-
Veselý, Jiří and Žofka, Martin
- Subjects
General Relativity and Quantum Cosmology - Abstract
We find a class of cylindrically symmetric, static electrovacuum spacetimes generated by a non-homogeneous magnetic field and involving the cosmological constant and one additional parameter, which determine uniquely the strength of the magnetic field. We provide a simple model of a source producing the field., Comment: 7 pages, no figures, a reference added
- Published
- 2021
- Full Text
- View/download PDF
34. Cylindrical spacetimes due to radial magnetic fields
- Author
-
Veselý, Jiří and Žofka, Martin
- Subjects
General Relativity and Quantum Cosmology - Abstract
We continue our previous study of cylindrically symmetric, static electrovacuum spacetimes generated by a magnetic field, involving optionally the cosmological constant, and investigate several classes of exact solutions. These spacetimes are due to magnetic fields that are perpendicular to the axis of symmetry., Comment: 8 pages, 6 figures
- Published
- 2021
- Full Text
- View/download PDF
35. Permutation-Based True Discovery Guarantee by Sum Tests
- Author
-
Vesely, Anna, Finos, Livio, and Goeman, Jelle J.
- Subjects
Statistics - Methodology - Abstract
Sum-based global tests are highly popular in multiple hypothesis testing. In this paper we propose a general closed testing procedure for sum tests, which provides lower confidence bounds for the proportion of true discoveries (TDP), simultaneously over all subsets of hypotheses. These simultaneous inferences come for free, i.e., without any adjustment of the alpha-level, whenever a global test is used. Our method allows for an exploratory approach, as simultaneity ensures control of the TDP even when the subset of interest is selected post hoc. It adapts to the unknown joint distribution of the data through permutation testing. Any sum test may be employed, depending on the desired power properties. We present an iterative shortcut for the closed testing procedure, based on the branch and bound algorithm, which converges to the full closed testing results, often after few iterations; even if it is stopped early, it controls the TDP. We compare the properties of different choices for the sum test through simulations, then we illustrate the feasibility of the method for high dimensional data on brain imaging and genomics data., Comment: Main: 27 pages, 3 figures. Appendices: 19 pages, 7 figures
- Published
- 2021
- Full Text
- View/download PDF
36. Theory meets Practice at the Median: a worst case comparison of relative error quantile algorithms
- Author
-
Cormode, Graham, Mishra, Abhinav, Ross, Joseph, and Veselý, Pavel
- Subjects
Computer Science - Data Structures and Algorithms ,Statistics - Computation ,F.2.2 - Abstract
Estimating the distribution and quantiles of data is a foundational task in data mining and data science. We study algorithms which provide accurate results for extreme quantile queries using a small amount of space, thus helping to understand the tails of the input distribution. Namely, we focus on two recent state-of-the-art solutions: $t$-digest and ReqSketch. While $t$-digest is a popular compact summary which works well in a variety of settings, ReqSketch comes with formal accuracy guarantees at the cost of its size growing as new observations are inserted. In this work, we provide insight into which conditions make one preferable to the other. Namely, we show how to construct inputs for $t$-digest that induce an almost arbitrarily large error and demonstrate that it fails to provide accurate results even on i.i.d. samples from a highly non-uniform distribution. We propose practical improvements to ReqSketch, making it faster than $t$-digest, while its error stays bounded on any instance. Still, our results confirm that $t$-digest remains more accurate on the ``non-adversarial'' data encountered in practice., Comment: Updated experiments, improved presentation. To appear in KDD 2021
- Published
- 2021
37. BCN2BRNO: ASR System Fusion for Albayzin 2020 Speech to Text Challenge
- Author
-
Kocour, Martin, Cámbara, Guillermo, Luque, Jordi, Bonet, David, Farrús, Mireia, Karafiát, Martin, Veselý, Karel, and Ĉernocký, Jan ''Honza''
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Computation and Language - Abstract
This paper describes joint effort of BUT and Telef\'onica Research on development of Automatic Speech Recognition systems for Albayzin 2020 Challenge. We compare approaches based on either hybrid or end-to-end models. In hybrid modelling, we explore the impact of SpecAugment layer on performance. For end-to-end modelling, we used a convolutional neural network with gated linear units (GLUs). The performance of such model is also evaluated with an additional n-gram language model to improve word error rates. We further inspect source separation methods to extract speech from noisy environment (i.e. TV shows). More precisely, we assess the effect of using a neural-based music separator named Demucs. A fusion of our best systems achieved 23.33% WER in official Albayzin 2020 evaluations. Aside from techniques used in our final submitted systems, we also describe our efforts in retrieving high quality transcripts for training., Comment: fusion, end-to-end model, hybrid model, semisupervised, automatic speech recognition, convolutional neural network
- Published
- 2021
38. Breaking the Barrier of 2 for the Competitiveness of Longest Queue Drop
- Author
-
Antoniadis, Antonios, Englert, Matthias, Matsakis, Nicolaos, and Veselý, Pavel
- Subjects
Computer Science - Data Structures and Algorithms ,F.2.2 - Abstract
We consider the problem of managing the buffer of a shared-memory switch that transmits packets of unit value. A shared-memory switch consists of an input port, a number of output ports, and a buffer with a specific capacity. In each time step, an arbitrary number of packets arrive at the input port, each packet designated for one output port. Each packet is added to the queue of the respective output port. If the total number of packets exceeds the capacity of the buffer, some packets have to be irrevocably evicted. At the end of each time step, each output port transmits a packet in its queue and the goal is to maximize the number of transmitted packets. The Longest Queue Drop (LQD) online algorithm accepts any arriving packet to the buffer. However, if this results in the buffer exceeding its memory capacity, then LQD drops a packet from whichever queue is currently the longest, breaking ties arbitrarily. The LQD algorithm was first introduced in 1991, and is known to be $2$-competitive since 2001. Although LQD remains the best known online algorithm for the problem and is of practical interest, determining its true competitiveness is a long-standing open problem. We show that LQD is 1.6918-competitive, establishing the first $(2-\varepsilon)$ upper bound for the competitive ratio of LQD, for a constant $\varepsilon>0$., Comment: A preliminary version appeared at ICALP 2021. This version contains an improved analysis which yields a slightly better upper bound. 30 pages
- Published
- 2020
39. Streaming Algorithms for Geometric Steiner Forest
- Author
-
Czumaj, Artur, Jiang, Shaofeng H. -C., Krauthgamer, Robert, and Veselý, Pavel
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
We consider an important generalization of the Steiner tree problem, the \emph{Steiner forest problem}, in the Euclidean plane: the input is a multiset $X \subseteq \mathbb{R}^2$, partitioned into $k$ color classes $C_1, C_2, \ldots, C_k \subseteq X$. The goal is to find a minimum-cost Euclidean graph $G$ such that every color class $C_i$ is connected in $G$. We study this Steiner forest problem in the streaming setting, where the stream consists of insertions and deletions of points to $X$. Each input point $x\in X$ arrives with its color $\textsf{color}(x) \in [k]$, and as usual for dynamic geometric streams, the input points are restricted to the discrete grid $\{0, \ldots, \Delta\}^2$. We design a single-pass streaming algorithm that uses $\mathrm{poly}(k \cdot \log\Delta)$ space and time, and estimates the cost of an optimal Steiner forest solution within ratio arbitrarily close to the famous Euclidean Steiner ratio $\alpha_2$ (currently $1.1547 \le \alpha_2 \le 1.214$). This approximation guarantee matches the state-of-the-art bound for streaming Steiner tree, i.e., when $k=1$, and it is a major open question to improve the ratio to $1 + \epsilon$ even for this special case. Our approach relies on a novel combination of streaming techniques, like sampling and linear sketching, with the classical Arora-style dynamic-programming framework for geometric optimization problems, which usually requires large memory and has so far not been applied in the streaming setting. We complement our streaming algorithm for the Steiner forest problem with simple arguments showing that any finite approximation requires $\Omega(k)$ bits of space.
- Published
- 2020
40. Automatic Speech Recognition Benchmark for Air-Traffic Communications
- Author
-
Zuluaga-Gomez, Juan, Motlicek, Petr, Zhan, Qingran, Vesely, Karel, and Braun, Rudolf
- Subjects
Computer Science - Computation and Language ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Advances in Automatic Speech Recognition (ASR) over the last decade opened new areas of speech-based automation such as in Air-Traffic Control (ATC) environment. Currently, voice communication and data links communications are the only way of contact between pilots and Air-Traffic Controllers (ATCo), where the former is the most widely used and the latter is a non-spoken method mandatory for oceanic messages and limited for some domestic issues. ASR systems on ATCo environments inherit increasing complexity due to accents from non-English speakers, cockpit noise, speaker-dependent biases, and small in-domain ATC databases for training. Hereby, we introduce CleanSky EC-H2020 ATCO2, a project that aims to develop an ASR-based platform to collect, organize and automatically pre-process ATCo speech-data from air space. This paper conveys an exploratory benchmark of several state-of-the-art ASR models trained on more than 170 hours of ATCo speech-data. We demonstrate that the cross-accent flaws due to speakers' accents are minimized due to the amount of data, making the system feasible for ATC environments. The developed ASR system achieves an averaged word error rate (WER) of 7.75% across four databases. An additional 35% relative improvement in WER is achieved on one test set when training a TDNNF system with byte-pair encoding., Comment: Accepted to: 21st INTERSPEECH conference (Shanghai, October 25-29)
- Published
- 2020
41. Relative Error Streaming Quantiles
- Author
-
Cormode, Graham, Karnin, Zohar, Liberty, Edo, Thaler, Justin, and Veselý, Pavel
- Subjects
Computer Science - Data Structures and Algorithms ,F.2.2 - Abstract
Estimating ranks, quantiles, and distributions over streaming data is a central task in data analysis and monitoring. Given a stream of $n$ items from a data universe equipped with a total order, the task is to compute a sketch (data structure) of size polylogarithmic in $n$. Given the sketch and a query item $y$, one should be able to approximate its rank in the stream, i.e., the number of stream elements smaller than or equal to $y$. Most works to date focused on additive $\varepsilon n$ error approximation, culminating in the KLL sketch that achieved optimal asymptotic behavior. This paper investigates multiplicative $(1\pm\varepsilon)$-error approximations to the rank. Practical motivation for multiplicative error stems from demands to understand the tails of distributions, and hence for sketches to be more accurate near extreme values. The most space-efficient algorithms due to prior work store either $O(\log(\varepsilon^2 n)/\varepsilon^2)$ or $O(\log^3(\varepsilon n)/\varepsilon)$ universe items. We present a randomized sketch storing $O(\log^{1.5}(\varepsilon n)/\varepsilon)$ items that can $(1\pm\varepsilon)$-approximate the rank of each universe item with high constant probability; this space bound is within an $O(\sqrt{\log(\varepsilon n)})$ factor of optimal. Our algorithm does not require prior knowledge of the stream length and is fully mergeable, rendering it suitable for parallel and distributed computing environments., Comment: Final version of the paper to appear in Journal of the ACM. Compared to the previous version, we removed any restrictions on the accuracy parameters in the main result and thoroughly revised the paper. 48 pages, 2 figures
- Published
- 2020
- Full Text
- View/download PDF
42. Star-finite coverings of Banach spaces
- Author
-
De Bernardi, Carlo Alberto, Somaglia, Jacopo, and Vesely, Libor
- Subjects
Mathematics - Functional Analysis - Abstract
We study star-finite coverings of infinite-dimensional normed spaces. A family of sets is called star-finite if each of its members intersects only finitely many other members of the family. It follows by our results that an LUR or a uniformly Fr\'echet smooth infinite-dimensional Banach space does not admit star-finite coverings by closed balls. On the other hand, we present a quite involved construction proving existence of a star-finite covering of $c_0(\Gamma)$ by Fr\'echet smooth centrally symmetric bounded convex bodies. A similar but simpler construction shows that every normed space of countable dimension (and hence incomplete) has a star-finite covering by closed balls.
- Published
- 2020
43. BUT Opensat 2019 Speech Recognition System
- Author
-
Karafiát, Martin, Baskar, Murali Karthick, Szöke, Igor, Vydana, Hari Krishna, Veselý, Karel, and Černocký, Jan "Honza''
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Machine Learning ,Computer Science - Sound - Abstract
The paper describes the BUT Automatic Speech Recognition (ASR) systems submitted for OpenSAT evaluations under two domain categories such as low resourced languages and public safety communications. The first was challenging due to lack of training data, therefore various architectures and multilingual approaches were employed. The combination led to superior performance. The second domain was challenging due to recording in extreme conditions such as specific channel, speaker under stress and high levels of noise. Data augmentation process was inevitable to get reasonably good performance., Comment: REJECTED in ICASSP 2020
- Published
- 2020
44. Streaming Algorithms for Bin Packing and Vector Scheduling
- Author
-
Cormode, Graham and Veselý, Pavel
- Subjects
Computer Science - Data Structures and Algorithms ,F.2.2 - Abstract
Problems involving the efficient arrangement of simple objects, as captured by bin packing and makespan scheduling, are fundamental tasks in combinatorial optimization. These are well understood in the traditional online and offline cases, but have been less well-studied when the volume of the input is truly massive, and cannot even be read into memory. This is captured by the streaming model of computation, where the aim is to approximate the cost of the solution in one pass over the data, using small space. As a result, streaming algorithms produce concise input summaries that approximately preserve the optimum value. We design the first efficient streaming algorithms for these fundamental problems in combinatorial optimization. For Bin Packing, we provide a streaming asymptotic $1+\varepsilon$-approximation with $\widetilde{O}\left(\frac{1}{\varepsilon}\right)$ memory, where $\widetilde{O}$ hides logarithmic factors. Moreover, such a space bound is essentially optimal. Our algorithm implies a streaming $d+\varepsilon$-approximation for Vector Bin Packing in $d$ dimensions, running in space $\widetilde{O}\left(\frac{d}{\varepsilon}\right)$. For the related Vector Scheduling problem, we show how to construct an input summary in space $\widetilde{O}(d^2\cdot m / \varepsilon^2)$ that preserves the optimum value up to a factor of $2 - \frac{1}{m} +\varepsilon$, where $m$ is the number of identical machines., Comment: 19 pages, 1 figure, submitted
- Published
- 2019
45. Tight Lower Bound for Comparison-Based Quantile Summaries
- Author
-
Cormode, Graham and Veselý, Pavel
- Subjects
Computer Science - Data Structures and Algorithms ,F.2.2 - Abstract
Quantiles, such as the median or percentiles, provide concise and useful information about the distribution of a collection of items, drawn from a totally ordered universe. We study data structures, called quantile summaries, which keep track of all quantiles, up to an error of at most $\varepsilon$. That is, an $\varepsilon$-approximate quantile summary first processes a stream of items and then, given any quantile query $0\le \phi\le 1$, returns an item from the stream, which is a $\phi'$-quantile for some $\phi' = \phi \pm \varepsilon$. We focus on comparison-based quantile summaries that can only compare two items and are otherwise completely oblivious of the universe. The best such deterministic quantile summary to date, due to Greenwald and Khanna (SIGMOD '01), stores at most $O(\frac{1}{\varepsilon}\cdot \log \varepsilon N)$ items, where $N$ is the number of items in the stream. We prove that this space bound is optimal by showing a matching lower bound. Our result thus rules out the possibility of constructing a deterministic comparison-based quantile summary in space $f(\varepsilon)\cdot o(\log N)$, for any function $f$ that does not depend on $N$. As a corollary, we improve the lower bound for biased quantiles, which provide a stronger, relative-error guarantee of $(1\pm \varepsilon)\cdot \phi$, and for other related computational tasks., Comment: 20 pages, 2 figures, major revison of the construction (Sec. 3) and some other parts of the paper
- Published
- 2019
46. How to glide in Schwarzschild spacetime
- Author
-
Veselý, Vítek and Zofka, Martin
- Subjects
General Relativity and Quantum Cosmology ,83C10, 83C55 - Abstract
We investigate the motion of extended test objects in the Schwarzschild spacetime, particularly the radial fall of two point masses connected by a massless rod of a length given as a fixed, periodic function of time. We argue that such a model is inappropriate in the most interesting regimes of high and low oscillation frequencies., Comment: 15 pages, 10 figures
- Published
- 2019
- Full Text
- View/download PDF
47. Microscopic multiphonon approach to nuclei with a valence hole in the oxygen region
- Author
-
De Gregorio, Giovanni, Knapp, Frantisek, Iudice, Nicola Lo, and Vesely, Petr
- Subjects
Nuclear Theory - Abstract
An equation of motion phonon method, developed for even nuclei and recently extended to odd systems with a valence particle, is formulated in the hole-phonon coupling scheme and applied to A=15 and A=21 isobars with a valence hole. The method derives a set of equations which yield an orthonormal basis of states composed of a hole coupled to an orthonormal basis of correlated n-phonon states (n = 0, 1, 2, . . .), built of constituent Tamm-Dancoff phonons, describing the excitations of a doubly magic core. The basis is then adopted to solve the full eigenvalue problem. The method is formally exact but lends itself naturally to simplifying approximations. Self-consistent calculations using a chiral Hamiltonian in a space encompassing up to two-phonon and three-phonon basis states in A=21 A=15 nuclei, respectively, yield full spectra, moments, electromagnetic and beta-decay transition strengths, and electric dipole cross sections. The analysis of the hole-phonon composition of the eigenfunctions contributes to clarify the mechanism of excitation of levels and resonances and to understand the reasons of the deviations of the theory from the experiments. Prescriptions for reducing these discrepancies are suggested., Comment: 11 pages, 7 figures, accepted in Phys. Rev. C
- Published
- 2019
48. Effect of a realistic three-body force on the spectra of medium-mass hypernuclei
- Author
-
Vesely, Petr, De Gregorio, Giovanni, and Pokorny, Jan
- Subjects
Nuclear Theory - Abstract
We adopt the Hartree-Fock (HF) method in the proton-neutron-$\Lambda$ (p-n-$\Lambda$) formalism and the nucleon-$\Lambda$ Tamm-Dancoff Approximation (N$\Lambda$ TDA) to study the energy spectra of medium-mass hypernuclei. The formalism is developed for a potential derived from effective field theories which includes explicitly the 3-body $NNN$ forces plus the $YN$ LO potential. The energy spectra of selected medium-mass hypernuclei are presented and their properties discussed. The present calculation is the first step of a project devoted to {\it ab initio} studies of hypernuclei in medium and heavy mass regions. This may provide a guide for a better understanding of the $YN$ interactions at momentum scales not accessible in few-body hypernuclei., Comment: 7 pages, 9 figures, accepted in Physica Scripta
- Published
- 2018
- Full Text
- View/download PDF
49. Introducing SPAIN (SParse Audio INpainter)
- Author
-
Mokrý, Ondřej, Záviška, Pavel, Rajmic, Pavel, and Veselý, Vítězslav
- Subjects
Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing ,Mathematics - Optimization and Control - Abstract
A novel sparsity-based algorithm for audio inpainting is proposed. It is an adaptation of the SPADE algorithm by Kiti\'c et al., originally developed for audio declipping, to the task of audio inpainting. The new SPAIN (SParse Audio INpainter) comes in synthesis and analysis variants. Experiments show that both A-SPAIN and S-SPAIN outperform other sparsity-based inpainting algorithms. Moreover, A-SPAIN performs on a par with the state-of-the-art method based on linear prediction in terms of the SNR, and, for larger gaps, SPAIN is even slightly better in terms of the PEMO-Q psychoacoustic criterion.
- Published
- 2018
- Full Text
- View/download PDF
50. Residual Memory Networks: Feed-forward approach to learn long temporal dependencies
- Author
-
Baskar, Murali Karthick, Karafiat, Martin, Burget, Lukas, Vesely, Karel, Grezl, Frantisek, and Cernocky, Jan Honza
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Training deep recurrent neural network (RNN) architectures is complicated due to the increased network complexity. This disrupts the learning of higher order abstracts using deep RNN. In case of feed-forward networks training deep structures is simple and faster while learning long-term temporal information is not possible. In this paper we propose a residual memory neural network (RMN) architecture to model short-time dependencies using deep feed-forward layers having residual and time delayed connections. The residual connection paves way to construct deeper networks by enabling unhindered flow of gradients and the time delay units capture temporal information with shared weights. The number of layers in RMN signifies both the hierarchical processing depth and temporal depth. The computational complexity in training RMN is significantly less when compared to deep recurrent networks. RMN is further extended as bi-directional RMN (BRMN) to capture both past and future information. Experimental analysis is done on AMI corpus to substantiate the capability of RMN in learning long-term information and hierarchical information. Recognition performance of RMN trained with 300 hours of Switchboard corpus is compared with various state-of-the-art LVCSR systems. The results indicate that RMN and BRMN gains 6 % and 3.8 % relative improvement over LSTM and BLSTM networks.
- Published
- 2018
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.