17 results on '"János M. Szalai-Gindl"'
Search Results
2. Tiling Strategies for Distributed Point Cloud Databases.
- Author
-
János M. Szalai-Gindl, László Dobos 0001, and István Csabai
- Published
- 2017
- Full Text
- View/download PDF
3. GPU-accelerated hierarchical Bayesian estimation of luminosity functions using flux-limited observations with photometric noise.
- Author
-
János M. Szalai-Gindl, T. J. Loredo, B. C. Kelly, István Csabai, Tamas Budavari, and László Dobos 0001
- Published
- 2018
- Full Text
- View/download PDF
4. Point cloud databases.
- Author
-
László Dobos 0001, István Csabai, János M. Szalai-Gindl, Tamás Budavári, and Alexander S. Szalay
- Published
- 2014
- Full Text
- View/download PDF
5. The COMPARE Data Hubs.
- Author
-
Clara Amid, Nima Pakseresht, Nicole Silvester, Suran Jayathilaka, Ole Lund, Lukasz D. Dynovski, Bálint á Pataki, Dávid Visontai, Basil Britto Xavier, Blaise T. F. Alako, Ariane Belka, Jose L. B. Cisneros, Matthew Cotten, George B. Haringhuizen, Peter W. Harrison, Dirk Höper, Sam Holt, Camilla Hundahl, Abdulrahman Hussein, Rolf S. Kaas, Xin Liu 0006, Rasko Leinonen, Surbhi Malhotra-Kumar, David F. Nieuwenhuijse, Nadim Rahman, Carolina dos S. Ribeiro, Jeffrey E. Skiby, Dennis Schmitz, József Stéger, János M. Szalai-Gindl, Martin Christen Frølund Thomsen, Simone M. Cacciò, István Csabai, Annelies Kroneman, Marion Koopmans, Frank Aarestrup, and Guy Cochrane
- Published
- 2019
- Full Text
- View/download PDF
6. Template Matching for 3D Objects in Large Point Clouds Using DBMS
- Author
-
Peter Vaderna, László Dobos, Dániel Varga, Bence Formanek, János M. Szalai-Gindl, and Sándor Laki
- Subjects
General Computer Science ,Computer science ,Pipeline (computing) ,Feature extraction ,Point cloud ,02 engineering and technology ,External Data Representation ,computer.software_genre ,registration ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,database ,template matching ,Data processing ,PCA ,Database ,Template matching ,Search engine indexing ,General Engineering ,020207 software engineering ,3D point cloud ,TK1-9971 ,Data set ,020201 artificial intelligence & image processing ,Electrical engineering. Electronics. Nuclear engineering ,computer - Abstract
LIDAR and depth cameras have gone through a profound technological evolution, making large-scale recording of 3D point cloud data possible which raises new challenges for data processing. Most of the existing 3D point cloud processing methods were developed to work properly when the entire data set fits into the memory of a single server. When point clouds are significantly larger than the main memory and data are only available on slow storage, new approaches are necessary. In this paper, we propose a DBMS-based point cloud processing pipeline that solves the template matching problem, i.e., finding the – potentially multiple – occurrences of a small query point cloud in an extensive scene data set that is preprocessed and stored in a database. The storage layer uses a compact and novel data representation to exploit the benefits of efficient indexing structures whereas the query algorithm consists of a novel combination of existing point cloud processing and matching methods. To the best of our knowledge, this is the first template matching proposal in the literature that exploits the benefits of databases.
- Published
- 2021
7. TOWARDS ON EXPERIMENTAL COMPARISON OF THE M-TREE INDEX STRUCTURE WITH BK-TREE AND VP-TREE
- Author
-
Attila Kiss, Gergo Gombos, János M. Szalai-Gindl, and István Donkó
- Subjects
M-tree ,postgresql ,Index (economics) ,databases ,Polymers and Plastics ,Computer science ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,metric space ,Industrial and Manufacturing Engineering ,m-tree ,Combinatorics ,indexes ,Data_FILES ,BK tree ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Business and International Management ,lcsh:TK1-9971 ,Vantage-point tree - Abstract
In our previous paper, we showed the M-tree index [7] using GiST in the PostgreSQL database. In this paper, we present that result and we extend that with some preliminary experimental results with other indexes. We compare the M-tree index with the BK-tree and the VP-tree indexes. These can be work in metric space with edit distance, that can be used to compare DNA sequences or melody of songs. In this paper, we compare the indexes in PostgreSQL. We use the range based queries to analyze the performance of the indexes. The result shows that the M-tree index is faster than the other two indexes
- Published
- 2020
8. The Descriptiveness of Feature Descriptors with Reduced Dimensionality
- Author
-
Dániel Varga, Sándor Laki, and János M. Szalai-Gindl
- Subjects
Full table scan ,business.industry ,Computer science ,Feature (computer vision) ,Nearest neighbor search ,Dimensionality reduction ,Feature vector ,Spatial database ,Point cloud ,Pattern recognition ,Artificial intelligence ,business ,Curse of dimensionality - Abstract
Nowadays, depth data has an important role in many applications. The sensors which can capture depth data became essential parts of autonomous vehicles. These sensors record a huge amount of 3D data (point clouds with x, y, and z coordinates). Furthermore, for many point cloud processing applications, it is important to calculate feature vectors that aim at describing the neighborhood of each point. Usually, a feature vector has high dimensionality, and storing it in a database is a difficult task. One of the most common operations on feature descriptors is the nearest neighbor search. However, earlier works show that nearest neighbor search with spatial index structures in high dimensions could be outperformed by sequential scan. In this work, we investigate how dimensionality reduction on 3D feature descriptors affects the descriptiveness.
- Published
- 2021
9. On Fast Point Cloud Matching with Key Points and Parameter Tuning
- Author
-
László Dobos, Bence Formanek, Sándor Laki, Dániel Varga, János M. Szalai-Gindl, and Peter Vaderna
- Subjects
Computational complexity theory ,Matching (graph theory) ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Point cloud ,02 engineering and technology ,Pipeline (software) ,Data point ,Feature (computer vision) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Point (geometry) ,Computer vision ,Noise (video) ,Artificial intelligence ,business - Abstract
Nowadays, three dimensional point cloud processing plays a very important role in a wide range of areas: autonomous driving, robotics, cartography, etc. Three dimensional point cloud registration pipelines have high computational complexity, mainly because of the cost of point feature signature calculation. By selecting keypoints and using only them for registration, data points that are interesting in some way, one can significantly reduce the number of points for which feature signatures are needed, hence the running time of registration pipelines. Consequently, keypoint detectors have a prominent role in an efficient processing pipeline. In this paper, we propose to analyze the usefulness of various keypoint detection algorithms and investigate whether and when it is worth to use a keypoint detector for registration. We define the goodness of a keypoint detection algorithm based on the success and quality of registration. Most keypoint detection methods require manual tuning of their parameters for best results. Here we revisit the most popular methods for keypoint detection in 3D point clouds and perform automatic parameter tuning with goodness of registration and run time as primary objectives. We compare keypoint-based registration to registration with randomly selected points and using all data points as a baseline. In contrast to former work, we use point clouds of different sizes, with and without noise, and register objects with different sizes.
- Published
- 2020
10. An implementation of the M-tree index structure for PostgreSQL using GiST
- Author
-
Gergo Gombos, Attila Kiss, István Donkó, and János M. Szalai-Gindl
- Subjects
M-tree ,Full table scan ,Structure (mathematical logic) ,Metric space ,Range (mathematics) ,Forcing (recursion theory) ,Theoretical computer science ,Computer science ,Heuristics ,Data type - Abstract
Several index structures are competing for efficient operations on various types of data, but there are some forms of information that cannot fit into the existing models, because by forcing them into one of the currently available representations they either lose some of their value or they do not have the necessary properties which would allow them to be used in the first place. In light of this, we implemented the M-tree [1] [2] index structure under the PostgreSQL database management system through a GiST [3] extension to facilitate efficient range-based queries and k-nearest neighbor searches on data that resides only in a metric space. The M-tree structure has parts that require heuristics, of which the implementations are not entirely specified, only suggestions are given. This allows flexibility for adapting to different situations and leaves space for further improvements. We implemented several of the strategies proposed in the original paper alongside some of our own ideas and saw significant speed gains when compared to a sequential scan, our main reference, as the other built-in index structures are not applicable to metric spaces.
- Published
- 2019
11. The COMPARE Data Hubs
- Author
-
Nicole Silvester, Frank Møller Aarestrup, Sam Holt, Martin Christen Frølund Thomsen, Nadim Rahman, Nima Pakseresht, Marion Koopmans, Dávid Visontai, Peter W. Harrison, José Cisneros, Suran Jayathilaka, Clara Amid, Lukasz D. Dynovski, Jeffrey Edward Skiby, Rolf Sommer Kaas, Abdulrahman Hussein, Dirk Höper, Carolina dos S. Ribeiro, George B. Haringhuizen, Ole Lund, Guy Cochrane, Matthew Cotten, Camilla Hundahl, Rasko Leinonen, Blaise T. F. Alako, Surbhi Malhotra-Kumar, Ariane Belka, János M. Szalai-Gindl, István Csabai, David F. Nieuwenhuijse, Basil Britto Xavier, József Stéger, and Bálint Ármin Pataki
- Subjects
0303 health sciences ,Animal health ,business.industry ,Computer science ,Food safety ,Data science ,Data sharing ,03 medical and health sciences ,0302 clinical medicine ,Workflow ,Lead (geology) ,Data model ,Informatics ,business ,030217 neurology & neurosurgery ,030304 developmental biology - Abstract
Data sharing enables research communities to exchange findings and build upon the knowledge that arises from their discoveries. Areas of public and animal health as well as food safety would benefit from rapid data sharing when it comes to emergencies. However, ethical, regulatory, and institutional challenges, as well as lack of suitable platforms which provide an infrastructure for data sharing in structured formats often lead to data not being shared, or at most shared in form of supplementary materials in journal publications. Here, we describe an informatics platform that includes workflows for structured data storage, managing and pre-publication sharing of pathogen sequencing data and its analysis interpretations with relevant stakeholders.
- Published
- 2019
- Full Text
- View/download PDF
12. Worldwide human mitochondrial haplogroup distribution from urban sewage
- Author
-
Marion Koopmans, Dávid Visontai, Frank Møller Aarestrup, Rene S. Hendriksen, János M. Szalai-Gindl, Anna Medgyes-Horváth, Orsolya Pipek, József Stéger, István Csabai, László Dobos, Rolf Sommer Kaas, and Virology
- Subjects
0301 basic medicine ,Mitochondrial DNA ,Genetic testing ,Urban Population ,Distribution (economics) ,Sewage ,lcsh:Medicine ,Human mitochondrial genetics ,DNA, Mitochondrial ,Article ,Evolution, Molecular ,03 medical and health sciences ,Population screening ,0302 clinical medicine ,SDG 3 - Good Health and Well-being ,Humans ,Location ,lcsh:Science ,Phylogeny ,Principal Component Analysis ,Stochastic Processes ,Multidisciplinary ,business.industry ,Haplotype ,lcsh:R ,Reproducibility of Results ,Data acquisition ,3. Good health ,030104 developmental biology ,Geography ,Haplotypes ,Evolutionary biology ,lcsh:Q ,Sample collection ,business ,030217 neurology & neurosurgery ,Human mitochondrial DNA haplogroup - Abstract
Community level genetic information can be essential to direct health measures and study demographic tendencies but is subject to considerable ethical and legal challenges. These concerns become less pronounced when analyzing urban sewage samples, which are ab ovo anonymous by their pooled nature. We were able to detect traces of the human mitochondrial DNA (mtDNA) in urban sewage samples and to estimate the distribution of human mtDNA haplogroups. An expectation maximization approach was used to determine mtDNA haplogroup mixture proportions for samples collected at each different geographic location. Our results show reasonable agreement with both previous studies of ancient evolution or migration and current US census data; and are also readily reproducible and highly robust. Our approach presents a promising alternative for sample collection in studies focusing on the ethnic and genetic composition of populations or diseases associated with different mtDNA haplogroups and genotypes.
- Published
- 2019
- Full Text
- View/download PDF
13. The COMPARE Data Hubs
- Author
-
Martin Christen Frølund Thomsen, Matthew Cotten, Peter W. Harrison, David F. Nieuwenhuijse, Simone M. Cacciò, Ariane Belka, Clara Amid, Surbhi Malhotra-Kumar, Annelies Kroneman, Frank Møller Aarestrup, Sam Holt, Rasko Leinonen, István Csabai, Xin Liu, Lukasz D. Dynovski, Basil Britto Xavier, Nadim Rahman, José Cisneros, József Stéger, Nicole Silvester, Ole Lund, Abdulrahman Hussein, János M. Szalai-Gindl, Camilla Hundahl, Carolina dos S. Ribeiro, Rolf Sommer Kaas, George B. Haringhuizen, Blaise T. F. Alako, Nima Pakseresht, Jeffrey Edward Skiby, Dirk Höper, Guy Cochrane, Dennis Schmitz, Bálint Ármin Pataki, Marion Koopmans, Dávid Visontai, Suran Jayathilaka, and Virology
- Subjects
Databases, Factual ,Computer science ,Sequencing data ,pathogen sequencing data ,pathogen portal ,General Biochemistry, Genetics and Molecular Biology ,03 medical and health sciences ,User-Computer Interface ,0302 clinical medicine ,Lead (geology) ,Phylogeny ,030304 developmental biology ,Computer. Automation ,0303 health sciences ,Animal health ,Bacteria ,FAIR principles ,business.industry ,Information Dissemination ,data hubs ,Food safety ,Data science ,Data sharing ,data sharing platform ,Workflow ,Informatics ,Original Article ,Human medicine ,Metagenomics ,General Agricultural and Biological Sciences ,business ,030217 neurology & neurosurgery ,Information Systems - Abstract
Data sharing enables research communities to exchange findings and build upon the knowledge that arises from their discoveries. Areas of public and animal health as well as food safety would benefit from rapid data sharing when it comes to emergencies. However, ethical, regulatory and institutional challenges, as well as lack of suitable platforms which provide an infrastructure for data sharing in structured formats, often lead to data not being shared or at most shared in form of supplementary materials in journal publications. Here, we describe an informatics platform that includes workflows for structured data storage, managing and pre-publication sharing of pathogen sequencing data and its analysis interpretations with relevant stakeholders.
- Published
- 2019
- Full Text
- View/download PDF
14. ND-GiST: A Novel Method for Disk-Resident k-mer Indexing
- Author
-
Attila Kiss, Gábor Halász, István Csabai, László Dobos, and János M. Szalai-Gindl
- Subjects
0301 basic medicine ,Sequence ,Focus (computing) ,Theoretical computer science ,Query string ,Computer science ,Search engine indexing ,02 engineering and technology ,Full table scan ,03 medical and health sciences ,030104 developmental biology ,Index (publishing) ,k-mer ,020204 information systems ,Subsequence ,0202 electrical engineering, electronic engineering, information engineering - Abstract
Several challenges are related to metagenomics, one of which is the data management. A related central concept is k-mer which means a possible subsequence of length k from a DNA (sub)sequence. In this work, the focus is on indexing k-mers and supporting box queries where a query string of length k might have multiple allowed nucleobases per position. A novel index structure: ND-GiST is introduced which has capability to handle box queries. Comparing it with full table scan and the traditional B-tree, the performance results of ND-GiST are encouraging.
- Published
- 2019
- Full Text
- View/download PDF
15. Tiling Strategies for Distributed Point Cloud Databases
- Author
-
László Dobos, János M. Szalai-Gindl, and István Csabai
- Subjects
Spectral clustering algorithm ,Theoretical computer science ,Database ,Hierarchy (mathematics) ,Computer science ,Point cloud ,02 engineering and technology ,computer.software_genre ,01 natural sciences ,Spectral clustering ,Transformation (function) ,Data extraction ,Shared nothing architecture ,Histogram ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data mining ,010306 general physics ,computer - Abstract
Many large point clouds -- such as cosmological N-body simulations, intersections of road networks etc. -- are strongly clustered on a hierarchy of scales. In shared nothing distributed environments, optimized tiling of data is crucial to minimize cross-server communication and balance IO and processing load. We propose histogram-based tiling algorithms, a hierarchical tiling and a spectral clustering algorithm, that can be incorporated into the data extraction or transformation phase of a typical Extraction--Transformation--Loading (ETL) procedure. We define measures to characterize the performance of these tiling techniques with respect to typical spatial search operations, and evaluate the algorithms based on these measures using hierarchically clustered data sets.
- Published
- 2017
16. Loss of BRCA1 or BRCA2 markedly increases the rate of base substitution mutagenesis and has distinct effects on genomic deletions
- Author
-
Marcin Krzystanek, Judit Zámborszky, Judit Z. Gervai, Charles Swanton, János M. Szalai-Gindl, Andrea L. Richardson, Bernadett Szikriszt, Orsolya Pipek, Zoltan Szallasi, Dezső Ribli, Dávid Szüts, István Csabai, and Ádám Póti
- Subjects
0301 basic medicine ,Male ,Cancer Research ,endocrine system diseases ,Mutant ,Mutagenesis (molecular biology technique) ,Biology ,medicine.disease_cause ,03 medical and health sciences ,chemistry.chemical_compound ,SDG 3 - Good Health and Well-being ,Genetics ,medicine ,Animals ,Humans ,skin and connective tissue diseases ,Molecular Biology ,BRCA2 Protein ,Mutation ,Mutation Spectra ,BRCA1 Protein ,Point mutation ,Genomics ,Methyl methanesulfonate ,030104 developmental biology ,chemistry ,Mutagenesis ,Original Article ,Female ,Erratum ,Homologous recombination ,Chickens ,DNA - Abstract
Loss-of-function mutations in the BRCA1 and BRCA2 genes increase the risk of cancer. Owing to their function in homologous recombination repair, much research has focused on the unstable genomic phenotype of BRCA1/2 mutant cells manifest mainly as large-scale rearrangements. We used whole-genome sequencing of multiple isogenic chicken DT40 cell clones to precisely determine the consequences of BRCA1/2 loss on all types of genomic mutagenesis. Spontaneous base substitution mutation rates increased sevenfold upon the disruption of either BRCA1 or BRCA2, and the arising mutation spectra showed strong and specific correlation with a mutation signature associated with BRCA1/2 mutant tumours. To model endogenous alkylating damage, we determined the mutation spectrum caused by methyl methanesulfonate (MMS), and showed that MMS also induces more base substitution mutations in BRCA1/2-deficient cells. Spontaneously arising and MMS-induced insertion/deletion mutations and large rearrangements were also more common in BRCA1/2 mutant cells compared with the wild-type control. A difference in the short deletion phenotypes of BRCA1 and BRCA2 suggested distinct roles for the two proteins in the processing of DNA lesions, as BRCA2 mutants contained more short deletions, with a wider size distribution, which frequently showed microhomology near the breakpoints resembling repair by non-homologous end joining. An increased and prolonged gamma-H2AX signal in MMS-treated BRCA1/2 cells suggested an aberrant processing of stalled replication forks as the cause of increased mutagenesis. The high rate of base substitution mutagenesis demonstrated by our experiments is likely to significantly contribute to the oncogenic effect of the inactivation of BRCA1 or BRCA2.
- Published
- 2016
17. Point cloud databases
- Author
-
János M. Szalai-Gindl, Alexander S. Szalay, Tamás Budavári, László Dobos, and István Csabai
- Subjects
Theoretical computer science ,Database ,Computer science ,Dimensionality reduction ,Search engine indexing ,Point cloud ,computer.software_genre ,Query language ,Database design ,Data point ,Relational model ,Data mining ,computer ,Database model - Abstract
We introduce the concept of the point cloud database, a new kind of database system aimed primarily towards scientific applications. Many scientific observations, experiments, feature extraction algorithms and large-scale simulations produce enormous amounts of data that are better represented as sparse (but often highly-clustered) points in a k-dimensional (k ≲ 10) metric space than on a multi-dimensional grid. Dimensionality reduction techniques, such as principal components, are also widely-used to project high dimensional data into similarly low dimensional spaces. Analysis techniques developed to work on multi-dimensional data points are usually implemented as in-memory algorithms and need to be modified to work in distributed cluster environments and on large amounts of disk-resident data. We conclude that the relational model, with certain additions, is appropriate for point clouds, but point cloud databases must also provide unique set of spatial search and proximity join operators, indexing schemes, and query language constructs that make them a distinct class of database systems.
- Published
- 2014
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.