749 results on '"Valduriez, Patrick"'
Search Results
702. Data Warehouse Design Methods Review: Trends, Challenges and Future Directions for the Healthcare Domain
- Author
-
Khnaisser, Christina, Lavoie, Luc, Diab, Hassan, Ethier, Jean-Francois, Liu, Ting, Series editor, Morzy, Tadeusz, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
703. Best-Match Time Series Subsequence Search on the Intel Many Integrated Core Architecture
- Author
-
Zymbler, Mikhail, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
704. Ontological Commitments, DL-Lite Logics and Reasoning Tractability
- Author
-
Espil, Mauricio Minuto, Ojea, Maria Gabriela, Ojea, Maria Alejandra, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
705. Distributed Sequence Pattern Detection Over Multiple Data Streams
- Author
-
Leghari, Ahmed Khan, Cao, Jianneng, Zhou, Yongluan, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
706. Efficient Computation of Parsimonious Temporal Aggregation
- Author
-
Mahlknecht, Giovanni, Dignös, Anton, Gamper, Johann, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
707. Optimizing Sort in Hadoop Using Replacement Selection
- Author
-
Dusso, Pedro Martins, Sauer, Caetano, Härder, Theo, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
708. A Self-tuning Framework for Cloud Storage Clusters
- Author
-
Mohammad, Siba, Schallehn, Eike, Saake, Gunter, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
709. TDQMed: Managing Collections of Complex Test Data
- Author
-
Held, Johannes, Lenz, Richard, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
710. Partitioning Templates for RDF
- Author
-
Schroeder, Rebeca, Hara, Carmem S., Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
711. Feedback Based Continuous Skyline Queries Over a Distributed Framework
- Author
-
Leghari, Ahmed Khan, Cao, Jianneng, Zhou, Yongluan, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
712. SeeCOnt: A New Seeding-Based Clustering Approach for Ontology Matching
- Author
-
Algergawy, Alsayed, Babalou, Samira, Kargar, Mohammad J., Davarpanah, S. Hashem, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
713. ForCE: Is Estimation of Data Completeness Through Time Series Forecasts Feasible?
- Author
-
Endler, Gregor, Baumgärtel, Philipp, Wahl, Andreas M., Lenz, Richard, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
714. Evidence-Based Languages for Conceptual Data Modelling Profiles
- Author
-
Fillottrani, Pablo Rubén, Keet, C. Maria, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
715. Analysis of the Blocking Behaviour of Schema Transformations in Relational Database Systems
- Author
-
Wevers, Lesley, Hofstra, Matthijs, Tammens, Menno, Huisman, Marieke, van Keulen, Maurice, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
716. Web Content Management Systems Archivability
- Author
-
Banos, Vangelis, Manolopoulos, Yannis, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
717. A Benchmark for Relation Extraction Kernels
- Author
-
Pereira, João L. M., Galhardas, Helena, Martins, Bruno, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
718. HBelt: Integrating an Incremental ETL Pipeline with a Big Data Store for Real-Time Analytics
- Author
-
Qu, Weiping, Shankar, Sahana, Ganza, Sandy, Dessloch, Stefan, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
719. Two-ETL Phases for Data Warehouse Creation: Design and Implementation
- Author
-
Nabli, Ahlem, Bouaziz, Senda, Yangui, Rania, Gargouri, Faiez, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
720. A Generic Data Warehouse Architecture for Analyzing Workflow Logs
- Author
-
Koncilia, Christian, Pichler, Horst, Wrembel, Robert, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
721. A Framework for Building OLAP Cubes on Graphs
- Author
-
Ghrab, Amine, Romero, Oscar, Skhiri, Sabri, Vaisman, Alejandro, Zimányi, Esteban, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
722. Direct Transformation Techniques for Compressed Data: General Approach and Application Scenarios
- Author
-
Damme, Patrick, Habich, Dirk, Lehner, Wolfgang, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
723. Implementation of Multidimensional Databases in Column-Oriented NoSQL Systems
- Author
-
Chevalier, Max, Malki, Mohammed El, Kopliku, Arlind, Teste, Olivier, Tournier, Ronan, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
724. CoDEL – A Relationally Complete Language for Database Evolution
- Author
-
Herrmann, Kai, Voigt, Hannes, Behrend, Andreas, Lehner, Wolfgang, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
725. Two Phase User Driven Schema Matching
- Author
-
Bozovic, Nick, Vassalos, Vasilis, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
726. Sybil Tolerance and Probabilistic Databases to Compute Web Services Trust
- Author
-
Saoud, Zohra, Faci, Noura, Maamar, Zakaria, Benslimane, Djamal, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
727. A General Trust Management Framework for Provider Selection in Cloud Environment
- Author
-
Filali, Fatima Zohra, Yagoubi, Belabbas, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
728. Hybrid Web Service Discovery Based on Fuzzy Condorcet Aggregation
- Author
-
Fethallah, Hadjila, Amine, Belabed, Amel, Halfaoui, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
729. Space-Bounded Query Approximation
- Author
-
Cule, Boris, Geerts, Floris, Ndindi, Reuben, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
730. Conditional Differential Dependencies (CDDs)
- Author
-
Kwashie, Selasi, Liu, Jixue, Li, Jiuyong, Ye, Feiyue, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
731. The Structure of Preference Orders
- Author
-
Endres, Markus, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
732. Confidentiality Preserving Evaluation of Open Relational Queries
- Author
-
Biskup, Joachim, Bring, Martin, Bulinski, Michael, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
733. Improving the Pruning Ability of Dynamic Metric Access Methods with Local Additional Pivots and Anticipation of Information
- Author
-
Oliveira, Paulo H., Traina, Caetano, Jr., Kaster, Daniel S., Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
734. DZI: An air index for spatial queries in one-dimensional channels.
- Author
-
Park, Kwangjin, Joly, Alexis, and Valduriez, Patrick
- Subjects
- *
K-nearest neighbor classification , *SEARCH algorithms , *AIR , *CLOAKING devices - Abstract
The wireless data broadcast environment characteristics cause the data to be delivered sequentially via one-dimensional channels. A space-filling curve has been proposed for recent wireless data broadcast environments. However, air indexing introduces various problems, including the increase in the size of the index, conversion costs, and an increase in the search space because of an inefficient structure. In this paper, we propose a distribution-based Z-order air index and query processing algorithms suitable for a wireless data broadcast environment. The proposed index organizes the object identification (hereafter called ID) hierarchically only in terms of objects that are present. We compare the proposed technique with the well-known spatial indexing technique DSI by creating equations that represent the access time and tuning time, followed by conducting a simulation-based performance evaluation. The results from experimental show that our proposed index and algorithms support efficient query processing in both range queries and K-nearest neighbor queries. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
735. Efficient Scheduling of Scientific Workflows Using Hot Metadata in a Multisite Cloud.
- Author
-
Liu, Ji, Pineda, Luis, Pacitti, Esther, Costan, Alexandru, Valduriez, Patrick, Antoniu, Gabriel, and Mattoso, Marta
- Subjects
- *
METADATA , *WORKFLOW management , *WORKFLOW management systems , *SCHEDULING , *SERVER farms (Computer network management) , *SOVEREIGN wealth funds - Abstract
Large-scale, data-intensive scientific applications are often expressed as scientific workflows (SWfs). In this paper, we consider the problem of efficient scheduling of a large SWf in a multisite cloud, i.e., a cloud with geo-distributed cloud data centers (sites). The reasons for using multiple cloud sites to run a SWf are that data is already distributed, the necessary resources exceed the limits at a single site, or the monetary cost is lower. In a multisite cloud, metadata management has a critical impact on the efficiency of SWf scheduling as it provides a global view of data location and enables task tracking during execution. Thus, it should be readily available to the system at any given time. While it has been shown that efficient metadata handling plays a key role in performance, little research has targeted this issue in multisite cloud. In this paper, we propose to identify and exploit hot metadata (frequently accessed metadata) for efficient SWf scheduling in a multisite cloud, using a distributed approach. We implemented our approach within a scientific workflow management system, which shows that our approach reduces the execution time of highly parallel jobs up to 64 percent and that of the whole SWfs up to 55 percent. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
736. StreamCloud: An Elastic Parallel-Distributed Stream Processing Engine
- Author
-
Gulisano, Vincenzo, Jiménez Peris, Ricardo, Valduriez, Patrick, Distributed Systems Laboratory (DSL), Universidad Politécnica de Madrid (UPM), Scientific Data Management (ZENITH), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Inria Sophia Antipolis - Méditerranée (CRISAM), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Universidad Politécnica de Madrid, Ricardo Jiménez Peris, Patrick Valduriez(Patrick.Valduriez@inria.fr), and Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Inria Sophia Antipolis - Méditerranée (CRISAM)
- Subjects
Informática ,Telecomunicaciones ,Elasticidad ,Scalability ,Fault Tolerance ,Stream Processing Engine ,Data Streaming ,Elasticity ,Load Balancing ,Sistemas de procesamiento de flujos de datos ,Equilibrado de Carga ,Tolerancia a fallos ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,Escalabilidad ,Elasticité - Abstract
In recent years, applications in domains such as telecommunications, network security or large scale sensor networks showed the limits of the traditional store-then-process paradigm. In this context, Stream Processing Engines emerged as a candidate solution for all these applications demanding for high processing capacity with low processing latency guarantees. With Stream Processing Engines, data streams are not persisted but rather processed on the fly, producing results continuously. Current Stream Processing Engines, either centralized or distributed, do not scale with the input load due to single-node bottlenecks. Moreover, they are based on static configurations that lead to either under or over-provisioning. This Ph.D. thesis discusses StreamCloud, an elastic paralleldistributed stream processing engine that enables for processing of large data stream volumes. Stream- Cloud minimizes the distribution and parallelization overhead introducing novel techniques that split queries into parallel subqueries and allocate them to independent sets of nodes. Moreover, Stream- Cloud elastic and dynamic load balancing protocols enable for effective adjustment of resources depending on the incoming load. Together with the parallelization and elasticity techniques, Stream- Cloud defines a novel fault tolerance protocol that introduces minimal overhead while providing fast recovery. StreamCloud has been fully implemented and evaluated using several real word applications such as fraud detection applications or network analysis applications. The evaluation, conducted using a cluster with more than 300 cores, demonstrates the large scalability, the elasticity and fault tolerance effectiveness of StreamCloud.; En los útimos años, aplicaciones en dominios tales como telecomunicaciones, seguridad de redes y redes de sensores de gran escala se han encontrado con múltiples limitaciones en el paradigma tradicional de bases de datos. En este contexto, los sistemas de procesamiento de flujos de datos han emergido como solución a estas aplicaciones que demandan una alta capacidad de procesamiento con una baja latencia. En los sistemas de procesamiento de flujos de datos, los datos no se persisten y luego se procesan, en su lugar los datos son procesados al vuelo en memoria produciendo resultados de forma continua. Los actuales sistemas de procesamiento de flujos de datos, tanto los centralizados, como los distribuidos, no escalan respecto a la carga de entrada del sistema debido a un cuello de botella producido por la concentración de flujos de datos completos en nodos individuales. Por otra parte, éstos están basados en configuraciones estáticas lo que conducen a un sobre o bajo aprovisionamiento. Esta tesis doctoral presenta StreamCloud, un sistema elástico paralelo-distribuido para el procesamiento de flujos de datos que es capaz de procesar grandes volúmenes de datos. StreamCloud minimiza el coste de distribución y paralelización por medio de una técnica novedosa la cual particiona las queries en subqueries paralelas repartiéndolas en subconjuntos de nodos independientes. Ademas, Stream- Cloud posee protocolos de elasticidad y equilibrado de carga que permiten una optimización de los recursos dependiendo de la carga del sistema. Unidos a los protocolos de paralelización y elasticidad, StreamCloud define un protocolo de tolerancia a fallos que introduce un coste mínimo mientras que proporciona una rápida recuperación. StreamCloud ha sido implementado y evaluado mediante varias aplicaciones del mundo real tales como aplicaciones de detección de fraude o aplicaciones de análisis del tráfico de red. La evaluación ha sido realizada en un cluster con más de 300 núcleos, demostrando la alta escalabilidad y la efectividad tanto de la elasticidad, como de la tolerancia a fallos de StreamCloud.; In recent years...
- Published
- 2022
- Full Text
- View/download PDF
737. Corrigendum to “Best position algorithms for efficient top-k query processing” [Inf. Syst. 36(6) (2011) 973–989].
- Author
-
Akbarinia, Reza, Pacitti, Esther, and Valduriez, Patrick
- Subjects
- *
PERIODICAL articles , *ALGORITHMS , *SEARCH algorithms , *COMPUTER networks , *COMPUTER research - Published
- 2015
- Full Text
- View/download PDF
738. Query Optimization for Database Programming Languages
- Author
-
Valduriez, Patrick and Danforth, Scott
- Published
- 1990
- Full Text
- View/download PDF
739. FP-Hadoop: Efficient processing of skewed MapReduce jobs.
- Author
-
Liroz-Gistau, Miguel, Akbarinia, Reza, Agrawal, Divyakant, and Valduriez, Patrick
- Subjects
- *
ELECTRONIC data processing , *BIG data , *SPARK (Computer program language) , *COMPUTER software execution , *COMPUTER network resources - Abstract
Nowadays, we are witnessing the fast production of very large amount of data, particularly by the users of online systems on the Web. However, processing this big data is very challenging since both space and computational requirements are hard to satisfy. One solution for dealing with such requirements is to take advantage of parallel frameworks, such as MapReduce or Spark, that allow to make powerful computing and storage units on top of ordinary machines. Although these key-based frameworks have been praised for their high scalability and fault tolerance, they show poor performance in the case of data skew. There are important cases where a high percentage of processing in the reduce side ends up being done by only one node. In this paper, we present FP-Hadoop , a Hadoop-based system that renders the reduce side of MapReduce more parallel by efficiently tackling the problem of reduce data skew. FP-Hadoop introduces a new phase, denoted intermediate reduce (IR), where blocks of intermediate values are processed by intermediate reduce workers in parallel. With this approach, even when all intermediate values are associated to the same key, the main part of the reducing work can be performed in parallel taking benefit of the computing power of all available workers. We implemented a prototype of FP-Hadoop, and conducted extensive experiments over synthetic and real datasets. We achieved excellent performance gains compared to native Hadoop, e.g. more than 10 times in reduce time and 5 times in total execution time . [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
740. Erratum to: ForCE: Is Estimation of Data Completeness Through Time Series Forecasts Feasible?
- Author
-
Endler, Gregor, Baumgärtel, Philipp, Wahl, Andreas M., Lenz, Richard, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Tadeusz, Morzy, editor, Valduriez, Patrick, editor, and Bellatreche, Ladjel, editor
- Published
- 2015
- Full Text
- View/download PDF
741. Elastic scalable transaction processing in LeanXcale.
- Author
-
Jimenez-Peris, Ricardo, Burgos-Sancho, Diego, Ballesteros, Francisco, Patiño-Martinez, Marta, and Valduriez, Patrick
- Subjects
- *
DATABASES , *QUALITY of service , *WEB services , *SCALABILITY , *ELASTICITY - Abstract
Scaling ACID transactions in a cloud database is hard, and providing elastic scalability even harder. In this paper, we present our solution for elastic scalable transaction processing in LeanXcale, an industrial-strength NewSQL database system. Unlike previous solutions, it does not require any hardware assistance. Yet, it does scales linearly to 100s of servers. LeanXcale supports non-intrusive elasticity and can move data partitions without hurting the quality of service of transaction management. We show the correctness of LeanXcale transaction management. Finally, we provide a thorough performance evaluation of our solution on Amazon Web Services (AWS) shared cloud instances. The results show linear scalability, e.g., 5 million TPC-C NewOrder TPM with 200 nodes, which is greater than the TPC-C throughput obtained by the 9th highest result in all history using dedicated hardware used exclusively (not shared like in our evaluation) for the benchmark. Furthermore, the efficiency in terms of TPM per core is double that of the two top TPC-C results (also the only results in a cloud). • A complete solution for elastic scalable transaction processing in LeanXcale. • Linear scalability to 100s of servers, without requiring any hardware assistance. • Non-intrusive elasticity, without hurting the quality of service of TP. • Thorough performance evaluation on AWS showing linear scalability. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
742. The leganet system: Freshness-aware transaction routing in a database cluster
- Author
-
Gançarski, Stéphane, Naacke, Hubert, Pacitti, Esther, and Valduriez, Patrick
- Subjects
- *
DATABASES , *APPLICATION service providers , *INFORMATION technology , *INTERNET industry - Abstract
Abstract: We consider the use of a database cluster for Application Service Provider (ASP). In the ASP context, applications and databases can be update-intensive and must remain autonomous. In this paper, we describe the Leganet system which performs freshness-aware transaction routing in a database cluster. We use multi-master replication and relaxed replica freshness to increase load balancing. Our transaction routing takes into account freshness requirements of queries at the relation level and uses a cost function that takes into account the cluster load and the cost to refresh replicas to the required level. We implemented the Leganet prototype on an 11-node Linux cluster running Oracle8i. Using experimentation and emulation up to 128 nodes, our validation based on the TPC-C benchmark demonstrates the performance benefits of our approach. [Copyright &y& Elsevier]
- Published
- 2007
- Full Text
- View/download PDF
743. Análise de dados científicos sobre múltiplas fontes de dados ao longo da execução de simulações computacionais
- Author
-
Sousa, Vitor Silva, Oliveira, Daniel Cardoso Moraes de, Valduriez, Patrick, Lima, Alexandre de Assis Bento, Boeres, Maria Cristina Silva, Azevedo, Leonardo Guerreiro, and Mattoso, Marta Lima de Queirós
- Subjects
Simulações computacionais ,Análise de dados ,Mineração de dados ,CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::MATEMATICA DA COMPUTACAO::MODELOS ANALITICOS E DE SIMULACAO [CNPQ] - Abstract
Submitted by Christianne Fontes de Andrade (cfontes@ct.ufrj.br) on 2020-01-21T17:31:05Z No. of bitstreams: 1 885643.pdf: 3290217 bytes, checksum: cca372e81e33fbbc2e4f08363ef349f8 (MD5) Made available in DSpace on 2020-01-21T17:31:05Z (GMT). No. of bitstreams: 1 885643.pdf: 3290217 bytes, checksum: cca372e81e33fbbc2e4f08363ef349f8 (MD5) Previous issue date: 2018-06 Simulações computacionais em larga escala são caracterizadas pelo encadeamento de programas que executam modelos computacionais cada vez mais complexos. Muitos dos dados produzidos por esses programas precisam ser analisados pelos usuários do domínio científico a fim de validar as suas hipóteses científicas. Entretanto, esta não é uma tarefa trivial, pois outros programas precisam ser desenvolvidos para acessar e capturar esses dados científicos. Em muitos casos, os usuários também precisam relacionar dados produzidos por diferentes programas de simulação. Esta tese propõe uma abordagem capaz de monitorar, depurar e analisar o fluxo de elementos de dados produzido pelos diferentes programas de simulação. Propomos também uma arquitetura baseada em componentes, nomeada como ARMFUL, que permite extrair e relacionar dados científicos produzidos nessas diversas etapas por meio da abstração de fluxo de dados e de técnicas de captura de dados científicos. Os seus componentes podem ser instanciados em um sistema de workflows científicos (A-Chiron) ou uma biblioteca de componentes (DfAnalyzer). Avaliamos essas instâncias utilizando simulações em ambientes de processamento de alto desempenho. Os resultados experimentais mostram que a nossa abordagem introduz uma sobrecarga negligenciável em relação ao tempo de execução da simulação, além de permitir o processamento de consultas aos dados científicos. Large-scale computational simulations are characterized by the chaining of programs that execute increasingly complex computational models. Much of the data produced by these programs need to be analyzed by scientific domain users to validate their scientific hypotheses. However, it is not trivial since other programs must be developed to access and to capture these scientific data. In many cases, users also need to relate data produced by different simulation programs. This thesis proposes an approach that monitors, debugs, and analyzes the data element flow produced by different simulation programs. We also propose a component-based architecture, named as ARMFUL, to extract and relate scientific data generated in these several simulation steps considering a dataflow abstraction and techniques for scientific data capture. ARMFUL’s components can be instantiated on a scientific workflow system (e.g., A-Chiron) or a library of components (e.g., DfAnalyzer). We evaluate these instances using simulations in high performance computing environments. In our experimental results, our approach introduced a negligible overhead of the simulation execution time, and we perform complex queries to the scientific data.
- Published
- 2018
744. Distributed management of scientific workflows for high-throughput plant phenotyping
- Author
-
Cohen-Boulakia, Sarah, Heidsieck, Gaetan, Pacitti, Esther, Tardieu, François, Pradal, Christophe, and Valduriez, Patrick
- Subjects
Vegetal Biology ,Biologie végétale - Abstract
High-throughput phenotyping platforms allow acquisition of quantitative data on thousands of plants required for genetic analyses in well-controlled environmental conditions. However, analysing these massive datasets and reproducing computational experiments require the use of new computational infrastructure and algorithms to scale.
- Published
- 2018
745. Optimisation des requêtes skyline multidimensionnelles
- Author
-
Kamnang Wanko, Patrick, Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB), Université de Bordeaux, Nicolas Hanusse, Sofian Maabout, STAR, ABES, Hanusse, Nicolas, Maabout, Sofian, Valduriez, Patrick, Amann, Bernd, Hacid, Mohand Saïd, and Milani, Alessia
- Subjects
[INFO.INFO-OH] Computer Science [cs]/Other [cs.OH] ,Optimization ,Size ,Skycuboid ,Dépendance fonctionnelle ,Taille ,Skyline ,Cardinality ,Functional Dependency ,Cardinalité ,[INFO.INFO-OH]Computer Science [cs]/Other [cs.OH] ,Skycube ,Optimisation - Abstract
As part of the selection of the best items in a multidimensional database,several kinds of query were defined. The skyline operator has the advantage of not requiring the definition of a scoring function in order to classify tuples. However, the property of monotony that this operator does not satify, (i) makes difficult to optimize its queries in a multidimensional context, (ii) makes hard to estimate the size of query result. This work proposes, first, to address the question of estimating the size of the result of a given skyline query, formulating estimators with good statistical properties (unbiased or convergent). Then, it provides two different approaches to optimize multidimensional skyline queries. The first leans on a well known database concept: functional dependencies. And the second approach looks like a data compression method. Both algorithms are very interesting as confirm the experimental results. Finally, we address the issue of skyline queries in dynamic data by adapting one of our previous solutions in this goal., Dans le cadre de la sélection de meilleurs éléments au sein d’une base de données multidimensionnelle, plusieurs types de requêtes ont été définies. L’opérateur skyline présente l’avantage de ne pas nécessiter la définition d’une fonction de score permettant de classer lesdits éléments. Cependant, la propriété de monotonie que cet opérateur ne présente pas, rend non seulement (i) difficile l’optimisation de ses requêtes dans un contexte multidimensionnel, mais aussi (ii) presque imprévisible la taille du résultat des requêtes. Ce travail se propose, dans un premier temps, d’aborder la question de l’estimation de la taille du résultat d’une requête skyline donnée, en formulant des estimateurs présentant de bonnes propriétés statistiques(sans biais ou convergeant). Ensuite, il fournit deux approches différentes à l’optimisation des requêtes skyline. La première reposant sur un concept classique des bases de données qui est la dépendance fonctionnelle. La seconde se rapprochant des techniques de compression des données. Ces deux techniques trouvent leur place au sein de l’état de l’art comme le confortent les résultats expérimentaux.Nous abordons enfin la question de requêtes skyline au sein de données dynamiques en adaptant l’une de nos solutions précédentes dans cet intérêt.
- Published
- 2017
746. Interactive Execution for Large Scale Computational Experiments
- Author
-
Dias, Jonas, Instituto Alberto Luiz Coimbra de Pós-Graduação e Pesquisa de Engenharia (COPPE-UFRJ), Universidade Federal do Rio de Janeiro (UFRJ), Scientific Data Management (ZENITH), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Inria Sophia Antipolis - Méditerranée (CRISAM), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Universidade Federal de Rio de Janeiro, Marta Mattoso, Patrick Valduriez(Patrick.Valduriez@inria.fr), CNPq-INRIA HOSCAR (2012-2015), INRIA associated team Sarava (2009-2011), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Inria Sophia Antipolis - Méditerranée (CRISAM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), and Valduriez, Patrick
- Subjects
[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] ,optimisation ,exécution parallèle ,parallélisation ,Workflows scientifiques orientés-données ,Data-centric scientific workflows ,workflow algebra ,parallelization ,HPC ,[INFO.INFO-DC] Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,[INFO.INFO-DB] Computer Science [cs]/Databases [cs.DB] ,algèbre de workflows ,parallel execution ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,cluster - Abstract
To tackle the exploratory nature of science and the dynamic process involved in scientific analysis, dynamic workflows have been identified as an open challenge as they are subject to continuous adaptation and improvement. In particular, they require the ability of adapting a scientific workflow, at runtime, based on external events such as human interaction. Supporting dynamic iteration is an important step towards dynamic workflows since user interaction with a workflow is iterative. However, current support for iteration in scientific workflows is static and does not allow for runtime changes in data such as filter criteria or error thresholds. In this thesis, we propose an algebraic approach to support data-centric iteration in dynamic workflows and a dynamic execution model for these operators. We introduce the concept of iteration lineage so that provenance data management is consistent with dynamic changes in the workflow. Lineage also enables scientists to interact with workflow data and configuration at runtime through two steering algorithms implemented in Chiron. We evaluate our approach using real large-scale workflows on a large-scale environment. The results show execution time savings up to 24 days when compared to a traditional non-iterative workflow execution. We also perform complex queries for partial result analysis along the iterations and we assess the max overhead introduced by our iterative model as 3.63% of execution time. The performance of our proposed steering algorithms run in less than 1 millisecond in the worst-case scenario we measured., Para lidar com a natureza exploratória da ciência e o processo dinâmico envolvido nas análises científicas, os sistemas de gerência de workflows dinâmicos são essenciais. Entretanto, workflows dinâmicos são considerados como um desafio em aberto, devido à complexidade em gerenciar o workflow em contínua adaptação, em tempo de execução, por eventos externos como a intervenção humana. Apoiar iterações dinâmicas é um passo importante na direção dos workflows dinâmicos uma vez que a interação entre o usuário e o workflow é iterativa. Porém, o apoio existente para iterações em workflows científicos é estático e não permite mudanças, em tempo de execução, nos dados do workflow, como critérios de filtros e margens de erro. Nesta tese, propomos uma abordagem algébrica para dar apoio a iterações centradas em dados em workflows dinâmicos. Propomos o conceito de linhagem da iteração de forma que a gerência dos dados de proveniência seja consistente com as interações com o workflow. A linhagem também possibilita que os cientistas interajam com os dados do workflow por meio de dois algoritmos implementados no sistema de workflows Chiron. Avaliamos a nossa abordagem utilizando workflows reais em ambientes de execução em larga escala. Os resultados mostram melhorias no tempo de execução de até 24 dias quando comparado com uma abordagem tradicional não iterativa. Realizamos consultas complexas aos resultados parciais ao longo das iterações do workflow. A nossa abordagem introduz uma sobrecarga de no máximo 3,63% do tempo de execução. O tempo para executar os algoritmos de interação também é menor que 1 milissegundo no pior cenário avaliado.
- Published
- 2013
747. Recommandation Pair-à-Pair pour Communautés en Ligne à Grande Echelle
- Author
-
Draidi, Fady, Scientific Data Management (ZENITH), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Inria Sophia Antipolis - Méditerranée (CRISAM), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Université Montpellier II - Sciences et Techniques du Languedoc, Esther Pacitti(Esther.Pacitti@lirmm.fr), Projet ANR DataRing, programme VERSO 2009, Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Inria Sophia Antipolis - Méditerranée (CRISAM), and Valduriez, Patrick
- Subjects
Système pair-à-pair (P2P) ,social networks ,large-scale data management ,gestion de données à grande échelle ,[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] ,système de recommandation (RS) ,P2P system ,recommendation system (RS) ,online communities ,réseaux sociaux ,[INFO.INFO-DB] Computer Science [cs]/Databases [cs.DB] ,information retrieval ,communautés en ligne ,recherche d'information - Abstract
Recommendation systems (RS) and P2P are both complementary in easing large-scale data sharing: RS to filter and personalize users' demands, and P2P to build de-centralized large-scale data sharing systems. However, many challenges need to be overcome when building scalable, reliable and efficient RS atop P2P. In this work, we focus on large-scale communities, where users rate the con-tents they explore, and store in their local workspace high quality content related to their topics of interest. Our goal then is to provide a novel and efficient P2P-RS for this context. We exploit users' topics of interest (automatically extracted from users' contents and ratings) and social data (friendship and trust) as parameters to construct and maintain a social P2P overlay, and generate recommendations. The thesis addresses several related issues. First, we focus on the design of a scalable P2P-RS, called P2Prec, by leveraging collaborative- and content-based filter-ing recommendation approaches. We then propose the construction and maintenance of a P2P dynamic overlay using different gossip protocols. Our performance experi-mentation results show that P2Prec has the ability to get good recall with acceptable query processing load and network traffic. Second, we consider a more complex in-frastructure in order to build and maintain a social P2P overlay, called F2Frec, which exploits social relationships between users. In this new infrastructure, we leverage content- and social-based filtering, in order to get a scalable P2P-RS that yields high quality and reliable recommendation results. Based on our extensive performance evaluation, we show that F2Frec increases recall, and the trust and confidence of the results with acceptable overhead. Finally, we describe our prototype of P2P-RS, which we developed to validate our proposal based on P2Prec and F2Frec., Les systèmes de recommandation (RS) et le pair-à-pair (P2) sont complémen-taires pour faciliter le partage de données à grande échelle: RS pour filtrer et person-naliser les requêtes des utilisateurs, et P2P pour construire des systèmes de partage de données décentralisés à grande échelle. Cependant, il reste beaucoup de difficultés pour construire des RS efficaces dans une infrastructure P2P. Dans cette thèse, nous considérons des communautés en ligne à grande échelle, où les utilisateurs notent les contenus qu'ils explorent et gardent dans leur espace de travail local les contenus de qualité pour leurs sujets d'intérêt. Notre objectif est de construire un P2P-RS efficace pour ce contexte. Nous exploitons les sujets d'intérêt des utilisateurs (extraits automatiquement des contenus et de leurs notes) et les don-nées sociales (amitié et confiance) afin de construire et maintenir un overlay P2P so-cial. La thèse traite de plusieurs problèmes. D'abord, nous nous concentrons sur la conception d'un P2P-RS qui passe à l'échelle, appelé P2Prec, en combinant les ap-proches de recommandation par filtrage collaboratif et par filtrage basé sur le contenu. Nous proposons alors de construire et maintenir un overlay P2P dynamique grâce à des protocoles de gossip. Nos résultats d'expérimentation montrent que P2Prec per-met d'obtenir un bon rappel avec une charge de requêtes et un trafic réseau accep-tables. Ensuite, nous considérons une infrastructure plus complexe afin de construire et maintenir un overlay P2P social, appelé F2Frec, qui exploite les relations sociales entre utilisateurs. Dans cette infrastructure, nous combinons les aspects filtrage par contenu et filtrage basé social, pour obtenir un P2P-RS qui fournit des résultats de qualité et fiables. A l'aide d'une évaluation de performances extensive, nous mon-trons que F2Frec améliore bien le rappel, ainsi que la confiance dans les résultats avec une surcharge acceptable. Enfin, nous décrivons notre prototype de P2P-RS que nous avons implémenté pour valider notre proposition basée sur P2Prec et F2Frec.
- Published
- 2012
748. Efficient XML query processing
- Author
-
Ioana Manolescu, Laboratoire de Recherche en Informatique (LRI), Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), Integration of data and knowledge distributed over the web (GEMO), Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Université Paris Sud - Paris XI, and Valduriez Patrick(patrick.valduriez@inria.fr)
- Subjects
P2P ,query processing ,pair-à-pair ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,évaluation de requêtes ,[INFO.INFO-HC]Computer Science [cs]/Human-Computer Interaction [cs.HC] ,XML ,optimisation de requêtes ,query optimization - Abstract
The increase in popularity of the XML format for representing data brought multiple challenges to the data management community, in particular concerning efficient query evaluation. At the same time, XML technologies are highly connected to distribution and data exchange, leading to the need for new distributed data management techniques. We present some results of our work on issues related to XML query processing. We first introduce a new language for describing materialized views, called XML Access Modules or XAMs, and a complete approach for XQuery evaluation based on XAM-specified views. We then discuss optimization techniques for ActiveXML, a language for the integration of distributed XML streams. Finally, we discuss architectures for handling large volumes of XML data on distributed hash tables, in the KadoP and ViP2P platforms developed in the Gemo group.; Nous présentons des travaux autour de la thématique de l'évaluation efficace de requêtes XML. Une première partie est liée à l'optimisation de l'accès aux données XML dans des bases de données centralisées. La deuxième partie considère des architectures distribuées à grande échelle pour le partage de données XML.
749. Distributed in-memory data management for workflow executions.
- Author
-
Souza R, Silva V, Lima AAB, de Oliveira D, Valduriez P, and Mattoso M
- Abstract
Complex scientific experiments from various domains are typically modeled as workflows and executed on large-scale machines using a Parallel Workflow Management System (WMS). Since such executions usually last for hours or days, some WMSs provide user steering support, i.e., they allow users to run data analyses and, depending on the results, adapt the workflows at runtime. A challenge in the parallel execution control design is to manage workflow data for efficient executions while enabling user steering support. Data access for high scalability is typically transaction-oriented, while for data analysis, it is online analytical-oriented so that managing such hybrid workloads makes the challenge even harder. In this work, we present SchalaDB, an architecture with a set of design principles and techniques based on distributed in-memory data management for efficient workflow execution control and user steering. We propose a distributed data design for scalable workflow task scheduling and high availability driven by a parallel and distributed in-memory DBMS. To evaluate our proposal, we develop d-Chiron, a WMS designed according to SchalaDB's principles. We carry out an extensive experimental evaluation on an HPC cluster with up to 960 computing cores. Among other analyses, we show that even when running data analyses for user steering, SchalaDB's overhead is negligible for workloads composed of hundreds of concurrent tasks on shared data. Our results encourage workflow engine developers to follow a parallel and distributed data-oriented approach not only for scheduling and monitoring but also for user steering., Competing Interests: Daniel Oliveira and Marta Mattoso are Academic Editors of PeerJ CS. Renan Souza is employed by IBM Research and Vitor Sousa is employed by Snap Research., (© 2021 Souza et al.)
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.