142 results on '"Sascha Hunold"'
Search Results
2. Analysis and prediction of performance variability in large-scale computing systems.
3. pSTL-Bench: A Micro-Benchmark Suite for Assessing Scalability of C++ Parallel STL Implementations.
4. Uniform Algorithms for Reduce-scatter and (most) other Collectives for MPI.
5. Algorithm Selection of MPI Collectives Considering System Utilization.
6. Exploring Mapping Strategies for Co-allocated HPC Applications.
7. Synchronizing MPI Processes in Space and Time.
8. Verifying Performance Guidelines for MPI Collectives at Scale.
9. Using Mixed-Radix Decomposition to Enumerate Computational Resources of Deeply Hierarchical Architectures.
10. mpisee: MPI Profiling for Communication and Communicator Structure.
11. An Overhead Analysis of MPI Profiling and Tracing Tools.
12. A Quantitative Analysis of OpenMP Task Runtime Systems.
13. OMPICollTune: Autotuning MPI Collectives by Incremental Online Learning.
14. Teaching Complex Scheduling Algorithms.
15. MicroBench Maker: Reproduce, Reuse, Improve.
16. Predicting MPI Collective Communication Performance Using Machine Learning.
17. Decomposing MPI Collectives for Exploiting Multi-lane Communication.
18. Efficient Process-to-Node Mapping Algorithms for Stencil Computations.
19. Collectives and Communicators: A Case for Orthogonality: (Or: How to get rid of MPI neighbor and enhance Cartesian collectives).
20. Benchmarking Julia's Communication Performance: Is Julia HPC ready or Full HPC?
21. Cartesian Collective Communication.
22. Hierarchical Clock Synchronization in MPI.
23. Autotuning MPI Collectives using Performance Guidelines.
24. Algorithm Selection of MPI Collectives Using Machine Learning Techniques.
25. LigandScout Remote: A New User-Friendly Interface for HPC and Cloud Resources.
26. Predicting the Energy-Consumption of MPI Applications at Scale Using Only a Single Node.
27. Scheduling.jl - Collaborative and Reproducible Scheduling Research with Julia.
28. Efficient Process-to-Node Mapping Algorithms for Stencil Computations.
29. Automatic Verification of Self-consistent MPI Performance Guidelines.
30. On the Expected and Observed Communication Performance with MPI Derived Datatypes.
31. Scheduling Independent Moldable Tasks on Multi-Cores with GPUs.
32. On expected and observed communication performance with MPI derived datatypes.
33. Isomorphic, Sparse MPI-like Collective Communication Operations for Parallel Stencil Computations.
34. On the Impact of Synchronizing Clocks and Processes on Benchmarking MPI Collectives.
35. MPI collective communication through a single set of interfaces: A case for orthogonality.
36. A Quantitative Analysis of OpenMP Task Runtime Systems
37. Reproducible MPI Benchmarking is Still Not as Easy as You Think.
38. Implementing a classic: zero-copy all-to-all communication with mpi datatypes.
39. One step toward bridging the gap between theory and practice in moldable task scheduling with precedence constraints.
40. Scheduling Moldable Tasks with Precedence Constraints and Arbitrary Speedup Functions on Multiprocessors.
41. Evolutionary Scheduling of Parallel Tasks Graphs onto Homogeneous Clusters.
42. From Simulation to Experiment: A Case Study on Multiprocessor Task Scheduling.
43. BPEL Remote Objects: Integrating BPEL Processes into Object-Oriented Applications.
44. Low-Cost Tuning of Two-Step Algorithms for Scheduling Mixed-Parallel Applications onto Homogeneous Clusters.
45. Combining Object-Oriented Design and SOA with Remote Objects over Web Services.
46. Jedule: A Tool for Visualizing Schedules of Parallel Applications.
47. Tuning MPI Collectives by Verifying Performance Guidelines.
48. Reducing the Class Coupling of Legacy Code by a Metrics-Based Relocation of Class Members.
49. Pattern-Based Refactoring of Legacy Software Systems.
50. Load Balancing Concurrent BPEL Processes by Dynamic Selection of Web Service Endpoints.
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.