Search

Your search keyword '"Hiroyuki Takizawa"' showing total 47 results

Search Constraints

Start Over You searched for: Author "Hiroyuki Takizawa" Remove constraint Author: "Hiroyuki Takizawa" Topic parallel computing Remove constraint Topic: parallel computing
47 results on '"Hiroyuki Takizawa"'

Search Results

1. Evaluating I/O Acceleration Mechanisms of SX-Aurora TSUBASA

2. Potential of a modern vector supercomputer for practical applications: performance evaluation of SX-ACE

3. Toward Dynamic Load Balancing across OpenMP Thread Teams for Irregular Workloads

4. An Automatic MPI Process Mapping Method Considering Locality and Memory Congestion on NUMA Systems

5. Scaling Performance for N-Body Stream Computation with a Ring of FPGAs

6. Performance Evaluation of Different Implementation Schemes of an Iterative Flow Solver on Modern Vector Machines

7. Translation of Large-Scale Simulation Codes for an OpenACC Platform Using the Xevolver Framework

10. Performance and Power Analysis of SX-ACE Using HP-X Benchmark Programs

11. Vectorization-Aware Loop Optimization with User-Defined Code Transformations

12. Optimizing Energy Consumption on HPC Systems with a Multi-Level Checkpointing Mechanism

13. A Capacity-Aware Thread Scheduling Method Combined with Cache Partitioning to Reduce Inter-Thread Cache Conflicts

14. The Importance of Dynamic Load Balancing among OpenMP Thread Teams for Irregular Workloads

15. A User-Defined Code Transformation Approach to Overlapping MPI Communication with Computation

16. A cache partitioning mechanism to protect shared data for CMPs

17. Parallel processing of the Building-Cube Method on a GPU platform

18. Performance of SOR methods on modern vector and scalar processors

19. Characteristics of an On-Chip Cache on NEC SX Vector Architecture

20. A Case Study of User-Defined Code Transformations for Data Layout Optimizations

21. A Verification Framework for Streamlining Empirical Auto-Tuning

22. Performance Evaluation of Compiler-Assisted OpenMP Codes on Various HPC Systems

23. Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing

24. Efficient parallel processing of competitive learning algorithms

25. Performance Evaluation of an OpenMP Parallelization by Using Automatic Parallelization Information

26. An energy optimization method for vector processing mechanisms

27. A Compiler-Assisted OpenMP Migration Method Based on Automatic Parallelizing Information

28. A Comparison of Performance Tunabilities between OpenCL and OpenACC

29. A flexible insertion policy for dynamic cache resizing mechanisms

30. ClMPI: An opencl extension for interoperation with the message passing interface

31. Analysing the Performance Improvements of Optimizations on Modern HPC Systems

32. Improving the scalability of transparent checkpointing for GPU computing systems

33. Performance Evaluation of a Next-Generation CFD on Various Supercomputing Systems

34. Performance and Scalability Analysis of a Chip Multi Vector Processor

35. Power-Aware Dynamic Cache Partitioning for CMPs

36. Cache partitioning strategies for 3-D stacked vector processors

37. Automatic Tuning of CUDA Execution Parameters for Stencil Processing

38. CheCUDA: A Checkpoint/Restart Tool for CUDA Applications

39. Performance evaluation of NEC SX-9 using real science and engineering applications

40. Performance tuning and analysis of future vector processors based on the roofline model

41. 3D on-chip memory for the vector architecture

42. Effects of MSHR and Prefetch Mechanisms on an On-Chip Cache of the Vector Architecture

43. Modeling of cache access behavior based on Zipf's law

44. First Experiences with NEC SX-9

45. An on-chip cache design for vector processors

46. Implications of Memory Performance for Highly Efficient Supercomputing of Scientific Applications

47. A stream Programming Language for GPU Computing

Catalog

Books, media, physical & digital resources