Back to Search
Start Over
TAZeR: Hiding the Cost of Remote I/O in Distributed Scientific Workflows
- Source :
- IEEE BigData, Web of Science
- Publication Year :
- 2019
- Publisher :
- IEEE, 2019.
-
Abstract
- Many scientific workflows access data derived from specialized instruments. When the data is analyzed, it is accessed over wide area networks, creating bottlenecks from long access latencies. We ask the question: assuming that data must be accessed remotely, can latencies be hidden without application change? We present TAZeR, a remote I/O framework that reduces effective data access latency. TAZeR transparently converts POSIX I/O into operations that interleave application work with data transfer, i.e., read prefetching and write stage-out. TAZeR ensures read data moves directly to application memory without synchronous intervention (soft zero-copy). TAZeR uses distributed bandwidth-aware staging to exploit data reuse across application tasks and to manage the capacity constraints of fast hierarchical storage. We evaluate TAZeR on a High Energy Physics workflow where two 1 Gb/s WAN links request remote data at 48 Gb/s using non-streaming access patterns. TAZeR is $12 \times $ and $22 \times $ faster than XRootD (state-of-the-art) and file copies (current approach), respectively; and within 7% of optimal. We explore conditions under which TAZeR can hide I/O accesses by showing performance as effective staging sizes change.
- Subjects :
- Input/output
Computer science
020206 networking & telecommunications
02 engineering and technology
computer.software_genre
Data access
Workflow
POSIX
0202 electrical engineering, electronic engineering, information engineering
Operating system
020201 artificial intelligence & image processing
Latency (engineering)
computer
Data transmission
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- 2019 IEEE International Conference on Big Data (Big Data)
- Accession number :
- edsair.doi.dedup.....c67c42c35b52cb5a0a63ca92863ea42b
- Full Text :
- https://doi.org/10.1109/bigdata47090.2019.9006418