Descriptor: "Translation lookaside buffer" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Translation lookaside buffer"' showing total 768 results

Start Over Descriptor "Translation lookaside buffer"

768 results on '"Translation lookaside buffer"'

1. EKRM: Efficient Key-Value Retrieval Method to Reduce Data Lookup Overhead for Redis

Author: Yao, Yiming, Wang, Xiaolin, Zhou, Diyu, Li, Liujia, Wu, Jianyu, Zhu, Liren, Wang, Zhenlin, Luo, Yingwei, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Carretero, Jesus, editor, Shende, Sameer, editor, Garcia-Blas, Javier, editor, Brandic, Ivona, editor, Olcoz, Katzalin, editor, and Schreiber, Martin, editor
Published: 2024
Full Text: View/download PDF

2. Pinning Page Structure Entries to Last-Level Cache for Fast Address Translation

Author: Osang Kwon, Yongho Lee, and Seokin Hong
Subjects: Address translation, page walk, translation lookaside buffer, virtual memory, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: As the memory footprint of emerging applications continues to increase, the address translation becomes a critical performance bottleneck owing to frequent misses on the Translation Lookaside Buffer (TLB). In addition, the TLB miss penalty becomes more critical in modern computer systems because the levels of the hierarchical page table (a.k.a. radix page table) are increasing to extend the address space. To reduce TLB misses, modern high-performance processors employ a multi-level TLB structure using a large last-level TLB. Employing a large last-level TLB may reduce TLB misses. However, its capacity is still limited, and it can incur a chip area overhead. In this paper, we propose a Page Structure Entry (PSE) pinning mechanism that provides a large PSE store by dedicating some space to the last-level cache to store only the page structure entries. The PSE Pinning is based on three key observations. First, memory-intensive applications suffer from frequent misses in the last-level cache. Thus, most of the space in the last-level cache is not utilized well. Second, most PSEs are fetched from the main memory during the page table walk process, meaning that the cache lines for the PSEs are frequently evicted from on-chip caches. Finally, a small number of PSEs are frequently accessed while others are not. By exploiting these three observations, PSE Pinning pins the frequently accessed page structure entries to the last-level caches so that they can reside on the cache. Experimental results show that PSE Pinning improves the performance of memory-intensive workloads suffering from frequent L2 TLB misses by 7.8% on average.
Published: 2022
Full Text: View/download PDF

3. Accelerating Address Translation for Virtualization by Leveraging Hardware Mode.

Author: Sha, Sai, Zhang, Yi, Luo, Yingwei, Wang, Xiaolin, and Wang, Zhenlin
Subjects: *WALKING speed, *HARDWARE, *VIRTUAL machine systems
Abstract: The overhead of memory virtualization remains nontrivial. The traditional shadow paging (TSP) resorts to a shadow page table (SPT) to achieve the native page walk speed, but page table updates require hypervisor interventions. Alternatively, nested paging enables low-overhead page table updates, but utilizes the hardware MMU to perform a long-latency two-dimensional page walk. This paper proposes new memory virtualization solutions based on hardware (machine) mode—the highest CPU privilege level in some architectures like Sunway and RISC-V. A programming interface, running in hardware mode, enables software-implementation of hardware support functions. We first propose Software-based Nested Paging (SNP), which extends the software MMU to perform a two-dimensional page walk in hardware mode. Second, we present Swift Shadow Paging (SSP), which accomplishes page table synchronization by intercepting TLB flushing in hardware mode. Finally we propose Accelerated Shadow Paging (ASP) combining SSP and SNP. ASP handles the last-level SPT page faults by walking two-dimensional page tables in hardware mode, which eliminates most hypervisor interventions. This paper systematically compares multiple memory virtualization models by analyzing their designs and evaluating their performance both on a real system and a simulator. The experiments show that the virtualization overhead of ASP is less than 4.5% for all workloads. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

4. Concurrency Control Algorithms for Translation Lookaside Buffer

Author: Agarwal, Manisha, Jailia, Manisha, Kacprzyk, Janusz, Series Editor, Fong, Simon, editor, Akashe, Shyam, editor, and Mahalle, Parikshit N., editor
Published: 2019
Full Text: View/download PDF

5. MemCAM: A Hybrid Memristor-CMOS CAM Cell for On-Chip Caches

Author: Zareen Sadiq and Shehzad Hasan
Subjects: Memristor content-addressable memory, memristor crossbar, translation lookaside buffer, miss rate reduction, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Non-volatile nanoscale memory devices (such as memristors) have promised to overcome the challenges of scalability and leakage currents of CMOS based memory devices. These novel memories can be fabricated in back-end-of-the-line of any CMOS process. Currently, a lot of research is focused on investigating the benefits of memristors for associative memories. These are Content-Addressable Memories (CAM) in which search based data access takes place. Searching for a particular bit in memristor is time consuming while search in CMOS CAM zone is efficient. To combine the speed and ease of search of CMOS memory and the scalability of memristor memory, we present a novel multibit hybrid CMOS-Memristor Associative Memory Cell. The benefits of such memory cells manifest in on-chip caches - the instruction and data cache, Branch Target Buffer, and Translation Lookaside Buffer. To exemplify the benefit of the cell further, we also simulate the MemCAM as the TLB of an ARM processor and obtained upto 50% decrease in miss rates of Data TLB and upto 93% in that of Instruction TLB. Average speedup of 1.16 was also achieved on various benchmark applications of PARCSEC and MiBench suites.
Published: 2021
Full Text: View/download PDF

6. Page Table Compaction for TLB Coalescing

Author: Jae Young Hur and Joonho Kong
Subjects: Architecture, memory management, page table, performance, translation lookaside buffer, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: In the traditional page-based memory management scheme, frequent page-table walks degrade performance and memory bandwidth utilization. A translation lookaside buffer (TLB) coalescing scheme reduces the problems by efficiently utilizing TLB and exploiting the contiguity in physical memory. In modern system hardware, it is usual that a memory transaction concurrently accesses multiple data. However, state-of-the-art TLB coalescing schemes do not fully utilize the data-level parallelism inherent in hardware. As a result, performance and memory bandwidth utilization can be degraded because of certain page-table walk overheads. To alleviate the overheads, we propose to conduct the compaction of allocated memory blocks (CAMB) in a page table. The proposed scheme can significantly reduce page-table walks by utilizing the data-level parallelism in hardware and the block-level allocation in operating system. A design, an analysis, a case study, an implementation, and an evaluation are presented. Considering image processing workloads as an example, experiments are conducted. The results indicate the presented scheme can improve performance and memory bandwidth utilization with modest cost.
Published: 2020
Full Text: View/download PDF

7. ReconOS

Author: Agne, Andreas, Platzner, Marco, Plessl, Christian, Happe, Markus, Lübbers, Enno, Koch, Dirk, editor, Hannig, Frank, editor, and Ziener, Daniel, editor
Published: 2016
Full Text: View/download PDF

8. Building Code Randomization Defenses

Author: Davi, Lucas, Sadeghi, Ahmad-Reza, Zdonik, Stan, Series editor, Shekhar, Shashi, Series editor, Katz, Jonathan, Series editor, Wu, Xindong, Series editor, Jain, Lakhmi C., Series editor, Padua, David, Series editor, Shen, Xuemin Sherman, Series editor, Furht, Borko, Series editor, Subrahmanian, V.S., Series editor, Hebert, Martial, Series editor, Ikeuchi, Katsushi, Series editor, Siciliano, Bruno, Series editor, Jajodia, Sushil, Series editor, Lee, Newton, Series editor, Davi, Lucas, and Sadeghi, Ahmad-Reza
Published: 2015
Full Text: View/download PDF

9. Improved Tool Support for Machine-Code Decompilation in HOL4

Author: Fox, Anthony, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Urban, Christian, editor, and Zhang, Xingyuan, editor
Published: 2015
Full Text: View/download PDF

10. BIOS and Management Firmware

Author: Gough, Corey, Steiner, Ian, Saunders, Winston, Gough, Corey, Steiner, Ian, and Saunders, Winston
Published: 2015
Full Text: View/download PDF

11. Making Information Hiding Effective Again

Author: Yueqiang Cheng, Chenggang Wu, Kang Yan, Yinqian Zhang, Bowen Tang, Zhe Wang, Mengyao Xie, Yuanming Lai, Zhiping Shi, and Pen-Chung Yew
Subjects: Computer science, Information hiding, Code reuse, Translation lookaside buffer, Code (cryptography), Overhead (computing), Cache, Side channel attack, Electrical and Electronic Engineering, Computer security, computer.software_genre, computer, Block (data storage)
Abstract: Information hiding (IH) is an important building block for many defenses against code reuse attacks, such as code-pointer integrity (CPI), control-flow integrity (CFI), and fine-grained code (re-)randomization, because of its effectiveness and performance. It employs randomization to probabilistically "hide" sensitive memory areas, called safe areas, from attackers and ensures their addresses are not leaked by any pointers directly. These defenses used safe areas to protect their critical data, such as jump targets and randomization secrets. However, recent works have shown that IH is vulnerable to various attacks. In this paper, we propose a new IH technique called SafeHidden. It continuously re-randomizes the locations of safe areas and thus prevents the attackers from probing and inferring the memory layout to find its location. A new thread-private memory mechanism is proposed to isolate the thread-local safe areas and prevent adversaries from reducing the randomization entropy. It also randomizes the safe areas after the TLB misses to prevent attackers from inferring the address of safe areas using cache side-channels. Existing IH-based defenses can utilize SafeHidden directly without any change. Our experiments show that SafeHidden not only prevents existing attacks effectively but also incurs low-performance overhead.
Published: 2022

12. Data Layout in Main Memory

Author: Plattner, Hasso and Plattner, Hasso
Published: 2014
Full Text: View/download PDF

13. Enhancing Instruction TLB Resilience to Soft Errors.

Author: Sanchez-Macian, Alfonso, Aranda, Luis Alberto, Reviriego, Pedro, Kiani, Vahdaneh, and Maestro, Juan Antonio
Subjects: *CACHE memory, *SOFT errors, *DATA corruption, *ERROR correction (Information theory), *VIRTUAL private networks
Abstract: A translation lookaside buffer (TLB) is a type of cache used to speed up the virtual to physical memory translation process. Instruction TLBs store virtual page numbers and their related physical page numbers for the last accessed pages of instruction memory. TLBs like other memories suffer soft errors that can corrupt their contents. A false positive due to an error produced in the virtual page number stored in the TLB may lead to a wrong translation and, consequently, the execution of a wrong instruction that can lead to a program hard fault or to data corruption. Parity or error correction codes have been proposed to provide protection for the TLB, but they require additional storage space. This paper presents some schemes to increase the instruction TLB resilience to this type of errors without requiring any extra storage space, by taking advantage of the spatial locality principle that takes place when executing a program. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

14. Improving Instruction TLB Reliability with Efficient Multi-bit Soft Error Protection.

Author: Kiani, Vahdaneh and Reviriego, Pedro
Subjects: *SOFT errors, *CACHE memory, *VIRTUAL storage (Computer science), *FAULT-tolerant computing, *STATISTICAL reliability
Abstract: Abstract A Translation Lookaside Buffer (TLB) is a type of memory cache that is used to store recent translations of virtual to physical memory to reduce the access latency. Every time the processor accesses the virtual memory, it must be translated to the corresponding physical address, so the number of accesses to the TLB is high. Consequently, soft errors affecting the TLB can lead to hard fault, silent data corruption, and system freeze by corrupting its content. Many studies have proposed to provide protection for the Content Addressable Memory (CAM), which is a part of a TLB that stores the VPNs, but these protection techniques in most cases do not cover the case of multiple errors. This paper presents an efficient, fast and high error coverage approach to improve the reliability of TLB against Multiple Bit Upsets (MBUs) by considering the performance improvement with a low-cost overhead. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

15. Diversifying the Software Stack Using Randomized NOP Insertion

Author: Jackson, Todd, Homescu, Andrei, Crane, Stephen, Larsen, Per, Brunthaler, Stefan, Franz, Michael, Jajodia, Sushil, editor, Ghosh, Anup K., editor, Subrahmanian, V.S., editor, Swarup, Vipin, editor, Wang, Cliff, editor, and Wang, X. Sean, editor
Published: 2013
Full Text: View/download PDF

16. Xeon Phi Core Microarchitecture

Author: Rahman, Rezaur and Rahman, Rezaur
Published: 2013
Full Text: View/download PDF

17. Superprocessors and Supercomputers

Author: Roth, Peter Hans, Jacobi, Christian, Weber, Kai, and Hoefflinger, Bernd, editor
Published: 2012
Full Text: View/download PDF

18. Efficient classification of private memory blocks

Author: Bhargavi R. Upadhyay, Alberto Ros, and Jalpa Shah
Subjects: Scheme (programming language), Multi-core processor, Hardware_MEMORYSTRUCTURES, Computer Networks and Communications, Computer science, CPU cache, Translation lookaside buffer, Directory, Theoretical Computer Science, Computer architecture, Shared memory, Artificial Intelligence, Hardware and Architecture, Granularity, Latency (engineering), computer, Software, computer.programming_language
Abstract: Shared memory architectures are pervasive in the multicore technology era. Still, sequential and parallel applications use most of the data as private in a multicore system. Recent proposals using this observation and driven by a classification of private/shared memory data can reduce the coherence directory area or the memory access latency. The effectiveness of these proposals depends on the accuracy of the classification. The existing proposals perform the private/shared classification at page granularity, leading to a miss-classification and reducing the number of detected private memory blocks. We propose a mechanism able to accurately classify memory blocks using the existing translation lookaside buffers (TLB), which increases the effectiveness of proposals relying on a private/shared classification. Our experimental results show that the proposed scheme reduces L1 cache misses by 25% compared to a page-grain classification approach, which translates into an improvement in system performance by 8.0% with respect to a page-grain approach.
Published: 2021

19. Adjusting Switching Granularity of Load Balancing for Heterogeneous Datacenter Traffic

Author: Weihe Li, Lyu Wenjun, Wenchao Jiang, Tian He, Jianxin Wang, Jiawei Huang, Zhaoyi Li, and Jinbin Hu
Subjects: Computer Networks and Communications, Computer science, Network packet, Distributed computing, Translation lookaside buffer, Bisection bandwidth, Throughput, Load balancing (computing), Computer Science Applications, Load management, Bandwidth (computing), Granularity, Electrical and Electronic Engineering, Software
Abstract: The state-of-the-art datacenter load balancing designs commonly optimize bisection bandwidth with homogeneous switching granularity. Their performances surprisingly degrade under mixed traffic containing both short and long flows. Specifically, the short flows suffer from long-tailed delay, while the throughputs of long flows also degrade dramatically due to low link utilization and packet reordering. To solve these problems, we design a traffic-aware load balancing (TLB) scheme to adaptively adjust the switching granularity of long flows according to the load strength of short ones. Under the heavy load of short flows, the long flows use large switching granularity to help short ones obtain more opportunities in choosing short queues to complete quickly. On the contrary, the long flows reroute flexibly with small switching granularity to achieve high throughput. Furthermore, under extremely bursty scenario, we utilize the packet slicing scheme for long flows to release bandwidth for short ones. The experimental results of NS2 simulation and testbed implementation show that TLB significantly reduces the average flow completion time of short flows by 16%-67% over the state-of-the-art load balancers and achieves the high throughput for long flows. Moreover, for extreme bursty case, at the acceptable throughput degradation of long flows, TLB with packet slicing reduces the deadline missing ratio of bursty short flows by up to 80%.
Published: 2021

20. One Size Fits all, Again! The Architecture of the Hybrid OLTP&OLAP Database Management System HyPer

Author: Kemper, Alfons, Neumann, Thomas, van der Aalst, Wil, Series editor, Mylopoulos, John, Series editor, Rosemann, Michael, Series editor, Shaw, Michael J., Series editor, Szyperski, Clemens, Series editor, Castellanos, Malu, editor, Dayal, Umeshwar, editor, and Markl, Volker, editor
Published: 2011
Full Text: View/download PDF

21. WCET-Aware Assembly Level Optimizations

Author: Lokuciejewski, Paul, Marwedel, Peter, Lokuciejewski, Paul, and Marwedel, Peter
Published: 2011
Full Text: View/download PDF

22. The Power Processing Element (PPE)

Author: Koranne, Sandeep and Koranne, Sandeep
Published: 2009
Full Text: View/download PDF

23. Algorithm Optimizations: Low Computational Complexity

Author: Novak, Miroslav, Singh, Sameer, editor, Tan, Zheng-Hua, and Lindberg, Børge
Published: 2008
Full Text: View/download PDF

24. TPE: A Hardware-Based TLB Profiling Expert for Workload Reconstruction

Author: Liwei Zhou, Yunjie Zhang, and Yiorgos Makris
Subjects: Profiling (computer programming), 021110 strategic, defence & security studies, business.industry, Computer science, Translation lookaside buffer, 0211 other engineering and technologies, Hypervisor, Workload, 02 engineering and technology, Simics, 020202 computer hardware & architecture, Software, 0202 electrical engineering, electronic engineering, information engineering, Benchmark (computing), Instrumentation (computer programming), Electrical and Electronic Engineering, business, Computer hardware
Abstract: We propose TPE, a hardware-based framework to perform workload execution forensics in microprocessors. Specifically, TPE leverages custom hardware instrumentation to capture the operational profile of the Translation Lookaside Buffer (TLB), as well as process these information off-line through multiple machine learning and/or deep learning approaches, in order to identify the executed processes and reconstruct the workload. Unlike software-based forensics methods implemented at the operating system (OS) or hypervisor level, whose data logging and monitoring mechanisms may be compromised through software attacks, TPE is implemented directly in hardware and, therefore, provides innate immunity to software tampering. A prototype of TPE is demonstrated in Linux on two representative architectures, i.e., 32-bit $\times 86$ and 64-bit RISC-V, implemented in the Simics and Spike simulation environment respectively. Experimental results using the Mibench workload benchmark suite reveal favorable process identification accuracy at low logging rate, which corroborates the effectiveness and the generalizability of TPE.
Published: 2021

25. Pioneer: Verifying Code Integrity and Enforcing Untampered Code Execution on Legacy Systems

Author: Seshadri, Arvind, Luk, Mark, Perrig, Adrian, van Doom, Leendert, Khosla, Pradeep, Jajodia, Sushil, editor, Christodorescu, Mihai, editor, Jha, Somesh, editor, Maughan, Douglas, editor, Song, Dawn, editor, and Wang, Cliff, editor
Published: 2007
Full Text: View/download PDF

26. Energy-Effective Instruction Fetch Unit for Wide Issue Processors

Author: Aragón, Juan L., Veidenbaum, Alexander V., Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Dough, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Srikanthan, Thambipillai, editor, Xue, Jingling, editor, and Chang, Chip-Hong, editor
Published: 2005
Full Text: View/download PDF

27. A Fetch Policy Maximizing Throughput and Fairness for Two-Context SMT Processors

Author: Sun, Caixia, Tang, Hongwei, Zhang, Minxuan, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Dough, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Cao, Jiannong, editor, Nejdl, Wolfgang, editor, and Xu, Ming, editor
Published: 2005
Full Text: View/download PDF

28. BabelFish: Fusing Address Translations for Containers

Author: Umur Darbaz, Dimitrios Skarlatos, Bhargava Gopireddy, Nam Sung Kim, and Josep Torrellas
Subjects: Computer science, media_common.quotation_subject, Overhead (engineering), Cloud computing, 02 engineering and technology, computer.software_genre, 01 natural sciences, Execution time, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, Leverage (statistics), Latency (engineering), Electrical and Electronic Engineering, Function (engineering), media_common, 010302 applied physics, business.industry, Translation lookaside buffer, 020202 computer hardware & architecture, Hardware and Architecture, Virtual machine, Virtual memory, Container (abstract data type), Operating system, Page table, business, computer, Software
Abstract: Cloud computing has begun a transformation from using virtual machines to containers. Containers are attractive because multiple of them can share a single kernel, and add minimal performance overhead. Cloud providers leverage the lean nature of containers to run hundreds of them on a few cores. Furthermore, containers enable the serverless paradigm, which leads to the creation of short-lived processes. In this work, we identify that containerized environments create page translations that are extensively replicated across containers in the TLB and in page tables. The result is high TLB pressure and redundant kernel work during page table management. To remedy this situation, this paper proposes BabelFish, a novel architecture to share page translations across containers in the TLB and in page tables. We evaluate BabelFish with simulations of an 8-core processor running a set of Docker containers in an environment with conservative container co-location. On average, under BabelFish, 53% of the translations in containerized workloads and 93% of the translations in serverless workloads are shared. As a result, BabelFish reduces the mean and tail latency of containerized data-serving workloads by 11% and 18%, respectively. It also lowers the execution time of containerized compute workloads by 11%. Finally, it reduces serverless function bring-up time by 8% and execution time by 10%-55%.
Published: 2021

29. Introduction to High-Performance Memory Systems

Author: Hadimioglu, Haldun, Kaeli, David, Kuskin, Jeffrey, Nanda, Ashwini, Torrellas, Josep, Hadimioglu, Haldun, editor, Kuskin, Jeffrey, editor, Torrellas, Josep, editor, Kaeli, David, editor, and Nanda, Ashwini, editor
Published: 2004
Full Text: View/download PDF

30. Towards an Asynchronous MIPS Processor

Author: Zhang, Qianyi, Theodoropoulos, Georgios, Goos, Gerhard, editor, Hartmanis, Juris, editor, van Leeuwen, Jan, editor, Omondi, Amos, editor, and Sedukhin, Stanislav, editor
Published: 2003
Full Text: View/download PDF

31. T

Author: Kajan, Ejub and Kajan, Ejub
Published: 2002
Full Text: View/download PDF

32. Estimation of tea leaf blight severity in natural scene images

Author: Yan Zhang, Kang Wei, Dong Liang, Gensheng Hu, and Wenxia Bao
Subjects: Conditional random field, Spots, business.industry, Translation lookaside buffer, 0211 other engineering and technologies, food and beverages, Pattern recognition, 04 agricultural and veterinary sciences, 02 engineering and technology, Convolutional neural network, Robustness (computer science), Metric (mathematics), 040103 agronomy & agriculture, 0401 agriculture, forestry, and fisheries, Blight, Segmentation, Artificial intelligence, General Agricultural and Biological Sciences, business, 021101 geological & geomatics engineering, Mathematics
Abstract: Tea leaf blight (TLB) is a common tea disease seriously affecting the quality and yield of tea. An accurate estimation of TLB severity can be used to guide tea farmers to reasonably spray pesticides. This study proposes an estimation method for TLB severity in natural scene images and consists of four main steps: segmentation of the diseased leaves, area fitting of the diseased leaves, segmentation of the disease spots, and estimation of disease severity. Target leaves with TLB in the tea images are segmented by combining the U-Net network and fully connected conditional random field to reduce the influence of complex background. An ellipse restoration method is proposed to generate an elliptic mask to fit the full size of the occluded or damaged TLB leaves. The disease spot regions are segmented from the TLB leaves by a support vector machine classifier to calculate the Initial Disease Severity (IDS) index. The IDS index, color features, and texture features of the TLB leaves are inputted into the metric learning model to finally estimated disease severity. Experimental results show that the proposed method has higher estimation accuracy and stronger robustness against occluded and damaged TLB leaves compared with conventional convolution neural network methods and classical machine learning techniques.
Published: 2021

33. MemCAM: A Hybrid Memristor-CMOS CAM Cell for On-Chip Caches

Author: Shehzad Hasan and Zareen Sadiq
Subjects: Speedup, General Computer Science, Computer science, translation lookaside buffer, 02 engineering and technology, Memristor, miss rate reduction, law.invention, law, memristor crossbar, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, System on a chip, Hardware_MEMORYSTRUCTURES, business.industry, Memristor content-addressable memory, Translation lookaside buffer, General Engineering, Content-addressable memory, 021001 nanoscience & nanotechnology, 020202 computer hardware & architecture, CMOS, Branch target predictor, Embedded system, lcsh:Electrical engineering. Electronics. Nuclear engineering, 0210 nano-technology, business, lcsh:TK1-9971
Abstract: Non-volatile nanoscale memory devices (such as memristors) have promised to overcome the challenges of scalability and leakage currents of CMOS based memory devices. These novel memories can be fabricated in back-end-of-the-line of any CMOS process. Currently, a lot of research is focused on investigating the benefits of memristors for associative memories. These are Content-Addressable Memories (CAM) in which search based data access takes place. Searching for a particular bit in memristor is time consuming while search in CMOS CAM zone is efficient. To combine the speed and ease of search of CMOS memory and the scalability of memristor memory, we present a novel multibit hybrid CMOS-Memristor Associative Memory Cell. The benefits of such memory cells manifest in on-chip caches - the instruction and data cache, Branch Target Buffer, and Translation Lookaside Buffer. To exemplify the benefit of the cell further, we also simulate the MemCAM as the TLB of an ARM processor and obtained upto 50% decrease in miss rates of Data TLB and upto 93% in that of Instruction TLB. Average speedup of 1.16 was also achieved on various benchmark applications of PARCSEC and MiBench suites.
Published: 2021

34. Modeling and Analysis of the Page Sizing Problem for NVM Storage in Virtualized Systems

Author: Yunjoo Park and Hyokyung Bahn
Subjects: General Computer Science, Page fault, Computer science, page fault, 02 engineering and technology, computer.software_genre, 0202 electrical engineering, electronic engineering, information engineering, Overhead (computing), General Materials Science, address translation, Electrical and Electronic Engineering, Page, memory performance, Hardware_MEMORYSTRUCTURES, Translation lookaside buffer, General Engineering, 020206 networking & telecommunications, Virtualization, virtualization, 020202 computer hardware & architecture, Non-volatile memory, Memory management, Page size, Operating system, NVM, lcsh:Electrical engineering. Electronics. Nuclear engineering, lcsh:TK1-9971, computer, Access time
Abstract: Recently, NVM (non-volatile memory) has advanced as a fast storage medium, and traditional memory management systems designed for HDD storage should be reconsidered. In this article, we revisit the page sizing problem in NVM storage, specially focusing on virtualized systems. The page sizing problem has not caught attention in traditional systems because of the two reasons. First, the memory performance is not sensitive to the page size when HDD is adopted as storage. We show that this is not the case in NVM storage by analyzing the TLB miss rate and the page fault rate, which have trade-off relations with respect to the page size. Second, changing the page size in traditional systems is not easy as it accompanies significant overhead. However, due to the widespread adoption of virtualized systems, the page sizing problem becomes feasible for virtual machines, which are generated for executing specific workloads with fixed hardware resources. In this article, we design a page size model that accurately estimates the TLB miss rate and the page fault rate for NVM storage. We then present a method that has the ability of estimating the memory access time as the page size is varied, which can guide a suitable page size for given environments. By considering workload characteristics with given memory and storage resources, we show that the memory performance of virtualized systems can be improved by 38.4% when our model is adopted.
Published: 2021

35. Detecting Hardware-Assisted Virtualization With Inconspicuous Features

Author: Yueqiang Cheng, Yi Zou, Zhi Zhang, Dongxi Liu, Yansong Gao, and Surya Nepal
Subjects: 021110 strategic, defence & security studies, Computer Networks and Communications, Computer science, business.industry, Translation lookaside buffer, 0211 other engineering and technologies, Hardware-assisted virtualization, Cloud computing, 02 engineering and technology, Transparency (human–computer interaction), computer.software_genre, Virtualization, Virtual machine, Operating system, Malware, Cache, Safety, Risk, Reliability and Quality, business, computer
Abstract: Recent years have witnessed the proliferation of the deployment of virtualization techniques. Virtualization is designed to be transparent, that is, unprivileged users should not be able to detect whether a system is virtualized. Such detection can result in serious security threats such as evading virtual machine (VM)-based malware dynamic analysis and exploiting vulnerabilities for cross-VM attacks. The traditional software-based virtualization leaves numerous artifacts/fingerprints, which can be exploited without much effort to detect the virtualization. In contrast, current mainstream hardware-assisted virtualization significantly enhances the virtualization transparency, making itself more transparent and difficult to be detected. Nonetheless, we showcase three new identified low-level inconspicuous features, which can be leveraged by an unprivileged adversary to effectively and stealthily detect the hardware-assisted virtualization. All three features come from the chipset fingerprints, rather than the traces of software-based virtualization implementations (e.g., Xen or KVM). The identified features include i) Translation-Lookaside Buffer (TLB) stores an extra layer of address translations; ii) Last-Level Cache (LLC) caches one more layer of page-table entries; and iii) Level-1 Data (L1D) Cache is unstable. Based on the above features, we develop three corresponding virtualization detection techniques, which are then comprehensively evaluated on three native environments and three popular cloud providers: i) Amazon Elastic Compute Cloud, ii) Google Compute Engine and iii) Microsoft Azure. Experimental results validate that these three adversarial detection techniques are effective (with no false positive) and stealthy (without triggering suspicious system events, e.g., VM-exit ) in detecting the above commodity virtualized environments.
Published: 2021

36. Monolithic 3D-Based SRAM/MRAM Hybrid Memory for an Energy-Efficient Unified L2 TLB-Cache Architecture

Author: Young-Ho Gong
Subjects: General Computer Science, CPU cache, Computer science, 02 engineering and technology, 01 natural sciences, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, Static random-access memory, energy efficiency, Monolithic 3D, 010302 applied physics, Magnetoresistive random-access memory, Random access memory, Hardware_MEMORYSTRUCTURES, business.industry, Translation lookaside buffer, General Engineering, Cache-only memory architecture, cache memory, SRAM, MRAM, 020202 computer hardware & architecture, Memory management, Embedded system, translation look-aside buffer, lcsh:Electrical engineering. Electronics. Nuclear engineering, business, lcsh:TK1-9971, Efficient energy use
Abstract: Monolithic 3D (M3D) integration has been emerged as a promising technology for fine-grained 3D stacking. As the M3D integration offers extremely small dimension of via in a nanometer-scale, it is beneficial for small microarchitectural blocks such as caches, register files, translation look-aside buffers (TLBs), etc. However, since the M3D integration requires low-temperature process for stacked layers, it causes lower performance for stacked transistors compared to the conventional 2D process. In contrast, non-volatile memory (NVM) such as magnetic RAM (MRAM) is originally fabricated at a low temperature, which enables the M3D integration without performance degradation. In this paper, we propose an energy-efficient unified L2 TLB-cache architecture exploiting M3D-based SRAM/MRAM hybrid memory. Since the M3D-based SRAM/MRAM hybrid memory consumes much smaller energy than the conventional 2D SRAM-only memory and 2D SRAM/MRAM hybrid memory, while providing comparable performance, our proposed architecture improves energy efficiency significantly. Especially, as our proposed architecture changes the memory partitioning of the unified L2 TLB-cache depending on the L2 cache miss rate, it maximizes the energy efficiency for parallel workloads suffering extremely high L2 cache miss rate. According to our analysis using PARSEC benchmark applications, our proposed architecture reduces the energy consumption of L2 TLB + L2 cache by up to 97.7% (53.6% on average), compared to the baseline with the 2D SRAM-only memory, with negligible impact on performance. Furthermore, our proposed technique reduces the memory access energy consumption by up to 32.8% (10.9% on average), by reducing memory accesses due to TLB misses.
Published: 2021

37. Improving the Precise Interrupt Mechanism of Software- Managed TLB Miss Handlers

Author: Jaleel, Aamer, Jacob, Bruce, Goos, Gerhard, editor, Hartmanis, Juris, editor, van Leeuwen, Jan, editor, Monien, Burkhard, editor, Prasanna, Viktor K., editor, and Vajapeyam, Sriram, editor
Published: 2001
Full Text: View/download PDF

38. Content-Based Prefetching: Initial Results

Author: Cooksey, Robert, Colarelli, Dennis, Grunwald, Dirk, Goos, G., editor, Hartmanis, J., editor, van Leeuwen, J., editor, Chong, Frederic T., editor, Kozyrakis, Christoforos, editor, and Oskin, Mark, editor
Published: 2001
Full Text: View/download PDF

39. An Architectural and Circuit-Level Approach to Improving the Energy Efficiency of Microprocessor Memory Structures

Author: Albonesi, David H., Silveira, Luis Miguel, editor, Devadas, Srinivas, editor, and Reis, Ricardo, editor
Published: 2000
Full Text: View/download PDF

40. High-Resolution Weather Forecasting: A Teraflop Sustained on RISC/cache or Vector Processors

Author: Thomas, S. J., Desgagné, M., Valin, M., Pollard, Andrew, editor, Mewhort, Douglas J. K., editor, and Weaver, Donald F., editor
Published: 2000
Full Text: View/download PDF

41. NWCache: Optimizing disk accesses via an optical network/write cache hybrid

Author: Carrera, Enrique V., Bianchini, Ricardo, Goos, Gerhard, editor, Hartmanis, Juris, editor, van Leeuwen, Jan, editor, Rolim, José, editor, Mueller, Frank, editor, Zomaya, Albert Y., editor, Ercal, Fikret, editor, Olariu, Stephan, editor, Ravindran, Binoy, editor, Gustafsson, Jan, editor, Takada, Hiroaki, editor, Olsson, Ron, editor, Kale, Laxmikant V., editor, Beckman, Pete, editor, Haines, Matthew, editor, ElGindy, Hossam, editor, Caromel, Denis, editor, Chaumette, Serge, editor, Fox, Geoffrey, editor, Pan, Yi, editor, Li, Keqin, editor, Yang, Tao, editor, Chiola, G., editor, Conte, G., editor, Mancini, L. V., editor, Méry, Domenique, editor, Sanders, Beverly, editor, Bhatt, Devesh, editor, and Prasanna, Viktor, editor
Published: 1999
Full Text: View/download PDF

42. Summary

Author: Wieferink, Andreas, Meyr, Heinrich, Leupers, Rainer, Wieferink, Andreas, Meyr, Heinrich, and Leupers, Rainer
Published: 2008
Full Text: View/download PDF

43. When Storage Response Time Catches Up With Overall Context Switch Overhead, What Is Next?

Author: Tei-Wei Kuo, Ming-Chang Yang, Chun-Feng Wu, and Yuan-Hao Chang
Subjects: Random access memory, Hardware_MEMORYSTRUCTURES, Page fault, CPU cache, business.industry, Computer science, Translation lookaside buffer, Response time, 02 engineering and technology, Computer Graphics and Computer-Aided Design, 020202 computer hardware & architecture, Memory management, Embedded system, Virtual memory, 0202 electrical engineering, electronic engineering, information engineering, Central processing unit, Electrical and Electronic Engineering, business, Software, Context switch
Abstract: The virtual memory technique provides a large and cheap memory space by extending the memory space with storage devices. It applies context switch to asynchronously swapping pages between memory and storage devices for hiding the long response time of storage devices when a page fault occurs. However, the overall context switch overhead is high because the context switch itself is a complex function and would further incur TLB shootdown/flush and compulsory CPU cache misses after context switches. On the contrary, as the rapid responsiveness improvement of high-end storage devices, we observe that the response time of high-end storage devices catches up and gradually becomes smaller than the overall context switch overhead. At this turning point, to further enhance the system responsiveness, we advocate adopting synchronous swapping rather than context switch in response to page faults. Meanwhile, we propose a strategy, called shadow huge page management, to further improve the overall system performance by minimizing the overall time overheads caused by page faults and page swappings. Evaluation results show that the proposed system can efficiently reduce the total CPU wasting time.
Published: 2020

44. <scp>ECO</scp> TLB

Author: Tushar Krishna, Steffen Maass, Taesoo Kim, Mohan Kumar, and Abhishek Bhattacharjee
Subjects: Hardware_MEMORYSTRUCTURES, Computer science, Address space, Translation lookaside buffer, Linux kernel, computer.software_genre, Asynchrony (computer programming), Hardware and Architecture, Asynchronous communication, Operating system, Isolation (database systems), Interrupt, Page table, computer, Software, Information Systems
Abstract: We propose ecoTLB —software-based eventual translation lookaside buffer (TLB) coherence—which eliminates the overhead of the synchronous TLB shootdown mechanism in operating systems that use address space identifiers (ASIDs). With an eventual TLB coherence, ecoTLB improves the performance of free and page swap operations by removing the inter-processor interrupt (IPI) overheads incurred to invalidate TLB entries. We show that the TLB shootdown has implications for page swapping in particular in emerging, disaggregated data centers and demonstrate that ecoTLB can improve both the performance and the specific swapping policy decisions using ecoTLB ’s asynchronous mechanism. We demonstrate that ecoTLB improves the performance of real-world applications, such as Memcached and Make, that perform page swapping using Infiniswap , a solution for next generation data centers that use disaggregated memory, by up to 17.2%. Moreover, ecoTLB improves the 99th percentile tail latency of Memcached by up to 70.8% due to its asynchronous scheme and improved policy decisions. Furthermore, we show that recent features to improve security in the Linux kernel, like kernel page table isolation (KPTI), can result in significant performance overheads on architectures without support for specific instructions to clear single entries in tagged TLBs, falling back to full TLB flushes. In this scenario, ecoTLB is able to recover the performance lost for supporting KPTI due to its asynchronous shootdown scheme and its support for tagged TLBs. Finally, we demonstrate that ecoTLB improves the performance of free operations by up to 59.1% on a 120-core machine and improves the performance of Apache on a 16-core machine by up to 13.7% compared to baseline Linux, and by up to 48.2% compared to ABIS, a recent state-of-the-art research prototype that reduces the number of IPIs.
Published: 2020

45. Технології апаратної віртуалізації мікропроцесорів Intel

Author: Yu. Povstiana, N. Khrystynets, M. Dovgonyuk, N. Cherniashchuk, and О. Miskevych
Subjects: Hardware_MEMORYSTRUCTURES, Computer Networks and Communications, Computer science, Translation lookaside buffer, computer.software_genre, Memory controller, Hardware and Architecture, Memory virtualization, Virtual memory, Operating system, Cache, Cache hierarchy, computer, Software, Range (computer programming)
Abstract: The features of the Nehalem architecture of microprocessors are considered: memory controller, cache hierarchy, TLB and memory access organization. The technical characteristics of the processors on the LGA1156 socket within one model range are presented as the results of tests of this architecture and the methods of virtual memory organization are investigated.
Published: 2020

46. Object-Level Memory Allocation and Migration in Hybrid Memory Systems

Author: Bingsheng He, Hai Jin, Liu Renshan, Haikun Liu, Yu Zhang, and Xiaofei Liao
Subjects: Random access memory, Hardware_MEMORYSTRUCTURES, Source code, Computer science, business.industry, media_common.quotation_subject, Translation lookaside buffer, 02 engineering and technology, Static memory allocation, 020202 computer hardware & architecture, Theoretical Computer Science, Non-volatile memory, Memory management, Computational Theory and Mathematics, Hardware and Architecture, Embedded system, 0202 electrical engineering, electronic engineering, information engineering, Overhead (computing), Cache, business, Software, Dram, Data migration, media_common
Abstract: Hybrid memory systems composed of emerging non-volatile memory (NVM) and DRAM have drawn increasing attention in recent years. To fully exploit the advantages of both NVM and DRAM, a primary goal is to properly place application data on the hybrid memories. Previous studies have focused on page migration schemes to achieve higher performance and energy efficiency. However, those schemes all rely on online page access monitoring (costly), and data migration at the page granularity may cause additional overhead due to DRAM bandwidth contention and maintenance of cache/TLB consistency. In this article, we present Object-level memory Allocation and Migration (OAM) mechanisms for hybrid memory systems. OAM exploits a profiling tool to characterize objects’ memory access patterns at different execution phases of applications, and applies a performance/energy model to direct the initial static memory allocation and runtime dynamic object migration between NVM and DRAM. Based on our newly-developed programming interfaces for hybrid memory systems, application source codes can be automatically transformed via static code instrumentation. We evaluate OAM on an emulated hybrid memory system, and experimental results show that OAM can significantly reduce system energy-delay-product by 61 percent on average compared to a page-interleaving data placement scheme. It can also significantly reduce data migration overhead by 83 and 69 percent compared to the state-of-the-art page migration scheme CLOCK-DWF and 2PP, respectively, while improving application performance by up to 22 and 10 percent.
Published: 2020

47. Formal Reasoning Under Cached Address Translation

Author: Hira Taqdees Syeda and Gerwin Klein
Subjects: Hardware_MEMORYSTRUCTURES, Programming language, Computer science, Address space, Translation lookaside buffer, Classical logic, computer.software_genre, Memory management unit, Computational Theory and Mathematics, Artificial Intelligence, Cache, Page table, computer, Software, Context switch, Abstraction (linguistics)
Abstract: Operating system (OS) kernels achieve isolation between user-level processes using hardware features such as multi-level page tables and translation lookaside buffers (TLBs). The TLB caches address translation, and therefore correctly controlling the TLB is a fundamental security property of OS kernels—yet all large-scale formal OS verification projects we are aware of leave the correct functionality of TLB as an assumption. In this paper, we present a verified sound abstraction of a detailed concrete model of the memory management unit (MMU) of the ARMv7-A architecture. This MMU abstraction revamps our previous address space specific MMU abstraction to include new software-visible TLB features such as caching of globally-mapped and partial translation entries in a two-stage TLB. We use this abstraction as the underlying model to develop a logic for reasoning about low-level programs in the presence of cached address translation. We extract invariants and necessary conditions for correct TLB operation that mirrors the informal reasoning of OS engineers. We systematically show how these invariants adapt to global and partial translation entries. We show that our program logic reduces to a standard logic for user-level reasoning, reduces to side-condition checks for kernel-level reasoning, and can handle typical OS kernel tasks such as context switching.
Published: 2020

48. TLB Coalescing for Multi-Grained Page Migration in Hybrid Memory Systems

Author: Xiaoyuan Wang, Haikun Liu, Xiaofei Liao, Yu Zhang, and Hai Jin
Subjects: TLB coalescing, General Computer Science, Computer science, Virtual memory, 02 engineering and technology, Parallel computing, Memory systems, 01 natural sciences, hybrid memory system, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, Overhead (computing), General Materials Science, Electrical and Electronic Engineering, page migration, multiple page size, 010302 applied physics, Hardware_MEMORYSTRUCTURES, Translation lookaside buffer, General Engineering, 020202 computer hardware & architecture, Memory management, Key (cryptography), lcsh:Electrical engineering. Electronics. Nuclear engineering, Cache, lcsh:TK1-9971, Dram
Abstract: Superpages have long been proposed to enlarge the coverage of translation lookaside buffer (TLB). They are extremely beneficial for reducing address translation overhead in big memory systems, such as hybrid memory systems that composed of DRAM and non-volatile memories (NVMs). However, superpages conflict with fine-grained memory migration, one of the key techniques in hybrid memory systems to improve performance and energy efficiency. Fine-grained page migrations usually require to splinter superpages, mitigating the benefit of TLB hardware for superpages. In this paper, we present Tamp, an efficient memory management mechanism to support multiple page sizes in hybrid memory systems. We manage large-capacity NVM using superpages, and use a relatively small size of DRAM to cache hot base pages within the superpages. We find that there are remarkable contiguity exist for hot base pages in superpages. In response, we bind those contiguous hot pages together and migrate them to DRAM. We also propose multi-grained TLBs to coalesce multiple page address translations into a single TLB entry. Our experimental results show that Tamp can significantly reduce TLB misses by 62.4% on average, and improve application performance (IPC) by 16.2%, compared to a page migration policy without TLB coalescing support.
Published: 2020

49. System Level Representation

Author: Geuskens, Bibiche, Rose, Kenneth, Geuskens, Bibiche, and Rose, Kenneth
Published: 1998
Full Text: View/download PDF

50. Translation lookaside buffer management

Author: Y. I. Klimiankou
Subjects: Os kernel, Hardware_MEMORYSTRUCTURES, tlb management, Computer science, Translation lookaside buffer, Information technology, T58.5-58.64, computer.software_genre, Associative cache, physical memory, virtual memory, Memory management, Physical address, Virtual memory, Operating system, Overhead (computing), memory management, computer
Abstract: This paper focuses on the Translation Lookaside Buffer (TLB) management as part of memory management. TLB is an associative cache of the advanced processors, which reduces the overhead of the virtual to physical address translations. We consider challenges related to the design of the TLB management subsystem of the OS kernel on the example of the IA-32 platform and propose a simple model of complete and consistent policy of TLB management. This model can be used as a foundation for memory management subsystems design and verification.
Published: 2019

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

768 results on '"Translation lookaside buffer"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources