434 results on '"Weichen Liu"'
Search Results
202. Communication optimization for thermal reliable many-core systems: work-in-progress.
- Author
-
Weichen Liu, Lei Yang 0018, Weiwen Jiang, and Nan Guan
- Published
- 2017
- Full Text
- View/download PDF
203. Fixed priority scheduling of real-time flows with arbitrary deadlines on smart NoCs: work-in-progress.
- Author
-
Weichen Liu, Peng Chen 0027, Lei Yang 0018, Mengquan Li, and Nan Guan
- Published
- 2017
- Full Text
- View/download PDF
204. On-chip sensor networks for soft-error tolerant real-time multiprocessor systems-on-chip.
- Author
-
Weichen Liu, Xuan Wang 0001, Jiang Xu 0001, Wei Zhang 0012, Yaoyao Ye, Xiaowen Wu, Mahdi Nikdast, and Zhehui Wang
- Published
- 2014
- Full Text
- View/download PDF
205. UNION: A Unified Inter/Intrachip Optical Network for Chip Multiprocessors.
- Author
-
Xiaowen Wu, Yaoyao Ye, Jiang Xu 0001, Wei Zhang 0012, Weichen Liu, Mahdi Nikdast, and Xuan Wang 0001
- Published
- 2014
- Full Text
- View/download PDF
206. Communication optimization for thermal reliable optical network-on-chip: work-in-progress.
- Author
-
Mengquan Li, Weichen Liu, Lei Yang 0018, Yiyuan Xie, Yaoyao Ye, and Nan Guan
- Published
- 2018
- Full Text
- View/download PDF
207. On the Analysis of Parallel Real-Time Tasks With Spin Locks
- Author
-
He Du, Wang Yi, Weichen Liu, Nan Guan, and Xu Jiang
- Subjects
FOS: Computer and information sciences ,Multi-core processor ,Computer science ,Distributed computing ,Workload ,02 engineering and technology ,Blocking (computing) ,020202 computer hardware & architecture ,Theoretical Computer Science ,Task (computing) ,Computer Science - Distributed, Parallel, and Cluster Computing ,Computational Theory and Mathematics ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Task analysis ,Distributed, Parallel, and Cluster Computing (cs.DC) ,Resource management (computing) ,Protocol (object-oriented programming) ,Software - Abstract
Locking protocol is an essential component in resource management of real-time systems, which coordinates mutually exclusive accesses to shared resources from different tasks. Although the design and analysis of locking protocols have been intensively studied for sequential real-time tasks, there has been a little work on this topic for parallel real-time tasks. In this article, we study the analysis of parallel real-time tasks using spin locks to protect accesses to shared resources in three commonly used request serving orders (unordered, FIFO-order, and priority-order). A remarkable feature making our analysis method more accurate is to systematically analyze the blocking time which may delay a task's finishing time, where the impact to the total workload and the longest path length is jointly considered, rather than analyzing them separately and counting all blocking time as the workload that delays a task's finishing time, as commonly assumed in the state-of-the-art.
- Published
- 2021
- Full Text
- View/download PDF
208. 3-D Mesh-Based Optical Network-on-Chip for Multiprocessor System-on-Chip.
- Author
-
Yaoyao Ye, Jiang Xu 0001, Baihan Huang, Xiaowen Wu, Wei Zhang 0012, Xuan Wang 0001, Mahdi Nikdast, Zhehui Wang, Weichen Liu, and Zhe Wang 0003
- Published
- 2013
- Full Text
- View/download PDF
209. Formal Worst-Case Analysis of Crosstalk Noise in Mesh-Based Optical Networks-on-Chip.
- Author
-
Yiyuan Xie, Mahdi Nikdast, Jiang Xu 0001, Xiaowen Wu, Wei Zhang 0012, Yaoyao Ye, Xuan Wang 0001, Zhehui Wang, and Weichen Liu
- Published
- 2013
- Full Text
- View/download PDF
210. System-Level Modeling and Analysis of Thermal Effects in Optical Networks-on-Chip.
- Author
-
Yaoyao Ye, Jiang Xu 0001, Xiaowen Wu, Wei Zhang 0012, Xuan Wang 0001, Mahdi Nikdast, Zhehui Wang, and Weichen Liu
- Published
- 2013
- Full Text
- View/download PDF
211. On-Chip Sensor Network for Efficient Management of Power Gating-Induced Power/Ground Noise in Multiprocessor System on Chip.
- Author
-
Weichen Liu, Yu Wang 0002, Xuan Wang 0001, Jiang Xu 0001, and Huazhong Yang
- Published
- 2013
- Full Text
- View/download PDF
212. A Torus-Based Hierarchical Optical-Electronic Network-on-Chip for Multiprocessor System-on-Chip.
- Author
-
Yaoyao Ye, Jiang Xu 0001, Xiaowen Wu, Wei Zhang 0012, Weichen Liu, and Mahdi Nikdast
- Published
- 2012
- Full Text
- View/download PDF
213. Coroutine-Based Synthesis of Efficient Embedded Software From SystemC Models.
- Author
-
Weichen Liu, Jiang Xu 0001, Jogesh K. Muppala, Wei Zhang 0012, Xiaowen Wu, and Yaoyao Ye
- Published
- 2011
- Full Text
- View/download PDF
214. Power Gating Aware Task Scheduling in MPSoC.
- Author
-
Yu Wang 0002, Jiang Xu 0001, Yan Xu, Weichen Liu, and Huazhong Yang
- Published
- 2011
- Full Text
- View/download PDF
215. Satisfiability Modulo Graph Theory for Task Mapping and Scheduling on Multiprocessor Systems.
- Author
-
Weichen Liu, Zonghua Gu 0001, Jiang Xu 0001, Xiaowen Wu, and Yaoyao Ye
- Published
- 2011
- Full Text
- View/download PDF
216. Hybridizing Different Local Search Algorithms with Each Other and Evolutionary Computation: Better Performance on the Traveling Salesman Problem.
- Author
-
Yuezhong Wu, Thomas Weise 0001, and Weichen Liu
- Published
- 2016
- Full Text
- View/download PDF
217. Effects of high-speed rail on the spatial agglomeration of producer services: A case study of the Yangtze River Delta urban agglomeration
- Author
-
Weichen Liu, Wei Wu, Xiaoli Li, and Zhaopei Tang
- Subjects
Delta ,Ecology ,Urban agglomeration ,Environmental protection ,Economies of agglomeration ,Geography, Planning and Development ,Earth and Planetary Sciences (miscellaneous) ,Yangtze river ,Environmental science ,Nature and Landscape Conservation - Published
- 2021
- Full Text
- View/download PDF
218. Efficient Software Synthesis for Dynamic Single Appearance Scheduling of Synchronous Dataflow.
- Author
-
Weichen Liu, Zonghua Gu 0001, and Jiang Xu 0001
- Published
- 2009
- Full Text
- View/download PDF
219. Efficient algorithms for 2D area management and online task placement on runtime reconfigurable FPGAs.
- Author
-
Zonghua Gu 0001, Weichen Liu, Jiang Xu 0001, Jin Cui, Xiuqiang He 0001, and Qingxu Deng
- Published
- 2009
- Full Text
- View/download PDF
220. Scope-Aware Useful Cache Block Calculation for Cache-Related Pre-Emption Delay Analysis With Set-Associative Data Caches
- Author
-
Zhiping Jia, Yue Tang, Wei Zhang, Weichen Liu, Nan Guan, and Lei Ju
- Subjects
Hardware_MEMORYSTRUCTURES ,Computer science ,Delay analysis ,Static timing analysis ,02 engineering and technology ,Parallel computing ,Calculation technique ,Computer Graphics and Computer-Aided Design ,020202 computer hardware & architecture ,Scheduling (computing) ,0202 electrical engineering, electronic engineering, information engineering ,Cache ,Electrical and Electronic Engineering ,Software ,Associative property - Abstract
Timing analysis of real-time systems must consider cache-related pre-emption delay (CRPD) costs when pre-emptive scheduling is used. While most previous work on CRPD analysis only considers instruction caches, the CRPD incurred on data caches is actually more significant. The state-of-the-art CRPD analysis methods are based on useful cache block (UCB) calculation. Unfortunately, as shown in this article, directly extending the existing UCB calculation techniques from instruction caches to data caches will lead to both unsoundness and significant imprecision. To solve these problems, we develop a new UCB calculation technique for data caches, which redefines the analysis unit (to address the unsoundness in the existing method) and precisely captures the dynamic cache access behavior by taking the temporal scopes of memory blocks into consideration. The experimental results show that our new technique yields substantially tighter CRPD estimations comparing with the state-of-the-art.
- Published
- 2020
- Full Text
- View/download PDF
221. Thermal-Aware Design and Simulation Approach for Optical NoCs
- Author
-
Wenfei Zhang, Yaoyao Ye, and Weichen Liu
- Subjects
Interconnection ,Silicon photonics ,Computer science ,Bandwidth (signal processing) ,Optical power ,Hardware_PERFORMANCEANDRELIABILITY ,02 engineering and technology ,Chip ,Computer Graphics and Computer-Aided Design ,Optical switch ,020202 computer hardware & architecture ,Computer Science::Hardware Architecture ,Hardware_INTEGRATEDCIRCUITS ,0202 electrical engineering, electronic engineering, information engineering ,Electronic engineering ,System on a chip ,Electrical and Electronic Engineering ,Electrical efficiency ,Software - Abstract
For chip multiprocessors, one major challenge is to bridge the increasing speed gap between processor and the global on-chip interconnect delay. By integrating optical interconnects in network-on-chip (NoC) architectures, optical NoCs can overcome the power and bandwidth bottleneck of traditional electrical on-chip networks. However, while considering the thermal sensitivity of silicon photonic devices used in optical NoCs, optical interconnects may not have advantages in power efficiency as compared with their electrical counterparts. To tackle this problem, in this article, we propose a thermal-aware design and simulation approach for optical NoCs. Key techniques include thermal-sensitive optical power loss models from device level to network level, a thermal-aware adaptive routing mechanism, and a thermal-aware simulation platform. The thermal-aware simulation platform enables optical NoC simulation together with on-chip temperature simulation as well as optical thermal effect modeling. With the proposed thermal-aware simulation platform, we conducted a case study of an $8\times 8$ mesh-based optical NoC under a set of synthetic traffic patterns as well as real applications at typical temperature scenarios. By comparing and analyzing different temperature distributions, we can conclude that it can achieves a better optimization effect for the temperature distributions where the hot spots are scattered across the chip.
- Published
- 2020
- Full Text
- View/download PDF
222. Local deformation behaviour of saturated silica sand during undrained cyclic torsional shear tests using image analysis
- Author
-
Chuang Zhao, Weichen Liu, and Junichi Koseki
- Subjects
021110 strategic, defence & security studies ,Materials science ,Shear (geology) ,0211 other engineering and technologies ,Earth and Planetary Sciences (miscellaneous) ,Liquefaction ,Geotechnical engineering ,02 engineering and technology ,Geotechnical Engineering and Engineering Geology ,021101 geological & geomatics engineering - Abstract
A series of undrained cyclic torsional shear tests was conducted to investigate the development of local deformations of silica sand specimens directly using an image-based technique and a transparent membrane. The results for saturated sand specimens with varying densities showed that the vertical slippage of sand particles relative to the membrane was initiated at the liquefaction stage. Additionally, the quantity of vertical slippage of dense specimens was lower than that of loose specimens tested under the same conditions. Moreover, measured on the surface of the specimen, the large accumulated movement during cyclic shearing significantly increased the potential for inducing vertical slippage. A comparison of local deformation among the different tests revealed that the threshold of shear strain to generate non-uniform local deformations decreased with an increase in the relative density of the specimen. A lower liquefaction resistance layer, which formed at the upper part of the specimen after initial liquefaction, would accelerate the concentration of local shear strain; this was especially distinct when the specimen was completely liquefied.
- Published
- 2020
- Full Text
- View/download PDF
223. A procedure to reach high liquefaction resistance in laboratory testing on reconstituted sand specimens using hollow cylindrical torsional shear apparatus
- Author
-
Weichen Liu and Junichi Koseki
- Subjects
Shear (sheet metal) ,Materials science ,Geotechnical engineering ,Laboratory testing ,Liquefaction resistance - Published
- 2020
- Full Text
- View/download PDF
224. Fault-Tolerant Routing Mechanism in 3D Optical Network-on-Chip Based on Node Reuse
- Author
-
Lei Guo, Sun Wei, Pengxing Guo, Luan H. K. Duong, Hainan Bao, Weigang Hou, Weichen Liu, Chuang Liu, and School of Computer Science and Engineering
- Subjects
Interconnection ,business.industry ,Network packet ,Computer science ,Fault-Tolerant Routing Mechanism ,Throughput ,Fault tolerance ,Energy consumption ,Reuse ,Network on a chip ,Computational Theory and Mathematics ,Hardware and Architecture ,3D Optical Network-On-Chip ,Signal Processing ,Redundancy (engineering) ,Computer science and engineering [Engineering] ,Photonics ,business ,Computer network - Abstract
The three-dimensional Network-on-Chips (3D NoCs) has become a mature multi-core interconnection architecture in recent years. However, the traditional electrical lines have very limited bandwidth and high energy consumption, making the photonic interconnection promising for future 3D Optical NoCs (ONoCs). Since existing solutions cannot well guarantee the fault-tolerant ability of 3D ONoCs, in this paper, we propose a reliable optical router (OR) structure which sacrifices less redundancy to obtain more restore paths. Moreover, by using our fault-tolerant routing algorithm, the restore path can be found inside the disabled OR under the deadlock-free condition, i.e., fault-node reuse. Experimental results show that the proposed approach outperforms the previous related works by maximum 81.1 percent and 33.0 percent on average for throughput performance under different synthetic and real traffic patterns. It can improve the system average optical signal to noise ratio (OSNR) performance by maximum 26.92 percent and 12.57 percent on average, and it can improve the average energy consumption performance by 0.3 percent to 15.2 percent under different topology types/sizes, failure rates, OR structures, and payload packet sizes.
- Published
- 2020
- Full Text
- View/download PDF
225. Organization of river-sea container transportation in the Yangtze River: Processes and mechanisms
- Author
-
Weichen Liu, Youhui Cao, Jianglong Chen, Jiaying Guo, and Shuangbo Liang
- Subjects
Geography, Planning and Development ,Transportation ,General Environmental Science - Published
- 2023
- Full Text
- View/download PDF
226. Aggravated chemical production of aerosols by regional transport and basin terrain in a heavy PM2.5 pollution episode over central China
- Author
-
Weiyang Hu, Yu Zhao, Tianliang Zhao, Yongqing Bai, Chun Zhao, Shaofei Kong, Lei Chen, Qiuyan Du, Huang Zheng, Wen Lu, Weichen Liu, and Xiaoyun Sun
- Subjects
Atmospheric Science ,General Environmental Science - Published
- 2023
- Full Text
- View/download PDF
227. A Cryptocurrency Price Prediction Model Based on Twitter Sentiment Indicators
- Author
-
Zi Ye, Weichen Liu, Qiang Qu, Qingshan Jiang, and Yi Pan
- Published
- 2022
- Full Text
- View/download PDF
228. Can the Brain Drain of Elite Scientists be Reversed? Evidence from China’s Young Thousand Talents Program
- Author
-
Dongbo Shi, Weichen Liu, and Yanbo Wang
- Published
- 2022
- Full Text
- View/download PDF
229. LAMP: load-balanced multipath parallel transmission in point-to-point NoCs
- Author
-
Hui Chen, Peng Chen, Xiangzhong Luo, Shuo Huai, Weichen Liu, and School of Computer Science and Engineering
- Subjects
Computer science and engineering::Computer systems organization [Engineering] ,Load-Balancing ,Electrical and Electronic Engineering ,Network-on-Chip ,Computer Graphics and Computer-Aided Design ,Software - Abstract
Network-on-Chip (NoC) is an emerging paradigm that is able to connect a significant amount of processing elements (PEs). However, as a distributed sub-system, NoC resources have not been exploited to the fullest. Multipath parallel transmission, which splits one message into multiple parts and sends them simultaneously, shows its efficiency in utilizing NoC resources and further reducing the transmission latency. However, this method is not fully optimized in previous works, especially for emerging point-to-point NoCs due to the following reasons: (1) only limited shortest paths are chosen; (2) static message splitting strategy without considering NoC utilization state increases contentions; (3) the optimization of hardware that supports multipath parallel transmission is missing, resulting in additional overheads. Thus, we propose LAMP, a software and hardware collaborated design to efficiently utilize resources and reduce latency in point-to-point NoCs through the load-balanced multipath parallel transmission. Specifically, we propose a reinforcement learning-based algorithm to decide when and how to split messages, and which path should be used according to traffic loads. Also, the temporal and spatial load-balancing algorithms are proposed so that the message size is adjusted properly to utilize NoC resources. Moreover, we revise the hardware design to support multipath parallel transmission efficiently. Extensive experiments show that our algorithm achieves a remarkable performance improvement (+18.0% ∼ +29.9%) when compared with the state-of-the-art dual-path algorithm. Our hardware design decreases power and area consumption by 23.2% and 10.3% over the dual-path hardware. Ministry of Education (MOE) Nanyang Technological University Submitted/Accepted version This work is partially supported by the Ministry of Education, Singapore, under its Academic Research Fund Tier 2 (MoE2019-T2-1-071) and Tier 1 (MoE2019-T1-001-072), and Nanyang Technological University, Singapore, under its NAP (M4082282) and SUG (M4082087).
- Published
- 2022
230. Contention minimization in emerging SMART NoC via direct and indirect routes
- Author
-
Peng Chen, Yiyuan Xie, Mengquan Li, Jun Zhou, Nan Guan, Weichen Liu, Hui Chen, Chunhua Xiao, and School of Computer Science and Engineering
- Subjects
Network Routing ,Computer science ,business.industry ,Theoretical Computer Science ,Spread spectrum ,Direct route ,Computational Theory and Mathematics ,Hardware and Architecture ,End to end latency ,Task analysis ,Minimisation ,Minification ,Routing (electronic design automation) ,business ,Resource management (computing) ,Computer science and engineering::Hardware::Input/output and data communications [Engineering] ,Software ,Computer network - Abstract
SMART (Single-cycle Multi-hop Asynchronous Repeated Traversal) Network-on-Chip (NoC), a recently proposed dynamically reconfigurable NoC, enables single-cycle long-distance communication by building single-bypass paths directly between distant communication pairs. However, such a single-cycle single-bypass path will be readily broken when contention occurs. Thus, packets will be buffered at intermediate routers with blocking latency from other contending packets, and extra router-stage latency to rebuild the remaining path when available, reducing the bypassing benefits that SMART NoC offers. In this article, we for the first time propose an effective contention-minimized routing algorithm to achieve maximal bypassing in SMART NoCs. Specifically, we identify two potential routes for packets: direct route, with which packets can reach the destination in a single bypass; and indirect route, with which packets can reach the destination in multiple bypasses via a (multiple) intermediate router(s). The novel feature of the proposed routing strategy is that, contrary to an intuitive approach, not the routes with minimal distance but the indirect routes via the arbitrary intermediate routers (even if they may be non-minimal) that avoid contentions yield the minimized end-to-end latency. Our new routing strategy can greatly enrich the path diversity, effectively minimize the conflicts between communication pairs, greatly balance the workloads and fully utilize bypass paths. Evaluation on realistic benchmarks demonstrates the effectiveness of the proposed routing strategy, which achieves average performance improvement by 35.48 percent in communication latency, 28.31 percent in application schedule length, and 37.59 percent in network throughput, compared with the current routing in SMART NoCs. Ministry of Education (MOE) Nanyang Technological University Submitted/Accepted version This work is partially supported by the Ministry of Education, Singapore, under its Academic Research Fund Tier 2 (MoE2019-T2-1-071) and Tier 1 (MoE2019-T1-001-072), and NTU, Singapore, under its NAP (M4082282) and SUG (M4082087).
- Published
- 2022
231. ZeroBN: Learning Compact Neural Networks For Latency-Critical Edge Systems
- Author
-
Lei Zhang, Ravi Subramaniam, Shuo Huai, Weichen Liu, Di Liu, School of Computer Science and Engineering, 2021 58th ACM/IEEE Design Automation Conference (DAC), and HP-NTU Digital Manufacturing Corporate Lab
- Subjects
Artificial neural network ,Edge device ,Computer science ,business.industry ,Computer science and engineering::Computing methodologies::Artificial intelligence [Engineering] ,Deep learning ,Process (computing) ,Compact Learning ,ZeroBN ,Constraint (information theory) ,Computer engineering ,Electronic design automation ,Artificial intelligence ,Enhanced Data Rates for GSM Evolution ,Latency (engineering) ,business - Abstract
Edge devices have been widely adopted to bring deep learning applications onto low power embedded systems, mitigating the privacy and latency issues of accessing cloud servers. The increasingly computational demand of complex neural network models leads to large latency on edge devices with limited resources. Many application scenarios are real-time and have a strict latency constraint, while conventional neural network compression methods are not latency-oriented. In this work, we propose a novel compact neural networks training method to reduce the model latency on latency-critical edge systems. A latency predictor is also introduced to guide and optimize this procedure. Coupled with the latency predictor, our method can guarantee the latency for a compact model by only one training process. The experiment results show that, compared to state-of-the-art model compression methods, our approach can well-fit the 'hard' latency constraint by significantly reducing the latency with a mild accuracy drop. To satisfy a 34ms latency constraint, we compact ResNet-50 with 0.82% of accuracy drop. And for GoogLeNet, we can even increase the accuracy by 0.3% National Research Foundation (NRF) Submitted/Accepted version This research was conducted in collaboration with HP Inc. and supported by National Research Foundation (NRF) Singapore and the Singapore Government through the Industry Alignment Fund - Industry Collaboration Projects Grant (I1801E0028).
- Published
- 2021
- Full Text
- View/download PDF
232. Face Generation using DCGAN for Low Computing Resources
- Author
-
Weichen Liu, Yuxuan Gu, and Kenan Zhang
- Published
- 2021
- Full Text
- View/download PDF
233. The Influence of Annealing Atmosphere, Blending Ratio, and Molecular Weight on the Phase Behavior of Blend Materials
- Author
-
Weichen Liu, Libin Zhang, and Yayi Wei
- Subjects
Materials science ,Process Chemistry and Technology ,Chemical technology ,Bioengineering ,molecular weight ,self-assembly ,TP1-1185 ,blend ratio ,Annealing (glass) ,Process conditions ,chemistry.chemical_compound ,Chemistry ,Chemical engineering ,chemistry ,Phase (matter) ,Copolymer ,Chemical Engineering (miscellaneous) ,Polystyrene ,Self-assembly ,Snowflake ,Annealing atmosphere ,phase separation ,annealing atmosphere ,QD1-999 - Abstract
In the study of block copolymers, many parameters need to be adjusted to obtain good phase separation results. Based on block copolymer polystyrene-b-polycarbonate and homopolymer polystyrene, the effects of the annealing atmosphere, blending ratio, and molecular weight on phase separation were studied. The results show that annealing in air can inhibit the occurrence of phase separation. In addition, snowflake patterns are formed during phase separation. The blending ratio affects the quality of the pattern. The molecular weight affects the size of the pattern, and the size increases as the molecular weight increases. In this article, the influence of process conditions and materials on phase separation was discussed, which has laid a solid foundation for the development of block copolymer self-assembly in the future.
- Published
- 2021
234. Real-Time Scheduling of DAG Tasks with Arbitrary Deadlines
- Author
-
Qingxu Deng, Nan Guan, Kankan Wang, Xu Jiang, Di Liu, and Weichen Liu
- Subjects
Earliest deadline first scheduling ,020203 distributed computing ,business.industry ,Computer science ,Computation ,02 engineering and technology ,Parallel computing ,Computer Graphics and Computer-Aided Design ,020202 computer hardware & architecture ,Computer Science Applications ,Scheduling (computing) ,Software ,Single task ,0202 electrical engineering, electronic engineering, information engineering ,Electrical and Electronic Engineering ,business - Abstract
Real-time and embedded systems are shifting from single-core to multi-core processors, on which the software must be parallelized to fully utilize the computation capacity of the hardware. Recently, much work has been done on real-time scheduling of parallel tasks modeled as directed acyclic graphs (DAG). However, most of these studies assume tasks to have implicit or constrained deadlines. Much less work considered the general case of arbitrary deadlines (i.e., the relative deadline is allowed to be larger than the period), which is more difficult to analyze due to intra-task interference among jobs. In this article, we study the analysis of Global Earliest Deadline First (GEDF) scheduling for DAG parallel tasks with arbitrary deadlines. We develop new analysis techniques for GEDF scheduling of a single DAG task and this new analysis techniques can guarantee a better capacity augmentation bound 2.41 (the best known result is 2.5) in the case of a single task. Furthermore, the proposed analysis techniques are also extended to the case of multiple DAG tasks under GEDF and federated scheduling. Finally, through empirical evaluation, we justify the out-performance of our schedulability tests compared to the state-of-the-art in general.
- Published
- 2019
- Full Text
- View/download PDF
235. Timing-Anomaly Free Dynamic Scheduling of Conditional DAG Tasks on Multi-Core Systems
- Author
-
Qingqiang He, Nan Guan, Xu Jiang, Weichen Liu, Peng Chen, and School of Computer Science and Engineering
- Subjects
Multi-core processor ,Schedule ,Theoretical computer science ,Computer science ,0206 medical engineering ,Response time ,Timing Anomaly ,02 engineering and technology ,Dynamic priority scheduling ,020601 biomedical engineering ,Upper and lower bounds ,020202 computer hardware & architecture ,Scheduling (computing) ,Hardware and Architecture ,Dynamic Scheduling ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,Computer science and engineering [Engineering] ,Time complexity ,Software - Abstract
In this paper, we propose a novel approach to schedule conditional DAG parallel tasks, with which we can derive safe response time upper bounds significantly better than the state-of-the-art counterparts. The main idea is to eliminate the notorious timing anomaly in scheduling parallel tasks by enforcing certain order constraints among the vertices, and thus the response time bound can be accurately predicted off-line by somehow “simulating” the runtime scheduling. A key challenge to apply the timing-anomaly free scheduling approach to conditional DAG parallel tasks is that at runtime it may generate exponentially many instances from a conditional DAG structure. To deal with this problem, we develop effective abstractions, based on which a safe response time upper bound is computed in polynomial time. We also develop algorithms to explore the vertex orders to shorten the response time bound. The effectiveness of the proposed approach is evaluated by experiments with randomly generated DAG tasks with different parameter configurations. Accepted version This work is supported by the Research Grants Council of Hong Kong (GRF 15204917 and 15213818) and National Natrual Science Foundation of China (Grant No. 61672140), and Nanyang Assistant Professorship (NAP) M4082282 and Start-Up Grant (SUG) M4082087 from Nanyang Technological University, Singapore.
- Published
- 2019
- Full Text
- View/download PDF
236. NV-eCryptfs: Accelerating Enterprise-Level Cryptographic File System with Non-Volatile Memory
- Author
-
Lei Zhang, Pengda Li, Linfeng Cheng, Yanyue Pan, Neil W. Bergmann, Weichen Liu, Chunhua Xiao, and School of Computer Science and Engineering
- Subjects
File system ,Address space ,business.industry ,Computer science ,ext4 ,Non-volatile Memory ,Cloud computing ,Cryptography ,02 engineering and technology ,computer.software_genre ,Encryption ,020202 computer hardware & architecture ,Theoretical Computer Science ,eCryptfs ,Computational Theory and Mathematics ,Hardware and Architecture ,Backup ,0202 electrical engineering, electronic engineering, information engineering ,Operating system ,Computer science and engineering [Engineering] ,Hardware acceleration ,business ,computer ,Software ,Block (data storage) - Abstract
The development of cloud computing and big data results in a large amount of data transmitting and storing. In order to protect sensitive data from leakage and unauthorized access, many cryptographic file systems are proposed to transparently encrypt file contents before storing them on storage devices, such as eCryptfs. However, the time-consuming encryption operations cause serious performance degradation. We found that compared with non-crypto file system EXT4, the performance slowdown could be up to 58.53 and 86.89 percent respectively for read and write with eCryptfs. Although prior work has proposed techniques to improve the efficiency of cryptographic file system through computation acceleration, no solution focused on the inefficiency working flow, which is demonstrated to be a major factor affecting system performance. To address this open problem, we present NV-eCryptfs, an asynchronous software stack for eCryptfs, which utilizes NVM as a fast storage tier on top of slower block devices to fully parallelize encryption and data I/O. We design an efficient NVM management scheme to support the fast parallel cryptographic operations. Besides providing an address space that can be directly accessed by the hardware accelerators, our designed mechanism is able to record the memory allocation states, and supplies a backup plan to deal with the situation of NVM shortage. The additional index structure is built to accelerate lookup operations to determine if a given data block resides in NVM. Moreover, we integrate an adaptive scheduling in NV-eCryptfs to process I/O requests dynamically according to access pattern and request size, which is able to take full utilization of both software and hardware acceleration to boost crypto performance. Our evaluation shows the proposed NV-eCryptfs outperforms the original eCryptfs with software routine 23.41× and 5.82× respectively for read and write. This work is supported by National Natural Science Foundation of China: No.61502061, Chongqing application foundation and research in cutting-edge technologies: No. cstc2015jcyjA40016, the fundamental research funds for the central universities: 106112017CDJXY180004, and also the financial support from the program of China Scholarships Council No.201706055029.
- Published
- 2019
- Full Text
- View/download PDF
237. Energy-efficient crypto acceleration with HW/SW co-design for HTTPS
- Author
-
Neil W. Bergmann, Chunhua Xiao, Xie Yuhua, Lei Zhang, Weichen Liu, and School of Computer Science and Engineering
- Subjects
Web server ,Energy Efficiency ,Computer Networks and Communications ,Computer science ,business.industry ,020206 networking & telecommunications ,02 engineering and technology ,computer.software_genre ,Encryption ,Instruction set ,Secure communication ,Hardware and Architecture ,Embedded system ,Cipher suite ,0202 electrical engineering, electronic engineering, information engineering ,Computer science and engineering [Engineering] ,Hardware acceleration ,020201 artificial intelligence & image processing ,HW/SW Co-design ,business ,computer ,Software ,Efficient energy use - Abstract
Entering the Big Data era leads to the rapid development of web applications which provide high-performance sensitive access on large cloud data centers. HTTPS has been widely deployed as an extension of HTTP by adding an encryption layer of SSL/TLS protocol for secure communication over the Internet. To accelerate the complex crypto computation, specific acceleration instruction set and hardware accelerator are adopted. However, energy consumption has been ignored in the rush for performance. Actually, energy efficiency has become a challenge with the increasing demands for performance and energy saving in data centers. In this paper, we present the EECA, an Energy-Efficient Crypto Acceleration system for HTTPS with OpenSSL. It provides high energy-efficient encryption through HW/SW co-design. The essential idea is to make full use of system resource to exert the superiorities of different crypto acceleration approaches for an energy-efficient design. Experimental results show that, if only do crypto computations with typical encryption algorithm AES-256-CBC, the proposed EECA could get up to 1637.13%, 84.82%, and 966.23% PPW (Performance per Watt) improvement comparing with original software encryption, instruction set acceleration and hardware accelerator, respectively. If considering the whole working flow for end-to-end secure HTTPS based on OpenSSL with cipher suite ECDHE-RSA-AES256-SHA384, EECA could also improve the energy efficiency by up to 422.26%, 40.14% and 96.05% comparing with the original Web server using software, instruction set and hardware accelerators, respectively.
- Published
- 2019
- Full Text
- View/download PDF
238. A Branch-and-Bound-Based Crossover Operator for the Traveling Salesman Problem
- Author
-
Qi Qi, Yan Jiang, Weichen Liu, and Thomas Weise
- Subjects
Mathematical optimization ,education.field_of_study ,021103 operations research ,Branch and bound ,Computer science ,Crossover ,Population ,0211 other engineering and technologies ,Evolutionary algorithm ,02 engineering and technology ,Travelling salesman problem ,Human-Computer Interaction ,Set (abstract data type) ,Operator (computer programming) ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,020201 artificial intelligence & image processing ,education ,Software - Abstract
In this article, the new crossover operator BBX for Evolutionary Algorithms (EAs) for traveling salesman problems (TSPs) is introduced. It uses branch-and-bound to find the optimal combination of the (directed) edges present in the parent solutions. The offspring solutions created are at least as good as their parents and are only composed of parental building blocks. The operator is closer to the ideal concept of crossover in EAs than existing operators. This article provides the most extensive study on crossover operators on the TSP, comparing BBX to ten other operators on the 110 instances of the TSPLib benchmark set in EAs with four different population sizes. BBX, with its better ability to reuse and combine building blocks, surprisingly does not generally outperform the other operators. However, it performs well in certain scenarios. Besides presenting a novel approach to crossover on the TSP, the study significantly extends and refines the body of knowledge on the field with new conclusions and comparison results.
- Published
- 2019
- Full Text
- View/download PDF
239. Implementation issues in optimization algorithms: do they matter?
- Author
-
Yuezhong Wu, Weichen Liu, Raymond Chiong, and Thomas Weise
- Subjects
021103 operations research ,Theoretical computer science ,Optimization algorithm ,Computer science ,0211 other engineering and technologies ,Lin–Kernighan heuristic ,02 engineering and technology ,Travelling salesman problem ,Theoretical Computer Science ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Implementation ,Software - Abstract
Two factors that have a major impact on the performance of an optimization method are (1) formal algorithm specifications and (2) practical implementations. The impact of the latter is typically ig...
- Published
- 2019
- Full Text
- View/download PDF
240. Using Virtual Reality to Induce and Assess Objective Correlates of Nicotine Craving: Paradigm Development Study
- Author
-
Weichen Liu, Gianna Andrade, Jurgen Schulze, Neal Doran, and Kelly E Courtney
- Subjects
eye-tracking ,Tobacco Smoke and Health ,genetic structures ,craving ,pupillometry ,Rehabilitation ,Biomedical Engineering ,Physical Therapy, Sports Therapy and Rehabilitation ,attentional bias ,smoking ,Brain Disorders ,Computer Science Applications ,Substance Misuse ,Psychiatry and Mental health ,Good Health and Well Being ,Clinical Research ,cue-exposure ,Tobacco ,virtual reality ,addiction ,Drug Abuse (NIDA only) ,development ,nicotine - Abstract
Background Craving is a clinically important phenotype for the development and maintenance of nicotine addiction. Virtual reality (VR) paradigms are successful in eliciting cue-induced subjective craving and may even elicit stronger craving than traditional picture-cue methods. However, few studies have leveraged the advances of this technology to improve the assessment of craving. Objective This report details the development of a novel, translatable VR paradigm designed to both elicit nicotine craving and assess multiple eye-related characteristics as potential objective correlates of craving. Methods A VR paradigm was developed, which includes three Active scenes with nicotine and tobacco product (NTP) cues present, and three Neutral scenes devoid of NTP cues. A pilot sample (N=31) of NTP users underwent the paradigm and completed subjective measures of nicotine craving, sense of presence in the VR paradigm, and VR-related sickness. Eye-gaze fixation time (“attentional bias”) and pupil diameter toward Active versus Neutral cues, as well as spontaneous blink rate during the Active and Neutral scenes, were recorded. Results The NTP Cue VR paradigm was found to elicit a moderate sense of presence (mean Igroup Presence Questionnaire score 60.05, SD 9.66) and low VR-related sickness (mean Virtual Reality Sickness Questionnaire score 16.25, SD 13.94). Scene-specific effects on attentional bias and pupil diameter were observed, with two of the three Active scenes eliciting greater NTP versus control cue attentional bias and pupil diameter (Cohen d=0.30-0.92). The spontaneous blink rate metrics did not differ across Active and Neutral scenes. Conclusions This report outlines the development of the NTP Cue VR paradigm. Our results support the potential of this paradigm as an effective laboratory-based cue-exposure task and provide early evidence of the utility of attentional bias and pupillometry, as measured during VR, as useful markers for nicotine addiction.
- Published
- 2021
241. Using Virtual Reality to Induce and Assess Objective Correlates of Nicotine Craving: Paradigm Development Study (Preprint)
- Author
-
Weichen Liu, Gianna Andrade, Jurgen Schulze, Neal Doran, and Kelly E Courtney
- Subjects
genetic structures - Abstract
BACKGROUND Craving is a clinically important phenotype for the development and maintenance of nicotine addiction. Virtual reality (VR) paradigms are successful in eliciting cue-induced subjective craving and may even elicit stronger craving than traditional picture-cue methods. However, few studies have leveraged the advances of this technology to improve the assessment of craving. OBJECTIVE This report details the development of a novel, translatable VR paradigm designed to both elicit nicotine craving and assess multiple eye-related characteristics as potential objective correlates of craving. METHODS A VR paradigm was developed, which includes three Active scenes with nicotine and tobacco product (NTP) cues present, and three Neutral scenes devoid of NTP cues. A pilot sample (N=31) of NTP users underwent the paradigm and completed subjective measures of nicotine craving, sense of presence in the VR paradigm, and VR-related sickness. Eye-gaze fixation time (“attentional bias”) and pupil diameter toward Active versus Neutral cues, as well as spontaneous blink rate during the Active and Neutral scenes, were recorded. RESULTS The NTP Cue VR paradigm was found to elicit a moderate sense of presence (mean Igroup Presence Questionnaire score 60.05, SD 9.66) and low VR-related sickness (mean Virtual Reality Sickness Questionnaire score 16.25, SD 13.94). Scene-specific effects on attentional bias and pupil diameter were observed, with two of the three Active scenes eliciting greater NTP versus control cue attentional bias and pupil diameter (Cohen d=0.30-0.92). The spontaneous blink rate metrics did not differ across Active and Neutral scenes. CONCLUSIONS This report outlines the development of the NTP Cue VR paradigm. Our results support the potential of this paradigm as an effective laboratory-based cue-exposure task and provide early evidence of the utility of attentional bias and pupillometry, as measured during VR, as useful markers for nicotine addiction.
- Published
- 2021
- Full Text
- View/download PDF
242. The effect of public health awareness and behaviors on the transmission dynamics of syphilis in Northwest China, 2006-2018, based on a multiple-stages mathematical model
- Author
-
Yu Zhao, Weichen Liu, Wenjun Jing, and Ning Ma
- Subjects
medicine.medical_specialty ,Syphilis model ,Primary Syphilis ,Epidemiology of syphilis ,Infectious and parasitic diseases ,RC109-216 ,37H10 ,92B05 ,Control strategy ,Environmental health ,medicine ,Transmission (medicine) ,Applied Mathematics ,Health Policy ,Public health ,medicine.disease ,Basic reproduction number ,Infectious Diseases ,34F05 ,Data fitting ,Health education ,Syphilis ,Psychology ,Epidemic model ,Sensitivity analysis ,60J70 ,Research Paper - Abstract
Syphilis, a sexually transmitted infectious disease caused by the bacterium treponema pallidum, has re-emerged as a global public health issue with an estimated 12 million people infected each year. Understanding the impacts of health awareness and behaviors on transmission dynamics of syphilis can help to establish optimal control strategy in different regions. In this paper, we develop a multiple-stage SIRS epidemic model taking into account the public health awareness and behaviors of syphilis. First, the basic reproduction number R 0 is obtained, which determines the global dynamics behaviors of the model. We derive the necessary conditions for implementing optimal control and the corresponding optimal solution for mitigation syphilis by using Pontryagin's Maximum Principle. Based on the data of syphilis in Ningxia from 2006 to 2018, the parameterizations and model calibration are carried out. The fitting results are in good agreement with the data. Moreover, sensitivity analysis shows that the public awareness induced protective behaviors Ce, compliance of condom-induced preventability e and treatment for the primary syphilis m1 play an important role in mitigating the risk of syphilis outbreaks. These results can help us gain insights into the epidemiology of syphilis and provide guidance for the public health authorities to implement health education programs.
- Published
- 2021
243. Parallel multipath transmission for burst traffic optimization in point-to-point NoCs
- Author
-
Zihao Zhang, Hui Chen, Peng Chen, Shien Zhu, Weichen Liu, School of Computer Science and Engineering, and 2021 Great Lakes Symposium on VLSI (GLSVLSI '21)
- Subjects
Point-to-point ,Speedup ,Network-on-Chips ,Computer science ,Path (graph theory) ,Traffic optimization ,Overhead (computing) ,Reinforcement learning ,Computer science and engineering [Engineering] ,Parallel computing ,Performance improvement ,Latency (engineering) ,Burst Traffic Optimization - Abstract
Network-on-chip (NoC) is a promising solution to connect more than hundreds of processing elements (PEs). As the number of PEs increases, the high communication latency caused by the burst traffic hampers the speedup gained by computation acceleration. Although parallel multipath transmission is an effective method to reduce transmission latency, its advantages have not been fully exploited in previous works, especially for emerging point-To-point NoCs since: (1) Previous static message splitting strategy increases contentions when traffic loads are heavy, degrading NoC performance. (2) Only limited shortest paths are chosen, ignoring other possible paths without contentions. (3) The optimization of hardware that supports parallel multipath transmission is missing, resulting in additional overhead. Thus, we propose a software and hardware collaborated design to reduce latency in point-To-point NoCs through parallel multipath transmission. Specifically, we revise hardware design to support parallel multipath transmission efficiently. Moreover, we propose a reinforcement learning-based algorithm to decide when and how to split messages, and which path should be used according to traffic loads. Experiments show that our algorithm achieves a remarkable performance improvement (+12.1% to +21.0%) when compared with the state-of-The-Art dual-path algorithm. Also, our hardware decreases power and area consumption by 23.2% and 10.3% over the dual-path hardware. Ministry of Education (MOE) Nanyang Technological University This work is partially supported by the Ministry of Education, Singapore, under its Academic Research Fund Tier 2 (MoE2019-T2-1-071) and Tier 1 (MoE2019-T1-001-072), and Nanyang Technological University, Singapore, under its NAP (M4082282) and SUG (M4082087).
- Published
- 2021
244. Attack mitigation of hardware trojans for thermal sensing via micro-ring resonator in optical NoCs
- Author
-
Weichen Liu, Jun Zhou, Pengxing Guo, Mengquan Li, and School of Computer Science and Engineering
- Subjects
Hardware security module ,Artificial neural network ,Computer science ,business.industry ,Electrical engineering ,Thermal Sensing ,02 engineering and technology ,01 natural sciences ,Network-on-chip ,010309 optics ,Thermal sensing ,020210 optoelectronics & photonics ,Network on a chip ,Hardware and Architecture ,Trojan ,0103 physical sciences ,Micro ring resonator ,0202 electrical engineering, electronic engineering, information engineering ,Bandwidth (computing) ,Hardware Security ,Electrical and Electronic Engineering ,Latency (engineering) ,business ,Computer science and engineering::Hardware::Input/output and data communications [Engineering] ,Software - Abstract
As an emerging role in new-generation on-chip communication, optical networks-on-chip (ONoCs) provide ultra-high bandwidth, low latency and low power dissipation for data transfers. However, the thermo-optic effects of the photonic devices have a great impact on the operating performance and reliability of ONoCs, where the thermal-aware control with accurate measurements, e.g., thermal sensing, is typically applied to alleviate it. Besides, the temperature-sensitive ONoCs are prone to be attacked by the hardware Trojans (HTs) covertly embedded in the counterfeit integrated circuits (ICs) from the malicious third-party vendors, leading to performance degradation, denial-of-service (DoS), or even permanent damages. In this paper, we focus on the tampering and snooping attacks during the thermal sensing via micro-ring resonator (MR) in ONoCs. Based on the provided work flow and attack model, a new structure of the anti-HT module is proposed to verify and protect the obtained data from the thermal sensor for attacks in its optical sampling and electronic transmission processes. In addition, we present the detection scheme based on the spiking neural networks (SNNs) to implement an accurate classification of the network security statuses for further high-level control. Evaluation results indicate that, with less than 1% extra area of a tile, our approach can significantly enhance the hardware security of thermal sensing for ONoC with trivial costs of up to 8.73%, 5.32% and 6.14% in average latency, execution time and energy consumption, respectively. Ministry of Education (MOE) Nanyang Technological University Accepted version This work is partially supported by the Ministry of Education, Singapore, under its Academic Research Fund Tier 2 (Grant No. MOE2019-T2-001-071) and Tier 1 (Grant No. MOE2019-T1-001-072), and Nanyang Technological University, Singapore, under its NAP (Grant No. M4082282) and SUG (Grant No. M4082087).
- Published
- 2021
245. HSCoNAS: Hardware-Software Co-Design of Efficient DNNs via Neural Architecture Search
- Author
-
Shuo Huai, Xiangzhong Luo, Di Liu, Weichen Liu, School of Computer Science and Engineering, 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), and HP-NTU Digital Manufacturing Corporate Lab
- Subjects
FOS: Computer and information sciences ,Performance Evaluation ,Computer Science - Machine Learning ,Artificial neural network ,Edge device ,Computer science ,Evolutionary algorithm ,Evolutionary computation ,Hardware software ,Machine Learning (cs.LG) ,Runtime ,Computer architecture ,Computer science and engineering [Engineering] ,Latency (engineering) ,Architecture - Abstract
In this paper, we present a novel multi-objective hardware-aware neural architecture search (NAS) framework, namely HSCoNAS, to automate the design of deep neural networks (DNNs) with high accuracy but low latency upon target hardware. To accomplish this goal, we first propose an effective hardware performance modeling method to approximate the runtime latency of DNNs on target hardware, which will be integrated into HSCoNAS to avoid the tedious on-device measurements. Besides, we propose two novel techniques, i.e., dynamic channel scaling to maximize the accuracy under the specified latency and progressive space shrinking to refine the search space towards target hardware as well as alleviate the search overheads. These two techniques jointly work to allow HSCoNAS to perform fine-grained and efficient explorations. Finally, an evolutionary algorithm (EA) is incorporated to conduct the architecture search. Extensive experiments on ImageNet are conducted upon diverse target hardware, i.e., GPU, CPU, and edge device to demonstrate the superiority of HSCoNAS over recent state-of-the-art approaches., Comment: DATE2021
- Published
- 2021
- Full Text
- View/download PDF
246. EDLAB : a benchmark for edge deep learning accelerators
- Author
-
Manu Rastogi, Weichen Liu, Athreya Madhu Sudan, Hao Kong, M. Anthony Lewis, Di Liu, Shiqing Li, Shuo Huai, Ravi Subramaniam, Lei Zhang, Hui Chen, Shien Zhu, School of Computer Science and Engineering, and HP-NTU Digital Manufacturing Corporate Lab
- Subjects
Edge Accelerator ,Computer science ,business.industry ,Deep learning ,Deployment ,Edge (geometry) ,Benchmark ,Computational science ,Deep Learning ,Hardware and Architecture ,Benchmark (computing) ,Computer science and engineering [Engineering] ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Software - Abstract
A new trend tends to deploy deep learning algorithms to edge environments to mitigate privacy and latency issues from cloud computing. Diverse edge deep learning accelerators are devised to speed up the inference of deep learning algorithms on edge devices. Various edge deep learning accelerators feature different characteristics in terms of power and performance, which make it a very challenging task to efficiently and uniformly compare different accelerators. In this paper, we introduce EDLAB, an end-to-end benchmark, to evaluate the overall performance of edge deep learning accelerators. EDLAB consists of state-of-the-art deep learning models, a unified workload preprocessing and deployment framework, as well as a collection of comprehensive metrics. In addition, we propose parameterized models to model the hardware performance bound so that EDLAB can identify the hardware potentials and the hardware utilization of different deep learning applications. Finally, we employ EDLAB to benchmark three edge deep learning accelerators and analyze the benchmarking results. From the analysis we obtain some insightful observations that can guide the design of efficient deep learning applications. Nanyang Technological University National Research Foundation (NRF) Submitted/Accepted version This research was conducted in collaboration with HP Inc. and supported by National Research Foundation (NRF) Singapore and the Singapore Government through the Industry Alignment Fund-Industry Collaboration Projects Grant (I1801E0028). This work is also partially supported by NTU NAP M4082282 and SUG M4082087, Singapore.
- Published
- 2021
247. Partial order based non-preemptive communication scheduling towards real-time networks-on-chip
- Author
-
Nan Guan, Di Liu, Wanli Chang, Jun Zhou, Hui Chen, Shiqing Li, Weichen Liu, Peng Chen, School of Computer Science and Engineering, and The 36th ACM/SIGAPP Symposium On Applied Computing
- Subjects
Schedule ,Computer science ,Distributed computing ,Process (computing) ,Response time ,020207 software engineering ,02 engineering and technology ,Dynamic priority scheduling ,Directed acyclic graph ,Upper and lower bounds ,Directed Acyclic Graph (DAG) ,Scheduling (computing) ,Tree traversal ,Network-on-Chip (NoC) ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Computer science and engineering [Engineering] - Abstract
Due to the increasing performance requirement of cyberphysical systems, many-core processors with high parallelism are gaining wide utilization, where network-on-chip (NoC) is a prevalent choice for inter-core communication. Unfortunately, the contention on NoCs introduces large timing uncertainties, which complicates the response time estimation. To address this problem, for real-time applications modeled as a directed acyclic graph (DAG), we introduce DAG-Order, a partial order based time-predictable scheduling paradigm, resulting in real-time NoCs. Specifically, DAG-Order is built upon an existing single-cycle long-range traversal (SLT) NoC that is to simplify the process of validation and verification. Then, DAG-Order is proposed based on a dynamic scheduling approach, which jointly considers communication as well as computation workloads, and matches SLT NoC. DAGOrder achieves provably bound safety by enforcing certain partial order constraints among edges/vertices that eliminate the execution-timing anomaly during the runtime phase. Finally, an effective algorithm exploring for a proper schedule order is deployed to tighten the upper bound. Experimental results demonstrate that DAG-Order performs better than state-of-the-art scheduling approaches. Ministry of Education (MOE) Accepted version This work is partially supported by the Ministry of Education, Singapore, under its Academic Research Fund Tier 2 (MoE2019-T2-1-071) and Tier 1 (MoE2019-T1-001-072), and Nanyang Technological University, Singapore, under its NAP (M4082282) and SUG (M4082087).
- Published
- 2021
248. Load-aware Adaptive Cache Management Scheme for Enterprise-level Stackable Cryptographic File System
- Author
-
Shi Qiu, Yanyue Pan, Weichen Liu, Dandan Xu, Chunhua Xiao, and Shuting Sun
- Subjects
File system ,Computer science ,business.industry ,Data security ,020206 networking & telecommunications ,Cryptography ,02 engineering and technology ,Dynamic priority scheduling ,computer.software_genre ,020202 computer hardware & architecture ,Adaptive system ,Embedded system ,0202 electrical engineering, electronic engineering, information engineering ,Redundancy (engineering) ,Cache ,Performance improvement ,business ,computer - Abstract
Stackable Cryptographic File Systems have been adopted as one of the dominant solutions to transparently improve data security. However, how to reduce the performance degradation induced by encryption-operations is still an open problem. To solve the performance degradation problem, Load-aware Adaptive Cache Management (LACM) scheme is proposed, which utilizes a load-aware dynamic scheduling strategy to improve cache efficiency from two aspects: load-aware redundancy elimination and non-redundant cache conversion. Experimental results show that the proposed solution can provide 20.03%~34.39% latency decrease, and supply 12.82%~34.77% performance improvement for application-level workloads. LACM can also be easily adapted to other stackable cryptographic/compressed file systems, such as NCryptfs, stackable compress file system and etc.
- Published
- 2020
- Full Text
- View/download PDF
249. Solving Dynamic Multiobjective Problem via Autoencoding Evolutionary Search
- Author
-
Wei Zhou, Yew-Soon Ong, Liang Feng, Weichen Liu, and Kay Chen Tan
- Subjects
Mathematical optimization ,Computer science ,Process (computing) ,Evolutionary algorithm ,Contrast (statistics) ,0102 computer and information sciences ,02 engineering and technology ,01 natural sciences ,Autoencoder ,Biological Evolution ,Computer Science Applications ,Human-Computer Interaction ,010201 computation theory & mathematics ,Control and Systems Engineering ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Electrical and Electronic Engineering ,Software ,Algorithms ,Information Systems - Abstract
Dynamic multiobjective optimization problem (DMOP) denotes the multiobjective optimization problem, which contains objectives that may vary over time. Due to the widespread applications of DMOP existed in reality, DMOP has attracted much research attention in the last decade. In this article, we propose to solve DMOPs via an autoencoding evolutionary search. In particular, for tracking the dynamic changes of a given DMOP, an autoencoder is derived to predict the moving of the Pareto-optimal solutions based on the nondominated solutions obtained before the dynamic occurs. This autoencoder can be easily integrated into the existing multiobjective evolutionary algorithms (EAs), for example, NSGA-II, MOEA/D, etc., for solving DMOP. In contrast to the existing approaches, the proposed prediction method holds a closed-form solution, which thus will not bring much computational burden in the iterative evolutionary search process. Furthermore, the proposed prediction of dynamic change is automatically learned from the nondominated solutions found along the dynamic optimization process, which could provide more accurate Pareto-optimal solution prediction. To investigate the performance of the proposed autoencoding evolutionary search for solving DMOP, comprehensive empirical studies have been conducted by comparing three state-of-the-art prediction-based dynamic multiobjective EAs. The results obtained on the commonly used DMOP benchmarks confirmed the efficacy of the proposed method.
- Published
- 2020
250. COSMA: An Efficient Concurrency-Oriented Space Management Scheme for In-memory File Systems
- Author
-
Ting Wu, Fu Xiaoxiang, Chunhua Xiao, Feng Zipei, Zhang Lin, and Weichen Liu
- Subjects
Scheme (programming language) ,File system ,Computer science ,business.industry ,Concurrency ,Reliability (computer networking) ,Distributed computing ,Big data ,020206 networking & telecommunications ,02 engineering and technology ,Thread (computing) ,computer.software_genre ,Data structure ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,business ,Throughput (business) ,computer ,Wear leveling ,computer.programming_language - Abstract
Emerging file systems have been designed for fully exploring NVM's advanced features. However, with the development of big data, these file systems suffer from the performance degradation in highly concurrent environment, which are caused by serious access conflicts in space management. To solve this problem, we propose an efficient concurrency-oriented space management scheme named as COSMA. Along with novel data structure design, COSMA is able to greatly reduce request congestion among multiple threads through hierarchical space allocation scheme. Furthermore, COSMA provides 3 reclamation strategies to improve space utilization, and can also adapt to different systems which varied in NVM capacities. To ensure the system reliability, COSMA is capable of keeping wear leveling among multiple NVMs slots. We implement COSMA in a representative persistent file system, PMFS. Experimental results show that COSMA can improve the IOPS of PMFS by 15%, the write throughput of PMFS by 7.6% and the concurrent processing performance of PMFS by 50 %. Besides, it can also achieve wear-leveling among multiple NVMs.
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.