Author: "Zheng, Size" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Zheng, Size"' showing total 103 results

Start Over Author "Zheng, Size"

103 results on '"Zheng, Size"'

1. Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts

Author: Zhang, Shulai, Zheng, Ningxin, Lin, Haibin, Jiang, Ziheng, Bao, Wenlei, Jiang, Chengquan, Hou, Qi, Cui, Weihao, Zheng, Size, Chang, Li-Wen, Chen, Quan, and Liu, Xin
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Mixture-of-experts (MoE) has been extensively employed to scale large language models to trillion-plus parameters while maintaining a fixed computational cost. The development of large MoE models in the distributed scenario encounters the problem of large communication overhead. The inter-device communication of a MoE layer can occupy 47% time of the entire model execution with popular models and frameworks. Therefore, existing methods suggest the communication in a MoE layer to be pipelined with the computation for overlapping. However, these coarse grained overlapping schemes introduce a notable impairment of computational efficiency and the latency concealing is sub-optimal. To this end, we present COMET, an optimized MoE system with fine-grained communication-computation overlapping. Leveraging data dependency analysis and task rescheduling, COMET achieves precise fine-grained overlapping of communication and computation. Through adaptive workload assignment, COMET effectively eliminates fine-grained communication bottlenecks and enhances its adaptability across various scenarios. Our evaluation shows that COMET accelerates the execution of a single MoE layer by $1.96\times$ and for end-to-end execution, COMET delivers a $1.71\times$ speedup on average. COMET has been adopted in the production environment of clusters with ten-thousand-scale of GPUs, achieving savings of millions of GPU hours. more...
Published: 2025

2. ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

Author: Sun, Hanshi, Chang, Li-Wen, Bao, Wenlei, Zheng, Size, Zheng, Ningxin, Liu, Xin, Dong, Harry, Chi, Yuejie, and Chen, Beidi
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language
Abstract: With the widespread deployment of long-context large language models (LLMs), there has been a growing demand for efficient support of high-throughput inference. However, as the key-value (KV) cache expands with the sequence length, the increasing memory footprint and the need to access it for each token generation both result in low throughput when serving long-context LLMs. While various dynamic sparse attention methods have been proposed to speed up inference while maintaining generation quality, they either fail to sufficiently reduce GPU memory consumption or introduce significant decoding latency by offloading the KV cache to the CPU. We present ShadowKV, a high-throughput long-context LLM inference system that stores the low-rank key cache and offloads the value cache to reduce the memory footprint for larger batch sizes and longer sequences. To minimize decoding latency, ShadowKV employs an accurate KV selection strategy that reconstructs minimal sparse KV pairs on-the-fly. By evaluating ShadowKV on a broad range of benchmarks, including RULER, LongBench, and Needle In A Haystack, and models like Llama-3.1-8B, Llama-3-8B-1M, GLM-4-9B-1M, Yi-9B-200K, Phi-3-Mini-128K, and Qwen2-7B-128K, we demonstrate that it can support up to 6$\times$ larger batch sizes and boost throughput by up to 3.04$\times$ on an A100 GPU without sacrificing accuracy, even surpassing the performance achievable with infinite batch size under the assumption of infinite GPU memory. The code is available at https://github.com/bytedance/ShadowKV. more...
Published: 2024

Catalog

Books, media, physical & digital resources

See catalog results

3. DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Author: DeepSeek-AI, Liu, Aixin, Feng, Bei, Wang, Bin, Wang, Bingxuan, Liu, Bo, Zhao, Chenggang, Dengr, Chengqi, Ruan, Chong, Dai, Damai, Guo, Daya, Yang, Dejian, Chen, Deli, Ji, Dongjie, Li, Erhang, Lin, Fangyun, Luo, Fuli, Hao, Guangbo, Chen, Guanting, Li, Guowei, Zhang, H., Xu, Hanwei, Yang, Hao, Zhang, Haowei, Ding, Honghui, Xin, Huajian, Gao, Huazuo, Li, Hui, Qu, Hui, Cai, J. L., Liang, Jian, Guo, Jianzhong, Ni, Jiaqi, Li, Jiashi, Chen, Jin, Yuan, Jingyang, Qiu, Junjie, Song, Junxiao, Dong, Kai, Gao, Kaige, Guan, Kang, Wang, Lean, Zhang, Lecong, Xu, Lei, Xia, Leyi, Zhao, Liang, Zhang, Liyue, Li, Meng, Wang, Miaojun, Zhang, Mingchuan, Zhang, Minghua, Tang, Minghui, Li, Mingming, Tian, Ning, Huang, Panpan, Wang, Peiyi, Zhang, Peng, Zhu, Qihao, Chen, Qinyu, Du, Qiushi, Chen, R. J., Jin, R. L., Ge, Ruiqi, Pan, Ruizhe, Xu, Runxin, Chen, Ruyi, Li, S. S., Lu, Shanghao, Zhou, Shangyan, Chen, Shanhuang, Wu, Shaoqing, Ye, Shengfeng, Ma, Shirong, Wang, Shiyu, Zhou, Shuang, Yu, Shuiping, Zhou, Shunfeng, Zheng, Size, Wang, T., Pei, Tian, Yuan, Tian, Sun, Tianyu, Xiao, W. L., Zeng, Wangding, An, Wei, Liu, Wen, Liang, Wenfeng, Gao, Wenjun, Zhang, Wentao, Li, X. Q., Jin, Xiangyue, Wang, Xianzu, Bi, Xiao, Liu, Xiaodong, Wang, Xiaohan, Shen, Xiaojin, Chen, Xiaokang, Chen, Xiaosha, Nie, Xiaotao, Sun, Xiaowen, Wang, Xiaoxiang, Liu, Xin, Xie, Xin, Yu, Xingkai, Song, Xinnan, Zhou, Xinyi, Yang, Xinyu, Lu, Xuan, Su, Xuecheng, Wu, Y., Li, Y. K., Wei, Y. X., Zhu, Y. X., Xu, Yanhong, Huang, Yanping, Li, Yao, Zhao, Yao, Sun, Yaofeng, Li, Yaohui, Wang, Yaohui, Zheng, Yi, Zhang, Yichao, Xiong, Yiliang, Zhao, Yilong, He, Ying, Tang, Ying, Piao, Yishi, Dong, Yixin, Tan, Yixuan, Liu, Yiyuan, Wang, Yongji, Guo, Yongqiang, Zhu, Yuchen, Wang, Yuduan, Zou, Yuheng, Zha, Yukun, Ma, Yunxian, Yan, Yuting, You, Yuxiang, Liu, Yuxuan, Ren, Z. Z., Ren, Zehui, Sha, Zhangli, Fu, Zhe, Huang, Zhen, Zhang, Zhen, Xie, Zhenda, Hao, Zhewen, Shao, Zhihong, Wen, Zhiniu, Xu, Zhipeng, Zhang, Zhongyu, Li, Zhuoshu, Wang, Zihan, Gu, Zihui, Li, Zilin, and Xie, Ziwei more...
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation. Compared with DeepSeek 67B, DeepSeek-V2 achieves significantly stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times. We pretrain DeepSeek-V2 on a high-quality and multi-source corpus consisting of 8.1T tokens, and further perform Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to fully unlock its potential. Evaluation results show that, even with only 21B activated parameters, DeepSeek-V2 and its chat versions still achieve top-tier performance among open-source models. more...
Published: 2024

4. vMCU: Coordinated Memory Management and Kernel Optimization for DNN Inference on MCUs

Author: Zheng, Size, Chen, Renze, Li, Meng, Ye, Zihao, Ceze, Luis, and Liang, Yun
Subjects: Computer Science - Hardware Architecture, Computer Science - Machine Learning
Abstract: IoT devices based on microcontroller units (MCU) provide ultra-low power consumption and ubiquitous computation for near-sensor deep learning models (DNN). However, the memory of MCU is usually 2-3 orders of magnitude smaller than mobile devices, which makes it challenging to map DNNs onto MCUs. Previous work separates memory management and kernel implementation for MCU and relies on coarse-grained memory management techniques such as inplace update to reduce memory consumption. In this paper, we propose to coordinate memory management and kernel optimization for DNN inference on MCUs to enable fine-grained memory management. The key idea is to virtualize the limited memory of MCU as a large memory pool. Each kernel divides the memory pool into kernel-specific segments and handles segment load and store while computing DNN layers. Memory consumption can be reduced because using the fine-grained segment-level memory control, we can overlap the memory footprint of different tensors without the need to materialize them at the same time. Following this idea, we implement \ours{} for DNN inference on MCU. Evaluation for single layers on ARM Cortex-M4 and Cortex-M7 processors shows that \ours{} can reduce from $12.0\%$ to $49.5\%$ RAM usage and from $20.6\%$ to $53.0\%$ energy consumption compared to state-of-the-art work. For full DNN evaluation, \ours{} can reduce the memory bottleneck by $61.5\%$, enabling more models to be deployed on low-end MCUs. more...
Published: 2024

5. Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Author: Zhao, Yilong, Lin, Chien-Yu, Zhu, Kan, Ye, Zihao, Chen, Lequn, Zheng, Size, Ceze, Luis, Krishnamurthy, Arvind, Chen, Tianqi, and Kasikci, Baris
Subjects: Computer Science - Machine Learning
Abstract: The growing demand for Large Language Models (LLMs) in applications such as content generation, intelligent chatbots, and sentiment analysis poses considerable challenges for LLM service providers. To efficiently use GPU resources and boost throughput, batching multiple requests has emerged as a popular paradigm; to further speed up batching, LLM quantization techniques reduce memory consumption and increase computing capacity. However, prevalent quantization schemes (e.g., 8-bit weight-activation quantization) cannot fully leverage the capabilities of modern GPUs, such as 4-bit integer operators, resulting in sub-optimal performance. To maximize LLMs' serving throughput, we introduce Atom, a low-bit quantization method that achieves high throughput improvements with negligible accuracy loss. Atom significantly boosts serving throughput by using low-bit operators and considerably reduces memory consumption via low-bit quantization. It attains high accuracy by applying a novel mixed-precision and fine-grained quantization process. We evaluate Atom on 4-bit weight-activation quantization in the serving context. Atom improves end-to-end throughput (token/s) by up to $7.7\times$ compared to the FP16 and by $2.5\times$ compared to INT8 quantization, while maintaining the same latency target. more...
Published: 2023

6. Molecular insights into the composition distribution and phase behavior of hydrocarbon mixtures in a multiscale system with mixed wettability

Author: Qiu, Xingdong, Liu, Yisheng, Zheng, Size, and Yang, Huan
Published: 2025
Full Text: View/download PDF

7. Exceeding the theoretical limit of interfacial evaporation efficiency by using the carbon-based aerogel evaporator with cold evaporation surface

Author: Yang, Rui, Li, Xiaoke, Xie, Wei, Zheng, Size, Shi, Hao, Shi, Jinwen, and Jing, Dengwei
Published: 2024
Full Text: View/download PDF

8. Achieving highly interfacial evaporation rate and continuous salt resistance simultaneously via multi-dimensional composite biomimetic evaporator

Author: Zou, Lie, Zhang, He, Chen, Qian, Zheng, Size, Chen, Ning, Wu, Xiaohu, and Li, Xiaoke
Published: 2024
Full Text: View/download PDF

9. HASCO: Towards Agile HArdware and Software CO-design for Tensor Computation

Author: Xiao, Qingcheng, Zheng, Size, Wu, Bingzhe, Xu, Pengcheng, Qian, Xuehai, and Liang, Yun
Subjects: Computer Science - Hardware Architecture, Computer Science - Artificial Intelligence
Abstract: Tensor computations overwhelm traditional general-purpose computing devices due to the large amounts of data and operations of the computations. They call for a holistic solution composed of both hardware acceleration and software mapping. Hardware/software (HW/SW) co-design optimizes the hardware and software in concert and produces high-quality solutions. There are two main challenges in the co-design flow. First, multiple methods exist to partition tensor computation and have different impacts on performance and energy efficiency. Besides, the hardware part must be implemented by the intrinsic functions of spatial accelerators. It is hard for programmers to identify and analyze the partitioning methods manually. Second, the overall design space composed of HW/SW partitioning, hardware optimization, and software optimization is huge. The design space needs to be efficiently explored. To this end, we propose an agile co-design approach HASCO that provides an efficient HW/SW solution to dense tensor computation. We use tensor syntax trees as the unified IR, based on which we develop a two-step approach to identify partitioning methods. For each method, HASCO explores the hardware and software design spaces. We propose different algorithms for the explorations, as they have distinct objectives and evaluation costs. Concretely, we develop a multi-objective Bayesian optimization algorithm to explore hardware optimization. For software optimization, we use heuristic and Q-learning algorithms. Experiments demonstrate that HASCO achieves a 1.25X to 1.44X latency reduction through HW/SW co-design compared with developing the hardware and software separately. more...
Published: 2021

10. Biomimetic hydrogel with directional heat regulation for efficient solar desalination

Author: Zhang, He, Li, Xiaoke, Liu, Xiyuan, Du, Yuping, Xie, Wei, Zheng, Size, Yang, Liu, Shi, Jinwen, and Jing, Dengwei
Published: 2023
Full Text: View/download PDF

11. PXLink: A simulation program of polymer crosslinking to study of polyamide membrane

Author: Zhang, Chi, Bu, Guangle, Sajib, Md Symon Jahan, Meng, Lida, Xu, Shiying, Zheng, Size, Zhang, Lin, and Wei, Tao
Published: 2023
Full Text: View/download PDF

12. Architecting Janus hydrogel evaporator with polydopamine-TiO2 photocatalyst for high-efficient solar desalination and purification

Author: Wen, Jin, Li, Xiaoke, Zhang, He, Zheng, Size, Yi, Caini, Yang, Liu, and Shi, Jinwen
Published: 2023
Full Text: View/download PDF

13. Constructing composite membranes from functionalized metal organic frameworks integrated MXene intended for ultrafast oil/water emulsion separation

Author: Zeng, Guangyong, Liu, Yongcong, Lin, Qingquan, Pu, Shengyan, Zheng, Size, Ang, Micah Belle Marie Yap, and Chiao, Yu-Hsuan
Published: 2022
Full Text: View/download PDF

14. Grazing Incidence Wide-Angle X-ray Scattering of Water Adsorption in Polyamide Barrier Layers of Reverse Osmosis Membranes

Author: Fu, Qinyi, Zheng, Size, Verma, Nisha, Gambarini, Roberto, Wei, Tao, Ocko, Benjamin M., and Hsiao, Benjamin S.
Abstract: To understand the relationship between the intermolecular structure of aromatic polyamide (PA) scaffold and the water molecules in the barrier layers of reverse osmosis (RO) membranes, a grazing incidence wide-angle X-ray scattering (GIWAXS) study was carried out on freestanding PA thin films at varying relative humidity (RH) conditions. The scattering results were analyzed by an interference scattering model, containing a phase factor between a PA chain and an adsorbed water molecule. This model yielded good fits to the GIWAXS profiles where the water adsorption was found to vary linearly with RH. Atomistic molecular dynamics (MD) simulations were also performed to complement the experimental study. The simulations revealed that a rapid condensation layer initially formed on the PA film surface, followed by the slow water molecule diffusion inside the PA membrane. Sparse adsorbed water, isolated in subnanopores of the PA film adjacent to the polar atoms, even in very low quantities, modifies the X-ray scattering. Atomistic simulations at the microscopic scale provide partial support for several X-ray scattering findings. more...
Published: 2025
Full Text: View/download PDF

15. Molecular Dynamics Study of Structure, Folding, and Aggregation of Poly-PR and Poly-GR Proteins

Author: Zheng, Size, Sahimi, Ali, Shing, Katherine S., and Sahimi, Muhammad
Published: 2021
Full Text: View/download PDF

16. Interfacial Polymerization of Aromatic Polyamide Reverse Osmosis Membranes.

Author: Zheng, Size, Gissinger, Jacob, Hsiao, Benjamin S., and Wei, Tao
Published: 2024
Full Text: View/download PDF

17. SpecPIM: Accelerating Speculative Inference on PIM-Enabled System via Architecture-Dataflow Co-Exploration

Author: Li, Cong, primary, Zhou, Zhe, additional, Zheng, Size, additional, Zhang, Jiaxi, additional, Liang, Yun, additional, and Sun, Guangyu, additional
Published: 2024
Full Text: View/download PDF

18. MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN

Author: Chen, Renze, primary, Ding, Zijian, additional, Zheng, Size, additional, Zhang, Chengrui, additional, Leng, Jingwen, additional, Liu, Xuanzhe, additional, and Liang, Yun, additional
Published: 2024
Full Text: View/download PDF

19. High-performing composite membrane based on dopamine-functionalized graphene oxide incorporated two-dimensional MXene nanosheets for water purification

Author: Zeng, Guangyong, Lin, Qingquan, Wei, Ke, Liu, Yongcong, Zheng, Size, Zhan, Yingqing, He, Shuangjiang, Patra, Tanmoy, and Chiao, Yu-Hsuan
Published: 2021
Full Text: View/download PDF

20. sDMD: An open source program for discontinuous molecular dynamics simulation of protein folding and aggregation

Author: Zheng, Size, Javidpour, Leili, Sahimi, Muhammad, Shing, Katherine S., and Nakano, Aiichiro
Published: 2020
Full Text: View/download PDF

21. TileFlow: A Framework for Modeling Fusion Dataflow via Tree-based Analysis

Author: Zheng, Size, primary, Chen, Siyuan, additional, Gao, Siyuan, additional, Jia, Liancheng, additional, Sun, Guangyu, additional, Wang, Runsheng, additional, and Liang, Yun, additional
Published: 2023
Full Text: View/download PDF

22. Preparation, Characterization and Application of Novel Photocatalytic Two-Dimensional Material Membrane: A Reform of Comprehensive Experimental Teaching

Author: Zeng, Guangyong, primary, Zheng, Hu, additional, Zhou, Kun, additional, Shi, Hao, additional, Zheng, Size, additional, Ma, Hui, additional, Wang, Peng, additional, and Pu, Shengyan, additional
Published: 2023
Full Text: View/download PDF

23. Rubick: A Synthesis Framework for Spatial Architectures via Dataflow Decomposition

Author: Luo, Zizhang, primary, Lu, Liqiang, additional, Zheng, Size, additional, Yin, Jieming, additional, Cong, Jason, additional, Yin, Jianwei, additional, and Liang, Yun, additional
Published: 2023
Full Text: View/download PDF

24. Memory and Computation Coordinated Mapping of DNNs onto Complex Heterogeneous SoC

Author: Zheng, Size, primary, Chen, Siyuan, additional, and Liang, Yun, additional
Published: 2023
Full Text: View/download PDF

25. Graphic contrastive learning analyses of discontinuous molecular dynamics simulations: Study of protein folding upon adsorption

Author: Zheng, Size, primary, Wei, Yong, additional, Lin, Yuewei, additional, and Wei, Tao, additional
Published: 2023
Full Text: View/download PDF

26. Multilevel Comparison of Ionic Liquid Separation of a Methanol/Methyl Acetate/Water Mixture

Author: Guo, Chao, primary, Du, Long, additional, Liu, Xiaoyan, additional, Cao, Yuqing, additional, Zheng, Size, additional, and He, Ge, additional
Published: 2023
Full Text: View/download PDF

27. The coral‐inspired steam evaporator for efficient solar desalination via porous and thermal insulation bionic design

Author: Zhang, He, primary, Li, Xiaoke, additional, Zheng, Size, additional, Wen, Jin, additional, Zhou, Jiaying, additional, Yang, Rui, additional, Luo, Wenmei, additional, Yang, Liu, additional, and Wu, Xiaohu, additional more...
Published: 2023
Full Text: View/download PDF

28. Rubick: A Unified Infrastructure for Analyzing, Exploring, and Implementing Spatial Architectures via Dataflow Decomposition

Author: Lu, Liqiang, primary, Luo, Zizhang, additional, Zheng, Size, additional, Yin, Jieming, additional, Cong, Jason, additional, Liang, Yun, additional, and Yin, Jianwei, additional
Published: 2023
Full Text: View/download PDF

29. Rubick: A Unified Infrastructure for Analyzing, Exploring, and Implementing Spatial Architectures via Dataflow Decomposition

Author: Lu, Liqiang, Luo, Zizhang, Zheng, Size, Yin, Jieming, Cong, Jason, Liang, Yun, and Yin, Jianwei
Abstract: The fast-growing tensor applications expose tremendous dataflow alternatives when implemented on spatial architectures that feature large PE arrays and abundant interconnection resources. Prior works develop various notations and performance models for dataflows. Though these notations are very useful for understanding the reuse, bandwidth, and performance of dataflows, they do not define the underlying hardware implementation. Due to the semantic gap, analysis based on these notations cannot capture the detailed architectural features between different dataflows, leading to inefficient design space exploration and suboptimal designs. To address these issues, we propose Rubick, a unified infrastructure for analyzing, exploring, and implementing spatial architectures. The main innovation of Rubick is it decomposes the dataflow into two low-level intermediate representations: 1) access entry and 2) data layout. Access entry specifies how data enter into the PE arrays from memory, while data layout specifies how data are arranged and accessed. These two representations allow us to infer the hardware implementation details, such as PE interconnection and memory structure, which are amenable for structural analysis and systematic exploration. Based on this decomposition analysis, Rubick provides opportunities for micro-architecture optimization and efficient design space exploration. Our experiments demonstrate that Rubick can reduce 82.4% of wire resources with only a 2.7% latency increase by optimizing access entry IR, and achieve 70.8% memory overhead reduction by optimizing data layout IR. Rubick also accelerates the DSE time of dataflows by up to $1.1\times 10^{5}\text{X}$ , saving the time from several days to minutes. The source code of Rubick is publically available on (https://link-omitted-for-blind-review). more...
Published: 2024
Full Text: View/download PDF

30. Multi-phase optimisation model predicts manual lifting motions with less reliance on experiment-based posture data

Author: Zheng, Size, primary, Li, Qingguo, additional, and Liu, Tao, additional
Published: 2022
Full Text: View/download PDF

31. Effect of different landing actions on knee joint biomechanics of female college athletes: Based on opensim simulation

Author: Chen, Liang, primary, Jiang, Ziang, additional, Yang, Chen, additional, Cheng, Rongshan, additional, Zheng, Size, additional, and Qian, Jingguang, additional
Published: 2022
Full Text: View/download PDF

32. Multi-phase optimisation model predicts manual lifting motions with less reliance on experiment-based posture data.

Author: Zheng, Size, Li, Qingguo, and Liu, Tao
Subjects: WORK measurement, LIFTING & carrying (Human mechanics), TASK performance, ROBOTICS, POSTURE, BODY movement, RESEARCH funding, ASSISTIVE technology, PREDICTION models, WEIGHT lifting, STATISTICAL correlation, PREDICTIVE validity, NEW product development, BIOMECHANICS
Abstract: Optimisation-based predictive models are widely-used to explore the lifting strategies. Existing models incorporated empirical subject-specific posture constraints to improve the prediction accuracy. However, over-reliance on these constraints limits the application of predictive models. This paper proposed a multi-phase optimisation method (MPOM) for two-dimensional sagittally symmetric semi-squat lifting prediction, which decomposes the complete lifting task into three phases—the initial posture, the final posture, and the dynamic lifting phase. The first two phases are predicted with force- and stability-related strategies, and the last phase is predicted with a smoothing-related objective. Box-lifting motions of different box initial heights were collected for validation. The results show that MPOM has better or similar accuracy than the traditional single-phase optimisation (SPOM) of minimum muscular utilisation ratio, and MPOM reduces the reliance on experimental data. MPOM offers the opportunity to improve accuracy at the expense of efforts to determine appropriate weightings in the posture prediction phases. Practitioner summary: Lifting optimisation models are useful to predict and explore the human motion strategies. Existing models rely on empirical subject-specific posture constraints, which limit their applications. A multi-phase model for lifting motion prediction was constructed. This model could accurately predict 2D lifting motions with less reliance on these constraints. [ABSTRACT FROM AUTHOR] more...
Published: 2023
Full Text: View/download PDF

33. Economic Analysis and Life Cycle Environmental Assessment of Imidazolium-Based Ionic Liquids for Separation of the Methanol/Dimethyl Carbonate Azeotrope.

Author: Guo, Chao, Liu, Xiaoyan, Wang, Fuqiang, Cao, Yuqing, Zheng, Size, and He, Ge
Published: 2023
Full Text: View/download PDF

34. Graph Clustering Analyses of Discontinuous Molecular Dynamics Simulations: Study of Lysozyme Adsorption on a Graphene Surface

Author: Chen, Jing, primary, Xu, Enze, additional, Wei, Yong, additional, Chen, Minghan, additional, Wei, Tao, additional, and Zheng, Size, additional
Published: 2022
Full Text: View/download PDF

35. Molecular dynamics study of structure, folding, and aggregation of poly-glycine-alanine (Poly-GA).

Author: Zheng, Size, Sahimi, Ali, Shing, Katherine S., and Sahimi, Muhammad
Subjects: *MOLECULAR dynamics, *GLYCINE, *AMYOTROPHIC lateral sclerosis, *HELICAL structure, *FRONTOTEMPORAL dementia, *PROTEIN models
Abstract: Poly-glycine-alanine (poly-GA) proteins are widely believed to be one of the main toxic dipeptide repeat molecules associated with amyotrophic lateral sclerosis (ALS) and frontotemporal dementia diseases. Using discontinuous molecular dynamics simulation and an all-atom model of the proteins, we study folding, stability, and aggregation of poly-GA. The results demonstrate that poly-GA is an aggregation-prone protein that, after a long enough time, forms β-sheet-rich aggregates that match recent experiment data and that two unique helical structures are formed very frequently, namely, β-helix and double-helix. The details of the two structures are analyzed. The analysis indicates that such helical structures are stable and share the characteristics of both α-helices and β-sheets. Molecular simulations indicate that identical phenomena also occur in the aggregation of poly-glycine-arginine (poly-GR). Therefore, we hypothesize that proteins of type (GX)n in which X may be any non-glycine amino acid and n is the repeat length may share the same folding structures of β-helix and double-helix and that it is the glycine in the repeat that contributes the most to this characteristic. Molecular dynamics simulation with continuous interaction potentials and explicit water molecules as the solvent supports the hypothesis. To our knowledge, this is the first molecular dynamics simulation of the phenomena involving poly-GA and poly-GR proteins. [ABSTRACT FROM AUTHOR] more...
Published: 2019
Full Text: View/download PDF

36. AMOS

Author: Zheng, Size, primary, Chen, Renze, additional, Wei, Anjiang, additional, Jin, Yicheng, additional, Han, Qin, additional, Lu, Liqiang, additional, Wu, Bingyang, additional, Li, Xiuhong, additional, Yan, Shengen, additional, and Liang, Yun, additional more...
Published: 2022
Full Text: View/download PDF

37. A Biomass‐Based Hydrogel Evaporator Modified Through Dynamic Regulation of Water Molecules: Highly Efficient and Cost‐Effective

Author: Luo, Boqiu, primary, Wen, Jin, additional, Wang, Hao, additional, Zheng, Size, additional, Liao, Rui, additional, Chen, Wenjing, additional, Mahian, Omid, additional, and Li, Xiaoke, additional
Published: 2022
Full Text: View/download PDF

38. A Biomass‐Based Hydrogel Evaporator Modified Through Dynamic Regulation of Water Molecules: Highly Efficient and Cost‐Effective.

Author: Luo, Boqiu, Wen, Jin, Wang, Hao, Zheng, Size, Liao, Rui, Chen, Wenjing, Mahian, Omid, and Li, Xiaoke
Abstract: Solar‐driven hydrogel evaporator used for water purification demonstrates great potential in seawater desalination and domestic sewage treatment. However, much uncertainty still exists about the most efficient design to obtain cost‐effective drinkable water. In this paper, a natural rich biomass Nicandra physalodes (Linn.) Gaertn. polysaccharide was introduced into the polyvinyl alcohol network to control the water distribution during evaporation and build a low‐cost hybrid hydrogel solar evaporator with a total material cost of $7.95 m−2. The mixed evaporator works stably in a long‐span acid–base range (pH 1–14) and salinity range (0–320 g kg−1). Its daily water purification capacity can reach 24.4 kg m−2 with a water purification capacity of 3.51 kg m−2 h−1 under sunlight. This paper provides a new possibility for a highly efficient and cost‐effective water desalination system with guaranteed water quality by focusing on the dynamic regulation of water molecules at the evaporation interface. [ABSTRACT FROM AUTHOR] more...
Published: 2023
Full Text: View/download PDF

39. Different Phases in Manual Materials Handling Have Different Performance Criteria: Evidence From Multi-Objective Optimization

Author: Zheng, Size, primary, Li, Tong, additional, Li, Qingguo, additional, and Liu, Tao, additional
Published: 2022
Full Text: View/download PDF

40. Solubility, Density, and Metastable Zone Width of Acidic Potassium Phosphate in Water and Phosphoric Acid Solvent Mixtures from 303.15 to 333.15 K

Author: Zhou, Kun, primary, Yin, Guoliang, additional, Zhu, Xunmei, additional, He, Jiaxiu, additional, Yang, Xiaojun, additional, Zheng, Size, additional, and Zhao, Ruiting, additional
Published: 2022
Full Text: View/download PDF

41. NeoFlow: A Flexible Framework for Enabling Efficient Compilation for High Performance DNN Training.

Author: Zheng, Size, Chen, Renze, Jin, Yicheng, Wei, Anjiang, Wu, Bingyang, Li, Xiuhong, Yan, Shengen, and Liang, Yun
Subjects: *COMPILERS (Computer programs), *AUTOMATIC differentiation, *NATURAL language processing, *REPRESENTATIONS of graphs, *IMAGE recognition (Computer vision), *DEEP learning
Abstract: Deep neural networks (DNNs) are increasingly deployed in various image recognition and natural language processing applications. The continuous demand for accuracy and high performance has led to innovations in DNN design and a proliferation of new operators. However, existing DNN training frameworks such as PyTorch and TensorFlow only support a limited range of operators and rely on hand-optimized libraries to provide efficient implementations for these operators. To evaluate novel neural networks with new operators, the programmers have to either replace the holistic new operators with existing operators or provide low-level implementations manually. Therefore, a critical requirement for DNN training frameworks is to provide high-performance implementations for the neural networks containing new operators automatically in the absence of efficient library support. In this article, we introduce NeoFlow, which is a flexible framework for enabling efficient compilation for high-performance DNN training. NeoFlow allows the programmers to directly write customized expressions as new operators to be mapped to graph representation and low-level implementations automatically, providing both high programming productivity and high performance. First, NeoFlow provides expression-based automatic differentiation to support customized model definitions with new operators. Then, NeoFlow proposes an efficient compilation system that partitions the neural network graph into subgraphs, explores optimized schedules, and generates high-performance libraries for subgraphs automatically. Finally, NeoFlow develops an efficient runtime system to combine the compilation and training as a whole by overlapping their execution. In the experiments, we examine the numerical accuracy and performance of NeoFlow. The results show that NeoFlow can achieve similar or even better performance at the operator and whole graph level for DNNs compared to deep learning frameworks. Especially, for novel networks training, the geometric mean speedups of NeoFlow to PyTorch, TensorFlow, and CuDNN are 3.16X, 2.43X, and 1.92X, respectively. [ABSTRACT FROM AUTHOR] more...
Published: 2022
Full Text: View/download PDF

42. HASCO: Towards Agile HArdware and Software CO-design for Tensor Computation

Author: Xiao, Qingcheng, primary, Zheng, Size, additional, Wu, Bingzhe, additional, Xu, Pengcheng, additional, Qian, Xuehai, additional, and Liang, Yun, additional
Published: 2021
Full Text: View/download PDF

43. Discontinuous Molecular Dynamics Simulations of Biomolecule Interfacial Behavior: Study of Ovispirin-1 Adsorption on a Graphene Surface

Author: Zheng, Size, primary, Sajib, Md Symon Jahan, additional, Wei, Yong, additional, and Wei, Tao, additional
Published: 2021
Full Text: View/download PDF

44. NeoFlow: A Flexible Framework for Enabling Efficient Compilation for High Performance DNN Training

Author: Zheng, Size, primary, Chen, Renze, additional, Jin, Yicheng, additional, Wei, Anjiang, additional, Wu, Bingyang, additional, Li, Xiuhong, additional, Yan, Shengen, additional, and Liang, Yun, additional more...
Published: 2021
Full Text: View/download PDF

45. Highly Efficient and Cost-Effective Water Desalination via the Hybrid Hydrogel-Based Solar-Driven Interfacial Evaporation

Author: Luo, Boqiu, primary, Wang, Hao, additional, Zheng, Size, additional, Liao, Rui, additional, Chen, Wenjing, additional, Mahian, Omid, additional, and Li, Xiaoke, additional
Published: 2021
Full Text: View/download PDF

46. A Passive Lifting Assist Exoskeleton with Multiple Working Modes: Theoretical Evaluation and Design Concepts

Author: Zheng, Size, primary, Yuan, Beizhe, additional, Ferreira, Joao Paulo, additional, Liu, Tao, additional, Li, Tong, additional, He, Long, additional, and Wang, Xinrui, additional
Published: 2020
Full Text: View/download PDF

47. SuSy

Author: Lai, Yi-Hsiang, primary, Rong, Hongbo, additional, Zheng, Size, additional, Zhang, Weihao, additional, Cui, Xiuping, additional, Jia, Yunshan, additional, Wang, Jie, additional, Sullivan, Brendan, additional, Zhang, Zhiru, additional, Liang, Yun, additional, Zhang, Youhui, additional, Cong, Jason, additional, George, Nithin, additional, Alvarez, Jose, additional, Hughes, Christopher, additional, and Dubey, Pradeep, additional more...
Published: 2020
Full Text: View/download PDF

48. Insight into the Mechanism of Internalization of the Cell-Penetrating Carrier Peptide Pep-1 by Conformational Analysis

Author: Wang, Ting, primary, Wang, Chu, additional, Zheng, Size, additional, Qu, Guanwen, additional, Feng, Zhangqi, additional, Shang, Jing, additional, Cheng, Yaozhong, additional, and He, Nongyue, additional more...
Published: 2020
Full Text: View/download PDF

49. FlexTensor

Author: Zheng, Size, primary, Liang, Yun, additional, Wang, Shuo, additional, Chen, Renze, additional, and Sheng, Kaiwen, additional
Published: 2020
Full Text: View/download PDF

50. Accelerating convolutional neural networks on FPGAs

Author: LU, Liqiang, primary, ZHENG, Size, additional, XIAO, Qingcheng, additional, CHEN, Deming, additional, and LIANG, Yun, additional
Published: 2019
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

103 results on '"Zheng, Size"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources