Author: "Chen, Yiran" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Chen, Yiran"' showing total 2,129 results

Start Over Author "Chen, Yiran"

2,129 results on '"Chen, Yiran"'

151. PENNI: Pruned Kernel Sharing for Efficient CNN Inference

Author: Li, Shiyu, Hanson, Edward, Li, Hai, and Chen, Yiran
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Although state-of-the-art (SOTA) CNNs achieve outstanding performance on various tasks, their high computation demand and massive number of parameters make it difficult to deploy these SOTA CNNs onto resource-constrained devices. Previous works on CNN acceleration utilize low-rank approximation of the original convolution layers to reduce computation cost. However, these methods are very difficult to conduct upon sparse models, which limits execution speedup since redundancies within the CNN model are not fully exploited. We argue that kernel granularity decomposition can be conducted with low-rank assumption while exploiting the redundancy within the remaining compact coefficients. Based on this observation, we propose PENNI, a CNN model compression framework that is able to achieve model compactness and hardware efficiency simultaneously by (1) implementing kernel sharing in convolution layers via a small number of basis kernels and (2) alternately adjusting bases and coefficients with sparse constraints. Experiments show that we can prune 97% parameters and 92% FLOPs on ResNet18 CIFAR10 with no accuracy loss, and achieve 44% reduction in run-time memory consumption and a 53% reduction in inference latency., Comment: 9 pages, 5 figures, to appear on ICML2020
Published: 2020

152. TRP: Trained Rank Pruning for Efficient Deep Neural Networks

Author: Xu, Yuhui, Li, Yuxi, Zhang, Shuai, Wen, Wei, Wang, Botao, Qi, Yingyong, Chen, Yiran, Lin, Weiyao, and Xiong, Hongkai
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: To enable DNNs on edge devices like mobile phones, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pretrained model by low-rank decomposition; however, small approximation errors in parameters can ripple over a large prediction loss. As a result, performance usually drops significantly and a sophisticated effort on fine-tuning is required to recover accuracy. Apparently, it is not optimal to separate low-rank approximation from training. Unlike previous works, this paper integrates low rank approximation and regularization into the training process. We propose Trained Rank Pruning (TRP), which alternates between low rank approximation and training. TRP maintains the capacity of the original network while imposing low-rank constraints during training. A nuclear regularization optimized by stochastic sub-gradient descent is utilized to further promote low rank in TRP. The TRP trained network inherently has a low-rank structure, and is approximated with negligible performance loss, thus eliminating the fine-tuning process after low rank decomposition. The proposed method is comprehensively evaluated on CIFAR-10 and ImageNet, outperforming previous compression methods using low rank approximation., Comment: Accepted by IJCAI2020, An extension version of arXiv:1812.02402
Published: 2020

153. Perturbing Across the Feature Hierarchy to Improve Standard and Strict Blackbox Attack Transferability

Author: Inkawhich, Nathan, Liang, Kevin J, Wang, Binghui, Inkawhich, Matthew, Carin, Lawrence, and Chen, Yiran
Subjects: Computer Science - Cryptography and Security, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We consider the blackbox transfer-based targeted adversarial attack threat model in the realm of deep neural network (DNN) image classifiers. Rather than focusing on crossing decision boundaries at the output layer of the source model, our method perturbs representations throughout the extracted feature hierarchy to resemble other classes. We design a flexible attack framework that allows for multi-layer perturbations and demonstrates state-of-the-art targeted transfer performance between ImageNet DNNs. We also show the superiority of our feature space methods under a relaxation of the common assumption that the source and target models are trained on the same dataset and label space, in some instances achieving a $10\times$ increase in targeted success rate relative to other blackbox transfer methods. Finally, we analyze why the proposed methods outperform existing attack strategies and show an extension of the method in the case when limited queries to the blackbox model are allowed.
Published: 2020

154. Transferable Perturbations of Deep Feature Distributions

Author: Inkawhich, Nathan, Liang, Kevin J, Carin, Lawrence, and Chen, Yiran
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Almost all current adversarial attacks of CNN classifiers rely on information derived from the output layer of the network. This work presents a new adversarial attack based on the modeling and exploitation of class-wise and layer-wise deep feature distributions. We achieve state-of-the-art targeted blackbox transfer-based attack results for undefended ImageNet models. Further, we place a priority on explainability and interpretability of the attacking process. Our methodology affords an analysis of how adversarial attacks change the intermediate feature distributions of CNNs, as well as a measure of layer-wise and class-wise feature distributional separability/entanglement. We also conceptualize a transition from task/data-specific to model-specific features within a CNN architecture that directly impacts the transferability of adversarial examples., Comment: Published as a conference paper at ICLR 2020
Published: 2020

155. Learning Low-rank Deep Neural Networks via Singular Vector Orthogonality Regularization and Singular Value Sparsification

Author: Yang, Huanrui, Tang, Minxue, Wen, Wei, Yan, Feng, Hu, Daniel, Li, Ang, Li, Hai, and Chen, Yiran
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Modern deep neural networks (DNNs) often require high memory consumption and large computational loads. In order to deploy DNN algorithms efficiently on edge or mobile devices, a series of DNN compression algorithms have been explored, including factorization methods. Factorization methods approximate the weight matrix of a DNN layer with the multiplication of two or multiple low-rank matrices. However, it is hard to measure the ranks of DNN layers during the training process. Previous works mainly induce low-rank through implicit approximations or via costly singular value decomposition (SVD) process on every training step. The former approach usually induces a high accuracy loss while the latter has a low efficiency. In this work, we propose SVD training, the first method to explicitly achieve low-rank DNNs during training without applying SVD on every step. SVD training first decomposes each layer into the form of its full-rank SVD, then performs training directly on the decomposed weights. We add orthogonality regularization to the singular vectors, which ensure the valid form of SVD and avoid gradient vanishing/exploding. Low-rank is encouraged by applying sparsity-inducing regularizers on the singular values of each layer. Singular value pruning is applied at the end to explicitly reach a low-rank model. We empirically show that SVD training can significantly reduce the rank of DNN layers and achieve higher reduction on computation load under the same accuracy, comparing to not only previous factorization methods but also state-of-the-art filter pruning methods., Comment: In proceeding of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). To be presented at EDLCV 2020 workshop co-located with CVPR 2020
Published: 2020

156. Extractive Summarization as Text Matching

Author: Zhong, Ming, Liu, Pengfei, Chen, Yiran, Wang, Danqing, Qiu, Xipeng, and Huang, Xuanjing
Subjects: Computer Science - Computation and Language
Abstract: This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems. Instead of following the commonly used framework of extracting sentences individually and modeling the relationship between sentences, we formulate the extractive summarization task as a semantic text matching problem, in which a source document and candidate summaries will be (extracted from the original text) matched in a semantic space. Notably, this paradigm shift to semantic matching framework is well-grounded in our comprehensive analysis of the inherent gap between sentence-level and summary-level extractors based on the property of the dataset. Besides, even instantiating the framework with a simple form of a matching model, we have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1). Experiments on the other five datasets also show the effectiveness of the matching framework. We believe the power of this matching-based summarization framework has not been fully exploited. To encourage more instantiations in the future, we have released our codes, processed dataset, as well as generated summaries in https://github.com/maszhongming/MatchSum., Comment: Accepted by ACL 2020
Published: 2020

157. Neural Predictor for Neural Architecture Search

Author: Wen, Wei, Liu, Hanxiao, Li, Hai, Chen, Yiran, Bender, Gabriel, and Kindermans, Pieter-Jan
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Neural Architecture Search methods are effective but often use complex algorithms to come up with the best architecture. We propose an approach with three basic steps that is conceptually much simpler. First we train N random architectures to generate N (architecture, validation accuracy) pairs and use them to train a regression model that predicts accuracy based on the architecture. Next, we use this regression model to predict the validation accuracies of a large number of random architectures. Finally, we train the top-K predicted architectures and deploy the model with the best validation result. While this approach seems simple, it is more than 20 times as sample efficient as Regularized Evolution on the NASBench-101 benchmark and can compete on ImageNet with more complex approaches based on weight sharing, such as ProxylessNAS.
Published: 2019

158. Qinzhu Liangxue inhibits IL-6-induced hyperproliferation and inflammation in HaCaT cells by regulating METTL14/SOCS3/STAT3 axis

Author: Chen, Yiran, Miao, Xiao, Xiang, Yanwei, Kuai, Le, Ding, Xiaojie, Ma, Tian, Li, Bin, and Fan, Bin
Published: 2023
Full Text: View/download PDF

159. Dual camera system for synchronous imaging in visible and non-visible spectral regions

Author: Chen, Yiran, Yin, Hujun, and Van Silfhout, Roelof
Abstract: A dual-camera system comprised of a visible camera and a long wavelength infrared camera system is presented. This solution for synchronous imaging in visible and non-visible spectral regions has important applications in localising and identifying objects with different thermal signatures. A detailed description of the design process and implementation of the dual-camera system is presented, highlighting all aspects of the image sensors used, their control and interfacing requirement and the necessary image processing to combine the acquired images into a single multi-spectrum image. The developed system consists of an acrylic box containing both sensors, an electronic interface board for conditioning and readout of the infrared sensor, and an image processing system that also handles network connection and communication with client programs by acting as a server. A field programmable gate array (FPGA) device is used to implement the readout protocols for the two sensor modules. Apart from the readout function, the FPGA is also a hardware accelerator which can be combined with a host central processing unit (CPU) to form an Open Computing Language (OpenCL) framework for a heterogeneous platform to perform complex image processing functions. As part of the research, geometric calibration of this multi-sensor system is conducted to achieve the intrinsic and extrinsic parameters of the cameras, for the purpose of remove lens distortion effect. Stereo camera calibration and stereo image rectification are performed to obtain the relative position and orientation of the sensors. A method is proposed to match the two views from separate cameras using triangulation such that a combinational image can be formed, which consist of information from the visible and infrared band. With the acceleration of OpenCL framework, this system can give a continuous stream of image pairs for 10.15 images/s. In the work of radiometric calibration of the uncooled infrared camera, a method which creates a temperature-controlled environment has been used, and this method does not require expensive blackbody equipment. From the image data analysis, a temperature model has been built to represent the relationship between the target surface temperature and the pixel value on the infrared image. A novel colour map is employed on the infrared image that can achieve better performance in the imaging modality than the conventional colour maps.
Published: 2021

160. Targeting the TCA cycle through cuproptosis confers synthetic lethality on ARID1A-deficient hepatocellular carcinoma

Author: Xing, Tao, Li, Li, Chen, Yiran, Ju, Gaoda, Li, Guilan, Zhu, Xiaoyun, Ren, Yubo, Zhao, Jing, Cheng, Zhilei, Li, Yan, Xu, Da, and Liang, Jun
Published: 2023
Full Text: View/download PDF

161. High-degree polymerizate IMOs of dextranase hydrolysates enhance Lactobacillus acid metabolism: Based on growth, and metabolomic and transcriptomic analyses

Author: Lin, Qianru, Liu, Mingwang, Ni, Hao, Hao, Yue, Yu, Yiqun, Chen, Yiran, Wu, Qing, Shen, Yi, Zhang, Lei, Lyu, Mingsheng, and Wang, Shujun
Published: 2023
Full Text: View/download PDF

162. Construction of Z-type heterojunction BiVO4/Sm/α-Fe2O3 photoanode for selective degradation: Efficient removal of bisphenol A based on multifunctional Sm-doped modification

Author: Chen, Yiran, Liu, Lu, Zhang, Lu, Li, Shunlin, Zhang, Xinyu, Yu, Wenchao, Wang, Feng, Xue, Wanlai, Wang, Hui, and Bian, Zhaoyong
Published: 2023
Full Text: View/download PDF

163. Enhanced electrocatalytic cathodic degradation of 2,4-dichlorophenoxyacetic acid based on a synergistic effect obtained from Co single atoms and Cu nanoclusters

Author: Liu, Lu, Chen, Yiran, Li, Shunlin, Yu, Wenchao, Zhang, Xinyu, Wang, Hui, Ren, Jianan, and Bian, Zhaoyong
Published: 2023
Full Text: View/download PDF

164. Explainability Metrics of Deep Convolutional Networks for Photoplethysmography Quality Assessment.

Author: Zhang, Oliver, Ding, Cheng, Pereira, Tania, Xiao, Ran, Gadhoumi, Kais, Meisel, Karl, Lee, Randall J, Chen, Yiran, and Hu, Xiao
Subjects: Deep neural network, PPG signal quality, biomedical informatics, Measurement, Deep learning, Annotations, Convolution, Biological system modeling, Neural networks, Data models, Information and Computing Sciences, Engineering, Technology
Abstract: Photoplethysmography (PPG) is a noninvasive way to monitor various aspects of the circulatory system, and is becoming more and more widespread in biomedical processing. Recently, deep learning methods for analyzing PPG have also become prevalent, achieving state of the art results on heart rate estimation, atrial fibrillation detection, and motion artifact identification. Consequently, a need for interpretable deep learning has arisen within the field of biomedical signal processing. In this paper, we pioneer novel explanatory metrics which leverage domain-expert knowledge to validate a deep learning model. We visualize model attention over a whole testset using saliency methods and compare it to human expert annotations. Congruence, our first metric, measures the proportion of model attention within expert-annotated regions. Our second metric, Annotation Classification, measures how much of the expert annotations our deep learning model pays attention to. Finally, we apply our metrics to compare between a signal based model and an image based model for PPG signal quality classification. Both models are deep convolutional networks based on the ResNet architectures. We show that our signal-based one dimensional model acts in a more explainable manner than our image based model; on average 50.78% of the one dimensional model's attention are within expert annotations, whereas 36.03% of the two dimensional model's attention are within expert annotations. Similarly, when thresholding the one dimensional model attention, one can more accurately predict if each pixel of the PPG is annotated as artifactual by an expert. Through this testcase, we demonstrate how our metrics can provide a quantitative and dataset-wide analysis of how explainable the model is.
Published: 2021

165. Are There Age Disparities in Community College Completion? Evidence from Ohio's Community Colleges

Author: Bahr, Peter Riley, Columbus, Rooney, and Chen, Yiran
Abstract: Research points to an age disparity in college completion, with adult community college students (ages 25 years and older) being less likely to complete postsecondary credentials than their younger peers. However, research also demonstrates that adult community college students are more likely to report educational goals that do not culminate in a postsecondary credential, especially goals related to updating job skills or changing careers. Hence, it is unclear whether the age gap in college completion is a result of differences in goals or a result of obstacles to persisting in college for adult students. Here, we use multilevel models to analyze longitudinal data from the Ohio community college system on over 300,000 first-time students in order to measure the age gaps in the completion of three types of postsecondary credentials--certificates, associate degrees, and baccalaureate degrees--after accounting for differences in the distribution of students' goals and other potentially relevant characteristics. We find that older students--both male and female--are more likely to complete certificates than their younger peers. Older women are more likely to complete associate degrees than younger women, while older men do not differ significantly from younger men in the likelihood of completing an associate degree. Conversely, older students are markedly less likely to complete baccalaureate degrees than younger students. Our results point to the potential value of increasing flexible access to baccalaureate degrees. Expanding community college baccalaureate offerings is a promising avenue for helping baccalaureate-seeking adult students achieve their goals.
Published: 2022
Full Text: View/download PDF

166. AutoShrink: A Topology-aware NAS for Discovering Efficient Neural Architecture

Author: Zhang, Tunhou, Cheng, Hsin-Pai, Li, Zhenwen, Yan, Feng, Huang, Chengyu, Li, Hai, and Chen, Yiran
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Resource is an important constraint when deploying Deep Neural Networks (DNNs) on mobile and edge devices. Existing works commonly adopt the cell-based search approach, which limits the flexibility of network patterns in learned cell structures. Moreover, due to the topology-agnostic nature of existing works, including both cell-based and node-based approaches, the search process is time consuming and the performance of found architecture may be sub-optimal. To address these problems, we propose AutoShrink, a topology-aware Neural Architecture Search(NAS) for searching efficient building blocks of neural architectures. Our method is node-based and thus can learn flexible network patterns in cell structures within a topological search space. Directed Acyclic Graphs (DAGs) are used to abstract DNN architectures and progressively optimize the cell structure through edge shrinking. As the search space intrinsically reduces as the edges are progressively shrunk, AutoShrink explores more flexible search space with even less search time. We evaluate AutoShrink on image classification and language tasks by crafting ShrinkCNN and ShrinkRNN models. ShrinkCNN is able to achieve up to 48% parameter reduction and save 34% Multiply-Accumulates (MACs) on ImageNet-1K with comparable accuracy of state-of-the-art (SOTA) models. Specifically, both ShrinkCNN and ShrinkRNN are crafted within 1.5 GPU hours, which is 7.2x and 6.7x faster than the crafting time of SOTA CNN and RNN models, respectively.
Published: 2019

167. Structural sparsification for Far-field Speaker Recognition with GNA

Author: Zhang, Jingchi, Huang, Jonathan, Deisher, Michael, Li, Hai, and Chen, Yiran
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Machine Learning
Abstract: Recently, deep neural networks (DNN) have been widely used in speaker recognition area. In order to achieve fast response time and high accuracy, the requirements for hardware resources increase rapidly. However, as the speaker recognition application is often implemented on mobile devices, it is necessary to maintain a low computational cost while keeping high accuracy in far-field condition. In this paper, we apply structural sparsification on time-delay neural networks (TDNN) to remove redundant structures and accelerate the execution. On our targeted hardware, our model can remove 60% of parameters and only slightly increasing equal error rate (EER) by 0.18% while our structural sparse model can achieve more than 1.5x speedup., Comment: submitted to icassp2020
Published: 2019

168. Trained Rank Pruning for Efficient Deep Neural Networks

Author: Xu, Yuhui, Li, Yuxi, Zhang, Shuai, Wen, Wei, Wang, Botao, Dai, Wenrui, Qi, Yingyong, Chen, Yiran, Lin, Weiyao, and Xiong, Hongkai
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: To accelerate DNNs inference, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposition; however, small approximation errors in parameters can ripple over a large prediction loss. Apparently, it is not optimal to separate low-rank approximation from training. Unlike previous works, this paper integrates low rank approximation and regularization into the training process. We propose Trained Rank Pruning (TRP), which alternates between low rank approximation and training. TRP maintains the capacity of the original network while imposing low-rank constraints during training. A nuclear regularization optimized by stochastic sub-gradient descent is utilized to further promote low rank in TRP. Networks trained with TRP has a low-rank structure in nature, and is approximated with negligible performance loss, thus eliminating fine-tuning after low rank approximation. The proposed method is comprehensively evaluated on CIFAR-10 and ImageNet, outperforming previous compression counterparts using low rank approximation. Our code is available at: https://github.com/yuhuixu1993/Trained-Rank-Pruning., Comment: overlap with arXiv:1812.02402, in order to merge the two submissions such that withdraw this version
Published: 2019

169. Conditional Transferring Features: Scaling GANs to Thousands of Classes with 30% Less High-quality Data for Training

Author: Wu, Chunpeng, Wen, Wei, Chen, Yiran, and Li, Hai
Subjects: Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Generative adversarial network (GAN) has greatly improved the quality of unsupervised image generation. Previous GAN-based methods often require a large amount of high-quality training data while producing a small number (e.g., tens) of classes. This work aims to scale up GANs to thousands of classes meanwhile reducing the use of high-quality data in training. We propose an image generation method based on conditional transferring features, which can capture pixel-level semantic changes when transforming low-quality images into high-quality ones. Moreover, self-supervision learning is integrated into our GAN architecture to provide more label-free semantic supervisory information observed from the training data. As such, training our GAN architecture requires much fewer high-quality images with a small number of additional low-quality images. The experiments on CIFAR-10 and STL-10 show that even removing 30% high-quality images from the training set, our method can still outperform previous ones. The scalability on object classes has been experimentally validated: our method with 30% fewer high-quality images obtains the best quality in generating 1,000 ImageNet classes, as well as generating all 3,755 classes of CASIA-HWDB1.0 Chinese handwriting characters.
Published: 2019

170. Towards Efficient and Secure Delivery of Data for Deep Learning with Privacy-Preserving

Author: Shen, Juncheng, Liu, Juzheng, Chen, Yiran, and Li, Hai
Subjects: Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: Privacy recently emerges as a severe concern in deep learning, that is, sensitive data must be prohibited from being shared with the third party during deep neural network development. In this paper, we propose Morphed Learning (MoLe), an efficient and secure scheme to deliver deep learning data. MoLe has two main components: data morphing and Augmented Convolutional (Aug-Conv) layer. Data morphing allows data providers to send morphed data without privacy information, while Aug-Conv layer helps deep learning developers to apply their networks on the morphed data without performance penalty. MoLe provides stronger security while introducing lower overhead compared to GAZELLE (USENIX Security 2018), which is another method with no performance penalty on the neural network. When using MoLe for VGG-16 network on CIFAR dataset, the computational overhead is only 9% and the data transmission overhead is 5.12%. As a comparison, GAZELLE has computational overhead of 10,000 times and data transmission overhead of 421,000 times. In this setting, the attack success rate of adversary is 7.9 x 10^{-90} for MoLe and 2.9 x 10^{-30} for GAZELLE, respectively.
Published: 2019

171. DeepObfuscator: Obfuscating Intermediate Representations with Privacy-Preserving Adversarial Learning on Smartphones

Author: Li, Ang, Guo, Jiayi, Yang, Huanrui, Salim, Flora D., and Chen, Yiran
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Deep learning has been widely applied in many computer vision applications, with remarkable success. However, running deep learning models on mobile devices is generally challenging due to the limitation of computing resources. A popular alternative is to use cloud services to run deep learning models to process raw data. This, however, imposes privacy risks. Some prior arts proposed sending the features extracted from raw data to the cloud. Unfortunately, these extracted features can still be exploited by attackers to recover raw images and to infer embedded private attributes. In this paper, we propose an adversarial training framework, DeepObfuscator, which prevents the usage of the features for reconstruction of the raw images and inference of private attributes. This is done while retaining useful information for the intended cloud service. DeepObfuscator includes a learnable obfuscator that is designed to hide privacy-related sensitive information from the features by performing our proposed adversarial training algorithm. The proposed algorithm is designed by simulating the game between an attacker who makes efforts to reconstruct raw image and infer private attributes from the extracted features and a defender who aims to protect user privacy. By deploying the trained obfuscator on the smartphone, features can be locally extracted and then sent to the cloud. Our experiments on CelebA and LFW datasets show that the quality of the reconstructed images from the obfuscated features of the raw image is dramatically decreased from 0.9458 to 0.3175 in terms of multi-scale structural similarity. The person in the reconstructed image, hence, becomes hardly to be re-identified. The classification accuracy of the inferred private attributes that can be achieved by the attacker is significantly reduced to a random-guessing level., Comment: This paper is to be published in IoTDI'21
Published: 2019
Full Text: View/download PDF

172. Accelerating CNN Training by Pruning Activation Gradients

Author: Ye, Xucheng, Dai, Pengcheng, Luo, Junyu, Guo, Xin, Qi, Yingjie, Yang, Jianlei, and Chen, Yiran
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Statistics - Machine Learning
Abstract: Sparsification is an efficient approach to accelerate CNN inference, but it is challenging to take advantage of sparsity in training procedure because the involved gradients are dynamically changed. Actually, an important observation shows that most of the activation gradients in back-propagation are very close to zero and only have a tiny impact on weight-updating. Hence, we consider pruning these very small gradients randomly to accelerate CNN training according to the statistical distribution of activation gradients. Meanwhile, we theoretically analyze the impact of pruning algorithm on the convergence. The proposed approach is evaluated on AlexNet and ResNet-\{18, 34, 50, 101, 152\} with CIFAR-\{10, 100\} and ImageNet datasets. Experimental results show that our training approach could substantially achieve up to $5.92 \times$ speedups at back-propagation stage with negligible accuracy loss., Comment: accepted by ECCV 2020
Published: 2019

173. RED: A ReRAM-based Deconvolution Accelerator

Author: Fan, Zichen, Li, Ziru, Li, Bing, Chen, Yiran, Hai, and Li
Subjects: Computer Science - Emerging Technologies, Computer Science - Machine Learning
Abstract: Deconvolution has been widespread in neural networks. For example, it is essential for performing unsupervised learning in generative adversarial networks or constructing fully convolutional networks for semantic segmentation. Resistive RAM (ReRAM)-based processing-in-memory architecture has been widely explored in accelerating convolutional computation and demonstrates good performance. Performing deconvolution on existing ReRAM-based accelerator designs, however, suffers from long latency and high energy consumption because deconvolutional computation includes not only convolution but also extra add-on operations. To realize the more efficient execution for deconvolution, we analyze its computation requirement and propose a ReRAM-based accelerator design, namely, RED. More specific, RED integrates two orthogonal methods, the pixel-wise mapping scheme for reducing redundancy caused by zero-inserting operations and the zero-skipping data flow for increasing the computation parallelism and therefore improving performance. Experimental evaluations show that compared to the state-of-the-art ReRAM-based accelerator, RED can speed up operation 3.69x~1.15x and reduce 8%~88.36% energy consumption., Comment: 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)
Published: 2019
Full Text: View/download PDF

174. SwiftNet: Using Graph Propagation as Meta-knowledge to Search Highly Representative Neural Architectures

Author: Cheng, Hsin-Pai, Zhang, Tunhou, Yang, Yukun, Yan, Feng, Li, Shiyu, Teague, Harris, Li, Hai, and Chen, Yiran
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Designing neural architectures for edge devices is subject to constraints of accuracy, inference latency, and computational cost. Traditionally, researchers manually craft deep neural networks to meet the needs of mobile devices. Neural Architecture Search (NAS) was proposed to automate the neural architecture design without requiring extensive domain expertise and significant manual efforts. Recent works utilized NAS to design mobile models by taking into account hardware constraints and achieved state-of-the-art accuracy with fewer parameters and less computational cost measured in Multiply-accumulates (MACs). To find highly compact neural architectures, existing works relies on predefined cells and directly applying width multiplier, which may potentially limit the model flexibility, reduce the useful feature map information, and cause accuracy drop. To conquer this issue, we propose GRAM(GRAph propagation as Meta-knowledge) that adopts fine-grained (node-wise) search method and accumulates the knowledge learned in updates into a meta-graph. As a result, GRAM can enable more flexible search space and achieve higher search efficiency. Without the constraints of predefined cell or blocks, we propose a new structure-level pruning method to remove redundant operations in neural architectures. SwiftNet, which is a set of models discovered by GRAM, outperforms MobileNet-V2 by 2.15x higher accuracy density and 2.42x faster with similar accuracy. Compared with FBNet, SwiftNet reduces the search cost by 26x and achieves 2.35x higher accuracy density and 1.47x speedup while preserving similar accuracy. SwiftNetcan obtain 63.28% top-1 accuracy on ImageNet-1K with only 53M MACs and 2.07M parameters. The corresponding inference latency is only 19.09 ms on Google Pixel 1.
Published: 2019

175. Thread Batching for High-performance Energy-efficient GPU Memory Design

Author: Li, Bing, Mao, Mengjie, Liu, Xiaoxiao, Liu, Tao, Liu, Zihao, Wen, Wujie, Chen, Yiran, Hai, and Li
Subjects: Computer Science - Hardware Architecture, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Massive multi-threading in GPU imposes tremendous pressure on memory subsystems. Due to rapid growth in thread-level parallelism of GPU and slowly improved peak memory bandwidth, the memory becomes a bottleneck of GPU's performance and energy efficiency. In this work, we propose an integrated architectural scheme to optimize the memory accesses and therefore boost the performance and energy efficiency of GPU. Firstly, we propose a thread batch enabled memory partitioning (TEMP) to improve GPU memory access parallelism. In particular, TEMP groups multiple thread blocks that share the same set of pages into a thread batch and applies a page coloring mechanism to bound each stream multiprocessor (SM) to the dedicated memory banks. After that, TEMP dispatches the thread batch to an SM to ensure high-parallel memory-access streaming from the different thread blocks. Secondly, a thread batch-aware scheduling (TBAS) scheme is introduced to improve the GPU memory access locality and to reduce the contention on memory controllers and interconnection networks. Experimental results show that the integration of TEMP and TBAS can achieve up to 10.3% performance improvement and 11.3% DRAM energy reduction across diverse GPU applications. We also evaluate the performance interference of the mixed CPU+GPU workloads when they are run on a heterogeneous system that employs our proposed schemes. Our results show that a simple solution can effectively ensure the efficient execution of both GPU and CPU applications.
Published: 2019

176. AutoGrow: Automatic Layer Growing in Deep Convolutional Networks

Author: Wen, Wei, Yan, Feng, Chen, Yiran, and Li, Hai
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Neural and Evolutionary Computing, Statistics - Machine Learning, I.2.6
Abstract: Depth is a key component of Deep Neural Networks (DNNs), however, designing depth is heuristic and requires many human efforts. We propose AutoGrow to automate depth discovery in DNNs: starting from a shallow seed architecture, AutoGrow grows new layers if the growth improves the accuracy; otherwise, stops growing and thus discovers the depth. We propose robust growing and stopping policies to generalize to different network architectures and datasets. Our experiments show that by applying the same policy to different network architectures, AutoGrow can always discover near-optimal depth on various datasets of MNIST, FashionMNIST, SVHN, CIFAR10, CIFAR100 and ImageNet. For example, in terms of accuracy-computation trade-off, AutoGrow discovers a better depth combination in ResNets than human experts. Our AutoGrow is efficient. It discovers depth within similar time of training a single DNN. Our code is available at https://github.com/wenwei202/autogrow., Comment: KDD 2020
Published: 2019

177. eSLAM: An Energy-Efficient Accelerator for Real-Time ORB-SLAM on FPGA Platform

Author: Liu, Runze, Yang, Jianlei, Chen, Yiran, and Zhao, Weisheng
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Simultaneous Localization and Mapping (SLAM) is a critical task for autonomous navigation. However, due to the computational complexity of SLAM algorithms, it is very difficult to achieve real-time implementation on low-power platforms.We propose an energy efficient architecture for real-time ORB (Oriented-FAST and Rotated- BRIEF) based visual SLAM system by accelerating the most time consuming stages of feature extraction and matching on FPGA platform.Moreover, the original ORB descriptor pattern is reformed as a rotational symmetric manner which is much more hardware friendly. Optimizations including rescheduling and parallelizing are further utilized to improve the throughput and reduce the memory footprint. Compared with Intel i7 and ARM Cortex-A9 CPUs on TUM dataset, our FPGA realization achieves up to 3X and 31X frame rate improvement, as well as up to 71X and 25X energy efficiency improvement, respectively., Comment: to appear in DAC 2019
Published: 2019

178. Snooping Attacks on Deep Reinforcement Learning

Author: Inkawhich, Matthew, Chen, Yiran, and Li, Hai
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Statistics - Machine Learning
Abstract: Adversarial attacks have exposed a significant security vulnerability in state-of-the-art machine learning models. Among these models include deep reinforcement learning agents. The existing methods for attacking reinforcement learning agents assume the adversary either has access to the target agent's learned parameters or the environment that the agent interacts with. In this work, we propose a new class of threat models, called snooping threat models, that are unique to reinforcement learning. In these snooping threat models, the adversary does not have the ability to interact with the target agent's environment, and can only eavesdrop on the action and reward signals being exchanged between agent and environment. We show that adversaries operating in these highly constrained threat models can still launch devastating attacks against the target agent by training proxy models on related tasks and leveraging the transferability of adversarial examples., Comment: 13 pages, 12 figures
Published: 2019

179. Low-Power Computer Vision: Status, Challenges, Opportunities

Author: Alyamkin, Sergei, Ardi, Matthew, Berg, Alexander C., Brighton, Achille, Chen, Bo, Chen, Yiran, Cheng, Hsin-Pai, Fan, Zichen, Feng, Chen, Fu, Bo, Gauen, Kent, Goel, Abhinav, Goncharenko, Alexander, Guo, Xuyang, Ha, Soonhoi, Howard, Andrew, Hu, Xiao, Huang, Yuanjun, Kang, Donghyun, Kim, Jaeyoun, Ko, Jong Gook, Kondratyev, Alexander, Lee, Junhyeok, Lee, Seungjae, Lee, Suwoong, Li, Zichao, Liang, Zhiyu, Liu, Juzheng, Liu, Xin, Lu, Yang, Lu, Yung-Hsiang, Malik, Deeptanshu, Nguyen, Hong Hanh, Park, Eunbyung, Repin, Denis, Shen, Liang, Sheng, Tao, Sun, Fei, Svitov, David, Thiruvathukal, George K., Zhang, Baiwu, Zhang, Jingchi, Zhang, Xiaopeng, and Zhuo, Shaojie
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Performance
Abstract: Computer vision has achieved impressive progress in recent years. Meanwhile, mobile phones have become the primary computing platforms for millions of people. In addition to mobile phones, many autonomous systems rely on visual data for making decisions and some of these systems have limited energy (such as unmanned aerial vehicles also called drones and mobile robots). These systems rely on batteries and energy efficiency is critical. This article serves two main purposes: (1) Examine the state-of-the-art for low-power solutions to detect objects in images. Since 2015, the IEEE Annual International Low-Power Image Recognition Challenge (LPIRC) has been held to identify the most energy-efficient computer vision solutions. This article summarizes 2018 winners' solutions. (2) Suggest directions for research as well as opportunities for low-power computer vision., Comment: Preprint, Accepted by IEEE Journal on Emerging and Selected Topics in Circuits and Systems. arXiv admin note: substantial text overlap with arXiv:1810.01732
Published: 2019

180. Low Power Inference for On-Device Visual Recognition with a Quantization-Friendly Solution

Author: Feng, Chen, Sheng, Tao, Liang, Zhiyu, Zhuo, Shaojie, Zhang, Xiaopeng, Shen, Liang, Ardi, Matthew, Berg, Alexander C., Chen, Yiran, Chen, Bo, Gauen, Kent, and Lu, Yung-Hsiang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The IEEE Low-Power Image Recognition Challenge (LPIRC) is an annual competition started in 2015 that encourages joint hardware and software solutions for computer vision systems with low latency and power. Track 1 of the competition in 2018 focused on the innovation of software solutions with fixed inference engine and hardware. This decision allows participants to submit models online and not worry about building and bringing custom hardware on-site, which attracted a historically large number of submissions. Among the diverse solutions, the winning solution proposed a quantization-friendly framework for MobileNets that achieves an accuracy of 72.67% on the holdout dataset with an average latency of 27ms on a single CPU core of Google Pixel2 phone, which is superior to the best real-time MobileNet models at the time., Comment: Accepted At The 2nd Workshop on Machine Learning on the Phone and other Consumer Devices (MLPCD 2)
Published: 2019

181. SPINBIS: Spintronics based Bayesian Inference System with Stochastic Computing

Author: Jia, Xiaotao, Yang, Jianlei, Dai, Pengcheng, Liu, Runze, Chen, Yiran, and Zhao, Weisheng
Subjects: Computer Science - Emerging Technologies, Computer Science - Hardware Architecture
Abstract: Bayesian inference is an effective approach for solving statistical learning problems, especially with uncertainty and incompleteness. However, Bayesian inference is a computing-intensive task whose efficiency is physically limited by the bottlenecks of conventional computing platforms. In this work, a spintronics based stochastic computing approach is proposed for efficient Bayesian inference. The inherent stochastic switching behaviors of spintronic devices are exploited to build stochastic bitstream generator (SBG) for stochastic computing with hybrid CMOS/MTJ circuits design. Aiming to improve the inference efficiency, an SBG sharing strategy is leveraged to reduce the required SBG array scale by integrating a switch network between SBG array and stochastic computing logic. A device-to-architecture level framework is proposed to evaluate the performance of spintronics based Bayesian inference system (SPINBIS). Experimental results on data fusion applications have shown that SPINBIS could improve the energy efficiency about 12X than MTJ-based approach with 45% design area overhead and about 26X than FPGA-based approach., Comment: 14 pages, 26 figures, accepted by IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Published: 2019
Full Text: View/download PDF

182. HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array

Author: Song, Linghao, Mao, Jiachen, Zhuo, Youwei, Qian, Xuehai, Li, Hai, and Chen, Yiran
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Machine Learning
Abstract: With the rise of artificial intelligence in recent years, Deep Neural Networks (DNNs) have been widely used in many domains. To achieve high performance and energy efficiency, hardware acceleration (especially inference) of DNNs is intensively studied both in academia and industry. However, we still face two challenges: large DNN models and datasets, which incur frequent off-chip memory accesses; and the training of DNNs, which is not well-explored in recent accelerator designs. To truly provide high throughput and energy efficient acceleration for the training of deep and large models, we inevitably need to use multiple accelerators to explore the coarse-grain parallelism, compared to the fine-grain parallelism inside a layer considered in most of the existing architectures. It poses the key research question to seek the best organization of computation and dataflow among accelerators. In this paper, inspired by recent work in machine learning systems, we propose a solution HyPar to determine layer-wise parallelism for deep neural network training with an array of DNN accelerators. HyPar partitions the feature map tensors (input and output), the kernel tensors, the gradient tensors, and the error tensors for the DNN accelerators. A partition constitutes the choice of parallelism for weighted layers. The optimization target is to search a partition that minimizes the total communication during training a complete DNN. To solve this problem, we propose a communication model to explain the source and amount of communications. Then, we use a hierarchical layer-wise dynamic programming method to search for the partition for each layer., Comment: To appear in the 2019 25th International Symposium on High-Performance Computer Architecture (HPCA 2019)
Published: 2019

183. Carrot and stick: Does dual-credit policy promote green innovation in auto firms?

Author: Li, Bo, Chen, Yiran, and Cao, Shaopeng
Published: 2023
Full Text: View/download PDF

184. A Metabolomics Study of Hypoxia Ischemia during Mouse Brain Development Using Hyperpolarized 13C

Author: Mikrogeorgiou, Alkisti, Chen, Yiran, Lee, Byong Sop, Bok, Robert, Sheldon, R Ann, Barkovich, A James, Xu, Duan, and Ferriero, Donna M
Subjects: Medical Biochemistry and Metabolomics, Biomedical and Clinical Sciences, Biomedical Imaging, Physical Injury - Accidents and Adverse Effects, Animals, Brain, Carbon Isotopes, Hypoxia, Lactic Acid, Magnetic Resonance Imaging, Magnetic Resonance Spectroscopy, Metabolomics, Mice, Pyruvic Acid, Developing brain, Hyperpolarized(13)C, Metabolism, Magnetic resonance spectroscopy, Neonatal brain injury, Pyruvate, Lactate, Hyperpolarized 13C, Neurosciences, Paediatrics and Reproductive Medicine, Cognitive Sciences, Neurology & Neurosurgery, Clinical sciences, Biological psychology
Abstract: BackgroundHyperpolarized 13C spectroscopic magnetic resonance spectroscopy (MRS) is an advanced imaging tool that may provide important real-time information about brain metabolism.MethodsMice underwent unilateral hypoxia-ischemia (HI) on postnatal day (P)10. Injured and sham mice were scanned at P10, P17, and P31. We used hyperpolarized 13C MRS to investigate the metabolic exchange of pyruvate to lactate in real time during brain development following HI. 13C-1-labeled pyruvate was hyperpolarized and injected into the tail vein through a tail-vein catheter. Chemical-shift imaging was performed to acquire spectral-spatial information of the metabolites in the brain. A voxel placed on each of the injured and contralateral hemispheres was chosen for comparison. The difference in pyruvate delivery and lactate to pyruvate ratio was calculated for each of the voxels at each time point. The normalized lactate level of the injured hemisphere was also calculated for each mouse at each of the scanning time points.ResultsThere was a significant reduction in pyruvate delivery and a higher lactate to pyruvate ratio in the ipsilateral (HI) hemisphere at P10. The differences decreased at P17 and disappeared at P31. The normalized lactate level in the injured hemisphere increased from P10 to P31 in both sham and HI mice without brain injury.ConclusionWe describe a method for detecting and monitoring the evolution of HI injury during brain maturation which could prove to be an excellent biomarker of injury.
Published: 2020

185. Towards Leveraging the Information of Gradients in Optimization-based Adversarial Attack

Author: Zhang, Jingyang, Cheng, Hsin-Pai, Wu, Chunpeng, Li, Hai, and Chen, Yiran
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In recent years, deep neural networks demonstrated state-of-the-art performance in a large variety of tasks and therefore have been adopted in many applications. On the other hand, the latest studies revealed that neural networks are vulnerable to adversarial examples obtained by carefully adding small perturbation to legitimate samples. Based upon the observation, many attack methods were proposed. Among them, the optimization-based CW attack is the most powerful as the produced adversarial samples present much less distortion compared to other methods. The better attacking effect, however, comes at the cost of running more iterations and thus longer computation time to reach desirable results. In this work, we propose to leverage the information of gradients as a guidance during the search of adversaries. More specifically, directly incorporating the gradients into the perturbation can be regarded as a constraint added to the optimization process. We intuitively and empirically prove the rationality of our method in reducing the search space. Our experiments show that compared to the original CW attack, the proposed method requires fewer iterations towards adversarial samples, obtaining a higher success rate and resulting in smaller $\ell_2$ distortion.
Published: 2018

186. Trained Rank Pruning for Efficient Deep Neural Networks

Author: Xu, Yuhui, Li, Yuxi, Zhang, Shuai, Wen, Wei, Wang, Botao, Qi, Yingyong, Chen, Yiran, Lin, Weiyao, and Xiong, Hongkai
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The performance of Deep Neural Networks (DNNs) keeps elevating in recent years with increasing network depth and width. To enable DNNs on edge devices like mobile phones, researchers proposed several network compression methods including pruning, quantization and factorization. Among the factorization-based approaches, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposition; however, small approximation errors in parameters can ripple a large prediction loss. As a result, performance usually drops significantly and a sophisticated fine-tuning is required to recover accuracy. We argue that it is not optimal to separate low-rank approximation from training. Unlike previous works, this paper integrates low rank approximation and regularization into the training. We propose Trained Rank Pruning (TRP), which iterates low rank approximation and training. TRP maintains the capacity of original network while imposes low-rank constraints during training. A stochastic sub-gradient descent optimized nuclear regularization is utilized to further encourage low rank in TRP. The TRP trained network has low-rank structure in nature, and can be approximated with negligible performance loss, eliminating fine-tuning after low rank approximation. The methods are comprehensively evaluated on CIFAR-10 and ImageNet, outperforming previous compression methods using low rank approximation. Code is available: https://github.com/yuhuixu1993/Trained-Rank-Pruning, Comment: Accepted by NIPS2019 EMC2 workshop, the same version as the withdrawn arXiv:1910.04576
Published: 2018

187. Metabolome-wide association study of four groups of persistent organic pollutants and abnormal blood lipids

Author: Chen, Yiran, Lv, Jiayun, Fu, Lei, Wu, Yan, Zhou, Si, Liu, Shiwei, Zheng, Linjie, Feng, Wenru, and Zhang, Lin
Published: 2023
Full Text: View/download PDF

188. A goodness-of-fit test for copulas based on the collision test

Author: Chen, Yiran and Ökten, Giray
Published: 2022
Full Text: View/download PDF

189. Deep Learning for Routability

Author: Xie, Zhiyao, Pan, Jingyu, Chang, Chen-Chia, Liang, Rongjian, Barboza, Erick Carvajal, Chen, Yiran, Ren, Haoxing, editor, and Hu, Jiang, editor
Published: 2022
Full Text: View/download PDF

190. Recent advances in photoelectrocatalytic advanced oxidation processes: From mechanism understanding to catalyst design and actual applications

Author: Zhang, Xinyu, Yu, Wenchao, Guo, Yajie, Li, Shunlin, Chen, Yiran, Wang, Hui, and Bian, Zhaoyong
Published: 2023
Full Text: View/download PDF

191. Dopamine-mimetic-coated polyamidoamine-functionalized Fe3O4 nanoparticles for safe and efficient gene delivery

Author: Liu, Liang, Liu, Chaobing, Yang, Zhaojun, Chen, Yiran, Chen, Xin, and Guan, Jintao
Published: 2023
Full Text: View/download PDF

192. The impact of basketball on the social adjustment of Chinese middle school students: the chain mediating role of interpersonal relationships and self-identity

Author: Sui Haoran, Lu Tianci, Chen Hanwen, Tao Baole, Chen Yiran, and Jun Yan
Subjects: basketball intervention, middle school students, social adjustment, interpersonal relationships, self-identity, chain mediation, Psychology, BF1-990
Abstract: BackgroundThis study examines the effects of 12 weeks of basketball on interpersonal relationships, self-identity and social adjustment of middle school students, as well as exploring the mediating role of interpersonal relationships and self-identity in basketball’s influence on social adjustment.MethodsA total of 87 students from a middle school in Jiangsu Province, China, were selected to participate in this study. A 12-week basketball intervention experiment was conducted, and questionnaires were administered to measure the study variables. Common method bias test, normality test, ANOVA and Pearson correlation analysis were used to analyze the study variables. The theoretical model of this study was also validated using the Process plug-in developed by Hayes, setting p < 0.05 (two-tail) as statistically significant.ResultsAfter a 12-week basketball intervention experiment, the interpersonal relationships, self-identity and social adjustment of the middle school students in the experimental and control groups showed improvement, with the experimental group showing significantly more significant improvement than the control group. A 12-week basketball intervention can positively impact the social adjustment of middle school students, with interpersonal relationships and self-identity acting as a chain mediator in the impact process.
Published: 2023
Full Text: View/download PDF

193. Processing-in-Memory Designs Based on Emerging Technology for Efficient Machine Learning Acceleration

Author: Kim, Bokyung, primary, Li, Hai, additional, and Chen, Yiran, additional
Published: 2024
Full Text: View/download PDF

194. Game-Based Mechanisms for Speaking English Enhancement: Exploring the Potential of Electronic Games in English Teaching in Chinese Universities

Author: Yin Hao and Chen Yiran
Subjects: Social Sciences
Abstract: With the continuous development of science and technology, the popularity of electronic games in China is getting higher and higher. As a long popular form of entertainment, electronic games are not only the favorite choice of young people but also influence oral English teaching in Chinese universities to a certain extent. This paper will analyze in depth the main mechanisms by which electronic games enhance students’ English speaking ability, and further explore the positive impact and possibility of introducing electronic games into the English speaking teaching system in Chinese universities.
Published: 2024
Full Text: View/download PDF

195. Research on the Application of Gamified Teaching in Primary School English Oral Teaching

Author: Chen Yiran and Yin Hao
Subjects: Social Sciences
Abstract: This study aims to explore the design and integration of educationally meaningful video games to enhance students’ problem-solving and planning skills in English oral instruction. By amalgamating educational and technological elements, we endeavor to create an engaging and effective learning milieu that stimulates students’ motivation and augments their proficiency in English oral expression. The research employs an experimental methodology, encompassing the design and development of an educational video game prototype, alongside field observations and survey analysis of students. The findings indicate that the use of educationally significant video game prototypes markedly increases students’ interest and engagement in problem-solving and planning, while also achieving substantial improvement in their English oral communication skills.
Published: 2024
Full Text: View/download PDF

196. Adversarial Attacks for Optical Flow-Based Action Recognition Classifiers

Author: Inkawhich, Nathan, Inkawhich, Matthew, Chen, Yiran, and Li, Hai
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The success of deep learning research has catapulted deep models into production systems that our society is becoming increasingly dependent on, especially in the image and video domains. However, recent work has shown that these largely uninterpretable models exhibit glaring security vulnerabilities in the presence of an adversary. In this work, we develop a powerful untargeted adversarial attack for action recognition systems in both white-box and black-box settings. Action recognition models differ from image-classification models in that their inputs contain a temporal dimension, which we explicitly target in the attack. Drawing inspiration from image classifier attacks, we create new attacks which achieve state-of-the-art success rates on a two-stream classifier trained on the UCF-101 dataset. We find that our attacks can significantly degrade a model's performance with sparsely and imperceptibly perturbed examples. We also demonstrate the transferability of our attacks to black-box action recognition systems.
Published: 2018

197. LEASGD: an Efficient and Privacy-Preserving Decentralized Algorithm for Distributed Learning

Author: Cheng, Hsin-Pai, Yu, Patrick, Hu, Haojing, Yan, Feng, Li, Shiyu, Li, Hai, and Chen, Yiran
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Distributed learning systems have enabled training large-scale models over large amount of data in significantly shorter time. In this paper, we focus on decentralized distributed deep learning systems and aim to achieve differential privacy with good convergence rate and low communication cost. To achieve this goal, we propose a new learning algorithm LEASGD (Leader-Follower Elastic Averaging Stochastic Gradient Descent), which is driven by a novel Leader-Follower topology and a differential privacy model.We provide a theoretical analysis of the convergence rate and the trade-off between the performance and privacy in the private setting.The experimental results show that LEASGD outperforms state-of-the-art decentralized learning algorithm DPSGD by achieving steadily lower loss within the same iterations and by reducing the communication cost by 30%. In addition, LEASGD spends less differential privacy budget and has higher final accuracy result than DPSGD under private setting.
Published: 2018

198. A Scalable Pipelined Dataflow Accelerator for Object Region Proposals on FPGA Platform

Author: Fu, Wenzhi, Yang, Jianlei, Dai, Pengcheng, Chen, Yiran, and Zhao, Weisheng
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Region proposal is critical for object detection while it usually poses a bottleneck in improving the computation efficiency on traditional control-flow architectures. We have observed region proposal tasks are potentially suitable for performing pipelined parallelism by exploiting dataflow driven acceleration. In this paper, a scalable pipelined dataflow accelerator is proposed for efficient region proposals on FPGA platform. The accelerator processes image data by a streaming manner with three sequential stages: resizing, kernel computing and sorting. First, Ping-Pong cache strategy is adopted for rotation loading in resize module to guarantee continuous output streaming. Then, a multiple pipelines architecture with tiered memory is utilized in kernel computing module to complete the main computation tasks. Finally, a bubble-pushing heap sort method is exploited in sorting module to find the top-k largest candidates efficiently. Our design is implemented with high level synthesis on FPGA platforms, and experimental results on VOC2007 datasets show that it could achieve about 3.67X speedups than traditional desktop CPU platform and >250X energy efficiency improvement than embedded ARM platform., Comment: accepted by FPT 2018 Conference
Published: 2018

199. Differentiable Fine-grained Quantization for Deep Neural Network Compression

Author: Cheng, Hsin-Pai, Huang, Yuanjun, Guo, Xuyang, Huang, Yifei, Yan, Feng, Li, Hai, and Chen, Yiran
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Neural networks have shown great performance in cognitive tasks. When deploying network models on mobile devices with limited resources, weight quantization has been widely adopted. Binary quantization obtains the highest compression but usually results in big accuracy drop. In practice, 8-bit or 16-bit quantization is often used aiming at maintaining the same accuracy as the original 32-bit precision. We observe different layers have different accuracy sensitivity of quantization. Thus judiciously selecting different precision for different layers/structures can potentially produce more efficient models compared to traditional quantization methods by striking a better balance between accuracy and compression rate. In this work, we propose a fine-grained quantization approach for deep neural network compression by relaxing the search space of quantization bitwidth from discrete to a continuous domain. The proposed approach applies gradient descend based optimization to generate a mixed-precision quantization scheme that outperforms the accuracy of traditional quantization methods under the same compression rate., Comment: Hsin-Pai Cheng, Yuanjun Huang and Xuyang Guo contributed equally and are co-first authors for this paper. This work has been accepted by NIPS 2018 Workshop on Compact Deep Neural Network Representation with Industrial Applications, Montreal, Canada
Published: 2018

200. Generalized Inverse Optimization through Online Learning

Author: Dong, Chaosheng, Chen, Yiran, and Zeng, Bo
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Inverse optimization is a powerful paradigm for learning preferences and restrictions that explain the behavior of a decision maker, based on a set of external signal and the corresponding decision pairs. However, most inverse optimization algorithms are designed specifically in batch setting, where all the data is available in advance. As a consequence, there has been rare use of these methods in an online setting suitable for real-time applications. In this paper, we propose a general framework for inverse optimization through online learning. Specifically, we develop an online learning algorithm that uses an implicit update rule which can handle noisy data. Moreover, under additional regularity assumptions in terms of the data and the model, we prove that our algorithm converges at a rate of $\mathcal{O}(1/\sqrt{T})$ and is statistically consistent. In our experiments, we show the online learning approach can learn the parameters with great accuracy and is very robust to noises, and achieves a dramatic improvement in computational efficacy over the batch learning approach., Comment: 14 pages, 10 figures, Accepted at NIPS 2018
Published: 2018

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

2,129 results on '"Chen, Yiran"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources