Author: "Wenjun Zhang" / Topic: computer science - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Wenjun Zhang"' showing total 532 results

Start Over Author "Wenjun Zhang" Topic computer science

532 results on '"Wenjun Zhang"'

1. Residual Quantization for Low Bit-Width Neural Networks

Author: Wenjun Zhang, Bingbing Ni, Zefan Li, Wen Gao, Teng Li, and Xiaokang Yang
Subjects: Artificial neural network, Computer science, Quantization (signal processing), Binary number, Maximization, Residual, Computer Science Applications, Acceleration, Compression (functional analysis), Signal Processing, Media Technology, Electrical and Electronic Engineering, Representation (mathematics), Algorithm
Abstract: Neural network quantization has shown to be an effective way for network compression and acceleration. However, existing binary or ternary quantization methods suffer from two major issues. First, low bit-width input/activation quantization easily results in severe prediction accuracy degradation. Second, network training and quantization are always treated as two non-related tasks, leading to accumulated parameter training error and quantization error. In this work, we introduce a novel scheme, named Residual Quantization, to train a neural network with both weights and inputs constrained to low bit-width, e.g., binary or ternary values. On one hand, by recursively performing residual quantization, the resulting binary/ternary network is guaranteed to approximate the full-precision network with much smaller errors. On the other hand, we mathematically re-formulate the network training scheme in an EM-like manner, which iteratively performs network quantization and parameter optimization. During expectation, the low bit-width network is encouraged to approximate the full-precision network. During maximization, the low bit-width network is further tuned to gain better representation capability. Extensive experiments well demonstrate that the proposed quantization scheme outperforms previous low bit-width methods and achieves much closer performance to the full-precision counterpart.
Published: 2023

2. Multi-Feature Fusion Method for Identifying Carotid Artery Vulnerable Plaque

Author: R. Wu, L. Huang, Jiang Xie, G. Ding, M. Chi, X. Xu, Wenjun Zhang, and L. Liu
Subjects: Carotid atherosclerosis, business.industry, Computer science, Carotid arteries, Biomedical Engineering, Biophysics, Pattern recognition, medicine.disease_cause, Convolutional neural network, Vulnerable plaque, Identification (information), Multi feature fusion, medicine, Classification methods, Artificial intelligence, business, Feature set
Abstract: Purpose Vulnerable plaque of carotid atherosclerosis is prone to rupture, which can easily lead to acute cardiovascular and cerebrovascular accidents. Accurate identification of the vulnerable plaque is a challenging task, especially on limited datasets. Methods This paper proposes a multi-feature fusion method to identify high-risk plaque, in which three types of features are combined, i.e. global features of carotid ultrasound images, echo features of regions of interests (ROI) and expert knowledge from ultrasound reports. Due to the fusion of three types of features, more critical features for identifying high-risk plaque are included in the feature set. Therefore, better performance can be achieved even on limited datasets. Results From testing all combinations of three types of features, the results showed that the accuracy of using all three types of features is the highest. The experiments also showed that the performance of the proposed method is better than other plaque classification methods and classical Convolutional Neural Networks (CNNs) on the Plaque dataset. Conclusion The proposed method helped to build a more complete feature set so that the machine learning models could identify vulnerable plaque more accurately even on datasets with poor quality and small scale.
Published: 2022

3. HazDesNet: An End-to-End Network for Haze Density Prediction

Author: Wenjun Zhang, Guangtao Zhai, Xiaokang Yang, Jiahe Zhang, Jiantao Zhou, Yucheng Zhu, and Xiongkuo Min
Subjects: Haze, Computer science, business.industry, Mechanical Engineering, Intelligent decision support system, Usability, Advanced driver assistance systems, Convolutional neural network, Computer Science Applications, End-to-end principle, Automotive Engineering, Code (cryptography), Computer vision, Artificial intelligence, Visibility, business
Abstract: Vision-based intelligent systems such as driver assistance systems and transportation systems should take into account weather conditions. The presence of haze in images can be a critical threat to driving scenarios. Haze density measures the visibility and usability of hazy images captured in real-world conditions. The prediction of haze density can be valuable in various vision-based intelligent systems, especially in those systems deployed in outdoor environments. Haze density prediction is a challenging task since the haze and many scene contents have a lot in common in appearance. Existing methods generally utilize different priors and design complex handcrafted features to predict the visibility or haze density of the image. In this article, we propose a novel end-to-end convolutional neural network (CNN) based method to predict haze density, named as HazDesNet. Our HazDesNet takes a hazy image as input and predicts a pixel-level haze density map. The density map is then refined and smoothed, and the average of the refined map is calculated as the global haze density of the image. To verify the performance of HazDesNet, a subjective human study is performed to build a Human Perceptual Haze Density (HPHD) database, which includes 500 real-world hazy images and 100 synthetic hazy images, and the corresponding human-rated perceptual haze density scores. Experimental results show that our method achieves the best haze density prediction performance on our built HPHD database and existing databases. Besides the global quantitative results, our HazDesNet is capable of predicting a continuous, stable, fine, and high-resolution haze density map. We will make the database and code publicly available at https://github.com/JiaheZhang/HazDesNet.
Published: 2022

4. StereoARS: Quality Evaluation for Stereoscopic Image Retargeting With Binocular Inconsistency Detection

Author: Wenjun Zhang, Yabin Zhang, Weisi Lin, Qiuping Jiang, Ke Gu, Feng Shao, and Zhenyu Peng
Subjects: Binocular rivalry, Monocular, Pixel, Computer science, business.industry, media_common.quotation_subject, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Stereoscopy, law.invention, Seam carving, law, Quality Score, Retargeting, Media Technology, Computer vision, Quality (business), Artificial intelligence, Electrical and Electronic Engineering, business, ComputingMethodologies_COMPUTERGRAPHICS, media_common
Abstract: Many stereoscopic image retargeting (SIR) methods have been developed for automatically and intelligently resizing stereoscopic images and we cannot always rely on time-consuming subjective user studies to validate the performance of different SIR methods. It is therefore required to design reliable objective metrics for SIR quality evaluation. This paper extends our previous 2D aspect ratio similarity (ARS) metric to a stereo 3D version termed as StereoARS where the key idea is to investigate into retargeting inconsistency between the original stereo correspondences. Our proposed StereoARS operates via two stages: monocular quality estimation and binocular inconsistency detection. In the first stage, monocular quality estimation is performed by applying a modified ARS measure on the left and right views separately to quantify the quality degradation within each monocular view. In the second stage, binocular inconsistency detection is performed in both pixel-level and grid-level to characterize the influence of binocular rivalry and stereo visual discomfort on SIR quality. In addition, we also measure to what extent the original pixel visibility relation is preserved after SIR as another binocular quality factor. Finally, these monocular and binocular quality estimates are fused to produce an overall SIR quality score. Extensive experiments have demonstrated that StereoARS achieves better alignment with human subjective ratings than the existing metrics by a large margin.
Published: 2022

5. A Polygonal Line Min-Sum Decoding Scheme for Low Density Parity Check Codes

Author: Yihang Huang, Wenjun Zhang, Na Gao, Dazhi He, Hao Ju, Yiyan Wu, and Yin Xu
Subjects: business.industry, Computer science, Belief propagation, symbols.namesake, Additive white Gaussian noise, Digital Video Broadcasting, Media Technology, symbols, Digital television, Electrical and Electronic Engineering, Low-density parity-check code, business, Error detection and correction, Algorithm, Decoding methods, Communication channel
Abstract: Low-density parity-check (LDPC) codes are widely used as error correction codes in new generation digital TV standards, such as the second generation of terrestrial digital video broadcasting standard (DVB-T2), Advanced Television Systems Committee (ATSC) 3.0, etc. The nonlinear belief propagation (BP) algorithm has excellent decoding performance for LDPC codes, but is often simplified in hardware implementations by linear min-sum (MS) algorithm due to its high complexity. This simplification also leads to over-estimation problems, which can be corrected by adding factors in conventional algorithms (e.g., normalized min-sum (NMS), offset min-sum (OMS), and variable scaling normalized min-sum (VMS) algorithms). However, the correction factors of these modified MS algorithms cannot adapt to different channels and modulations, and the performance needs further improvement. In this paper, the concepts of over-estimation value (OEV) and over-estimation rate (OER) are introduced to describe the over-estimation problem of the MS algorithm. Then, under the guidance of OEV and OER, a polygonal line min-sum (PMS) algorithm with correction factors adapted to different channels and modulations is proposed according to LLR distribution. Following the properties of OEV and OER, PMS algorithm is further simplified into Simplified PMS (SPMS) algorithm. LDPC codes from ATSC 3.0 are adopted in this paper to evaluate SPMS algorithm in comparison with the conventional algorithms. Extensive simulation results show that the SPMS algorithm for ATSC 3.0 LDPC decoder has at most 1.61dB, 0.24dB and 0.36dB gain over NMS, OMS and VMS algorithms respectively when frame error rate (FER) is at 10⁻⁴ level over additive white Gaussian noise (AWGN) channel with QPSK modulation. More importantly, the simulation results show that the SPMS algorithm can achieve much better performance than these modified MS algorithms over AWGN and Rayleigh channel with higher-order modulations or under limited maximum iteration number.
Published: 2022

6. QoE Driven VR 360° Video Massive MIMO Transmission

Author: Chengshang Xiao, Yongpeng Wu, Wenjun Zhang, Guangtao Zhai, Long Teng, Xiongkuo Min, and Zhi Ding
Subjects: Transmission (telecommunications), Computer science, Wireless network, Applied Mathematics, Real-time computing, MIMO, Spectral efficiency, Electrical and Electronic Engineering, Virtual reality, Latency (engineering), Integer programming, Throughput (business), Computer Science Applications
Abstract: Massive multiple-input and multiple-output (MIMO) enables ultra-high throughput and low latency for tile-based adaptive virtual reality (VR) 360° video transmission in wireless network. In this paper, we consider a massive MIMO system where multiple users in a single-cell theater watch an identical VR 360° video. Based on tile prediction, base station (BS) deliveries the tiles in predicted field of view (FoV) to users. By introducing practical supplementary transmission for missing tiles and unacceptable VR sickness, we propose the first stable transmission scheme for VR video. we formulate an integer non-linear programming (INLP) problem to maximize users’ average quality of experience (QoE) score. Moreover, we derive the achievable spectral efficiency (SE) expression of predictive tile groups and the approximately achievable SE expression of missing tile groups, respectively. Analytically, the overall throughput is related to the number of tile groups and the length of pilot sequences. By exploiting the relationship between the structure of viewport tiles and SE expression, we propose a multi-lattice multi-stream grouping method aimed at improving the overall throughput for VR video transmission. Moreover, we analyze the relationship between QoE objective and number of predictive tile. We transform the original INLP problem into an integer linear programming problem by setting the predictive tiles groups as some constants. With variable relaxation and recovery, we obtain the optimal average QoE. Extensive simulation results validate that the proposed algorithm effectively improves QoE.
Published: 2022

7. Image emotion distribution learning based on enhanced fuzzy KNN algorithm with sparse learning

Author: Meixian Zhang, Yunwen Zhu, Yonghua Zhu, Wenjun Zhang, and Ke Zhang
Subjects: Statistics and Probability, Sparse learning, Distribution (number theory), Artificial Intelligence, business.industry, Computer science, General Engineering, Pattern recognition, Learning based, Artificial intelligence, business, Fuzzy knn, Image (mathematics)
Abstract: With the trend of people expressing opinions and emotions via images online, increasing attention has been paid to affective analysis of visual content. Traditional image affective analysis mainly focuses on single-label classification, but an image usually evokes multiple emotions. To this end, emotion distribution learning is proposed to describe emotions more explicitly. However, most current studies ignore the ambiguity included in emotions and the elusive correlations with complex visual features. Considering that emotions evoked by images are delivered through various visual features, and each feature in the image may have multiple emotion attributes, this paper develops a novel model that extracts multiple features and proposes an enhanced fuzzy k-nearest neighbor (EFKNN) to calculate the fuzzy emotional memberships. Specifically, the multiple visual features are converted into fuzzy emotional memberships of each feature belonging to emotion classes, which can be regarded as an intermediate representation to bridge the affective gap. Then, the fuzzy emotional memberships are fed into a fully connected neural network to learn the relationships between the fuzzy memberships and image emotion distributions. To obtain the fuzzy memberships of test images, a novel sparse learning method is introduced by learning the combination coefficients of test images and training images. Extensive experimental results on several datasets verify the superiority of our proposed approach for emotion distribution learning of images.
Published: 2021

8. An Elastic System Architecture for Edge Based Low Latency Interactive Video Applications

Author: Wenjun Zhang, Yu Dong, Rong Xie, and Li Song
Subjects: Flexibility (engineering), Computer architecture, Computer science, Interactive video, Pipeline (computing), Cloud gaming, Scalability, Media Technology, Systems architecture, Electrical and Electronic Engineering, Latency (engineering), Edge computing
Abstract: 5G and edge computing have brought great changes to video industry. Interactive video is becoming an emerging application form of multimedia service, which provides attraction beyond typical scenarios like cloud gaming and remote virtual reality (VR), and puts forward great challenges in resource capacity, response latency, and function flexibility to its service system. In this paper, we propose an elastic system architecture with low latency features to accommodate generic interactive video applications on near user edges. To increase system flexibility, we firstly design a dynamic Directed Acyclic Graph (dDAG) model for efficient task representation. Secondly, based on the model, we present the elastic architecture together with its scalable workflow pipeline. Thirdly, we propose a set of novel latency measurement metrics to analyze and optimize the performance of an interactive video system. Based on the proposed approaches, we disassemble a real world free-viewpoint synthesis application and benchmark its performance with the metrics. Extensive experimental results show the flexibility of our system to handle the stochastic human interactions during a video service session, with less than 5 ms additional scheduling latency introduced. End to end latency is kept within 43 ms for complex functions, and 28 ms for simpler scenarios, which satisfies the restrictions of most interactive video applications provided by an edge. Client of the architecture serves as a pure video player, which is also friendly to power limited terminals such as 5G phones. Efficiency and stability analyses of the system show superiorities over existing work, and also reveal potential optimization directions for future research.
Published: 2021

9. Robust fuzzy dynamic surface formation control for underactuated ships using MLP and LFG

Author: Xianku Zhang, Wenjun Zhang, Shang Liu, and Guoqing Zhang
Subjects: Surface (mathematics), Control and Optimization, Control engineering systems. Automatic machinery (General), Underactuation, Computer science, low-frequency gain-learning, Structure (category theory), Parameterized complexity, formation control, dynamic surface control, Fuzzy logic, Systems engineering, TA168, Artificial Intelligence, Control and Systems Engineering, Control theory, TJ212-225, Control (linguistics), underactuated ship, minimal learning parameter
Abstract: This note deals with the leader-following formation problem for multiple underactuated ships in the presence of structure uncertainties and the time-varying parameterized disturbances. Following this ideology, a novel robust fuzzy dynamic surface formation control algorithm is proposed by fusing of the dynamic surface control (DSC), minimal learning parameter (MLP) and low frequency gain-learning (LFG). In the control algorithm, the intermediate virtual control laws do not appear in the finally actual control effort, and only two fuzzy type approximators are introduced to compensate the model uncertainties and the external disturbances, which can effectively overcome the constraints of ‘explosion of complexity’ and ‘curse of dimensionality’ in the traditional approximation-based algorithm. Unlike the current DSC technique, no filter errors are required to be stabilized in the Lyapunov function by virtue of the filter compensation signal, which could optimize the calculation of stabilization analysis. Furthermore, benefiting from the LFG technique, the robustness and applicability of the proposed control algorithm can be improved. Based on the Lyapunov theory analysis, all signals of the closed-loop control system can be guaranteed to be semi-global uniformly ultimately bounded (SGUUB). Finally, the simulated experiment is provided to verify the effectiveness and superiority of the proposed control scheme.
Published: 2021

10. Improved composite adaptive fault‐tolerant control for dynamic positioning vehicle subject to the dead‐zone nonlinearity

Author: Wenjun Zhang, Mingqi Yao, Weidong Zhang, and Guoqing Zhang
Subjects: Control and Optimization, Control engineering systems. Automatic machinery (General), Computer science, Composite number, Fault tolerance, Computer Science Applications, Human-Computer Interaction, Control and Systems Engineering, Control theory, TJ212-225, Subject (grammar), Dynamic positioning, Dead zone nonlinearity, Electrical and Electronic Engineering, Control (linguistics)
Abstract: In order to tackle the marine practical constraints, for example the actuator faults, the dead‐zone input, an improved composite adaptive neural control algorithm is proposed for dynamic positioning vehicles in presence of the unknown external disturbances. In the algorithm, the robust neural damping technique is employed to remodel the system model uncertainty and suppress the external interference. As for the dead‐zone input, the dead‐zone inverse model is constructed to derive the corresponding compensating terms. That could effectively release the constraints from the actuator faults and the dead‐zone non‐linearity. Furthermore, for merits of the composite intelligent learning method, one designs the serial‐parallel estimation model to estimate the related velocity variables. The corresponding prediction error could be applied in the design of adaptive law. That could effectively improve the accuracy of parameter estimation and facilitate the robustness of the closed‐loop system. The semi‐global uniformly ultimately bounded stability is guaranteed for all error signals in the closed‐loop system by utilizing the Lyapunov theory. Finally, the validity of the proposed algorithm is demonstrated through the simulation experiments.
Published: 2021

11. Modeling Acceleration Properties for Flexible INTRA HEVC Complexity Control

Author: Wenjun Zhang, Rong Xie, Li Song, Ebroul Izquierdo, and Yan Huang
Subjects: Computer science, Heuristic (computer science), business.industry, Deep learning, Context (language use), Naive Bayes classifier, Acceleration, Computer engineering, Encoding (memory), Media Technology, Artificial intelligence, Electrical and Electronic Engineering, business, Encoder, Coding (social sciences)
Abstract: It is a very well-known fact, that the high complexity of the High Efficiency Video Coding standard (HEVC) is the main hurdle for its wide deployment and use. To tackle this problem, a number of recent research outcomes exploit heuristic algorithms and machine learning, including deep learning, to reduce the coding complexity. However, in most cases, each encoder module, i.e., encoding process, is first accelerated individually, and then different acceleration algorithms are manually combined. Without a holistic strategy, the acceleration potential of multi-module combination is not exploited and the Rate-Distortion (RD) loss is generally not well controlled. To tackle these shortcomings, this paper exploits the acceleration properties of different modules, i.e., the numerical representation of potential time saving and possible RD loss, from which a heuristic model is explored. Then a Heuristic Model Oriented Framework (HMOF) is proposed which adapts the properties of modules to underlying acceleration algorithms. In the framework, two advanced acceleration algorithms, including Border Considered CNN (BC-CNN)-based Coding Unit (CU) partition and Naive Bayes-based Prediction Unit (PU) partition, are proposed for the CU and PU modules, respectively. Further, by leveraging the heuristic model as the guidance to combine the proposed acceleration algorithms, HMOF is globally optimized, where different time saving budgets are wisely allocated to different modules and a theoretically minimal RD loss is achieved. According to the experimental results, through fusing a suitable deep learning technique and a Bayes-Based prediction, the proposed acceleration framework HMOF enable multiple acceleration choices. Here the proposed joint optimization strategy help to make a choice leading to the best cost-performance. Furthermore, within the proposed framework, intra coding time can be precisely controlled with negligible Bjontegaard delta bit-rate (BDBR) loss. In this context, as a complexity control method, HMOF outperforms the state-of-the-art complexity reduction algorithms under a similar complexity reduction ratio. These results partially demonstrate the superiority of the proposed technique.
Published: 2021

12. A method for unmanned vessel autonomous collision avoidance based on model predictive control

Author: Wenjun Zhang, Shengwei Xing, and Hongwei Xie
Subjects: Control and Optimization, Control engineering systems. Automatic machinery (General), Computer science, GeneralLiterature_INTRODUCTORYANDSURVEY, model predictive control, Quantitative Biology::Tissues and Organs, Physics::Medical Physics, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, ComputerApplications_COMPUTERSINOTHERSYSTEMS, collision risk, Collision risk, unmanned vessel, Quantitative Biology::Cell Behavior, Systems engineering, Model predictive control, TA168, Artificial Intelligence, Control and Systems Engineering, Control theory, TJ212-225, cardiovascular system, collision avoidance, Nuclear Experiment, Collision avoidance, ComputingMethodologies_COMPUTERGRAPHICS
Abstract: Aiming at the problem of autonomous collision avoidance of unmanned vessels in the case of multiple vessels encountering at sea, this paper proposes a method for collision avoidance of vessels in open water based on the Mathematical Model Group (MMG) vessel motion mathematical model. This method uses Model Predictive Control (MPC) model algorithm, and considers vessel maneuverability and the International Regulations for Preventing Collision at Sea, 1972 (COLREGs), and uses fuzzy mathematics to analyze the collision risk of vessels during navigation, and then constructs the evaluation function of the collision avoidance algorithm. The vessel's autonomous collision avoidance is realized. The simulation results show that the algorithm can solve the problem of autonomous vessel collision avoidance in the case of multi-vessel encounters in open water, which verifies the effectiveness of the algorithm.
Published: 2021

13. MBSFN or SC-PTM: How to Efficiently Multicast/Broadcast

Author: Yizhe Zhang, Wenjun Zhang, Dazhi He, Yunfeng Guan, and Yin Xu
Subjects: Optimization problem, Multicast, business.industry, Computer science, Single-frequency network, Throughput, Multimedia Broadcast Multicast Service, Synchronization, Nonlinear programming, Media Technology, Electrical and Electronic Engineering, business, Computer network, Power control
Abstract: In Multimedia Broadcast/Multicast Service(MBMS), Multicast/Broadcast Single Frequency Network (MBSFN) and Single-cell Point-to-multipoint Network (SC-PTM) are two essential ways to organize networks to provide multicast services. MBSFN shows its unique advantage of enhancing signals at the boundary of two cells, however, highly requiring synchronization of symbols. SC-PTM shows the advantage of the flexibility on the network deployment. The comparison and complementarity of these two modes are explored. In this paper, we analyze the reception performance of MBSFN and SC-PTM from multiple perspectives. We first compare the successful transmission probability (STP) of MBSFN with that of SC-PTM. Then, we consider optimal power control problems in MBSFN and SC-PTM. The corresponding power control optimization problems are with fraction and max-min forms and solved by the Dinkelbach algorithm. Furthermore, we propose a joint mode-selection and power-control method in the hybrid mode of MBSFN and SC-PTM, where each cell could select the multicast mode from MBSFN and SC-PTM. This is a mixed-integer nonlinear programming (MINLP) problem and solved by the concave-convex procedure (CCCP) and the Dinkelbach algorithm. To further improve the throughput, the appropriate successful reception proportion is selected under the opportunistic multicast. Finally, the numerical simulations show the performance of our algorithms and compare the throughput of two modes, which verify our analysis.
Published: 2021

14. User-Load-Compatible Masking Schemes for Raptor-Like Protograph-Based LDPC Codes in Gaussian Multiple Access Channels

Author: Wenjun Zhang, Hanjiang Hong, Yin Xu, Dazhi He, Yihang Huang, Chang Wen Chen, and Na Gao
Subjects: Masking (art), Information transfer, Computer Networks and Communications, Computer science, Gaussian, Aerospace Engineering, symbols.namesake, Chart, Automotive Engineering, symbols, Enhanced Data Rates for GSM Evolution, Electrical and Electronic Engineering, Low-density parity-check code, Algorithm, Selection algorithm, Coding (social sciences)
Abstract: In this paper, the backward-compatible coding schemes are explored to achieve the near-capacity performances in dynamic LDPC-coded Gaussian multiple access channels (GMAC). Specifically, some user-load-compatible (ULC) masked edges in a raptor-like protograph are masked with the variation of user load $u$ . To distinguish the edges in a protograph, a multi-user protograph-based extrinsic information transfer (MU-PEXIT) chart is developed. Based on this chart, an iterative edge selection algorithm combined with a theoretical simplification is proposed to facilitate the selection of independent masked edge sets (MESs) for varying $u$ . To further decrease the complexity of the independent masking scheme, a nested version is presented, where the MESs for varying $u$ are nested and derived from the same mother MES. Meanwhile, nested selection algorithms are provided to select the nested MESs. Extensive simulation results demonstrate that the masked 5G-NR LDPC codes can maintain near-capacity performances regardless of the variation of $u$ . Besides, these ULC masked codes achieve comparable performances with the state-of-the-art LDPC codes which are specially re-designed in a static GMAC.
Published: 2021

15. Internet of things (IoT) and big data analytics (BDA) for digital manufacturing (DM)

Author: Lihui Wang, Zhuming Bi, Yan Jin, Wenjun Zhang, and Paul Maropoulos
Subjects: business.industry, Computer science, Strategy and Management, Sustainable manufacturing, Big data, Cyber-physical system, Enterprise architecture, Digital manufacturing, Management Science and Operations Research, Internet of Things, business, Data science, Industrial and Manufacturing Engineering
Abstract: This paper aims to investigate the impact of enterprise architecture (EA) on system capabilities in dealing with changes and uncertainties in globalised business environments. Enterprise informatio...
Published: 2021

16. Shortening for LDPC-Coded Multi-User Systems

Author: Dazhi He, Chang Wen Chen, Yihang Huang, Wenjun Zhang, Hanjiang Hong, Yin Xu, and Na Gao
Subjects: Information transfer, Computer science, Gaussian, Multi-user, Multiuser detection, Computer Science Applications, symbols.namesake, Modeling and Simulation, Benchmark (computing), symbols, Electrical and Electronic Engineering, Low-density parity-check code, Algorithm, Decoding methods, Communication channel
Abstract: This letter explores shortening technology for achieving near Gaussian multiple access channel (GMAC) capacity in low-density parity-check coded multi-user systems. To optimize the shortening patterns, a hybrid extrinsic information transfer (H-EXIT) tool, which integrates EXIT and protograph-based EXIT, is firstly developed. Based on this analysis, H-EXIT priority (HEP) algorithm is proposed to facilitate the optimization of shortening patterns. It can be observed that columns with larger degrees are prior to be selected, which differs from those for point-to-point scenarios. Inspired by this finding, we further propose largest-column-degree priority (LCDP) algorithm, which narrows the selection space to lower the complexity while maintains a comparable performance. Extensive simulation results demonstrate the superiority of proposed shortening schemes from two aspects: 1) Proposed shortening can bring nonnegligible gain over unshortening with consistent sum rate; 2) HEP and LCDP algorithms outperform benchmark algorithms.
Published: 2021

17. Exclusion, subset realization, and part‐whole relations

Author: Wenjun Zhang
Subjects: Philosophy, Computer science, Part whole, Arithmetic, Realization (systems)
Published: 2021

18. Subjective and Objective Quality Assessment of Compressed Screen Content Videos

Author: Guangtao Zhai, Li Teng, Yiling Xu, Wenjun Zhang, Heng Zhao, and Xiongkuo Min
Subjects: Measure (data warehouse), Similarity (geometry), Computer science, Image quality, business.industry, Distortion (optics), media_common.quotation_subject, Frame (networking), 020206 networking & telecommunications, Cloud computing, 02 engineering and technology, computer.software_genre, Video quality, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, Quality (business), Data mining, Electrical and Electronic Engineering, business, computer, media_common
Abstract: With the widespread of application scenarios such as remote office and cloud collaboration, Screen Content Video (SCV) and its processing which show different characteristics from Natural Scene Video (NSV) and its processing, are increasingly attracting researcher’s attention. Among these processing techniques, quality evaluation plays an important role in various media processing systems. Despite extensive research on general Image Quality Assessment (IQA) and Video Quality Assessment (VQA), quality assessment of SCVs remains undeveloped. In particular, SCVs always suffer from compression degradations in all kinds of application scenarios. In this article, we first study subjective SCV quality assessment. Specifically, we first construct a Compressed Screen Content Video Quality (CSCVQ) database with 165 distorted SCVs compressed from 11 most common screen application scenarios using the H.264, HEVC and HEVC-SCC formats. Twenty subjects were recruited to participate in the subjective test on the CSCVQ database. Then we study objective SCV quality assessment and propose a SCV quality measure. We observe that localized protruding information such as curves and dots can be well captured by the local relative standard deviation which then can be used to measure the intra-frame quality. Base on this observation, we develop a MutiScale Relative Standard Deviation Similarity (MS-RSDS) model for SCV quality evaluation. In our model, the relative standard deviation similarity between the reference and distorted SCVs is measured from frame differences between two adjacent frames, which can capture the spatiotemporal distortions accurately. A multiscale strategy is also applied to strengthen the original single-scale model. Extensive experiments are performed to compare the proposed model with the most popular and state-of-the-art quality assessment models on the CSCVQ database. Experimental results show that our proposed MS-RSDS model which has relatively low computation complexity, outperforms other IQA/VQA models.
Published: 2021

19. A Biobjective Optimization Model for Expert Opinions Aggregation and Its Application in Group Decision Making

Author: Xiwen Lu, Wenjun Zhang, and Chunli Ji
Subjects: 021103 operations research, Operations research, Computer Networks and Communications, Computer science, Process (engineering), Survey of Professional Forecasters, media_common.quotation_subject, 0211 other engineering and technologies, 02 engineering and technology, Aggregation problem, Field (computer science), Computer Science Applications, Group decision-making, Control and Systems Engineering, Quality (business), Electrical and Electronic Engineering, Objectivity (science), Reliability (statistics), Information Systems, media_common
Abstract: Expert opinions aggregation is a generic part of the group decision making (GDM) problem. The challenge of expert opinions aggregation is to reduce the subjectivity in the process as much as possible and improve the reliability of the aggregated opinion. Most of the existing literature try to eliminate the subjectivity but seldom consider the reliability of the aggregation result. In this article, we propose a new criterion that contains consensus level and confidence level to improve both objectivity (i.e., consensus) and reliability (i.e., no absurd result) with the experts’ opinions being represented as probability density functions. Subsequently, the expert opinion aggregation problem is formulated as a biobjective optimization model. The Survey of Professional Forecasters is used as an example to examine the feasibility and accuracy of the proposed approach and the result shows that the new approach can provide a better estimation than that of the single objective model in the literature. To our best knowledge, the proposed criterion is new in the literature of GDM along with relevant problems. The proposed criterion is actually a pilot work to probe the problem of the quality of a GDM process, which is largely ignored in the field of GDM.
Published: 2021

20. A New Approach to Polyp Detection by Pre-Processing of Images and Enhanced Faster R-CNN

Author: Kunyu Wang, Yi Lv, Dongyuan Lv, Madan M. Gupta, Zhiqin Qian, Wenjun Zhang, and Gu Huijun
Subjects: medicine.diagnostic_test, Computer science, business.industry, Deep learning, 010401 analytical chemistry, Feature extraction, Colonoscopy, Pattern recognition, medicine.disease, 01 natural sciences, Convolutional neural network, 0104 chemical sciences, Colon polyps, medicine, Data analysis, Artificial intelligence, Electrical and Electronic Engineering, business, Instrumentation
Abstract: Colon cancer is the third most common cancer in the world, and it is increasingly threatening people’s health. Early diagnosis is crucial to reducing the threat; however, the chance of missed polyps in today’s colonoscopy examination is still high (about 10%) due to limitations in diagnosis techniques and data analysis methods. The colonoscope is a kind of robot and on its tip there is a camera to acquire images. This paper presents a study aimed to improve the rate of successful diagnosis with a new image data analysis approach based on the faster regional convolutional neural network (faster R-CNN). This new approach has two steps for data analysis: (i) pre-processing of images to characterize polyps, and (ii) incorporating of the result of the pre-processing into the faster R-CNN. Specifically, the pre-processing of colonoscopy was expected to reduce the influence of specular reflections, resulting in an improved image, upon which the faster R-CNN algorithm was aplied. There are several improvements of the faster r-CNN tailoring to the task of colon polyps detection. To confirm the superiority of this new approach, the mean average precision (mAP) was used to compare the results obtained with the new approach and the faster R-CNN algorithm. The experimental result shows that the mAP of the new approach is 91.43%, as opposed to 90.57% with the faster R-CNN, which shows a significant improvement.
Published: 2021

21. Backward Compatible Low-Complexity Demapping Algorithms for Two-Dimensional Non-Uniform Constellations in ATSC 3.0

Author: Dazhi He, Wenjun Zhang, Hanjiang Hong, Yin Xu, Na Gao, and Yiyan Wu
Subjects: Computer science, business.industry, Approximation algorithm, 020206 networking & telecommunications, 02 engineering and technology, Code rate, Broadcasting, Backward compatibility, Digital terrestrial television, Reduction (complexity), Limit (music), 0202 electrical engineering, electronic engineering, information engineering, Media Technology, Electrical and Electronic Engineering, business, Algorithm, Quadrature amplitude modulation
Abstract: Non-uniform constellation (NUC) is an advanced technology in digital terrestrial television broadcasting (DTTB) systems to reduce the shapping gap of BICM capacity to Shannon theoretical limit and provide performance gain. Two-dimensional NUC (2D-NUC) is a kind of NUC providing more gain but bringing higher demapping complexity at the receiver, which hinders its application prospects, especially in power limited systems. This paper proposes three novel demapping algorithms with reduced complexity for low to medium code rate 2D-NUCs in Advanced Television Systems Committee 3rd Generation (ATSC 3.0) standard. The proposed algorithms are based on the introduction of virtual points, the strategy of condensed symbols reduction and some reasonable approximations. There is a trade-off between the demapping complexity and performance. These three algorithms have different degrees of reduction in complexity and performance degradation, so they accommodate for different practical requirements. Theoretical analysis and simulation results are also given in this paper to prove the efficiency of the proposed demapping algorithms with reduced complexity.
Published: 2021

22. Mode Selection Algorithm for Multicast Service Delivery

Author: Yunfeng Guan, Wenjun Zhang, Yiwei Zhang, Yin Xu, and Dazhi He
Subjects: Multicast, Computer science, Point-to-multipoint communication, Single-frequency network, 020206 networking & telecommunications, Throughput, 02 engineering and technology, Nonlinear programming, Transmission (telecommunications), 0202 electrical engineering, electronic engineering, information engineering, Media Technology, Cellular network, Electrical and Electronic Engineering, Algorithm, Selection (genetic algorithm)
Abstract: The ever-growing demands for data-hungry services make the multicast play an increasingly important role in service delivery. 3GPP has enabled multicast for cellular network, including the adoption of two distinct multicast modes, Multicast/Broadcast Single Frequency Network (MBSFN) and Single Cell Point to Multipoint (SC-PTM). However, the lack of flexible selection between different multicast modes limits the transmission capacity. In order to further increase system throughput, this article focuses on exploiting the complementarity between MBSFN mode and SC-PTM mode, namely selecting the appropriate multicast mode for each cell in the network. This approach benefits from the trade-off between the utilization of user diversity via SC-PTM and the extra SFN gain from MBSFN, which increases the system throughput from a perspective of enhancing configuration flexibility. By constructing the analytical model of the network comprising cells with different multicast modes, we formulate a multicast mode selection problem aiming to maximize system throughput. Then, the formulated mixed-integer nonlinear programming problem is converted to a Difference of Convex (DC) programming, which is solved by the proposed algorithm based on the concave-convex procedure. Considering potential massive scale networks and high selection frequency, alternative mode selection algorithms with lower complexity are also designed.
Published: 2021

23. A problem-specific non-dominated sorting genetic algorithm for supervised feature selection

Author: Wenjun Zhang, Junhao Kang, Yu Zhou, Xu Wang, and Xiao Zhang
Subjects: Information Systems and Management, Optimization problem, Computer science, Population, Feature selection, 02 engineering and technology, Bit array, Theoretical Computer Science, Operator (computer programming), Artificial Intelligence, Genetic algorithm, 0202 electrical engineering, electronic engineering, information engineering, education, education.field_of_study, 05 social sciences, Pareto principle, Sorting, 050301 education, Computer Science Applications, Control and Systems Engineering, Feature (computer vision), Mutation (genetic algorithm), 020201 artificial intelligence & image processing, 0503 education, Algorithm, Software
Abstract: Feature selection (FS), which plays an important role in classification tasks, has been recently studied as a multi-objective optimization problem (MOP). In this paper, we consider minimizing three objectives of FS and propose a problem-specific non-dominated sorting genetic algorithm (PS-NSGA). In PS-NSGA, an accuracy-preferred domination operator is applied, which makes the individual with higher classification accuracy in the population more likely to survive. And a quick bit mutation is used, which breaks through the limitation of traditional bit string mutation and increases the efficiency. In addition, a mutation-retry operator and a combination operator are designed to make our algorithm converge faster and better. At last, a solution selection strategy is developed to determine the most proper feature subset from the obtained Pareto solutions. Experimental results on 15 real-world high-dimensional datasets demonstrate that our proposed algorithm can achieve competitive classification accuracy while obtaining a smaller size of feature subset compared with some state-of-the-art evolutionary and traditional FS algorithms.
Published: 2021

24. Bathymetric Particle Filter SLAM With Graph-Based Trajectory Update Method

Author: Ye Li, Teng Ma, Wenjun Zhang, Qianyi Zhang, and Cong Zheng
Subjects: particle filter, 0209 industrial biotechnology, General Computer Science, Matching (graph theory), Computer science, 020208 electrical & electronic engineering, General Engineering, 02 engineering and technology, Simultaneous localization and mapping, Odometer, TK1-9971, Weighting, 020901 industrial engineering & automation, Robustness (computer science), 0202 electrical engineering, electronic engineering, information engineering, Trajectory, Autonomous underwater vehicle, Graph (abstract data type), General Materials Science, Electrical engineering. Electronics. Nuclear engineering, pose graph optimization, Particle filter, Algorithm, bathymetric simultaneous localization and mapping
Abstract: A graph-based particle filter bathymetric simultaneous localization and mapping (BSLAM) method is proposed to solve the oscillation problem of the trajectories estimated by particles when using a low precise vehicle motion model and obtain accurate navigation results for autonomous underwater vehicles (AUVs). A graph-based trajectory update method is proposed to update the trajectories stored in particles before particle weighting to weaken the influence of the low precise odometer model on the particle trajectories. A particle weighting method based on submap matching is proposed to improve the robustness of the particle filter. Besides, a graph-based map generation method is proposed to solve the map selection problem of the particle filtering theory. The performance of the proposed method is demonstrated using a simulated dataset and a field dataset collected from a sea trial. The results show that the proposed method is more accurate and effective compared with a state-of-art particle filter BSLAM method.
Published: 2021

25. Group Re-Identification With Group Context Graph Neural Networks

Author: Nian Liu, Weiyao Lin, Wenjun Zhang, Jia Wang, Ji Zhu, and Hua Yang
Subjects: Theoretical computer science, Matching (graph theory), Group (mathematics), Computer science, Node (networking), Context (language use), Computer Science Applications, Weighting, Similarity (network science), Signal Processing, Media Technology, Graph (abstract data type), Electrical and Electronic Engineering, Graph property
Abstract: Group re-identification aims to match groups of people across disjoint cameras. In this task, the contextual information from neighbor individuals can be exploited for re-identifying each individual within the group as well as the entire group. However, compared with single person re-identification, it brings new challenges including group layout and group membership changes. Motivated by the observation that individuals who are close together are more likely to keep in the same group under different cameras than those who are far apart, we propose to model each group as a spatial K-nearest neighbor graph (SKNNG) and design a group context graph neural network (GCGNN) for graph representation learning. Specifically, for each node in the graph, the proposed GCGNN learns an embedding which aggregates the contextual information from neighbor nodes. We design multiple weighting kernels for neighborhood aggregation based on the graph properties including node in-degrees and spatial relationship attributes. We compute the similarity scores between node embeddings of two graphs for group member association and obtain the matching score between the two graphs by summing up the similarity scores of all linked node pairs. Experimental results on three public datasets show that our approach performs favorably against state-of-the-art methods and achieves high efficiency.
Published: 2021

26. Learned Resolution Scaling Powered Gaming-as-a-Service at Scale

Author: Wenjun Zhang, Ming Lu, Zhan Ma, Yiling Xu, Hao Chen, Qiu Shen, and Xu Zhang
Subjects: business.industry, Computer science, Cloud gaming, Real-time computing, Provisioning, Cloud computing, 02 engineering and technology, Video quality, Computer Science Applications, Rendering (computer graphics), Server, Signal Processing, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, 020201 artificial intelligence & image processing, The Internet, Quality of experience, Electrical and Electronic Engineering, business, Video game
Abstract: Built on the explosive advancement of cloud and telecommunication technologies, Gaming-as-a-Service (GaaS) or cloud gaming system is expected to revolutionize the traditional multi-billion video game market in the near future. This wave is analogous to the rise of live-video-streaming-based-Netflix to replace conventional DVD rental business for movies and TVs. In practice, a successful GaaS platform need to operate in a transparent mode without requiring substantial efforts from both content providers and end users, and offer the pristine quality of experience (QoE) at an affordable cost. Our analysis suggests that GaaS provisioning cost can be reduced significantly by enforcing the game video rendering and streaming at a lower resolution (so as to increase the user concurrency in the cloud and reduce the streaming bandwidth over the network). However, streaming video at a lower resolution may deteriorate the QoE. To maintain the client QoE at the level using the default-native resolution for streaming or even enhance it, we introduce the learned resolution scaling (LRS), which leverages the computational capabilities at clients/edges to restore/improve the reconstructed image/video quality via stacked deep neural networks (DNN). We integrate this LRS into a commercialized GaaS platform - AnyGame, to study its efficiency and complexity quantitatively. Extensive real-life experiments have shown that LRS-powered AnyGame offers the state-of-the-art performance, and the lower operational cost, paving the road for a potential success of GaaS over the Internet. Additionally, we dive into proposed LRS via ablation studies to further demonstrate its consistent performance, including the discussions on trade-off between efficiency and complexity, alternative training sets, etc.
Published: 2021

27. A New Approach to Analyzing Interactions of Two Objects in Space Based on a Specially- Tailored Local Coordinate System

Author: Haiyan Yu, Wenjun Zhang, and Yuanjun He
Subjects: 0209 industrial biotechnology, Theoretical computer science, geometric method, General Computer Science, dimension reduction, Computer science, Coordinate system, projection, 02 engineering and technology, Space (mathematics), Field (computer science), 020901 industrial engineering & automation, Software, Robustness (computer science), 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, Electrical and Electronic Engineering, Projection (set theory), business.industry, General Engineering, 020207 software engineering, Object (computer science), geometric transformation, TK1-9971, Visualization, Geometric computing, Electrical engineering. Electronics. Nuclear engineering, business
Abstract: Conventionally, geometric computing problems are treated as algebraic computing problems by representing a geometric object in a global reference coordinate system. This approach has two problems. The first problem is that the intuitive view of the interaction of objects is lost, and the second problem is that algebraic computing is prone to errors for degenerated cases related to the interaction of (e.g., notable case of “divided by zero”). In this paper, we propose a new approach to geometric computing especially to analyze the interaction of two objects (e.g., two triangles). The main idea behind this new approach is a specially-tailored local coordinate system for two interacting objects is defined, which makes the projection of the objects on this local coordinate system represent the true geometry of the objects. This idea significantly departs from the conventional approach which is primarily based on the concept of the global coordinate system. Three examples are provided to illustrate the effectiveness of the proposed approach. Among them, one example is related to the robustness of methods for analyzing the relations of two interacting objects, which is still an open issue in the field of geometric computing and suggests that the proposed approach could have some benefit to robustness in analysis.
Published: 2021

28. Bathymetric Particle Filter SLAM Based on Mean Trajectory Map Representation

Author: Teng Ma, Qianyi Zhang, Cong Zheng, Wenjun Zhang, and Ye Li
Subjects: 0209 industrial biotechnology, General Computer Science, Computer science, 02 engineering and technology, Simultaneous localization and mapping, 020901 industrial engineering & automation, Position (vector), 0202 electrical engineering, electronic engineering, information engineering, Autonomous underwater vehicle, General Materials Science, Bathymetry, particle filter, business.industry, General Engineering, Hierarchical clustering, TK1-9971, Trajectory, Global Positioning System, Graph (abstract data type), 020201 artificial intelligence & image processing, Electrical engineering. Electronics. Nuclear engineering, business, Particle filter, Algorithm, hierarchical clustering, bathymetric simultaneous localization and mapping
Abstract: To obtain independent navigation results for autonomous underwater vehicles (AUVs) and construct high-resolution consistent seabed maps, a particle filter-based bathymetric simultaneous localization and mapping (BSLAM) method with the mean trajectory map representation is proposed. To reduce the computational consumption, particles only keep the current estimated position of the vehicle, while all historical states of the vehicle are stored in the mean trajectory map. Using this set-up, only the weights of the particles which closed to the mean trajectory map are calculated with newly collected bathymetric data. A hierarchical clustering procedure is also discussed to identify invalid loop closures. The performance of the proposed method is validated using both the simulated data and the field data collected from sea trails. The results demonstrate that the proposed method is 50% more accurate and 50% faster than a state-of-the-art particle filter-based BSLAM method, and it has similar accuracy but 30% faster compared with a graph-based BSLAM method.
Published: 2021

29. COLREGS-based Path Planning for Ships at Sea Using Velocity Obstacles

Author: Wenjun Zhang, Chengyong Yan, Pinglin Wang, Hongguang Lyu, Bai Xiao, Zehua Li, and Zongyao Xue
Subjects: Value (ethics), 0209 industrial biotechnology, General Computer Science, Operations research, Computer science, media_common.quotation_subject, 020101 civil engineering, 02 engineering and technology, 0201 civil engineering, 020901 industrial engineering & automation, General Materials Science, collision avoidance, Motion planning, Electrical and Electronic Engineering, Function (engineering), Collision avoidance, media_common, COLREGS, General Engineering, International Regulations for Preventing Collisions at Sea, utility function, Action (philosophy), route planning, lcsh:Electrical engineering. Electronics. Nuclear engineering, velocity obstacles, lcsh:TK1-9971
Abstract: Recent studies have made significant development in path planning for ships. However some studies blindly obey rules of COLREGS (International Regulations for Preventing Collisions at Sea) or only adopt turning to starboard ignoring actual practice, like, turning to port and considering deviation from the planned route when taking actions for collision avoidance. In view of the COLREGS and ordinary seamen practice, this paper proposes collision avoidance actions before encounter situation and collision avoidance actions in an encounter situation. Based on the different stages of the encounter situation, it will add more choices for ships when taking action. To specify different stages of encounter situation clearly and take proper collision avoidance actions, this paper makes a quantitative analysis of three primary encounter situations; velocity obstacles (VO) is employed to find allowed velocity space for own ship (OS); by making further analysis of the relationship among distance at the closest point of approach, bow cross, and COLREGS, the method gives a clear direction for OS to search the best velocity in allowed velocity space for three primary encounter situations; VO utility function is applied to search specific value of the best velocity and is useful for different encounter situations. Simulations show that the results are effective and deterministic for collision avoidance. This method not only prevents blindly obeying rules of COLREGS but also promotes reducing deviation from the planned route.
Published: 2021

30. UCMH: Unpaired cross-modal hashing with matrix factorization

Author: Fangming Zhong, Zhikui Chen, Wenjun Zhang, and Jing Gao
Subjects: 0209 industrial biotechnology, Similarity (geometry), Theoretical computer science, Computer science, Cognitive Neuroscience, Hash function, 02 engineering and technology, Construct (python library), Computer Science Applications, Matrix decomposition, Reduction (complexity), 020901 industrial engineering & automation, Modal, Artificial Intelligence, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Semantic gap
Abstract: In recent years, hashing based cross-modal retrieval methods have attracted considerable attention, due to the significant reduction in computational cost and storage consumption. Most previous cross-modal hashing methods usually assume that data examples in different modalities are fully-paired. However, they neglect the fact that the data are often unpaired without one-to-one corresponding relationships in practical applications. Though several methods have noted the semi-paired or partial paired scenario, they ignore the completely unpaired scenario. In this paper, we propose a novel cross-modal hashing method, named Unpaired Cross-Modal Hashing (UCMH) for cross-modal retrieval to address the data with completely unpaired relationships. It leverages matrix factorization, similarity preservation, and semantic information to map data of different modalities to their respective semantic spaces. Moreover, different from most previous approaches, we construct an affinity matrix to bridge the semantic gap of data in different semantic spaces, which allows our method to handle single-label and multi-label unpaired cases simultaneously. Extensive experiments on one single-label dataset Wiki and two multi-label datasets namely MIR Flickr and NUS-WIDE, demonstrate that UCMH outperforms the state-of-the-art cross-modal hashing methods.
Published: 2020

31. Visual sentiment analysis via deep multiple clustered instance learning

Author: Haiyan Gao, Yonghua Zhu, Wenjun Zhang, and Wenjing Gao
Subjects: Statistics and Probability, business.industry, Computer science, Sentiment analysis, General Engineering, 02 engineering and technology, computer.software_genre, Artificial Intelligence, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, Natural language processing
Abstract: The increasing tendency of people expressing opinions via images online has motivated the development of automatic assessment of sentiment from visual contents. Based on the observation that visual sentiment is conveyed through many visual elements in images, we put forward to tackle visual sentiment analysis under multiple instance learning (MIL) formulation. We propose a deep multiple clustered instance learning formulation, under which a deep multiple clustered instance learning network (DMCILN) is constructed for visual sentiment analysis. Specifically, the input image is converted into a bag of instances through visual instance generation module, which is composed of a pre-trained convolutional neural network (CNN) and two adaptation layers. Then, a fuzzy c-means routing algorithm is introduced for generating clustered instances as semantic mid-level representation to bridge the instance-to-bag gap. To explore the relationships between clustered instances and bags, we construct an attention based MIL pooling layer for representing bag features. A multi-head mechanism is integrated to form MIL ensembles, which enables to weigh the contribution of each clustered instance in different subspaces for generating more robust bag representation. Finally, we conduct extensive experiments on several datasets, and the experimental results verify the feasibility of our proposed approach for visual sentiment analysis.
Published: 2020

32. Objective Evaluation of Fabric Flatness Grade Based on Convolutional Neural Network

Author: Jun Wang, Wenjun Zhang, Chen Xia, and Zhu Zhan
Subjects: 010407 polymers, Scanner, Multidisciplinary, Pixel, Computer science, business.industry, Flatness (systems theory), Image processing, Pattern recognition, 01 natural sciences, Convolutional neural network, Coincidence, 0104 chemical sciences, 03 medical and health sciences, 0302 clinical medicine, 030220 oncology & carcinogenesis, Objective evaluation, Artificial intelligence, business, Network model
Abstract: As an important indicator for the appearance and intrinsic quality of textiles, fabric flatness is the immediate cause affecting the aesthetic appearance and performance of textiles. In this paper, the objective evaluation system of fabric flatness based on 3D scanner and convolutional neural network (CNN) is constructed by using the height data of AATCC flatness template. The 3D scanner is responsible for the collection of the height value data of the sample. The effect of different sub-sample cutting sizes, cutting offsets, and network model depths on the objective evaluation coincidence rate of multiple flatness level was studied. The experimental results show that the coincidence rate of the system reaches 98.9% when the collected sample data are cut into subsamples of 20 pixel × 20 pixel with 12 pixel cutting offsets and the 11-layer network model is selected. Finally, this scheme is used to evaluate the flatness of four real fabrics with different colors and textures. The result shows that all of the samples can achieve a higher coincidence rate, which further verifies the adaptability and stability of the objective evaluation system constructed in this paper for fabric flatness evaluation.
Published: 2020

33. DOA Estimation Using Virtual ESPRIT with Successive Baselines and Coprime Baselines

Author: Xiaolin Li and Wenjun Zhang
Subjects: Estimation, 0209 industrial biotechnology, 020901 industrial engineering & automation, Coprime integers, Simple (abstract algebra), Computer science, Direction finding, Applied Mathematics, Signal Processing, 02 engineering and technology, Baseline (configuration management), Algorithm, Linear array
Abstract: In this paper, we exploit the idea of virtual ESPRIT (VESPA) to develop a multi-baseline VESPA (MB-VESPA) approach for direction finding. Specially, we define several cumulant matrices to provide ambiguous direction estimates under different baselines. Fine and unambiguous estimation is then obtained by a simple refinement step. Two refine approaches, termed as successive baseline approach and coprime baseline approach, are subsequently introduced. MB-VESPA shares all the advantages of the VESPA. It is simple, closed-form, search-free, and is applicable to irregularly linear array. In addition, it is free of the impact on the sensor gain uncertainties.
Published: 2020

34. Scale-Aware Crowd Counting via Depth-Embedded Convolutional Neural Networks

Author: Wenjun Zhang, Chongyang Zhang, Bingbing Ni, Muming Zhao, Fatih Porikli, and Jian Zhang
Subjects: Scale (ratio), business.industry, Computer science, 02 engineering and technology, Object (computer science), Convolutional neural network, Distortion, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, Benchmark (computing), Embedding, 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, Electrical and Electronic Engineering, Depth perception, business
Abstract: Scale variation of pedestrians in a crowd image presents a significant challenge for vision-based people counting systems. Such variations are mainly caused by perspective-related distortions due to the camera pose relative to the ground plane. Following the density-based counting paradigm, we postulate that generating density values adaptive to object scales plays a critical role in the accuracy of the final counting results. Motivated by this, we distill the underlying information from depth cues to obtain scale-aware representations that can respond to object scales considering the fact that the scale is inversely proportional to the object depth. Specifically, we propose a depth embedding module as add-ons into existing networks. This module exploits essential depth cues to spatially re-calibrate the magnitude of the original features. In this way, the objects, although in the same class, will attain distinct representations according to their scales, which directly benefits the estimation of scale-aware density values. We conduct a comprehensive analysis of the effects of the depth embedding module and validate that exploiting depth cues to perceive object scale variations in convolutional neural networks improves crowd counting performances. Our experiments demonstrate the effectiveness of the proposed approach on four popular benchmark datasets.
Published: 2020

35. MUGGLE: MUlti-Stream Group Gaze Learning and Estimation

Author: Wen Gao, Bingbing Ni, Yi Xu, Ning Zhuang, Xiaokang Yang, Zefan Li, and Wenjun Zhang
Subjects: Structure (mathematical logic), Point (typography), Computer science, business.industry, Group (mathematics), ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 02 engineering and technology, Gaze, Image (mathematics), 0202 electrical engineering, electronic engineering, information engineering, Media Technology, 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, Electrical and Electronic Engineering, business
Abstract: Being able to accurately predict the common gaze point of a group of persons is of particular interest to precise marketing and automatic group attention assessment. Group gaze estimation faces challenges including small face/head size and outlier observers. To address these challenges, we proposed a novel framework called Multi-stream Group Gaze Learning and Estimation (MUGGLE). The MUGGLE infrastructure includes two inference streams: 1) a holistic stream which utilizes fused attention map as input to a global deep convolutional structure to explore the global geometric configurations and contexts of interesting persons in the scene; and 2) an aggregative stream which robustly aggregates individual gazes via a recurrent structure (e.g., LSTM) to obtain outlier-tolerant estimation. Both streams are seamlessly integrated via a fusion network. Extensive experiments are performed on a fully annotated group gaze image dataset with 8,000+ images and 100,000+ faces (which is publicly releasable). The results demonstrate the effectiveness of the proposed MUGGLE framework in group gaze estimation.
Published: 2020

36. Risk Assessment of Water Supply Network Operation Based on ANP-Fuzzy Comprehensive Evaluation Method

Author: Yao Li, Taotao Lai, and Wenjun Zhang
Subjects: Matrix (mathematics), Risk level, Effi, Risk analysis (engineering), Computer science, Mechanical Engineering, Evaluation methods, Water supply network, Risk assessment, Fuzzy logic, Civil and Structural Engineering
Abstract: The water supply network is of great importance for a city, and it faces various risks such as leaks and breaks, so its risk level needs to be evaluated regularly. To propose a new and effi...
Published: 2022

37. Modeling the Perceptual Quality of Viewport Adaptive Omnidirectional Video Streaming

Author: Wenjun Zhang, Shaowei Xie, Zhan Ma, Qiu Shen, and Yiling Xu
Subjects: Viewport, Scale (ratio), Computer science, media_common.quotation_subject, 02 engineering and technology, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, 020201 artificial intelligence & image processing, Quality (business), Noise (video), Electrical and Electronic Engineering, Quantization (image processing), Focus (optics), Algorithm, media_common
Abstract: Instead of streaming the entire OmniDirectional Videos (ODVs) that are often sampled at ultra high definition and high frame rate, a viewport adaptive streaming is preferred in practice. We usually stream the High-Quality (HQ) content within current viewport, while Low-Quality (LQ) elsewhere to save the network bandwidth consumption. Such scheme would lead to a quality refinement after user adapts his/her focus to a new viewport. In this paper, we thus model the perceptual impact of the quality variations (through adjusting the Quantization Stepsize (QS or q) and Spatial Resolution (SR or s)) with respect to the Refinement Duration (RD or τ) when performing the refinement from an arbitrary LQ scale to an arbitrary HQ one. A number of quality variations are studied to cover sufficient use cases in practice, resulting in a unified analytical model, as a product of separable exponential functions that measure the QS and SR induced perceptual impacts in terms of the RD, and a perceptual index measuring the subjective quality of corresponding viewport video after refinement. This model is first validated in a managed lab environment via independent subjective assessments by constraining user's navigation to avoid unexpected noise, where both Pearson Correlation Coefficient (PCC) and Spearman's Rank Correlation Coefficient (SRCC) are around 0.97. We then extend the validations in a real-life viewport-dependent streaming system, still yielding PCC and SRCC about 0.96 when comparing collected subjective scores with model predictions.
Published: 2020

38. Min-Sum Algorithm Using Multi-Edge-Type Normalized Scheme for ATSC 3.0 LDPC Decoders

Author: Wenjun Zhang, Hanjiang Hong, Chang Wen Chen, Dazhi He, Yin Xu, Sung-Ik Park, Hao Ju, and Na Gao
Subjects: Offset (computer science), Computer science, Approximation algorithm, 020206 networking & telecommunications, 02 engineering and technology, Edge type, symbols.namesake, Additive white Gaussian noise, Frame error rate, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, symbols, Electrical and Electronic Engineering, Low-density parity-check code, Algorithm, Decoding methods, Density evolution
Abstract: To offer commercial LDPC decoders with better performance, in this paper, multi-edge-type (MET) normalized scheme is proposed to improve the performance of traditional min-sum algorithm (MSA) and its modified versions. With respect to sum-product algorithm (SPA), we firstly analyze and find out that the degradations of convergence in different edge types are distinct in MSA theoretically. To compensate the different degradations above, proposed MET-normalized scheme is used in normalized min-sum algorithm (MET-NMSA). In addition, MET-based density evolution (MET-DE) is also presented to search optimized MET-scaling factors for MET-NMSA. To further verify the validity of MET-normalized scheme, Advanced Television Systems Committee (ATSC) 3.0 LDPC codes are used to evaluate MET-NMSA in comparison with the conventional algorithms, i.e., normalized min-sum algorithm (NMSA), offset min-sum algorithm (OMSA) and variable-scaling normalized min-sum algorithm (VS-NMSA). Extensive simulation results show that MET-NMSA for ATSC 3.0 LDPC decoders has at most 1.53dB, 0.21dB and 0.35dB gain over NMSA, OMSA and VS-NMSA respectively when frame error rate (FER) is at 10-4 level over additive white Gaussian noise (AWGN) channels using QPSK modulation.
Published: 2020

39. The Dynamic Models, Control Strategies and Applications for Magnetorheological Damping Systems: A Systematic Review

Author: Songsong Zhang, Rui Chen, Qi Sun, Hongzhan Lv, and Wenjun Zhang
Subjects: 0209 industrial biotechnology, Computer science, Vibration control, Quality control, Control engineering, 02 engineering and technology, 021001 nanoscience & nanotechnology, Field (computer science), Damper, 020901 industrial engineering & automation, Control theory, Control system, Magnetorheological fluid, 0210 nano-technology, Intelligent control
Abstract: The magnetorheological (MR) damping devices have attracted an increasing amount of attention in the field of vibration control for their excellent performance of the vibration absorption. Systematically, the constitutive mechanical models of the MR fluids affect the control accuracy for the control strategies and the applications of the MR dampers. This work elaborately gives a systematic review on the control issues with the MR devices, thus, covering the dynamic models of the MR dampers, the state-of-the-art research, and the damping control strategies and its applications, which provide the necessary fundamental theories and the references for the damping control design of an MR damping device. According to the advanced degree of the control algorithms that are discussed in detail, they can be classified into three categories, namely, the classical control algorithms, the advanced control algorithms, and the intelligent control algorithms. The damping control strategies determine the control quality, which is the soul of an MR damping control system. There is still room to improve the algorithm for the controller for MR dampers. Employing the smart algorithms of machine learning, the work directs much more attention to the foundational research on the application of the artificial intelligence algorithms in damping control, outlining a perspective to combine the advantages of the individual different intelligent algorithms that could be an alternative solution to control of an MR damping device for the complex problems.
Published: 2020

40. Enhancements on Coding and Modulation Schemes for LTE-Based 5G Terrestrial Broadcast System

Author: Dazhi He, Yang Cai, Wenjun Zhang, Hanjiang Hong, Yin Xu, Na Gao, Yiyan Wu, and Xiaohan Duan
Subjects: business.industry, Broadband networks, Computer science, 3rd Generation Partnership Project 2, 020206 networking & telecommunications, Data_CODINGANDINFORMATIONTHEORY, 02 engineering and technology, Spectral efficiency, Broadcasting, QAM, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, Electronic engineering, Turbo code, Electrical and Electronic Engineering, Low-density parity-check code, business, Quadrature amplitude modulation
Abstract: Broadcasting and broadband network is moving towards integration and LTE-based 5G terrestrial broadcast is now researched in Release 16 in Third Generation Partnership Project (3GPP) standardization meetings. However, the work scope of LTE-based 5G terrestrial broadcast focuses on specifying new numerologies and some minor improvement on cell acquisition subframe, which is insufficient. In this paper, limitations in coding and modulation schemes of LTE-based 5G terrestrial broadcast system, e.g., Turbo codes and Quadrature Amplitude Modulation (QAM), are detailedly analyzed. To further enhance the spectrum efficiency of LTE-based 5G terrestrial broadcast system, LDPC (Low Density Parity Check) codes from 5G new radio (NR) standard and newly designed non-uniform constellations (NUCs) are adopted in this paper to replace Turbo codes and QAM respectively. Extensive simulations and complexity analysis show that the proposed LDPC coding and NUC modulation scheme, either standalone or combined, can provide significant performance gain over Additive White Gaussian Noise (AWGN) and Tapped Delayline (TDL) channels, without additional complexity. To summarize, this paper investigates the weakness of the coding and modulation schemes in current systems and provides potential alternatives for the enhanced future broadcast in 3GPP standard.
Published: 2020

41. Modeling the Screen Content Image Quality via Multiscale Edge Attention Similarity

Author: Yiling Xu, Wenjun Zhang, Zhan Ma, Le Yang, Jun Sun, and Qi Yang
Subjects: Mean squared error, Correlation coefficient, Image quality, business.industry, Computer science, Gaussian, Mean opinion score, 020206 networking & telecommunications, Pattern recognition, 02 engineering and technology, Luminance, symbols.namesake, Human visual system model, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, symbols, Artificial intelligence, Electrical and Electronic Engineering, business, Laplace operator
Abstract: Screen content image (SCI) prevails because of the explosive growth of screen oriented applications. This leads to extensive studies on SCI quality assessment and modeling for application optimization. In this paper, we propose a full reference multiscale edge attention (MSEA) similarity index to efficiently measure the perceptual quality of a screen image. This model considers the perceptual impacts of fixation attention, edge structure and edge contrast jointly, to accurately capture the masking phenomena (e.g., frequency selectivity, luminance, contrast, etc.) of our human visual system (HVS) when viewing a screen image. Specifically, we decompose the images using Gaussian and Laplacian pyramids which are then used to derive the edge structure, and edge contrast feature maps. Together with the fixation attention map generated by weighted luminance difference between the reference and distorted SCIs, we could eventually offer a MSEA similarity map for final index score. We have evaluated this model using a publicly accessible screen image database. Simulation results have shown that the MSEA similarity index correlates with the collected subjective mean opinion score (MOS) very well. In fact, it is ranked at the first place for both Pearson linear correlation coefficient (PLCC) and Root mean squared error (RMSE), and ranked at the second place for Spearman rank-order correlation coefficient (SROCC) measurements, among existing quality metrics.
Published: 2020

42. Overview of Physical Layer Enhancement for 5G Broadcast in Release 16

Author: Yiwei Zhang, Yihang Huang, Dazhi He, Hao Cheng, Yin Xu, Wenjun Zhang, Hanjiang Hong, Xiaohan Duan, Wang Wanting, and Huang Xiuxuan
Subjects: Standardization, Computer science, business.industry, Point-to-multipoint communication, Physical layer, 020206 networking & telecommunications, 02 engineering and technology, Multimedia Broadcast Multicast Service, Telecommunications link, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, Link level, Electrical and Electronic Engineering, Unicast, business, 5G, Computer network
Abstract: 3GPP is now carrying forward the development of the point to multipoint transmission mode of 5G on the basis of the cellular infrastructure and standard. Since the dedicated proposal for multimedia broadcast multicast system (MBMS) was approved, the MBMS technologies are evolving with the update of the requirements from 3G to 5G. This paper reviews the latest progress during the evolved MBMS standardization, which focuses on the LTE-based 5G terrestrial broadcast mode. Some agreements in terms of two widely discussed topics during 3GPP meetings, cell acquisition subframe (CAS) enhancement and numerology refinement, are presented. The enhancement of CAS and the update of numerology aim to deal with the service outage issue in the case of covering large area and serving high-mobility users. To verify the improvement brought by the CAS enhancement and the new numerology, simulations from system level and link level are carried out by several organizations and the representative results are presented and analyzed in this paper.
Published: 2020

43. FACT: Fused Attention for Clothing Transfer with Generative Adversarial Networks

Author: Rong Xie, Li Song, Wenjun Zhang, Yicheng Zhang, and Lei Li
Subjects: Computer science, business.industry, General Medicine, Machine learning, computer.software_genre, Clothing, Task (project management), Image (mathematics), Feature (computer vision), Artificial intelligence, business, computer, Generative grammar, Generator (mathematics), Texture synthesis
Abstract: Clothing transfer is a challenging task in computer vision where the goal is to transfer the human clothing style in an input image conditioned on a given language description. However, existing approaches have limited ability in delicate colorization and texture synthesis with a conventional fully convolutional generator. To tackle this problem, we propose a novel semantic-based Fused Attention model for Clothing Transfer (FACT), which allows fine-grained synthesis, high global consistency and plausible hallucination in images. Towards this end, we incorporate two attention modules based on spatial levels: (i) soft attention that searches for the most related positions in sentences, and (ii) self-attention modeling long-range dependencies on feature maps. Furthermore, we also develop a stylized channel-wise attention module to capture correlations on feature levels. We effectively fuse these attention modules in the generator and achieve better performances than the state-of-the-art method on the DeepFashion dataset. Qualitative and quantitative comparisons against the baselines demonstrate the effectiveness of our approach.
Published: 2020

44. Adversarial Domain Adaptation with Domain Mixup

Author: Minghao Xu, Chengjie Wang, Bingbing Ni, Qi Tian, Teng Li, Jian Zhang, and Wenjun Zhang
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Current (mathematics), Computer science, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, 02 engineering and technology, General Medicine, Space (commercial competition), Machine Learning (cs.LG), 030218 nuclear medicine & medical imaging, Domain (software engineering), 03 medical and health sciences, Adversarial system, 0302 clinical medicine, Robustness (computer science), 0202 electrical engineering, electronic engineering, information engineering, Feature (machine learning), 020201 artificial intelligence & image processing, Algorithm
Abstract: Recent works on domain adaptation reveal the effectiveness of adversarial learning on filling the discrepancy between source and target domains. However, two common limitations exist in current adversarial-learning-based methods. First, samples from two domains alone are not sufficient to ensure domain-invariance at most part of latent space. Second, the domain discriminator involved in these methods can only judge real or fake with the guidance of hard label, while it is more reasonable to use soft scores to evaluate the generated images or features, i.e., to fully utilize the inter-domain information. In this paper, we present adversarial domain adaptation with domain mixup (DM-ADA), which guarantees domain-invariance in a more continuous latent space and guides the domain discriminator in judging samples' difference relative to source and target domains. Domain mixup is jointly conducted on pixel and feature level to improve the robustness of models. Extensive experiments prove that the proposed approach can achieve superior performance on tasks with various degrees of domain shift and data complexity., Accepted as oral presentation at 34th AAAI Conference on Artificial Intelligence, 2020
Published: 2020

45. A Wavelet-Predominant Algorithm Can Evaluate Quality of THz Security Image and Identify Its Usability

Author: Xiongkuo Min, Rong Xie, Qingli Li, Wenjun Zhang, Xiaokang Yang, Menghan Hu, and Guangtao Zhai
Subjects: Mean squared error, Computer science, Image quality, Mean opinion score, Estimator, 020206 networking & telecommunications, 02 engineering and technology, Noise, Metric (mathematics), 0202 electrical engineering, electronic engineering, information engineering, Media Technology, Image noise, Electrical and Electronic Engineering, Image resolution, Algorithm
Abstract: This paper presents an aggregate wavelet-predominant algorithm to measure the distortions in THz security images. The algorithm integrates a spectral-based sharpness estimator, a noise estimator derived alpha-stable model and an overall viewing experience estimator based on free-energy principle. Among them, the greater weight is assigned to the spectral-based sharpness estimator considering that the main quality factor in THz security image is sharpness. To verify the feasibility of the proposed metric, we construct the THz security image dataset including a total of 181 THz security images, and each image has the mean opinion score (MOS) collected via subjective quality evaluation experiment. Quantitative experimental results on the constructed THz security image dataset show that the aggregate wavelet-predominant estimator produces the promising overall performance for the estimation of MOS values, with PLCC, SROCC, and RMSE of 0.900, 0.873, and 0.386, respectively. This performance is superior to other opinion-unaware approaches, viz. , FISBLIM, SISBLIM, NIQE, CPBD, SINE, S3, FISH, and noise estimator. The determination coefficient ( ${R} ^{{2}}$ ) of linear regression between reference and predicted MOSs is 0.81. The result of Bland–Altman analysis further validates that the aggregate wavelet-predominant estimator can substitute for the subjective IQA of THz security image, with approximately 94.5% of data points locating within the limits of agreement. For usability identification, the wavelet-predominant estimator gives the satisfactory results, with accuracy, precision, recall rate, and false positive rate of 84.0%, 79.8%, 95.0%, and 29.6%, respectively. Furthermore, the potential application perspectives of the proposed metric can refer to commercial applications (guarantee THz security images of good quality) and scientific researches (assist in software development for THz security image analysis). The dataset is available at https://doi.org/10.6084/m9.figshare.7700123.v3 . Possible researches on this dataset may include the development of THz quality standards, the selection of the best display mode, the enhancement of images, the modeling of image noise, and the detection of prohibited goods.
Published: 2020

46. Reliability Analysis of Power Distribution Network Based on PSO-DBN

Author: Lijia Ren, Wenjun Zhang, Hongtao Shan, Yuanyuan Sun, and Aleksey Kudreyko
Subjects: DBN, General Computer Science, Mean squared error, Computer science, 020209 energy, Computer Science::Neural and Evolutionary Computation, Monte Carlo method, 02 engineering and technology, RBM, Deep belief network, PSO-DBN, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, Reliability (statistics), Artificial neural network, business.industry, Deep learning, General Engineering, Particle swarm optimization, Statistical classification, 020201 artificial intelligence & image processing, lcsh:Electrical engineering. Electronics. Nuclear engineering, Artificial intelligence, power distribution network, business, lcsh:TK1-9971, Algorithm
Abstract: The main problem dealt with in this paper is to find a method to improve the performance of the reliability analysis of power distribution networks. With the help of deep learning, which has the characteristics of large-scale parallel processing and self-learning, a deep belief network (DBN) simulation model for power distribution network reliability analysis is established. After training RBM layer by layer and extracting feature information from complex data, DBN model parameters are adaptively adjusted by particle swarm optimization (PSO) algorithm. The results of power distribution network reliability analysis based on PSO-DBN model is compared with those of Monte Carlo model. In order to evaluate the performance of the proposed model, the coefficient R2, the mean absolute error and the root mean square error are used to evaluate the model. The results show that the reliability analysis model based on PSO-DBN is more accurate, and the reliability analysis efficiency of the trained PSO-DBN model is higher, which to some extent proves the superiority of applying deep neural network to the reliability analysis of distribution network.
Published: 2020

47. Transfer Correlation Between Textual Content to Images for Sentiment Analysis

Author: Ke Zhang, Yonghua Zhu, Yunwen Zhu, Wenjun Zhang, and Weilin Zhang
Subjects: 0209 industrial biotechnology, General Computer Science, Computer science, Polarity (physics), Feature extraction, 02 engineering and technology, Representation (arts), Semantics, computer.software_genre, 020901 industrial engineering & automation, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, Social media, business.industry, Sentiment analysis, General Engineering, cross-modal, Correlation, sentiment analysis, 020201 artificial intelligence & image processing, Artificial intelligence, lcsh:Electrical engineering. Electronics. Nuclear engineering, business, computer, transfer, lcsh:TK1-9971, Sentence, Natural language processing
Abstract: In social media, images and texts are used to convey individuals’ attitudes and feelings; thus, social media has become an indispensable part of people’s lives. To understand social behavior and provide better recommendations, sentiment analysis on social media is helpful. One sentiment analysis task is polarity prediction. Although current research on visual or textual sentiment analysis has achieved quite good progress, multimodal and cross-modal analysis combining visual and textual correlation is still in the exploration stage. To capture a semantic connection between images and captions, this paper proposes a cross-modal approach that considers both images and captions in classifying image sentiment polarity. This method transfers the correlation between textual content to images. First, the image and its corresponding caption are sent into an inner-class mapping model, where they are transformed into vectors in Hilbert space to obtain their labels by calculating the inner-class maximum mean discrepancy (MMD). Then, a class-aware sentence representation (CASR) model assigns the distributed representation to the labels with a class-aware attention-based gated recurrent unit (GRU). Finally, an inner-class dependency LSTM (IDLSTM) classifies the sentiment polarity. Experiments carried out on the Getty Images dataset and Twitter 1269 dataset demonstrate the effectiveness of our approach. Moreover, extensive experimental results show that our model outperforms baseline solutions.
Published: 2020

48. Perceptually Optimized Quality Adaptation of Viewport-Dependent Omnidirectional Video Streaming

Author: Zhan Ma, Shaowei Xie, Yunqiao Li, Yiling Xu, Wenjun Zhang, and Qiu Shen
Subjects: Viewport, Scale (ratio), Computer science, business.industry, Heuristic (computer science), 020206 networking & telecommunications, 02 engineering and technology, Constraint (information theory), Signal Processing, 0202 electrical engineering, electronic engineering, information engineering, Computer vision, Artificial intelligence, Quality of experience, Electrical and Electronic Engineering, Hidden Markov model, Adaptation (computer science), business, Omnidirectional antenna
Abstract: Viewport-Dependent Streaming (VDS) is a preferred way in practice to deliver the omnidirectional videos, of which a High-Quality (HQ) scale is applied for the content in current viewport but a Low-Quality (LQ) scale elsewhere. Quality adaptation or refinement happens after users stabilize their fixations to a new viewport. In this article, we formulate this as a perceptually optimized quality adaptation problem to maximize the Quality of Experience (QoE) for the refinement from a LQ scale to another HQ level within a specific duration under the given network constraint. With our developed perceptual model considering the adaptation quality for VDS of omnidirectional videos, we first provide baseline solutions numerically, demonstrating the noticeable subjective improvements of model-driven solution against the heuristic selection based approach. We also propose a novel viewport prediction algorithm based on the Hidden Markov Model (HMM), and experimental results show that it significantly outperforms the relevant methods with better prediction accuracy. We then improve the adaptation strategy with proposed viewport prediction-based data prefetching, leading to better visual perception than the baseline system at the same bandwidth constraint. Generally, prefetching the content of predicted next viewport one second ahead of its playback time, would lead to more than 8% Bjontegaard Delta Rate (BD-Rate) gain.
Published: 2020

49. A Low Complexity Decoding Scheme for Raptor-Like LDPC Codes

Author: Wenjun Zhang, Jun Sun, Dazhi He, Yin Xu, Yiyan Wu, Hao Ju, and Genning Zhang
Subjects: Offset (computer science), Computer science, business.industry, Approximation algorithm, 020206 networking & telecommunications, Data_CODINGANDINFORMATIONTHEORY, 02 engineering and technology, Code rate, Belief propagation, Likelihood-ratio test, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, Digital television, Electrical and Electronic Engineering, Low-density parity-check code, business, Algorithm, Decoding methods
Abstract: Recently, a new structure of low density parity check (LDPC) code named raptor-like LDPC code has attracted much attention. It has better performance at low code rate. In this paper, a novel decoding scheme for raptor-like LDPC code is proposed. First, the Gaussian approximation density evolution (GADE) algorithm is used to track and analyze the message transmission during the decoding process. It is found that certain log likelihood ratio (LLR) messages can be approximated by “zero” setting in the early iteration of raptor-like LDPC decoding. In other words, some column and row operations could be eliminated without compromise the performance. Next, we propose a new decoding scheme, which can skip unnecessary column and row operations. In comparison with the traditional belief propagation (BP)-based LDPC decoding scheme, the proposed scheme can reduce the decoding complexity. Additionally, a new algorithm is developed to facilitate the selection of the early iteration number. With this novel design, the proposed decoding scheme performs almost the same as the traditional BP-based scheme. To proof the concept, the raptor-like LDPC codes in the ATSC3.0 digital TV system are used to evaluate the proposed scheme, in comparison with the traditional BP-based schemes, i.e., sum-product algorithm (SPA), offset min-sum algorithm (OMSA), and normalized min-sum algorithm (NMSA). The simulation results confirm that the proposed scheme can reduce the decoding complexity without sacrifice in performance for all SPA, OMSA, and NMSA methods. About 10% complexity reduction can be achieved. This will reduce the buttery consumption for Internet of Things (IoT) and handheld devices.
Published: 2019

50. Adaptive Bootstrap Design for Hybrid Terrestrial Broadcast and Mobile Communication Networks

Author: Dazhi He, Wenjun Zhang, Yihang Huang, Yin Xu, and Yunfeng Guan
Subjects: Computer science, Bandwidth (signal processing), Fast Fourier transform, 020206 networking & telecommunications, 02 engineering and technology, Frequency domain, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, Frequency offset, Time domain, Electrical and Electronic Engineering, Algorithm, Multipath propagation, Decoding methods, Communication channel
Abstract: The bootstrap in ATSC 3.0 is expected to act as a universal wake-up signal for various wireless systems in addition to broadcast network. However, the signaling decoding performance of the standardized bootstrap degrades significantly in channels of fast time-variation and strong multipath. Furthermore, part of available bandwidth is reserved to ensure the compatibility with mobile communication network (MCN), which puts a limitation on the performance improvement. In this paper, to make the best of the available bandwidth, we introduce the bandwidth-concerned version information to enable an adaptive bandwidth configuration for the proposed bootstrap. Moreover, a 2-D signaling scheme is used to increase signaling capacity by selecting different gold sequences in frequency domain (FD) and simultaneously applying cyclic shift in time domain (TD). At the receiver side, we first provide improved estimators of symbol timing offset (STO) and fine frequency offset (FFO) for the bootstrap with special TD structure. Meanwhile, a learning-based binary classifier taking the output of STO estimator as training data is provided to generate an SNR-independent threshold for spectrum sensing without requiring the knowledge of channel conditions nor noise estimator. Afterwards, an inverse fast Fourier transform (IFFT)-based algorithm is used to decode FD signaling in the presence of unknown TD signaling, which allows 3-bit higher signaling capacity. Numerical analysis and simulation results demonstrate that the proposed adaptive bootstrap design as well as corresponding receiver algorithms significantly outperform the standardized one in terms of synchronization, detection and signaling decoding.
Published: 2019

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

532 results on '"Wenjun Zhang"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources