2,637 results
Search Results
2. CAVLCU: an efficient GPU-based implementation of CAVLC
- Author
-
Nicolás Guil, Antonio Fuentes-Alventosa, Juan Gómez-Luna, José María González-Linares, and Rafael Medina-Carnicer
- Subjects
Computer science ,business.industry ,CAVLC ,GPU ,CUDA ,H.264 ,Parallel implementations ,Data compression ,Variable-length encoding ,Frame (networking) ,Memory bandwidth ,Parallel computing ,Encryption ,Theoretical Computer Science ,Hardware and Architecture ,Encoding (memory) ,business ,Encoder ,Software ,Information Systems ,Image compression ,Block (data storage) ,Context-adaptive variable-length coding - Abstract
CAVLC (Context-Adaptive Variable Length Coding) is a high-performance entropy method for video and image compression. It is the most commonly used entropy method in the video standard H.264. In recent years, several hardware accelerators for CAVLC have been designed. In contrast, high-performance software implementations of CAVLC (e.g., GPU-based) are scarce. A high-performance GPU-based implementation of CAVLC is desirable in several scenarios. On the one hand, it can be exploited as the entropy component in GPU-based H.264 encoders, which are a very suitable solution when GPU built-in H.264 hardware encoders lack certain necessary functionality, such as data encryption and information hiding. On the other hand, a GPU-based implementation of CAVLC can be reused in a wide variety of GPU-based compression systems for encoding images and videos in formats other than H.264, such as medical images. This is not possible with hardware implementations of CAVLC, as they are non-separable components of hardware H.264 encoders. In this paper, we present CAVLCU, an efficient implementation of CAVLC on GPU, which is based on four key ideas. First, we use only one kernel to avoid the long latency global memory accesses required to transmit intermediate results among different kernels, and the costly launches and terminations of additional kernels. Second, we apply an efficient synchronization mechanism for thread-blocks (In this paper, to prevent confusion, a block of pixels of a frame will be referred to as simply block and a GPU thread block as thread-block.) that process adjacent frame regions (in horizontal and vertical dimensions) to share results in global memory space. Third, we exploit fully the available global memory bandwidth by using vectorized loads to move directly the quantized transform coefficients to registers. Fourth, we use register tiling to implement the zigzag sorting, thus obtaining high instruction-level parallelism. An exhaustive experimental evaluation showed that our approach is between 2.5x and 5.4x faster than the only state-of-the-art GPUbased implementation of CAVLC., The Journal of Supercomputing, 78 (6), ISSN:0920-8542, ISSN:1573-0484
- Published
- 2021
3. The g-extra diagnosability of the balanced hypercube under the PMC and MM* model
- Author
-
Yuehong Chen, Qiao Sun, Lijuan Huang, Xin-Yang Wang, Naqin Zhou, Weiwei Lin, and Keqin Li
- Subjects
Discrete mathematics ,Current (mathematics) ,Cover (topology) ,Hardware and Architecture ,Computer science ,Value (computer science) ,Fault tolerance ,Hypercube ,Upper and lower bounds ,Software ,Information Systems ,Theoretical Computer Science - Abstract
Fault diagnosis plays an important role in the measuring of the fault tolerance of an interconnection network, which is of great value in the design and maintenance of large-scale multiprocessor systems. As a classical variant of the hypercube, the Balanced Hypercube, denoted by BHn(n $$\ge$$ 1), has drawn a lot of research attention, and its $$g$$ -extra diagnosability has been studied to improve the network diagnostic ability. However, the current literatures on $$g$$ -extra diagnosability of BHn under the PMC model only cover the cases of $$g < 6$$ , and what’s more, seldom involve its $$g$$ -extra diagnosability under the MM* model, which is a great limitation on the research of BHn diagnosability. In this paper, the upper and lower bounds of the $$g$$ -extra diagnosability of the balanced hypercube are proved, respectively, based on the $$g$$ -extra connectivity by the contradiction method, and finally, the $$g$$ -extra diagnosability of BHn for $$2 \le g \le 2n - 1$$ under the PMC and MM* model is obtained, i.e., $$2\left[ {\left( {n - 2} \right)\lceil\frac{g - 1}{2}\rceil + n} \right] + g$$ . In addition, as a special case, the $$g$$ -extra diagnosability of the balanced hypercube for $$g = 2n$$ is proved to be $$2^{2n - 1} - 1$$ under the PMC and MM* model. In the end, simulation experiments are conducted to verify the effectiveness of our proposed theories. The conclusion of this paper has certain theory and application value for the research of BHn fault diagnosis.
- Published
- 2021
4. Understanding human emotions through speech spectrograms using deep neural network
- Author
-
Yu-Chen Hu, Stuti Juyal, and Vedika Gupta
- Subjects
Artificial neural network ,Computer science ,Speech recognition ,Feature extraction ,Perceptron ,Theoretical Computer Science ,Support vector machine ,Hardware and Architecture ,Bag-of-words model in computer vision ,Classifier (linguistics) ,Cepstrum ,Mel-frequency cepstrum ,Software ,Information Systems - Abstract
This paper presents the analysis and classification of speech spectrograms for recognizing emotions in RAVDESS dataset. Feature extraction from speech utterances is performed using Mel-Frequency Cepstrum Coefficient. Thereafter, deep neural networks are employed to classify speech into six emotions (happy, sad, neutral, calm, disgust, and fear). Firstly, this paper presents a comprehensive comparative study on DNNs on prosodic features. The outcomes of all DNNs are presented in the paper. Secondly, the paper puts forward an analysis of Bag of Visual Words that uses speeded-up robust features (SURF) to cluster them using K-means and further classify them using support vector machine (SVM) into aforementioned emotions. Out of the five DNNs deployed, (i) Long Short-Term Memory (LSTM) on MFCC and, (ii) Multi-Layer Perceptron (MLP) classifier on MFCC, outperforms others, giving an accuracy score of 0.70 (in both cases). Further, the BoVW technique performed 53% of correct classification. Therefore, the proposed methodology constructs a Hybrid of Acoustic Features (HAF) and feeds them into an ensemble of bagged multi-layer perceptron classifier imparting an accuracy of 85%. Also, it achieves a precision score between 0.77 and 0.88 for the classification of six emotions.
- Published
- 2021
5. An effective SPMV based on block strategy and hybrid compression on GPU
- Author
-
Qilong Han, Nianbin Wang, Yuhua Wang, Huanyu Cui, and Yuezhu Xu
- Subjects
Computer science ,Sparse matrix-vector multiplication ,Serial code ,Load balancing (computing) ,Theoretical Computer Science ,Matrix (mathematics) ,Acceleration ,Hardware and Architecture ,Redundancy (engineering) ,Algorithm ,Software ,Information Systems ,Sparse matrix ,Block (data storage) - Abstract
Due to the non-uniformity of the sparse matrix, the calculation of SPMV (sparse matrix vector multiplication) will lead to redundancy in calculation, redundancy in storage, unbalanced load and low GPU utilization. In this study, a new matrix compression method based on CSR and COO is proposed for the above analysis: PBC algorithm. This method considers the load balancing condition in the calculation process of SPMV, and blocks are divided according to the strategy of row main order to ensure the minimum standard deviation between each block, aiming to satisfy the maximum similarity in the number of nonzero elements between each block. This paper preprocesses the original matrix based on block splitting algorithm to meet the conditions of load balancing for each block stored in the form of CSR and COO. Finally, the experimental results show that the time of SPMV preprocessing is within the acceptable range of the algorithm. Compared with the serial code without CSR optimization, the parallel method in this paper has an acceleration ratio of 178x. In addition, compared with the serial code for CSR optimization, the parallel method in this paper has an acceleration ratio of 6x. And a representative matrix compression method is also selected for performing comparative analysis. The experimental results show that the PBC algorithm has a good efficiency improvement compared with the comparison algorithm.
- Published
- 2021
6. Software-defect prediction within and across projects based on improved self-organizing data mining
- Author
-
Junhua Ren and Qing Zhang
- Subjects
Correlation coefficient ,Computer science ,computer.software_genre ,Data imbalance ,Software metric ,Theoretical Computer Science ,Software bug ,Ranking ,Hardware and Architecture ,Data pre-processing ,Data mining ,computer ,Software ,Information Systems - Abstract
This paper proposes a new method for software-defect prediction based on self-organizing data mining; this method can establish a causal relationship between software metrics and defects. Defect-prediction models were established for intra-project and cross-project scenarios. For intra-project forecasting, this article establishes a self-organizing data mining model, adding a method of smooth data preprocessing to solve the problem of data imbalance. For cross-project forecasting, this article establishes a self-organizing data mining model, solves the difference between the two by finding a source-project instance with a larger correlation coefficient with the target project, and establishes a defect-prediction model for the selected source-project instance. This paper aims to achieve classification and ranking prediction. The proposed method is tested on public-defect datasets. In the classification-prediction experiment, the precision, F-measure, and AUC evaluation indicators of this method are used. In the ranking-prediction experiment, AAE and ARE evaluation by this method are optimized. The algorithm is found to be an efficient and feasible method for software-defect prediction.
- Published
- 2021
7. Dynamic weighted selective ensemble learning algorithm for imbalanced data streams
- Author
-
Du Hongle, Ke Gang, Zhang Lin, Yeh-Cheng Chen, and Zhang Yan
- Subjects
Data stream ,Concept drift ,Computer science ,Data stream mining ,Sample (statistics) ,Ensemble learning ,Theoretical Computer Science ,ComputingMethodologies_PATTERNRECOGNITION ,Hardware and Architecture ,Resampling ,Classifier (linguistics) ,Oversampling ,Algorithm ,Software ,Information Systems - Abstract
Data stream mining is one of the hot topics in data mining. Most existing algorithms assume that data stream with concept drift is balanced. However, in real-world, the data streams are imbalanced with concept drift. The learning algorithm will be more complex for the imbalanced data stream with concept drift. In online learning algorithm, the oversampling method is used to select a small number of samples from the previous data block through a certain strategy and add them into the current data block to amplify the current minority class. However, in this method, the number of stored samples, the method of oversampling and the weight calculation of base-classifier all affect the classification performance of ensemble classifier. This paper proposes a dynamic weighted selective ensemble (DWSE) learning algorithm for imbalanced data stream with concept drift. On the one hand, through resampling the minority samples in previous data block, the minority samples of the current data block can be amplified, and the information in the previous data block can be absorbed into building a classifier to reduce the impact of concept drift. The calculation method of information content of every sample is defined, and the resampling method and updating method of the minority samples are given in this paper. On the other hand, because of concept drift, the performance of the base-classifier will be degraded, and the decay factor is usually used to describe the performance degradation of base-classifier. However, the static decay factor cannot accurately describe the performance degradation of the base-classifier with the concept drift. The calculation method of dynamic decay factor of the base-classifier is defined in DWSE algorithm to select sub-classifiers to eliminate according to the attenuation situation, which makes the algorithm better deal with concept drift. Compared with other algorithms, the results show that the DWSE algorithm has better classification performance for majority class samples and minority samples.
- Published
- 2021
8. A new software cache structure on Sunway TaihuLight
- Author
-
Zhaochu Deng, Panpan Du, Jie Lin, and Jianjiang Li
- Subjects
Hardware_MEMORYSTRUCTURES ,Computer science ,Program optimization ,Supercomputer ,computer.software_genre ,Theoretical Computer Science ,Data access ,Hardware and Architecture ,Hit rate ,Operating system ,Overhead (computing) ,Cache ,computer ,Software ,Information Systems ,Sunway TaihuLight ,Data transmission - Abstract
The Sunway TaihuLight is the first supercomputer built entirely with domestic processors in China. On Sunway Taihulight, the local data memory (LDM) of the slave core is limited, so data transmission with the main memory is frequent during calculation, and the memory access efficiency is low. On the other hand, for many scientific computing programs, how to solve the storage problem of irregular access data is the key to program optimization. Software cache (SWC) is one of the effective means to solve these problems. Based on the characteristics of Sunway TaihuLight structure and irregular access, this paper designs and implements a new software cache structure by using part of the space in LDM to simulate the cache function, which uses new cache address mapping and conflicts solution to solve high data access overhead and storage overhead in a traditional cache. At the same time, the SWC uses the register communication between the slave cores to share on the different slave core LDMs, increasing the capacity of the software cache and improving the hit rate. In addition, we adopt a double buffer strategy to access regular data in batches, which hides the communication overhead between the slave core and the main memory. The test results on the Sunway TaihuLight platform show that the software cache structure in this paper can effectively reduce the program running time, improve the software cache hit rate, and achieve a better optimization effect.
- Published
- 2021
9. Comparative evaluation of task priorities for processing and bandwidth capacities-based workflow scheduling for cloud environment
- Author
-
Zheng Wei, Zhang De-Fu, and Emmanuel Bugingo
- Subjects
Flexibility (engineering) ,Job shop scheduling ,Computer science ,Heuristic ,business.industry ,Distributed computing ,Cloud computing ,computer.software_genre ,Theoretical Computer Science ,Task (computing) ,Hardware and Architecture ,Virtual machine ,Bandwidth (computing) ,business ,computer ,Software ,Information Systems ,Data transmission - Abstract
With the development of the cloud computing market, cloud computing providers are offering to their users the flexibility to choose the desired capacity of both processing and data transfer (bandwidth) to use during the execution of their applications. The selected processing capacity can be shared among the number of virtual machines (VMs) that are capable of completing the user’s application within optimal time. During the execution of the users’ application, each task is scheduled on the VM that is capable of minimizing its execution time. However, the tasks are interdependent which means that the execution delay of a parent task will lead to the delay of its dependent tasks. The Bandwidth can be used to reduce the waiting time and avoid this delay of the dependent task. The execution time of the user’s application depends on some factors such as tasks’ priority, application’s structure, size, and the number of the VM selected to share the selected processing capacity. Determining the number of VM to share the users’ selected capacities under users’ specified quality of services remains a big challenge. Determining the number of VM to share the users selected processing capacity has been studied in our previous paper, where makespan idle-time was given the same weight. In this paper, we extended our previous work by adding the bandwidth capacity to the user’s selection and use CRITIC a multi-criteria decision-making technique to determine the weight of each criterion. The evaluation results show that the proposed heuristic can work well under different parameter settings.
- Published
- 2021
10. Automatic lane marking prediction using convolutional neural network and S-Shaped Binary Butterfly Optimization
- Author
-
Abrar M. Alajlan and Marwah Almasri
- Subjects
Hyperparameter ,Receiver operating characteristic ,Computer science ,business.industry ,Stability (learning theory) ,Process (computing) ,Binary number ,Pattern recognition ,Convolutional neural network ,Measure (mathematics) ,Theoretical Computer Science ,Hardware and Architecture ,Robustness (computer science) ,Artificial intelligence ,business ,Software ,Information Systems - Abstract
Lane detection is a technique that uses geometric features as an input to the autonomous vehicle to automatically distinguish lane markings. To process the intricate features present in the lane images, traditional computer vision (CV) techniques are typically time-consuming, need more computing resources, and use complex algorithms. To address this problem, this paper presents a deep convolutional neural network (CNN) architecture that prevents the complexities of traditional CV techniques. CNN is regarded as a reasonable method for lane marking prediction, while improved performance requires hyperparameter tuning. To enhance the initial parameter setting of the CNN, an S-Shaped Binary Butterfly Optimization Algorithm (SBBOA) is utilized in this paper. In this way, the relative parameters of CNN are selected for accurate lane marking. To evaluate the performance of the proposed SBBOA-CNN model, extensive experiments are conducted using the TUSimple and CULane datasets. The experimental results obtained show that the proposed approach outperforms other state-of-the-art techniques in terms of classification accuracy, precision, F1-score, and recall. The proposed model also considerably outperforms the CNN in terms of classification accuracy, average elapsed time, and receiver operating characteristics curve measure. This result demonstrates that the SBBOA optimized CNN exhibits higher robustness and stability than CNN.
- Published
- 2021
11. Data balancing-based intermediate data partitioning and check point-based cache recovery in Spark environment
- Author
-
Youlong Luo, Qianqian Cai, and Chunlin Li
- Subjects
Shuffling ,Computer science ,business.industry ,Distributed computing ,Skew ,Theoretical Computer Science ,Data recovery ,Task (computing) ,Hardware and Architecture ,Spark (mathematics) ,Overhead (computing) ,Cache ,Reservoir sampling ,business ,Software ,Information Systems - Abstract
Both data shuffling and cache recovery are essential parts of the Spark system, and they directly affect Spark parallel computing performance. Existing dynamic partitioning schemes to solve the data skewing problem in the data shuffle phase suffer from poor dynamic adaptability and insufficient granularity. To address the above problems, this paper proposes a dynamic balanced partitioning method for the shuffle phase based on reservoir sampling. The method mitigates the impact of data skew on Spark performance by sampling and preprocessing intermediate data, predicting the overall data skew, and giving the overall partitioning strategy executed by the application. In addition, an inappropriate failure recovery strategy increases the recovery overhead and leads to an inefficient data recovery mechanism. To address the above issues, this paper proposes a checkpoint-based fast recovery strategy for the RDD cache. The strategy analyzes the task execution mechanism of the in-memory computing framework and forms a new failure recovery strategy using the failure recovery model plus weight information based on the semantic analysis of the code to obtain detailed information about the task, so as to improve the efficiency of the data recovery mechanism. The experimental results show that the proposed dynamic balanced partitioning approach can effectively optimize the total completion time of the application and improve Spark parallel computing performance. The proposed cache fast recovery strategy can effectively improve the computational speed of data recovery and the computational rate of Spark.
- Published
- 2021
12. Optimal channel estimation and interference cancellation in MIMO-OFDM system using MN-based improved AMO model
- Author
-
Chittetti Venkateswarlu and Nandanavanam Venkateswara Rao
- Subjects
Speedup ,Computer science ,MIMO-OFDM ,Theoretical Computer Science ,Single antenna interference cancellation ,Rate of convergence ,Interference (communication) ,Hardware and Architecture ,Bit error rate ,Minification ,Algorithm ,Software ,Computer Science::Information Theory ,Information Systems ,Communication channel - Abstract
In recent years, MIMO-OFDM plays a significant role due to its high-speed transmission rate. Various research studies have been carried out regarding the channel estimation to obtain optimal output without affecting the system performances. But due to increased bit error rate achieving optimal channel estimation is considered as a challenging task. Therefore, this paper proposes the modified Newton’s (MN)-based Improved Animal Migration Optimization (IAMO) algorithm in MIMO-OFDM system. The significant objective of this proposed approach involves the minimization of bit error rate and to enhance the system performance. In this paper, a modified Newton’s method is utilized to determine the discover capability and to speed up the convergence rate thereby obtaining the optimum search space positions. In addition to this, the proposed method is utilized to restrict the interference in the MIMO-OFDM systems. Finally, the performance of the proposed method is compared with other channel estimation methods to determine the effectiveness of the system. The experimental and comparative analyses are carried out, and the results demonstrate that the proposed approach provides better frequency-selective channels than other state-of-the-art methods .
- Published
- 2021
13. Kubernetes in IT administration and serverless computing: An empirical study and research challenges
- Author
-
Hong-Ning Dai, Rui Pan, Subrota K. Mondal, Tan Tian, and H M Dipu Kabir
- Subjects
Computer science ,business.industry ,media_common.quotation_subject ,Attack tree ,Cloud computing ,computer.software_genre ,Computer security ,Virtualization ,Theoretical Computer Science ,Debugging ,Hardware and Architecture ,Virtual machine ,Container (abstract data type) ,Software design ,Orchestration (computing) ,business ,computer ,Software ,Information Systems ,media_common - Abstract
Today’s industry has gradually realized the importance of lifting efficiency and saving costs during the life-cycle of an application. In particular, we see that most of the cloud-based applications and services often consist of hundreds of micro-services; however, the traditional monolithic pattern is no longer suitable for today’s development life-cycle. This is due to the difficulties of maintenance, scale, load balance, and many other factors associated with it. Consequently, people switch their focus on containerization—a lightweight virtualization technology. The saving grace is that it can use machine resources more efficiently than the virtual machine (VM). In VM, a guest OS is required to simulate on the host machine, whereas containerization enables applications to share a common OS. Furthermore, containerization facilitates users to create, delete, or deploy containers effortlessly. In order to manipulate and manage the multiple containers, the leading Cloud providers introduced the container orchestration platforms, such as Kubernetes, Docker Swarm, Nomad, and many others. In this paper, a rigorous study on Kubernetes from an administrator’s perspective is conducted. In a later stage, serverless computing paradigm was redefined and integrated with Kubernetes to accelerate the development of software applications. Theoretical knowledge and experimental evaluation show that this novel approach can be accommodated by the developers to design software architecture and development more efficiently and effectively by minimizing the cost charged by public cloud providers (such as AWS, GCP, Azure). However, serverless functions are attached with several issues, such as security threats, cold start problem, inadequacy of function debugging, and many other. Consequently, the challenge is to find ways to address these issues. However, there are difficulties and hardships in addressing all the issues altogether. Respectively, in this paper, we simply narrow down our analysis toward the security aspects of serverless. In particular, we quantitatively measure the success probability of attack in serverless (using Attack Tree and Attack–Defense Tree) with the possible attack scenarios and the related countermeasures. Thereafter, we show how the quantification can reflect toward the end-to-end security enhancement. In fine, this study concludes with research challenges such as the burdensome and error-prone steps of setting the platform, and investigating the existing security vulnerabilities of serverless computing, and possible future directions.
- Published
- 2021
14. Design and analysis of SRAM cell using reversible logic gates towards smart computing
- Author
-
M. Siva Kumar and O. Mohana chandrika
- Subjects
Hardware_MEMORYSTRUCTURES ,Computer science ,Transistor ,Theoretical Computer Science ,Power (physics) ,law.invention ,CMOS ,Hardware and Architecture ,law ,Video tracking ,Logic gate ,Electronic engineering ,Electronics ,Static random-access memory ,Software ,Information Systems ,Voltage - Abstract
With the enhancement of technology, the usage of electronics in various applications involving large memories for storing and processing data has increased. In this sort of application, SRAM is mainly used because of its high speed. Moreover, with the high usage of memory cells, power consumption has increased to a great extent. The current literature shows that the various parameters of SRAM, such as speed and power, need to be improved for memory cells used in object tracking applications. To improve these parameters, the architectures of SRAM must be combined with new techniques. In recent years, reversible circuits have gained extensive attention because of their low-power and low-speed characteristics. In this brief, a low-power high-speed reversible static RAM is proposed. The proposed SRAM has the combined features of data processing with low-power dissipation and high speed. The proposed architecture of SRAM yields better performance and is similar to traditional SRAM architecture in terms of delay. This paper also implements a 32 × 64 memory block for object tracking applications. This work is carried out with 45 nm CMOS technology. In the proposed design, transistors are made to operate in the weak inversion region through the use of the EKV model. The design proposed in this paper reduces garbage outputs by 60%, the quantum cost by 70%, and the quantum delay by 70% compared to the current architectures. The proposed design is simulated at different supply voltages to ensure that the power dissipation and delay of SRAM are proportional to the voltage supplied.
- Published
- 2021
15. NodeRank: Finding influential nodes in social networks based on interests
- Author
-
Ibrahim Kamel, Mohammed Bahutair, and Zaher Al Aghbari
- Subjects
Structure (mathematical logic) ,Theoretical computer science ,Social network ,business.industry ,Computer science ,Theoretical Computer Science ,Set (abstract data type) ,Hardware and Architecture ,Scalability ,Spark (mathematics) ,Recursive algorithms ,business ,Software ,Information Systems - Abstract
Finding influential members in social networks received a lot of interest in recent literature. Several algorithms have been proposed that provide techniques for extracting a set of the most influential people in a certain social network. However, most of these algorithms find influential nodes based solely on the topological structure of the network. In this paper, a new algorithm, namely NodeRank, is proposed that ranks every user in a given social network based on the topological structure as well as the interests of the users (nodes). Higher ranks are given to people with great influence on other members of the network. Furthermore, the paper investigates a MapReduce version of the algorithm that enables the algorithm to run on multiple machines simultaneously. Experiments showed that the MapReduce model is not suitable for the NodeRank algorithm since MapReduce is only applicable for batch processes and the NodeRank is highly iterative. For that reason, a parallel version of the algorithm is proposed that utilizes Hadoop Spark, a framework for parallel processes that supports batch operations as well as iterative and recursive algorithms. Several experiments have been carried out to test the accuracy as well as the scalability of the algorithm.
- Published
- 2021
16. TRAM: Technique for resource allocation and management in fog computing environment
- Author
-
Rajni Aron and Heena Wadhwa
- Subjects
Flexibility (engineering) ,business.industry ,Computer science ,Process (engineering) ,Distributed computing ,Cloud computing ,Energy consumption ,Theoretical Computer Science ,Resource (project management) ,Hardware and Architecture ,Computer data storage ,Wireless ,Resource allocation ,business ,Software ,Information Systems - Abstract
The traditional cloud computing technology provides services to a plethora of applications by providing resources. These services support numerous industries for computational purposes and data storage. However, the obstruction of the cloud computing framework is its inadequate flexibility and problem to accommodate the diverse requirements generated from an IoT-based environment. Cloud computing is emerging with the latest paradigms to ensure that the connected heterogeneous system can achieve high-performance computing (HPC). Furthermore, many of today’s requirements prefer diverse geographic distribution of resources and near to the end device location. Hence, the new fog computing paradigm provides some innovative solutions for real-time applications. The fog computing framework’s prime agenda is to support latency-sensitive applications by utilizing all available resources. In this paper, a novel approach is designed for resource allocation and management. TRAM , a technique for resource allocation and management, is proposed to ensure resource utilization at the fog layer. This approach is used to track the intensity level of existing tasks using expectation maximization (EM) algorithm and calculate the current status of resources. All the available resources manage by using a wireless system. This paper provides a scheduling algorithm for the resource grading process in the fog computing environment. The performance of this approach is tested on the iFogSim simulator and compared the results with SJF, FCFS and MPSO. The experimental results demonstrated that TRAM effectively minimizes execution time, network consumption, energy consumption and average loop delay of tasks.
- Published
- 2021
17. Token-based approach in distributed mutual exclusion algorithms: a review and direction to future research
- Author
-
Ashish Singh Parihar and Swarnendu Kumar Chakraborty
- Subjects
Computer science ,business.industry ,Wireless ad hoc network ,Distributed computing ,media_common.quotation_subject ,Security token ,Adaptability ,Theoretical Computer Science ,Shared resource ,Domain (software engineering) ,Hardware and Architecture ,The Internet ,Mutual exclusion ,Mobile telephony ,business ,Software ,Information Systems ,media_common - Abstract
The problem of mutual exclusion is a highly focused area in the distributed architecture. To avoid inconsistency in data, mutual exclusion ensures that no two processes running on different processors are allowed to enter into the same shared resource simultaneously in the system. In recent years, the consistent development of ongoing internet and mobile communication technologies, the devices, infrastructure and resources in networking systems like Ad Hoc Networks are becoming more complex and heterogeneous. Various algorithms have been introduced as a solution to mutual exclusion problem in the domain of distributed architecture over the past years. The performance and adaptability of these solutions depend on the different strategies used by them in the system. Various classifications of these strategies have been proposed such as token-based and non-token-based (also, permission-based). This paper presents a survey of various existing token-based distributed mutual exclusion algorithms (TBDMEA) in the focus of their performance measures and fault-tolerant capabilities which comprises the associated open challenges and directions to future research. In conjunction with traditional to latest proposed TBDMEA, token-based distributed group mutual exclusion algorithms (TBDGMEA) and token-based self-stabilizing distributed mutual exclusion algorithms (TBStDMEA) have also been surveyed in this paper as new variants of the token-based scheme.
- Published
- 2021
18. Smart home security: challenges, issues and solutions at different IoT layers
- Author
-
Mudassar Hussain, Muhammad Bilal, Fadi Al-Turjman, Shakir Zaman, Rashid Amin, and Haseeb Touqeer
- Subjects
020203 distributed computing ,Temperature monitoring ,business.industry ,Computer science ,Control (management) ,02 engineering and technology ,Computer security ,computer.software_genre ,Theoretical Computer Science ,Layered structure ,Hardware and Architecture ,Light control ,Home automation ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,DECIPHER ,Internet of Things ,business ,computer ,Software ,Information Systems - Abstract
The Internet of Things is a rapidly evolving technology in which interconnected computing devices and sensors share data over the network to decipher different problems and deliver new services. For example, IoT is the key enabling technology for smart homes. Smart home technology provides many facilities to users like temperature monitoring, smoke detection, automatic light control, smart locks, etc. However, it also opens the door to new set of security and privacy issues, for example, the private data of users can be accessed by taking control over surveillance devices or activating false fire alarms, etc. These challenges make smart homes feeble to various types of security attacks and people are reluctant to adopt this technology due to the security issues. In this survey paper, we throw light on IoT, how IoT is growing, objects and their specifications, the layered structure of the IoT environment, and various security challenges for each layer that occur in the smart home. This paper not only presents the challenges and issues that emerge in IoT-based smart homes but also presents some solutions that would help to overcome these security challenges.
- Published
- 2021
19. AIEMLA: artificial intelligence enabled machine learning approach for routing attacks on internet of things
- Author
-
Saurabh Sharma and Vinod Kumar Verma
- Subjects
Hyperparameter ,Routing protocol ,Focus (computing) ,Artificial neural network ,Computer science ,business.industry ,Lossy compression ,Machine learning ,computer.software_genre ,Prime (order theory) ,Theoretical Computer Science ,Hardware and Architecture ,The Internet ,Artificial intelligence ,Routing (electronic design automation) ,business ,computer ,Software ,Information Systems - Abstract
The Internet of things (IoT) is emerging as a prime area of research in the modern era. The significance of IoT in the daily life is increasing due to the increase in objects or things connected to the internet. In this paper, routing protocol for low power and lossy networks (RPL) is examined on the Contiki operating system. This paper used RPL attack framework to simulate three RPL attacks, namely hello-flood, decreased-rank and increased-version. These attacks are simulated in a separate and simultaneous manner. The focus remained on the detection of these attacks through artificial neural network (ANN)-based supervised machine learning approach. The accurate detection of the malicious nodes prevents the network from the severe effects of the attack. The accuracy of the proposed model is computed with hold-out approach and tenfold cross-validation technique. The hyperparameters have been optimized through parameter tuning. The model presented in this paper detected the aforesaid attacks simultaneously as well as individually with 100% accuracy. This work also investigated other performance measures like precision, recall, F1-score and Mathews correlation coefficient (MCC).
- Published
- 2021
20. Design and testing of a reversible ALU by quantum cells automata electro-spin technology
- Author
-
Rupsa Roy, Swarup Sarkar, and Sourav Dhar
- Subjects
020203 distributed computing ,Computer science ,business.industry ,Transistor ,Fault tolerance ,Hardware_PERFORMANCEANDRELIABILITY ,02 engineering and technology ,Dissipation ,Theoretical Computer Science ,law.invention ,Automaton ,Arithmetic logic unit ,Software ,CMOS ,Hardware and Architecture ,law ,Hardware_INTEGRATEDCIRCUITS ,0202 electrical engineering, electronic engineering, information engineering ,Electronic engineering ,business ,Hardware_LOGICDESIGN ,Information Systems ,Quantum cellular automaton - Abstract
Arithmetic logic unit (ALU), a core component of a processor, is one of the thrust areas of the current research. Presently, ALU is designed by transistor-based CMOS technique and its individual components are placed in different layers. The current design is affected by the limitations of Moore’s law and design complexity. At present, ‘Quantum cellular automata electro-spin (QCA-ES)’ technology is widely accepted technology as an alternative of ‘CMOS’ to minimize the above discussed problems. In this research paper, the design of a novel multilayer portable, dynamic, fault-tolerant, power-efficient, thermally stable reversible ALU is proposed which is explored through QCA-ES. All the arithmetic and logical components of ALU are separately placed in different layers. Area density, delay, fault tolerance and thermal stability are investigated. A specific type of gate, known as reversible gate (modified 3:3 ‘TSG’ gate), is used in this proposed design with QCA technology to get the optimized design ALU with low occupied area, complexity, delay and power dissipation. Investigation of a fault-free design and saturated amplitude level (of output) change with respect to temperature increment in the proposed device are also discussed in this paper. Not only the thermal stability (up to 6 k temperature) but also an investigation on cell complexity of the 100% fault-free (against multiple cell-omission, cell-displacement, cell-orientation change and extra cell deposition), multilayer nano-device is represented in this work. ‘QCA-Designer’ software is used in this research work to design and develop layout of the proposed components in quantum-field and find out the occupied area, delay and complexity of proposed design. ‘QCA-Pro’ software is used for getting the value of dissipated power.
- Published
- 2021
21. Performance evaluation and optimization of a task offloading strategy on the mobile edge computing with edge heterogeneity
- Author
-
Shunfu Jin and Wei Li
- Subjects
020203 distributed computing ,Karush–Kuhn–Tucker conditions ,Mobile edge computing ,Computer science ,business.industry ,Distributed computing ,Cloud computing ,02 engineering and technology ,Energy consumption ,Theoretical Computer Science ,System model ,Task (computing) ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Computation offloading ,business ,Software ,Information Systems ,Efficient energy use - Abstract
With the development for the technology of mobile edge computing (MEC) and the grave situation for the shortage of global energy, the problem of computation offloading in a cloud computing framework is getting more attention by network managers. In order to improve the experience quality of users and increase the energy efficiency of the system, we focus on the issue of task offloading strategy in MEC system. In this paper, we propose a task offloading strategy in the MEC system with a heterogeneous edge. By considering the execution and transmission of tasks under the task offloading strategy, we present an architecture for the MEC system. We establish a system model composed of M/M/1, M/M/c and M/M/ $$\infty$$ queues to capture the execution process of tasks in local mobile device (MD), MEC server and remote cloud servers, respectively. Moreover, by trading off the average delay of tasks, the energy consumption level of the MD and the offloading expend of the system, we construct a cost function for serving one task and formulate a joint optimization problem for the task offloading strategy accordingly. Furthermore, under the constraints of steady state and proportion scope, we use the Lagrangian function and the corresponding Karush–Kuhn–Tucker (KKT) condition to obtain the optimal task offloading strategy with the minimum system cost. Finally, we carry out numerical experiments on the MEC system to investigate the influence of system parameters on the task offloading strategy and to obtain the optimal results. The experiment results show that the task offloading strategy proposed in this paper can balance the average delay, the energy consumption level and the offloading expend with the optimal allocation ratio.
- Published
- 2021
22. A novel approach for multilevel multi-secret image sharing scheme
- Author
-
Kanchan Bisht and Maroti Deshmukh
- Subjects
Scheme (programming language) ,Theoretical computer science ,Distribution (number theory) ,Computer science ,Image sharing ,Structure (category theory) ,Theoretical Computer Science ,Image (mathematics) ,Hardware and Architecture ,Multimedia data transmission ,computer ,Software ,Information Systems ,computer.programming_language - Abstract
Multi-secret sharing (MSS) is an effective technique that securely encodes multiple secrets to generate shares and distributes them among the participants in such a way that these shares can be used later to reconstruct the secrets. MSS schemes have a considerable advantage over the single-secret sharing schemes for secure multimedia data transmission. This paper presents a novel secret image sharing approach, namely ‘(n, m, l)-Multilevel Multi-Secret Image Sharing (MMSIS) scheme.’ The proposed MMSIS scheme encodes ‘n’ distinct secret images to generate ‘m’ shares and distributes them among ‘m’ participants allocated to ‘l’ distinct levels. The paper proposes two variants of the MMSIS scheme. The first variant is an $$(n,n+1,l)$$ -MMSIS scheme which encodes ‘n’ secret images each having a unique level id $$L_k$$ into $$(n+1)$$ shares. The image shares are then distributed among $$(n+1)$$ participants assigned to ‘ $$l=n$$ ’ different levels. With the increase in level id, the number of shares required to reconstruct the secret image also increases. To reconstruct a secret image of a particular level $$L_k$$ , all the shares at level $$L_k$$ and its preceding levels need to be acquired, which requires the consensus of all participants holding the shares up to level $$L_k$$ . The second variant, namely extended-MMSIS (EMMSIS) scheme is a generalized (n, m, l) version of the former scheme that allows to generate more shares for a specific secret image at a particular level in accordance with the consensus requirements for its reconstruction. The multilevel structure of the scheme makes it useful for multi-secret distribution in a multilevel organizational structure.
- Published
- 2021
23. Simple method of selecting totalistic rules for pseudorandom number generator based on nonuniform cellular automaton
- Author
-
Miroslaw Szaban
- Subjects
Pseudorandom number generator ,Keyspace ,Selection (relational algebra) ,Computer science ,business.industry ,Cryptography ,Cellular automaton ,Theoretical Computer Science ,Set (abstract data type) ,Hardware and Architecture ,Entropy (information theory) ,business ,Algorithm ,Software ,Information Systems ,Generator (mathematics) - Abstract
This paper is devoted to selecting rules for one-dimensional (1D) totalistic cellular automaton (TCA). These rules are used for the generation of pseudorandom sequences, which could be useful in cryptography. The power of pseudorandom number generator (PRNG) based on nonuniform TCA can be improved using not only one rule but a large set of rules. For this purpose, each subset of rules should be analyzed with its assignation to cellular automaton (CA) cells should be analyzed. We examine each of the subsets of totalistic rules, consisting of rules with neighborhood radius equal to 1 and 2. The entropy of bitstreams generated by the nonuniform TCA points out the best set of rules appropriate for the TCA-based generator. The paper also presents the method of simple selection of CA rules based on a cryptographic criterion known as a balance. The proposed method selects a maximal size of the set of available CA rules for a given neighborhood radius and suitable for PRNG. The method guarantees to avoid conflicting assignments of rules resulting in the creation of unwanted stable bit sequences, and provides high-quality pseudorandom sequences. This technique is used to verify the subsets of rules selected experimentally. Verified rules are proposed for 1D TCA-based PRNG as a new subset of best nonuniform TCA rules. New picked, examined, and verified subset of rules could be used in TCA-based PRNG and provide cryptographically strong bit sequences and huge keyspace.
- Published
- 2021
24. E2LG: a multiscale ensemble of LSTM/GAN deep learning architecture for multistep-ahead cloud workload prediction
- Author
-
Saeed Sharifian and Peyman Yazdanian
- Subjects
020203 distributed computing ,Discriminator ,business.industry ,Computer science ,Deep learning ,Chaotic ,Cloud computing ,Workload ,02 engineering and technology ,computer.software_genre ,Standard deviation ,Autoscaling ,Hilbert–Huang transform ,Theoretical Computer Science ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Artificial intelligence ,Data mining ,business ,computer ,Software ,Information Systems - Abstract
Efficient resource demand prediction and management are two main challenges for cloud service providers in order to control dynamic autoscaling and power consumption in recent years. The behavior of cloud workload time-series at subminute scale is highly chaotic and volatile; therefore, traditional machine learning-based time-series analysis approaches fail to obtain accurate predictions. In recent years, deep learning-based schemes are suggested to predict highly nonlinear cloud workloads, but sometimes they fail to obtain excellent prediction results. Hence, demands for more accurate prediction algorithm exist. In this paper, we address this issue by proposing a hybrid E2LG algorithm, which decomposes the cloud workload time-series into its constituent components in different frequency bands using empirical mode decomposition method which reduces the complexity and nonlinearity of prediction model in each frequency band. Also, a new state-of-the-art ensemble GAN/LSTM deep learning architecture is proposed to predict each sub band workload time-series individually, based on its degree of complexity and volatility. Our novel ensemble GAN/LSTM architecture, which employs stacked LSTM blocks as its generator and 1D ConvNets as discriminator, can exploit the long-term nonlinear dependencies of cloud workload time-series effectively specially in high-frequency, noise-like components. By validating our approach using extensive set of experiments with standard real cloud workload traces, we confirm that E2LG provides significant improvements in cloud workload prediction accuracy with respect to the mean absolute and standard deviation of the prediction error and outperforming traditional and state-of-the-art deep learning approaches. It improves the prediction accuracy at least 5% and 12% in average compared to the main contemporary approaches in recent papers such as hybrid methods which employs CNN, LSTM or SVR.
- Published
- 2021
25. Research on GPU parallel algorithm for direct numerical solution of two-dimensional compressible flows
- Author
-
Jun’an Zhang, Yongzhen Wang, and Xuefeng Yan
- Subjects
Speedup ,Computer science ,Computation ,Parallel algorithm ,Graphics processing unit ,Direct numerical simulation ,Finite difference ,Upwind scheme ,Theoretical Computer Science ,Computational science ,Hardware and Architecture ,Central processing unit ,Software ,Information Systems - Abstract
In this paper, a novel parallel algorithm is proposed to solve the problems of heavy computation and long simulation time in the field of compressible flows. In this algorithm, a third-order upwind scheme and a fourth-order central difference scheme are employed, with a third-order Runge-Kutta method for time stepping. Considering the powerful floating-point computing ability of the Graphics Processing Unit (GPU), this paper establishes the algorithm on the basis of GPU. Moreover, the direct numerical simulation method is adopted in this algorithm to improve the solution accuracy of the simulation results. To further enhance the efficiency of the algorithm, several optimization strategies are explored in the design of the algorithm as well. Both accuracy and feasibility of the algorithm are verified by a classical two-dimensional example. Compared with solving this example on the Central Processing Unit platform, the experimental results demonstrate that the maximum speedup ratio achieved by our approach is 18.03 times.
- Published
- 2021
26. Parallel optimization of the ray-tracing algorithm based on the HPM model
- Author
-
Wang Yi-Ou, Ding Gangyi, Zhang Fu-quan, Li Yu-Gang, and Wang Jun-Feng
- Subjects
Basis (linear algebra) ,Image quality ,Computer science ,Node (networking) ,Parallel optimization ,Division (mathematics) ,Theoretical Computer Science ,Hardware and Architecture ,Parallelism (grammar) ,Ray tracing (graphics) ,Algorithm ,Time complexity ,Software ,Information Systems - Abstract
This paper proposes a parallel computing analysis model HPM and analyzes the parallel architecture of CPU–GPU based on this model. On this basis, we study the parallel optimization of the ray-tracing algorithm on the CPU–GPU parallel architecture and give full play to the parallelism between nodes, the parallelism of the multi-core CPU inside the node, and the parallelism of the GPU, which improve the calculation speed of the ray-tracing algorithm. This paper uses the space division technology to divide the ground data, constructs the KD-tree organization structure, and improves the construction method of KD-tree to reduce the time complexity of the algorithm. The ground data is evenly distributed to each computing node, and the computing nodes use a combination of CPU–GPU for parallel optimization. This method dramatically improves the drawing speed while ensuring the image quality and provides an effective means for quickly generating photorealistic images.
- Published
- 2021
27. Optimal multilevel media stream caching in cloud-edge environment
- Author
-
Chunlin Li, Yihan Zhang, Youlong Luo, and Hengliang Tang
- Subjects
020203 distributed computing ,Network architecture ,Hardware_MEMORYSTRUCTURES ,business.industry ,Computer science ,Multitier architecture ,Cloud computing ,02 engineering and technology ,Theoretical Computer Science ,Hardware and Architecture ,Knapsack problem ,Server ,0202 electrical engineering, electronic engineering, information engineering ,Enhanced Data Rates for GSM Evolution ,Cache ,Greedy algorithm ,business ,Software ,Information Systems ,Computer network - Abstract
Due to the problem of high link load of edge cache and small storage space of edge server, a caching architecture by the collaborative of edge nodes and the cloud server is proposed. The content cache location is designed and optimized, which can be the content provider, cloud server (CS), and edge node (EN). In the proposed system, cloud servers collaborate with edge servers and the performance of content caching can be improved by coordinating caching on the cloud server or caching on the edge server. In this paper, a cloud-edge collaborative caching model based on the greedy algorithm is proposed, which includes the content caching model and collaborative caching model. Network architecture, file popularity estimation, link capacity, and other factors are considered in the model. Correspondingly, a cloud-edge collaborative cache algorithm based on a greedy algorithm is proposed. The related optimization problem is decomposed into the knapsack problem of cache layout in each layer, and then the greedy algorithm is used to solve the knapsack problem of cache placement and cooperative cache proposed in this paper. The affiliation between CS cache and EN caches in the layered architecture is improved and recognized. In the experimental results, the link load is reduced, the cache hit rate is improved by using the proposed method of edge caching, and it also has obvious advantages in the average end-to-end service delay.
- Published
- 2021
28. An intelligent IoT-based positioning system for theme parks
- Author
-
Sina Einavipour and Reza Javidan
- Subjects
020203 distributed computing ,Focus (computing) ,Positioning system ,Computer science ,business.industry ,Real-time computing ,02 engineering and technology ,Theoretical Computer Science ,Hardware and Architecture ,Video tracking ,0202 electrical engineering, electronic engineering, information engineering ,Frequency-hopping spread spectrum ,Radio-frequency identification ,business ,Theme (computing) ,Software ,Information Systems - Abstract
With the advent of the Internet of Things (IoT) and ubiquitous presence of sensor nodes, positioning technologies have become a topic of interest among researchers. While the applications of positioning systems are very vast, determining the position of moving sensor nodes, finding missing people in large areas, and object tracking are among the most popular ones. The focus of this paper is to propose a positioning system to locate missing people in theme parks. Currently, radio frequency identification (RFID) systems are used in modern theme parks to locate lost visitors. In these systems, a wristband with active RFID tag is given to each visitor and RFID readers are deployed in predetermined locations. When a visitor is in the communication range of a reader, its location can be estimated based on the location of the reader. Therefore, the accuracy of these systems is relevant to the communication range of readers. Another limitation of RFID-based systems is due to the fact that readers cannot be placed in communication range of each other as they can interference with each other. It is clear that the only way to increase the accuracy of such systems is by increasing the number of readers and decreasing the communication range of each reader. In this paper, a Bluetooth low energy (BLE)-based system is proposed to be used in theme parks for locating lost visitors. The advantage of using BLE is due to the fact that it uses frequency hopping spread spectrum (FHSS) thus readers can be placed in communication range of each other without severe interference. In the proposed method, at first, the optimal places for deploying readers are obtained using ant colony optimization (ACO). Then, a fuzzy approach is used to increase the accuracy of the system. Three different signal levels are defined to be used in our fuzzy system based on which the location of visitors can be estimated. By using three levels of signal strength, the accuracy of the system is increased compared with the similar system with the similar number of readers. The simulation results show that the accuracy of the system is improved using this method, and the cost of the system is decreased as BLE readers are much less expensive than their RFID counterparts.
- Published
- 2021
29. Revisiting non-tree routing for maximum lifetime data gathering in wireless sensor networks
- Author
-
Xiaojun Zhu
- Subjects
020203 distributed computing ,Computer science ,Node (networking) ,Maximum flow problem ,02 engineering and technology ,Topology ,Theoretical Computer Science ,Tree structure ,Hardware and Architecture ,Path (graph theory) ,0202 electrical engineering, electronic engineering, information engineering ,Routing (electronic design automation) ,Time complexity ,Wireless sensor network ,Software ,Information Systems - Abstract
Wireless sensor networks usually adopt a tree structure for routing, where each node sends and forwards messages to its parent. However, lifetime maximization with tree routing structure is NP-hard, and all algorithms attempting to find the optimal solution run in exponential time unless $$P=\mathrm{NP}$$ . This paper revisits the problem of non-tree routing structure, where a node can send different messages to different neighbors. Though lifetime maximization with non-tree routing can be solved in polynomial time, the existing method transforms it into a series of maximum flow problems, which are either complicated or with high running time. This paper proposes an algorithm with O(mn) running time, where m is the number of edges and n is the number of nodes. The heart of the algorithm is a method to find a routing path from any node to the sink in O(m) time without disconnecting existing routing paths. The proposed algorithm is also suitable for distributed implementation. When a node fails, each influenced node can establish a new routing path in O(m) time. Simulations are conducted to compare the optimal lifetimes of tree structure and non-tree structure on random networks. The results verify the effectiveness of the proposed algorithm.
- Published
- 2021
30. High performance of brain emotional intelligent controller for DTC-SVM based sensorless induction motor drive
- Author
-
Sridhar Savarapu and Yadaiah Narri
- Subjects
020203 distributed computing ,Electronic speed control ,Computer science ,Rotor (electric) ,Stator ,02 engineering and technology ,Theoretical Computer Science ,law.invention ,Support vector machine ,Stator voltage ,Direct torque control ,Hardware and Architecture ,Control theory ,law ,Adaptive system ,0202 electrical engineering, electronic engineering, information engineering ,Software ,Induction motor ,Information Systems - Abstract
This paper introduces the application of the induction motor (IM) drive brain emotional intelligent controller (BEIC). Intelligent regulation, modelled on the human brain, is capable of generating impulses and is used as a controller. A Model Reference Adaptive System is developed using stator current and stator voltages, which are further developed with BEIC to approximate the rotor rpm. This paper proposes that speed estimation using BEIC for direct torque control (DTC) of IM drive. The experimental work is conducted on a hardware-in-loop mechanism using a real-time digital simulator (Op-RTDS-OP5600). The simulation and test results are discussed. The proposed method is compared to the DTC-SVM-based IM drive speed control with the existing controllers.
- Published
- 2021
31. VGL: a high-performance graph processing framework for the NEC SX-Aurora TSUBASA vector architecture
- Author
-
Vladimir V. Voevodin, Kazuhiko Komatsu, Ilya V. Afanasyev, and Hiroaki Kobayashi
- Subjects
Structure (mathematical logic) ,Connected component ,Speedup ,Computer science ,Parallel computing ,Supercomputer ,Graph ,Theoretical Computer Science ,Vector processor ,Vector graphics ,Hardware and Architecture ,Programming paradigm ,Software ,Information Systems - Abstract
Developing efficient graph algorithms implementations is an extremely important problem of modern computer science, since graphs are frequently used in various real-world applications. Graph algorithms typically belong to the data-intensive class, and thus using architectures with high-bandwidth memory potentially allows to solve many graph problems significantly faster compared to modern multicore CPUs. Among other supercomputer architectures, vector systems, such as the SX family of NEC vector supercomputers, are equipped with high-bandwidth memory. However, the highly irregular structure of many real-world graphs makes it extremely challenging to implement graph algorithms on vector systems, since these implementations are usually bulky and complicated, and a deep understanding of vector architectures hardware features is required. This paper presents the world first attempt to develop an efficient and simultaneously simple graph processing framework for modern vector systems. Our vector graph library (VGL) framework targets NEC SX-Aurora TSUBASA as a primary vector architecture and provides relatively simple computational and data abstractions. These abstractions incorporate many vector-oriented optimization strategies into a high-level programming model, allowing quick implementation of new graph algorithms with a small amount of code and minimal knowledge about features of vector systems. In this paper, we evaluate the VGL performance on four widely used graph processing problems: breadth-first search, single source shortest paths, connected components, and page rank. The provided comparative performance analysis demonstrates that the VGL-based implementations achieve significant acceleration over the existing high-performance frameworks and libraries: up to 14 times speedup over multicore CPUs (Ligra, Galois, GAPBS) and up to 3 times speedup compared to NVIDIA GPU (Gunrock, NVGRAPH) implementations.
- Published
- 2021
32. GPU-based embedded edge server configuration and offloading for a neural network service
- Author
-
Joo-Hwan Kim, Shan Ullah, and Deok-Hwan Kim
- Subjects
020203 distributed computing ,Artificial neural network ,business.industry ,Computer science ,Graphics processing unit ,Cloud computing ,02 engineering and technology ,Theoretical Computer Science ,Edge server ,Computer architecture ,Hardware and Architecture ,Server ,0202 electrical engineering, electronic engineering, information engineering ,The Internet ,Enhanced Data Rates for GSM Evolution ,Latency (engineering) ,business ,Software ,Edge computing ,Information Systems - Abstract
Recently, emerging edge computing technology has been proposed as a new paradigm that compensates for the disadvantages of the current cloud computing. In particular, edge computing is used for service applications with low latency while using local data. For this emerging technology, a neural network approach is required to run large-scale machine learning on edge servers. In this paper, we propose a pod allocation method by adding various graphics processing unit (GPU) resources to increase the efficiency of a Kubernetes-based edge server configuration using a GPU-based embedded board and a TensorFlow-based neural network service application. As a result of experiments performed on the proposed edge server, the following are inferred: 1) The bandwidth, according to the time and data size, changes in local (20.4–42.4 Mbps) and Internet environments (6.31–25.5 Mbps) for service applications. 2) When two neural network applications are run on an edge server consisted with Xavier, TX2 and Nano, the network times of the object detection application are from 112.2 ms (Xavier) to 515.8 ms (Nano); the network times of the driver profiling application are from 321.8 ms (Xavier) to 495.7 ms (Nano). 3) The proposed pod allocation method demonstrates better performance than the default pod allocation method. We observe that the number of allocatable pods on three worker nodes increases from five to seven, and compared to other papers, the proposed offloading shows similar or better response times in environments where multiple deep learning applications are implemented.
- Published
- 2021
33. HSAC-ALADMM: an asynchronous lazy ADMM algorithm based on hierarchical sparse allreduce communication
- Author
-
Yongmei Lei, Dongxia Wang, Jinyang Xie, and Guozheng Wang
- Subjects
Optimization problem ,Computer science ,Node (networking) ,Payload (computing) ,Filter (signal processing) ,Theoretical Computer Science ,Hardware and Architecture ,Asynchronous communication ,Multithreading ,Scalability ,Algorithm ,Software ,Information Systems ,Sparse matrix - Abstract
The distributed alternating direction method of multipliers (ADMM) is an effective algorithm for solving large-scale optimization problems. However, its high communication cost limits its scalability. An asynchronous lazy ADMM algorithm based on hierarchical sparse allreduce communication mode (HSAC-ALADMM) is proposed to reduce the communication cost of the distributed ADMM: firstly, this paper proposes a lazily aggregate parameters strategy to filter the transmission parameters of the distributed ADMM, which reduces the payload of the node per iteration. Secondly, a hierarchical sparse allreduce communication mode is tailored for sparse data to aggregate the filtered transmission parameters effectively. Finally, a Calculator-Communicator-Manager framework is designed to implement the proposed algorithm, which combines the asynchronous communication protocol and the allreduce communication mode effectively. It separates the calculation and communication by multithreading, thus improving the efficiency of system calculation and communication. Experimental results for the L1-regularized logistic regression problem with public datasets show that the HSAC-ALADMM algorithm is faster than existing asynchronous ADMM algorithms. Compared with existing sparse allreduce algorithms, the hierarchical sparse allreduce algorithm proposed in this paper makes better use of the characteristics of sparse data to reduce system time in multi-core cluster.
- Published
- 2021
34. Efficient design and implementation of a robust coplanar crossover and multilayer hybrid full adder–subtractor using QCA technology
- Author
-
Mukesh Patidar and Namit Gupta
- Subjects
020203 distributed computing ,Adder ,Computer science ,Circuit design ,Crossover ,Quantum dot cellular automaton ,Hardware_PERFORMANCEANDRELIABILITY ,02 engineering and technology ,Theoretical Computer Science ,Euler method ,symbols.namesake ,Hardware and Architecture ,Subtractor ,Hardware_INTEGRATEDCIRCUITS ,0202 electrical engineering, electronic engineering, information engineering ,symbols ,Electronic engineering ,Software ,Hardware_LOGICDESIGN ,Information Systems ,Electronic circuit - Abstract
Quantum dot cellular automaton (QCA) is a novel emerging nanometer-scale-based circuit design using nanocomputing technology, which overcomes the limitations of complementary MOS technology in the precondition of the circuit design area, power, and latency/delay. This paper presents an efficient design of crossover single-layer (coplanar) and multilayer novel hybrid full adder–subtractor circuits by implementing majority gate minimization functional J-map technique. The proposed circuits have been found more efficient in terms of minimum number of QCA cells, low latency, required area in µm2, and reduced quantum cost as compared to existing QCA adder–subtractor designs and also avoid the thermodynamics problems occurring due to long QCA wires with the applied synchronization clocking method. In this paper, we have introduced 14 nm × 14 nm and 16 nm × 16 nm cell size QCA circuits and compared with an existing and proposed novel 18 nm × 18 nm single-layer and multilayer designs. Both designs are implemented by the QCADesigner-E tool with bistable vector and coherent vector energy setup in the Euler method and the Runge–Kutta method.
- Published
- 2021
35. Design and implementation of an academic expert system through big data analysis
- Author
-
Jaesoo Yoo, Hyeonbyeong Lee, Kyoungsoo Bok, and Dojin Choi
- Subjects
Influence factor ,Impact factor ,Computer science ,business.industry ,media_common.quotation_subject ,Big data ,computer.software_genre ,Data science ,Expert system ,Field (computer science) ,Theoretical Computer Science ,Hardware and Architecture ,Factor (programming language) ,Quality (business) ,business ,computer ,Software ,Information Systems ,media_common ,computer.programming_language - Abstract
Most researchers establish research directions in their study of new fields by providing expert advice or publishing expert papers. The existing academic search services display papers by field but do not provide experts by field. Therefore, researchers are left to judge experts in each field by analyzing the papers for themselves. In this paper, we design and implement an expert search system based on papers that have been published in the academic societies. The academic expert search system is based on a big data processing system to handle a large amount of data in academic fields. It calculates an expert score using quality and influence factors. The quality factor is calculated based on the citations, impact factor, and recentness of a paper. The influence factor is measured by the sparsity of a field and the degree of contributiveness of an author. The proposed system provides various services such as expert searches, keyword searches, the hot topics, expert relationships, and academic society statistics. By finding experts in a specific field, our system can support researchers’ research activities.
- Published
- 2021
36. ST-CAC: a low-cost crosstalk avoidance coding mechanism based on three-valued numerical system
- Author
-
Zahra Shirmohammadi, Martin Omana, Ata Khorami, Shirmohammadi Z., Khorami A., and Omana M.E.
- Subjects
020203 distributed computing ,Crosstalk avoidance codes (CACs) ,Computer science ,Reliability (computer networking) ,Code word ,Binary number ,Crosstalk fault ,02 engineering and technology ,Reliability ,Network-on-chip ,Theoretical Computer Science ,Hardware and Architecture ,Tri-valued numerical system coding mechanism ,0202 electrical engineering, electronic engineering, information engineering ,Code (cryptography) ,Algorithm ,Software ,Information Systems ,Degradation (telecommunications) ,Coding (social sciences) ,Data transmission - Abstract
Appearances of specific transition patterns during data transfer in bus lines of modern high-performance computing systems, such as communicating structures of accelerators for deep convolutional neural networks, commercial Network on Chips, and memories, can lead to crosstalk faults. With the shrinkage of technology size, crosstalk faults occurrence boosts and leads to degradation of reliability and performance, as well as the increasing power consumption of lines. One effective way to alleviate crosstalk faults is to avoid the appearance of these specific transition patterns by using numerical-based crosstalk avoidance codes (CACs). However, a serious problem with numerical-based CACs is their overheads in terms of required additional bus lines for representing code words. To solve this problem, in this paper we present a novel CAC that is based on the use of three symbols (three-value) to represent the code words in the bus lines, rather than classical binary CACs based on binary, i.e., 0 and 1 symbols. Our proposed CAC, named summation-based tri-value crosstalk avoidance code (ST-CAC), reduces the worst-case delay in bus lines with respect to binary CACs, and it can efficiently be applied to any arbitrary channel width of lines. The use of three symbols to represent code words in ST-CAC enables to increase the number of code words of a numerical system without increasing the number of required bus lines significantly. The experimental results show that CACs based on the use of three symbols can reduce the number of additional lines compared to binary CACs by 33%. Moreover, we show in the paper, that the delay of wires in the presence our ST-CAC can reduce by 33% with respect to state-of-the-art binary value CACs.
- Published
- 2021
37. Data congestion in VANETs: research directions and new trends through a bibliometric analysis
- Author
-
Tarandeep Kaur Bhatia, Ramkumar Ketti Ramachandran, Robin Doss, and Lei Pan
- Subjects
Bibliometric analysis ,Hardware and Architecture ,Wireless ad hoc network ,Computer science ,Scale (social sciences) ,ComputerSystemsOrganization_COMPUTER-COMMUNICATIONNETWORKS ,Scopus ,Data science ,Software ,Field (computer science) ,Information Systems ,Theoretical Computer Science ,Domain (software engineering) - Abstract
Vehicular Ad hoc Networks (VANETs) become increasingly popular in academia and manufacturing businesses. The VANETs domain attracts massive attention from various authors all over the world on a large scale. However, substantial research efforts are expected in the VANETs field to solve the data congestion problem. For this, it is vital to state the current status of research in this domain. As the research publications have substantially increased since 2009, a bibliometric analysis is necessary for researchers to understand actual results and findings in this area. This paper examines and analyzes the status of research trends between 2010 and 2019 for the domain “Data congestion in VANETs” by applying various parameters. As extracted from the Scopus database till December 31, 2019, a total of 11,109 publications are associated with the VANETs domain. Moreover, 434 publications among the collection are related to data congestion in the VANETs field. Finally, a software tool named the VOSviewer is used to create and envision the selected field’s bibliometric networks. This analysis paper will help the researchers to catch the research trends of data congestion in VANETs.
- Published
- 2021
38. Efficient covering of target areas using a location prediction-based algorithm
- Author
-
Seok-Woo Jang
- Subjects
020203 distributed computing ,genetic structures ,Computer science ,business.industry ,Deep learning ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,Object (computer science) ,Tracking (particle physics) ,Theoretical Computer Science ,Image (mathematics) ,Location prediction ,Hardware and Architecture ,Pattern recognition (psychology) ,0202 electrical engineering, electronic engineering, information engineering ,Artificial intelligence ,business ,Face detection ,Image retrieval ,Algorithm ,Software ,Information Systems - Abstract
Due to the rapid development of the high-speed wired and wireless Internet, image contents including objects with exposed personal information are being distributed freely, which is a social problem. In this paper, we introduce a method of robustly detecting a target object with facial region exposed from an image that is quickly entered using skin color and a deep learning algorithm and effectively covering the detected target object through prediction. The proposed method in this paper accurately detects the target object containing facial region exposed from the image entered by applying an image adaptive skin color model and a CNN-based deep learning algorithm. Subsequently, the location prediction algorithm is used to quickly track the detected object. A mosaic is overlapped over the target object area to effectively protect the object area where the facial region is exposed. The experimental results show that the proposed approach accurately detects the target object including the facial region exposed from the continuously entered video, and efficiently covers the detected object through mosaic processing while quickly tracking it using a prediction-based tracking algorithm. The tracking-based target covering method proposed in this study is expected to be useful in various practical applications related to pattern recognition and image security, such as content-based image retrieval, real-time surveillance, human–computer interaction, and face detection.
- Published
- 2020
39. K-means tree: an optimal clustering tree for unsupervised learning
- Author
-
Pooya Tavallali, Mukesh Singhal, and Peyman Tavallali
- Subjects
Computational complexity theory ,Computer science ,business.industry ,Random projection ,k-means clustering ,Pattern recognition ,Manifold ,Theoretical Computer Science ,Hardware and Architecture ,Principal component analysis ,Unsupervised learning ,Artificial intelligence ,Cluster analysis ,business ,Time complexity ,Software ,Information Systems ,Curse of dimensionality - Abstract
Tree construction is one of the popular methods for tackling any supervised task in machine learning. However, there has been little effort in applying trees for unsupervised tasks. The traditional unsupervised trees are based on recursively partitioning the space such that the achieved partitions contain similar samples. Sense of similarity depends on the models and applications. This paper tackles the issue of learning optimal clustering oblique trees for the first time and proposes a linear time algorithm for training it. Optimizing performance of infrastructures and energy consumption in the field of Internet of things can be mentioned as applications of tree and clustering, respectively. The motivation of unsupervised tree models is to preserve the data manifold, while keeping the query, while keeping the query time fast. Popular unsupervised models consist of k-d trees, random projection (RP trees), principal component analysis trees (PCA trees) and clustering trees. However, all existing methods for unsupervised tree are sub-optimal. Additionally, existing clustering trees are limited to axis-aligned trees. Further, some of the mentioned methods suffer from curse of dimensionality such as k-d trees. Despite the mentioned challenges, trees are fast in query time. On the other hand, a non-hierarchical clustering such as k-means has both: It performs well in high-dimensional problems and is locally optimal. Its learning algorithm is efficient. However, k-means clustering is not fast in query time. To address the mentioned issues, this paper proposes a novel k-means tree, a tree that outputs the centroids of clusters. The advantages of such tree are being fast in query time and also learning as good cluster centroids as k-means. As a result, problem of learning such trees is to learn both centroids and the tree parameters optimally and jointly. In this paper, this problem is first cast as a constrained minimization problem and then solved using quadratic penalty method. The method consists of learning clusters from k-means and gradually adapting centroids to the outputs of an optimal oblique tree. The alternating optimization is used, and alternation steps consist of weighted k-means clustering and tree optimization. Additionally, the training complexity of proposed algorithm is efficient. Proposed algorithm is optimal in the sense of learned clusters and tree jointly. Trees used in the k-means tree are oblique, and as per our knowledge, this is the first time that oblique trees are applied to the task of clustering. As a side product of the proposed method, sample reduction is explored and shown its merits. It is shown that computational complexity of training KMT (K-means tree) as a sample reduction method is faster than training K-means as a sample reduction. The training complexity of KMT sample reduction algorithm is logarithmic over the size of reduced train set, while training complexity of K-means is linear over the size of reduced dataset. Finally, proposed method is compared to other tree-based clustering algorithms and its superiority in terms of reconstruction error is shown. Additionally, its query complexity is compared with k-means.
- Published
- 2020
40. Anti-negation method for handling negation words in question answering system
- Author
-
K. Sundarakantham, S. Mercy Shalinie, and J. Felicia Lilian
- Subjects
020203 distributed computing ,Word embedding ,Phrase ,Computer science ,business.industry ,media_common.quotation_subject ,02 engineering and technology ,Ambiguity ,Semantics ,computer.software_genre ,Theoretical Computer Science ,Negation ,Reading comprehension ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Question answering ,Artificial intelligence ,business ,computer ,Software ,Sentence ,Natural language processing ,Information Systems ,media_common - Abstract
The question answer (QA) system for a reading comprehension task tries to answer the question by retrieving the needed phrase from the given content. Precise answering is the key role of a QA system. An ambiguity is developed when we need to answer a negative question with a positive reply. The negation words change the polarity of the sentence, and hence, the scope of negation words is notable. This has paved the way for studying the role of ‘negation’ in the natural language processing (NLP) task. The handling of these words is considered a major part of our proposed methodology. In this paper, we propose an algorithm to retrieve and replace the negation words present in the content and query. A comparative study is done for performing word embedding over these words using various state-of-the-art methods. In earlier works when handling the negation the semantics of the sentences are changed. Hence, in this paper we try to maintain the semantics through our proposed methodology. The updated content is embedded into the bi-directional long short-term memory (Bi-LSTM) and thus makes the retrieving of an answer for a question answer system easier. The proposed work has been carried over the Stanford Negation, and the SQuAD dataset with a higher precision value of 96.2% has been achieved in retrieving the answers that are given in the dataset.
- Published
- 2020
41. Forecasting air passenger traffic flow based on the two-phase learning model
- Author
-
Xiang Yong, Xinzhi Zhou, Mingqian Du, Xinfang Wu, Xiuqing Yang, and Gang Mao
- Subjects
020203 distributed computing ,Operations research ,Hardware and Architecture ,Computer science ,0202 electrical engineering, electronic engineering, information engineering ,ComputerApplications_COMPUTERSINOTHERSYSTEMS ,02 engineering and technology ,Volatility (finance) ,Traffic flow ,Software ,Information Systems ,Theoretical Computer Science - Abstract
The future airports will head toward a highly intelligent direction, like the unmanned check-in services, while the scale and resources allocation of the ground service are tightly related to the air passenger flow. Therefore, forecasting passenger flow accurately will affect the development of future airports and the optimization of service of civil airlines significantly. As a kind of time series, air passenger flow is influenced by multiple factors, particularly, the stochastic part of seasonality, trend and volatility. These will ultimately affect the accuracy of the prediction. Therefore, this paper introduces a prediction model based on a two-phase learning framework. In phase one, various predictors cope with different features of time series in parallel and the prediction results are integrated in phase two. Furthermore, this paper has compared principal error indicators with actual data and results show that the two-phase learning model performs better than current fusion models and owns stable performance.
- Published
- 2020
42. SINGLETON: A lightweight and secure end-to-end encryption protocol for the sensor networks in the Internet of Things based on cryptographic ratchets
- Author
-
Siyamak Shahpasand and Amir Hassani Karbasi
- Subjects
Cryptographic primitive ,business.industry ,Computer science ,Double Ratchet Algorithm ,Cryptography ,Cryptographic protocol ,Encryption ,Theoretical Computer Science ,End-to-end encryption ,Hardware and Architecture ,Forward secrecy ,business ,Communications protocol ,Software ,Information Systems ,Computer network - Abstract
For many systems, safe connectivity is an important requirement, even if the transmitting machines are resource-constrained. The advent of the Internet of Things (IoT) has also increased the demand for low-power devices capable of connecting with each other or sending data to a central processing site. The IoT allows many applications in a smart environment, such as outdoor activity control, smart energy, infrastructure management, environmental sensing, or cyber-security issues. Security in such situations remains an open challenge because of the resource-constrained design of sensors and objects, or the multi-purpose adversaries may target the process during the life cycle of a smart sensor. This paper discusses widely used protocols that provide safe communications for various applications in IoT and also different attacks are defined. In this paper, to protect the IoT objects and sensors, we propose a comprehensive and lightweight security protocol based on Cryptographic Ratchets. That is, an encrypted messaging protocol using the Double Ratchet Algorithm is defined which we call Singleton, and the implementation of protocol is tested and compared to the implementation of the IoT standard protocols and a post-quantum version of the protocol. Various cryptographic primitives are also evaluated, and their suitability for use in the protocol is tested. The results show that the protocol as the building stone not only enables efficient resource-wise protocols and architectures but also provides advanced and scalable IoT sensors. Our design and analysis demonstrate that Singleton security architecture can be easily integrated into existing network protocols such as IEEE 802.15.4 or OMA LWM2M, which offers several benefits that existing approaches cannot offer both performance and important security services. For chat applications such as WhatsApp, Skype, Facebook Private Messenger, Google Allo, and Signal, a cryptographic ratchet-based protocol provides end-to-end encryption, forward secrecy, backward secrecy, authentication, and deniability.
- Published
- 2020
43. A compensation textures dehazing method for water alike area
- Author
-
Jian Zhang, Wanjuan Song, and Feihu Feng
- Subjects
020203 distributed computing ,Haze ,Channel (digital image) ,Computer science ,business.industry ,Deep learning ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,02 engineering and technology ,Theoretical Computer Science ,Compensation (engineering) ,Aerial photography ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,Computer vision ,The Internet ,Artificial intelligence ,business ,Software ,Water vapor ,ComputingMethodologies_COMPUTERGRAPHICS ,Information Systems - Abstract
With the continual development of deep learning, the image processing in Internet of Things is the key technology. Nevertheless, many deep learning methods cannot deal with the special needs of Internet of Things, for example, the Internet of vehicles and ships for the traffic haze image. Particularly, haze removal in the water area, because of the influence of water vapor, is more difficult than that in the ordinary scene. And the dehazing of water area has practical value in shipping and aerial photography. Sensible dehazing effect can even ensure the safety of navigation. In this paper, a compensation textures dehazing method is presented for water alike scene. The motivation of this paper comes from the following observations. Dark channel haze removal method has a very real dehazing effect for ordinary scenes. However, due to the principle of the dark channel method, this dehazing method has a large deviation in the water alike area. Therefore, based on the classical dark channel method, this paper proposes three innovations. First, a dynamic priority method is designed. This method can calculate the priority order of patches according to the characteristics of the processed subject. Second, a compensation textures method is designed, which can compensate the special area according to the proposed priority method. Third, a new haze removal method is designed, which can effectively remove the haze of water area according to the proposed compensation textures method. The results of visual and quality experiment show that proposed method has a state-of-the-art dehazing result in the water alike area.
- Published
- 2020
44. NP-completeness of chromatic orthogonal art gallery problem
- Author
-
Hamid Hoorfar and Alireza Bagheri
- Subjects
Art gallery problem ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Approximation algorithm ,Computer Science::Computational Geometry ,Computational geometry ,Theoretical Computer Science ,Combinatorics ,Exact algorithm ,Monotone polygon ,Hardware and Architecture ,Polygon ,Graph coloring ,Rectangle ,Simple polygon ,Time complexity ,Software ,ComputingMethodologies_COMPUTERGRAPHICS ,Information Systems - Abstract
The chromatic orthogonal art gallery problem is a well-known problem in the computational geometry. Two points in an orthogonal polygon P see each other if there is an axis-aligned rectangle inside P contains them. An orthogonal guarding of P is k-colorable, if there is an assignment between k colors and the guards such that the visibility regions of every two guards in the same color have no intersection. The purposes of this paper are discussing the time complexity of k-colorability of orthogonal guarding and providing algorithms for the chromatic orthogonal art gallery problem. The correctness of presented solutions is proved, mathematically. Herein, the heuristic method is used that leads us to an innovative reduction, some optimal and one approximation algorithms. The paper shows that deciding k-colorability of orthogonal guarding for P is NP-complete. First, we prove that deciding 2-colorability of P is NP-complete. It is proved by a reduction from planar monotone rectilinear 3-SAT problem. After that, a reduction from graph coloring implies this is true for every fixed integer $$k\ge 2$$ . In the third step, we present a 6-approximation algorithm for every orthogonal simple polygon. Also, an exact algorithm is provided for histogram polygons that finds the minimum chromatic number.
- Published
- 2020
45. A review on diagnostic autism spectrum disorder approaches based on the Internet of Things and Machine Learning
- Author
-
Ali Mazaherinezhad, Mehdi Hosseinzadeh, Aziz Rezapour, Alireza Souri, Ahmed Omar Bali, Mahdi Bohlouli, Jalil Koohpayehzadeh, and Farnoosh Afshin Rad
- Subjects
020203 distributed computing ,Computer science ,business.industry ,Deep learning ,Context (language use) ,02 engineering and technology ,medicine.disease ,Machine learning ,computer.software_genre ,Field (computer science) ,Theoretical Computer Science ,Nonverbal communication ,Hardware and Architecture ,Autism spectrum disorder ,Taxonomy (general) ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,Autism ,Artificial intelligence ,business ,computer ,Software ,Information Systems ,Gesture - Abstract
Children with autism spectrum disorders (ASDs) have some disturbance activities. Usually, they cannot speak fluently. Instead, they use gestures and pointing words to make a relationship. Hence, understanding their needs is one of the most challenging tasks for caregivers, but early diagnosis of the disease can make it much easier. The lack of verbal and nonverbal communications can be eliminated by assistive technologies and the Internet of Things (IoT). The IoT-based systems help to diagnose and improve the patients’ lives through applying Deep Learning (DL) and Machine Learning (ML) algorithms. This paper provides a systematic review of the ASD approaches in the context of IoT devices. The main goal of this review is to recognize significant research trends in the field of IoT-based healthcare. Also, a technical taxonomy is presented to classify the existing papers on the ASD methods and algorithms. A statistical and functional analysis of reviewed ASD approaches is provided based on evaluation metrics such as accuracy and sensitivity.
- Published
- 2020
46. Comparative studies on machine learning for paralinguistic signal compression and classification
- Author
-
Kyomin Jung, Seokhyun Byun, and Seung-Hyun Yoon
- Subjects
020203 distributed computing ,business.industry ,Computer science ,Feature extraction ,Signal compression ,02 engineering and technology ,Machine learning ,computer.software_genre ,Signal ,Theoretical Computer Science ,Statistical classification ,Redundancy (information theory) ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Feature (machine learning) ,Artificial intelligence ,business ,computer ,Software ,Information Systems ,Curse of dimensionality ,Data compression - Abstract
In this paper, we focus on various compression and classification algorithms for three different paralinguistic signal classification tasks. These tasks are quite difficult for humans because the sound information from such signals is difficult to distinguish. Therefore, when machine learning techniques are applied to analyze paralinguistic signals, several different aspects of speech-related information, such as prosody, energy, and cepstral information, are usually considered for feature extraction. However, when the size of the training corpus is not sufficiently large, it is extremely difficult to directly apply machine learning to classify such signals due to their high feature dimensions; this problem is also known as the curse of dimensionality. This paper proposes to address this limitation by means of feature compression. First, we present experimental results obtained by using various compression algorithms to compress signals to eliminate redundancy of the signal features. We observe that compared with the original features, the compressed signal features still provide a comparable ability to distinguish the signals, especially when using a fully connected neural network classifier. Second, we calculate the output distribution of the F1-score for each emotion in the speech emotion recognition problem and show that the fully connected neural network classifier performs more stably than other classical methods.
- Published
- 2020
47. A Riccati-type algorithm for solving generalized Hermitian eigenvalue problems
- Author
-
Takafumi Miyata
- Subjects
Computer science ,Krylov subspace ,Hermitian matrix ,Theoretical Computer Science ,Hardware and Architecture ,Iterated function ,ComputingMethodologies_SYMBOLICANDALGEBRAICMANIPULATION ,Convergence (routing) ,Riccati equation ,MATLAB ,computer ,Algorithm ,Software ,Subspace topology ,Eigenvalues and eigenvectors ,Information Systems ,computer.programming_language - Abstract
The paper describes a heuristic algorithm for solving a generalized Hermitian eigenvalue problem fast. The algorithm searches a subspace for an approximate solution of the problem. If the approximate solution is unacceptable, the subspace is expanded to a larger one, and then, in the expanded subspace a possibly better approximated solution is computed. The algorithm iterates these two steps alternately. Thus, the speed of the convergence of the algorithm depends on how to generate a subspace. In this paper, we derive a Riccati equation whose solution can correct the approximate solution of a generalized Hermitian eigenvalue problem to the exact one. In other words, the solution of the eigenvalue problem can be found if a subspace is expanded by the solution of the Riccati equation. This is a feature the existing algorithms such as the Krylov subspace algorithm implemented in the MATLAB and the Jacobi–Davidson algorithm do not have. However, similar to solving the eigenvalue problem, solving the Riccati equation is time-consuming. We consider solving the Riccati equation with low accuracy and use its approximate solution to expand a subspace. The implementation of this heuristic algorithm is discussed so that the computational cost of the algorithm can be saved. Some experimental results show that the heuristic algorithm converges within fewer iterations and thus requires lesser computational time comparing with the existing algorithms.
- Published
- 2020
48. Priority-based joint EDF–RM scheduling algorithm for individual real-time task on distributed systems
- Author
-
Deepak Dahiya, Mohammed Alshehri, Rashmi Sharma, and Nitin Nitin
- Subjects
Rate-monotonic scheduling ,Earliest deadline first scheduling ,020203 distributed computing ,Basis (linear algebra) ,Computer science ,Distributed computing ,CPU time ,02 engineering and technology ,Upper and lower bounds ,Turnaround time ,Theoretical Computer Science ,Task (project management) ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Joint (audio engineering) ,Software ,Information Systems - Abstract
Multiple tasks arrive in the distributed systems that can be executed in either parallel or sequential manner. Before the execution, tasks are scheduled prioritywise with the help of scheduling algorithms to their respective processors. For task assignment, every scheduling algorithm follows different protocols like upper bound of CPU utilization, assigning priorities, etc. In this paper, author has worked on such scheduling algorithms. Previously, the author evaluated the performance of algorithms on the basis of transactions (group of tasks). In this paper, the author re-evaluates joint EDF–RM scheduling algorithm, where its performance is calculated on the execution of individual task basis. For comparative analysis, similar algorithms are considered, i.e., joint EDF–RMS, earliest deadline first (EDF) and rate monotonic scheduling (RMS). These mentioned algorithms are simulated and analyzed with the help of statistical analysis, and turnaround time of periodic tasks is evaluated. Additionally, migration distribution and CPU utilization on the basis of scheduling algorithms' upper bounds are also calculated.
- Published
- 2020
49. Reduce energy consumption in sensors using a smartphone, smartwatch, and the use of SFLA algorithms (REC-SSS)
- Author
-
Sara Najafzadeh, Mohammad Reza Mohammadhosseini, and Ebrahim Mahdipour
- Subjects
020203 distributed computing ,Computer science ,business.industry ,Gyroscope ,02 engineering and technology ,Energy consumption ,Accelerometer ,Theoretical Computer Science ,law.invention ,Smartwatch ,Hardware and Architecture ,law ,0202 electrical engineering, electronic engineering, information engineering ,Wireless ,Sink (computing) ,business ,Algorithm ,Software ,Energy (signal processing) ,Information Systems ,Efficient energy use - Abstract
Wireless body area networks are a technology for remote medical care. Because of the limited energy of the sensors, one of the problems of long-distance medical care is the high energy consumption when sending information of sensors to the sink. Choosing the proper route when sending information to the sink will reduce energy consumption and increase network lifetime. The current paper used the shuffled frog leaping algorithm (SFLA) to find an appropriate route that can send information of sensors to the lowest energy to the sink or coordinator. Also, to prevent data traffic in the sink, the sensors are divided into two groups of four. With the advent of the Internet of things and increasing use of them among people, the use of this technology has attracted the attention of remote medical care. Smartphones and smartwatches can measure information from different parts of the body, such as the heart, stroll, and glucose, based on their built-in sensors such as accelerometers, gyroscopes, and advanced cameras. These two smart devices are used in three different roles (coordinator, sink, and sensor) to improve energy efficiency in remote medical care. The paper is shown an appropriate path to reduce energy consumption in sensors using a smartphone, smartwatch, and the use of SFLA algorithms (REC-SSS). The simulation results show that the network stability increased by 12.5%,132%, and 3.5% compared to SIMPLE, M-ATTEMPT, and EERP, respectively. Also, in the proposed schema, the lifetime of the network is increased by 26% over SIMPLE, M-ATTEMPT, and EERP.
- Published
- 2020
50. Improving learning ability of learning automata using chaos theory
- Author
-
Mohammad Reza Meybodi and Bagher Zarei
- Subjects
Theoretical computer science ,Learning automata ,Computer science ,Chaotic ,Tent map ,Chaos theory ,Theoretical Computer Science ,Automaton ,Set (abstract data type) ,Rate of convergence ,Hardware and Architecture ,Finite set ,Software ,Information Systems - Abstract
A learning automaton (LA) can be considered as an abstract system with a finite set of actions. LA operates by choosing an action from the set of its actions and applying it to the stochastic environment. The environment evaluates the chosen action, and automaton uses the response of the environment to update its decision-making method for selecting the next action. This process is repeated until the optimal action is found. The learning algorithm (learning scheme) determines how to use the environment response for updating the decision-making method to select the next action. In this paper, the chaos theory is incorporated with the LA and a new type of LA, namely chaotic LA (cLA), is introduced. In cLA, the chaotic numbers are used instead of the random numbers when choosing the action. The experiment results show that in most cases, the use of chaotic numbers leads to a significant improvement in the learning ability of the LA. Among the chaotic maps investigated in this paper, the Tent map has better performance than the other maps. The convergence rate/convergence time of the LA will increase/decrease by 91.4%/29.6% to 264.4%/69.1%, on average, by using the Tent map. Furthermore, the chaotic LA has more scalability than the standard LA, and its performance will not decrease significantly by increasing the problem size (number of actions).
- Published
- 2020
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.