Author: "Lijuan Zhou" / Topic: computer science - Searchworks@Jio Institute Digital Library Search Results

1. Optimized Artificial Bee Colony Algorithm for Web Service Composition Problem

Author: Shudong Zhang, Lijuan Zhou, and Shao Yaru
Subjects: Artificial bee colony algorithm, Information Systems and Management, Artificial Intelligence, business.industry, Computer science, Web service composition, Artificial intelligence, business, Computer Science Applications
Published: 2021

2. Study on the Personalized Learning Model of Learner-Learning Resource Matching

Author: Lijuan Zhou, Shudong Zhang, Min Xu, and Feifei Zhang
Subjects: Matching (statistics), Learning resource, Computer science, business.industry, Personalized learning, Artificial intelligence, Machine learning, computer.software_genre, business, computer, Computer Science Applications, Education
Abstract: With the development of service integration technology, online learning platforms have gathered a large number of learning resources, causing learners to get lost in a variety of course information and it is difficult to obtain learning resources that match their own needs. The proposal of personalized learning gives the problem a direction to solve. However, current personalized learning resource recommendation services facing problems such as excessive candidate resources, sparse history and cold starts. In addition, the learning resources provided also show problems of "difficult or easy, uneven quality". For this article researches the personalized learning recommendation model of learner-learning resource matching. The main content includes three parts: First, build a demand model based on learner registration information, learning behavior and other data. Second, analyze the access behavior of learning resources and assess their quality. Third, calculate the matching degree between learners and learning resources based on the demand model and the quality information of the learning resources, and recommend them.
Published: 2021

3. Jointly Learning Visual Poses and Pose Lexicon for Semantic Action Recognition

Author: Wanqing Li, Philip Ogunbona, Zhengyou Zhang, and Lijuan Zhou
Subjects: Computer science, business.industry, Probabilistic logic, 02 engineering and technology, computer.software_genre, Lexicon, Visualization, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, Frame (artificial intelligence), 020201 artificial intelligence & image processing, Artificial intelligence, Electrical and Electronic Engineering, Hidden Markov model, business, computer, Natural language processing
Abstract: A novel method for semantic action recognition through learning a pose lexicon is presented in this paper. A pose lexicon comprises a set of semantic poses, a set of visual poses, and a probabilistic mapping between the visual and semantic poses. This paper assumes that both the visual poses and mapping are hidden and proposes a method to simultaneously learn a visual pose model that estimates the likelihood of an observed video frame being generated from hidden visual poses, and a pose lexicon model establishes the probabilistic mapping between the hidden visual poses and the semantic poses parsed from textual instructions. Specifically, the proposed method consists of two-level hidden Markov models. One level represents the alignment between the visual poses and semantic poses. The other level represents a visual pose sequence, and each visual pose is modeled as a Gaussian mixture. An expectation-maximization algorithm is developed to train a pose lexicon. With the learned lexicon, action classification is formulated as a problem of finding the maximum posterior probability of a given sequence of video frames that follows a given sequence of semantic poses, constrained by the most likely visual pose and the alignment sequences. The proposed method was evaluated on MSRC-12, WorkoutSU-10, WorkoutUOW-18, Combined-15, Combined-17, and Combined-50 action datasets using cross-subject, cross-dataset, zero-shot, and seen/unseen protocols.
Published: 2020

4. Detection of small objects in complex long-distance scenes based on Yolov3

Author: Weidong Feng, Bin Lu, Lijuan Zhou, Shudong Zhang, and Xin Chen
Subjects: Data set, Set (abstract data type), business.industry, Computer science, Feature extraction, Experimental data, Computer vision, Scale (descriptive set theory), Artificial intelligence, Function (mathematics), Construct (python library), business, Tower
Abstract: In view of the difficulty in detecting and managing illegal buildings, this article uses high-definition cameras mounted on a tower to regularly capture images to construct the “Suspected Illegal Building Information” dataset. Due to the high distance of the tower from the ground, most of the collected data images contian many small targets, and there are many types of target objects to be detected and target size is quite differences, resulting in low detection accuracy. Based on the above problems, this paper makes the following improvements based on the Yolov3 algorithm: (1) using K-menas re-clustering anchor boxes, (2) introducing CIoU optimization loss function, (3) adding 104×104 scale for feature extraction, (4) using Soft-NMS instead traditional NMS algorithm, in addition, the data set is enhanced and optimized. The experimental results show that, compared with the original Yolov3, the improved algorithm has improved the detection accuracy of this experimental data set by 19.1%, which can well meet the needs of the project.
Published: 2021

5. Video captioning based on multi-feature fusion with object awareness

Author: Changyong Niu, Tao Liu, and Lijuan Zhou
Subjects: Closed captioning, business.industry, Computer science, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Representation (systemics), Object (computer science), Multi feature fusion, Component (UML), Fuse (electrical), Computer vision, Artificial intelligence, business, Joint (audio engineering), Block (data storage)
Abstract: This paper proposes a novel method to utilize three source features for video captioning. It fuses global video features with local object and regional features to model the relationships among objects and their motions and applies object tags instead of visual features to guide the generation of descriptions. Specifically, Multi-feature is firstly extracted by pretrained models and treated as separate inputs alongside video frames. Secondly, an object awareness attention block is designed to fuse the different features information and to learn a joint video representation which has both visual and linguistic semantics. Experiments on MSVD and MSR-VTT datasets have shown the effectiveness of the proposed method, and the ablation studies have verified the contribution of each component.
Published: 2021

6. Data Cache Optimization Model Based on HBase and Redis

Author: Lijuan Zhou, Bin Lu, Laijun Qi, and Shudong Zhang
Subjects: business.industry, Computer science, Distributed computing, Distributed data store, Key (cryptography), Cloud computing, Focus (optics), business, Cloud storage, Storage model, Field (computer science), Image (mathematics)
Abstract: The computing-based cloud storage model realizes the security and reliable storage of massive image data. Nowadays, how to quickly obtain high-quality image data is the focus of attention in the research field. This paper discusses the cloud storage model, proposing a combined data caching strategy based on HBase and Redis technology. Besides, the model improves the technical defects of Memcached, setting key indexes which is stored in Hbase, having achieved the mapping between index and DataNode. The results of experiment show that the proposed optimization model can improve the ability of rapid acquisition and analysis of image data and achieve a higher retrieval efficiency about image data.
Published: 2020

7. A New Majority Weighted Minority Oversampling Technique for Classification of Imbalanced Datasets

Author: Yixuan Zhao, Chen Tian, Lijuan Zhou, and Shudong Zhang
Subjects: business.industry, Computer science, Pattern recognition, 02 engineering and technology, Hierarchical clustering, Data set, Statistical classification, ComputingMethodologies_PATTERNRECOGNITION, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Oversampling, 020201 artificial intelligence & image processing, Artificial intelligence, Cluster analysis, business, Classifier (UML)
Abstract: Classification problem is one of the essential tasks in data mining. Traditional classification strategies are predominantly via cost-insensitive equilibrium data. They tend to be concentrated on the overall accuracy of a model, and such classifiers are improper for unbalanced sample data. Hence, optimizing unbalanced samples to improve classifier performance is an issue worthy of discussion. Based on the information-rich minority samples that are difficult to learn, Majority Weighted Minority Oversampling Technique (MWMOTE) uses the clustering method to generate synthetic samples from the weighted information samples. However, the accuracy of the clustering should be optimized. To this end, a method called NC_Link_MWMOTE is presented for efficiently handling imbalanced learning problems. We propose a solution by using NC_Link-based hierarchical clustering method to synthesize different samples from a small number of samples, thus optimizing the clustering effect. NC_Link_MWMOTE was evaluated on six different levels of equilibrium data sets. The simulation results show that our method is effective and outperforms competitive baseline method in terms of various assessment metrics, such as Fl-score and Area Under Curve (AUC).
Published: 2020

8. Statistical Analysis and Automatic Recognition of Grammatical Errors in Teaching Chinese as a Second Language

Author: Lijuan Zhou, Mengjie Zhong, Hongying Zan, and Yingjie Han
Subjects: Computer science, business.industry, First language, Negative transfer, computer.software_genre, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Second language, Statistical analysis, Artificial intelligence, Second language learners, business, computer, Word (computer architecture), Natural language processing
Abstract: Foreigners make various grammatical errors when learning Chinese due to the negative transfer of their mother tongue, learning strategies, etc. At present, the research on grammatical errors mainly focuses on a certain word or a certain kind of errors, resulting in a lack of comprehensive understanding. In this paper, a statistical analysis on large-scale data sets of grammatical errors made by second language learners is conducted, including words with grammatical errors and their quantities. The statistical analysis gives people a more comprehensive understanding of grammatical errors and have certain guiding significance for teaching Chinese as a second language (TCSL). Because of the large proportion of grammatical errors of “的[de](of)”, the usages of “的[de](of)” are integrated into automatic recognition of Chinese grammatical errors. Experimental results show that the performance is overall improved.
Published: 2020

9. BERT with Enhanced Layer for Assistant Diagnosis Based on Chinese Obstetric EMRs

Author: Chuang Liu, Kunli Zhang, Xuemin Duan, Lijuan Zhou, Hongying Zan, and Yueshu Zhao
Subjects: Language representation, Computer science, Data mining, computer.software_genre, computer, Encoder
Abstract: This paper proposes a novel method based on the language representation model called BERT (Bidirectional Encoder Representations from Transformers) for Obstetric assistant diagnosis on Chinese obstetric EMRs (Electronic Medical Records). To aggregate more information for final output, an enhanced layer is augmented to the BERT model. In particular, the enhanced layer in this paper is constructed based on strategy 1(A strategy) and/or strategy 2(A-AP strategy). The proposed method is evaluated on two datasets including Chinese Obstetric EMRs dataset and Arxiv Academic Paper Dataset (AAPD). The experimental results show that the proposed method based on BERT improves the F1 value by 19.58% and 2.71% over the state-of-the-art methods, and the proposed method based on BERT and the enhanced layer by strategy 2 improves the F1 value by 0.7% and 0.3% (strategy 1 improves the F1 value by 0.68% and 0.1%) over the method without adding enhanced layer respectively on Obstetric EMRs dataset and AAPD dataset.
Published: 2019

10. Imbalanced Data Processing Model for Software Defect Prediction

Author: Hua Wang, Shudong Zhang, Lijuan Zhou, and Ran Li
Subjects: Computer science, business.industry, Decision tree, Software development, Sampling (statistics), 020207 software engineering, Feature selection, 02 engineering and technology, computer.software_genre, Computer Science Applications, ComputingMethodologies_PATTERNRECOGNITION, C4.5 algorithm, Software bug, 0202 electrical engineering, electronic engineering, information engineering, Chi-square test, 020201 artificial intelligence & image processing, AdaBoost, Data mining, Electrical and Electronic Engineering, business, Classifier (UML), computer
Abstract: In the field of software engineering, software defect prediction is the hotspot of the researches which can effectively guarantee the quality during software development. However, the problem of class imbalanced datasets will affect the accuracy of overall classification of software defect prediction, which is the key issue to be solved urgently today. In order to better solve this problem, this paper proposes a model named ASRA which combines attribute selection, sampling technologies and ensemble algorithm. The model adopts the Chi square test of attribute selection and then utilizes the combined sampling technique which includes SMOTE over-sampling and under-sampling to remove the redundant attributes and make the datasets balance. Afterwards, the model ASRA is eventually established by ensemble algorithm named Adaboost with basic classifier J48 decision tree. The data used in the experiments comes from UCI datasets. It can draw the conclusion that the effect of software defect prediction classification which using this model is improved and better than before by comparing the precision P, F-measure and AUC values from the results of the experiments.
Published: 2017

11. Semantic action recognition by learning a pose lexicon

Author: Zhengyou Zhang, Lijuan Zhou, Philip Ogunbona, and Wanqing Li
Subjects: Sequence, Computer science, business.industry, Posterior probability, 020207 software engineering, 02 engineering and technology, Mixture model, Lexicon, computer.software_genre, Action (philosophy), Artificial Intelligence, Signal Processing, 0202 electrical engineering, electronic engineering, information engineering, Frame (artificial intelligence), 020201 artificial intelligence & image processing, Computer Vision and Pattern Recognition, Artificial intelligence, Set (psychology), Hidden Markov model, business, computer, Software, Natural language processing
Abstract: This paper proposes a semantic representation, pose lexicon , for action recognition. The lexicon is composed of a set of semantic poses, a set of visual poses and a probabilistic mapping between the visual and semantic poses. Specially, an action can be represented by a sequence of semantic poses extracted from an associated textual instruction. Visual frames of the action are considered to be generated from a sequence of hidden visual poses. To learn the lexicon, a visual pose model is learned from training samples by a Gaussian Mixture model to characterize the likelihood of an observed visual frame being generated by a visual pose. A pose lexicon model is also learned by an extended hidden Markov alignment model to encode the probabilistic mapping between hidden visual poses and semantic poses sequences. With the lexicon, action classification is formulated as a problem of finding the maximum posterior probability of a given sequence of visual frames that fits to a given sequence of semantic poses through the most likely visual pose and alignment sequences. The efficacy of the proposed method was evaluated on MSRC-12, WorkoutSU-10, WorkoutUOW-18, Combined-15 and Combined-17 action datasets using cross-subject, cross-dataset and zero-shot protocols.
Published: 2017

12. Software Defect Prediction Based on Ensemble Learning

Author: Hui Liu, Zhong Sun, Xiangyang Huang, Lijuan Zhou, Ran Li, and Shudong Zhang
Subjects: Computer science, business.industry, media_common.quotation_subject, 020207 software engineering, 02 engineering and technology, Construct (python library), Machine learning, computer.software_genre, Ensemble learning, Random forest, Software, Software bug, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Quality (business), Software system, Artificial intelligence, business, Focus (optics), computer, media_common
Abstract: Software defect prediction is one of the important ways to guarantee the quality of software systems. Combining various algorithms in machine learning to predict software defects has become a hot topic in the current study. The paper uses the datasets of MDP as the experimental research objects and takes ensemble learning as research focus to construct software defect prediction model. With experimenting five different types of ensemble algorithms and analyzing the features and procedures, this paper discusses the best ensemble algorithm which is Random Forest through experimental comparison. Then we utilize the SMOTE over-sampling and Resample methods to improve the quality of datasets to build a complete new software defect prediction model. Therefore, the results show that the model can improve defect classification performance effectively.
Published: 2019

13. Efficiency Optimization of Capsule Network Model Based on Vector Element

Author: Kai Feng, Hui Li, Lijuan Zhou, Xiangyang Huang, and Shudong Zhang
Subjects: Computer science, business.industry, Deep learning, Pattern recognition, 02 engineering and technology, Convolutional neural network, Field (computer science), Vector element, 03 medical and health sciences, 0302 clinical medicine, Artificial Intelligence, Image identification, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, 030212 general & internal medicine, Computer Vision and Pattern Recognition, Artificial intelligence, business, Software, Network model
Abstract: Currently, Deep Learning and Convolutional Neural Network (CNN) have been widely used in many fields and have generated very high value in these fields, especially in the field of image recognition. But there are some deficiencies in certain issues of image recognition. For example, CNN’s recognizing performance is not good at different angles of objects and overlapping objects. Also, CNN is sometimes very sensitive to slight perturbations, modifying one pixel of a recognized image may cause recognition errors. For these problems, the capsule network (CapsNet) proposed by Geoffrey Hinton can solve the problems of traditional convolutional networks. Shortly after CapsNet was proposed, the model structure was relatively simple, and many aspects could be explored for improvement. This paper will optimize CapsNet from two aspects: “optimization of routing mechanism” and “increase Dropout operation.” And carry out experiments and results analysis on these optimizations.
Published: 2020

14. Improved clustering algorithm with adaptive opposition-based learning

Author: Qianqian Meng and Lijuan Zhou
Subjects: Clustering high-dimensional data, DBSCAN, Fuzzy clustering, Computer science, Population-based incremental learning, Correlation clustering, 02 engineering and technology, computer.software_genre, Machine learning, Biclustering, CURE data clustering algorithm, Consensus clustering, 0202 electrical engineering, electronic engineering, information engineering, Cluster analysis, k-medians clustering, FSA-Red Algorithm, k-medoids, business.industry, Constrained clustering, k-means clustering, Determining the number of clusters in a data set, Data stream clustering, Canopy clustering algorithm, FLAME clustering, Affinity propagation, 020201 artificial intelligence & image processing, Algorithm design, Data mining, Artificial intelligence, business, computer
Abstract: In recent years, clustering has become a hotspot in the field of data mining, as one of the key technologies of getting data distribution and observing the characteristics of class. However, some clustering algorithms depend on the selection of initial clustering centers, and the clustering results easily fall into local optimal. To solve the above problem, the paper integrates differential evolution algorithm and adaptive opposition-based learning. The algorithm makes use of reverse factor to guide algorithm search space approaching to the global optimal solution in each generation. In this paper, the improved algorithm is combined with classical K-means algorithm. According to the result of the three sets of data from UCI data verification, it demonstrates that the improved clustering algorithm can not only cluster better and converge faster, but also effectively suppress the occurrence of prematurity.
Published: 2017

15. Swarm-Based Spreading Points

Author: Shudong Zhang, LiGuo Huang, Xiangyang Huang, and Lijuan Zhou
Subjects: 021103 operations research, Computer science, Minimum distance, 0211 other engineering and technologies, Process (computing), Swarm behaviour, Particle swarm optimization, 0102 computer and information sciences, 02 engineering and technology, 01 natural sciences, Set (abstract data type), Packing problems, 010201 computation theory & mathematics, Point (geometry), Pairwise comparison, Algorithm
Abstract: In this paper we propose a Swarm-based Spreading Points algorithm (SSP) for improving the solutions for packing problems. The SSP repositions the initial set of points and evolves it to improve the minimum distance between points. During the evolving process, for each point, a feasible direction of movement is computed according to its nearest neighbors so that the shortest pairwise distance between the point and other points can be increased along this direction (if any). Our experiments showed that the SSP algorithm can improve certain best-known solutions for some problems previously reported in the literature.
Published: 2017

16. Research and Implementation of Data Mining Algorithms Based on Cloud Computing

Author: Hui Wang, Xiang Wang, and Lijuan Zhou
Subjects: General Computer Science, Database, business.industry, Computer science, Data stream mining, General Mathematics, Cloud computing, computer.software_genre, Data mining algorithm, Utility computing, Cloud testing, Data mining, business, computer
Published: 2013

17. Indexing of Large Data Based on CloudComputing Platform

Author: Wenbo Wang, Hui Wang, and Lijuan Zhou
Subjects: Information retrieval, Computer Networks and Communications, Hardware and Architecture, Computer science, Search engine indexing
Published: 2013

18. Research on Parallel Classification Algorithms for Large-scale Data

Author: Hui Wang, Wenbo Wang, and Lijuan Zhou
Subjects: Computer Networks and Communications, Computer science, business.industry, Improved algorithm, Cloud computing, Large scale data, computer.software_genre, Naive Bayes classifier, Statistical classification, Hardware and Architecture, Scalability, Computer data storage, Programming paradigm, Data mining, business, computer
Abstract: Because of the growing mass of data and the requirements of data mining's individuation, the traditional centralized data mining method can't adapt to this kind of demand. Cloud computing provided a cheap solution for massive data storage, analysis and handling. In order to achieve the purpose of parallel data mining in cloud environment, an improved algorithm based on the traditional Naive Bayes has been proposed in this paper. First, proposing the designing ideas of the improved algorithm in MapReduce programming model. Then using the actual data to test the algorithm. The experimental result validated that the new algorithm has higher performance and better scalability.
Published: 2012

19. Learning a Pose Lexicon for Semantic Action Recognition

Author: Philip Ogunbona, Lijuan Zhou, and Wanqing Li
Subjects: FOS: Computer and information sciences, Sequence, Computer science, business.industry, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, 020207 software engineering, 02 engineering and technology, Construct (python library), computer.software_genre, Lexicon, Translation (geometry), Semantics, Visualization, Action (philosophy), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, Natural language processing, Gesture
Abstract: This paper presents a novel method for learning a pose lexicon comprising semantic poses defined by textual instructions and their associated visual poses defined by visual features. The proposed method simultaneously takes two input streams, semantic poses and visual pose candidates, and statistically learns a mapping between them to construct the lexicon. With the learned lexicon, action recognition can be cast as the problem of finding the maximum translation probability of a sequence of semantic poses given a stream of visual pose candidates. Experiments evaluating pre-trained and zero-shot action recognition conducted on MSRC-12 gesture and WorkoutSu-10 exercise datasets were used to verify the efficacy of the proposed method., Accepted by the 2016 IEEE International Conference on Multimedia and Expo (ICME 2016). 6 pages paper and 4 pages supplementary material
Published: 2016

20. The Comprehensive Electronic Identity Security System of the Internet

Author: Xiang Zou, Huafeng Kong, Bing Chen, Bo Jin, Lijuan Zhou, and Yuebo Dai
Subjects: Cloud computing security, Computer Networks and Communications, Computer science, business.industry, Internet privacy, Computer security, computer.software_genre, Internet Architecture Board, Identity management, Security association, Hardware and Architecture, Electronic identity, The Internet, business, computer, Security system
Published: 2011

21. Decision fusion rules based on multi-bit knowledge of local sensors in wireless sensor networks

Author: Lijuan Zhou, Guangzhu Chen, Zhencai Zhu, and Gongbo Zhou
Subjects: business.industry, Computer science, Monte Carlo method, Pattern recognition, symbols.namesake, Hardware and Architecture, Gaussian noise, Likelihood-ratio test, Signal Processing, symbols, Fusion rules, Artificial intelligence, False alarm, business, Wireless sensor network, Algorithm, Software, Fusion center, Information Systems, Rayleigh fading
Abstract: For Wireless Sensor Networks (WSNs) with a small quantity of sensors and very low SNR, distributed detection and decision fusion rules based on multi-bit knowledge of local sensors are proposed. At local sensors, observations are quantized to multi-bit local decisions. Three quantification algorithms are investigated, which are based on weight, statistics and redundancy, respectively. Corresponding suboptimal fusion rules at the fusion center are also discussed by approximating the optimal likelihood ratio test. System level detection performance measures, namely probabilities of detection and false alarm, are derived analytically by employing probability theory. Finally, Monte Carlo methods are employed to study the performance of proposed decision fusion rules with parameters such as Rayleigh fading channel and Gaussian noise. Numerical results show that, under non-ideal channel, commonly used schemes based on weight cannot improve the system performance even with a large number and high SNR. Fortunately, schemes based on statistics and redundancy can enhance the system capability when the node is deficient and SNR is low. Furthermore, schemes based on statistics have the best stability among the three schemes, and schemes based on redundancy have the best performance among the three when quantization degree is high.
Published: 2011

22. Discriminative Key Pose Extraction Using Extended LC-KSVD for Action Recognition

Author: Philip Ogunbona, Hanling Zhang, Duc Thanh Nguyen, Lijuan Zhou, Yuyao Zhang, and Wanqing Li
Subjects: business.industry, Computer science, Feature extraction, Pattern recognition, Svm classifier, ComputingMethodologies_PATTERNRECOGNITION, Discriminative model, Key (cryptography), Action recognition, Pyramid (image processing), Artificial intelligence, business, Max pooling, Gesture
Abstract: This paper presents a method for extracting discriminative key poses for skeleton-based action recognition. Poses are represented by normalized joint locations, velocities and accelerations of skeleton joints. An extended label consistent K-SVD (ELC-KSVD) algorithm is proposed for learning the common and action-specific dictionaries. Discriminative key poses are represented by the atoms of the action-specific dictionaries. With the specific dictionaries, sparse codes are obtained for representing action instances through max pooling and temporal pyramid. A SVM classifier is trained for action recognition. The proposed method was evaluated on the MSRC-12 gesture and MSR-Action 3D datasets. Experimental results have shown that the proposed method is effective in extracting discriminative key poses.
Published: 2014

23. Research of the FP-Growth Algorithm Based on Cloud Environments

Author: Lijuan Zhou and Xiang Wang
Subjects: Flexibility (engineering), Computer science, business.industry, Cloud computing, Linked list, computer.software_genre, Human-Computer Interaction, Data set, Artificial Intelligence, Scalability, Programming paradigm, Data mining, business, Algorithm, computer, Software, FSA-Red Algorithm
Abstract: The emergence of cloud computing solves the problems that traditional data mining algorithms encounter when dealing with large data. This paper studies the FP-Growth algorithm and proposes a parallel linked list-based FPG algorithm based on MapReduce programming model, named as the PLFPG algorithm. And then it describes the main idea of algorithm. Finally, by using different data sets to test the algorithm, the experimental result shows that PLFPG algorithm has higher efficiency and better flexibility and scalability.
Published: 2014

24. Improved Data Mining Algorithms Based on an Early Warning System of College Students

Author: Shuang Li, Yuyan Chen, and Lijuan Zhou
Subjects: Human-Computer Interaction, Artificial neural network, Warning system, Association rule learning, Artificial Intelligence, Computer science, Early warning system, Data mining, computer.software_genre, computer, Software, Data mining algorithm
Abstract: In order to solve the problem of early warning of college students’ achievement, this paper proposes two improved algorithms for data pre-processing and mining warning factor. At first we put forward an improved K-Means algorithm which is not only ensures the accuracy of the original algorithm, but also improves the stability of the algorithm. Then we put forward an improved algorithm New_Apriori algorithm and analyze experiment result. The result shows that the amount of data access has been reduced significantly and efficiency has been improved. In the end of this paper, we built the early warning model of students’ achievement based on neural network. The result of experiment shows that the new algorithms improve the efficiency and accuracy of the early warning.
Published: 2013

25. Integrated positioning for coal mining machinery in enclosed underground mine based on SINS/WSN

Author: Lijuan Zhou, Jing Hui, Wenxu Yan, Zhenzhong Yu, Qigao Fan, Wei Li, and Lei Wu
Subjects: Article Subject, business.industry, Computer science, lcsh:T, Real-time computing, lcsh:R, Coal mining, lcsh:Medicine, General Medicine, Models, Theoretical, lcsh:Technology, Coal Mining, General Biochemistry, Genetics and Molecular Biology, Working condition, Global Positioning System, Dynamic positioning, lcsh:Q, business, lcsh:Science, Simulation, General Environmental Science, Research Article
Abstract: To realize dynamic positioning of the shearer, a new method based on SINS/WSN is studied in this paper. Firstly, the shearer movement model is built and running regularity of the shearer in coal mining face has been mastered. Secondly, as external calibration of SINS using GPS is infeasible in enclosed underground mine, WSN positioning strategy is proposed to eliminate accumulative error produced by SINS; then the corresponding coupling model is established. Finally, positioning performance is analyzed by simulation and experiment. Results show that attitude angle and position of the shearer can be real-timely tracked by integrated positioning strategy based on SINS/WSN, and positioning precision meet the demand of actual working condition.
Published: 2013

26. Studies on a Hybrid Way of Rules and Statistics for Chinese Conjunction Usages Recognition

Author: Lijuan Zhou and Hongying Zan
Subjects: Measure (data warehouse), Combining rules, Computer science, business.industry, Statistics, Artificial intelligence, business, Machine learning, computer.software_genre, Participle, computer, Test (assessment), Conjunction (grammar)
Abstract: Conjunction is a kind of functional words. Different conjunctions may contain different usages. The same conjunction may have different usages in different contexts. Studies on conjunction usage recognition are helpful for automatic understanding of modern Chinese texts. This paper adopts a hybrid way of rules and statistics to identify conjunction usages. Experiment results show that the methods combining rules and statistics are helpful for automatic recognition of conjunction usages. Among them, F measure of the participle and part-of-speech tagging corpus of the April , May, June 2000 People’ s Daily achieves 91.42%, 90.88%, 90.92% respectively in open test.
Published: 2013

27. Parallel Implementation of Classification Algorithms Based on Cloud Computing Environment

Author: Wenbo Wang, Lijuan Zhou, and Hui Wang
Subjects: Analysis of parallel algorithms, Cost efficiency, business.industry, Computer science, Process (computing), Cloud computing, Machine learning, computer.software_genre, Task (project management), Naive Bayes classifier, Statistical classification, ComputingMethodologies_PATTERNRECOGNITION, Artificial intelligence, Data mining, business, Scale (map), computer
Abstract: As an important task of data mining, Classification has been received considerable attention in many applications, such as information retrieval, web searching, etc. The enlarging volumes of information emerging by the progress of technology and the growing individual needs of data mining, makes classifying of very large scale of data a challenging task. In order to deal with the problem, many researchers try to design efficient parallel classification algorithms. This paper introduces the classification algorithms and cloud computing briefly, based on it analyses the bad points of the present parallel classification algorithms, then addresses a new model of parallel classifying algorithms. And it mainly introduces a parallel Naive Bayes classification algorithm based on MapReduce, which is a simple yet powerful parallel programming technique. The experimental results demonstrate that the proposed algorithm improves the original algorithm performance, and it can process large datasets efficiently on commodity hardware.
Published: 2012

28. An Improved Approach for Materialized View Selection Based on Genetic Algorithm

Author: Lijuan Zhou, Xiaoxu He, and Kang Li
Subjects: Mathematical optimization, General Computer Science, Computer science, Population-based incremental learning, Crossover, Materialized view, computer.software_genre, Data warehouse, Convergence (routing), Genetic algorithm, Data mining, Genetic representation, computer, Selection (genetic algorithm)
Abstract: This paper presents an improved genetic algorithm to solve the materialized view selection problem under query cost constraints. The algorithm dynamically changes the crossover probability and mutation probability in the process of genetic. In this way, it can not only maintain the population diversity, but also ensure the convergence of the genetic algorithm. So it effectively improves the optimization ability of genetic algorithm, thus avoiding the "evolutionary stagnation" problems. Meanwhile, the improved genetic algorithm increases the processing of invalid solution to avoid the "evolutionary stagnation" problems generated by invalid cycle, thereby the efficiency of materialized view selection is greatly improved.
Published: 2012

29. Study and Application of an Improved Clustering Algorithm

Author: Yuyan Chen, Shuang Li, and Lijuan Zhou
Subjects: k-medoids, Computer science, Population-based incremental learning, computer.software_genre, Human-Computer Interaction, Data stream clustering, Artificial Intelligence, CURE data clustering algorithm, Canopy clustering algorithm, Data mining, Cluster analysis, computer, Software, k-medians clustering, FSA-Red Algorithm
Abstract: This paper, combined with the characteristics of the early warning about students' grade, represents an optimization algorithm in order to solve the random selection from the initial clustering center of results to cause major influence this volatility defects .It has integrated into the open source WEKA platform. The optimized algorithm not only guarantees the accuracy of the original algorithm, but also improves the stability of the algorithm.
Published: 2012

30. Efficient Mining Algorithms of Finding Frequent Datasets

Author: Zhang Zhang and Lijuan Zhou
Subjects: Structure (mathematical logic), Computer science, Relational database, InformationSystems_DATABASEMANAGEMENT, Subset and superset, computer.software_genre, Human-Computer Interaction, ComputingMethodologies_PATTERNRECOGNITION, Artificial Intelligence, ComputingMethodologies_SYMBOLICANDALGEBRAICMANIPULATION, Data mining, computer, Algorithm, Software
Abstract: This work proposes an efficient mining algorithm to find maximal frequent item sets from relational database. It adapts to large datasets.Itemset is stored in list with special structure. The two main lists called itemset list and Frequent itemset list are created by scanning database once for dividing maximal itemsets into two categories depending on whether the itemsets to achieve minimum support number. Sub itemsets whose superset is in itemset list are generated by recursion to make sure that each sub itemsets appeared before its superset. As current sub itemsets being joined to frequent itemset list, its sub itemsets are pruned from the itemset list. At last, all sub itemsets whose nearest superset is in frequent itemset list are pruned from the frequent itemset list to hold all maximal frequent itemsets.We compare our algorithms and FP-Growth by two sets of time-consuming experiments to prove the superiority of our efficient algorithm both not only with increasing datasets but also with changing mini-support.
Published: 2012

31. EMIR: a novel music retrieval system for mobile devices incorporating analysis of user emotion

Author: Hongfei Lin, Cathal Gurrin, and Lijuan Zhou
Subjects: Music Information Retrieval, Emotion Detection, Machine Learning, Emotion Space, Information retrieval, Computer science, InformationSystems_INFORMATIONINTERFACESANDPRESENTATION(e.g.,HCI), Speech recognition, Emotion detection, Similarity (psychology), InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL, Music information retrieval, Space (commercial competition), Lyrics, Mobile device, Text retrieval
Abstract: We present an Emotional Music Information Retrieval system for mobile devices that utilizes a machine learning approach to detect latent emotion from within both user queries (non-descriptive queries) and the lyrics of songs and uses both elements to develop an effective Music Information Retrieval system. Emotion is extracted from the songs and queries and mapped into a high-dimensional emotion space, which allows for the employment of conventional text retrieval techniques to calculate the similarity between a user query and the latent emotion in song lyrics, thereby producing a ranked list of songs for playback.
Published: 2012

32. The data processing based on factor analysis

Author: Lijuan Zhou, Ning-ning Chen, Yuan Zhen, and Hua Wang
Subjects: Data processing, Assessment data, Covariance matrix, Computer science, Factor (programming language), education, Principal component analysis, Data mining, computer.software_genre, computer, computer.programming_language
Abstract: This paper starting from the teacher assessment data, analyze whether the evaluation indicators are suitable for factor analysis, then find the two factors which played a decisive role in evaluation indicators these indicators were interactive and related, so as to achieve the purpose of simplifying calculation. Final get the teachers' rankings by factor analysis.
Published: 2011

33. A Performance Measurement System Based on BSC

Author: Lijuan Zhou and Yan Peng
Subjects: Process management, Balanced scorecard, Performance management, Shareholder, Computer science, business.industry, Business process, Bayesian network, Performance measurement, Human resources, business, Organizational performance
Abstract: Balanced scorecard (BSC) provides an integrated view of overall organizational performance and strategic objectives. Using financial and non-financial measures, the Balanced Scorecard (BSC) approach appraises four dimensions of firm performance: customers, financial (or shareholders), learning and growth, and internal business processes. This research first summarized the evaluation indexes synthesized from the literature relating to HR(Human Resource) performance measurement. Then, indexes fit for Performance Evaluation can be selected from the system easily base on API indicator. Finally, BSC map is created to explore a new kind of performance management model.
Published: 2011

34. Studies on the Automatic Recognition of Modern Chinese Conjunction Usages

Author: Hongying Zan, Kunli Zhang, and Lijuan Zhou
Subjects: Subjectivity, Computer science, business.industry, Rule-based system, Artificial intelligence, computer.software_genre, business, computer, Natural language processing, Connection (mathematics), Conjunction (grammar)
Abstract: The conjunctions can connect words, sentences and even paragraphs. They have special connection functions and their usages are complex and diverse. At present, the studies on conjunctions are mostly human-oriented. These descriptions can not avoid such limitations as subjectivity and illegibility, and are not easy to be applied directly to natural language processing (NLP). This paper studies the automatic recognition of conjunction usages in the background of NLP. It designs a rule-based method and several statistical methods for conjunction usages recognition. Results are compared and analyzed and turns out that rule-based method and statistical methods have advantages and disadvantages.
Published: 2011

35. An Improved Algorithm for Materialized View Selection

Author: Lijuan Zhou, Ming Sheng Xu, and Haijun Geng
Subjects: Set (abstract data type), General Computer Science, Computer science, Online analytical processing, Genetic algorithm, Materialized view, Key (cryptography), Dimensional modeling, Data mining, computer.software_genre, computer, Selection (genetic algorithm), Data warehouse
Abstract: The data warehouse is subject oriented, integrated, nonvolatile and time-varying data sets, which is used to support management decision-making. A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing Decision-support or OLAP queries. One of the most important decisions in designing a data warehouse is the selection of materialized views to be maintained at the warehouse. The materialization of all views is not possible because of the space constraint and maintenance cost constraint. Selecting a suitable set of views that minimize the total cost associated with the materialized views is the key objective of data warehousing. In this paper, first the query cost view selection problem model is proposed. Second, the methods for selecting materialized views are presented. The genetic algorithm is applied to the materialized view selection problem. But with the development of genetic process, the legal solution produced become more and more difficult. Therefore, improved algorithm has been presented in this paper. Finally, in order to test the function and efficiency of our algorithms, experiment simulation is adopted. The experiments show that the given methods can provide near-optimal solutions in limited time and work well in practical cases. Randomized algorithms will become invaluable tools for data warehouse evolution.
Published: 2011

36. Research on algorithm of association rules in Distributed Database System

Author: Mingsheng Xu, Shuang Li, and Lijuan Zhou
Subjects: Apriori algorithm, Distributed database, Association rule learning, Computer science, Node (networking), InformationSystems_DATABASEMANAGEMENT, Crunode, computer.software_genre, ComputingMethodologies_PATTERNRECOGNITION, Key (cryptography), Algorithm design, Data mining, Algorithm, computer, FSA-Red Algorithm
Abstract: This dissertation proposes a new algorithm of distributed mining association rules using the improved Apriori algorithm, based on analyses and introduction of the basic concepts and algorithms of mining association rules and mining association rules in distributed databases. Using improved Apriori algorithm to directly produce all of local frequent itemset in each crunode, rather than iteratively selecting candidate itemset. Then gather all of local multifarious itemset to broadcast to the general node, producing the global frequent itemset of association rules. In the process, the data is no longer saved with the affair ID as the key word. We take the item ID as the new key word. The performance of the improved Apriori algorithm has been improved through cutting down the store space. While the general node gathers all of local frequent itemset to select the global frequent itemset, it needs only a broadcast probably, needing three broadcasts worst. This raised the efficiency of the new algorithm of Association Rules in Distributed Database System.
Published: 2010

37. Massive data mining based on item sequence set grid space

Author: Mingsheng Xu, Zhang Zhang, and Lijuan Zhou
Subjects: Set (abstract data type), Apriori algorithm, Association rule learning, Computer science, Relational database, Data stream mining, Data mining, Linked list, computer.software_genre, Grid, Data structure, computer
Abstract: According to the stored mode of massive data in the relational database, this paper proposed a fast mining algorithm to find maximum frequent item sets based on item sequence set grid space. The traditional methods for mining association rules generate frequent item sets from small to large. These approaches are either time consuming or computationally expensive, and often generate a large number of redundant candidates or frequent item sets, which is fatal for controlling mining speed as data to mass-level. The goal of this paper is first to use a self-defined structure linked list to storage item sequence then to find the frequent item sets from large to small. Several applications of association rules mining using item sequence set grid space has a good performance but it demonstrated inefficiency in massive data mining. The problem involves time spent on sub item sets finding. Experimental results will be presented to show that the fast mining algorithm ISSDL-DM proposed in this paper use much less time than the similar existing algorithm ISS-DM for achieving the same outcomes.
Published: 2010

38. A clustering-Based KNN improved algorithm CLKNN for text classification

Author: Lijuan Zhou, Qian Shi, Lin-shuang Wang, and Xue-bin Ge
Subjects: Training set, Computer science, business.industry, Document classification, Pattern recognition, Boundary testing, computer.software_genre, Knn classifier, k-nearest neighbors algorithm, Statistical classification, ComputingMethodologies_PATTERNRECOGNITION, Algorithm design, Artificial intelligence, Data mining, business, Cluster analysis, computer
Abstract: As a simple, effective and nonparametric classification method, k Nearest Neighbor (KNN) is widely used in document classification for dealing with the much more difficult problem such as large-scale or many of categories. But KNN classifier may have a problem when training samples are uneven. The problem is that KNN classifier may decrease the precision of classification because of the uneven density of training data. To solve the problem, a new clustering-based KNN method is presented in this paper. It preprocesses training data by using clustering , then classify with a new KNN algorithm, which adopts a dynamic adjustment in each iteration for the neighborhood number K.This method would avoid the uneven classification phenomenon and reduce the misjudgment of the boundary testing samples. We have an experiment in text classification and the result shows that it has good performance.
Published: 2010

39. The minimum incremental maintenance of materialized views in data warehouse

Author: Haijun Geng, Lijuan Zhou, and Qian Shi
Subjects: Consistency (database systems), Information engineering, Automatic control, Distributed database, Database, Computer science, Online analytical processing, Materialized view, computer.software_genre, Maintenance engineering, computer, Data warehouse
Abstract: A large number of materialized views are stored in data warehouse to enable users to quickly get search results for OLAP analysis. But when the remote basic data source changes, the materialized views in data warehouse are also updated correspondingly in order to maintain the consistency with basic relations, which causes materialized views maintenance issues. There are two methods for materialized views maintenance. One way is to re-compute the views, which can lead to extra large storage and maintenance cost and is sometimes unachievable due to storage limitation. So incremental maintenance technique is more preferable in recent years. Its principle is that data source reports its changes to the integrator who then calculates the corresponding changes and inform the database with the results. Incremental maintenance technique is adopted in this paper. The amount of incremental data is different for the same view when adopting different methods, which result in different maintenance costs. The idea and strategy of minimum incremental maintenance is presented. The materialized view definitions and maintenance expressions, as well as algorithms are given. The experiment shows that the maintenance cost of materialized views is decreased and data warehouse processing efficiency is improved.
Published: 2010

40. A Brief Review of Machine Learning and Its Application

Author: Hua Wang, Cuiqin Ma, and Lijuan Zhou
Subjects: Structure (mathematical logic), Artificial neural network, Computer science, business.industry, Analogy, Rote learning, Machine learning, computer.software_genre, Variety (cybernetics), Knowledge extraction, Algorithm design, Learning based, Artificial intelligence, business, computer
Abstract: With the popularization of information and the establishment of the databases in great number, and how to extract data from the useful information is the urgent problem to be solved. Machine learning is the core issue of artificial intelligence research, this paper introduces the definition of machine learning and its basic structure, and describes a variety of machine learning methods, including rote learning, inductive learning, analogy learning , explained learning, learning based on neural network and knowledge discovery and so on. This paper also brings foreword the objectives of machine learning, and points out the development trend of machine learning. Keywords-machine learning; intelligence; methods; application
Published: 2009

41. Classification data mining method based on dynamic RBF neural networks

Author: Lijuan Zhou, Zhang Zhang, Luping Duan, and Min Xu
Subjects: Artificial neural network, business.industry, Computer science, Machine learning, computer.software_genre, Data warehouse, Bottleneck, ComputingMethodologies_PATTERNRECOGNITION, Rate of convergence, Robustness (computer science), Search algorithm, Incremental learning, Radial basis function, Artificial intelligence, Data mining, business, computer
Abstract: With the widely application of databases and sharp development of Internet, The capacity of utilizing information technology to manufacture and collect data has improved greatly. It is an urgent problem to mine useful information or knowledge from large databases or data warehouses. Therefore, data mining technology is developed rapidly to meet the need. But DM (data mining) often faces so much data which is noisy, disorder and nonlinear. Fortunately, ANN (Artificial Neural Network) is suitable to solve the before-mentioned problems of DM because ANN has such merits as good robustness, adaptability, parallel-disposal, distributing-memory and high tolerating-error. This paper gives a detailed discussion about the application of ANN method used in DM based on the analysis of all kinds of data mining technology, and especially lays stress on the classification Data Mining based on RBF neural networks. Pattern classification is an important part of the RBF neural network application. Under on-line environment, the training dataset is variable, so the batch learning algorithm (e.g. OLS) which will generate plenty of unnecessary retraining has a lower efficiency. This paper deduces an incremental learning algorithm (ILA) from the gradient descend algorithm to improve the bottleneck. ILA can adaptively adjust parameters of RBF networks driven by minimizing the error cost, without any redundant retraining. Using the method proposed in this paper, an on-line classification system was constructed to resolve the IRIS classification problem. Experiment results show the algorithm has fast convergence rate and excellent on-line classification performance.
Published: 2009

42. Design of data warehouse in teaching state based on OLAP and data mining

Author: Shuang Li, Lijuan Zhou, and Minhua Wu
Subjects: Decision support system, Association rule learning, Computer science, business.industry, Online analytical processing, Data management, Information technology, computer.software_genre, Data structure, Data science, Data warehouse, Data modeling, Data extraction, Data quality, Data mining, business, computer
Abstract: The data warehouse and the data mining technology is one of information technology research hot topics. At present the data warehouse and the data mining technology in aspects and so on commercial, financial industry as well as enterprise's production, market marketing obtained the widespread application, but is relatively less in educational fields' application. Over the years, the teaching and management have been accumulating large amounts of data in colleges and universities, while the data can not be effectively used, in the light of social needs of the university development and the current status of data management, the establishment of data warehouse in university state, the better use of existing data, and on the basis dealing with a higher level of disposal --data mining are particularly important. In this paper, starting from the decision-making needs design data warehouse structure of university teaching state, and then through the design structure and data extraction, loading, conversion create a data warehouse model, finally make use of association rule mining algorithm for data mining, to get effective results applied in practice. Based on the data analysis and mining, get a lot of valuable information, which can be used to guide teaching management, thereby improving the quality of teaching and promoting teaching devotion in universities and enhancing teaching infrastructure. At the same time it can provide detailed, multi-dimensional information for universities assessment and higher education research.
Published: 2009

43. Research of Data Mining Approach Based on Radial Basis Function Neural Networks

Author: Luping Duan, Minhua Wu, Ming Sheng Xu, Haijun Geng, and Lijuan Zhou
Subjects: Statistical classification, Radial basis function network, Artificial neural network, Computer science, Population-based incremental learning, Process (computing), Algorithm design, Radial basis function, Data mining, computer.software_genre, computer, Selection (genetic algorithm)
Abstract: In this paper classification of data mining based on radial basis function neural networks is researched. After intensive analysis, the training algorithm of radial basis function neural networks is improved in optimum structure, learning speed and approximation accuracy. In learning speed, two-stage learning strategy is used to accelerate the learning process. In approximation accuracy, an error-correction algorithm is presented to improve the output accuracy of radial basis function. In optimum structure, the paper is focused on the number and center selection of the hidden layer units and proposes an adaptive dynamic and static combination algorithm of center selection. Finally, the algorithms are experimented and comparative analyzed. The experimental results show that the performance of the algorithm is significantly improved, and also prove the validity of the improved algorithm.
Published: 2009

44. Research on Materialized Views Technology in Data Warehouse

Author: Min Xu, Lijuan Zhou, Zhongxiao Hao, and Qian Shi
Subjects: Decision support system, Database, Distributed database, Computer science, Node (networking), Materialized view, InformationSystems_DATABASEMANAGEMENT, Dimensional modeling, computer.software_genre, Data warehouse, Set (abstract data type), Algorithm design, Data mining, computer
Abstract: With the needs of decision-support information of enterprise and the fast development of computer technologies data warehouse technology come out. The data warehouse is a repository of information collected from multiple, possibly heterogeneous, autonomous, distributed databases. The information stored at the data warehouse is in form of views referred to as materialized views. The design of data warehouse is one of the core research problems in studying and evolution of data warehouse. One of the most important decisions in design of data warehouse is the data warehouse selection. Selecting views to materialize impacts on the efficiency as well as the total cost of establishing and running a data warehouse. So, we develop algorithms to select a set of views to materialize in data warehouse in order to minimize the total view maintenance cost under the constraint of a given query response time. We call it query cost view selection problem (QC_VSP). In this paper, First, we propose query cost view selection problem model. Second, we give three algorithms for QC_VSP; we give view_node_matrix in order to solve it. Third, experiment simulation is adopted. The results show that our algorithm works better in practical cases. We implemented our algorithms and a performance study of the algorithms shows that the proposed algorithm delivers an optimal solution. Finally, we discuss the observed behavior of the algorithms. We also identify some important issues for future investigations.
Published: 2008

45. The Model and Realization of Materialized Views Selection in Data Warehouse

Author: Lijuan Zhou, Xuebin Ge, and Minhua Wu
Subjects: Set (abstract data type), Information retrieval, Selection (relational algebra), Database, Distributed database, Computer science, Materialized view, InformationSystems_DATABASEMANAGEMENT, Dimensional modeling, Algorithm design, computer.software_genre, computer, Data warehouse
Abstract: The data warehouse is a repository of information collected from multiple, possibly heterogeneous, autonomous, distributed databases. The information stored at the data warehouse is in form of views, referred to as materialized views. One of the most important decisions in designing a data warehouse is selection of right views to be materialized. So, we develop algorithms to select a set of views to materialize in data warehouse in order to minimize the total view maintenance cost under the constraint of a given query response time. We call it query cost view_ selection problem (QC_VSP). In this paper, First, we propose the cost model of QC_VSP. Second, we design algorithms for QC_VSP. Third, we use experiments do demonstrate the power of our approach.
Published: 2008

46. Research on parallel algorithm for sequential pattern mining

Author: Lijuan Zhou, Zhongxiao Hao, Yu Wang, and Bai Qin
Subjects: GSP Algorithm, Theoretical computer science, Speedup, Sequence database, Computer science, Test data generation, Distributed algorithm, Search algorithm, Parallel algorithm, Data mining, computer.software_genre, computer, FSA-Red Algorithm
Abstract: Sequential pattern mining is the mining of frequent sequences related to time or other orders from the sequence database. Its initial motivation is to discover the laws of customer purchasing in a time section by finding the frequent sequences. In recent years, sequential pattern mining has become an important direction of data mining, and its application field has not been confined to the business database and has extended to new data sources such as Web and advanced science fields such as DNA analysis. The data of sequential pattern mining has characteristics as follows: mass data amount and distributed storage. Most existing sequential pattern mining algorithms haven't considered the above-mentioned characteristics synthetically. According to the traits mentioned above and combining the parallel theory, this paper puts forward a new distributed parallel algorithm SPP(Sequential Pattern Parallel). The algorithm abides by the principal of pattern reduction and utilizes the divide-and-conquer strategy for parallelization. The first parallel task is to construct frequent item sets applying frequent concept and search space partition theory and the second task is to structure frequent sequences using the depth-first search method at each processor. The algorithm only needs to access the database twice and doesn't generate the candidated sequences, which abates the access time and improves the mining efficiency. Based on the random data generation procedure and different information structure designed, this paper simulated the SPP algorithm in a concrete parallel environment and implemented the AprioriAll algorithm. The experiments demonstrate that compared with AprioriAll, the SPP algorithm had excellent speedup factor and efficiency.
Published: 2008

47. Selecting materialized views using random algorithm

Author: Zhongxiao Hao, Lijuan Zhou, and Chi Liu
Subjects: Distributed database, Computer science, View, Online analytical processing, Genetic algorithm, Materialized view, Simulated annealing, Data mining, computer.software_genre, computer, Data warehouse, Randomized algorithm
Abstract: The data warehouse is a repository of information collected from multiple possibly heterogeneous autonomous distributed databases. The information stored at the data warehouse is in form of views referred to as materialized views. The selection of the materialized views is one of the most important decisions in designing a data warehouse. Materialized views are stored in the data warehouse for the purpose of efficiently implementing on-line analytical processing queries. The first issue for the user to consider is query response time. So in this paper, we develop algorithms to select a set of views to materialize in data warehouse in order to minimize the total view maintenance cost under the constraint of a given query response time. We call it query_cost view_ selection problem. First, cost graph and cost model of query_cost view_ selection problem are presented. Second, the methods for selecting materialized views by using random algorithms are presented. The genetic algorithm is applied to the materialized views selection problem. But with the development of genetic process, the legal solution produced become more and more difficult, so a lot of solutions are eliminated and producing time of the solutions is lengthened in genetic algorithm. Therefore, improved algorithm has been presented in this paper, which is the combination of simulated annealing algorithm and genetic algorithm for the purpose of solving the query cost view selection problem. Finally, in order to test the function and efficiency of our algorithms experiment simulation is adopted. The experiments show that the given methods can provide near-optimal solutions in limited time and works better in practical cases. Randomized algorithms will become invaluable tools for data warehouse evolution.
Published: 2007

48. Divison of Imaging Intervals and Selection of Optimum Imgaging Time for Ship ISAR Imaging Based on Measured Data

Author: Haiping Sun, Lijuan Zhou, and Mengdao Xing
Subjects: Synthetic aperture radar, Computer science, Pulse-Doppler radar, Acoustics, Doppler radar, Side looking airborne radar, Space-based radar, law.invention, Inverse synthetic aperture radar, Continuous-wave radar, law, Computer Science::Computer Vision and Pattern Recognition, Radar imaging, Remote sensing
Abstract: In this paper an Inverse Synthetic Aperture Radar (ISAR) imaging algorithm for ship targets based on the division of imaging intervals and the selection of optimum time is proposed. The relative motion of a ship on the ocean wave can be broken into three components, namely pith, roll and yaw, which makes the Doppler frequency vary with the slow time. So during the whole slow time the echoes are not linear frequency modulation (LFM) signals any more. Under this condition the division of the imaging intervals and the selection of the optimum imaging time during the considered imaging interval are of great importance for ISAR imaging of ship targets. The imaging results of the measured data demonstrate the effectiveness of the proposed approach.
Published: 2006

49. Synthetic Bandwidth Method Integrated with Characteristics ofSAR

Author: Mengdao Xing, Haiping Sun, and Lijuan Zhou
Subjects: Synthetic aperture radar, Motion compensation, symbols.namesake, Computer science, Radar imaging, Bandwidth (signal processing), Echo signal, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, symbols, Electronic engineering, High bandwidth, Time domain, Doppler effect
Abstract: Stepped-frequency subpulse signals are widely used to obtain ultra-high range resolution. The stepped-frequency subpulse signals can be combined to one single signal with high bandwidth by using synthetic bandwidth methods. In practical SAR imaging motion error and the time delay of echo signal must be considered before applying the available synthetic bandwidth methods. This paper presents a method in which motion compensation and compensation in Doppler domain are integrated with time domain synthetic bandwidth method in order to get high quality SAR image.
Published: 2006

50. P2P Traffic Identification by TCP Flow Analysis

Author: ZhiTong Li, LiJuan Zhou, and Bin Liu
Subjects: Network packet, Computer science, business.industry, ComputerSystemsOrganization_COMPUTER-COMMUNICATIONNETWORKS, Traffic policing, Network interface, Traffic shaping, business, Traffic generation model, Host (network), Network traffic control, Network traffic simulation, Computer network
Abstract: In this paper, we first propose some new and universal features of all kinds of P2P traffic derived from the packet header information of transfer/network layer and present a novel approach for gathering P2P traffic: just from a single host perspective, namely capturing packets from the NIC (network interface card) of one host.
Published: 2006

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

55 results on '"Lijuan Zhou"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources