55 results on '"Lijuan Zhou"'
Search Results
2. Study on the Personalized Learning Model of Learner-Learning Resource Matching
- Author
-
Lijuan Zhou, Shudong Zhang, Min Xu, and Feifei Zhang
- Subjects
Matching (statistics) ,Learning resource ,Computer science ,business.industry ,Personalized learning ,Artificial intelligence ,Machine learning ,computer.software_genre ,business ,computer ,Computer Science Applications ,Education - Abstract
With the development of service integration technology, online learning platforms have gathered a large number of learning resources, causing learners to get lost in a variety of course information and it is difficult to obtain learning resources that match their own needs. The proposal of personalized learning gives the problem a direction to solve. However, current personalized learning resource recommendation services facing problems such as excessive candidate resources, sparse history and cold starts. In addition, the learning resources provided also show problems of "difficult or easy, uneven quality". For this article researches the personalized learning recommendation model of learner-learning resource matching. The main content includes three parts: First, build a demand model based on learner registration information, learning behavior and other data. Second, analyze the access behavior of learning resources and assess their quality. Third, calculate the matching degree between learners and learning resources based on the demand model and the quality information of the learning resources, and recommend them.
- Published
- 2021
3. Jointly Learning Visual Poses and Pose Lexicon for Semantic Action Recognition
- Author
-
Wanqing Li, Philip Ogunbona, Zhengyou Zhang, and Lijuan Zhou
- Subjects
Computer science ,business.industry ,Probabilistic logic ,02 engineering and technology ,computer.software_genre ,Lexicon ,Visualization ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,Frame (artificial intelligence) ,020201 artificial intelligence & image processing ,Artificial intelligence ,Electrical and Electronic Engineering ,Hidden Markov model ,business ,computer ,Natural language processing - Abstract
A novel method for semantic action recognition through learning a pose lexicon is presented in this paper. A pose lexicon comprises a set of semantic poses, a set of visual poses, and a probabilistic mapping between the visual and semantic poses. This paper assumes that both the visual poses and mapping are hidden and proposes a method to simultaneously learn a visual pose model that estimates the likelihood of an observed video frame being generated from hidden visual poses, and a pose lexicon model establishes the probabilistic mapping between the hidden visual poses and the semantic poses parsed from textual instructions. Specifically, the proposed method consists of two-level hidden Markov models. One level represents the alignment between the visual poses and semantic poses. The other level represents a visual pose sequence, and each visual pose is modeled as a Gaussian mixture. An expectation-maximization algorithm is developed to train a pose lexicon. With the learned lexicon, action classification is formulated as a problem of finding the maximum posterior probability of a given sequence of video frames that follows a given sequence of semantic poses, constrained by the most likely visual pose and the alignment sequences. The proposed method was evaluated on MSRC-12, WorkoutSU-10, WorkoutUOW-18, Combined-15, Combined-17, and Combined-50 action datasets using cross-subject, cross-dataset, zero-shot, and seen/unseen protocols.
- Published
- 2020
4. Detection of small objects in complex long-distance scenes based on Yolov3
- Author
-
Weidong Feng, Bin Lu, Lijuan Zhou, Shudong Zhang, and Xin Chen
- Subjects
Data set ,Set (abstract data type) ,business.industry ,Computer science ,Feature extraction ,Experimental data ,Computer vision ,Scale (descriptive set theory) ,Artificial intelligence ,Function (mathematics) ,Construct (python library) ,business ,Tower - Abstract
In view of the difficulty in detecting and managing illegal buildings, this article uses high-definition cameras mounted on a tower to regularly capture images to construct the “Suspected Illegal Building Information” dataset. Due to the high distance of the tower from the ground, most of the collected data images contian many small targets, and there are many types of target objects to be detected and target size is quite differences, resulting in low detection accuracy. Based on the above problems, this paper makes the following improvements based on the Yolov3 algorithm: (1) using K-menas re-clustering anchor boxes, (2) introducing CIoU optimization loss function, (3) adding 104×104 scale for feature extraction, (4) using Soft-NMS instead traditional NMS algorithm, in addition, the data set is enhanced and optimized. The experimental results show that, compared with the original Yolov3, the improved algorithm has improved the detection accuracy of this experimental data set by 19.1%, which can well meet the needs of the project.
- Published
- 2021
5. Video captioning based on multi-feature fusion with object awareness
- Author
-
Changyong Niu, Tao Liu, and Lijuan Zhou
- Subjects
Closed captioning ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Representation (systemics) ,Object (computer science) ,Multi feature fusion ,Component (UML) ,Fuse (electrical) ,Computer vision ,Artificial intelligence ,business ,Joint (audio engineering) ,Block (data storage) - Abstract
This paper proposes a novel method to utilize three source features for video captioning. It fuses global video features with local object and regional features to model the relationships among objects and their motions and applies object tags instead of visual features to guide the generation of descriptions. Specifically, Multi-feature is firstly extracted by pretrained models and treated as separate inputs alongside video frames. Secondly, an object awareness attention block is designed to fuse the different features information and to learn a joint video representation which has both visual and linguistic semantics. Experiments on MSVD and MSR-VTT datasets have shown the effectiveness of the proposed method, and the ablation studies have verified the contribution of each component.
- Published
- 2021
6. Data Cache Optimization Model Based on HBase and Redis
- Author
-
Lijuan Zhou, Bin Lu, Laijun Qi, and Shudong Zhang
- Subjects
business.industry ,Computer science ,Distributed computing ,Distributed data store ,Key (cryptography) ,Cloud computing ,Focus (optics) ,business ,Cloud storage ,Storage model ,Field (computer science) ,Image (mathematics) - Abstract
The computing-based cloud storage model realizes the security and reliable storage of massive image data. Nowadays, how to quickly obtain high-quality image data is the focus of attention in the research field. This paper discusses the cloud storage model, proposing a combined data caching strategy based on HBase and Redis technology. Besides, the model improves the technical defects of Memcached, setting key indexes which is stored in Hbase, having achieved the mapping between index and DataNode. The results of experiment show that the proposed optimization model can improve the ability of rapid acquisition and analysis of image data and achieve a higher retrieval efficiency about image data.
- Published
- 2020
7. A New Majority Weighted Minority Oversampling Technique for Classification of Imbalanced Datasets
- Author
-
Yixuan Zhao, Chen Tian, Lijuan Zhou, and Shudong Zhang
- Subjects
business.industry ,Computer science ,Pattern recognition ,02 engineering and technology ,Hierarchical clustering ,Data set ,Statistical classification ,ComputingMethodologies_PATTERNRECOGNITION ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Oversampling ,020201 artificial intelligence & image processing ,Artificial intelligence ,Cluster analysis ,business ,Classifier (UML) - Abstract
Classification problem is one of the essential tasks in data mining. Traditional classification strategies are predominantly via cost-insensitive equilibrium data. They tend to be concentrated on the overall accuracy of a model, and such classifiers are improper for unbalanced sample data. Hence, optimizing unbalanced samples to improve classifier performance is an issue worthy of discussion. Based on the information-rich minority samples that are difficult to learn, Majority Weighted Minority Oversampling Technique (MWMOTE) uses the clustering method to generate synthetic samples from the weighted information samples. However, the accuracy of the clustering should be optimized. To this end, a method called NC_Link_MWMOTE is presented for efficiently handling imbalanced learning problems. We propose a solution by using NC_Link-based hierarchical clustering method to synthesize different samples from a small number of samples, thus optimizing the clustering effect. NC_Link_MWMOTE was evaluated on six different levels of equilibrium data sets. The simulation results show that our method is effective and outperforms competitive baseline method in terms of various assessment metrics, such as Fl-score and Area Under Curve (AUC).
- Published
- 2020
8. Statistical Analysis and Automatic Recognition of Grammatical Errors in Teaching Chinese as a Second Language
- Author
-
Lijuan Zhou, Mengjie Zhong, Hongying Zan, and Yingjie Han
- Subjects
Computer science ,business.industry ,First language ,Negative transfer ,computer.software_genre ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Second language ,Statistical analysis ,Artificial intelligence ,Second language learners ,business ,computer ,Word (computer architecture) ,Natural language processing - Abstract
Foreigners make various grammatical errors when learning Chinese due to the negative transfer of their mother tongue, learning strategies, etc. At present, the research on grammatical errors mainly focuses on a certain word or a certain kind of errors, resulting in a lack of comprehensive understanding. In this paper, a statistical analysis on large-scale data sets of grammatical errors made by second language learners is conducted, including words with grammatical errors and their quantities. The statistical analysis gives people a more comprehensive understanding of grammatical errors and have certain guiding significance for teaching Chinese as a second language (TCSL). Because of the large proportion of grammatical errors of “的[de](of)”, the usages of “的[de](of)” are integrated into automatic recognition of Chinese grammatical errors. Experimental results show that the performance is overall improved.
- Published
- 2020
9. BERT with Enhanced Layer for Assistant Diagnosis Based on Chinese Obstetric EMRs
- Author
-
Chuang Liu, Kunli Zhang, Xuemin Duan, Lijuan Zhou, Hongying Zan, and Yueshu Zhao
- Subjects
Language representation ,Computer science ,Data mining ,computer.software_genre ,computer ,Encoder - Abstract
This paper proposes a novel method based on the language representation model called BERT (Bidirectional Encoder Representations from Transformers) for Obstetric assistant diagnosis on Chinese obstetric EMRs (Electronic Medical Records). To aggregate more information for final output, an enhanced layer is augmented to the BERT model. In particular, the enhanced layer in this paper is constructed based on strategy 1(A strategy) and/or strategy 2(A-AP strategy). The proposed method is evaluated on two datasets including Chinese Obstetric EMRs dataset and Arxiv Academic Paper Dataset (AAPD). The experimental results show that the proposed method based on BERT improves the F1 value by 19.58% and 2.71% over the state-of-the-art methods, and the proposed method based on BERT and the enhanced layer by strategy 2 improves the F1 value by 0.7% and 0.3% (strategy 1 improves the F1 value by 0.68% and 0.1%) over the method without adding enhanced layer respectively on Obstetric EMRs dataset and AAPD dataset.
- Published
- 2019
10. Imbalanced Data Processing Model for Software Defect Prediction
- Author
-
Hua Wang, Shudong Zhang, Lijuan Zhou, and Ran Li
- Subjects
Computer science ,business.industry ,Decision tree ,Software development ,Sampling (statistics) ,020207 software engineering ,Feature selection ,02 engineering and technology ,computer.software_genre ,Computer Science Applications ,ComputingMethodologies_PATTERNRECOGNITION ,C4.5 algorithm ,Software bug ,0202 electrical engineering, electronic engineering, information engineering ,Chi-square test ,020201 artificial intelligence & image processing ,AdaBoost ,Data mining ,Electrical and Electronic Engineering ,business ,Classifier (UML) ,computer - Abstract
In the field of software engineering, software defect prediction is the hotspot of the researches which can effectively guarantee the quality during software development. However, the problem of class imbalanced datasets will affect the accuracy of overall classification of software defect prediction, which is the key issue to be solved urgently today. In order to better solve this problem, this paper proposes a model named ASRA which combines attribute selection, sampling technologies and ensemble algorithm. The model adopts the Chi square test of attribute selection and then utilizes the combined sampling technique which includes SMOTE over-sampling and under-sampling to remove the redundant attributes and make the datasets balance. Afterwards, the model ASRA is eventually established by ensemble algorithm named Adaboost with basic classifier J48 decision tree. The data used in the experiments comes from UCI datasets. It can draw the conclusion that the effect of software defect prediction classification which using this model is improved and better than before by comparing the precision P, F-measure and AUC values from the results of the experiments.
- Published
- 2017
11. Semantic action recognition by learning a pose lexicon
- Author
-
Zhengyou Zhang, Lijuan Zhou, Philip Ogunbona, and Wanqing Li
- Subjects
Sequence ,Computer science ,business.industry ,Posterior probability ,020207 software engineering ,02 engineering and technology ,Mixture model ,Lexicon ,computer.software_genre ,Action (philosophy) ,Artificial Intelligence ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Frame (artificial intelligence) ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Set (psychology) ,Hidden Markov model ,business ,computer ,Software ,Natural language processing - Abstract
This paper proposes a semantic representation, pose lexicon , for action recognition. The lexicon is composed of a set of semantic poses, a set of visual poses and a probabilistic mapping between the visual and semantic poses. Specially, an action can be represented by a sequence of semantic poses extracted from an associated textual instruction. Visual frames of the action are considered to be generated from a sequence of hidden visual poses. To learn the lexicon, a visual pose model is learned from training samples by a Gaussian Mixture model to characterize the likelihood of an observed visual frame being generated by a visual pose. A pose lexicon model is also learned by an extended hidden Markov alignment model to encode the probabilistic mapping between hidden visual poses and semantic poses sequences. With the lexicon, action classification is formulated as a problem of finding the maximum posterior probability of a given sequence of visual frames that fits to a given sequence of semantic poses through the most likely visual pose and alignment sequences. The efficacy of the proposed method was evaluated on MSRC-12, WorkoutSU-10, WorkoutUOW-18, Combined-15 and Combined-17 action datasets using cross-subject, cross-dataset and zero-shot protocols.
- Published
- 2017
12. Software Defect Prediction Based on Ensemble Learning
- Author
-
Hui Liu, Zhong Sun, Xiangyang Huang, Lijuan Zhou, Ran Li, and Shudong Zhang
- Subjects
Computer science ,business.industry ,media_common.quotation_subject ,020207 software engineering ,02 engineering and technology ,Construct (python library) ,Machine learning ,computer.software_genre ,Ensemble learning ,Random forest ,Software ,Software bug ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Quality (business) ,Software system ,Artificial intelligence ,business ,Focus (optics) ,computer ,media_common - Abstract
Software defect prediction is one of the important ways to guarantee the quality of software systems. Combining various algorithms in machine learning to predict software defects has become a hot topic in the current study. The paper uses the datasets of MDP as the experimental research objects and takes ensemble learning as research focus to construct software defect prediction model. With experimenting five different types of ensemble algorithms and analyzing the features and procedures, this paper discusses the best ensemble algorithm which is Random Forest through experimental comparison. Then we utilize the SMOTE over-sampling and Resample methods to improve the quality of datasets to build a complete new software defect prediction model. Therefore, the results show that the model can improve defect classification performance effectively.
- Published
- 2019
13. Efficiency Optimization of Capsule Network Model Based on Vector Element
- Author
-
Kai Feng, Hui Li, Lijuan Zhou, Xiangyang Huang, and Shudong Zhang
- Subjects
Computer science ,business.industry ,Deep learning ,Pattern recognition ,02 engineering and technology ,Convolutional neural network ,Field (computer science) ,Vector element ,03 medical and health sciences ,0302 clinical medicine ,Artificial Intelligence ,Image identification ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,030212 general & internal medicine ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Software ,Network model - Abstract
Currently, Deep Learning and Convolutional Neural Network (CNN) have been widely used in many fields and have generated very high value in these fields, especially in the field of image recognition. But there are some deficiencies in certain issues of image recognition. For example, CNN’s recognizing performance is not good at different angles of objects and overlapping objects. Also, CNN is sometimes very sensitive to slight perturbations, modifying one pixel of a recognized image may cause recognition errors. For these problems, the capsule network (CapsNet) proposed by Geoffrey Hinton can solve the problems of traditional convolutional networks. Shortly after CapsNet was proposed, the model structure was relatively simple, and many aspects could be explored for improvement. This paper will optimize CapsNet from two aspects: “optimization of routing mechanism” and “increase Dropout operation.” And carry out experiments and results analysis on these optimizations.
- Published
- 2020
14. Improved clustering algorithm with adaptive opposition-based learning
- Author
-
Qianqian Meng and Lijuan Zhou
- Subjects
Clustering high-dimensional data ,DBSCAN ,Fuzzy clustering ,Computer science ,Population-based incremental learning ,Correlation clustering ,02 engineering and technology ,computer.software_genre ,Machine learning ,Biclustering ,CURE data clustering algorithm ,Consensus clustering ,0202 electrical engineering, electronic engineering, information engineering ,Cluster analysis ,k-medians clustering ,FSA-Red Algorithm ,k-medoids ,business.industry ,Constrained clustering ,k-means clustering ,Determining the number of clusters in a data set ,Data stream clustering ,Canopy clustering algorithm ,FLAME clustering ,Affinity propagation ,020201 artificial intelligence & image processing ,Algorithm design ,Data mining ,Artificial intelligence ,business ,computer - Abstract
In recent years, clustering has become a hotspot in the field of data mining, as one of the key technologies of getting data distribution and observing the characteristics of class. However, some clustering algorithms depend on the selection of initial clustering centers, and the clustering results easily fall into local optimal. To solve the above problem, the paper integrates differential evolution algorithm and adaptive opposition-based learning. The algorithm makes use of reverse factor to guide algorithm search space approaching to the global optimal solution in each generation. In this paper, the improved algorithm is combined with classical K-means algorithm. According to the result of the three sets of data from UCI data verification, it demonstrates that the improved clustering algorithm can not only cluster better and converge faster, but also effectively suppress the occurrence of prematurity.
- Published
- 2017
15. Swarm-Based Spreading Points
- Author
-
Shudong Zhang, LiGuo Huang, Xiangyang Huang, and Lijuan Zhou
- Subjects
021103 operations research ,Computer science ,Minimum distance ,0211 other engineering and technologies ,Process (computing) ,Swarm behaviour ,Particle swarm optimization ,0102 computer and information sciences ,02 engineering and technology ,01 natural sciences ,Set (abstract data type) ,Packing problems ,010201 computation theory & mathematics ,Point (geometry) ,Pairwise comparison ,Algorithm - Abstract
In this paper we propose a Swarm-based Spreading Points algorithm (SSP) for improving the solutions for packing problems. The SSP repositions the initial set of points and evolves it to improve the minimum distance between points. During the evolving process, for each point, a feasible direction of movement is computed according to its nearest neighbors so that the shortest pairwise distance between the point and other points can be increased along this direction (if any). Our experiments showed that the SSP algorithm can improve certain best-known solutions for some problems previously reported in the literature.
- Published
- 2017
16. Research and Implementation of Data Mining Algorithms Based on Cloud Computing
- Author
-
Hui Wang, Xiang Wang, and Lijuan Zhou
- Subjects
General Computer Science ,Database ,business.industry ,Computer science ,Data stream mining ,General Mathematics ,Cloud computing ,computer.software_genre ,Data mining algorithm ,Utility computing ,Cloud testing ,Data mining ,business ,computer - Published
- 2013
17. Indexing of Large Data Based on CloudComputing Platform
- Author
-
Wenbo Wang, Hui Wang, and Lijuan Zhou
- Subjects
Information retrieval ,Computer Networks and Communications ,Hardware and Architecture ,Computer science ,Search engine indexing - Published
- 2013
18. Research on Parallel Classification Algorithms for Large-scale Data
- Author
-
Hui Wang, Wenbo Wang, and Lijuan Zhou
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,Improved algorithm ,Cloud computing ,Large scale data ,computer.software_genre ,Naive Bayes classifier ,Statistical classification ,Hardware and Architecture ,Scalability ,Computer data storage ,Programming paradigm ,Data mining ,business ,computer - Abstract
Because of the growing mass of data and the requirements of data mining's individuation, the traditional centralized data mining method can't adapt to this kind of demand. Cloud computing provided a cheap solution for massive data storage, analysis and handling. In order to achieve the purpose of parallel data mining in cloud environment, an improved algorithm based on the traditional Naive Bayes has been proposed in this paper. First, proposing the designing ideas of the improved algorithm in MapReduce programming model. Then using the actual data to test the algorithm. The experimental result validated that the new algorithm has higher performance and better scalability.
- Published
- 2012
19. Learning a Pose Lexicon for Semantic Action Recognition
- Author
-
Philip Ogunbona, Lijuan Zhou, and Wanqing Li
- Subjects
FOS: Computer and information sciences ,Sequence ,Computer science ,business.industry ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,020207 software engineering ,02 engineering and technology ,Construct (python library) ,computer.software_genre ,Lexicon ,Translation (geometry) ,Semantics ,Visualization ,Action (philosophy) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Natural language processing ,Gesture - Abstract
This paper presents a novel method for learning a pose lexicon comprising semantic poses defined by textual instructions and their associated visual poses defined by visual features. The proposed method simultaneously takes two input streams, semantic poses and visual pose candidates, and statistically learns a mapping between them to construct the lexicon. With the learned lexicon, action recognition can be cast as the problem of finding the maximum translation probability of a sequence of semantic poses given a stream of visual pose candidates. Experiments evaluating pre-trained and zero-shot action recognition conducted on MSRC-12 gesture and WorkoutSu-10 exercise datasets were used to verify the efficacy of the proposed method., Accepted by the 2016 IEEE International Conference on Multimedia and Expo (ICME 2016). 6 pages paper and 4 pages supplementary material
- Published
- 2016
20. The Comprehensive Electronic Identity Security System of the Internet
- Author
-
Xiang Zou, Huafeng Kong, Bing Chen, Bo Jin, Lijuan Zhou, and Yuebo Dai
- Subjects
Cloud computing security ,Computer Networks and Communications ,Computer science ,business.industry ,Internet privacy ,Computer security ,computer.software_genre ,Internet Architecture Board ,Identity management ,Security association ,Hardware and Architecture ,Electronic identity ,The Internet ,business ,computer ,Security system - Published
- 2011
21. Decision fusion rules based on multi-bit knowledge of local sensors in wireless sensor networks
- Author
-
Lijuan Zhou, Guangzhu Chen, Zhencai Zhu, and Gongbo Zhou
- Subjects
business.industry ,Computer science ,Monte Carlo method ,Pattern recognition ,symbols.namesake ,Hardware and Architecture ,Gaussian noise ,Likelihood-ratio test ,Signal Processing ,symbols ,Fusion rules ,Artificial intelligence ,False alarm ,business ,Wireless sensor network ,Algorithm ,Software ,Fusion center ,Information Systems ,Rayleigh fading - Abstract
For Wireless Sensor Networks (WSNs) with a small quantity of sensors and very low SNR, distributed detection and decision fusion rules based on multi-bit knowledge of local sensors are proposed. At local sensors, observations are quantized to multi-bit local decisions. Three quantification algorithms are investigated, which are based on weight, statistics and redundancy, respectively. Corresponding suboptimal fusion rules at the fusion center are also discussed by approximating the optimal likelihood ratio test. System level detection performance measures, namely probabilities of detection and false alarm, are derived analytically by employing probability theory. Finally, Monte Carlo methods are employed to study the performance of proposed decision fusion rules with parameters such as Rayleigh fading channel and Gaussian noise. Numerical results show that, under non-ideal channel, commonly used schemes based on weight cannot improve the system performance even with a large number and high SNR. Fortunately, schemes based on statistics and redundancy can enhance the system capability when the node is deficient and SNR is low. Furthermore, schemes based on statistics have the best stability among the three schemes, and schemes based on redundancy have the best performance among the three when quantization degree is high.
- Published
- 2011
22. Discriminative Key Pose Extraction Using Extended LC-KSVD for Action Recognition
- Author
-
Philip Ogunbona, Hanling Zhang, Duc Thanh Nguyen, Lijuan Zhou, Yuyao Zhang, and Wanqing Li
- Subjects
business.industry ,Computer science ,Feature extraction ,Pattern recognition ,Svm classifier ,ComputingMethodologies_PATTERNRECOGNITION ,Discriminative model ,Key (cryptography) ,Action recognition ,Pyramid (image processing) ,Artificial intelligence ,business ,Max pooling ,Gesture - Abstract
This paper presents a method for extracting discriminative key poses for skeleton-based action recognition. Poses are represented by normalized joint locations, velocities and accelerations of skeleton joints. An extended label consistent K-SVD (ELC-KSVD) algorithm is proposed for learning the common and action-specific dictionaries. Discriminative key poses are represented by the atoms of the action-specific dictionaries. With the specific dictionaries, sparse codes are obtained for representing action instances through max pooling and temporal pyramid. A SVM classifier is trained for action recognition. The proposed method was evaluated on the MSRC-12 gesture and MSR-Action 3D datasets. Experimental results have shown that the proposed method is effective in extracting discriminative key poses.
- Published
- 2014
23. Research of the FP-Growth Algorithm Based on Cloud Environments
- Author
-
Lijuan Zhou and Xiang Wang
- Subjects
Flexibility (engineering) ,Computer science ,business.industry ,Cloud computing ,Linked list ,computer.software_genre ,Human-Computer Interaction ,Data set ,Artificial Intelligence ,Scalability ,Programming paradigm ,Data mining ,business ,Algorithm ,computer ,Software ,FSA-Red Algorithm - Abstract
The emergence of cloud computing solves the problems that traditional data mining algorithms encounter when dealing with large data. This paper studies the FP-Growth algorithm and proposes a parallel linked list-based FPG algorithm based on MapReduce programming model, named as the PLFPG algorithm. And then it describes the main idea of algorithm. Finally, by using different data sets to test the algorithm, the experimental result shows that PLFPG algorithm has higher efficiency and better flexibility and scalability.
- Published
- 2014
24. Improved Data Mining Algorithms Based on an Early Warning System of College Students
- Author
-
Shuang Li, Yuyan Chen, and Lijuan Zhou
- Subjects
Human-Computer Interaction ,Artificial neural network ,Warning system ,Association rule learning ,Artificial Intelligence ,Computer science ,Early warning system ,Data mining ,computer.software_genre ,computer ,Software ,Data mining algorithm - Abstract
In order to solve the problem of early warning of college students’ achievement, this paper proposes two improved algorithms for data pre-processing and mining warning factor. At first we put forward an improved K-Means algorithm which is not only ensures the accuracy of the original algorithm, but also improves the stability of the algorithm. Then we put forward an improved algorithm New_Apriori algorithm and analyze experiment result. The result shows that the amount of data access has been reduced significantly and efficiency has been improved. In the end of this paper, we built the early warning model of students’ achievement based on neural network. The result of experiment shows that the new algorithms improve the efficiency and accuracy of the early warning.
- Published
- 2013
25. Integrated positioning for coal mining machinery in enclosed underground mine based on SINS/WSN
- Author
-
Lijuan Zhou, Jing Hui, Wenxu Yan, Zhenzhong Yu, Qigao Fan, Wei Li, and Lei Wu
- Subjects
Article Subject ,business.industry ,Computer science ,lcsh:T ,Real-time computing ,lcsh:R ,Coal mining ,lcsh:Medicine ,General Medicine ,Models, Theoretical ,lcsh:Technology ,Coal Mining ,General Biochemistry, Genetics and Molecular Biology ,Working condition ,Global Positioning System ,Dynamic positioning ,lcsh:Q ,business ,lcsh:Science ,Simulation ,General Environmental Science ,Research Article - Abstract
To realize dynamic positioning of the shearer, a new method based on SINS/WSN is studied in this paper. Firstly, the shearer movement model is built and running regularity of the shearer in coal mining face has been mastered. Secondly, as external calibration of SINS using GPS is infeasible in enclosed underground mine, WSN positioning strategy is proposed to eliminate accumulative error produced by SINS; then the corresponding coupling model is established. Finally, positioning performance is analyzed by simulation and experiment. Results show that attitude angle and position of the shearer can be real-timely tracked by integrated positioning strategy based on SINS/WSN, and positioning precision meet the demand of actual working condition.
- Published
- 2013
26. Studies on a Hybrid Way of Rules and Statistics for Chinese Conjunction Usages Recognition
- Author
-
Lijuan Zhou and Hongying Zan
- Subjects
Measure (data warehouse) ,Combining rules ,Computer science ,business.industry ,Statistics ,Artificial intelligence ,business ,Machine learning ,computer.software_genre ,Participle ,computer ,Test (assessment) ,Conjunction (grammar) - Abstract
Conjunction is a kind of functional words. Different conjunctions may contain different usages. The same conjunction may have different usages in different contexts. Studies on conjunction usage recognition are helpful for automatic understanding of modern Chinese texts. This paper adopts a hybrid way of rules and statistics to identify conjunction usages. Experiment results show that the methods combining rules and statistics are helpful for automatic recognition of conjunction usages. Among them, F measure of the participle and part-of-speech tagging corpus of the April , May, June 2000 People’ s Daily achieves 91.42%, 90.88%, 90.92% respectively in open test.
- Published
- 2013
27. Parallel Implementation of Classification Algorithms Based on Cloud Computing Environment
- Author
-
Wenbo Wang, Lijuan Zhou, and Hui Wang
- Subjects
Analysis of parallel algorithms ,Cost efficiency ,business.industry ,Computer science ,Process (computing) ,Cloud computing ,Machine learning ,computer.software_genre ,Task (project management) ,Naive Bayes classifier ,Statistical classification ,ComputingMethodologies_PATTERNRECOGNITION ,Artificial intelligence ,Data mining ,business ,Scale (map) ,computer - Abstract
As an important task of data mining, Classification has been received considerable attention in many applications, such as information retrieval, web searching, etc. The enlarging volumes of information emerging by the progress of technology and the growing individual needs of data mining, makes classifying of very large scale of data a challenging task. In order to deal with the problem, many researchers try to design efficient parallel classification algorithms. This paper introduces the classification algorithms and cloud computing briefly, based on it analyses the bad points of the present parallel classification algorithms, then addresses a new model of parallel classifying algorithms. And it mainly introduces a parallel Naive Bayes classification algorithm based on MapReduce, which is a simple yet powerful parallel programming technique. The experimental results demonstrate that the proposed algorithm improves the original algorithm performance, and it can process large datasets efficiently on commodity hardware.
- Published
- 2012
28. An Improved Approach for Materialized View Selection Based on Genetic Algorithm
- Author
-
Lijuan Zhou, Xiaoxu He, and Kang Li
- Subjects
Mathematical optimization ,General Computer Science ,Computer science ,Population-based incremental learning ,Crossover ,Materialized view ,computer.software_genre ,Data warehouse ,Convergence (routing) ,Genetic algorithm ,Data mining ,Genetic representation ,computer ,Selection (genetic algorithm) - Abstract
This paper presents an improved genetic algorithm to solve the materialized view selection problem under query cost constraints. The algorithm dynamically changes the crossover probability and mutation probability in the process of genetic. In this way, it can not only maintain the population diversity, but also ensure the convergence of the genetic algorithm. So it effectively improves the optimization ability of genetic algorithm, thus avoiding the "evolutionary stagnation" problems. Meanwhile, the improved genetic algorithm increases the processing of invalid solution to avoid the "evolutionary stagnation" problems generated by invalid cycle, thereby the efficiency of materialized view selection is greatly improved.
- Published
- 2012
29. Study and Application of an Improved Clustering Algorithm
- Author
-
Yuyan Chen, Shuang Li, and Lijuan Zhou
- Subjects
k-medoids ,Computer science ,Population-based incremental learning ,computer.software_genre ,Human-Computer Interaction ,Data stream clustering ,Artificial Intelligence ,CURE data clustering algorithm ,Canopy clustering algorithm ,Data mining ,Cluster analysis ,computer ,Software ,k-medians clustering ,FSA-Red Algorithm - Abstract
This paper, combined with the characteristics of the early warning about students' grade, represents an optimization algorithm in order to solve the random selection from the initial clustering center of results to cause major influence this volatility defects .It has integrated into the open source WEKA platform. The optimized algorithm not only guarantees the accuracy of the original algorithm, but also improves the stability of the algorithm.
- Published
- 2012
30. Efficient Mining Algorithms of Finding Frequent Datasets
- Author
-
Zhang Zhang and Lijuan Zhou
- Subjects
Structure (mathematical logic) ,Computer science ,Relational database ,InformationSystems_DATABASEMANAGEMENT ,Subset and superset ,computer.software_genre ,Human-Computer Interaction ,ComputingMethodologies_PATTERNRECOGNITION ,Artificial Intelligence ,ComputingMethodologies_SYMBOLICANDALGEBRAICMANIPULATION ,Data mining ,computer ,Algorithm ,Software - Abstract
This work proposes an efficient mining algorithm to find maximal frequent item sets from relational database. It adapts to large datasets.Itemset is stored in list with special structure. The two main lists called itemset list and Frequent itemset list are created by scanning database once for dividing maximal itemsets into two categories depending on whether the itemsets to achieve minimum support number. Sub itemsets whose superset is in itemset list are generated by recursion to make sure that each sub itemsets appeared before its superset. As current sub itemsets being joined to frequent itemset list, its sub itemsets are pruned from the itemset list. At last, all sub itemsets whose nearest superset is in frequent itemset list are pruned from the frequent itemset list to hold all maximal frequent itemsets.We compare our algorithms and FP-Growth by two sets of time-consuming experiments to prove the superiority of our efficient algorithm both not only with increasing datasets but also with changing mini-support.
- Published
- 2012
31. EMIR: a novel music retrieval system for mobile devices incorporating analysis of user emotion
- Author
-
Hongfei Lin, Cathal Gurrin, and Lijuan Zhou
- Subjects
Music Information Retrieval ,Emotion Detection ,Machine Learning ,Emotion Space ,Information retrieval ,Computer science ,InformationSystems_INFORMATIONINTERFACESANDPRESENTATION(e.g.,HCI) ,Speech recognition ,Emotion detection ,Similarity (psychology) ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,Music information retrieval ,Space (commercial competition) ,Lyrics ,Mobile device ,Text retrieval - Abstract
We present an Emotional Music Information Retrieval system for mobile devices that utilizes a machine learning approach to detect latent emotion from within both user queries (non-descriptive queries) and the lyrics of songs and uses both elements to develop an effective Music Information Retrieval system. Emotion is extracted from the songs and queries and mapped into a high-dimensional emotion space, which allows for the employment of conventional text retrieval techniques to calculate the similarity between a user query and the latent emotion in song lyrics, thereby producing a ranked list of songs for playback.
- Published
- 2012
32. The data processing based on factor analysis
- Author
-
Lijuan Zhou, Ning-ning Chen, Yuan Zhen, and Hua Wang
- Subjects
Data processing ,Assessment data ,Covariance matrix ,Computer science ,Factor (programming language) ,education ,Principal component analysis ,Data mining ,computer.software_genre ,computer ,computer.programming_language - Abstract
This paper starting from the teacher assessment data, analyze whether the evaluation indicators are suitable for factor analysis, then find the two factors which played a decisive role in evaluation indicators these indicators were interactive and related, so as to achieve the purpose of simplifying calculation. Final get the teachers' rankings by factor analysis.
- Published
- 2011
33. A Performance Measurement System Based on BSC
- Author
-
Lijuan Zhou and Yan Peng
- Subjects
Process management ,Balanced scorecard ,Performance management ,Shareholder ,Computer science ,business.industry ,Business process ,Bayesian network ,Performance measurement ,Human resources ,business ,Organizational performance - Abstract
Balanced scorecard (BSC) provides an integrated view of overall organizational performance and strategic objectives. Using financial and non-financial measures, the Balanced Scorecard (BSC) approach appraises four dimensions of firm performance: customers, financial (or shareholders), learning and growth, and internal business processes. This research first summarized the evaluation indexes synthesized from the literature relating to HR(Human Resource) performance measurement. Then, indexes fit for Performance Evaluation can be selected from the system easily base on API indicator. Finally, BSC map is created to explore a new kind of performance management model.
- Published
- 2011
34. Studies on the Automatic Recognition of Modern Chinese Conjunction Usages
- Author
-
Hongying Zan, Kunli Zhang, and Lijuan Zhou
- Subjects
Subjectivity ,Computer science ,business.industry ,Rule-based system ,Artificial intelligence ,computer.software_genre ,business ,computer ,Natural language processing ,Connection (mathematics) ,Conjunction (grammar) - Abstract
The conjunctions can connect words, sentences and even paragraphs. They have special connection functions and their usages are complex and diverse. At present, the studies on conjunctions are mostly human-oriented. These descriptions can not avoid such limitations as subjectivity and illegibility, and are not easy to be applied directly to natural language processing (NLP). This paper studies the automatic recognition of conjunction usages in the background of NLP. It designs a rule-based method and several statistical methods for conjunction usages recognition. Results are compared and analyzed and turns out that rule-based method and statistical methods have advantages and disadvantages.
- Published
- 2011
35. An Improved Algorithm for Materialized View Selection
- Author
-
Lijuan Zhou, Ming Sheng Xu, and Haijun Geng
- Subjects
Set (abstract data type) ,General Computer Science ,Computer science ,Online analytical processing ,Genetic algorithm ,Materialized view ,Key (cryptography) ,Dimensional modeling ,Data mining ,computer.software_genre ,computer ,Selection (genetic algorithm) ,Data warehouse - Abstract
The data warehouse is subject oriented, integrated, nonvolatile and time-varying data sets, which is used to support management decision-making. A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing Decision-support or OLAP queries. One of the most important decisions in designing a data warehouse is the selection of materialized views to be maintained at the warehouse. The materialization of all views is not possible because of the space constraint and maintenance cost constraint. Selecting a suitable set of views that minimize the total cost associated with the materialized views is the key objective of data warehousing. In this paper, first the query cost view selection problem model is proposed. Second, the methods for selecting materialized views are presented. The genetic algorithm is applied to the materialized view selection problem. But with the development of genetic process, the legal solution produced become more and more difficult. Therefore, improved algorithm has been presented in this paper. Finally, in order to test the function and efficiency of our algorithms, experiment simulation is adopted. The experiments show that the given methods can provide near-optimal solutions in limited time and work well in practical cases. Randomized algorithms will become invaluable tools for data warehouse evolution.
- Published
- 2011
36. Research on algorithm of association rules in Distributed Database System
- Author
-
Mingsheng Xu, Shuang Li, and Lijuan Zhou
- Subjects
Apriori algorithm ,Distributed database ,Association rule learning ,Computer science ,Node (networking) ,InformationSystems_DATABASEMANAGEMENT ,Crunode ,computer.software_genre ,ComputingMethodologies_PATTERNRECOGNITION ,Key (cryptography) ,Algorithm design ,Data mining ,Algorithm ,computer ,FSA-Red Algorithm - Abstract
This dissertation proposes a new algorithm of distributed mining association rules using the improved Apriori algorithm, based on analyses and introduction of the basic concepts and algorithms of mining association rules and mining association rules in distributed databases. Using improved Apriori algorithm to directly produce all of local frequent itemset in each crunode, rather than iteratively selecting candidate itemset. Then gather all of local multifarious itemset to broadcast to the general node, producing the global frequent itemset of association rules. In the process, the data is no longer saved with the affair ID as the key word. We take the item ID as the new key word. The performance of the improved Apriori algorithm has been improved through cutting down the store space. While the general node gathers all of local frequent itemset to select the global frequent itemset, it needs only a broadcast probably, needing three broadcasts worst. This raised the efficiency of the new algorithm of Association Rules in Distributed Database System.
- Published
- 2010
37. Massive data mining based on item sequence set grid space
- Author
-
Mingsheng Xu, Zhang Zhang, and Lijuan Zhou
- Subjects
Set (abstract data type) ,Apriori algorithm ,Association rule learning ,Computer science ,Relational database ,Data stream mining ,Data mining ,Linked list ,computer.software_genre ,Grid ,Data structure ,computer - Abstract
According to the stored mode of massive data in the relational database, this paper proposed a fast mining algorithm to find maximum frequent item sets based on item sequence set grid space. The traditional methods for mining association rules generate frequent item sets from small to large. These approaches are either time consuming or computationally expensive, and often generate a large number of redundant candidates or frequent item sets, which is fatal for controlling mining speed as data to mass-level. The goal of this paper is first to use a self-defined structure linked list to storage item sequence then to find the frequent item sets from large to small. Several applications of association rules mining using item sequence set grid space has a good performance but it demonstrated inefficiency in massive data mining. The problem involves time spent on sub item sets finding. Experimental results will be presented to show that the fast mining algorithm ISSDL-DM proposed in this paper use much less time than the similar existing algorithm ISS-DM for achieving the same outcomes.
- Published
- 2010
38. A clustering-Based KNN improved algorithm CLKNN for text classification
- Author
-
Lijuan Zhou, Qian Shi, Lin-shuang Wang, and Xue-bin Ge
- Subjects
Training set ,Computer science ,business.industry ,Document classification ,Pattern recognition ,Boundary testing ,computer.software_genre ,Knn classifier ,k-nearest neighbors algorithm ,Statistical classification ,ComputingMethodologies_PATTERNRECOGNITION ,Algorithm design ,Artificial intelligence ,Data mining ,business ,Cluster analysis ,computer - Abstract
As a simple, effective and nonparametric classification method, k Nearest Neighbor (KNN) is widely used in document classification for dealing with the much more difficult problem such as large-scale or many of categories. But KNN classifier may have a problem when training samples are uneven. The problem is that KNN classifier may decrease the precision of classification because of the uneven density of training data. To solve the problem, a new clustering-based KNN method is presented in this paper. It preprocesses training data by using clustering , then classify with a new KNN algorithm, which adopts a dynamic adjustment in each iteration for the neighborhood number K.This method would avoid the uneven classification phenomenon and reduce the misjudgment of the boundary testing samples. We have an experiment in text classification and the result shows that it has good performance.
- Published
- 2010
39. The minimum incremental maintenance of materialized views in data warehouse
- Author
-
Haijun Geng, Lijuan Zhou, and Qian Shi
- Subjects
Consistency (database systems) ,Information engineering ,Automatic control ,Distributed database ,Database ,Computer science ,Online analytical processing ,Materialized view ,computer.software_genre ,Maintenance engineering ,computer ,Data warehouse - Abstract
A large number of materialized views are stored in data warehouse to enable users to quickly get search results for OLAP analysis. But when the remote basic data source changes, the materialized views in data warehouse are also updated correspondingly in order to maintain the consistency with basic relations, which causes materialized views maintenance issues. There are two methods for materialized views maintenance. One way is to re-compute the views, which can lead to extra large storage and maintenance cost and is sometimes unachievable due to storage limitation. So incremental maintenance technique is more preferable in recent years. Its principle is that data source reports its changes to the integrator who then calculates the corresponding changes and inform the database with the results. Incremental maintenance technique is adopted in this paper. The amount of incremental data is different for the same view when adopting different methods, which result in different maintenance costs. The idea and strategy of minimum incremental maintenance is presented. The materialized view definitions and maintenance expressions, as well as algorithms are given. The experiment shows that the maintenance cost of materialized views is decreased and data warehouse processing efficiency is improved.
- Published
- 2010
40. A Brief Review of Machine Learning and Its Application
- Author
-
Hua Wang, Cuiqin Ma, and Lijuan Zhou
- Subjects
Structure (mathematical logic) ,Artificial neural network ,Computer science ,business.industry ,Analogy ,Rote learning ,Machine learning ,computer.software_genre ,Variety (cybernetics) ,Knowledge extraction ,Algorithm design ,Learning based ,Artificial intelligence ,business ,computer - Abstract
With the popularization of information and the establishment of the databases in great number, and how to extract data from the useful information is the urgent problem to be solved. Machine learning is the core issue of artificial intelligence research, this paper introduces the definition of machine learning and its basic structure, and describes a variety of machine learning methods, including rote learning, inductive learning, analogy learning , explained learning, learning based on neural network and knowledge discovery and so on. This paper also brings foreword the objectives of machine learning, and points out the development trend of machine learning. Keywords-machine learning; intelligence; methods; application
- Published
- 2009
41. Classification data mining method based on dynamic RBF neural networks
- Author
-
Lijuan Zhou, Zhang Zhang, Luping Duan, and Min Xu
- Subjects
Artificial neural network ,business.industry ,Computer science ,Machine learning ,computer.software_genre ,Data warehouse ,Bottleneck ,ComputingMethodologies_PATTERNRECOGNITION ,Rate of convergence ,Robustness (computer science) ,Search algorithm ,Incremental learning ,Radial basis function ,Artificial intelligence ,Data mining ,business ,computer - Abstract
With the widely application of databases and sharp development of Internet, The capacity of utilizing information technology to manufacture and collect data has improved greatly. It is an urgent problem to mine useful information or knowledge from large databases or data warehouses. Therefore, data mining technology is developed rapidly to meet the need. But DM (data mining) often faces so much data which is noisy, disorder and nonlinear. Fortunately, ANN (Artificial Neural Network) is suitable to solve the before-mentioned problems of DM because ANN has such merits as good robustness, adaptability, parallel-disposal, distributing-memory and high tolerating-error. This paper gives a detailed discussion about the application of ANN method used in DM based on the analysis of all kinds of data mining technology, and especially lays stress on the classification Data Mining based on RBF neural networks. Pattern classification is an important part of the RBF neural network application. Under on-line environment, the training dataset is variable, so the batch learning algorithm (e.g. OLS) which will generate plenty of unnecessary retraining has a lower efficiency. This paper deduces an incremental learning algorithm (ILA) from the gradient descend algorithm to improve the bottleneck. ILA can adaptively adjust parameters of RBF networks driven by minimizing the error cost, without any redundant retraining. Using the method proposed in this paper, an on-line classification system was constructed to resolve the IRIS classification problem. Experiment results show the algorithm has fast convergence rate and excellent on-line classification performance.
- Published
- 2009
42. Design of data warehouse in teaching state based on OLAP and data mining
- Author
-
Shuang Li, Lijuan Zhou, and Minhua Wu
- Subjects
Decision support system ,Association rule learning ,Computer science ,business.industry ,Online analytical processing ,Data management ,Information technology ,computer.software_genre ,Data structure ,Data science ,Data warehouse ,Data modeling ,Data extraction ,Data quality ,Data mining ,business ,computer - Abstract
The data warehouse and the data mining technology is one of information technology research hot topics. At present the data warehouse and the data mining technology in aspects and so on commercial, financial industry as well as enterprise's production, market marketing obtained the widespread application, but is relatively less in educational fields' application. Over the years, the teaching and management have been accumulating large amounts of data in colleges and universities, while the data can not be effectively used, in the light of social needs of the university development and the current status of data management, the establishment of data warehouse in university state, the better use of existing data, and on the basis dealing with a higher level of disposal --data mining are particularly important. In this paper, starting from the decision-making needs design data warehouse structure of university teaching state, and then through the design structure and data extraction, loading, conversion create a data warehouse model, finally make use of association rule mining algorithm for data mining, to get effective results applied in practice. Based on the data analysis and mining, get a lot of valuable information, which can be used to guide teaching management, thereby improving the quality of teaching and promoting teaching devotion in universities and enhancing teaching infrastructure. At the same time it can provide detailed, multi-dimensional information for universities assessment and higher education research.
- Published
- 2009
43. Research of Data Mining Approach Based on Radial Basis Function Neural Networks
- Author
-
Luping Duan, Minhua Wu, Ming Sheng Xu, Haijun Geng, and Lijuan Zhou
- Subjects
Statistical classification ,Radial basis function network ,Artificial neural network ,Computer science ,Population-based incremental learning ,Process (computing) ,Algorithm design ,Radial basis function ,Data mining ,computer.software_genre ,computer ,Selection (genetic algorithm) - Abstract
In this paper classification of data mining based on radial basis function neural networks is researched. After intensive analysis, the training algorithm of radial basis function neural networks is improved in optimum structure, learning speed and approximation accuracy. In learning speed, two-stage learning strategy is used to accelerate the learning process. In approximation accuracy, an error-correction algorithm is presented to improve the output accuracy of radial basis function. In optimum structure, the paper is focused on the number and center selection of the hidden layer units and proposes an adaptive dynamic and static combination algorithm of center selection. Finally, the algorithms are experimented and comparative analyzed. The experimental results show that the performance of the algorithm is significantly improved, and also prove the validity of the improved algorithm.
- Published
- 2009
44. Research on Materialized Views Technology in Data Warehouse
- Author
-
Min Xu, Lijuan Zhou, Zhongxiao Hao, and Qian Shi
- Subjects
Decision support system ,Database ,Distributed database ,Computer science ,Node (networking) ,Materialized view ,InformationSystems_DATABASEMANAGEMENT ,Dimensional modeling ,computer.software_genre ,Data warehouse ,Set (abstract data type) ,Algorithm design ,Data mining ,computer - Abstract
With the needs of decision-support information of enterprise and the fast development of computer technologies data warehouse technology come out. The data warehouse is a repository of information collected from multiple, possibly heterogeneous, autonomous, distributed databases. The information stored at the data warehouse is in form of views referred to as materialized views. The design of data warehouse is one of the core research problems in studying and evolution of data warehouse. One of the most important decisions in design of data warehouse is the data warehouse selection. Selecting views to materialize impacts on the efficiency as well as the total cost of establishing and running a data warehouse. So, we develop algorithms to select a set of views to materialize in data warehouse in order to minimize the total view maintenance cost under the constraint of a given query response time. We call it query cost view selection problem (QC_VSP). In this paper, First, we propose query cost view selection problem model. Second, we give three algorithms for QC_VSP; we give view_node_matrix in order to solve it. Third, experiment simulation is adopted. The results show that our algorithm works better in practical cases. We implemented our algorithms and a performance study of the algorithms shows that the proposed algorithm delivers an optimal solution. Finally, we discuss the observed behavior of the algorithms. We also identify some important issues for future investigations.
- Published
- 2008
45. The Model and Realization of Materialized Views Selection in Data Warehouse
- Author
-
Lijuan Zhou, Xuebin Ge, and Minhua Wu
- Subjects
Set (abstract data type) ,Information retrieval ,Selection (relational algebra) ,Database ,Distributed database ,Computer science ,Materialized view ,InformationSystems_DATABASEMANAGEMENT ,Dimensional modeling ,Algorithm design ,computer.software_genre ,computer ,Data warehouse - Abstract
The data warehouse is a repository of information collected from multiple, possibly heterogeneous, autonomous, distributed databases. The information stored at the data warehouse is in form of views, referred to as materialized views. One of the most important decisions in designing a data warehouse is selection of right views to be materialized. So, we develop algorithms to select a set of views to materialize in data warehouse in order to minimize the total view maintenance cost under the constraint of a given query response time. We call it query cost view_ selection problem (QC_VSP). In this paper, First, we propose the cost model of QC_VSP. Second, we design algorithms for QC_VSP. Third, we use experiments do demonstrate the power of our approach.
- Published
- 2008
46. Research on parallel algorithm for sequential pattern mining
- Author
-
Lijuan Zhou, Zhongxiao Hao, Yu Wang, and Bai Qin
- Subjects
GSP Algorithm ,Theoretical computer science ,Speedup ,Sequence database ,Computer science ,Test data generation ,Distributed algorithm ,Search algorithm ,Parallel algorithm ,Data mining ,computer.software_genre ,computer ,FSA-Red Algorithm - Abstract
Sequential pattern mining is the mining of frequent sequences related to time or other orders from the sequence database. Its initial motivation is to discover the laws of customer purchasing in a time section by finding the frequent sequences. In recent years, sequential pattern mining has become an important direction of data mining, and its application field has not been confined to the business database and has extended to new data sources such as Web and advanced science fields such as DNA analysis. The data of sequential pattern mining has characteristics as follows: mass data amount and distributed storage. Most existing sequential pattern mining algorithms haven't considered the above-mentioned characteristics synthetically. According to the traits mentioned above and combining the parallel theory, this paper puts forward a new distributed parallel algorithm SPP(Sequential Pattern Parallel). The algorithm abides by the principal of pattern reduction and utilizes the divide-and-conquer strategy for parallelization. The first parallel task is to construct frequent item sets applying frequent concept and search space partition theory and the second task is to structure frequent sequences using the depth-first search method at each processor. The algorithm only needs to access the database twice and doesn't generate the candidated sequences, which abates the access time and improves the mining efficiency. Based on the random data generation procedure and different information structure designed, this paper simulated the SPP algorithm in a concrete parallel environment and implemented the AprioriAll algorithm. The experiments demonstrate that compared with AprioriAll, the SPP algorithm had excellent speedup factor and efficiency.
- Published
- 2008
47. Selecting materialized views using random algorithm
- Author
-
Zhongxiao Hao, Lijuan Zhou, and Chi Liu
- Subjects
Distributed database ,Computer science ,View ,Online analytical processing ,Genetic algorithm ,Materialized view ,Simulated annealing ,Data mining ,computer.software_genre ,computer ,Data warehouse ,Randomized algorithm - Abstract
The data warehouse is a repository of information collected from multiple possibly heterogeneous autonomous distributed databases. The information stored at the data warehouse is in form of views referred to as materialized views. The selection of the materialized views is one of the most important decisions in designing a data warehouse. Materialized views are stored in the data warehouse for the purpose of efficiently implementing on-line analytical processing queries. The first issue for the user to consider is query response time. So in this paper, we develop algorithms to select a set of views to materialize in data warehouse in order to minimize the total view maintenance cost under the constraint of a given query response time. We call it query_cost view_ selection problem. First, cost graph and cost model of query_cost view_ selection problem are presented. Second, the methods for selecting materialized views by using random algorithms are presented. The genetic algorithm is applied to the materialized views selection problem. But with the development of genetic process, the legal solution produced become more and more difficult, so a lot of solutions are eliminated and producing time of the solutions is lengthened in genetic algorithm. Therefore, improved algorithm has been presented in this paper, which is the combination of simulated annealing algorithm and genetic algorithm for the purpose of solving the query cost view selection problem. Finally, in order to test the function and efficiency of our algorithms experiment simulation is adopted. The experiments show that the given methods can provide near-optimal solutions in limited time and works better in practical cases. Randomized algorithms will become invaluable tools for data warehouse evolution.
- Published
- 2007
48. Divison of Imaging Intervals and Selection of Optimum Imgaging Time for Ship ISAR Imaging Based on Measured Data
- Author
-
Haiping Sun, Lijuan Zhou, and Mengdao Xing
- Subjects
Synthetic aperture radar ,Computer science ,Pulse-Doppler radar ,Acoustics ,Doppler radar ,Side looking airborne radar ,Space-based radar ,law.invention ,Inverse synthetic aperture radar ,Continuous-wave radar ,law ,Computer Science::Computer Vision and Pattern Recognition ,Radar imaging ,Remote sensing - Abstract
In this paper an Inverse Synthetic Aperture Radar (ISAR) imaging algorithm for ship targets based on the division of imaging intervals and the selection of optimum time is proposed. The relative motion of a ship on the ocean wave can be broken into three components, namely pith, roll and yaw, which makes the Doppler frequency vary with the slow time. So during the whole slow time the echoes are not linear frequency modulation (LFM) signals any more. Under this condition the division of the imaging intervals and the selection of the optimum imaging time during the considered imaging interval are of great importance for ISAR imaging of ship targets. The imaging results of the measured data demonstrate the effectiveness of the proposed approach.
- Published
- 2006
49. Synthetic Bandwidth Method Integrated with Characteristics ofSAR
- Author
-
Mengdao Xing, Haiping Sun, and Lijuan Zhou
- Subjects
Synthetic aperture radar ,Motion compensation ,symbols.namesake ,Computer science ,Radar imaging ,Bandwidth (signal processing) ,Echo signal ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,symbols ,Electronic engineering ,High bandwidth ,Time domain ,Doppler effect - Abstract
Stepped-frequency subpulse signals are widely used to obtain ultra-high range resolution. The stepped-frequency subpulse signals can be combined to one single signal with high bandwidth by using synthetic bandwidth methods. In practical SAR imaging motion error and the time delay of echo signal must be considered before applying the available synthetic bandwidth methods. This paper presents a method in which motion compensation and compensation in Doppler domain are integrated with time domain synthetic bandwidth method in order to get high quality SAR image.
- Published
- 2006
50. P2P Traffic Identification by TCP Flow Analysis
- Author
-
ZhiTong Li, LiJuan Zhou, and Bin Liu
- Subjects
Network packet ,Computer science ,business.industry ,ComputerSystemsOrganization_COMPUTER-COMMUNICATIONNETWORKS ,Traffic policing ,Network interface ,Traffic shaping ,business ,Traffic generation model ,Host (network) ,Network traffic control ,Network traffic simulation ,Computer network - Abstract
In this paper, we first propose some new and universal features of all kinds of P2P traffic derived from the packet header information of transfer/network layer and present a novel approach for gathering P2P traffic: just from a single host perspective, namely capturing packets from the NIC (network interface card) of one host.
- Published
- 2006
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.