2,474 results on '"Source data"'
Search Results
2. 农产品溯源区块链的源头数据验证机制研究.
- Author
-
赵龙海, 赵金辉, and 邹惠
- Abstract
Copyright of Journal of Chongqing University of Technology (Natural Science) is the property of Chongqing University of Technology and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
3. A Source Data Verification-Based Data Quality Analysis Within the Network of a German Comprehensive Cancer Center
- Author
-
Borner, Martina, Schweizer, Diana, Fey, Theres, Nasseh, Daniel, Dengler, Robert, FOM Hochschule für Oekonomie & Management, Cassens, Manfred, editor, Kollányi, Zsófia, editor, and Tsenov, Aleksandar, editor
- Published
- 2022
- Full Text
- View/download PDF
4. Evaluation of the clinical application effect of eSource record tools for clinical research
- Author
-
Bin Wang, Xinbao Hao, Xiaoyan Yan, Junkai Lai, Feifei Jin, Xiwen Liao, Hongju Xie, and Chen Yao
- Subjects
Electronic medical record ,eSource ,Source data ,Real-world study ,Interoperability ,Data collection ,Computer applications to medicine. Medical informatics ,R858-859.7 - Abstract
Abstract Background Electronic sources (eSources) can improve data quality and reduce clinical trial costs. Our team has developed an innovative eSource record (ESR) system in China. This study aims to evaluate the efficiency, quality, and system performance of the ESR system in data collection and data transcription. Methods The study used time efficiency and data transcription accuracy indicators to compare the eSource and non-eSource data collection workflows in a real-world study (RWS). The two processes are traditional data collection and manual transcription (the non-eSource method) and the ESR-based source data collection and electronic transmission (the eSource method). Through the system usability scale (SUS) and other characteristic evaluation scales (system security, system compatibility, record quality), the participants’ experience of using ESR was evaluated. Results In terms of the source data collection (the total time required for writing electronic medical records (EMRs)), the ESR system can reduce the time required by 39% on average compared to the EMR system. In terms of data transcription (electronic case report form (eCRF) filling and verification), the ESR can reduce the time required by 80% compared to the non-eSource method (difference: 223 ± 21 s). The ESR accuracy in filling the eCRF field is 96.92%. The SUS score of ESR is 66.9 ± 16.7, which is at the D level and thus very close to the acceptable margin, indicating that optimization work is needed. Conclusions This preliminary evaluation shows that in the clinical medical environment, the ESR-based eSource method can improve the efficiency of source data collection and reduce the workload required to complete data transcription.
- Published
- 2022
- Full Text
- View/download PDF
5. Evaluation of the clinical application effect of eSource record tools for clinical research.
- Author
-
Wang, Bin, Hao, Xinbao, Yan, Xiaoyan, Lai, Junkai, Jin, Feifei, Liao, Xiwen, Xie, Hongju, and Yao, Chen
- Subjects
- *
ELECTRONIC health records , *CLINICAL medicine , *MEDICAL research , *ACQUISITION of data - Abstract
Background: Electronic sources (eSources) can improve data quality and reduce clinical trial costs. Our team has developed an innovative eSource record (ESR) system in China. This study aims to evaluate the efficiency, quality, and system performance of the ESR system in data collection and data transcription.Methods: The study used time efficiency and data transcription accuracy indicators to compare the eSource and non-eSource data collection workflows in a real-world study (RWS). The two processes are traditional data collection and manual transcription (the non-eSource method) and the ESR-based source data collection and electronic transmission (the eSource method). Through the system usability scale (SUS) and other characteristic evaluation scales (system security, system compatibility, record quality), the participants' experience of using ESR was evaluated.Results: In terms of the source data collection (the total time required for writing electronic medical records (EMRs)), the ESR system can reduce the time required by 39% on average compared to the EMR system. In terms of data transcription (electronic case report form (eCRF) filling and verification), the ESR can reduce the time required by 80% compared to the non-eSource method (difference: 223 ± 21 s). The ESR accuracy in filling the eCRF field is 96.92%. The SUS score of ESR is 66.9 ± 16.7, which is at the D level and thus very close to the acceptable margin, indicating that optimization work is needed.Conclusions: This preliminary evaluation shows that in the clinical medical environment, the ESR-based eSource method can improve the efficiency of source data collection and reduce the workload required to complete data transcription. [ABSTRACT FROM AUTHOR]- Published
- 2022
- Full Text
- View/download PDF
6. Research on Data Quality of Open Source Code Data
- Author
-
BAO Panpan, TAO Chuanqi, HUANG Zhiqiu
- Subjects
intelligent programming ,open source big data ,source data ,data quality ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Code generation and bug prediction based on open source code data are the typical application fields in current intelligent software development. However, the existing researches mainly focus on diverse intelligent algorithms applied in different applications, such as recommendation and prediction. The quality of the data used in the research is seldom evaluated and analyzed. Most of the data used in intelligent technologies come from open source code. Due to the variety of software developers and programmers, there exists a clear quality issue frequently. According to garbage in and garbage out, this affects the final results quality. The quality of source data has an important impact on relevant research, but has not received sufficient attention. Aiming to address the quality problem, this paper proposes an approach to data quality evaluation and analysis for open source code. First, this paper studies how to define and evaluate the quality of the source code extracted from GitHub, and then evaluates the quality from different dimensions. The benefits of the approach can support related researchers to construct data sets with higher quality and make further improvement in intelligent application effects, such as code generation and bug prediction.
- Published
- 2020
- Full Text
- View/download PDF
7. REPRESENTATION OF SOCIOLOGICAL DATA IN THE CONTEXT OF LEGAL PROTECTION OF COPYRIGHT PROPERTY IN THE DIGITAL ECONOMY
- Author
-
O. O. Medvedeva, I. D. Katolik, A. O. Zhuk, R. A. Baryshev, and K. N. Zakharyin
- Subjects
digital objects of copyright ,digital sociological data ,ipuniversity platform ,patent law ,identifier ,source data ,Bibliography. Library science. Information resources - Abstract
This article presents the modern aspects of the formation digital objects of copyright (DOC) in the digital economy, as well as their function in modern scientific communication. The peculiarities of copyright protection are analyzed within the framework of traditional protection institutions, as well as using digital technologies based on distributed registries. Using the example of an object of copyright “sociological data” presented in digital form, its structure and description are presented for placement on the IPUniversity platform created by leading Russian universities to test fixing models and sharing the results of intellectual activity.
- Published
- 2019
- Full Text
- View/download PDF
8. The Late Bronze Age settlement site of Březnice: Magnetometer survey data
- Author
-
Martin Kuna, Roman Křivánek, Ondřej Chvojka, and Tereza Šálková
- Subjects
Magnetometry ,Source data ,Bronze Age settlement site ,Intra-site patterning ,House burning ,Prehistoric homestead clusters ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Science (General) ,Q1-390 - Abstract
The archaeological site of Březnice (Czechia) represents one of the large settlements of the Late Bronze Age (Ha A2/B1, 14C: 1124–976 BC) in Bohemia. The site became known mainly for a high number of so-called ‘trenches’, oblong pit features (breadth around 1 m, length 4–7 m), remarkable not only for their specific shape but also for their contents (unusual amounts of pottery, daub, loom weights and other finds, often with traces of a strong fire).In 2018–20, a research project focusing on the study of the site was realized. Magnetometer survey became an integral part of the project since it represented a way to obtain an overall image of the site. A 5-channel fluxgate gradiometer from Sensys (Germany) was used; the vertical gradient of the Z component of the Earth magnetic field was measured. In total, the survey covered an area of over 17 hectares and included over 1.8 million measurements. Profiles were orientated from east to west and data taken bidirectionally (alternate lines in opposite directions), in a 0.5 × 0.2 m grid.The site is extraordinary due to the fact that all archaeological features discovered so far belong to a single archaeological period (Late Bronze Age). This makes the acquired data set exceptional. It can be further used by archaeologists and geophysicists, both to create alternative models of the dynamics of prehistoric settlements and to better understand the nature and interpretive possibilities of the magnetometer data in archaeology as such.
- Published
- 2021
- Full Text
- View/download PDF
9. Learning From a Complementary-Label Source Domain: Theory and Algorithms
- Author
-
Zhen Fang, Bo Yuan, Guangquan Zhang, Yiyang Zhang, Feng Liu, and Jie Lu
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Domain adaptation ,Source data ,Adversarial network ,Computer Networks and Communications ,Computer science ,Machine Learning (stat.ML) ,02 engineering and technology ,Machine Learning (cs.LG) ,Domain (software engineering) ,Statistics - Machine Learning ,Artificial Intelligence ,0502 economics and business ,Classifier (linguistics) ,0202 electrical engineering, electronic engineering, information engineering ,Domain theory ,Artificial Intelligence & Image Processing ,Series (mathematics) ,business.industry ,05 social sciences ,Pattern recognition ,Class (biology) ,Computer Science Applications ,ComputingMethodologies_PATTERNRECOGNITION ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,050203 business & management ,Software - Abstract
In unsupervised domain adaptation (UDA), a classifier for the target domain is trained with massive true-label data from the source domain and unlabeled data from the target domain. However, collecting fully-true-label data in the source domain is high-cost and sometimes impossible. Compared to the true labels, a complementary label specifies a class that a pattern does not belong to, hence collecting complementary labels would be less laborious than collecting true labels. Thus, in this paper, we propose a novel setting that the source domain is composed of complementary-label data, and a theoretical bound for it is first proved. We consider two cases of this setting, one is that the source domain only contains complementary-label data (completely complementary unsupervised domain adaptation, CC-UDA), and the other is that the source domain has plenty of complementary-label data and a small amount of true-label data (partly complementary unsupervised domain adaptation, PC-UDA). To this end, a complementary label adversarial network} (CLARINET) is proposed to solve CC-UDA and PC-UDA problems. CLARINET maintains two deep networks simultaneously, where one focuses on classifying complementary-label source data and the other takes care of source-to-target distributional adaptation. Experiments show that CLARINET significantly outperforms a series of competent baselines on handwritten-digits-recognition and objects-recognition tasks., arXiv admin note: text overlap with arXiv:2007.14612
- Published
- 2022
10. Object-based tracking of precipitation systems in western Canada: the importance of temporal resolution of source data.
- Author
-
Li, Lintao, Li, Yanping, and Li, Zhenhua
- Subjects
- *
ATMOSPHERIC models , *ALGORITHMS , *VALLEYS , *COASTS , *MAXIMUM power point trackers , *PRAIRIES - Abstract
Object-based algorithm provides additional spatiotemporal information of precipitation, besides traditional aspects such as amount and intensity. Using the Method for Object-based Diagnostic Evaluation with Time Dimension (MODE-TD, or MTD), precipitation features in western Canada have been analyzed comprehensively based on the Canadian Precipitation Analysis, North American Regional Reanalysis, Multi-Source Weighted-Ensemble Precipitation, and a convection-permitting climate model. We found light precipitation occurs frequently in the interior valleys of western Canada while moderate to heavy precipitation is rare there. The size of maritime precipitation system near the coast is similar to the continental precipitation system on the Prairies for moderate to heavy precipitation while light precipitation on the Prairies is larger in size than that occurs near the coast. For temporal features, moderate to heavy precipitation lasts longer than light precipitation over the Pacific coast, and precipitation systems on the Prairies generally move faster than the coastal precipitation. For annual cycle, the west coast has more precipitation events in cold seasons while more precipitation events are identified in warm seasons on the Prairies due to vigorous convection activities. Using two control experiments, the way how the spatiotemporal resolution of source data influences the MTD results has been examined. Overall, the spatial resolution of source data has little influence on MTD results. However, MTD driven by dataset with coarse temporal resolution tend to identify precipitation systems with relatively large size and slow propagation speed. This kind of precipitation systems normally have short track length and relatively long lifetime. For a typical precipitation system (0.7 ∼ 2 × 10 4 km 2 in size) in western Canada, the maximum propagation speed that can be identified by 6-h data is approximately 25 km h - 1 , 33 km h - 1 for 3-h, and 100 km h - 1 for hourly dataset. Since the propagation speed of precipitation systems in North America is basically between 0 and 80 km h - 1 , we argue that precipitation features can be identified properly by MTD only when dataset with hourly or higher temporal resolution is used. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
11. Improvement of the bootstrapping part of the ODIN system
- Author
-
Universitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació, Romero Moral, Óscar, Queralt Calafat, Anna, Villanueva Baxarias, Rubén, Universitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació, Romero Moral, Óscar, Queralt Calafat, Anna, and Villanueva Baxarias, Rubén
- Abstract
ODIN és un sistema que exposa una ontologia, que és la conceptualització de les fonts de dades i el domini d'interès, amb l'objectiu d'oferir una interfície de consultes uniformes. Les consultes sobre l'ontologia es reescriuen sobre les fonts a través de mapejos d'esquema. La part del sistema que involucra la construcció automàtica de l'ontologia a partir de les fonts de dades i els seus respectius mapejos es coneix com a bootstrapping. L'objectiu d'aquest TFG és millorar el bootstrapping actual del sistema ODIN i implementar tots els casos d'ús i funcionalitats que falten per tal que funcioni per a qualsevol font de dades JSON., ODIN is a system that exposes an ontology, which is the conceptualization of the source data and the domain of interest, with the aim of offering a uniform query interface. Queries over the ontology are rewritten over the sources via schema mappings. The part of the system that involves the automatic construction of the ontology from the data sources and their respective mappings is known as bootstrapping. The objective of this TFG is to improve the current bootstrapping of the ODIN system and to implement all the use cases and functionalities that are missing.
- Published
- 2023
12. Transferable Feature Selection for Unsupervised Domain Adaptation
- Author
-
Qingyao Wu, Yuzhong Ye, Hanrui Wu, Liu Dapeng, Min Lu, Yuguang Yan, Bi Chaoyang, and Michael K. Ng
- Subjects
Domain adaptation ,Source data ,Computer science ,business.industry ,Feature selection ,Machine learning ,computer.software_genre ,Computer Science Applications ,Domain (software engineering) ,Task (project management) ,Computational Theory and Mathematics ,Discriminative model ,Feature (machine learning) ,Artificial intelligence ,business ,computer ,Classifier (UML) ,Information Systems - Abstract
Domain adaptation aims at extracting knowledge from auxiliary source domains to assist the learning task in a target domain. In classification problems, since the distributions of the source and target domains are different, directly using source data to build a classifier for the target domain may hamper the classification performance on the target data. Fortunately, in many tasks, there can be some features that are transferable, i.e., the source and target domains share similar properties. On the other hand, it is common that the source data contain noisy features which may degrade the learning performance in the target domain. This issue, however, is barely studied in existing works. In this paper, we propose to find a feature subset that is transferable across the source and target domains. As a result, the domain discrepancy measured on the selected features can be reduced. Moreover, we seek to find the most discriminative features for classification. To achieve the above goals, we formulate a new sparse learning model that is able to jointly reduce the domain discrepancy and select informative features for classification. We develop two optimization algorithms to address the derived learning problem. Extensive experiments on real-world data sets demonstrate the effectiveness of the proposed method.
- Published
- 2022
13. Data Collection and Quality Control
- Author
-
Senior, Hugh, Nikles, Jane, editor, and Mitchell, Geoffrey, editor
- Published
- 2015
- Full Text
- View/download PDF
14. GNN-RE: Graph Neural Networks for Reverse Engineering of Gate-Level Netlists
- Author
-
Satwik Patnaik, Abhrajit Sengupta, Hani Saleh, Mahmoud Al-Qutayri, Lilas Alrahis, Ozgur Sinanoglu, Johann Knechtel, and Baker Mohammad
- Subjects
Reverse engineering ,Hardware security module ,Theoretical computer science ,Source data ,Computer science ,Feature vector ,Feature extraction ,computer.software_genre ,Computer Graphics and Computer-Aided Design ,Logic gate ,Netlist ,Node (circuits) ,Electrical and Electronic Engineering ,computer ,Software ,Hardware_LOGICDESIGN - Abstract
This work introduces a generic, machine learning (ML)-based platform for functional reverse engineering (RE) of circuits. Our proposed platform GNN-RE leverages the notion of graph neural networks (GNNs) to (i) represent and analyze flattened/ unstructured gate-level netlists, (ii) automatically identify the boundaries between the modules or sub-circuits implemented in such netlists and (iii) classify the sub-circuits based on their functionalities. For GNNs in general, each graph node is tailored to learn about its own features and its neighboring nodes, which is a powerful approach for the detection of any kind of sub-graphs of interest. For GNN-RE, in particular, each node represents a gate and is initialized with a feature vector that reflects on the functional and structural properties of its neighboring gates. GNN-RE also learns the global structure of the circuit, which facilitates identifying the boundaries between subcircuits in a flattened netlist. Initially, to provide high-quality data for training of GNN-RE, we deploy a comprehensive dataset of foundational designs/components with differing functionalities, implementation styles, bit-widths, and interconnections. GNN-RE is then tested on the unseen shares of this custom dataset, as well as the EPFL benchmarks, the ISCAS-85 benchmarks, and the 74X series benchmarks. GNN-RE achieves an average accuracy of 98:82% in terms of mapping individual gates to modules, all without any manual intervention or post-processing. We also release our code and source data 1.
- Published
- 2022
15. Adaptive Fusion CNN Features for RGBT Object Tracking
- Author
-
Huanlong Zhang, Xian Wei, Yong Wang, Xuan Tang, and Hao Shen
- Subjects
Source data ,Modality (human–computer interaction) ,business.industry ,Computer science ,Mechanical Engineering ,Tracking (particle physics) ,Convolutional neural network ,Computer Science Applications ,Minimum bounding box ,Video tracking ,Automotive Engineering ,RGB color model ,Computer vision ,Artificial intelligence ,business ,Intelligent transportation system - Abstract
Thermal sensors play an important role in intelligent transportation system. This paper studies the problem of RGB and thermal (RGBT) tracking in challenging situations by leveraging multimodal data. A RGBT object tracking method is proposed in correlation filter tracking framework based on short term historical information. Given the initial object bounding box, hierarchical convolutional neural network (CNN) is employed to extract features. The target is tracked for RGB and thermal modalities separately. Then the backward tracking is implemented in the two modalities. The difference between each pair is computed, which is an indicator of the tracking quality in each modality. Considering the temporal continuity of sequence frames, we also incorporate the history data into the weights computation to achieve a robust fusion of different source data. Experiments on three RGBT datasets show the proposed method achieves comparable results to state-of-the-art methods.
- Published
- 2022
16. MM-UrbanFAC: Urban Functional Area Classification Model Based on Multimodal Machine Learning
- Author
-
Xiujuan Xu, Xiaowei Zhao, Yu Liu, Yulin Bai, and Yuzhi Sun
- Subjects
Feature engineering ,Source data ,Artificial neural network ,business.industry ,Computer science ,Mechanical Engineering ,Association (object-oriented programming) ,Machine learning ,computer.software_genre ,Computer Science Applications ,Dual (category theory) ,Tree (data structure) ,Automotive Engineering ,Classifier (linguistics) ,Probability distribution ,Artificial intelligence ,business ,computer - Abstract
Most of the classification methods of urban functional areas nowadays are only based on single source data analysis and modeling, which can not make full use of the multi-scale and multi-source data that is easy to obtain. Therefore, this paper proposed a classification model of urban functional areas based on multi-modal machine learning, by analyzing regional remote sensing images and behavior data of visitors in the area, using the combination of supervised methods extracted the deep-seated features and relationships of kinds of data, filtered and merged the overall and local features of the data. The model used dual branch neural network combining SE-ResNeXt and Dual Path Network (DPN) to automatically mined and fused the overall characteristics of multi-source data, and used the designed feature engineering to deeply mine the behavior data of users to obtain more association information, then combined the algorithm based on Gradient Boosting Decision Tree to learn the characteristics of different levels and obtained the classification probability for different levels of features. Finally, we continued to use the algorithm based on the Gradient Boosting Decision Tree to learn the probability distribution of different levels of features to obtain the final prediction results of urban functional area classification. Through the analysis and experimental verification of real data sets, the results showed that MM-UrbanFAC model can effectively integrate the features of multi-modal data. Compared with a single classifier, the integration framework based on gradient lifting tree improved the prediction performance, this method can effectively integrate the results of multiple models and accurately classify urban functional areas, and the model can provide reference for tourism recommendation, urban land planning and urban construction.
- Published
- 2022
17. Cross-Corpus Speech Emotion Recognition Based on Joint Transfer Subspace Learning and Regression
- Author
-
Wenjing Zhang, Weijian Zhang, Chao Sheng, Dongliang Chen, and Peng Song
- Subjects
ComputingMethodologies_PATTERNRECOGNITION ,Source data ,Discriminative model ,Artificial Intelligence ,Computer science ,Speech recognition ,Metric (mathematics) ,Graph (abstract data type) ,Generalizability theory ,Transfer of learning ,Software ,Subspace topology ,Test data - Abstract
Speech emotion recognition has become an attractive research topic due to various emotional states of speech signals in real-life scenarios. Most current speech emotion recognition methods are carried out on a single corpus. However, in practice, the training and testing data often come from different domains, e.g., different corpora. In this case, the model generalizability and recognition performance would decrease greatly due to the domain mismatch. To address this challenging problem, we present a transfer learning method, called joint transfer subspace learning and regression (JTSLR), for cross-corpus speech emotion recognition. Specifically, JTSLR performs transfer subspace learning and regression in a joint framework. First, we learn a latent subspace by introducing a discriminative maximum mean discrepancy (MMD) as the discrepancy metric. Then, we put forward a regression function in this latent subspace to describe the relationships between features and corresponding labels. Moreover, we present a label graph to help transfer knowledge from relevant source data to target data. Finally, we conduct extensive experiments on three popular emotional datasets. The results show that our method can outperform traditional methods and some state-of-the-art transfer learning algorithms for cross-corpus speech emotion recognition tasks.
- Published
- 2022
18. AdaDC: Adaptive Deep Clustering for Unsupervised Domain Adaptation in Person Re-Identification
- Author
-
Zhilan Hu, Shihua Li, Jie Chen, and Mingkuan Yuan
- Subjects
Domain adaptation ,Source data ,Exploit ,business.industry ,Generalization ,Computer science ,Overfitting ,Machine learning ,computer.software_genre ,Task (project management) ,Domain (software engineering) ,ComputingMethodologies_PATTERNRECOGNITION ,Media Technology ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Cluster analysis ,computer - Abstract
Unsupervised domain adaptation (UDA) in person re-identification (re-ID) is a challenging task, aiming to learn a model with labeled source data and unlabeled target data to recognize the same person in the target domain across different cameras. Recently, a lot of popular and promising methods based on clustering are proposed for this task and achieve a sizable progress. However, in those methods, without target labels, the clustering algorithms will inevitably produce noisy pseudo-labels. Overfitting on these noisy labels is severely harmful to the performance and generalization of models. In order to address the above issues, we propose a novel framework, Adaptive Deep Clustering (AdaDC), to reduce the negative impact of noisy pseudo-labels. On one hand, the proposed approach employs different clustering methods adaptively and alternately to fully exploit their complementary information and avoid overfitting noisy pseudo-labels. On the other hand, there is a progressive sample selection strategy for reducing noisy label ratio in pseudo-labels, which is achieved by integrating different clustering results. Experiments present that the proposed approach can achieve state-of-the-art performance compared to the other recent UDA person re-ID methods on widely-used datasets. Moreover, there are some other analysis experiments conducted for verifying the effectiveness of the proposed approach.
- Published
- 2022
19. Stackelberg game for price and power control in secure cooperative relay network
- Author
-
Khyati Chopra, Ranjan Bose, and Anupam Joshi
- Subjects
probability ,game theory ,relay networks (telecommunication) ,cooperative communication ,diversity reception ,power control ,telecommunication security ,Stackelberg game ,price ,threshold-based relay network ,source information ,source profits ,source data ,threshold-based relaying ,allocated maximum power constraints ,relay node ,Stackelberg security game scheme ,leader–follower security equilibrium game model ,Computer engineering. Computer hardware ,TK7885-7895 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
In this study, the authors propose a scheme based on Stackelberg game for price and power control in threshold-based relay network, where the source transmits message to destination with the cooperation of a relay, in the presence of an eavesdropper. The relay gets revenue for transmitting the source information and the source profits from the cooperation secrecy rate. The source needs to pay for the amount of power allocated by the relay for transmission of source data to destination. Threshold-based relaying is considered, where the relay can correctly decode the message, only if the received signal satisfies the predetermined threshold. Maximal ratio combining scheme is employed at destination and eavesdropper to maximise the probability of secure transmission and successful eavesdropping, respectively. Unlike other works to date, the authors have maximised the utility of relay and source, with allocated maximum power constraints at the relay node, using Stackelberg security game scheme. The closed-form optimal solution for price and power control is obtained, for this leader–follower security equilibrium game model, with the help of convex optimisation.
- Published
- 2019
- Full Text
- View/download PDF
20. Défis méthodologiques de la programmation des cours en langues sur objectifs spécifiques.
- Author
-
SOWA, MAGDALENA
- Abstract
The aim of the text is to discuss issues related to planning LSP courses. The key stages of course design are understandable and clear to most LSP teachers. However, their practical implementation can raise certain doubts or questions. The paper attempts to analyse such problematic aspects of LSP course planning in detail. We will situate these problems at particular stages during course design and show to what extent they can affect the success of LSP education. In our opinion, the diagnosis and analysis of these sensitive areas can be help-ful not only in the effective planning of LSP training programmes but also in language teach-ers' education for professional purposes. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
21. A data-centric review of deep transfer learning with applications to text data
- Author
-
Kalifa Manjang, Frank Emmert Streib, Samar Bashath, Matthias Dehmer, Shailesh Tripathi, Nadeesha Perera, Tampere University, and Computing Sciences
- Subjects
Information Systems and Management ,Source data ,Computer science ,business.industry ,Deep learning ,Supervised learning ,113 Computer and information sciences ,Machine learning ,computer.software_genre ,Computer Science Applications ,Theoretical Computer Science ,Terminology ,Consistency (database systems) ,Artificial Intelligence ,Control and Systems Engineering ,Feature (machine learning) ,Artificial intelligence ,business ,Transfer of learning ,computer ,Software ,Test data - Abstract
In recent years, many applications are using various forms of deep learning models. Such methods are usually based on traditional learning paradigms requiring the consistency of properties among the feature spaces of the training and test data and also the availability of large amounts of training data, e.g., for performing supervised learning tasks. However, many real-world data do not adhere to such assumptions. In such situations transfer learning can provide feasible solutions, e.g., by simultaneously learning from data-rich source data and data-sparse target data to transfer information for learning a target task. In this paper, we survey deep transfer learning models with a focus on applications to text data. First, we review the terminology used in the literature and introduce a new nomenclature allowing the unequivocal description of a transfer learning model. Second, we introduce a visual taxonomy of deep learning approaches that provides a systematic structure to the many diverse models introduced until now. Furthermore, we provide comprehensive information about text data that have been used for studying such models because only by the application of methods to data, performance measures can be estimated and models assessed. publishedVersion
- Published
- 2022
22. The completeness and accuracy of DataDerm: The database of the American Academy of Dermatology
- Author
-
Margo J. Reeder, Barbara Mathes, Matthew Fitzgerald, Arik Aninos, Toni Kaye, Jeffrey P. Jacobs, Robert A. Swerlick, Marta VanBeek, and Caryn D. Etkin
- Subjects
medicine.medical_specialty ,Measure (data warehouse) ,Source data ,Databases, Factual ,Database ,business.industry ,Data Collection ,Academies and Institutes ,Data field ,Dermatology ,Audit ,External auditor ,computer.software_genre ,United States ,medicine ,Humans ,Clinical registry ,Registries ,Medical diagnosis ,business ,Completeness (statistics) ,computer - Abstract
The utility of any database or registry depends on the completeness and accuracy of the data it contains. This report documents the validity of data elements within DataDerm, the clinical registry database of the American Academy of Dermatology. An external audit of DataDerm, performed by a third-party vendor, involved the manual review of 1098 individual patient charts from calendar year 2018 from 8 different dermatology practices that used 4 different electronic health records. At each site, 142 discrete data fields were assessed, comparing the data within DataDerm to the source data within the electronic health record. Audited data included 3 domains of data elements (diagnoses, medications, and procedures) and a performance measure ("Biopsy Reporting Time-Clinician to Patient"), which is 1 of several measures used by DataDerm as a Qualified Clinical Data Registry. Overall completeness of data was 95.3%, with a range among practices of 90.6% to 98.5%. Overall accuracy of data was 89.8%, with a range of accuracy among practices of 81.2% to 94.1%. These levels of completeness and accuracy exceed the rates in the literature for registries that are based on data that is extracted from electronic health records; and therefore, this audit validates the excellent quality of data in DataDerm.
- Published
- 2022
23. Multi-source unsupervised domain adaptation for object detection
- Author
-
Lin Xiong, Lihua Zhou, Yiguang Liu, Dan Zhang, and Mao Ye
- Subjects
Source data ,business.industry ,Computer science ,Feature extraction ,Pattern recognition ,Object detection ,Domain (software engineering) ,Hardware and Architecture ,Feature (computer vision) ,Signal Processing ,Domain knowledge ,Artificial intelligence ,business ,Focus (optics) ,Software ,Multi-source ,Information Systems - Abstract
Domain adaptation for object detection has been extensively studied in recent years. Most existing approaches focus on single-source unsupervised domain adaptive object detection. However, a more practical scenario is that the labeled source data is collected from multiple domains with different feature distributions. The conventional approaches do not work very well since multiple domain gaps exist. We propose a Multi-source domain Knowledge Transfer (MKT) method to handle this situation. First, the low-level features from multiple domains are aligned by learning a shallow feature extraction network. Then, the high-level features from each pair of source and target domains are aligned by the followed multi-branch network. After that, we perform two parts of information fusion: (1) We train a detection network shared by all branches based on the transferability of each source sample feature. The transferability of a source sample feature means the indistinguishable degree to the target domain sample features. (2) For using our model, the target sample features output by the multi-branch network are fused based on the average transferability of each domain. Moreover, we leverage both image-level and instance-level attention to promote positive cross-domain transfer and suppress negative transfer. Our main contributions are the two-stage feature alignments and information fusion. Extensive experimental results on various transfer scenarios show that our method achieves the state-of-the-art performance.
- Published
- 2022
24. Context-Based Multiscale Unified Network for Missing Data Reconstruction in Remote Sensing Images
- Author
-
Jiancheng Luo, Mingwen Shao, Tianjun Wu, Chao Wang, and Deyu Meng
- Subjects
Structure (mathematical logic) ,Task (computing) ,Source data ,Remote sensing (archaeology) ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Electrical and Electronic Engineering ,Geotechnical Engineering and Engineering Geology ,Missing data ,Context based ,Convolutional neural network ,Generator (mathematics) ,Remote sensing - Abstract
Missing data reconstruction is a classical yet challenging problem in remote sensing images. Most current methods based on traditional convolutional neural network require supplementary data and can only handle one specific task. To address these limitations, we propose a novel generative adversarial network-based missing data reconstruction method in this letter, which is capable of various reconstruction tasks given only single source data as input. Two auxiliary patch-based discriminators are deployed to impose additional constraints on the local and global regions, respectively. In order to better fit the nature of remote sensing images, we introduce special convolutions and attention mechanism in a two-stage generator, thereby benefiting the tradeoff between accuracy and efficiency. Combining with perceptual and multiscale adversarial losses, the proposed model can produce coherent structure with better details. Qualitative and quantitative experiments demonstrate the uncompromising performance of the proposed model against multisource methods in generating visually plausible reconstruction results. Moreover, further exploration shows a promising way for the proposed model to utilize spatio-spectral-temporal information. The codes and models are available at https://github.com/Oliiveralien/Inpainting-on-RSI.
- Published
- 2022
25. M5L: Multi-Modal Multi-Margin Metric Learning for RGBT Tracking
- Author
-
Zhengzheng Tu, Jin Tang, Chun Lin, Wei Zhao, and Chenglong Li
- Subjects
Ground truth ,Source data ,Computer science ,business.industry ,Pattern recognition ,Computer Graphics and Computer-Aided Design ,Margin (machine learning) ,Robustness (computer science) ,Metric (mathematics) ,Feature (machine learning) ,Leverage (statistics) ,Artificial intelligence ,Focus (optics) ,business ,Software - Abstract
Classifying hard samples in the course of RGBT tracking is a quite challenging problem. Existing methods only focus on enlarging the boundary between positive and negative samples, but ignore the relations of multilevel hard samples, which are crucial for the robustness of hard sample classification. To handle this problem, we propose a novel Multi-Modal Multi-Margin Metric Learning framework named M5L for RGBT tracking. In particular, we divided all samples into four parts including normal positive, normal negative, hard positive and hard negative ones, and aim to leverage their relations to improve the robustness of feature embeddings, e.g., normal positive samples are closer to the ground truth than hard positive ones. To this end, we design a multi-modal multi-margin structural loss to preserve the relations of multilevel hard samples in the training stage. In addition, we introduce an attention-based fusion module to achieve quality-aware integration of different source data. Extensive experiments on large-scale datasets testify that our framework clearly improves the tracking performance and performs favorably the state-of-the-art RGBT trackers.
- Published
- 2022
26. Attention-Based Polarimetric Feature Selection Convolutional Network for PolSAR Image Classification
- Author
-
Hongwei Dong, Lamei Zhang, Bin Zou, and Da Lu
- Subjects
Source data ,Contextual image classification ,business.industry ,Computer science ,Data classification ,Polarimetry ,Pattern recognition ,Feature selection ,Geotechnical Engineering and Engineering Geology ,Convolutional neural network ,Feature (computer vision) ,Classifier (linguistics) ,Artificial intelligence ,Electrical and Electronic Engineering ,business - Abstract
Noting the fact that the high-dimensional data composed of various polarimetric features has better decouplability than polarimetric synthetic aperture radar (PolSAR) image source data, in this letter, multiple polarimetric features are extracted and stacked to form a high-dimensional feature cube as the input of convolutional neural networks (CNNs) to improve the performance of PolSAR image classification. Directly utilizing the polarimetric features will produce a performance degradation and the recalibration of them is indispensable. However, classical feature selection methods are independent of the classifier, which means that the stimulated features may not be the classification-friendly ones. To avoid separated procedures and improve the performance, attention-based polarimetric feature selection convolutional network, called AFS-CNN, is proposed to implement end-to-end feature selection and classification. The relationship between input polarimetric features can be captured and embedded through attention-based architecture to ensure the validity of high-dimensional data classification. Experiments on two PolSAR benchmark data sets verify the performance of the proposed method. Furthermore, this work is quite flexible, which is reflected in that the proposal can be used as a plug-and-play component of any CNN-based PolSAR classifier.
- Published
- 2022
27. Dual-Pathway Change Detection Network Based on the Adaptive Fusion Module
- Author
-
Xiaofan Jiang, Peng Tang, Shao Xiang, and Mi Wang
- Subjects
Fusion ,Source data ,Computer science ,business.industry ,Deep learning ,Geotechnical Engineering and Engineering Geology ,computer.software_genre ,Feature (computer vision) ,Encoding (memory) ,Fuse (electrical) ,Data mining ,Artificial intelligence ,Electrical and Electronic Engineering ,Representation (mathematics) ,business ,computer ,Change detection - Abstract
In recent years, with the development of high-resolution remote sensing (RS) images and deep learning technology, high-quality source data and state-of-the-art methods have become increasingly available, and great progress has been made in change detection (CD) in RS fields. However, existing methods still suffer from weak network feature representation and poor CD performance. To address these problems, we propose a novel CD network, called dual-pathway CD network (DP-CD-Net), which can help enhance feature representation and achieve a more accurate difference map. The proposed method contains a dual-pathway feature difference network (FDN), an adaptive fusion module (AFM), and an auxiliary supervision strategy. Dual-pathway FDNs can effectively enhance feature representation by supplementing the detailed information from the encoding layers. Then, we use the AFM method to fuse the difference maps. To solve the problem of training difficulty, we use the auxiliary supervision strategy to improve the performance of DP-CD-Net. We conduct extensive experiments to validate the performance of the proposed method on the LEVIR-CD dataset. The results demonstrate that the proposed method performs better than existing methods.
- Published
- 2022
28. Deblending of Seismic Data Based on Neural Network Trained in the CSG
- Author
-
Kunxi Wang and Tianyue Hu
- Subjects
Source data ,Artificial neural network ,Generalization ,Computer science ,business.industry ,Pattern recognition ,Interval (mathematics) ,Noise ,Data acquisition ,Convergence (routing) ,General Earth and Planetary Sciences ,Test phase ,Artificial intelligence ,Electrical and Electronic Engineering ,business - Abstract
The simultaneous source acquisition method, which excites multiple sources in a narrow time interval, can greatly improve the efficiency of seismic data acquisition and provide good illumination. However, the simultaneous source data, also known as the blended data, contain the crosstalk noise from other sources, which brings trouble to the subsequent processing flow. Therefore, an effective deblending method for the simultaneous source data is needed. In order to suppress crosstalk noise, an iterative deblending method using a deep neural network (DNN) trained in the common shot gather (CSG) is developed in this paper with the double-blended simultaneous source (DBSS) data being the input data and the blended CSG data being the label data. The proposed training method can not only solve the problem of difficult acquisition of the label data, but also make the DNN applicable to any complex formation conditions without considering whether the DNN has the generalization ability of deblending in different work areas. In the test phase, the trained DNN is embedded into the iterative separation framework to deblend the data in the common receiver gather (CRG), which can achieve convergence in a few iterations and achieve a better separation effect. The synthetic and field data examples are tested to verify that the proposed method can effectively suppress the crosstalk noise when deblending the simultaneous source data.
- Published
- 2022
29. ASH Research Collaborative: a real-world data infrastructure to support real-world evidence development and learning healthcare systems in hematology
- Author
-
Peter W. Marks, Ann T. Farrell, Charles S. Abrams, Kenneth C. Anderson, Brendan K. Dolan, William A. Wood, Sam Walters, Matthew Gertzog, Donna Neuberg, Alexis A. Thompson, Emily Tucker, Kathleen Hewitt, Paul G. Kluetz, Gregory Pappas, Robert M. Plovnick, and Donna Rivera
- Subjects
Knowledge management ,Source data ,Electronic data capture ,Computer science ,business.industry ,Interoperability ,Stakeholder ,Hematology ,Learning Health System ,Data modeling ,Clinical trial ,Health care ,Electronic Health Records ,Humans ,business ,Delivery of Health Care ,Data hub - Abstract
The ASH Research Collaborative is a nonprofit organization established through the American Society of Hematology’s commitment to patients with hematologic conditions and the science that informs clinical care and future therapies. The ASH Research Collaborative houses 2 major initiatives: (1) the Data Hub and (2) the Clinical Trials Network (CTN). The Data Hub is a program for hematologic diseases in which networks of clinical care delivery sites are developed in specific disease areas, with individual patient data contributed through electronic health record (EHR) integration, direct data entry through electronic data capture, and external data sources. Disease-specific data models are constructed so that data can be assembled into analytic datasets and used to enhance clinical care through dashboards and other mechanisms. Initial models have been built in multiple myeloma (MM) and sickle cell disease (SCD) using the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) and Fast Healthcare Interoperability Resources (FHIR) standards. The Data Hub also provides a framework for development of disease-specific learning communities (LC) and testing of health care delivery strategies. The ASH Research Collaborative SCD CTN is a clinical trials accelerator that creates efficiencies in the execution of multicenter clinical trials and has been initially developed for SCD. Both components are operational, with the Data Hub actively aggregating source data and the SCD CTN reviewing study candidates. This manuscript describes processes involved in developing core features of the ASH Research Collaborative to inform the stakeholder community in preparation for expansion to additional disease areas.
- Published
- 2021
30. Von Neumann Entropy Controlled Reduction of Quantum Representations for Weather Data Fusion and Decision-Making
- Author
-
Weimin Peng and Aihong Chen
- Subjects
Source data ,Computer Networks and Communications ,Computer science ,Reliability (computer networking) ,Von Neumann entropy ,Object (computer science) ,Sensor fusion ,Computer Science Applications ,Reduction (complexity) ,Control and Systems Engineering ,Quantum state ,Electrical and Electronic Engineering ,Quantum ,Algorithm ,Information Systems - Abstract
For avoiding the risk of information loss caused by quantum measurement and obtaining the concise and reliable fusion results of a given source weather dataset for decision-making, this article transforms a given source (weather) dataset into the collection of quantum states, and proposes a weather data detection and fusion method based on the von Neumann entropy controlled reduction of quantum representations. The reduction of the quantum representations within a data vector depends on the von Neumann entropy of this data vector. Then, all the source data samples (units) are classified into different subsets according to the gaps between the data samples recombined by the quantum elements after reduction. Depending on the impact factors of source data samples, the source data samples in a subset are fused into a new object data sample. After that, the reliabilities of all object data samples are evaluated through the predefined reliability evaluation model, and the object data samples with high reliabilities are useful for decision-making oriented applications (such as weather prediction). The experimental results demonstrate that the proposed data fusion method is more effective in obtaining the concise and reliable fusion results for decision-making oriented applications compared with the other mentioned data fusion methods.
- Published
- 2021
31. Cross-Domain Scene Classification by Integrating Multiple Incomplete Sources
- Author
-
Xiaoqiang Lu, Tengfei Gong, and Xiangtao Zheng
- Subjects
Domain adaptation ,Source data ,business.industry ,Computer science ,Pattern recognition ,Domain (software engineering) ,Set (abstract data type) ,Data set ,Binary classification ,Cover (topology) ,General Earth and Planetary Sciences ,Labeled data ,Artificial intelligence ,Electrical and Electronic Engineering ,business - Abstract
Cross-domain scene classification identifies scene categories by learning knowledge from a labeled data set (source domain) to an unlabeled data set (target domain), where the source data and the target data are sampled from different distributions. A lot of domain adaptation methods are used to reduce the distribution shift across domains, and most existing methods assume that the source domain shares the same categories with the target domain. It is usually hard to find a source domain that covers all categories in the target domain. Some works exploit multiple incomplete source domains to cover the target domain. However, in such setting, the categories of each source domain are a subset of the target-domain categories, and the target domain contains ``unknown'' categories for each source domain. The existence of unknown categories results in the conventional domain adaptation unsuitable. Known and unknown categories should be treated separately. Therefore, a separation mechanism is proposed to separate the known and unknown categories in this article. First, multiple-source classifiers trained on the multiple source domains are used to coarsely separate the known/unknown categories in the target domain. The target images with high similarities to source images are selected as known categories, and the target images with low similarities are selected as unknown categories. Then, a binary classifier trained using the selected images is used to finely separate all target-domain images. Finally, only the known categories are implemented in the cross-domain alignment and classification. The target images get labels by integrating the hypotheses of multiple-source classifiers on the known categories. Experiments are conducted on three cross-domain data sets to demonstrate the effectiveness of the proposed method.
- Published
- 2021
32. Revisions to the Siraya lexicon based on the original Utrecht Manuscript
- Author
-
Christopher Joby
- Subjects
Linguistics and Language ,History ,Resource (project management) ,Source data ,Principal (computer security) ,Historiography ,Lexicon ,Variety (linguistics) ,Language and Linguistics ,Linguistics - Abstract
Summary Linguistic historiography analyzes how linguistic knowledge has been acquired, stored, used and diffused. This article examines what can happen if linguists rely on copies of source data rather than the source data itself. It takes as a case study linguistic data from Siraya, a now-extinct Formosan language. Documents compiled in the seventeenth century by Dutch missionaries in Taiwan form a significant source of data for Siraya. One such document, a wordlist known as the Utrecht Manuscript (UM), is the principal source for the lexicon of one variety of Siraya, “Siraya Proper”. It has been published three times. Each edition, however, contains many errors. These editions, rather than the manuscript, have been used by scholars investigating Siraya. This article aims to correct errors in the editions and secondary literature on the UM with my readings of the manuscript itself. It therefore presents a more accurate record of the lexicon of “Siraya Proper” as well as illustrating the importance of using primary rather than secondary sources of linguistic data. Finally, it introduces an online edition of the UM, which will provide scholars and language revivalists with a useful resource for this lexicon.
- Published
- 2021
33. Improving suicide risk prediction via targeted data fusion: proof of concept using medical claims data
- Author
-
Yan Li, Robert H. Aseltine, Chang Su, Steven C. Rogers, Fei Wang, Kun Chen, and Wanwan Xu
- Subjects
Linkage (software) ,Source data ,Framingham Risk Score ,Computer science ,business.industry ,Health Informatics ,Sample (statistics) ,Research and Applications ,Machine learning ,computer.software_genre ,Sensor fusion ,Suicidal Ideation ,Suicide ,Predictive Value of Tests ,Risk Factors ,Proof of concept ,Humans ,Artificial intelligence ,Diagnosis code ,Child ,Transfer of learning ,business ,Delivery of Health Care ,computer - Abstract
Objective Reducing suicidal behavior among patients in the healthcare system requires accurate and explainable predictive models of suicide risk across diverse healthcare settings. Materials and Methods We proposed a general targeted fusion learning framework that can be used to build a tailored risk prediction model for any specific healthcare setting, drawing on information fusion from a separate more comprehensive dataset with indirect sample linkage through patient similarities. As a proof of concept, we predicted suicide-related hospitalizations for pediatric patients in a limited statewide Hospital Inpatient Discharge Dataset (HIDD) fused with a more comprehensive medical All-Payer Claims Database (APCD) from Connecticut. Results We built a suicide risk prediction model for the source data (APCD) and calculated patient risk scores. Patient similarity scores between patients in the source and target (HIDD) datasets using their demographic characteristics and diagnosis codes were assessed. A fused risk score was generated for each patient in the target dataset using our proposed targeted fusion framework. With this model, the averaged sensitivities at 90% and 95% specificity improved by 67% and 171%, and the positive predictive values for the combined fusion model improved 64% and 135% compared to the conventional model. Discussion and Conclusions We proposed a general targeted fusion learning framework that can be used to build a tailored predictive model for any specific healthcare setting. Results from this study suggest we can improve the performance of predictive models in specific target settings without complete integration of the raw records from external data sources.
- Published
- 2021
34. Broadband seismic source data acquisition and processing to delineate iron oxide deposits in the Blötberget mine‐central Sweden
- Author
-
Jordan Bos, Richard de Kunder, Tatiana Pertuz, Alireza Malehmir, Ding Yinshuai, Paul Marsden, and Bojan Brodic
- Subjects
Source data ,Geofysik ,Geochemistry ,Iron oxide ,Broadband ,Processing ,Mineral exploration ,chemistry.chemical_compound ,Geophysics ,chemistry ,Geochemistry and Petrology ,Seismic source ,Geology - Abstract
A prototype electromagnetic vibrator, referred to here as E-Vib, was upgraded and developed for broadband hardrock and mineral exploration seismic surveys. We selected the iron oxide mine in Blötberget, central Sweden, for a test site in 2019 for the newly developed E-Vib because of the availability of earlier seismic datasets (from 2015 to 2016) for verification of its performance for hardrock imaging purposes. The two-dimensional data acquisition consisted of a fixed geometry with 550 receiver locations spaced at every 5 m, employing both cabled and wireless seismic recorders, along an approximately 2.7 km long profile. The E-Vib operated at every second receiver station (i.e. 10 m spacing) with a linear sweep of 2–180 Hz and with a peak force of 7 kN. The processing workflow took advantage of the broadband signal generated by the E-Vib in this challenging hardrock environment with varying ground conditions. The processed seismic section shows a set of reflections associated with the known iron oxide mineralization and a major crosscutting reflection interpreted to be from a fault system likely to be crosscutting the mineralization. The broadband source data acquisition and subsequent processing helped to improve signal quality and resolution in comparison with the earlier workflows and data where a drophammer seismic source was used as the seismic source. These results suggest new possibilities for the E-Vib source for improved targeting in hardrock geological settings.
- Published
- 2021
35. Accuracy Comparison of Various Supercomputer Job Management System Models
- Author
-
D. S. Lyakhovets and A. V. Baranov
- Subjects
Job scheduler ,Measure (data warehouse) ,Source data ,business.industry ,General Mathematics ,Job management ,Supercomputer ,computer.software_genre ,Euclidean distance ,Software ,Computer engineering ,Key (cryptography) ,business ,computer ,Mathematics - Abstract
Supercomputer job management systems (JMS) are complex software which have a number of parameters and settings. Various simulating methods have been used in order to explore impact of such parameters on the JMS efficiency metrics. At the same time evaluating accuracy (adequacy) of the applied JMS models is one of the key points. The paper contains the results of adequacy measure experiments for various JMS models, including simulating with a virtual supercomputing nodes and with the Alea job scheduling simulator. JMS SUPPZ functioning at the Joint Supercomputer Center of the Russian Academy of Sciences (JSCC RAS) was used for the experiments. Source data for such simulating was created upon the statistics of supercomputer MVS–10P OP installed at JSCC RAS. The normalized Euclidean distance between the job residence (turnaround) time vectors, obtained from the job streams of the real supercomputer and JMS model, was used as a measure of adequacy. The experiments results have confirmed intuitive ideas about the studied simulating methods accuracy, that allows using the normalized Euclidean distance between the jobs turnaround times vectors as a measure of various JMS models adequacy.
- Published
- 2021
36. A data-driven intelligent planning model for UAVs routing networks in mobile Internet of Things
- Author
-
Dian Meng, Xinting Lu, Lanxia Qin, Qiao Xiang, Zhiwei Guo, Alireza Jolfaei, and Yang Xiao
- Subjects
Source data ,Operations research ,Situation awareness ,Computer Networks and Communications ,business.industry ,Computer science ,Wireless ,Radio repeater ,Routing (electronic design automation) ,business ,Bridge (nautical) ,Drone ,Data-driven - Abstract
Owing to constant progress of wireless communications, the Unmanned Aerial Vehicles (UAVs) routing networks (UAVs-RN) under mobile Internet of Things (MIoT) have been prevalent tools to deal with natural emergencies. But the achievement of effective responses and proper utility, still remains a challenging task. It is required to analyze multi-source data of UAVs-RN, so that optimal planning schemes under MIoT can be found. To bridge such gap, this work mainly takes three aspects of factors into consideration: rapid response, finite budget and uncertain signal fading. Accordingly, a data-driven intelligent planning model for UAVs-RN under MIoT, is put forward in this paper. Data about wildfire happened in local areas of Australia is selected to build experimental scenarios. And two kinds of UAVs, Surveillance and Situational Awareness drones and Radio Repeater drones, are considered in this study. Firstly, the source data is visualized and the internal trend is analyzed to verify true validity. Then, a multi-objective planning model is accordingly established to aggregate multi-source data. At last, a case study is deeply investigated on real-world data to assess the proposed approach and suggest feasible planning schemes.
- Published
- 2021
37. Influence of Job Runtime Prediction on Scheduling Quality
- Author
-
G. I. Savin, Dmitriy Lyakhovets, and A. V. Baranov
- Subjects
Source data ,Computer engineering ,General Mathematics ,media_common.quotation_subject ,Quality (business) ,Supercomputer ,Job management ,Queue ,Wait time ,Mathematics ,Scheduling (computing) ,media_common ,Runtime prediction - Abstract
One of the common problems in multiuser supercomputer systems is inaccurate user walltime request for a job. Usually, this time is significantly overestimated by the user. Scheduling algorithms implemented in modern supercomputer job management systems (JMS) are quite complicated and have a lot of settings. In this regard, the influence of the job walltime overestimate on the scheduling efficiency is not obvious. The job walltime prediction simulation can obtain the such influence estimation. The article contains simulation results when the predicting accuracy achieves 100 $$\%$$ . JMS referred to as SUPPZ is applied at the Joint Supercomputer Center of the RAS (JSCC RAS). SUPPZ was used for our simulations. Actual statistics of MVS–10P OP supercomputer installed at JSCC RAS was used for simulation as the source data. In this paper we used the SUPPZ simulation with virtual supercomputing nodes due to its highest accuracy. The simulation results showed a noticeable improvement of such metrics as average queue job wait time, average queue length and average slowdown.
- Published
- 2021
38. Multisource Heterogeneous Unsupervised Domain Adaptation via Fuzzy Relation Neural Networks
- Author
-
Feng Liu, Jie Lu, and Guangquan Zhang
- Subjects
Domain adaptation ,Source data ,Artificial neural network ,Computer science ,Applied Mathematics ,Feature extraction ,Fuzzy set ,Cognitive neuroscience of visual object recognition ,02 engineering and technology ,computer.software_genre ,Fuzzy logic ,Computational Theory and Mathematics ,Artificial Intelligence ,Control and Systems Engineering ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data mining ,Classifier (UML) ,computer - Abstract
In unsupervised domain adaptation (UDA), a classifier for a target domain is trained with labeled source data and unlabeled target data. Existing UDA methods assume that the source data come from the same source domain (i.e., single-source scenario) or from multiple source domains, whose feature spaces have the same dimension ( homogeneous ) but different distributions (i.e., multihomogeneous-source scenario). However, in the real world, for a specific target domain, we probably have multiple different-dimension ( heterogeneous ) source domains, which do not satisfy the assumption of existing UDA methods. To remove this assumption and move forward to a realistic UDA problem, this article presents a shared-fuzzy-equivalence-relation neural network (SFERNN) for addressing the multisource heterogeneous UDA problem. The SFERNN is a five-layer neural network containing $c$ source branches and one target branch. The network structure of the SFERNN is first confirmed by a novel fuzzy relation called multisource shared fuzzy equivalence relation. Then, we optimize parameters of the SFERNN via minimizing cross-entropy loss on $c$ source branches and the distributional discrepancy between each source branch and the target branch. Experiments distributed across eight real-world datasets are conducted to validate the SFERNN. This testing regime demonstrates that the SFERNN outperforms the existing single-source heterogeneous UDA methods, especially when the target domain contains few data.
- Published
- 2021
39. Towards Cross-Environment Human Activity Recognition Based on Radar Without Source Data
- Author
-
Zhongping Cao, Guoli Wang, Xuemei Guo, and Zhenchang Li
- Subjects
Source data ,Computer Networks and Communications ,Computer science ,business.industry ,Feature extraction ,Aerospace Engineering ,Machine learning ,computer.software_genre ,law.invention ,Activity recognition ,law ,Automotive Engineering ,Feature (machine learning) ,Artificial intelligence ,Electrical and Electronic Engineering ,Radar ,Transfer of learning ,Adaptation (computer science) ,Cluster analysis ,business ,computer - Abstract
Radar-based human activity recognition (HAR) finds various applications like assisted living and driver behavior monitoring. As radar data are heavily environment-dependent, it is becoming increasingly important to develop a transfer learning mechanism that enables a radar-based HAR system with desirable cross-environment adaptation feasibility. This paper concerns the issue of how radar-based HAR system can adapt to a new environment without source data. To this end, we devote to using the source hypothesis transfer learning architecture to build such an environment adaptation mechanism towards cross-environment radar-based HAR. In doing this, it is a challenging task to develop a reliable self-supervised labeling strategy for generating pseudo labels associated with the unlabeled target data, which is crucial to facilitate the learning of a target-specific feature extractor being responsible for environment adaptation. This paper presents the neighbor-aggregating-based labeling method and incorporates it with the existing clustering-based labeling method to perform the self-supervised labeling task. The logic behind our approach is that the above two labeling methods are complementary to each other in terms of making use of both local and global structures of adaptation data to supervise the labeling task. The coordination of both labeling methods is motivated to be implemented in the weighted combination form, which contributes to improving the reliability of generated labels. Experimental results on a public HAR dataset based on the frequency modulated continuous wave (FMCW) radar demonstrate the effectiveness of our approach.
- Published
- 2021
40. Structure preservation adversarial network for visual domain adaptation
- Author
-
Min Meng, Jigang Wu, and Qiguang Chen
- Subjects
Structure (mathematical logic) ,Information Systems and Management ,Source data ,Computer science ,business.industry ,Process (engineering) ,Feature vector ,Sample (statistics) ,Space (commercial competition) ,Machine learning ,computer.software_genre ,Computer Science Applications ,Theoretical Computer Science ,Domain (software engineering) ,Artificial Intelligence ,Control and Systems Engineering ,Artificial intelligence ,Transfer of learning ,business ,computer ,Software - Abstract
Domain adaptation has attracted attention by leveraging knowledge from well-labeled source data to facilitate unlabeled target learning tasks. Numerous research efforts have been devoted to extracting effective features by incorporating the pseudolabels of target data. However, the transferable knowledge reflected by intradomain structure, interdomain correlation and label supervision has scarcely been considered simultaneously. In this paper, we propose a novel structure preservation adversarial network with target reweighting (SPTR) for unsupervised domain adaptation, in which local structure consistencies and category-level semantic alignment are simultaneously considered in the adversarial learning framework. Based on the labeled and pseudolabeled samples, we attempt to align both global and category-level domain statistics from different domains and simultaneously enforce structural consistency from feature space to label space in the source and target domains. Furthermore, to suppress the influence of falsely labeled target samples, a novel and generalized sample reweighting strategy is developed to assign target samples with different levels of confidence, which fully explores the knowledge of the target distribution to benefit the semantic transfer process. The experimental results in three transfer learning scenarios demonstrate the superiority of our proposed method over other state-of-the-art domain adaptation algorithms.
- Published
- 2021
41. Online voltage consistency prediction of proton exchange membrane fuel cells using a machine learning method
- Author
-
Hongyang Liao, Chenghao Deng, Wanchao Shan, Jinrui Chen, Yuxiang He, Huicui Chen, Pucheng Pei, and Tong Zhang
- Subjects
Source data ,Renewable Energy, Sustainability and the Environment ,Computer science ,business.industry ,Process (computing) ,Energy Engineering and Power Technology ,Proton exchange membrane fuel cell ,Condensed Matter Physics ,Machine learning ,computer.software_genre ,Power (physics) ,Fuel Technology ,Stack (abstract data type) ,Consistency (statistics) ,Sensitivity (control systems) ,Artificial intelligence ,business ,computer ,Voltage - Abstract
Widely acknowledged by experts, the inconsistency between the cells of the proton exchange membrane fuel cell stack during operation is an important cause of the fuel cell life decay. Existing studies mainly focus on qualitative analysis of the effects of operating parameters on fuel cell stack consistency. However, there is currently almost no quantitative research on predicting the voltage consistency through operating parameters with machine learning methods. To solve this problem, a three-dimensional model of proton exchange membrane fuel cell stack with five single cells is established in this paper. The Computational Fluid Dynamic (CFD) method is used to provide the source data for prediction model. After predicting the voltage consistency with several machine learning methods and comparing the accuracy through simulation data, the integrated regression method based on Gradient Boosting Decision Tree (GBDT) gets the highest score (0.896) and is proposed for quickly predicting the consistency of cell voltage through operating parameters. After verifying the GBDT method with the experimental data from the fuel cell stack of SUNRISE POWER, in which the accuracy score is 0.910, the universality and accuracy of the method is confirmed. The influencing sensitivity of each operating parameter is evaluated and the current density has the greatest influence on the predicted value, which accounts for 0.40. The prediction of voltage consistency under different combination of operating parameters can guide the optimization of structural parameters in the process of the fuel cell design and operating parameters in the process of fuel cell control.
- Published
- 2021
42. Condition-Based Monitoring in Variable Machine Running Conditions Using Low-Level Knowledge Transfer With DNN
- Author
-
Seetaram Maurya, Chris K. Mechefske, Nishchal K. Verma, and Vikas Singh
- Subjects
Source data ,Artificial neural network ,Computer science ,Feature extraction ,Condition monitoring ,computer.software_genre ,Data set ,Variable (computer science) ,Control and Systems Engineering ,Intelligent maintenance system ,Data mining ,Electrical and Electronic Engineering ,computer ,Test data - Abstract
Traditional machine learning methods assume that training and testing data must be from the same machine running condition (MRC) and drawn from the same distribution. However, in several real-time industrial applications, this assumption does not hold. The traditional methods work satisfactorily in steady-state conditions but fail in time-varying conditions. In order to utilize time-varying data in variable MRCs, this article proposes a novel low-level knowledge transfer framework using a deep neural network (DNN) model for condition monitoring of machines in variable running conditions. The low-level features have been extracted in time, frequency, and time–frequency domains. These features are extracted from the source data to train the DNN. The trained DNN-based parameters are then transferred to another DNN, which is modified according to the low-level features extracted from the target data. The proposed approach is validated through three case studies on: 1) the air compressor acoustic data set; 2) the Case Western Reserve University bearing data set; and 3) the intelligent maintenance system bearing data set. The prediction accuracy obtained for the above case studies is as high as 100%, 93.07%, and 100%, respectively, with fivefold cross-validation. These real-time results show considerable improvement in the prediction performance using the proposed approach. Note to Practitioners —Condition-based monitoring schemes are widely applicable to rotating machines in various industries since they operate in tough working situations, and consequently, unpredicted failures occur. These unpredicted failures may cause perilous accidents in the industries. CBM systems prevent such failures, which results in the reduction of equipment damage and, hence, increases machinery lifetime. Modern industries are so complex and generating huge data, and these data can be collected using sensors, but placing a large number of sensors is difficult and expensive for different but similar kinds of faults in industries. This also increases the cost due to additional sensors and circuits. In this article, the authors have proposed a novel low-level knowledge transfer framework using the deep neural network (DNN)-based method for condition monitoring of machines in variable running conditions. Low-level features have been extracted to reduce the computations of DNN drastically with improved performance. This article also considered additional faults in the target domain, which is more practical in real-time applications. The proposed scheme has been validated with three case studies on acoustic and vibration signatures.
- Published
- 2021
43. Distance constraint between features for unsupervised domain adaptive person re-identification
- Author
-
Zhihao Li, Xinbo Gao, Zongyuan Liu, Bing Han, and Biao Hou
- Subjects
Source data ,business.industry ,Computer science ,Cognitive Neuroscience ,Pattern recognition ,Computer Science Applications ,Domain (software engineering) ,Constraint (information theory) ,Identification (information) ,Artificial Intelligence ,Face (geometry) ,Feature (machine learning) ,Artificial intelligence ,business ,Cluster analysis ,Encoder - Abstract
Many superior person re-identification (re-ID) approaches face a challenge: their performance show disastrous when all supervised models are generalized to a new domain. Some researchers have recently proposed domain adaptive person re-ID methods to address this difficulty, whereas they directly leveraged the target-domain data that would generate lots of noisy pseudo labels. Hence, this paper proposes a distance constraint between features (DCF) method, which clusters the feature distribution fitted the real target-domain data. We assemble different parts of one person for multi-scale self-supervised learning. After introducing domain invariance and designing the inter-image and inter-class distance constraint to regulate distances between target samples, the feature distribution extracted from the encoder trained by the source data can fit the real target data distribution, which leads our domain adaptive model to enjoy the more reliable clustering results and thus obtain a great identification performance in target domain. Extensive experiments demonstrate our approach outperforms state-of-the-art methods on three large-scale released datasets.
- Published
- 2021
44. Forager Mobility and Lithic Discard Probability Similarly Affect the Distance of Raw Material Discard from Source
- Author
-
Sam C. Lin and L. S. Premo
- Subjects
Stone tool ,Archeology ,History ,Source data ,Computer science ,Museology ,Equifinality ,engineering.material ,Raw material ,Spatial distribution ,Outcome (probability) ,Lithic technology ,Arts and Humanities (miscellaneous) ,Statistics ,engineering ,Neutral model - Abstract
The neutral model of stone procurement developed by Brantingham (2003, 2006) provides a formal means to investigate the formation of lithic discard patterning under changing forager mobility conditions. This study modifies Brantingham's (2006) Lévy walk model to examine the influence of discard probability on the spatial distribution of raw material abundance. The model outcome shows that forager movement and tool discard probability have similar effects on the simulated patterns of raw material transport, so it is difficult—if not impossible—to differentiate the respective influence of the two factors from distance to source distributions alone. This finding of equifinality complicates the task of interpretating hominin mobility from archaeological distance to source data, particularly in settings such as the Middle-Upper Paleolithic transition, which is marked by an important reorganization in hominin lithic technology that may have affected stone tool discard probability.
- Published
- 2021
45. High-Resolution Population Exposure to PM2.5 in Nanchang Urban Region Using Multi-Source Data
- Author
-
Qingming Leng, Haiou Yang, and Zixie Guo
- Subjects
Urban region ,Source data ,Environmental Chemistry ,Environmental science ,High resolution ,Physical geography ,Population exposure ,General Environmental Science - Published
- 2021
46. Kualitas Layanan Bidang Penempatan Kerja dalam Meningkatkan Kepuasan Masyarakat pada Dinas Tenaga Kerja Kota Banjarmasin
- Author
-
Kharunnisa Khairunnisa
- Subjects
education.field_of_study ,Source data ,biology ,Social satisfaction ,Population ,Applied psychology ,Field research ,Tenaga ,Survey research ,Sample (statistics) ,Psychology ,education ,biology.organism_classification - Abstract
Problem Formulation in this research are indicates that How quality of service bidang Penempatan Tenaga Kerja on Dinas Tenaga Kerja Kota Banjarmasin consist of tangible, reliability, responsiveness, assurance and empathy up grade social satisfaction. Kind data concists of quantitative data and qualitative data. Source data consist of prymary data and secondary data. Draft this research according to method using survey research is collection of data only done for in part of population. Population and sample this research amount 150 respondents. Technique collected data among them Field research are questionnairy, interview, observation, documentation and library research. Technique data analysis be explained in descriptive with quantitative is statistics method to used for describe of data collected.
- Published
- 2021
47. Deblending Method of Multisource Seismic Data Based on a Periodically Varying Cosine Code
- Author
-
Shaohuan Zu, Mengyao Jiao, Tianyue Hu, Weikang Kuang, and Yang Liu
- Subjects
Source data ,Signal-to-noise ratio ,Computer science ,Noise reduction ,Code (cryptography) ,Curvelet ,Boundary (topology) ,Enhanced Data Rates for GSM Evolution ,Electrical and Electronic Engineering ,Geotechnical Engineering and Engineering Geology ,Algorithm ,Synthetic data - Abstract
Acquisition technology of multisource data has outstanding advantages in enhancing collection efficiency and reducing cost. However, traditional seismic data processes cannot be applied to multisource blended data. Therefore, deblending technology of multisource data is the key to the research. In this letter, we propose a deblending method of multisource seismic data based on a periodically varying cosine code (PVCC). First, we design a PVCC to blend the seismic data. Next, the blending model is transformed into the minimum problem of the objective function. Then, the blended data are decomposed in the curvelet domain. Finally, the main source data are separated based on sparse inversion. Furthermore, we use the edge processing to eliminate the boundary effect in the processed seismic data. The examples of synthetic data and field data are adopted to demonstrate that the proposed method has great potential in the deblending of multisource data. In addition, the edge processing can effectively suppress the boundary effect.
- Published
- 2021
48. Deep Learning Based End-to-End Wireless Communication Systems Without Pilots
- Author
-
Geoffrey Ye Li, Hao Ye, and Biing-Hwang Juang
- Subjects
Source data ,Computer Networks and Communications ,business.industry ,Computer science ,020209 energy ,Deep learning ,Real-time computing ,Transmitter ,MIMO ,Data_CODINGANDINFORMATIONTHEORY ,02 engineering and technology ,Communications system ,Data recovery ,Artificial Intelligence ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Wireless ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Computer Science::Information Theory ,Communication channel - Abstract
The recent development in machine learning, especially in deep neural networks (DNN), has enabled learning-based end-to-end communication systems, where DNNs are employed to substitute all modules at the transmitter and receiver. In this article, two end-to-end frameworks for frequency-selective channels and multi-input and multi-output (MIMO) channels are developed, where the wireless channel effects are modeled with an untrainable stochastic convolutional layer. The end-to-end framework is trained with mini-batches of input data and channel samples. Instead of using pilot information to implicitly or explicitly estimate the unknown channel parameters as in current communication systems, the transmitter DNN learns to transform the input data in a way that is robust to various channel conditions. The receiver consists of two DNN modules used for channel information extraction and data recovery, respectively. A bilinear production operation is employed to combine the features extracted from the channel information extraction module and the received signals. The combined features are further utilized in the data recovery module to recover the transmitted data. Compared with the conventional communication systems, performance improvement has been shown for frequency-selective channels and MIMO channels. Furthermore, the end-to-end system can automatically leverage the correlation in the channels and in the source data to improve the overall performance.
- Published
- 2021
49. Cross-Project Defect Prediction via Landmark Selection-Based Kernelized Discriminant Subspace Alignment
- Author
-
Xiao-Yuan Jing, Wangyang Yu, Zhiqiang Li, Jingwen Niu, and Chao Qi
- Subjects
Source data ,Computer science ,computer.software_genre ,Data modeling ,Kernel (linear algebra) ,Redundancy (information theory) ,Discriminant ,Benchmark (computing) ,Data mining ,Electrical and Electronic Engineering ,Safety, Risk, Reliability and Quality ,computer ,Subspace topology ,Linear separability - Abstract
Cross-project defect prediction (CPDP) refers to identifying defect-prone software modules in one project (target) using historical data collected from other projects (source), which can help developers find bugs and prioritize their testing efforts. Recently, CPDP has attracted great research interest. However, the source and target data usually exist redundancy and nonlinearity characteristics. Besides, most CPDP methods do not exploit source label information to uncover the underlying knowledge for label propagation. These factors usually lead to unsatisfactory CPDP performance. To address the above limitations, we propose a landmark selection-based kernelized discriminant subspace alignment (LSKDSA) approach for CPDP. LSKDSA not only reduces the discrepancy of the data distributions between the source and target projects, but also characterizes the complex data structures and increases the probability of linear separability of the data. Moreover, LSKDSA encodes label information of the source data into domain adaptation learning process and makes itself with good discriminant ability. Extensive experiments on 13 public projects from three benchmark datasets demonstrate that LSKDSA performs better than a range of competing CPDP methods. The improvement is $ 3.44\%-11.23\%$ in g-measure , $ 5.75\%-11.76\%$ in AUC , and $ 9.34\%-33.63\%$ in MCC , respectively.
- Published
- 2021
50. Assessment of Level of Risk in Decision-Making in Terms of Career Exploitation
- Author
-
Aleksandr Sergeevich Semenov and Vladimir Sergeevich Kuznetcov
- Subjects
open-pit mine ,a working platform ,risk ,design ,reliability ,probability ,source data ,Business ,HF5001-6182 ,Economics as a science ,HB71-74 - Abstract
When designing career plots the raw data are stochastic in nature. From the results of the determination of these initial data depends not only the final result of the design or evaluation, but also the feasibility of the development of the field. While there are significant errors associated with the probabilistic nature of the source data and measurement errors and errors of calculations. Risk assessment is an integral part of project documentation. The project decision-making occurs under conditions of uncertainty and risk. To minimize uncertainty, it is first necessary to identify the area of potential risk, to determine the probability of its occurrence and the potential consequences. If adverse effects cannot be excluded, a more complete understanding of the problem and contributes more mindful response to the potential risk. Analysis of the traditional approaches to designing open pits in the face of uncertainty of input data, revealed that used design methods do not account for the risk that entails the adoption and implementation of inefficient design solutions. Risk assessment is made in the design process and includes qualitative and quantitative analysis. If the evaluation of the project will be adopted for implementation, the mining companies are already faced with some problems of risk management. According to the results of the project accumulates statistics, which allows you to more accurately identify risks and work with them. When the uncertainty of the project is too high, then it can be sent back for revision, then again, there must be a risk assessment.
- Published
- 2015
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.