450 results on '"benchmark dataset"'
Search Results
2. Benchmark dataset on feeding intensity of the pearl gentian grouper(Epinephelus fuscoguttatus♀×E. lanceolatus♂)
- Author
-
Qin, Haijing, Tian, Yunchen, Quan, Jianing, Cong, Xueqi, Li, Qingfei, and Sui, Jinzhu
- Published
- 2025
- Full Text
- View/download PDF
3. Full-body virtual try-on using top and bottom garments with wearing style control
- Author
-
Park, Soonchan and Park, Jinah
- Published
- 2025
- Full Text
- View/download PDF
4. EarthObsNet: A comprehensive Benchmark dataset for data-driven earth observation image synthesis
- Author
-
Li, Zhouyayan, Sermet, Yusuf, and Demir, Ibrahim
- Published
- 2025
- Full Text
- View/download PDF
5. Are LLMs good at structured outputs? A benchmark for evaluating structured output capabilities in LLMs
- Author
-
Liu, Yu, Li, Duantengchuan, Wang, Kaili, Xiong, Zhuoran, Shi, Fobo, Wang, Jian, Li, Bing, and Hang, Bo
- Published
- 2024
- Full Text
- View/download PDF
6. UAV-Enhanced Combination to Application: Comprehensive Analysis and Benchmarking of a Human Detection Dataset for Disaster Scenarios
- Author
-
Nihal, Ragib Amin, Yen, Benjamin, Itoyama, Katsutoshi, Nakadai, Kazuhiro, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Antonacopoulos, Apostolos, editor, Chaudhuri, Subhasis, editor, Chellappa, Rama, editor, Liu, Cheng-Lin, editor, Bhattacharya, Saumik, editor, and Pal, Umapada, editor
- Published
- 2025
- Full Text
- View/download PDF
7. MSD: A Benchmark Dataset for Floor Plan Generation of Building Complexes
- Author
-
van Engelenburg, Casper, Mostafavi, Fatemeh, Kuhn, Emanuel, Jeon, Yuntae, Franzen, Michael, Standfest, Matthias, van Gemert, Jan, Khademi, Seyran, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
- Published
- 2025
- Full Text
- View/download PDF
8. WiMANS: A Benchmark Dataset for WiFi-Based Multi-user Activity Sensing
- Author
-
Huang, Shuokang, Li, Kaihan, You, Di, Chen, Yichong, Lin, Arvin, Liu, Siying, Li, Xiaohui, McCann, Julie A., Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
- Published
- 2025
- Full Text
- View/download PDF
9. A vessel bifurcation liver CT landmark pair dataset for evaluating deformable image registration algorithms.
- Author
-
Zhang, Zhendong, Criscuolo, Edward Robert, Hao, Yao, McKeown, Trevor, and Yang, Deshan
- Subjects
- *
IMAGE processing , *INSTITUTIONAL review boards , *COMPUTED tomography , *IMAGE registration , *RESEARCH personnel , *LIVER - Abstract
Purpose: Evaluating deformable image registration (DIR) algorithms is vital for enhancing algorithm performance and gaining clinical acceptance. However, there is a notable lack of dependable DIR benchmark datasets for assessing DIR performance except for lung images. To address this gap, we aim to introduce our comprehensive liver computed tomography (CT) DIR landmark dataset library. This dataset is designed for efficient and quantitative evaluation of various DIR methods for liver CTs, paving the way for more accurate and reliable image registration techniques. Acquisition and validation methods: Forty CT liver image pairs were acquired from several publicly available image archives and authors' institutions under institutional review board (IRB) approval. The images were processed with a semi‐automatic procedure to generate landmark pairs: (1) for each case, liver vessels were automatically segmented on one image; (2) landmarks were automatically detected at vessel bifurcations; (3) corresponding landmarks in the second image were placed using two deformable image registration methods to avoid algorithm‐specific biases; (4) a comprehensive validation process based on quantitative evaluation and manual assessment was applied to reject outliers and ensure the landmarks' positional accuracy. This workflow resulted in an average of ∼56 landmark pairs per image pair, comprising a total of 2220 landmarks for 40 cases. The general landmarking accuracy of this procedure was evaluated using digital phantoms and manual landmark placement. The landmark pair target registration errors (TRE) on digital phantoms were 0.37 ± 0.26 and 0.55 ± 0.34 mm respectively for the two selected DIR algorithms used in our workflow, with 97% of landmark pairs having TREs below 1.5 mm. The distances from the calculated landmarks to the averaged manual placement were 1.27 ± 0.79 mm. Data format and usage notes: All data, including image files and landmark information, are publicly available at Zenodo (https://zenodo.org/records/13738577). Instructions for using our data can be found on our GitHub page at https://github.com/deshanyang/Liver‐DIR‐QA. Potential applications: The landmark dataset generated in this work is the first collection of large‐scale liver CT DIR landmarks prepared on real patient images. This dataset can provide researchers with a dense set of ground truth benchmarks for the quantitative evaluation of DIR algorithms within the liver. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
10. Generating domain models from natural language text using NLP: a benchmark dataset and experimental comparison of tools: Generating domain models from natural language text using NLP...: F. Bozyigit et al.
- Author
-
Bozyigit, Fatma, Bardakci, Tolgahan, Khalilipour, Alireza, Challenger, Moharram, Ramackers, Guus, Babur, Önder, and Chaudron, Michel R. V.
- Subjects
- *
SYSTEMS software , *SOFTWARE requirements specifications , *SOFTWARE architecture , *COMPUTER science , *COMPUTER software development - Abstract
Software requirements specification describes users' needs and expectations on some target system. Requirements documents are typically represented by unstructured natural language text. Such texts are the basis for the various subsequent activities in software development, such as software analysis and design. As part of software analysis, domain models are made that describe the key concepts and relations between them. Since the analysis process is performed manually by business analysts, it is time-consuming and may introduce mistakes. Recently, researchers have worked toward automating the synthesis of domain models from textual software requirements. Current studies on this topic have limitations in terms of the volume and heterogeneity of experimental datasets. To remedy this, we provide a curated dataset of software requirements to be utilized as a benchmark by algorithms that transform textual requirements documents into domain models. We present a detailed evaluation of two text-to-model approaches: one based on a large-language model (ChatGPT) and one building on grammatical rules (txt2Model). Our evaluation reveals that both tools yield promising results with relatively high F-scores for modeling the classes, attributes, methods, and relationships, with txt2Model performing better than ChatGPT on average. Both tools have relatively lower performance and high variance when it comes to the relation types. We believe our dataset and experimental evaluation pave to way to advance the field of automated model generation from requirements. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. OMAD-6: Advancing Offshore Mariculture Monitoring with a Comprehensive Six-Type Dataset and Performance Benchmark.
- Author
-
Mo, Zewen, Liang, Yinyu, Chen, Yulin, Shen, Yanyun, Xu, Minduan, Wang, Zhipan, and Zhang, Qingling
- Subjects
- *
ECONOMIC security , *DEEP learning , *MARICULTURE , *REMOTE sensing , *FOOD security - Abstract
Offshore mariculture is critical for global food security and economic development. Advances in deep learning and data-driven approaches, enable the rapid and effective monitoring of offshore mariculture distribution and changes. However, detector performance depends heavily on training data quality. The lack of standardized classifications and public datasets for offshore mariculture facilities currently hampers effective monitoring. Here, we propose to categorize offshore mariculture facilities into six types: TCC, DWCC, FRC, LC, RC, and BC. Based on these categories, we introduce a benchmark dataset called OMAD-6. This dataset includes over 130,000 instances and more than 16,000 high-resolution remote sensing images. The images with a spatial resolution of 0.6 m were sourced from key regions in China, Chile, Norway, and Egypt, from the Google Earth platform. All instances in OMAD-6 were meticulously annotated manually with horizontal bounding boxes and polygons. Compared to existing remote sensing datasets, OMAD-6 has three notable characteristics: (1) it is comparable to large, published datasets in instances per category, image quantity, and sample coverage; (2) it exhibits high inter-class similarity; (3) it shows significant intra-class diversity in facility sizes and arrangements. Based on the OMAD-6 dataset, we evaluated eight state-of-the-art methods to establish baselines for future research. The experimental results demonstrate that the OMAD-6 dataset effectively represents various real-world scenarios, which have posed considerable challenges for current instance segmentation algorithms. Our evaluation confirms that the OMAD-6 dataset has the potential to improve offshore mariculture identification. Notably, the QueryInst and PointRend algorithms have distinguished themselves as top performers on the OMAD-6 dataset, robustly identifying offshore mariculture facilities even with complex environmental backgrounds. Its ongoing development and application will play a pivotal role in future offshore mariculture identification and management. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. CadastreVision: A benchmark dataset for cadastral boundary delineation from multi-resolution earth observation images.
- Author
-
Grift, Jeroen, Persello, Claudio, and Koeva, Mila
- Subjects
- *
ARTIFICIAL intelligence , *PROPERTY rights , *REMOTE sensing , *KNOWLEDGE transfer , *SPATIAL resolution , *DEEP learning - Abstract
Approximately 70%–75% of people worldwide have no formally registered land rights. Fit-For-Purpose Land Administration was introduced to address this problem and focuses on delineating visible cadastral boundaries from earth observation imagery. Recent studies have shown the potential of deep learning models to extract these visible cadastral boundaries automatically. However, studies are limited by the small size and geographical coverage of available datasets and by the lack of information about which cadastral boundaries are visible, i.e., associated with a physical object boundary. To overcome these problems, we present CadastreVision , a benchmark dataset containing cadastral reference data and corresponding multi-resolution earth observation imagery from The Netherlands, with a spatial resolution ranging from 0.1 m to 10 m. The ratio between visible and non-visible cadastral boundaries is essential to evaluate the potential automation level in cadastral boundary extraction from earth observation images and interpret results obtained by deep learning models. We investigate this ratio using a novel analysis pipeline that overlays cadastral reference data with visible topographic object boundaries. Our results show that approximately 72% of the total length of cadastral boundaries in The Netherlands are visible. CadastreVision will enable new developments in cadastral boundary delineation and future endeavours to investigate knowledge transfer to data-scarce areas. Our data and code is available at https://github.com/jeroengrift/cadastrevision. [Display omitted] [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Recommendations for the creation of benchmark datasets for reproducible artificial intelligence in radiology
- Author
-
Nikos Sourlos, Rozemarijn Vliegenthart, Joao Santinha, Michail E. Klontzas, Renato Cuocolo, Merel Huisman, and Peter van Ooijen
- Subjects
Benchmark dataset ,Validation ,Bias ,Artificial intelligence (AI) software ,Radiology ,Medical physics. Medical radiology. Nuclear medicine ,R895-920 - Abstract
Abstract Various healthcare domains have witnessed successful preliminary implementation of artificial intelligence (AI) solutions, including radiology, though limited generalizability hinders their widespread adoption. Currently, most research groups and industry have limited access to the data needed for external validation studies. The creation and accessibility of benchmark datasets to validate such solutions represents a critical step towards generalizability, for which an array of aspects ranging from preprocessing to regulatory issues and biostatistical principles come into play. In this article, the authors provide recommendations for the creation of benchmark datasets in radiology, explain current limitations in this realm, and explore potential new approaches. Clinical relevance statement Benchmark datasets, facilitating validation of AI software performance can contribute to the adoption of AI in clinical practice. Key Points Benchmark datasets are essential for the validation of AI software performance. Factors like image quality and representativeness of cases should be considered. Benchmark datasets can help adoption by increasing the trustworthiness and robustness of AI. Graphical Abstract
- Published
- 2024
- Full Text
- View/download PDF
14. Mining impactful discoveries from the biomedical literature
- Author
-
Erwan Moreau, Orla Hardiman, Mark Heverin, and Declan O’Sullivan
- Subjects
Literature-based discovery ,Evaluation ,Benchmark dataset ,Time-sliced method ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Literature-based discovery (LBD) aims to help researchers to identify relations between concepts which are worthy of further investigation by text-mining the biomedical literature. While the LBD literature is rich and the field is considered mature, standard practice in the evaluation of LBD methods is methodologically poor and has not progressed on par with the domain. The lack of properly designed and decent-sized benchmark dataset hinders the progress of the field and its development into applications usable by biomedical experts. Results This work presents a method for mining past discoveries from the biomedical literature. It leverages the impact made by a discovery, using descriptive statistics to detect surges in the prevalence of a relation across time. The validity of the method is tested against a baseline representing the state-of-the-art “time-sliced” method. Conclusions This method allows the collection of a large amount of time-stamped discoveries. These can be used for LBD evaluation, alleviating the long-standing issue of inadequate evaluation. It might also pave the way for more fine-grained LBD methods, which could exploit the diversity of these past discoveries to train supervised models. Finally the dataset (or some future version of it inspired by our method) could be used as a methodological tool for systematic reviews. We provide an online exploration tool in this perspective, available at https://brainmend.adaptcentre.ie/ .
- Published
- 2024
- Full Text
- View/download PDF
15. Recommendations for the creation of benchmark datasets for reproducible artificial intelligence in radiology.
- Author
-
Sourlos, Nikos, Vliegenthart, Rozemarijn, Santinha, Joao, Klontzas, Michail E., Cuocolo, Renato, Huisman, Merel, and van Ooijen, Peter
- Subjects
SOFTWARE validation ,ARTIFICIAL intelligence ,TRUST ,RESEARCH teams ,BIOMETRY - Abstract
Various healthcare domains have witnessed successful preliminary implementation of artificial intelligence (AI) solutions, including radiology, though limited generalizability hinders their widespread adoption. Currently, most research groups and industry have limited access to the data needed for external validation studies. The creation and accessibility of benchmark datasets to validate such solutions represents a critical step towards generalizability, for which an array of aspects ranging from preprocessing to regulatory issues and biostatistical principles come into play. In this article, the authors provide recommendations for the creation of benchmark datasets in radiology, explain current limitations in this realm, and explore potential new approaches. Clinical relevance statement: Benchmark datasets, facilitating validation of AI software performance can contribute to the adoption of AI in clinical practice. Key Points: Benchmark datasets are essential for the validation of AI software performance. Factors like image quality and representativeness of cases should be considered. Benchmark datasets can help adoption by increasing the trustworthiness and robustness of AI. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Mining impactful discoveries from the biomedical literature.
- Author
-
Moreau, Erwan, Hardiman, Orla, Heverin, Mark, and O'Sullivan, Declan
- Subjects
RESEARCH personnel ,DESCRIPTIVE statistics ,EVALUATION methodology ,TEST methods ,LITERATURE - Abstract
Background: Literature-based discovery (LBD) aims to help researchers to identify relations between concepts which are worthy of further investigation by text-mining the biomedical literature. While the LBD literature is rich and the field is considered mature, standard practice in the evaluation of LBD methods is methodologically poor and has not progressed on par with the domain. The lack of properly designed and decent-sized benchmark dataset hinders the progress of the field and its development into applications usable by biomedical experts. Results: This work presents a method for mining past discoveries from the biomedical literature. It leverages the impact made by a discovery, using descriptive statistics to detect surges in the prevalence of a relation across time. The validity of the method is tested against a baseline representing the state-of-the-art "time-sliced" method. Conclusions: This method allows the collection of a large amount of time-stamped discoveries. These can be used for LBD evaluation, alleviating the long-standing issue of inadequate evaluation. It might also pave the way for more fine-grained LBD methods, which could exploit the diversity of these past discoveries to train supervised models. Finally the dataset (or some future version of it inspired by our method) could be used as a methodological tool for systematic reviews. We provide an online exploration tool in this perspective, available at https://brainmend.adaptcentre.ie/. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. SpatialCTD: A Large-Scale Tumor Microenvironment Spatial Transcriptomic Dataset to Evaluate Cell Type Deconvolution for Immuno-Oncology.
- Author
-
Ding, Jiayuan, Li, Lingxiao, Lu, Qiaolin, Venegas, Julian, Wang, Yixin, Wu, Lidan, Jin, Wei, Wen, Hongzhi, Liu, Renming, Tang, Wenzhuo, Dai, Xinnan, Li, Zhaoheng, Zuo, Wangyang, Chang, Yi, Lei, Yu Leo, Shang, Lulu, Danaher, Patrick, Xie, Yuying, and Tang, Jiliang
- Subjects
- *
GRAPH neural networks , *TECHNOLOGICAL innovations , *TUMOR microenvironment , *RNA sequencing , *TRANSCRIPTOMES , *LUNGS - Abstract
Recent technological advancements have enabled spatially resolved transcriptomic profiling but at a multicellular resolution that is more cost-effective. The task of cell type deconvolution has been introduced to disentangle discrete cell types from such multicellular spots. However, existing benchmark datasets for cell type deconvolution are either generated from simulation or limited in scale, predominantly encompassing data on mice and are not designed for human immuno-oncology. To overcome these limitations and promote comprehensive investigation of cell type deconvolution for human immuno-oncology, we introduce a large-scale spatial transcriptomic deconvolution benchmark dataset named SpatialCTD, encompassing 1.8 million cells and 12,900 pseudo spots from the human tumor microenvironment across the lung, kidney, and liver. In addition, SpatialCTD provides more realistic reference than those generated from single-cell RNA sequencing (scRNA-seq) data for most reference-based deconvolution methods. To utilize the location-aware SpatialCTD reference, we propose a graph neural network-based deconvolution method (i.e., GNNDeconvolver). Extensive experiments show that GNNDeconvolver often outperforms existing state-of-the-art methods by a substantial margin, without requiring scRNA-seq data. To enable comprehensive evaluations of spatial transcriptomics data from flexible protocols, we provide an online tool capable of converting spatial transcriptomic data from various platforms (e.g., 10× Visium, MERFISH, and sci-Space) into pseudo spots, featuring adjustable spot size. The SpatialCTD dataset and GNNDeconvolver implementation are available at https://github.com/OmicsML/SpatialCTD, and the online converter tool can be accessed at https://omicsml.github.io/SpatialCTD/. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. An Open-World, Diverse, Cross-Spatial-Temporal Benchmark for Dynamic Wild Person Re-Identification.
- Author
-
Zhang, Lei, Fu, Xiaowei, Huang, Fuxiang, Yang, Yi, and Gao, Xinbo
- Subjects
- *
PERSONAL belongings , *WEATHER , *TRUST , *SHOPPING malls , *GENERALIZATION - Abstract
Person re-identification (ReID) has made great strides thanks to the data-driven deep learning techniques. However, the existing benchmark datasets lack diversity, and models trained on these data cannot generalize well to dynamic wild scenarios. To meet the goal of improving the explicit generalization of ReID models, we develop a new Open-World, Diverse, Cross-Spatial-Temporal dataset named OWD with several distinct features. (1) Diverse collection scenes: multiple independent open-world and highly dynamic collecting scenes, including streets, intersections, shopping malls, etc. (2) Diverse lighting variations: long time spans from daytime to nighttime with abundant illumination changes. (3) Diverse person status: multiple camera networks in all seasons with normal/adverse weather conditions and diverse pedestrian appearances (e.g., clothes, personal belongings, poses, etc.). (4) Protected privacy: invisible faces for privacy critical applications. To improve the implicit generalization of ReID, we further propose a Latent Domain Expansion (LDE) method to develop the potential of source data, which decouples discriminative identity-relevant and trustworthy domain-relevant features and implicitly enforces domain-randomized identity feature space expansion with richer domain diversity to facilitate domain-invariant representations. Our comprehensive evaluations with most benchmark datasets in the community are crucial for progress, although this work is far from the grand goal toward open-world and dynamic wild applications. The project page is https://github.com/fxw13/OWD. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. HRVQA: A Visual Question Answering benchmark for high-resolution aerial images.
- Author
-
Li, Kun, Vosselman, George, and Yang, Michael Ying
- Subjects
- *
COMPUTER vision , *URBAN planning , *SOURCE code , *QUESTION answering systems , *SCARCITY , *PIXELS - Abstract
Visual question answering (VQA) is an important and challenging multimodal task in computer vision and photogrammetry. Recently, efforts have been made to bring the VQA task to aerial images, due to its potential real-world applications in disaster monitoring, urban planning, and digital earth product generation. However, the development of VQA in this domain is restricted by the huge variation in the appearance, scale, and orientation of the concepts in aerial images, along with the scarcity of well-annotated datasets. In this paper, we introduce a new dataset, HRVQA, which provides a collection of 53,512 aerial images of 1024 × 1024 pixels and semi-automatically generated 1,070,240 QA pairs. To benchmark the understanding capability of VQA models for aerial images, we evaluate the recent methods on the HRVQA dataset. Moreover, we propose a novel model, GFTransformer, with gated attention modules and a mutual fusion module. The experiments show that the proposed dataset is quite challenging, especially the specific attribute-related questions. Our method achieves superior performance in comparison to the previous state-of-the-art approaches. The dataset and the source code are released at https://hrvqa.nl/. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Modality-missing RGBT Tracking: Invertible Prompt Learning and High-quality Benchmarks
- Author
-
Lu, Andong, Li, Chenglong, Zhao, Jiacong, Tang, Jin, and Luo, Bin
- Published
- 2024
- Full Text
- View/download PDF
21. Deep multimodal-based finger spelling recognition for Thai sign language: a new benchmark and model composition.
- Author
-
Vijitkunsawat, Wuttichai, Racharak, Teeradaj, and Le Nguyen, Minh
- Abstract
Video-based sign language recognition is vital for improving communication for the deaf and hard of hearing. Creating and maintaining quality of Thai sign language video datasets is challenging due to a lack of resources. Tackling this issue, we rigorously investigate a design and development of deep learning-based system for Thai Finger Spelling recognition, assessing various models with a new dataset of 90 standard letters performed by 43 diverse signers. We investigate seven deep learning models with three distinct modalities for our analysis: video-only methods (including RGB-sequencing-based CNN-LSTM and VGG-LSTM), human body joint coordinate sequences (processed by LSTM, BiLSTM, GRU, and Transformer models), and skeleton analysis (using TGCN with graph-structured skeleton representation). A thorough assessment of these models is conducted across seven circumstances, encompassing single-hand postures, single-hand motions with one, two, and three strokes, as well as two-hand postures with both static and dynamic point-on-hand interactions. The research highlights that the TGCN model is the optimal lightweight model in all scenarios. In single-hand pose cases, a combination of the Transformer and TGCN models of two modalities delivers outstanding performance, excelling in four particular conditions: single-hand poses, single-hand poses requiring one, two, and three strokes. In contrast, two-hand poses with static or dynamic point-on-hand interactions present substantial challenges, as the data from joint coordinates is inadequate due to hand obstructions, stemming from insufficient coordinate sequence data and the lack of a detailed skeletal graph structure. The study recommends integrating RGB-sequencing with visual modality to enhance the accuracy of two-handed sign language gestures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Constrained Spectral–Spatial Attention Residual Network and New Cross-Scene Dataset for Hyperspectral Classification.
- Author
-
Li, Siyuan, Chen, Baocheng, Wang, Nan, Shi, Yuetian, Zhang, Geng, and Liu, Jia
- Subjects
IMAGE recognition (Computer vision) ,LAND cover ,SPATIAL variation ,CLASSIFICATION - Abstract
Hyperspectral image classification is widely applied in several fields. Since existing datasets focus on a single scene, current deep learning-based methods typically divide patches randomly on the same image as training and testing samples. This can result in similar spatial distributions of samples, which may incline the network to learn specific spatial distributions in pursuit of falsely high accuracy. In addition, the large variation between single-scene datasets has led to research in cross-scene hyperspectral classification, focusing on domain adaptation and domain generalization while neglecting the exploration of the generalizability of models to specific variables. This paper proposes two approaches to address these issues. The first approach is to train the model on the original image and then test it on the rotated dataset to simulate cross-scene evaluation. The second approach is constructing a new cross-scene dataset for spatial distribution variations, named GF14-C17&C16, to avoid the problems arising from the existing single-scene datasets. The image conditions in this dataset are basically the same, and only the land cover distribution is different. In response to the spatial distribution variations, this paper proposes a constrained spectral attention mechanism and a constrained spatial attention mechanism to limit the fitting of the model to specific feature distributions. Based on these, this paper also constructs a constrained spectral–spatial attention residual network (CSSARN). Extensive experimental results on two public hyperspectral datasets and the GF14-C17&C16 dataset have demonstrated that CSSARN is more effective than other methods in extracting cross-scene spectral and spatial features. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Dataset Generation for Gujarati Language Using Handwritten Character Images.
- Author
-
Suthar, Sanket B. and Thakkar, Amit R.
- Subjects
CONVOLUTIONAL neural networks ,GRAYSCALE model - Abstract
In pattern recognition, the handwritten character recognition (HCR) is considered as the classical challenge. In particular, the benchmark dataset for HCR in the Gujarati language is limited. To overcome this challenge, a proper dataset is required for experimentation. Hence, this work introduces dataset generation for the Gujarati language using pre-processing and classification techniques. Initially, the handwritten data is collected from various native Gujarati writers. In this work, there are three processes carried out to generate the dataset. Initially, the pre-processing stages like a selection of image, noise removal, normalization, conversion of integer value to double, grayscale image into a binary image, dimensionality reduction, and vector conversation are performed. Then, the pre-processed image is segmented using line segmentation, character segmentation and word segmentation. Finally, the data are classified using a Convolutional neural network (CNN). The kappa and FPR (False Positive Rate) values achieved by the CNN are 0.981 and 0.189. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. BAMFORESTS: Bamberg Benchmark Forest Dataset of Individual Tree Crowns in Very-High-Resolution UAV Images.
- Author
-
Troles, Jonas, Schmid, Ute, Fan, Wen, and Tian, Jiaojiao
- Subjects
- *
CROWNS (Botany) , *ARTIFICIAL neural networks , *CLIMATE change , *DIGITAL elevation models , *COMPUTER vision , *DRONE aircraft - Abstract
The anthropogenic climate crisis results in the gradual loss of tree species in locations where they were previously able to grow. This leads to increasing workloads and requirements for foresters and arborists as they are forced to restructure their forests and city parks. The advancements in computer vision (CV)—especially in supervised deep learning (DL)—can help cope with these new tasks. However, they rely on large, carefully annotated datasets to produce good and generalizable models. This paper presents BAMFORESTS: a dataset with 27,160 individually delineated tree crowns in 105 ha of very-high-resolution UAV imagery gathered with two different sensors from two drones. BAMFORESTS covers four areas of coniferous, mixed, and deciduous forests and city parks. The labels contain instance segmentations of individual trees, and the proposed splits are balanced by tree species and vitality. Furthermore, the dataset contains the corrected digital surface model (DSM), representing tree heights. BAMFORESTS is annotated in the COCO format and is especially suited for training deep neural networks (DNNs) to solve instance segmentation tasks. BAMFORESTS was created in the BaKIM project and is freely available under the CC BY 4.0 license. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. An Underwater Organism Image Dataset and a Lightweight Module Designed for Object Detection Networks.
- Author
-
Huang, Jiafeng, Zhang, Tianjun, Zhao, Shengjie, Zhang, Lin, and Zhou, Yicong
- Subjects
OBJECT recognition (Computer vision) ,DEEP learning ,ACOUSTIC imaging ,COMPUTER vision ,EVIDENCE gaps ,COMPUTER performance ,ECHO ,MARINE sciences - Abstract
Long-term monitoring and recognition of underwater organism objects are of great significance in marine ecology, fisheries science and many other disciplines. Traditional techniques in this field, including manual fishing-based ones and sonar-based ones, are usually flawed. Specifically, the method based on manual fishing is time-consuming and unsuitable for scientific researches, while the sonar-based one, has the defects of low acoustic image accuracy and large echo errors. In recent years, the rapid development of deep learning and its excellent performance in computer vision tasks make vision-based solutions feasible. However, the researches in this area are still relatively insufficient in mainly two aspects. First, to our knowledge, there is still a lack of large-scale datasets of underwater organism images with accurate annotations. Second, in consideration of the limitation on hardware resources of underwater devices, an underwater organism detection algorithm that is both accurate and lightweight enough to be able to infer in real time is still lacking. As an attempt to fill in the aforementioned research gaps to some extent, we established the Multiple Kinds of Underwater Organisms (MKUO) dataset with accurate bounding box annotations of taxonomic information, which consists of 10,043 annotated images, covering eighty-four underwater organism categories. Based on our benchmark dataset, we evaluated a series of existing object detection algorithms to obtain their accuracy and complexity indicators as the baseline for future reference. In addition, we also propose a novel lightweight module, namely Sparse Ghost Module, designed especially for object detection networks. By substituting the standard convolution with our proposed one, the network complexity can be significantly reduced and the inference speed can be greatly improved without obvious detection accuracy loss. To make our results reproducible, the dataset and the source code are available online at https://cslinzhang.github.io/MKUO-and-Sparse-Ghost-Module/. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. A New Benchmark In Vivo Paired Dataset for Laparoscopic Image De-smoking
- Author
-
Xia, Wenyao, Fan, Victoria, Peters, Terry, Chen, Elvis C. S., Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Linguraru, Marius George, editor, Dou, Qi, editor, Feragen, Aasa, editor, Giannarou, Stamatia, editor, Glocker, Ben, editor, Lekadir, Karim, editor, and Schnabel, Julia A., editor
- Published
- 2024
- Full Text
- View/download PDF
27. Towards Highly Effective Moving Tiny Ball Tracking via Vision Transformer
- Author
-
Yu, Jizhe, Liu, Yu, Wei, Hongkui, Xu, Kaiping, Cao, Yifei, Li, Jiangquan, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Si, Zhanjun, editor, and Pan, Yijie, editor
- Published
- 2024
- Full Text
- View/download PDF
28. Evaluating Baselines for Type Inference: Static Code Analysis Versus Large Language Model
- Author
-
Vagin, Andrey, Romanov, Vitaly, Ivanov, Vladimir, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Abraham, Ajith, editor, Bajaj, Anu, editor, Hanne, Thomas, editor, and Siarry, Patrick, editor
- Published
- 2024
- Full Text
- View/download PDF
29. CSEPrompts: A Benchmark of Introductory Computer Science Prompts
- Author
-
Raihan, Nishat, Goswami, Dhiman, Puspo, Sadiya Sayara Chowdhury, Newman, Christian, Ranasinghe, Tharindu, Zampieri, Marcos, Appice, Annalisa, editor, Azzag, Hanane, editor, Hacid, Mohand-Said, editor, Hadjali, Allel, editor, and Ras, Zbigniew, editor
- Published
- 2024
- Full Text
- View/download PDF
30. CODD: A benchmark dataset for the automated sorting of construction and demolition waste.
- Author
-
Demetriou, Demetris, Mavromatidis, Pavlos, Petrou, Michael F., and Nicolaides, Demetris
- Subjects
- *
CONSTRUCTION & demolition debris , *OBJECT recognition (Computer vision) , *WASTE products , *SCIENTIFIC community - Abstract
• A benchmark dataset, CODD, is developed for the task of automated sorting of CDW. • The CODD is designed both for bounding box and instance segmentation detection. • A baseline model is developed on the YOLOV8 architecture for future benchmarking. • Open invitation is sent to the scientific community for collaborating on the CODD. This study presents the Construction and Demolition Waste Object Detection Dataset (CODD), a benchmark dataset specifically curated for the training of object detection models and the full-scale implementation of automated sorting of Construction and Demolition Waste (CDW). The CODD encompasses a comprehensive range of CDW scenarios, capturing a diverse array of debris and waste materials frequently encountered in real-world construction and demolition sites. A noteworthy feature of the presented study is the ongoing collaborative nature of the dataset, which invites contributions from the scientific community, ensuring its perpetual improvement and adaptability to emerging research and practical requirements. Building upon the benchmark dataset, an advanced object detection model based on the latest bounding box and instance segmentation YOLOV8 architecture is developed to establish a baseline performance for future comparisons. The CODD benchmark dataset, along with the baseline model, provides a reliable reference for comprehensive comparisons and objective assessments of future models, contributing to progressive advancements and collaborative research in the field. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Enhancing antigenic peptide discovery: Improved MHC-I binding prediction and methodology.
- Author
-
Giziński, Stanisław, Preibisch, Grzegorz, Kucharski, Piotr, Tyrolski, Michał, Rembalski, Michał, Grzegorczyk, Piotr, and Gambin, Anna
- Subjects
- *
T cell receptors , *PEPTIDES , *HLA histocompatibility antigens , *MAJOR histocompatibility complex , *LANGUAGE models , *VACCINE development - Abstract
The Major Histocompatibility Complex (MHC) is a critical element of the vertebrate cellular immune system, responsible for presenting peptides derived from intracellular proteins. MHC-I presentation is pivotal in the immune response and holds considerable potential in the realms of vaccine development and cancer immunotherapy. This study delves into the limitations of current methods and benchmarks for MHC-I presentation. We introduce a novel benchmark designed to assess generalization properties and the reliability of models on unseen MHC molecules and peptides, with a focus on the Human Leukocyte Antigen (HLA)–a specific subset of MHC genes present in humans. Finally, we introduce HLABERT, a pretrained language model that outperforms previous methods significantly on our benchmark and establishes a new state-of-the-art on existing benchmarks. • Study reviews limitations in current methods and benchmarks for pan-specific MHC-I presentation prediction. • Novel pan-specific MHC-I prediction model rigorously benchmarked against existing approaches. • HLABERT excels on benchmarks, with enhanced generalization. Key contributions: limitations ID, new benchmark, high-performing model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. A benchmark dataset in chemical apparatus: recognition and detection.
- Author
-
Zou, Le, Ding, Ze-Sheng, Ran, Shuo-Yi, Wu, Zhi-Ze, Wei, Yun-Sheng, He, Zhi-Huang, and Wang, Xiao-Feng
- Abstract
Robots that perform chemical experiments autonomously have been implemented, using the same chemical apparatus as human chemists and capable of performing complex chemical experiments unmanaged. However, most robots in chemistry are still programmed and cannot adapt to diverse environments or to changes in displacement and angle of the object. To resolve this issue, we have conceived a computer vision method for identifying and detecting chemical apparatus automatically. Identifying and localizing such apparatus accurately from chemistry lab images is the most important task. We acquired 2246 images from real chemistry laboratories, with a total of 33,108 apparatus instances containing 21 classes. We demonstrate a Chemical Apparatus Benchmark Dataset (CABD) containing a chemical apparatus image recognition dataset and a chemical apparatus object detection dataset. We evaluated five excellent image recognition models: AlexNet, VGG16, GoogLeNet, ResNet50, MobileNetV2 and four state-of-the-art object detection methods: Faster R-CNN (3 backbones), Single Shot MultiBox Detector (SSD), YOLOv3-SPP and YOLOv5, respectively, on the CABD dataset. The results can serve as a baseline for future research. Experiments show that ResNet50 has the highest accuracy (99.9%) in the chemical apparatus image recognition dataset; Faster R-CNN (ResNet50-fpn) and YOLOv5 performed the best in terms of mAP (99.0%) and AR (94.5%) in the chemical apparatus object detection dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. CUS3D: A New Comprehensive Urban-Scale Semantic-Segmentation 3D Benchmark Dataset.
- Author
-
Gao, Lin, Liu, Yu, Chen, Xi, Liu, Yuxiang, Yan, Shen, and Zhang, Maojun
- Subjects
- *
GEOMETRIC topology , *POINT cloud , *DATA conversion , *SMART cities , *SPACE , *ANNOTATIONS - Abstract
With the continuous advancement of the construction of smart cities, the availability of large-scale and semantically enriched datasets is essential for enhancing the machine's ability to understand urban scenes. Mesh data have a distinct advantage over point cloud data for large-scale scenes, as they can provide inherent geometric topology information and consume less memory space. However, existing publicly available large-scale scene mesh datasets are limited in scale and semantic richness and do not cover a wide range of urban semantic information. The development of 3D semantic segmentation algorithms depends on the availability of datasets. Moreover, existing large-scale 3D datasets lack various types of official annotation data, which hinders the widespread applicability of benchmark applications and may cause label errors during data conversion. To address these issues, we present a comprehensive urban-scale semantic segmentation benchmark dataset. It is suitable for various research pursuits on semantic segmentation methodologies. This dataset contains finely annotated point cloud and mesh data types for 3D, as well as high-resolution original 2D images with detailed 2D semantic annotations. It is constructed from a 3D reconstruction of 10,840 UVA aerial images and spans a vast area of approximately 2.85 square kilometers that covers both urban and rural scenes. The dataset is composed of 152,298,756 3D points and 289,404,088 triangles. Each 3D point, triangular mesh, and the original 2D image in the dataset are carefully labeled with one of the ten semantic categories. Six typical 3D semantic segmentation methods were compared on the CUS3D dataset, with KPConv demonstrating the highest overall performance. The mIoU is 59.72%, OA is 89.42%, and mAcc is 97.88%. Furthermore, the experimental results on the impact of color information on semantic segmentation suggest that incorporating both coordinate and color features can enhance the performance of semantic segmentation. The current limitations of the CUS3D dataset, particularly in class imbalance, will be the primary target for future dataset enhancements. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Hyperspectral Image Classification on Large-Scale Agricultural Crops: The Heilongjiang Benchmark Dataset, Validation Procedure, and Baseline Results.
- Author
-
Zhang, Hongzhe, Feng, Shou, Wu, Di, Zhao, Chunhui, Liu, Xi, Zhou, Yuan, Wang, Shengnan, Deng, Hongtao, and Zheng, Shuang
- Subjects
- *
IMAGE recognition (Computer vision) , *CROPS , *AGRICULTURE , *DEEP learning , *RESEARCH personnel , *INTERCROPPING - Abstract
Over the past few decades, researchers have shown sustained and robust investment in exploring methods for hyperspectral image classification (HSIC). The utilization of hyperspectral imagery (HSI) for crop classification in agricultural areas has been widely demonstrated for its feasibility, flexibility, and cost-effectiveness. However, numerous coexisting issues in agricultural scenarios, such as limited annotated samples, uneven distribution of crops, and mixed cropping, could not be explored insightfully in the mainstream datasets. The limitations within these impractical datasets have severely restricted the widespread application of HSIC methods in agricultural scenarios. A benchmark dataset named Heilongjiang (HLJ) for HSIC is introduced in this paper, which is designed for large-scale crop classification. For practical applications, the HLJ dataset covers a wide range of genuine agricultural regions in Heilongjiang Province; it provides rich spectral diversity enriched through two images from diverse time periods and vast geographical areas with intercropped multiple crops. Simultaneously, considering the urgent demand of deep learning models, the two images in the HLJ dataset have 319,685 and 318,942 annotated samples, along with 151 and 149 spectral bands, respectively. To validate the suitability of the HLJ dataset as a baseline dataset for HSIC, we employed eight classical classification models in fundamental experiments on the HLJ dataset. Most of the methods achieved an overall accuracy of more than 80% with 10% of the labeled samples used for training. Furthermore, the advantages of the HLJ dataset and the impact of real-world factors on experimental results are comprehensively elucidated. The comprehensive baseline experimental evaluation and analysis affirm the research potential of the HLJ dataset as a large-scale crop classification dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Demand response-based cost mitigation strategy in renewable energy connected microgrid using intelligent energy management system.
- Author
-
Vaikund, Harini and Srivani, S. G.
- Subjects
- *
MICROGRIDS , *ENERGY management , *RENEWABLE energy sources , *POWER resources , *ELECTRIC power consumption , *INTELLIGENT control systems - Abstract
A microgrid was a mixed device of distributed energy resources that contain renewable energy resources, power storage devices and loads and has the capacity to operate locally in a single controllable entity. However, rising electricity costs and rising consumer electricity demand were major problems in worldwide. An energy management system (EMS) was integrated into the system to address these problems. Yet, the managing between load and source and economic problems were a challenging task for the power system industry. Several approaches were developed to manage the EMS to overcome these issues, but it consumes too much time in energy reporting and difficult to solve the energy challenges. So, a novel energy management system was proposed to manage the power flows to reduce the electricity cost. A standard microgrid was designed like IEEE 6 bus system as per the guidelines of IEEE Standard 1547-2018. PV, grid, and battery were chosen for sources, and in the load, home uses were taken. According to the behaviour of individual person and the accompanying appliances activation power requirement, an actual-time standard dataset was constructed. Using this dataset, the intelligent controller was built to anticipate when the sources will be turned ON and OFF. EMS forecasts the load demand and checks the trained value of an intelligent model to produce a command signal of source's CB. The suggested intelligent-based EMS system performance was analysed at both islanded and grid disconnected mode. The proposed model provides 97% accuracy, 0.059% FPR, and 99.8% specificity. The results show that the proposed intelligent controller provides better prediction performance in both conditions and is therefore more suitable for real-time estimation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. ICG: A Machine Learning Benchmark Dataset and Baselines for Inline Code Comments Generation Task.
- Author
-
Zhang, Xiaowei, Chen, Lin, Zou, Weiqin, Cao, Yulu, Ren, Hao, Wang, Zhi, Li, Yanhui, and Zhou, Yuming
- Subjects
MACHINE learning ,DATA scrubbing ,RESEARCH personnel ,STATISTICS ,TAXONOMY ,ONLINE comments - Abstract
As a fundamental component of software documentation, code comments could help developers comprehend and maintain programs. Several datasets of method header comments have been proposed in previous studies for machine learning-based code comment generation. As part of code comments, inline code comments are also crucial for code understanding activities. However, unlike method header comments written in a standard format and describing the whole method code, inline comments are often written in arbitrary formats by developers due to timelines pressures and different aspects of code snippets in the method are described. Currently, there is no large-scale dataset used for inline comments generation considering these. Hence, this naturally inspires us to explore whether we can construct a dataset to foster machine learning research that not only performs fine-grained noise-cleaning but conducts a taxonomy of inline comments. To this end, we first collect inline comments and code snippets from 8000 Java projects on GitHub. Then, we conduct a manual review to obtain heuristic rules, which could be used to clean the data noise in a fine-grained manner. As a result, we construct a large-scale benchmark dataset named ICG with 5,740,770 pairs of inline comments and code snippets. We then build a comprehensive taxonomy and conduct a statistical and manual analysis to explore the performances of different categories of inline comments, such as helpfulness in code understanding. After that, we provide and compare several baseline models to automatically generate inline comments, such as CodeBERT, to enhance the usability of the benchmark for researchers. The availability of our benchmark and baselines can help develop and validate new inline comment generation methods, which would also further facilitate code understanding activities. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. CEDAnet: Individual Tree Segmentation in Dense Orchard via Context Enhancement and Density Prior
- Author
-
Fangjie Zhu, Zhenhao Chen, Haoyang Li, Qian Shi, and Xiaoping Liu
- Subjects
Benchmark dataset ,deep learning (DL) ,individual tree segmentation (ITS) ,instance segmentation ,unmanned aerial vehicle (UAV) ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Individual tree segmentation (ITS) is a pivotal technique in orchard research, estimating tree counts and delineating crown contours. This method provides foundational data for assessing orchard health, nutritional composition, and predicting yield. Unmanned aerial vehicles (UAVs) have become an essential data source for (ITS) due to their capability to capture ultra-fine details. However, current deep-learning-based ITS methods struggle to accurately handle densely overlapping fruit tree distributions with similar characteristics in UAVs images, primarily due to the intricate nature of spatial arrangements in such scenarios. In this article, we propose CEDAnet, a context enhancement, and density adjustment network, to address the challenge of dense fruit trees segmentation. Specifically, a transformer-based contextual aggregation module is designed to distinguish different instances and refine the boundary of the instances. We have proposed a density-guided nonmaximum suppression method to adaptively generate sufficient candidate bounding boxes, aiming to retain more potential instances in dense trees. To evaluate the effectiveness and robustness of our proposal, we curated two ITS datasets constructed with imagery captured by UAVs, namely instance segmentation in Conghua images dataset (iSCHID) and instance segmentation in Maoming images dataset (iSMMID) based on their respective spatial characteristics. Experimental results on both two datasets demonstrated that CEDAnet yields competitive results in ITS tasks, with the bounding box AP of 0.498, segmentation AP of 0.493 in iSCHID, and the bounding box AP of 0.706, segmentation AP of 0.703 in iSMMID.
- Published
- 2024
- Full Text
- View/download PDF
38. Recursive Elimination of 'Outliers' to Get Benchmark Dataset
- Author
-
Langsha Liu, Chunhui Xie, Wensheng Hu, and Yunqi Li
- Subjects
Benchmark dataset ,recursive data elimination ,polyurethane elastomer ,mechanical properties ,regression ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Benchmark datasets normally have relatively conserved relationships and low fraction of outliers, indicated from higher determination coefficient (R2) and lower Mean Absolute Error (MAE) in regression model. Here inspired by the process of peeling onions, we introduced a recursive data elimination (RDE) of “outliers” strategy to get benchmark dataset. Outliers are labeled using William’s plot in residual vs leverage (recorded as RDE_W), and the performance was compared with that using residual alone (recorded as RDE). The validation was performed in single-target and multiple-target ways through the predictions of mechanical properties including Young’s modulus, tensile strength, and elongation at break for 643 polyurethane elastomers (the first time this dataset has been released), and compressive strength for 1030 concrete samples. In the single-target way, RDE_W strategy achieved an 8.06% increase in R2 and a 19.87% reduction in MAE compared to RDE. In the multiple-target way the improvement was approximately 3%. SVM outperformed XGB, NN, RF, Lasso and DT algorithms in the RDE_W strategy. Additional tests also validated the advantages for RDE_W over RDE to generate high-quality benchmark datasets. We released the data and code to facilitate the construction of high quality benchmark datasets and the development of new approaches to better understand, explore and design advanced materials.
- Published
- 2024
- Full Text
- View/download PDF
39. Sen4Map: Advancing Mapping With Sentinel-2 by Providing Detailed Semantic Descriptions and Customizable Land-Use and Land-Cover Data
- Author
-
Surbhi Sharma, Rocco Sedona, Morris Riedel, Gabriele Cavallaro, and Claudia Paris
- Subjects
Benchmark dataset ,land-use and land-cover mapping ,machine learning ,Sentinel-2 ,supervised classification ,Land Use and Coverage Area frame Survey (LUCAS) ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
This article presents Sen4Map, a large-scale benchmark dataset designed to enhance the capability of generating land-cover maps using Sentinel-2 data. Comprising nonoverlapping $ \text{64} \times \text{64} $ patches extracted from Sentinel-2 time series images, the dataset spans 335 125 geotagged locations across the European Union. These locations are associated with detailed land-cover and land-use information gathered by expert surveyors in 2018. Unlike most existing large datasets available in the literature, the presented database provides: first, a detailed description of the land-cover and land-use properties of each sampled area; second, independence of scale, as it is associated with reference data collected in situ by expert surveyors; third, the ability to test both temporal and spatial classification approaches because of the availability of time series of $ \text{64} \times \text{64} $ patches associated with each labeled sample; and fourth, samples were collected following a stratified random sample design to obtain a statistically representative spatial distribution of land-cover classes throughout the European Union. To showcase the properties and challenges offered by Sen4Map, we benchmarked the current state-of-the-art land-cover classification approaches.
- Published
- 2024
- Full Text
- View/download PDF
40. A Comprehensive Benchmark and Evaluation of Thai Finger Spelling in Multi-Modal Deep Learning Models
- Author
-
Wuttichai Vijitkunsawat and Teeradaj Racharak
- Subjects
Thai finger spelling ,sign language recognition ,multi-modality ,benchmark dataset ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Sign Language Recognition (SLR) is an intricate and demanding area within computer vision that requires advanced models for accurate interpretation. This research presents a comprehensive analysis and evaluation of the newly benchmarked Thai Finger Spelling (TFS) dataset through seven main experiments. It utilizes both RGB-based (evaluated by CNN-LSTM, VGG-LSTM, I3D, Fusion-3, MEMP, DeepSign-CNN, and ChatGPT4) and pose-based input modalities (assessed by Pose-GRU, Pose-TGCN, SPOTER, Bi-RNN, and FNN-LSTM) across one-hand and two-hand poses, covering 90 standard letters. Findings from the one-handed experiments show that models employing pose-based input modalities substantially outperform those using RGB-based modalities for TFS. Indeed, the pose-based models achieve scores higher than 95% in in-sample testing and 66% in out-of-sample testing. The pose-based models show strong resilience to environmental factors like lighting, background, and clothing which often affect the performance of RGB-based models. This robustness enhances the effectiveness of pose-based systems in diverse settings, improving sign language interpretation’s accuracy and expanding the applicability of SLR technologies in various contexts. However, scenarios involving two-hand poses add complexity, challenging both RGB-based and pose-based modalities in accurately tracking and distinguishing interactions between two hands, particularly during rapid or overlapping movements. These challenges can lead to occlusions in RGB-based systems and difficulties in mapping spatial relationships in pose-based systems. As a result, the performance of out-of-sample tests significantly decreases to below 50% for both static-point-on-hand and total two-hand poses. This benchmark research offers comprehensive insights into TFS and guides the development of state-of-the-art models for TFS.
- Published
- 2024
- Full Text
- View/download PDF
41. OHSCR: Benchmarks Dataset for Offline Handwritten Sindhi Character Recognition.
- Author
-
Naveed, Jakhro Abdul, Soomro, Mudasar Ahmed, Saleem, Leezna, and Shaikh, Muhammad Khalid
- Subjects
PATTERN recognition systems ,HANDWRITING recognition (Computer science) ,SYSTEM identification ,SCIENTIFIC community ,MACHINE learning - Abstract
This research work presents a unique dataset for offline handwritten Sindhi character recognition. It has 7800 character images in total, divided into multiple categories by 150 writers of various ages, genders, and professional backgrounds. Each writer writes the 52 Sindhi characters in the designed form. With a high-quality scanner, all of the written samples were scanned. After that, all the handwritten Sindhi characters were cropped from the collected designed form, and the cropped images were saved in '.png' format. For the benefit of the Sindhi research community, this work suggests an image dataset for character recognition in handwritten Sindhi. The dataset will be made publically available. For the Sindhi language, this dataset can be used to create and test handwritten character recognition systems and provide helpful insights through writer identification. The dataset has been divided into the training set and the test set, with 80% for training and 20% for testing. The different preprocessing techniques used to remove noise from scanned images to create a clean dataset. The dataset created as a result of this research is the world's first openly accessible dataset for handwritten research, and it can be useful for writer identification systems and handwriting recognition systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Solving the Distributed Permutation Flow-Shop Scheduling Problem Using Constrained Programming.
- Author
-
Gogos, Christos
- Subjects
FLOW shop scheduling ,FLOW shops ,CONSTRAINT programming ,PERMUTATIONS ,PRODUCTION scheduling - Abstract
The permutation flow-shop scheduling problem is a classical problem in scheduling that aims at identifying the optimal sequence of jobs that should be processed in a number of machines in an effort to minimize makespan or some other performance criterion. The distributed permutation flow-shop scheduling problem adds multiple factories where copies of the machines exist and asks for minimizing the makespan on the longest-running location. In this paper, the problem is approached using Constraint Programming and its specialized scheduling features, such as interval variables and non-overlap constraints, while a novel heuristic is proposed for computing lower bounds. Two constraint programming models are proposed: one that solves the Distributed Permutation Flow-shop Scheduling problem, and another one that drops the constraint of processing jobs under the same order for all machines of each factory. The experiments use an extended public dataset of problem instances to validate the approach's effectiveness. In the process, optimality is proved for many problem instances known in the literature but has yet to be proven optimal. Moreover, a high speed of reaching optimal solutions is achieved for many problems, even with moderate big sizes (e.g., seven factories, 20 machines, and 20 jobs). The critical role that the number of jobs plays in the complexity of the problem is identified and discussed. In conclusion, this paper demonstrates the great benefits of scheduling problems that stem from using state-of-the-art constraint programming solvers and models that capture the problem tightly. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. A survey of deep learning methods and datasets for hand pose estimation from hand-object interaction images.
- Author
-
Woo, Taeyun, Park, Wonjung, Jeong, Woohyun, and Park, Jinah
- Subjects
- *
POSE estimation (Computer vision) , *DEEP learning , *JOINTS (Anatomy) , *IMPLICIT functions , *VIRTUAL reality , *COMPUTER vision - Abstract
The research topic of estimating hand pose from the images of hand-object interaction has the potential for replicating natural hand behavior in many practical applications of virtual reality and robotics. However, the intricacy of hand-object interaction combined with mutual occlusion, and the need for physical plausibility, brings many challenges to the problem. This paper provides a comprehensive survey of the state-of-the-art deep learning-based approaches for estimating hand pose (joint and shape) in the context of hand-object interaction. We discuss various deep learning-based approaches to image-based hand tracking, including hand joint and shape estimation. In addition, we review the hand-object interaction dataset benchmarks that are well-utilized in hand joint and shape estimation methods. Deep learning has emerged as a powerful technique for solving many problems including hand pose estimation. While we cover extensive research in the field, we discuss the remaining challenges leading to future research directions. [Display omitted] • Deep learning is effectively used for estimating hand pose from images. • The correlation between a hand and an object helps in estimating hand-object pose. • Hand model helps estimate hand shape, but it restricts within the model's prior. • Implicit function methods have emerged in hand-object pose estimation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Comprehensive Analysis of Freebase and Dataset Creation for Robust Evaluation of Knowledge Graph Link Prediction Models
- Author
-
Shirvani-Mahdavi, Nasim, Akrami, Farahnaz, Saeef, Mohammed Samiul, Shi, Xiao, Li, Chengkai, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Payne, Terry R., editor, Presutti, Valentina, editor, Qi, Guilin, editor, Poveda-Villalón, María, editor, Stoilos, Giorgos, editor, Hollink, Laura, editor, Kaoudi, Zoi, editor, Cheng, Gong, editor, and Li, Juanzi, editor
- Published
- 2023
- Full Text
- View/download PDF
45. DisGait: A Prior Work of Gait Recognition Concerning Disguised Appearance and Pose
- Author
-
Huang, Shouwang, Fan, Ruiqi, Wu, Shichao, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Premaratne, Prashan, editor, Jin, Baohua, editor, Qu, Boyang, editor, Jo, Kang-Hyun, editor, and Hussain, Abir, editor
- Published
- 2023
- Full Text
- View/download PDF
46. The Causal Strength Bank: A New Benchmark for Causal Strength Classification
- Author
-
Yuan, Xiaosong, Guan, Renchu, Zuo, Wanli, Zhang, Yijia, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Kashima, Hisashi, editor, Ide, Tsuyoshi, editor, and Peng, Wen-Chih, editor
- Published
- 2023
- Full Text
- View/download PDF
47. NICO Challenge: Out-of-Distribution Generalization for Image Recognition Challenges
- Author
-
Zhang, Xingxuan, He, Yue, Wang, Tan, Qi, Jiaxin, Yu, Han, Wang, Zimu, Peng, Jie, Xu, Renzhe, Shen, Zheyan, Niu, Yulei, Zhang, Hanwang, Cui, Peng, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Karlinsky, Leonid, editor, Michaeli, Tomer, editor, and Nishino, Ko, editor
- Published
- 2023
- Full Text
- View/download PDF
48. Pol-InSAR-Island - A benchmark dataset for multi-frequency Pol-InSAR data land cover classification
- Author
-
Sylvia Hochstuhl, Niklas Pfeffer, Antje Thiele, Stefan Hinz, Joel Amao-Oliva, Rolf Scheiber, Andreas Reigber, and Holger Dirks
- Subjects
Pol-InSAR ,Multi-frequency ,Benchmark dataset ,Land cover classification ,Machine learning ,Wishart classifier ,Geography (General) ,G1-922 ,Surveying ,TA501-625 - Abstract
This paper presents Pol-InSAR-Island, the first publicly available multi-frequency Polarimetric Interferometric Synthetic Aperture Radar (Pol-InSAR) dataset labeled with detailed land cover classes, which serves as a challenging benchmark dataset for land cover classification. In recent years, machine learning has become a powerful tool for remote sensing image analysis. While there are numerous large-scale benchmark datasets for training and evaluating machine learning models for the analysis of optical data, the availability of labeled SAR or, more specifically, Pol-InSAR data is very limited. The lack of labeled data for training, as well as for testing and comparing different approaches, hinders the rapid development of machine learning algorithms for Pol-InSAR image analysis. The Pol-InSAR-Island benchmark dataset presented in this paper aims to fill this gap. The dataset consists of Pol-InSAR data acquired in S- and L-band by DLR's airborne F-SAR system over the East Frisian island Baltrum. The interferometric image pairs are the result of a repeat-pass measurement with a time offset of several minutes. The image data are given as 6 × 6 coherency matrices in ground range on a 1 m × 1m grid. Pixel-accurate class labels, consisting of 12 different land cover classes, are generated in a semi-automatic process based on an existing biotope type map and visual interpretation of SAR and optical images. Fixed training and test subsets are defined to ensure the comparability of different approaches trained and tested prospectively on the Pol-InSAR-Island dataset. In addition to the dataset, results of supervised Wishart and Random Forest classifiers that achieve mean Intersection-over-Union scores between 24% and 67% are provided to serve as a baseline for future work. The dataset is provided via KITopenData: https://doi.org/10.35097/1700.
- Published
- 2023
- Full Text
- View/download PDF
49. A benchmark dataset and evaluation methodology for Chinese zero pronoun translation.
- Author
-
Xu, Mingzhou, Wang, Longyue, Liu, Siyou, Wong, Derek F., Shi, Shuming, and Tu, Zhaopeng
- Subjects
- *
CHINESE language , *MACHINE translating , *EVALUATION methodology , *PRONOUNS (Grammar) , *TRANSLATING & interpreting - Abstract
The phenomenon of zero pronoun (ZP) has attracted increasing interest in the machine translation community due to its importance and difficulty. However, previous studies generally evaluate the quality of translating ZPs with BLEU score on MT testsets, which is not expressive or sensitive enough for accurate assessment. To bridge the data and evaluation gaps, we propose a benchmark testset and evaluation metric for target evaluation on Chinese ZP translation. The human-annotated testset covers five challenging genres, which reveal different characteristics of ZPs for comprehensive evaluation. We systematically revisit advanced models on ZP translation and identify current challenges for future exploration. We release data, code, and trained models, which we hope can significantly promote research in this field. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
50. The Wildfire Dataset: Enhancing Deep Learning-Based Forest Fire Detection with a Diverse Evolving Open-Source Dataset Focused on Data Representativeness and a Novel Multi-Task Learning Approach.
- Author
-
El-Madafri, Ismail, Peña, Marta, and Olmedo-Torre, Noelia
- Subjects
FOREST fires ,FOREST fire prevention & control ,WILDFIRE prevention ,WILDFIRES ,DEEP learning ,FIRE alarms ,FALSE alarms - Abstract
This study explores the potential of RGB image data for forest fire detection using deep learning models, evaluating their advantages and limitations, and discussing potential integration within a multi-modal data context. The research introduces a uniquely comprehensive wildfire dataset, capturing a broad array of environmental conditions, forest types, geographical regions, and confounding elements, aiming to reduce high false alarm rates in fire detection systems. To ensure integrity, only public domain images were included, and a detailed description of the dataset's attributes, URL sources, and image resolutions is provided. The study also introduces a novel multi-task learning approach, integrating multi-class confounding elements within the framework. A pioneering strategy in the field of forest fire detection, this method aims to enhance the model's discriminatory ability and decrease false positives. When tested against the wildfire dataset, the multi-task learning approach demonstrated significantly superior performance in key metrics and lower false alarm rates compared to traditional binary classification methods. This emphasizes the effectiveness of the proposed methodology and the potential to address confounding elements. Recognizing the need for practical solutions, the study stresses the importance of future work to increase the representativeness of training and testing datasets. The evolving and publicly available wildfire dataset is anticipated to inspire innovative solutions, marking a substantial contribution to the field. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.