21,090 results on '"Information retrieval"'
Search Results
2. Efficient prediction of music genre using support vector machine and decision tree.
- Author
-
Pavan, V., Vickram, A. S., and Dhanalakshmi, R.
- Subjects
- *
SUPPORT vector machines , *POPULAR music genres , *DECISION trees , *MACHINE performance , *MUSICAL performance , *INFORMATION retrieval - Abstract
The study aims to evaluate the performance of the Music Genre Prediction System using a Novel Support Vector Machine and Decision Tree. The MARSYAS website provided the GTZAN dataset utilised in this investigation, which is comprised of one thousand.au-formatted music files and is intended for use in Music Information Retrieval. It is currently considered the gold standard dataset. Mel-frequency cepstralcoefficients (MFCC) are retrieved from music files and used to make predictions about the genre. All steps, from analysing data to training a model to testing it, take place within Jupyter. SPSS is used to analyse the mean accuracies of two algorithms side-by-side. Accuracy, precision, etc., numbers came out differently depending on the size of the input data set. SVM and Decision Tree are used for classification, with N equal to 20 for each of the two groups (proposed and comparative). The value of the pre-test achieved is 0.08. This research shows that compared to the Decision Tree method, Novel SVM is more effective. In the Research study, SVM with 81.10% accuracy is found more effective, efficient, and faster when compared to the Decision Tree with 59.88% accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Accuracy of fast information retrieval of data for novel vehicle parking system using k-nearest neighbor algorithm compared over fuzzy algorithm.
- Author
-
Prakash, S. Bhanu and Anithaashri, T. P.
- Subjects
- *
K-nearest neighbor classification , *FUZZY algorithms , *INFORMATION retrieval , *AUTOMOBILE parking , *ACCURACY of information , *INDEPENDENT variables , *DATABASES - Abstract
The K-Nearest Neighbor approach outperforms the Fuzzy Algorithm in terms of accuracy during quick data retrieval for an innovative car parking system. The Components and Techniques: K-Nearest Neighbor Algorithm enables 20-sample database vehicle parking system, which improves innovative vehicle parking system's data retrieval accuracy and speed. The implementation was accomplished by developing a Python web application with the aid of the anaconda navigator. As a result, a car parking database is constructed using the K-Nearest Neighbor Algorithm, which improves efficiency by 87 percent in terms of secure accessibility compared to the FuzzyApproach, which improves efficiency by only 75 percent. The length of time it takes to log in and access the database has been analysed using SPSS (Statistical Package for the Social Sciences) with the independent variables of time and size and a significance level of p 0.05. K-Nearest Neighbor Algorithm, which is more significant than a Fuzzy Approach in terms of Accurate efficiency, has been employed in a protected system to improve the car parking database. When data is a dependent variable and sample size is an independent variable, SPSS analysis may show how reliable the data is. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Comparison of fully anonymous novel HMAC encryption algorithm with IDEA encryption algorithm for secured data retrieval with reduced time.
- Author
-
Kumaran, G. Jitvan and Logu, K.
- Subjects
- *
INFORMATION retrieval , *DATABASE management , *DATA recovery , *ALGORITHMS , *CLOUD storage , *TIME management - Abstract
This paper provides a comprehensive analysis of safe data from novel HMAC cloud administration with respect to low time consumption in Database Management. In order to foresee the information in time consumption of the Database Management in cloud administration, we tested the HMAC encryption Algorithm with a test size of =20 and the IDEA encryption calculation (IEA) with a test size of =20. HMAC encryption was used to reduce the time needed to recover data from more secure distributed storage. When comparing HMAC and IDEA encryption calculations, HMAC has a higher utilisation rate (66.80 percent) (62.97 percent). In general, HMAC encryption calculation outperforms IDEA encryption calculation (p 0.05, 2-tailed). In data set administration frameworks, a novel HMAC encryption computation helps reduce the possibility of data recovery. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Experience in evaluating legal scientific information.
- Author
-
Normatov, Sherbek, Rakhmatullaev, Marat, Urinkulov, Odil, and Khalmuratov, Omonboy
- Subjects
- *
LEGAL literature , *INFORMATION retrieval , *INFORMATION resources , *SCIENCE education , *RESEARCH & development - Abstract
It is known that scientific information resources in the legal field are an important source of information in the training of qualified personnel in this field, the development of scientific research, and the regulation of relations in the field of law. Despite this, a large number of scientific information resources in the legal field makes the issue of finding the most necessary sources of legal information in certain legal situations very relevant. It is necessary to take into account not only the relevance of information, but also its classical characteristics, such as relevance, accuracy, completeness, and ease of use. This creates the need to evaluate sources of scientific information in the legal field. The purpose of this article is to develop a method for classifying and evaluating sources of scientific information in the legal field. The issue of evaluating information sources related to science and education is one of the important foundations of intelligent information retrieval to improve the efficiency of providing users with the necessary quality literature. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Improvements on the damage calculations using evaluated nuclear data and NJOY.
- Author
-
Chen, Shengli and Bernard, David
- Subjects
- *
NUCLEAR energy , *ELECTRONIC data processing , *NEUTRONS , *NUCLEAR cross sections , *INFORMATION retrieval - Abstract
NJOY is the only open-source nuclear data processing code allowing calculating neutron-induced displacement damage cross sections from evaluated nuclear data. However, some issues exist in the NJOY calculation of damage cross sections, including the inconsistency for neutron capture reaction with photon data given in MF6 vs. MF12-MF15, questionable or even incorrect recoil nuclear data in MF6, discrepant damage cross sections using different approaches, and the potential underestimation above 20 MeV due to the storage of nuclear data in MT5. These issues should be addressed by the improvements on both evaluated nuclear data and the NJOY code. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Intelligent system for detection and classification of diabetic retinopathy.
- Author
-
Ramanan, Anila Vengathanath and Kumar, Kala Krishna
- Subjects
- *
DIABETIC retinopathy , *MACHINE learning , *FEATURE extraction , *INFORMATION retrieval , *PEOPLE with diabetes - Abstract
Diabetic retinopathy is a commonly found medical condition in a diabetic patient due to the exceeding limits of diabetes in blood. A severe case of Diabetic Retinopathy will lead to complete blindness. The main reason Diabetic Retinopathy is dangerous is that it shows no or very few symptoms till it becomes severe and incurable. Early-stage Diabetic Retinopathy detection has a vital role in restoring eye vision and proper treatment. The process of detecting and classifying diabetic retinopathy from retina images is costly and consumes a good amount of time. Also, the detection is done manually by a doctor which makes the detection and classification of the stage prone to errors. An Automated model for the early-stage detection of DR will help the doctors to identify the Diabetic Retinopathy and give proper treatment to those who needed. A few attempts are being done to automate the detection and classification of DR from retina images. This work proposes a machine learning model which detects and classifies diabetic retinopathy from the colored fundus image dataset and classifies them into various stages. The model uses a pre-trained ResNet152 model for feature extraction and classification. An article usually includes an abstract, a concise summary of the work covered at length in the main body of the article. It is used for secondary publications and for information retrieval purposes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. A Computational Inflection for Scientific Discovery.
- Author
-
HOPE, TOM, DOWNEY, DOUG, ETZIONI, OREN, WELD, DANIEL S., and HORVITZ, ERIC
- Subjects
- *
SCIENTIFIC knowledge , *LANGUAGE models , *SCIENTIFIC method , *ARTIFICIAL intelligence , *INFORMATION retrieval , *NATURAL language processing , *COGNITION , *HUMAN-artificial intelligence interaction - Abstract
This article presents an overview on task-guided scientific knowledge retrieval as a way for researchers to overcome the limitations of human cognitive capacity that in the age of explosive digital information creates a cognitive bottleneck. Topics include prototypes of task-guided scientific knowledge retrieval, as well as a look at novel representations, tools, and services and a review of systems that aid researchers in all aspects of scientific inquiry and discovery.
- Published
- 2023
- Full Text
- View/download PDF
9. A Tale of Two Tools: Comparing LibKey Discovery to Quicklinks in Primo VE.
- Author
-
Locascio, Jill K. and Rubel, Dejah
- Subjects
- *
WEB browsers , *MEDICAL libraries , *ACADEMIC libraries , *INTERNET searching , *ONLINE library catalogs , *INFORMATION retrieval , *ACCESS to information , *OPEN access publishing , *DESCRIPTIVE statistics , *DOCUMENT markup languages - Abstract
The article reviews the native Quicklinks direct linking tool from Ex Libris and the LibKey Discovery linking tool from Third Iron.
- Published
- 2023
- Full Text
- View/download PDF
10. A hierarchical shape description approach and its application in similarity measurement of polygon entities.
- Author
-
Ma, Jingzhen, Sun, Qun, Ma, Chao, Lyu, Zheng, Sun, Shijie, and Wen, Bowei
- Subjects
- *
SHAPE measurement , *MULTISENSOR data fusion , *INFORMATION retrieval , *POLYGONS , *INFORMATION processing , *MEASUREMENT - Abstract
Spatial similarity provides an important basis for geographic information processing and is widely applied in multi-source data fusion and update, data retrieval and query, and cartographic generalization. To address the shape description and similarity measurement of polygon entities, this study presents a new hierarchical shape description approach and examines its application in similarity measurement of polygon entities. Using the rotation and segmentation methods, we first constructed a hierarchical shape description model for target polygon entities, followed by measurement of global and hierarchical shape description of polygon entities, respectively, using the farthest-point-distance and geometric feature description methods. Finally, we constructed a comprehensive similarity measurement model through a weighted integration of position, size, direction, and shape. The hierarchical shape description approach proposed in this paper can be applied to the shape similarity measurement of polygon elements, similarity measurement after spatial object simplification, and multi-scale polygon entity matching. The experimental results showed that the hierarchical shape description approach and similarity measurement model are able to effectively measure spatial similarity between different polygon entities, and have obtained good application results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. The thesan project: public data release of radiation-hydrodynamic simulations matching reionization-era JWST observations.
- Author
-
Garaldi, Enrico, Kannan, Rahul, Smith, Aaron, Borrow, Josh, Vogelsberger, Mark, Pakmor, Rüdiger, Springel, Volker, Hernquist, Lars, Galárraga-Espinosa, Daniela, Yeh, Jessica Y -C, Shen, Xuejian, Xu, Clara, Neyer, Meredith, Spina, Benedetta, Almualla, Mouza, and Zhao, Yu
- Subjects
- *
DATA release , *COSMIC dust , *SPACE telescopes , *COMPUTATIONAL complexity , *N-body simulations (Astronomy) , *GALAXY formation , *INFORMATION retrieval - Abstract
Cosmological simulations serve as invaluable tools for understanding the Universe. However, the technical complexity and substantial computational resources required to generate such simulations often limit their accessibility within the broader research community. Notable exceptions exist, but most are not suited for simultaneously studying the physics of galaxy formation and cosmic reionization during the first billion years of cosmic history. This is especially relevant now that a fleet of advanced observatories (e.g. James Webb Space Telescope, Nancy Grace Roman Space Telescope , SPHEREx, ELT, SKA) will soon provide an holistic picture of this defining epoch. To bridge this gap, we publicly release all simulation outputs and post-processing products generated within the thesan simulation project at www.thesan-project.com. This project focuses on the z ≥ 5.5 Universe, combining a radiation-hydrodynamics solver (arepo-rt), a well-tested galaxy formation model (IllustrisTNG) and cosmic dust physics to provide a comprehensive view of the Epoch of Reionization. The thesan suite includes 16 distinct simulations, each varying in volume, resolution, and underlying physical models. This paper outlines the unique features of these new simulations, the production and detailed format of the wide range of derived data products, and the process for data retrieval. Finally, as a case study, we compare our simulation data with a number of recent observations from the James Webb Space Telescope , affirming the accuracy and applicability of thesan. The examples also serve as prototypes for how to utilize the released data set to perform comparisons between predictions and observations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Systems for electronic documentation and sharing of advance care planning preferences: a scoping review.
- Author
-
Çevik, Hüsna Sarıca, Muente, Catharina, Muehlensiepen, Felix, Birtwistle, Jacqueline, Pachanov, Alexander, Pieper, Dawid, and Allsop, Matthew J.
- Subjects
- *
DOCUMENTATION , *DIGITAL technology , *MEDICAL information storage & retrieval systems , *MEDICAL quality control , *DO-not-resuscitate orders , *CINAHL database , *DESCRIPTIVE statistics , *DECISION making , *ELECTRONIC data interchange , *SYSTEMATIC reviews , *MEDLINE , *ELECTRONIC health records , *LITERATURE reviews , *ELIGIBILITY (Social aspects) , *INFORMATION retrieval , *ADVANCE directives (Medical care) , *PSYCHOLOGY information storage & retrieval systems , *ACCESS to information - Abstract
Digital approaches to support advance care planning (ACP) documentation and sharing are increasingly being used, with a lack of research to characterise their design, content, and use. This study aimed to characterise how digital approaches are being used to support ACP documentation and sharing internationally. A scoping review was performed in accordance with the JBI (formerly Joanna Briggs Institute) guidelines and the PRISMA 2020 checklist, prospectively registered on Open Science Framework (). MEDLINE, EMBASE, PsycINFO, ACM Digital, IEEE Xplore and CINAHL were searched in February 2023. Only publications in English, published from 2008 onwards were considered. Eligibility criteria included a focus on ACP and electronic systems. Out of 2,393 records, 34 reports were included, predominantly from the USA (76.5%). ACP documentation is typically stored in electronic health records (EHRs) (67.6%), with a third (32.4%) enabling limited patient access. Non-standard approaches (n = 15;44.1%) were the commonest study design of included reports, with outcome measures focusing on the influence of systems on the documentation (i.e. creation, quantity, quality, frequency or timing) of ACP information (n = 23;67.6%). Digital approaches to support ACP are being implemented and researched internationally with an evidence base dominated by non-standard study designs. Future research is needed to extend outcome measurement to consider aspects of care quality and explore whether the content of existing systems aligns with aspects of care that are valued by patients. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Development and validation of search filters to retrieve medication discontinuation articles in Medline and Embase.
- Author
-
Morel, Thomas, Nguyen‐Soenen, Jérôme, Thompson, Wade, and Fournier, Jean‐Pascal
- Subjects
- *
MEDICAL information storage & retrieval systems , *DATABASE searching , *BIBLIOGRAPHIC databases , *COMPUTER software , *DATA analysis , *TERMINATION of treatment , *HEALTH , *STATISTICAL sampling , *MEDLINE , *PUBLISHING , *INFORMATION retrieval , *STATISTICS , *CONFIDENCE intervals , *VOCABULARY , *TEXT messages , *SENSITIVITY & specificity (Statistics) - Abstract
Background: Medication discontinuation studies explore the outcomes of stopping a medication compared to continuing it. Comprehensively identifying medication discontinuation articles in bibliographic databases remains challenging due to variability in terminology. Objectives: To develop and validate search filters to retrieve medication discontinuation articles in Medline and Embase. Methods: We identified medication discontinuation articles in a convenience sample of systematic reviews. We used primary articles to create two reference sets for Medline and Embase, respectively. The reference sets were equally divided by randomization in development sets and validation sets. Terms relevant for discontinuation were identified by term frequency analysis in development sets and combined to develop two search filters that maximized relative recalls. The filters were validated against validation sets. Relative recalls were calculated with their 95% confidences intervals (95% CI). Results: We included 316 articles for Medline and 407 articles for Embase, from 15 systematic reviews. The Medline optimized search filter combined 7 terms. The Embase optimized search filter combined 8 terms. The relative recalls were respectively 92% (95% CI: 87–96) and 91% (95% CI: 86–94). Conclusions: We developed two search filters for retrieving medication discontinuation articles in Medline and Embase. Further research is needed to estimate precision and specificity of the filters. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Expert searchers identified time, team, technology and tension as challenges when carrying out supplementary searches for systematic reviews: A thematic network analysis.
- Author
-
Briscoe, Simon, Abbott, Rebecca, and Melendez‐Torres, G. J.
- Subjects
- *
TEAMS in the workplace , *GREY literature , *QUALITATIVE research , *MEDICAL librarians , *INTERVIEWING , *INFORMATION storage & retrieval systems , *INFORMATION resources , *SOCIAL work research , *SYSTEMATIC reviews , *THEMATIC analysis , *INFORMATION retrieval , *RESEARCH methodology , *MEDICAL coding , *TIME , *PSYCHOSOCIAL factors , *ACCESS to information - Abstract
Background: Systematic reviews require detailed planning of complex processes which can present logistical challenges. Understanding these logistical challenges can help with planning and execution of tasks Objectives: To describe the perspectives of expert searchers on the main logistical challenges when carrying out supplementary searches for systematic reviews, in particular, forward citation searching and web searching. Methods: Qualitative interviews were undertaken with 15 experts on searching for studies for systematic reviews (e.g. information specialists) working in health and social care research settings. Interviews were undertaken by video‐call between September 2020 and June 2021. Data analysis used thematic network analysis. Results: We identified three logistical challenges of using forward citation searching and web searching which were organised under the global theme of 'tension': time, team and technology. Several subthemes were identified which supported the organising themes, including allocating time, justifying time and keeping to time; reviewer expectations and contact with review teams; and access to resources and reference management. Conclusion: Forward citation searching and web searching are logistically challenging search methods for a systematic review. An understanding of these challenges should encourage expert searchers and review teams to maintain open channels of communication, which should also facilitate improved working relationships. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Bridging Qualitative Data Silos: The Potential of Reusing Codings Through Machine Learning Based Cross-Study Code Linking.
- Author
-
Wildemann, Sergej, Niederée, Claudia, and Elejalde, Erick
- Subjects
- *
RESEARCH questions , *INFORMATION retrieval , *RESEARCH personnel , *MACHINE learning , *AMBIGUITY , *DATA analysis , *SEMANTICS - Abstract
For qualitative data analysis (QDA), researchers assign codes to text segments to arrange the information into topics or concepts. These annotations facilitate information retrieval and the identification of emerging patterns in unstructured data. However, this metadata is typically not published or reused after the research. Subsequent studies with similar research questions require a new definition of codes and do not benefit from other analysts' experience. Machine learning (ML) based classification seeded with such data remains a challenging task due to the ambiguity of code definitions and the inherent subjectivity of the exercise. Previous attempts to support QDA using ML rely on linear models and only examined individual datasets that were either smaller or coded specifically for this purpose. However, we show that modern approaches effectively capture at least part of the codes' semantics and may generalize to multiple studies. We analyze the performance of multiple classifiers across three large real-world datasets. Furthermore, we propose an ML-based approach to identify semantic relations of codes in different studies to show thematic faceting, enhance retrieval of related content, or bootstrap the coding process. These are encouraging results that suggest how analysts might benefit from prior interpretation efforts, potentially yielding new insights into qualitative data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. How much freedom does an effectiveness metric really have?
- Author
-
Moffat, Alistair and Mackenzie, Joel
- Subjects
- *
DATABASE management , *COMPUTER software , *T-test (Statistics) , *DATA analysis , *RESEARCH funding , *INFORMATION storage & retrieval systems , *INFORMATION technology , *DESCRIPTIVE statistics , *INFORMATION needs , *SEARCH engines , *INFORMATION retrieval , *INFORMATION science , *MEDICAL coding , *STATISTICS , *CLOUD computing - Abstract
It is tempting to assume that because effectiveness metrics have free choice to assign scores to search engine result pages (SERPs) there must thus be a similar degree of freedom as to the relative order that SERP pairs can be put into. In fact that second freedom is, to a considerable degree, illusory. That is because if one SERP in a pair has been given a certain score by a metric, fundamental ordering constraints in many cases then dictate that the score for the second SERP must be either not less than, or not greater than, the score assigned to the first SERP. We refer to these fixed relationships as innate pairwise SERP orderings. Our first goal in this work is to describe and defend those pairwise SERP relationship constraints, and tabulate their relative occurrence via both exhaustive and empirical experimentation. We then consider how to employ such innate pairwise relationships in IR experiments, leading to a proposal for a new measurement paradigm. Specifically, we argue that tables of results in which many different metrics are listed for champion versus challenger system comparisons should be avoided; and that instead a single metric be argued for in principled terms, with any relationships identified by that metric then reinforced via an assessment of the innate relationship as to whether other metrics are likely to yield the same system‐versus‐system outcome. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Cross-Lingual Information Retrieval from Multilingual Construction Documents Using Pretrained Language Models.
- Author
-
Kim, Jungyeon, Chung, Sehwan, and Chi, Seokho
- Subjects
- *
CROSS-language information retrieval , *LANGUAGE models , *MACHINE translating , *INFORMATION retrieval , *AUTOMATIC train control - Abstract
The growth of the global construction market has attracted international companies to participate in overseas projects. Overseas projects are extremely dynamic with numerous uncertainties, raising the need to collect information about construction in host countries. Due to the vast amounts of text data in the construction industry, an automated method, specifically information retrieval, is required to find the necessary information. Previous studies have suggested automated methods to review various construction documents. However, these studies required substantial manual effort and mainly focused on only one language, resulting in loss of vital information because it is buried in documents written in the host country's language. To address these limitations, this study proposes a cross-lingual information retrieval (CLIR) framework using pretrained Bidirectional Encoder Representations from Transformers (BERT) models to retrieve information from multilingual construction documents. The proposed framework employs language models (i.e., monolingual, multilingual, and cross-lingual) and trains these models on a construction data set to enhance their ability in construction-specific text. The framework achieved reliable performance of retrieval, even with minimal additional training using domain-specific data. The results indicate that training on the domain data set raises the level of retrieval, increasing the mean reciprocal rank of a specific task by up to 0.2128. With the employment of a monolingual model with machine translation, CLIR in a specific domain could be performed effectively without the need for a labeled data set. The suggested CLIR framework offers a practical alternative for dealing with construction documents in overseas projects, reducing time and cost while improving risk identification and mitigation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Survey on Recommender Systems for Biomedical Items in Life and Health Sciences.
- Author
-
Pato, Matilde, Barros, Márcia, and Couto, Francisco M.
- Published
- 2024
- Full Text
- View/download PDF
19. An efficient labeled memory system for learned indexes.
- Author
-
Yuxuan Mo, Jingnan Jia, Pengfei Li, and Yu Hua
- Subjects
- *
COMPUTER storage devices , *CACHE memory , *DATA warehousing , *INFORMATION retrieval , *BANDWIDTH allocation - Abstract
The appearance and wide use of memory hardware bring significant changes to the conventional vertical memory hierarchy that fails to handle contentions for shared hardware resources and expensive data movements. To deal with these problems, existing schemes have to rely on inefficient scheduling strategies that also cause extra temporal, spatial and bandwidth overheads. Based on the insights that the shared hardware resources trend to be uniformly and hierarchically offered to the requests for co-located applications in memory systems, we present an efficient abstraction of memory hierarchies, called Label, which is used to establish the connection between the application layer and underlying hardware layer. Based on labels, our paper proposes LaMem, a labeled, resource-isolated and cross-tiered memory system by leveraging the way-based partitioning technique for shared resources to guarantee QoS demands of applications, while supporting fast and low-overhead cache repartitioning technique. Besides, we customize LaMem for the learned index that fundamentally replaces storage structures with computation models as a case study to verify the applicability of LaMem. Experimental results demonstrate the efficiency and efficacy of LaMem. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. A globally shared resource paradigm for encoded storage systems in the public cloud.
- Author
-
Zhiyue Li and Guangyan Zhang
- Subjects
- *
DATA warehousing , *FAULT tolerance (Engineering) , *INFORMATION retrieval , *CLOUD computing , *INFORMATION sharing - Abstract
Public clouds favor sharing of storage resources, in which many tenants acquire bandwidth and storage capacity from a shared storage pool. To provide high availability, data are often encoded to provide fault tolerance with low storage costs. Regarding this, efficiently organizing an encoded storage system for shared I/Os is critical for application performance. This is usually hard to achieve as different applications have different stripe configurations and fault tolerance levels. In this paper, we first study the block trace from the Alibaba cloud, and find that I/O patterns of modern applications prefer the resource sharing scheme. Based on this, we propose a globally shared resource paradigm for encoded storage system in the public cloud. The globally shared resource paradigm can provide balanced load and fault tolerance for numerous disk pool sizes and arbitrary application stripe configurations. Furthermore, we demonstrate with two case studies that our theory can help address the device-specific problems of HDD and SSD RAID arrays with slight modifications: comparing the existing resource partition and resource sharing methods, our theory can promote the rebuild speed of the HDD RAID arrays by 2.5 ×, and reduce the P99 tail latency of the SSD arrays by up to two orders of magnitude. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Telehealth-Based Information Retrieval and Extraction for Analysis of Clinical Characteristics and Symptom Patterns in Mild COVID-19 Patients.
- Author
-
Jahaj, Edison, Gallos, Parisis, Tziomaka, Melina, Kallipolitis, Athanasios, Pasias, Apostolos, Panagopoulos, Christos, Menychtas, Andreas, Dimopoulou, Ioanna, Kotanidou, Anastasia, Maglogiannis, Ilias, and Vassiliou, Alice Georgia
- Subjects
- *
COVID-19 , *MEDICAL care , *FEVER , *DATA mining , *INFORMATION retrieval , *SYMPTOMS - Abstract
Clinical characteristics of COVID-19 patients have been mostly described in hospitalised patients, yet most are managed in an outpatient setting. The COVID-19 pandemic transformed healthcare delivery models and accelerated the implementation and adoption of telemedicine solutions. We employed a modular remote monitoring system with multi-modal data collection, aggregation, and analytics features to monitor mild COVID-19 patients and report their characteristics and symptoms. At enrolment, the patients were equipped with wearables, which were associated with their accounts, provided the respective in-system consents, and, in parallel, reported the demographics and patient characteristics. The patients monitored their vitals and symptoms daily during a 14-day monitoring period. Vital signs were entered either manually or automatically through wearables. We enrolled 162 patients from February to May 2022. The median age was 51 (42–60) years; 44% were male, 22% had at least one comorbidity, and 73.5% were fully vaccinated. The vitals of the patients were within normal range throughout the monitoring period. Thirteen patients were asymptomatic, while the rest had at least one symptom for a median of 11 (7–16) days. Fatigue was the most common symptom, followed by fever and cough. Loss of taste and smell was the longest-lasting symptom. Age positively correlated with the duration of fatigue, anorexia, and low-grade fever. Comorbidities, the number of administered doses, the days since the last dose, and the days since the positive test did not seem to affect the number of sick days or symptomatology. The i-COVID platform allowed us to provide remote monitoring and reporting of COVID-19 outpatients. We were able to report their clinical characteristics while simultaneously helping reduce the spread of the virus through hospitals by minimising hospital visits. The monitoring platform also offered advanced knowledge extraction and analytic capabilities to detect health condition deterioration and automatically trigger personalised support workflows. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. A Question and Answering Service of Typhoon Disasters Based on the T5 Large Language Model.
- Author
-
Xia, Yongqi, Huang, Yi, Qiu, Qianqian, Zhang, Xueying, Miao, Lizhi, and Chen, Yixiang
- Abstract
A typhoon disaster is a common meteorological disaster that seriously impacts natural ecology, social economy, and even human sustainable development. It is crucial to access the typhoon disaster information, and the corresponding disaster prevention and reduction strategies. However, traditional question and answering (Q&A) methods exhibit shortcomings like low information retrieval efficiency and poor interactivity. This makes it difficult to satisfy users' demands for obtaining accurate information. Consequently, this work proposes a typhoon disaster knowledge Q&A approach based on LLM (T5). This method integrates two technical paradigms of domain fine-tuning and retrieval-augmented generation (RAG) to optimize user interaction experience and improve the precision of disaster information retrieval. The process specifically includes the following steps. First, this study selects information about typhoon disasters from open-source databases, such as Baidu Encyclopedia and Wikipedia. Utilizing techniques such as slicing and masked language modeling, we generate a training set and 2204 Q&A pairs specifically focused on typhoon disaster knowledge. Second, we continuously pretrain the T5 model using the training set. This process involves encoding typhoon knowledge as parameters in the neural network's weights and fine-tuning the pretrained model with Q&A pairs to adapt the T5 model for downstream Q&A tasks. Third, when responding to user queries, we retrieve passages from external knowledge bases semantically similar to the queries to enhance the prompts. This action further improves the response quality of the fine-tuned model. Finally, we evaluate the constructed typhoon agent (Typhoon-T5) using different similarity-matching approaches. Furthermore, the method proposed in this work lays the foundation for the cross-integration of large language models with disaster information. It is expected to promote the further development of GeoAI. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Detecting common features from point patterns for similarity measurement using matrix decomposition.
- Author
-
Zhang, Yifan and Yu, Wenhao
- Subjects
- *
MATRIX decomposition , *VECTOR spaces , *INFORMATION retrieval , *DATABASES , *IMAGE analysis - Abstract
Similarity of point patterns is critical to geographic information retrieval. Many methods depend on measuring the similarity between point patterns within the spatial database. However, previous researches mainly focus on point density which is only one aspect of point patterns. A point distribution can be complex by its numerous alignments of point groups, which usually imply different geographical meanings in reality. In this paper, we propose a new method that uses image analysis techniques to comprehensively consider the characteristics of a point pattern. Specifically, given a set of point datasets falling in the same region, our method first generates the point intensity surfaces to transform the original vector space to raster space; then, the method constructs a matrix to describe all the pattern-related information. Finally, the point pattern similarity is calculated by decomposing this matrix into the lower-order representation and the factorized basis features. Due to the use of matrix decomposition, the proposed method has the merits that it can eliminate noises from the original data and assess the similarity of two patterns with emphasis on their major features. As a case study, our method is effective in discovering regularity from the taxi pick-up/drop-off point datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Transfer Learning-Based Hyperspectral Image Classification Using Residual Dense Connection Networks.
- Author
-
Zhou, Hao, Wang, Xianwang, Xia, Kunming, Ma, Yi, and Yuan, Guowu
- Subjects
- *
IMAGE recognition (Computer vision) , *FEATURE extraction , *HYPERSPECTRAL imaging systems , *CLASSIFICATION algorithms , *SIGNAL classification , *SPECTRAL imaging , *MULTICASTING (Computer networks) , *INFORMATION retrieval - Abstract
The extraction of effective classification features from high-dimensional hyperspectral images, impeded by the scarcity of labeled samples and uneven sample distribution, represents a formidable challenge within hyperspectral image classification. Traditional few-shot learning methods confront the dual dilemma of limited annotated samples and the necessity for deeper, more effective features from complex hyperspectral data, often resulting in suboptimal outcomes. The prohibitive cost of sample annotation further exacerbates the challenge, making it difficult to rely on a scant number of annotated samples for effective feature extraction. Prevailing high-accuracy algorithms require abundant annotated samples and falter in deriving deep, discriminative features from limited data, compromising classification performance for complex substances. This paper advocates for an integration of advanced spectral–spatial feature extraction with meta-transfer learning to address the classification of hyperspectral signals amidst insufficient labeled samples. Initially trained on a source domain dataset with ample labels, the model undergoes transference to a target domain with minimal samples, utilizing dense connection blocks and tree-dimensional convolutional residual connections to enhance feature extraction and maximize spatial and spectral information retrieval. This approach, validated on three diverse hyperspectral datasets—IP, UP, and Salinas—significantly surpasses existing classification algorithms and small-sample techniques in accuracy, demonstrating its applicability to high-dimensional signal classification under label constraints. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Updates to the Alliance of Genome Resources central infrastructure.
- Author
-
Consortium, The Alliance of Genome Resources
- Subjects
- *
BIOLOGICAL models , *DATABASES , *COMPUTER software , *DATA mining , *DATABASE management , *DATA curation , *ARTIFICIAL intelligence , *INFORMATION resources , *FISHES , *PROFESSIONS , *MICE , *RATS , *INFORMATION services , *INFORMATION retrieval , *CAENORHABDITIS elegans , *INSECTS , *ONTOLOGIES (Information retrieval) , *MACHINE learning , *GENOMES , *GENETICS , *YEAST , *ANURA - Abstract
The Alliance of Genome Resources (Alliance) is an extensible coalition of knowledgebases focused on the genetics and genomics of intensively studied model organisms. The Alliance is organized as individual knowledge centers with strong connections to their research communities and a centralized software infrastructure, discussed here. Model organisms currently represented in the Alliance are budding yeast, Caenorhabditis elegans , Drosophila , zebrafish, frog, laboratory mouse, laboratory rat, and the Gene Ontology Consortium. The project is in a rapid development phase to harmonize knowledge, store it, analyze it, and present it to the community through a web portal, direct downloads, and application programming interfaces (APIs). Here, we focus on developments over the last 2 years. Specifically, we added and enhanced tools for browsing the genome (JBrowse), downloading sequences, mining complex data (AllianceMine), visualizing pathways, full-text searching of the literature (Textpresso), and sequence similarity searching (SequenceServer). We enhanced existing interactive data tables and added an interactive table of paralogs to complement our representation of orthology. To support individual model organism communities, we implemented species-specific "landing pages" and will add disease-specific portals soon; in addition, we support a common community forum implemented in Discourse software. We describe our progress toward a central persistent database to support curation, the data modeling that underpins harmonization, and progress toward a state-of-the-art literature curation system with integrated artificial intelligence and machine learning (AI/ML). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. What is new in FungiDB: a web-based bioinformatics platform for omics-scale data analysis for fungal and oomycete species.
- Author
-
Basenko, Evelina Y, Shanmugasundram, Achchuthan, Böhme, Ulrike, Starns, David, Wilkinson, Paul A, Davison, Helen R, Crouch, Kathryn, Maslen, Gareth, Harb, Omar S, Amos, Beatrice, McDowell, Mary Ann, Kissinger, Jessica C, Roos, David S, and Jones, Andrew
- Subjects
- *
DATA mining , *GENOMICS , *DATABASE management , *FUNGI , *BIOINFORMATICS , *GENE expression , *DATABASE design , *INFORMATION retrieval , *WEB development , *GENE expression profiling , *GENETICS , *PHENOTYPES , *USER interfaces , *ACCESS to information - Abstract
FungiDB (https://fungidb.org) serves as a valuable online resource that seamlessly integrates genomic and related large-scale data for a wide range of fungal and oomycete species. As an integral part of the VEuPathDB Bioinformatics Resource Center (https://veupathdb.org), FungiDB continually integrates both published and unpublished data addressing various aspects of fungal biology. Established in early 2011, the database has evolved to support 674 datasets. The datasets include over 300 genomes spanning various taxa (e.g. Ascomycota, Basidiomycota, Blastocladiomycota, Chytridiomycota, Mucoromycota, as well as Albuginales, Peronosporales, Pythiales, and Saprolegniales). In addition to genomic assemblies and annotation, over 300 extra datasets encompassing diverse information, such as expression and variation data, are also available. The resource also provides an intuitive web-based interface, facilitating comprehensive approaches to data mining and visualization. Users can test their hypotheses and navigate through omics-scale datasets using a built-in search strategy system. Moreover, FungiDB offers capabilities for private data analysis via the integrated VEuPathDB Galaxy platform. FungiDB also permits genome improvements by capturing expert knowledge through the User Comments system and the Apollo genome annotation editor for structural and functional gene curation. FungiDB facilitates data exploration and analysis and contributes to advancing research efforts by capturing expert knowledge for fungal and oomycete species. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. The Arabidopsis Information Resource in 2024.
- Author
-
Reiser, Leonore, Bakker, Erica, Subramaniam, Sabarinath, Chen, Xingguo, Sawant, Swapnil, Khosa, Kartik, Prithvi, Trilok, and Berardini, Tanya Z
- Subjects
- *
BRASSICACEAE , *DATABASES , *COMMUNITY health services , *GENOMICS , *PHENOMENOLOGICAL biology , *DATA curation , *GENETIC markers , *INFORMATION services , *GENETIC variation , *GENE expression , *INFORMATION retrieval , *PROTEOMICS , *PLANT physiology , *SEQUENCE analysis , *PLANT proteins - Abstract
Since 1999, The Arabidopsis Information Resource (www.arabidopsis.org) has been curating data about the Arabidopsis thaliana genome. Its primary focus is integrating experimental gene function information from the peer-reviewed literature and codifying it as controlled vocabulary annotations. Our goal is to produce a "gold standard" functional annotation set that reflects the current state of knowledge about the Arabidopsis genome. At the same time, the resource serves as a nexus for community-based collaborations aimed at improving data quality, access, and reuse. For the past decade, our work has been made possible by subscriptions from our global user base. This update covers our ongoing biocuration work, some of our modernization efforts that contribute to the first major infrastructure overhaul since 2011, the introduction of JBrowse2, and the resource's role in community activities such as organizing the structural reannotation of the genome. For gene function assessment, we used gene ontology annotations as a metric to evaluate: (1) what is currently known about Arabidopsis gene function and (2) the set of "unknown" genes. Currently, 74% of the proteome has been annotated to at least one gene ontology term. Of those loci, half have experimental support for at least one of the following aspects: molecular function, biological process, or cellular component. Our work sheds light on the genes for which we have not yet identified any published experimental data and have no functional annotation. Drawing attention to these unknown genes highlights knowledge gaps and potential sources of novel discoveries. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Multimodal learning with only image data: A deep unsupervised model for street view image retrieval by fusing visual and scene text features of images.
- Author
-
Wu, Shangyou, Yu, Wenhao, Zhang, Yifan, and Huang, Mengqiu
- Subjects
- *
IMAGE retrieval , *MULTIMODAL user interfaces , *TEXT recognition , *MACHINE learning , *CONVOLUTIONAL neural networks , *STREETS , *NAUTICAL charts , *INFORMATION retrieval - Abstract
As one of the classic tasks in information retrieval, the core of image retrieval is to identify the images sharing similar features with a query image, aiming to enable users to find the required information from a large number of images conveniently. Street view image retrieval, in particular, finds extensive applications in many fields, such as improvements to navigation and mapping services, formulation of urban development planning scheme, and analysis of historical evolution of buildings. However, the intricate foreground and background details in street view images, coupled with a lack of attribute annotations, render it among the most challenging issues in practical applications. Current image retrieval research mainly uses the visual model that is completely dependent on the image visual features, and the multimodal learning model that necessitates additional data sources (e.g., annotated text). Yet, creating annotated datasets is expensive, and street view images, which contain a large amount of scene texts themselves, are often unannotated. Therefore, this paper proposes a deep unsupervised learning algorithm that combines visual and text features from image data for improving the accuracy of street view image retrieval. Specifically, we employ text detection algorithms to identify scene text, utilize the Pyramidal Histogram of Characters encoding predictor model to extract text information from images, deploy deep convolutional neural networks for visual feature extraction, and incorporate a contrastive learning module for image retrieval. Upon testing across three street view image datasets, the results demonstrate that our model holds certain advantages over the state‐of‐the‐art multimodal models pre‐trained on extensive datasets, characterized by fewer parameters and lower floating point operations. Code and data are available at https://github.com/nwuSY/svtRetrieval. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Contextualizing and Expanding Conversational Queries without Supervision.
- Author
-
Krasakis, Antonios Minas, Yates, Andrew, and Kanoulas, Evangelos
- Abstract
The article focuses on contextualizing and expanding conversational queries without supervision, addressing conversational dependencies and query resolution challenges. Topics include zero-shot unified resolution-retrieval approaches, contextualization of query embeddings using conversation history, and query expansion techniques for improving conversational retrieval.
- Published
- 2024
- Full Text
- View/download PDF
30. Information Retrieval Evaluation Measures Defined on Some Axiomatic Models of Preferences.
- Author
-
Giner, Fernando
- Abstract
The article focuses on Information Retrieval Evaluation Measures defined on Axiomatic Models of Preferences (AMPs). Topics include the formal exploration of numeric, metric, and scale properties of effectiveness measures, the identification of join-irreducible elements in AMPs, and the characterization of retrieval measures such as precision, recall, and AP in terms of these elements.
- Published
- 2024
- Full Text
- View/download PDF
31. Understanding Feeling-of-Knowing in Information Search: An EEG Study.
- Author
-
Michalkova, Dominika, Rodriguez, Mario Parra, and Moshfeghi, Yashar
- Abstract
The article focuses on understanding the relationship between Feeling-of-Knowing (FOK) and the Anomalous State of Knowledge (ASK) in information search processes, exploring neurophysiological drivers associated with FOK. Topics include the distinct neurophysiological signatures in response to different states of knowledge, the role of metamemory and metacognition in information retrieval, and the implications of FOK as a distinctive state in IN realization.
- Published
- 2024
- Full Text
- View/download PDF
32. Shifting feedback agency to students by having them write their own feedback comments.
- Author
-
Nicol, David and Kushwah, Lovleen
- Subjects
- *
PSYCHOLOGICAL feedback , *HIGHER education , *TEACHER attitudes , *STUDENT attitudes , *INFORMATION retrieval - Abstract
In higher education, there is a tension between teachers providing comments to students about their work and students developing agency in producing that work. Most proposals to address this tension assume a dialogic conception of feedback where students take more agency in eliciting and responding to others' advice, recently framed as developing their feedback literacy. This conception does not however acknowledge the feedback agency students exercise implicitly during learning, through interactions with resources (e.g. textbooks, videos). This study therefore adopted a different framing - that all feedback is internally generated by students through comparing their work against different sources of reference information, human and material; and that agency is increased when these comparisons are made explicit. Students produced a literature review, compared it against information in two published reviews, and wrote their own self-feedback comments. The small sample size enabled detailed analysis of these comments and of students' experiences in producing them. Results show students can generate significant self-feedback by making resource comparisons, that this feedback can replace or complement teacher feedback, be activated when required and help students fine-tune feedback requests to teachers. This widely applicable methodology strengthens students' natural capacity for agency and makes dialogic feedback more effective. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Measuring what learners do in feedback: the feedback literacy behaviour scale.
- Author
-
Dawson, Phillip, Yan, Zi, Lipnevich, Anastasiya, Tai, Joanna, Boud, David, and Mahoney, Paige
- Subjects
- *
PSYCHOLOGICAL feedback , *STUDENT attitudes , *EDUCATIONAL evaluation , *EDUCATORS' attitudes , *INFORMATION retrieval - Abstract
Feedback can be powerful, but its effects are dependent on what students do. There has been intensive research in recent years under the banner of 'feedback literacy' to understand how to help students make the most of feedback. Although there are instruments to measure feedback literacy, they largely measure perceptions and orientations rather than what learners actually do. This paper documents the development and validation of the Feedback Literacy Behaviour Scale (FLBS), which is a self-report instrument intended to measure students' feedback behaviours. A framework for feedback literacy was constructed with five factors: Seek Feedback information (SF), Make Sense of information (MS), Use Feedback information (UF), Provide Feedback information (PF), and Manage Affect (MA). An initial set of 45 questions were reviewed in an iterative process by feedback experts, resulting in 39 questions that were trialled with 350 student participants from four countries. Our final survey of 24 questions was generally supported by confirmatory factor and Rasch analyses, and has acceptable test-retest reliability. The FLBS provides a more robust way for educators and researchers to capture behavioural indicators of feedback literacy and the impact of interventions to improve it. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Shaping information and knowledge on climate change technologies: A cross‐country qualitative analysis of carbon capture and storage results on Google search.
- Author
-
Rowland, Jussara, López‐Asensio, Sergi, Bagci, Ataberk, Delicado, Ana, and Prades, Ana
- Subjects
- *
ENVIRONMENTAL health , *CROSS-sectional method , *QUALITATIVE research , *CLIMATE change , *INFORMATION technology , *PUBLIC opinion , *SEARCH engines , *INFORMATION retrieval , *CARBON dioxide , *HEALTH promotion , *COMPARATIVE studies , *ACCESS to information - Abstract
Commercial search engines play a central role in shaping, defining, and promoting the information people have access to in contemporary societies. This is particularly true when it comes to emergent technologies, for which there is often limited available information in legacy media and other sources, thus having a strong bearing on public perceptions. In this article, we focus on how the Google search engine promotes information on carbon capture and storage (CCS). We explore how Google's ranking parameters and interface shape the information people access when searching for CCS through a qualitative analysis comparing the results in three countries (France, Spain, and Portugal). We focus on the content of the first search engine result pages (SERP) and consider both Google's ranking criteria and the content and format of promoted sources. The study reveals Google's influence in highlighting Wikipedia pages, Q&A‐formatted sources, and prioritizing online specialized media and private corporations. Additionally, we observe country‐specific variations in terms of actors and types of content, reflecting the level of interest and investment in the topic at the national level. These findings underscore the significant role of search engine mediations in shaping public perceptions and knowledge about emergent climate change technologies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Towards improving user awareness of search engine biases: A participatory design approach.
- Author
-
Paramita, Monica Lestari, Kasinidou, Maria, Kleanthous, Styliani, Rosso, Paolo, Kuflik, Tsvi, and Hopfgartner, Frank
- Subjects
- *
COMPUTERS , *RESEARCH funding , *INFORMATION technology , *INFORMATION retrieval , *INFORMATION-seeking behavior , *COGNITION - Abstract
Bias in news search engines has been shown to influence users' perceptions of a news topic and contribute to the polarisation of society. As a result, there is a need for news search engines that increase user awareness of biases in the search results. While technical approaches have been developed to mitigate biases in search, very few studies have investigated user preferences in interface designs for potentially raising their awareness of biases in news search engines. In this study, we utilized a participatory design methodology to develop eight prototypes with different features that could potentially be used to raise user awareness of biases in news search engines. We conducted three user studies, involving 132 participants with Computer Science backgrounds, to evaluate these prototypes. Our findings indicate the importance of news search engines that (a) inform users of possible biases in the results (bias visualization approach) and (b) allow users to access alternative search results (results‐reranking approach). Our study provides further insights into the strengths and possible risks of each approach, which are important for future research on designing interfaces for raising user awareness of biases in news search engines. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Impact and development of an Open Web Index for open web search.
- Author
-
Granitzer, Michael, Voigt, Stefan, Fathima, Noor Afshan, Golasowski, Martin, Guetl, Christian, Hecking, Tobias, Hendriksen, Gijs, Hiemstra, Djoerd, Martinovič, Jan, Mitrović, Jelena, Mlakar, Izidor, Moiras, Stavros, Nussbaumer, Alexander, Öster, Per, Potthast, Martin, Srdič, Marjana Senčar, Megi, Sharikadze, Slaninová, Kateřina, Stein, Benno, and de Vries, Arjen P.
- Subjects
- *
COST control , *WORLD Wide Web , *DATABASE searching , *ECOSYSTEMS , *SEARCH engines , *INFORMATION retrieval , *WEB development , *APPLICATION software , *ACCESS to information - Abstract
Web search is a crucial technology for the digital economy. Dominated by a few gatekeepers focused on commercial success, however, web publishers have to optimize their content for these gatekeepers, resulting in a closed ecosystem of search engines as well as the risk of publishers sacrificing quality. To encourage an open search ecosystem and offer users genuine choice among alternative search engines, we propose the development of an Open Web Index (OWI). We outline six core principles for developing and maintaining an open index, based on open data principles, legal compliance, and collaborative technology development. The combination of an open index with what we call declarative search engines will facilitate the development of vertical search engines and innovative web data products (including, e.g., large language models), enabling a fair and open information space. This framework underpins the EU‐funded project OpenWebSearch.EU, marking the first step towards realizing an Open Web Index. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Is googling risky? A study on risk perception and experiences of adverse consequences in web search.
- Author
-
Häußler, Helena, Schultheiß, Sebastian, and Lewandowski, Dirk
- Subjects
- *
SOCIAL media , *INFORMATION resources , *MISINFORMATION , *DESCRIPTIVE statistics , *ATTITUDE (Psychology) , *SEARCH engines , *INFORMATION retrieval , *RESEARCH methodology , *RISK perception , *DATA analysis software - Abstract
Search engines, such as Google, have a considerable impact on society. Therefore, undesirable consequences, such as retrieving incorrect search results, pose a risk to users. Although previous research has reported the adverse outcomes of web search, little is known about how search engine users evaluate those outcomes. In this study, we show which aspects of web search are perceived as risky using a sample (N = 3884) representative of the German Internet population. We found that many participants are often concerned with adverse consequences immediately appearing on the search engine result page. For example, 45.2% of respondents are concerned about retrieving incorrect information. In contrast, consequences with a delayed impact are rarely perceived as a risk. Moreover, participants' experiences with adverse consequences are directly related to their risk perception. Our results demonstrate that people perceive risks related to web search. In addition to our study, there is a need for more independent research on the possible detrimental outcomes of web search to monitor and mitigate risks. Apart from risks for individuals, search engines with a massive number of users have an extraordinary impact on society; therefore, the acceptable risks of web search should be discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Perceptual (but not acoustic) features predict singing voice preferences.
- Author
-
Bruder, Camila, Poeppel, David, and Larrouy-Maestri, Pauline
- Subjects
- *
SINGING , *HUMAN voice , *INDIVIDUAL differences , *INFORMATION retrieval , *AUTOMATIC speech recognition - Abstract
Why do we prefer some singers to others? We investigated how much singing voice preferences can be traced back to objective features of the stimuli. To do so, we asked participants to rate short excerpts of singing performances in terms of how much they liked them as well as in terms of 10 perceptual attributes (e.g.: pitch accuracy, tempo, breathiness). We modeled liking ratings based on these perceptual ratings, as well as based on acoustic features and low-level features derived from Music Information Retrieval (MIR). Mean liking ratings for each stimulus were highly correlated between Experiments 1 (online, US-based participants) and 2 (in the lab, German participants), suggesting a role for attributes of the stimuli in grounding average preferences. We show that acoustic and MIR features barely explain any variance in liking ratings; in contrast, perceptual features of the voices achieved around 43% of prediction. Inter-rater agreement in liking and perceptual ratings was low, indicating substantial (and unsurprising) individual differences in participants' preferences and perception of the stimuli. Our results indicate that singing voice preferences are not grounded in acoustic attributes of the voices per se, but in how these features are perceptually interpreted by listeners. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Impact of Channel Selection with Different Bandwidths on Retrieval at 50–60 GHz.
- Author
-
Zhang, Minjie, Ma, Gang, He, Jieying, and Zhang, Chao
- Subjects
- *
BANDWIDTHS , *METEOROLOGICAL satellites , *RADIATIVE transfer , *ENTROPY (Information theory) , *TRACE gases , *INFORMATION retrieval , *RADIATIVE transfer equation - Abstract
Microwave hyperspectral instruments represent one of the main atmospheric sounders of China's next-generation Fengyun meteorological satellites. In order to better apply microwave hyperspectral observations in the fields of atmospheric parameter retrieval and data assimilation, this paper analyzes the sensitivity of trace gases to five selected bandwidth channels using a radiative transfer model based on the simulated data of microwave hyperspectral radiances at 50–60 GHz. This method uses information entropy and a weighting function to select channels and analyze the impact of this on the retrieval accuracy of atmospheric profiles before and after channel selection. The experimental results show that channel selection can reduce the number of channels by approximately 74.05% while maintaining a large amount of information content, and this retrieval effect is significantly better than that of MWTS-III. After channel selection, the 10 MHz, 30 MHz, and 50 MHz bandwidths have the best retrieval results in the stratosphere, whole atmosphere, and troposphere, respectively. When considering the number of channels, computational scale, and retrieval results comprehensively, the channel selection method is effective. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. An advanced spatial co-registration of cloud properties for the atmospheric Sentinel missions: Application to TROPOMI.
- Author
-
Argyrouli, Athina, Loyola, Diego, Romahn, Fabian, Lutz, Ronny, García, Víctor Molina, Hedelt, Pascal, Heue, Klaus-Peter, and Siddans, Richard
- Subjects
- *
REFLECTANCE measurement , *ALBEDO , *METEOROLOGICAL satellites , *INFRARED imaging , *SPECTRAL imaging , *INFORMATION retrieval , *ORBITS (Astronomy) - Abstract
The retrieval of cloud parameters from the atmospheric Sentinel missions require Earth reflectance measurements from a set of spectral bands. Frequently, the ground pixel footprints of the involved spectral bands are not fully aligned and therefore, special treatment is required within the operational algorithms. This so-called inter-band spatial mis-registration of passive spectrometers is present when the Earth reflectance measurements in different spectral bands are captured by different spectrometers. The cloud retrieval algorithm requires reflectance measurements in the UV (ultraviolet)/VIS (visible) band, where the first cloud parameter (i.e., radiometric cloud fraction) is retrieved from the OCRA (Optical Cloud Recognition Algorithm) algorithm. In addition, Earth reflectances in the NIR (near-infrared) band are needed for the retrieval of two additional cloud parameters (i.e., cloud height and cloud albedo or cloud-top height and optical thickness) from the ROCINN (Retrieval of Cloud Information using Neural Networks) algorithm. In the former TROPOMI (TROPOspheric Monitoring Instrument)/S5P (Sentinel-5 Precursor) retrieval, a co-registration scheme of the derived cloud parameters from the source band to the target band based on pre-calculated mapping weights from UV/VIS to NIR, and vice versa, is applied. In this paper we present a new scheme for the co-registration of the TROPOMI cloud parameters using collocated VIIRS (Visible Infrared Imaging Radiometer Suite)/SNPP (Suomi National Polar-orbiting Partnership) information. A great benefit of the new co-registration scheme based on the VIIRS data is that it improves the overall quality of the TROPOMI cloud products and, in addition, it allows the re-construction of the cloud parameters on the first UV/VIS detector pixel, which was impossible with the former scheme based on the static mapping tables. The latter practically means that a significant number of valid data points are added to the TROPOMI cloud, total ozone, SO2 and HCHO product since November 26th 2023 (orbit 31705), when the UPAS version 2.6 with the new co-registration scheme was activated operationally. From a comparison analysis between the two techniques, we found that the largest differences mainly appear for inhomogeneous scenes. From a validation exercise of TROPOMI against VIIRS in the across-track flight direction, we found that the old co-registration scheme tends to smooth out cloud structures along the scanline, whereas such structures can be maintained with the new scheme. The need to implement a similar inter-band spatial co-registration scheme is foreseen for the Sentinel-4/MTG-S (Meteosat Third Generation - Sounder) and Sentinel-5/MetOp-SG (Meteorological Operational Satellite - Second Generation) missions. In the case of Sentinel-4 instrument, the external cloud information will originate from collocated FCI (Flexible Combined Imager) data, on board the MTG-I (Meteosat Third Generation - Imager) satellite. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Unveiling a century of Taraxacum officinale G.H. Weber ex Wiggers research: a scientometric analysis and thematically-based narrative review.
- Author
-
Taha, Manal Mohamed Elhassan and Abdelwahab, Siddig Ibrahim
- Subjects
- *
COMMON dandelion , *THEMATIC maps , *EVIDENCE gaps , *INFORMATION retrieval , *HEAVY metals , *KNOWLEDGE gap theory - Abstract
Background: This study aims to conduct a scientometric analysis and thematically-based narrative review of a century of Taraxacum officinale research (TOR), uncovering patterns, trends, themes, and advancements in the field to provide insights for future investigations. The study followed PRISMA guidelines and utilized the Scopus database with MeSH terms for bibliographic data retrieval. Scientometric mapping employed VOSviewer and R-package-based Bibliometrix, while extracted themes were reviewed narratively. A detailed analysis of TOR was achieved by including only original studies. Results: The findings include the extensive duration of TOR since 1908 and its significant growth, particularly in the last two decades. China emerges as the most productive country, but the United States leads in recognizable and collaborative TOR. The thematic map displays dynamic and diverse themes, with a rich knowledge structure revealed through the analysis of term co-occurrence. The year 2016 represents a turning point in the thematic map, marked by numerical growth and thematic bifurcation. The study extracted several main research topics within the field of TOR, including germination, antioxidant activity, bioherbicide, oxidative stress, Taraxacum kok-saghyz, and heavy metals. These topics represent key areas of investigation and provide insights into the diverse aspects of research surrounding T. officinale. Additionally, emerging topics in TOR encompass toxicity, metabolomics, dandelion extract, and diabetes mellitus. Conclusions: The study consolidated knowledge, highlighted research gaps, and provided directions for future investigations on TOR. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Extraction of fractures in shale CT images using improved U-Net.
- Author
-
Xiang Wu, Fei Wang, Xiaoqiu Zhang, Bohua Han, Qianru Liu, and Yonghao Zhang
- Subjects
- *
DEEP learning , *COMPUTED tomography , *IMAGE segmentation , *INFORMATION retrieval , *COMPUTER algorithms - Abstract
Accurate extraction of pores and fractures is a prerequisite for constructing digital rocks for physical property simulation and microstructural response analysis. However, fractures in CT images are similar in grayscale to the rock matrix, and traditional algorithms have difficulty to achieve accurate segmentation results. In this study, a dataset containing multiscale fracture information was constructed, and a U-Net semantic segmentation model with a scSE attention mechanism was used to classify shale CT images at the pixel level and compare the results with traditional methods. The results showed that the CLAHE algorithm effectively removed noise and enhanced the fracture information in the dark parts, which is beneficial for further fracture extraction. The Canny edge detection algorithm had significant false positives and failed to recognize the internal information of the fractures. The Otsu algorithm only extracted fractures with a significant difference from the background and was not sensitive enough for fine fractures. The MEF algorithm enhanced the edge information of the fractures and was also sensitive to fine fractures, but it overestimated the aperture of the fractures. The U-Net was able to identify almost all fractures with good continuity, with an MIou and Recall of 0.80 and 0.82, respectively. As the image resolution increases, more fine fracture information can be extracted. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Ontology-Based Generalized Zero-Shot Learning with Generative Networks.
- Author
-
Akdemir, Emre and Barışçı, Necattin
- Subjects
- *
NATURAL language processing , *MACHINE learning , *GENERATIVE adversarial networks , *ONTOLOGY , *INFORMATION retrieval - Abstract
Zero-Shot Learning (ZSL) aims to classify images of new categories in the testing phase without labeled images during training, using examples from categories with labeled images and some auxiliary information. The auxiliary information includes semantic attributes, textual descriptions, word embeddings, etc., for both labeled and unlabeled classes, utilizing Natural Language Processing (NLP) approaches. The word embeddings created are limited by the semantic attributes and textual descriptions where the semantics of categories are insufficient. In this paper, introduces a study for Generalized Zero-Shot Learning (GZSL), a type of ZSL, by integrating the rich semantics offered by ontology. Semantic attributes used for semantic representation are supported by ontology. Variational Autoencoder (VAE) and Generative Adversarial Network (GAN) network architectures are used together to synthesize visual features. Our work was evaluated on the AWA2 dataset, and improvement in GZSL performance was achieved. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Contributions of the National Center for Manuscripts in Adrar to the Preservation and Digitization of Manuscript Patrimony.
- Author
-
Ouarda, Mecibah
- Subjects
- *
DIGITAL preservation , *PRESERVATION of cultural property , *HISTORICAL research , *INFORMATION retrieval , *DIGITIZATION - Published
- 2024
45. MEDLINE citation tool accuracy: an analysis in two platforms.
- Author
-
Scheinfeld, Laurel and Chung, Sunny
- Subjects
- *
DATABASES , *RESEARCH methodology evaluation , *ARTIFICIAL intelligence , *LIBRARIANS , *STATISTICAL sampling , *CITATION analysis , *AUTHORSHIP , *MEDLINE , *BIBLIOGRAPHICAL citations , *PUBLISHING , *INFORMATION literacy , *INFORMATION retrieval , *ADULT education workshops , *BIBLIOGRAPHY , *BIBLIOMETRICS , *RESEARCH , *ELECTRONIC publications , *ONLINE information services - Abstract
Background: Libraries provide access to databases with auto-cite features embedded into the services; however, the accuracy of these auto-cite buttons is not very high in humanities and social sciences databases. Case Presentation: This case compares two biomedical databases, Ovid MEDLINE and PubMed, to see if either is reliable enough to confidently recommend to students for use when writing papers. A total of 60 citations were assessed, 30 citations from each citation generator, based on the top 30 articles in PubMed from 2010 to 2020. Conclusions: Error rates were higher in Ovid MEDLINE than PubMed but neither database platform provided error-free references. The auto-cite tools were not reliable. Zero of the 60 citations examined were 100% correct. Librarians should continue to advise students not to rely solely upon citation generators in these biomedical databases. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Using Online Tutorials to Teach Citation: A Work in Progress.
- Author
-
Haren, Shonn and Hershman, Sarah
- Subjects
- *
CURRICULUM , *HUMAN services programs , *EDUCATIONAL outcomes , *LIBRARIANS , *PILOT projects , *HEALTH occupations students , *ACADEMIC libraries , *TEACHING aids , *CITATION analysis , *BIBLIOMETRICS , *INFORMATION retrieval , *ONLINE education , *TEST-taking skills , *VIDEO recording , *COVID-19 pandemic - Abstract
After receiving troubling data about graduating students’ abilities to cite their work, the librarians at Cal Poly Pomona sought to respond to these findings by creating a suite of online video tutorials to aid student learning. These tutorials included brief assessment quizzes to aid in gauging student understanding. The onset of the COVID-19 pandemic in spring 2020 greatly accelerated the process of tutorial creation and implementation, as they became a key tool for instructional outreach during pandemic lockdowns. This article will discuss the process of planning, implementation and piloting of these citation tutorials during the early months of COVID-19. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Making the Most of SUSHI in Alma: Tips for Smooth Maintenance and Reporting.
- Author
-
Day, Marie and Beane, Laura
- Subjects
- *
MEDICAL protocols , *DOCUMENTATION , *DASHBOARDS (Management information systems) , *DATABASE management , *DATA analysis , *ACADEMIC libraries , *LIBRARY automation , *LIBRARIANS , *CONTRACTING out , *PROBLEM solving , *INFORMATION retrieval , *STATISTICS , *ELECTRONIC publications , *BUSINESS intelligence , *ACCOUNTING - Abstract
SUSHI is a widely used protocol in academic libraries. This article discusses maintaining SUSHI vendor accounts in Alma, troubleshooting irregularities, and finding technical support. This article also includes sample analyses for the ACRL survey 60B Digital/Electronic Circulation or Usage and 63 E-serials Usage using COUNTER 5 reports. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Beyond the Algorithm: Understanding How ChatGPT Handles Complex Library Queries.
- Author
-
Yang, Sharon Q. and Mason, Sarah
- Subjects
- *
WORLD Wide Web , *LIBRARY reference services , *T-test (Statistics) , *PLAGIARISM , *ARTIFICIAL intelligence , *STATISTICAL sampling , *QUESTIONNAIRES , *ACADEMIC libraries , *LIBRARIANS , *DESCRIPTIVE statistics , *INFORMATION services , *INFORMATION retrieval , *CONFIDENCE intervals , *ALGORITHMS , *REFERENCE interviews (Library science) - Abstract
The introduction of ChatGPT 3.5 in November 2022 ignited a sensation in the academic community, leaving many astounded by its capabilities. This new release more closely emulates human responses than its predecessors. Among its remarkable capabilities, it can answer questions, catalog items in MARC21, recommend reading lists, and make suggestions on a wide array of topics. To assess ChatGPT’s efficacy in aiding library users, the authors of this paper conducted an experiment comparing ChatGPT’s performance with that of librarians in answering reference questions. Thirty questions were randomly selected from the transaction log of the reference inquiries between June 1, 2023 to July 31, 2023 at the Rider University Libraries. These queries constituted 34% of the total user questions during this two-month period. The authors compared the answers by ChatGPT and those by reference librarians for their accuracy, relevance, and friendliness. The findings indicate that reference librarians markedly outperformed their robotic counterpart. An evident issue arises from ChatGPT’s deficiency in understanding local policies and practices. This consequently hinders its ability to provide satisfactory answers in those areas. OpenAI posits that ChatGPT’s proficiency can be enhanced through targeted fine-tuning using locally specific information. At the moment, ChatGPT remains a great tool for librarians. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. ChatGPT: Unleashing the Power of Conversational AI for Library Reference Services.
- Author
-
Yang, Sharon Q.
- Subjects
- *
CHATGPT , *ARTIFICIAL intelligence , *INFORMATION retrieval , *LIBRARY information networks , *LIBRARY reference services - Abstract
Purpose--Explore the impact of AI and ChatGPT on library information services; Design/methodology/approach--A sample of twenty-two reference questions are fed to ChatGPT and the answers are evaluated for quality and accuracy; Findings--ChatGPT are excellent in information retrieval in some areas, but it is not comparable to a reference librarian in others; Research limitations/implications--The findings may not be conclusive due to small sample size; Practical implications--Understand AI and ChatGPT and their behavior; Social implications--The knowledge from the study can assist librarians to adjust their services to better serve users; Originality/value--No research has been done in this area. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. Use of Newly Acquired Materials: An Analysis of Print and E-book Acquisitions.
- Author
-
Best, Rickey
- Subjects
- *
LIBRARY planning , *LIBRARY administration , *LIBRARY research , *INFORMATION retrieval , *LIBRARY science - Abstract
The Auburn University at Montgomery (AUM) Library examined its current acquisitions circulation rate over a period of five years (from 2017-2021) to determine whether materials being added met student and faculty needs as demonstrated by circulation patterns, and if there was a difference in circulation patterns between acquired print and e-books that might help the library determine where to better focus its resources: print or electronic. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.