226 results on '"Zero-shot"'
Search Results
2. VCP-CLIP: A Visual Context Prompting Model for Zero-Shot Anomaly Segmentation
- Author
-
Qu, Zhen, Tao, Xian, Prasad, Mukesh, Shen, Fei, Zhang, Zhengtao, Gong, Xinyi, Ding, Guiguang, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
- Published
- 2025
- Full Text
- View/download PDF
3. Dual-Branch Task Residual Enhancement with Parameter-Free Attention for Zero-Shot Multi-label Image Recognition
- Author
-
Zhang, Shizhou, Dang, Kairui, Cheng, De, Xing, Yinghui, Wu, Qirui, Kong, Dexuan, Zhang, Yanning, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Antonacopoulos, Apostolos, editor, Chaudhuri, Subhasis, editor, Chellappa, Rama, editor, Liu, Cheng-Lin, editor, Bhattacharya, Saumik, editor, and Pal, Umapada, editor
- Published
- 2025
- Full Text
- View/download PDF
4. LineTR: Unified Text Line Segmentation for Challenging Palm Leaf Manuscripts
- Author
-
Agrawal, Vaibhav, Vadlamudi, Niharika, Waseem, Muhammad, Joseph, Amal, Chitluri, Sreenya, Sarvadevabhatla, Ravi Kiran, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Antonacopoulos, Apostolos, editor, Chaudhuri, Subhasis, editor, Chellappa, Rama, editor, Liu, Cheng-Lin, editor, Bhattacharya, Saumik, editor, and Pal, Umapada, editor
- Published
- 2025
- Full Text
- View/download PDF
5. Can Language Improve Visual Features For Distinguishing Unseen Plant Diseases?
- Author
-
Liaw, Jerad Zherui, Chai, Abel Yu Hao, Lee, Sue Han, Bonnet, Pierre, Joly, Alexis, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Antonacopoulos, Apostolos, editor, Chaudhuri, Subhasis, editor, Chellappa, Rama, editor, Liu, Cheng-Lin, editor, Bhattacharya, Saumik, editor, and Pal, Umapada, editor
- Published
- 2025
- Full Text
- View/download PDF
6. SPK: Semantic and Positional Knowledge for Zero-Shot Referring Expression Comprehension
- Author
-
Du, Zetao, Yang, Jianhua, Wang, Junbo, Huang, Yan, Wang, Liang, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Antonacopoulos, Apostolos, editor, Chaudhuri, Subhasis, editor, Chellappa, Rama, editor, Liu, Cheng-Lin, editor, Bhattacharya, Saumik, editor, and Pal, Umapada, editor
- Published
- 2025
- Full Text
- View/download PDF
7. FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior
- Author
-
Chen, Zhekai, Wang, Wen, Yang, Zhen, Yuan, Zeqing, Chen, Hao, Shen, Chunhua, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
- Published
- 2025
- Full Text
- View/download PDF
8. CLIP-AD: A Language-Guided Staged Dual-Path Model for Zero-Shot Anomaly Detection
- Author
-
Chen, Xuhai, Zhang, Jiangning, Tian, Guanzhong, He, Haoyang, Zhang, Wuhao, Wang, Yabiao, Wang, Chengjie, Liu, Yong, Ghosh, Ashish, Editorial Board Member, Peng, Kuan-Chuan, editor, Wang, Yizhou, editor, Li, Ziyue, editor, Chen, Zhenghua, editor, Yang, Jianfei, editor, Suh, Sungho, editor, and Wu, Min, editor
- Published
- 2025
- Full Text
- View/download PDF
9. Zero-Shot Referring Image Segmentation with Hierarchical Prompts and Frequency Domain Fusion
- Author
-
Li, Changlong, Zhuang, Jiedong, Hu, Jiaqi, Hu, Haoji, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Hadfi, Rafik, editor, Anthony, Patricia, editor, Sharma, Alok, editor, Ito, Takayuki, editor, and Bai, Quan, editor
- Published
- 2025
- Full Text
- View/download PDF
10. ZeFaV: Boosting Large Language Models for Zero-Shot Fact Verification
- Author
-
Luu, Son T., Nguyen, Hiep, Vo, Trung, Nguyen, Le-Minh, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Hadfi, Rafik, editor, Anthony, Patricia, editor, Sharma, Alok, editor, Ito, Takayuki, editor, and Bai, Quan, editor
- Published
- 2025
- Full Text
- View/download PDF
11. MFNAS: Multi-fidelity Exploration in Neural Architecture Search with Stable Zero-Shot Proxy
- Author
-
Fu, Wei, Lou, Wenqi, Qin, Yunji, Gong, Lei, Wang, Chao, Zhou, Xuehai, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Hadfi, Rafik, editor, Anthony, Patricia, editor, Sharma, Alok, editor, Ito, Takayuki, editor, and Bai, Quan, editor
- Published
- 2025
- Full Text
- View/download PDF
12. Enhancing Zero-Shot Anomaly Detection: CLIP-SAM Collaboration with Cascaded Prompts
- Author
-
Hou, Yanning, Xu, Ke, Li, Junfa, Ruan, Yanran, Qiu, Jianfeng, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Lin, Zhouchen, editor, Cheng, Ming-Ming, editor, He, Ran, editor, Ubul, Kurban, editor, Silamu, Wushouer, editor, Zha, Hongbin, editor, Zhou, Jie, editor, and Liu, Cheng-Lin, editor
- Published
- 2025
- Full Text
- View/download PDF
13. 基于改进类别嵌入发掘网络的液压泵零样本故障诊断.
- Author
-
李克, 郑直, 刘彤谣, 袁晓明, and 韩炬
- Subjects
PARTICLE swarm optimization ,NETWORK performance ,FAULT diagnosis ,FEATURE extraction ,GENERALIZATION - Abstract
Copyright of Machine Tool & Hydraulics is the property of Guangzhou Mechanical Engineering Research Institute (GMERI) and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
14. Deep Learning-Based Detection of Impacted Teeth on Panoramic Radiographs.
- Author
-
Zhicheng, He, Yipeng, Wang, and Xiao, Li
- Subjects
- *
IMPACTION of teeth , *COMPUTER-aided diagnosis , *X-ray imaging , *IMAGE segmentation , *RADIOGRAPHS - Abstract
Objective: The aim is to detect impacted teeth in panoramic radiology by refining the pretrained MedSAM model. Study design: Impacted teeth are dental issues that can cause complications and are diagnosed via radiographs. We modified SAM model for individual tooth segmentation using 1016 X-ray images. The dataset was split into training, validation, and testing sets, with a ratio of 16:3:1. We enhanced the SAM model to automatically detect impacted teeth by focusing on the tooth's center for more accurate results. Results: With 200 epochs, batch size equals to 1, and a learning rate of 0.001, random images trained the model. Results on the test set showcased performance up to an accuracy of 86.73%, F1-score of 0.5350, and IoU of 0.3652 on SAM-related models. Conclusion: This study fine-tunes MedSAM for impacted tooth segmentation in X-ray images, aiding dental diagnoses. Further improvements on model accuracy and selection are essential for enhancing dental practitioners' diagnostic capabilities. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Improving ChatGPT's Competency in Generating Effective Business Communication Messages: Integrating Rhetorical Genre Analysis into Prompting Techniques.
- Author
-
Wang, Junhua
- Subjects
- *
CHATGPT , *RHETORICAL analysis , *ARTIFICIAL intelligence , *BUSINESS communication , *BUSINESS writing - Abstract
This study explores how prompting techniques, especially those integrated with rhetorical analysis results, may improve the effectiveness of artificial intelligence (AI)-generated business communication messages. I conducted an experiment to assess the effectiveness of these prompting techniques in the context of crafting a negative message generated with ChatGPT 3.5 (n = 85). A multiple regression was calculated to explore prompting techniques' impact on the negative message grades and how each technique influences the message grade. The results (F (4, 80) = 31.84, p <.001), with an adjusted R 2 =.595, indicate a positive relationship between prompting techniques and the effectiveness of AI-generated messages. This study also identified challenges related to students' AI literacy. I conclude the study by recommending practical measures on how to incorporate AI into business and professional writing classrooms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Applying Retrieval-Augmented Generation for Academic Discipline Development: Insights from Zero-Shot to Tree-of-Thought Prompting
- Author
-
Polina Shnaider, Anastasiia Chernysheva, Anton Govorov, Maksim Khlopotov, and Anna Nikiforova
- Subjects
large language models ,prompt engineering ,quantized models ,rag ,few-shot ,zero-shot ,chain-of-thought ,Telecommunication ,TK5101-6720 - Abstract
This study assesses the efficiency of large language models (LLMs) in generating university course structures, comparing traditional methods with Retrieval-Augmented Generation (RAG). It involves a comparative analysis across twelve courses using four LLMs: starling-lm-7b-alpha, openchat_3.5, saiga2_13b, and gpt-3.5-turbo, with four distinct prompting approaches. Findings indicate that advanced prompting techniques significantly influence model performance and response variability. The study underscores the importance of selecting appropriate LLMs and prompting strategies to optimize educational outcomes, highlighting RAG's role in enhancing data retrieval accuracy in educational technology.
- Published
- 2024
- Full Text
- View/download PDF
17. AutoCTS++: zero-shot joint neural architecture and hyperparameter search for correlated time series forecasting.
- Author
-
Wu, Xinle, Wu, Xingjian, Yang, Bin, Zhou, Lekui, Guo, Chenjuan, Qiu, Xiangfei, Hu, Jilin, Sheng, Zhenli, and Jensen, Christian S.
- Abstract
Sensors in cyber-physical systems often capture interconnected processes and thus emit correlated time series (CTS), the forecasting of which enables important applications. Recent deep learning based forecasting methods show strong capabilities at capturing both the temporal dynamics of time series and the spatial correlations among time series, thus achieving impressive accuracy. In particular, automated CTS forecasting, where a deep learning architecture is configured automatically, enables forecasting accuracy that surpasses what has been achieved by manual approaches. However, automated CTS forecasting remains in its infancy, as existing proposals are only able to find optimal architectures for predefined hyperparameters and for specific datasets and forecasting settings (e.g., short vs. long term forecasting). These limitations hinder real-world industrial application, where forecasting faces diverse datasets and forecasting settings. We propose AutoCTS++, a zero-shot, joint search framework, to efficiently configure effective CTS forecasting models (including both neural architectures and hyperparameters), even when facing unseen datasets and foreacsting settings. Specifically, we propose an architecture-hyperparameter joint search space by encoding candidate architecture and accompanying hyperparameters into a graph representation. We then introduce a zero-shot Task-aware Architecture-Hyperparameter Comparator (T-AHC) to rank architecture-hyperparameter pairs according to different tasks (i.e., datasets and forecasting settings). We propose zero-shot means to train T-AHC, enabling it to rank architecture-hyperparameter pairs given unseen datasets and forecasting settings. A final forecasting model is then selected from the top-ranked pairs. Extensive experiments involving multiple benchmark datasets and forecasting settings demonstrate that AutoCTS++ is able to efficiently devise forecasting models for unseen datasets and forecasting settings that are capable of outperforming existing manually designed and automated models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. ZS-CEBE: leveraging zero-shot cross and bi-encoder architecture for cold-start news recommendation.
- Author
-
Rauf, Muhammad Arslan, Khalil, Mian Muhammad Yasir, Ghani, Muhammad Ahmad Nawaz Ul, Wang, Weidong, Wang, Qingxian, and Hassan, Junaid
- Abstract
News recommendation systems heavily rely on the information exchange between news articles and users to personalize the recommendation. Consequently, one of the significant challenges is the cold-start problem in news recommendation models, referring to the low accuracy of recommendations for new users due to a lack of interaction data. This study addresses the cold-start challenge in news recommendation systems by introducing a novel zero-shot-based approach. The ZS-CEBE approach presented in this paper utilizes a rarely explored zero-shot paradigm to effectively tackle the cold-start problem in news recommendations. The methodology incorporates two crucial models: the fine-tuned Cross-Encoder and a Bi-Encoder model. The cross-encoder captures user-news interactions, predicting the likelihood of user engagement with a news article. Subsequently, the bi-encoder, designed for swift inference, precomputes embeddings for users and articles and calculates their relevance during predictions. The proposed technique is applicable to various neural news recommendation systems and is empirically evaluated using real-world benchmark datasets MIND and Adressa. The experimental results demonstrate that ZS-CEBE outperforms baseline methods in terms of nDCG@k, AUC, and MRR, both in cold-start scenarios and regular user-news interaction situations. This underscores the efficacy of the zero-shot approach in mitigating the cold-start dilemma and improving overall recommendation system performance [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Comparing Fine-Tuning, Zero and Few-Shot Strategies with Large Language Models in Hate Speech Detection in English.
- Author
-
Pan, Ronghao, García-Díaz, José Antonio, and Valencia-García, Rafael
- Subjects
LANGUAGE models ,NATURAL language processing ,CONTEXTUAL learning ,HATE speech ,WOMEN immigrants - Abstract
Large Language Models (LLMs) are increasingly demonstrating their ability to understand natural language and solve complex tasks, especially through text generation. One of the relevant capabilities is contextual learning, which involves the ability to receive instructions in natural language or task demonstrations to generate expected outputs for test instances without the need for additional training or gradient updates. In recent years, the popularity of social networking has provided a medium through which some users can engage in offensive and harmful online behavior. In this study, we investigate the ability of different LLMs, ranging from zero-shot and few-shot learning to fine-tuning. Our experiments show that LLMs can identify sexist and hateful online texts using zero-shot and few-shot approaches through information retrieval. Furthermore, it is found that the encoder-decoder model called Zephyr achieves the best results with the fine-tuning approach, scoring 86.811% on the Explainable Detection of Online Sexism (EDOS) test-set and 57.453% on the Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter (HatEval) test-set. Finally, it is confirmed that the evaluated models perform well in hate text detection, as they beat the best result in the HatEval task leaderboard. The error analysis shows that contextual learning had difficulty distinguishing between types of hate speech and figurative language. However, the fine-tuned approach tends to produce many false positives. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Causal Reasoning of Occupational Incident Texts Using Large Language Models.
- Author
-
Nakamura, Manato, Hayamizu, Satoru, Masanori, Hattori, Fuseya, Takafumi, Iwamatsu, Hidetoshi, and Terada, Kazunori
- Abstract
In this study, we conducted multi-label annotation based on textual entailment for text data related to incident cases that occurred at an electric power company, utilizing GPT, a general-purpose LLM, without additional training. The experiment examined GPT's zero-shot textual entailment performance and, for comparison, its one-shot textual entailment performance using prompt engineering. Furthermore, in this study, the abstract category labels for the causes of incidents used in the annotation task were also extracted zero-shot from GPT-4, and these were approved by human participants to determine the labels. The results of the experiment showed that, particularly in the one-shot approach using prompt engineering, GPT exhibited strong generalization capabilities and demonstrated promising performance, approaching the level of human annotators in certain evaluation metrics. However, it was also suggested that when dealing with highly specialized and multifaceted cases like those in this study, careful adjustments in model choice and prompt settings are required. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. 面向域外说话人适应场景的多层级解耦个性化语音合成.
- Author
-
高盛祥, 杨元樟, 王琳钦, 莫尚斌, 余正涛, and 董凌
- Subjects
SPEECH synthesis ,PHONEME (Linguistics) ,GENERALIZATION ,SPEECH perception - Abstract
Copyright of Journal of Guangxi Normal University - Natural Science Edition is the property of Gai Kan Bian Wei Hui and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
22. CP-CLIP: Core-Periphery Feature Alignment CLIP for Zero-Shot Medical Image Analysis
- Author
-
Yu, Xiaowei, Wu, Zihao, Zhang, Lu, Zhang, Jing, Lyu, Yanjun, Zhu, Dajiang, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Linguraru, Marius George, editor, Dou, Qi, editor, Feragen, Aasa, editor, Giannarou, Stamatia, editor, Glocker, Ben, editor, Lekadir, Karim, editor, and Schnabel, Julia A., editor
- Published
- 2024
- Full Text
- View/download PDF
23. Segmentation of Knee Bones for Osteoarthritis Assessment: A Comparative Analysis of Supervised, Few-Shot, and Zero-Shot Learning Approaches
- Author
-
Teoh, Yun Xin, Othmani, Alice, Goh, Siew Li, Usman, Juliana, Lai, Khin Wee, Magjarević, Ratko, Series Editor, Ładyżyński, Piotr, Associate Editor, Ibrahim, Fatimah, Associate Editor, Lackovic, Igor, Associate Editor, Rock, Emilio Sacristan, Associate Editor, Costin, Hariton-Nicolae, editor, and Petroiu, Gladiola Gabriela, editor
- Published
- 2024
- Full Text
- View/download PDF
24. Leveraging Large Language Models for Flexible and Robust Table-to-Text Generation
- Author
-
Oro, Ermelinda, De Grandis, Luca, Granata, Francesco Maria, Ruffolo, Massimo, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Strauss, Christine, editor, Amagasa, Toshiyuki, editor, Manco, Giuseppe, editor, Kotsis, Gabriele, editor, Tjoa, A Min, editor, and Khalil, Ismail, editor
- Published
- 2024
- Full Text
- View/download PDF
25. Zero-Shot Relation Triplet Extraction via Knowledge-Driven LLM Synthetic Data Generation
- Author
-
He, Li, Zhang, Hayilang, Liu, Jie, Sun, Kang, Zhang, Qing, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Si, Zhanjun, editor, and Zhang, Chuanlei, editor
- Published
- 2024
- Full Text
- View/download PDF
26. Multilingual Fake News Detection: A Study on Various Models and Training Scenarios
- Author
-
Chalehchaleh, Razieh, Farahbakhsh, Reza, Crespi, Noel, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, and Arai, Kohei, editor
- Published
- 2024
- Full Text
- View/download PDF
27. Large Language Models for Emotion Evolution Prediction
- Author
-
Leung, Clement, Xu, Zhifei, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Gervasi, Osvaldo, editor, Murgante, Beniamino, editor, Garau, Chiara, editor, Taniar, David, editor, C. Rocha, Ana Maria A., editor, and Faginas Lago, Maria Noelia, editor
- Published
- 2024
- Full Text
- View/download PDF
28. Knowledge Enhanced Zero-Shot Visual Relationship Detection
- Author
-
Ding, Nan, Lai, Yong, Liu, Jie, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Cao, Cungeng, editor, Chen, Huajun, editor, Zhao, Liang, editor, Arshad, Junaid, editor, Asyhari, Taufiq, editor, and Wang, Yonghao, editor
- Published
- 2024
- Full Text
- View/download PDF
29. Exploring Text-Driven Approaches for Online Action Detection
- Author
-
Benavent-Lledo, Manuel, Mulero-Pérez, David, Ortiz-Perez, David, Garcia-Rodriguez, Jose, Orts-Escolano, Sergio, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, van Leeuwen, Jan, Series Editor, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Kobsa, Alfred, Series Editor, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Nierstrasz, Oscar, Series Editor, Pandu Rangan, C., Editorial Board Member, Sudan, Madhu, Series Editor, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Weikum, Gerhard, Series Editor, Vardi, Moshe Y, Series Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Ferrández Vicente, José Manuel, editor, Val Calvo, Mikel, editor, and Adeli, Hojjat, editor
- Published
- 2024
- Full Text
- View/download PDF
30. Zero-Shot Rolling Bearing Compound Fault Diagnosis Based on Envelope Spectrum Semantic Construction
- Author
-
Sun, Heming, Tian, Shaoning, Kong, Jinzhen, Li, Haiyang, Ramli, Rahizar, Feng, Guojin, Zhen, Dong, IFToMM, Series Editor, Ceccarelli, Marco, Advisory Editor, Corves, Burkhard, Advisory Editor, Glazunov, Victor, Advisory Editor, Hernández, Alfonso, Advisory Editor, Huang, Tian, Advisory Editor, Jauregui Correa, Juan Carlos, Advisory Editor, Takeda, Yukio, Advisory Editor, Agrawal, Sunil K., Advisory Editor, Ball, Andrew D., editor, Ouyang, Huajiang, editor, Sinha, Jyoti K., editor, and Wang, Zuolu, editor
- Published
- 2024
- Full Text
- View/download PDF
31. Zero-Shot Learning of Individualized Task Contrast Prediction from Resting-State Functional Connectomes
- Author
-
Nguyen, Minh, Ngo, Gia H., Sabuncu, Mert R., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Xue, Yuan, editor, Chen, Chen, editor, Chen, Chao, editor, Zuo, Lianrui, editor, and Liu, Yihao, editor
- Published
- 2024
- Full Text
- View/download PDF
32. Simple Domain Adaptation for Sparse Retrievers
- Author
-
Vast, Mathias, Zong, Yuxuan, Piwowarski, Benjamin, Soulier, Laure, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Goharian, Nazli, editor, Tonellotto, Nicola, editor, He, Yulan, editor, Lipani, Aldo, editor, McDonald, Graham, editor, Macdonald, Craig, editor, and Ounis, Iadh, editor
- Published
- 2024
- Full Text
- View/download PDF
33. GenQREnsemble: Zero-Shot LLM Ensemble Prompting for Generative Query Reformulation
- Author
-
Dhole, Kaustubh D., Agichtein, Eugene, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Goharian, Nazli, editor, Tonellotto, Nicola, editor, He, Yulan, editor, Lipani, Aldo, editor, McDonald, Graham, editor, Macdonald, Craig, editor, and Ounis, Iadh, editor
- Published
- 2024
- Full Text
- View/download PDF
34. Zero-Shot Relation Triplet Extraction via Retrieval-Augmented Synthetic Data Generation
- Author
-
Zhang, Qing, Yang, Yuechen, Zhang, Hayilang, Gao, Zhengxin, Wang, Hao, Duan, Jianyong, He, Li, Liu, Jie, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Luo, Biao, editor, Cheng, Long, editor, Wu, Zheng-Guang, editor, Li, Hongyi, editor, and Li, Chaojie, editor
- Published
- 2024
- Full Text
- View/download PDF
35. Zero-Shot Singing Voice Conversion Based on Timbre Space Modeling and Excitation Signal Control
- Author
-
Jiang, Yuan, Chen, Yan-Nian, Liu, Li-Juan, Hu, Ya-Jun, Fang, Xin, Ling, Zhen-Hua, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Jia, Jia, editor, Ling, Zhenhua, editor, Chen, Xie, editor, Li, Ya, editor, and Zhang, Zixing, editor
- Published
- 2024
- Full Text
- View/download PDF
36. Hierarchical Multi-task Learning with Articulatory Attributes for Cross-Lingual Phoneme Recognition
- Author
-
Glocker, Kevin, Georges, Munir, Celebi, Emre, Series Editor, Chen, Jingdong, Series Editor, Gopi, E. S., Series Editor, Neustein, Amy, Series Editor, Liotta, Antonio, Series Editor, Di Mauro, Mario, Series Editor, and Abbas, Mourad, editor
- Published
- 2024
- Full Text
- View/download PDF
37. SKZC: self-distillation and k-nearest neighbor-based zero-shot classification
- Author
-
Muyang Sun and Haitao Jia
- Subjects
Image classification ,Zero-shot ,Self-distillation ,k-NN ,Cluster ,Engineering (General). Civil engineering (General) ,TA1-2040 - Abstract
Abstract Zero-shot learning represents a formidable paradigm in machine learning, wherein the crux lies in distilling and generalizing knowledge from observed classes to novel ones. The objective is to identify unfamiliar objects that were not included in the model’s training, leveraging learned patterns and knowledge from previously encountered categories. As a crucial subtask of open-world object detection, zero-shot classification can also provide insights and solutions for this field. Despite its potential, current zero-shot classification models often suffer from a performance gap due to limited transfer ability and discriminative capability of learned representations. In pursuit of advancing the subpar state of zero-shot object classification, this paper introduces a novel model for image classification which can be applied to object detection, namely, self-distillation and k-nearest neighbor-based zero-shot classification method. First, we employ a diffusion detector to identify potential objects in images. Then, self-distillation and distance-based classifiers are used for distinguishing unseen objects from seen classes. The k-nearest neighbor-based cluster heads are designed to cluster the unseen objects. Extensive experiments and visualizations were conducted on publicly available datasets on the efficacy of the proposed approach. Precisely, our model demonstrates performance improvement of over 20% compared to contrastive clustering. Moreover, it achieves a precision of 0.910 and a recall of 0.842 on CIFAR-10 datasets, a precision of 0.737, and a recall of 0.688 on CIFAR-100 datasets for the macro average. Compared to a more recent model (SGFR), our model realized improvements of 10.9%, 13.3%, and 7.8% in Sacc, Uacc, and H metrics, respectively. This study aims to introduce fresh ideas into the domain of zero-shot image classification, and it can be applied to open-world object detection tasks. Our code is available at https://www.github.com/CmosWolf1/Code_implementation_for_paper_SKZC .
- Published
- 2024
- Full Text
- View/download PDF
38. Adapting low‐dose CT denoisers for texture preservation using zero‐shot local noise‐level matching.
- Author
-
Ko, Youngjun, Song, Seongjong, Baek, Jongduk, and Shim, Hyunjung
- Subjects
- *
IMAGE denoising , *COMPUTED tomography , *SUPERCONDUCTING quantum interference devices , *DEEP learning , *RADIOLOGISTS - Abstract
Background: On enhancing the image quality of low‐dose computed tomography (LDCT), various denoising methods have achieved meaningful improvements. However, they commonly produce over‐smoothed results; the denoised images tend to be more blurred than the normal‐dose targets (NDCTs). Furthermore, many recent denoising methods employ deep learning(DL)‐based models, which require a vast amount of CT images (or image pairs). Purpose: Our goal is to address the problem of over‐smoothed results and design an algorithm that works regardless of the need for a large amount of training dataset to achieve plausible denoising results. Over‐smoothed images negatively affect the diagnosis and treatment since radiologists had developed clinical experiences with NDCT. Besides, a large‐scale training dataset is often not available in clinical situations. To overcome these limitations, we propose locally‐adaptive noise‐level matching (LANCH), emphasizing the output should retain the same noise‐level and characteristics to that of the NDCT without additional training. Methods: We represent the NDCT image as the pixel‐wisely weighted sum of an over‐smoothed output from off‐the‐shelf denoiser (OSD) and the difference between the LDCT image and the OSD output. Herein, LANCH determines a 2D ratio map (i.e., pixel‐wise weight matrix) by locally matching the noise‐level of output and NDCT, where the LDCT‐to‐NDCT device flux (mAs) ratio reveals the NDCT noise‐level. Thereby, LANCH can preserve important details in LDCT, and enhance the sharpness of the noise‐free regions. Note that LANCH can enhance any LDCT denoisers without additional training data (i.e., zero‐shot). Results: The proposed method is applicable to any OSD denoisers, reporting significant texture plausibility development over the baseline denoisers in quantitative and qualitative manners. It is surprising that the denoising accuracy achieved by our method with zero‐shot denoiser was comparable or superior to that of the best training‐based denoisers; our result showed 1% and 33% gains in terms of SSIM and DISTS, respectively. Reader study with experienced radiologists shows significant image quality improvements, a gain of + 1.18 on a five‐point mean opinion score scale. Conclusions: In this paper, we propose a technique to enhance any low‐dose CT denoiser by leveraging the fundamental physical relationship between the x‐ray flux and noise variance. Our method is capable of operating in a zero‐shot condition, which means that only a single low‐dose CT image is required for the enhancement process. We demonstrate that our approach is comparable or even superior to supervised DL‐based denoisers that are trained using numerous CT images. Extensive experiments illustrate that our method consistently improves the performance of all tested LDCT denoisers. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. 基于多模态特征频域融合的零样本指称图像分割.
- Author
-
林浩然, 刘春黔, 薛榕融, 谢勋伟, and 雷印杰
- Abstract
In order to solve the problem that semantic segmentation cannot handle undefined categories when applied to downstream tasks in the real world, it proposed referring image segmentation to find the corresponding target in the image according to the description of natural language text. Most of the existing methods use a cross-modal decoder to fuse the features extracted independently from the visual encoder and language encoder, but these methods cannot effectively utilize the edge features of the image and are complicated to train. CLIP is a powerful pre-trained visual language cross-modal model that can effectively extract image and text features. Therefore, this paper proposed a method of multimodal feature fusion in the frequency domain after CLIP encoding. Firstly, it used an unsupervised model to segment images, and extracted nouns in natural language text for follow-up task. Then it used the image encoder and text encoder of CLIP to encode the image and text respectively. Then it used the wavelet transform to decompose the image and text features, and decomposed and fused in the frequency domain which could make full use of the edge features of the image and the position information in the image, fused the image feature and text feature respectively in the frequency domain, then inversed the fused features. Finally, it matched the text features and image features pixel by pixel, and obtained the segmentation results, and tested on commonly used data sets. The experimental results prove that the network has achieved good results without training zero samples, and has good robustness and generalization ability. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. SKZC: self-distillation and k-nearest neighbor-based zero-shot classification.
- Author
-
Sun, Muyang and Jia, Haitao
- Subjects
IMAGE recognition (Computer vision) ,CLASSIFICATION ,MACHINE learning ,DATA visualization ,MULTISPECTRAL imaging - Abstract
Zero-shot learning represents a formidable paradigm in machine learning, wherein the crux lies in distilling and generalizing knowledge from observed classes to novel ones. The objective is to identify unfamiliar objects that were not included in the model's training, leveraging learned patterns and knowledge from previously encountered categories. As a crucial subtask of open-world object detection, zero-shot classification can also provide insights and solutions for this field. Despite its potential, current zero-shot classification models often suffer from a performance gap due to limited transfer ability and discriminative capability of learned representations. In pursuit of advancing the subpar state of zero-shot object classification, this paper introduces a novel model for image classification which can be applied to object detection, namely, self-distillation and k-nearest neighbor-based zero-shot classification method. First, we employ a diffusion detector to identify potential objects in images. Then, self-distillation and distance-based classifiers are used for distinguishing unseen objects from seen classes. The k-nearest neighbor-based cluster heads are designed to cluster the unseen objects. Extensive experiments and visualizations were conducted on publicly available datasets on the efficacy of the proposed approach. Precisely, our model demonstrates performance improvement of over 20% compared to contrastive clustering. Moreover, it achieves a precision of 0.910 and a recall of 0.842 on CIFAR-10 datasets, a precision of 0.737, and a recall of 0.688 on CIFAR-100 datasets for the macro average. Compared to a more recent model (SGFR), our model realized improvements of 10.9%, 13.3%, and 7.8% in Sacc, Uacc, and H metrics, respectively. This study aims to introduce fresh ideas into the domain of zero-shot image classification, and it can be applied to open-world object detection tasks. Our code is available at https://www.github.com/CmosWolf1/Code_implementation_for_paper_SKZC. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Adapting the Segment Anything Model for Plant Recognition and Automated Phenotypic Parameter Measurement.
- Author
-
Zhang, Wenqi, Dang, L. Minh, Nguyen, Le Quan, Alam, Nur, Bui, Ngoc Dung, Park, Han Yong, and Moon, Hyeonjoon
- Subjects
PHENOTYPES ,COMPUTER vision ,CULTIVARS ,MACHINE learning ,DATA recorders & recording - Abstract
Traditional phenotyping relies on experts visually examining plants for physical traits like size, color, or disease presence. Measurements are taken manually using rulers, scales, or color charts, with all data recorded by hand. This labor-intensive and time-consuming process poses a significant obstacle to the efficient breeding of new cultivars. Recent innovations in computer vision and machine learning offer potential solutions for accelerating the development of robust and highly effective plant phenotyping. This study introduces an efficient plant recognition framework that leverages the power of the Segment Anything Model (SAM) guided by Explainable Contrastive Language–Image Pretraining (ECLIP). This approach can be applied to a variety of plant types, eliminating the need for labor-intensive manual phenotyping. To enhance the accuracy of plant phenotype measurements, a B-spline curve is incorporated during the plant component skeleton extraction process. The effectiveness of our approach is demonstrated through experimental results, which show that the proposed framework achieves a mean absolute error (MAE) of less than 0.05 for the majority of test samples. Remarkably, this performance is achieved without the need for model training or labeled data, highlighting the practicality and efficiency of the framework. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Training-free neural architecture search: A review
- Author
-
Meng-Ting Wu and Chun-Wei Tsai
- Subjects
Neural architecture search ,Deep neural network ,Training-free ,Zero-shot ,Internet of things ,Information technology ,T58.5-58.64 - Abstract
The goal of neural architecture search (NAS) is to either downsize the neural architecture and model of a deep neural network (DNN), adjust a neural architecture to improve its end result, or even speed up the whole training process. Such improvements make it possible to generate or install the model of a DNN on a small device, such as a device of internet of things or wireless sensor network. Because most NAS algorithms are time-consuming, finding out a way to reduce their computation costs has now become a critical research issue. The training-free method (also called the zero-shot learning) provides an alternative way to estimate how good a neural architecture is more efficiently during the process of NAS by using a lightweight score function instead of a general training process to avoid incurring heavy costs. This paper starts with a brief discussion of DNN and NAS, followed by a brief review of both model-dependent and model-independent training-free score functions. A brief introduction to the search algorithms and benchmarks that were widely used in a training-free NAS will also be given in this paper. The changes, potential, open issues, and future trends of this research topic are then addressed in the end of this paper.
- Published
- 2024
- Full Text
- View/download PDF
43. A Novel Zero-Shot Real World Spatio-Temporal Super-Resolution (ZS-RW-STSR) Model for Video Super-Resolution
- Author
-
Ankit Shukla, Avinash Upadhyay, Manoj Sharma, Anil Saini, Nuzhat Fatema, Hasmat Malik, Asyraf Afthanorhan, and Mohammad Asef Hossaini
- Subjects
Zero-shot ,super-resolution ,convolutional auto-encoder ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Super-resolution (SR) of the degraded and real low-resolution (LR) video remains a challenging problem despite the development of deep learning-based SR models. Most existing state-of-the-art networks focus on getting high-resolution (HR) videos from the corresponding down-sampled LR video but fail in scenarios with noisy or degraded low-resolution video. In this article, a novel real-world “zero-shot” video spatio-temporal SR model, i.e., 3D-Deep Convolutional Auto-Encoder (3D-CAE) guided attention-based deep spatio-temporal back-projection network has been proposed. 3D-CAE is utilized for extracting noise-free features from real low-resolution video and used in the attention-based deep spatio-temporal back-projection network for clean, high-resolution video reconstruction. In the proposed framework, the denoising loss of low-resolution video with high-resolution video reconstruction loss is jointly used in an end-to-end manner with a zero-shot setting. Further, Meta-learning is used to initialize the weights of the proposed model to take advantage of learning on the external dataset with internal learning in a zero-shot environment. To maintain the temporal coherency, we have used the Motion Compensation Transformer (MCT) for motion estimation and the Sub-Pixel Motion Compensation (SPMC) layer for motion compensation. We have evaluated the performance of our proposed model on REDS and Vid4 Dataset. The PSNR value of our model is 25.13 dB for the RealVSR dataset, which is 0.72 dB more than the next-best performing model, EAVSR+. For MVSR4x, our model provides 24.61 db PSNR, 0.67 dB more than the EAVSR+ model. Experimental results demonstrate the effectiveness of the proposed framework on degraded and noisy real low-resolution video compared to the existing methods. Furthermore, an ablation study has been conducted to highlight the contribution of 3D-CAE and attention layer to the overall network performance.
- Published
- 2024
- Full Text
- View/download PDF
44. Phrase based code-switching for cross-lingual question understanding.
- Author
-
Haisa, Gulizada, Altenbek, Gulila, and Li, Wen
- Abstract
Cross-lingual question understanding involves identifying named entities and question intent in the target language based on corresponding texts from the source-language training dataset. However, relying solely on bilingual parallel corpora has limitations, especially for low-resource languages where such corpora are scarce or unavailable. This paper argues that current cross-lingual techniques hinder the effectiveness of various phrases, particularly noun phrases and interrogative phrases. To address this, a new code-switching data augmentation method called PBCS is introduced for zero-shot cross-lingual training. Unlike recent methods, this approach utilizes small bilingual phrase dictionaries instead of relying on a large bilingual parallel corpus. Moreover, a cross-lingual question understanding model, XQUM, is proposed. At the lower level, the model shares input features and hidden layer states to mitigate error accumulation. Additionally, at the top level, model performance is enhanced through a bi-directional correlation layer based on an iterative mechanism, specifically tailored for the given task. Experimental results on the MQUC and MTOD datasets demonstrate that XQUM significantly improves the accuracy of cross-lingual question understanding tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. Segment Anything Model Can Not Segment Anything: Assessing AI Foundation Model's Generalizability in Permafrost Mapping.
- Author
-
Li, Wenwen, Hsu, Chia-Yu, Wang, Sizhe, Yang, Yezhou, Lee, Hyunho, Liljedahl, Anna, Witharana, Chandi, Yang, Yili, Rogers, Brendan M., Arundel, Samantha T., Jones, Matthew B., McHenry, Kenton, and Solis, Patricia
- Subjects
- *
LANGUAGE models , *BUILDING foundations , *ARTIFICIAL intelligence , *PERMAFROST , *GLOBAL warming , *TUNDRAS - Abstract
This paper assesses trending AI foundation models, especially emerging computer vision foundation models and their performance in natural landscape feature segmentation. While the term foundation model has quickly garnered interest from the geospatial domain, its definition remains vague. Hence, this paper will first introduce AI foundation models and their defining characteristics. Built upon the tremendous success achieved by Large Language Models (LLMs) as the foundation models for language tasks, this paper discusses the challenges of building foundation models for geospatial artificial intelligence (GeoAI) vision tasks. To evaluate the performance of large AI vision models, especially Meta's Segment Anything Model (SAM), we implemented different instance segmentation pipelines that minimize the changes to SAM to leverage its power as a foundation model. A series of prompt strategies were developed to test SAM's performance regarding its theoretical upper bound of predictive accuracy, zero-shot performance, and domain adaptability through fine-tuning. The analysis used two permafrost feature datasets, ice-wedge polygons and retrogressive thaw slumps because (1) these landform features are more challenging to segment than man-made features due to their complicated formation mechanisms, diverse forms, and vague boundaries; (2) their presence and changes are important indicators for Arctic warming and climate change. The results show that although promising, SAM still has room for improvement to support AI-augmented terrain mapping. The spatial and domain generalizability of this finding is further validated using a more general dataset EuroCrops for agricultural field mapping. Finally, we discuss future research directions that strengthen SAM's applicability in challenging geospatial domains. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Remote Sensing Object Detection in the Deep Learning Era—A Review.
- Author
-
Gui, Shengxi, Song, Shuang, Qin, Rongjun, and Tang, Yang
- Subjects
- *
OBJECT recognition (Computer vision) , *DEEP learning , *REMOTE sensing , *SYNTHETIC aperture radar , *OPTICAL radar , *OPTICAL remote sensing , *DIGITAL elevation models - Abstract
Given the large volume of remote sensing images collected daily, automatic object detection and segmentation have been a consistent need in Earth observation (EO). However, objects of interest vary in shape, size, appearance, and reflecting properties. This is not only reflected by the fact that these objects exhibit differences due to their geographical diversity but also by the fact that these objects appear differently in images collected from different sensors (optical and radar) and platforms (satellite, aerial, and unmanned aerial vehicles (UAV)). Although there exists a plethora of object detection methods in the area of remote sensing, given the very fast development of prevalent deep learning methods, there is still a lack of recent updates for object detection methods. In this paper, we aim to provide an update that informs researchers about the recent development of object detection methods and their close sibling in the deep learning era, instance segmentation. The integration of these methods will cover approaches to data at different scales and modalities, such as optical, synthetic aperture radar (SAR) images, and digital surface models (DSM). Specific emphasis will be placed on approaches addressing data and label limitations in this deep learning era. Further, we survey examples of remote sensing applications that benefited from automatic object detection and discuss future trends of the automatic object detection in EO. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Attentional Composition Networks for Long-Tailed Human Action Recognition.
- Author
-
HAORAN WANG, YAJIE WANG, BAOSHENG YU, YIBING ZHAN, CHUNFENG YUAN, and WANKOU YANG
- Subjects
HUMAN activity recognition ,EYE tracking - Abstract
The problem of long-tailed visual recognition has been receiving increasing research attention. However, the long-tailed distribution problem remains underexplored for video-based visual recognition. To address this issue, in this article we propose a compositional learning based solution for video-based human action recognition. Our method, named Attentional Composition Networks (ACN), first learns verb-like and prepositionlike components, then shuffles these components to generate samples for the tail classes in the feature space to augment the data for the tail classes. Specifically, during training, we represent each action video by a graph that captures the spatial-temporal relations (edges) among detected human/object instances (nodes). Then, ACN utilizes the position information to decompose each action into a set of verb and preposition representations using the edge features in the graph. After that, the verb and preposition features from different videos are combined via an attention structure to synthesize feature representations for tail classes. This way, we can enrich the data for the tail classes and consequently improve the action recognition for these classes. To evaluate the compositional human action recognition, we further contribute a new human action recognition dataset, namely NEU-Interaction (NEU-I). Experimental results on both Something-Something V2 and the proposed NEU-I demonstrate the effectiveness of the proposed method for long-tailed, few-shot, and zero-shot problems in human action recognition. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Zero-Shot Video Moment Retrieval Using BLIP-Based Models
- Author
-
Wattasseril, Jobin Idiculla, Shekhar, Sumit, Döllner, Jürgen, Trapp, Matthias, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bebis, George, editor, Ghiasi, Golnaz, editor, Fang, Yi, editor, Sharf, Andrei, editor, Dong, Yue, editor, Weaver, Chris, editor, Leo, Zhicheng, editor, LaViola Jr., Joseph J., editor, and Kohli, Luv, editor
- Published
- 2023
- Full Text
- View/download PDF
49. Zero-Shot NER via Extractive Question Answering
- Author
-
Tirskikh, Danil, Konovalov, Vasily, Kacprzyk, Janusz, Series Editor, Kryzhanovsky, Boris, editor, Dunin-Barkowski, Witali, editor, Redko, Vladimir, editor, Tiumentsev, Yury, editor, and Klimov, Valentin, editor
- Published
- 2023
- Full Text
- View/download PDF
50. ZeroGen: Zero-Shot Multimodal Controllable Text Generation with Multiple Oracles
- Author
-
Tu, Haoqin, Yang, Bowen, Zhao, Xianfeng, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Liu, Fei, editor, Duan, Nan, editor, Xu, Qingting, editor, and Hong, Yu, editor
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.