Author: "Wang, Zekun" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Wang, Zekun"' showing total 888 results

Start Over Author "Wang, Zekun"

888 results on '"Wang, Zekun"'

1. MIO: A Foundation Model on Multimodal Tokens

Author: Wang, Zekun, Zhu, King, Xu, Chunpu, Zhou, Wangchunshu, Liu, Jiaheng, Zhang, Yibo, Wang, Jiashuo, Shi, Ning, Li, Siyu, Li, Yizhi, Que, Haoran, Zhang, Zhaoxiang, Zhang, Yuanxing, Zhang, Ge, Xu, Ke, Fu, Jie, and Huang, Wenhao
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: In this paper, we introduce MIO, a novel foundation model built on multimodal tokens, capable of understanding and generating speech, text, images, and videos in an end-to-end, autoregressive manner. While the emergence of large language models (LLMs) and multimodal large language models (MM-LLMs) propels advancements in artificial general intelligence through their versatile capabilities, they still lack true any-to-any understanding and generation. Recently, the release of GPT-4o has showcased the remarkable potential of any-to-any LLMs for complex real-world tasks, enabling omnidirectional input and output across images, speech, and text. However, it is closed-source and does not support the generation of multimodal interleaved sequences. To address this gap, we present MIO, which is trained on a mixture of discrete tokens across four modalities using causal multimodal modeling. MIO undergoes a four-stage training process: (1) alignment pre-training, (2) interleaved pre-training, (3) speech-enhanced pre-training, and (4) comprehensive supervised fine-tuning on diverse textual, visual, and speech tasks. Our experimental results indicate that MIO exhibits competitive, and in some cases superior, performance compared to previous dual-modal baselines, any-to-any model baselines, and even modality-specific baselines. Moreover, MIO demonstrates advanced capabilities inherent to its any-to-any feature, such as interleaved video-text generation, chain-of-visual-thought reasoning, visual guideline generation, instructional image editing, etc., Comment: Technical Report. Codes and models will be available soon
Published: 2024

2. HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Author: Que, Haoran, Duan, Feiyu, He, Liqun, Mou, Yutao, Zhou, Wangchunshu, Liu, Jiaheng, Rong, Wenge, Wang, Zekun Moore, Yang, Jian, Zhang, Ge, Peng, Junran, Zhang, Zhaoxiang, Zhang, Songyang, and Chen, Kai
Subjects: Computer Science - Computation and Language
Abstract: In recent years, Large Language Models (LLMs) have demonstrated remarkable capabilities in various tasks (e.g., long-context understanding), and many benchmarks have been proposed. However, we observe that long text generation capabilities are not well investigated. Therefore, we introduce the Hierarchical Long Text Generation Benchmark (HelloBench), a comprehensive, in-the-wild, and open-ended benchmark to evaluate LLMs' performance in generating long text. Based on Bloom's Taxonomy, HelloBench categorizes long text generation tasks into five subtasks: open-ended QA, summarization, chat, text completion, and heuristic text generation. Besides, we propose Hierarchical Long Text Evaluation (HelloEval), a human-aligned evaluation method that significantly reduces the time and effort required for human evaluation while maintaining a high correlation with human evaluation. We have conducted extensive experiments across around 30 mainstream LLMs and observed that the current LLMs lack long text generation capabilities. Specifically, first, regardless of whether the instructions include explicit or implicit length constraints, we observe that most LLMs cannot generate text that is longer than 4000 words. Second, we observe that while some LLMs can generate longer text, many issues exist (e.g., severe repetition and quality degradation). Third, to demonstrate the effectiveness of HelloEval, we compare HelloEval with traditional metrics (e.g., ROUGE, BLEU, etc.) and LLM-as-a-Judge methods, which show that HelloEval has the highest correlation with human evaluation. We release our code in https://github.com/Quehry/HelloBench.
Published: 2024

3. CFSP: An Efficient Structured Pruning Framework for LLMs with Coarse-to-Fine Activation Information

Author: Wang, Yuxin, Ma, Minghua, Wang, Zekun, Chen, Jingchang, Fan, Huiming, Shan, Liping, Yang, Qing, Xu, Dongliang, Liu, Ming, and Qin, Bing
Subjects: Computer Science - Computation and Language
Abstract: The colossal parameters and computational overhead of Large Language Models (LLMs) challenge their real-world applications. Network pruning, which targets unstructured or structured sparsity by removing redundant parameters, has recently been explored for LLM acceleration. Existing LLM pruning works focus on unstructured pruning, which typically requires special hardware support for a practical speed-up. In contrast, structured pruning can reduce latency on general devices. However, it remains a challenge to perform structured pruning efficiently and maintain performance, especially at high sparsity ratios. To this end, we introduce an efficient structured pruning framework named CFSP, which leverages both Coarse (interblock) and Fine-grained (intrablock) activation information as an importance criterion to guide pruning. The pruning is highly efficient, as it only requires one forward pass to compute feature activations. Specifically, we first allocate the sparsity budget across blocks based on their importance and then retain important weights within each block. In addition, we introduce a recovery fine-tuning strategy that adaptively allocates training overhead based on coarse-grained importance to further improve performance. Experimental results demonstrate that CFSP outperforms existing methods on diverse models across various sparsity budgets. Our code will be available at https://github.com/wyxscir/CFSP., Comment: Work in progress
Published: 2024

4. Video-CCAM: Enhancing Video-Language Understanding with Causal Cross-Attention Masks for Short and Long Videos

Author: Fei, Jiajun, Li, Dian, Deng, Zhidong, Wang, Zekun, Liu, Gang, and Wang, Hui
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Multi-modal large language models (MLLMs) have demonstrated considerable potential across various downstream tasks that require cross-domain knowledge. MLLMs capable of processing videos, known as Video-MLLMs, have attracted broad interest in video-language understanding. However, videos, especially long videos, contain more visual tokens than images, making them difficult for LLMs to process. Existing works either downsample visual features or extend the LLM context size, risking the loss of high-resolution information or slowing down inference speed. To address these limitations, we apply cross-attention layers in the intermediate projector between the visual encoder and the large language model (LLM). As the naive cross-attention mechanism is insensitive to temporal order, we further introduce causal cross-attention masks (CCAMs) within the cross-attention layers. This Video-MLLM, named Video-CCAM, is trained in a straightforward two-stage fashion: feature alignment and visual instruction tuning. We develop several Video-CCAM models based on LLMs of different sizes (4B, 9B, and 14B). Video-CCAM proves to be a robust Video-MLLM and shows outstanding performance from short videos to long ones. Among standard video benchmarks like MVBench and VideoChatGPT-QA, Video-CCAM shows outstanding performances (1st/2nd/3rd in MVBench and TGIF-QA, 2nd/3rd/4th in MSVD-QA, MSRVTT-QA, and ActivityNet-QA). In benchmarks encompassing long videos, Video-CCAM models can be directly adapted to long video understanding and still achieve exceptional scores despite being trained solely with images and 16-frame videos. Using 96 frames (6$\times$ the training number of frames), Video-CCAM models rank 1st/2nd/3rd in VideoVista and 1st/2nd/4th in MLVU among all open-source Video-MLLMs, respectively. The code is publicly available in \url{https://github.com/QQ-MM/Video-CCAM}., Comment: 10 pages, 5 figures
Published: 2024

5. GUIDE: A Guideline-Guided Dataset for Instructional Video Comprehension

Author: Liang, Jiafeng, Jiang, Shixin, Wang, Zekun, Pan, Haojie, Chen, Zerui, Chu, Zheng, Liu, Ming, Fu, Ruiji, Wang, Zhongyuan, and Qin, Bing
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language
Abstract: There are substantial instructional videos on the Internet, which provide us tutorials for completing various tasks. Existing instructional video datasets only focus on specific steps at the video level, lacking experiential guidelines at the task level, which can lead to beginners struggling to learn new tasks due to the lack of relevant experience. Moreover, the specific steps without guidelines are trivial and unsystematic, making it difficult to provide a clear tutorial. To address these problems, we present the GUIDE (Guideline-Guided) dataset, which contains 3.5K videos of 560 instructional tasks in 8 domains related to our daily life. Specifically, we annotate each instructional task with a guideline, representing a common pattern shared by all task-related videos. On this basis, we annotate systematic specific steps, including their associated guideline steps, specific step descriptions and timestamps. Our proposed benchmark consists of three sub-tasks to evaluate comprehension ability of models: (1) Step Captioning: models have to generate captions for specific steps from videos. (2) Guideline Summarization: models have to mine the common pattern in task-related videos and summarize a guideline from them. (3) Guideline-Guided Captioning: models have to generate captions for specific steps under the guide of guideline. We evaluate plenty of foundation models with GUIDE and perform in-depth analysis. Given the diversity and practicality of GUIDE, we believe that it can be used as a better benchmark for instructional video comprehension., Comment: IJCAI 2024
Published: 2024

6. PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents

Author: Wang, Junjie, Zhang, Yin, Ji, Yatai, Zhang, Yuxiang, Jiang, Chunyang, Wang, Yubo, Zhu, Kang, Wang, Zekun, Wang, Tiezhen, Huang, Wenhao, Fu, Jie, Chen, Bei, Lin, Qunshu, Liu, Minghao, Zhang, Ge, and Chen, Wenhu
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: Recent advancements in Large Multimodal Models (LMMs) have leveraged extensive multimodal datasets to enhance capabilities in complex knowledge-driven tasks. However, persistent challenges in perceptual and reasoning errors limit their efficacy, particularly in interpreting intricate visual data and deducing multimodal relationships. Addressing these issues, we introduce a novel dataset format, PIN (Paired and INterleaved multimodal documents), designed to significantly improve both the depth and breadth of multimodal training. The PIN format is built on three foundational principles: knowledge intensity, scalability, and support for diverse training modalities. This innovative format combines markdown files and comprehensive images to enrich training data with a dense knowledge structure and versatile training strategies. We present PIN-14M, an open-source dataset comprising 14 million samples derived from a diverse range of Chinese and English sources, tailored to include complex web and scientific content. This dataset is constructed meticulously to ensure data quality and ethical integrity, aiming to facilitate advanced training strategies and improve model robustness against common multimodal training pitfalls. Our initial results, forming the basis of this technical report, suggest significant potential for the PIN format in refining LMM performance, with plans for future expansions and detailed evaluations of its impact on model capabilities.
Published: 2024

7. McEval: Massively Multilingual Code Evaluation

Author: Chai, Linzheng, Liu, Shukai, Yang, Jian, Yin, Yuwei, Jin, Ke, Liu, Jiaheng, Sun, Tao, Zhang, Ge, Ren, Changyu, Guo, Hongcheng, Wang, Zekun, Wang, Boyang, Wu, Xianjie, Wang, Bing, Li, Tongliang, Yang, Liqun, Duan, Sufeng, and Li, Zhoujun
Subjects: Computer Science - Programming Languages
Abstract: Code large language models (LLMs) have shown remarkable advances in code understanding, completion, and generation tasks. Programming benchmarks, comprised of a selection of code challenges and corresponding test cases, serve as a standard to evaluate the capability of different LLMs in such tasks. However, most existing benchmarks primarily focus on Python and are still restricted to a limited number of languages, where other languages are translated from the Python samples (e.g. MultiPL-E) degrading the data diversity. To further facilitate the research of code LLMs, we propose a massively multilingual code benchmark covering 40 programming languages (McEval) with 16K test samples, which substantially pushes the limits of code LLMs in multilingual scenarios. The benchmark contains challenging code completion, understanding, and generation evaluation tasks with finely curated massively multilingual instruction corpora McEval-Instruct. In addition, we introduce an effective multilingual coder mCoder trained on McEval-Instruct to support multilingual programming language generation. Extensive experimental results on McEval show that there is still a difficult journey between open-source models and closed-source LLMs (e.g. GPT-series models) in numerous languages. The instruction corpora, evaluation benchmark, and leaderboard are available at \url{https://mceval.github.io/}., Comment: 22 pages
Published: 2024

8. II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models

Author: Liu, Ziqiang, Fang, Feiteng, Feng, Xi, Du, Xinrun, Zhang, Chenhao, Wang, Zekun, Bai, Yuelin, Zhao, Qixuan, Fan, Liyang, Gan, Chengguang, Lin, Hongquan, Li, Jiaming, Ni, Yuansheng, Wu, Haihong, Narsupalli, Yaswanth, Zheng, Zhigang, Li, Chengming, Hu, Xiping, Xu, Ruifeng, Chen, Xiaojun, Yang, Min, Liu, Jiaheng, Liu, Ruibo, Huang, Wenhao, Zhang, Ge, and Ni, Shiwen
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: The rapid advancements in the development of multimodal large language models (MLLMs) have consistently led to new breakthroughs on various benchmarks. In response, numerous challenging and comprehensive benchmarks have been proposed to more accurately assess the capabilities of MLLMs. However, there is a dearth of exploration of the higher-order perceptual capabilities of MLLMs. To fill this gap, we propose the Image Implication understanding Benchmark, II-Bench, which aims to evaluate the model's higher-order perception of images. Through extensive experiments on II-Bench across multiple MLLMs, we have made significant findings. Initially, a substantial gap is observed between the performance of MLLMs and humans on II-Bench. The pinnacle accuracy of MLLMs attains 74.8%, whereas human accuracy averages 90%, peaking at an impressive 98%. Subsequently, MLLMs perform worse on abstract and complex images, suggesting limitations in their ability to understand high-level semantics and capture image details. Finally, it is observed that most models exhibit enhanced accuracy when image sentiment polarity hints are incorporated into the prompts. This observation underscores a notable deficiency in their inherent understanding of image sentiment. We believe that II-Bench will inspire the community to develop the next generation of MLLMs, advancing the journey towards expert artificial general intelligence (AGI). II-Bench is publicly available at https://huggingface.co/datasets/m-a-p/II-Bench., Comment: 100 pages, 82 figures, add citations
Published: 2024

9. Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation

Author: Chen, Jingchang, Tang, Hongxuan, Chu, Zheng, Chen, Qianglong, Wang, Zekun, Liu, Ming, and Qin, Bing
Subjects: Computer Science - Computation and Language, Computer Science - Software Engineering
Abstract: Despite recent progress made by large language models in code generation, they still struggle with programs that meet complex requirements. Recent work utilizes plan-and-solve decomposition to decrease the complexity and leverage self-tests to refine the generated program. Yet, planning deep-inside requirements in advance can be challenging, and the tests need to be accurate to accomplish self-improvement. To this end, we propose FunCoder, a code generation framework incorporating the divide-and-conquer strategy with functional consensus. Specifically, FunCoder recursively branches off sub-functions as smaller goals during code generation, represented by a tree hierarchy. These sub-functions are then composited to attain more complex objectives. Additionally, we designate functions via a consensus formed by identifying similarities in program behavior, mitigating error propagation. FunCoder outperforms state-of-the-art methods by +9.8% on average in HumanEval, MBPP, xCodeEval and MATH with GPT-3.5 and GPT-4. Moreover, our method demonstrates superiority on smaller models: With FunCoder, StableCode-3b surpasses GPT-3.5 by +18.6% and achieves 97.7% of GPT-4's performance on HumanEval. Further analysis reveals that our proposed dynamic function decomposition is capable of handling complex requirements, and the functional consensus prevails over self-testing in correctness evaluation.
Published: 2024

10. Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations

Author: Ma, Ziqiao, Wang, Zekun, and Chai, Joyce
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Humans are efficient language learners and inherently social creatures. Our language development is largely shaped by our social interactions, for example, the demonstration and feedback from caregivers. Contrary to human language learning, recent advancements in large language models have primarily adopted a non-interactive training paradigm, and refined pre-trained models through feedback afterward. In this work, we aim to examine how corrective feedback from interactions influences neural language acquisition from the ground up through systematically controlled experiments, assessing whether it contributes to learning efficiency in language models. We introduce a trial-and-demonstration (TnD) learning framework that incorporates three components: student trials, teacher demonstrations, and a reward conditioned on language competence at various developmental stages. Our experiments reveal that the TnD approach accelerates word acquisition for student models of equal and smaller numbers of parameters, and we highlight the significance of both trials and demonstrations. We further show that the teacher's choices of words influence students' word-specific learning efficiency, and a practice-makes-perfect effect is evident by a strong correlation between the frequency of words in trials and their respective learning curves. Our findings suggest that interactive language learning, with teacher demonstrations and student trials, can facilitate efficient word learning in language models.
Published: 2024

11. EFACT: an External Function Auto-Completion Tool to Strengthen Static Binary Lifting

Author: Zhang, Yilei, Liao, Haoyu, Wang, Zekun, Huang, Bo, and Guo, Jianmei
Subjects: Computer Science - Software Engineering
Abstract: Static binary lifting is essential in binary rewriting frameworks. Existing tools overlook the impact of External Function Completion (EXFC) in static binary lifting. EXFC recovers the prototypes of External Functions (EXFs, functions defined in standard shared libraries) using only the function symbols available. Incorrect EXFC can misinterpret the source binary, or cause memory overflows in static binary translation, which eventually results in program crashes. Notably, existing tools struggle to recover the prototypes of mangled EXFs originating from binaries compiled from C++. Moreover, they require time-consuming manual processing to support new libraries. This paper presents EFACT, an External Function Auto-Completion Tool for static binary lifting. Our EXF recovery algorithm better recovers the prototypes of mangled EXFs, particularly addressing the template specialization mechanism in C++. EFACT is designed as a lightweight plugin to strengthen other static binary rewriting frameworks in EXFC. Our evaluation shows that EFACT outperforms RetDec and McSema in mangled EXF recovery by 96.4% and 97.3% on SPEC CPU 2017. Furthermore, we delve deeper into static binary translation and address several cross-ISA EXFC problems. When integrated with McSema, EFACT correctly translates 36.7% more benchmarks from x86-64 to x86-64 and 93.6% more from x86-64 to AArch64 than McSema alone on EEMBC.
Published: 2024

12. Legendre Transformation under Micro Canonical Ensemble

Author: Wu, Jingxu, Li, Chenjia, Lei, Zhenzhou, Wumaier, Tuerdi, Li, Congyu, Wang, Yan, and Wang, Zekun
Subjects: Condensed Matter - Statistical Mechanics, Mathematical Physics
Abstract: The Legendre transformation is a crucial tool in theoretical physics, known for its symmetry, especially when applied to multivariate functions. In statistical mechanics, ensembles represent the central focus. Leveraging the dimensionless aspect of Legendre transformation, this paper explores the transformation process from the entropy characteristic function of microcanonical ensembles to the analogous definition of partition function transformation. Additionally, it derives characteristic functions, partition functions, and establishes their interrelations, along with deriving corresponding thermodynamic formulas for various ensembles. This streamlined approach sheds light on the fundamental principles of statistical mechanics and underscores the symmetry inherent in Legendre transformation.
Published: 2024

13. COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning

Author: Bai, Yuelin, Du, Xinrun, Liang, Yiming, Jin, Yonggang, Liu, Ziqiang, Zhou, Junting, Zheng, Tianyu, Zhang, Xincheng, Ma, Nuo, Wang, Zekun, Yuan, Ruibin, Wu, Haihong, Lin, Hongquan, Huang, Wenhao, Zhang, Jiajun, Chen, Wenhu, Lin, Chenghua, Fu, Jie, Yang, Min, Ni, Shiwen, and Zhang, Ge
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Recently, there have been significant advancements in large language models (LLMs), particularly focused on the English language. These advancements have enabled these LLMs to understand and execute complex instructions with unprecedented accuracy and fluency. However, despite these advancements, there remains a noticeable gap in the development of Chinese instruction tuning. The unique linguistic features and cultural depth of the Chinese language pose challenges for instruction tuning tasks. Existing datasets are either derived from English-centric LLMs or are ill-suited for aligning with the interaction patterns of real-world Chinese users. To bridge this gap, we introduce COIG-CQIA, a high-quality Chinese instruction tuning dataset. Our aim is to build a diverse, wide-ranging instruction-tuning dataset to better align model behavior with human interactions. To this end, we collect a high-quality human-written corpus from various sources on the Chinese Internet, including Q&A communities, Wikis, examinations, and existing NLP datasets. This corpus was rigorously filtered and carefully processed to form the COIG-CQIA dataset. Furthermore, we train models of various scales on different subsets of CQIA, following in-depth evaluation and analyses. The findings from our experiments offer valuable insights for selecting and developing Chinese instruction-tuning datasets. We also find that models trained on CQIA-Subset achieve competitive results in human assessment as well as knowledge and security benchmarks. Data are available at https://huggingface.co/datasets/m-a-p/COIG-CQIA
Published: 2024

14. CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models

Author: LI, Yizhi, Zhang, Ge, Qu, Xingwei, Li, Jiali, Li, Zhaoqun, Wang, Zekun, Li, Hao, Yuan, Ruibin, Ma, Yinghao, Zhang, Kai, Zhou, Wangchunshu, Liang, Yiming, Zhang, Lei, Ma, Lei, Zhang, Jiajun, Li, Zuowen, Huang, Stephen W., Lin, Chenghua, and Fu, Jie
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: The advancement of large language models (LLMs) has enhanced the ability to generalize across a wide range of unseen natural language processing (NLP) tasks through instruction-following. Yet, their effectiveness often diminishes in low-resource languages like Chinese, exacerbated by biased evaluations from data leakage, casting doubt on their true generalizability to new linguistic territories. In response, we introduce the Chinese Instruction-Following Benchmark (CIF-Bench), designed to evaluate the zero-shot generalizability of LLMs to the Chinese language. CIF-Bench comprises 150 tasks and 15,000 input-output pairs, developed by native speakers to test complex reasoning and Chinese cultural nuances across 20 categories. To mitigate data contamination, we release only half of the dataset publicly, with the remainder kept private, and introduce diversified instructions to minimize score variance, totaling 45,000 data instances. Our evaluation of 28 selected LLMs reveals a noticeable performance gap, with the best model scoring only 52.9%, highlighting the limitations of LLMs in less familiar language and task contexts. This work not only uncovers the current limitations of LLMs in handling Chinese language tasks but also sets a new standard for future LLM generalizability research, pushing towards the development of more adaptable, culturally informed, and linguistically diverse models., Comment: Camera-ready version for ACL 2024. Project page at https://yizhilll.github.io/CIF-Bench/
Published: 2024

15. PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM Agents

Author: Yang, Qisen, Wang, Zekun, Chen, Honghui, Wang, Shenzhi, Pu, Yifan, Gao, Xin, Huang, Wenhao, Song, Shiji, and Huang, Gao
Subjects: Computer Science - Computation and Language, Computer Science - Computers and Society, Computer Science - Human-Computer Interaction, Computer Science - Machine Learning, Computer Science - Multiagent Systems
Abstract: Psychological measurement is essential for mental health, self-understanding, and personal development. Traditional methods, such as self-report scales and psychologist interviews, often face challenges with engagement and accessibility. While game-based and LLM-based tools have been explored to improve user interest and automate assessment, they struggle to balance engagement with generalizability. In this work, we propose PsychoGAT (Psychological Game AgenTs) to achieve a generic gamification of psychological assessment. The main insight is that powerful LLMs can function both as adept psychologists and innovative game designers. By incorporating LLM agents into designated roles and carefully managing their interactions, PsychoGAT can transform any standardized scales into personalized and engaging interactive fiction games. To validate the proposed method, we conduct psychometric evaluations to assess its effectiveness and employ human evaluators to examine the generated content across various psychological constructs, including depression, cognitive distortions, and personality traits. Results demonstrate that PsychoGAT serves as an effective assessment tool, achieving statistically significant excellence in psychometric metrics such as reliability, convergent validity, and discriminant validity. Moreover, human evaluations confirm PsychoGAT's enhancements in content coherence, interactivity, interest, immersion, and satisfaction., Comment: ACL 2024
Published: 2024

16. CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark

Author: Zhang, Ge, Du, Xinrun, Chen, Bei, Liang, Yiming, Luo, Tongxu, Zheng, Tianyu, Zhu, Kang, Cheng, Yuyang, Xu, Chunpu, Guo, Shuyue, Zhang, Haoran, Qu, Xingwei, Wang, Junjie, Yuan, Ruibin, Li, Yizhi, Wang, Zekun, Liu, Yudong, Tsai, Yu-Hsuan, Zhang, Fengji, Lin, Chenghua, Huang, Wenhao, and Fu, Jie
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: As the capabilities of large multimodal models (LMMs) continue to advance, evaluating the performance of LMMs emerges as an increasing need. Additionally, there is an even larger gap in evaluating the advanced knowledge and reasoning abilities of LMMs in non-English contexts such as Chinese. We introduce CMMMU, a new Chinese Massive Multi-discipline Multimodal Understanding benchmark designed to evaluate LMMs on tasks demanding college-level subject knowledge and deliberate reasoning in a Chinese context. CMMMU is inspired by and strictly follows the annotation and analysis pattern of MMMU. CMMMU includes 12k manually collected multimodal questions from college exams, quizzes, and textbooks, covering six core disciplines: Art & Design, Business, Science, Health & Medicine, Humanities & Social Science, and Tech & Engineering, like its companion, MMMU. These questions span 30 subjects and comprise 39 highly heterogeneous image types, such as charts, diagrams, maps, tables, music sheets, and chemical structures. CMMMU focuses on complex perception and reasoning with domain-specific knowledge in the Chinese context. We evaluate 11 open-source LLMs and one proprietary GPT-4V(ision). Even GPT-4V only achieves accuracies of 42%, indicating a large space for improvement. CMMMU will boost the community to build the next-generation LMMs towards expert artificial intelligence and promote the democratization of LMMs by providing diverse language contexts.
Published: 2024

17. CogGPT: Unleashing the Power of Cognitive Dynamics on Large Language Models

Author: Lv, Yaojia, Pan, Haojie, Wang, Zekun, Liang, Jiafeng, Liu, Yuanxing, Fu, Ruiji, Liu, Ming, Wang, Zhongyuan, and Qin, Bing
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Cognitive dynamics are pivotal to advance human understanding of the world. Recent advancements in large language models (LLMs) reveal their potential for cognitive simulation. However, these LLM-based cognitive studies primarily focus on static modeling, overlooking the dynamic nature of cognition. To bridge this gap, we propose the concept of the cognitive dynamics of LLMs and present a corresponding task with the inspiration of longitudinal studies. Towards the task, we develop CogBench, a novel benchmark to assess the cognitive dynamics of LLMs and validate it through participant surveys. We also design two evaluation metrics for CogBench, including Authenticity and Rationality. Recognizing the inherent static nature of LLMs, we introduce CogGPT for the task, which features an innovative iterative cognitive mechanism aimed at enhancing lifelong cognitive dynamics. Empirical results demonstrate the superiority of CogGPT over existing methods, particularly in its ability to facilitate role-specific cognitive dynamics under continuous information flows., Comment: Accepted to EMNLP 2024 (Findings)
Published: 2024

18. Align on the Fly: Adapting Chatbot Behavior to Established Norms

Author: Xu, Chunpu, Chern, Steffi, Chern, Ethan, Zhang, Ge, Wang, Zekun, Liu, Ruibo, Li, Jing, Fu, Jie, and Liu, Pengfei
Subjects: Computer Science - Computation and Language
Abstract: In this paper, we aim to align large language models with the ever-changing, complex, and diverse human values (e.g., social norms) across time and locations. This presents a challenge to existing alignment techniques, such as supervised fine-tuning, which internalize values within model parameters. To overcome this, we propose an On-the-fly Preference Optimization (OPO) method, which is a real-time alignment that works in a streaming way. It employs an external memory to store established rules for alignment, which can constrain LLMs' behaviors without further training, allowing for convenient updates and customization of human values. We also introduce a scalable evaluation to assess the proposed method more effectively. Experimental results on both human-annotated and auto-generated questions from legal and moral domains indicate the effectiveness of the proposed OPO method. Our code and data are released at https://github.com/GAIR-NLP/OPO.
Published: 2023

19. MTGER: Multi-view Temporal Graph Enhanced Temporal Reasoning over Time-Involved Document

Author: Chu, Zheng, Wang, Zekun, Liang, Jiafeng, Liu, Ming, and Qin, Bing
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: The facts and time in the document are intricately intertwined, making temporal reasoning over documents challenging. Previous work models time implicitly, making it difficult to handle such complex relationships. To address this issue, we propose MTGER, a novel Multi-view Temporal Graph Enhanced Temporal Reasoning framework for temporal reasoning over time-involved documents. Concretely, MTGER explicitly models the temporal relationships among facts by multi-view temporal graphs. On the one hand, the heterogeneous temporal graphs explicitly model the temporal and discourse relationships among facts; on the other hand, the multi-view mechanism captures both time-focused and fact-focused information, allowing the two views to complement each other through adaptive fusion. To further improve the implicit reasoning capability of the model, we design a self-supervised time-comparing objective. Extensive experimental results demonstrate the effectiveness of our method on the TimeQA and SituatedQA datasets. Furthermore, MTGER gives more consistent answers under question perturbations., Comment: Findings of EMNLP 2023, long paper
Published: 2023

20. Endovascular thrombectomy with or without intravenous alteplase in large-core ischemic stroke: a systematic review and meta-analysis

Author: Wang, Zekun, Ji, Kangxiang, and Fang, Qi
Published: 2024
Full Text: View/download PDF

21. Hot deformation behavior and processing workability of ERNiCrMo-3 alloy

Author: Sun, Zhiren, Yang, Yan, Ning, Xu, Li, Yuan, Yang, Sen, Wang, Zekun, and Wang, Kaikun
Published: 2024
Full Text: View/download PDF

22. A highoutput PDMS-MXene/gelatin triboelectric nanogenerator with the petal surface-microstructure

Author: Wang, Zekun, Hao, Congcong, Cai, Mingzhe, Cui, Juan, Zheng, Yongqiu, and Xue, Chenyang
Published: 2024
Full Text: View/download PDF

23. RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models

Author: Wang, Zekun Moore, Peng, Zhongyuan, Que, Haoran, Liu, Jiaheng, Zhou, Wangchunshu, Wu, Yuhan, Guo, Hongcheng, Gan, Ruitong, Ni, Zehao, Yang, Jian, Zhang, Man, Zhang, Zhaoxiang, Ouyang, Wanli, Xu, Ke, Huang, Stephen W., Fu, Jie, and Peng, Junran
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: The advent of Large Language Models (LLMs) has paved the way for complex tasks such as role-playing, which enhances user interactions by enabling models to imitate various characters. However, the closed-source nature of state-of-the-art LLMs and their general-purpose training limit role-playing optimization. In this paper, we introduce RoleLLM, a framework to benchmark, elicit, and enhance role-playing abilities in LLMs. RoleLLM comprises four stages: (1) Role Profile Construction for 100 roles; (2) Context-Based Instruction Generation (Context-Instruct) for role-specific knowledge extraction; (3) Role Prompting using GPT (RoleGPT) for speaking style imitation; and (4) Role-Conditioned Instruction Tuning (RoCIT) for fine-tuning open-source models along with role customization. By Context-Instruct and RoleGPT, we create RoleBench, the first systematic and fine-grained character-level benchmark dataset for role-playing with 168,093 samples. Moreover, RoCIT on RoleBench yields RoleLLaMA (English) and RoleGLM (Chinese), significantly enhancing role-playing abilities and even achieving comparable results with RoleGPT (using GPT-4)., Comment: 30 pages, repo at https://github.com/InteractiveNLP-Team/RoleLLM-public
Published: 2023

24. The trigger system for the CSR external-target experiment

Author: Guo, Dong, Xyu, Haoqian, Qi, DongDong, Wang, HeXiang, Zhang, Lei, Sun, Zhengyang, Qin, Zhi, Wang, Botan, Zhou, Yingjie, Wang, Zekun, Yang, Yuansheng, Qin, Yuhao, Wei, Xianglun, Yang, Herun, Yu, Yuhong, Zhao, Lei, and Xiao, Zhigang
Subjects: Physics - Instrumentation and Detectors, Nuclear Experiment
Abstract: A trigger system has been designed and implemented for the HIRFL-CSR external target experiment (CEE), the spectrometer for studying nuclear matter properties with heavy ion collisions in the GeV energy region. The system adopts master-slave structure and serial data transmission mode using optical fiber to deal with different types of detectors and long-distance signal transmission. The trigger logic can be accessed based on command register and controlled by a remote computer. The overall field programmable gate array (FPGA) logic can be flexibly reconfigured online to match the physical requirements of the experiment. The trigger system has been tested in beam experiment. It is demonstrated that the trigger system functions correctly and meets the physical requirements of CEE.
Published: 2023

25. Fragment and Integrate Network (FIN): A Novel Spatial-Temporal Modeling Based on Long Sequential Behavior for Online Food Ordering Click-Through Rate Prediction

Author: Li, Jun, Wang, Jingjian, Wang, Hongwei, Deng, Xing, Chen, Jielong, Cao, Bing, Wang, Zekun, Xu, Guanjie, Zhang, Ge, Shi, Feng, and Liu, Hualei
Subjects: Computer Science - Information Retrieval, Computer Science - Machine Learning
Abstract: Spatial-temporal information has been proven to be of great significance for click-through rate prediction tasks in online Location-Based Services (LBS), especially in mainstream food ordering platforms such as DoorDash, Uber Eats, Meituan, and Ele.me. Modeling user spatial-temporal preferences with sequential behavior data has become a hot topic in recommendation systems and online advertising. However, most of existing methods either lack the representation of rich spatial-temporal information or only handle user behaviors with limited length, e.g. 100. In this paper, we tackle these problems by designing a new spatial-temporal modeling paradigm named Fragment and Integrate Network (FIN). FIN consists of two networks: (i) Fragment Network (FN) extracts Multiple Sub-Sequences (MSS) from lifelong sequential behavior data, and captures the specific spatial-temporal representation by modeling each MSS respectively. Here both a simplified attention and a complicated attention are adopted to balance the performance gain and resource consumption. (ii) Integrate Network (IN) builds a new integrated sequence by utilizing spatial-temporal interaction on MSS and captures the comprehensive spatial-temporal representation by modeling the integrated sequence with a complicated attention. Both public datasets and production datasets have demonstrated the accuracy and scalability of FIN. Since 2022, FIN has been fully deployed in the recommendation advertising system of Ele.me, one of the most popular online food ordering platforms in China, obtaining 5.7% improvement on Click-Through Rate (CTR) and 7.3% increase on Revenue Per Mille (RPM)., Comment: Accepted by CIKM 2023 Applied Research Paper
Published: 2023

26. Components of dissolved organic carbon in relation to environmental factors in lakes along an altitudinal gradient in central China

Author: Long, Xuejing, Zhang, Huimin, Liu, Hui, Wang, Zekun, Zeng, Linghan, Huang, Xianyu, and Chen, Xu
Published: 2024
Full Text: View/download PDF

27. Exploring & Exploiting High-Order Graph Structure for Sparse Knowledge Graph Completion

Author: He, Tao, Liu, Ming, Cao, Yixin, Wang, Zekun, Zheng, Zihao, Chu, Zheng, and Qin, Bing
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Sparse knowledge graph (KG) scenarios pose a challenge for previous Knowledge Graph Completion (KGC) methods, that is, the completion performance decreases rapidly with the increase of graph sparsity. This problem is also exacerbated because of the widespread existence of sparse KGs in practical applications. To alleviate this challenge, we present a novel framework, LR-GCN, that is able to automatically capture valuable long-range dependency among entities to supplement insufficient structure features and distill logical reasoning knowledge for sparse KGC. The proposed approach comprises two main components: a GNN-based predictor and a reasoning path distiller. The reasoning path distiller explores high-order graph structures such as reasoning paths and encodes them as rich-semantic edges, explicitly compositing long-range dependencies into the predictor. This step also plays an essential role in densifying KGs, effectively alleviating the sparse issue. Furthermore, the path distiller further distills logical reasoning knowledge from these mined reasoning paths into the predictor. These two components are jointly optimized using a well-designed variational EM algorithm. Extensive experiments and analyses on four sparse benchmarks demonstrate the effectiveness of our proposed method., Comment: 12 pages, 5 figures
Published: 2023

28. SmartTrim: Adaptive Tokens and Attention Pruning for Efficient Vision-Language Models

Author: Wang, Zekun, Chen, Jingchang, Zhou, Wangchunshu, Zhu, Haichao, Liang, Jiafeng, Shan, Liping, Liu, Ming, Xu, Dongliang, Yang, Qing, and Qin, Bing
Subjects: Computer Science - Computation and Language
Abstract: Despite achieving remarkable performance on various vision-language tasks, Transformer-based Vision-Language Models (VLMs) suffer from redundancy in inputs and parameters, significantly hampering their efficiency in real-world applications. Moreover, the degree of redundancy in token representations and model parameters, such as attention heads, varies significantly for different inputs. In light of the challenges, we propose SmartTrim, an adaptive acceleration framework for VLMs, which adjusts the computational overhead per instance. Specifically, we integrate lightweight modules into the original backbone to identify and prune redundant token representations and attention heads within each layer. Furthermore, we devise a self-distillation strategy to enhance the consistency between the predictions of the pruned model and its fully-capacity counterpart. Experimental results across various vision-language tasks consistently demonstrate that SmartTrim accelerates the original model by 2-3 times with minimal performance degradation, highlighting the effectiveness and efficiency compared to previous approaches. Code will be available at https://github.com/kugwzk/SmartTrim., Comment: COLING-LREC 2024
Published: 2023

29. Interactive Natural Language Processing

Author: Wang, Zekun, Zhang, Ge, Yang, Kexin, Shi, Ning, Zhou, Wangchunshu, Hao, Shaochun, Xiong, Guangzheng, Li, Yizhi, Sim, Mong Yuan, Chen, Xiuying, Zhu, Qingqing, Yang, Zhenzhu, Nik, Adam, Liu, Qi, Lin, Chenghua, Wang, Shi, Liu, Ruibo, Chen, Wenhu, Xu, Ke, Liu, Dayiheng, Guo, Yike, and Fu, Jie
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP, aimed at addressing limitations in existing frameworks while aligning with the ultimate goals of artificial intelligence. This paradigm considers language models as agents capable of observing, acting, and receiving feedback iteratively from external entities. Specifically, language models in this context can: (1) interact with humans for better understanding and addressing user needs, personalizing responses, aligning with human values, and improving the overall user experience; (2) interact with knowledge bases for enriching language representations with factual knowledge, enhancing the contextual relevance of responses, and dynamically leveraging external information to generate more accurate and informed responses; (3) interact with models and tools for effectively decomposing and addressing complex tasks, leveraging specialized expertise for specific subtasks, and fostering the simulation of social behaviors; and (4) interact with environments for learning grounded representations of language, and effectively tackling embodied tasks such as reasoning, planning, and decision-making in response to environmental observations. This paper offers a comprehensive survey of iNLP, starting by proposing a unified definition and framework of the concept. We then provide a systematic classification of iNLP, dissecting its various components, including interactive objects, interaction interfaces, and interaction methods. We proceed to delve into the evaluation methodologies used in the field, explore its diverse applications, scrutinize its ethical and safety issues, and discuss prospective research directions. This survey serves as an entry point for researchers who are interested in this rapidly evolving area and offers a broad view of the current landscape and future trajectory of iNLP., Comment: 110 pages
Published: 2023

30. Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language Models

Author: Ignat, Oana, Jin, Zhijing, Abzaliev, Artem, Biester, Laura, Castro, Santiago, Deng, Naihao, Gao, Xinyi, Gunal, Aylin, He, Jacky, Kazemi, Ashkan, Khalifa, Muhammad, Koh, Namho, Lee, Andrew, Liu, Siyang, Min, Do June, Mori, Shinka, Nwatu, Joan, Perez-Rosas, Veronica, Shen, Siqi, Wang, Zekun, Wu, Winston, and Mihalcea, Rada
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Recent progress in large language models (LLMs) has enabled the deployment of many generative NLP applications. At the same time, it has also led to a misleading public discourse that ``it's all been solved.'' Not surprisingly, this has, in turn, made many NLP researchers -- especially those at the beginning of their careers -- worry about what NLP research area they should focus on. Has it all been solved, or what remaining questions can we work on regardless of LLMs? To address this question, this paper compiles NLP research directions rich for exploration. We identify fourteen different research areas encompassing 45 research directions that require new research and are not directly solvable by LLMs. While we identify many research areas, many others exist; we do not cover areas currently addressed by LLMs, but where LLMs lag behind in performance or those focused on LLM development. We welcome suggestions for other research directions to include: https://bit.ly/nlp-era-llm, Comment: Accepted at COLING 2024
Published: 2023

31. CircTRIM1 encodes TRIM1-269aa to promote chemoresistance and metastasis of TNBC via enhancing CaM-dependent MARCKS translocation and PI3K/AKT/mTOR activation

Author: Li, Yaming, Wang, Zekun, Yang, Jingwen, Sun, Yuhan, He, Yinqiao, Wang, Yuping, Chen, Xi, Liang, Yiran, Zhang, Ning, Wang, Xiaolong, Zhao, Wenjing, Hu, Guohong, and Yang, Qifeng
Published: 2024
Full Text: View/download PDF

32. PSMB2 plays an oncogenic role in glioma and correlates to the immune microenvironment

Author: He, Wei, Zhang, Zhe, Tan, ZiLong, Liu, XinXian, Wang, ZeKun, Xiong, Bo, Shen, XiaoLi, and Zhu, XinGen
Published: 2024
Full Text: View/download PDF

33. Spatial and temporal distribution of emerging airborne viral infectious diseases outbreaks on a global scale

Author: Wang, Zekun, Yan, Xiangyu, Zhao, Mingchen, Zhang, Shan, and Jia, Zhongwei
Published: 2024
Full Text: View/download PDF

34. Electron Backscatter Diffraction Investigation on Microstructure Evolution of TiB2(p)/Al-Cu Composite after Single-Pass Equal Channel Angular Pressing for Formability Assessment

Author: Yang, Sen, Wang, Kaikun, Jarfors, Anders E. W., Sun, Zhiren, Li, Qipeng, Wang, Zekun, and Huang, Zherong
Published: 2024
Full Text: View/download PDF

35. Chinese Open Instruction Generalist: A Preliminary Release

Author: Zhang, Ge, Shi, Yemin, Liu, Ruibo, Yuan, Ruibin, Li, Yizhi, Dong, Siwei, Shu, Yu, Li, Zhaoqun, Wang, Zekun, Lin, Chenghua, Huang, Wenhao, and Fu, Jie
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Instruction tuning is widely recognized as a key technique for building generalist language models, which has attracted the attention of researchers and the public with the release of InstructGPT~\citep{ouyang2022training} and ChatGPT\footnote{\url{https://chat.openai.com/}}. Despite impressive progress in English-oriented large-scale language models (LLMs), it is still under-explored whether English-based foundation LLMs can perform similarly on multilingual tasks compared to English tasks with well-designed instruction tuning and how we can construct the corpora needed for the tuning. To remedy this gap, we propose the project as an attempt to create a Chinese instruction dataset by various methods adapted to the intrinsic characteristics of 4 sub-tasks. We collect around 200k Chinese instruction tuning samples, which have been manually checked to guarantee high quality. We also summarize the existing English and Chinese instruction corpora and briefly describe some potential applications of the newly constructed Chinese instruction corpora. The resulting \textbf{C}hinese \textbf{O}pen \textbf{I}nstruction \textbf{G}eneralist (\textbf{COIG}) corpora are available in Huggingface\footnote{\url{https://huggingface.co/datasets/BAAI/COIG}} and Github\footnote{\url{https://github.com/BAAI-Zlab/COIG}}, and will be continuously updated.
Published: 2023

36. Digital Twin Tracking Dataset (DTTD): A New RGB+Depth 3D Dataset for Longer-Range Object Tracking Applications

Author: Feng, Weiyu, Zhao, Seth Z., Pan, Chuanyu, Chang, Adam, Chen, Yichen, Wang, Zekun, and Yang, Allen Y.
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Digital twin is a problem of augmenting real objects with their digital counterparts. It can underpin a wide range of applications in augmented reality (AR), autonomy, and UI/UX. A critical component in a good digital-twin system is real-time, accurate 3D object tracking. Most existing works solve 3D object tracking through the lens of robotic grasping, employ older generations of depth sensors, and measure performance metrics that may not apply to other digital-twin applications such as in AR. In this work, we create a novel RGB-D dataset, called Digital Twin Tracking Dataset (DTTD), to enable further research of the problem and extend potential solutions towards longer ranges and mm localization accuracy. To reduce point cloud noise from the input source, we select the latest Microsoft Azure Kinect as the state-of-the-art time-of-flight (ToF) camera. In total, 103 scenes of 10 common off-the-shelf objects with rich textures are recorded, with each frame annotated with a per-pixel semantic segmentation and ground-truth object poses provided by a commercial motion capturing system. Through extensive experiments with model-level and dataset-level analysis, we demonstrate that DTTD can help researchers develop future object tracking methods and analyze new challenges. The dataset, data generation, annotation, and model evaluation pipeline are made publicly available as open source code at: https://github.com/augcog/DTTDv1.
Published: 2023

37. Synergy Between Urban Planning and Environmental Design: Creating Sustainable and Livable Cities

Author: Feng, Yifan, Wang, Zekun, Appolloni, Andrea, Series Editor, Caracciolo, Francesco, Series Editor, Ding, Zhuoqi, Series Editor, Gogas, Periklis, Series Editor, Huang, Gordon, Series Editor, Nartea, Gilbert, Series Editor, Ngo, Thanh, Series Editor, Striełkowski, Wadim, Series Editor, Magdalena, Radulescu, editor, Majoul, Bootheina, editor, Singh, Satya Narayan, editor, and Rauf, Abdul, editor
Published: 2024
Full Text: View/download PDF

38. A novel spectral clustering algorithm based on neighbor relation and Gaussian kernel function with only one parameter

Author: Zhou, Hao, Wang, Zekun, Chen, Hongjia, and Wang, Xiang
Published: 2024
Full Text: View/download PDF

39. Sampled Measurement-Based Robust Impulsive Filtering for Delayed Lur’e Implicit Hybrid Systems and Its Application in RC Circuit

Author: Wang, Zekun, Zhuang, Guangming, Xia, Jianwei, and Xie, Xiangpeng
Published: 2023
Full Text: View/download PDF

40. Foundations for NLP-assisted formative assessment feedback for short-answer tasks in large-enrollment classes

Author: Lloyd, Susan, Beckman, Matthew, Pearl, Dennis, Passonneau, Rebecca, Li, Zhaohui, and Wang, Zekun
Subjects: Statistics - Other Statistics
Abstract: Research suggests "write-to-learn" tasks improve learning outcomes, yet constructed-response methods of formative assessment become unwieldy with large class sizes. This study evaluates natural language processing algorithms to assist this aim. Six short-answer tasks completed by 1,935 students were scored by several human raters, using a detailed rubric, and an algorithm. Results indicate substantial inter-rater agreement using quadratic weighted kappa for rater pairs (each QWK > 0.74) and group consensus (Fleiss Kappa = 0.68). Additionally, intra-rater agreement was estimated for one rater who had scored 178 responses seven years prior (QWK = 0.89). With compelling rater agreement, the study then pilots cluster analysis of response text toward enabling instructors to ascribe meaning to clusters as a means for scalable formative assessment., Comment: 6 pages
Published: 2022
Full Text: View/download PDF

41. A comprehensive study on the regulation of Compound Zaoren Granules on cAMP/CREB signaling pathway and metabolic disorder in CUMS-PCPA induced insomnia rats

Author: Wang, Zekun, Li, Danting, Chen, Min, Yu, Xiaocong, Chen, Chen, Chen, Yajun, Zhang, Lingfeng, and Shu, Yachun
Published: 2024
Full Text: View/download PDF

42. SMEK1 promotes clear cell renal cell carcinoma progression via EGFR tyrosine-kinase dependent pathway

Author: Wang, Jue, Bi, Wenhao, Lv, Renguang, Wang, Zekun, Xin, Qian, Li, Kailin, Chen, Yuan, Liu, Qiji, and Zhang, Xiang
Published: 2024
Full Text: View/download PDF

43. Habitat-Based Radiomics for Predicting EGFR Mutations in Exon 19 and 21 From Brain Metastasis

Author: Yang, Chunna, Fan, Ying, Zhao, Dan, Wang, Zekun, Wang, Xiaoyu, Wang, Huan, Hu, Yanjun, He, Lingzi, Zhang, Jin, Wang, Yan, Liu, Yan, Sha, Xianzheng, and Su, Juan
Published: 2024
Full Text: View/download PDF

44. High-performance coaxial reversal rotational triboelectric nanogenerator based on charge pumping strategy driving tip high voltage breakdown

Author: Hao, Congcong, Wang, Zekun, Cai, Mingzhe, Liu, Tingshan, Zhai, Cong, Cui, Juan, Zheng, Yongqiu, and Xue, Chenyang
Published: 2024
Full Text: View/download PDF

45. The therapeutic effect of Yinqiaosan decoction against influenza A virus infection by regulating T cell receptor signaling pathway

Author: Li, Danting, Wang, Zekun, Wang, Wenlei, Zheng, Zhihui, Wei, Hailin, Su, Qin, Yang, Mengmeng, Zhao, Yimeng, Zhang, Xinyuan, Yu, Xiaocong, Zhang, Pinghu, and Shu, Yachun
Published: 2024
Full Text: View/download PDF

46. Boron modification promoting electrochemical surface reconstruction of NiFe-LDH for efficient and stable freshwater/seawater oxidation catalysis

Author: Wang, Zekun, Niu, Xueqing, Ye, Lin, Wang, Xiaoyu, Wang, Chao, Wen, Yonghong, Zong, Lingbo, Wang, Lei, Gao, Hongtao, Li, Xingwei, and Zhan, Tianrong
Published: 2024
Full Text: View/download PDF

47. Investigation of the Flow Field in the Pulse Tube Refrigerator with the Multi-Bypass Structure Through the Finite Element Method

Author: Wang, ZeKun, Fang, ChuShu, Zhou, Yuan, and Li, Laifeng
Published: 2023
Full Text: View/download PDF

48. Timer-dependent L-K functional-based H∞ impulsive filtering for delayed implicit hybrid systems

Author: Wang, Zekun, Zhuang, Guangming, Xie, Xiangpeng, and Wang, Yanqian
Published: 2023
Full Text: View/download PDF

49. Distilled Dual-Encoder Model for Vision-Language Understanding

Author: Wang, Zekun, Wang, Wenhui, Zhu, Haichao, Liu, Ming, Qin, Bing, and Wei, Furu
Subjects: Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: We propose a cross-modal attention distillation framework to train a dual-encoder model for vision-language understanding tasks, such as visual reasoning and visual question answering. Dual-encoder models have a faster inference speed than fusion-encoder models and enable the pre-computation of images and text during inference. However, the shallow interaction module used in dual-encoder models is insufficient to handle complex vision-language understanding tasks. In order to learn deep interactions of images and text, we introduce cross-modal attention distillation, which uses the image-to-text and text-to-image attention distributions of a fusion-encoder model to guide the training of our dual-encoder model. In addition, we show that applying the cross-modal attention distillation for both pre-training and fine-tuning stages achieves further improvements. Experimental results demonstrate that the distilled dual-encoder model achieves competitive performance for visual reasoning, visual entailment and visual question answering tasks while enjoying a much faster inference speed than fusion-encoder models. Our code and models will be publicly available at https://github.com/kugwzk/Distilled-DualEncoder., Comment: EMNLP 2022
Published: 2021

50. Early anomaly detection of wind turbine gearbox based on SLFormer neural network

Author: Wang, Zekun, Jiang, Xue, Xu, Zifei, Cai, Chang, Wang, Xiaodong, Xu, Jianzhong, Zhong, Xiaohui, Yang, Wei, and Li, Qing 'an
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

888 results on '"Wang, Zekun"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources