Author: "Feng, Hongwei" / Topic: fos: computer and information sciences - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Feng, Hongwei"' showing total 5 results

Start Over Author "Feng, Hongwei" Topic fos: computer and information sciences

5 results on '"Feng, Hongwei"'

1. Domain Mastery Benchmark: An Ever-Updating Benchmark for Evaluating Holistic Domain Knowledge of Large Language Model--A Preliminary Release

Author: Gu, Zhouhong, Zhu, Xiaoxuan, Ye, Haoning, Zhang, Lin, Xiong, Zhuozhi, Li, Zihan, He, Qianyu, Jiang, Sihang, Feng, Hongwei, and Xiao, Yanghua
Subjects: FOS: Computer and information sciences, Computer Science - Computation and Language, Computation and Language (cs.CL)
Abstract: Domain knowledge refers to the in-depth understanding, expertise, and familiarity with a specific subject, industry, field, or area of special interest. The existing benchmarks are all lack of an overall design for domain knowledge evaluation. Holding the belief that the real ability of domain language understanding can only be fairly evaluated by an comprehensive and in-depth benchmark, we introduces the Domma, a Domain Mastery Benchmark. DomMa targets at testing Large Language Models (LLMs) on their domain knowledge understanding, it features extensive domain coverage, large data volume, and a continually updated data set based on Chinese 112 first-level subject classifications. DomMa consist of 100,000 questions in both Chinese and English sourced from graduate entrance examinations and undergraduate exams in Chinese college. We have also propose designs to make benchmark and evaluation process more suitable to LLMs.
Published: 2023

2. Sem4SAP: Synonymous Expression Mining From Open Knowledge Graph For Language Model Synonym-Aware Pretraining

Author: Gu, Zhouhong, Jiang, Sihang, Huang, Wenhao, Liang, Jiaqing, Feng, Hongwei, and Xiao, Yanghua
Subjects: FOS: Computer and information sciences, Artificial Intelligence (cs.AI), Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computation and Language (cs.CL)
Abstract: The model's ability to understand synonymous expression is crucial in many kinds of downstream tasks. It will make the model to better understand the similarity between context, and more robust to the synonym substitution attack. However, many Pretrained Language Model (PLM) lack synonym knowledge due to limitation of small-scale synsets and PLM's pretraining objectives. In this paper, we propose a framework called Sem4SAP to mine synsets from Open Knowledge Graph (Open-KG) and using the mined synsets to do synonym-aware pretraining for language models. We propose to coarsly filter the content in Open-KG and use the frequency information to better help the clustering process under low-resource unsupervised conditions. We expand the mined synsets by migrating core semantics between synonymous expressions.We also propose two novel and effective synonym-aware pre-training methods for injecting synonym knowledge into PLMs.Extensive experiments demonstrate that Sem4SAP can dramatically outperform the original PLMs and other baselines on ten different tasks.
Published: 2023

3. GANTEE: Generative Adversatial Network for Taxonomy Entering Evaluation

Author: Gu, Zhouhong, Jiang, Sihang, Liu, Jingping, Xiao, Yanghua, Feng, Hongwei, Li, Zhixu, Liang, Jiaqing, and Zhong, Jian
Subjects: FOS: Computer and information sciences, Artificial Intelligence (cs.AI), Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computation and Language (cs.CL)
Abstract: Taxonomy is formulated as directed acyclic concepts graphs or trees that support many downstream tasks. Many new coming concepts need to be added to an existing taxonomy. The traditional taxonomy expansion task aims only at finding the best position for new coming concepts in the existing taxonomy. However, they have two drawbacks when being applied to the real-scenarios. The previous methods suffer from low-efficiency since they waste much time when most of the new coming concepts are indeed noisy concepts. They also suffer from low-effectiveness since they collect training samples only from the existing taxonomy, which limits the ability of the model to mine more hypernym-hyponym relationships among real concepts. This paper proposes a pluggable framework called Generative Adversarial Network for Taxonomy Entering Evaluation (GANTEE) to alleviate these drawbacks. A generative adversarial network is designed in this framework by discriminative models to alleviate the first drawback and the generative model to alleviate the second drawback. Two discriminators are used in GANTEE to provide long-term and short-term rewards, respectively. Moreover, to further improve the efficiency, pre-trained language models are used to retrieve the representation of the concepts quickly. The experiments on three real-world large-scale datasets with two different languages show that GANTEE improves the performance of the existing taxonomy expansion methods in both effectiveness and efficiency., Accepted by AAAI 2023
Published: 2023

4. Beyond the Obvious: Evaluating the Reasoning Ability In Real-life Scenarios of Language Models on Life Scapes Reasoning Benchmark~(LSR-Benchmark)

Author: Gu, Zhouhong, Li, Zihan, Zhang, Lin, Xiong, Zhuozhi, Jiang, Sihang, Zhu, Xiaoxuan, Wang, Shusen, Wang, Zili, Wang, Jianchen, Ye, Haoning, Huang, Wenhao, Zhang, Yikai, Feng, Hongwei, and Xiao, Yanghua
Subjects: FOS: Computer and information sciences, Computer Science - Computation and Language, Computation and Language (cs.CL)
Abstract: This paper introduces the Life Scapes Reasoning Benchmark (LSR-Benchmark), a novel dataset targeting real-life scenario reasoning, aiming to close the gap in artificial neural networks' ability to reason in everyday contexts. In contrast to domain knowledge reasoning datasets, LSR-Benchmark comprises free-text formatted questions with rich information on real-life scenarios, human behaviors, and character roles. The dataset consists of 2,162 questions collected from open-source online sources and is manually annotated to improve its quality. Experiments are conducted using state-of-the-art language models, such as gpt3.5-turbo and instruction fine-tuned llama models, to test the performance in LSR-Benchmark. The results reveal that humans outperform these models significantly, indicating a persisting challenge for machine learning models in comprehending daily human life.
Published: 2023
Full Text: View/download PDF

5. Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation

Author: Gu, Zhouhong, Zhu, Xiaoxuan, Ye, Haoning, Zhang, Lin, Wang, Jianchen, Jiang, Sihang, Xiong, Zhuozhi, Li, Zihan, He, Qianyu, Xu, Rui, Huang, Wenhao, Wang, Zili, Wang, Shusen, Zheng, Weiguo, Feng, Hongwei, and Xiao, Yanghua
Subjects: FOS: Computer and information sciences, Computer Science - Computation and Language, Computation and Language (cs.CL)
Abstract: New Natural Langauge Process~(NLP) benchmarks are urgently needed to align with the rapid development of large language models (LLMs). We present Xiezhi, the most comprehensive evaluation suite designed to assess holistic domain knowledge. Xiezhi comprises multiple-choice questions across 516 diverse disciplines ranging from 13 different subjects with 249,587 questions and accompanied by Xiezhi-Specialty and Xiezhi-Interdiscipline, both with 15k questions. We conduct evaluation of the 47 cutting-edge LLMs on Xiezhi. Results indicate that LLMs exceed average performance of humans in science, engineering, agronomy, medicine, and art, but fall short in economics, jurisprudence, pedagogy, literature, history, and management. We anticipate Xiezhi will help analyze important strengths and shortcomings of LLMs, and the benchmark is released in~\url{https://github.com/MikeGu721/XiezhiBenchmark}., Comment: Under review of NeurIPS 2023
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

5 results on '"Feng, Hongwei"'

1. Domain Mastery Benchmark: An Ever-Updating Benchmark for Evaluating Holistic Domain Knowledge of Large Language Model--A Preliminary Release

2. Sem4SAP: Synonymous Expression Mining From Open Knowledge Graph For Language Model Synonym-Aware Pretraining

3. GANTEE: Generative Adversatial Network for Taxonomy Entering Evaluation

4. Beyond the Obvious: Evaluating the Reasoning Ability In Real-life Scenarios of Language Models on Life Scapes Reasoning Benchmark~(LSR-Benchmark)

5. Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Database

5 results on '"Feng, Hongwei"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources