Author: "Koreeda, Yuta" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Koreeda, Yuta"' showing total 30 results

Start Over Author "Koreeda, Yuta"

30 results on '"Koreeda, Yuta"'

1. Acquiring Bidirectionality via Large and Small Language Models

Author: Goto, Takumi, Nagao, Hiroyoshi, and Koreeda, Yuta
Subjects: Computer Science - Computation and Language
Abstract: Using token representation from bidirectional language models (LMs) such as BERT is still a widely used approach for token-classification tasks. Even though there exist much larger unidirectional LMs such as Llama-2, they are rarely used to replace the token representation of bidirectional LMs. In this work, we hypothesize that their lack of bidirectionality is keeping them behind. To that end, we propose to newly train a small backward LM and concatenate its representations to those of existing LM for downstream tasks. Through experiments in named entity recognition, we demonstrate that introducing backward model improves the benchmark performance more than 10 points. Furthermore, we show that the proposed method is especially effective for rare domains and in few-shot learning settings.
Published: 2024

2. LARCH: Large Language Model-based Automatic Readme Creation with Heuristics

Author: Koreeda, Yuta, Morishita, Terufumi, Imaichi, Osamu, and Sogawa, Yasuhiro
Subjects: Computer Science - Computation and Language, Computer Science - Software Engineering
Abstract: Writing a readme is a crucial aspect of software development as it plays a vital role in managing and reusing program code. Though it is a pain point for many developers, automatically creating one remains a challenge even with the recent advancements in large language models (LLMs), because it requires generating an abstract description from thousands of lines of code. In this demo paper, we show that LLMs are capable of generating a coherent and factually correct readmes if we can identify a code fragment that is representative of the repository. Building upon this finding, we developed LARCH (LLM-based Automatic Readme Creation with Heuristics) which leverages representative code identification with heuristics and weak supervision. Through human and automated evaluations, we illustrate that LARCH can generate coherent and factually correct readmes in the majority of cases, outperforming a baseline that does not rely on representative code identification. We have made LARCH open-source and provided a cross-platform Visual Studio Code interface and command-line interface, accessible at https://github.com/hitachi-nlp/larch. A demo video showcasing LARCH's capabilities is available at https://youtu.be/ZUKkh5ED-O4., Comment: This is a pre-print of a paper accepted at CIKM'23 Demo. Refer to the DOI URL for the original publication
Published: 2023
Full Text: View/download PDF

3. Hitachi at SemEval-2023 Task 3: Exploring Cross-lingual Multi-task Strategies for Genre and Framing Detection in Online News

Author: Koreeda, Yuta, Yokote, Ken-ichi, Ozaki, Hiroaki, Yamaguchi, Atsuki, Tsunokake, Masaya, and Sogawa, Yasuhiro
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: This paper explains the participation of team Hitachi to SemEval-2023 Task 3 "Detecting the genre, the framing, and the persuasion techniques in online news in a multi-lingual setup.'' Based on the multilingual, multi-task nature of the task and the low-resource setting, we investigated different cross-lingual and multi-task strategies for training the pretrained language models. Through extensive experiments, we found that (a) cross-lingual/multi-task training, and (b) collecting an external balanced dataset, can benefit the genre and framing detection. We constructed ensemble models from the results and achieved the highest macro-averaged F1 scores in Italian and Russian genre categorization subtasks., Comment: Accepted at SemEval-2023 Task 3
Published: 2023

4. Holistic Evaluation of Language Models

Author: Liang, Percy, Bommasani, Rishi, Lee, Tony, Tsipras, Dimitris, Soylu, Dilara, Yasunaga, Michihiro, Zhang, Yian, Narayanan, Deepak, Wu, Yuhuai, Kumar, Ananya, Newman, Benjamin, Yuan, Binhang, Yan, Bobby, Zhang, Ce, Cosgrove, Christian, Manning, Christopher D., Ré, Christopher, Acosta-Navas, Diana, Hudson, Drew A., Zelikman, Eric, Durmus, Esin, Ladhak, Faisal, Rong, Frieda, Ren, Hongyu, Yao, Huaxiu, Wang, Jue, Santhanam, Keshav, Orr, Laurel, Zheng, Lucia, Yuksekgonul, Mert, Suzgun, Mirac, Kim, Nathan, Guha, Neel, Chatterji, Niladri, Khattab, Omar, Henderson, Peter, Huang, Qian, Chi, Ryan, Xie, Sang Michael, Santurkar, Shibani, Ganguli, Surya, Hashimoto, Tatsunori, Icard, Thomas, Zhang, Tianyi, Chaudhary, Vishrav, Wang, William, Li, Xuechen, Mai, Yifan, Zhang, Yuhui, and Koreeda, Yuta
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Language models (LMs) are becoming the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation of Language Models (HELM) to improve the transparency of language models. First, we taxonomize the vast space of potential scenarios (i.e. use cases) and metrics (i.e. desiderata) that are of interest for LMs. Then we select a broad subset based on coverage and feasibility, noting what's missing or underrepresented (e.g. question answering for neglected English dialects, metrics for trustworthiness). Second, we adopt a multi-metric approach: We measure 7 metrics (accuracy, calibration, robustness, fairness, bias, toxicity, and efficiency) for each of 16 core scenarios when possible (87.5% of the time). This ensures metrics beyond accuracy don't fall to the wayside, and that trade-offs are clearly exposed. We also perform 7 targeted evaluations, based on 26 targeted scenarios, to analyze specific aspects (e.g. reasoning, disinformation). Third, we conduct a large-scale evaluation of 30 prominent language models (spanning open, limited-access, and closed models) on all 42 scenarios, 21 of which were not previously used in mainstream LM evaluation. Prior to HELM, models on average were evaluated on just 17.9% of the core HELM scenarios, with some prominent models not sharing a single scenario in common. We improve this to 96.0%: now all 30 models have been densely benchmarked on the same core scenarios and metrics under standardized conditions. Our evaluation surfaces 25 top-level findings. For full transparency, we release all raw model prompts and completions publicly for further analysis, as well as a general modular toolkit. We intend for HELM to be a living benchmark for the community, continuously updated with new scenarios, metrics, and models., Comment: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Project page: https://crfm.stanford.edu/helm/v1.0
Published: 2022

5. ContractNLI: A Dataset for Document-level Natural Language Inference for Contracts

Author: Koreeda, Yuta and Manning, Christopher D.
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Reviewing contracts is a time-consuming procedure that incurs large expenses to companies and social inequality to those who cannot afford it. In this work, we propose "document-level natural language inference (NLI) for contracts", a novel, real-world application of NLI that addresses such problems. In this task, a system is given a set of hypotheses (such as "Some obligations of Agreement may survive termination.") and a contract, and it is asked to classify whether each hypothesis is "entailed by", "contradicting to" or "not mentioned by" (neutral to) the contract as well as identifying "evidence" for the decision as spans in the contract. We annotated and release the largest corpus to date consisting of 607 annotated contracts. We then show that existing models fail badly on our task and introduce a strong baseline, which (1) models evidence identification as multi-label classification over spans instead of trying to predict start and end tokens, and (2) employs more sophisticated context segmentation for dealing with long documents. We also show that linguistic characteristics of contracts, such as negations by exceptions, are contributing to the difficulty of this task and that there is much room for improvement., Comment: Accepted at the Findings of the Association for Computational Linguistics: EMNLP 2021
Published: 2021

6. Capturing Logical Structure of Visually Structured Documents with Multimodal Transition Parser

Author: Koreeda, Yuta and Manning, Christopher D.
Subjects: Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Information Retrieval
Abstract: While many NLP pipelines assume raw, clean texts, many texts we encounter in the wild, including a vast majority of legal documents, are not so clean, with many of them being visually structured documents (VSDs) such as PDFs. Conventional preprocessing tools for VSDs mainly focused on word segmentation and coarse layout analysis, whereas fine-grained logical structure analysis (such as identifying paragraph boundaries and their hierarchies) of VSDs is underexplored. To that end, we proposed to formulate the task as prediction of "transition labels" between text fragments that maps the fragments to a tree, and developed a feature-based machine learning system that fuses visual, textual and semantic cues.Our system is easily customizable to different types of VSDs and it significantly outperformed baselines in identifying different structures in VSDs. For example, our system obtained a paragraph boundary detection F1 score of 0.953 which is significantly better than a popular PDF-to-text tool with an F1 score of 0.739., Comment: 11 pages, 5 figure
Published: 2021

7. Hitachi at SemEval-2020 Task 12: Offensive Language Identification with Noisy Labels using Statistical Sampling and Post-Processing

Author: Ravikiran, Manikandan, Muljibhai, Amin Ekant, Miyoshi, Toshinori, Ozaki, Hiroaki, Koreeda, Yuta, and Masayuki, Sakata
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: In this paper, we present our participation in SemEval-2020 Task-12 Subtask-A (English Language) which focuses on offensive language identification from noisy labels. To this end, we developed a hybrid system with the BERT classifier trained with tweets selected using Statistical Sampling Algorithm (SA) and Post-Processed (PP) using an offensive wordlist. Our developed system achieved 34 th position with Macro-averaged F1-score (Macro-F1) of 0.90913 over both offensive and non-offensive classes. We further show comprehensive results and error analysis to assist future research in offensive language identification with noisy labels., Comment: preprint v1, Under submission for SemEval 2020 Workshop
Published: 2020

8. Hitachi at MRP 2019: Unified Encoder-to-Biaffine Network for Cross-Framework Meaning Representation Parsing

Author: Koreeda, Yuta, Morio, Gaku, Morishita, Terufumi, Ozaki, Hiroaki, and Yanai, Kohsuke
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: This paper describes the proposed system of the Hitachi team for the Cross-Framework Meaning Representation Parsing (MRP 2019) shared task. In this shared task, the participating systems were asked to predict nodes, edges and their attributes for five frameworks, each with different order of "abstraction" from input tokens. We proposed a unified encoder-to-biaffine network for all five frameworks, which effectively incorporates a shared encoder to extract rich input features, decoder networks to generate anchorless nodes in UCCA and AMR, and biaffine networks to predict edges. Our system was ranked fifth with the macro-averaged MRP F1 score of 0.7604, and outperformed the baseline unified transition-based MRP. Furthermore, post-evaluation experiments showed that we can boost the performance of the proposed system by incorporating multi-task learning, whereas the baseline could not. These imply efficacy of incorporating the biaffine network to the shared architecture for MRP and that learning heterogeneous meaning representations at once can boost the system performance., Comment: 13 pages, 3 figures
Published: 2019
Full Text: View/download PDF

9. Split First and Then Rephrase: Hierarchical Generation for Sentence Simplification

Author: Wang, Mengru, Ozaki, Hiroaki, Koreeda, Yuta, Yanai, Kohsuke, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Kotenko, Igor, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Nguyen, Le-Minh, editor, Phan, Xuan-Hieu, editor, Hasida, Kôiti, editor, and Tojo, Satoshi, editor
Published: 2020
Full Text: View/download PDF

10. Hitachi at SemEval-2023 Task 3: Exploring Cross-lingual Multi-task Strategies for Genre and Framing Detection in Online News

Author: Koreeda, Yuta, primary, Yokote, Ken-ichi, additional, Ozaki, Hiroaki, additional, Yamaguchi, Atsuki, additional, Tsunokake, Masaya, additional, and Sogawa, Yasuhiro, additional
Published: 2023
Full Text: View/download PDF

11. Hitachi at SemEval-2023 Task 4: Exploring Various Task Formulations Reveals the Importance of Description Texts on Human Values

Author: Tsunokake, Masaya, primary, Yamaguchi, Atsuki, additional, Koreeda, Yuta, additional, Ozaki, Hiroaki, additional, and Sogawa, Yasuhiro, additional
Published: 2023
Full Text: View/download PDF

12. Virtually transparent surgical instruments in endoscopic surgery with augmentation of obscured regions

Author: Koreeda, Yuta, Kobayashi, Yo, Ieiri, Satoshi, Nishio, Yuya, Kawamura, Kazuya, Obata, Satoshi, Souzaki, Ryota, Hashizume, Makoto, and Fujie, Masakatsu G.
Published: 2016
Full Text: View/download PDF

13. The effect of forceps manipulation for expert pediatric surgeons using an endoscopic pseudo-viewpoint alternating system: the phenomenon of economical slow and fast performance in endoscopic surgery

Author: Ieiri, Satoshi, Jimbo, Takahiro, Koreeda, Yuta, Obata, Satoshi, Uemura, Munenori, Souzaki, Ryota, Kobayashi, Yo, Fujie, Masakatsu G., Hashizume, Makoto, and Taguchi, Tomoaki
Published: 2015
Full Text: View/download PDF

14. Virtual Shadow Drawing System Using Augmented Reality for Laparoscopic Surgery

Author: Miura, Satoshi, primary, Seki, Masaki, additional, Koreeda, Yuta, additional, Cao, Yang, additional, Kawamura, Kazuya, additional, Kobayashi, Yo, additional, Fujie, Masakatsu G., additional, and Miyashita, Tomoyuki, additional
Published: 2022
Full Text: View/download PDF

15. i-Parser: Interactive Parser Development Kit for Natural Language Processing

Author: Morio, Gaku, primary, Ozaki, Hiroaki, additional, Koreeda, Yuta, additional, Morishita, Terufumi, additional, and Miyoshi, Toshinori, additional
Published: 2021
Full Text: View/download PDF

16. ContractNLI: A Dataset for Document-level Natural Language Inference for Contracts

Author: Koreeda, Yuta, primary and Manning, Christopher, additional
Published: 2021
Full Text: View/download PDF

17. Capturing Logical Structure of Visually Structured Documents with Multimodal Transition Parser

Author: Koreeda, Yuta, primary and Manning, Christopher, additional
Published: 2021
Full Text: View/download PDF

18. Evaluation of Virtual Shadow’s Direction in Laparoscopic Surgery

Author: Miura, Satoshi, primary, Seki, Masaki, additional, Koreeda, Yuta, additional, Cao, Yang, additional, Kawamura, Kazuya, additional, Kobayashi, Yo, additional, Fujie, Masakatsu G., additional, and Miyashita, Tomoyuki, additional
Published: 2020
Full Text: View/download PDF

19. Hitachi at MRP 2020: Text-to-Graph-Notation Transducer

Author: Ozaki, Hiroaki, primary, Morio, Gaku, additional, Koreeda, Yuta, additional, Morishita, Terufumi, additional, and Miyoshi, Toshinori, additional
Published: 2020
Full Text: View/download PDF

20. Hitachi at SemEval-2020 Task 12: Offensive Language Identification with Noisy Labels Using Statistical Sampling and Post-Processing

Author: Ravikiran, Manikandan, primary, Muljibhai, Amin Ekant, additional, Miyoshi, Toshinori, additional, Ozaki, Hiroaki, additional, Koreeda, Yuta, additional, and Masayuki, Sakata, additional
Published: 2020
Full Text: View/download PDF

21. Towards Better Non-Tree Argument Mining: Proposition-Level Biaffine Parsing with Task-Specific Parameterization

Author: Morio, Gaku, primary, Ozaki, Hiroaki, additional, Morishita, Terufumi, additional, Koreeda, Yuta, additional, and Yanai, Kohsuke, additional
Published: 2020
Full Text: View/download PDF

22. A Joint Neural Model for Patent Classification and Rationale Identification

Author: Koreeda, Yuta, primary, Mase, Hisao, additional, and Yanai, Kohsuke, additional
Published: 2019
Full Text: View/download PDF

23. Hitachi at MRP 2019: Unified Encoder-to-Biaffine Network for Cross-Framework Meaning Representation Parsing

Author: Koreeda, Yuta, primary, Morio, Gaku, additional, Morishita, Terufumi, additional, Ozaki, Hiroaki, additional, and Yanai, Kohsuke, additional
Published: 2019
Full Text: View/download PDF

24. StruAP: A Tool for Bundling Linguistic Trees through Structure-based Abstract Pattern

Author: Yanai, Kohsuke, primary, Sato, Misa, additional, Yanase, Toshihiko, additional, Kurotsuchi, Kenzo, additional, Koreeda, Yuta, additional, and Niwa, Yoshiki, additional
Published: 2017
Full Text: View/download PDF

25. bunji at SemEval-2017 Task 3: Combination of Neural Similarity Features and Comment Plausibility Features

Author: Koreeda, Yuta, primary, Hashito, Takuya, additional, Niwa, Yoshiki, additional, Sato, Misa, additional, Yanase, Toshihiko, additional, Kurotsuchi, Kenzo, additional, and Yanai, Kohsuke, additional
Published: 2017
Full Text: View/download PDF

26. Textual Supportiveness Recognition Based on Combinations of Syntax Features for Automated Argument Generation

Author: Sato, Misa, primary, Yanai, Kohsuke, additional, Yanase, Toshihiko, additional, Miyoshi, Toshinori, additional, Koreeda, Yuta, additional, and Niwa, Yoshiki, additional
Published: 2016
Full Text: View/download PDF

27. Neural Attention Model for Classification of Sentences that Support Promoting/Suppressing Relationship

Author: Koreeda, Yuta, primary, Yanase, Toshihiko, additional, Yanai, Kohsuke, additional, Sato, Misa, additional, and Niwa, Yoshiki, additional
Published: 2016
Full Text: View/download PDF

28. Estimation technique of the destined target with gesture recognition for the development of in-car interface of automated driving vehicle

Author: NAKAYAMA, Masayuki, primary, MIURA, Satoshi, additional, KAWANO, Shinya, additional, KOREEDA, Yuta, additional, YAMAMOTO, Akihiro, additional, FUKUMOTO, Ryota, additional, SAKUMA, Tsuyoshi, additional, NAKASHIMA, Yasutaka, additional, KOBAYASHI, Yo, additional, and G. FUJIE, Masakatsu, additional
Published: 2015
Full Text: View/download PDF

29. Characteristics of PHEMTs and MSM photodetectors simultaneously fabricated on same epitaxial wafer with In0.75Ga0.25As/InGaAs channel layer

Author: Koreeda, Yuta, primary, Endo, Yutaka, additional, Sato, Kouichi, additional, Yoshizawa, Kenya, additional, Nishio, Yui, additional, Taguchi, Hirohisa, additional, Iida, Tsutomu, additional, and Takanashi, Yoshifumi, additional
Published: 2011
Full Text: View/download PDF

30. Characteristics of PHEMTs and MSM photodetectors simultaneously fabricated on same epitaxial wafer with In0.75Ga0.25As/InGaAs channel layer.

Author: Koreeda, Yuta, Endo, Yutaka, Sato, Kouichi, Yoshizawa, Kenya, Nishio, Yui, Taguchi, Hirohisa, Iida, Tsutomu, and Takanashi, Yoshifumi
Published: 2012
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

30 results on '"Koreeda, Yuta"'

1. Acquiring Bidirectionality via Large and Small Language Models

2. LARCH: Large Language Model-based Automatic Readme Creation with Heuristics

3. Hitachi at SemEval-2023 Task 3: Exploring Cross-lingual Multi-task Strategies for Genre and Framing Detection in Online News

4. Holistic Evaluation of Language Models

5. ContractNLI: A Dataset for Document-level Natural Language Inference for Contracts

6. Capturing Logical Structure of Visually Structured Documents with Multimodal Transition Parser

7. Hitachi at SemEval-2020 Task 12: Offensive Language Identification with Noisy Labels using Statistical Sampling and Post-Processing

8. Hitachi at MRP 2019: Unified Encoder-to-Biaffine Network for Cross-Framework Meaning Representation Parsing

9. Split First and Then Rephrase: Hierarchical Generation for Sentence Simplification

10. Hitachi at SemEval-2023 Task 3: Exploring Cross-lingual Multi-task Strategies for Genre and Framing Detection in Online News

11. Hitachi at SemEval-2023 Task 4: Exploring Various Task Formulations Reveals the Importance of Description Texts on Human Values

12. Virtually transparent surgical instruments in endoscopic surgery with augmentation of obscured regions

13. The effect of forceps manipulation for expert pediatric surgeons using an endoscopic pseudo-viewpoint alternating system: the phenomenon of economical slow and fast performance in endoscopic surgery

14. Virtual Shadow Drawing System Using Augmented Reality for Laparoscopic Surgery

15. i-Parser: Interactive Parser Development Kit for Natural Language Processing

16. ContractNLI: A Dataset for Document-level Natural Language Inference for Contracts

17. Capturing Logical Structure of Visually Structured Documents with Multimodal Transition Parser

18. Evaluation of Virtual Shadow’s Direction in Laparoscopic Surgery

19. Hitachi at MRP 2020: Text-to-Graph-Notation Transducer

20. Hitachi at SemEval-2020 Task 12: Offensive Language Identification with Noisy Labels Using Statistical Sampling and Post-Processing

21. Towards Better Non-Tree Argument Mining: Proposition-Level Biaffine Parsing with Task-Specific Parameterization

22. A Joint Neural Model for Patent Classification and Rationale Identification

23. Hitachi at MRP 2019: Unified Encoder-to-Biaffine Network for Cross-Framework Meaning Representation Parsing

24. StruAP: A Tool for Bundling Linguistic Trees through Structure-based Abstract Pattern

25. bunji at SemEval-2017 Task 3: Combination of Neural Similarity Features and Comment Plausibility Features

26. Textual Supportiveness Recognition Based on Combinations of Syntax Features for Automated Argument Generation

27. Neural Attention Model for Classification of Sentences that Support Promoting/Suppressing Relationship

28. Estimation technique of the destined target with gesture recognition for the development of in-car interface of automated driving vehicle

29. Characteristics of PHEMTs and MSM photodetectors simultaneously fabricated on same epitaxial wafer with In0.75Ga0.25As/InGaAs channel layer

30. Characteristics of PHEMTs and MSM photodetectors simultaneously fabricated on same epitaxial wafer with In0.75Ga0.25As/InGaAs channel layer.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

30 results on '"Koreeda, Yuta"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources