Author: "Ahrabian, Kian" / Topic: computer science - machine learning - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Ahrabian, Kian"' showing total 5 results

Start Over Author "Ahrabian, Kian" Topic computer science - machine learning

5 results on '"Ahrabian, Kian"'

1. On The Adaptation of Unlimiformer for Decoder-Only Transformers

Author: Ahrabian, Kian, Benhaim, Alon, Patra, Barun, Pujara, Jay, Singhal, Saksham, and Song, Xia
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: One of the prominent issues stifling the current generation of large language models is their limited context length. Recent proprietary models such as GPT-4 and Claude 2 have introduced longer context lengths, 8k/32k and 100k, respectively; however, despite the efforts in the community, most common models, such as LLama-2, have a context length of 4k or less. Unlimiformer (Bertsch et al., 2023) is a recently popular vector-retrieval augmentation method that offloads cross-attention computations to a kNN index. However, its main limitation is incompatibility with decoder-only transformers out of the box. In this work, we explore practical considerations of adapting Unlimiformer to decoder-only transformers and introduce a series of modifications to overcome this limitation. Moreover, we expand the original experimental setup on summarization to include a new task (i.e., free-form Q&A) and an instruction-tuned model (i.e., a custom 6.7B GPT model). Our results showcase the effectiveness of these modifications on summarization, performing on par with a model with 2x the context length. Moreover, we discuss limitations and future directions for free-form Q&A and instruction-tuned models., Comment: 8 pages, 6 figures
Published: 2024

2. MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning

Author: Jiang, Yifan, Zhang, Jiarui, Sun, Kexuan, Sourati, Zhivar, Ahrabian, Kian, Ma, Kaixin, Ilievski, Filip, and Pujara, Jay
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: While multi-modal large language models (MLLMs) have shown significant progress on many popular visual reasoning benchmarks, whether they possess abstract visual reasoning abilities remains an open question. Similar to the Sudoku puzzles, abstract visual reasoning (AVR) problems require finding high-level patterns (e.g., repetition constraints) that control the input shapes (e.g., digits) in a specific task configuration (e.g., matrix). However, existing AVR benchmarks only considered a limited set of patterns (addition, conjunction), input shapes (rectangle, square), and task configurations (3 by 3 matrices). To evaluate MLLMs' reasoning abilities comprehensively, we introduce MARVEL, a multidimensional AVR benchmark with 770 puzzles composed of six core knowledge patterns, geometric and abstract shapes, and five different task configurations. To inspect whether the model accuracy is grounded in perception and reasoning, MARVEL complements the general AVR question with perception questions in a hierarchical evaluation framework. We conduct comprehensive experiments on MARVEL with nine representative MLLMs in zero-shot and few-shot settings. Our experiments reveal that all models show near-random performance on the AVR question, with significant performance gaps (40%) compared to humans across all patterns and task configurations. Further analysis of perception questions reveals that MLLMs struggle to comprehend the visual features (near-random performance) and even count the panels in the puzzle ( <45%), hindering their ability for abstract reasoning. We release our entire code and dataset.
Published: 2024

3. PubGraph: A Large-Scale Scientific Knowledge Graph

Author: Ahrabian, Kian, Du, Xinwei, Myloth, Richard Delwin, Ananthan, Arun Baalaaji Sankar, and Pujara, Jay
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Research publications are the primary vehicle for sharing scientific progress in the form of new discoveries, methods, techniques, and insights. Unfortunately, the lack of a large-scale, comprehensive, and easy-to-use resource capturing the myriad relationships between publications, their authors, and venues presents a barrier to applications for gaining a deeper understanding of science. In this paper, we present PubGraph, a new resource for studying scientific progress that takes the form of a large-scale knowledge graph (KG) with more than 385M entities, 13B main edges, and 1.5B qualifier edges. PubGraph is comprehensive and unifies data from various sources, including Wikidata, OpenAlex, and Semantic Scholar, using the Wikidata ontology. Beyond the metadata available from these sources, PubGraph includes outputs from auxiliary community detection algorithms and large language models. To further support studies on reasoning over scientific networks, we create several large-scale benchmarks extracted from PubGraph for the core task of knowledge graph completion (KGC). These benchmarks present many challenges for knowledge graph embedding models, including an adversarial community-based KGC evaluation setting, zero-shot inductive learning, and large-scale learning. All of the aforementioned resources are accessible at https://pubgraph.isi.edu/ and released under the CC-BY-SA license. We plan to update PubGraph quarterly to accommodate the release of new publications., Comment: 17 Pages, 6 Figures, 9 Tables
Published: 2023

4. Structure Aware Negative Sampling in Knowledge Graphs

Author: Ahrabian, Kian, Feizi, Aarash, Salehi, Yasmin, Hamilton, William L., and Bose, Avishek Joey
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language, Statistics - Machine Learning
Abstract: Learning low-dimensional representations for entities and relations in knowledge graphs using contrastive estimation represents a scalable and effective method for inferring connectivity patterns. A crucial aspect of contrastive learning approaches is the choice of corruption distribution that generates hard negative samples, which force the embedding model to learn discriminative representations and find critical characteristics of observed data. While earlier methods either employ too simple corruption distributions, i.e. uniform, yielding easy uninformative negatives or sophisticated adversarial distributions with challenging optimization schemes, they do not explicitly incorporate known graph structure resulting in suboptimal negatives. In this paper, we propose Structure Aware Negative Sampling (SANS), an inexpensive negative sampling strategy that utilizes the rich graph structure by selecting negative samples from a node's k-hop neighborhood. Empirically, we demonstrate that SANS finds semantically meaningful negatives and is competitive with SOTA approaches while requires no additional parameters nor difficult adversarial optimization., Comment: Accepted to EMNLP 2020. Camera-ready submission
Published: 2020

5. Software Engineering Event Modeling using Relative Time in Temporal Knowledge Graphs

Author: Ahrabian, Kian, Tarlow, Daniel, Cheng, Hehuimin, and Guo, Jin L. C.
Subjects: Computer Science - Machine Learning, Computer Science - Software Engineering, Statistics - Machine Learning
Abstract: We present a multi-relational temporal Knowledge Graph based on the daily interactions between artifacts in GitHub, one of the largest social coding platforms. Such representation enables posing many user-activity and project management questions as link prediction and time queries over the knowledge graph. In particular, we introduce two new datasets for i) interpolated time-conditioned link prediction and ii) extrapolated time-conditioned link/time prediction queries, each with distinguished properties. Our experiments on these datasets highlight the potential of adapting knowledge graphs to answer broad software engineering questions. Meanwhile, it also reveals the unsatisfactory performance of existing temporal models on extrapolated queries and time prediction queries in general. To overcome these shortcomings, we introduce an extension to current temporal models using relative temporal information with regards to past events., Comment: 11 pages, 1 figure. 37th International Conference on Machine Learning (ICML 2020) - Workshop on Graph Representation Learning and Beyond
Published: 2020

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

5 results on '"Ahrabian, Kian"'

1. On The Adaptation of Unlimiformer for Decoder-Only Transformers

2. MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning

3. PubGraph: A Large-Scale Scientific Knowledge Graph

4. Structure Aware Negative Sampling in Knowledge Graphs

5. Software Engineering Event Modeling using Relative Time in Temporal Knowledge Graphs

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

5 results on '"Ahrabian, Kian"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources