492 results on '"CHEUNG, ALVIN"'
Search Results
2. Flo: a Semantic Foundation for Progressive Stream Processing
- Author
-
Laddad, Shadaj, Cheung, Alvin, Hellerstein, Joseph M., and Milano, Mae
- Subjects
Computer Science - Programming Languages ,Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
Streaming systems are present throughout modern applications, processing continuous data in real-time. Existing streaming languages have a variety of semantic models and guarantees that are often incompatible. Yet all these languages are considered "streaming" -- what do they have in common? In this paper, we identify two general yet precise semantic properties: streaming progress and eager execution. Together, they ensure that streaming outputs are deterministic and kept fresh with respect to streaming inputs. We formally define these properties in the context of Flo, a parameterized streaming language that abstracts over dataflow operators and the underlying structure of streams. It leverages a lightweight type system to distinguish bounded streams, which allow operators to block on termination, from unbounded ones. Furthermore, Flo provides constructs for dataflow composition and nested graphs with cycles. To demonstrate the generality of our properties, we show how key ideas from representative streaming and incremental computation systems -- Flink, LVars, and DBSP -- have semantics that can be modeled in Flo and guarantees that map to our properties.
- Published
- 2024
3. Legal Gaslighting
- Author
-
Cheung, Alvin
- Published
- 2021
4. LLM-Aided Compilation for Tensor Accelerators
- Author
-
Hong, Charles, Bhatia, Sahil, Haan, Altan, Dong, Shengjun Kris, Nikiforov, Dima, Cheung, Alvin, and Shao, Yakun Sophia
- Subjects
Computer Science - Hardware Architecture ,Computer Science - Machine Learning ,Computer Science - Programming Languages - Abstract
Hardware accelerators, in particular accelerators for tensor processing, have many potential application domains. However, they currently lack the software infrastructure to support the majority of domains outside of deep learning. Furthermore, a compiler that can easily be updated to reflect changes at both application and hardware levels would enable more agile development and design space exploration of accelerators, allowing hardware designers to realize closer-to-optimal performance. In this work, we discuss how large language models (LLMs) could be leveraged to build such a compiler. Specifically, we demonstrate the ability of GPT-4 to achieve high pass rates in translating code to the Gemmini accelerator, and prototype a technique for decomposing translation into smaller, more LLM-friendly steps. Additionally, we propose a 2-phase workflow for utilizing LLMs to generate hardware-optimized code., Comment: 4 page workshop paper
- Published
- 2024
5. Suki: Choreographed Distributed Dataflow in Rust
- Author
-
Laddad, Shadaj, Cheung, Alvin, and Hellerstein, Joseph M.
- Subjects
Computer Science - Programming Languages ,Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
Programming models for distributed dataflow have long focused on analytical workloads that allow the runtime to dynamically place and schedule compute logic. Meanwhile, models that enable fine-grained control over placement, such as actors, make global optimization difficult. In this extended abstract, we present Suki, an embedded Rust DSL that lets developers implement streaming dataflow with explicit placement of computation. Key to this choreographic programming approach is our use of staged programming, which lets us expose a high-level Rust API while compiling local compute units into individual binaries with zero-overhead. We also explore how this approach, combined with Rust's trait system, enables a type-safe API for mapping dataflow programs to cloud computing resources.
- Published
- 2024
6. Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
- Author
-
Liu, Xiaoxuan, Daniel, Cade, Hu, Langxiang, Kwon, Woosuk, Li, Zhuohan, Mo, Xiangxi, Cheung, Alvin, Deng, Zhijie, Stoica, Ion, and Zhang, Hao
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Performance - Abstract
Reducing the inference latency of large language models (LLMs) is crucial, and speculative decoding (SD) stands out as one of the most effective techniques. Rather than letting the LLM generate all tokens directly, speculative decoding employs effective proxies to predict potential outputs, which are then verified by the LLM without compromising the generation quality. Yet, deploying SD in real online LLM serving systems (with continuous batching) does not always yield improvement -- under higher request rates or low speculation accuracy, it paradoxically increases latency. Furthermore, there is no best speculation length work for all workloads under different system loads. Based on the observations, we develop a dynamic framework SmartSpec. SmartSpec dynamically determines the best speculation length for each request (from 0, i.e., no speculation, to many tokens) -- hence the associated speculative execution costs -- based on a new metric called goodput, which characterizes the current observed load of the entire system and the speculation accuracy. We show that SmartSpec consistently reduces average request latency by up to 3.2x compared to non-speculative decoding baselines across different sizes of target models, draft models, request rates, and datasets. Moreover, SmartSpec can be applied to different styles of speculative decoding, including traditional, model-based approaches as well as model-free methods like prompt lookup and tree-style decoding.
- Published
- 2024
7. Verified Code Transpilation with LLMs
- Author
-
Bhatia, Sahil, Qiu, Jie, Hasabnis, Niranjan, Seshia, Sanjit A., and Cheung, Alvin
- Subjects
Computer Science - Programming Languages - Abstract
Domain-specific languages (DSLs) are integral to various software workflows. Such languages offer domain-specific optimizations and abstractions that improve code readability and maintainability. However, leveraging these languages requires developers to rewrite existing code using the specific DSL's API. While large language models (LLMs) have shown some success in automatic code transpilation, none of them provide any functional correctness guarantees on the transpiled code. Another approach for automating this task is verified lifting, which relies on program synthesis to find programs in the target language that are functionally equivalent to the source language program. While several verified lifting tools have been developed for various application domains, they are specialized for specific source-target languages or require significant expertise in domain knowledge to make the search efficient. In this paper, leveraging recent advances in LLMs, we propose an LLM-based approach (LLMLift) to building verified lifting tools. We use the LLM's capabilities to reason about programs to translate a given program into its corresponding equivalent in the target language. Additionally, we use LLMs to generate proofs for functional equivalence. We develop lifting-based compilers for {\em four different} DSLs targeting different application domains. Our approach not only outperforms previous symbolic-based tools in both the number of benchmarks transpiled and transpilation time, but also requires significantly less effort to build.
- Published
- 2024
8. Tenspiler: A Verified Lifting-Based Compiler for Tensor Operations
- Author
-
Qiu, Jie, Cai, Colin, Bhatia, Sahil, Hasabnis, Niranjan, Seshia, Sanjit A., and Cheung, Alvin
- Subjects
Computer Science - Programming Languages - Abstract
Tensor processing infrastructures such as deep learning frameworks and specialized hardware accelerators have revolutionized how computationally intensive code from domains such as deep learning and image processing is executed and optimized. These infrastructures provide powerful and expressive abstractions while ensuring high performance. However, to utilize them, code must be written specifically using the APIs / ISAs of such software frameworks or hardware accelerators. Importantly, given the fast pace of innovation in these domains, code written today quickly becomes legacy as new frameworks and accelerators are developed, and migrating such legacy code manually is a considerable effort. To enable developers in leveraging such DSLs while preserving their current programming paradigm, we introduce Tenspiler, a verified lifting-based compiler that uses program synthesis to translate sequential programs written in general-purpose programming languages (e.g., C++ or Python code) into tensor operations. Central to Tenspiler is our carefully crafted yet simple intermediate language, named TensIR, that expresses tensor operations. TensIR enables efficient lifting, verification, and code generation. Currently, Tenspiler already supports \textbf{six} DSLs, spanning a broad spectrum of software and hardware environments. Furthermore, we show that new backends can be easily supported by Tenspiler by adding simple pattern-matching rules for TensIR. Using 10 real-world code benchmark suites, our experimental evaluation shows that by translating code to be executed on \textbf{6} different software frameworks and hardware devices, Tenspiler offers on average 105$\times$ kernel and 9.65$\times$ end-to-end execution time improvement over the fully-optimized sequential implementation of the same benchmarks.
- Published
- 2024
9. M\'elange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
- Author
-
Griggs, Tyler, Liu, Xiaoxuan, Yu, Jiaxiang, Kim, Doyoung, Chiang, Wei-Lin, Cheung, Alvin, and Stoica, Ion
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Machine Learning - Abstract
Large language models (LLMs) are increasingly integrated into many online services, yet they remain cost-prohibitive to deploy due to the requirement of expensive GPU instances. Prior work has addressed the high cost of LLM serving by improving the inference engine, but less attention has been given to selecting the most cost-efficient GPU type(s) for a specific LLM service. There is a large and growing landscape of GPU types and, within these options, higher cost does not always lead to increased performance. Instead, through a comprehensive investigation, we find that three key LLM service characteristics (request size, request rate, SLO) strongly influence GPU cost efficiency, and differing GPU types are most cost efficient for differing LLM service settings. As a result, the most cost-efficient allocation for a given service is typically a mix of heterogeneous GPU types. Based on this analysis, we introduce M\'elange, a GPU allocation framework that navigates these diverse LLM service characteristics and heterogeneous GPU option space to automatically and efficiently derive the minimal-cost GPU allocation for a given LLM service. We formulate the GPU allocation task as a cost-aware bin packing problem where GPUs are bins and items are slices of the service workload. Our formulation's constraints account for a service's unique characteristics, allowing M\'elange to be flexible to support diverse service settings and heterogeneity-aware to adapt the GPU allocation to a specific service. Compared to using only a single GPU type, M\'elange reduces deployment costs by up to 77% in conversational settings, 33% in document-based settings, and 51% in a mixed setting.
- Published
- 2024
10. Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
- Author
-
Gong, Linyuan, Wang, Sida, Elhoushi, Mostafa, and Cheung, Alvin
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Computer Science - Software Engineering - Abstract
We introduce Syntax-Aware Fill-In-the-Middle (SAFIM), a new benchmark for evaluating Large Language Models (LLMs) on the code Fill-in-the-Middle (FIM) task. This benchmark focuses on syntax-aware completions of program structures such as code blocks and conditional expressions, and includes 17,720 examples from multiple programming languages, sourced from recent code submissions after April 2022 to minimize data contamination. SAFIM provides a robust framework with various prompt designs and novel syntax-aware post-processing techniques, facilitating accurate and fair comparisons across LLMs. Our comprehensive evaluation of 15 LLMs shows that FIM pretraining not only enhances FIM proficiency but also improves Left-to-Right (L2R) inference using LLMs. Our findings challenge conventional beliefs and suggest that pretraining methods and data quality have more impact than model size. SAFIM thus serves as a foundational platform for future research in effective pretraining strategies for code LLMs. The evaluation toolkit and dataset are available at https://github.com/gonglinyuan/safim, and the leaderboard is available at https://safimbenchmark.com., Comment: 22 pages; ICML 2024 Oral: https://icml.cc/virtual/2024/oral/35482
- Published
- 2024
11. AST-T5: Structure-Aware Pretraining for Code Generation and Understanding
- Author
-
Gong, Linyuan, Elhoushi, Mostafa, and Cheung, Alvin
- Subjects
Computer Science - Software Engineering ,Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Large language models (LLMs) have made significant advancements in code-related tasks, yet many LLMs treat code as simple sequences, neglecting its structured nature. We introduce AST-T5, a novel pretraining paradigm that leverages the Abstract Syntax Tree (AST) for enhanced code generation, transpilation, and understanding. Using dynamic programming, our AST-Aware Segmentation retains code structure, while our AST-Aware Span Corruption objective equips the model to reconstruct various code structures. Unlike other models, AST-T5 avoids intricate program analyses or architectural changes, so it integrates seamlessly with any encoder-decoder Transformer. Evaluations show that AST-T5 consistently outperforms similar-sized LMs across various code-related tasks. Structure-awareness makes AST-T5 particularly powerful in code-to-code tasks, surpassing CodeT5 by 2 points in exact match score for the Bugs2Fix task and by 3 points in exact match score for Java-C# Transpilation in CodeXGLUE. Our code and model are publicly available at https://github.com/gonglinyuan/ast_t5., Comment: 15 pages; ICML 2024: https://icml.cc/virtual/2024/poster/33601
- Published
- 2024
12. Peptostreptococcus anaerobius mediates anti-PD1 therapy resistance and exacerbates colorectal cancer via myeloid-derived suppressor cells in mice
- Author
-
Liu, Yali, Wong, Chi Chun, Ding, Yanqiang, Gao, Mengxue, Wen, Jun, Lau, Harry Cheuk-Hay, Cheung, Alvin Ho-Kwan, Huang, Dan, Huang, He, and Yu, Jun
- Published
- 2024
- Full Text
- View/download PDF
13. Online Speculative Decoding
- Author
-
Liu, Xiaoxuan, Hu, Lanxiang, Bailis, Peter, Cheung, Alvin, Deng, Zhijie, Stoica, Ion, and Zhang, Hao
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Speculative decoding is a pivotal technique to accelerate the inference of large language models (LLMs) by employing a smaller draft model to predict the target model's outputs. However, its efficacy can be limited due to the low predictive accuracy of the draft model, particularly when faced with diverse text inputs and a significant capability gap between the draft and target models. We introduce online speculative decoding to address this challenge. The main idea is to continuously update the (multiple) draft model(s) on observed user query data. Adapting to query distribution mitigates the shifts between the training distribution of the draft model and the query distribution, enabling the draft model to more accurately predict the target model's outputs. We develop a prototype of online speculative decoding based on knowledge distillation and evaluate it using both synthetic and real query data. The results show a substantial increase in the token acceptance rate by 0.1 to 0.65, bringing 1.42x to 2.17x latency reduction. Our code is available at https://github.com/LiuXiaoxuanPKU/OSD.
- Published
- 2023
14. Code Transpilation for Hardware Accelerators
- Author
-
Nishida, Yuto, Bhatia, Sahil, Laddad, Shadaj, Genc, Hasan, Shao, Yakun Sophia, and Cheung, Alvin
- Subjects
Computer Science - Programming Languages ,Computer Science - Hardware Architecture - Abstract
DSLs and hardware accelerators have proven to be very effective in optimizing computationally expensive workloads. In this paper, we propose a solution to the challenge of manually rewriting legacy or unoptimized code in domain-specific languages and hardware accelerators. We introduce an approach that integrates two open-source tools: Metalift, a code translation framework, and Gemmini, a DNN accelerator generator. The integration of these two tools offers significant benefits, including simplified workflows for developers to run legacy code on Gemmini generated accelerators and a streamlined programming stack for Gemmini that reduces the effort required to add new instructions. This paper provides details on this integration and its potential to simplify and optimize computationally expensive workloads.
- Published
- 2023
15. Spatialyze: A Geospatial Video Analytics System with Spatial-Aware Optimizations
- Author
-
Kittivorawong, Chanwut, Ge, Yongming, Helal, Yousef, and Cheung, Alvin
- Subjects
Computer Science - Databases ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Videos that are shot using commodity hardware such as phones and surveillance cameras record various metadata such as time and location. We encounter such geospatial videos on a daily basis and such videos have been growing in volume significantly. Yet, we do not have data management systems that allow users to interact with such data effectively. In this paper, we describe Spatialyze, a new framework for end-to-end querying of geospatial videos. Spatialyze comes with a domain-specific language where users can construct geospatial video analytic workflows using a 3-step, declarative, build-filter-observe paradigm. Internally, Spatialyze leverages the declarative nature of such workflows, the temporal-spatial metadata stored with videos, and physical behavior of real-world objects to optimize the execution of workflows. Our results using real-world videos and workflows show that Spatialyze can reduce execution time by up to 5.3x, while maintaining up to 97.1% accuracy compared to unoptimized execution., Comment: Project Page: https://spatialyze.github.io
- Published
- 2023
- Full Text
- View/download PDF
16. Optimizing Stateful Dataflow with Local Rewrites
- Author
-
Laddad, Shadaj, Power, Conor, Hou, Tyler, Cheung, Alvin, and Hellerstein, Joseph M.
- Subjects
Computer Science - Programming Languages ,Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
Optimizing a stateful dataflow language is a challenging task. There are strict correctness constraints for preserving properties expected by downstream consumers, a large space of possible optimizations, and complex analyses that must reason about the behavior of the program over time. Classic compiler techniques with specialized optimization passes yield unpredictable performance and have complex correctness proofs. But with e-graphs, we can dramatically simplify the process of building a correct optimizer while yielding more consistent results! In this short paper, we discuss our early work using e-graphs to develop an optimizer for a the Hydroflow dataflow language. Our prototype demonstrates that composing simple, easy-to-prove rewrite rules is sufficient to match techniques in hand-optimized systems., Comment: EGRAPHS 2023
- Published
- 2023
17. SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics
- Author
-
Ardakani, Arash, Haan, Altan, Tan, Shangyin, Popovici, Doru Thom, Cheung, Alvin, Iancu, Costin, and Sen, Koushik
- Subjects
Computer Science - Computation and Language - Abstract
Transformer-based models, such as BERT and ViT, have achieved state-of-the-art results across different natural language processing (NLP) and computer vision (CV) tasks. However, these models are extremely memory intensive during their fine-tuning process, making them difficult to deploy on GPUs with limited memory resources. To address this issue, we introduce a new tool called SlimFit that reduces the memory requirements of these models by dynamically analyzing their training dynamics and freezing less-contributory layers during fine-tuning. The layers to freeze are chosen using a runtime inter-layer scheduling algorithm. SlimFit adopts quantization and pruning for particular layers to balance the load of dynamic activations and to minimize the memory footprint of static activations, where static activations refer to those that cannot be discarded regardless of freezing. This allows SlimFit to freeze up to 95% of layers and reduce the overall on-device GPU memory usage of transformer-based models such as ViT and BERT by an average of 2.2x, across different NLP and CV benchmarks/datasets such as GLUE, SQuAD 2.0, CIFAR-10, CIFAR-100 and ImageNet with an average degradation of 0.2% in accuracy. For such NLP and CV tasks, SlimFit can reduce up to 3.1x the total on-device memory usage with an accuracy degradation of only up to 0.4%. As a result, while fine-tuning of ViT on ImageNet and BERT on SQuAD 2.0 with a batch size of 128 requires 3 and 2 32GB GPUs respectively, SlimFit enables their fine-tuning on a single 32GB GPU without any significant accuracy degradation.
- Published
- 2023
18. Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers
- Author
-
Gong, Linyuan, Xiong, Chenyan, Liu, Xiaodong, Bajaj, Payal, Xie, Yiqing, Cheung, Alvin, Gao, Jianfeng, and Song, Xia
- Subjects
Computer Science - Computation and Language - Abstract
This paper explores the effectiveness of model-generated signals in improving zero-shot generalization of text-to-text Transformers such as T5. We study various designs to pretrain T5 using an auxiliary model to construct more challenging token replacements for the main model to denoise. Key aspects under study include the decoding target, the location of the RTD head, and the masking pattern. Based on these studies, we develop a new model, METRO-T0, which is pretrained using the redesigned ELECTRA-Style pretraining strategies and then prompt-finetuned on a mixture of NLP tasks. METRO-T0 outperforms all similar-sized baselines on prompted NLP benchmarks, such as T0 Eval and MMLU, and rivals the state-of-the-art T0-11B model with only 8% of its parameters. Our analysis on model's neural activation and parameter sensitivity reveals that the effectiveness of METRO-T0 stems from more balanced contribution of parameters and better utilization of their capacity. The code and model checkpoints are available at https://github.com/gonglinyuan/metro_t0., Comment: Published as a conference paper at ACL 2023. 9 pages
- Published
- 2023
- Full Text
- View/download PDF
19. An Evaluation of Memory Optimization Methods for Training Neural Networks
- Author
-
Liu, Xiaoxuan, Jha, Siddharth, and Cheung, Alvin
- Subjects
Computer Science - Machine Learning ,Computer Science - Performance - Abstract
As models continue to grow in size, the development of memory optimization methods (MOMs) has emerged as a solution to address the memory bottleneck encountered when training large models. To comprehensively examine the practical value of various MOMs, we have conducted a thorough analysis of existing literature from a systems perspective. Our analysis has revealed a notable challenge within the research community: the absence of standardized metrics for effectively evaluating the efficacy of MOMs. The scarcity of informative evaluation metrics hinders the ability of researchers and practitioners to compare and benchmark different approaches reliably. Consequently, drawing definitive conclusions and making informed decisions regarding the selection and application of MOMs becomes a challenging endeavor. To address the challenge, this paper summarizes the scenarios in which MOMs prove advantageous for model training. We propose the use of distinct evaluation metrics under different scenarios. By employing these metrics, we evaluate the prevailing MOMs and find that their benefits are not universal. We present insights derived from experiments and discuss the circumstances in which they can be advantageous.
- Published
- 2023
20. ADELT: Transpilation Between Deep Learning Frameworks
- Author
-
Gong, Linyuan, Wang, Jiayi, and Cheung, Alvin
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
We propose the Adversarial DEep Learning Transpiler (ADELT), a novel approach to source-to-source transpilation between deep learning frameworks. ADELT uniquely decouples code skeleton transpilation and API keyword mapping. For code skeleton transpilation, it uses few-shot prompting on large language models (LLMs), while for API keyword mapping, it uses contextual embeddings from a code-specific BERT. These embeddings are trained in a domain-adversarial setup to generate a keyword translation dictionary. ADELT is trained on an unlabeled web-crawled deep learning corpus, without relying on any hand-crafted rules or parallel data. It outperforms state-of-the-art transpilers, improving pass@1 rate by 17.4 pts and 15.0 pts for PyTorch-Keras and PyTorch-MXNet transpilation pairs respectively. We provide open access to our code at https://github.com/gonglinyuan/adelt., Comment: 19 pages, to be published in the main track of IJCAI 2024
- Published
- 2023
21. Keep CALM and CRDT On
- Author
-
Laddad, Shadaj, Power, Conor, Milano, Mae, Cheung, Alvin, Crooks, Natacha, and Hellerstein, Joseph M.
- Subjects
Computer Science - Databases - Abstract
Despite decades of research and practical experience, developers have few tools for programming reliable distributed applications without resorting to expensive coordination techniques. Conflict-free replicated datatypes (CRDTs) are a promising line of work that enable coordination-free replication and offer certain eventual consistency guarantees in a relatively simple object-oriented API. Yet CRDT guarantees extend only to data updates; observations of CRDT state are unconstrained and unsafe. We propose an agenda that embraces the simplicity of CRDTs, but provides richer, more uniform guarantees. We extend CRDTs with a query model that reasons about which queries are safe without coordination by applying monotonicity results from the CALM Theorem, and lay out a larger agenda for developing CRDT data stores that let developers safely and efficiently interact with replicated application state.
- Published
- 2022
22. The artificial intelligence-based model ANORAK improves histopathological grading of lung adenocarcinoma
- Author
-
Pan, Xiaoxi, AbdulJabbar, Khalid, Coelho-Lima, Jose, Grapa, Anca-Ioana, Zhang, Hanyun, Cheung, Alvin Ho Kwan, Baena, Juvenal, Karasaki, Takahiro, Wilson, Claire Rachel, Sereno, Marco, Veeriah, Selvaraju, Aitken, Sarah J., Hackshaw, Allan, Nicholson, Andrew G., Jamal-Hanjani, Mariam, Swanton, Charles, Yuan, Yinyin, Le Quesne, John, and Moore, David A.
- Published
- 2024
- Full Text
- View/download PDF
23. NumS: Scalable Array Programming for the Cloud
- Author
-
Elibol, Melih, Benara, Vinamra, Yagati, Samyu, Zheng, Lianmin, Cheung, Alvin, Jordan, Michael I., and Stoica, Ion
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Machine Learning ,Computer Science - Mathematical Software ,Statistics - Applications - Abstract
Scientists increasingly rely on Python tools to perform scalable distributed memory array operations using rich, NumPy-like expressions. However, many of these tools rely on dynamic schedulers optimized for abstract task graphs, which often encounter memory and network bandwidth-related bottlenecks due to sub-optimal data and operator placement decisions. Tools built on the message passing interface (MPI), such as ScaLAPACK and SLATE, have better scaling properties, but these solutions require specialized knowledge to use. In this work, we present NumS, an array programming library which optimizes NumPy-like expressions on task-based distributed systems. This is achieved through a novel scheduler called Load Simulated Hierarchical Scheduling (LSHS). LSHS is a local search method which optimizes operator placement by minimizing maximum memory and network load on any given node within a distributed system. Coupled with a heuristic for load balanced data layouts, our approach is capable of attaining communication lower bounds on some common numerical operations, and our empirical study shows that LSHS enhances performance on Ray by decreasing network load by a factor of 2x, requiring 4x less memory, and reducing execution time by 10x on the logistic regression problem. On terabyte-scale data, NumS achieves competitive performance to SLATE on DGEMM, up to 20x speedup over Dask on a key operation for tensor factorization, and a 2x speedup on logistic regression compared to Dask ML and Spark's MLlib.
- Published
- 2022
24. GACT: Activation Compressed Training for Generic Network Architectures
- Author
-
Liu, Xiaoxuan, Zheng, Lianmin, Wang, Dequan, Cen, Yukuo, Chen, Weize, Han, Xu, Chen, Jianfei, Liu, Zhiyuan, Tang, Jie, Gonzalez, Joey, Mahoney, Michael, and Cheung, Alvin
- Subjects
Computer Science - Machine Learning - Abstract
Training large neural network (NN) models requires extensive memory resources, and Activation Compressed Training (ACT) is a promising approach to reduce training memory footprint. This paper presents GACT, an ACT framework to support a broad range of machine learning tasks for generic NN architectures with limited domain knowledge. By analyzing a linearized version of ACT's approximate gradient, we prove the convergence of GACT without prior knowledge on operator type or model architecture. To make training stable, we propose an algorithm that decides the compression ratio for each tensor by estimating its impact on the gradient at run time. We implement GACT as a PyTorch library that readily applies to any NN architecture. GACT reduces the activation memory for convolutional NNs, transformers, and graph NNs by up to 8.1x, enabling training with a 4.2x to 24.7x larger batch size, with negligible accuracy loss. We implement GACT as a PyTorch library at https://github.com/LiuXiaoxuanPKU/GACT-ICML.
- Published
- 2022
25. Katara: Synthesizing CRDTs with Verified Lifting
- Author
-
Laddad, Shadaj, Power, Conor, Milano, Mae, Cheung, Alvin, and Hellerstein, Joseph M.
- Subjects
Computer Science - Programming Languages ,Computer Science - Distributed, Parallel, and Cluster Computing ,D.1.2 - Abstract
Conflict-free replicated data types (CRDTs) are a promising tool for designing scalable, coordination-free distributed systems. However, constructing correct CRDTs is difficult, posing a challenge for even seasoned developers. As a result, CRDT development is still largely the domain of academics, with new designs often awaiting peer review and a manual proof of correctness. In this paper, we present Katara, a program synthesis-based system that takes sequential data type implementations and automatically synthesizes verified CRDT designs from them. Key to this process is a new formal definition of CRDT correctness that combines a reference sequential type with a lightweight ordering constraint that resolves conflicts between non-commutative operations. Our process follows the tradition of work in verified lifting, including an encoding of correctness into SMT logic using synthesized inductive invariants and hand-crafted grammars for the CRDT state and runtime. Katara is able to automatically synthesize CRDTs for a wide variety of scenarios, from reproducing classic CRDTs to synthesizing novel designs based on specifications in existing literature. Crucially, our synthesized CRDTs are fully, automatically verified, eliminating entire classes of common errors and reducing the process of producing a new CRDT from a painstaking paper proof of correctness to a lightweight specification.
- Published
- 2022
26. The Sky Above The Clouds
- Author
-
Chasins, Sarah, Cheung, Alvin, Crooks, Natacha, Ghodsi, Ali, Goldberg, Ken, Gonzalez, Joseph E., Hellerstein, Joseph M., Jordan, Michael I., Joseph, Anthony D., Mahoney, Michael W., Parameswaran, Aditya, Patterson, David, Popa, Raluca Ada, Sen, Koushik, Shenker, Scott, Song, Dawn, and Stoica, Ion
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
Technology ecosystems often undergo significant transformations as they mature. For example, telephony, the Internet, and PCs all started with a single provider, but in the United States each is now served by a competitive market that uses comprehensive and universal technology standards to provide compatibility. This white paper presents our view on how the cloud ecosystem, barely over fifteen years old, could evolve as it matures., Comment: 35 pages
- Published
- 2022
27. Leveraging Application Data Constraints to Optimize Database-Backed Web Applications
- Author
-
Liu, Xiaoxuan, Wang, Shuxian, Sun, Mengzhu, Pan, Sicheng, Li, Ge, Jha, Siddharth, Yan, Cong, Yang, Junwen, Lu, Shan, and Cheung, Alvin
- Subjects
Computer Science - Databases ,Computer Science - Programming Languages - Abstract
Exploiting the relationships among data is a classical query optimization technique. As persistent data is increasingly being created and maintained programmatically, prior work that infers data relationships from data statistics misses an important opportunity. We present ConstrOpt, the first tool that identifies data relationships by analyzing database-backed applications. Once identified, ConstrOpt leverages the constraints to optimize the application's physical design and query execution. Instead of developing a fixed set of predefined rewriting rules, ConstrOpt employs an enumerate-test-verify technique to automatically exploit the discovered data constraints to improve query execution. Each resulting rewrite is provably equivalent to the original query. Using 14 real-world web applications, our experiments show that ConstrOpt can discover numerous data constraints from code analysis and improve real-world application performance significantly.
- Published
- 2022
28. Synthesizing Analytical SQL Queries from Computation Demonstration
- Author
-
Zhou, Xiangyu, Bodik, Rastislav, Cheung, Alvin, and Wang, Chenglong
- Subjects
Computer Science - Programming Languages ,Computer Science - Databases - Abstract
Analytical SQL is widely used in modern database applications and data analysis. However, its partitioning and grouping operators are challenging for novice users. Unfortunately, programming by example, shown effective on standard SQL, are less attractive because examples for analytical queries are more laborious to solve by hand. To make demonstrations easier to create, we designed a new end-user specification, programming by computation demonstration, that allows the user to demonstrate the task using a (possibly incomplete) cell-level computation trace. This specification is exploited in a new abstraction-based synthesis algorithm to prove that a partially formed query cannot be completed to satisfy the specification, allowing us to prune the search space. We implemented our approach in a tool named Sickle and tested it on 80 real-world analytical SQL tasks. Results show that even from small demonstrations, Sickle can solve 76 tasks, in 12.8 seconds on average, while the prior approaches can solve only 60 tasks and are on average 22.5x slower. Our user study with 13 participants reveals that our specification increases user efficiency and confidence on challenging tasks.
- Published
- 2022
- Full Text
- View/download PDF
29. VSTM2A reverses immunosuppression in colorectal cancer by antagonizing the PD-L1/PD-1 interaction
- Author
-
Dong, Yujuan, Liu, Jiaxun Jade, Zhou, Yunfei, Kang, Wei, Li, Shanglin, Cheung, Alvin H.K., Hu, Yi, Liao, Rui, Wong, Nathalie, Wong, Chi Chun, Ng, Simon S.M., and Yu, Jun
- Published
- 2024
- Full Text
- View/download PDF
30. Peptostreptococcus stomatis promotes colonic tumorigenesis and receptor tyrosine kinase inhibitor resistance by activating ERBB2-MAPK
- Author
-
Huang, Pingmei, Ji, Fenfen, Cheung, Alvin Ho-Kwan, Fu, Kaili, Zhou, Qiming, Ding, Xiao, Chen, Danyu, Lin, Yufeng, Wang, Luyao, Jiao, Ying, Chu, Eagle S.H., Kang, Wei, To, Ka Fai, Yu, Jun, and Wong, Chi Chun
- Published
- 2024
- Full Text
- View/download PDF
31. Integrative plasma and fecal metabolomics identify functional metabolites in adenoma-colorectal cancer progression and as early diagnostic biomarkers
- Author
-
Sun, Yang, Zhang, Xiang, Hang, Dong, Lau, Harry Cheuk-Hay, Du, Jie, Liu, Chuanfa, Xie, Mingxu, Pan, Yasi, Wang, Le, Liang, Cong, Zhou, Xingyu, Chen, Danyu, Rong, Jiamei, Zhao, Zengren, Cheung, Alvin Ho-Kwan, Wu, Yuet, Gou, Hongyan, Wong, Chi Chun, Du, Lingbin, Deng, Junliang, Hu, Zhibin, Shen, Hongbing, Miao, Yinglei, and Yu, Jun
- Published
- 2024
- Full Text
- View/download PDF
32. SMARCA4 deficiency and mutations are frequent in large cell lung carcinoma and are prognostically significant
- Author
-
Cheung, Alvin Ho-Kwan, Wong, Kit-Yee, Chau, Shuk-Ling, Xie, Fuda, Mui, Zeta, Li, Gordon Yuan-Ho, Li, Molly Siu Ching, Tong, Joanna, Ng, Calvin Sze-Hang, Mok, Tony S., Kang, Wei, and To, Ka-Fai
- Published
- 2024
- Full Text
- View/download PDF
33. MLK4 promotes glucose metabolism in lung adenocarcinoma through CREB-mediated activation of phosphoenolpyruvate carboxykinase and is regulated by KLF5
- Author
-
Cheung, Alvin Ho-Kwan, Wong, Kit-Yee, Liu, Xiaoli, Ji, Fenfen, Hui, Chris Ho-Lam, Zhang, Yihan, Kwan, Johnny Sheung-Him, Chen, Bonan, Dong, Yujuan, Lung, Raymond Wai-Ming, Yu, Jun, Lo, Kwok Wai, Wong, Chi Chun, Kang, Wei, and To, Ka-Fai
- Published
- 2023
- Full Text
- View/download PDF
34. VSS: A Storage System for Video Analytics [Technical Report]
- Author
-
Haynes, Brandon, Daum, Maureen, He, Dong, Mazumdar, Amrita, Balazinska, Magdalena, Cheung, Alvin, and Ceze, Luis
- Subjects
Computer Science - Databases - Abstract
We present a new video storage system (VSS) designed to decouple high-level video operations from the low-level details required to store and efficiently retrieve video data. VSS is designed to be the storage subsystem of a video data management system (VDBMS) and is responsible for: (1) transparently and automatically arranging the data on disk in an efficient, granular format; (2) caching frequently-retrieved regions in the most useful formats; and (3) eliminating redundancies found in videos captured from multiple cameras with overlapping fields of view. Our results suggest that VSS can improve VDBMS read performance by up to 54%, reduce storage costs by up to 45%, and enable developers to focus on application logic rather than video storage and retrieval.
- Published
- 2021
35. Falx: Synthesis-Powered Visualization Authoring
- Author
-
Wang, Chenglong, Feng, Yu, Bodik, Rastislav, Dillig, Isil, Cheung, Alvin, and Ko, Amy J.
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Programming Languages - Abstract
Modern visualization tools aim to allow data analysts to easily create exploratory visualizations. When the input data layout conforms to the visualization design, users can easily specify visualizations by mapping data columns to visual channels of the design. However, when there is a mismatch between data layout and the design, users need to spend significant effort on data transformation. We propose Falx, a synthesis-powered visualization tool that allows users to specify visualizations in a similarly simple way but without needing to worry about data layout. In Falx, users specify visualizations using examples of how concrete values in the input are mapped to visual channels, and Falx automatically infers the visualization specification and transforms the data to match the design. In a study with 33 data analysts on four visualization tasks involving data transformation, we found that users can effectively adopt Falx to create visualizations they otherwise cannot implement., Comment: CHI 2021
- Published
- 2021
- Full Text
- View/download PDF
36. New Directions in Cloud Programming
- Author
-
Cheung, Alvin, Crooks, Natacha, Hellerstein, Joseph M., and Milano, Mae
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Databases ,Computer Science - Operating Systems ,Computer Science - Programming Languages - Abstract
Nearly twenty years after the launch of AWS, it remains difficult for most developers to harness the enormous potential of the cloud. In this paper we lay out an agenda for a new generation of cloud programming research aimed at bringing research ideas to programmers in an evolutionary fashion. Key to our approach is a separation of distributed programs into a PACT of four facets: Program semantics, Availablity, Consistency and Targets of optimization. We propose to migrate developers gradually to PACT programming by lifting familiar code into our more declarative level of abstraction. We then propose a multi-stage compiler that emits human-readable code at each stage that can be hand-tuned by developers seeking more control. Our agenda raises numerous research challenges across multiple areas including language design, query optimization, transactions, distributed consistency, compilers and program synthesis.
- Published
- 2021
37. Interpretability Meets Generalizability: A Hybrid Machine Learning System to Identify Nonlinear Granger Causality in Global Stock Indices
- Author
-
Lu, Yixiao, Lee, Yokiu, Feng, Haoran, Leung, Johnathan, Cheung, Alvin, Dost, Katharina, Taskova, Katerina, Lacombe, Thomas, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Kashima, Hisashi, editor, Ide, Tsuyoshi, editor, and Peng, Wen-Chih, editor
- Published
- 2023
- Full Text
- View/download PDF
38. Streptococcus anginosus promotes gastric inflammation, atrophy, and tumorigenesis in mice
- Author
-
Fu, Kaili, Cheung, Alvin Ho Kwan, Wong, Chi Chun, Liu, Weixin, Zhou, Yunfei, Wang, Feixue, Huang, Pingmei, Yuan, Kai, Coker, Olabisi Oluwabukola, Pan, Yasi, Chen, Danyu, Lam, Nga Man, Gao, Mengxue, Zhang, Xiang, Huang, He, To, Ka Fai, Sung, Joseph Jao Yiu, and Yu, Jun
- Published
- 2024
- Full Text
- View/download PDF
39. High Soluble Fiber Promotes Colorectal Tumorigenesis Through Modulating Gut Microbiota and Metabolites in Mice
- Author
-
Yang, Jia, Wei, Hong, Lin, Yufeng, Chu, Eagle S.H., Zhou, Yunfei, Gou, Hongyan, Guo, Shang, Lau, Harry C.H., Cheung, Alvin H.K., Chen, Huarong, To, Ka Fei, Sung, Joseph J.Y., Wang, Yong, and Yu, Jun
- Published
- 2024
- Full Text
- View/download PDF
40. Germline Human Leukocyte Antigen Status is Associated With Immunotherapy-Induced Pneumonitis and Treatment Response in Patients With Non–Small Cell Lung Carcinoma With High Programmed Death-Ligand 1 Expression
- Author
-
Cheung, Alvin H.K., Mui, Zeta, Yeung, Walter W., Chow, Chit, Yu, Mandy F., Chen, Olivia H., Wong, Kit-Yee, Xie, Fuda, Lau, Yat Ming, Cheng, Alfred S-L., Kang, Wei, To, Ka-Fai, Mok, Tony S., and Li, Molly S.C.
- Published
- 2024
- Full Text
- View/download PDF
41. Bifidobacterium pseudolongum-generated acetate suppresses non-alcoholic fatty liver disease-associated hepatocellular carcinoma
- Author
-
Song, Qian, Zhang, Xiang, Liu, Weixin, Wei, Hong, Liang, Wei, Zhou, Yunfei, Ding, Yanqiang, Ji, Fenfen, Ho-Kwan Cheung, Alvin, Wong, Nathalie, and Yu, Jun
- Published
- 2023
- Full Text
- View/download PDF
42. Visualization by Example
- Author
-
Wang, Chenglong, Feng, Yu, Bodik, Rastislav, Cheung, Alvin, and Dillig, Isil
- Subjects
Computer Science - Programming Languages ,Computer Science - Human-Computer Interaction - Abstract
While visualizations play a crucial role in gaining insights from data, generating useful visualizations from a complex dataset is far from an easy task. Besides understanding the functionality provided by existing visualization libraries, generating the desired visualization also requires reshaping and aggregating the underlying data as well as composing different visual elements to achieve the intended visual narrative. This paper aims to simplify visualization tasks by automatically synthesizing the required program from simple visual sketches provided by the user. Specifically, given an input data set and a visual sketch that demonstrates how to visualize a very small subset of this data, our technique automatically generates a program that can be used to visualize the entire data set. Automating visualization poses several challenges. First, because many visualization tasks require data wrangling in addition to generating plots, we need to decompose the end-to-end synthesis task into two separate sub-problems. Second, because the intermediate specification that results from the decomposition is necessarily imprecise, this makes the data wrangling task particularly challenging in our context. In this paper, we address these problems by developing a new compositional visualization-by-example technique that (a) decomposes the end-to-end task into two different synthesis problems over different DSLs and (b) leverages bi-directional program analysis to deal with the complexity that arises from having an imprecise intermediate specification. We implemented our visualization-by-example algorithm and evaluate it on 83 visualization tasks collected from on-line forums and tutorials. Viser can solve 84% of these benchmarks within a 600 second time limit, and, for those tasks that can be solved, the desired visualization is among the top-5 generated by Viser in 70% of the cases.
- Published
- 2019
43. METTL3 drives NAFLD-related hepatocellular carcinoma and is a therapeutic target for boosting immunotherapy
- Author
-
Pan, Yasi, Chen, Huarong, Zhang, Xiang, Liu, Weixin, Ding, Yanqiang, Huang, Dan, Zhai, Jianning, Wei, Wenchao, Wen, Jun, Chen, Danyu, Zhou, Yunfei, Liang, Cong, Wong, Nathalie, Man, Kwan, Cheung, Alvin Ho-Kwan, Wong, Chi Chun, and Yu, Jun
- Published
- 2023
- Full Text
- View/download PDF
44. Learning Programmatic Idioms for Scalable Semantic Parsing
- Author
-
Iyer, Srinivasan, Cheung, Alvin, and Zettlemoyer, Luke
- Subjects
Computer Science - Computation and Language - Abstract
Programmers typically organize executable source code using high-level coding patterns or idiomatic structures such as nested loops, exception handlers and recursive blocks, rather than as individual code tokens. In contrast, state of the art (SOTA) semantic parsers still map natural language instructions to source code by building the code syntax tree one node at a time. In this paper, we introduce an iterative method to extract code idioms from large source code corpora by repeatedly collapsing most-frequent depth-2 subtrees of their syntax trees, and train semantic parsers to apply these idioms during decoding. Applying idiom-based decoding on a recent context-dependent semantic parsing task improves the SOTA by 2.2\% BLEU score while reducing training time by more than 50\%. This improved speed enables us to scale up the model by training on an extended training set that is 5$\times$ larger, to further move up the SOTA by an additional 2.3\% BLEU and 0.9\% exact match. Finally, idioms also significantly improve accuracy of semantic parsing to SQL on the ATIS-SQL dataset, when training data is limited., Comment: Accepted at EMNLP 2019
- Published
- 2019
45. Vignette: Perceptual Compression for Video Storage and Processing Systems
- Author
-
Mazumdar, Amrita, Haynes, Brandon, Balazinska, Magdalena, Ceze, Luis, Cheung, Alvin, and Oskin, Mark
- Subjects
Computer Science - Multimedia ,Computer Science - Databases - Abstract
Compressed videos constitute 70% of Internet traffic, and video upload growth rates far outpace compute and storage improvement trends. Past work in leveraging perceptual cues like saliency, i.e., regions where viewers focus their perceptual attention, reduces compressed video size while maintaining perceptual quality, but requires significant changes to video codecs and ignores the data management of this perceptual information. In this paper, we propose Vignette, a compression technique and storage manager for perception-based video compression. Vignette complements off-the-shelf compression software and hardware codec implementations. Vignette's compression technique uses a neural network to predict saliency information used during transcoding, and its storage manager integrates perceptual information into the video storage system to support a perceptual compression feedback loop. Vignette's saliency-based optimizations reduce storage by up to 95% with minimal quality loss, and Vignette videos lead to power savings of 50% on mobile phones during video playback. Our results demonstrate the benefit of embedding information about the human visual system into the architecture of video storage systems.
- Published
- 2019
46. Parvimonas micra promotes colorectal tumorigenesis and is associated with prognosis of colorectal cancer patients
- Author
-
Zhao, Liuyang, Zhang, Xiang, Zhou, Yunfei, Fu, Kaili, Lau, Harry Cheuk-Hay, Chun, Tommy Wai-Yiu, Cheung, Alvin Ho-Kwan, Coker, Olabisi Oluwabukola, Wei, Hong, Wu, William Ka-Kei, Wong, Sunny Hei, Sung, Joseph Jao-Yiu, To, Ka Fai, and Yu, Jun
- Published
- 2022
- Full Text
- View/download PDF
47. Interpretability Meets Generalizability: A Hybrid Machine Learning System to Identify Nonlinear Granger Causality in Global Stock Indices
- Author
-
Lu, Yixiao, primary, Lee, Yokiu, additional, Feng, Haoran, additional, Leung, Johnathan, additional, Cheung, Alvin, additional, Dost, Katharina, additional, Taskova, Katerina, additional, and Lacombe, Thomas, additional
- Published
- 2023
- Full Text
- View/download PDF
48. Improving High Contention OLTP Performance via Transaction Scheduling
- Author
-
Prasaad, Guna, Cheung, Alvin, and Suciu, Dan
- Subjects
Computer Science - Databases - Abstract
Research in transaction processing has made significant progress in improving the performance of multi-core in-memory transactional systems. However, the focus has mainly been on low-contention workloads. Modern transactional systems perform poorly on workloads with transactions accessing a few highly contended data items. We observe that most transactional workloads, including those with high contention, can be divided into clusters of data conflict-free transactions and a small set of residuals. In this paper, we introduce a new concurrency control protocol called Strife that leverages the above observation. Strife executes transactions in batches, where each batch is partitioned into clusters of conflict-free transactions and a small set of residual transactions. The conflict-free clusters are executed in parallel without any concurrency control, followed by executing the residual cluster either serially or with concurrency control. We present a low-overhead algorithm that partitions a batch of transactions into clusters that do not have cross-cluster conflicts and a small residual cluster. We evaluate Strife against the optimistic concurrency control protocol and several variants of two-phase locking, where the latter is known to perform better than other concurrency protocols under high contention, and show that Strife can improve transactional throughput by up to 2x. We also perform an in-depth micro-benchmark analysis to empirically characterize the performance and quality of our clustering algorithm
- Published
- 2018
49. Mapping Language to Code in Programmatic Context
- Author
-
Iyer, Srinivasan, Konstas, Ioannis, Cheung, Alvin, and Zettlemoyer, Luke
- Subjects
Computer Science - Computation and Language - Abstract
Source code is rarely written in isolation. It depends significantly on the programmatic context, such as the class that the code would reside in. To study this phenomenon, we introduce the task of generating class member functions given English documentation and the programmatic context provided by the rest of the class. This task is challenging because the desired code can vary greatly depending on the functionality the class provides (e.g., a sort function may or may not be available when we are asked to "return the smallest element" in a particular member variable list). We introduce CONCODE, a new large dataset with over 100,000 examples consisting of Java classes from online code repositories, and develop a new encoder-decoder architecture that models the interaction between the method documentation and the class environment. We also present a detailed error analysis suggesting that there is significant room for future work on this task., Comment: Accepted at EMNLP 2018
- Published
- 2018
50. Cuttlefish: A Lightweight Primitive for Adaptive Query Processing
- Author
-
Kaftan, Tomer, Balazinska, Magdalena, Cheung, Alvin, and Gehrke, Johannes
- Subjects
Computer Science - Databases ,Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
Modern data processing applications execute increasingly sophisticated analysis that requires operations beyond traditional relational algebra. As a result, operators in query plans grow in diversity and complexity. Designing query optimizer rules and cost models to choose physical operators for all of these novel logical operators is impractical. To address this challenge, we develop Cuttlefish, a new primitive for adaptively processing online query plans that explores candidate physical operator instances during query execution and exploits the fastest ones using multi-armed bandit reinforcement learning techniques. We prototype Cuttlefish in Apache Spark and adaptively choose operators for image convolution, regular expression matching, and relational joins. Our experiments show Cuttlefish-based adaptive convolution and regular expression operators can reach 72-99% of the throughput of an all-knowing oracle that always selects the optimal algorithm, even when individual physical operators are up to 105x slower than the optimal. Additionally, Cuttlefish achieves join throughput improvements of up to 7.5x compared with Spark SQL's query optimizer.
- Published
- 2018
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.