Author: "Su, Yi" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Su, Yi"' showing total 11,447 results

Start Over Author "Su, Yi"

11,447 results on '"Su, Yi"'

1. Contestant Heterogeneity and Hack-a-Shaq Strategy in Team Efforts in Professional Basketball Games

Author: Jane, Wen-Jhan, Chen, Sheng Tung, and Su, Yi Ju
Published: 2024

2. EVOLvE: Evaluating and Optimizing LLMs For Exploration

Author: Nie, Allen, Su, Yi, Chang, Bo, Lee, Jonathan N., Chi, Ed H., Le, Quoc V., and Chen, Minmin
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Despite their success in many domains, large language models (LLMs) remain under-studied in scenarios requiring optimal decision-making under uncertainty. This is crucial as many real-world applications, ranging from personalized recommendations to healthcare interventions, demand that LLMs not only predict but also actively learn to make optimal decisions through exploration. In this work, we measure LLMs' (in)ability to make optimal decisions in bandits, a state-less reinforcement learning setting relevant to many applications. We develop a comprehensive suite of environments, including both context-free and contextual bandits with varying task difficulties, to benchmark LLMs' performance. Motivated by the existence of optimal exploration algorithms, we propose efficient ways to integrate this algorithmic knowledge into LLMs: by providing explicit algorithm-guided support during inference; and through algorithm distillation via in-context demonstrations and fine-tuning, using synthetic data generated from these algorithms. Impressively, these techniques allow us to achieve superior exploration performance with smaller models, surpassing larger models on various tasks. We conducted an extensive ablation study to shed light on various factors, such as task difficulty and data representation, that influence the efficiency of LLM exploration. Additionally, we conduct a rigorous analysis of the LLM's exploration efficiency using the concept of regret, linking its ability to explore to the model size and underlying algorithm., Comment: 28 pages
Published: 2024

3. Supervised Multi-Modal Fission Learning

Author: Mao, Lingchao, wang, Qi, Su, Yi, Lure, Fleming, and Li, Jing
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: Learning from multimodal datasets can leverage complementary information and improve performance in prediction tasks. A commonly used strategy to account for feature correlations in high-dimensional datasets is the latent variable approach. Several latent variable methods have been proposed for multimodal datasets. However, these methods either focus on extracting the shared component across all modalities or on extracting both a shared component and individual components specific to each modality. To address this gap, we propose a Multi-Modal Fission Learning (MMFL) model that simultaneously identifies globally joint, partially joint, and individual components underlying the features of multimodal datasets. Unlike existing latent variable methods, MMFL uses supervision from the response variable to identify predictive latent components and has a natural extension for incorporating incomplete multimodal data. Through simulation studies, we demonstrate that MMFL outperforms various existing multimodal algorithms in both complete and incomplete modality settings. We applied MMFL to a real-world case study for early prediction of Alzheimers Disease using multimodal neuroimaging and genomics data from the Alzheimers Disease Neuroimaging Initiative (ADNI) dataset. MMFL provided more accurate predictions and better insights into within- and across-modality correlations compared to existing methods.
Published: 2024

4. Training Language Models to Self-Correct via Reinforcement Learning

Author: Kumar, Aviral, Zhuang, Vincent, Agarwal, Rishabh, Su, Yi, Co-Reyes, John D, Singh, Avi, Baumli, Kate, Iqbal, Shariq, Bishop, Colton, Roelofs, Rebecca, Zhang, Lei M, McKinney, Kay, Shrivastava, Disha, Paduraru, Cosmin, Tucker, George, Precup, Doina, Behbahani, Feryal, and Faust, Aleksandra
Subjects: Computer Science - Machine Learning
Abstract: Self-correction is a highly desirable capability of large language models (LLMs), yet it has consistently been found to be largely ineffective in modern LLMs. Current methods for training self-correction typically depend on either multiple models, a more advanced model, or additional forms of supervision. To address these shortcomings, we develop a multi-turn online reinforcement learning (RL) approach, SCoRe, that significantly improves an LLM's self-correction ability using entirely self-generated data. To build SCoRe, we first show that variants of supervised fine-tuning (SFT) on offline model-generated correction traces are often insufficient for instilling self-correction behavior. In particular, we observe that training via SFT falls prey to either a distribution mismatch between mistakes made by the data-collection policy and the model's own responses, or to behavior collapse, where learning implicitly prefers only a certain mode of correction behavior that is often not effective at self-correction on test problems. SCoRe addresses these challenges by training under the model's own distribution of self-generated correction traces and using appropriate regularization to steer the learning process into learning a self-correction behavior that is effective at test time as opposed to fitting high-reward responses for a given prompt. This regularization process includes an initial phase of multi-turn RL on a base model to generate a policy initialization that is less susceptible to collapse, followed by using a reward bonus to amplify self-correction. With Gemini 1.0 Pro and 1.5 Flash models, we find that SCoRe achieves state-of-the-art self-correction performance, improving the base models' self-correction by 15.6% and 9.1% respectively on MATH and HumanEval.
Published: 2024

5. CUNSB-RFIE: Context-aware Unpaired Neural Schr\'odinger Bridge in Retinal Fundus Image Enhancement

Author: Dong, Xuanzhao, Vasa, Vamsi Krishna, Zhu, Wenhui, Qiu, Peijie, Chen, Xiwen, Su, Yi, Xiong, Yujian, Yang, Zhangsihao, Chen, Yanxi, and Wang, Yalin
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Retinal fundus photography is significant in diagnosing and monitoring retinal diseases. However, systemic imperfections and operator/patient-related factors can hinder the acquisition of high-quality retinal images. Previous efforts in retinal image enhancement primarily relied on GANs, which are limited by the trade-off between training stability and output diversity. In contrast, the Schr\"odinger Bridge (SB), offers a more stable solution by utilizing Optimal Transport (OT) theory to model a stochastic differential equation (SDE) between two arbitrary distributions. This allows SB to effectively transform low-quality retinal images into their high-quality counterparts. In this work, we leverage the SB framework to propose an image-to-image translation pipeline for retinal image enhancement. Additionally, previous methods often fail to capture fine structural details, such as blood vessels. To address this, we enhance our pipeline by introducing Dynamic Snake Convolution, whose tortuous receptive field can better preserve tubular structures. We name the resulting retinal fundus image enhancement framework the Context-aware Unpaired Neural Schr\"{o}dinger Bridge (CUNSB-RFIE). To the best of our knowledge, this is the first endeavor to use the SB approach for retinal image enhancement. Experimental results on a large-scale dataset demonstrate the advantage of the proposed method compared to several state-of-the-art supervised and unsupervised methods in terms of image quality and performance on downstream tasks.The code is available at https://github.com/Retinal-Research/CUNSB-RFIE .
Published: 2024

6. Achieving Optimal Short-Blocklength Secrecy Rate Using Multi-Kernel PAC Codes for the Binary Erasure Wiretap Channel

Author: Lin, Hsuan-Yin, Su, Yi-Sheng, and Chiu, Mao-Ching
Subjects: Computer Science - Information Theory
Abstract: We investigate practical short-blocklength coding for the semi-deterministic binary erasure wiretap channel (BE-WTC), where the main channel to the legitimate receiver is noiseless, and the eavesdropper's channel is a binary erasure channel (BEC). It is shown that under the average total variation distance secrecy metric, multi-kernel polarization-adjusted convolutional (MK-PAC) codes can achieve the best possible theoretical secrecy rate at blocklengths of 16, 32, 64, and 128 if the secrecy leakage is less than or equal to certain values., Comment: Paper accepted for presentation at the 2024 IEEE International Symposium on Information Theory and Its Applications (ISITA 2024)
Published: 2024

7. Interplay of Quantum Resources in Nonlocality Tests

Author: Dong, Hai-Hao, Zhu, Yuwei, Cheng, Su-Yi, Zhang, Xingjian, Li, Cheng-Long, Li, Ying-Zhao, Li, Hao, You, Lixing, Ma, Xiongfeng, Zhang, Qiang, and Pan, Jian-Wei
Subjects: Quantum Physics
Abstract: Nonlocality, evidenced by the violation of Bell inequalities, not only signifies entanglement but also highlights measurement incompatibility in quantum systems. Utilizing the generalized Clauser-Horne-Shimony-Holt (CHSH) Bell inequality, our high-efficiency optical setup achieves a loophole-free violation of $2.0132$. This result provides a device-independent lower bound on entanglement, quantified as the entanglement of formation at $0.0159$. Moreover, by tuning the parameters of the generalized Bell inequality, we enhance the estimation of measurement incompatibility, which is quantified by an effective overlap of $4.3883 \times 10^{-5}$. To explore the intricate interplay among nonlocality, entanglement, and measurement incompatibility, we generate mixed states, allowing for flexible modulation of entanglement via fast switching among the four Bell states using Pockels cells, achieving a fidelity above $99.10\%$. Intriguingly, our results reveal a counterintuitive relationship where increasing incompatibility initially boosts nonlocality but eventually leads to its reduction. Typically, maximal nonlocality does not coincide with maximal incompatibility. This experimental study sheds light on the optimal management of quantum resources for Bell-inequality-based quantum information processing., Comment: 15 pages, 9 figures
Published: 2024

8. Testing learning hypotheses using neural networks by manipulating learning data

Author: Leong, Cara Su-Yi and Linzen, Tal
Subjects: Computer Science - Computation and Language
Abstract: Although passivization is productive in English, it is not completely general -- some exceptions exist (e.g. *One hour was lasted by the meeting). How do English speakers learn these exceptions to an otherwise general pattern? Using neural network language models as theories of acquisition, we explore the sources of indirect evidence that a learner can leverage to learn whether a verb can passivize. We first characterize English speakers' judgments of exceptions to the passive, confirming that speakers find some verbs more passivizable than others. We then show that a neural network language model can learn restrictions to the passive that are similar to those displayed by humans, suggesting that evidence for these exceptions is available in the linguistic input. We test the causal role of two hypotheses for how the language model learns these restrictions by training models on modified training corpora, which we create by altering the existing training corpora to remove features of the input implicated by each hypothesis. We find that while the frequency with which a verb appears in the passive significantly affects its passivizability, the semantics of the verb does not. This study highlight the utility of altering a language model's training data for answering questions where complete control over a learner's input is vital., Comment: Submitted to Journal of Memory and Language
Published: 2024

9. OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure

Author: Wang, Jikai, Su, Yi, Li, Juntao, Xia, Qingrong, Ye, Zi, Duan, Xinyu, Wang, Zhefeng, and Zhang, Min
Subjects: Computer Science - Computation and Language
Abstract: Autoregressive language models demonstrate excellent performance in various scenarios. However, the inference efficiency is limited by its one-step-one-word generation mode, which has become a pressing problem recently as the models become increasingly larger. Speculative decoding employs a "draft and then verify" mechanism to allow multiple tokens to be generated in one step, realizing lossless acceleration. Existing methods mainly adopt fixed heuristic draft structures, which fail to adapt to different situations to maximize the acceptance length during verification. To alleviate this dilemma, we proposed OPT-Tree, an algorithm to construct adaptive and scalable draft trees. It searches the optimal tree structure that maximizes the mathematical expectation of the acceptance length in each decoding step. Experimental results reveal that OPT-Tree outperforms the existing draft structures and achieves a speed-up ratio of up to 3.2 compared with autoregressive decoding. If the draft model is powerful enough and the node budget is sufficient, it can generate more than ten tokens in a single step. Our code is available at https://github.com/Jikai0Wang/OPT-Tree.
Published: 2024

10. Demonstration Augmentation for Zero-shot In-context Learning

Author: Su, Yi, Tai, Yunpeng, Ji, Yixin, Li, Juntao, Yan, Bowen, and Zhang, Min
Subjects: Computer Science - Computation and Language
Abstract: Large Language Models (LLMs) have demonstrated an impressive capability known as In-context Learning (ICL), which enables them to acquire knowledge from textual demonstrations without the need for parameter updates. However, many studies have highlighted that the model's performance is sensitive to the choice of demonstrations, presenting a significant challenge for practical applications where we lack prior knowledge of user queries. Consequently, we need to construct an extensive demonstration pool and incorporate external databases to assist the model, leading to considerable time and financial costs. In light of this, some recent research has shifted focus towards zero-shot ICL, aiming to reduce the model's reliance on external information by leveraging their inherent generative capabilities. Despite the effectiveness of these approaches, the content generated by the model may be unreliable, and the generation process is time-consuming. To address these issues, we propose Demonstration Augmentation for In-context Learning (DAIL), which employs the model's previously predicted historical samples as demonstrations for subsequent ones. DAIL brings no additional inference cost and does not rely on the model's generative capabilities. Our experiments reveal that DAIL can significantly improve the model's performance over direct zero-shot inference and can even outperform few-shot ICL without any external information., Comment: Accepted to ACL 2024 Findings
Published: 2024

11. OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning

Author: Qiao, Dan, Su, Yi, Wang, Pinzheng, Ye, Jing, Xie, Wenjing, Zhou, Yuechi, Ding, Yuyang, Tang, Zecheng, Wang, Jikai, Ji, Yixin, Wang, Yue, Guo, Pei, Sun, Zechen, Zhang, Zikang, Li, Juntao, Chao, Pingfu, Chen, Wenliang, Fu, Guohong, Zhou, Guodong, Zhu, Qiaoming, and Zhang, Min
Subjects: Computer Science - Computation and Language
Abstract: Large Language Models (LLMs) have played an important role in many fields due to their powerful capabilities.However, their massive number of parameters leads to high deployment requirements and incurs significant inference costs, which impedes their practical applications. Training smaller models is an effective way to address this problem. Therefore, we introduce OpenBA-V2, a 3.4B model derived from multi-stage compression and continual pre-training from the original 15B OpenBA model. OpenBA-V2 utilizes more data, more flexible training objectives, and techniques such as layer pruning, neural pruning, and vocabulary pruning to achieve a compression rate of 77.3\% with minimal performance loss. OpenBA-V2 demonstrates competitive performance compared to other open-source models of similar size, achieving results close to or on par with the 15B OpenBA model in downstream tasks such as common sense reasoning and Named Entity Recognition (NER). OpenBA-V2 illustrates that LLMs can be compressed into smaller ones with minimal performance loss by employing advanced training objectives and data strategies, which may help deploy LLMs in resource-limited scenarios.
Published: 2024

12. A Symphony of Poems and Pictures: Hwa-Jen Ho's Nonfiction Picturebooks about Wild Birds in Taiwan

Author: Su, Yi-Ching
Published: 2018
Full Text: View/download PDF

13. AutoGFI: Streamlined Generalized Fiducial Inference for Modern Inference Problems

Author: Du, Wei, Hannig, Jan, Lee, Thomas C. M., Su, Yi, and Zhang, Chunzhe
Subjects: Statistics - Methodology
Abstract: The origins of fiducial inference trace back to the 1930s when R. A. Fisher first introduced the concept as a response to what he perceived as a limitation of Bayesian inference - the requirement for a subjective prior distribution on model parameters in cases where no prior information was available. However, Fisher's initial fiducial approach fell out of favor as complications arose, particularly in multi-parameter problems. In the wake of 2000, amidst a renewed interest in contemporary adaptations of fiducial inference, generalized fiducial inference (GFI) emerged to extend Fisher's fiducial argument, providing a promising avenue for addressing numerous crucial and practical inference challenges. Nevertheless, the adoption of GFI has been limited due to its often demanding mathematical derivations and the necessity for implementing complex Markov Chain Monte Carlo algorithms. This complexity has impeded its widespread utilization and practical applicability. This paper presents a significant advancement by introducing an innovative variant of GFI designed to alleviate these challenges. Specifically, this paper proposes AutoGFI, an easily implementable algorithm that streamlines the application of GFI to a broad spectrum of inference problems involving additive noise. AutoGFI can be readily implemented as long as a fitting routine is available, making it accessible to a broader audience of researchers and practitioners. To demonstrate its effectiveness, AutoGFI is applied to three contemporary and challenging problems: tensor regression, matrix completion, and regression with network cohesion. These case studies highlight the immense potential of GFI and illustrate AutoGFI's promising performance when compared to specialized solutions for these problems. Overall, this research paves the way for a more accessible and powerful application of GFI in a range of practical domains.
Published: 2024

14. VISION2UI: A Real-World Dataset with Layout for Code Generation from UI Designs

Author: Gui, Yi, Li, Zhen, Wan, Yao, Shi, Yemin, Zhang, Hongyu, Su, Yi, Dong, Shaoling, Zhou, Xing, and Jiang, Wenbin
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Software Engineering
Abstract: Automatically generating UI code from webpage design visions can significantly alleviate the burden of developers, enabling beginner developers or designers to directly generate Web pages from design diagrams. Currently, prior research has accomplished the objective of generating UI code from rudimentary design visions or sketches through designing deep neural networks. Inspired by the groundbreaking advancements achieved by Multimodal Large Language Models (MLLMs), the automatic generation of UI code from high-fidelity design images is now emerging as a viable possibility. Nevertheless, our investigation reveals that existing MLLMs are hampered by the scarcity of authentic, high-quality, and large-scale datasets, leading to unsatisfactory performance in automated UI code generation. To mitigate this gap, we present a novel dataset, termed VISION2UI, extracted from real-world scenarios, augmented with comprehensive layout information, tailored specifically for finetuning MLLMs in UI code generation. Specifically, this dataset is derived through a series of operations, encompassing collecting, cleaning, and filtering of the open-source Common Crawl dataset. In order to uphold its quality, a neural scorer trained on labeled samples is utilized to refine the data, retaining higher-quality instances. Ultimately, this process yields a dataset comprising 2,000 (Much more is coming soon) parallel samples encompassing design visions and UI code. The dataset is available at https://huggingface.co/datasets/xcodemind/vision2ui.
Published: 2024

15. Broadband and fabrication-tolerant 3-dB couplers with topological valley edge modes

Author: Tang, Guo-Jing, Chen, Xiao-Dong, Sun, Lu, Guo, Chao-Heng, Li, Meng-Yu, Tian, Zhong-Tao, Chen, Hou-Hong, Wang, Hong-Wei, Sun, Qi-Yao, Pan, Ying-Di, He, Xin-Tao, Su, Yi-Kai, and Dong, Jian-Wen
Subjects: Physics - Optics, Physics - Applied Physics
Abstract: 3-dB couplers, which are commonly used in photonic integrated circuits for on-chip information processing, precision measurement, and quantum computing, face challenges in achieving robust performance due to their limited 3-dB bandwidths and sensitivity to fabrication errors. To address this, we introduce topological physics to nanophotonics, developing a framework for topological 3-dB couplers. These couplers exhibit broad working wavelength range and robustness against fabrication dimensional errors. By leveraging valley-Hall topology and mirror symmetry, the photonic-crystal-slab couplers achieve ideal 3-dB splitting characterized by a wavelength-insensitive scattering matrix. Tolerance analysis confirms the superiority on broad bandwidth of 48 nm and robust splitting against dimensional errors of 20 nm. We further propose a topological interferometer for on-chip distance measurement, which also exhibits robustness against dimensional errors. This extension of topological principles to the fields of interferometers, may open up new possibilities for constructing robust wavelength division multiplexing, temperature-drift-insensitive sensing, and optical coherence tomography applications., Comment: 20 pages, 4 figures
Published: 2024

16. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Author: Gemini Team, Georgiev, Petko, Lei, Ving Ian, Burnell, Ryan, Bai, Libin, Gulati, Anmol, Tanzer, Garrett, Vincent, Damien, Pan, Zhufeng, Wang, Shibo, Mariooryad, Soroosh, Ding, Yifan, Geng, Xinyang, Alcober, Fred, Frostig, Roy, Omernick, Mark, Walker, Lexi, Paduraru, Cosmin, Sorokin, Christina, Tacchetti, Andrea, Gaffney, Colin, Daruki, Samira, Sercinoglu, Olcan, Gleicher, Zach, Love, Juliette, Voigtlaender, Paul, Jain, Rohan, Surita, Gabriela, Mohamed, Kareem, Blevins, Rory, Ahn, Junwhan, Zhu, Tao, Kawintiranon, Kornraphop, Firat, Orhan, Gu, Yiming, Zhang, Yujing, Rahtz, Matthew, Faruqui, Manaal, Clay, Natalie, Gilmer, Justin, Co-Reyes, JD, Penchev, Ivo, Zhu, Rui, Morioka, Nobuyuki, Hui, Kevin, Haridasan, Krishna, Campos, Victor, Mahdieh, Mahdis, Guo, Mandy, Hassan, Samer, Kilgour, Kevin, Vezer, Arpi, Cheng, Heng-Tze, de Liedekerke, Raoul, Goyal, Siddharth, Barham, Paul, Strouse, DJ, Noury, Seb, Adler, Jonas, Sundararajan, Mukund, Vikram, Sharad, Lepikhin, Dmitry, Paganini, Michela, Garcia, Xavier, Yang, Fan, Valter, Dasha, Trebacz, Maja, Vodrahalli, Kiran, Asawaroengchai, Chulayuth, Ring, Roman, Kalb, Norbert, Soares, Livio Baldini, Brahma, Siddhartha, Steiner, David, Yu, Tianhe, Mentzer, Fabian, He, Antoine, Gonzalez, Lucas, Xu, Bibo, Kaufman, Raphael Lopez, Shafey, Laurent El, Oh, Junhyuk, Hennigan, Tom, Driessche, George van den, Odoom, Seth, Lucic, Mario, Roelofs, Becca, Lall, Sid, Marathe, Amit, Chan, Betty, Ontanon, Santiago, He, Luheng, Teplyashin, Denis, Lai, Jonathan, Crone, Phil, Damoc, Bogdan, Ho, Lewis, Riedel, Sebastian, Lenc, Karel, Yeh, Chih-Kuan, Chowdhery, Aakanksha, Xu, Yang, Kazemi, Mehran, Amid, Ehsan, Petrushkina, Anastasia, Swersky, Kevin, Khodaei, Ali, Chen, Gowoon, Larkin, Chris, Pinto, Mario, Yan, Geng, Badia, Adria Puigdomenech, Patil, Piyush, Hansen, Steven, Orr, Dave, Arnold, Sebastien M. R., Grimstad, Jordan, Dai, Andrew, Douglas, Sholto, Sinha, Rishika, Yadav, Vikas, Chen, Xi, Gribovskaya, Elena, Austin, Jacob, Zhao, Jeffrey, Patel, Kaushal, Komarek, Paul, Austin, Sophia, Borgeaud, Sebastian, Friso, Linda, Goyal, Abhimanyu, Caine, Ben, Cao, Kris, Chung, Da-Woon, Lamm, Matthew, Barth-Maron, Gabe, Kagohara, Thais, Olszewska, Kate, Chen, Mia, Shivakumar, Kaushik, Agarwal, Rishabh, Godhia, Harshal, Rajwar, Ravi, Snaider, Javier, Dotiwalla, Xerxes, Liu, Yuan, Barua, Aditya, Ungureanu, Victor, Zhang, Yuan, Batsaikhan, Bat-Orgil, Wirth, Mateo, Qin, James, Danihelka, Ivo, Doshi, Tulsee, Chadwick, Martin, Chen, Jilin, Jain, Sanil, Le, Quoc, Kar, Arjun, Gurumurthy, Madhu, Li, Cheng, Sang, Ruoxin, Liu, Fangyu, Lamprou, Lampros, Munoz, Rich, Lintz, Nathan, Mehta, Harsh, Howard, Heidi, Reynolds, Malcolm, Aroyo, Lora, Wang, Quan, Blanco, Lorenzo, Cassirer, Albin, Griffith, Jordan, Das, Dipanjan, Lee, Stephan, Sygnowski, Jakub, Fisher, Zach, Besley, James, Powell, Richard, Ahmed, Zafarali, Paulus, Dominik, Reitter, David, Borsos, Zalan, Joshi, Rishabh, Pope, Aedan, Hand, Steven, Selo, Vittorio, Jain, Vihan, Sethi, Nikhil, Goel, Megha, Makino, Takaki, May, Rhys, Yang, Zhen, Schalkwyk, Johan, Butterfield, Christina, Hauth, Anja, Goldin, Alex, Hawkins, Will, Senter, Evan, Brin, Sergey, Woodman, Oliver, Ritter, Marvin, Noland, Eric, Giang, Minh, Bolina, Vijay, Lee, Lisa, Blyth, Tim, Mackinnon, Ian, Reid, Machel, Sarvana, Obaid, Silver, David, Chen, Alexander, Wang, Lily, Maggiore, Loren, Chang, Oscar, Attaluri, Nithya, Thornton, Gregory, Chiu, Chung-Cheng, Bunyan, Oskar, Levine, Nir, Chung, Timothy, Eltyshev, Evgenii, Si, Xiance, Lillicrap, Timothy, Brady, Demetra, Aggarwal, Vaibhav, Wu, Boxi, Xu, Yuanzhong, McIlroy, Ross, Badola, Kartikeya, Sandhu, Paramjit, Moreira, Erica, Stokowiec, Wojciech, Hemsley, Ross, Li, Dong, Tudor, Alex, Shyam, Pranav, Rahimtoroghi, Elahe, Haykal, Salem, Sprechmann, Pablo, Zhou, Xiang, Mincu, Diana, Li, Yujia, Addanki, Ravi, Krishna, Kalpesh, Wu, Xiao, Frechette, Alexandre, Eyal, Matan, Dafoe, Allan, Lacey, Dave, Whang, Jay, Avrahami, Thi, Zhang, Ye, Taropa, Emanuel, Lin, Hanzhao, Toyama, Daniel, Rutherford, Eliza, Sano, Motoki, Choe, HyunJeong, Tomala, Alex, Safranek-Shrader, Chalence, Kassner, Nora, Pajarskas, Mantas, Harvey, Matt, Sechrist, Sean, Fortunato, Meire, Lyu, Christina, Elsayed, Gamaleldin, Kuang, Chenkai, Lottes, James, Chu, Eric, Jia, Chao, Chen, Chih-Wei, Humphreys, Peter, Baumli, Kate, Tao, Connie, Samuel, Rajkumar, Santos, Cicero Nogueira dos, Andreassen, Anders, Rakićević, Nemanja, Grewe, Dominik, Kumar, Aviral, Winkler, Stephanie, Caton, Jonathan, Brock, Andrew, Dalmia, Sid, Sheahan, Hannah, Barr, Iain, Miao, Yingjie, Natsev, Paul, Devlin, Jacob, Behbahani, Feryal, Prost, Flavien, Sun, Yanhua, Myaskovsky, Artiom, Pillai, Thanumalayan Sankaranarayana, Hurt, Dan, Lazaridou, Angeliki, Xiong, Xi, Zheng, Ce, Pardo, Fabio, Li, Xiaowei, Horgan, Dan, Stanton, Joe, Ambar, Moran, Xia, Fei, Lince, Alejandro, Wang, Mingqiu, Mustafa, Basil, Webson, Albert, Lee, Hyo, Anil, Rohan, Wicke, Martin, Dozat, Timothy, Sinha, Abhishek, Piqueras, Enrique, Dabir, Elahe, Upadhyay, Shyam, Boral, Anudhyan, Hendricks, Lisa Anne, Fry, Corey, Djolonga, Josip, Su, Yi, Walker, Jake, Labanowski, Jane, Huang, Ronny, Misra, Vedant, Chen, Jeremy, Skerry-Ryan, RJ, Singh, Avi, Rijhwani, Shruti, Yu, Dian, Castro-Ros, Alex, Changpinyo, Beer, Datta, Romina, Bagri, Sumit, Hrafnkelsson, Arnar Mar, Maggioni, Marcello, Zheng, Daniel, Sulsky, Yury, Hou, Shaobo, Paine, Tom Le, Yang, Antoine, Riesa, Jason, Rogozinska, Dominika, Marcus, Dror, Badawy, Dalia El, Zhang, Qiao, Wang, Luyu, Miller, Helen, Greer, Jeremy, Sjos, Lars Lowe, Nova, Azade, Zen, Heiga, Chaabouni, Rahma, Rosca, Mihaela, Jiang, Jiepu, Chen, Charlie, Liu, Ruibo, Sainath, Tara, Krikun, Maxim, Polozov, Alex, Lespiau, Jean-Baptiste, Newlan, Josh, Cankara, Zeyncep, Kwak, Soo, Xu, Yunhan, Chen, Phil, Coenen, Andy, Meyer, Clemens, Tsihlas, Katerina, Ma, Ada, Gottweis, Juraj, Xing, Jinwei, Gu, Chenjie, Miao, Jin, Frank, Christian, Cankara, Zeynep, Ganapathy, Sanjay, Dasgupta, Ishita, Hughes-Fitt, Steph, Chen, Heng, Reid, David, Rong, Keran, Fan, Hongmin, van Amersfoort, Joost, Zhuang, Vincent, Cohen, Aaron, Gu, Shixiang Shane, Mohananey, Anhad, Ilic, Anastasija, Tobin, Taylor, Wieting, John, Bortsova, Anna, Thacker, Phoebe, Wang, Emma, Caveness, Emily, Chiu, Justin, Sezener, Eren, Kaskasoli, Alex, Baker, Steven, Millican, Katie, Elhawaty, Mohamed, Aisopos, Kostas, Lebsack, Carl, Byrd, Nathan, Dai, Hanjun, Jia, Wenhao, Wiethoff, Matthew, Davoodi, Elnaz, Weston, Albert, Yagati, Lakshman, Ahuja, Arun, Gao, Isabel, Pundak, Golan, Zhang, Susan, Azzam, Michael, Sim, Khe Chai, Caelles, Sergi, Keeling, James, Sharma, Abhanshu, Swing, Andy, Li, YaGuang, Liu, Chenxi, Bostock, Carrie Grimes, Bansal, Yamini, Nado, Zachary, Anand, Ankesh, Lipschultz, Josh, Karmarkar, Abhijit, Proleev, Lev, Ittycheriah, Abe, Yeganeh, Soheil Hassas, Polovets, George, Faust, Aleksandra, Sun, Jiao, Rrustemi, Alban, Li, Pen, Shivanna, Rakesh, Liu, Jeremiah, Welty, Chris, Lebron, Federico, Baddepudi, Anirudh, Krause, Sebastian, Parisotto, Emilio, Soricut, Radu, Xu, Zheng, Bloxwich, Dawn, Johnson, Melvin, Neyshabur, Behnam, Mao-Jones, Justin, Wang, Renshen, Ramasesh, Vinay, Abbas, Zaheer, Guez, Arthur, Segal, Constant, Nguyen, Duc Dung, Svensson, James, Hou, Le, York, Sarah, Milan, Kieran, Bridgers, Sophie, Gworek, Wiktor, Tagliasacchi, Marco, Lee-Thorp, James, Chang, Michael, Guseynov, Alexey, Hartman, Ale Jakse, Kwong, Michael, Zhao, Ruizhe, Kashem, Sheleem, Cole, Elizabeth, Miech, Antoine, Tanburn, Richard, Phuong, Mary, Pavetic, Filip, Cevey, Sebastien, Comanescu, Ramona, Ives, Richard, Yang, Sherry, Du, Cosmo, Li, Bo, Zhang, Zizhao, Iinuma, Mariko, Hu, Clara Huiyi, Roy, Aurko, Bijwadia, Shaan, Zhu, Zhenkai, Martins, Danilo, Saputro, Rachel, Gergely, Anita, Zheng, Steven, Jia, Dawei, Antonoglou, Ioannis, Sadovsky, Adam, Gu, Shane, Bi, Yingying, Andreev, Alek, Samangooei, Sina, Khan, Mina, Kocisky, Tomas, Filos, Angelos, Kumar, Chintu, Bishop, Colton, Yu, Adams, Hodkinson, Sarah, Mittal, Sid, Shah, Premal, Moufarek, Alexandre, Cheng, Yong, Bloniarz, Adam, Lee, Jaehoon, Pejman, Pedram, Michel, Paul, Spencer, Stephen, Feinberg, Vladimir, Xiong, Xuehan, Savinov, Nikolay, Smith, Charlotte, Shakeri, Siamak, Tran, Dustin, Chesus, Mary, Bohnet, Bernd, Tucker, George, von Glehn, Tamara, Muir, Carrie, Mao, Yiran, Kazawa, Hideto, Slone, Ambrose, Soparkar, Kedar, Shrivastava, Disha, Cobon-Kerr, James, Sharman, Michael, Pavagadhi, Jay, Araya, Carlos, Misiunas, Karolis, Ghelani, Nimesh, Laskin, Michael, Barker, David, Li, Qiujia, Briukhov, Anton, Houlsby, Neil, Glaese, Mia, Lakshminarayanan, Balaji, Schucher, Nathan, Tang, Yunhao, Collins, Eli, Lim, Hyeontaek, Feng, Fangxiaoyu, Recasens, Adria, Lai, Guangda, Magni, Alberto, De Cao, Nicola, Siddhant, Aditya, Ashwood, Zoe, Orbay, Jordi, Dehghani, Mostafa, Brennan, Jenny, He, Yifan, Xu, Kelvin, Gao, Yang, Saroufim, Carl, Molloy, James, Wu, Xinyi, Arnold, Seb, Chang, Solomon, Schrittwieser, Julian, Buchatskaya, Elena, Radpour, Soroush, Polacek, Martin, Giordano, Skye, Bapna, Ankur, Tokumine, Simon, Hellendoorn, Vincent, Sottiaux, Thibault, Cogan, Sarah, Severyn, Aliaksei, Saleh, Mohammad, Thakoor, Shantanu, Shefey, Laurent, Qiao, Siyuan, Gaba, Meenu, Chang, Shuo-yiin, Swanson, Craig, Zhang, Biao, Lee, Benjamin, Rubenstein, Paul Kishan, Song, Gan, Kwiatkowski, Tom, Koop, Anna, Kannan, Ajay, Kao, David, Schuh, Parker, Stjerngren, Axel, Ghiasi, Golnaz, Gibson, Gena, Vilnis, Luke, Yuan, Ye, Ferreira, Felipe Tiengo, Kamath, Aishwarya, Klimenko, Ted, Franko, Ken, Xiao, Kefan, Bhattacharya, Indro, Patel, Miteyan, Wang, Rui, Morris, Alex, Strudel, Robin, Sharma, Vivek, Choy, Peter, Hashemi, Sayed Hadi, Landon, Jessica, Finkelstein, Mara, Jhakra, Priya, Frye, Justin, Barnes, Megan, Mauger, Matthew, Daun, Dennis, Baatarsukh, Khuslen, Tung, Matthew, Farhan, Wael, Michalewski, Henryk, Viola, Fabio, Quitry, Felix de Chaumont, Lan, Charline Le, Hudson, Tom, Wang, Qingze, Fischer, Felix, Zheng, Ivy, White, Elspeth, Dragan, Anca, Alayrac, Jean-baptiste, Ni, Eric, Pritzel, Alexander, Iwanicki, Adam, Isard, Michael, Bulanova, Anna, Zilka, Lukas, Dyer, Ethan, Sachan, Devendra, Srinivasan, Srivatsan, Muckenhirn, Hannah, Cai, Honglong, Mandhane, Amol, Tariq, Mukarram, Rae, Jack W., Wang, Gary, Ayoub, Kareem, FitzGerald, Nicholas, Zhao, Yao, Han, Woohyun, Alberti, Chris, Garrette, Dan, Krishnakumar, Kashyap, Gimenez, Mai, Levskaya, Anselm, Sohn, Daniel, Matak, Josip, Iturrate, Inaki, Chang, Michael B., Xiang, Jackie, Cao, Yuan, Ranka, Nishant, Brown, Geoff, Hutter, Adrian, Mirrokni, Vahab, Chen, Nanxin, Yao, Kaisheng, Egyed, Zoltan, Galilee, Francois, Liechty, Tyler, Kallakuri, Praveen, Palmer, Evan, Ghemawat, Sanjay, Liu, Jasmine, Tao, David, Thornton, Chloe, Green, Tim, Jasarevic, Mimi, Lin, Sharon, Cotruta, Victor, Tan, Yi-Xuan, Fiedel, Noah, Yu, Hongkun, Chi, Ed, Neitz, Alexander, Heitkaemper, Jens, Sinha, Anu, Zhou, Denny, Sun, Yi, Kaed, Charbel, Hulse, Brice, Mishra, Swaroop, Georgaki, Maria, Kudugunta, Sneha, Farabet, Clement, Shafran, Izhak, Vlasic, Daniel, Tsitsulin, Anton, Ananthanarayanan, Rajagopal, Carin, Alen, Su, Guolong, Sun, Pei, V, Shashank, Carvajal, Gabriel, Broder, Josef, Comsa, Iulia, Repina, Alena, Wong, William, Chen, Warren Weilun, Hawkins, Peter, Filonov, Egor, Loher, Lucia, Hirnschall, Christoph, Wang, Weiyi, Ye, Jingchen, Burns, Andrea, Cate, Hardie, Wright, Diana Gage, Piccinini, Federico, Zhang, Lei, Lin, Chu-Cheng, Gog, Ionel, Kulizhskaya, Yana, Sreevatsa, Ashwin, Song, Shuang, Cobo, Luis C., Iyer, Anand, Tekur, Chetan, Garrido, Guillermo, Xiao, Zhuyun, Kemp, Rupert, Zheng, Huaixiu Steven, Li, Hui, Agarwal, Ananth, Ngani, Christel, Goshvadi, Kati, Santamaria-Fernandez, Rebeca, Fica, Wojciech, Chen, Xinyun, Gorgolewski, Chris, Sun, Sean, Garg, Roopal, Ye, Xinyu, Eslami, S. M. Ali, Hua, Nan, Simon, Jon, Joshi, Pratik, Kim, Yelin, Tenney, Ian, Potluri, Sahitya, Thiet, Lam Nguyen, Yuan, Quan, Luisier, Florian, Chronopoulou, Alexandra, Scellato, Salvatore, Srinivasan, Praveen, Chen, Minmin, Koverkathu, Vinod, Dalibard, Valentin, Xu, Yaming, Saeta, Brennan, Anderson, Keith, Sellam, Thibault, Fernando, Nick, Huot, Fantine, Jung, Junehyuk, Varadarajan, Mani, Quinn, Michael, Raul, Amit, Le, Maigo, Habalov, Ruslan, Clark, Jon, Jalan, Komal, Bullard, Kalesha, Singhal, Achintya, Luong, Thang, Wang, Boyu, Rajayogam, Sujeevan, Eisenschlos, Julian, Jia, Johnson, Finchelstein, Daniel, Yakubovich, Alex, Balle, Daniel, Fink, Michael, Agarwal, Sameer, Li, Jing, Dvijotham, Dj, Pal, Shalini, Kang, Kai, Konzelmann, Jaclyn, Beattie, Jennifer, Dousse, Olivier, Wu, Diane, Crocker, Remi, Elkind, Chen, Jonnalagadda, Siddhartha Reddy, Lee, Jong, Holtmann-Rice, Dan, Kallarackal, Krystal, Liu, Rosanne, Vnukov, Denis, Vats, Neera, Invernizzi, Luca, Jafari, Mohsen, Zhou, Huanjie, Taylor, Lilly, Prendki, Jennifer, Wu, Marcus, Eccles, Tom, Liu, Tianqi, Kopparapu, Kavya, Beaufays, Francoise, Angermueller, Christof, Marzoca, Andreea, Sarcar, Shourya, Dib, Hilal, Stanway, Jeff, Perbet, Frank, Trdin, Nejc, Sterneck, Rachel, Khorlin, Andrey, Li, Dinghua, Wu, Xihui, Goenka, Sonam, Madras, David, Goldshtein, Sasha, Gierke, Willi, Zhou, Tong, Liu, Yaxin, Liang, Yannie, White, Anais, Li, Yunjie, Singh, Shreya, Bahargam, Sanaz, Epstein, Mark, Basu, Sujoy, Lao, Li, Ozturel, Adnan, Crous, Carl, Zhai, Alex, Lu, Han, Tung, Zora, Gaur, Neeraj, Walton, Alanna, Dixon, Lucas, Zhang, Ming, Globerson, Amir, Uy, Grant, Bolt, Andrew, Wiles, Olivia, Nasr, Milad, Shumailov, Ilia, Selvi, Marco, Piccinno, Francesco, Aguilar, Ricardo, McCarthy, Sara, Khalman, Misha, Shukla, Mrinal, Galic, Vlado, Carpenter, John, Villela, Kevin, Zhang, Haibin, Richardson, Harry, Martens, James, Bosnjak, Matko, Belle, Shreyas Rammohan, Seibert, Jeff, Alnahlawi, Mahmoud, McWilliams, Brian, Singh, Sankalp, Louis, Annie, Ding, Wen, Popovici, Dan, Simicich, Lenin, Knight, Laura, Mehta, Pulkit, Gupta, Nishesh, Shi, Chongyang, Fatehi, Saaber, Mitrovic, Jovana, Grills, Alex, Pagadora, Joseph, Petrova, Dessie, Eisenbud, Danielle, Zhang, Zhishuai, Yates, Damion, Mittal, Bhavishya, Tripuraneni, Nilesh, Assael, Yannis, Brovelli, Thomas, Jain, Prateek, Velimirovic, Mihajlo, Akbulut, Canfer, Mu, Jiaqi, Macherey, Wolfgang, Kumar, Ravin, Xu, Jun, Qureshi, Haroon, Comanici, Gheorghe, Wiesner, Jeremy, Gong, Zhitao, Ruddock, Anton, Bauer, Matthias, Felt, Nick, GP, Anirudh, Arnab, Anurag, Zelle, Dustin, Rothfuss, Jonas, Rosgen, Bill, Shenoy, Ashish, Seybold, Bryan, Li, Xinjian, Mudigonda, Jayaram, Erdogan, Goker, Xia, Jiawei, Simsa, Jiri, Michi, Andrea, Yao, Yi, Yew, Christopher, Kan, Steven, Caswell, Isaac, Radebaugh, Carey, Elisseeff, Andre, Valenzuela, Pedro, McKinney, Kay, Paterson, Kim, Cui, Albert, Latorre-Chimoto, Eri, Kim, Solomon, Zeng, William, Durden, Ken, Ponnapalli, Priya, Sosea, Tiberiu, Choquette-Choo, Christopher A., Manyika, James, Robenek, Brona, Vashisht, Harsha, Pereira, Sebastien, Lam, Hoi, Velic, Marko, Owusu-Afriyie, Denese, Lee, Katherine, Bolukbasi, Tolga, Parrish, Alicia, Lu, Shawn, Park, Jane, Venkatraman, Balaji, Talbert, Alice, Rosique, Lambert, Cheng, Yuchung, Sozanschi, Andrei, Paszke, Adam, Kumar, Praveen, Austin, Jessica, Li, Lu, Salama, Khalid, Kim, Wooyeol, Dukkipati, Nandita, Baryshnikov, Anthony, Kaplanis, Christos, Sheng, XiangHai, Chervonyi, Yuri, Unlu, Caglar, Casas, Diego de Las, Askham, Harry, Tunyasuvunakool, Kathryn, Gimeno, Felix, Poder, Siim, Kwak, Chester, Miecnikowski, Matt, Dimitriev, Alek, Parisi, Aaron, Liu, Dangyi, Tsai, Tomy, Shevlane, Toby, Kouridi, Christina, Garmon, Drew, Goedeckemeyer, Adrian, Brown, Adam R., Vijayakumar, Anitha, Elqursh, Ali, Jazayeri, Sadegh, Huang, Jin, Carthy, Sara Mc, Hoover, Jay, Kim, Lucy, Kumar, Sandeep, Chen, Wei, Biles, Courtney, Bingham, Garrett, Rosen, Evan, Wang, Lisa, Tan, Qijun, Engel, David, Pongetti, Francesco, de Cesare, Dario, Hwang, Dongseong, Yu, Lily, Pullman, Jennifer, Narayanan, Srini, Levin, Kyle, Gopal, Siddharth, Li, Megan, Aharoni, Asaf, Trinh, Trieu, Lo, Jessica, Casagrande, Norman, Vij, Roopali, Matthey, Loic, Ramadhana, Bramandia, Matthews, Austin, Carey, CJ, Johnson, Matthew, Goranova, Kremena, Shah, Rohin, Ashraf, Shereen, Dasgupta, Kingshuk, Larsen, Rasmus, Wang, Yicheng, Vuyyuru, Manish Reddy, Jiang, Chong, Ijazi, Joana, Osawa, Kazuki, Smith, Celine, Boppana, Ramya Sree, Bilal, Taylan, Koizumi, Yuma, Xu, Ying, Altun, Yasemin, Shabat, Nir, Bariach, Ben, Korchemniy, Alex, Choo, Kiam, Ronneberger, Olaf, Iwuanyanwu, Chimezie, Zhao, Shubin, Soergel, David, Hsieh, Cho-Jui, Cai, Irene, Iqbal, Shariq, Sundermeyer, Martin, Chen, Zhe, Bursztein, Elie, Malaviya, Chaitanya, Biadsy, Fadi, Shroff, Prakash, Dhillon, Inderjit, Latkar, Tejasi, Dyer, Chris, Forbes, Hannah, Nicosia, Massimo, Nikolaev, Vitaly, Greene, Somer, Georgiev, Marin, Wang, Pidong, Martin, Nina, Sedghi, Hanie, Zhang, John, Banzal, Praseem, Fritz, Doug, Rao, Vikram, Wang, Xuezhi, Zhang, Jiageng, Patraucean, Viorica, Du, Dayou, Mordatch, Igor, Jurin, Ivan, Liu, Lewis, Dubey, Ayush, Mohan, Abhi, Nowakowski, Janek, Ion, Vlad-Doru, Wei, Nan, Tojo, Reiko, Raad, Maria Abi, Hudson, Drew A., Keshava, Vaishakh, Agrawal, Shubham, Ramirez, Kevin, Wu, Zhichun, Nguyen, Hoang, Liu, Ji, Sewak, Madhavi, Petrini, Bryce, Choi, DongHyun, Philips, Ivan, Wang, Ziyue, Bica, Ioana, Garg, Ankush, Wilkiewicz, Jarek, Agrawal, Priyanka, Guo, Danhao, Xue, Emily, Shaik, Naseer, Leach, Andrew, Khan, Sadh MNM, Wiesinger, Julia, Jerome, Sammy, Chakladar, Abhishek, Wang, Alek Wenjiao, Ornduff, Tina, Abu, Folake, Ghaffarkhah, Alireza, Wainwright, Marcus, Cortes, Mario, Liu, Frederick, Maynez, Joshua, Terzis, Andreas, Samangouei, Pouya, Mansour, Riham, Kępa, Tomasz, Aubet, François-Xavier, Algymr, Anton, Banica, Dan, Weisz, Agoston, Orban, Andras, Senges, Alexandre, Andrejczuk, Ewa, Geller, Mark, Santo, Niccolo Dal, Anklin, Valentin, Merey, Majd Al, Baeuml, Martin, Strohman, Trevor, Bai, Junwen, Petrov, Slav, Wu, Yonghui, Hassabis, Demis, Kavukcuoglu, Koray, Dean, Jeffrey, and Vinyals, Oriol
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.
Published: 2024

17. Federated Deep Q-Learning and 5G load balancing

Author: Lin, Hsin, Su, Yi-Kang, Chen, Hong-Qi, and Ko, La-Fei
Subjects: Computer Science - Networking and Internet Architecture, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Multiagent Systems
Abstract: Despite advances in cellular network technology, base station (BS) load balancing remains a persistent problem. Although centralized resource allocation methods can address the load balancing problem, it still remains an NP-hard problem. In this research, we study how federated deep Q learning can be used to inform each user equipment (UE) of the each BS's load conditions. Federated deep Q learning's load balancing enables intelligent UEs to independently select the best BS while also limiting the amount of private information exposed to the network. In this study, we propose and analyze a federated deep Q learning load balancing system, which is implemented using the Open-RAN xAPP framework and the near-Real Time Radio Interface Controller (near-RT RIC). Our simulation results indicate that compared to the maximum Signal-To-Noise-Ratio (MAX-SINR) method currently used by UEs, our proposed deep Q learning model can consistently provide better High average UE quality of service, Comment: 5 pages, in Chinese language. 8 figures. Presented at 2022 Taiwan telecommunications annual symposium
Published: 2024

18. Online Feature Updates Improve Online (Generalized) Label Shift Adaptation

Author: Wu, Ruihan, Datta, Siddhartha, Su, Yi, Baby, Dheeraj, Wang, Yu-Xiang, and Weinberger, Kilian Q.
Subjects: Computer Science - Machine Learning
Abstract: This paper addresses the prevalent issue of label shift in an online setting with missing labels, where data distributions change over time and obtaining timely labels is challenging. While existing methods primarily focus on adjusting or updating the final layer of a pre-trained classifier, we explore the untapped potential of enhancing feature representations using unlabeled data at test-time. Our novel method, Online Label Shift adaptation with Online Feature Updates (OLS-OFU), leverages self-supervised learning to refine the feature extraction process, thereby improving the prediction model. Theoretical analyses confirm that OLS-OFU reduces algorithmic regret by capitalizing on self-supervised learning for feature refinement. Empirical studies on various datasets, under both online label shift and generalized label shift conditions, underscore the effectiveness and robustness of OLS-OFU, especially in cases of domain shifts.
Published: 2024

19. Comparative analysis of hybrid transumbilical and anal laparoscopic pull-through versus totally transanal laparoscopic assisted pull-through for common type Hirschsprung’s disease

Author: Huang, Guizhen, Sun, Chi, He, Chaosheng, Xu, Weili, Su, Yi, and Li, Suolin
Published: 2024
Full Text: View/download PDF

20. Fluoxetine promotes the recovery of dysphagia and improves nutritional status and neurotrophic status in dysphagia patients after acute ischemic stroke

Author: Su, Yi, Hao, Youguo, Zeng, Xianjing, and Li, Jing
Published: 2024
Full Text: View/download PDF

21. Investigation of Chemical Species in the Corrosion Barrier Layer on Thermal CO2 Treated AZ91D Magnesium Alloy with Lithium Nitrate

Author: Su, Yi-Feng, Jang, Gyoung Gug, Wade, IV, John E., and Jun, Jiheon
Published: 2024
Full Text: View/download PDF

22. Enhanced wear resistance of LDED 316L stainless steel fabricated by in-situ ultrasonic rolling

Author: Su, Yi-gui, Liu, Guan, Pi, Xu-yu, Wen, Dong-xu, Liu, De-fu, and Lin, Yong-cheng
Published: 2024
Full Text: View/download PDF

23. Enhanced electrochemical corrosion resistance of 316L stainless steel manufactured by ultrasonic rolling assisted laser directed energy deposition

Author: Liu, Guan, Su, Yi-gui, Pi, Xu-yu, Wen, Dong-xu, Liu, De-fu, and Lin, Yong-cheng
Published: 2024
Full Text: View/download PDF

24. Dynamic characterization and optimization of moving platforms for enhancing precision in semiconductor point testing equipment

Author: Chan, Tzu-Chi, Fan, Su-Yi, Ullah, Aman, and Farooq, Umar
Published: 2024
Full Text: View/download PDF

25. Synergistic enhancement effect of multi-dimensional nanomaterials on high-damping polyurethane

Author: Su, Yi, Chen, Yuying, Zhang, Hengyuan, Liu, Shaobo, and Guo, Peng
Published: 2024
Full Text: View/download PDF

26. Fabrication of Ni–Zn spinel ferrite with superior magnetic performance from electric arc furnace dust and iron scale

Author: Li, Yang, Jie, Wei-zhe, Chen, Chang, Su, Yi, Zhou, Wu, Zhang, Hua, and Ni, Hong-wei
Published: 2024
Full Text: View/download PDF

27. High-temperature superconductivity with zero resistance and strange-metal behaviour in La3Ni2O7−δ

Author: Zhang, Yanan, Su, Dajun, Huang, Yanen, Shan, Zhaoyang, Sun, Hualei, Huo, Mengwu, Ye, Kaixin, Zhang, Jiawen, Yang, Zihan, Xu, Yongkang, Su, Yi, Li, Rui, Smidman, Michael, Wang, Meng, Jiao, Lin, and Yuan, Huiqiu
Published: 2024
Full Text: View/download PDF

28. Molecular docking-aided AIEgen design: concept, synthesis and applications

Author: Zhang, Jian-Qing, Xu, Xiao-Yu, Liu, Fu-Sheng, Cao, Shu-Qiang, Gui, Yu-Xin, Su, Yi-Wen, He, Xiao-Yu, Liang, Ji-Yuan, and Zou, You-Quan
Published: 2024
Full Text: View/download PDF

29. Local stability of glued laminated bamboo columns with box sections under axial compression

Author: Su, Yi and Zou, Jun
Published: 2024
Full Text: View/download PDF

30. When and How Knowledge Hiding Motivates Perpetrators' Organizational Citizenship Behavior

Author: Pan, Wei, Lua, Egan, Yang, Zaoli, and Su, Yi
Published: 2024
Full Text: View/download PDF

31. A Human-Algorithm Integration System for Hip Fracture Detection on Plain Radiography: System Development and Validation Study

Author: Cheng, Chi-Tung, Chen, Chih-Chi, Cheng, Fu-Jen, Chen, Huan-Wu, Su, Yi-Siang, Yeh, Chun-Nan, Chung, I-Fang, and Liao, Chien-Hung
Subjects: Computer applications to medicine. Medical informatics, R858-859.7
Abstract: BackgroundHip fracture is the most common type of fracture in elderly individuals. Numerous deep learning (DL) algorithms for plain pelvic radiographs (PXRs) have been applied to improve the accuracy of hip fracture diagnosis. However, their efficacy is still undetermined. ObjectiveThe objective of this study is to develop and validate a human-algorithm integration (HAI) system to improve the accuracy of hip fracture diagnosis in a real clinical environment. MethodsThe HAI system with hip fracture detection ability was developed using a deep learning algorithm trained on trauma registry data and 3605 PXRs from August 2008 to December 2016. To compare their diagnostic performance before and after HAI system assistance using an independent testing dataset, 34 physicians were recruited. We analyzed the physicians’ accuracy, sensitivity, specificity, and agreement with the algorithm; we also performed subgroup analyses according to physician specialty and experience. Furthermore, we applied the HAI system in the emergency departments of different hospitals to validate its value in the real world. ResultsWith the support of the algorithm, which achieved 91% accuracy, the diagnostic performance of physicians was significantly improved in the independent testing dataset, as was revealed by the sensitivity (physician alone, median 95%; HAI, median 99%; P
Published: 2020
Full Text: View/download PDF

32. Postcards

Author: Barger, Bettie Parsons, Montañés-Lleras, Andrés, Su, Yi-Ching, Vorster, Magdel, Wee, Jongsun, Harde, Roxanne, Onmuş, İpek, and Bukhina, Olga
Published: 2017
Full Text: View/download PDF

33. Flortaucipir tau PET findings from former professional and college American football players in the DIAGNOSE CTE research project.

Author: Su, Yi, Protas, Hillary, Luo, Ji, Chen, Kewei, Alosco, Michael, Adler, Charles, Balcer, Laura, Bernick, Charles, Au, Rhoda, Banks, Sarah, Barr, William, Coleman, Michael, Dodick, David, Katz, Douglas, Marek, Kenneth, McClean, Michael, McKee, Ann, Mez, Jesse, Daneshvar, Daniel, Palmisano, Joseph, Peskind, Elaine, Turner, Robert, Wethe, Jennifer, Johnson, Keith, Tripodis, Yorghos, Cummings, Jeffrey, Shenton, Martha, Stern, Robert, Reiman, Eric, and Rabinovici, Gil
Subjects: CTE, PET, Tau, flortaucipir, football, Male, Humans, Middle Aged, Chronic Traumatic Encephalopathy, Football, tau Proteins, Positron-Emission Tomography, Brain Injuries, Traumatic, Carbolines
Abstract: INTRODUCTION: Tau is a key pathology in chronic traumatic encephalopathy (CTE). Here, we report our findings in tau positron emission tomography (PET) measurements from the DIAGNOSE CTE Research Project. METHOD: We compare flortaucipir PET measures from 104 former professional players (PRO), 58 former college football players (COL), and 56 same-age men without exposure to repetitive head impacts (RHI) or traumatic brain injury (unexposed [UE]); characterize their associations with RHI exposure; and compare players who did or did not meet diagnostic criteria for traumatic encephalopathy syndrome (TES). RESULTS: Significantly elevated flortaucipir uptake was observed in former football players (PRO+COL) in prespecified regions (p
Published: 2024

34. Recommendations and guidelines of integrative medicine for COVID-19 care: The APEC project outcome.

Author: Jia, Libin, Beidelschies, Michelle, Evans, Joel, Niemtzow, Richard, Niemtzow, Songxuan, Dusek, Jeffery, Lin, Yufang, Wu, Charles, Su, Yi-Chang, Wang, C, Lin, Chien-Yu, Astana, Peristiwan, Ardiyanto, Danang, Hardjoutomo, Rusmiyati, Visithanon, Khwanchai, Puagkong, Jagravudh, Chokpaisarn, Julalak, Lopez, Martha, Yotsuyanagi, Hiroshi, Lee, Myeong, Ramirez, Hernan, Bobadilla, Cecilia, Quinteros, Elizabeth, Galanti de la Paz, Monica, and Maramba-Lazarte, Cecilia
Subjects: COVID-19 care, Guidelines, Integrative medicine
Abstract: This article - Recommendations and Guidelines of Integrative Medicine (IM) for COVID-19 Care - was one of the outcomes from an Asia-Pacific Economic Cooperation (APEC) Project (Integrative Medicine (IM) and COVID -19 Care) during the time between May 2022 and March 2023. With the efforts from care providers, researchers, health policy makers and healthcare administrative leaders among APEC economies, the purpose of this file was to provide comprehensive IM systems for COVID-19 care as recommendations and suggestive guidelines including care methods, tools, procedures, symptom conditions and targets selections, and points need to be considered during care applications. All cited COVID-19 care practices have confirmed their efficacy and usefulness either used alone or combined with conventional medicine. This article provides current useful medical information on IM for COVID-19 care which could benefit APEC economies and world health communities on their healthcare system.
Published: 2024

35. KGLens: Towards Efficient and Effective Knowledge Probing of Large Language Models with Knowledge Graphs

Author: Zheng, Shangshang, Bai, He, Zhang, Yizhe, Su, Yi, Niu, Xiaochuan, and Jaitly, Navdeep
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Large Language Models (LLMs) might hallucinate facts, while curated Knowledge Graph (KGs) are typically factually reliable especially with domain-specific knowledge. Measuring the alignment between KGs and LLMs can effectively probe the factualness and identify the knowledge blind spots of LLMs. However, verifying the LLMs over extensive KGs can be expensive. In this paper, we present KGLens, a Thompson-sampling-inspired framework aimed at effectively and efficiently measuring the alignment between KGs and LLMs. KGLens features a graph-guided question generator for converting KGs into natural language, along with a carefully designed importance sampling strategy based on parameterized KG structure to expedite KG traversal. Our simulation experiment compares the brute force method with KGLens under six different sampling methods, demonstrating that our approach achieves superior probing efficiency. Leveraging KGLens, we conducted in-depth analyses of the factual accuracy of ten LLMs across three large domain-specific KGs from Wikidata, composing over 19K edges, 700 relations, and 21K entities. Human evaluation results indicate that KGLens can assess LLMs with a level of accuracy nearly equivalent to that of human annotators, achieving 95.7% of the accuracy rate., Comment: ACL 2024 Workshop Towards Knowledgeable Language Models
Published: 2023

36. A Novel Hybrid Ordinal Learning Model with Health Care Application

Author: Wang, Lujia, Wang, Hairong, Su, Yi, Lure, Fleming, and Li, Jing
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control, 68Q32
Abstract: Ordinal learning (OL) is a type of machine learning models with broad utility in health care applications such as diagnosis of different grades of a disease (e.g., mild, modest, severe) and prediction of the speed of disease progression (e.g., very fast, fast, moderate, slow). This paper aims to tackle a situation when precisely labeled samples are limited in the training set due to cost or availability constraints, whereas there could be an abundance of samples with imprecise labels. We focus on imprecise labels that are intervals, i.e., one can know that a sample belongs to an interval of labels but cannot know which unique label it has. This situation is quite common in health care datasets due to limitations of the diagnostic instrument, sparse clinical visits, or/and patient dropout. Limited research has been done to develop OL models with imprecise/interval labels. We propose a new Hybrid Ordinal Learner (HOL) to integrate samples with both precise and interval labels to train a robust OL model. We also develop a tractable and efficient optimization algorithm to solve the HOL formulation. We compare HOL with several recently developed OL methods on four benchmarking datasets, which demonstrate the superior performance of HOL. Finally, we apply HOL to a real-world dataset for predicting the speed of progressing to Alzheimer's Disease (AD) for individuals with Mild Cognitive Impairment (MCI) based on a combination of multi-modality neuroimaging and demographic/clinical datasets. HOL achieves high accuracy in the prediction and outperforms existing methods. The capability of accurately predicting the speed of progression to AD for each individual with MCI has the potential for helping facilitate more individually-optimized interventional strategies., Comment: 16 pages, 3 figures, 2 tables
Published: 2023

37. Factors Analysis of Intelligent Construction Technology Adoption Barriers for Expressway Construction Enterprises

Author: Zhou, Zhi-chao, Su, Yi-kun, Zheng, Zhi-zhe, and Wang, Yi-lin
Published: 2024
Full Text: View/download PDF

38. Position Tracking Control of Permanent Magnet Synchronous Motor Based on AHONFTSMC Control

Author: Ruan, Guan-Qiang, Su, Yi-Yu, Cao, Jin-Liang, Su, Yu-Han, Gong, Zheng-Da, and Hu, Xing
Published: 2024
Full Text: View/download PDF

39. Experimental Study on the Static Strain Aging of Q345 Steel Using Complementary In-Situ Non-destructive Testing Techniques

Author: Zhou, Wei, Li, Dong-qi, Su, Yi-fan, and Zhang, Yi-fei
Published: 2024
Full Text: View/download PDF

40. Global prevalence and factors associated with preoperative depression in women undergoing breast surgery: a meta-analysis and meta-regression

Author: Leo, Celest Su Yi, Cheng, Ling Jie, Lam, Xin Rong, and He, Honggu
Published: 2024
Full Text: View/download PDF

41. Open-air plasma-assisted deposition of organosilicon coating for corrosion protection of AZ91D Mg alloy

Author: Jun, Jiheon, Su, Yi-Feng, Wade, IV, John E., Pappas, Daphne, Sy, Andrew, Robinson, Ryan, and Lim, Yong Chae
Published: 2024
Full Text: View/download PDF

42. Ordinal Classification with Distance Regularization for Robust Brain Age Prediction

Author: Shah, Jay, Siddiquee, Md Mahfuzur Rahman, Su, Yi, Wu, Teresa, and Li, Baoxin
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Age is one of the major known risk factors for Alzheimer's Disease (AD). Detecting AD early is crucial for effective treatment and preventing irreversible brain damage. Brain age, a measure derived from brain imaging reflecting structural changes due to aging, may have the potential to identify AD onset, assess disease risk, and plan targeted interventions. Deep learning-based regression techniques to predict brain age from magnetic resonance imaging (MRI) scans have shown great accuracy recently. However, these methods are subject to an inherent regression to the mean effect, which causes a systematic bias resulting in an overestimation of brain age in young subjects and underestimation in old subjects. This weakens the reliability of predicted brain age as a valid biomarker for downstream clinical applications. Here, we reformulate the brain age prediction task from regression to classification to address the issue of systematic bias. Recognizing the importance of preserving ordinal information from ages to understand aging trajectory and monitor aging longitudinally, we propose a novel ORdinal Distance Encoded Regularization (ORDER) loss that incorporates the order of age labels, enhancing the model's ability to capture age-related patterns. Extensive experiments and ablation studies demonstrate that this framework reduces systematic bias, outperforms state-of-art methods by statistically significant margins, and can better capture subtle differences between clinical groups in an independent AD dataset. Our implementation is publicly available at https://github.com/jaygshah/Robust-Brain-Age-Prediction., Comment: Accepted in WACV 2024
Published: 2023
Full Text: View/download PDF

43. Leveraging Large Language Models for Exploiting ASR Uncertainty

Author: Dighe, Pranay, Su, Yi, Zheng, Shangshang, Liu, Yunshu, Garg, Vineet, Niu, Xiaochuan, and Tewfik, Ahmed
Subjects: Computer Science - Computation and Language, Computer Science - Human-Computer Interaction, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: While large language models excel in a variety of natural language processing (NLP) tasks, to perform well on spoken language understanding (SLU) tasks, they must either rely on off-the-shelf automatic speech recognition (ASR) systems for transcription, or be equipped with an in-built speech modality. This work focuses on the former scenario, where LLM's accuracy on SLU tasks is constrained by the accuracy of a fixed ASR system on the spoken input. Specifically, we tackle speech-intent classification task, where a high word-error-rate can limit the LLM's ability to understand the spoken intent. Instead of chasing a high accuracy by designing complex or specialized architectures regardless of deployment costs, we seek to answer how far we can go without substantially changing the underlying ASR and LLM, which can potentially be shared by multiple unrelated tasks. To this end, we propose prompting the LLM with an n-best list of ASR hypotheses instead of only the error-prone 1-best hypothesis. We explore prompt-engineering to explain the concept of n-best lists to the LLM; followed by the finetuning of Low-Rank Adapters on the downstream tasks. Our approach using n-best lists proves to be effective on a device-directed speech detection task as well as on a keyword spotting task, where systems using n-best list prompts outperform those using 1-best ASR hypothesis; thus paving the way for an efficient method to exploit ASR uncertainty via LLMs for speech-based applications., Comment: Added references
Published: 2023

44. Prompting invokes expert-like downward shifts in GPT-4V's conceptual hierarchies

Author: Leong, Cara Su-Yi and Lake, Brenden
Subjects: Artificial Intelligence, Linguistics, Psychology, Concepts and categories, Large Language Models
Abstract: Humans tend to privilege an intermediate level of categorization, known as the basic level, when categorizing objects that exist in a conceptual hierarchy (e.g. choosing to call a Labrador a dog instead of Labrador or animal). Domain experts demonstrate a downward shift in their object categorization behaviour, recruiting subordinate levels in a conceptual hierarchy as readily as conventionally basic categories (Tanaka & Philibert, 2022; Tanaka & Taylor, 1991). Do multimodal large language models show similar behavioural changes when prompted to behave in an expert-like way? We test whether GPT-4 with Vision (GPT-4V, OpenAI, 2023a) and LLaVA (Liu, Li, Wu, & Lee, 2023; Liu, Li, Li, & Lee, 2023) demonstrate downward shifts using an object naming task and eliciting expert-like personas by altering the model's system prompt. We find evidence of downward shifts in GPT-4V when expert system prompts are used, suggesting that human expert-like behaviour can be elicited from GPT-4V using prompting, but find no evidence of downward shift in LLaVA. We also find that there is an unpredicted upward shift in areas of non-expertise in some cases. These findings suggest that in the default case, GPT-4V is not a novice: instead, it behaves at default with a median level of expertise, while further expertise can be primed or forgotten through textual prompts. These results open the door for GPT-4V and similar models to be used as tools for studying differences in the behaviour of experts and novices, and even comparing contrasting levels of expertise within the same large language model.
Published: 2024

45. A global database of bird nest traits.

Author: Chia, Stephanie, Fang, Yi-Ting, Su, Yi-Ting, Tsai, Pei-Yu, Hsieh, Chia, Tsao, Shu-Han, Juang, Jia-Yang, Hung, Chih-Ming, and Tuanmu, Mao-Ning
Subjects: Animals, Birds, Breeding, Nesting Behavior, Phylogeny, Reproduction
Abstract: The reproductive success of birds is closely tied to the characteristics of their nests. It is crucial to understand the distribution of nest traits across phylogenetic and geographic dimensions to gain insight into bird evolution and adaptation. Despite the extensive historical documentation on breeding behavior, a structured dataset describing bird nest characteristics has been lacking. To address this gap, we have compiled a comprehensive dataset that characterizes three ecologically and evolutionarily significant nest traits-site, structure, and attachment-for 9,248 bird species, representing all 36 orders and 241 out of the 244 families. By defining seven sites, seven structures, and four attachment types, we have systematically classified the nests of each species using information from text descriptions, photos, and videos sourced from online databases and literature. This nest traits dataset serves as a valuable addition to the existing body of morphological and ecological trait data for bird species, providing a useful resource for a wide range of avian macroecological and macroevolutionary research.
Published: 2023

46. High-temperature superconductivity with zero-resistance and strange metal behavior in La$_{3}$Ni$_{2}$O$_{7-\delta}$

Author: Zhang, Yanan, Su, Dajun, Huang, Yanen, Shan, Zhaoyang, Sun, Hualei, Huo, Mengwu, Ye, Kaixin, Zhang, Jiawen, Yang, Zihan, Xu, Yongkang, Su, Yi, Li, Rui, Smidman, Michael, Wang, Meng, Jiao, Lin, and Yuan, Huiqiu
Subjects: Condensed Matter - Superconductivity
Abstract: Recently signatures of superconductivity were observed close to 80 K in \LN\ under pressure. This discovery positions \LN\ as the first bulk nickelate with high-temperature superconductivity, but the lack of zero resistance presents a significant drawback for validating the findings. Here we report pressure measurements up to over 30 GPa using a liquid pressure medium and show that single crystals of \LNO\ do exhibit zero resistance. We find that \LNO\ remains metallic under applied pressures, suggesting the absence of a metal-insulator transition proximate to the superconductivity. Analysis of the normal state $T$-linear resistance suggests an intricate link between this strange metal behaviour and superconductivity, whereby at high pressures both the linear resistance coefficient and superconducting transition are slowly suppressed by pressure, while at intermediate pressures both the superconductivity and strange metal behaviour appear disrupted, possibly due to a nearby structural instability. The association between strange metal behaviour and high-temperature superconductivity is very much in line with diverse classes of unconventional superconductors, including the cuprates and Fe-based superconductors. Understanding the superconductivity of \LNO\ evidently requires further revealing the interplay of strange metal behaviour, superconductivity, as well as possible competing electronic or structural phases., Comment: 28 pages, 4+8 figures, including Extended Data Files
Published: 2023
Full Text: View/download PDF

47. Intensity-free Convolutional Temporal Point Process: Incorporating Local and Global Event Contexts

Author: Zhou, Wang-Tao, Kang, Zhao, Tian, Ling, and Su, Yi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Social and Information Networks
Abstract: Event prediction in the continuous-time domain is a crucial but rather difficult task. Temporal point process (TPP) learning models have shown great advantages in this area. Existing models mainly focus on encoding global contexts of events using techniques like recurrent neural networks (RNNs) or self-attention mechanisms. However, local event contexts also play an important role in the occurrences of events, which has been largely ignored. Popular convolutional neural networks, which are designated for local context capturing, have never been applied to TPP modelling due to their incapability of modelling in continuous time. In this work, we propose a novel TPP modelling approach that combines local and global contexts by integrating a continuous-time convolutional event encoder with an RNN. The presented framework is flexible and scalable to handle large datasets with long sequences and complex latent patterns. The experimental result shows that the proposed model improves the performance of probabilistic sequential modelling and the accuracy of event prediction. To our best knowledge, this is the first work that applies convolutional neural networks to TPP modelling., Comment: Accepted to Information Sciences
Published: 2023

48. Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

Author: Zhang, Zeyu, Su, Yi, Yuan, Hui, Wu, Yiran, Balasubramanian, Rishab, Wu, Qingyun, Wang, Huazheng, and Wang, Mengdi
Subjects: Computer Science - Machine Learning, Computer Science - Information Retrieval
Abstract: Off-policy Learning to Rank (LTR) aims to optimize a ranker from data collected by a deployed logging policy. However, existing off-policy learning to rank methods often make strong assumptions about how users generate the click data, i.e., the click model, and hence need to tailor their methods specifically under different click models. In this paper, we unified the ranking process under general stochastic click models as a Markov Decision Process (MDP), and the optimal ranking could be learned with offline reinforcement learning (RL) directly. Building upon this, we leverage offline RL techniques for off-policy LTR and propose the Click Model-Agnostic Unified Off-policy Learning to Rank (CUOLR) method, which could be easily applied to a wide range of click models. Through a dedicated formulation of the MDP, we show that offline RL algorithms can adapt to various click models without complex debiasing techniques and prior knowledge of the model. Results on various large-scale datasets demonstrate that CUOLR consistently outperforms the state-of-the-art off-policy learning to rank algorithms while maintaining consistency and robustness under different click models., Comment: accepted by Neruips 2023
Published: 2023

49. Language Models Can Learn Exceptions to Syntactic Rules

Author: Leong, Cara Su-Yi and Linzen, Tal
Subjects: Computer Science - Computation and Language
Abstract: Artificial neural networks can generalize productively to novel contexts. Can they also learn exceptions to those productive rules? We explore this question using the case of restrictions on English passivization (e.g., the fact that "The vacation lasted five days" is grammatical, but "*Five days was lasted by the vacation" is not). We collect human acceptability judgments for passive sentences with a range of verbs, and show that the probability distribution defined by GPT-2, a language model, matches the human judgments with high correlation. We also show that the relative acceptability of a verb in the active vs. passive voice is positively correlated with the relative frequency of its occurrence in those voices. These results provide preliminary support for the entrenchment hypothesis, according to which learners track and uses the distributional properties of their input to learn negative exceptions to rules. At the same time, this hypothesis fails to explain the magnitude of unpassivizability demonstrated by certain individual verbs, suggesting that other cues to exceptionality are available in the linguistic input., Comment: Accepted to SCiL 2023
Published: 2023
Full Text: View/download PDF

50. Long-Term Value of Exploration: Measurements, Findings and Algorithms

Author: Su, Yi, Wang, Xiangyu, Le, Elaine Ya, Liu, Liang, Li, Yuening, Lu, Haokai, Lipshitz, Benjamin, Badam, Sriraj, Heldt, Lukasz, Bi, Shuchao, Chi, Ed, Goodrow, Cristos, Wu, Su-Lin, Baugher, Lexi, and Chen, Minmin
Subjects: Computer Science - Information Retrieval
Abstract: Effective exploration is believed to positively influence the long-term user experience on recommendation platforms. Determining its exact benefits, however, has been challenging. Regular A/B tests on exploration often measure neutral or even negative engagement metrics while failing to capture its long-term benefits. We here introduce new experiment designs to formally quantify the long-term value of exploration by examining its effects on content corpus, and connecting content corpus growth to the long-term user experience from real-world experiments. Once established the values of exploration, we investigate the Neural Linear Bandit algorithm as a general framework to introduce exploration into any deep learning based ranking systems. We conduct live experiments on one of the largest short-form video recommendation platforms that serves billions of users to validate the new experiment designs, quantify the long-term values of exploration, and to verify the effectiveness of the adopted neural linear bandit algorithm for exploration., Comment: 11 pages, WSDM 2024
Published: 2023

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Region

Database

Publisher

11,447 results on '"Su, Yi"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources