Author: "Sun, Fei" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Sun, Fei"' showing total 7,052 results

Start Over Author "Sun, Fei"

7,052 results on '"Sun, Fei"'

1. How Many Times Did It Bloom

Author: Sun, Fei
Published: 2022
Full Text: View/download PDF

2. Half Bowl of Mengpo's Soup

Author: Sun, Fei
Published: 2022
Full Text: View/download PDF

3. Improving the Shortest Plank: Vulnerability-Aware Adversarial Training for Robust Recommender System

Author: Zhang, Kaike, Cao, Qi, Wu, Yunfan, Sun, Fei, Shen, Huawei, and Cheng, Xueqi
Subjects: Computer Science - Information Retrieval
Abstract: Recommender systems play a pivotal role in mitigating information overload in various fields. Nonetheless, the inherent openness of these systems introduces vulnerabilities, allowing attackers to insert fake users into the system's training data to skew the exposure of certain items, known as poisoning attacks. Adversarial training has emerged as a notable defense mechanism against such poisoning attacks within recommender systems. Existing adversarial training methods apply perturbations of the same magnitude across all users to enhance system robustness against attacks. Yet, in reality, we find that attacks often affect only a subset of users who are vulnerable. These perturbations of indiscriminate magnitude make it difficult to balance effective protection for vulnerable users without degrading recommendation quality for those who are not affected. To address this issue, our research delves into understanding user vulnerability. Considering that poisoning attacks pollute the training data, we note that the higher degree to which a recommender system fits users' training data correlates with an increased likelihood of users incorporating attack information, indicating their vulnerability. Leveraging these insights, we introduce the Vulnerability-aware Adversarial Training (VAT), designed to defend against poisoning attacks in recommender systems. VAT employs a novel vulnerability-aware function to estimate users' vulnerability based on the degree to which the system fits them. Guided by this estimation, VAT applies perturbations of adaptive magnitude to each user, not only reducing the success ratio of attacks but also preserving, and potentially enhancing, the quality of recommendations. Comprehensive experiments confirm VAT's superior defensive capabilities across different recommendation models and against various types of attacks.
Published: 2024
Full Text: View/download PDF

4. Creation of independently controllable and long lifetime polar skyrmion textures in ferroelectric-metallic heterostructures

Author: Sun, Fei, Ren, Jianhua, Li, Hongfang, Wu, Yiwei, Liang, Jianwei, Yang, Hui, Zhang, Yi, Liu, Jianyi, Liu, Linjie, Wu, Mengjun, Zhang, Xiaoyue, Zhu, Wenpeng, Chen, Weijin, and Zheng, Yue
Subjects: Condensed Matter - Materials Science
Abstract: Topological textures like vortices, labyrinths and skyrmions formed in ferroic materials have attracted extensive interests during the past decade for their fundamental physics, intriguing topology, and technological prospects. So far, polar skyrmions remain scarce in ferroelectrics as they require a delicate balance between various dipolar interactions. Here, we report that PbTiO3 thin films in a metallic contact undergo a topological phase transition and stabilize a broad family of skyrmion-like textures (e.g., skyrmion bubbles, multiple {\pi}-twist target skyrmions, and skyrmion bags) with independent controllability, analogous to those reported in magnetic systems. Weakly-interacted skyrmion arrays with a density over 300 Gb/inch2 are successfully written, erased and read-out by local electrical and mechanical stimuli of a scanning probe. Interestingly, in contrast to the relatively short lifetime <20 hours of the skyrmion bubbles, the multiple {\pi}-twist target skyrmions and skyrmion bags show topology-enhanced stability with lifetime over two weeks. Experimental and theoretical analysis implies the heterostructures carry electric Dzyaloshinskii-Moriya interaction mediated by oxygen octahedral tiltings. Our results demonstrate ferroelectric-metallic heterostructures as fertile playground for topological states and emergent phenomena.
Published: 2024

5. Accelerating the Surrogate Retraining for Poisoning Attacks against Recommender Systems

Author: Wu, Yunfan, Cao, Qi, Tao, Shuchang, Zhang, Kaike, Sun, Fei, and Shen, Huawei
Subjects: Computer Science - Information Retrieval
Abstract: Recent studies have demonstrated the vulnerability of recommender systems to data poisoning attacks, where adversaries inject carefully crafted fake user interactions into the training data of recommenders to promote target items. Current attack methods involve iteratively retraining a surrogate recommender on the poisoned data with the latest fake users to optimize the attack. However, this repetitive retraining is highly time-consuming, hindering the efficient assessment and optimization of fake users. To mitigate this computational bottleneck and develop a more effective attack in an affordable time, we analyze the retraining process and find that a change in the representation of one user/item will cause a cascading effect through the user-item interaction graph. Under theoretical guidance, we introduce \emph{Gradient Passing} (GP), a novel technique that explicitly passes gradients between interacted user-item pairs during backpropagation, thereby approximating the cascading effect and accelerating retraining. With just a single update, GP can achieve effects comparable to multiple original training iterations. Under the same number of retraining epochs, GP enables a closer approximation of the surrogate recommender to the victim. This more accurate approximation provides better guidance for optimizing fake users, ultimately leading to enhanced data poisoning attacks. Extensive experiments on real-world datasets demonstrate the efficiency and effectiveness of our proposed GP., Comment: Accepted by RecSys 2024
Published: 2024

6. Two-dimensional superconductivity in a thick exfoliated kagome film

Author: Sun, Fei, Salinas, Andrea Capa, Wilson, Stephen D., and Zhang, Haijing
Subjects: Condensed Matter - Superconductivity, Condensed Matter - Materials Science, Condensed Matter - Strongly Correlated Electrons
Abstract: We report the observation of two-dimensional superconductivity (2D SC) in exfoliated kagome metal CsV$_3$Sb$_5$ with a thickness far thicker than the atomic limit. By examining the critical current and upper critical magnetic fields ($H_{c2}$) of 40-60 nm thick films in the superconducting state, we identify a pronounced Berezinskii-Kosterlitz-Thouless (BKT) transition behavior, i.e. a drastic decrease of the superfluid stiffness near the transition, and a cusp-like feature of the angular dependent $H_{c2}$, both of which serve as direct evidence of 2D SC. In addition, an exceeding of the Pauli paramagnetic limit of the in-plane $H_{c2}$ is consistent with the 2D SC nature. The observed 2D SC occurs in thick films with the highest superconducting transition temperature $T_c$ and the lowest charge density wave transition temperature $T_{\rm {CDW}}$, which suggests that the charge density wave states are interrelated with the superconducting states. Our findings impose constraints in understanding the enhancement of SC in kagome superconductors, and illuminate pathways for achieving novel 2D superconducting states in more stable and much thicker systems., Comment: 6 pages and 4 figures for the main text; 10 pages and 9 figures for the Supplementary Materials
Published: 2024

7. The Llama 3 Herd of Models

Author: Dubey, Abhimanyu, Jauhri, Abhinav, Pandey, Abhinav, Kadian, Abhishek, Al-Dahle, Ahmad, Letman, Aiesha, Mathur, Akhil, Schelten, Alan, Yang, Amy, Fan, Angela, Goyal, Anirudh, Hartshorn, Anthony, Yang, Aobo, Mitra, Archi, Sravankumar, Archie, Korenev, Artem, Hinsvark, Arthur, Rao, Arun, Zhang, Aston, Rodriguez, Aurelien, Gregerson, Austen, Spataru, Ava, Roziere, Baptiste, Biron, Bethany, Tang, Binh, Chern, Bobbie, Caucheteux, Charlotte, Nayak, Chaya, Bi, Chloe, Marra, Chris, McConnell, Chris, Keller, Christian, Touret, Christophe, Wu, Chunyang, Wong, Corinne, Ferrer, Cristian Canton, Nikolaidis, Cyrus, Allonsius, Damien, Song, Daniel, Pintz, Danielle, Livshits, Danny, Esiobu, David, Choudhary, Dhruv, Mahajan, Dhruv, Garcia-Olano, Diego, Perino, Diego, Hupkes, Dieuwke, Lakomkin, Egor, AlBadawy, Ehab, Lobanova, Elina, Dinan, Emily, Smith, Eric Michael, Radenovic, Filip, Zhang, Frank, Synnaeve, Gabriel, Lee, Gabrielle, Anderson, Georgia Lewis, Nail, Graeme, Mialon, Gregoire, Pang, Guan, Cucurell, Guillem, Nguyen, Hailey, Korevaar, Hannah, Xu, Hu, Touvron, Hugo, Zarov, Iliyan, Ibarra, Imanol Arrieta, Kloumann, Isabel, Misra, Ishan, Evtimov, Ivan, Copet, Jade, Lee, Jaewon, Geffert, Jan, Vranes, Jana, Park, Jason, Mahadeokar, Jay, Shah, Jeet, van der Linde, Jelmer, Billock, Jennifer, Hong, Jenny, Lee, Jenya, Fu, Jeremy, Chi, Jianfeng, Huang, Jianyu, Liu, Jiawen, Wang, Jie, Yu, Jiecao, Bitton, Joanna, Spisak, Joe, Park, Jongsoo, Rocca, Joseph, Johnstun, Joshua, Saxe, Joshua, Jia, Junteng, Alwala, Kalyan Vasuden, Upasani, Kartikeya, Plawiak, Kate, Li, Ke, Heafield, Kenneth, Stone, Kevin, El-Arini, Khalid, Iyer, Krithika, Malik, Kshitiz, Chiu, Kuenley, Bhalla, Kunal, Rantala-Yeary, Lauren, van der Maaten, Laurens, Chen, Lawrence, Tan, Liang, Jenkins, Liz, Martin, Louis, Madaan, Lovish, Malo, Lubo, Blecher, Lukas, Landzaat, Lukas, de Oliveira, Luke, Muzzi, Madeline, Pasupuleti, Mahesh, Singh, Mannat, Paluri, Manohar, Kardas, Marcin, Oldham, Mathew, Rita, Mathieu, Pavlova, Maya, Kambadur, Melanie, Lewis, Mike, Si, Min, Singh, Mitesh Kumar, Hassan, Mona, Goyal, Naman, Torabi, Narjes, Bashlykov, Nikolay, Bogoychev, Nikolay, Chatterji, Niladri, Duchenne, Olivier, Çelebi, Onur, Alrassy, Patrick, Zhang, Pengchuan, Li, Pengwei, Vasic, Petar, Weng, Peter, Bhargava, Prajjwal, Dubal, Pratik, Krishnan, Praveen, Koura, Punit Singh, Xu, Puxin, He, Qing, Dong, Qingxiao, Srinivasan, Ragavan, Ganapathy, Raj, Calderer, Ramon, Cabral, Ricardo Silveira, Stojnic, Robert, Raileanu, Roberta, Girdhar, Rohit, Patel, Rohit, Sauvestre, Romain, Polidoro, Ronnie, Sumbaly, Roshan, Taylor, Ross, Silva, Ruan, Hou, Rui, Wang, Rui, Hosseini, Saghar, Chennabasappa, Sahana, Singh, Sanjay, Bell, Sean, Kim, Seohyun Sonia, Edunov, Sergey, Nie, Shaoliang, Narang, Sharan, Raparthy, Sharath, Shen, Sheng, Wan, Shengye, Bhosale, Shruti, Zhang, Shun, Vandenhende, Simon, Batra, Soumya, Whitman, Spencer, Sootla, Sten, Collot, Stephane, Gururangan, Suchin, Borodinsky, Sydney, Herman, Tamar, Fowler, Tara, Sheasha, Tarek, Georgiou, Thomas, Scialom, Thomas, Speckbacher, Tobias, Mihaylov, Todor, Xiao, Tong, Karn, Ujjwal, Goswami, Vedanuj, Gupta, Vibhor, Ramanathan, Vignesh, Kerkez, Viktor, Gonguet, Vincent, Do, Virginie, Vogeti, Vish, Petrovic, Vladan, Chu, Weiwei, Xiong, Wenhan, Fu, Wenyin, Meers, Whitney, Martinet, Xavier, Wang, Xiaodong, Tan, Xiaoqing Ellen, Xie, Xinfeng, Jia, Xuchao, Wang, Xuewei, Goldschlag, Yaelle, Gaur, Yashesh, Babaei, Yasmine, Wen, Yi, Song, Yiwen, Zhang, Yuchen, Li, Yue, Mao, Yuning, Coudert, Zacharie Delpierre, Yan, Zheng, Chen, Zhengxing, Papakipos, Zoe, Singh, Aaditya, Grattafiori, Aaron, Jain, Abha, Kelsey, Adam, Shajnfeld, Adam, Gangidi, Adithya, Victoria, Adolfo, Goldstand, Ahuva, Menon, Ajay, Sharma, Ajay, Boesenberg, Alex, Vaughan, Alex, Baevski, Alexei, Feinstein, Allie, Kallet, Amanda, Sangani, Amit, Yunus, Anam, Lupu, Andrei, Alvarado, Andres, Caples, Andrew, Gu, Andrew, Ho, Andrew, Poulton, Andrew, Ryan, Andrew, Ramchandani, Ankit, Franco, Annie, Saraf, Aparajita, Chowdhury, Arkabandhu, Gabriel, Ashley, Bharambe, Ashwin, Eisenman, Assaf, Yazdan, Azadeh, James, Beau, Maurer, Ben, Leonhardi, Benjamin, Huang, Bernie, Loyd, Beth, De Paola, Beto, Paranjape, Bhargavi, Liu, Bing, Wu, Bo, Ni, Boyu, Hancock, Braden, Wasti, Bram, Spence, Brandon, Stojkovic, Brani, Gamido, Brian, Montalvo, Britt, Parker, Carl, Burton, Carly, Mejia, Catalina, Wang, Changhan, Kim, Changkyu, Zhou, Chao, Hu, Chester, Chu, Ching-Hsiang, Cai, Chris, Tindal, Chris, Feichtenhofer, Christoph, Civin, Damon, Beaty, Dana, Kreymer, Daniel, Li, Daniel, Wyatt, Danny, Adkins, David, Xu, David, Testuggine, Davide, David, Delia, Parikh, Devi, Liskovich, Diana, Foss, Didem, Wang, Dingkang, Le, Duc, Holland, Dustin, Dowling, Edward, Jamil, Eissa, Montgomery, Elaine, Presani, Eleonora, Hahn, Emily, Wood, Emily, Brinkman, Erik, Arcaute, Esteban, Dunbar, Evan, Smothers, Evan, Sun, Fei, Kreuk, Felix, Tian, Feng, Ozgenel, Firat, Caggioni, Francesco, Guzmán, Francisco, Kanayet, Frank, Seide, Frank, Florez, Gabriela Medina, Schwarz, Gabriella, Badeer, Gada, Swee, Georgia, Halpern, Gil, Thattai, Govind, Herman, Grant, Sizov, Grigory, Guangyi, Zhang, Lakshminarayanan, Guna, Shojanazeri, Hamid, Zou, Han, Wang, Hannah, Zha, Hanwen, Habeeb, Haroun, Rudolph, Harrison, Suk, Helen, Aspegren, Henry, Goldman, Hunter, Damlaj, Ibrahim, Molybog, Igor, Tufanov, Igor, Veliche, Irina-Elena, Gat, Itai, Weissman, Jake, Geboski, James, Kohli, James, Asher, Japhet, Gaya, Jean-Baptiste, Marcus, Jeff, Tang, Jeff, Chan, Jennifer, Zhen, Jenny, Reizenstein, Jeremy, Teboul, Jeremy, Zhong, Jessica, Jin, Jian, Yang, Jingyi, Cummings, Joe, Carvill, Jon, Shepard, Jon, McPhie, Jonathan, Torres, Jonathan, Ginsburg, Josh, Wang, Junjie, Wu, Kai, U, Kam Hou, Saxena, Karan, Prasad, Karthik, Khandelwal, Kartikay, Zand, Katayoun, Matosich, Kathy, Veeraraghavan, Kaushik, Michelena, Kelly, Li, Keqian, Huang, Kun, Chawla, Kunal, Lakhotia, Kushal, Huang, Kyle, Chen, Lailin, Garg, Lakshya, A, Lavender, Silva, Leandro, Bell, Lee, Zhang, Lei, Guo, Liangpeng, Yu, Licheng, Moshkovich, Liron, Wehrstedt, Luca, Khabsa, Madian, Avalani, Manav, Bhatt, Manish, Tsimpoukelli, Maria, Mankus, Martynas, Hasson, Matan, Lennie, Matthew, Reso, Matthias, Groshev, Maxim, Naumov, Maxim, Lathi, Maya, Keneally, Meghan, Seltzer, Michael L., Valko, Michal, Restrepo, Michelle, Patel, Mihir, Vyatskov, Mik, Samvelyan, Mikayel, Clark, Mike, Macey, Mike, Wang, Mike, Hermoso, Miquel Jubert, Metanat, Mo, Rastegari, Mohammad, Bansal, Munish, Santhanam, Nandhini, Parks, Natascha, White, Natasha, Bawa, Navyata, Singhal, Nayan, Egebo, Nick, Usunier, Nicolas, Laptev, Nikolay Pavlovich, Dong, Ning, Zhang, Ning, Cheng, Norman, Chernoguz, Oleg, Hart, Olivia, Salpekar, Omkar, Kalinli, Ozlem, Kent, Parkin, Parekh, Parth, Saab, Paul, Balaji, Pavan, Rittner, Pedro, Bontrager, Philip, Roux, Pierre, Dollar, Piotr, Zvyagina, Polina, Ratanchandani, Prashant, Yuvraj, Pritish, Liang, Qian, Alao, Rachad, Rodriguez, Rachel, Ayub, Rafi, Murthy, Raghotham, Nayani, Raghu, Mitra, Rahul, Li, Raymond, Hogan, Rebekkah, Battey, Robin, Wang, Rocky, Maheswari, Rohan, Howes, Russ, Rinott, Ruty, Bondu, Sai Jayesh, Datta, Samyak, Chugh, Sara, Hunt, Sara, Dhillon, Sargun, Sidorov, Sasha, Pan, Satadru, Verma, Saurabh, Yamamoto, Seiji, Ramaswamy, Sharadh, Lindsay, Shaun, Feng, Sheng, Lin, Shenghao, Zha, Shengxin Cindy, Shankar, Shiva, Zhang, Shuqiang, Wang, Sinong, Agarwal, Sneha, Sajuyigbe, Soji, Chintala, Soumith, Max, Stephanie, Chen, Stephen, Kehoe, Steve, Satterfield, Steve, Govindaprasad, Sudarshan, Gupta, Sumit, Cho, Sungmin, Virk, Sunny, Subramanian, Suraj, Choudhury, Sy, Goldman, Sydney, Remez, Tal, Glaser, Tamar, Best, Tamara, Kohler, Thilo, Robinson, Thomas, Li, Tianhe, Zhang, Tianjun, Matthews, Tim, Chou, Timothy, Shaked, Tzook, Vontimitta, Varun, Ajayi, Victoria, Montanez, Victoria, Mohan, Vijai, Kumar, Vinay Satish, Mangla, Vishal, Albiero, Vítor, Ionescu, Vlad, Poenaru, Vlad, Mihailescu, Vlad Tiberiu, Ivanov, Vladimir, Li, Wei, Wang, Wenchen, Jiang, Wenwen, Bouaziz, Wes, Constable, Will, Tang, Xiaocheng, Wang, Xiaofang, Wu, Xiaojian, Wang, Xiaolan, Xia, Xide, Wu, Xilun, Gao, Xinbo, Chen, Yanjun, Hu, Ye, Jia, Ye, Qi, Ye, Li, Yenda, Zhang, Yilin, Zhang, Ying, Adi, Yossi, Nam, Youngjin, Yu, Wang, Hao, Yuchen, Qian, Yundi, He, Yuzi, Rait, Zach, DeVito, Zachary, Rosnbrick, Zef, Wen, Zhaoduo, Yang, Zhenyu, and Zhao, Zhiwei
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development.
Published: 2024

8. Broadband planar electromagnetic hyper-lens with uniform magnification in air

Author: Sun, Ran, Sun, Fei, Chen, Hanchuan, Liu, Yichao, and Wang, Qi
Subjects: Physics - Applied Physics
Abstract: A planar hyper-lens, capable of creating sub-wavelength imaging for broadband electromagnetic wave, is designed based on electromagnetic null medium. Subsequently, a scheme for the implementation of the proposed hyper-lens is given by using well-designed flexural metal plates, which function as the reduced electromagnetic null medium for TM-polarized microwaves. Both simulated and measured results verify that the hyper-lens designed with flexural metal plates can achieve super-resolution imaging for microwave at operating wavelength ({\lambda}0=3cm) with a resolution of 0.25{\lambda}0 and a uniform magnification of about 5. Moreover, the designed hyper-lens ensures that both the object and image surfaces are planes, and simultaneously provides a uniform magnification for objects in different positions. Additionally, the proposed hyper-lens offers a broadband super-resolution imaging capabilities, achieving good super-resolution imaging effects for microwave frequencies ranging from 8.5 to 11 GHz. The proposed hyper-lens may find applications in high precision imaging, detection, and sensing.
Published: 2024

9. The Fall of ROME: Understanding the Collapse of LLMs in Model Editing

Author: Yang, Wanli, Sun, Fei, Tan, Jiajun, Ma, Xinyu, Su, Du, Yin, Dawei, and Shen, Huawei
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Despite significant progress in model editing methods, their application in real-world scenarios remains challenging as they often cause large language models (LLMs) to collapse. Among them, ROME is particularly concerning, as it could disrupt LLMs with only a single edit. In this paper, we study the root causes of such collapse. Through extensive analysis, we identify two primary factors that contribute to the collapse: i) inconsistent handling of prefixed and unprefixed keys in the parameter update equation may result in very small denominators, causing excessively large parameter updates; ii) the subject of collapse cases is usually the first token, whose unprefixed key distribution significantly differs from the prefixed key distribution in autoregressive transformers, causing the aforementioned issue to materialize. To validate our analysis, we propose a simple yet effective approach: uniformly using prefixed keys during editing phase and adding prefixes during the testing phase. The experimental results show that the proposed solution can prevent model collapse while maintaining the effectiveness of the edits.
Published: 2024

10. Impurity-level induced broadband photoelectric response in wide-band semiconductor SrSnO3

Author: Zhang, Yuyang, Wang, Lisheng, Wu, Weijie, Wang, Zhaoyang, Sun, Fei, Jiang, He, Zhang, Bangmin, and Zheng, Yue
Subjects: Physics - Applied Physics, Condensed Matter - Materials Science
Abstract: Broadband spectrum detectors exhibit great promise in fields such as multispectral imaging and optical communications. Despite significant progress, challenges like materials instability, complex manufacturing process and high costs still hinder further application. Here we present a method that achieves broadband spectral detect by impurity-level in SrSnO3. We report over 200 mA/W photo-responsivity at 275 nm (ultraviolet C solar-bind) and 367 nm (ultraviolet A) and ~ 1 mA/W photo-responsivity at 532 nm and 700 nm (visible) with a voltage bias of 5V. Further transport and photoluminescence results indicate that the broadband response comes from the impurity levels and mutual interactions. Additionally, the photodetector demonstrates excellent robustness and stability under repeated tests and prolonged exposure in air. These findings show the potential of SSO photodetectors and propose a method to achieve broadband spectrum detection, creating new possibility for the development of single-phase, low-cost, simple structure and high-efficiency photodetectors., Comment: 5 Figures
Published: 2024

11. Is Flash Attention Stable?

Author: Golden, Alicia, Hsia, Samuel, Sun, Fei, Acun, Bilge, Hosmer, Basil, Lee, Yejin, DeVito, Zachary, Johnson, Jeff, Wei, Gu-Yeon, Brooks, David, and Wu, Carole-Jean
Subjects: Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Training large-scale machine learning models poses distinct system challenges, given both the size and complexity of today's workloads. Recently, many organizations training state-of-the-art Generative AI models have reported cases of instability during training, often taking the form of loss spikes. Numeric deviation has emerged as a potential cause of this training instability, although quantifying this is especially challenging given the costly nature of training runs. In this work, we develop a principled approach to understanding the effects of numeric deviation, and construct proxies to put observations into context when downstream effects are difficult to quantify. As a case study, we apply this framework to analyze the widely-adopted Flash Attention optimization. We find that Flash Attention sees roughly an order of magnitude more numeric deviation as compared to Baseline Attention at BF16 when measured during an isolated forward pass. We then use a data-driven analysis based on the Wasserstein Distance to provide upper bounds on how this numeric deviation impacts model weights during training, finding that the numerical deviation present in Flash Attention is 2-5 times less significant than low-precision training.
Published: 2024

12. When to Trust LLMs: Aligning Confidence with Response Quality

Author: Tao, Shuchang, Yao, Liuyi, Ding, Hanxing, Xie, Yuexiang, Cao, Qi, Sun, Fei, Gao, Jinyang, Shen, Huawei, and Ding, Bolin
Subjects: Computer Science - Computation and Language
Abstract: Despite the success of large language models (LLMs) in natural language generation, much evidence shows that LLMs may produce incorrect or nonsensical text. This limitation highlights the importance of discerning when to trust LLMs, especially in safety-critical domains. Existing methods often express reliability by confidence level, however, their effectiveness is limited by the lack of objective guidance. To address this, we propose CONfidence-Quality-ORDer-preserving alignment approach (CONQORD), which leverages reinforcement learning guided by a tailored dual-component reward function. This function integrates quality reward and order-preserving alignment reward functions. Specifically, the order-preserving reward incentivizes the model to verbalize greater confidence for responses of higher quality to align the order of confidence and quality. Experiments demonstrate that CONQORD significantly improves the alignment performance between confidence and response accuracy, without causing over-cautious. Furthermore, the aligned confidence provided by CONQORD informs when to trust LLMs, and acts as a determinant for initiating the retrieval process of external knowledge. Aligning confidence with response quality ensures more transparent and reliable responses, providing better trustworthiness., Comment: Accepted by ACL 2024
Published: 2024

13. Experimental demonstration of a thermal-EM concentrator for enhancing EM signals and converging heat fluxes simultaneously

Author: Chen, Hanchuan, Liu, Yichao, Sun, Fei, Sun, Qianhan, Wu, Xiaoxiao, and Sun, Ran
Subjects: Physics - General Physics
Abstract: Simultaneously concentrating EM waves and heat fluxes to the same target region within an on-chip system carries substantial academic research importance and practical application value. Nevertheless, existing researches are primarily aimed at the design and experimentation of concentrators for individual EM waves or temperature fields. In this work, a thermal-EM concentrator, capable of simultaneously concentrating EM waves and heat fluxes, is designed using transformation optics/thermodynamics and fabricated with engineered EM-thermal metamaterials. The concentrating effects of the proposed thermal-EM concentrator on the thermal fluxes and EM waves are verified through numerical simulations and experimental measurements, respectively, which are in good agreement with each other. Both numerically simulated and experimentally measured results demonstrate the concentrating capability of the proposed thermal-EM concentrator, which can concentrate broadband TM-polarized EM waves ranging from 8-12 GHz and heat/cold flows to the same target region within an on-chip operating environment. The thermal-EM concentrator exhibits a thermal focusing efficiency close to 100% and more than three times enhancement of the magnetic field at the designed center frequency of 10 GHz. The proposed thermal-EM concentrator can be utilized for efficient cooling for the specified component and simultaneously enhancing the EM antenna's radiation/reception efficiency within an on-chip system., Comment: 15 pages, 5 figures
Published: 2024
Full Text: View/download PDF

14. Chiral phase transition and spin alignment of vector meson in the Polarized-Polyakov-loop Nambu-Jona-Lasinio model under rotation

Author: Sun, Fei, Shao, Jingdong, Wen, Rui, Xu, Kun, and Huang, Mei
Subjects: High Energy Physics - Phenomenology
Abstract: By using the extrapolation method, a polarized Polykov-loop potential at finite real angular velocity is constructed from the lattice results at finite imaginary angular velocity. The chiral and deconfinement phase transitions under rotation have been simultaneously investigated in the Polarized-Polyakov-loop Nambu-Jona-Lasinio (PPNJL) model. It is observed that both critical temperatures of deconfinement and chiral phase transition increase with the angular velocity, which is in consistent with lattice results. The spin alignment of vector meson has the negative deviation of $\rho_{00} -1/3$ under rotation, and the deviation in the PPNJL model is much more significant than that in the NJL model and the quark coalescence model, which revealing the important role of rotating gluons on the quark polarization., Comment: 10 pages, 6 figures
Published: 2024

15. Unlink to Unlearn: Simplifying Edge Unlearning in GNNs

Author: Tan, Jiajun, Sun, Fei, Qiu, Ruichen, Su, Du, and Shen, Huawei
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Cryptography and Security
Abstract: As concerns over data privacy intensify, unlearning in Graph Neural Networks (GNNs) has emerged as a prominent research frontier in academia. This concept is pivotal in enforcing the \textit{right to be forgotten}, which entails the selective removal of specific data from trained GNNs upon user request. Our research focuses on edge unlearning, a process of particular relevance to real-world applications. Current state-of-the-art approaches like GNNDelete can eliminate the influence of specific edges yet suffer from \textit{over-forgetting}, which means the unlearning process inadvertently removes excessive information beyond needed, leading to a significant performance decline for remaining edges. Our analysis identifies the loss functions of GNNDelete as the primary source of over-forgetting and also suggests that loss functions may be redundant for effective edge unlearning. Building on these insights, we simplify GNNDelete to develop \textbf{Unlink to Unlearn} (UtU), a novel method that facilitates unlearning exclusively through unlinking the forget edges from graph structure. Our extensive experiments demonstrate that UtU delivers privacy protection on par with that of a retrained model while preserving high accuracy in downstream tasks, by upholding over 97.3\% of the retrained model's privacy protection capabilities and 99.8\% of its link prediction accuracy. Meanwhile, UtU requires only constant computational demands, underscoring its advantage as a highly lightweight and practical edge unlearning solution., Comment: Accepted by WWW 2024 as a Short Research Paper
Published: 2024

16. The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse

Author: Yang, Wanli, Sun, Fei, Ma, Xinyu, Liu, Xun, Yin, Dawei, and Cheng, Xueqi
Subjects: Computer Science - Artificial Intelligence
Abstract: Although model editing has shown promise in revising knowledge in Large Language Models (LLMs), its impact on the inherent capabilities of LLMs is often overlooked. In this work, we reveal a critical phenomenon: even a single edit can trigger model collapse, manifesting as significant performance degradation in various benchmark tasks. However, benchmarking LLMs after each edit, while necessary to prevent such collapses, is impractically time-consuming and resource-intensive. To mitigate this, we propose using perplexity as a surrogate metric, validated by extensive experiments demonstrating changes in an edited model's perplexity are strongly correlated with its downstream task performances. We further conduct an in-depth study on sequential editing, a practical setting for real-world scenarios, across various editing methods and LLMs, focusing on hard cases from our previous single edit studies. The results indicate that nearly all examined editing methods result in model collapse after only few edits. To facilitate further research, we have utilized GPT-3.5 to develop a new dataset, HardEdit, based on those hard cases. This dataset aims to establish the foundation for pioneering research in reliable model editing and the mechanisms underlying editing-induced model collapse. We hope this work can draw the community's attention to the potential risks inherent in model editing practices., Comment: Accepted at Findings of ACL 2024
Published: 2024

17. Tunable uniform field enhancement in a subwavelength air pillar by photonic doping in epsilon-near-zero medium

Author: Sun, Fei, Shan, Jinyuan, and Liu, Yichao
Subjects: Physics - Applied Physics, Physics - Optics
Abstract: In this study, a novel electric field compressor is proposed by doping a metal-air-metal pillar in epsilon-near-zero medium within a metallic waveguide, which effectively enhances the background electric fields in a sub-wavelength air pillar with high uniformity. The field enhancement factor can be analytically determined through theoretical derivations from Maxwell's equations, reaching its maximum value by appropriately selecting the size of the air pillar. Furthermore, the proposed compressor can achieve a tunable electric field enhancement effect within the deep sub-wavelength air pillar by adjusting the height of the air pillar through movement of two metal pillars inserted into the waveguide. Both theoretical analysis and numerical simulations are employed to validate the performance of the electric field compressor, which exhibits a wide range of tunable field enhancement effect (i.e., continuously adjustable enhancement factor between 20 and 800) with uniformity below 1.5 within the deep sub-wavelength air pillar (wherein the air volume is smaller than la0/10 * la0/10 * la0/16). Finally, a practical model is proposed for the realization of the electric field compressor in the microwave range, where specially placed metal pillars and wires are incorporated into a rectangular metallic waveguide structure. The field enhancement and tunable effects of this practical model have been verified through simulations.
Published: 2024
Full Text: View/download PDF

18. Evaluation of the Liver Disease Information in Baidu Encyclopedia and Wikipedia: Longitudinal Study

Author: Sun, Fei, Yang, Fuchun, and Zheng, Shusen
Subjects: Computer applications to medicine. Medical informatics, R858-859.7, Public aspects of medicine, RA1-1270
Abstract: BackgroundThe internet has changed the way of people acquiring health information. Previous studies have shown that Wikipedia is a reasonably reliable medical resource, and it has been ranked higher than other general websites in various search engines. Baidu Encyclopedia is one of the most popular encyclopedia websites in China. However, no studies have shown the quality of the content provided in the Baidu Encyclopedia. ObjectiveThis study aimed to evaluate the quality of liver disease information provided by Wikipedia (in English) and Baidu Encyclopedia (in Chinese) and to perform a comparison of the quality and timeliness of the articles published in these two encyclopedias. Moreover, a 3-year follow-up study was conducted to compare if the information in both these websites was updated regularly over this period. MethodsWe searched for information on liver diseases by using the International Statistical Classification of Diseases and Related Health Problems 10th Revision Version 2016 codes on Wikipedia (in English) and Baidu Encyclopedia (in Chinese). The quality of the articles was assessed using the DISCERN instrument, which consists of 3 sections. We recorded the latest editing date of the webpages and calculated the date interval to evaluate the update timeliness of these websites. ResultsWe found 22 entries on liver diseases in Baidu Encyclopedia and 15 articles in Wikipedia between September 15, 2016, and September 30, 2016, and we found 25 entries in Baidu Encyclopedia and 16 articles in Wikipedia between September 15, 2019, and September 30, 2019. In section 1 of the DISCERN instrument, the mean (SE) scores of Baidu Encyclopedia entries were significantly lower than those of Wikipedia articles. In section 2 and section 3 of the DISCERN instrument, the DISCERN scores of Baidu Encyclopedia entries were lower than those of Wikipedia articles, but the differences were not statistically significant. The total DISCERN scores of Baidu Encyclopedia entries were significantly lower than those of Wikipedia articles. The update interval of the entries in Baidu Encyclopedia was found to be significantly longer than that of the articles in Wikipedia. ConclusionsThis study shows that the quality of articles and the reliability of the research content on liver diseases in Wikipedia are better than those of the entries in Baidu Encyclopedia. However, the quality of the treatment choices provided in both Wikipedia and Baidu Encyclopedia is not satisfactory. Wikipedia is updated more frequently than Baidu Encyclopedia, thereby ensuring that the information presented has the most recent research findings. The findings of our study suggest that in order to find accurate health information, it is important to seek the help of medical professionals instead of looking for a prescription amid the confusing information provided on the internet.
Published: 2021
Full Text: View/download PDF

19. The Effect of Mindfulness on Marital Stability via Perceived Stress: Findings from a Dyadic Study of Older Couples in China

Author: Zhang, Rong, Wu, Jie, Duan, Yemo, and Sun, Fei
Published: 2024
Full Text: View/download PDF

20. The role of tumor-associated macrophages in the radioresistance of esophageal cancer cells via regulation of the VEGF-mediated angiogenic pathway

Author: Sun, Fei, Lian, Yingying, Zhou, Mengyun, Luo, Judong, Hu, Lijun, Wang, Jianlin, Sun, Zhiqiang, and Yu, Jingping
Published: 2024
Full Text: View/download PDF

21. Study on Acoustic Emission and Crack Propagation Characteristics of Single-Fissured Sandstone with Different Angles Under Uniaxial Compression

Author: Guo, Jia-Qi, Zhu, Zi-Hui, Chen, Jian-Xun, Sun, Fei-Yue, and Wang, Zheng
Published: 2024
Full Text: View/download PDF

22. LoRec: Large Language Model for Robust Sequential Recommendation against Poisoning Attacks

Author: Zhang, Kaike, Cao, Qi, Wu, Yunfan, Sun, Fei, Shen, Huawei, and Cheng, Xueqi
Subjects: Computer Science - Information Retrieval
Abstract: Sequential recommender systems stand out for their ability to capture users' dynamic interests and the patterns of item-to-item transitions. However, the inherent openness of sequential recommender systems renders them vulnerable to poisoning attacks, where fraudulent users are injected into the training data to manipulate learned patterns. Traditional defense strategies predominantly depend on predefined assumptions or rules extracted from specific known attacks, limiting their generalizability to unknown attack types. To solve the above problems, considering the rich open-world knowledge encapsulated in Large Language Models (LLMs), our research initially focuses on the capabilities of LLMs in the detection of unknown fraudulent activities within recommender systems, a strategy we denote as LLM4Dec. Empirical evaluations demonstrate the substantial capability of LLMs in identifying unknown fraudsters, leveraging their expansive, open-world knowledge. Building upon this, we propose the integration of LLMs into defense strategies to extend their effectiveness beyond the confines of known attacks. We propose LoRec, an advanced framework that employs LLM-Enhanced Calibration to strengthen the robustness of sequential recommender systems against poisoning attacks. LoRec integrates an LLM-enhanced CalibraTor (LCT) that refines the training process of sequential recommender systems with knowledge derived from LLMs, applying a user-wise reweighting to diminish the impact of fraudsters injected by attacks. By incorporating LLMs' open-world knowledge, the LCT effectively converts the limited, specific priors or rules into a more general pattern of fraudsters, offering improved defenses against poisoning attacks. Our comprehensive experiments validate that LoRec, as a general framework, significantly strengthens the robustness of sequential recommender systems.
Published: 2024

23. Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts When Knowledge Conflicts?

Author: Tan, Hexiang, Sun, Fei, Yang, Wanli, Wang, Yuanzhuo, Cao, Qi, and Cheng, Xueqi
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: While auxiliary information has become a key to enhancing Large Language Models (LLMs), relatively little is known about how LLMs merge these contexts, specifically contexts generated by LLMs and those retrieved from external sources. To investigate this, we formulate a systematic framework to identify whether LLMs' responses are attributed to either generated or retrieved contexts. To easily trace the origin of the response, we construct datasets with conflicting contexts, i.e., each question is paired with both generated and retrieved contexts, yet only one of them contains the correct answer. Our experiments reveal a significant bias in several LLMs (GPT-4/3.5 and Llama2) to favor generated contexts, even when they provide incorrect information. We further identify two key factors contributing to this bias: i) contexts generated by LLMs typically show greater similarity to the questions, increasing their likelihood of being selected; ii) the segmentation process used in retrieved contexts disrupts their completeness, thereby hindering their full utilization in LLMs. Our analysis enhances the understanding of how LLMs merge diverse contexts, offers valuable insights for advancing current LLM augmentation methods, and highlights the risk of generated misinformation for retrieval-augmented LLMs., Comment: Accepted at ACL 2024 Main, Homepage (https://tan-hexiang.github.io/Blinded_by_Generated_Contexts/)
Published: 2024

24. Photoproduction of lepton pair in ultra-relativistic heavy-ion collisions

Author: Yu, Kewei, Peng, Jiazhen, Li, Shuang, Wu, Kejun, Xie, Wei, and Sun, Fei
Subjects: Nuclear Theory, High Energy Physics - Phenomenology
Abstract: Dilepton production provides a unique probe of the strong electromagnetic field produced in heavy-ion collisions. To map out the behavior of its transverse momentum broadening, we present a theoretical model based on the equivalent photon approximation, and then we update it to make direct comparisons with the recent experimental measurements. We find that the model calculations can describe well, not only the average transverse momentum squared of $e^{+}e^{-}$ pairs in Au--Au collisions at $\sqrt{s_{\rm NN}}=200$ GeV, but also the acoplanarity of $\mu^{+}\mu^{-}$ pairs in Pb--Pb collisions at$\sqrt{s_{\rm NN}}=5.02$ TeV. Furthermore, the model predictions are also able to reproduce the measured dependencies of the pair mass and the transverse momentum squared., Comment: 10 pages, 9 figures
Published: 2024
Full Text: View/download PDF

25. Unraveling collisional energy loss of a heavy quark in quark-gluon plasma

Author: Peng, Jiazhen, Yu, Kewei, Li, Shuang, Xiong, Wei, Sun, Fei, and Xie, Wei
Subjects: High Energy Physics - Phenomenology, Nuclear Theory
Abstract: At leading order in QCD coupling constant, we compute the energy loss per traveling distance of a heavy quark $dE/dz$ from elastic scattering off thermal quarks and gluons at a temperature $T$, including the thermal perturbative description of soft scatterings ($-t<-t^{\ast}$) and a perturbative QCD-based calculation for hard collisions ($-t>-t^{\ast}$). Within this soft-hard factorization model, we find that the full results of $dE/dz$ behaves a mild sensitivity to the intermediate cutoff $t^{\ast}$, supporting the validity of the soft-hard approach within the temperature region of interest. We re-derive the analytic formula for $dE/dz$ in the high-energy approximation, $E_{1}\gg m^{2}_{1}/T$, where $E_{1}$ is the injected heavy quark energy and $m_{1}$ is its mass. It is realized that the soft logarithmic contribution, $dE/dz\propto ln(-t^{\ast}/m^{2}_{D})$, arises from the $t$-channel scattering off thermal partons, while the hard logarithmic term, $dE/dz\propto ln[E_{1}T/(-t^{\ast})]$, stems from the $t$-channel scattering off thermal partons, and the one $dE/dz\propto ln(E_{1}T/m^{2}_{1})$ comes from the $s$- and $u$-channel scattering off gluons. The sum of these contributions cancels the $t^{\ast}$-dependence as observed in the full result. The mass hierarchy is observed $dE/dz(charm)>dE/dz(bottom)$. Our full results are crucial for a better description of heavy quark transport in QCD medium, in particular at low and moderate energy. We also calculate the energy loss by imposing the Einstein's relationship. The related results appear to be systematically larger than that without imposing the Einstein's relationship., Comment: 16 pages, 8 figures
Published: 2024
Full Text: View/download PDF

26. Generative AI Beyond LLMs: System Implications of Multi-Modal Generation

Author: Golden, Alicia, Hsia, Samuel, Sun, Fei, Acun, Bilge, Hosmer, Basil, Lee, Yejin, DeVito, Zachary, Johnson, Jeff, Wei, Gu-Yeon, Brooks, David, and Wu, Carole-Jean
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Machine Learning, Computer Science - Multimedia
Abstract: As the development of large-scale Generative AI models evolve beyond text (1D) generation to include image (2D) and video (3D) generation, processing spatial and temporal information presents unique challenges to quality, performance, and efficiency. We present the first work towards understanding this new system design space for multi-modal text-to-image (TTI) and text-to-video (TTV) generation models. Current model architecture designs are bifurcated into 2 categories: Diffusion- and Transformer-based models. Our systematic performance characterization on a suite of eight representative TTI/TTV models shows that after state-of-the-art optimization techniques such as Flash Attention are applied, Convolution accounts for up to 44% of execution time for Diffusion-based TTI models, while Linear layers consume up to 49% of execution time for Transformer-based models. We additionally observe that Diffusion-based TTI models resemble the Prefill stage of LLM inference, and benefit from 1.1-2.5x greater speedup from Flash Attention than Transformer-based TTI models that resemble the Decode phase. Since optimizations designed for LLMs do not map directly onto TTI/TTV models, we must conduct a thorough characterization of these workloads to gain insights for new optimization opportunities. In doing so, we define sequence length in the context of TTI/TTV models and observe sequence length can vary up to 4x in Diffusion model inference. We additionally observe temporal aspects of TTV workloads pose unique system bottlenecks, with Temporal Attention accounting for over 60% of total Attention time. Overall, our in-depth system performance characterization is a critical first step towards designing efficient and deployable systems for emerging TTI/TTV workloads., Comment: Published at 2024 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
Published: 2023

27. On-chip omnidirectional electromagnetic-thermal cloak

Author: Liu, Yichao, Chen, Hanchuan, Zhao, Gang, and Sun, Fei
Subjects: Physics - Applied Physics
Abstract: Simultaneously guiding electromagnetic waves and heat flow at any incidence angle to smoothly bypass some electromagnetic/thermal sensitive elements is a key factor to ensure efficient communication and thermal protection for an on-chip system. In this study, an omnidirectional on-chip electromagnetic-thermal cloak is proposed. Firstly, a holey metallic plate with periodic array of subwavelength apertures is designed by optical surface transformation to realize an omnidirectional electromagnetic cloaking module for on-chip electromagnetic signal. Secondly, a two-layer ring-shaped engineered thermal structure is designed by solving Laplace equation to realize an omnidirectional thermal cloaking module for in-chip heat flow. Finally, these two cloaking modules are combined elaborately to achieve cloaking effect for both the electromagnetic waves and thermal fields simultaneously from any detecting direction, thus protecting the build-in electromagnetic/thermal sensitive elements without disturbing the external electromagnetic/thermal signal. The proposed electromagnetic-thermal cloak may have potential advantage in dealing with omnidirectional electromagnetic compatibility/shielding and multi-directional thermal management/dissipation of an on-chip system.
Published: 2023
Full Text: View/download PDF

28. EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

Author: Xiong, Yunyang, Varadarajan, Bala, Wu, Lemeng, Xiang, Xiaoyu, Xiao, Fanyi, Zhu, Chenchen, Dai, Xiaoliang, Wang, Dilin, Sun, Fei, Iandola, Forrest, Krishnamoorthi, Raghuraman, and Chandra, Vikas
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Segment Anything Model (SAM) has emerged as a powerful tool for numerous vision applications. A key component that drives the impressive performance for zero-shot transfer and high versatility is a super large Transformer model trained on the extensive high-quality SA-1B dataset. While beneficial, the huge computation cost of SAM model has limited its applications to wider real-world applications. To address this limitation, we propose EfficientSAMs, light-weight SAM models that exhibits decent performance with largely reduced complexity. Our idea is based on leveraging masked image pretraining, SAMI, which learns to reconstruct features from SAM image encoder for effective visual representation learning. Further, we take SAMI-pretrained light-weight image encoders and mask decoder to build EfficientSAMs, and finetune the models on SA-1B for segment anything task. We perform evaluations on multiple vision tasks including image classification, object detection, instance segmentation, and semantic object detection, and find that our proposed pretraining method, SAMI, consistently outperforms other masked image pretraining methods. On segment anything task such as zero-shot instance segmentation, our EfficientSAMs with SAMI-pretrained lightweight image encoders perform favorably with a significant gain (e.g., ~4 AP on COCO/LVIS) over other fast SAM models.
Published: 2023

29. TEA: Test-time Energy Adaptation

Author: Yuan, Yige, Xu, Bingbing, Hou, Liang, Sun, Fei, Shen, Huawei, and Cheng, Xueqi
Subjects: Computer Science - Machine Learning
Abstract: Test-time adaptation (TTA) aims to improve model generalizability when test data diverges from training distribution, offering the distinct advantage of not requiring access to training data and processes, especially valuable in the context of large pre-trained models. However, current TTA methods fail to address the fundamental issue: covariate shift, i.e., the decreased generalizability can be attributed to the model's reliance on the marginal distribution of the training data, which may impair model calibration and introduce confirmation bias. To address this, we propose a novel energy-based perspective, enhancing the model's perception of target data distributions without requiring access to training data or processes. Building on this perspective, we introduce $\textbf{T}$est-time $\textbf{E}$nergy $\textbf{A}$daptation ($\textbf{TEA}$), which transforms the trained classifier into an energy-based model and aligns the model's distribution with the test data's, enhancing its ability to perceive test distributions and thus improving overall generalizability. Extensive experiments across multiple tasks, benchmarks and architectures demonstrate TEA's superior generalization performance against state-of-the-art methods. Further in-depth analyses reveal that TEA can equip the model with a comprehensive perception of test distribution, ultimately paving the way toward improved generalization and calibration., Comment: Accepted by IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR 2024). Code is available at https://github.com/yuanyige/tea
Published: 2023

30. Evaluation of the Tensile Properties of Vanadium-Added Steels with Different Ferrite and Pearlite Hardness Ratios

Author: Kawamura, Minato, Ogawa, Toshio, Sun, Fei, and Adachi, Yoshitaka
Published: 2024
Full Text: View/download PDF

31. Improvement in the Strength–Ductility Balance of Tempered Martensite Steel by Controlling Cementite Particle Size Distribution

Author: Hayakawa, Kenji, Ogawa, Toshio, He, Lei, Sun, Fei, and Adachi, Yoshitaka
Published: 2024
Full Text: View/download PDF

32. Simultaneously enhancing toluene adsorption and regeneration process by hierarchical pore in activated coke: a combined experimental and adsorption kinetic modeling study

Author: Chen, Guoqing, Zhang, Wenshuang, Sun, Fei, Qu, Zhibin, Hu, Yun, Li, Xuhan, Li, Junfeng, and Wang, Tao
Published: 2024
Full Text: View/download PDF

33. Future in the past: paternal reprogramming of offspring phenotype and the epigenetic mechanisms

Author: Wu, Di, Zhang, Kejia, Guan, Kaifeng, Khan, Faheem Ahmed, Pandupuspitasari, Nuruliarizki Shinta, Negara, Windu, Sun, Fei, and Huang, Chunjie
Published: 2024
Full Text: View/download PDF

34. Tbx21 gene and its association with resistance against viral nervous necrosis (VNN) in Asian seabass, Lates calcarifer

Author: Wong, Joey, Yang, Zituo, Wang, Le, Sun, Fei, and Yue, Gen Hua
Published: 2024
Full Text: View/download PDF

35. Unveiling the tapestry of teacher belief research: tracing the present and forging the future through bibliometric analysis

Author: Wang, Xiaochen, Gao, Yang, Sun, Fei, and Wang, Qikai
Published: 2024
Full Text: View/download PDF

36. The rotation effect on the thermodynamics of the QCD matter

Author: Sun, Fei, Li, Shuang, Wen, Rui, Huang, Anping, and Xie, Wei
Subjects: High Energy Physics - Phenomenology
Abstract: In this study, we investigate the impact of rotation on the thermodynamic characteristics of QCD matter using the three-flavor NJL model. We examine the temperature, quark chemical potential, and angular velocity dependencies of key thermodynamic quantities, such as the trace anomaly, specific heat, speed of sound, angular momentum, and moment of inertia. As the main finding of our analysis, we observe that the speed of sound exhibits a nonmonotonic behavior as the angular velocity changes., Comment: 18 pages, 19 figures
Published: 2023

37. Robust Recommender System: A Survey and Future Directions

Author: Zhang, Kaike, Cao, Qi, Sun, Fei, Wu, Yunfan, Tao, Shuchang, Shen, Huawei, and Cheng, Xueqi
Subjects: Computer Science - Information Retrieval
Abstract: With the rapid growth of information, recommender systems have become integral for providing personalized suggestions and overcoming information overload. However, their practical deployment often encounters "dirty" data, where noise or malicious information can lead to abnormal recommendations. Research on improving recommender systems' robustness against such dirty data has thus gained significant attention. This survey provides a comprehensive review of recent work on recommender systems' robustness. We first present a taxonomy to organize current techniques for withstanding malicious attacks and natural noise. We then explore state-of-the-art methods in each category, including fraudster detection, adversarial training, certifiable robust training against malicious attacks, and regularization, purification, self-supervised learning against natural noise. Additionally, we summarize evaluation metrics and common datasets used to assess robustness. We discuss robustness across varying recommendation scenarios and its interplay with other properties like accuracy, interpretability, privacy, and fairness. Finally, we delve into open issues and future research directions in this emerging field. Our goal is to equip readers with a holistic understanding of robust recommender systems and spotlight pathways for future research and development.
Published: 2023

38. A Large Language Model Enhanced Conversational Recommender System

Author: Feng, Yue, Liu, Shuchang, Xue, Zhenghai, Cai, Qingpeng, Hu, Lantao, Jiang, Peng, Gai, Kun, and Sun, Fei
Subjects: Computer Science - Information Retrieval, Computer Science - Computation and Language
Abstract: Conversational recommender systems (CRSs) aim to recommend high-quality items to users through a dialogue interface. It usually contains multiple sub-tasks, such as user preference elicitation, recommendation, explanation, and item information search. To develop effective CRSs, there are some challenges: 1) how to properly manage sub-tasks; 2) how to effectively solve different sub-tasks; and 3) how to correctly generate responses that interact with users. Recently, Large Language Models (LLMs) have exhibited an unprecedented ability to reason and generate, presenting a new opportunity to develop more powerful CRSs. In this work, we propose a new LLM-based CRS, referred to as LLMCRS, to address the above challenges. For sub-task management, we leverage the reasoning ability of LLM to effectively manage sub-task. For sub-task solving, we collaborate LLM with expert models of different sub-tasks to achieve the enhanced performance. For response generation, we utilize the generation ability of LLM as a language interface to better interact with users. Specifically, LLMCRS divides the workflow into four stages: sub-task detection, model matching, sub-task execution, and response generation. LLMCRS also designs schema-based instruction, demonstration-based instruction, dynamic sub-task and model matching, and summary-based generation to instruct LLM to generate desired results in the workflow. Finally, to adapt LLM to conversational recommendations, we also propose to fine-tune LLM with reinforcement learning from CRSs performance feedback, referred to as RLPF. Experimental results on benchmark datasets show that LLMCRS with RLPF outperforms the existing methods.
Published: 2023

39. Farnesoid X receptor mediates macrophage-intrinsic responses to suppress colitis-induced colon cancer progression.

Author: Dong, Xingchen, Qi, Ming, Cai, Chunmiao, Zhu, Yu, Li, Yuwenbin, Coulter, Sally, Sun, Fei, Liddle, Christopher, Uboha, Nataliya V, Halberg, Richard, Xu, Wei, Marker, Paul, and Fu, Ting
Subjects: Biomedical and Clinical Sciences, Immunology, Autoimmune Disease, Inflammatory Bowel Disease, Digestive Diseases, Crohn's Disease, Colo-Rectal Cancer, Cancer, Aetiology, 2.1 Biological and endogenous factors, Oral and gastrointestinal, Animals, Mice, Humans, Colonic Neoplasms, Colitis, Macrophages, Inflammatory Bowel Diseases, Inflammation, Bile Acids and Salts, Disease Models, Animal, Colorectal cancer, Endocrinology, Gastroenterology, Biomedical and clinical sciences, Health sciences
Abstract: Bile acids (BAs) affect the intestinal environment by ensuring barrier integrity, maintaining microbiota balance, regulating epithelium turnover, and modulating the immune system. As a master regulator of BA homeostasis, farnesoid X receptor (FXR) is severely compromised in patients with inflammatory bowel disease (IBD) and colitis-associated colorectal cancer (CAC). At the front line, gut macrophages react to the microbiota and metabolites that breach the epithelium. We aim to study the role of the BA/FXR axis in macrophages. This study demonstrates that inflammation-induced epithelial abnormalities compromised FXR signaling and altered BAs' profile in a mouse CAC model. Further, gut macrophage-intrinsic FXR sensed aberrant BAs, leading to pro-inflammatory cytokines' secretion, which promoted intestinal stem cell proliferation. Mechanistically, activation of FXR ameliorated intestinal inflammation and inhibited colitis-associated tumor growth, by regulating gut macrophages' recruitment, polarization, and crosstalk with Th17 cells. However, deletion of FXR in bone marrow or gut macrophages escalated the intestinal inflammation. In summary, our study reveals a distinctive regulatory role of FXR in gut macrophages, suggesting its potential as a therapeutic target for addressing IBD and CAC.
Published: 2024

40. An injury-responsive mmp14b enhancer is required for heart regeneration.

Author: Zlatanova, Ivana, Sun, Fei, Wu, Roland, Chen, Xiaoxin, Lau, Bryan, Colombier, Pauline, Sinha, Tanvi, Xu, Shan-Mei, Huang, Guo, Black, Brian, Materna, Stefan, and Celona, Barbara
Subjects: Animals, Mice, Zebrafish, Endothelial Cells, Myocardium, Myocytes, Cardiac, Cell Proliferation, Regeneration, Mammals
Abstract: Mammals have limited capacity for heart regeneration, whereas zebrafish have extraordinary regeneration abilities. During zebrafish heart regeneration, endothelial cells promote cardiomyocyte cell cycle reentry and myocardial repair, but the mechanisms responsible for promoting an injury microenvironment conducive to regeneration remain incompletely defined. Here, we identify the matrix metalloproteinase Mmp14b as an essential regulator of heart regeneration. We identify a TEAD-dependent mmp14b endothelial enhancer induced by heart injury in zebrafish and mice, and we show that the enhancer is required for regeneration, supporting a role for Hippo signaling upstream of mmp14b. Last, we show that MMP-14 function in mice is important for the accumulation of Agrin, an essential regulator of neonatal mouse heart regeneration. These findings reveal mechanisms for extracellular matrix remodeling that promote heart regeneration.
Published: 2023

41. The Splitting of Chiral and Deconfinement Phase Transitions induced by Rotation

Author: Sun, Fei, Xu, Kun, and Huang, Mei
Subjects: High Energy Physics - Phenomenology
Abstract: The chiral and deconfinement phase transitions under rotation have been simultaneously investigated in the Polyakov-Nambu-Jona-Lasinio (PNJL) model. An interesting observation has been found that the chiral phase transition is catalyzed and the deconfinement phase transition is decelerated by rotation, therefore a chiral symmetric but confined phase is induced by rotation, which indicates that chiral dynamics and gluon dynamics can be split by rotation., Comment: 9 pages, 5 figures
Published: 2023
Full Text: View/download PDF

42. Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts

Author: Jawahar, Ganesh, Yang, Haichuan, Xiong, Yunyang, Liu, Zechun, Wang, Dilin, Sun, Fei, Li, Meng, Pappu, Aasish, Oguz, Barlas, Abdul-Mageed, Muhammad, Lakshmanan, Laks V. S., Krishnamoorthi, Raghuraman, and Chandra, Vikas
Subjects: Computer Science - Computation and Language
Abstract: Weight-sharing supernets are crucial for performance estimation in cutting-edge neural architecture search (NAS) frameworks. Despite their ability to generate diverse subnetworks without retraining, the quality of these subnetworks is not guaranteed due to weight sharing. In NLP tasks like machine translation and pre-trained language modeling, there is a significant performance gap between supernet and training from scratch for the same model architecture, necessitating retraining post optimal architecture identification. This study introduces a solution called mixture-of-supernets, a generalized supernet formulation leveraging mixture-of-experts (MoE) to enhance supernet model expressiveness with minimal training overhead. Unlike conventional supernets, this method employs an architecture-based routing mechanism, enabling indirect sharing of model weights among subnetworks. This customization of weights for specific architectures, learned through gradient descent, minimizes retraining time, significantly enhancing training efficiency in NLP. The proposed method attains state-of-the-art (SoTA) performance in NAS for fast machine translation models, exhibiting a superior latency-BLEU tradeoff compared to HAT, the SoTA NAS framework for machine translation. Furthermore, it excels in NAS for building memory-efficient task-agnostic BERT models, surpassing NAS-BERT and AutoDistil across various model sizes. The code can be found at: https://github.com/UBC-NLP/MoS., Comment: ACL 2024 Findings
Published: 2023

43. PDE+: Enhancing Generalization via PDE with Adaptive Distributional Diffusion

Author: Yuan, Yige, Xu, Bingbing, Lin, Bo, Hou, Liang, Sun, Fei, Shen, Huawei, and Cheng, Xueqi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: The generalization of neural networks is a central challenge in machine learning, especially concerning the performance under distributions that differ from training ones. Current methods, mainly based on the data-driven paradigm such as data augmentation, adversarial training, and noise injection, may encounter limited generalization due to model non-smoothness. In this paper, we propose to investigate generalization from a Partial Differential Equation (PDE) perspective, aiming to enhance it directly through the underlying function of neural networks, rather than focusing on adjusting input data. Specifically, we first establish the connection between neural network generalization and the smoothness of the solution to a specific PDE, namely "transport equation". Building upon this, we propose a general framework that introduces adaptive distributional diffusion into transport equation to enhance the smoothness of its solution, thereby improving generalization. In the context of neural networks, we put this theoretical framework into practice as $\textbf{PDE+}$ ($\textbf{PDE}$ with $\textbf{A}$daptive $\textbf{D}$istributional $\textbf{D}$iffusion) which diffuses each sample into a distribution covering semantically similar inputs. This enables better coverage of potentially unobserved distributions in training, thus improving generalization beyond merely data-driven methods. The effectiveness of PDE+ is validated through extensive experimental settings, demonstrating its superior performance compared to SOTA methods., Comment: Accepted by Annual AAAI Conference on Artificial Intelligence (AAAI) 2024. Code is available at https://github.com/yuanyige/pde-add
Published: 2023

44. Retraction Note: Policies to obtain energy transformation target: evidence from emission accounting impacts

Author: Qu, Zhaojun, Sun, Fei, and Wu, Qitao
Published: 2024
Full Text: View/download PDF

45. tRNA modifications and tRNA-derived small RNAs: new insights of tRNA in human disease

Author: Wu, Di, Li, Xiuling, Khan, Faheem Ahmed, Yuan, Chenyang, Pandupuspitasari, Nuruliarizki Shinta, Huang, Chunjie, Sun, Fei, and Guan, Kaifeng
Published: 2024
Full Text: View/download PDF

46. Emergence and transformation of polar skyrmion lattices via flexoelectricity

Author: Ren, Jianhua, Liu, Linjie, Sun, Fei, He, Qian, Wu, Mengjun, Chen, Weijin, and Zheng, Yue
Published: 2024
Full Text: View/download PDF

47. Abnormal expression of circ_0013958 in patients with acute myocardial infarction (AMI) and its influence on prognosis

Author: Sun, Fei, Zou, Shenglan, Li, Xiaomin, and Liu, Xueya
Published: 2024
Full Text: View/download PDF

48. Variant analysis and PGT-M of OTC gene in a Chinese family with ornithine carbamoyltransferase deficiency

Author: Zhou, Yao, Jiang, Xinxing, Zhang, Yongfang, Zhang, Yu, Sun, Fei, and Ma, Yanlin
Published: 2024
Full Text: View/download PDF

49. Detection of AZF microdeletions and analysis of reproductive hormonal profiles in Hainan men undergoing assisted reproductive technology

Author: He, Qina, Zhang, Yongle, Song, Mengyi, Zhou, Yao, Lin, Dan, Ma, Yanlin, Sun, Fei, and Li, Qi
Published: 2024
Full Text: View/download PDF

50. Effects of different doses of intranasal dexmedetomidine on related complications and parents’ satisfaction in anesthetized children: a systematic review

Author: Hu, Wei, Wang, Ming, and Sun, Fei
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

7,052 results on '"Sun, Fei"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources