7,052 results on '"Sun, Fei"'
Search Results
2. Half Bowl of Mengpo's Soup
- Author
-
Sun, Fei
- Published
- 2022
- Full Text
- View/download PDF
3. Improving the Shortest Plank: Vulnerability-Aware Adversarial Training for Robust Recommender System
- Author
-
Zhang, Kaike, Cao, Qi, Wu, Yunfan, Sun, Fei, Shen, Huawei, and Cheng, Xueqi
- Subjects
Computer Science - Information Retrieval - Abstract
Recommender systems play a pivotal role in mitigating information overload in various fields. Nonetheless, the inherent openness of these systems introduces vulnerabilities, allowing attackers to insert fake users into the system's training data to skew the exposure of certain items, known as poisoning attacks. Adversarial training has emerged as a notable defense mechanism against such poisoning attacks within recommender systems. Existing adversarial training methods apply perturbations of the same magnitude across all users to enhance system robustness against attacks. Yet, in reality, we find that attacks often affect only a subset of users who are vulnerable. These perturbations of indiscriminate magnitude make it difficult to balance effective protection for vulnerable users without degrading recommendation quality for those who are not affected. To address this issue, our research delves into understanding user vulnerability. Considering that poisoning attacks pollute the training data, we note that the higher degree to which a recommender system fits users' training data correlates with an increased likelihood of users incorporating attack information, indicating their vulnerability. Leveraging these insights, we introduce the Vulnerability-aware Adversarial Training (VAT), designed to defend against poisoning attacks in recommender systems. VAT employs a novel vulnerability-aware function to estimate users' vulnerability based on the degree to which the system fits them. Guided by this estimation, VAT applies perturbations of adaptive magnitude to each user, not only reducing the success ratio of attacks but also preserving, and potentially enhancing, the quality of recommendations. Comprehensive experiments confirm VAT's superior defensive capabilities across different recommendation models and against various types of attacks.
- Published
- 2024
- Full Text
- View/download PDF
4. Creation of independently controllable and long lifetime polar skyrmion textures in ferroelectric-metallic heterostructures
- Author
-
Sun, Fei, Ren, Jianhua, Li, Hongfang, Wu, Yiwei, Liang, Jianwei, Yang, Hui, Zhang, Yi, Liu, Jianyi, Liu, Linjie, Wu, Mengjun, Zhang, Xiaoyue, Zhu, Wenpeng, Chen, Weijin, and Zheng, Yue
- Subjects
Condensed Matter - Materials Science - Abstract
Topological textures like vortices, labyrinths and skyrmions formed in ferroic materials have attracted extensive interests during the past decade for their fundamental physics, intriguing topology, and technological prospects. So far, polar skyrmions remain scarce in ferroelectrics as they require a delicate balance between various dipolar interactions. Here, we report that PbTiO3 thin films in a metallic contact undergo a topological phase transition and stabilize a broad family of skyrmion-like textures (e.g., skyrmion bubbles, multiple {\pi}-twist target skyrmions, and skyrmion bags) with independent controllability, analogous to those reported in magnetic systems. Weakly-interacted skyrmion arrays with a density over 300 Gb/inch2 are successfully written, erased and read-out by local electrical and mechanical stimuli of a scanning probe. Interestingly, in contrast to the relatively short lifetime <20 hours of the skyrmion bubbles, the multiple {\pi}-twist target skyrmions and skyrmion bags show topology-enhanced stability with lifetime over two weeks. Experimental and theoretical analysis implies the heterostructures carry electric Dzyaloshinskii-Moriya interaction mediated by oxygen octahedral tiltings. Our results demonstrate ferroelectric-metallic heterostructures as fertile playground for topological states and emergent phenomena.
- Published
- 2024
5. Accelerating the Surrogate Retraining for Poisoning Attacks against Recommender Systems
- Author
-
Wu, Yunfan, Cao, Qi, Tao, Shuchang, Zhang, Kaike, Sun, Fei, and Shen, Huawei
- Subjects
Computer Science - Information Retrieval - Abstract
Recent studies have demonstrated the vulnerability of recommender systems to data poisoning attacks, where adversaries inject carefully crafted fake user interactions into the training data of recommenders to promote target items. Current attack methods involve iteratively retraining a surrogate recommender on the poisoned data with the latest fake users to optimize the attack. However, this repetitive retraining is highly time-consuming, hindering the efficient assessment and optimization of fake users. To mitigate this computational bottleneck and develop a more effective attack in an affordable time, we analyze the retraining process and find that a change in the representation of one user/item will cause a cascading effect through the user-item interaction graph. Under theoretical guidance, we introduce \emph{Gradient Passing} (GP), a novel technique that explicitly passes gradients between interacted user-item pairs during backpropagation, thereby approximating the cascading effect and accelerating retraining. With just a single update, GP can achieve effects comparable to multiple original training iterations. Under the same number of retraining epochs, GP enables a closer approximation of the surrogate recommender to the victim. This more accurate approximation provides better guidance for optimizing fake users, ultimately leading to enhanced data poisoning attacks. Extensive experiments on real-world datasets demonstrate the efficiency and effectiveness of our proposed GP., Comment: Accepted by RecSys 2024
- Published
- 2024
6. Two-dimensional superconductivity in a thick exfoliated kagome film
- Author
-
Sun, Fei, Salinas, Andrea Capa, Wilson, Stephen D., and Zhang, Haijing
- Subjects
Condensed Matter - Superconductivity ,Condensed Matter - Materials Science ,Condensed Matter - Strongly Correlated Electrons - Abstract
We report the observation of two-dimensional superconductivity (2D SC) in exfoliated kagome metal CsV$_3$Sb$_5$ with a thickness far thicker than the atomic limit. By examining the critical current and upper critical magnetic fields ($H_{c2}$) of 40-60 nm thick films in the superconducting state, we identify a pronounced Berezinskii-Kosterlitz-Thouless (BKT) transition behavior, i.e. a drastic decrease of the superfluid stiffness near the transition, and a cusp-like feature of the angular dependent $H_{c2}$, both of which serve as direct evidence of 2D SC. In addition, an exceeding of the Pauli paramagnetic limit of the in-plane $H_{c2}$ is consistent with the 2D SC nature. The observed 2D SC occurs in thick films with the highest superconducting transition temperature $T_c$ and the lowest charge density wave transition temperature $T_{\rm {CDW}}$, which suggests that the charge density wave states are interrelated with the superconducting states. Our findings impose constraints in understanding the enhancement of SC in kagome superconductors, and illuminate pathways for achieving novel 2D superconducting states in more stable and much thicker systems., Comment: 6 pages and 4 figures for the main text; 10 pages and 9 figures for the Supplementary Materials
- Published
- 2024
7. The Llama 3 Herd of Models
- Author
-
Dubey, Abhimanyu, Jauhri, Abhinav, Pandey, Abhinav, Kadian, Abhishek, Al-Dahle, Ahmad, Letman, Aiesha, Mathur, Akhil, Schelten, Alan, Yang, Amy, Fan, Angela, Goyal, Anirudh, Hartshorn, Anthony, Yang, Aobo, Mitra, Archi, Sravankumar, Archie, Korenev, Artem, Hinsvark, Arthur, Rao, Arun, Zhang, Aston, Rodriguez, Aurelien, Gregerson, Austen, Spataru, Ava, Roziere, Baptiste, Biron, Bethany, Tang, Binh, Chern, Bobbie, Caucheteux, Charlotte, Nayak, Chaya, Bi, Chloe, Marra, Chris, McConnell, Chris, Keller, Christian, Touret, Christophe, Wu, Chunyang, Wong, Corinne, Ferrer, Cristian Canton, Nikolaidis, Cyrus, Allonsius, Damien, Song, Daniel, Pintz, Danielle, Livshits, Danny, Esiobu, David, Choudhary, Dhruv, Mahajan, Dhruv, Garcia-Olano, Diego, Perino, Diego, Hupkes, Dieuwke, Lakomkin, Egor, AlBadawy, Ehab, Lobanova, Elina, Dinan, Emily, Smith, Eric Michael, Radenovic, Filip, Zhang, Frank, Synnaeve, Gabriel, Lee, Gabrielle, Anderson, Georgia Lewis, Nail, Graeme, Mialon, Gregoire, Pang, Guan, Cucurell, Guillem, Nguyen, Hailey, Korevaar, Hannah, Xu, Hu, Touvron, Hugo, Zarov, Iliyan, Ibarra, Imanol Arrieta, Kloumann, Isabel, Misra, Ishan, Evtimov, Ivan, Copet, Jade, Lee, Jaewon, Geffert, Jan, Vranes, Jana, Park, Jason, Mahadeokar, Jay, Shah, Jeet, van der Linde, Jelmer, Billock, Jennifer, Hong, Jenny, Lee, Jenya, Fu, Jeremy, Chi, Jianfeng, Huang, Jianyu, Liu, Jiawen, Wang, Jie, Yu, Jiecao, Bitton, Joanna, Spisak, Joe, Park, Jongsoo, Rocca, Joseph, Johnstun, Joshua, Saxe, Joshua, Jia, Junteng, Alwala, Kalyan Vasuden, Upasani, Kartikeya, Plawiak, Kate, Li, Ke, Heafield, Kenneth, Stone, Kevin, El-Arini, Khalid, Iyer, Krithika, Malik, Kshitiz, Chiu, Kuenley, Bhalla, Kunal, Rantala-Yeary, Lauren, van der Maaten, Laurens, Chen, Lawrence, Tan, Liang, Jenkins, Liz, Martin, Louis, Madaan, Lovish, Malo, Lubo, Blecher, Lukas, Landzaat, Lukas, de Oliveira, Luke, Muzzi, Madeline, Pasupuleti, Mahesh, Singh, Mannat, Paluri, Manohar, Kardas, Marcin, Oldham, Mathew, Rita, Mathieu, Pavlova, Maya, Kambadur, Melanie, Lewis, Mike, Si, Min, Singh, Mitesh Kumar, Hassan, Mona, Goyal, Naman, Torabi, Narjes, Bashlykov, Nikolay, Bogoychev, Nikolay, Chatterji, Niladri, Duchenne, Olivier, Çelebi, Onur, Alrassy, Patrick, Zhang, Pengchuan, Li, Pengwei, Vasic, Petar, Weng, Peter, Bhargava, Prajjwal, Dubal, Pratik, Krishnan, Praveen, Koura, Punit Singh, Xu, Puxin, He, Qing, Dong, Qingxiao, Srinivasan, Ragavan, Ganapathy, Raj, Calderer, Ramon, Cabral, Ricardo Silveira, Stojnic, Robert, Raileanu, Roberta, Girdhar, Rohit, Patel, Rohit, Sauvestre, Romain, Polidoro, Ronnie, Sumbaly, Roshan, Taylor, Ross, Silva, Ruan, Hou, Rui, Wang, Rui, Hosseini, Saghar, Chennabasappa, Sahana, Singh, Sanjay, Bell, Sean, Kim, Seohyun Sonia, Edunov, Sergey, Nie, Shaoliang, Narang, Sharan, Raparthy, Sharath, Shen, Sheng, Wan, Shengye, Bhosale, Shruti, Zhang, Shun, Vandenhende, Simon, Batra, Soumya, Whitman, Spencer, Sootla, Sten, Collot, Stephane, Gururangan, Suchin, Borodinsky, Sydney, Herman, Tamar, Fowler, Tara, Sheasha, Tarek, Georgiou, Thomas, Scialom, Thomas, Speckbacher, Tobias, Mihaylov, Todor, Xiao, Tong, Karn, Ujjwal, Goswami, Vedanuj, Gupta, Vibhor, Ramanathan, Vignesh, Kerkez, Viktor, Gonguet, Vincent, Do, Virginie, Vogeti, Vish, Petrovic, Vladan, Chu, Weiwei, Xiong, Wenhan, Fu, Wenyin, Meers, Whitney, Martinet, Xavier, Wang, Xiaodong, Tan, Xiaoqing Ellen, Xie, Xinfeng, Jia, Xuchao, Wang, Xuewei, Goldschlag, Yaelle, Gaur, Yashesh, Babaei, Yasmine, Wen, Yi, Song, Yiwen, Zhang, Yuchen, Li, Yue, Mao, Yuning, Coudert, Zacharie Delpierre, Yan, Zheng, Chen, Zhengxing, Papakipos, Zoe, Singh, Aaditya, Grattafiori, Aaron, Jain, Abha, Kelsey, Adam, Shajnfeld, Adam, Gangidi, Adithya, Victoria, Adolfo, Goldstand, Ahuva, Menon, Ajay, Sharma, Ajay, Boesenberg, Alex, Vaughan, Alex, Baevski, Alexei, Feinstein, Allie, Kallet, Amanda, Sangani, Amit, Yunus, Anam, Lupu, Andrei, Alvarado, Andres, Caples, Andrew, Gu, Andrew, Ho, Andrew, Poulton, Andrew, Ryan, Andrew, Ramchandani, Ankit, Franco, Annie, Saraf, Aparajita, Chowdhury, Arkabandhu, Gabriel, Ashley, Bharambe, Ashwin, Eisenman, Assaf, Yazdan, Azadeh, James, Beau, Maurer, Ben, Leonhardi, Benjamin, Huang, Bernie, Loyd, Beth, De Paola, Beto, Paranjape, Bhargavi, Liu, Bing, Wu, Bo, Ni, Boyu, Hancock, Braden, Wasti, Bram, Spence, Brandon, Stojkovic, Brani, Gamido, Brian, Montalvo, Britt, Parker, Carl, Burton, Carly, Mejia, Catalina, Wang, Changhan, Kim, Changkyu, Zhou, Chao, Hu, Chester, Chu, Ching-Hsiang, Cai, Chris, Tindal, Chris, Feichtenhofer, Christoph, Civin, Damon, Beaty, Dana, Kreymer, Daniel, Li, Daniel, Wyatt, Danny, Adkins, David, Xu, David, Testuggine, Davide, David, Delia, Parikh, Devi, Liskovich, Diana, Foss, Didem, Wang, Dingkang, Le, Duc, Holland, Dustin, Dowling, Edward, Jamil, Eissa, Montgomery, Elaine, Presani, Eleonora, Hahn, Emily, Wood, Emily, Brinkman, Erik, Arcaute, Esteban, Dunbar, Evan, Smothers, Evan, Sun, Fei, Kreuk, Felix, Tian, Feng, Ozgenel, Firat, Caggioni, Francesco, Guzmán, Francisco, Kanayet, Frank, Seide, Frank, Florez, Gabriela Medina, Schwarz, Gabriella, Badeer, Gada, Swee, Georgia, Halpern, Gil, Thattai, Govind, Herman, Grant, Sizov, Grigory, Guangyi, Zhang, Lakshminarayanan, Guna, Shojanazeri, Hamid, Zou, Han, Wang, Hannah, Zha, Hanwen, Habeeb, Haroun, Rudolph, Harrison, Suk, Helen, Aspegren, Henry, Goldman, Hunter, Damlaj, Ibrahim, Molybog, Igor, Tufanov, Igor, Veliche, Irina-Elena, Gat, Itai, Weissman, Jake, Geboski, James, Kohli, James, Asher, Japhet, Gaya, Jean-Baptiste, Marcus, Jeff, Tang, Jeff, Chan, Jennifer, Zhen, Jenny, Reizenstein, Jeremy, Teboul, Jeremy, Zhong, Jessica, Jin, Jian, Yang, Jingyi, Cummings, Joe, Carvill, Jon, Shepard, Jon, McPhie, Jonathan, Torres, Jonathan, Ginsburg, Josh, Wang, Junjie, Wu, Kai, U, Kam Hou, Saxena, Karan, Prasad, Karthik, Khandelwal, Kartikay, Zand, Katayoun, Matosich, Kathy, Veeraraghavan, Kaushik, Michelena, Kelly, Li, Keqian, Huang, Kun, Chawla, Kunal, Lakhotia, Kushal, Huang, Kyle, Chen, Lailin, Garg, Lakshya, A, Lavender, Silva, Leandro, Bell, Lee, Zhang, Lei, Guo, Liangpeng, Yu, Licheng, Moshkovich, Liron, Wehrstedt, Luca, Khabsa, Madian, Avalani, Manav, Bhatt, Manish, Tsimpoukelli, Maria, Mankus, Martynas, Hasson, Matan, Lennie, Matthew, Reso, Matthias, Groshev, Maxim, Naumov, Maxim, Lathi, Maya, Keneally, Meghan, Seltzer, Michael L., Valko, Michal, Restrepo, Michelle, Patel, Mihir, Vyatskov, Mik, Samvelyan, Mikayel, Clark, Mike, Macey, Mike, Wang, Mike, Hermoso, Miquel Jubert, Metanat, Mo, Rastegari, Mohammad, Bansal, Munish, Santhanam, Nandhini, Parks, Natascha, White, Natasha, Bawa, Navyata, Singhal, Nayan, Egebo, Nick, Usunier, Nicolas, Laptev, Nikolay Pavlovich, Dong, Ning, Zhang, Ning, Cheng, Norman, Chernoguz, Oleg, Hart, Olivia, Salpekar, Omkar, Kalinli, Ozlem, Kent, Parkin, Parekh, Parth, Saab, Paul, Balaji, Pavan, Rittner, Pedro, Bontrager, Philip, Roux, Pierre, Dollar, Piotr, Zvyagina, Polina, Ratanchandani, Prashant, Yuvraj, Pritish, Liang, Qian, Alao, Rachad, Rodriguez, Rachel, Ayub, Rafi, Murthy, Raghotham, Nayani, Raghu, Mitra, Rahul, Li, Raymond, Hogan, Rebekkah, Battey, Robin, Wang, Rocky, Maheswari, Rohan, Howes, Russ, Rinott, Ruty, Bondu, Sai Jayesh, Datta, Samyak, Chugh, Sara, Hunt, Sara, Dhillon, Sargun, Sidorov, Sasha, Pan, Satadru, Verma, Saurabh, Yamamoto, Seiji, Ramaswamy, Sharadh, Lindsay, Shaun, Feng, Sheng, Lin, Shenghao, Zha, Shengxin Cindy, Shankar, Shiva, Zhang, Shuqiang, Wang, Sinong, Agarwal, Sneha, Sajuyigbe, Soji, Chintala, Soumith, Max, Stephanie, Chen, Stephen, Kehoe, Steve, Satterfield, Steve, Govindaprasad, Sudarshan, Gupta, Sumit, Cho, Sungmin, Virk, Sunny, Subramanian, Suraj, Choudhury, Sy, Goldman, Sydney, Remez, Tal, Glaser, Tamar, Best, Tamara, Kohler, Thilo, Robinson, Thomas, Li, Tianhe, Zhang, Tianjun, Matthews, Tim, Chou, Timothy, Shaked, Tzook, Vontimitta, Varun, Ajayi, Victoria, Montanez, Victoria, Mohan, Vijai, Kumar, Vinay Satish, Mangla, Vishal, Albiero, Vítor, Ionescu, Vlad, Poenaru, Vlad, Mihailescu, Vlad Tiberiu, Ivanov, Vladimir, Li, Wei, Wang, Wenchen, Jiang, Wenwen, Bouaziz, Wes, Constable, Will, Tang, Xiaocheng, Wang, Xiaofang, Wu, Xiaojian, Wang, Xiaolan, Xia, Xide, Wu, Xilun, Gao, Xinbo, Chen, Yanjun, Hu, Ye, Jia, Ye, Qi, Ye, Li, Yenda, Zhang, Yilin, Zhang, Ying, Adi, Yossi, Nam, Youngjin, Yu, Wang, Hao, Yuchen, Qian, Yundi, He, Yuzi, Rait, Zach, DeVito, Zachary, Rosnbrick, Zef, Wen, Zhaoduo, Yang, Zhenyu, and Zhao, Zhiwei
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development.
- Published
- 2024
8. Broadband planar electromagnetic hyper-lens with uniform magnification in air
- Author
-
Sun, Ran, Sun, Fei, Chen, Hanchuan, Liu, Yichao, and Wang, Qi
- Subjects
Physics - Applied Physics - Abstract
A planar hyper-lens, capable of creating sub-wavelength imaging for broadband electromagnetic wave, is designed based on electromagnetic null medium. Subsequently, a scheme for the implementation of the proposed hyper-lens is given by using well-designed flexural metal plates, which function as the reduced electromagnetic null medium for TM-polarized microwaves. Both simulated and measured results verify that the hyper-lens designed with flexural metal plates can achieve super-resolution imaging for microwave at operating wavelength ({\lambda}0=3cm) with a resolution of 0.25{\lambda}0 and a uniform magnification of about 5. Moreover, the designed hyper-lens ensures that both the object and image surfaces are planes, and simultaneously provides a uniform magnification for objects in different positions. Additionally, the proposed hyper-lens offers a broadband super-resolution imaging capabilities, achieving good super-resolution imaging effects for microwave frequencies ranging from 8.5 to 11 GHz. The proposed hyper-lens may find applications in high precision imaging, detection, and sensing.
- Published
- 2024
9. The Fall of ROME: Understanding the Collapse of LLMs in Model Editing
- Author
-
Yang, Wanli, Sun, Fei, Tan, Jiajun, Ma, Xinyu, Su, Du, Yin, Dawei, and Shen, Huawei
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Despite significant progress in model editing methods, their application in real-world scenarios remains challenging as they often cause large language models (LLMs) to collapse. Among them, ROME is particularly concerning, as it could disrupt LLMs with only a single edit. In this paper, we study the root causes of such collapse. Through extensive analysis, we identify two primary factors that contribute to the collapse: i) inconsistent handling of prefixed and unprefixed keys in the parameter update equation may result in very small denominators, causing excessively large parameter updates; ii) the subject of collapse cases is usually the first token, whose unprefixed key distribution significantly differs from the prefixed key distribution in autoregressive transformers, causing the aforementioned issue to materialize. To validate our analysis, we propose a simple yet effective approach: uniformly using prefixed keys during editing phase and adding prefixes during the testing phase. The experimental results show that the proposed solution can prevent model collapse while maintaining the effectiveness of the edits.
- Published
- 2024
10. Impurity-level induced broadband photoelectric response in wide-band semiconductor SrSnO3
- Author
-
Zhang, Yuyang, Wang, Lisheng, Wu, Weijie, Wang, Zhaoyang, Sun, Fei, Jiang, He, Zhang, Bangmin, and Zheng, Yue
- Subjects
Physics - Applied Physics ,Condensed Matter - Materials Science - Abstract
Broadband spectrum detectors exhibit great promise in fields such as multispectral imaging and optical communications. Despite significant progress, challenges like materials instability, complex manufacturing process and high costs still hinder further application. Here we present a method that achieves broadband spectral detect by impurity-level in SrSnO3. We report over 200 mA/W photo-responsivity at 275 nm (ultraviolet C solar-bind) and 367 nm (ultraviolet A) and ~ 1 mA/W photo-responsivity at 532 nm and 700 nm (visible) with a voltage bias of 5V. Further transport and photoluminescence results indicate that the broadband response comes from the impurity levels and mutual interactions. Additionally, the photodetector demonstrates excellent robustness and stability under repeated tests and prolonged exposure in air. These findings show the potential of SSO photodetectors and propose a method to achieve broadband spectrum detection, creating new possibility for the development of single-phase, low-cost, simple structure and high-efficiency photodetectors., Comment: 5 Figures
- Published
- 2024
11. Is Flash Attention Stable?
- Author
-
Golden, Alicia, Hsia, Samuel, Sun, Fei, Acun, Bilge, Hosmer, Basil, Lee, Yejin, DeVito, Zachary, Johnson, Jeff, Wei, Gu-Yeon, Brooks, David, and Wu, Carole-Jean
- Subjects
Computer Science - Machine Learning ,Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
Training large-scale machine learning models poses distinct system challenges, given both the size and complexity of today's workloads. Recently, many organizations training state-of-the-art Generative AI models have reported cases of instability during training, often taking the form of loss spikes. Numeric deviation has emerged as a potential cause of this training instability, although quantifying this is especially challenging given the costly nature of training runs. In this work, we develop a principled approach to understanding the effects of numeric deviation, and construct proxies to put observations into context when downstream effects are difficult to quantify. As a case study, we apply this framework to analyze the widely-adopted Flash Attention optimization. We find that Flash Attention sees roughly an order of magnitude more numeric deviation as compared to Baseline Attention at BF16 when measured during an isolated forward pass. We then use a data-driven analysis based on the Wasserstein Distance to provide upper bounds on how this numeric deviation impacts model weights during training, finding that the numerical deviation present in Flash Attention is 2-5 times less significant than low-precision training.
- Published
- 2024
12. When to Trust LLMs: Aligning Confidence with Response Quality
- Author
-
Tao, Shuchang, Yao, Liuyi, Ding, Hanxing, Xie, Yuexiang, Cao, Qi, Sun, Fei, Gao, Jinyang, Shen, Huawei, and Ding, Bolin
- Subjects
Computer Science - Computation and Language - Abstract
Despite the success of large language models (LLMs) in natural language generation, much evidence shows that LLMs may produce incorrect or nonsensical text. This limitation highlights the importance of discerning when to trust LLMs, especially in safety-critical domains. Existing methods often express reliability by confidence level, however, their effectiveness is limited by the lack of objective guidance. To address this, we propose CONfidence-Quality-ORDer-preserving alignment approach (CONQORD), which leverages reinforcement learning guided by a tailored dual-component reward function. This function integrates quality reward and order-preserving alignment reward functions. Specifically, the order-preserving reward incentivizes the model to verbalize greater confidence for responses of higher quality to align the order of confidence and quality. Experiments demonstrate that CONQORD significantly improves the alignment performance between confidence and response accuracy, without causing over-cautious. Furthermore, the aligned confidence provided by CONQORD informs when to trust LLMs, and acts as a determinant for initiating the retrieval process of external knowledge. Aligning confidence with response quality ensures more transparent and reliable responses, providing better trustworthiness., Comment: Accepted by ACL 2024
- Published
- 2024
13. Experimental demonstration of a thermal-EM concentrator for enhancing EM signals and converging heat fluxes simultaneously
- Author
-
Chen, Hanchuan, Liu, Yichao, Sun, Fei, Sun, Qianhan, Wu, Xiaoxiao, and Sun, Ran
- Subjects
Physics - General Physics - Abstract
Simultaneously concentrating EM waves and heat fluxes to the same target region within an on-chip system carries substantial academic research importance and practical application value. Nevertheless, existing researches are primarily aimed at the design and experimentation of concentrators for individual EM waves or temperature fields. In this work, a thermal-EM concentrator, capable of simultaneously concentrating EM waves and heat fluxes, is designed using transformation optics/thermodynamics and fabricated with engineered EM-thermal metamaterials. The concentrating effects of the proposed thermal-EM concentrator on the thermal fluxes and EM waves are verified through numerical simulations and experimental measurements, respectively, which are in good agreement with each other. Both numerically simulated and experimentally measured results demonstrate the concentrating capability of the proposed thermal-EM concentrator, which can concentrate broadband TM-polarized EM waves ranging from 8-12 GHz and heat/cold flows to the same target region within an on-chip operating environment. The thermal-EM concentrator exhibits a thermal focusing efficiency close to 100% and more than three times enhancement of the magnetic field at the designed center frequency of 10 GHz. The proposed thermal-EM concentrator can be utilized for efficient cooling for the specified component and simultaneously enhancing the EM antenna's radiation/reception efficiency within an on-chip system., Comment: 15 pages, 5 figures
- Published
- 2024
- Full Text
- View/download PDF
14. Chiral phase transition and spin alignment of vector meson in the Polarized-Polyakov-loop Nambu-Jona-Lasinio model under rotation
- Author
-
Sun, Fei, Shao, Jingdong, Wen, Rui, Xu, Kun, and Huang, Mei
- Subjects
High Energy Physics - Phenomenology - Abstract
By using the extrapolation method, a polarized Polykov-loop potential at finite real angular velocity is constructed from the lattice results at finite imaginary angular velocity. The chiral and deconfinement phase transitions under rotation have been simultaneously investigated in the Polarized-Polyakov-loop Nambu-Jona-Lasinio (PPNJL) model. It is observed that both critical temperatures of deconfinement and chiral phase transition increase with the angular velocity, which is in consistent with lattice results. The spin alignment of vector meson has the negative deviation of $\rho_{00} -1/3$ under rotation, and the deviation in the PPNJL model is much more significant than that in the NJL model and the quark coalescence model, which revealing the important role of rotating gluons on the quark polarization., Comment: 10 pages, 6 figures
- Published
- 2024
15. Unlink to Unlearn: Simplifying Edge Unlearning in GNNs
- Author
-
Tan, Jiajun, Sun, Fei, Qiu, Ruichen, Su, Du, and Shen, Huawei
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Cryptography and Security - Abstract
As concerns over data privacy intensify, unlearning in Graph Neural Networks (GNNs) has emerged as a prominent research frontier in academia. This concept is pivotal in enforcing the \textit{right to be forgotten}, which entails the selective removal of specific data from trained GNNs upon user request. Our research focuses on edge unlearning, a process of particular relevance to real-world applications. Current state-of-the-art approaches like GNNDelete can eliminate the influence of specific edges yet suffer from \textit{over-forgetting}, which means the unlearning process inadvertently removes excessive information beyond needed, leading to a significant performance decline for remaining edges. Our analysis identifies the loss functions of GNNDelete as the primary source of over-forgetting and also suggests that loss functions may be redundant for effective edge unlearning. Building on these insights, we simplify GNNDelete to develop \textbf{Unlink to Unlearn} (UtU), a novel method that facilitates unlearning exclusively through unlinking the forget edges from graph structure. Our extensive experiments demonstrate that UtU delivers privacy protection on par with that of a retrained model while preserving high accuracy in downstream tasks, by upholding over 97.3\% of the retrained model's privacy protection capabilities and 99.8\% of its link prediction accuracy. Meanwhile, UtU requires only constant computational demands, underscoring its advantage as a highly lightweight and practical edge unlearning solution., Comment: Accepted by WWW 2024 as a Short Research Paper
- Published
- 2024
16. The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse
- Author
-
Yang, Wanli, Sun, Fei, Ma, Xinyu, Liu, Xun, Yin, Dawei, and Cheng, Xueqi
- Subjects
Computer Science - Artificial Intelligence - Abstract
Although model editing has shown promise in revising knowledge in Large Language Models (LLMs), its impact on the inherent capabilities of LLMs is often overlooked. In this work, we reveal a critical phenomenon: even a single edit can trigger model collapse, manifesting as significant performance degradation in various benchmark tasks. However, benchmarking LLMs after each edit, while necessary to prevent such collapses, is impractically time-consuming and resource-intensive. To mitigate this, we propose using perplexity as a surrogate metric, validated by extensive experiments demonstrating changes in an edited model's perplexity are strongly correlated with its downstream task performances. We further conduct an in-depth study on sequential editing, a practical setting for real-world scenarios, across various editing methods and LLMs, focusing on hard cases from our previous single edit studies. The results indicate that nearly all examined editing methods result in model collapse after only few edits. To facilitate further research, we have utilized GPT-3.5 to develop a new dataset, HardEdit, based on those hard cases. This dataset aims to establish the foundation for pioneering research in reliable model editing and the mechanisms underlying editing-induced model collapse. We hope this work can draw the community's attention to the potential risks inherent in model editing practices., Comment: Accepted at Findings of ACL 2024
- Published
- 2024
17. Tunable uniform field enhancement in a subwavelength air pillar by photonic doping in epsilon-near-zero medium
- Author
-
Sun, Fei, Shan, Jinyuan, and Liu, Yichao
- Subjects
Physics - Applied Physics ,Physics - Optics - Abstract
In this study, a novel electric field compressor is proposed by doping a metal-air-metal pillar in epsilon-near-zero medium within a metallic waveguide, which effectively enhances the background electric fields in a sub-wavelength air pillar with high uniformity. The field enhancement factor can be analytically determined through theoretical derivations from Maxwell's equations, reaching its maximum value by appropriately selecting the size of the air pillar. Furthermore, the proposed compressor can achieve a tunable electric field enhancement effect within the deep sub-wavelength air pillar by adjusting the height of the air pillar through movement of two metal pillars inserted into the waveguide. Both theoretical analysis and numerical simulations are employed to validate the performance of the electric field compressor, which exhibits a wide range of tunable field enhancement effect (i.e., continuously adjustable enhancement factor between 20 and 800) with uniformity below 1.5 within the deep sub-wavelength air pillar (wherein the air volume is smaller than la0/10 * la0/10 * la0/16). Finally, a practical model is proposed for the realization of the electric field compressor in the microwave range, where specially placed metal pillars and wires are incorporated into a rectangular metallic waveguide structure. The field enhancement and tunable effects of this practical model have been verified through simulations.
- Published
- 2024
- Full Text
- View/download PDF
18. Evaluation of the Liver Disease Information in Baidu Encyclopedia and Wikipedia: Longitudinal Study
- Author
-
Sun, Fei, Yang, Fuchun, and Zheng, Shusen
- Subjects
Computer applications to medicine. Medical informatics ,R858-859.7 ,Public aspects of medicine ,RA1-1270 - Abstract
BackgroundThe internet has changed the way of people acquiring health information. Previous studies have shown that Wikipedia is a reasonably reliable medical resource, and it has been ranked higher than other general websites in various search engines. Baidu Encyclopedia is one of the most popular encyclopedia websites in China. However, no studies have shown the quality of the content provided in the Baidu Encyclopedia. ObjectiveThis study aimed to evaluate the quality of liver disease information provided by Wikipedia (in English) and Baidu Encyclopedia (in Chinese) and to perform a comparison of the quality and timeliness of the articles published in these two encyclopedias. Moreover, a 3-year follow-up study was conducted to compare if the information in both these websites was updated regularly over this period. MethodsWe searched for information on liver diseases by using the International Statistical Classification of Diseases and Related Health Problems 10th Revision Version 2016 codes on Wikipedia (in English) and Baidu Encyclopedia (in Chinese). The quality of the articles was assessed using the DISCERN instrument, which consists of 3 sections. We recorded the latest editing date of the webpages and calculated the date interval to evaluate the update timeliness of these websites. ResultsWe found 22 entries on liver diseases in Baidu Encyclopedia and 15 articles in Wikipedia between September 15, 2016, and September 30, 2016, and we found 25 entries in Baidu Encyclopedia and 16 articles in Wikipedia between September 15, 2019, and September 30, 2019. In section 1 of the DISCERN instrument, the mean (SE) scores of Baidu Encyclopedia entries were significantly lower than those of Wikipedia articles. In section 2 and section 3 of the DISCERN instrument, the DISCERN scores of Baidu Encyclopedia entries were lower than those of Wikipedia articles, but the differences were not statistically significant. The total DISCERN scores of Baidu Encyclopedia entries were significantly lower than those of Wikipedia articles. The update interval of the entries in Baidu Encyclopedia was found to be significantly longer than that of the articles in Wikipedia. ConclusionsThis study shows that the quality of articles and the reliability of the research content on liver diseases in Wikipedia are better than those of the entries in Baidu Encyclopedia. However, the quality of the treatment choices provided in both Wikipedia and Baidu Encyclopedia is not satisfactory. Wikipedia is updated more frequently than Baidu Encyclopedia, thereby ensuring that the information presented has the most recent research findings. The findings of our study suggest that in order to find accurate health information, it is important to seek the help of medical professionals instead of looking for a prescription amid the confusing information provided on the internet.
- Published
- 2021
- Full Text
- View/download PDF
19. The Effect of Mindfulness on Marital Stability via Perceived Stress: Findings from a Dyadic Study of Older Couples in China
- Author
-
Zhang, Rong, Wu, Jie, Duan, Yemo, and Sun, Fei
- Published
- 2024
- Full Text
- View/download PDF
20. The role of tumor-associated macrophages in the radioresistance of esophageal cancer cells via regulation of the VEGF-mediated angiogenic pathway
- Author
-
Sun, Fei, Lian, Yingying, Zhou, Mengyun, Luo, Judong, Hu, Lijun, Wang, Jianlin, Sun, Zhiqiang, and Yu, Jingping
- Published
- 2024
- Full Text
- View/download PDF
21. Study on Acoustic Emission and Crack Propagation Characteristics of Single-Fissured Sandstone with Different Angles Under Uniaxial Compression
- Author
-
Guo, Jia-Qi, Zhu, Zi-Hui, Chen, Jian-Xun, Sun, Fei-Yue, and Wang, Zheng
- Published
- 2024
- Full Text
- View/download PDF
22. LoRec: Large Language Model for Robust Sequential Recommendation against Poisoning Attacks
- Author
-
Zhang, Kaike, Cao, Qi, Wu, Yunfan, Sun, Fei, Shen, Huawei, and Cheng, Xueqi
- Subjects
Computer Science - Information Retrieval - Abstract
Sequential recommender systems stand out for their ability to capture users' dynamic interests and the patterns of item-to-item transitions. However, the inherent openness of sequential recommender systems renders them vulnerable to poisoning attacks, where fraudulent users are injected into the training data to manipulate learned patterns. Traditional defense strategies predominantly depend on predefined assumptions or rules extracted from specific known attacks, limiting their generalizability to unknown attack types. To solve the above problems, considering the rich open-world knowledge encapsulated in Large Language Models (LLMs), our research initially focuses on the capabilities of LLMs in the detection of unknown fraudulent activities within recommender systems, a strategy we denote as LLM4Dec. Empirical evaluations demonstrate the substantial capability of LLMs in identifying unknown fraudsters, leveraging their expansive, open-world knowledge. Building upon this, we propose the integration of LLMs into defense strategies to extend their effectiveness beyond the confines of known attacks. We propose LoRec, an advanced framework that employs LLM-Enhanced Calibration to strengthen the robustness of sequential recommender systems against poisoning attacks. LoRec integrates an LLM-enhanced CalibraTor (LCT) that refines the training process of sequential recommender systems with knowledge derived from LLMs, applying a user-wise reweighting to diminish the impact of fraudsters injected by attacks. By incorporating LLMs' open-world knowledge, the LCT effectively converts the limited, specific priors or rules into a more general pattern of fraudsters, offering improved defenses against poisoning attacks. Our comprehensive experiments validate that LoRec, as a general framework, significantly strengthens the robustness of sequential recommender systems.
- Published
- 2024
23. Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts When Knowledge Conflicts?
- Author
-
Tan, Hexiang, Sun, Fei, Yang, Wanli, Wang, Yuanzhuo, Cao, Qi, and Cheng, Xueqi
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
While auxiliary information has become a key to enhancing Large Language Models (LLMs), relatively little is known about how LLMs merge these contexts, specifically contexts generated by LLMs and those retrieved from external sources. To investigate this, we formulate a systematic framework to identify whether LLMs' responses are attributed to either generated or retrieved contexts. To easily trace the origin of the response, we construct datasets with conflicting contexts, i.e., each question is paired with both generated and retrieved contexts, yet only one of them contains the correct answer. Our experiments reveal a significant bias in several LLMs (GPT-4/3.5 and Llama2) to favor generated contexts, even when they provide incorrect information. We further identify two key factors contributing to this bias: i) contexts generated by LLMs typically show greater similarity to the questions, increasing their likelihood of being selected; ii) the segmentation process used in retrieved contexts disrupts their completeness, thereby hindering their full utilization in LLMs. Our analysis enhances the understanding of how LLMs merge diverse contexts, offers valuable insights for advancing current LLM augmentation methods, and highlights the risk of generated misinformation for retrieval-augmented LLMs., Comment: Accepted at ACL 2024 Main, Homepage (https://tan-hexiang.github.io/Blinded_by_Generated_Contexts/)
- Published
- 2024
24. Photoproduction of lepton pair in ultra-relativistic heavy-ion collisions
- Author
-
Yu, Kewei, Peng, Jiazhen, Li, Shuang, Wu, Kejun, Xie, Wei, and Sun, Fei
- Subjects
Nuclear Theory ,High Energy Physics - Phenomenology - Abstract
Dilepton production provides a unique probe of the strong electromagnetic field produced in heavy-ion collisions. To map out the behavior of its transverse momentum broadening, we present a theoretical model based on the equivalent photon approximation, and then we update it to make direct comparisons with the recent experimental measurements. We find that the model calculations can describe well, not only the average transverse momentum squared of $e^{+}e^{-}$ pairs in Au--Au collisions at $\sqrt{s_{\rm NN}}=200$ GeV, but also the acoplanarity of $\mu^{+}\mu^{-}$ pairs in Pb--Pb collisions at$\sqrt{s_{\rm NN}}=5.02$ TeV. Furthermore, the model predictions are also able to reproduce the measured dependencies of the pair mass and the transverse momentum squared., Comment: 10 pages, 9 figures
- Published
- 2024
- Full Text
- View/download PDF
25. Unraveling collisional energy loss of a heavy quark in quark-gluon plasma
- Author
-
Peng, Jiazhen, Yu, Kewei, Li, Shuang, Xiong, Wei, Sun, Fei, and Xie, Wei
- Subjects
High Energy Physics - Phenomenology ,Nuclear Theory - Abstract
At leading order in QCD coupling constant, we compute the energy loss per traveling distance of a heavy quark $dE/dz$ from elastic scattering off thermal quarks and gluons at a temperature $T$, including the thermal perturbative description of soft scatterings ($-t<-t^{\ast}$) and a perturbative QCD-based calculation for hard collisions ($-t>-t^{\ast}$). Within this soft-hard factorization model, we find that the full results of $dE/dz$ behaves a mild sensitivity to the intermediate cutoff $t^{\ast}$, supporting the validity of the soft-hard approach within the temperature region of interest. We re-derive the analytic formula for $dE/dz$ in the high-energy approximation, $E_{1}\gg m^{2}_{1}/T$, where $E_{1}$ is the injected heavy quark energy and $m_{1}$ is its mass. It is realized that the soft logarithmic contribution, $dE/dz\propto ln(-t^{\ast}/m^{2}_{D})$, arises from the $t$-channel scattering off thermal partons, while the hard logarithmic term, $dE/dz\propto ln[E_{1}T/(-t^{\ast})]$, stems from the $t$-channel scattering off thermal partons, and the one $dE/dz\propto ln(E_{1}T/m^{2}_{1})$ comes from the $s$- and $u$-channel scattering off gluons. The sum of these contributions cancels the $t^{\ast}$-dependence as observed in the full result. The mass hierarchy is observed $dE/dz(charm)>dE/dz(bottom)$. Our full results are crucial for a better description of heavy quark transport in QCD medium, in particular at low and moderate energy. We also calculate the energy loss by imposing the Einstein's relationship. The related results appear to be systematically larger than that without imposing the Einstein's relationship., Comment: 16 pages, 8 figures
- Published
- 2024
- Full Text
- View/download PDF
26. Generative AI Beyond LLMs: System Implications of Multi-Modal Generation
- Author
-
Golden, Alicia, Hsia, Samuel, Sun, Fei, Acun, Bilge, Hosmer, Basil, Lee, Yejin, DeVito, Zachary, Johnson, Jeff, Wei, Gu-Yeon, Brooks, David, and Wu, Carole-Jean
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Machine Learning ,Computer Science - Multimedia - Abstract
As the development of large-scale Generative AI models evolve beyond text (1D) generation to include image (2D) and video (3D) generation, processing spatial and temporal information presents unique challenges to quality, performance, and efficiency. We present the first work towards understanding this new system design space for multi-modal text-to-image (TTI) and text-to-video (TTV) generation models. Current model architecture designs are bifurcated into 2 categories: Diffusion- and Transformer-based models. Our systematic performance characterization on a suite of eight representative TTI/TTV models shows that after state-of-the-art optimization techniques such as Flash Attention are applied, Convolution accounts for up to 44% of execution time for Diffusion-based TTI models, while Linear layers consume up to 49% of execution time for Transformer-based models. We additionally observe that Diffusion-based TTI models resemble the Prefill stage of LLM inference, and benefit from 1.1-2.5x greater speedup from Flash Attention than Transformer-based TTI models that resemble the Decode phase. Since optimizations designed for LLMs do not map directly onto TTI/TTV models, we must conduct a thorough characterization of these workloads to gain insights for new optimization opportunities. In doing so, we define sequence length in the context of TTI/TTV models and observe sequence length can vary up to 4x in Diffusion model inference. We additionally observe temporal aspects of TTV workloads pose unique system bottlenecks, with Temporal Attention accounting for over 60% of total Attention time. Overall, our in-depth system performance characterization is a critical first step towards designing efficient and deployable systems for emerging TTI/TTV workloads., Comment: Published at 2024 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
- Published
- 2023
27. On-chip omnidirectional electromagnetic-thermal cloak
- Author
-
Liu, Yichao, Chen, Hanchuan, Zhao, Gang, and Sun, Fei
- Subjects
Physics - Applied Physics - Abstract
Simultaneously guiding electromagnetic waves and heat flow at any incidence angle to smoothly bypass some electromagnetic/thermal sensitive elements is a key factor to ensure efficient communication and thermal protection for an on-chip system. In this study, an omnidirectional on-chip electromagnetic-thermal cloak is proposed. Firstly, a holey metallic plate with periodic array of subwavelength apertures is designed by optical surface transformation to realize an omnidirectional electromagnetic cloaking module for on-chip electromagnetic signal. Secondly, a two-layer ring-shaped engineered thermal structure is designed by solving Laplace equation to realize an omnidirectional thermal cloaking module for in-chip heat flow. Finally, these two cloaking modules are combined elaborately to achieve cloaking effect for both the electromagnetic waves and thermal fields simultaneously from any detecting direction, thus protecting the build-in electromagnetic/thermal sensitive elements without disturbing the external electromagnetic/thermal signal. The proposed electromagnetic-thermal cloak may have potential advantage in dealing with omnidirectional electromagnetic compatibility/shielding and multi-directional thermal management/dissipation of an on-chip system.
- Published
- 2023
- Full Text
- View/download PDF
28. EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
- Author
-
Xiong, Yunyang, Varadarajan, Bala, Wu, Lemeng, Xiang, Xiaoyu, Xiao, Fanyi, Zhu, Chenchen, Dai, Xiaoliang, Wang, Dilin, Sun, Fei, Iandola, Forrest, Krishnamoorthi, Raghuraman, and Chandra, Vikas
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Segment Anything Model (SAM) has emerged as a powerful tool for numerous vision applications. A key component that drives the impressive performance for zero-shot transfer and high versatility is a super large Transformer model trained on the extensive high-quality SA-1B dataset. While beneficial, the huge computation cost of SAM model has limited its applications to wider real-world applications. To address this limitation, we propose EfficientSAMs, light-weight SAM models that exhibits decent performance with largely reduced complexity. Our idea is based on leveraging masked image pretraining, SAMI, which learns to reconstruct features from SAM image encoder for effective visual representation learning. Further, we take SAMI-pretrained light-weight image encoders and mask decoder to build EfficientSAMs, and finetune the models on SA-1B for segment anything task. We perform evaluations on multiple vision tasks including image classification, object detection, instance segmentation, and semantic object detection, and find that our proposed pretraining method, SAMI, consistently outperforms other masked image pretraining methods. On segment anything task such as zero-shot instance segmentation, our EfficientSAMs with SAMI-pretrained lightweight image encoders perform favorably with a significant gain (e.g., ~4 AP on COCO/LVIS) over other fast SAM models.
- Published
- 2023
29. TEA: Test-time Energy Adaptation
- Author
-
Yuan, Yige, Xu, Bingbing, Hou, Liang, Sun, Fei, Shen, Huawei, and Cheng, Xueqi
- Subjects
Computer Science - Machine Learning - Abstract
Test-time adaptation (TTA) aims to improve model generalizability when test data diverges from training distribution, offering the distinct advantage of not requiring access to training data and processes, especially valuable in the context of large pre-trained models. However, current TTA methods fail to address the fundamental issue: covariate shift, i.e., the decreased generalizability can be attributed to the model's reliance on the marginal distribution of the training data, which may impair model calibration and introduce confirmation bias. To address this, we propose a novel energy-based perspective, enhancing the model's perception of target data distributions without requiring access to training data or processes. Building on this perspective, we introduce $\textbf{T}$est-time $\textbf{E}$nergy $\textbf{A}$daptation ($\textbf{TEA}$), which transforms the trained classifier into an energy-based model and aligns the model's distribution with the test data's, enhancing its ability to perceive test distributions and thus improving overall generalizability. Extensive experiments across multiple tasks, benchmarks and architectures demonstrate TEA's superior generalization performance against state-of-the-art methods. Further in-depth analyses reveal that TEA can equip the model with a comprehensive perception of test distribution, ultimately paving the way toward improved generalization and calibration., Comment: Accepted by IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR 2024). Code is available at https://github.com/yuanyige/tea
- Published
- 2023
30. Evaluation of the Tensile Properties of Vanadium-Added Steels with Different Ferrite and Pearlite Hardness Ratios
- Author
-
Kawamura, Minato, Ogawa, Toshio, Sun, Fei, and Adachi, Yoshitaka
- Published
- 2024
- Full Text
- View/download PDF
31. Improvement in the Strength–Ductility Balance of Tempered Martensite Steel by Controlling Cementite Particle Size Distribution
- Author
-
Hayakawa, Kenji, Ogawa, Toshio, He, Lei, Sun, Fei, and Adachi, Yoshitaka
- Published
- 2024
- Full Text
- View/download PDF
32. Simultaneously enhancing toluene adsorption and regeneration process by hierarchical pore in activated coke: a combined experimental and adsorption kinetic modeling study
- Author
-
Chen, Guoqing, Zhang, Wenshuang, Sun, Fei, Qu, Zhibin, Hu, Yun, Li, Xuhan, Li, Junfeng, and Wang, Tao
- Published
- 2024
- Full Text
- View/download PDF
33. Future in the past: paternal reprogramming of offspring phenotype and the epigenetic mechanisms
- Author
-
Wu, Di, Zhang, Kejia, Guan, Kaifeng, Khan, Faheem Ahmed, Pandupuspitasari, Nuruliarizki Shinta, Negara, Windu, Sun, Fei, and Huang, Chunjie
- Published
- 2024
- Full Text
- View/download PDF
34. Tbx21 gene and its association with resistance against viral nervous necrosis (VNN) in Asian seabass, Lates calcarifer
- Author
-
Wong, Joey, Yang, Zituo, Wang, Le, Sun, Fei, and Yue, Gen Hua
- Published
- 2024
- Full Text
- View/download PDF
35. Unveiling the tapestry of teacher belief research: tracing the present and forging the future through bibliometric analysis
- Author
-
Wang, Xiaochen, Gao, Yang, Sun, Fei, and Wang, Qikai
- Published
- 2024
- Full Text
- View/download PDF
36. The rotation effect on the thermodynamics of the QCD matter
- Author
-
Sun, Fei, Li, Shuang, Wen, Rui, Huang, Anping, and Xie, Wei
- Subjects
High Energy Physics - Phenomenology - Abstract
In this study, we investigate the impact of rotation on the thermodynamic characteristics of QCD matter using the three-flavor NJL model. We examine the temperature, quark chemical potential, and angular velocity dependencies of key thermodynamic quantities, such as the trace anomaly, specific heat, speed of sound, angular momentum, and moment of inertia. As the main finding of our analysis, we observe that the speed of sound exhibits a nonmonotonic behavior as the angular velocity changes., Comment: 18 pages, 19 figures
- Published
- 2023
37. Robust Recommender System: A Survey and Future Directions
- Author
-
Zhang, Kaike, Cao, Qi, Sun, Fei, Wu, Yunfan, Tao, Shuchang, Shen, Huawei, and Cheng, Xueqi
- Subjects
Computer Science - Information Retrieval - Abstract
With the rapid growth of information, recommender systems have become integral for providing personalized suggestions and overcoming information overload. However, their practical deployment often encounters "dirty" data, where noise or malicious information can lead to abnormal recommendations. Research on improving recommender systems' robustness against such dirty data has thus gained significant attention. This survey provides a comprehensive review of recent work on recommender systems' robustness. We first present a taxonomy to organize current techniques for withstanding malicious attacks and natural noise. We then explore state-of-the-art methods in each category, including fraudster detection, adversarial training, certifiable robust training against malicious attacks, and regularization, purification, self-supervised learning against natural noise. Additionally, we summarize evaluation metrics and common datasets used to assess robustness. We discuss robustness across varying recommendation scenarios and its interplay with other properties like accuracy, interpretability, privacy, and fairness. Finally, we delve into open issues and future research directions in this emerging field. Our goal is to equip readers with a holistic understanding of robust recommender systems and spotlight pathways for future research and development.
- Published
- 2023
38. A Large Language Model Enhanced Conversational Recommender System
- Author
-
Feng, Yue, Liu, Shuchang, Xue, Zhenghai, Cai, Qingpeng, Hu, Lantao, Jiang, Peng, Gai, Kun, and Sun, Fei
- Subjects
Computer Science - Information Retrieval ,Computer Science - Computation and Language - Abstract
Conversational recommender systems (CRSs) aim to recommend high-quality items to users through a dialogue interface. It usually contains multiple sub-tasks, such as user preference elicitation, recommendation, explanation, and item information search. To develop effective CRSs, there are some challenges: 1) how to properly manage sub-tasks; 2) how to effectively solve different sub-tasks; and 3) how to correctly generate responses that interact with users. Recently, Large Language Models (LLMs) have exhibited an unprecedented ability to reason and generate, presenting a new opportunity to develop more powerful CRSs. In this work, we propose a new LLM-based CRS, referred to as LLMCRS, to address the above challenges. For sub-task management, we leverage the reasoning ability of LLM to effectively manage sub-task. For sub-task solving, we collaborate LLM with expert models of different sub-tasks to achieve the enhanced performance. For response generation, we utilize the generation ability of LLM as a language interface to better interact with users. Specifically, LLMCRS divides the workflow into four stages: sub-task detection, model matching, sub-task execution, and response generation. LLMCRS also designs schema-based instruction, demonstration-based instruction, dynamic sub-task and model matching, and summary-based generation to instruct LLM to generate desired results in the workflow. Finally, to adapt LLM to conversational recommendations, we also propose to fine-tune LLM with reinforcement learning from CRSs performance feedback, referred to as RLPF. Experimental results on benchmark datasets show that LLMCRS with RLPF outperforms the existing methods.
- Published
- 2023
39. Farnesoid X receptor mediates macrophage-intrinsic responses to suppress colitis-induced colon cancer progression.
- Author
-
Dong, Xingchen, Qi, Ming, Cai, Chunmiao, Zhu, Yu, Li, Yuwenbin, Coulter, Sally, Sun, Fei, Liddle, Christopher, Uboha, Nataliya V, Halberg, Richard, Xu, Wei, Marker, Paul, and Fu, Ting
- Subjects
Biomedical and Clinical Sciences ,Immunology ,Autoimmune Disease ,Inflammatory Bowel Disease ,Digestive Diseases ,Crohn's Disease ,Colo-Rectal Cancer ,Cancer ,Aetiology ,2.1 Biological and endogenous factors ,Oral and gastrointestinal ,Animals ,Mice ,Humans ,Colonic Neoplasms ,Colitis ,Macrophages ,Inflammatory Bowel Diseases ,Inflammation ,Bile Acids and Salts ,Disease Models ,Animal ,Colorectal cancer ,Endocrinology ,Gastroenterology ,Biomedical and clinical sciences ,Health sciences - Abstract
Bile acids (BAs) affect the intestinal environment by ensuring barrier integrity, maintaining microbiota balance, regulating epithelium turnover, and modulating the immune system. As a master regulator of BA homeostasis, farnesoid X receptor (FXR) is severely compromised in patients with inflammatory bowel disease (IBD) and colitis-associated colorectal cancer (CAC). At the front line, gut macrophages react to the microbiota and metabolites that breach the epithelium. We aim to study the role of the BA/FXR axis in macrophages. This study demonstrates that inflammation-induced epithelial abnormalities compromised FXR signaling and altered BAs' profile in a mouse CAC model. Further, gut macrophage-intrinsic FXR sensed aberrant BAs, leading to pro-inflammatory cytokines' secretion, which promoted intestinal stem cell proliferation. Mechanistically, activation of FXR ameliorated intestinal inflammation and inhibited colitis-associated tumor growth, by regulating gut macrophages' recruitment, polarization, and crosstalk with Th17 cells. However, deletion of FXR in bone marrow or gut macrophages escalated the intestinal inflammation. In summary, our study reveals a distinctive regulatory role of FXR in gut macrophages, suggesting its potential as a therapeutic target for addressing IBD and CAC.
- Published
- 2024
40. An injury-responsive mmp14b enhancer is required for heart regeneration.
- Author
-
Zlatanova, Ivana, Sun, Fei, Wu, Roland, Chen, Xiaoxin, Lau, Bryan, Colombier, Pauline, Sinha, Tanvi, Xu, Shan-Mei, Huang, Guo, Black, Brian, Materna, Stefan, and Celona, Barbara
- Subjects
Animals ,Mice ,Zebrafish ,Endothelial Cells ,Myocardium ,Myocytes ,Cardiac ,Cell Proliferation ,Regeneration ,Mammals - Abstract
Mammals have limited capacity for heart regeneration, whereas zebrafish have extraordinary regeneration abilities. During zebrafish heart regeneration, endothelial cells promote cardiomyocyte cell cycle reentry and myocardial repair, but the mechanisms responsible for promoting an injury microenvironment conducive to regeneration remain incompletely defined. Here, we identify the matrix metalloproteinase Mmp14b as an essential regulator of heart regeneration. We identify a TEAD-dependent mmp14b endothelial enhancer induced by heart injury in zebrafish and mice, and we show that the enhancer is required for regeneration, supporting a role for Hippo signaling upstream of mmp14b. Last, we show that MMP-14 function in mice is important for the accumulation of Agrin, an essential regulator of neonatal mouse heart regeneration. These findings reveal mechanisms for extracellular matrix remodeling that promote heart regeneration.
- Published
- 2023
41. The Splitting of Chiral and Deconfinement Phase Transitions induced by Rotation
- Author
-
Sun, Fei, Xu, Kun, and Huang, Mei
- Subjects
High Energy Physics - Phenomenology - Abstract
The chiral and deconfinement phase transitions under rotation have been simultaneously investigated in the Polyakov-Nambu-Jona-Lasinio (PNJL) model. An interesting observation has been found that the chiral phase transition is catalyzed and the deconfinement phase transition is decelerated by rotation, therefore a chiral symmetric but confined phase is induced by rotation, which indicates that chiral dynamics and gluon dynamics can be split by rotation., Comment: 9 pages, 5 figures
- Published
- 2023
- Full Text
- View/download PDF
42. Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts
- Author
-
Jawahar, Ganesh, Yang, Haichuan, Xiong, Yunyang, Liu, Zechun, Wang, Dilin, Sun, Fei, Li, Meng, Pappu, Aasish, Oguz, Barlas, Abdul-Mageed, Muhammad, Lakshmanan, Laks V. S., Krishnamoorthi, Raghuraman, and Chandra, Vikas
- Subjects
Computer Science - Computation and Language - Abstract
Weight-sharing supernets are crucial for performance estimation in cutting-edge neural architecture search (NAS) frameworks. Despite their ability to generate diverse subnetworks without retraining, the quality of these subnetworks is not guaranteed due to weight sharing. In NLP tasks like machine translation and pre-trained language modeling, there is a significant performance gap between supernet and training from scratch for the same model architecture, necessitating retraining post optimal architecture identification. This study introduces a solution called mixture-of-supernets, a generalized supernet formulation leveraging mixture-of-experts (MoE) to enhance supernet model expressiveness with minimal training overhead. Unlike conventional supernets, this method employs an architecture-based routing mechanism, enabling indirect sharing of model weights among subnetworks. This customization of weights for specific architectures, learned through gradient descent, minimizes retraining time, significantly enhancing training efficiency in NLP. The proposed method attains state-of-the-art (SoTA) performance in NAS for fast machine translation models, exhibiting a superior latency-BLEU tradeoff compared to HAT, the SoTA NAS framework for machine translation. Furthermore, it excels in NAS for building memory-efficient task-agnostic BERT models, surpassing NAS-BERT and AutoDistil across various model sizes. The code can be found at: https://github.com/UBC-NLP/MoS., Comment: ACL 2024 Findings
- Published
- 2023
43. PDE+: Enhancing Generalization via PDE with Adaptive Distributional Diffusion
- Author
-
Yuan, Yige, Xu, Bingbing, Lin, Bo, Hou, Liang, Sun, Fei, Shen, Huawei, and Cheng, Xueqi
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
The generalization of neural networks is a central challenge in machine learning, especially concerning the performance under distributions that differ from training ones. Current methods, mainly based on the data-driven paradigm such as data augmentation, adversarial training, and noise injection, may encounter limited generalization due to model non-smoothness. In this paper, we propose to investigate generalization from a Partial Differential Equation (PDE) perspective, aiming to enhance it directly through the underlying function of neural networks, rather than focusing on adjusting input data. Specifically, we first establish the connection between neural network generalization and the smoothness of the solution to a specific PDE, namely "transport equation". Building upon this, we propose a general framework that introduces adaptive distributional diffusion into transport equation to enhance the smoothness of its solution, thereby improving generalization. In the context of neural networks, we put this theoretical framework into practice as $\textbf{PDE+}$ ($\textbf{PDE}$ with $\textbf{A}$daptive $\textbf{D}$istributional $\textbf{D}$iffusion) which diffuses each sample into a distribution covering semantically similar inputs. This enables better coverage of potentially unobserved distributions in training, thus improving generalization beyond merely data-driven methods. The effectiveness of PDE+ is validated through extensive experimental settings, demonstrating its superior performance compared to SOTA methods., Comment: Accepted by Annual AAAI Conference on Artificial Intelligence (AAAI) 2024. Code is available at https://github.com/yuanyige/pde-add
- Published
- 2023
44. Retraction Note: Policies to obtain energy transformation target: evidence from emission accounting impacts
- Author
-
Qu, Zhaojun, Sun, Fei, and Wu, Qitao
- Published
- 2024
- Full Text
- View/download PDF
45. tRNA modifications and tRNA-derived small RNAs: new insights of tRNA in human disease
- Author
-
Wu, Di, Li, Xiuling, Khan, Faheem Ahmed, Yuan, Chenyang, Pandupuspitasari, Nuruliarizki Shinta, Huang, Chunjie, Sun, Fei, and Guan, Kaifeng
- Published
- 2024
- Full Text
- View/download PDF
46. Emergence and transformation of polar skyrmion lattices via flexoelectricity
- Author
-
Ren, Jianhua, Liu, Linjie, Sun, Fei, He, Qian, Wu, Mengjun, Chen, Weijin, and Zheng, Yue
- Published
- 2024
- Full Text
- View/download PDF
47. Abnormal expression of circ_0013958 in patients with acute myocardial infarction (AMI) and its influence on prognosis
- Author
-
Sun, Fei, Zou, Shenglan, Li, Xiaomin, and Liu, Xueya
- Published
- 2024
- Full Text
- View/download PDF
48. Variant analysis and PGT-M of OTC gene in a Chinese family with ornithine carbamoyltransferase deficiency
- Author
-
Zhou, Yao, Jiang, Xinxing, Zhang, Yongfang, Zhang, Yu, Sun, Fei, and Ma, Yanlin
- Published
- 2024
- Full Text
- View/download PDF
49. Detection of AZF microdeletions and analysis of reproductive hormonal profiles in Hainan men undergoing assisted reproductive technology
- Author
-
He, Qina, Zhang, Yongle, Song, Mengyi, Zhou, Yao, Lin, Dan, Ma, Yanlin, Sun, Fei, and Li, Qi
- Published
- 2024
- Full Text
- View/download PDF
50. Effects of different doses of intranasal dexmedetomidine on related complications and parents’ satisfaction in anesthetized children: a systematic review
- Author
-
Hu, Wei, Wang, Ming, and Sun, Fei
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.