44,666 results on '"SUN, Wei"'
Search Results
2. Showcasing Japan
- Author
-
Sun, Wei and Zancan, Claudia
- Subjects
Exhibition Studies. Identity. Italo-Japanese cultural exchange. Japanese archaeology. Japanese art ,Languages and literature of Eastern Asia, Africa, Oceania ,PL1-8844 - Abstract
To what extent does the narrative of Japan’s prehistorical origins matter to Italy? In the second half of the twentieth century, Palazzo delle Esposizioni in Rome hosted two significant exhibitions dedicated to Japanese archaeology and ancient art: Tesori dell’Arte Giapponese in 1958 and Il Giappone prima dell’Occidente in 1995. Both displays provided Italian visitors with an unparalleled framework to engage with early artistic manifestations of the archipelago known today as Japan. Built on a critical analysis of the prehistoric and protohistoric artefacts from the Jōmon to Kofun periods selected for the Italian audience, this paper examines the active application of narrative discourse on Japan’s identity by the Japanese government in Italy. Still, it also sheds light on the presence of Japanese archaeology and art in Italian public and private collections throughout the twentieth century. The analysis delves into the textual and visual presentation of exhibits, examining both the venue and catalogues. These sources offer insights into potential instances of orientalism or self-orientalism, revealing a narrative closely tied to stereotypical views. The investigation unravels aspects of Japan’s past emphasised in diplomatic shows, evolving alongside ground-breaking archaeological discoveries in post-war Japan.
- Published
- 2024
- Full Text
- View/download PDF
3. Optimizing Online Teaching: Total Quality Management in Action for Quality Assurance Measures
- Author
-
Sun Wei and Guozhen Yin
- Abstract
The large-scale online teaching amid the pandemic triggered increasing concern over online teaching management and quality assurance. Take the theory of Total Quality Management (TQM) as guidance, a Chinese higher education institution (CHEI) built a multi-level, multi-link, and multi-dimensional teaching quality monitoring system (Online Teaching Quality Assurance Measures) with full participation, whole process, and all-round development by innovating teaching quality management and monitoring mechanism, aiming to ensure the continuous improvement of talent training quality to realize the sustainable development of application-oriented undergraduate universities with quality improvement as the core. The effectiveness of online teaching quality was demonstrated through the Questionnaire of Student Evaluation of Online Teaching Faculty and students' academic performance (GPA) before and after the implementation of Online Teaching Quality Assurance Measures, guided by the principles of Total Quality Management theory. The results indicated that Online Teaching Quality Assurance Measures have a series of positive effects on online teaching in CHEI, and systematically guide online instructors as evidenced by outstanding ratings and feedback in course evaluations and students' academic performance. This study also revealed that CHEI's online teaching is facing some challenges, especially in the effort to promote learning interaction and teaching cooperation. The study underscored the importance of continuous improvement and provided some interventions in enhancing online educational practices, aligning with TQM principles. The findings are expected to make an important contribution to the field of online teaching quality management in higher education.
- Published
- 2024
4. Bilateral boundary finite-time stabilization of 2x2 linear first-order hyperbolic systems with spatially varying coefficients
- Author
-
Sun, Wei, Li, Jing, and Xu, Liangyu
- Subjects
Mathematics - Optimization and Control ,Mathematics - Analysis of PDEs ,35L04, 35L40, 93D15 - Abstract
This paper presents bilateral control laws for one-dimensional(1-D) linear 2x2 hyperbolic first-order systems (with spatially varying coefficients). Bilateral control means there are two actuators at each end of the domain. This situation becomes more complex as the transport velocities are no longer constant, and this extension is nontrivial. By selecting the appropriate backstepping transformation and target system, the infinite-dimensional backstepping method is extended and a full-state feedback control law is given that ensures the closed-loop system converges to its zero equilibrium in finite time. The design of bilateral controllers enables a potential for fault-tolerant designs.
- Published
- 2024
5. Assessing UHD Image Quality from Aesthetics, Distortions, and Saliency
- Author
-
Sun, Wei, Zhang, Weixia, Cao, Yuqin, Cao, Linhan, Jia, Jun, Chen, Zijian, Zhang, Zicheng, Min, Xiongkuo, and Zhai, Guangtao
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
UHD images, typically with resolutions equal to or higher than 4K, pose a significant challenge for efficient image quality assessment (IQA) algorithms, as adopting full-resolution images as inputs leads to overwhelming computational complexity and commonly used pre-processing methods like resizing or cropping may cause substantial loss of detail. To address this problem, we design a multi-branch deep neural network (DNN) to assess the quality of UHD images from three perspectives: global aesthetic characteristics, local technical distortions, and salient content perception. Specifically, aesthetic features are extracted from low-resolution images downsampled from the UHD ones, which lose high-frequency texture information but still preserve the global aesthetics characteristics. Technical distortions are measured using a fragment image composed of mini-patches cropped from UHD images based on the grid mini-patch sampling strategy. The salient content of UHD images is detected and cropped to extract quality-aware features from the salient regions. We adopt the Swin Transformer Tiny as the backbone networks to extract features from these three perspectives. The extracted features are concatenated and regressed into quality scores by a two-layer multi-layer perceptron (MLP) network. We employ the mean square error (MSE) loss to optimize prediction accuracy and the fidelity loss to optimize prediction monotonicity. Experimental results show that the proposed model achieves the best performance on the UHD-IQA dataset while maintaining the lowest computational complexity, demonstrating its effectiveness and efficiency. Moreover, the proposed model won first prize in ECCV AIM 2024 UHD-IQA Challenge. The code is available at https://github.com/sunwei925/UIQA., Comment: The proposed model won first prize in ECCV AIM 2024 Pushing the Boundaries of Blind Photo Quality Assessment Challenge
- Published
- 2024
6. LMM-VQA: Advancing Video Quality Assessment with Large Multimodal Models
- Author
-
Ge, Qihang, Sun, Wei, Zhang, Yu, Li, Yunhao, Ji, Zhongpeng, Sun, Fengyu, Jui, Shangling, Min, Xiongkuo, and Zhai, Guangtao
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
The explosive growth of videos on streaming media platforms has underscored the urgent need for effective video quality assessment (VQA) algorithms to monitor and perceptually optimize the quality of streaming videos. However, VQA remains an extremely challenging task due to the diverse video content and the complex spatial and temporal distortions, thus necessitating more advanced methods to address these issues. Nowadays, large multimodal models (LMMs), such as GPT-4V, have exhibited strong capabilities for various visual understanding tasks, motivating us to leverage the powerful multimodal representation ability of LMMs to solve the VQA task. Therefore, we propose the first Large Multi-Modal Video Quality Assessment (LMM-VQA) model, which introduces a novel spatiotemporal visual modeling strategy for quality-aware feature extraction. Specifically, we first reformulate the quality regression problem into a question and answering (Q&A) task and construct Q&A prompts for VQA instruction tuning. Then, we design a spatiotemporal vision encoder to extract spatial and temporal features to represent the quality characteristics of videos, which are subsequently mapped into the language space by the spatiotemporal projector for modality alignment. Finally, the aligned visual tokens and the quality-inquired text tokens are aggregated as inputs for the large language model (LLM) to generate the quality score and level. Extensive experiments demonstrate that LMM-VQA achieves state-of-the-art performance across five VQA benchmarks, exhibiting an average improvement of $5\%$ in generalization ability over existing methods. Furthermore, due to the advanced design of the spatiotemporal encoder and projector, LMM-VQA also performs exceptionally well on general video understanding tasks, further validating its effectiveness. Our code will be released at https://github.com/Sueqk/LMM-VQA.
- Published
- 2024
7. AIM 2024 Challenge on Compressed Video Quality Assessment: Methods and Results
- Author
-
Smirnov, Maksim, Gushchin, Aleksandr, Antsiferova, Anastasia, Vatolin, Dmitry, Timofte, Radu, Jia, Ziheng, Zhang, Zicheng, Sun, Wei, Qian, Jiaying, Cao, Yuqin, Sun, Yinan, Zhu, Yuxin, Min, Xiongkuo, Zhai, Guangtao, De, Kanjar, Luo, Qing, Zhang, Ao-Xiang, Zhang, Peng, Lei, Haibo, Jiang, Linyan, Li, Yaqing, Meng, Wenhui, Tan, Xiaoheng, Wang, Haiqiang, Xu, Xiaozhong, Liu, Shan, Chen, Zhenzhong, Cheng, Zhengxue, Xiao, Jiahao, Xu, Jun, He, Chenlong, Zheng, Qi, Zhu, Ruoxi, Li, Min, Fan, Yibo, and Tu, Zhengzhong
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Multimedia - Abstract
Video quality assessment (VQA) is a crucial task in the development of video compression standards, as it directly impacts the viewer experience. This paper presents the results of the Compressed Video Quality Assessment challenge, held in conjunction with the Advances in Image Manipulation (AIM) workshop at ECCV 2024. The challenge aimed to evaluate the performance of VQA methods on a diverse dataset of 459 videos, encoded with 14 codecs of various compression standards (AVC/H.264, HEVC/H.265, AV1, and VVC/H.266) and containing a comprehensive collection of compression artifacts. To measure the methods performance, we employed traditional correlation coefficients between their predictions and subjective scores, which were collected via large-scale crowdsourced pairwise human comparisons. For training purposes, participants were provided with the Compressed Video Quality Assessment Dataset (CVQAD), a previously developed dataset of 1022 videos. Up to 30 participating teams registered for the challenge, while we report the results of 6 teams, which submitted valid final solutions and code for reproducing the results. Moreover, we calculated and present the performance of state-of-the-art VQA methods on the developed dataset, providing a comprehensive benchmark for future research. The dataset, results, and online leaderboard are publicly available at https://challenges.videoprocessing.ai/challenges/compressedvideo-quality-assessment.html.
- Published
- 2024
8. Depth-guided Texture Diffusion for Image Semantic Segmentation
- Author
-
Sun, Wei, Li, Yuan, Ye, Qixiang, Jiao, Jianbin, and Zhou, Yanzhao
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Depth information provides valuable insights into the 3D structure especially the outline of objects, which can be utilized to improve the semantic segmentation tasks. However, a naive fusion of depth information can disrupt feature and compromise accuracy due to the modality gap between the depth and the vision. In this work, we introduce a Depth-guided Texture Diffusion approach that effectively tackles the outlined challenge. Our method extracts low-level features from edges and textures to create a texture image. This image is then selectively diffused across the depth map, enhancing structural information vital for precisely extracting object outlines. By integrating this enriched depth map with the original RGB image into a joint feature embedding, our method effectively bridges the disparity between the depth map and the image, enabling more accurate semantic segmentation. We conduct comprehensive experiments across diverse, commonly-used datasets spanning a wide range of semantic segmentation tasks, including Camouflaged Object Detection (COD), Salient Object Detection (SOD), and indoor semantic segmentation. With source-free estimated depth or depth captured by depth cameras, our method consistently outperforms existing baselines and achieves new state-of-theart results, demonstrating the effectiveness of our Depth-guided Texture Diffusion for image semantic segmentation.
- Published
- 2024
9. Correspondence-Guided SfM-Free 3D Gaussian Splatting for NVS
- Author
-
Sun, Wei, Zhang, Xiaosong, Wan, Fang, Zhou, Yanzhao, Li, Yuan, Ye, Qixiang, and Jiao, Jianbin
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Novel View Synthesis (NVS) without Structure-from-Motion (SfM) pre-processed camera poses--referred to as SfM-free methods--is crucial for promoting rapid response capabilities and enhancing robustness against variable operating conditions. Recent SfM-free methods have integrated pose optimization, designing end-to-end frameworks for joint camera pose estimation and NVS. However, most existing works rely on per-pixel image loss functions, such as L2 loss. In SfM-free methods, inaccurate initial poses lead to misalignment issue, which, under the constraints of per-pixel image loss functions, results in excessive gradients, causing unstable optimization and poor convergence for NVS. In this study, we propose a correspondence-guided SfM-free 3D Gaussian splatting for NVS. We use correspondences between the target and the rendered result to achieve better pixel alignment, facilitating the optimization of relative poses between frames. We then apply the learned poses to optimize the entire scene. Each 2D screen-space pixel is associated with its corresponding 3D Gaussians through approximated surface rendering to facilitate gradient back propagation. Experimental results underline the superior performance and time efficiency of the proposed approach compared to the state-of-the-art baselines., Comment: arXiv admin note: text overlap with arXiv:2312.07504 by other authors
- Published
- 2024
10. Nanometric dual-comb ranging using photon-level microcavity solitons
- Author
-
Wang, Zihao, Wang, Yifei, Shi, Baoqi, Sun, Wei, Yang, Changxi, Liu, Junqiu, and Bao, Chengying
- Subjects
Physics - Optics - Abstract
Absolute distance measurement with low return power, fast measurement speed, high precision, and immunity to intensity fluctuations is highly demanded in nanotechnology. However, achieving all these objectives simultaneously remains a significant challenge for miniaturized systems. Here, we demonstrate dual-comb ranging (DCR) that encompasses all these capabilities by using counter-propagating (CP) solitons generated in an integrated Si$_3$N$_4$ microresonator. We derive equations linking the DCR precision with comb line powers, revealing the advantage of microcomb's large line spacing in precise ranging. Leveraging the advantage, our system reaches 1-nm-precision and measures nm-scale vibration at frequencies up to 0.9 MHz. We also show that precise DCR is possible even in the presence of strong intensity noise and loss, using a mean received photon number as low as 5.5$\times$10$^{-4}$ per pulse. Our work establishes an optimization principle for dual-comb systems and bridges high performance ranging with foundry-manufactured photonic chips.
- Published
- 2024
11. SG-JND: Semantic-Guided Just Noticeable Distortion Predictor For Image Compression
- Author
-
Cao, Linhan, Sun, Wei, Min, Xiongkuo, Jia, Jun, Zhang, Zicheng, Chen, Zijian, Zhu, Yucheng, Liu, Lizhou, Chen, Qiubo, Chen, Jing, and Zhai, Guangtao
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Just noticeable distortion (JND), representing the threshold of distortion in an image that is minimally perceptible to the human visual system (HVS), is crucial for image compression algorithms to achieve a trade-off between transmission bit rate and image quality. However, traditional JND prediction methods only rely on pixel-level or sub-band level features, lacking the ability to capture the impact of image content on JND. To bridge this gap, we propose a Semantic-Guided JND (SG-JND) network to leverage semantic information for JND prediction. In particular, SG-JND consists of three essential modules: the image preprocessing module extracts semantic-level patches from images, the feature extraction module extracts multi-layer features by utilizing the cross-scale attention layers, and the JND prediction module regresses the extracted features into the final JND value. Experimental results show that SG-JND achieves the state-of-the-art performance on two publicly available JND datasets, which demonstrates the effectiveness of SG-JND and highlight the significance of incorporating semantic information in JND assessment., Comment: Accepted by ICIP 2024
- Published
- 2024
12. A microcomb-empowered Fourier domain mode-locked LiDAR
- Author
-
Cai, Zhaoyu, Wang, Zihao, Wei, Ziqi, Shi, Baoqi, Sun, Wei, Yang, Changxi, Liu, Junqiu, and Bao, Chengying
- Subjects
Physics - Optics ,Physics - Applied Physics - Abstract
Light detection and ranging (LiDAR) has emerged as an indispensable tool in autonomous technology. Among its various techniques, frequency modulated continuous wave (FMCW) LiDAR stands out due to its capability to operate with ultralow return power, immunity to unwanted light, and simultaneous acquisition of distance and velocity. However, achieving a rapid update rate with sub-micron precision remains a challenge for FMCW LiDARs. Here, we present such a LiDAR with a sub-10 nm precision and a 24.6 kHz update rate by combining a broadband Fourier domain mode-locked (FDML) laser with a silicon nitride soliton microcomb. An ultrahigh frequency chirp rate up to 320 PHz/s is linearized by a 50 GHz microcomb to reach this performance. Our theoretical analysis also contributes to resolving the challenge of FMCW velocity measurements with nonlinear frequency sweeps and enables us to realize velocity measurement with an uncertainty below 0.4 mm/s. Our work shows how nanophotonic microcombs can unlock the potential of ultrafast frequency sweeping lasers for applications including LiDAR, optical coherence tomography and sensing.
- Published
- 2024
13. Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model
- Author
-
Zhang, Zhichao, Li, Xinyue, Sun, Wei, Jia, Jun, Min, Xiongkuo, Zhang, Zicheng, Li, Chunyi, Chen, Zijian, Wang, Puyi, Ji, Zhongpeng, Sun, Fengyu, Jui, Shangling, and Zhai, Guangtao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In recent years, artificial intelligence (AI) driven video generation has garnered significant attention due to advancements in stable diffusion and large language model techniques. Thus, there is a great demand for accurate video quality assessment (VQA) models to measure the perceptual quality of AI-generated content (AIGC) videos as well as optimize video generation techniques. However, assessing the quality of AIGC videos is quite challenging due to the highly complex distortions they exhibit (e.g., unnatural action, irrational objects, etc.). Therefore, in this paper, we try to systemically investigate the AIGC-VQA problem from both subjective and objective quality assessment perspectives. For the subjective perspective, we construct a Large-scale Generated Vdeo Quality assessment (LGVQ) dataset, consisting of 2,808 AIGC videos generated by 6 video generation models using 468 carefully selected text prompts. Unlike previous subjective VQA experiments, we evaluate the perceptual quality of AIGC videos from three dimensions: spatial quality, temporal quality, and text-to-video alignment, which hold utmost importance for current video generation techniques. For the objective perspective, we establish a benchmark for evaluating existing quality assessment metrics on the LGVQ dataset, which reveals that current metrics perform poorly on the LGVQ dataset. Thus, we propose a Unify Generated Video Quality assessment (UGVQ) model to comprehensively and accurately evaluate the quality of AIGC videos across three aspects using a unified model, which uses visual, textual and motion features of video and corresponding prompt, and integrates key features to enhance feature expression. We hope that our benchmark can promote the development of quality evaluation metrics for AIGC videos. The LGVQ dataset and the UGVQ metric will be publicly released.
- Published
- 2024
14. Domain Adaptable Prescriptive AI Agent for Enterprise
- Author
-
Orderique, Piero, Sun, Wei, and Greenewald, Kristjan
- Subjects
Computer Science - Artificial Intelligence - Abstract
Despite advancements in causal inference and prescriptive AI, its adoption in enterprise settings remains hindered primarily due to its technical complexity. Many users lack the necessary knowledge and appropriate tools to effectively leverage these technologies. This work at the MIT-IBM Watson AI Lab focuses on developing the proof-of-concept agent, PrecAIse, a domain-adaptable conversational agent equipped with a suite of causal and prescriptive tools to help enterprise users make better business decisions. The objective is to make advanced, novel causal inference and prescriptive tools widely accessible through natural language interactions. The presented Natural Language User Interface (NLUI) enables users with limited expertise in machine learning and data science to harness prescriptive analytics in their decision-making processes without requiring intensive computing resources. We present an agent capable of function calling, maintaining faithful, interactive, and dynamic conversations, and supporting new domains.
- Published
- 2024
15. UNQA: Unified No-Reference Quality Assessment for Audio, Image, Video, and Audio-Visual Content
- Author
-
Cao, Yuqin, Min, Xiongkuo, Gao, Yixuan, Sun, Wei, Lin, Weisi, and Zhai, Guangtao
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Multimedia ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
As multimedia data flourishes on the Internet, quality assessment (QA) of multimedia data becomes paramount for digital media applications. Since multimedia data includes multiple modalities including audio, image, video, and audio-visual (A/V) content, researchers have developed a range of QA methods to evaluate the quality of different modality data. While they exclusively focus on addressing the single modality QA issues, a unified QA model that can handle diverse media across multiple modalities is still missing, whereas the latter can better resemble human perception behaviour and also have a wider range of applications. In this paper, we propose the Unified No-reference Quality Assessment model (UNQA) for audio, image, video, and A/V content, which tries to train a single QA model across different media modalities. To tackle the issue of inconsistent quality scales among different QA databases, we develop a multi-modality strategy to jointly train UNQA on multiple QA databases. Based on the input modality, UNQA selectively extracts the spatial features, motion features, and audio features, and calculates a final quality score via the four corresponding modality regression modules. Compared with existing QA methods, UNQA has two advantages: 1) the multi-modality training strategy makes the QA model learn more general and robust quality-aware feature representation as evidenced by the superior performance of UNQA compared to state-of-the-art QA methods. 2) UNQA reduces the number of models required to assess multimedia data across different modalities. and is friendly to deploy to practical applications.
- Published
- 2024
16. DOLOS: Tricking the Wi-Fi APs with Incorrect User Locations
- Author
-
Arun, Aditya, Anand, Vaibhav, Sun, Wei, Ayyalasomayajula, Roshan, and Bharadia, Dinesh
- Subjects
Computer Science - Networking and Internet Architecture ,Electrical Engineering and Systems Science - Signal Processing - Abstract
Wi-Fi-based indoor localization has been extensively studied for context-aware services. As a result, the accurate Wi-Fi-based indoor localization introduces a great location privacy threat. However, the existing solutions for location privacy protection are hard to implement on current devices. They require extra hardware deployment in the environment or hardware modifications at the transmitter or receiver side. To this end, we propose DOLOS, a system that can protect the location privacy of the Wi-Fi user with a novel signal obfuscation approach. DOLOSis a software-only solution that can be deployed on existing protocol-compliant Wi-Fi user devices. We provide this obfuscation by invalidating a simple assumption made by most localization systems -- "direct path signal arrives earlier than all the reflections to distinguish this direct path prior to estimating the location". However, DOLOS creates a novel software fix that allows the user to transmit the signal wherein this direct path arrives later, creating ambiguity in the location estimates. Our experimental results demonstrate DOLOS can degrade the localization accuracy of state-of-art systems by 6x for a single AP and 2.5x for multiple AP scenarios, thereby protecting the Wi-Fi user's location privacy without compromising the Wi-Fi communication performance.
- Published
- 2024
17. DiffStega: Towards Universal Training-Free Coverless Image Steganography with Diffusion Models
- Author
-
Yang, Yiwei, Liu, Zheyuan, Jia, Jun, Gao, Zhongpai, Li, Yunhao, Sun, Wei, Liu, Xiaohong, and Zhai, Guangtao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Traditional image steganography focuses on concealing one image within another, aiming to avoid steganalysis by unauthorized entities. Coverless image steganography (CIS) enhances imperceptibility by not using any cover image. Recent works have utilized text prompts as keys in CIS through diffusion models. However, this approach faces three challenges: invalidated when private prompt is guessed, crafting public prompts for semantic diversity, and the risk of prompt leakage during frequent transmission. To address these issues, we propose DiffStega, an innovative training-free diffusion-based CIS strategy for universal application. DiffStega uses a password-dependent reference image as an image prompt alongside the text, ensuring that only authorized parties can retrieve the hidden information. Furthermore, we develop Noise Flip technique to further secure the steganography against unauthorized decryption. To comprehensively assess our method across general CIS tasks, we create a dataset comprising various image steganography instances. Experiments indicate substantial improvements in our method over existing ones, particularly in aspects of versatility, password sensitivity, and recovery quality. Codes are available at \url{https://github.com/evtricks/DiffStega}., Comment: 9 pages, 7 figures; reference added; accepted at IJCAI2024 main track
- Published
- 2024
18. Unveiling mussel plaque core ductility: the role of pore distribution and hierarchical structure
- Author
-
Lyu, Yulan, Tan, Mengting, Pang, Yong, Sun, Wei, Li, Shuguang, and Liu, Tao
- Subjects
Physics - Applied Physics ,Condensed Matter - Soft Condensed Matter ,Physics - Biological Physics - Abstract
The mussel thread-plaque system exhibits strong adhesion and high ductility, allowing it to adhere to various surfaces. While the microstructure of plaques has been thoroughly studied, the effect of their unique porous structure on ductility remains unclear. This study firstly investigated the porous structure of mussel plaque cores using scanning electron microscopy (SEM). Two-dimensional (2D) porous representative volume elements (RVEs) with scaled distribution parameters were generated, and the calibrated phase-field modelling method was applied to analyse the effect of the pore distribution and multi-scale porous structure on the failure mechanism of porous RVEs. The SEM analysis revealed that large-scale pores exhibited a lognormal size distribution and a uniform spatial distribution. Simulations showed that increasing the normalised mean radius value of the large-scale pore distribution can statistically lead to a decreasing trend in ductility, strength and strain energy, but cannot solely determine their values. The interaction between pores can lead to two different failure modes under the same pore distribution: progressive failure mode and sudden failure mode. Additionally, the hierarchical structure of multi-scale porous RVEs can further increase ductility by 40%-60% compared to single-scale porous RVEs by reducing stiffness, highlighting the hierarchical structure could be another key factor contributing to the high ductility. These findings deepen our understanding of how the pore distribution and multi-scale porous structure in mussel plaques contribute to their high ductility and affect other mechanical properties, providing valuable insights for the future design of highly ductile biomimetic materials.
- Published
- 2024
19. An alkali-referenced vector spectrum analyzer for visible-light integrated photonics
- Author
-
Shi, Baoqi, Zheng, Ming-Yang, Zhao, Yunkai, Luo, Yi-Han, Long, Jinbao, Sun, Wei, Ma, Wenbo, Xie, Xiu-Ping, Gao, Lan, Shen, Chen, Wang, Anting, Liang, Wei, Zhang, Qiang, and Liu, Junqiu
- Subjects
Physics - Optics - Abstract
Integrated photonics has reformed our information society by offering on-chip optical signal synthesis, processing and detection with reduced size, weight and power consumption. As such, it has been successfully established in the near-infrared (NIR) telecommunication bands. With the soaring demand in miniaturized systems for biosensing, quantum information and transportable atomic clocks, extensive endeavors have been stacked on translating integrated photonics into the visible spectrum, i.e. visible-light integrated photonics. Various innovative visible-light integrated devices have been demonstrated, such as lasers, frequency combs, and atom traps, highlighting the capacity and prospect to create chip-based optical atomic clocks that can make timing and frequency metrology ubiquitous. A pillar to the development of visible-light integrated photonics is characterization techniques featuring high frequency resolution and wide spectral coverage, which however remain elusive. Here, we demonstrate a vector spectrum analyzer (VSA) for visible-light integrated photonics, offering spectral bandwidth from 766 to 795 nm and frequency resolution of 415 kHz. The VSA is rooted on a widely chirping, high-power, narrow-linewidth, mode-hop-free laser around 780 nm, which is frequency-doubled from the near-infrared via an efficient, broadband CPLN waveguide. The VSA is further referenced to hyperfine structures of rubidium and potassium atoms, enabling 8.1 MHz frequency accuracy. We apply our VSA to showcase the characterization of loss, dispersion and phase response of passive integrated devices, as well as densely spaced spectra of mode-locked lasers. Combining operation in the NIR and visible spectra, our VSA allows characterization bandwidth exceeding an octave and can be an invaluable diagnostic tool for spectroscopy, nonlinear optical processing, imaging and quantum interfaces to atomic devices.
- Published
- 2024
20. Unlocking the Potential of Early Epochs: Uncertainty-aware CT Metal Artifact Reduction
- Author
-
Yang, Xinquan, Zhou, Guanqun, Sun, Wei, Zhang, Youjian, Wang, Zhongya, He, Jiahui, and Zhang, Zhicheng
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition - Abstract
In computed tomography (CT), the presence of metallic implants in patients often leads to disruptive artifacts in the reconstructed images, hindering accurate diagnosis. Recently, a large amount of supervised deep learning-based approaches have been proposed for metal artifact reduction (MAR). However, these methods neglect the influence of initial training weights. In this paper, we have discovered that the uncertainty image computed from the restoration result of initial training weights can effectively highlight high-frequency regions, including metal artifacts. This observation can be leveraged to assist the MAR network in removing metal artifacts. Therefore, we propose an uncertainty constraint (UC) loss that utilizes the uncertainty image as an adaptive weight to guide the MAR network to focus on the metal artifact region, leading to improved restoration. The proposed UC loss is designed to be a plug-and-play method, compatible with any MAR framework, and easily adoptable. To validate the effectiveness of the UC loss, we conduct extensive experiments on the public available Deeplesion and CLINIC-metal dataset. Experimental results demonstrate that the UC loss further optimizes the network training process and significantly improves the removal of metal artifacts.
- Published
- 2024
21. Perceiver-Prompt: Flexible Speaker Adaptation in Whisper for Chinese Disordered Speech Recognition
- Author
-
Jiang, Yicong, Wang, Tianzi, Xie, Xurong, Liu, Juan, Sun, Wei, Yan, Nan, Chen, Hui, Wang, Lan, Liu, Xunying, and Tian, Feng
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Artificial Intelligence ,Computer Science - Sound - Abstract
Disordered speech recognition profound implications for improving the quality of life for individuals afflicted with, for example, dysarthria. Dysarthric speech recognition encounters challenges including limited data, substantial dissimilarities between dysarthric and non-dysarthric speakers, and significant speaker variations stemming from the disorder. This paper introduces Perceiver-Prompt, a method for speaker adaptation that utilizes P-Tuning on the Whisper large-scale model. We first fine-tune Whisper using LoRA and then integrate a trainable Perceiver to generate fixed-length speaker prompts from variable-length inputs, to improve model recognition of Chinese dysarthric speech. Experimental results from our Chinese dysarthric speech dataset demonstrate consistent improvements in recognition performance with Perceiver-Prompt. Relative reduction up to 13.04% in CER is obtained over the fine-tuned Whisper., Comment: Accepted by interspeech 2024
- Published
- 2024
22. Recy-ctronics: Designing Fully Recyclable Electronics With Varied Form Factors
- Author
-
Cheng, Tingyu, Zhang, Zhihan, Huang, Han, Gao, Yingting, Sun, Wei, Abowd, Gregory D., Oh, HyunJoo, and Hester, Josiah
- Subjects
Computer Science - Human-Computer Interaction - Abstract
For today's electronics manufacturing process, the emphasis on stable functionality, durability, and fixed physical forms is designed to ensure long-term usability. However, this focus on robustness and permanence complicates the disassembly and recycling processes, leading to significant environmental repercussions. In this paper, we present three approaches that leverage easily recyclable materials-specifically, polyvinyl alcohol (PVA) and liquid metal (LM)-alongside accessible manufacturing techniques to produce electronic components and systems with versatile form factors. Our work centers on the development of recyclable electronics through three methods: 1) creating sheet electronics by screen printing LM traces on PVA substrates; 2) developing foam-based electronics by immersing mechanically stirred PVA foam into an LM solution; and 3) fabricating recyclable electronic tubes by injecting LM into mold cast PVA tubes, which can then be woven into various structures. To further assess the sustainability of our proposed methods, we conducted a life cycle assessment (LCA) to evaluate the environmental impact of our recyclable electronics in comparison to their conventional counterparts.
- Published
- 2024
23. GAIA: Rethinking Action Quality Assessment for AI-Generated Videos
- Author
-
Chen, Zijian, Sun, Wei, Tian, Yuan, Jia, Jun, Zhang, Zicheng, Wang, Jiarui, Huang, Ru, Min, Xiongkuo, Zhai, Guangtao, and Zhang, Wenjun
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Assessing action quality is both imperative and challenging due to its significant impact on the quality of AI-generated videos, further complicated by the inherently ambiguous nature of actions within AI-generated video (AIGV). Current action quality assessment (AQA) algorithms predominantly focus on actions from real specific scenarios and are pre-trained with normative action features, thus rendering them inapplicable in AIGVs. To address these problems, we construct GAIA, a Generic AI-generated Action dataset, by conducting a large-scale subjective evaluation from a novel causal reasoning-based perspective, resulting in 971,244 ratings among 9,180 video-action pairs. Based on GAIA, we evaluate a suite of popular text-to-video (T2V) models on their ability to generate visually rational actions, revealing their pros and cons on different categories of actions. We also extend GAIA as a testbed to benchmark the AQA capacity of existing automatic evaluation methods. Results show that traditional AQA methods, action-related metrics in recent T2V benchmarks, and mainstream video quality methods correlate poorly with human opinions, indicating a sizable gap between current models and human action perception patterns in AIGVs. Our findings underscore the significance of action quality as a unique perspective for studying AIGVs and can catalyze progress towards methods with enhanced capacities for AQA in AIGVs., Comment: 28 pages, 13 figures
- Published
- 2024
24. A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
- Author
-
Zhang, Zicheng, Wu, Haoning, Li, Chunyi, Zhou, Yingjie, Sun, Wei, Min, Xiongkuo, Chen, Zijian, Liu, Xiaohong, Lin, Weisi, and Zhai, Guangtao
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
How to accurately and efficiently assess AI-generated images (AIGIs) remains a critical challenge for generative models. Given the high costs and extensive time commitments required for user studies, many researchers have turned towards employing large multi-modal models (LMMs) as AIGI evaluators, the precision and validity of which are still questionable. Furthermore, traditional benchmarks often utilize mostly natural-captured content rather than AIGIs to test the abilities of LMMs, leading to a noticeable gap for AIGIs. Therefore, we introduce A-Bench in this paper, a benchmark designed to diagnose whether LMMs are masters at evaluating AIGIs. Specifically, A-Bench is organized under two key principles: 1) Emphasizing both high-level semantic understanding and low-level visual quality perception to address the intricate demands of AIGIs. 2) Various generative models are utilized for AIGI creation, and various LMMs are employed for evaluation, which ensures a comprehensive validation scope. Ultimately, 2,864 AIGIs from 16 text-to-image models are sampled, each paired with question-answers annotated by human experts, and tested across 18 leading LMMs. We hope that A-Bench will significantly enhance the evaluation process and promote the generation quality for AIGIs. The benchmark is available at https://github.com/Q-Future/A-Bench.
- Published
- 2024
25. Quantum Computing in Intelligent Transportation Systems: A Survey
- Author
-
Zhuang, Yifan, Azfar, Talha, Wang, Yinhai, Sun, Wei, Wang, Xiaokun Cara, Guo, Qianwen Vivian, and Ke, Ruimin
- Subjects
Quantum Physics ,Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
Quantum computing, a field utilizing the principles of quantum mechanics, promises great advancements across various industries. This survey paper is focused on the burgeoning intersection of quantum computing and intelligent transportation systems, exploring its potential to transform areas such as traffic optimization, logistics, routing, and autonomous vehicles. By examining current research efforts, challenges, and future directions, this survey aims to provide a comprehensive overview of how quantum computing could affect the future of transportation.
- Published
- 2024
26. Uncertainty-guided Optimal Transport in Depth Supervised Sparse-View 3D Gaussian
- Author
-
Sun, Wei, Zhang, Qi, Zhou, Yanzhao, Ye, Qixiang, Jiao, Jianbin, and Li, Yuan
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
3D Gaussian splatting has demonstrated impressive performance in real-time novel view synthesis. However, achieving successful reconstruction from RGB images generally requires multiple input views captured under static conditions. To address the challenge of sparse input views, previous approaches have incorporated depth supervision into the training of 3D Gaussians to mitigate overfitting, using dense predictions from pretrained depth networks as pseudo-ground truth. Nevertheless, depth predictions from monocular depth estimation models inherently exhibit significant uncertainty in specific areas. Relying solely on pixel-wise L2 loss may inadvertently incorporate detrimental noise from these uncertain areas. In this work, we introduce a novel method to supervise the depth distribution of 3D Gaussians, utilizing depth priors with integrated uncertainty estimates. To address these localized errors in depth predictions, we integrate a patch-wise optimal transport strategy to complement traditional L2 loss in depth supervision. Extensive experiments conducted on the LLFF, DTU, and Blender datasets demonstrate that our approach, UGOT, achieves superior novel view synthesis and consistently outperforms state-of-the-art methods., Comment: 10pages
- Published
- 2024
27. Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow
- Author
-
Chao, Chen-Hao, Feng, Chien, Sun, Wei-Fang, Lee, Cheng-Kuang, See, Simon, and Lee, Chun-Yi
- Subjects
Computer Science - Machine Learning - Abstract
Existing Maximum-Entropy (MaxEnt) Reinforcement Learning (RL) methods for continuous action spaces are typically formulated based on actor-critic frameworks and optimized through alternating steps of policy evaluation and policy improvement. In the policy evaluation steps, the critic is updated to capture the soft Q-function. In the policy improvement steps, the actor is adjusted in accordance with the updated soft Q-function. In this paper, we introduce a new MaxEnt RL framework modeled using Energy-Based Normalizing Flows (EBFlow). This framework integrates the policy evaluation steps and the policy improvement steps, resulting in a single objective training process. Our method enables the calculation of the soft value function used in the policy evaluation target without Monte Carlo approximation. Moreover, this design supports the modeling of multi-modal action distributions while facilitating efficient action sampling. To evaluate the performance of our method, we conducted experiments on the MuJoCo benchmark suite and a number of high-dimensional robotic tasks simulated by Omniverse Isaac Gym. The evaluation results demonstrate that our method achieves superior performance compared to widely-adopted representative baselines.
- Published
- 2024
28. WiDRa -- Enabling Millimeter-Level Differential Ranging Accuracy in Wi-Fi Using Carrier Phase
- Author
-
Ratnam, Vishnu V., Sadiq, Bilal, Chen, Hao, Sun, Wei, Wu, Shunyao, Ng, Boon L., Jianzhong, and Zhang
- Subjects
Computer Science - Information Theory - Abstract
Although Wi-Fi is an ideal technology for many ranging applications, the performance of current methods is limited by the system bandwidth, leading to low accuracy of $\sim 1$ m. For many applications, measuring differential range, viz., the change in the range between adjacent measurements, is sufficient. Correspondingly, this work proposes WiDRa - a Wi-Fi based Differential Ranging solution that provides differential range estimates by using the sum-carrier-phase information. The proposed method is not limited by system bandwidth and can track range changes even smaller than the carrier wavelength. The proposed method is first theoretically justified, while taking into consideration the various hardware impairments affecting Wi-Fi chips. In the process, methods to isolate the sum-carrier phase from the hardware impairments are proposed. Extensive simulation results show that WiDRa can achieve a differential range estimation root-mean-square-error (RMSE) of $\approx 1$ mm in channels with a Rician-factor $\geq 7$ (a $100 \times$ improvement to existing methods). The proposed methods are also validated on off-the-shelf Wi-Fi hardware to demonstrate feasibility, where they achieve an RMSE of $< 1$ mm in the differential range. Finally, limitations of current investigation and future directions of exploration are suggested, to further tap into the potential of WiDRa., Comment: Accepted to IEEE JSAC special issue on Positioning and Sensing Over Wireless Networks, 2024
- Published
- 2024
29. A GAN-Based Data Poisoning Attack Against Federated Learning Systems and Its Countermeasure
- Author
-
Sun, Wei, Gao, Bo, Xiong, Ke, and Wang, Yuwei
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Networking and Internet Architecture - Abstract
As a distributed machine learning paradigm, federated learning (FL) is collaboratively carried out on privately owned datasets but without direct data access. Although the original intention is to allay data privacy concerns, "available but not visible" data in FL potentially brings new security threats, particularly poisoning attacks that target such "not visible" local data. Initial attempts have been made to conduct data poisoning attacks against FL systems, but cannot be fully successful due to their high chance of causing statistical anomalies. To unleash the potential for truly "invisible" attacks and build a more deterrent threat model, in this paper, a new data poisoning attack model named VagueGAN is proposed, which can generate seemingly legitimate but noisy poisoned data by untraditionally taking advantage of generative adversarial network (GAN) variants. Capable of manipulating the quality of poisoned data on demand, VagueGAN enables to trade-off attack effectiveness and stealthiness. Furthermore, a cost-effective countermeasure named Model Consistency-Based Defense (MCD) is proposed to identify GAN-poisoned data or models after finding out the consistency of GAN outputs. Extensive experiments on multiple datasets indicate that our attack method is generally much more stealthy as well as more effective in degrading FL performance with low complexity. Our defense method is also shown to be more competent in identifying GAN-poisoned data or models. The source codes are publicly available at \href{https://github.com/SSssWEIssSS/VagueGAN-Data-Poisoning-Attack-and-Its-Countermeasure}{https://github.com/SSssWEIssSS/VagueGAN-Data-Poisoning-Attack-and-Its-Countermeasure}., Comment: 18 pages, 16 figures
- Published
- 2024
30. Enhancing Blind Video Quality Assessment with Rich Quality-aware Features
- Author
-
Sun, Wei, Wu, Haoning, Zhang, Zicheng, Jia, Jun, Zhang, Zhichao, Cao, Linhan, Chen, Qiubo, Min, Xiongkuo, Lin, Weisi, and Zhai, Guangtao
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Multimedia - Abstract
In this paper, we present a simple but effective method to enhance blind video quality assessment (BVQA) models for social media videos. Motivated by previous researches that leverage pre-trained features extracted from various computer vision models as the feature representation for BVQA, we further explore rich quality-aware features from pre-trained blind image quality assessment (BIQA) and BVQA models as auxiliary features to help the BVQA model to handle complex distortions and diverse content of social media videos. Specifically, we use SimpleVQA, a BVQA model that consists of a trainable Swin Transformer-B and a fixed SlowFast, as our base model. The Swin Transformer-B and SlowFast components are responsible for extracting spatial and motion features, respectively. Then, we extract three kinds of features from Q-Align, LIQE, and FAST-VQA to capture frame-level quality-aware features, frame-level quality-aware along with scene-specific features, and spatiotemporal quality-aware features, respectively. Through concatenating these features, we employ a multi-layer perceptron (MLP) network to regress them into quality scores. Experimental results demonstrate that the proposed model achieves the best performance on three public social media VQA datasets. Moreover, the proposed model won first place in the CVPR NTIRE 2024 Short-form UGC Video Quality Assessment Challenge. The code is available at \url{https://github.com/sunwei925/RQ-VQA.git}.
- Published
- 2024
31. Dual-Branch Network for Portrait Image Quality Assessment
- Author
-
Sun, Wei, Zhang, Weixia, Jiang, Yanwei, Wu, Haoning, Zhang, Zicheng, Jia, Jun, Zhou, Yingjie, Ji, Zhongpeng, Min, Xiongkuo, Lin, Weisi, and Zhai, Guangtao
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Multimedia - Abstract
Portrait images typically consist of a salient person against diverse backgrounds. With the development of mobile devices and image processing techniques, users can conveniently capture portrait images anytime and anywhere. However, the quality of these portraits may suffer from the degradation caused by unfavorable environmental conditions, subpar photography techniques, and inferior capturing devices. In this paper, we introduce a dual-branch network for portrait image quality assessment (PIQA), which can effectively address how the salient person and the background of a portrait image influence its visual quality. Specifically, we utilize two backbone networks (\textit{i.e.,} Swin Transformer-B) to extract the quality-aware features from the entire portrait image and the facial image cropped from it. To enhance the quality-aware feature representation of the backbones, we pre-train them on the large-scale video quality assessment dataset LSVQ and the large-scale facial image quality assessment dataset GFIQA. Additionally, we leverage LIQE, an image scene classification and quality assessment model, to capture the quality-aware and scene-specific features as the auxiliary features. Finally, we concatenate these features and regress them into quality scores via a multi-perception layer (MLP). We employ the fidelity loss to train the model via a learning-to-rank manner to mitigate inconsistencies in quality scores in the portrait image quality assessment dataset PIQ. Experimental results demonstrate that the proposed model achieves superior performance in the PIQ dataset, validating its effectiveness. The code is available at \url{https://github.com/sunwei925/DN-PIQA.git}.
- Published
- 2024
32. Deep Learning-Based Object Pose Estimation: A Comprehensive Survey
- Author
-
Liu, Jian, Sun, Wei, Yang, Hui, Zeng, Zhiwen, Liu, Chongpei, Zheng, Jin, Liu, Xingyu, Rahmani, Hossein, Sebe, Nicu, and Mian, Ajmal
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Object pose estimation is a fundamental computer vision problem with broad applications in augmented reality and robotics. Over the past decade, deep learning models, due to their superior accuracy and robustness, have increasingly supplanted conventional algorithms reliant on engineered point pair features. Nevertheless, several challenges persist in contemporary methods, including their dependency on labeled training data, model compactness, robustness under challenging conditions, and their ability to generalize to novel unseen objects. A recent survey discussing the progress made on different aspects of this area, outstanding challenges, and promising future directions, is missing. To fill this gap, we discuss the recent advances in deep learning-based object pose estimation, covering all three formulations of the problem, \emph{i.e.}, instance-level, category-level, and unseen object pose estimation. Our survey also covers multiple input data modalities, degrees-of-freedom of output poses, object properties, and downstream tasks, providing the readers with a holistic understanding of this field. Additionally, it discusses training paradigms of different domains, inference modes, application areas, evaluation metrics, and benchmark datasets, as well as reports the performance of current state-of-the-art methods on these benchmarks, thereby facilitating the readers in selecting the most suitable method for their application. Finally, the survey identifies key challenges, reviews the prevailing trends along with their pros and cons, and identifies promising directions for future research. We also keep tracing the latest works at https://github.com/CNJianLiu/Awesome-Object-Pose-Estimation., Comment: 27 pages, 7 figures
- Published
- 2024
33. DMON: A Simple yet Effective Approach for Argument Structure Learning
- Author
-
Sun, Wei, Li, Mingxiao, Sun, Jingyuan, Davis, Jesse, and Moens, Marie-Francine
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Argument structure learning~(ASL) entails predicting relations between arguments. Because it can structure a document to facilitate its understanding, it has been widely applied in many fields~(medical, commercial, and scientific domains). Despite its broad utilization, ASL remains a challenging task because it involves examining the complex relationships between the sentences in a potentially unstructured discourse. To resolve this problem, we have developed a simple yet effective approach called Dual-tower Multi-scale cOnvolution neural Network~(DMON) for the ASL task. Specifically, we organize arguments into a relationship matrix that together with the argument embeddings forms a relationship tensor and design a mechanism to capture relations with contextual arguments. Experimental results on three different-domain argument mining datasets demonstrate that our framework outperforms state-of-the-art models. The code is available at https://github.com/VRCMF/DMON.git ., Comment: COLING 2024
- Published
- 2024
34. LMM-PCQA: Assisting Point Cloud Quality Assessment with LMM
- Author
-
Zhang, Zicheng, Wu, Haoning, Zhou, Yingjie, Li, Chunyi, Sun, Wei, Chen, Chaofeng, Min, Xiongkuo, Liu, Xiaohong, Lin, Weisi, and Zhai, Guangtao
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Although large multi-modality models (LMMs) have seen extensive exploration and application in various quality assessment studies, their integration into Point Cloud Quality Assessment (PCQA) remains unexplored. Given LMMs' exceptional performance and robustness in low-level vision and quality assessment tasks, this study aims to investigate the feasibility of imparting PCQA knowledge to LMMs through text supervision. To achieve this, we transform quality labels into textual descriptions during the fine-tuning phase, enabling LMMs to derive quality rating logits from 2D projections of point clouds. To compensate for the loss of perception in the 3D domain, structural features are extracted as well. These quality logits and structural features are then combined and regressed into quality scores. Our experimental results affirm the effectiveness of our approach, showcasing a novel integration of LMMs into PCQA that enhances model understanding and assessment accuracy. We hope our contributions can inspire subsequent investigations into the fusion of LMMs with PCQA, fostering advancements in 3D visual quality analysis and beyond. The code is available at https://github.com/zzc-1998/LMM-PCQA.
- Published
- 2024
35. Large Multi-modality Model Assisted AI-Generated Image Quality Assessment
- Author
-
Wang, Puyi, Sun, Wei, Zhang, Zicheng, Jia, Jun, Jiang, Yanwei, Zhang, Zhichao, Min, Xiongkuo, and Zhai, Guangtao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Traditional deep neural network (DNN)-based image quality assessment (IQA) models leverage convolutional neural networks (CNN) or Transformer to learn the quality-aware feature representation, achieving commendable performance on natural scene images. However, when applied to AI-Generated images (AGIs), these DNN-based IQA models exhibit subpar performance. This situation is largely due to the semantic inaccuracies inherent in certain AGIs caused by uncontrollable nature of the generation process. Thus, the capability to discern semantic content becomes crucial for assessing the quality of AGIs. Traditional DNN-based IQA models, constrained by limited parameter complexity and training data, struggle to capture complex fine-grained semantic features, making it challenging to grasp the existence and coherence of semantic content of the entire image. To address the shortfall in semantic content perception of current IQA models, we introduce a large Multi-modality model Assisted AI-Generated Image Quality Assessment (MA-AGIQA) model, which utilizes semantically informed guidance to sense semantic information and extract semantic vectors through carefully designed text prompts. Moreover, it employs a mixture of experts (MoE) structure to dynamically integrate the semantic information with the quality-aware features extracted by traditional DNN-based IQA models. Comprehensive experiments conducted on two AI-generated content datasets, AIGCQA-20k and AGIQA-3k show that MA-AGIQA achieves state-of-the-art performance, and demonstrate its superior generalization capabilities on assessing the quality of AGIs. Code is available at https://github.com/wangpuyi/MA-AGIQA., Comment: ACM MM'24
- Published
- 2024
- Full Text
- View/download PDF
36. NTIRE 2024 Quality Assessment of AI-Generated Content Challenge
- Author
-
Liu, Xiaohong, Min, Xiongkuo, Zhai, Guangtao, Li, Chunyi, Kou, Tengchuan, Sun, Wei, Wu, Haoning, Gao, Yixuan, Cao, Yuqin, Zhang, Zicheng, Wu, Xiele, Timofte, Radu, Peng, Fei, Fu, Huiyuan, Ming, Anlong, Wang, Chuanming, Ma, Huadong, He, Shuai, Dou, Zifei, Chen, Shu, Zhang, Huacong, Xie, Haiyi, Wang, Chengwei, Chen, Baoying, Zeng, Jishen, Yang, Jianquan, Wang, Weigang, Fang, Xi, Lv, Xiaoxin, Yan, Jun, Zhi, Tianwu, Zhang, Yabin, Li, Yaohui, Li, Yang, Xu, Jingwen, Liu, Jianzhao, Liao, Yiting, Li, Junlin, Yu, Zihao, Lu, Yiting, Li, Xin, Motamednia, Hossein, Hosseini-Benvidi, S. Farhad, Guan, Fengbin, Mahmoudi-Aznaveh, Ahmad, Mansouri, Azadeh, Gankhuyag, Ganzorig, Yoon, Kihwan, Xu, Yifang, Fan, Haotian, Kong, Fangyuan, Zhao, Shiling, Dong, Weifeng, Yin, Haibing, Zhu, Li, Wang, Zhiling, Huang, Bingchen, Saha, Avinab, Mishra, Sandeep, Gupta, Shashank, Sureddi, Rajesh, Saha, Oindrila, Celona, Luigi, Bianco, Simone, Napoletano, Paolo, Schettini, Raimondo, Yang, Junfeng, Fu, Jing, Zhang, Wei, Cao, Wenzhi, Liu, Limei, Peng, Han, Yuan, Weijun, Li, Zhan, Cheng, Yihang, Deng, Yifan, Li, Haohui, Qu, Bowen, Li, Yao, Luo, Shuqing, Wang, Shunzhou, Gao, Wei, Lu, Zihao, Conde, Marcos V., Wang, Xinrui, Chen, Zhibo, Liao, Ruling, Ye, Yan, Wang, Qiulin, Li, Bing, Zhou, Zhaokun, Geng, Miao, Chen, Rui, Tao, Xin, Liang, Xiaoyu, Sun, Shangkun, Ma, Xingyuan, Li, Jiaze, Yang, Mengduo, Xu, Haoran, Zhou, Jie, Zhu, Shiding, Yu, Bohan, Chen, Pengfei, Xu, Xinrui, Shen, Jiabin, Duan, Zhichao, Asadi, Erfan, Liu, Jiahe, Yan, Qi, Qu, Youran, Zeng, Xiaohui, Wang, Lele, and Liao, Renjie
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Content (AIGC). The challenge is divided into the image track and the video track. The image track uses the AIGIQA-20K, which contains 20,000 AI-Generated Images (AIGIs) generated by 15 popular generative models. The image track has a total of 318 registered participants. A total of 1,646 submissions are received in the development phase, and 221 submissions are received in the test phase. Finally, 16 participating teams submitted their models and fact sheets. The video track uses the T2VQA-DB, which contains 10,000 AI-Generated Videos (AIGVs) generated by 9 popular Text-to-Video (T2V) models. A total of 196 participants have registered in the video track. A total of 991 submissions are received in the development phase, and 185 submissions are received in the test phase. Finally, 12 participating teams submitted their models and fact sheets. Some methods have achieved better results than baseline methods, and the winning methods in both tracks have demonstrated superior prediction performance on AIGC.
- Published
- 2024
37. AIS 2024 Challenge on Video Quality Assessment of User-Generated Content: Methods and Results
- Author
-
Conde, Marcos V., Zadtootaghaj, Saman, Barman, Nabajeet, Timofte, Radu, He, Chenlong, Zheng, Qi, Zhu, Ruoxi, Tu, Zhengzhong, Wang, Haiqiang, Chen, Xiangguang, Meng, Wenhui, Pan, Xiang, Shi, Huiying, Zhu, Han, Xu, Xiaozhong, Sun, Lei, Chen, Zhenzhong, Liu, Shan, Zhang, Zicheng, Wu, Haoning, Zhou, Yingjie, Li, Chunyi, Liu, Xiaohong, Lin, Weisi, Zhai, Guangtao, Sun, Wei, Cao, Yuqin, Jiang, Yanwei, Jia, Jun, Zhang, Zhichao, Chen, Zijian, Zhang, Weixia, Min, Xiongkuo, Göring, Steve, Qi, Zihao, and Feng, Chen
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Multimedia - Abstract
This paper reviews the AIS 2024 Video Quality Assessment (VQA) Challenge, focused on User-Generated Content (UGC). The aim of this challenge is to gather deep learning-based methods capable of estimating the perceptual quality of UGC videos. The user-generated videos from the YouTube UGC Dataset include diverse content (sports, games, lyrics, anime, etc.), quality and resolutions. The proposed methods must process 30 FHD frames under 1 second. In the challenge, a total of 102 participants registered, and 15 submitted code and models. The performance of the top-5 submissions is reviewed and provided here as a survey of diverse deep models for efficient video quality assessment of user-generated content., Comment: CVPR 2024 Workshop -- AI for Streaming (AIS) Video Quality Assessment Challenge
- Published
- 2024
38. NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results
- Author
-
Li, Xin, Yuan, Kun, Pei, Yajing, Lu, Yiting, Sun, Ming, Zhou, Chao, Chen, Zhibo, Timofte, Radu, Sun, Wei, Wu, Haoning, Zhang, Zicheng, Jia, Jun, Zhang, Zhichao, Cao, Linhan, Chen, Qiubo, Min, Xiongkuo, Lin, Weisi, Zhai, Guangtao, Sun, Jianhui, Wang, Tianyi, Li, Lei, Kong, Han, Wang, Wenxuan, Li, Bing, Luo, Cheng, Wang, Haiqiang, Chen, Xiangguang, Meng, Wenhui, Pan, Xiang, Shi, Huiying, Zhu, Han, Xu, Xiaozhong, Sun, Lei, Chen, Zhenzhong, Liu, Shan, Kong, Fangyuan, Fan, Haotian, Xu, Yifang, Xu, Haoran, Yang, Mengduo, Zhou, Jie, Li, Jiaze, Wen, Shijie, Xu, Mai, Li, Da, Yao, Shunyu, Du, Jiazhi, Zuo, Wangmeng, Li, Zhibo, He, Shuai, Ming, Anlong, Fu, Huiyuan, Ma, Huadong, Wu, Yong, Xue, Fie, Zhao, Guozhi, Du, Lina, Guo, Jie, Zhang, Yu, Zheng, Huimin, Chen, Junhao, Liu, Yue, Zhou, Dulan, Xu, Kele, Xu, Qisheng, Sun, Tao, Ding, Zhixiang, and Hu, Yuhang
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Artificial Intelligence - Abstract
This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i.e., Kuaishou/Kwai Platform. The KVQ database is divided into three parts, including 2926 videos for training, 420 videos for validation, and 854 videos for testing. The purpose is to build new benchmarks and advance the development of S-UGC VQA. The competition had 200 participants and 13 teams submitted valid solutions for the final testing phase. The proposed solutions achieved state-of-the-art performances for S-UGC VQA. The project can be found at https://github.com/lixinustc/KVQChallenge-CVPR-NTIRE2024., Comment: Accepted by CVPR2024 Workshop. The challenge report for CVPR NTIRE2024 Short-form UGC Video Quality Assessment Challenge
- Published
- 2024
39. THQA: A Perceptual Quality Assessment Database for Talking Heads
- Author
-
Zhou, Yingjie, Zhang, Zicheng, Sun, Wei, Liu, Xiaohong, Min, Xiongkuo, Wang, Zhihua, Zhang, Xiao-Ping, and Zhai, Guangtao
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
In the realm of media technology, digital humans have gained prominence due to rapid advancements in computer technology. However, the manual modeling and control required for the majority of digital humans pose significant obstacles to efficient development. The speech-driven methods offer a novel avenue for manipulating the mouth shape and expressions of digital humans. Despite the proliferation of driving methods, the quality of many generated talking head (TH) videos remains a concern, impacting user visual experiences. To tackle this issue, this paper introduces the Talking Head Quality Assessment (THQA) database, featuring 800 TH videos generated through 8 diverse speech-driven methods. Extensive experiments affirm the THQA database's richness in character and speech features. Subsequent subjective quality assessment experiments analyze correlations between scoring results and speech-driven methods, ages, and genders. In addition, experimental results show that mainstream image and video quality assessment methods have limitations for the THQA database, underscoring the imperative for further research to enhance TH video quality assessment. The THQA database is publicly accessible at https://github.com/zyj-2000/THQA.
- Published
- 2024
40. Modulus representation of the Riemann $\xi$ function and polynomial inequalities equivalent to the Riemann hypothesis
- Author
-
Sun, Wei
- Subjects
Mathematics - Number Theory ,11M26 - Abstract
We use the Jacobi theta function to give a representation of the modulus of the Riemann $\xi$ function. Based on this modulus representation, we show that the Riemann hypothesis is equivalent to the validity of a family of polynomial inequalities.
- Published
- 2024
41. Thermal X-ray Emission in the Western Half of the LMC Superbubble 30 Dor C
- Author
-
Chi, Yi-Heng, Chen, Han-Xiao, Chen, Yang, Meng, Yi-Fan, Zhou, Ping, Sun, Lei, and Sun, Wei
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
While 30 Dor C is a unique superbubble in the Large Magellanic Cloud for its luminous non-thermal X-ray emission, the thermal X-ray emission it emanates has not yet been thoroughly investigated and well constrained. Based on the separate ~1 Ms deep XMM-Newton and Chandra observations, we report the discovery of the thermally-emitting plasma in some portions of the western half of 30 Dor C. The thermal emission can be reproduced by a collisional-ionization-equilibrium plasma model with an average electron temperature of ~0.4 keV. We find a significant overabundance of the intermediate-mass elements such as O, Ne, Mg, and Si, which may be indicative of a recent supernova explosion in 30 Dor C. Dynamical properties in combination with the information of the OB association LH 90 suggest that the internal post-main-sequence stars dominate the power of the superbubble and blow it out in the past ~1 Myr., Comment: 15 pages, 18 figures. Accepted by MNRAS
- Published
- 2024
42. Pressure Balance and Energy Budget of the Nuclear Superbubble of NGC 3079
- Author
-
Li, Jiang-Tao, Sun, Wei, Ji, Li, and Yang, Yang
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
Superbubbles in the nuclear region of galaxies could be produced by the AGN or nuclear starburst via different driving forces. We report analysis of the multi-wavelength data of the kpc-scale nuclear superbubble in NGC 3079, in order to probe the mechanisms driving the expansion of the superbubble. Based on the Chandra X-ray observations, we derive the hot gas thermal pressure inside the bubble, which is about one order of magnitude higher than that of the warm ionized gas traced by optical lines. We derive a [C II]-based star formation rate of ${\rm SFR}\sim1.3\rm~M_\odot~{\rm yr}^{-1}$ from the nuclear region using the SOFIA/FIFI-LS observation. This SFR infers a radiation pressure toward the bubble shells much lower than the thermal pressure of the gases. The VLA radio image infers a magnetic pressure at the northeast cap above the superbubble less than the thermal pressure of the hot gas enclosed in the bubble, but has a clearly larger extension. The magnetic field may thus still help to reconcile the expansion of the bubble. The observed thermal energy of the hot gas enclosed in the bubble requires an energy injection rate of $\gtrsim10^{42}\rm~ergs~s^{-1}$ within the bubble's dynamical age, which is probably larger than the power provided by the current nuclear starburst and the parsec-scale jet. If this is true, stronger past AGN activity may provide an alternative energy source to drive the observed bubble expansion., Comment: 12 pages, 3 figures, 1 table, ApJ in press
- Published
- 2024
43. AIGIQA-20K: A Large Database for AI-Generated Image Quality Assessment
- Author
-
Li, Chunyi, Kou, Tengchuan, Gao, Yixuan, Cao, Yuqin, Sun, Wei, Zhang, Zicheng, Zhou, Yingjie, Zhang, Zhichao, Zhang, Weixia, Wu, Haoning, Liu, Xiaohong, Min, Xiongkuo, and Zhai, Guangtao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
With the rapid advancements in AI-Generated Content (AIGC), AI-Generated Images (AIGIs) have been widely applied in entertainment, education, and social media. However, due to the significant variance in quality among different AIGIs, there is an urgent need for models that consistently match human subjective ratings. To address this issue, we organized a challenge towards AIGC quality assessment on NTIRE 2024 that extensively considers 15 popular generative models, utilizing dynamic hyper-parameters (including classifier-free guidance, iteration epochs, and output image resolution), and gather subjective scores that consider perceptual quality and text-to-image alignment altogether comprehensively involving 21 subjects. This approach culminates in the creation of the largest fine-grained AIGI subjective quality database to date with 20,000 AIGIs and 420,000 subjective ratings, known as AIGIQA-20K. Furthermore, we conduct benchmark experiments on this database to assess the correspondence between 16 mainstream AIGI quality models and human perception. We anticipate that this large-scale quality database will inspire robust quality indicators for AIGIs and propel the evolution of AIGC for vision. The database is released on https://www.modelscope.cn/datasets/lcysyzxdxc/AIGCQA-30K-Image.
- Published
- 2024
44. ATF3 is a neuron-specific biomarker for spinal cord injury and ischaemic stroke.
- Author
-
Wang, Zhanqiang, Sun, Wei, Pan, Peipei, Li, Wei, Sun, Yongtao, Chen, Shoulin, Lin, Amity, Tan, Wulin, He, Liangliang, Greene, Jacob, Yao, Virginia, An, Lijun, Liang, Rich, Li, Qifeng, Yu, Jessica, Zhang, Lingyi, Kyritsis, Nikolaos, Fernandez, Xuan, Moncivais, Sara, Mendoza, Esmeralda, Fung, Pamela, Wang, Gongming, Niu, Xinhuan, Du, Qihang, Xiao, Zhaoyang, Chang, Yuwen, Lv, Peiyuan, Huie, J, Torres-Espin, Abel, Ferguson, Adam, Hemmerle, Debra, Talbott, Jason, Weinstein, Philip, Pascual, Lisa, Singh, Vineeta, DiGiorgio, Anthony, Saigal, Rajiv, Manley, Geoffrey, Dhall, Sanjay, Bresnahan, Jacqueline, Jiang, Xiangning, Singhal, Neel, Beattie, Michael, Su, Hua, Maze, Mervyn, Guan, Zhonghui, Pan, Jonathan, and Whetstone, William
- Subjects
activating transcription factor 3 (ATF3) ,biomarker ,neuronal injury ,neuroprotection ,spinal cord injury ,stroke ,Animals ,Female ,Humans ,Male ,Mice ,Activating Transcription Factor 3 ,Biomarkers ,Disease Models ,Animal ,Ischemic Stroke ,Mice ,Knockout ,Neurons ,Spinal Cord Injuries - Abstract
BACKGROUND: Although many molecules have been investigated as biomarkers for spinal cord injury (SCI) or ischemic stroke, none of them are specifically induced in central nervous system (CNS) neurons following injuries with low baseline expression. However, neuronal injury constitutes a major pathology associated with SCI or stroke and strongly correlates with neurological outcomes. Biomarkers characterized by low baseline expression and specific induction in neurons post-injury are likely to better correlate with injury severity and recovery, demonstrating higher sensitivity and specificity for CNS injuries compared to non-neuronal markers or pan-neuronal markers with constitutive expressions. METHODS: In animal studies, young adult wildtype and global Atf3 knockout mice underwent unilateral cervical 5 (C5) SCI or permanent distal middle cerebral artery occlusion (pMCAO). Gene expression was assessed using RNA-sequencing and qRT-PCR, while protein expression was detected through immunostaining. Serum ATF3 levels in animal models and clinical human samples were measured using commercially available enzyme-linked immune-sorbent assay (ELISA) kits. RESULTS: Activating transcription factor 3 (ATF3), a molecular marker for injured dorsal root ganglion sensory neurons in the peripheral nervous system, was not expressed in spinal cord or cortex of naïve mice but was induced specifically in neurons of the spinal cord or cortex within 1 day after SCI or ischemic stroke, respectively. Additionally, ATF3 protein levels in mouse blood significantly increased 1 day after SCI or ischemic stroke. Importantly, ATF3 protein levels in human serum were elevated in clinical patients within 24 hours after SCI or ischemic stroke. Moreover, Atf3 knockout mice, compared to the wildtype mice, exhibited worse neurological outcomes and larger damage regions after SCI or ischemic stroke, indicating that ATF3 has a neuroprotective function. CONCLUSIONS: ATF3 is an easily measurable, neuron-specific biomarker for clinical SCI and ischemic stroke, with neuroprotective properties. HIGHLIGHTS: ATF3 was induced specifically in neurons of the spinal cord or cortex within 1 day after SCI or ischemic stroke, respectively. Serum ATF3 protein levels are elevated in clinical patients within 24 hours after SCI or ischemic stroke. ATF3 exhibits neuroprotective properties, as evidenced by the worse neurological outcomes and larger damage regions observed in Atf3 knockout mice compared to wildtype mice following SCI or ischemic stroke.
- Published
- 2024
45. DriveEnv-NeRF: Exploration of A NeRF-Based Autonomous Driving Environment for Real-World Performance Validation
- Author
-
Shen, Mu-Yi, Hsu, Chia-Chi, Hou, Hao-Yu, Huang, Yu-Chen, Sun, Wei-Fang, Chang, Chia-Che, Liu, Yu-Lun, and Lee, Chun-Yi
- Subjects
Computer Science - Robotics - Abstract
In this study, we introduce the DriveEnv-NeRF framework, which leverages Neural Radiance Fields (NeRF) to enable the validation and faithful forecasting of the efficacy of autonomous driving agents in a targeted real-world scene. Standard simulator-based rendering often fails to accurately reflect real-world performance due to the sim-to-real gap, which represents the disparity between virtual simulations and real-world conditions. To mitigate this gap, we propose a workflow for building a high-fidelity simulation environment of the targeted real-world scene using NeRF. This approach is capable of rendering realistic images from novel viewpoints and constructing 3D meshes for emulating collisions. The validation of these capabilities through the comparison of success rates in both simulated and real environments demonstrates the benefits of using DriveEnv-NeRF as a real-world performance indicator. Furthermore, the DriveEnv-NeRF framework can serve as a training environment for autonomous driving agents under various lighting conditions. This approach enhances the robustness of the agents and reduces performance degradation when deployed to the target real scene, compared to agents fully trained using the standard simulator rendering pipeline., Comment: Project page: https://github.com/muyishen2040/DriveEnvNeRF
- Published
- 2024
46. Form factor for Dalitz decays from $J/\psi$ to light pseudoscalars
- Author
-
Shi, Chunjiang, Chen, Ying, Jiang, Xiangyu, Gong, Ming, Liu, Zhaofeng, and Sun, Wei
- Subjects
High Energy Physics - Lattice ,High Energy Physics - Experiment ,High Energy Physics - Phenomenology - Abstract
We calculate the form factor $M(q^2)$ for the Dalitz decay $J/\psi\to \gamma^*(q^2)\eta_{(N_f=1)}$ with $\eta_{(N_f)}$ being the SU($N_f$) flavor singlet pseudoscalar meson. The difference among the partial widths $\Gamma(J/\psi\to \gamma \eta_{(N_f)})$ at different $N_f$ can be attributed in part to the $\mathbf{U}_A(1)$ anomaly that induces a $N_f$ scaling. $M(q^2)$'s in $N_f=1,2$ are both well described by the single pole model $M(q^2)=M(0)/(1-q^2/\Lambda^2)$. Combined with the known experimental results of the Dalitz decays $J/\psi\to Pe^+e^-$, the pseudoscalar mass $m_P$ dependence of the pole parameter $\Lambda$ is approximated by $\Lambda(m_P^2)=\Lambda_1(1-m_P^2/\Lambda_2^2)$ with $\Lambda_1=2.64(4)~\mathrm{GeV}$ and $\Lambda_2=2.97(33)~\mathrm{GeV}$. These results provide inputs for future theoretical and experimental studies on the Dalitz decays $J/\psi\to Pe^+e^-$., Comment: 9 pages, 5 figures
- Published
- 2024
47. A resource-constrained stochastic scheduling algorithm for homeless street outreach and gleaning edible food
- Author
-
Artman, Conor M., Mate, Aditya, Nwankwo, Ezinne, Heching, Aliza, Idé, Tsuyoshi, Navrátil, Jiří, Shanmugam, Karthikeyan, Sun, Wei, Varshney, Kush R., Goldkind, Lauri, Kroch, Gidi, Sawyer, Jaclyn, and Watson, Ian
- Subjects
Computer Science - Machine Learning ,Computer Science - Computers and Society ,Statistics - Machine Learning - Abstract
We developed a common algorithmic solution addressing the problem of resource-constrained outreach encountered by social change organizations with different missions and operations: Breaking Ground -- an organization that helps individuals experiencing homelessness in New York transition to permanent housing and Leket -- the national food bank of Israel that rescues food from farms and elsewhere to feed the hungry. Specifically, we developed an estimation and optimization approach for partially-observed episodic restless bandits under $k$-step transitions. The results show that our Thompson sampling with Markov chain recovery (via Stein variational gradient descent) algorithm significantly outperforms baselines for the problems of both organizations. We carried out this work in a prospective manner with the express goal of devising a flexible-enough but also useful-enough solution that can help overcome a lack of sustainable impact in data science for social good.
- Published
- 2024
48. Precision Spectroscopy and Nuclear Structure Parameters in 7Li+ ion
- Author
-
Guan, Hua, Qi, Xiao-Qiu, Zhou, Peng-Peng, Sun, Wei, Chen, Shao-Long, Chang, Xu-Rui, Huang, Yao, Zhang, Pei-Pei, Yan, Zong-Chao, Drake, G. W. F., Chen, Ai-Xi, Zhong, Zhen-Xiang, Shi, Ting-Yun, and Gao, Ke-Lin
- Subjects
Physics - Atomic Physics - Abstract
The optical Ramsey technique is used to obtain precise measurements of the hyperfine splittings in the $2\,^3\!S_1$ and $2\,^3\!P_J$ states of $^7$Li$^+$. Together with bound-state quantum electrodynamic theory, the Zemach radius and quadrupole moment of the $^7$Li nucleus are determined to be $3.35(1)$~fm and $-3.86(5)$~fm$^2$ respectively, with the quadrupole moment deviating from the recommended value of $-4.00(3)$~fm$^2$ by $1.75\sigma$. Furthermore, we determine the quadrupole moment ratio of $^6$Li to $^7$Li as $0.101(13)$, exhibiting a $6\sigma$ deviation from the previous measured value of $0.020161(13)$ by LiF molecular spectroscopy. The results taken together provide a sensitive test of nuclear structure models.
- Published
- 2024
49. A chip-integrated comb-based microwave oscillator
- Author
-
Sun, Wei, Chen, Zhiyang, Li, Linze, Shen, Chen, Long, Jinbao, Zheng, Huamin, Yang, Luyu, Chen, Qiushi, Zhang, Zhouze, Shi, Baoqi, Li, Shichang, Gao, Lan, Luo, Yi-Han, Chen, Baile, and Liu, Junqiu
- Subjects
Physics - Optics ,Physics - Applied Physics - Abstract
Low-noise microwave oscillators are cornerstones for wireless communication, radar and clocks. Optical frequency combs have enabled photonic microwaves with unrivalled noise performance and bandwidth. Emerging interest is to generate microwaves using chip-based frequency combs, namely microcombs. Here, we demonstrate the first, fully integrated, microcomb-based, microwave oscillator chip. The chip, powered by a microelectronic circuit, leverages hybrid integration of a DFB laser, a nonlinear microresonator, and a high-speed photodetector. Each component represents the best of its own class, yet allows large-volume manufacturing with low cost in CMOS foundries. The hybrid chip outputs an ultralow-noise laser of 6.9 Hz linewidth, a microcomb of 10.7 GHz repetition rate, and a 10.7 GHz microwave of 6.3 mHz linewidth -- all three in one entity of 76 mm$^2$ size.The microwave phase noise reaches -75/-105/-130 dBc/Hz at 1/10/100 kHz Fourier offset frequency. Our results can reinvigorate our information society for communication, sensing, timing and precision measurement.
- Published
- 2024
50. Reinforcement Learning Based Robust Volt/Var Control in Active Distribution Networks With Imprecisely Known Delay
- Author
-
Cheng, Hong, Luo, Huan, Liu, Zhi, Sun, Wei, Li, Weitao, and Li, Qiyue
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
Active distribution networks (ADNs) incorporating massive photovoltaic (PV) devices encounter challenges of rapid voltage fluctuations and potential violations. Due to the fluctuation and intermittency of PV generation, the state gap, arising from time-inconsistent states and exacerbated by imprecisely known system delays, significantly impacts the accuracy of voltage control. This paper addresses this challenge by introducing a framework for delay adaptive Volt/Var control (VVC) in the presence of imprecisely known system delays to regulate the reactive power of PV inverters. The proposed approach formulates the voltage control, based on predicted system operation states, as a robust VVC problem. It employs sample selection from the state prediction interval to promptly identify the worst-performing system operation state. Furthermore, we leverage the decentralized partially observable Markov decision process (Dec-POMDP) to reformulate the robust VVC problem. We design Multiple Policy Networks and employ Multiple Policy Networks and Reward Shaping-based Multi-agent Twin Delayed Deep Deterministic Policy Gradient (MPNRS-MATD3) algorithm to efficiently address and solve the Dec-POMDP model-based problem. Simulation results show the delay adaption characteristic of our proposed framework, and the MPNRS-MATD3 outperforms other multi-agent reinforcement learning algorithms in robust voltage control.
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.