Author: "Wu, Wentao" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Wu, Wentao"' showing total 1,285 results

Start Over Author "Wu, Wentao"

1,285 results on '"Wu, Wentao"'

1. VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models

Author: Wu, Wentao, Hong, Fanghua, Wang, Xiao, Li, Chenglong, and Tang, Jin
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Neural and Evolutionary Computing
Abstract: Existing vehicle detectors are usually obtained by training a typical detector (e.g., YOLO, RCNN, DETR series) on vehicle images based on a pre-trained backbone (e.g., ResNet, ViT). Some researchers also exploit and enhance the detection performance using pre-trained large foundation models. However, we think these detectors may only get sub-optimal results because the large models they use are not specifically designed for vehicles. In addition, their results heavily rely on visual features, and seldom of they consider the alignment between the vehicle's semantic information and visual representations. In this work, we propose a new vehicle detection paradigm based on a pre-trained foundation vehicle model (VehicleMAE) and a large language model (T5), termed VFM-Det. It follows the region proposal-based detection framework and the features of each proposal can be enhanced using VehicleMAE. More importantly, we propose a new VAtt2Vec module that predicts the vehicle semantic attributes of these proposals and transforms them into feature vectors to enhance the vision features via contrastive learning. Extensive experiments on three vehicle detection benchmark datasets thoroughly proved the effectiveness of our vehicle detector. Specifically, our model improves the baseline approach by $+5.1\%$, $+6.2\%$ on the $AP_{0.5}$, $AP_{0.75}$ metrics, respectively, on the Cityscapes dataset.The source code of this work will be released at https://github.com/Event-AHU/VFM-Det., Comment: In Peer Review
Published: 2024

2. Apple Intelligence Foundation Language Models

Author: Gunter, Tom, Wang, Zirui, Wang, Chong, Pang, Ruoming, Narayanan, Andy, Zhang, Aonan, Zhang, Bowen, Chen, Chen, Chiu, Chung-Cheng, Qiu, David, Gopinath, Deepak, Yap, Dian Ang, Yin, Dong, Nan, Feng, Weers, Floris, Yin, Guoli, Huang, Haoshuo, Wang, Jianyu, Lu, Jiarui, Peebles, John, Ye, Ke, Lee, Mark, Du, Nan, Chen, Qibin, Keunebroek, Quentin, Wiseman, Sam, Evans, Syd, Lei, Tao, Rathod, Vivek, Kong, Xiang, Du, Xianzhi, Li, Yanghao, Wang, Yongqiang, Gao, Yuan, Ahmed, Zaid, Xu, Zhaoyang, Lu, Zhiyun, Rashid, Al, Jose, Albin Madappally, Doane, Alec, Bencomo, Alfredo, Vanderby, Allison, Hansen, Andrew, Jain, Ankur, Anupama, Anupama Mann, Kamal, Areeba, Wu, Bugu, Brum, Carolina, Maalouf, Charlie, Erdenebileg, Chinguun, Dulhanty, Chris, Moritz, Dominik, Kang, Doug, Jimenez, Eduardo, Ladd, Evan, Shi, Fangping, Bai, Felix, Chu, Frank, Hohman, Fred, Kotek, Hadas, Coleman, Hannah Gillis, Li, Jane, Bigham, Jeffrey, Cao, Jeffery, Lai, Jeff, Cheung, Jessica, Shan, Jiulong, Zhou, Joe, Li, John, Qin, Jun, Singh, Karanjeet, Vega, Karla, Zou, Kelvin, Heckman, Laura, Gardiner, Lauren, Bowler, Margit, Cordell, Maria, Cao, Meng, Hay, Nicole, Shahdadpuri, Nilesh, Godwin, Otto, Dighe, Pranay, Rachapudi, Pushyami, Tantawi, Ramsey, Frigg, Roman, Davarnia, Sam, Shah, Sanskruti, Guha, Saptarshi, Sirovica, Sasha, Ma, Shen, Ma, Shuang, Wang, Simon, Kim, Sulgi, Jayaram, Suma, Shankar, Vaishaal, Paidi, Varsha, Kumar, Vivek, Wang, Xin, Zheng, Xin, Cheng, Walker, Shrager, Yael, Ye, Yang, Tanaka, Yasu, Guo, Yihao, Meng, Yunsong, Luo, Zhao Tang, Ouyang, Zhi, Aygar, Alp, Wan, Alvin, Walkingshaw, Andrew, Lin, Antonie, Farooq, Arsalan, Ramerth, Brent, Reed, Colorado, Bartels, Chris, Chaney, Chris, Riazati, David, Yang, Eric Liang, Feldman, Erin, Hochstrasser, Gabriel, Seguin, Guillaume, Belousova, Irina, Pelemans, Joris, Yang, Karen, Vahid, Keivan Alizadeh, Cao, Liangliang, Najibi, Mahyar, Zuliani, Marco, Horton, Max, Cho, Minsik, Bhendawade, Nikhil, Dong, Patrick, Maj, Piotr, Agrawal, Pulkit, Shan, Qi, Fu, Qichen, Poston, Regan, Xu, Sam, Liu, Shuangning, Rao, Sushma, Heeramun, Tashweena, Merth, Thomas, Rayala, Uday, Cui, Victor, Sridhar, Vivek Rangarajan, Zhang, Wencong, Zhang, Wenqi, Wu, Wentao, Zhou, Xingyu, Liu, Xinwen, Zhao, Yang, Xia, Yin, Ren, Zhile, and Ren, Zhongzheng
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: We present foundation language models developed to power Apple Intelligence features, including a ~3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute. These models are designed to perform a wide range of tasks efficiently, accurately, and responsibly. This report describes the model architecture, the data used to train the model, the training process, how the models are optimized for inference, and the evaluation results. We highlight our focus on Responsible AI and how the principles are applied throughout the model development.
Published: 2024

3. PharmaGPT: Domain-Specific Large Language Models for Bio-Pharmaceutical and Chemistry

Author: Chen, Linqing, Wang, Weilei, Bai, Zilong, Xu, Peng, Fang, Yan, Fang, Jie, Wu, Wentao, Zhou, Lizhi, Zhang, Ruiji, Xia, Yubin, Xu, Chaobo, Hu, Ran, Xu, Licong, Cai, Qijun, Hua, Haoran, Sun, Jing, Liu, Jin, Qiu, Tian, Liu, Haowen, Hu, Meng, Li, Xiuwen, Gao, Fei, Wang, Yufu, Tie, Lin, Wang, Chaochao, Lu, Jianping, Sun, Cheng, Wang, Yixin, Yang, Shengjie, Li, Yuancheng, Jin, Lu, Zhang, Lisha, Bian, Fu, Ye, Zhongkai, Pei, Lidong, and Tu, Changyang
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Large language models (LLMs) have revolutionized Natural Language Processing (NLP) by minimizing the need for complex feature engineering. However, the application of LLMs in specialized domains like biopharmaceuticals and chemistry remains largely unexplored. These fields are characterized by intricate terminologies, specialized knowledge, and a high demand for precision areas where general purpose LLMs often fall short. In this study, we introduce PharmaGPT, a suite of domain specilized LLMs with 13 billion and 70 billion parameters, specifically trained on a comprehensive corpus tailored to the Bio-Pharmaceutical and Chemical domains. Our evaluation shows that PharmaGPT surpasses existing general models on specific-domain benchmarks such as NAPLEX, demonstrating its exceptional capability in domain-specific tasks. Remarkably, this performance is achieved with a model that has only a fraction, sometimes just one-tenth-of the parameters of general-purpose large models. This advancement establishes a new benchmark for LLMs in the bio-pharmaceutical and chemical fields, addressing the existing gap in specialized language modeling. It also suggests a promising path for enhanced research and development, paving the way for more precise and effective NLP applications in these areas.
Published: 2024

4. Pre-training on High Definition X-ray Images: An Experimental Study

Author: Wang, Xiao, Li, Yuehang, Wu, Wentao, Jin, Jiandong, Rong, Yao, Jiang, Bo, Li, Chuanfu, and Tang, Jin
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Existing X-ray based pre-trained vision models are usually conducted on a relatively small-scale dataset (less than 500k samples) with limited resolution (e.g., 224 $\times$ 224). However, the key to the success of self-supervised pre-training large models lies in massive training data, and maintaining high resolution in the field of X-ray images is the guarantee of effective solutions to difficult miscellaneous diseases. In this paper, we address these issues by proposing the first high-definition (1280 $\times$ 1280) X-ray based pre-trained foundation vision model on our newly collected large-scale dataset which contains more than 1 million X-ray images. Our model follows the masked auto-encoder framework which takes the tokens after mask processing (with a high rate) is used as input, and the masked image patches are reconstructed by the Transformer encoder-decoder network. More importantly, we introduce a novel context-aware masking strategy that utilizes the chest contour as a boundary for adaptive masking operations. We validate the effectiveness of our model on two downstream tasks, including X-ray report generation and disease recognition. Extensive experiments demonstrate that our pre-trained medical foundation vision model achieves comparable or even new state-of-the-art performance on downstream benchmark datasets. The source code and pre-trained models of this paper will be released on https://github.com/Event-AHU/Medical_Image_Analysis., Comment: Technology Report
Published: 2024

5. State Space Model for New-Generation Network Alternative to Transformers: A Survey

Author: Wang, Xiao, Wang, Shiao, Ding, Yuhe, Li, Yuehang, Wu, Wentao, Rong, Yao, Kong, Weizhe, Huang, Ju, Li, Shihao, Yang, Haoxiang, Wang, Ziwen, Jiang, Bo, Li, Chenglong, Wang, Yaowei, Tian, Yonghong, and Tang, Jin
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: In the post-deep learning era, the Transformer architecture has demonstrated its powerful performance across pre-trained big models and various downstream tasks. However, the enormous computational demands of this architecture have deterred many researchers. To further reduce the complexity of attention models, numerous efforts have been made to design more efficient methods. Among them, the State Space Model (SSM), as a possible replacement for the self-attention based Transformer model, has drawn more and more attention in recent years. In this paper, we give the first comprehensive review of these works and also provide experimental comparisons and analysis to better demonstrate the features and advantages of SSM. Specifically, we first give a detailed description of principles to help the readers quickly capture the key ideas of SSM. After that, we dive into the reviews of existing SSMs and their various applications, including natural language processing, computer vision, graph, multi-modal and multi-media, point cloud/event stream, time series data, and other domains. In addition, we give statistical comparisons and analysis of these models and hope it helps the readers to understand the effectiveness of different structures on various tasks. Then, we propose possible research points in this direction to better promote the development of the theoretical model and application of SSM. More related works will be continuously updated on the following GitHub: https://github.com/Event-AHU/Mamba_State_Space_Model_Paper_List., Comment: The First review of State Space Model (SSM)/Mamba and their applications in artificial intelligence, 33 pages
Published: 2024

6. Budget-aware Query Tuning: An AutoML Perspective

Author: Wu, Wentao and Wang, Chi
Subjects: Computer Science - Databases, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Modern database systems rely on cost-based query optimizers to come up with good execution plans for input queries. Such query optimizers rely on cost models to estimate the costs of candidate query execution plans. A cost model represents a function from a set of cost units to query execution cost, where each cost unit specifies the unit cost of executing a certain type of query processing operation (such as table scan or join). These cost units are traditionally viewed as constants, whose values only depend on the platform configuration where the database system runs on top of but are invariant for queries processed by the database system. In this paper, we challenge this classic view by thinking of these cost units as variables instead. We show that, by varying the cost-unit values one can obtain query plans that significantly outperform the default query plans returned by the query optimizer when viewing the cost units as constants. We term this cost-unit tuning process "query tuning" (QT) and show that it is similar to the well-known hyper-parameter optimization (HPO) problem in AutoML. As a result, any state-of-the-art HPO technologies can be applied to QT. We study the QT problem in the context of anytime tuning, which is desirable in practice by constraining the total time spent on QT within a given budget -- we call this problem budget-aware query tuning. We further extend our study from tuning a single query to tuning a workload with multiple queries, and we call this generalized problem budget-aware workload tuning (WT), which aims for minimizing the execution time of the entire workload. WT is more challenging as one needs to further prioritize individual query tuning within the given time budget. We propose solutions to both QT and WT and experimental evaluation using both benchmark and real workloads demonstrates the efficacy of our proposed solutions.
Published: 2024

7. TablePuppet: A Generic Framework for Relational Federated Learning

Author: Xu, Lijie, Xie, Chulin, Guo, Yiran, Alonso, Gustavo, Li, Bo, Li, Guoliang, Wang, Wei, Wu, Wentao, and Zhang, Ce
Subjects: Computer Science - Machine Learning, Computer Science - Databases, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Current federated learning (FL) approaches view decentralized training data as a single table, divided among participants either horizontally (by rows) or vertically (by columns). However, these approaches are inadequate for handling distributed relational tables across databases. This scenario requires intricate SQL operations like joins and unions to obtain the training data, which is either costly or restricted by privacy concerns. This raises the question: can we directly run FL on distributed relational tables? In this paper, we formalize this problem as relational federated learning (RFL). We propose TablePuppet, a generic framework for RFL that decomposes the learning process into two steps: (1) learning over join (LoJ) followed by (2) learning over union (LoU). In a nutshell, LoJ pushes learning down onto the vertical tables being joined, and LoU further pushes learning down onto the horizontal partitions of each vertical table. TablePuppet incorporates computation/communication optimizations to deal with the duplicate tuples introduced by joins, as well as differential privacy (DP) to protect against both feature and label leakages. We demonstrate the efficiency of TablePuppet in combination with two widely-used ML training algorithms, stochastic gradient descent (SGD) and alternating direction method of multipliers (ADMM), and compare their computation/communication complexity. We evaluate the SGD/ADMM algorithms developed atop TablePuppet by training diverse ML models. Our experimental results show that TablePuppet achieves model accuracy comparable to the centralized baselines running directly atop the SQL results. Moreover, ADMM takes less communication time than SGD to converge to similar model accuracy., Comment: 14 pages, 8 figures
Published: 2024

8. Arterial Progression Signal Optimization for Speed Uncertainty Scenarios

Author: Zhang, Zhe, Cao, Qi, Chen, Weihan, Ren, Gang, Hu, Tongyu, and Wu, Wentao
Published: 2024
Full Text: View/download PDF

9. Visual guidance method for artificial assembly in visual blind areas based on augmented reality

Author: Zheng, Yizhen, Li, Yuefeng, Wu, Wentao, Meng, Fanwei, and Chen, Changyu
Published: 2024
Full Text: View/download PDF

10. Stochastic gradient descent without full data shuffle: with applications to in-database machine learning and deep learning systems

Author: Xu, Lijie, Qiu, Shuang, Yuan, Binhang, Jiang, Jiawei, Renggli, Cedric, Gan, Shaoduo, Kara, Kaan, Li, Guoliang, Liu, Ji, Wu, Wentao, Ye, Jieping, and Zhang, Ce
Published: 2024
Full Text: View/download PDF

11. Structural Information Guided Multimodal Pre-training for Vehicle-centric Perception

Author: Wang, Xiao, Wu, Wentao, Li, Chenglong, Zhao, Zhicheng, Chen, Zhe, Shi, Yukai, and Tang, Jin
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Understanding vehicles in images is important for various applications such as intelligent transportation and self-driving system. Existing vehicle-centric works typically pre-train models on large-scale classification datasets and then fine-tune them for specific downstream tasks. However, they neglect the specific characteristics of vehicle perception in different tasks and might thus lead to sub-optimal performance. To address this issue, we propose a novel vehicle-centric pre-training framework called VehicleMAE, which incorporates the structural information including the spatial structure from vehicle profile information and the semantic structure from informative high-level natural language descriptions for effective masked vehicle appearance reconstruction. To be specific, we explicitly extract the sketch lines of vehicles as a form of the spatial structure to guide vehicle reconstruction. The more comprehensive knowledge distilled from the CLIP big model based on the similarity between the paired/unpaired vehicle image-text sample is further taken into consideration to help achieve a better understanding of vehicles. A large-scale dataset is built to pre-train our model, termed Autobot1M, which contains about 1M vehicle images and 12693 text information. Extensive experiments on four vehicle-based downstream tasks fully validated the effectiveness of our VehicleMAE. The source code and pre-trained models will be released at https://github.com/Event-AHU/VehicleMAE., Comment: Accepted by AAAI-2024
Published: 2023

12. VeCLIP: Improving CLIP Training via Visual-Enriched Captions

Author: Lai, Zhengfeng, Zhang, Haotian, Zhang, Bowen, Wu, Wentao, Bai, Haoping, Timofeev, Aleksei, Du, Xianzhi, Gan, Zhe, Shan, Jiulong, Chuah, Chen-Nee, Yang, Yinfei, Cao, Meng, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
Published: 2025
Full Text: View/download PDF

13. VeCLIP: Improving CLIP Training via Visual-enriched Captions

Author: Lai, Zhengfeng, Zhang, Haotian, Zhang, Bowen, Wu, Wentao, Bai, Haoping, Timofeev, Aleksei, Du, Xianzhi, Gan, Zhe, Shan, Jiulong, Chuah, Chen-Nee, Yang, Yinfei, and Cao, Meng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Large-scale web-crawled datasets are fundamental for the success of pre-training vision-language models, such as CLIP. However, the inherent noise and potential irrelevance of web-crawled AltTexts pose challenges in achieving precise image-text alignment. Existing methods utilizing large language models (LLMs) for caption rewriting have shown promise on small, curated datasets like CC3M and CC12M. This study introduces a scalable pipeline for noisy caption rewriting. Unlike recent LLM rewriting techniques, we emphasize the incorporation of visual concepts into captions, termed as Visual-enriched Captions (VeCap). To ensure data diversity, we propose a novel mixed training scheme that optimizes the utilization of AltTexts alongside newly generated VeCap. We showcase the adaptation of this method for training CLIP on large-scale web-crawled datasets, termed VeCLIP. Employing this cost-effective pipeline, we effortlessly scale our dataset up to 300 million samples named VeCap dataset. Our results show significant advantages in image-text alignment and overall model performance. For example, VeCLIP achieves up to +25.2% gain in COCO and Flickr30k retrieval tasks under the 12M setting. For data efficiency, VeCLIP achieves +3% gain while only using 14% of the data employed in the vanilla CLIP and 11% in ALIGN. We also note the VeCap data is complementary with other well curated datasets good for zero-shot classification tasks. When combining VeCap and DFN, our model can achieve strong performance on both of image-text retrieval and zero-shot classification tasks, e.g. 83.1% accuracy@1 on ImageNet zero-shot for a H/14 model. We release the pre-trained models at https://github.com/apple/ml-veclip., Comment: CV/ML
Published: 2023

14. ML-Powered Index Tuning: An Overview of Recent Progress and Open Challenges

Author: Siddiqui, Tarique and Wu, Wentao
Subjects: Computer Science - Databases, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: The scale and complexity of workloads in modern cloud services have brought into sharper focus a critical challenge in automated index tuning -- the need to recommend high-quality indexes while maintaining index tuning scalability. This challenge is further compounded by the requirement for automated index implementations to introduce minimal query performance regressions in production deployments, representing a significant barrier to achieving scalability and full automation. This paper directs attention to these challenges within automated index tuning and explores ways in which machine learning (ML) techniques provide new opportunities in their mitigation. In particular, we reflect on recent efforts in developing ML techniques for workload selection, candidate index filtering, speeding up index configuration search, reducing the amount of query optimizer calls, and lowering the chances of performance regressions. We highlight the key takeaways from these efforts and underline the gaps that need to be closed for their effective functioning within the traditional index tuning framework. Additionally, we present a preliminary cross-platform design aimed at democratizing index tuning across multiple SQL-like systems -- an imperative in today's continuously expanding data system landscape. We believe our findings will help provide context and impetus to the research and development efforts in automated index tuning.
Published: 2023

15. MOFI: Learning Image Representations from Noisy Entity Annotated Images

Author: Wu, Wentao, Timofeev, Aleksei, Chen, Chen, Zhang, Bowen, Duan, Kun, Liu, Shuangning, Zheng, Yantao, Shlens, Jonathon, Du, Xianzhi, Gan, Zhe, and Yang, Yinfei
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: We present MOFI, Manifold OF Images, a new vision foundation model designed to learn image representations from noisy entity annotated images. MOFI differs from previous work in two key aspects: (i) pre-training data, and (ii) training recipe. Regarding data, we introduce a new approach to automatically assign entity labels to images from noisy image-text pairs. Our approach involves employing a named entity recognition model to extract entities from the alt-text, and then using a CLIP model to select the correct entities as labels of the paired image. It's a simple, cost-effective method that can scale to handle billions of web-mined image-text pairs. Through this method, we have created Image-to-Entities (I2E), a new dataset with 1 billion images and 2 million distinct entities, covering rich visual concepts in the wild. Building upon the I2E dataset, we study different training recipes like supervised pre-training, contrastive pre-training, and multi-task learning. For contrastive pre-training, we treat entity names as free-form text, and further enrich them with entity descriptions. Experiments show that supervised pre-training with large-scale fine-grained entity labels is highly effective for image retrieval tasks, and multi-task training further improves the performance. The final MOFI model achieves 86.66% mAP on the challenging GPR1200 dataset, surpassing the previous state-of-the-art performance of 72.19% from OpenAI's CLIP model. Further experiments on zero-shot and linear probe image classification also show that MOFI outperforms a CLIP model trained on the original image-text data, demonstrating the effectiveness of the I2E dataset in learning strong image representations. We release our code and model weights at https://github.com/apple/ml-mofi., Comment: Accepted to ICLR 2024
Published: 2023

16. The impact of postoperative adjuvant therapy on EGFR-mutated stage IA lung adenocarcinoma with micropapillary pathological subtypes

Author: Cheng, Ran, Hao, Zhexue, Qiu, Li, Zheng, Xiang, Huang, Sihe, Xian, Jianzhao, Huang, Haoyang, Li, Jianfu, Zhang, Zhenhui, Ye, Kaiwen, Wu, Wentao, Zhang, Yaowen, and Liu, Jun
Published: 2024
Full Text: View/download PDF

17. Asymmetric N-oxidation catalyzed by bisguanidinium dinuclear oxodiperoxomolybdosulfate

Author: Wu, Wentao, Ang, Esther Cai Xia, Xu, Xinru, Wang, Qi, Wang, Hong, Lee, Richmond, Tan, Choon-Hong, and Ye, Xinyi
Published: 2024
Full Text: View/download PDF

18. Forecasting and analyzing influenza activity in Hebei Province, China, using a CNN-LSTM hybrid model

Author: Li, Guofan, Li, Yan, Han, Guangyue, Jiang, Caixiao, Geng, Minghao, Guo, Nana, Wu, Wentao, Liu, Shangze, Xing, Zhihuai, Han, Xu, and Li, Qi
Published: 2024
Full Text: View/download PDF

19. The clinical significance of inflammatory mediators in predicting obesity and progression-free survival in patients with adult-onset Craniopharyngioma

Author: Xiao, Youchao, Wu, Wentao, Liu, Fangzheng, Jia, Yanfei, Jin, Lu, Qiao, Ning, Cai, Kefan, Ru, Siming, Cao, Lei, and Gui, Songbai
Published: 2024
Full Text: View/download PDF

20. Elevated SCN11A concentrations associated with lower serum lipid levels in patients with major depressive disorder

Author: Xu, Ke, Zhao, Shuang, Ren, Yi, Zhong, Qi, Feng, Jinzhou, Tu, Dianji, Wu, Wentao, Wang, Jiaolin, Chen, Jianjun, and Xie, Peng
Published: 2024
Full Text: View/download PDF

21. Gut microbiota composition and metabolic characteristics in patients with Craniopharyngioma

Author: Liu, Chunhui, Liu, Fangzheng, Nie, Ding, Xiao, Youchao, Wu, Wentao, Jia, Yanfei, Jin, Lu, Qiao, Ning, Cai, Kefan, Ru, Siming, Liu, Xin, Song, Yifan, Xu, Jintian, Cao, Lei, and Gui, Songbai
Published: 2024
Full Text: View/download PDF

22. Concomitant Prediction of the Ki67 and PIT-1 Expression in Pituitary Adenoma Using Different Radiomics Models

Author: Liu, Fangzheng, Zang, Yuying, Feng, Limei, Shi, Xinyao, Wu, Wentao, Liu, Xin, Song, Yifan, Xu, Jintian, Gui, Songbai, and Chen, Xuzhu
Published: 2024
Full Text: View/download PDF

23. How good are machine learning clouds? Benchmarking two snapshots over 5 years

Author: Jiang, Jiawei, Wei, Yi, Liu, Yu, Wu, Wentao, Hu, Chuang, Zheng, Zhigao, Zhang, Ziyi, Shao, Yingxia, and Zhang, Ce
Published: 2024
Full Text: View/download PDF

24. A systematic evaluation of machine learning on serverless infrastructure

Author: Jiang, Jiawei, Gan, Shaoduo, Du, Bo, Alonso, Gustavo, Klimovic, Ana, Singla, Ankit, Wu, Wentao, Wang, Sheng, and Zhang, Ce
Published: 2024
Full Text: View/download PDF

25. Finite-Time Extended State Observer-Based Performance-Critical Control for Uncertain MIMO Nonlinear Systems

Author: Wu, Wentao, Zhang, Chenming, Li, Zhenhua, Zhang, Weidong, Zhang, Yibo, Xie, Wei, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Wang, Qing, editor, Dong, Xiwang, editor, and Song, Peng, editor
Published: 2024
Full Text: View/download PDF

26. Development of a dc-SQUID Amplifier with Intra-Coil Resistors

Author: Wu, Wentao, Lin, Zhirong, Ni, Zhi, Zhang, Shuo, Zhang, Guofeng, Wang, Yongliang, Ying, Liliang, Peng, Wei, You, Lixing, and Wang, Zhen
Published: 2024
Full Text: View/download PDF

27. Stochastic Gradient Descent without Full Data Shuffle

Author: Xu, Lijie, Qiu, Shuang, Yuan, Binhang, Jiang, Jiawei, Renggli, Cedric, Gan, Shaoduo, Kara, Kaan, Li, Guoliang, Liu, Ji, Wu, Wentao, Ye, Jieping, and Zhang, Ce
Subjects: Computer Science - Machine Learning
Abstract: Stochastic gradient descent (SGD) is the cornerstone of modern machine learning (ML) systems. Despite its computational efficiency, SGD requires random data access that is inherently inefficient when implemented in systems that rely on block-addressable secondary storage such as HDD and SSD, e.g., TensorFlow/PyTorch and in-DB ML systems over large files. To address this impedance mismatch, various data shuffling strategies have been proposed to balance the convergence rate of SGD (which favors randomness) and its I/O performance (which favors sequential access). In this paper, we first conduct a systematic empirical study on existing data shuffling strategies, which reveals that all existing strategies have room for improvement -- they all suffer in terms of I/O performance or convergence rate. With this in mind, we propose a simple but novel hierarchical data shuffling strategy, CorgiPile. Compared with existing strategies, CorgiPile avoids a full data shuffle while maintaining comparable convergence rate of SGD as if a full shuffle were performed. We provide a non-trivial theoretical analysis of CorgiPile on its convergence behavior. We further integrate CorgiPile into PyTorch by designing new parallel/distributed shuffle operators inside a new CorgiPileDataSet API. We also integrate CorgiPile into PostgreSQL by introducing three new physical operators with optimizations. Our experimental results show that CorgiPile can achieve comparable convergence rate with the full shuffle based SGD for both deep learning and generalized linear models. For deep learning models on ImageNet dataset, CorgiPile is 1.5X faster than PyTorch with full data shuffle. For in-DB ML with linear models, CorgiPile is 1.6X-12.8X faster than two state-of-the-art in-DB ML systems, Apache MADlib and Bismarck, on both HDD and SSD., Comment: This technical report is an extension of our SIGMOD 2022 paper titled "In-Database Machine Learning with CorgiPile: Stochastic Gradient Descent without Full Data Shuffle". https://doi.org/10.1145/3514221.3526150
Published: 2022

28. Data Debugging with Shapley Importance over End-to-End Machine Learning Pipelines

Author: Karlaš, Bojan, Dao, David, Interlandi, Matteo, Li, Bo, Schelter, Sebastian, Wu, Wentao, and Zhang, Ce
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Databases
Abstract: Developing modern machine learning (ML) applications is data-centric, of which one fundamental challenge is to understand the influence of data quality to ML training -- "Which training examples are 'guilty' in making the trained ML model predictions inaccurate or unfair?" Modeling data influence for ML training has attracted intensive interest over the last decade, and one popular framework is to compute the Shapley value of each training example with respect to utilities such as validation accuracy and fairness of the trained ML model. Unfortunately, despite recent intensive interest and research, existing methods only consider a single ML model "in isolation" and do not consider an end-to-end ML pipeline that consists of data transformations, feature extractors, and ML training. We present DataScope (ease.ml/datascope), the first system that efficiently computes Shapley values of training examples over an end-to-end ML pipeline, and illustrate its applications in data debugging for ML training. To this end, we first develop a novel algorithmic framework that computes Shapley value over a specific family of ML pipelines that we call canonical pipelines: a positive relational algebra query followed by a K-nearest-neighbor (KNN) classifier. We show that, for many subfamilies of canonical pipelines, computing Shapley value is in PTIME, contrasting the exponential complexity of computing Shapley value in general. We then put this to practice -- given an sklearn pipeline, we approximate it with a canonical pipeline to use as a proxy. We conduct extensive experiments illustrating different use cases and utilities. Our results show that DataScope is up to four orders of magnitude faster over state-of-the-art Monte Carlo-based methods, while being comparably, and often even more, effective in data debugging.
Published: 2022

29. Picking up railway track vibration energy using a novel doughnut-shaped piezoelectric energy harvester

Author: Min, Zhaowei, Du, Xuteng, Zhang, Xiaofan, Wu, Wentao, Shan, Xiaobiao, and Xie, Tao
Published: 2024
Full Text: View/download PDF

30. Low frequency coupled bandgap regulation of staggered piezoelectric supercell beam

Author: Wu, Wentao, Shan, Xiaobiao, Zhang, Huan, Sun, Chenghui, Du, Xuteng, and Min, Zhaowei
Published: 2024
Full Text: View/download PDF

31. Short-term thermal resilience and building energy flexibility using thermal mass and controlled natural ventilation

Author: Yoon, Nari and Wu, Wentao
Published: 2024
Full Text: View/download PDF

32. Micromechanism insights and constitutive modeling for deformation in a novel high manganese austenitic steel

Author: Wu, Wentao, Liu, Shengqiang, Fan, Wenjie, Xue, Zhaojiang, Chen, Kaixuan, Wu, Ying, Guo, Hongyan, Gan, Bin, Zhao, Feng, Jiang, Naisheng, Xia, Min, and He, Manchao
Published: 2024
Full Text: View/download PDF

33. Joint mining of fluid knowledge and multi-sensor data for gas–water two-phase flow status monitoring and evolution analysis

Author: Wu, Wentao, Tan, Chao, Zhang, Shumei, and Dong, Feng
Published: 2024
Full Text: View/download PDF

34. Critical cutting thickness model considering subsurface damage of zirconia grinding and friction–wear performance evaluation applied in simulated oral environment

Author: Yang, Min, Hao, Jiachao, Wu, Wentao, Li, Zhonghao, Ma, Yunqi, Zhou, Zongming, Gao, Teng, Liu, Mingzheng, Cui, Xin, Zhang, Yanbin, Li, Benkai, Ma, Xiao, Dambatta, Yusuf Suleiman, and Li, Changhe
Published: 2024
Full Text: View/download PDF

35. Analytical solution for surrounding rock temperature of cold-region tunnel considering phase change

Author: Wu, Wentao, Guo, Jiaqi, Wang, Xiaochuan, Hu, Huanmeng, and Zhao, Pengyu
Published: 2024
Full Text: View/download PDF

36. Abnormal hypothalamic functional connectivity associated with cognitive impairment in craniopharyngiomas

Author: Jin, Lu, Lu, Pengwei, Kang, Jie, Liu, Fangzheng, Liu, Xin, Song, Yifan, Wu, Wentao, Cai, Kefan, Ru, Siming, Cao, Jingtao, Zuo, Zentao, and Gui, Songbai
Published: 2024
Full Text: View/download PDF

37. Dynamic tensile behavior and constitutive model of a novel high-strength and high-toughness plate steel

Author: Tang, Jie, He, Manchao, Qiao, Yafei, Wu, Wentao, and Xia, Min
Published: 2024
Full Text: View/download PDF

38. A 6-mode pre-amplifier for turbulence-resistant free-space optical communication

Author: Zhang, Peng, Yu, Hao, Wu, Wentao, He, Shuang, Wang, Yuanxin, Tian, Dongsheng, and Tong, Shoufeng
Published: 2025
Full Text: View/download PDF

39. Flexural behaviour of BFRP grid/bar reinforced UHPC beams and slabs

Author: Wu, Dongyan, Wu, Wentao, Zhao, Junliang, and Xia, Zhi-Yu
Published: 2024
Full Text: View/download PDF

40. Experimental and numerical study on upper-room Far-UVC system under different ventilation schemes to disinfect airborne microorganisms in indoor environments

Author: Lv, Yang, Chen, Xi, Wu, Wentao, Wu, Fang, Wu, Xiaozhou, Yuan, Wenjie, and Qu, Changfeng
Published: 2024
Full Text: View/download PDF

41. Enhanced energy storage density and efficiency of nanocomposite dielectrics by modifying polymer matrix and aminated boron nitride nanosheet

Author: Wang, Jian, Wang, Baohui, Wu, Wentao, Gong, Honghong, Guo, Yuxuan, Mao, Jie, He, Lijun, Liang, Sen, and Xie, Yunchuan
Published: 2024
Full Text: View/download PDF

42. Optimized isolation and purification of Shaoyao Gancao decoction using macroporous resin

Author: Luo, Yao, Wu, Wentao, Gao, Rui, and Guo, Yongxue
Published: 2024
Full Text: View/download PDF

43. The influence of annealing temperature on the microstructure and mechanical properties of Fe-0.52 C-11Mn-5.14Al-1Cr lightweight steel

Author: Ge, Meiling, He, Zhongping, Wang, Lijuan, Fu, Hua, Wu, Wentao, Chen, Zhijiang, Cheng, Hong, Si, Tianyu, Che, Lun, Zheng, Kaiyuan, Xu, Xiaotian, He, Yanlin, and Zhao, Feng
Published: 2024
Full Text: View/download PDF

44. Voltage-assisted 3D printing of polymer composite dielectric films with low energy loss and high energy storage density

Author: Wang, Jian, Peng, Biyun, Zhang, Yifei, Gong, Honghong, Wang, Baohui, Wu, Wentao, He, Lijun, Liang, Sen, and Xie, Yunchuan
Published: 2024
Full Text: View/download PDF

45. Multi-energy X-ray imaging enabled by [formula omitted]E-E telescope scintillator

Author: He, Tengyue, Shao, Wenyi, Yin, Jun, Wang, Hongyun, Zhou, Yang, Wang, Jian-Xin, Yuan, Peng, Gutiérrez-Arzaluz, Luis, Wu, Wentao, Zhou, Renqian, Shao, Bingyao, Xia, Xiaochuan, Liang, Hongwei, Bakr, Osman M., and Mohammed, Omar F.
Published: 2024
Full Text: View/download PDF

46. Mechanical properties and microstructure of high-strength and high-ductility steel at elevated temperature

Author: Xia, Min, Wu, Wentao, Xue, Zhaojiang, Sang, Ying, Nie, Shuyu, Jiang, Naisheng, Guo, Hongyan, and He, Manchao
Published: 2024
Full Text: View/download PDF

47. Application of visual inertia fusion technology in rice transplanter operation

Author: Wu, Wentao, Zhang, Zeqing, Zhang, Xiya, He, Yong, and Fang, Hui
Published: 2024
Full Text: View/download PDF

48. Comparison of heavy metal contaminants removal using EDTA and Cyanex 302 as chelating agents for supercritical CO2-based soil remediation

Author: Wu, Wentao, Chen, Lin, Zhang, Wenhong, and Mei, Deqing
Published: 2024
Full Text: View/download PDF

49. 2D/2D heterojunction of MgAlTi-LDH/g-C3N4 with oxygen vacancy engineering for enhanced photocatalytic activities under natural sunlight

Author: Chen, Quan, Wu, Lunan, Wu, Jun, Ma, Kesong, Ma, Wenzhen, Wu, Wentao, Guan, Fangyuan, Li, Peng, Liu, Dong, and Yang, Xiu-Jie
Published: 2024
Full Text: View/download PDF

50. VolcanoML: Speeding up End-to-End AutoML via Scalable Search Space Decomposition

Author: Li, Yang, Shen, Yu, Zhang, Wentao, Jiang, Jiawei, Ding, Bolin, Li, Yaliang, Zhou, Jingren, Yang, Zhi, Wu, Wentao, Zhang, Ce, and Cui, Bin
Subjects: Computer Science - Machine Learning
Abstract: End-to-end AutoML has attracted intensive interests from both academia and industry, which automatically searches for ML pipelines in a space induced by feature engineering, algorithm/model selection, and hyper-parameter tuning. Existing AutoML systems, however, suffer from scalability issues when applying to application domains with large, high-dimensional search spaces. We present VolcanoML, a scalable and extensible framework that facilitates systematic exploration of large AutoML search spaces. VolcanoML introduces and implements basic building blocks that decompose a large search space into smaller ones, and allows users to utilize these building blocks to compose an execution plan for the AutoML problem at hand. VolcanoML further supports a Volcano-style execution model - akin to the one supported by modern database systems - to execute the plan constructed. Our evaluation demonstrates that, not only does VolcanoML raise the level of expressiveness for search space decomposition in AutoML, it also leads to actual findings of decomposition strategies that are significantly more efficient than the ones employed by state-of-the-art AutoML systems such as auto-sklearn.
Published: 2021
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

1,285 results on '"Wu, Wentao"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources