Author: "Lixin Zou" / Publisher: acm - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Lixin Zou"' showing total 11 results

Start Over Author "Lixin Zou" Publisher acm

11 results on '"Lixin Zou"'

1. Model-based Unbiased Learning to Rank

Author: Dan Luo, Lixin Zou, Qingyao Ai, Zhiyu Chen, Dawei Yin, and Brian D. Davison
Subjects: FOS: Computer and information sciences, Information Retrieval (cs.IR), Computer Science - Information Retrieval
Abstract: Unbiased Learning to Rank (ULTR) that learns to rank documents with biased user feedback data is a well-known challenge in information retrieval. Existing methods in unbiased learning to rank typically rely on click modeling or inverse propensity weighting (IPW). Unfortunately, the search engines are faced with severe long-tail query distribution, where neither click modeling nor IPW can handle well. Click modeling suffers from data sparsity problem since the same query-document pair appears limited times on tail queries; IPW suffers from high variance problem since it is highly sensitive to small propensity score values. Therefore, a general debiasing framework that works well under tail queries is in desperate need. To address this problem, we propose a model-based unbiased learning-to-rank framework. Specifically, we develop a general context-aware user simulator to generate pseudo clicks for unobserved ranked lists to train rankers, which addresses the data sparsity problem. In addition, considering the discrepancy between pseudo clicks and actual clicks, we take the observation of a ranked list as the treatment variable and further incorporate inverse propensity weighting with pseudo labels in a doubly robust way. The derived bias and variance indicate that the proposed model-based method is more robust than existing methods. Finally, extensive experiments on benchmark datasets, including simulated datasets and real click logs, demonstrate that the proposed model-based method consistently performs outperforms state-of-the-art methods in various scenarios. The code is available at https://github.com/rowedenny/MULTR., Comment: accepted in WSDM '23; extended version
Published: 2023
Full Text: View/download PDF

2. H-ERNIE

Author: Xiaokai Chu, Jiashu Zhao, Lixin Zou, and Dawei Yin
Published: 2022
Full Text: View/download PDF

3. Generative Session-based Recommendation

Author: Zhidan Wang, Wenwen Ye, Xu Chen, Wenqiang Zhang, Zhenlei Wang, Lixin Zou, and Weidong Liu
Published: 2022
Full Text: View/download PDF

4. Fast Semantic Matching via Flexible Contextualized Interaction

Author: Wenwen Ye, Yiding Liu, Lixin Zou, Hengyi Cai, Suqi Cheng, Shuaiqiang Wang, and Dawei Yin
Published: 2022
Full Text: View/download PDF

5. Pre-trained Language Model based Ranking in Baidu Search

Author: Zhicong Cheng, Shuaiqiang Wang, Suqi Cheng, Lixin Zou, Dehong Ma, Hengyi Cai, Dawei Yin, Daiting Shi, and Shengqiang Zhang
Subjects: Search engine, Information retrieval, Ranking, Exploit, Computer science, Online search, Relevance (information retrieval), Learning to rank, Language model, Latency (engineering)
Abstract: As the heart of a search engine, the ranking system plays a crucial role in satisfying users' information demands. More recently, neural rankers fine-tuned from pre-trained language models (PLMs) establish state-of-the-art ranking effectiveness. However, it is nontrivial to directly apply these PLM-based rankers to the large-scale web search system due to the following challenging issues: (1) the prohibitively expensive computations of massive neural PLMs, especially for long texts in the web document, prohibit their deployments in an online ranking system that demands extremely low latency; (2) the discrepancy between existing ranking-agnostic pre-training objectives and the ad-hoc retrieval scenarios that demand comprehensive relevance modeling is another main barrier for improving the online ranking system; (3) a real-world search engine typically involves a committee of ranking components, and thus the compatibility of the individually fine-tuned ranking model is critical for a cooperative ranking system. In this work, we contribute a series of successfully applied techniques in tackling these exposed issues when deploying the state-of-the-art Chinese pre-trained language model, i.e., ERNIE, in the online search engine system. We first articulate a novel practice to cost-efficiently summarize the web document and contextualize the resultant summary content with the query using a cheap yet powerful Pyramid-ERNIE architecture. Then we endow an innovative paradigm to finely exploit the large-scale noisy and biased post-click behavioral data for relevance-oriented pre-training. We also propose a human-anchored fine-tuning strategy tailored for the online ranking system, aiming to stabilize the ranking signals across various online components. Extensive offline and online experimental results show that the proposed techniques significantly boost the search engine's performance.
Published: 2021
Full Text: View/download PDF

6. Enhanced Doubly Robust Learning for Debiasing Post-Click Conversion Rate Estimation

Author: Hechang Chen, Dawei Yin, Lixin Zou, Wenwen Ye, Shuaiqiang Wang, Yi Chang, Suqi Cheng, Siyuan Guo, and Yiding Liu
Subjects: FOS: Computer and information sciences, Selection bias, Computer Science - Machine Learning, Computer science, media_common.quotation_subject, Estimator, Variance (accounting), Recommender system, Debiasing, Machine Learning (cs.LG), Computer Science - Information Retrieval, Robustness (computer science), Code (cryptography), Imputation (statistics), Algorithm, Information Retrieval (cs.IR), media_common
Abstract: Post-click conversion, as a strong signal indicating the user preference, is salutary for building recommender systems. However, accurately estimating the post-click conversion rate (CVR) is challenging due to the selection bias, i.e., the observed clicked events usually happen on users' preferred items. Currently, most existing methods utilize counterfactual learning to debias recommender systems. Among them, the doubly robust (DR) estimator has achieved competitive performance by combining the error imputation based (EIB) estimator and the inverse propensity score (IPS) estimator in a doubly robust way. However, inaccurate error imputation may result in its higher variance than the IPS estimator. Worse still, existing methods typically use simple model-agnostic methods to estimate the imputation error, which are not sufficient to approximate the dynamically changing model-correlated target (i.e., the gradient direction of the prediction model). To solve these problems, we first derive the bias and variance of the DR estimator. Based on it, a more robust doubly robust (MRDR) estimator has been proposed to further reduce its variance while retaining its double robustness. Moreover, we propose a novel double learning approach for the MRDR estimator, which can convert the error imputation into the general CVR estimation. Besides, we empirically verify that the proposed learning scheme can further eliminate the high variance problem of the imputation learning. To evaluate its effectiveness, extensive experiments are conducted on a semi-synthetic dataset and two real-world datasets. The results demonstrate the superiority of the proposed approach over the state-of-the-art methods. The code is available at https://github.com/guosyjlu/MRDR-DL., 10 pages, 3 figures, accepted by SIGIR 2021
Published: 2021
Full Text: View/download PDF

7. UserSim: User Simulation via Supervised GenerativeAdversarial Network

Author: Xiangyu Zhao, Lixin Zou, Dawei Yin, Jiliang Tang, Long Xia, and Hui Liu
Subjects: Discriminator, Computer science, business.industry, 02 engineering and technology, Recommender system, Machine learning, computer.software_genre, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Benchmark (computing), Reinforcement learning, 020201 artificial intelligence & image processing, Artificial intelligence, business, Generative adversarial network, computer, Generator (mathematics)
Abstract: With the recent advances in Reinforcement Learning (RL), there have been tremendous interests in employing RL for recommender systems. However, directly training and evaluating a new RL-based recommendation algorithm needs to collect users’ real-time feedback in the real system, which is time/effort consuming and could negatively impact users’ experiences. Thus, it calls for a user simulator that can mimic real users’ behaviors to pre-train and evaluate new recommendation algorithms. Simulating users’ behaviors in a dynamic system faces immense challenges – (i) the underlying item distribution is complex, and (ii) historical logs for each user are limited. In this paper, we develop a user simulator based on a Generative Adversarial Network (GAN). To be specific, the generator captures the underlying distribution of users’ historical logs and generates realistic logs that can be considered as augmentations of real logs; while the discriminator not only distinguishes real and fake logs but also predicts users’ behaviors. The experimental results based on benchmark datasets demonstrate the effectiveness of the proposed simulator.
Published: 2021
Full Text: View/download PDF

8. Deep Multifaceted Transformers for Multi-objective Ranking in Large-Scale E-commerce Recommender Systems

Author: Shuaiqiang Wang, Zhuoye Ding, Dawei Yin, Yulong Gu, Lixin Zou, and Yiding Liu
Subjects: Selection bias, Training set, Exploit, business.industry, Computer science, media_common.quotation_subject, Multi-task learning, 02 engineering and technology, E-commerce, Recommender system, Machine learning, computer.software_genre, Single task, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, media_common
Abstract: Recommender Systems have been playing essential roles in e-commerce portals. Existing recommendation algorithms usually learn the ranking scores of items by optimizing a single task (e.g. Click-through rate prediction) based on users' historical click sequences, but they generally pay few attention to simultaneously modeling users' multiple types of behaviors or jointly optimize multiple objectives (e.g. both Click-through rate and Conversion rate), which are both vital for e-commerce sites. In this paper, we argue that it is crucial to formulate users' different interests based on multiple types of behaviors and perform multi-task learning for significant improvement in multiple objectives simultaneously. We propose Deep Multifaceted Transformers (DMT), a novel framework that can model users' multiple types of behavior sequences simultaneously with multiple Transformers. It utilizes Multi-gate Mixture-of-Experts to optimize multiple objectives. Besides, it exploits unbiased learning to reduce the selection bias in the training data. Experiments on JD real production dataset demonstrate the effectiveness of DMT, which significantly outperforms state-of-art methods. DMT has been successfully deployed to serve the main traffic in the commercial Recommender System in JD.com. To facilitate future research, we release the codes and datasets at https://github.com/guyulongcs/CIKM2020_DMT.
Published: 2020
Full Text: View/download PDF

9. Pseudo Dyna-Q

Author: Weidong Liu, Ting Bai, Long Xia, Pan Du, Dawei Yin, Jian-Yun Nie, Lixin Zou, and Zhuo Zhang
Subjects: Selection bias, Computer science, business.industry, media_common.quotation_subject, 02 engineering and technology, Variance (accounting), Recommender system, Machine learning, computer.software_genre, 020204 information systems, Convergence (routing), Offline learning, 0202 electrical engineering, electronic engineering, information engineering, Reinforcement learning, 020201 artificial intelligence & image processing, Artificial intelligence, business, Temporal difference learning, Function (engineering), computer, media_common
Abstract: Applying reinforcement learning (RL) in recommender systems is attractive but costly due to the constraint of the interaction with real customers, where performing online policy learning through interacting with real customers usually harms customer experiences. A practical alternative is to build a recommender agent offline from logged data, whereas directly using logged data offline leads to the problem of selection bias between logging policy and the recommendation policy. The existing direct offline learning algorithms, such as Monte Carlo methods and temporal difference methods are either computationally expensive or unstable on convergence. To address these issues, we propose Pseudo Dyna-Q (PDQ). In PDQ, instead of interacting with real customers, we resort to a customer simulator, referred to as the World Model, which is designed to simulate the environment and handle the selection bias of logged data. During policy improvement, the World Model is constantly updated and optimized adaptively, according to the current recommendation policy. This way, the proposed PDQ not only avoids the instability of convergence and high computation cost of existing approaches but also provides unlimited interactions without involving real customers. Moreover, a proved upper bound of empirical error of reward function guarantees that the learned offline policy has lower bias and variance. Extensive experiments demonstrated the advantages of PDQ on two real-world datasets against state-of-the-arts methods.
Published: 2020
Full Text: View/download PDF

10. Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems

Author: Weidong Liu, Jiaxing Song, Dawei Yin, Zhuoye Ding, Long Xia, and Lixin Zou
Subjects: FOS: Computer and information sciences, Computer science, Supervised learning, 02 engineering and technology, Recommender system, Computer Science - Information Retrieval, Term (time), User engagement, Human–computer interaction, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Reinforcement learning, 020201 artificial intelligence & image processing, Information Retrieval (cs.IR)
Abstract: Recommender systems play a crucial role in our daily lives. Feed streaming mechanism has been widely used in the recommender system, especially on the mobile Apps. The feed streaming setting provides users the interactive manner of recommendation in never-ending feeds. In such an interactive manner, a good recommender system should pay more attention to user stickiness, which is far beyond classical instant metrics, and typically measured by {\bf long-term user engagement}. Directly optimizing the long-term user engagement is a non-trivial problem, as the learning target is usually not available for conventional supervised learning methods. Though reinforcement learning~(RL) naturally fits the problem of maximizing the long term rewards, applying RL to optimize long-term user engagement is still facing challenges: user behaviors are versatile and difficult to model, which typically consists of both instant feedback~(e.g. clicks, ordering) and delayed feedback~(e.g. dwell time, revisit); in addition, performing effective off-policy learning is still immature, especially when combining bootstrapping and function approximation. To address these issues, in this work, we introduce a reinforcement learning framework --- FeedRec to optimize the long-term user engagement. FeedRec includes two components: 1)~a Q-Network which designed in hierarchical LSTM takes charge of modeling complex user behaviors, and 2)~an S-Network, which simulates the environment, assists the Q-Network and voids the instability of convergence in policy learning. Extensive experiments on synthetic data and a real-world large scale data show that FeedRec effectively optimizes the long-term user engagement and outperforms state-of-the-arts.
Published: 2019
Full Text: View/download PDF

11. CTRec

Author: Ji-Rong Wen, Lixin Zou, Ting Bai, Weidong Liu, Wayne Xin Zhao, Pan Du, and Jian-Yun Nie
Subjects: Process (engineering), Computer science, Mechanism (biology), Product (category theory), Industrial engineering
Abstract: In e-commerce, users' demands are not only conditioned by their profile and preferences, but also by their recent purchases that may generate new demands, as well as periodical demands that depend on purchases made some time ago. We call them respectively short-term demands and long-term demands. In this paper, we propose a novel self-attentive Continuous-Time Recommendation model (CTRec) for capturing the evolving demands of users over time. For modeling such time-sensitive demands, a Demand-aware Hawkes Process (DHP) framework is designed in CTRec to learn from the discrete purchase records of users. More specifically, a convolutional neural network is utilized to capture the short-term demands; and a self-attention mechanism is employed to capture the periodical purchase cycles of long-term demands. All types of demands are fused in DHP to make final continuous-time recommendations. We conduct extensive experiments on four real-world commercial datasets to demonstrate that CTRec is effective for general sequential recommendation problems, including next-item and next-session/basket recommendations. We observe in particular that CTRec is capable of learning the purchase cycles of products and estimating the purchase time of a product given a user.
Published: 2019
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

11 results on '"Lixin Zou"'

1. Model-based Unbiased Learning to Rank

2. H-ERNIE

3. Generative Session-based Recommendation

4. Fast Semantic Matching via Flexible Contextualized Interaction

5. Pre-trained Language Model based Ranking in Baidu Search

6. Enhanced Doubly Robust Learning for Debiasing Post-Click Conversion Rate Estimation

7. UserSim: User Simulation via Supervised GenerativeAdversarial Network

8. Deep Multifaceted Transformers for Multi-objective Ranking in Large-Scale E-commerce Recommender Systems

9. Pseudo Dyna-Q

10. Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems

11. CTRec

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

11 results on '"Lixin Zou"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources