Author: "Shroff, Ness B." / Publication Year Range: This year - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Shroff, Ness B."' showing total 6 results

Start Over Author "Shroff, Ness B." Publication Year Range This year

6 results on '"Shroff, Ness B."'

1. How to Find the Exact Pareto Front for Multi-Objective MDPs?

Author: Li, Yining, Ju, Peizhong, and Shroff, Ness B.
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Multi-objective Markov Decision Processes (MDPs) are receiving increasing attention, as real-world decision-making problems often involve conflicting objectives that cannot be addressed by a single-objective MDP. The Pareto front identifies the set of policies that cannot be dominated, providing a foundation for finding optimal solutions that can efficiently adapt to various preferences. However, finding the Pareto front is a highly challenging problem. Most existing methods either (i) rely on traversing the continuous preference space, which is impractical and results in approximations that are difficult to evaluate against the true Pareto front, or (ii) focus solely on deterministic Pareto optimal policies, from which there are no known techniques to characterize the full Pareto front. Moreover, finding the structure of the Pareto front itself remains unclear even in the context of dynamic programming. This work addresses the challenge of efficiently discovering the Pareto front. By investigating the geometric structure of the Pareto front in MO-MDP, we uncover a key property: the Pareto front is on the boundary of a convex polytope whose vertices all correspond to deterministic policies, and neighboring vertices of the Pareto front differ by only one state-action pair of the deterministic policy, almost surely. This insight transforms the global comparison across all policies into a localized search among deterministic policies that differ by only one state-action pair, drastically reducing the complexity of searching for the exact Pareto front. We develop an efficient algorithm that identifies the vertices of the Pareto front by solving a single-objective MDP only once and then traversing the edges of the Pareto front, making it more efficient than existing methods. Our empirical studies demonstrate the effectiveness of our theoretical strategy in discovering the Pareto front.
Published: 2024

2. Theory on Mixture-of-Experts in Continual Learning

Author: Li, Hongbo, Lin, Sen, Duan, Lingjie, Liang, Yingbin, and Shroff, Ness B.
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Continual learning (CL) has garnered significant attention because of its ability to adapt to new tasks that arrive over time. Catastrophic forgetting (of old tasks) has been identified as a major issue in CL, as the model adapts to new tasks. The Mixture-of-Experts (MoE) model has recently been shown to effectively mitigate catastrophic forgetting in CL, by employing a gating network to sparsify and distribute diverse tasks among multiple experts. However, there is a lack of theoretical analysis of MoE and its impact on the learning performance in CL. This paper provides the first theoretical results to characterize the impact of MoE in CL via the lens of overparameterized linear regression tasks. We establish the benefit of MoE over a single expert by proving that the MoE model can diversify its experts to specialize in different tasks, while its router learns to select the right expert for each task and balance the loads across all experts. Our study further suggests an intriguing fact that the MoE in CL needs to terminate the update of the gating network after sufficient training rounds to attain system convergence, which is not needed in the existing MoE studies that do not consider the continual task arrival. Furthermore, we provide explicit expressions for the expected forgetting and overall generalization error to characterize the benefit of MoE in the learning performance in CL. Interestingly, adding more experts requires additional rounds before convergence, which may not enhance the learning performance. Finally, we conduct experiments on both synthetic and real datasets to extend these insights from linear models to deep neural networks (DNNs), which also shed light on the practical algorithm design for MoE in CL.
Published: 2024

3. Adversarial Online Reinforcement Learning Under Limited Defender Resources

Author: Shi, Ming, Liang, Yingbin, Shroff, Ness B., Jajodia, Sushil, Series Editor, Samarati, Pierangela, Series Editor, Lopez, Javier, Series Editor, Vaidya, Jaideep, Series Editor, Chen, Yingying, editor, Wu, Jie, editor, Yu, Paul, editor, and Wang, Xiaogang, editor
Published: 2024
Full Text: View/download PDF

4. Linear Bandits With Side Observations on Networks

Author: Kar, Avik, Singh, Rahul, Liu, Fang, Liu, Xin, and Shroff, Ness B.
Abstract: We investigate linear bandits in a network setting in the presence of side-observations across nodes in order to design recommendation algorithms for users connected via social networks. Users in social networks respond to their friends’ activity and, hence, provide information about each other’s preferences. In our model, when a learning algorithm recommends an article to a user, not only does it observe her response (e.g., an ad click) but also the side-observations, i.e., the response of her neighbors if they were presented with the same article. We model these observation dependencies by a graph $\mathcal {G}$ in which nodes correspond to users and edges to social links. We derive a problem/instance-dependent lower-bound on the regret of any consistent algorithm. We propose an optimization-based data-driven learning algorithm that utilizes the structure of $\mathcal {G}$ in order to make recommendations to users and show that it is asymptotically optimal, in the sense that its regret matches the lower-bound as the number of rounds $T\to \infty $ . We show that this asymptotically optimal regret is upper-bounded as $O\left ({{|\chi (\mathcal {G})|\log T}}\right)$ , where $|\chi (\mathcal {G})|$ is the domination number of $\mathcal {G}$ . In contrast, a naive application of the existing learning algorithms results in $O\left ({{N\log T}}\right)$ regret, where N is the number of users.
Published: 2024
Full Text: View/download PDF

5. Minimizing Edge Caching Service Costs Through Regret-Optimal Online Learning

Author: Quan, Guocong, Eryilmaz, Atilla, and Shroff, Ness B.
Abstract: Edge caching has been widely implemented to efficiently serve data requests from end users. Numerous edge caching policies have been proposed to adaptively update the cache contents based on various statistics. One critical statistic is the miss cost, which could measure the latency or the bandwidth/energy consumption to resolve the cache miss. Existing caching policies typically assume that the miss cost for each data item is fixed and known. However, in real systems, they could be random with unknown statistics. A promising approach would be to use online learning to estimate the unknown statistics of these random costs, and make caching decisions adaptively. Unfortunately, conventional learning techniques cannot be directly applied, because the caching problem has additional cache capacity and cache update constraints that are not covered in traditional learning settings. In this work, we resolve these issues by developing a novel edge caching policy that learns uncertain miss costs efficiently, and is shown to be asymptotically optimal. We first derive an asymptotic lower bound on the achievable regret. We then design a Kullback-Leibler lower confidence bound (KL-LCB) based edge caching policy, which adaptively learns the random miss costs by following the “optimism in the face of uncertainty” principle. By employing a novel analysis that accounts for the new constraints and the dynamics of the setting, we prove that the regret of the proposed policy matches the regret lower bound, thus showing asymptotic optimality. Further, via numerical experiments we demonstrate the performance improvements of our policy over natural benchmarks.
Published: 2024
Full Text: View/download PDF

6. Optimal Edge Caching For Individualized Demand Dynamics

Author: Quan, Guocong, primary, Eryilmaz, Atilla, additional, and Shroff, Ness B., additional
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

6 results on '"Shroff, Ness B."'

1. How to Find the Exact Pareto Front for Multi-Objective MDPs?

2. Theory on Mixture-of-Experts in Continual Learning

3. Adversarial Online Reinforcement Learning Under Limited Defender Resources

4. Linear Bandits With Side Observations on Networks

5. Minimizing Edge Caching Service Costs Through Regret-Optimal Online Learning

6. Optimal Edge Caching For Individualized Demand Dynamics

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Journal

Database

Publisher

6 results on '"Shroff, Ness B."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources