Author: "Ambuj K Singh" / Database: OpenAIRE - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Ambuj K Singh"' showing total 198 results

Start Over Author "Ambuj K Singh" Database OpenAIRE

198 results on '"Ambuj K Singh"'

1. FlowGEN

Author: Furkan Kocayusufoglu, Arlei Silva, and Ambuj K. Singh
Published: 2022
Full Text: View/download PDF

2. Tree++: Truncated Tree Based Graph Kernels

Author: Rachel Redberg, Ambuj K. Singh, Zhen Wang, and Wei Ye
Subjects: Social and Information Networks (cs.SI), FOS: Computer and information sciences, Computer Science - Machine Learning, Graph kernel, Computer science, Computer Science - Social and Information Networks, Machine Learning (stat.ML), 02 engineering and technology, Data structure, Graph, Machine Learning (cs.LG), Computer Science Applications, Vertex (geometry), Combinatorics, Computational Theory and Mathematics, Statistics - Machine Learning, 020204 information systems, Graph classification, 0202 electrical engineering, electronic engineering, information engineering, Tree based, Graph structured data, MathematicsofComputing_DISCRETEMATHEMATICS, Information Systems
Abstract: Graph-structured data arise ubiquitously in many application domains. A fundamental problem is to quantify their similarities. Graph kernels are often used for this purpose, which decompose graphs into substructures and compare these substructures. However, most of the existing graph kernels do not have the property of scale-adaptivity, i.e., they cannot compare graphs at multiple levels of granularities. Many real-world graphs such as molecules exhibit structure at varying levels of granularities. To tackle this problem, we propose a new graph kernel called Tree++ in this paper. At the heart of Tree++ is a graph kernel called the path-pattern graph kernel. The path-pattern graph kernel first builds a truncated BFS tree rooted at each vertex and then uses paths from the root to every vertex in the truncated BFS tree as features to represent graphs. The path-pattern graph kernel can only capture graph similarity at fine granularities. In order to capture graph similarity at coarse granularities, we incorporate a new concept called super path into it. The super path contains truncated BFS trees rooted at the vertices in a path. Our evaluation on a variety of real-world graphs demonstrates that Tree++ achieves the best classification accuracy compared with previous graph kernels.
Published: 2021
Full Text: View/download PDF

3. Predictive models for human-AI nexus in group decision making

Author: Omid Askarisichani, Francesco Bullo, Noah E. Friedkin, and Ambuj K. Singh
Subjects: Machine Learning, History and Philosophy of Science, Artificial Intelligence, General Neuroscience, Decision Making, Humans, General Biochemistry, Genetics and Molecular Biology, Algorithms
Abstract: Machine learning (ML) and artificial intelligence (AI) have had a profound impact on our lives. Domains like health and learning are naturally helped by human-AI interactions and decision making. In these areas, as ML algorithms prove their value in making important decisions, humans add their distinctive expertise and judgment on social and interpersonal issues that need to be considered in tandem with algorithmic inputs of information. Some questions naturally arise. What rules and regulations should be invoked on the employment of AI, and what protocols should be in place to evaluate available AI resources? What are the forms of effective communication and coordination with AI that best promote effective human-AI teamwork? In this review, we highlight factors that we believe are especially important in assembling and managing human-AI decision making in a group setting.
Published: 2022

4. Author response for 'Predictive models for human–AI nexus in group decision making'

Author: null Omid Askarisichani, null Francesco Bullo, null Noah E. Friedkin, and null Ambuj K. Singh
Published: 2022
Full Text: View/download PDF

5. Deep Representations for Time-varying Brain Datasets

Author: Sikun Lin, Shuyun Tang, Scott T. Grafton, and Ambuj K. Singh
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Quantitative Biology - Neurons and Cognition, FOS: Biological sciences, Neurons and Cognition (q-bio.NC), Machine Learning (cs.LG)
Abstract: Finding an appropriate representation of dynamic activities in the brain is crucial for many downstream applications. Due to its highly dynamic nature, temporally averaged fMRI (functional magnetic resonance imaging) can only provide a narrow view of underlying brain activities. Previous works lack the ability to learn and interpret the latent dynamics in brain architectures. This paper builds an efficient graph neural network model that incorporates both region-mapped fMRI sequences and structural connectivities obtained from DWI (diffusion-weighted imaging) as inputs. We find good representations of the latent brain dynamics through learning sample-level adaptive adjacency matrices and performing a novel multi-resolution inner cluster smoothing. We also attribute inputs with integrated gradients, which enables us to infer (1) highly involved brain connections and subnetworks for each task, (2) temporal keyframes of imaging sequences that characterize tasks, and (3) subnetworks that discriminate between individual subjects. This ability to identify critical subnetworks that characterize signal states across heterogeneous tasks and individuals is of great importance to neuroscience and other scientific domains. Extensive experiments and ablation studies demonstrate our proposed method's superiority and efficiency in spatial-temporal graph signal modeling with insightful interpretations of brain dynamics.
Published: 2022
Full Text: View/download PDF

6. A Broader Picture of Random-walk Based Graph Embedding

Author: Zexi Huang, Ambuj K. Singh, and Arlei Silva
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Theoretical computer science, Graph embedding, Computer science, media_common.quotation_subject, Dot product, Pointwise mutual information, Machine Learning (cs.LG), Ranking (information retrieval), Autocovariance, Similarity (psychology), Embedding, Function (engineering), media_common
Abstract: Graph embedding based on random-walks supports effective solutions for many graph-related downstream tasks. However, the abundance of embedding literature has made it increasingly difficult to compare existing methods and to identify opportunities to advance the state-of-the-art. Meanwhile, existing work has left several fundamental questions -- such as how embeddings capture different structural scales and how they should be applied for effective link prediction -- unanswered. This paper addresses these challenges with an analytical framework for random-walk based graph embedding that consists of three components: a random-walk process, a similarity function, and an embedding algorithm. Our framework not only categorizes many existing approaches but naturally motivates new ones. With it, we illustrate novel ways to incorporate embeddings at multiple scales to improve downstream task performance. We also show that embeddings based on autocovariance similarity, when paired with dot product ranking for link prediction, outperform state-of-the-art methods based on Pointwise Mutual Information similarity by up to 100%., Comment: Accepted to KDD 2021
Published: 2021
Full Text: View/download PDF

7. Modulating ALS-Related Amyloidogenic TDP-43307–319 Oligomeric Aggregates with Computationally Derived Therapeutic Molecules

Author: Nicole M. Marsh, Dezmond Bishop, Michael T. Bowers, Kristi Lazar Cantrell, Veronica Laos, Christian A. Lang, Steven K. Buratto, and Ambuj K. Singh
Subjects: Drug, Mutation, Atomic force microscopy, media_common.quotation_subject, Computational biology, medicine.disease_cause, medicine.disease, Biochemistry, Oligomer, chemistry.chemical_compound, chemistry, medicine, Molecule, Pharmacophore, Amyotrophic lateral sclerosis, Frontotemporal dementia, media_common
Abstract: TDP-43 aggregates are a salient feature of amyotrophic lateral sclerosis (ALS), frontotemporal dementia (FTD), and a variety of other neurodegenerative diseases, including Alzheimer's disease (AD). With an anticipated growth in the most susceptible demographic, projections predict neurodegenerative diseases will potentially affect 15 million people in the United States by 2050. Currently, there are no cures for ALS, FTD, or AD. Previous studies of the amyloidogenic core of TDP-43 have demonstrated that oligomers greater than a trimer are associated with toxicity. Utilizing a joint pharmacophore space (JPS) method, potential drugs have been designed specifically for amyloid-related diseases. These molecules were generated on the basis of key chemical features necessary for blood-brain barrier permeability, low adverse side effects, and target selectivity. Combining ion-mobility mass spectrometry and atomic force microscopy with the JPS computational method allows us to more efficiently evaluate a potential drug's efficacy in disrupting the development of putative toxic species. Our results demonstrate the dissociation of higher-order oligomers in the presence of these novel JPS-generated inhibitors into smaller oligomer species. Additionally, drugs approved by the Food and Drug Administration for the treatment of ALS were also evaluated and demonstrated to maintain higher-order oligomeric assemblies. Possible mechanisms for the observed action of the JPS molecules are discussed.
Published: 2019
Full Text: View/download PDF

8. Structural balance emerges and explains performance in risky decision-making

Author: Brian Uzzi, Omid Askarisichani, Noah E. Friedkin, Ambuj K. Singh, Jacqueline N. Lane, and Francesco Bullo
Subjects: 0301 basic medicine, Structural balance, Science, Decision Making, General Physics and Astronomy, 02 engineering and technology, Models, Psychological, General Biochemistry, Genetics and Molecular Biology, Profit (economics), Article, Social Networking, Microeconomics, 03 medical and health sciences, Risk-Taking, Sociology, Models, Economics, Humans, lcsh:Science, Social organization, Text Messaging, Multidisciplinary, Interdisciplinary studies, Polarization (politics), Commerce, General Chemistry, 021001 nanoscience & nanotechnology, Markov Chains, 030104 developmental biology, Psychological, lcsh:Q, 0210 nano-technology, Social structure
Abstract: Polarization affects many forms of social organization. A key issue focuses on which affective relationships are prone to change and how their change relates to performance. In this study, we analyze a financial institutional over a two-year period that employed 66 day traders, focusing on links between changes in affective relations and trading performance. Traders’ affective relations were inferred from their IMs (>2 million messages) and trading performance was measured from profit and loss statements (>1 million trades). Here, we find that triads of relationships, the building blocks of larger social structures, have a propensity towards affective balance, but one unbalanced configuration resists change. Further, balance is positively related to performance. Traders with balanced networks have the “hot hand”, showing streaks of high performance. Research implications focus on how changes in polarization relate to performance and polarized states can depolarize., How do socially polarized systems change and how does a change in polarization relate to performance? Using instant messaging data and performance records from day traders, the authors find that certain relations are prone to balance and that balance is associated with better trading decisions.
Published: 2019

9. A Game Theoretic Approach For Core Resilience

Author: Tiyani Ma, Arlei Silva, Sourav Medya, and Ambuj K. Singh
Subjects: Core (game theory), Game theoretic, Computer science, Resilience (network), Computer security, computer.software_genre, computer
Abstract: K-cores are maximal induced subgraphs where all vertices have degree at least k. These dense patterns have applications in community detection, network visualization and protein function prediction. However, k-cores can be quite unstable to network modifications, which motivates the question: How resilient is the k-core structure of a network, such as the Web or Facebook, to edge deletions? We investigate this question from an algorithmic perspective. More specifically, we study the problem of computing a small set of edges for which the removal minimizes the k-core structure of a network. This paper provides a comprehensive characterization of the hardness of the k-core minimization problem (KCM), including innaproximability and parameterized complexity. Motivated by these challenges, we propose a novel algorithm inspired by Shapley value---a cooperative game-theoretic concept--- that is able to leverage the strong interdependencies in the effects of edge removals in the search space. We efficiently approximate Shapley values using a randomized algorithm with probabilistic guarantees. Our experiments, show that the proposed algorithm outperforms competing solutions in terms of k-core minimization while being able to handle large graphs. Moreover, we illustrate how KCM can be applied in the analysis of the k-core resilience of networks.
Published: 2020
Full Text: View/download PDF

10. Learning Deep Graph Representations via Convolutional Neural Networks

Author: Ambuj K. Singh, Omid Askarisichani, Alex T. Jones, and Wei Ye
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Theoretical computer science, Computer science, Feature vector, Machine Learning (stat.ML), 02 engineering and technology, Convolutional neural network, Graph, Computer Science Applications, Vertex (geometry), Machine Learning (cs.LG), Computational Theory and Mathematics, Statistics - Machine Learning, 020204 information systems, Graph classification, 0202 electrical engineering, electronic engineering, information engineering, Information Systems, MathematicsofComputing_DISCRETEMATHEMATICS
Abstract: Graph-structured data arise in many scenarios. A fundamental problem is to quantify the similarities of graphs for tasks such as classification. R-convolution graph kernels are positive-semidefinite functions that decompose graphs into substructures and compare them. One problem in the effective implementation of this idea is that the substructures are not independent, which leads to high-dimensional feature space. In addition, graph kernels cannot capture the high-order complex interactions between vertices. To mitigate these two problems, we propose a framework called DeepMap to learn deep representations for graph feature maps. The learned deep representation for a graph is a dense and low-dimensional vector that captures complex high-order interactions in a vertex neighborhood. DeepMap extends Convolutional Neural Networks (CNNs) to arbitrary graphs by generating aligned vertex sequences and building the receptive field for each vertex. We empirically validate DeepMap on various graph classification benchmarks and demonstrate that it achieves state-of-the-art performance., arXiv admin note: text overlap with arXiv:2002.09846
Published: 2020

11. DANR: Discrepancy-aware Network Regularization

Author: Ambuj K. Singh, Furkan Kocayusufoglu, and Hongyuan You
Subjects: Evolving networks, Computer science, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, 010103 numerical & computational mathematics, 02 engineering and technology, 0101 mathematics, 01 natural sciences, Regularization (mathematics), Algorithm
Abstract: Network regularization is an effective tool for incorporating structural prior knowledge to learn coherent models over networks, and has yielded provably accurate estimates in applications ranging from spatial economics to neuroimaging studies. Recently, there has been an increasing interest in extending network regularization to the spatio-temporal case to accommodate the evolution of networks. However, in both static and spatio-temporal cases, missing or corrupted edge weights can compromise the ability of network regularization to discover desired solutions. To address these gaps, we propose a novel approach---{\it discrepancy-aware network regularization} (DANR)---that is robust to inadequate regularizations and effectively captures model evolution and structural changes over spatio-temporal networks. We develop a distributed and scalable algorithm based on the alternating direction method of multipliers (ADMM) to solve the proposed problem with guaranteed convergence to global optimum solutions. Experimental results on both synthetic and real-world networks demonstrate that our approach achieves improved performance on various tasks, and enables interpretation of model changes in evolving networks.
Published: 2020
Full Text: View/download PDF

12. Incorporating User's Preference into Attributed Graph Clustering

Author: Wei Ye, Christian Bohm, Ambuj K. Singh, Dominik Mautz, and Claudia Plant
Subjects: Vertex (graph theory), FOS: Computer and information sciences, Computer Science - Machine Learning, Theoretical computer science, Computer science, Computer Science - Human-Computer Interaction, Machine Learning (stat.ML), 02 engineering and technology, Electronic mail, Machine Learning (cs.LG), Human-Computer Interaction (cs.HC), Statistics - Machine Learning, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Cluster analysis, Clustering coefficient, Social and Information Networks (cs.SI), Computer Science - Social and Information Networks, Unimodality, Graph, Computer Science Applications, Vertex (geometry), Compact space, Computational Theory and Mathematics, Subspace topology, Information Systems
Abstract: Graph clustering has been studied extensively on both plain graphs and attributed graphs. However, all these methods need to partition the whole graph to find cluster structures. Sometimes, based on domain knowledge, people may have information about a specific target region in the graph and only want to find a single cluster concentrated on this local region. Such a task is called local clustering. In contrast to global clustering, local clustering aims to find only one cluster that is concentrating on the given seed vertex (and also on the designated attributes for attributed graphs). Currently, very few methods can deal with this kind of task. To this end, we propose two quality measures for a local cluster: Graph Unimodality ( GU ) and Attribute Unimodality ( AU ). The former measures the homogeneity of the graph structure while the latter measures the homogeneity of the subspace that is composed of the designated attributes. We call their linear combination as Compactness . Further, we propose LOCLU to optimize the Compactness score. The local cluster detected by LOCLU concentrates on the region of interest, provides efficient information flow in the graph and exhibits a unimodal data distribution in the subspace of the designated attributes.
Published: 2020
Full Text: View/download PDF

13. Making a Small World Smaller: Path Optimization in Networks

Author: Ambuj K. Singh, Petko Bogdanov, and Sourav Medya
Subjects: Mathematical optimization, Social network, business.industry, Computer science, Network packet, Node (networking), Network delay, Probabilistic logic, 02 engineering and technology, Propagation delay, Telecommunications network, Computer Science Applications, Network planning and design, Computational Theory and Mathematics, 020204 information systems, Path (graph theory), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Algorithm design, Greedy algorithm, business, Information Systems
Abstract: Reduction of end-to-end network delay is an optimization task with applications in multiple domains. Low delays enable improved information flow in social networks, quick spread of ideas in collaboration networks, low travel times for vehicles on road networks, and increased rate of packets in the case of communication networks. Delay reduction can be achieved by both improving the propagation capabilities of individual nodes and adding additional edges in the network. One of the main challenges in such network design problems is that the effects of local changes are not independent, and as a consequence, there is a combinatorial search-space of possible improvements. Thus, minimizing the cumulative propagation delay requires novel scalable and data-driven approaches. We consider the problem of network delay minimization via node upgrades. We show that the problem is NP-hard and prove strong inapproximability results about it (i.e., APX-hard) even for equal vertex delays. On the positive side, probabilistic approximations for a restricted version of the problem can be obtained. We propose a greedy heuristic to solve the general problem setting which has good quality in practice, but does not scale to very large instances. To enable scalability to real-world networks, we develop approximations for Greedy with probabilistic guarantees for every iteration, tailored to different models of delay distribution and network structures. Our methods scale almost linearly with the graph size and consistently outperform competitors in quality. We evaluate our approaches on several real-world graphs from different genres. We achieve up to two orders of magnitude speed-up compared to alternatives from the literature on moderate size networks, and obtain high-quality results in minutes on large datasets while competitors from the literature require more than four hours.
Published: 2018
Full Text: View/download PDF

14. Noticeable network delay minimization via node upgrades

Author: Jithin Vachery, Sourav Medya, Ambuj K. Singh, and Sayan Ranu
Subjects: Computer science, Network packet, Distributed computing, Node (networking), Network delay, General Engineering, 02 engineering and technology, Telecommunications network, Network planning and design, Reduction (complexity), Data flow diagram, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Importance sampling
Abstract: In several domains, the flow of data is governed by an underlying network. Reduction of delays in end-to-end data flow is an important network optimization task. Reduced delays enable shorter travel times for vehicles in road networks, faster information flow in social networks, and increased rate of packets in communication networks. While techniques for network delay minimization have been proposed, they fail to provide any noticeable reduction in individual data flows. Furthermore, they treat all nodes as equally important, which is often not the case in real-world networks. In this paper, we incorporate these practical aspects and propose a network design problem where the goal is to perform k network upgrades such that it maximizes the number of flows in the network with a noticeable reduction in delay. We show that the problem is NP-hard, APX-hard, and non-submodular. We overcome these computational challenges by designing an importance sampling based algorithm with provable quality guarantees. Through extensive experiments on real and synthetic data sets, we establish that importance sampling imparts up to 1000 times speed-up over the greedy approach, and provides up to 70 times the improvement achieved by the state-of-the-art technique.
Published: 2018
Full Text: View/download PDF

15. Polar Opinion Dynamics in Social Networks

Author: Victor Amelkin, Ambuj K. Singh, and Francesco Bullo
Subjects: Lyapunov function, 0209 industrial biotechnology, Persuasion, Dynamical systems theory, Computer science, media_common.quotation_subject, 02 engineering and technology, 01 natural sciences, 010305 fluids & plasmas, symbols.namesake, 020901 industrial engineering & automation, 0103 physical sciences, Electrical and Electronic Engineering, Function (engineering), media_common, Context model, Social network, business.industry, Computer Science::Social and Information Networks, Computer Science Applications, Nonlinear system, Control and Systems Engineering, symbols, Artificial intelligence, business, Mathematical economics
Abstract: For decades, scientists have studied opinion formation in social networks, where information travels via word of mouth. The particularly interesting case is when polar opinions—Democrats versus Republicans or iOS versus Android—compete in the network. The central problem is to design and analyze a model that captures how polar opinions evolve in the real world. In this paper, we propose a general nonlinear model of polar opinion dynamics , rooted in several theories of sociology and social psychology. The model's key distinguishing trait is that unlike in the existing linear models, such as DeGroot and Friedkin–Johnsen models, an individual's susceptibility to persuasion is a function of his or her current opinion . For example, a person holding a neutral opinion may be rather malleable, while “extremists” may be strongly committed to their current beliefs. We also study three specializations of our general model, whose susceptibility functions correspond to different sociopsychological theories. We provide a comprehensive theoretical analysis of our nonlinear models’ behavior using several tools from nonsmooth analysis of dynamical systems. To study convergence, we use nonsmooth max–min Lyapunov functions together with the generalized Invariance Principle. For our general model, we derive a general sufficient condition for the convergence to consensus. For the specialized models, we provide a full theoretical analysis of their convergence—whether to consensus or disagreement. Our results are rather general and easily apply to the analysis of other nonlinear models defined over directed networks, with Lyapunov functions constructed out of convex components.
Published: 2017
Full Text: View/download PDF

16. Topology Design Games and Dynamics in Adversarial Environments

Author: Kevin S. Chan, Derya Cansever, Ambuj K. Singh, Siddharth Pal, Ertugrul N. Ciftcioglu, Ananthram Swami, and Prithwish Basu
Subjects: Network architecture, Computer Networks and Communications, Computer science, Distributed computing, Logical topology, 020206 networking & telecommunications, Topology (electrical circuits), 02 engineering and technology, Adversary, Network topology, symbols.namesake, Strategy, Nash equilibrium, Equilibrium selection, 0202 electrical engineering, electronic engineering, information engineering, symbols, 020201 artificial intelligence & image processing, Electrical and Electronic Engineering, Game theory, Computer Science::Cryptography and Security
Abstract: We study the problem of network topology design within a set of policy-compliant topologies as a game between a designer and an adversary. At any time instant, the designer aims to operate the network in an optimal topology within the set of policy compliant topologies with respect to a desired network property. Simultaneously, the adversary counters the designer trying to force operation in a suboptimal topology. Specifically, if the designer and the attacker choose the same link in the current topology to defend/grow and attack, respectively, then the latter is thwarted. However, if the defender does not correctly guess where the attacker is going to attack, and, hence, acts elsewhere, the topology reverts to the best policy-compliant configuration after a successful attack. We show the existence of various mixed strategy equilibria in this game and systematically study its structural properties. We study the effect of parameters, such as probability of a successful attack, and characterize the steady state behavior of the underlying Markov chain. While the intuitive adversarial strategy here is to attack the most important links, the Nash equilibrium strategy is for the designer to defend the most crucial links and for the adversary to focus attack on the lesser crucial links. We validate these properties through two use cases with example sets of network topologies. Next, we consider a multi-stage framework where the designer is not only interested in the instantaneous network property costs but a discounted sum of costs over many time instances. We establish structural properties of the equilibrium strategies in the multi-stage setting, and also demonstrate that applying algorithms based on the Q-Learning and Rollout methods can result in significant benefits for the designer compared with strategies resulting from a one-shot based game.
Published: 2017
Full Text: View/download PDF

17. Modulating ALS-Related Amyloidogenic TDP-43

Author: Veronica, Laos, Dezmond, Bishop, Christian A, Lang, Nicole M, Marsh, Kristi Lazar, Cantrell, Steven K, Buratto, Ambuj K, Singh, and Michael T, Bowers
Subjects: DNA-Binding Proteins, Alzheimer Disease, Blood-Brain Barrier, Drug Design, Frontotemporal Dementia, TDP-43 Proteinopathies, Amyotrophic Lateral Sclerosis, Ion Mobility Spectrometry, Mutation, Computational Biology, Humans, Microscopy, Atomic Force
Abstract: TDP-43 aggregates are a salient feature of amyotrophic lateral sclerosis (ALS), frontotemporal dementia (FTD), and a variety of other neurodegenerative diseases, including Alzheimer's disease (AD). With an anticipated growth in the most susceptible demographic, projections predict neurodegenerative diseases will potentially affect 15 million people in the United States by 2050. Currently, there are no cures for ALS, FTD, or AD. Previous studies of the amyloidogenic core of TDP-43 have demonstrated that oligomers greater than a trimer are associated with toxicity. Utilizing a joint pharmacophore space (JPS) method, potential drugs have been designed specifically for amyloid-related diseases. These molecules were generated on the basis of key chemical features necessary for blood-brain barrier permeability, low adverse side effects, and target selectivity. Combining ion-mobility mass spectrometry and atomic force microscopy with the JPS computational method allows us to more efficiently evaluate a potential drug's efficacy in disrupting the development of putative toxic species. Our results demonstrate the dissociation of higher-order oligomers in the presence of these novel JPS-generated inhibitors into smaller oligomer species. Additionally, drugs approved by the Food and Drug Administration for the treatment of ALS were also evaluated and demonstrated to maintain higher-order oligomeric assemblies. Possible mechanisms for the observed action of the JPS molecules are discussed.
Published: 2019

18. Fighting Opinion Control in Social Networks via Link Recommendation

Author: Ambuj K. Singh and Victor Amelkin
Subjects: Markov chain, Social network, Computer science, Heuristic, business.industry, Control (management), Process (computing), 02 engineering and technology, Computer security, computer.software_genre, Network planning and design, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, business, computer
Abstract: The process of opinion formation is inherently a network process, with user opinions in a social network being driven to a certain average opinion. One simple and intuitive incarnation of this opinion attractor is the average of user opinions weighted by the users' eigenvector centralities. This value is a lucrative target for control, as altering it essentially changes the mass opinion in the network. Since any potentially malicious influence upon the opinion distribution in a society is undesirable, it is important to design methods to prevent external attacks upon it. In this work, we assume that the adversary aims to maliciously change the network's average opinion by altering the opinions of some unknown users. We, then, state an NP-hard problem of disabling such opinion control attempts via strategically altering the network's users' eigencentralities by recommending a limited number of links to the users. Relying on Markov chain theory, we provide perturbation analysis that shows how eigencentrality and, hence, our problem's objective change in response to a link's addition to the network. The latter leads to the design of a pseudo-linear-time heuristic, relying on efficient estimation of mean first passage times in Markov chains. We have confirmed our theoretical and algorithmic findings, and studied effectiveness and efficiency of our heuristic in experiments with synthetic and real networks.
Published: 2019
Full Text: View/download PDF

19. Approximate Algorithms for Data-driven Influence Limitation

Author: Ambuj K. Singh, Arlei Silva, and Sourav Medya
Subjects: Computational Theory and Mathematics, Computer science, Algorithm, Computer Science Applications, Information Systems, Data-driven
Published: 2020
Full Text: View/download PDF

20. Learning Multiclassifiers with Predictive Features that Vary with Data Distribution

Author: Omid Askarisichani, Ambuj K. Singh, and Xuan Hong Dang
Subjects: Computer science, business.industry, Feature extraction, Supervised learning, Multi-task learning, 02 engineering and technology, 010501 environmental sciences, Machine learning, computer.software_genre, 01 natural sciences, Support vector machine, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Artificial intelligence, business, Gradient descent, computer, 0105 earth and related environmental sciences
Abstract: In many real-world big data applications, the data distribution is not homogeneous over entire data, but instead varies across groups/clusters of data samples. Although a model’s predictive performance remains vital, there is also a need to learn succinct sets of features that evolve and capture smooth variations in data distribution. These small sets of features not only lead to high prediction accuracy, but also discover the important underlying processes. We investigate this challenging problem by developing a novel multi-task learning paradigm that trains multiple support vector machine (SVM) classifiers over a set of related data clusters, and directly imposes smoothness constraints on adjacent classifiers. We show that such patterns can be effectively learned in the dual form of the classical SVM, and further show that a parsimonious solution can be achieved in the primal form. Although a solution can be effectively optimized via gradient descent, the technical development is not straightforward, requiring a relaxation over the loss function of SVMs. We demonstrate the performance of our algorithm in two practical application domains: team performance and road traffic prediction. Empirical results show our model not only achieves competitive prediction accuracy, but its discovered patterns truly capture and give intuition about the variation in the data distribution across multiple data clusters.
Published: 2018
Full Text: View/download PDF

21. Summarizing Network Processes with Network-Constrained Boolean Matrix Factorization

Author: Minh X. Hoang, Furkan Kocayusufoglu, and Ambuj K. Singh
Subjects: Theoretical computer science, Computer science, Markov chain Monte Carlo, 02 engineering and technology, Complex network, Matrix decomposition, Constraint (information theory), Set (abstract data type), symbols.namesake, 020204 information systems, Scalability, 0202 electrical engineering, electronic engineering, information engineering, symbols, 020201 artificial intelligence & image processing, Gibbs sampling
Abstract: Understanding and modeling complex network processes is an important task in many real-world applications. The first challenge is to discover patterns in such complex data. In this work, our goal is to summarize different processes in a network by a small yet interpretable set of network patterns, each of which represents a local community of connected nodes frequently participating in the same network processes. We formulate this problem as a Boolean Matrix Factorization with a network constraint, which we prove to be NP-hard. We then propose an efficient algorithm that incrementally adds the best patterns and achieve scalability with two further improvements. First, to decide which network processes contain which network patterns, we introduce two mapping algorithms with linear costs. Second, to systematically mine the exponential subgraph search space for good patterns, we devise two sampling algorithms based on Monte Carlo Markov Chain. Experimental results on both synthetic and real-world datasets show that our solutions are scalable and find network patterns that effectively summarize network processes.
Published: 2018
Full Text: View/download PDF

22. Group Centrality Maximization via Network Design

Author: Ananthram Swami, Ambuj K. Singh, Prithwish Basu, Arlei Silva, and Sourav Medya
Subjects: Network planning and design, Theoretical computer science, Computer science, Group (mathematics), Maximization, Centrality
Published: 2018
Full Text: View/download PDF

23. Inhibiting and Remodeling Toxic Amyloid-beta Oligomer Formation Using a Computationally Designed Drug Molecule that Targets Alzheimer’s Disease

Author: Michael T. Bowers, Christian A. Lang, Ambuj K. Singh, Maxwell J. Giammona, Steven K. Buratto, and Matthew A. Downey
Subjects: Aging, Pyrrolidines, Neurodegenerative, Alzheimer's Disease, Proteomics, 01 natural sciences, Analytical Chemistry, Machine Learning, Amyloid-β protein, Atomic force microscopy, Amyloid disease, Models, Structural Biology, Spectroscopy, Inhibition, Microscopy, Chemistry, Aβ42, Atomic Force, Preclinical, Drug development, Blood-Brain Barrier, 5.1 Pharmaceuticals, Neurological, Development of treatments and therapeutic interventions, Pharmacophore, Alzheimer’s disease, Biotechnology, Physical Chemistry (incl. Structural), Amyloid- protein, Amyloid, Ion mobility mass spectrometry, Bioengineering, 010402 general chemistry, Fibril, Article, Aggregation, Medicinal and Biomolecular Chemistry, Alzheimer Disease, Nitriles, Ion Mobility Spectrometry, Acquired Cognitive Impairment, Humans, Molecule, Amyloid beta-Peptides, 010401 analytical chemistry, Neurosciences, Molecular, Alzheimer's Disease including Alzheimer's Disease Related Dementias (AD/ADRD), Peptide Fragments, Brain Disorders, 0104 chemical sciences, A42, Dodecameric protein, Drug Design, Biophysics, Drug Evaluation, Dementia, Joint pharmacophore space
Abstract: Alzheimer's disease (AD) is rapidly reaching epidemic status among a burgeoning aging population. Much evidence suggests the toxicity of this amyloid disease is most influenced by the formation of soluble oligomeric forms of amyloid β-protein, particularly the 42-residue alloform (Aβ42). Developing potential therapeutics in a directed, streamlined approach to treating this disease is necessary. Here we utilize the joint pharmacophore space (JPS) model to design a new molecule [AC0107] incorporating structural characteristics of known Aβ inhibitors, blood-brain barrier permeability, and limited toxicity. To test the molecule's efficacy experimentally, we employed ion mobility mass spectrometry (IM-MS) to discover [AC0107] inhibits the formation of the toxic Aβ42 dodecamer at both high (1:10) and equimolar concentrations of inhibitor. Atomic force microscopy (AFM) experiments reveal that [AC0107] prevents further aggregation of Aβ42, destabilizes preformed fibrils, and reverses Aβ42 aggregation. This trend continues for long-term interaction times of 2days until only small aggregates remain with virtually no fibrils or higher order oligomers surviving. Pairing JPS with IM-MS and AFM presents a powerful and effective first step for AD drug development. Graphical Abstract.
Published: 2018

24. Spectral Algorithms for Temporal Graph Cuts

Author: Ananthram Swami, Ambuj K. Singh, and Arlei Silva
Subjects: FOS: Computer and information sciences, Computer science, Maximum cut, Comparability graph, 010103 numerical & computational mathematics, 0102 computer and information sciences, Strength of a graph, 01 natural sciences, law.invention, Coxeter graph, Pathwidth, Computer Science - Databases, law, Graph power, Outerplanar graph, Cut, Line graph, Clique-width, 0101 mathematics, Graph cuts in computer vision, Graph property, Complement graph, Distance-hereditary graph, Social and Information Networks (cs.SI), Discrete mathematics, Block graph, Spectral graph theory, Voltage graph, Databases (cs.DB), Computer Science - Social and Information Networks, 1-planar graph, Graph, Vertex (geometry), Graph bandwidth, 010201 computation theory & mathematics, Cubic graph, Null graph, Algorithm, MathematicsofComputing_DISCRETEMATHEMATICS
Abstract: The sparsest cut problem consists of identifying a small set of edges that breaks the graph into balanced sets of vertices. The normalized cut problem balances the total degree, instead of the size, of the resulting sets. Applications of graph cuts include community detection and computer vision. However, cut problems were originally proposed for static graphs, an assumption that does not hold in many modern applications where graphs are highly dynamic. In this paper, we introduce the sparsest and normalized cut problems in temporal graphs, which generalize their standard definitions by enforcing the smoothness of cuts over time. We propose novel formulations and algorithms for computing temporal cuts using spectral graph theory, multiplex graphs, divide-and-conquer and low-rank matrix approximation. Furthermore, we extend our formulation to dynamic graph signals, where cuts also capture node values, as graph wavelets. Experiments show that our solutions are accurate and scalable, enabling the discovery of dynamic communities and the analysis of dynamic graph processes.
Published: 2018
Full Text: View/download PDF

25. Modeling the Co-evolution of Committee Formation and Awareness Networks in Organizations

Author: Ambuj K. Singh, Alex T. Jones, and Noah E. Friedkin
Subjects: Heuristic (computer science), Management science, Stochastic modelling, Political science, Maximum coverage problem, media_common.quotation_subject, Interpersonal communication, Baseline (configuration management), Deliberation, Telecommunications network, Task (project management), media_common
Abstract: Large-scale organizations confront a difficult problem of assigning personnel to non-routine tasks. Such projects require novel combinations of individuals from separate parts of the organization. We propose a computational method for assembling a committee that recruits individuals to solve the task. This method is driven by data on interpersonal awareness relationships, fostered by mutual interaction and indirect contact. By inducing a network from these relationships, we can reduce the committee selection problem to the maximum coverage problem. A stochastic model is then presented to capture plausible new relationships which form during committee deliberation. This framework provides administrators the opportunity to track the evolution of awareness relationships while improving the organization’s ability to solve non-routine tasks. To track awareness network evolution, we demonstrate correlations in empirical networks between communication network topology and the probability of awareness relationships. To formalize organizational improvement over a series of committee formations, we extend a static committee formation problem to a repeated committee formation problem. After showing the computational cost of exact solutions to this problem, we propose an efficient, heuristic-based solution. Through simulation on real-world networks, we demonstrate that our approach produces more efficient and productive network states than a baseline algorithm during a series of committee formations.
Published: 2017
Full Text: View/download PDF

26. Spatial coherence of oriented white matter microstructure: Applications to white matter regions associated with genetic similarity

Author: Matthew Cieslak, Scott T. Grafton, Ambuj K. Singh, Haraldur Tómas Hallgrímsson, and Luca Foschini
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Cognitive Neuroscience, Computer Science - Computer Vision and Pattern Recognition, Biology, computer.software_genre, Statistics - Applications, Quantitative Biology - Quantitative Methods, 030218 nuclear medicine & medical imaging, White matter, 03 medical and health sciences, 0302 clinical medicine, Similarity (network science), Voxel, medicine, Connectome, Image Processing, Computer-Assisted, Twins, Dizygotic, Humans, Applications (stat.AP), Quantitative Methods (q-bio.QM), Human Connectome Project, Orientation (computer vision), Superior longitudinal fasciculus, Brain, Coherence (statistics), Twins, Monozygotic, White Matter, medicine.anatomical_structure, Diffusion Tensor Imaging, Neurology, FOS: Biological sciences, Cartography, computer, 030217 neurology & neurosurgery, Diffusion MRI
Abstract: We present a method to discover differences between populations with respect to the spatial coherence of their oriented white matter microstructure in arbitrarily shaped white matter regions. This method is applied to diffusion MRI scans of a subset of the Human Connectome Project dataset: 57 pairs of monozygotic and 52 pairs of dizygotic twins. After controlling for morphological similarity between twins, we identify 3.7% of all white matter as being associated with genetic similarity (35.1 k voxels, p 10 − 4 , false discovery rate 1.5%), 75% of which spatially clusters into twenty-two contiguous white matter regions. Furthermore, we show that the orientation similarity within these regions generalizes to a subset of 47 pairs of non-twin siblings, and show that these siblings are on average as similar as dizygotic twins. The regions are located in deep white matter including the superior longitudinal fasciculus, the optic radiations, the middle cerebellar peduncle, the corticospinal tract, and within the anterior temporal lobe, as well as the cerebellum, brain stem, and amygdalae. These results extend previous work using undirected fractional anisotrophy for measuring putative heritable influences in white matter. Our multidirectional extension better accounts for crossing fiber connections within voxels. This bottom up approach has at its basis a novel measurement of coherence within neighboring voxel dyads between subjects, and avoids some of the fundamental ambiguities encountered with tractographic approaches to white matter analysis that estimate global connectivity.
Published: 2017

27. Subnetwork Mining with Spatial and Temporal Smoothness

Author: Xuan-Hong Dang, Hongyuan You, Ambuj K. Singh, and Scott Grafton
Published: 2017
Full Text: View/download PDF

28. Dynamics of collective performance in collaboration networks

Author: Ambuj K. Singh, Omid Askarisichani, Thomas W. Malone, Young Ji Kim, and Victor Amelkin
Subjects: Knowledge management, Computer science, Intelligence, lcsh:Medicine, Social Sciences, 01 natural sciences, Task (project management), 010104 statistics & probability, Empirical research, Mental Processes, Mathematical and Statistical Techniques, Cognition, Learning and Memory, Sociology, Psychology, Cooperative Behavior, lcsh:Science, Verbal Communication, Multidisciplinary, 05 social sciences, Statistics, Social Communication, Workload, Variety (cybernetics), Physical Sciences, Regression Analysis, Team Behavior, Network Analysis, Research Article, Computer and Information Sciences, Models, Psychological, Research and Analysis Methods, Memory, 0502 economics and business, Humans, 0101 mathematics, Statistical Methods, Baseline (configuration management), Set (psychology), Network Reciprocity, Behavior, business.industry, Verbal Behavior, lcsh:R, Cognitive Psychology, Biology and Life Sciences, Communications, Group Processes, Collective Human Behavior, Cognitive Science, lcsh:Q, business, 050203 business & management, Mathematics, Forecasting, Neuroscience
Abstract: Today, many complex tasks are assigned to teams, rather than individuals. One reason for teaming up is expansion of the skill coverage of each individual to the joint team skill set. However, numerous empirical studies of human groups suggest that the performance of equally skilled teams can widely differ. Two natural question arise: What are the factors defining team performance? and How can we best predict the performance of a given team on a specific task? While the team members’ task-related capabilities constrain the potential for the team’s success, the key to understanding team performance is in the analysis of the team process, encompassing the behaviors of the team members during task completion. In this study, we extend the existing body of research on team process and prediction models of team performance. Specifically, we analyze the dynamics of historical team performance over a series of tasks as well as the fine-grained patterns of collaboration between team members, and formally connect these dynamics to the team performance in the predictive models. Our major qualitative finding is that higher performing teams have well-connected collaboration networks—as indicated by the topological and spectral properties of the latter—which are more robust to perturbations, and where network processes spread more efficiently. Our major quantitative finding is that our predictive models deliver accurate team performance predictions—with a prediction error of 15-25%—on a variety of simple tasks, outperforming baseline models that do not capture the micro-level dynamics of team member behaviors. We also show how to use our models in an application, for optimal online planning of workload distribution in an organization. Our findings emphasize the importance of studying the dynamics of team collaboration as the major driver of high performance in teams.
Published: 2017

29. Predictive modeling and scalability analysis for large graph analytics

Author: Sourav Medya, Ludmila Cherkasova, and Ambuj K. Singh
Subjects: 020203 distributed computing, Computer science, Breadth-first search, 020207 software engineering, 02 engineering and technology, Bandwidth throttling, Parallel computing, Computer engineering, Scalability, 0202 electrical engineering, electronic engineering, information engineering, Benchmark (computing), Distributed memory, Algorithm design, Cluster analysis, Graph500
Abstract: Many HPC and modern large graph processing applications belong to a class of scale-out applications, where the application dataset is partitioned and processed by a cluster of machines. Assessing the application scalability is one of the primary goals during such application implementation. Typically, in the design phase, programmers are limited by a small size cluster available for their experiments. Therefore, predictive modeling is required for the analysis of the application scalability and its performance in a larger cluster. While in an increased size cluster, each node will process a smaller portion of the original dataset, a higher communication volume between a larger number of nodes may cripple the application scalability and provide diminishing performance benefits. One of the main challenges is the analysis of bandwidth demands due to an increased communication volume in a larger size cluster. In this paper1, we introduce a novel regression-based approach to assess the scalability and performance of a distributed memory program for execution in a large-scale cluster. Our solution involves 1) a limited set of traditional experiments performed in a small size cluster and 2) an additional set of similar experiments performed with an “interconnect bandwidth throttling” tool, which exposes the bandwidth impact on the application performance. These measurements are used in creating an ensemble of analytical models for performance and scalability analysis. Using a linear regression approach, step by step, we incorporate into the model the following important parameters: i) the number of cluster nodes and application processes, ii) the dataset size, and iii) interconnect bandwidth. We demonstrate our solution, its power, and accuracy using a popular Graph500 benchmark, which implements a Breadth First Search algorithm on large, synthetically generated graphs. By utilizing measurements collected in a 32-node cluster, we are able to project the program performance in a large size cluster with hundreds of nodes. The proposed approach and derived models help to provide an early feedback to programmers on the scalability and efficiency of their solution.
Published: 2017
Full Text: View/download PDF

30. GPOP

Author: Xuan Hong Dang, Minh X. Hoang, Zhenyu Yan, Wu Xiang, and Ambuj K. Singh
Subjects: education.field_of_study, Social network, business.industry, Computer science, Population, 02 engineering and technology, Network topology, Machine learning, computer.software_genre, Popularity, Advertising campaign, Ranking, 020204 information systems, Scalability, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Web content, Cluster analysis, business, education, computer, Clustering coefficient
Abstract: Predicting the popularity of online content in social networks is important in many applications, ranging from ad campaign design, web content caching and prefetching, to web-search result ranking. Earlier studies target this problem by learning models that either generalize behaviors of the entire network population or capture behaviors of each individual user. In this paper, we claim that a novel approach based on group-level popularity is necessary and more practical, given that users naturally organize themselves into clusters and that users within a cluster react to online content in a uniform manner. We develop a novel framework by first grouping users into cohesive clusters, and then adopt tensor decomposition to make predictions. In order to minimize the impact of noisy data and be more flexible in capturing changes in users' interests, our framework exploits both the network topology and interaction among users in learning a robust user clustering. The PARAFAC tensor decomposition is adapted to work with hierarchical constraint over user groups, and we show that optimizing this constrained function via gradient descent achieves faster convergence and leads to more stable solutions. Extensive experimental results over two social networks demonstrate that our framework is scalable, finds meaningful user groups, and significantly outperforms eight baseline methods in terms of prediction accuracy.
Published: 2017
Full Text: View/download PDF

31. Base Motif Recognition and Design of DNA Templates for Fluorescent Silver Clusters by Machine Learning

Author: Ambuj K. Singh, Stacy M. Copp, Mark Debord, Petko Bogdanov, and Elisabeth G. Gwinn
Subjects: Materials science, Base Sequence, business.industry, Mechanical Engineering, Silver Compounds, DNA, Machine learning, computer.software_genre, Fluorescence, Pattern Recognition, Automated, chemistry.chemical_compound, Spectrometry, Fluorescence, Template, chemistry, Discriminative model, Artificial Intelligence, Mechanics of Materials, DNA nanotechnology, General Materials Science, Artificial intelligence, business, computer
Abstract: Discriminative base motifs within DNA templates for fluorescent silver clusters are identified using methods that combine large experimental data sets with machine learning tools for pattern recognition. Combining the discovery of certain multibase motifs important for determining fluorescence brightness with a generative algorithm, the probability of selecting DNA templates that stabilize fluorescent silver clusters is increased by a factor of >3.
Published: 2014
Full Text: View/download PDF

32. Static and Dynamic Structural Correlations in Graphs

Author: Ambuj K. Singh, Ziyu Guan, Jian Wu, Qing Zhang, and Xifeng Yan
Subjects: Theoretical computer science, Social network, Computer science, business.industry, Stochastic process, Graph theory, computer.software_genre, Electronic mail, Graph, Computer Science Applications, Computational Theory and Mathematics, Scalability, The Internet, Data mining, business, computer, Information Systems
Abstract: Real-life graphs not only contain nodes and edges, but also have events taking place, e.g., product sales in social networks. Among different events, some exhibit strong correlations with the network structure, while others do not. Such structural correlations will shed light on viral influence existing in the corresponding network. Unfortunately, the traditional association mining concept is not applicable in graphs because it only works on homogeneous data sets like transactions and baskets. We propose a novel measure for assessing such structural correlations in heterogeneous graph data sets with events. The measure applies hitting time to aggregate the proximity among nodes that have the same event. To calculate the correlation scores for many events in a large network, we develop a scalable framework, called gScore, using sampling and approximation. By comparing to the situation where events are randomly distributed in the same network, our method is able to discover events that are highly correlated with the graph structure. We test gScore's effectiveness by synthetic events on the DBLP coauthor network and report interesting correlation results in a social network extracted from TaoBao.com, the largest online shopping network in China. Scalability of gScore is tested on the Twitter network. Since an event is essentially a temporal phenomenon, we also propose a dynamic measure, which reveals structural correlations at specific time steps and can be used for discovering detailed evolutionary patterns.
Published: 2013
Full Text: View/download PDF

33. Quantifying spatial relationships from whole retinal images

Author: Steven K. Fisher, Gabriel Luna, Ambuj K. Singh, Brian E. Ruttenberg, and Geoffrey P. Lewis
Subjects: Statistics and Probability, Computer science, Feature vector, Image processing, Spatial distribution, computer.software_genre, Biochemistry, Retina, Mice, chemistry.chemical_compound, Microscopy, Image Processing, Computer-Assisted, medicine, Animals, Point (geometry), Molecular Biology, Pixel, Retinal Vessels, Retinal, Computer Science Applications, Computational Mathematics, medicine.anatomical_structure, Computational Theory and Mathematics, chemistry, Astrocytes, Data mining, computer
Abstract: Motivation: Microscopy advances have enabled the acquisition of large-scale biological images that capture whole tissues in situ. This in turn has fostered the study of spatial relationships between cells and various biological structures, which has proved enormously beneficial toward understanding organ and organism function. However, the unique nature of biological images and tissues precludes the application of many existing spatial mining and quantification methods necessary to make inferences about the data. Especially difficult is attempting to quantify the spatial correlation between heterogeneous structures and point objects, which often occurs in many biological tissues. Results: We develop a method to quantify the spatial correlation between a continuous structure and point data in large (17 500 × 17 500 pixel) biological images. We use this method to study the spatial relationship between the vasculature and a type of cell in the retina called astrocytes. We use a geodesic feature space based on vascular structures and embed astrocytes into the space by spatial sampling. We then propose a quantification method in this feature space that enables us to empirically demonstrate that the spatial distribution of astrocytes is often correlated with vascular structure. Additionally, these patterns are conserved in the retina after injury. These results prove the long-assumed patterns of astrocyte spatial distribution and provide a novel methodology for conducting other spatial studies of similar tissue and structures. Availability: The Matlab code for the method described in this article can be found at http://www.cs.ucsb.edu/∼dbl/software.php. Contact: bruttenberg@cra.com or ambuj@cs.ucsb.edu Supplementary information: Supplementary data are available at Bioinformatics online.
Published: 2013
Full Text: View/download PDF

34. FCCF

Author: Yu Zheng, Minh X. Hoang, and Ambuj K. Singh
Subjects: Exploit, business.industry, Big data, 02 engineering and technology, Missing data, computer.software_genre, Residual, Partition (database), Geography, Crowds, 020204 information systems, Urban computing, Scalability, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Data mining, business, computer
Abstract: Predicting the movement of crowds in a city is strategically important for traffic management, risk assessment, and public safety. In this paper, we propose predicting two types of flows of crowds in every region of a city based on big data, including human mobility data, weather conditions, and road network data. To develop a practical solution for citywide traffic prediction, we first partition the map of a city into regions using both its road network and historical records of human mobility. Our problem is different than the predictions of each individual's movements and each road segment's traffic conditions, which are computationally costly and not necessary from the perspective of public safety on a citywide scale. To model the multiple complex factors affecting crowd flows, we decompose flows into three components: seasonal (periodic patterns), trend (changes in periodic patterns), and residual flows (instantaneous changes). The seasonal and trend models are built as intrinsic Gaussian Markov random fields which can cope with noisy and missing data, whereas a residual model exploits the spatio-temporal dependence among different flows and regions, as well as the effect of weather. Experiment results on three real-world datasets show that our method is scalable and outperforms all baselines significantly in terms of accuracy.
Published: 2016
Full Text: View/download PDF

35. Outlier Detection from Network Data with Subnetwork Interpretation

Author: Arlei Silva, Prithwish Basu, Ambuj K. Singh, Xuan Hong Dang, and Ananthram Swami
Subjects: FOS: Computer and information sciences, Network segmentation, business.industry, Computer Science - Artificial Intelligence, 02 engineering and technology, Machine learning, computer.software_genre, Network topology, Machine Learning (cs.LG), Network simulation, Computer Science - Learning, Artificial Intelligence (cs.AI), 020204 information systems, Outlier, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Anomaly detection, Algorithm design, Artificial intelligence, Data mining, business, Subnetwork, computer, Subspace topology, Mathematics
Abstract: Detecting a small number of outliers from a set of data observations is always challenging. This problem is more difficult in the setting of multiple network samples, where computing the anomalous degree of a network sample is generally not sufficient. In fact, explaining why the network is exceptional, expressed in the form of subnetwork, is also equally important. In this paper, we develop a novel algorithm to address these two key problems. We treat each network sample as a potential outlier and identify subnetworks that mostly discriminate it from nearby regular samples. The algorithm is developed in the framework of network regression combined with the constraints on both network topology and L1-norm shrinkage to perform subnetwork discovery. Our method thus goes beyond subspace/subgraph discovery and we show that it converges to a global optimum. Evaluation on various real-world network datasets demonstrates that our algorithm not only outperforms baselines in both network and high dimensional setting, but also discovers highly relevant and interpretable local subnetworks, further enhancing our understanding of anomalous networks.
Published: 2016

36. Towards Scalable Network Delay Minimization

Author: Sourav Medya, Ambuj K. Singh, and Petko Bogdanov
Subjects: FOS: Computer and information sciences, Computer science, Network packet, Node (networking), Distributed computing, Network delay, Approximation algorithm, Databases (cs.DB), 02 engineering and technology, Propagation delay, Telecommunications network, Reduction (complexity), Computer Science - Databases, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Combinatorial search, 020201 artificial intelligence & image processing
Abstract: Reduction of end-to-end network delays is an optimization task with applications in multiple domains. Low delays enable improved information flow in social networks, quick spread of ideas in collaboration networks, low travel times for vehicles on road networks and increased rate of packets in the case of communication networks. Delay reduction can be achieved by both improving the propagation capabilities of individual nodes and adding additional edges in the network. One of the main challenges in such design problems is that the effects of local changes are not independent, and as a consequence, there is a combinatorial search-space of possible improvements. Thus, minimizing the cumulative propagation delay requires novel scalable and data-driven approaches. In this paper, we consider the problem of network delay minimization via node upgrades. Although the problem is NP-hard, we show that probabilistic approximation for a restricted version can be obtained. We design scalable and high-quality techniques for the general setting based on sampling and targeted to different models of delay distribution. Our methods scale almost linearly with the graph size and consistently outperform competitors in quality.
Published: 2016

37. Graph Wavelets via Sparse Cuts

Author: Ananthram Swami, Ambuj K. Singh, Xuan Hong Dang, Arlei Silva, and Prithwish Basu
Subjects: Mathematical optimization, Voltage graph, Comparability graph, 020206 networking & telecommunications, Graph theory, 0102 computer and information sciences, 02 engineering and technology, Strength of a graph, 01 natural sciences, Modular decomposition, 010201 computation theory & mathematics, 0202 electrical engineering, electronic engineering, information engineering, Graph property, Null graph, Algorithm, Complement graph, Mathematics
Abstract: Modeling information that resides on vertices of large graphs is a key problem in several real-life applications, ranging from social networks to the Internet-of-things. Signal Processing on Graphs and, in particular, graph wavelets can exploit the intrinsic smoothness of these datasets in order to represent them in a compact and accurate manner. However, how to discover wavelet bases that capture the geometry of the data with respect to the signal as well as the graph structure remains an open problem. In this paper, we study the problem of computing graph wavelet bases via sparse cuts in order to produce low-dimensional encodings of data-driven bases. This problem is connected to known hard problems in graph theory (e.g. multiway cuts) and thus requires an efficient heuristic. We formulate the basis discovery task as a relaxation of a vector optimization problem, which leads to an elegant solution as a regularized eigenvalue computation. Moreover, we propose several strategies in order to scale our algorithm to large graphs. Experimental results show that the proposed algorithm can effectively encode both the graph structure and signal, producing compressed and accurate representations for vertex values in a wide range of datasets (e.g. sensor and gene networks) and significantly outperforming the best baseline.
Published: 2016
Full Text: View/download PDF

38. Topology design under adversarial dynamics

Author: Kevin S. Chan, Ertugrul Necdet Ciftcioglu, Siddharth Pal, Derya Cansever, Ananthram Swami, Prithwish Basu, and Ambuj K. Singh
Subjects: Sequence, Mathematical optimization, Markov chain, Computer science, Logical topology, 020206 networking & telecommunications, Topology (electrical circuits), 02 engineering and technology, Adversary, Network topology, symbols.namesake, Strategy, Nash equilibrium, 0202 electrical engineering, electronic engineering, information engineering, symbols, 020201 artificial intelligence & image processing, Computer Science::Cryptography and Security
Abstract: We study the problem of network topology design within a sequence of policy-compliant topologies as a game between a designer and an adversary. At any time instant, the designer aims to operate the network in an optimal topology within this policy compliant sequence with respect to a desired network property. Simultaneously, the adversary counters the designer trying to force operation in a suboptimal topology. We show the existence of various mixed strategy equilibria in this game and systematically study its structural properties. We study the effect of parameters, and characterize the steady state behavior of the underlying Markov chain. While the intuitive adversarial strategy here is to attack links appearing early in the topology sequence, the Nash Equilibrium strategy is for the designer to defend the earlier links and for the adversary to attack the later links. We validate these properties through two use cases with example sets of network topologies.
Published: 2016
Full Text: View/download PDF

39. Indexing the earth mover's distance using normal distributions

Author: Brian E. Ruttenberg and Ambuj K. Singh
Subjects: FOS: Computer and information sciences, Uncertain data, Computer science, business.industry, General Engineering, Stochastic dominance, Databases (cs.DB), Pattern recognition, 02 engineering and technology, Vector projection, Upper and lower bounds, k-nearest neighbors algorithm, Normal distribution, Data set, Cardinality, Computer Science - Databases, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Probability distribution, 020201 artificial intelligence & image processing, Artificial intelligence, business, Earth mover's distance
Abstract: Querying uncertain data sets (represented as probability distributions) presents many challenges due to the large amount of data involved and the difficulties comparing uncertainty between distributions. The Earth Mover's Distance (EMD) has increasingly been employed to compare uncertain data due to its ability to effectively capture the differences between two distributions. Computing the EMD entails finding a solution to the transportation problem, which is computationally intensive. In this paper, we propose a new lower bound to the EMD and an index structure to significantly improve the performance of EMD based K-nearest neighbor (K-NN) queries on uncertain databases. We propose a new lower bound to the EMD that approximates the EMD on a projection vector. Each distribution is projected onto a vector and approximated by a normal distribution, as well as an accompanying error term. We then represent each normal as a point in a Hough transformed space. We then use the concept of stochastic dominance to implement an efficient index structure in the transformed space. We show that our method significantly decreases K-NN query time on uncertain databases. The index structure also scales well with database cardinality. It is well suited for heterogeneous data sets, helping to keep EMD based queries tractable as uncertain data sets become larger and more complex., VLDB2012
Published: 2011
Full Text: View/download PDF

40. Probabilistic Substructure Mining From Small-Molecule Screens

Author: Ambuj K. Singh, Sayan Ranu, S. Joshua Swamidass, and Bradley T. Calhoun
Subjects: Structure (mathematical logic), Virtual screening, Computer science, Organic Chemistry, Probabilistic logic, computer.software_genre, Computer Science Applications, Set (abstract data type), Structural Biology, Cheminformatics, Drug Discovery, Benchmark (computing), Molecular Medicine, Substructure, Data mining, Throughput (business), computer
Abstract: Identifying the overrepresented substructures from a set of molecules with similar activity is a common task in chemical informatics. Existing substructure miners are deterministic, requiring the activity of all mined molecules to be known with high confidence. In contrast, we introduce pGraphSig, a probabilistic structure miner, which effectively mines structures from noisy data, where many molecules are labeled with their probability of being active. We benchmark pGraphSig on data from several small-molecule high throughput screens, finding that it can more effectively identify overrepresented structures than a deterministic structure miner.
Published: 2011
Full Text: View/download PDF

41. Scalable discovery of best clusters on large graphs

Author: Kathy Macropol and Ambuj K. Singh
Subjects: Parallelizable manifold, Theoretical computer science, Computer science, Correlation clustering, Scalability, General Engineering, Directed graph, Cluster analysis, Time complexity, Graph, Clustering coefficient
Abstract: The identification of clusters, well-connected components in a graph, is useful in many applications from biological function prediction to social community detection. However, finding these clusters can be difficult as graph sizes increase. Most current graph clustering algorithms scale poorly in terms of time or memory. An important insight is that many clustering applications need only the subset of best clusters, and not all clusters in the entire graph. In this paper we propose a new technique, Top Graph Clusters (TopGC), which probabilistically searches large, edge weighted, directed graphs for their best clusters in linear time. The algorithm is inherently parallelizable, and is able to find variable size, overlapping clusters. To increase scalability, a parameter is introduced that controls memory use. When compared with three other state-of-the art clustering techniques, TopGC achieves running time speedups of up to 70% on large scale real world datasets. In addition, the clusters returned by TopGC are consistently found to be better both in calculated score and when compared on real world benchmarks.
Published: 2010
Full Text: View/download PDF

42. Efficient and Robust Detection of Duplicate Videos in a Large Database

Author: Anindya Sarkar, Vishwakarma Singh, B.S. Manjunath, Ambuj K. Singh, and Pradiptya Ghosh
Subjects: Database, Search algorithm, Computer science, Search engine indexing, Media Technology, Vector quantization, Data mining, Electrical and Electronic Engineering, Fingerprint recognition, computer.software_genre, Digital watermarking, computer, Data compression
Abstract: We present an efficient and accurate method for duplicate video detection in a large database using video fingerprints. We have empirically chosen the color layout descriptor, a compact and robust frame-based descriptor, to create fingerprints which are further encoded by vector quantization (VQ). We propose a new nonmetric distance measure to find the similarity between the query and a database video fingerprint and experimentally show its superior performance over other distance measures for accurate duplicate detection. Efficient search cannot be performed for high-dimensional data using a nonmetric distance measure with existing indexing techniques. Therefore, we develop novel search algorithms based on precomputed distances and new dataset pruning techniques yielding practical retrieval times. We perform experiments with a database of 38 000 videos, worth 1600 h of content. For individual queries with an average duration of 60 s (about 50% of the average database video length), the duplicate video is retrieved in 0.032 s, on Intel Xeon with CPU 2.33 GHz, with a very high accuracy of 97.5%.
Published: 2010
Full Text: View/download PDF

43. Bisque: a platform for bioimage analysis and management

Author: Kristian Kvilekval, Boguslaw Obara, Dmitry Fedorov, B.S. Manjunath, and Ambuj K. Singh
Subjects: Diagnostic Imaging, Statistics and Probability, Databases, Factual, Computer science, Process (engineering), business.industry, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Computational Biology, Information Storage and Retrieval, Bioimage informatics, Biochemistry, Data science, Field (computer science), Computer Science Applications, Digital media, Metadata, World Wide Web, User-Computer Interface, Computational Mathematics, Computational Theory and Mathematics, business, Molecular Biology, Software
Abstract: Motivation: Advances in the field of microscopy have brought about the need for better image management and analysis solutions. Novel imaging techniques have created vast stores of images and metadata that are difficult to organize, search, process and analyze. These tasks are further complicated by conflicting and proprietary image and metadata formats, that impede analyzing and sharing of images and any associated data. These obstacles have resulted in research resources being locked away in digital media and file cabinets. Current image management systems do not address the pressing needs of researchers who must quantify image data on a regular basis. Results: We present Bisque, a web-based platform specifically designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend Bisque with both data model and analysis extensions in order to adapt the system to local needs. Bisque's extensibility stems from two core concepts: flexible metadata facility and an open web-based architecture. Together these empower researchers to create, develop and share novel bioimage analyses. Several case studies using Bisque with specific applications are presented as an indication of how users can expect to extend Bisque for their own purposes. Availability: Bisque is web based, cross-platform and open source. The system is also available as software-as-a-service through the Center of Bioimage Informatics at UCSB. Contact: kris@cs.ucsb.edu; fedorov@ece.ucsb.edu Supplementary information: The supplementary material is available at Bioinformatics online, including screen shots, metadata XML descriptions and implementation details.
Published: 2009
Full Text: View/download PDF

44. Mining Statistically Significant Molecular Substructures for Efficient Molecular Classification

Author: Sayan Ranu and Ambuj K. Singh
Subjects: Molecular classification, Molecular Structure, Computer science, General Chemical Engineering, Information Storage and Retrieval, General Chemistry, Data mining, Library and Information Sciences, computer.software_genre, computer, Computer Science Applications
Abstract: The increased availability of large repositories of chemical compounds has created new challenges in designing efficient molecular querying and mining systems. Molecular classification is an important problem in drug development where libraries of chemical compounds are screened and molecules with the highest probability of success against a given target are selected. We have developed a technique called GraphSig to mine significantly over-represented molecular substructures in a given class of molecules. GraphSig successfully overcomes the scalability bottleneck of mining patterns at a low frequency. Patterns mined by GraphSig display correlation with biological activities and serve as an excellent platform on which to build molecular analysis tools. The potential of GraphSig as a chemical descriptor is explored, and support vector machines are used to classify molecules described by patterns mined using GraphSig. Furthermore, the over-represented patterns are more informative than features generated exhaustively by traditional fingerprints; this has potential in providing scaffolds and lead generation. Extensive experiments are carried out to evaluate the proposed techniques, and empirical results show promising performance in terms of classification quality. An implementation of the algorithm is available free for academic use at http://www.uweb.ucsb.edu/ approximately sayan/software/GraphSig.tar.
Published: 2009
Full Text: View/download PDF

45. Optimization Techniques for Reactive Network Monitoring

Author: Ahmet Bulut, A. Meka, Divesh Srivastava, Nick Koudas, and Ambuj K. Singh
Subjects: Network architecture, Computer science, Distributed computing, Real-time computing, Local area network, Network monitoring, Network traffic control, Computer Science Applications, law.invention, Scheduling (computing), Network simulation, Network element, Computational Theory and Mathematics, law, Internet Protocol, Wireless sensor network, Network management station, Information Systems
Abstract: We develop a framework for minimizing the communication overhead of monitoring global system parameters in IP networks and sensor networks. A global system predicate is defined as a conjunction of the local properties of different network elements. A typical example is to identify the time windows when the outbound traffic from each network element exceeds a predefined threshold. Our main idea is to optimize the scheduling of local event reporting across network elements for a given network traffic load and local event frequencies. The system architecture consists of N distributed network elements coordinated by a central monitoring station. Each network element monitors a set of local properties and the central station is responsible for identifying the status of global parameters registered in the system. We design an optimal algorithm, the partition and rank (PAR) scheme, when the local events are independent; whereas, when they are dependent, we show that the problem is NP-complete and develop two efficient heuristics: the PAR for dependent events (PAR-D) and adaptive (Ada) algorithms, which adapt well to changing network conditions, and outperform the current state of the art techniques in terms of communication cost.
Published: 2009
Full Text: View/download PDF

46. FTDP-17 Mutations in Tau Alter the Regulation of Microtubule Dynamics

Author: Adria C LeBoeuf, Sasha F. Levy, Leslie Wilson, Ambuj K. Singh, Michelle R. Gaylord, Stuart C. Feinstein, Arnab Bhattacharya, and Mary Ann Jordan
Subjects: Genetics, Mutation, biology, Neurodegeneration, Mutant, Tau protein, Regulator, Cell Biology, medicine.disease, medicine.disease_cause, Biochemistry, Cell biology, Tubulin, Microtubule, mental disorders, biology.protein, medicine, Molecular Biology, Peptide sequence
Abstract: Mutations affecting either the structure or regulation of the microtubule-associated protein Tau cause neuronal cell death and dementia. However, the molecular mechanisms mediating these deleterious effects remain unclear. Among the most characterized activities of Tau is the ability to regulate microtubule dynamics, known to be essential for proper cell function and viability. Here we have tested the hypothesis that Tau mutations causing neurodegeneration also alter the ability of Tau to regulate the dynamic instability behaviors of microtubules. Using in vitro microtubule dynamics assays to assess average microtubule growth rates, microtubule growth rate distributions, and catastrophe frequencies, we found that all tested mutants possessing amino acid substitutions or deletions mapping to either the repeat or interrepeat regions of Tau do indeed compromise its ability to regulate microtubule dynamics. Further mutational analyses suggest a novel mechanism of Tau regulatory action based on an "alternative core" of microtubule binding and regulatory activities composed of two repeats and the interrepeat between them. In this model, the interrepeat serves as the primary regulator of microtubule dynamics, whereas the flanking repeats serve as tethers to properly position the interrepeat on the microtubule. Importantly, since there are multiple interrepeats on each Tau molecule, there are also multiple cores on each Tau molecule, each with distinct mechanistic capabilities, thereby providing significant regulatory potential. Taken together, the data are consistent with a microtubule misregulation mechanism for Tau-mediated neuronal cell death and provide a novel mechanistic model for normal and pathological Tau action.
Published: 2008
Full Text: View/download PDF

47. Learning Predictive Substructures with Regularization for Network Data

Author: Xuan Hong Dang, Petko Bogdanov, Ambuj K. Singh, and Hongyuan You
Subjects: Computer science, Graph theory, Semi-supervised learning, Complex network, computer.software_genre, Generalization error, Regularization (mathematics), Convex optimization, Graph (abstract data type), Algorithm design, Data mining, Gradient descent, computer, Subspace topology
Abstract: Learning a succinct set of substructures that predicts global network properties plays a key role in understanding complex network data. Existing approaches address this problem by sampling the exponential space of all possible subnetworks to find ones of high prediction accuracy. In this paper, we develop a novel framework that avoids sampling by formulating the problem of predictive subnetwork learning as node selection, subject to network-constrained regularization. Our framework involves two steps: (i) subspace learning, and (ii) predictive substructures discovery with network regularization. The framework is developed based upon two mathematically sound techniques of spectral graph learning and gradient descent optimization, and we show that their solutions converge to a global optimum solution -- a desired property that cannot be guaranteed by sampling approaches. Through experimental analysis on a number of real world datasets, we demonstrate the performance of our framework against state-of-the-art algorithms, not only based on prediction accuracy but also in terms of domain relevance of the discovered substructures.
Published: 2015
Full Text: View/download PDF

48. Beyond Models

Author: Ambuj K. Singh, Bruno Ribeiro, and Minh X. Hoang
Subjects: Social network, Computer science, business.industry, Stochastic process, Process (engineering), Degrees of freedom (statistics), Complex network, computer.software_genre, Machine learning, Data mining, Artificial intelligence, Information cascade, Consensus forecast, business, computer, Equivalence (measure theory)
Abstract: Complex network phenomena -- such as information cascades in online social networks -- are hard to fully observe, model, and forecast. In forecasting, a recent trend has been to forgo the use of parsimonious models in favor of models with increasingly large degrees of freedom that are trained to learn the behavior of a process from historical data. Extrapolating this trend into the future, eventually we would renounce models all together. But is it possible to forecast the evolution of a complex stochastic process directly from the data without a model? In this work we show that model-free forecasting is possible. We present SED, an algorithm that forecasts process statistics based on relationships of statistical equivalence using two general axioms and historical data. To the best of our knowledge, SED is the first method that can perform axiomatic, model-free forecasts of complex stochastic processes. Our simulations using simple and complex evolving processes and tests performed on a large real-world dataset show promising results.
Published: 2015
Full Text: View/download PDF

49. Hierarchical in-network attribute compression via importance sampling

Author: Ambuj K. Singh, Petko Bogdanov, and Arlei Silva
Subjects: Tree (data structure), Computer science, Node (networking), Approximation algorithm, Data compression ratio, Data mining, Enhanced Data Rates for GSM Evolution, Lossy compression, computer.software_genre, computer, Importance sampling
Abstract: Many real-world complex systems can be modeled as dynamic networks with real-valued vertex/edge attributes. Examples include users' opinions in social networks and average speeds in a road system. When managing these large dynamic networks, compressing attribute values becomes a key requirement, since it enables the answering of attribute-based queries regarding a node/edge or network region based on a compact representation of the data. To address this problem, we introduce a lossy network compression scheme called Slice Tree (ST), which partitions a network into smooth regions with respect to node/edge values and compresses each value as the average of its region. ST applies a compact representation for network partitions, called slices, that are defined as a center node and radius distance. We propose an importance sampling algorithm to efficiently prune the search space of candidate slices in the ST construction by biasing the sampling process towards the node values that most affect the compression error. The effectiveness of ST in terms of compression error, compression rate, and running time is demonstrated using synthetic and real datasets. ST scales to million-node instances and removes up to 87% of the error in attribute values with a 103 compression ratio. We also illustrate how ST captures relevant phenomena in real networks, such as research collaboration patterns and traffic congestions.
Published: 2015
Full Text: View/download PDF

50. Discovery of LRRK2 inhibitors using sequential in silico joint pharmacophore space (JPS) and ensemble docking

Author: Gregory D. Cuny, Christian A. Lang, Soumya S. Ray, Ambuj K. Singh, and Min Liu
Subjects: Models, Molecular, In silico, Clinical Biochemistry, Pharmaceutical Science, Computational biology, Protein Serine-Threonine Kinases, Leucine-Rich Repeat Serine-Threonine Protein Kinase-2, Biochemistry, Structure-Activity Relationship, Drug Discovery, Ic50 values, Humans, Computer Simulation, Molecular Biology, Protein Kinase Inhibitors, Binding Sites, Chemistry, Organic Chemistry, Parkinson Disease, LRRK2, Combinatorial chemistry, nervous system diseases, High-Throughput Screening Assays, Amino Acid Substitution, Docking (molecular), Structural Homology, Protein, Molecular Medicine, Mutant Proteins, Pharmacophore
Abstract: Joint pharmacophore space (JPS), ensemble docking and sequential JPS-ensemble docking were used to select three panels of compounds (10 per panel) for evaluation as LRRK2 inhibitors. These computational methods identified four LRRK2 inhibitors with IC50 values
Published: 2015

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

198 results on '"Ambuj K Singh"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources