Author: "Mesquita, Diego" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Mesquita, Diego"' showing total 119 results

Start Over Author "Mesquita, Diego"

119 results on '"Mesquita, Diego"'

1. On Divergence Measures for Training GFlowNets

Author: da Silva, Tiago, da Silva, Eliezer de Souza, and Mesquita, Diego
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning, 68T05, G.3, I.5.1, I.2.8, I.2.6
Abstract: Generative Flow Networks (GFlowNets) are amortized inference models designed to sample from unnormalized distributions over composable objects, with applications in generative modeling for tasks in fields such as causal discovery, NLP, and drug discovery. Traditionally, the training procedure for GFlowNets seeks to minimize the expected log-squared difference between a proposal (forward policy) and a target (backward policy) distribution, which enforces certain flow-matching conditions. While this training procedure is closely related to variational inference (VI), directly attempting standard Kullback-Leibler (KL) divergence minimization can lead to proven biased and potentially high-variance estimators. Therefore, we first review four divergence measures, namely, Renyi-$\alpha$'s, Tsallis-$\alpha$'s, reverse and forward KL's, and design statistically efficient estimators for their stochastic gradients in the context of training GFlowNets. Then, we verify that properly minimizing these divergences yields a provably correct and empirically effective training scheme, often leading to significantly faster convergence than previously proposed optimization. To achieve this, we design control variates based on the REINFORCE leave-one-out and score-matching estimators to reduce the variance of the learning objectives' gradients. Our work contributes by narrowing the gap between GFlowNets training and generalized variational approximations, paving the way for algorithmic ideas informed by the divergence minimization viewpoint., Comment: Accepted at NeurIPS 2024, https://openreview.net/forum?id=N5H4z0Pzvn
Published: 2024

2. Embarrassingly Parallel GFlowNets

Author: da Silva, Tiago, Carvalho, Luiz Max, Souza, Amauri, Kaski, Samuel, and Mesquita, Diego
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: GFlowNets are a promising alternative to MCMC sampling for discrete compositional random variables. Training GFlowNets requires repeated evaluations of the unnormalized target distribution or reward function. However, for large-scale posterior sampling, this may be prohibitive since it incurs traversing the data several times. Moreover, if the data are distributed across clients, employing standard GFlowNets leads to intensive client-server communication. To alleviate both these issues, we propose embarrassingly parallel GFlowNet (EP-GFlowNet). EP-GFlowNet is a provably correct divide-and-conquer method to sample from product distributions of the form $R(\cdot) \propto R_1(\cdot) ... R_N(\cdot)$ -- e.g., in parallel or federated Bayes, where each $R_n$ is a local posterior defined on a data partition. First, in parallel, we train a local GFlowNet targeting each $R_n$ and send the resulting models to the server. Then, the server learns a global GFlowNet by enforcing our newly proposed \emph{aggregating balance} condition, requiring a single communication step. Importantly, EP-GFlowNets can also be applied to multi-objective optimization and model reuse. Our experiments illustrate the EP-GFlowNets's effectiveness on many tasks, including parallel Bayesian phylogenetics, multi-objective multiset, sequence generation, and federated Bayesian structure learning., Comment: Accepted to ICML 2024
Published: 2024

3. In-n-Out: Calibrating Graph Neural Networks for Link Prediction

Author: Nascimento, Erik, Mesquita, Diego, Kaski, Samuel, and Souza, Amauri H
Subjects: Computer Science - Machine Learning
Abstract: Deep neural networks are notoriously miscalibrated, i.e., their outputs do not reflect the true probability of the event we aim to predict. While networks for tabular or image data are usually overconfident, recent works have shown that graph neural networks (GNNs) show the opposite behavior for node-level classification. But what happens when we are predicting links? We show that, in this case, GNNs often exhibit a mixed behavior. More specifically, they may be overconfident in negative predictions while being underconfident in positive ones. Based on this observation, we propose IN-N-OUT, the first-ever method to calibrate GNNs for link prediction. IN-N-OUT is based on two simple intuitions: i) attributing true/false labels to an edge while respecting a GNNs prediction should cause but small fluctuations in that edge's embedding; and, conversely, ii) if we label that same edge contradicting our GNN, embeddings should change more substantially. An extensive experimental campaign shows that IN-N-OUT significantly improves the calibration of GNNs in link prediction, consistently outperforming the baselines available -- which are not designed for this specific task., Comment: 18 pages, 4 figures, 8 tables
Published: 2024

4. Thin and Deep Gaussian Processes

Author: de Souza, Daniel Augusto, Nikitin, Alexander, John, ST, Ross, Magnus, Álvarez, Mauricio A., Deisenroth, Marc Peter, Gomes, João P. P., Mesquita, Diego, and Mattos, César Lincoln C.
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: Gaussian processes (GPs) can provide a principled approach to uncertainty quantification with easy-to-interpret kernel hyperparameters, such as the lengthscale, which controls the correlation distance of function values. However, selecting an appropriate kernel can be challenging. Deep GPs avoid manual kernel engineering by successively parameterizing kernels with GP layers, allowing them to learn low-dimensional embeddings of the inputs that explain the output data. Following the architecture of deep neural networks, the most common deep GPs warp the input space layer-by-layer but lose all the interpretability of shallow GPs. An alternative construction is to successively parameterize the lengthscale of a kernel, improving the interpretability but ultimately giving away the notion of learning lower-dimensional embeddings. Unfortunately, both methods are susceptible to particular pathologies which may hinder fitting and limit their interpretability. This work proposes a novel synthesis of both previous approaches: Thin and Deep GP (TDGP). Each TDGP layer defines locally linear transformations of the original input data maintaining the concept of latent embeddings while also retaining the interpretation of lengthscales of a kernel. Moreover, unlike the prior solutions, TDGP induces non-pathological manifolds that admit learning lower-dimensional representations. We show with theoretical and experimental results that i) TDGP is, unlike previous models, tailored to specifically discover lower-dimensional manifolds in the input data, ii) TDGP behaves well when increasing the number of layers, and iii) TDGP performs well in standard benchmark datasets., Comment: Accepted at the Conference on Neural Information Processing Systems (NeurIPS) 2023
Published: 2023

5. Human-in-the-Loop Causal Discovery under Latent Confounding using Ancestral GFlowNets

Author: da Silva, Tiago, Silva, Eliezer, Góis, António, Heider, Dominik, Kaski, Samuel, Mesquita, Diego, and Ribeiro, Adèle
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Structure learning is the crux of causal inference. Notably, causal discovery (CD) algorithms are brittle when data is scarce, possibly inferring imprecise causal relations that contradict expert knowledge -- especially when considering latent confounders. To aggravate the issue, most CD methods do not provide uncertainty estimates, making it hard for users to interpret results and improve the inference process. Surprisingly, while CD is a human-centered affair, no works have focused on building methods that both 1) output uncertainty estimates that can be verified by experts and 2) interact with those experts to iteratively refine CD. To solve these issues, we start by proposing to sample (causal) ancestral graphs proportionally to a belief distribution based on a score function, such as the Bayesian information criterion (BIC), using generative flow networks. Then, we leverage the diversity in candidate graphs and introduce an optimal experimental design to iteratively probe the expert about the relations among variables, effectively reducing the uncertainty of our belief over ancestral graphs. Finally, we update our samples to incorporate human feedback via importance sampling. Importantly, our method does not require causal sufficiency (i.e., unobserved confounders may exist). Experiments with synthetic observational data show that our method can accurately sample from distributions over ancestral graphs and that we can greatly improve inference quality with human aid.
Published: 2023

6. Locking and Quacking: Stacking Bayesian model predictions by log-pooling and superposition

Author: Yao, Yuling, Carvalho, Luiz Max, Mesquita, Diego, and McLatchie, Yann
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: Combining predictions from different models is a central problem in Bayesian inference and machine learning more broadly. Currently, these predictive distributions are almost exclusively combined using linear mixtures such as Bayesian model averaging, Bayesian stacking, and mixture of experts. Such linear mixtures impose idiosyncrasies that might be undesirable for some applications, such as multi-modality. While there exist alternative strategies (e.g. geometric bridge or superposition), optimising their parameters usually involves computing an intractable normalising constant repeatedly. We present two novel Bayesian model combination tools. These are generalisations of model stacking, but combine posterior densities by log-linear pooling (locking) and quantum superposition (quacking). To optimise model weights while avoiding the burden of normalising constants, we investigate the Hyvarinen score of the combined posterior predictions. We demonstrate locking with an illustrative example and discuss its practical application with importance sampling., Comment: An earlier version appeared at the NeurIPS 2022 Workshop on Score-Based Methods
Published: 2023

7. Distill n' Explain: explaining graph neural networks using simple surrogates

Author: Pereira, Tamara, Nascimento, Erik, Resck, Lucas E., Mesquita, Diego, and Souza, Amauri
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Explaining node predictions in graph neural networks (GNNs) often boils down to finding graph substructures that preserve predictions. Finding these structures usually implies back-propagating through the GNN, bonding the complexity (e.g., number of layers) of the GNN to the cost of explaining it. This naturally begs the question: Can we break this bond by explaining a simpler surrogate GNN? To answer the question, we propose Distill n' Explain (DnX). First, DnX learns a surrogate GNN via knowledge distillation. Then, DnX extracts node or edge-level explanations by solving a simple convex program. We also propose FastDnX, a faster version of DnX that leverages the linear decomposition of our surrogate model. Experiments show that DnX and FastDnX often outperform state-of-the-art GNN explainers while being orders of magnitude faster. Additionally, we support our empirical findings with theoretical results linking the quality of the surrogate model (i.e., distillation error) to the faithfulness of explanations., Comment: To appear in AISTATS 2023
Published: 2023

8. Towards automatic labeling of exception handling bugs: A case study of 10 years bug-fixing in Apache Hadoop

Author: da Silva, Antônio José A., Vieira, Renan G., Mesquita, Diego P. P., Gomes, João Paulo P., and Rocha, Lincoln S.
Published: 2024
Full Text: View/download PDF

9. Provably expressive temporal graph networks

Author: Souza, Amauri H., Mesquita, Diego, Kaski, Samuel, and Garg, Vikas
Subjects: Computer Science - Machine Learning
Abstract: Temporal graph networks (TGNs) have gained prominence as models for embedding dynamic interactions, but little is known about their theoretical underpinnings. We establish fundamental results about the representational power and limits of the two main categories of TGNs: those that aggregate temporal walks (WA-TGNs), and those that augment local message passing with recurrent memory modules (MP-TGNs). Specifically, novel constructions reveal the inadequacy of MP-TGNs and WA-TGNs, proving that neither category subsumes the other. We extend the 1-WL (Weisfeiler-Leman) test to temporal graphs, and show that the most powerful MP-TGNs should use injective updates, as in this case they become as expressive as the temporal WL. Also, we show that sufficiently deep MP-TGNs cannot benefit from memory, and MP/WA-TGNs fail to compute graph properties such as girth. These theoretical insights lead us to PINT -- a novel architecture that leverages injective temporal message passing and relative positional features. Importantly, PINT is provably more expressive than both MP-TGNs and WA-TGNs. PINT significantly outperforms existing TGNs on several real-world benchmarks., Comment: Accepted to NeurIPS 2022
Published: 2022

10. Bayesian ART for incomplete datasets

Author: Matias, Alan L.S., Gomes, João Paulo P., Mattos, César Lincoln C., Rocha Neto, Ajalmar R., and Mesquita, Diego
Published: 2024
Full Text: View/download PDF

11. Parallel MCMC Without Embarrassing Failures

Author: de Souza, Daniel Augusto, Mesquita, Diego, Kaski, Samuel, and Acerbi, Luigi
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning, Statistics - Methodology
Abstract: Embarrassingly parallel Markov Chain Monte Carlo (MCMC) exploits parallel computing to scale Bayesian inference to large datasets by using a two-step approach. First, MCMC is run in parallel on (sub)posteriors defined on data partitions. Then, a server combines local results. While efficient, this framework is very sensitive to the quality of subposterior sampling. Common sampling problems such as missing modes or misrepresentation of low-density regions are amplified -- instead of being corrected -- in the combination phase, leading to catastrophic failures. In this work, we propose a novel combination strategy to mitigate this issue. Our strategy, Parallel Active Inference (PAI), leverages Gaussian Process (GP) surrogate modeling and active learning. After fitting GPs to subposteriors, PAI (i) shares information between GP surrogates to cover missing modes; and (ii) uses active sampling to individually refine subposterior approximations. We validate PAI in challenging benchmarks, including heavy-tailed and multi-modal posteriors and a real-world application to computational neuroscience. Empirical results show that PAI succeeds where previous methods catastrophically fail, with a small communication overhead., Comment: To appear in the 25th International Conference on Artificial Intelligence and Statistics (AISTATS 2022). For associated code, see https://github.com/spectraldani/pai/
Published: 2022

12. Rethinking pooling in graph neural networks

Author: Mesquita, Diego, Souza, Amauri H., and Kaski, Samuel
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Graph pooling is a central component of a myriad of graph neural network (GNN) architectures. As an inheritance from traditional CNNs, most approaches formulate graph pooling as a cluster assignment problem, extending the idea of local patches in regular grids to graphs. Despite the wide adherence to this design choice, no work has rigorously evaluated its influence on the success of GNNs. In this paper, we build upon representative GNNs and introduce variants that challenge the need for locality-preserving representations, either using randomization or clustering on the complement graph. Strikingly, our experiments demonstrate that using these variants does not result in any decrease in performance. To understand this phenomenon, we study the interplay between convolutional layers and the subsequent pooling ones. We show that the convolutions play a leading role in the learned representations. In contrast to the common belief, local pooling is not responsible for the success of GNNs on relevant and widely-used benchmarks., Comment: Accepted to NeurIPS 2020
Published: 2020

13. Federated Stochastic Gradient Langevin Dynamics

Author: Mekkaoui, Khaoula El, Mesquita, Diego, Blomstedt, Paul, and Kaski, Samuel
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: Stochastic gradient MCMC methods, such as stochastic gradient Langevin dynamics (SGLD), employ fast but noisy gradient estimates to enable large-scale posterior sampling. Although we can easily extend SGLD to distributed settings, it suffers from two issues when applied to federated non-IID data. First, the variance of these estimates increases significantly. Second, delaying communication causes the Markov chains to diverge from the true posterior even for very simple models. To alleviate both these problems, we propose conducive gradients, a simple mechanism that combines local likelihood approximations to correct gradient updates. Notably, conducive gradients are easy to compute, and since we only calculate the approximations once, they incur negligible overhead. We apply conducive gradients to distributed stochastic gradient Langevin dynamics (DSGLD) and call the resulting method federated stochastic gradient Langevin dynamics (FSGLD). We demonstrate that our approach can handle delayed communication rounds, converging to the target posterior in cases where DSGLD fails. We also show that FSGLD outperforms DSGLD for non-IID federated data with experiments on metric learning and neural networks., Comment: Accepted to UAI 2021
Published: 2020

14. Learning GPLVM with arbitrary kernels using the unscented transformation

Author: de Souza, Daniel Augusto R. M. A., Mesquita, Diego, Mattos, César Lincoln C., and Gomes, João Paulo P.
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: Gaussian Process Latent Variable Model (GPLVM) is a flexible framework to handle uncertain inputs in Gaussian Processes (GPs) and incorporate GPs as components of larger graphical models. Nonetheless, the standard GPLVM variational inference approach is tractable only for a narrow family of kernel functions. The most popular implementations of GPLVM circumvent this limitation using quadrature methods, which may become a computational bottleneck even for relatively low dimensions. For instance, the widely employed Gauss-Hermite quadrature has exponential complexity on the number of dimensions. In this work, we propose using the unscented transformation instead. Overall, this method presents comparable, if not better, performance than offthe-shelf solutions to GPLVM and its computational complexity scales only linearly on dimension. In contrast to Monte Carlo methods, our approach is deterministic and works well with quasi-Newton methods, such as the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm. We illustrate the applicability of our method with experiments on dimensionality reduction and multistep-ahead prediction with uncertainty propagation., Comment: 10 pages, currently under review
Published: 2019

15. LS-SVR as a Bayesian RBF network

Author: Mesquita, Diego P. P., Freitas, Luis A., Gomes, João P. P., and Mattos, César L. C.
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: We show theoretical similarities between the Least Squares Support Vector Regression (LS-SVR) model with a Radial Basis Functions (RBF) kernel and maximum a posteriori (MAP) inference on Bayesian RBF networks with a specific Gaussian prior on the regression weights. Although previous works have pointed out similar expressions between those learning approaches, we explicit and formally state the existing correspondences. We empirically demonstrate our result by performing computational experiments with standard regression benchmarks. Our findings open a range of possibilities to improve LS-SVR by borrowing strength from well-established developments in Bayesian methodology., Comment: 14 pages, currently under review
Published: 2019

16. Meta-analysis of Bayesian analyses

Author: Blomstedt, Paul, Mesquita, Diego, Rivasplata, Omar, Lintusaari, Jarno, Sivula, Tuomas, Corander, Jukka, and Kaski, Samuel
Subjects: Statistics - Methodology
Abstract: Meta-analysis aims to generalize results from multiple related statistical analyses through a combined analysis. While the natural outcome of a Bayesian study is a posterior distribution, traditional Bayesian meta-analyses proceed by combining summary statistics (i.e., point-valued estimates) computed from data. In this paper, we develop a framework for combining posterior distributions from multiple related Bayesian studies into a meta-analysis. Importantly, the method is capable of reusing pre-computed posteriors from computationally costly analyses, without needing the implementation details from each study. Besides providing a consensus across studies, the method enables updating the local posteriors post-hoc and therefore refining them by sharing statistical strength between the studies, without rerunning the original analyses. We illustrate the wide applicability of the framework by combining results from likelihood-free Bayesian analyses, which would be difficult to carry out using standard methodology., Comment: Published at Bayesian Analysis
Published: 2019
Full Text: View/download PDF

17. Embarrassingly parallel MCMC using deep invertible transformations

Author: Mesquita, Diego, Blomstedt, Paul, and Kaski, Samuel
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: While MCMC methods have become a main work-horse for Bayesian inference, scaling them to large distributed datasets is still a challenge. Embarrassingly parallel MCMC strategies take a divide-and-conquer stance to achieve this by writing the target posterior as a product of subposteriors, running MCMC for each of them in parallel and subsequently combining the results. The challenge then lies in devising efficient aggregation strategies. Current strategies trade-off between approximation quality, and costs of communication and computation. In this work, we introduce a novel method that addresses these issues simultaneously. Our key insight is to introduce a deep invertible transformation to approximate each of the subposteriors. These approximations can be made accurate even for complex distributions and serve as intermediate representations, keeping the total communication cost limited. Moreover, they enable us to sample from the product of the subposteriors using an efficient and stable importance sampling scheme. We demonstrate the approach outperforms available state-of-the-art methods in a range of challenging scenarios, including high-dimensional and heterogeneous subposteriors., Comment: Accepted to UAI 2019
Published: 2019

18. ConveXplainer for Graph Neural Networks

Author: Pereira, Tamara A., Nascimento, Erik Jhones F., Mesquita, Diego, Souza, Amauri H., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Xavier-Junior, João Carlos, editor, and Rios, Ricardo Araújo, editor
Published: 2022
Full Text: View/download PDF

19. ConveXplainer for Graph Neural Networks

Author: Pereira, Tamara A., primary, Nascimento, Erik Jhones F., additional, Mesquita, Diego, additional, and Souza, Amauri H., additional
Published: 2022
Full Text: View/download PDF

20. Artificial Neural Networks with Random Weights for Incomplete Datasets

Author: Mesquita, Diego P. P., Gomes, João Paulo P., and Rodrigues, Leonardo R.
Published: 2019
Full Text: View/download PDF

21. Gaussian kernels for incomplete data

Author: Mesquita, Diego P.P., Gomes, João P.P., Corona, Francesco, Souza, Amauri H., Junior, and Nobre, Juvêncio S.
Published: 2019
Full Text: View/download PDF

22. Uma Introdução Amigável às Redes Neurais para Grafos

Author: Silva, Tiago, primary, Souza Junior, Amauri Holanda, additional, and Mesquita, Diego Parente Paiva, additional
Published: 2023
Full Text: View/download PDF

23. Radial Basis Function Neural Networks for Datasets with Missing Values

Author: Mesquita, Diego P. Paiva, Gomes, João Paulo P., Madureira, Ana Maria, editor, Abraham, Ajith, editor, Gamboa, Dorabela, editor, and Novais, Paulo, editor
Published: 2017
Full Text: View/download PDF

24. Forward Stagewise Regression on Incomplete Datasets

Author: Veras, Marcelo B. A., Mesquita, Diego P. P., Gomes, João P. P., Souza Junior, Amauri H., Barreto, Guilherme A., Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Rojas, Ignacio, editor, Joya, Gonzalo, editor, and Catala, Andreu, editor
Published: 2017
Full Text: View/download PDF

25. Building selective ensembles of Randomization Based Neural Networks with the successive projections algorithm

Author: Mesquita, Diego P.P., P. Gomes, João Paulo, Rodrigues, Leonardo R., Oliveira, Saulo A.F., and Galvão, Roberto K.H.
Published: 2018
Full Text: View/download PDF

26. Classification with reject option for software defect prediction

Author: Mesquita, Diego P.P., Rocha, Lincoln S., Gomes, João Paulo P., and Rocha Neto, Ajalmar R.
Published: 2016
Full Text: View/download PDF

27. A Minimal Learning Machine for Datasets with Missing Values

Author: Mesquita, Diego P. Paiva, Gomes, João Paulo P., Jr., Amauri H. Souza, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Arik, Sabri, editor, Huang, Tingwen, editor, Lai, Weng Kin, editor, and Liu, Qingshan, editor
Published: 2015
Full Text: View/download PDF

28. Ensemble of Minimal Learning Machines for Pattern Classification

Author: Mesquita, Diego Parente Paiva, Gomes, João Paulo Pordeus, Junior, Amauri Holanda Souza, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Rojas, Ignacio, editor, Joya, Gonzalo, editor, and Catala, Andreu, editor
Published: 2015
Full Text: View/download PDF

29. Fast Co-MLM: An Efficient Semi-supervised Co-training Method Based on the Minimal Learning Machine

Author: Caldas, Weslley L., Gomes, João P. P., and Mesquita, Diego P. P.
Published: 2017
Full Text: View/download PDF

30. Ensemble of Efficient Minimal Learning Machines for Classification and Regression

Author: Mesquita, Diego P. P., Gomes, João P. P., and Souza Junior, Amauri H.
Published: 2017
Full Text: View/download PDF

31. Bayesian Analysis of Bug-Fixing Time using Report Data

Author: Vieira, Renan, primary, Mesquita, Diego, additional, Mattos, César Lincoln, additional, Britto, Ricardo, additional, Rocha, Lincoln, additional, and Gomes, João, additional
Published: 2022
Full Text: View/download PDF

32. Forward Stagewise Regression on Incomplete Datasets

Author: Veras, Marcelo B. A., primary, Mesquita, Diego P. P., additional, Gomes, João P. P., additional, Souza Junior, Amauri H., additional, and Barreto, Guilherme A., additional
Published: 2017
Full Text: View/download PDF

33. What's on the News? The Use of Media Texts in Exams of Clinical Biochemistry for Medical and Nutrition Students

Author: Oliveira, Julia Martins, Mesquita, Diego Martins, and Hermes-Lima, Marcelo
Abstract: Health-related popular articles are easily found among media sources. With the increasing popularity of the internet, medical information--full of misconceptions--has become easily available to the lay people. The ability to recognize misconceptions may require good biomedical knowledge. In this sense, we decided to use articles from the internet as part of a formal exam to evaluate students' learning of Clinical and Applied Biochemistry (CAB). This test, known as the True-or-False (T-or-F) exam, is made up of statements found online that are judged by freshmen medical and nutrition students taking Basic Biochemistry. In the last four teaching-semesters, students' acceptance and responses to T-or-F exam on CAB were evaluated through questionnaires (using a 0-4 Likert scale). Results from 258 students revealed that 71, 87, and 94% of them believed, respectively, that the exam was (i) difficult, (ii) of good quality, and (iii) that using media-questions is relevant for evaluating the learning of CAB. Moreover, the average grade in the T-of-F exam was 5.85 (out of 10). This low average is probably because students are not familiarized with this sort of examination that does not emphasize on memorizations of biochemical pathways and processes--it instead evaluates mostly the comprehension and application of knowledge, levels 2 and 3 in Bloom's scale. Such conclusion was possible by analyzing 192 questions in four exams--67% were at levels 2, 3 or above. This kind of media-based exam could be well applied to several other disciplines in health sciences. (Contains 4 tables, 2 boxes, and 2 footnotes.)
Published: 2010
Full Text: View/download PDF

34. Bayesian Analysis of Bug-Fixing Time using Report Data

Author: Vieira, Renan, Mesquita, Diego, Mattos, César Lincoln, Britto, Ricardo, Rocha, Lincoln, Gomes, João, Vieira, Renan, Mesquita, Diego, Mattos, César Lincoln, Britto, Ricardo, Rocha, Lincoln, and Gomes, João
Abstract: Background: Bug-fixing is the crux of software maintenance. It entails tending to heaps of bug reports using limited resources. Using historical data, we can ask questions that contribute to betterinformed allocation heuristics. The caveat here is that often there is not enough data to provide a sound response. This issue is especially prominent for young projects. Also, answers may vary from project to project. Consequently, it is impossible to generalize results without assuming a notion of relatedness between projects. Aims: Evaluate the independent impact of three report features in the bug-fixing time (BFT), generalizing results from many projects: bug priority, code-churn size in bug fixing commits, and existence of links to other reports (e.g., depends on or blocks other bug reports). Method: We analyze 55 projects from the Apache ecosystem using Bayesian statistics. Similar to standard random effects methodology, we assume each project's average BFT is a dispersed version of a global average BFT that we want to assess. We split the data based on feature values/range (e.g., with or without links). For each split, we compute a posterior distribution over its respective global BFT. Finally, we compare the posteriors to establish the feature's effect on the BFT. We run independent analyses for each feature. Results: Our results show that the existence of links and higher code-churn values lead to BFTs that are at least twice as long. On the other hand, considering three levels of priority (low, medium, and high), we observe no difference in the BFT. Conclusion: To the best of our knowledge, this is the first study using hierarchical Bayes to extrapolate results from multiple projects and assess the global effect of different attributes on the BFT. We use this methodology to gain insight on how links, priority, and code-churn size impact the BFT. On top of that, our posteriors can be used as a prior to analyze novel projects, potentially young and scarce on data., open access
Published: 2022
Full Text: View/download PDF

35. Cambiar terapia antirretroviral: inefectividad o inseguridad?

Author: Mesquita, Diego Pinheiro, Araújo, Andressa Silva, Mourão, Erika Maria Siqueira, and Romeu, Geysa Aguiar
Subjects: Acquired Immunodeficiency Syndrome, Drug-Related Side Effects and Adverse Reactions, Anti-HIV Agents, Atención al paciente, Patient care, Infecciones por VIH, Efectos colaterales y reacciones adversas relacionados con medicamentos, Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos, Fármacos anti-VIH, Fármacos Anti-HIV, Assistência ao paciente, Síndrome de Imunodeficiência Adquirida, Síndrome de inmunodeficiencia adquirida, Infecções por HIV, HIV infections
Abstract: The aim of the study was to analyze the changes in antiretroviral therapy in patients treated at a medicine dispensing unit in Fortaleza, Ceará. An observational, documentary, descriptive, retrospective study with a quantitative approach was carried out from January to May 2020. All HIV/AIDS patients, registered in the drug dispensing unit, aged 18 or over were included. years that made changes in antiretroviral therapy from January to December 2019. Users who changed units during this period were excluded. There was a predominance of males, aged between 31 and 40 years, brown, single, with education corresponding to high school to higher education, natives and coming from Fortaleza, with diagnosis time between 1 and 5 years. Twenty-four different combinations of antiretroviral therapy were recorded, the most used association being tenofovir, lamivudine and efavirenz. Of the reasons that led to the change in antiretroviral therapy, the incidence or risk of adverse drug reaction, beginning or end of pregnancy and facilitating adherence to treatment prevailed. Efavirenz was the main drug involved in suspected adverse reactions, followed by tenofovir. It is concluded that the data obtained helped to understand the profile of people living with HIV in a Medicine Dispensing Unit in Fortaleza and the causes for changing antiretroviral therapy. El objetivo del estudio fue analizar los cambios en la terapia antirretroviral en pacientes atendidos en una unidad de dispensación de medicamentos en Fortaleza, Ceará. Se realizó un estudio observacional, documental, descriptivo, retrospectivo con abordaje cuantitativo de enero a mayo de 2020. Se incluyeron todos los pacientes con VIH/SIDA, registrados en la unidad de dispensación de medicamentos, de 18 años o más. Años que realizaron cambios en la terapia antirretroviral desde enero a diciembre de 2019. Se excluyeron los usuarios que cambiaron de unidad durante este período. Predominó el sexo masculino, con edades entre 31 y 40 años, morenos, solteros, con educación correspondiente a bachillerato a nivel superior, nativos y provenientes de Fortaleza, con tiempo de diagnóstico entre 1 y 5 años. Se registraron veinticuatro combinaciones diferentes de terapia antirretroviral, siendo la asociación más utilizada tenofovir, lamivudina y efavirenz. De las razones que motivaron el cambio en la terapia antirretroviral, prevalecieron la incidencia o riesgo de reacción adversa al medicamento, inicio o finalización del embarazo y facilitación de la adherencia al tratamiento. Efavirenz fue el principal fármaco implicado en las sospechas de reacciones adversas, seguido de tenofovir. Se concluye que los datos obtenidos ayudaron a comprender el perfil de las personas que viven con el VIH en una Unidad de Dispensación de Medicamentos en Fortaleza y las causas del cambio de terapia antirretroviral. O objetivo do estudo foi analisar as alterações da terapia antirretroviral de pacientes atendidos em uma unidade dispensadora de medicamentos de Fortaleza, Ceará. Realizou-se estudo observacional, documental, descritivo, retrospectivo, com abordagem quantitativa, no período de janeiro a maio de 2020. Incluíram-se todos os portadores de HIV/AIDS, cadastrados na unidade dispensadora de medicamentos, com idade igual ou superior a 18 anos que realizaram mudanças na terapia antirretroviral no período de janeiro a dezembro de 2019. Excluíram-se os usuários que mudaram de unidade durante esse período. Predominaram pessoas do sexo masculino, faixa etária entre 31 a 40 anos, pardos, solteiros, com escolaridade correspondente do ensino médio ao ensino superior, naturais e procedentes de Fortaleza, com tempo de diagnóstico entre 1 e 5 anos. Foram registradas 24 combinações diferentes de terapia antirretroviral, sendo a associação mais utilizada com tenofovir, lamivudina e efavirenz. Dos motivos que levaram à mudança de terapia antirretroviral, prevaleceram a incidência ou risco de reação adversa ao medicamento, início ou fim da gestação e facilitação da adesão ao tratamento. O efavirenz foi o principal medicamento envolvido nas suspeitas de reação adversa, seguido pelo tenofovir. Conclui-se que os dados obtidos auxiliaram na compreensão do perfil das pessoas vivendo com HIV de uma Unidade Dispensadora de Medicamentos de Fortaleza e as causas para mudança da terapia antirretroviral.
Published: 2022

36. Parallel MCMC Without Embarrassing Failures

Author: Augusto de Souza, Daniel, Parente Paiva Mesquita, Diego, Kaski, Samuel, Acerbi, Luigi, University College London, Department of Computer Science, Computer Science Professors, University of Helsinki, Aalto-yliopisto, and Aalto University
Abstract: Embarrassingly parallel Markov Chain Monte Carlo (MCMC) exploits parallel computing to scale Bayesian inference to large datasets by using a two-step approach. First, MCMC is run in parallel on (sub)posteriors defined on data partitions. Then, a server combines local results. While efficient, this framework is very sensitive to the quality of subposterior sampling. Common sampling problems such as missing modes or misrepresentation of low-density regions are amplified – instead of being corrected – in the combination phase, leading to catastrophic failures. In this work, we propose a novel combination strategy to mitigate this issue. Our strategy, Parallel Active Inference (PAI), leverages Gaussian Process (GP) surrogate modeling and active learning. After fitting GPs to subposteriors, PAI (i) shares information between GP surrogates to cover missing modes; and (ii) uses active sampling to individually refine subposterior approximations. We validate PAI in challenging benchmarks, including heavy-tailed and multi-modal posteriors and a real-world application to computational neuroscience. Empirical results show that PAI succeeds where previous methods catastrophically fail, with a small communication overhead.
Published: 2022

37. Federated Stochastic Gradient Langevin Dynamics

Author: El Mekkaoui, Khaoula, Parente Paiva Mesquita, Diego, Blomstedt, Paul, Kaski, Samuel, Department of Computer Science, F-Secure, Computer Science Professors, Aalto-yliopisto, and Aalto University
Abstract: Stochastic gradient MCMC methods, such as stochastic gradient Langevin dynamics (SGLD), employ fast but noisy gradient estimates to enable large-scale posterior sampling. Although we can easily extend SGLD to distributed settings, it suffers from two issues when applied to federated non-IID data. First, the variance of these estimates increases significantly. Second, delaying communication causes the Markov chains to diverge from the true posterior even for very simple models. To alleviate both these problems, we propose conducive gradients, a simple mechanism that combines local likelihood approximations to correct gradient updates. Notably, conducive gradients are easy to compute, and since we only calculate the approximations once, they incur negligible overhead. We apply conducive gradients to distributed stochastic gradient Langevin dynamics (DSGLD) and call the resulting method federated stochastic gradient Langevin dynamics (FSGLD). We demonstrate that our approach can handle delayed communication rounds, converging to the target posterior in cases where DSGLD fails. We also show that FSGLD outperforms DSGLD for non-IID federated data with experiments on metric learning and neural networks.
Published: 2021

38. Alteração da terapia antirretroviral: inefetividade ou insegurança?

Author: Mesquita, Diego Pinheiro, primary, Araújo, Andressa Silva, additional, Mourão, Erika Maria Siqueira, additional, and Romeu, Geysa Aguiar, additional
Published: 2022
Full Text: View/download PDF

39. Bayesian Multilateration

Author: Sampaio de Carvalho Alencar, Alisson, primary, Mattos, Cesar, additional, Pordeus Gomes, Joao Paulo, additional, and Mesquita, Diego, additional
Published: 2022
Full Text: View/download PDF

40. Long-Term Results of the Modified Thal Procedure in Patients with Chagasic Megaesophagus

Author: Alves, Andréa Pedrosa Ribeiro, de Oliveira, Paulo Gonçalves, de Oliveira, Julia Martins, de Mesquita, Diego Martins, and dos Santos, João Henrique Zanotelli
Published: 2014
Full Text: View/download PDF

41. A Minimal Learning Machine for Datasets with Missing Values

Author: Mesquita, Diego P. Paiva, primary, Gomes, João Paulo P., additional, and Souza, Amauri H., additional
Published: 2015
Full Text: View/download PDF

42. Ensemble of Minimal Learning Machines for Pattern Classification

Author: Mesquita, Diego Parente Paiva, primary, Gomes, João Paulo Pordeus, additional, and Junior, Amauri Holanda Souza, additional
Published: 2015
Full Text: View/download PDF

43. Bayesian Multilateration

Author: Alencar, Alisson, primary, Mattos, César, primary, Gomes, João, primary, and Mesquita, Diego, primary
Published: 2021
Full Text: View/download PDF

44. Improving Graph Variational Autoencoders with Multi-Hop Simple Convolutions

Author: Freitas do Nascimento, Erik Jhones, primary, Souza, Amauri, additional, and Mesquita, Diego, additional
Published: 2021
Full Text: View/download PDF

45. How do loss functions impact the performance of graph neural networks?

Author: Duarte, Gabriel Jonas, primary, Pereira, Tamara Arruda, additional, Nascimento, Erik Jhones, additional, Mesquita, Diego, additional, and Souza Junior, Amauri Holanda, additional
Published: 2021
Full Text: View/download PDF

46. LS-SVR as a Bayesian RBF Network

Author: Mesquita, Diego P. P., primary, Freitas, Luis A., additional, Gomes, Joao P. P., additional, and Mattos, Cesar L. C., additional
Published: 2020
Full Text: View/download PDF

47. A sparse linear regression model for incomplete datasets

Author: Veras, Marcelo B. A., primary, Mesquita, Diego P. P., additional, Mattos, Cesar L. C., additional, and Gomes, João P. P., additional
Published: 2019
Full Text: View/download PDF

48. An investigation together with a formation political of a mathematics teacher

Author: Mesquita, Diego Marques [UNESP], Universidade Estadual Paulista (Unesp), and Miarka, Roger [UNESP]
Subjects: Mathematics and society, Teacher training, Narrative, Policy, Teacher education, Matemática e sociedade, Escola sem partido, Strike, Política, Greve, Romance, Narrativa, Formação de professores
Abstract: Submitted by Diego Marques Mesquita (diego.mmesquita@hotmail.com) on 2018-06-13T15:16:08Z No. of bitstreams: 1 Dissertacao-Diego-Marques-Mesquita-Educacao-Matematica-2018.pdf: 3252860 bytes, checksum: 4ddac71688f55afa6c6ee75a1c0222ba (MD5) Approved for entry into archive by Adriana Aparecida Puerta null (dripuerta@rc.unesp.br) on 2018-06-13T17:51:27Z (GMT) No. of bitstreams: 1 mesquita_dm_me_rcla.pdf: 3252860 bytes, checksum: 4ddac71688f55afa6c6ee75a1c0222ba (MD5) Made available in DSpace on 2018-06-13T17:51:27Z (GMT). No. of bitstreams: 1 mesquita_dm_me_rcla.pdf: 3252860 bytes, checksum: 4ddac71688f55afa6c6ee75a1c0222ba (MD5) Previous issue date: 2018-05-02 A formação de docente de matemática é o foco de diversas pesquisas na Educação Matemática. Há mapeamentos de possibilidades de formações nestas pesquisas, cada qual com uma proposta de investigação sobre o movimento que desencadeia o ato de ser docente de matemática. Esta investigação também aborda o tema formação docente, contudo, com um enfoque na formação política do(a) docente de matemática e em alguns momentos no questionamento das políticas de formação. Ao longo da formação, escolhas políticas são feitas a todo momento, seja, por exemplo, ao escolher as disciplinas e grade curricular na universidade; na prática docente, ao escolher os materiais didáticos; enquanto trabalhador, ao ser confrontado com medidas macropolíticas e, para não restringir, formar é um movimento constante de leituras, vivências, iniciativas de mudanças, militância.... Portanto, dissociar as ações políticas da formação docente aparenta ser perigoso e um tanto quanto nocivo para esta área, ainda mais em tempos de Escola sem partido. Sendo assim, torna-se interessante haver uma investigação da potência que a formação política traz a docentes de matemática e, consequentemente, à Educação Matemática. Nesse rastro, esta pesquisa se debruça em torno da pergunta: “Que pode a formação política para/(d) o professor de matemática?” Para abrir a visão de um horizonte de possibilidades dessa pergunta, buscamos trabalhar com algumas marcas presentes na formação do autor deste trabalho em sua formação como professor de matemática, em especial, aquelas produzidas pelo movimento estudantil, pelo movimento grevista e pela prática docente. A opção de mobilizar esses rastros de experiências se deu por meio de narrativas, que constituirão um bildungsroman, de modo a produzir um corpo e delinear discussões, via embaralhamento de códigos que possam investigar a potência da formação política do professor de matemática. The education of mathematics teacher is the focus of several researches in Mathematics Education. There are mappings of training possibilities in these res earches, each one of them with a research proposal on the movement that triggers the act of being a mathematics teacher. This research also addresses the subject teacher education , however, with a slight syntactic difference - political formation of the ma th teacher. Throughout the formation political choices are made at all times, for example, in the university when choosing the disci plines ; in teaching practice when choosing courseware ; as a worker when confronted by macropolitical measures and, in order not to restrict, to form is a constant movement of readings, experiences, initiatives of changes, militancy, etc. Therefore, dissociating political actions from teacher education seems to be dangerous and somewhat harmful to this area, especially in “Escol a Sem Partido” times . Thus, it becomes interesting to have an investigation of the power that the political formation allows to the teacher of mathematics and, consequently, to Mathematical Education. In this trail , this research looks at the question “Wha t can the political formation for/(of) the mathematics teacher?” In order to open the vision of a horizon of possibilities of this question, we seek to work with some marks present in the formation of t he author of this research in his education as a teach er of mathematics, especially those produced by the student movement, by the striking movement and by the teaching practice. The option of mobilizing these traces of experiences was through narratives, which will constitute a bildungsroman, in order to pro duce a bod y and delineate discussions, by means of code shuffling, which can investigate the power of the political formation of the mathematics teacher.
Published: 2018

49. A Robust Minimal Learning Machine based on the M-Estimator

Author: Kärkkäinen, Tommi, Gomes, Joao, Mesquita, Diego, Freire, Ananda, and Junior, Amauri Souza
Subjects: ComputingMethodologies_PATTERNRECOGNITION, koneoppiminen, learning methods
Abstract: In this paper we propose a robust Minimal Learning Machine (R-RLM) for regression problems. The proposed method uses a robust M-estimator to generate a linear mapping between input and output distances matrices of MLM. The R-MLM was tested on one synthetic and three real world datasets that were contaminated with an increasing number of outliers. The method achieved a performance comparable to the robust Extreme Learning Machine (R-RLM) and thus can be seen as a valid alternative for regression tasks on datasets with outliers. peerReviewed
Published: 2017

50. Machine Learning for incomplete data

Author: Mesquita, Diego Parente Paiva and Gomes, João Paulo Pordeus
Subjects: Missing data, Machine learning, Epanechnikov kernel, Gaussian kernel, Euclidean distance, Basis functions
Abstract: Methods based on basis functions (such as the sigmoid and q-Gaussian functions) and similarity measures (such as distances or kernel functions) are widely used in machine learning and related fields. These methods often take for granted that data is fully observed and are not equipped to handle incomplete data in an organic manner. This assumption is often flawed, as incomplete data is a fact in various domains such as medical diagnosis and sensor analytics. Therefore, one might find it useful to be able to estimate the value of these functions in the presence of partially observed data. We propose methodologies to estimate the Gaussian Kernel, the Euclidean Distance, the Epanechnikov kernel and arbitrary basis functions in the presence of possibly incomplete feature vectors. To obtain such estimates, the incomplete feature vectors are treated as continuous random variables and, based on that, we take the expected value of the transforms of interest. Métodos baseados em funções de base (como as funções sigmoid e a q-Gaussian) e medidas de similaridade (como distâncias ou funções de kernel) são comuns em Aprendizado de Máquina e áreas correlatas. Comumente, no entanto, esses métodos não são equipados para utilizar dados incompletos de maneira orgânica. Isso pode ser visto como um impedimento, uma vez que dados parcialmente observados são comuns em vários domínios, como aplicações médicas e dados provenientes de sensores. Nesta dissertação, propomos metodologias para estimar o valor do kernel Gaussiano, da distância Euclidiana, do kernel Epanechnikov e de funções de base arbitrárias na presença de vetores possivelmente parcialmente observados. Para obter tais estimativas, os vetores incompletos são tratados como variáveis aleatórias contínuas e, baseado nisso, tomamos o valor esperado da transformada de interesse.
Published: 2017

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

119 results on '"Mesquita, Diego"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources