Descriptor: "Hinge loss" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Hinge loss"' showing total 593 results

Start Over Descriptor "Hinge loss"

593 results on '"Hinge loss"'

1. Improving top-N recommendations using batch approximation for weighted pair-wise loss

Author: Sofia Aftab and Heri Ramampiaro
Subjects: Recommender system, Bayesian personalized ranking, Triplet network, Hinge loss, Impostors, Cybernetics, Q300-390, Electronic computers. Computer science, QA75.5-76.95
Abstract: In collaborative filtering, matrix factorization and collaborative metric learning are challenged by situations where non-preferred items may appear so close to a user in the feature embedding space that they lead to degrading the recommendation performance. We call such items ‘potential impostor’ risks. Addressing the issues with ‘potential impostor’ is important because it can result in inefficient learning and poor feature extraction. To achieve this, we propose a novel loss function formulation designed to enhance learning efficiency by actively identifying and addressing impostors, leveraging item associations and learning the distribution of negative items. This approach is crucial for models to differentiate between positive and negative items effectively, even when they are closely aligned in the feature space. Here, a loss function is generally an objective optimization function that is defined based on user–item interaction data, through either implicit or explicit feedback. The loss function essentially decides how well a recommendation algorithm performs. In this paper, we introduce and define the concept of ‘potential impostor’, highlighting its impact on learned representation quality and algorithmic efficiency. We tackle the limitations of non-metric methods, like the Weighted Approximate Rank Pairwise Loss (WARP) method, which struggles to capture item–item similarities, by using a ‘similarity propagation’ strategy with a new loss term. Similarly, we address fixed margin inefficiencies in Weighted Collaborative Metric Learning (WCML), through density distribution approximation. This moves potential impostors away from the margin for more robust learning. Additionally, we propose a large-scale batch approximation algorithm for increased detection of impostors, coupled with an active learning strategy for improved top-N recommendation performance. Our extensive empirical analysis across five major and diverse datasets demonstrates the effectiveness and feasibility of our methods, compared to existing techniques with respect to improving AUC, reducing impostor rate, and increasing the average distance metrics. More specifically, our evaluation shows that our two proposed methods outperform the existing state-of-the-art techniques, with an improvement of AUC by 3.5% and 3.7%, NDCG by 1.0% and 9.1% and HR by 1.3% and 3.6%, respectively. Similarly, the impostor rate is decreased by 35% and 18%, and their average distance is increased by 33% and 37%, respectively.
Published: 2024
Full Text: View/download PDF

2. Levenberg–Marquardt multi-classification using hinge loss function

Author: Ozyildirim, Buse Melis and Kiran, Mariam
Subjects: Information and Computing Sciences, Machine Learning, Algorithms, Neural Networks, Computer, Neural networks, Levenberg-Marquardt, Hinge loss, Loss functions, Classification, Levenberg–Marquardt, Artificial Intelligence & Image Processing, Artificial intelligence, Machine learning, Statistics
Abstract: Incorporating higher-order optimization functions, such as Levenberg-Marquardt (LM) have revealed better generalizable solutions for deep learning problems. However, these higher-order optimization functions suffer from very large processing time and training complexity especially as training datasets become large, such as in multi-view classification problems, where finding global optima is a very costly problem. To solve this issue, we develop a solution for LM-enabled classification with, to the best of knowledge first-time implementation of hinge loss, for multiview classification. Hinge loss allows the neural network to converge faster and perform better than other loss functions such as logistic or square loss rates. We prove our method by experimenting with various multiclass classification challenges of varying complexity and training data size. The empirical results show the training time and accuracy rates achieved, highlighting how our method outperforms in all cases, especially when training time is limited. Our paper presents important results in the relationship between optimization and loss functions and how these can impact deep learning problems.
Published: 2021

3. A new fuzzy support vector machine with pinball loss

Author: Ram Nayan Verma, Rahul Deo, Rakesh Srivastava, Naidu Subbarao, and Gajendra Pratap Singh
Subjects: Hinge loss, Pinball loss, Support vector machine, Fuzzy support vector machine, Computational linguistics. Natural language processing, P98-98.5, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract The fuzzy support vector machine (FSVM) assigns each sample a fuzzy membership value based on its relevance, making it less sensitive to noise or outliers in the data. Although FSVM has had some success in avoiding the negative effects of noise, it uses hinge loss, which maximizes the shortest distance between two classes and is ineffective in dealing with feature noise near the decision boundary. Furthermore, whereas FSVM concentrates on misclassification errors, it neglects to consider the critical within-class scatter minimization. We present a Fuzzy support vector machine with pinball loss (FPin-SVM), which is a fuzzy extension of a reformulation of a recently proposed support vector machine with pinball loss (Pin-SVM) with several significant improvements, to improve the performance of FSVM. First, because we used the squared L2- norm of errors variables instead of the L1 norm, our FPin-SVM is a strongly convex minimization problem; second, to speed up the training procedure, solutions of the proposed FPin-SVM, as an unconstrained minimization problem, are obtained using the functional iterative and Newton methods. Third, it is proposed to solve the minimization problem directly in primal. Unlike FSVM and Pin-SVM, our FPin-SVM does not require a toolbox for optimization. We dig deeper into the features of FPin-SVM, such as noise insensitivity and within-class scatter minimization. We conducted experiments on synthetic and real-world datasets with various sounds to validate the usefulness of the suggested approach. Compared to the SVM, FSVM, and Pin-SVM, the presented approaches demonstrate equivalent or superior generalization performance in less training time.
Published: 2023
Full Text: View/download PDF

4. Deep Contextual Grid Triplet Network for Context-Aware Recommendation

Author: Sofia Aftab, Heri Ramampiaro, Helge Langseth, and Massimiliano Ruocco
Subjects: Recommender systems, context-awareness, deep learning, triplet network, hinge loss, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Modeling contextual information is a vital part of developing effective recommender systems. Still, existing work on recommendation algorithms has generally put limited focus on the effective treatment of contextual information. Moreover, adding context to recommendation models is challenging since it increases the dimensionality and complexity of the model. Therefore, an efficient learning method is required to extract an association and inter-relationship between user/item features and contextual features for preference-driven modeling. The engineering of features through the exploration of adjacent correlations between the user/item and their context, and their further learning through a distance-based metric, is critical for effective personalization. Motivated by this, we introduce a context-aware recommendation strategy using a ‘contextual grid triplet network.’ This strategy uses a contextual grid topology to capture robust semantic representations of users, items, and contextual data. We present a learning methodology that merges a triplet network with a convolutional neural network. This fusion enables the exploration of associations both ‘within’ the contextual grid, such as between users or items, and ‘between’ different contextual grids, like between a user and items of input. Moreover, we present a variant of a hinge loss function using a triplet network for improved performance and fast convergence. In this work, we study how these aspects boost the quality of top-N recommendations. Furthermore, We show through extensive ablation-based experiments that the proposed method outperforms existing state-of-the-art techniques, demonstrating its robustness and feasibility.
Published: 2023
Full Text: View/download PDF

5. Exploring SVM for Federated Machine Learning Applications

Author: Nair, Divya G., Aswartha Narayana, C. V., Jaideep Reddy, K., Nair, Jyothisha J., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Rout, Rashmi Ranjan, editor, Ghosh, Soumya Kanti, editor, Jana, Prasanta K., editor, Tripathy, Asis Kumar, editor, Sahoo, Jyoti Prakash, editor, and Li, Kuan-Ching, editor
Published: 2022
Full Text: View/download PDF

6. FMD-cGAN: Fast Motion Deblurring Using Conditional Generative Adversarial Networks

Author: Kumar, Jatin, Mastan, Indra Deep, Raman, Shanmuganathan, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Raman, Balasubramanian, editor, Murala, Subrahmanyam, editor, Chowdhury, Ananda, editor, Dhall, Abhinav, editor, and Goyal, Puneet, editor
Published: 2022
Full Text: View/download PDF

7. A new fuzzy support vector machine with pinball loss.

Author: Verma, Ram Nayan, Deo, Rahul, Srivastava, Rakesh, Subbarao, Naidu, and Singh, Gajendra Pratap
Subjects: SUPPORT vector machines, PINBALL machines, GENERALIZATION, PROBLEM solving, ITERATIVE methods (Mathematics)
Abstract: The fuzzy support vector machine (FSVM) assigns each sample a fuzzy membership value based on its relevance, making it less sensitive to noise or outliers in the data. Although FSVM has had some success in avoiding the negative effects of noise, it uses hinge loss, which maximizes the shortest distance between two classes and is ineffective in dealing with feature noise near the decision boundary. Furthermore, whereas FSVM concentrates on misclassification errors, it neglects to consider the critical within-class scatter minimization. We present a Fuzzy support vector machine with pinball loss (FPin-SVM), which is a fuzzy extension of a reformulation of a recently proposed support vector machine with pinball loss (Pin-SVM) with several significant improvements, to improve the performance of FSVM. First, because we used the squared L2- norm of errors variables instead of the L1 norm, our FPin-SVM is a strongly convex minimization problem; second, to speed up the training procedure, solutions of the proposed FPin-SVM, as an unconstrained minimization problem, are obtained using the functional iterative and Newton methods. Third, it is proposed to solve the minimization problem directly in primal. Unlike FSVM and Pin-SVM, our FPin-SVM does not require a toolbox for optimization. We dig deeper into the features of FPin-SVM, such as noise insensitivity and within-class scatter minimization. We conducted experiments on synthetic and real-world datasets with various sounds to validate the usefulness of the suggested approach. Compared to the SVM, FSVM, and Pin-SVM, the presented approaches demonstrate equivalent or superior generalization performance in less training time. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

8. The Facial Expression Data Enhancement Method Induced by Improved StarGAN V2.

Author: Han, Baojin and Hu, Min
Subjects: *FACIAL expression, *DATA augmentation, *PROBLEM solving, *BIOMETRY
Abstract: Due to the small data and unbalanced sample distribution in the existing facial emotion datasets, the effect of facial expression recognition is not ideal. Traditional data augmentation methods include image angle modification, image shearing, and image scrambling. The above approaches cannot solve the problem that is the high similarity of the generated images. StarGAN V2 can generate different styles of images across multiple domains. Nevertheless, there are some defects in gener-ating these facial expression images, such as crooked mouths and fuzzy facial expression images. To service such problems, we improved StarGAN V2 by solving the drawbacks of creating pictures that apply an SENet to the generator of StarGAN V2. The generator's SENet can concentrate at-tention on the important regions of the facial expression images. Thus, this makes the generated symmetrical expression image more obvious and easier to distinguish. Meanwhile, to further im-prove the quality of the generated pictures, we customized the hinge loss function to reconstruct the loss functions that increase the boundary of real and fake images. The created facial expression pictures testified that our improved model could solve the defects in the images created by the original StarGAN V2. The experiments were conducted on the CK+ and MMI datasets. The correct recognition rate of the facial expressions on the CK+ was 99.2031%, which is a 1.4186% higher accuracy than that of StarGAN V2. The correct recognition rate of the facial expressions on the MMI displays was 98.1378%, which is 5.059% higher than that of the StarGAN V2 method. Furthermore, contrast test outcomes proved that the improved StarGAN V2 performed better than most state-of-the-art methods. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

9. A Granular Parakeratosis Classification using SVM Hinge and Cross Validation

Author: Sheetal Janthakal and Girisha Hosalli
Subjects: 10-fold cross validation, convolutional neural networks, hinge loss, linear activation function, support vector machine, Engineering (General). Civil engineering (General), TA1-2040, Chemical engineering, TP155-156, Physics, QC1-999
Abstract: Now-a-days, a challenging task in the medical field is the diagnosis of skin illness considering numerous characteristics such as color, size, and the lesion region. Dermoscopy is a technique that has been frequently used to diagnose skin lesions. Researchers have recently demonstrated a keen interest in building an automated diagnosis system, and a satisfying result can be achieved with a high degree of skill, as skin lesion classification necessitates a great deal of knowledge and expertise. Automated skin lesion classification in dermoscopy images is an essential way to improve diagnostic performance. This paper presents the power of convolutional neural networks in classifying the skin lesions into two different categories, namely Granular Parakeratosis and Paraneoplastic Pemphigus. The proposed method includes implementation of Support Vector Machine with hinge loss and linear activation function for classification of lesions and this output is fed to the 10-fold cross validation model, yielding an accuracy of 94%, sensitivity of 93%, and specificity of 91%. The proposed strategy outperforms the SVM kernel Radial basis function (RBF), which was created specifically for binary classification problems.
Published: 2022
Full Text: View/download PDF

10. Robust General Twin Support Vector Machine with Pinball Loss Function

Author: Ganaie, M. A., Tanveer, M., Kacprzyk, Janusz, Series Editor, Kumar, Pardeep, editor, and Singh, Amit Kumar, editor
Published: 2021
Full Text: View/download PDF

11. Damping Percentage Detection Using Unconventional Methods

Author: Bansal, Tushar, Singhal, Sanjhi, Deswal, Suprita, Nagarajan, S. T., Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Chen, Joy Iong-Zong, editor, Tavares, João Manuel R. S., editor, Shakya, Subarna, editor, and Iliyasu, Abdullah M., editor
Published: 2021
Full Text: View/download PDF

12. On Loss Functions and Regret Bounds for Multi-Category Classification.

Author: Tan, Zhiqiang and Zhang, Xinwei
Subjects: *CONVEX functions, *CLASSIFICATION, *ENTROPY
Abstract: We develop new approaches in multi-class settings for constructing loss functions and establishing corresponding regret bounds with respect to the zero-one or cost-weighted classification loss. We provide new general representations of losses by deriving inverse mappings from a concave generalized entropy to a loss through a convex dissimilarity function related to the multi-distribution $f$ -divergence. This approach is then applied to study both hinge-like losses and proper scoring rules. In the first case, we derive new hinge-like convex losses, which are tighter extensions outside the probability simplex than related hinge-like losses and geometrically simpler with fewer non-differentiable edges. We also establish a classification regret bound in general for all losses with the same generalized entropy as the zero-one loss, thereby substantially extending and improving existing results. In the second case, we identify new sets of multi-class proper scoring rules through different types of dissimilarity functions and reveal interesting relationships between various composite losses currently in use. We also establish new classification regret bounds in general for multi-class proper scoring rules and, as applications, provide simple meaningful regret bounds for two specific sets of proper scoring rules. These results generalize, for the first time, previous two-class regret bounds to multi-class settings. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

13. Ramp loss KNN-weighted multi-class twin support vector machine.

Author: Wang, Huiru, Xu, Yitian, and Zhou, Zhijian
Subjects: *SUPPORT vector machines, *NAIVE Bayes classification, *HINGES, *ALGORITHMS
Abstract: The K-nearest neighbor-weighted multi-class twin support vector machine (KWMTSVM) is an effective multi-classification algorithm which utilizes the local information of all training samples. However, it is easily affected by the noises and outliers owing to the use of the hinge loss function. That is because the outlier will obtain a huge loss and become the support vector, which will shift the separating hyperplane inappropriately. To reduce the negative influence of outliers, we use the ramp loss function to replace the hinge loss function in KWMTSVM and propose a novel sparse and robust multi-classification algorithm named ramp loss K-nearest neighbor-weighted multi-class twin support vector machine (RKWMTSVM) in this paper. Firstly, the proposed RKWMTSVM restricts the loss of outlier to a fixed value, thus the negative influence on the construction of hyperplane is suppressed and the classification performance is further improved. Secondly, since outliers will not become support vectors, the RKWMTSVM is a sparser algorithm, especially compared with KWMTSVM. Thirdly, because RKWMTSVM is a non-differentiable non-convex optimization problem, we use the concave–convex procedure (CCCP) to solve it. In each iteration of CCCP, the proposed RKWMTSVM solves a series of KWMTSVM-like problems. That also means RKWMTSVM inherits the merits of KWMTSVM, namely, it can exploit the local information of intra-class to improve the generalization ability and use inter-class information to remove the redundant constraints and accelerate the solution process. In the end, the clipping dual coordinate descent (clipDCD) algorithm is employed into our RKWMTSVM to further speed up the computational speed. We do numerical experiments on twenty-four benchmark datasets. The experimental results verify the validity and effectiveness of our algorithm. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

14. The Facial Expression Data Enhancement Method Induced by Improved StarGAN V2

Author: Baojin Han and Min Hu
Subjects: face expression recognition, data enhancement, StarGAN V2, hinge loss, SENet, symmetry and asymmetry, Mathematics, QA1-939
Abstract: Due to the small data and unbalanced sample distribution in the existing facial emotion datasets, the effect of facial expression recognition is not ideal. Traditional data augmentation methods include image angle modification, image shearing, and image scrambling. The above approaches cannot solve the problem that is the high similarity of the generated images. StarGAN V2 can generate different styles of images across multiple domains. Nevertheless, there are some defects in gener-ating these facial expression images, such as crooked mouths and fuzzy facial expression images. To service such problems, we improved StarGAN V2 by solving the drawbacks of creating pictures that apply an SENet to the generator of StarGAN V2. The generator’s SENet can concentrate at-tention on the important regions of the facial expression images. Thus, this makes the generated symmetrical expression image more obvious and easier to distinguish. Meanwhile, to further im-prove the quality of the generated pictures, we customized the hinge loss function to reconstruct the loss functions that increase the boundary of real and fake images. The created facial expression pictures testified that our improved model could solve the defects in the images created by the original StarGAN V2. The experiments were conducted on the CK+ and MMI datasets. The correct recognition rate of the facial expressions on the CK+ was 99.2031%, which is a 1.4186% higher accuracy than that of StarGAN V2. The correct recognition rate of the facial expressions on the MMI displays was 98.1378%, which is 5.059% higher than that of the StarGAN V2 method. Furthermore, contrast test outcomes proved that the improved StarGAN V2 performed better than most state-of-the-art methods.
Published: 2023
Full Text: View/download PDF

15. Biologically Plausible Training Mechanisms for Self-Supervised Learning in Deep Networks.

Author: Tang, Mufeng, Yang, Yibo, and Amit, Yali
Subjects: SHORT-term memory, CONVOLUTIONAL neural networks, PERFORMANCE standards, DEEP learning
Abstract: We develop biologically plausible training mechanisms for self-supervised learning (SSL) in deep networks. Specifically, by biologically plausible training we mean (i) all updates of weights are based on current activities of pre-synaptic units and current, or activity retrieved from short term memory of post synaptic units, including at the top-most error computing layer, (ii) complex computations such as normalization, inner products and division are avoided, (iii) asymmetric connections between units, and (iv) most learning is carried out in an unsupervised manner. SSL with a contrastive loss satisfies the third condition as it does not require labeled data and it introduces robustness to observed perturbations of objects, which occur naturally as objects or observers move in 3D and with variable lighting over time. We propose a contrastive hinge based loss whose error involves simple local computations satisfying (ii), as opposed to the standard contrastive losses employed in the literature, which do not lend themselves easily to implementation in a network architecture due to complex computations involving ratios and inner products. Furthermore, we show that learning can be performed with one of two more plausible alternatives to backpropagation that satisfy conditions (i) and (ii). The first is difference target propagation (DTP), which trains network parameters using target-based local losses and employs a Hebbian learning rule, thus overcoming the biologically implausible symmetric weight problem in backpropagation. The second is layer-wise learning, where each layer is directly connected to a layer computing the loss error. The layers are either updated sequentially in a greedy fashion (GLL) or in random order (RLL), and each training stage involves a single hidden layer network. Backpropagation through one layer needed for each such network can either be altered with fixed random feedback weights (RF) or using updated random feedback weights (URF) as in Amity's study 2019. Both methods represent alternatives to the symmetric weight issue of backpropagation. By training convolutional neural networks (CNNs) with SSL and DTP, GLL or RLL, we find that our proposed framework achieves comparable performance to standard BP learning downstream linear classifier evaluation of the learned embeddings. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

16. Generalization Bounds and Algorithms for Learning to Communicate Over Additive Noise Channels.

Subjects: *MACHINE learning, *GENERALIZATION, *ERROR probability, *NOISE, *GIBBS sampling, *TELECOMMUNICATION systems
Abstract: An additive noise channel is considered, in which the distribution of the noise is nonparametric and unknown. The problem of learning encoders and decoders based on noise samples is considered. For uncoded communication systems, the problem of choosing a codebook and possibly also a generalized minimal distance decoder (which is parameterized by a covariance matrix) is addressed. High probability generalization bounds for the error probability loss function, as well as for a hinge-type surrogate loss function are provided. A stochastic-gradient based alternating-minimization algorithm for the latter loss function is proposed. In addition, a Gibbs-based algorithm that gradually expurgates an initial codebook from codewords in order to obtain a smaller codebook with improved error probability is proposed, and bounds on its average empirical error and generalization error, as well as a high probability generalization bound, are stated. Various experiments demonstrate the performance of the proposed algorithms. For coded systems, the problem of maximizing the mutual information between the input and the output with respect to the input distribution is addressed, and uniform convergence bounds for two different classes of input distributions are obtained. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

17. Biologically Plausible Training Mechanisms for Self-Supervised Learning in Deep Networks

Author: Mufeng Tang, Yibo Yang, and Yali Amit
Subjects: difference target propagation, layerwise learning, hinge loss, back-propagation (BP), self-supervised learning, Neurosciences. Biological psychiatry. Neuropsychiatry, RC321-571
Abstract: We develop biologically plausible training mechanisms for self-supervised learning (SSL) in deep networks. Specifically, by biologically plausible training we mean (i) all updates of weights are based on current activities of pre-synaptic units and current, or activity retrieved from short term memory of post synaptic units, including at the top-most error computing layer, (ii) complex computations such as normalization, inner products and division are avoided, (iii) asymmetric connections between units, and (iv) most learning is carried out in an unsupervised manner. SSL with a contrastive loss satisfies the third condition as it does not require labeled data and it introduces robustness to observed perturbations of objects, which occur naturally as objects or observers move in 3D and with variable lighting over time. We propose a contrastive hinge based loss whose error involves simple local computations satisfying (ii), as opposed to the standard contrastive losses employed in the literature, which do not lend themselves easily to implementation in a network architecture due to complex computations involving ratios and inner products. Furthermore, we show that learning can be performed with one of two more plausible alternatives to backpropagation that satisfy conditions (i) and (ii). The first is difference target propagation (DTP), which trains network parameters using target-based local losses and employs a Hebbian learning rule, thus overcoming the biologically implausible symmetric weight problem in backpropagation. The second is layer-wise learning, where each layer is directly connected to a layer computing the loss error. The layers are either updated sequentially in a greedy fashion (GLL) or in random order (RLL), and each training stage involves a single hidden layer network. Backpropagation through one layer needed for each such network can either be altered with fixed random feedback weights (RF) or using updated random feedback weights (URF) as in Amity's study 2019. Both methods represent alternatives to the symmetric weight issue of backpropagation. By training convolutional neural networks (CNNs) with SSL and DTP, GLL or RLL, we find that our proposed framework achieves comparable performance to standard BP learning downstream linear classifier evaluation of the learned embeddings.
Published: 2022
Full Text: View/download PDF

18. Solving L1-regularized SVMs and Related Linear Programs: Revisiting the Effectiveness of Column and Constraint Generation.

Author: Dedieu, Antoine, Mazumder, Rahul, and Haoyue Wang
Subjects: *SUPPORT vector machines, *CONVEX functions, *MACHINE learning
Abstract: The linear Support Vector Machine (SVM) is a classic classification technique in machine learning. Motivated by applications in high dimensional statistics, we consider penalized SVM problems involving the minimization of a hinge-loss function with a convex sparsityinducing regularizer such as: the L1-norm on the coefficients, its grouped generalization and the sorted L1-penalty (aka Slope). Each problem can be expressed as a Linear Program (LP) and is computationally challenging when the number of features and/or samples is large|the current state of algorithms for these problems is rather nascent when compared to the usual L2-regularized linear SVM. To this end, we propose new computational algorithms for these LPs by bringing together techniques from (a) classical column (and constraint) generation methods and (b) first order methods for non-smooth convex optimization|techniques that appear to be rarely used together for solving large scale LPs. These components have their respective strengths; and while they are found to be useful as separate entities, they appear to be more powerful in practice when used together in the context of solving large-scale LPs such as the ones studied herein. Our approach complements the strengths of (a) and (b)|leading to a scheme that seems to significantly outperform commercial solvers as well as specialized implementations for these problems. We present numerical results on a series of real and synthetic data sets demonstrating the surprising effectiveness of classic column/constraint generation methods in the context of challenging LP-based machine learning tasks. [ABSTRACT FROM AUTHOR]
Published: 2022

19. Improved Classification Rates for Localized SVMs.

Author: Blaschzyk, Ingrid and Steinwart, Ingo
Subjects: *MACHINE learning, *REGULARIZATION parameter, *GLOBAL method of teaching, *CLASSIFICATION, *SUPPORT vector machines, *STATISTICS
Abstract: Localized support vector machines solve SVMs on many spatially defined small chunks and besides their computational benefit compared to global SVMs one of their main characteristics is the freedom of choosing arbitrary kernel and regularization parameter on each cell. We take advantage of this observation to derive global learning rates for localized SVMs with Gaussian kernels and hinge loss. It turns out that our rates outperform under suitable sets of assumptions known classification rates for localized SVMs, for global SVMs, and other learning algorithms based on e.g., plug-in rules or trees. The localized SVM rates are achieved under a set of margin conditions, which describe the behavior of the data-generating distribution, and no assumption on the existence of a density is made. Moreover, we show that our rates are obtained adaptively, that is without knowing the margin parameters in advance. The statistical analysis of the excess risk relies on a simple partitioning based technique, which splits the input space into a subset that is close to the decision boundary and into a subset that is suficiently far away. A crucial condition to derive then improved global rates is a margin condition that relates the distance to the decision boundary to the amount of noise. [ABSTRACT FROM AUTHOR]
Published: 2022

20. Benefits of Using Symmetric Loss in Recommender Systems

Author: Singh, Gaurav, Mitrović, Sandra, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, Pasi, Gabriella, editor, Piwowarski, Benjamin, editor, Azzopardi, Leif, editor, and Hanbury, Allan, editor
Published: 2018
Full Text: View/download PDF

21. Bias of Homotopic Gradient Descent for the Hinge Loss.

Author: Molitor, Denali, Needell, Deanna, and Ward, Rachel
Subjects: *HINGES, *SMOOTHNESS of functions, *MACHINE learning, *NONSMOOTH optimization
Abstract: Gradient descent is a simple and widely used optimization method for machine learning. For homogeneous linear classifiers applied to separable data, gradient descent has been shown to converge to the maximal-margin (or equivalently, the minimal-norm) solution for various smooth loss functions. The previous theory does not, however, apply to the non-smooth hinge loss which is widely used in practice. Here, we study the convergence of a homotopic variant of gradient descent applied to the hinge loss and provide explicit convergence rates to the maximal-margin solution for linearly separable data. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

22. Sample-based online learning for bi-regular hinge loss.

Author: Xue, Wei, Zhong, Ping, Zhang, Wensheng, Yu, Gaohang, and Chen, Yebin
Abstract: Support vector machine (SVM), a state-of-the-art classifier for supervised classification task, is famous for its strong generalization guarantees derived from the max-margin property. In this paper, we focus on the maximum margin classification problem cast by SVM and study the bi-regular hinge loss model, which not only performs feature selection but tends to select highly correlated features together. To solve this model, we propose an online learning algorithm that aims at solving a non-smooth minimization problem by alternating iterative mechanism. Basically, the proposed algorithm alternates between intrusion samples detection and iterative optimization, and at each iteration it obtains a closed-form solution to the model. In theory, we prove that the proposed algorithm achieves O (1 / T) convergence rate under some mild conditions, where T is the number of training samples received in online learning. Experimental results on synthetic data and benchmark datasets demonstrate the effectiveness and performance of our approach in comparison with several popular algorithms, such as LIBSVM, SGD, PEGASOS, SVRG, etc. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

23. Sparse Twin Extreme Learning Machine With $\varepsilon$ -Insensitive Zone Pinball Loss

Author: Jumei Shen and Jun Ma
Subjects: Twin extreme learning machine, pinball loss, hinge loss, sparsity, noise insensitivity, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Twin extreme learning machine (TELM) based on the hinge-loss function shows great potential for pattern classification. However, the hinge loss is related to the shortest distance between sets and the corresponding classifier is hence sensitive to noise and unstable for resampling. In contrast, the ε-insensitive zone pinball loss is related to the quantile distance therefore is not sensitive to noise, and the resulting solution is sparse. To improve the performance of TELM, we propose a novel TELM learning framework by introducing ε-insensitive zone pinball loss function into TELM. Compared to TELM with hinge loss, the proposed SPTELM has the same computational complexity and is insensitive to noise, resampling stability and maintaining the sparsity of the solution. Further, we theoretically analyzed the sparsity, noise insensitivity and time complexity of SPTELM. Experimental results on multiple datasets demonstrate the noise insensitive, retains sparsity of the proposed method.
Published: 2019
Full Text: View/download PDF

24. Adaptive FH-SVM for Imbalanced Classification

Author: Qi Wang, Yingjie Tian, and Dalian Liu
Subjects: Focal loss, hinge loss, class imbalance, support vector machines (SVMs), Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Support vector machines (SVMs), powerful learning methods, have been popular among machine learning researches due to their strong performance on both classification and regression problems. However, traditional SVM making use of Hinge Loss cannot deal with class imbalance problems, because it applies the same weight of loss to each class. Recently, Focal Loss has been widely used for deep learning to address the imbalanced datasets. The significant effectiveness of Focal loss attracts the attention in many fields, such as object detection, semantic segmentation. Inspired by Focal loss, we reconstructed Hinge Loss with the scaling factor of Focal loss, called FH Loss, which not only deals with the class imbalance problems but also preserve the distinctive property of Hinge loss. Owing to the difficulty of the trade-off between positive and negative accuracy in imbalanced classification, FH loss pays more attention on minority class and misclassified instances to improve the accuracy of each class, further to reduce the influence of imbalance. In addition, due to the difficulty of solving SVM with FH loss, we propose an improved model with modified FH loss, called Adaptive FH-SVM. The algorithm solves the optimization problem iteratively and adaptively updates the FH loss of each instance. Experimental results on 31 binary imbalanced datasets demonstrate the effectiveness of our proposed method.
Published: 2019
Full Text: View/download PDF

25. Predicting Interrelated Alzheimer’s Disease Outcomes via New Self-learned Structured Low-Rank Model

Author: for the ADNI, Wang, Xiaoqian, Liu, Kefei, Yan, Jingwen, Risacher, Shannon L., Saykin, Andrew J., Shen, Li, Huang, Heng, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Niethammer, Marc, editor, Styner, Martin, editor, Aylward, Stephen, editor, Zhu, Hongtu, editor, Oguz, Ipek, editor, Yap, Pew-Thian, editor, and Shen, Dinggang, editor
Published: 2017
Full Text: View/download PDF

26. Multi-parameter safe screening rule for hinge-optimal margin distribution machine.

Author: Ma, Mengdan and Xu, Yitian
Subjects: CLASSIFICATION algorithms, MACHINERY, WHOLE-body vibration
Abstract: Optimal margin distribution machine (ODM) is an efficient algorithm for classification problems. ODM attempts to optimize the margin distribution by maximizing the margin mean and minimizing the margin variance simultaneously, so it can achieve a better generalization performance. However, it is relatively time-consuming for large-scale problems. In this paper, we propose a hinge loss-based optimal margin distribution machine (Hinge-ODM), which derives a simplified substitute formulation. It can speed up the solving process without affecting the optimal accuracy obviously. Besides, inspired by its sparse solution, we put forward a multi-parameter safe screening rule for Hinge-ODM, called MSSR-Hinge-ODM. Based on the MSSR, most non-support vectors can be identified and deleted beforehand so the scale of dual problem will be greatly reduced. Moreover, our MSSR is safe, that is, it can get the exactly same optimal solutions as the original one. Furthermore, a fast algorithm DCDM is introduced to further solve the reduced Hinge-ODM. Finally, we integrate the MSSR into grid search method to accelerate the whole training process. Experimental results on twenty data sets demonstrate the superiority of the proposed methods. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

27. Spam Classification: Genetically Optimized Passive-Aggressive Approach

Author: Naravajhula, Priyatam and Naravajula, Alekhya
Published: 2023
Full Text: View/download PDF

28. On Lagrangian L2-norm pinball twin bounded support vector machine via unconstrained convex minimization.

Author: Prasad, Subhash Chandra and Balasundaram, S.
Subjects: *SUPPORT vector machines, *MATRIX inversion
Abstract: With the introduction of the regularization term in the formulation of the well-known twin support vector machine (TWSVM) for classification, twin bounded support vector machine (TBSVM) method was proposed recently as an improved version by implementing the structural risk minimization principle. However, TBSVM employs hinge loss function and it is sensitive to noise and unstable to re-sampling. Since the pinball loss function related to quantile distance enjoys noise insensitivity property, a novel TBSVM method with squared pinball loss function for classification is proposed. The noise insensitivity and scatter minimization properties are discussed. Our formulation is further simplified as a pair of unconstrained strongly convex minimization problems in the dual space free of matrix inversion terms and having only m variables where m is the number of training examples. As opposed to TWSVM and TBSVM wherein approximate kernel generated surfaces are constructed, kernel trick is applied directly in our formulation and thereby elegant formulation as in the classical support vector machine (SVM) is achieved. Numerical experiments performed on a synthetic and thirteen benchmark datasets with noise where better or comparable generalization performance with faster learning speed by the proposed method confirms its suitability and applicability to problems of interest. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

29. Learning Rate for Convex Support Tensor Machines.

Subjects: *SUPPORT vector machines
Abstract: Tensors are increasingly encountered in prediction problems. We extend previous results for high-dimensional least-squares convex tensor regression to classification problems with a hinge loss and establish its asymptotic statistical properties. Based on a general convex decomposable penalty, the rate depends on both the intrinsic dimension and the Rademacher complexity of the class of linear functions of tensor predictors. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

30. A General Framework for Deep Supervised Discrete Hashing.

Author: Li, Qi, Sun, Zhenan, He, Ran, and Tan, Tieniu
Subjects: *HASHING, *STREAMING video & television, *BINARY codes, *DEEP learning, *COST functions, *VIDEO compression
Abstract: With the rapid growth of image and video data on the web, hashing has been extensively studied for image or video search in recent years. Benefiting from recent advances in deep learning, deep hashing methods have shown superior performance over the traditional hashing methods. However, there are some limitations of previous deep hashing methods (e.g., the semantic information is not fully exploited). In this paper, we develop a general deep supervised discrete hashing framework based on the assumption that the learned binary codes should be ideal for classification. Both the similarity information and the classification information are used to learn the hash codes within one stream framework. We constrain the outputs of the last layer to be binary codes directly, which is rarely investigated in deep hashing algorithms. Besides, both the pairwise similarity information and the triplet ranking information are exploited in this paper. In addition, two different loss functions are presented: l 2 loss and hinge loss, which are carefully designed for the classification term under the one stream framework. Because of the discrete nature of hash codes, an alternating minimization method is used to optimize the objective function. Experimental results have shown that our approach outperforms current state-of-the-art methods on benchmark datasets. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

31. Ramp loss for twin multi-class support vector classification.

Author: Wang, Huiru, Lu, Sijie, and Zhou, Zhijian
Subjects: *COST functions, *FORECASTING, *CLASSIFICATION, *HINGES, *HYPERPLANES, *OUTLIERS (Statistics)
Abstract: Twin K-class support vector classification (TKSVC) adopts 'One-vs.-One-vs.-Rest' structure to utilise all the samples to increase the prediction accuracy. However, TKSVC is sensitive to noises or outliers due to the use of the Hinge loss function. To reduce the negative influence of outliers, in this paper, we propose a more robust algorithm termed as Ramp loss for twin K-class support vector classification (Ramp-TKSVC) where we use the Ramp loss function to substitute the Hinge loss function in TKSVC. Because the Ramp-TKSVC is a non-differentiable non-convex optimisation problem, we adopt Concave–Convex Procedure (CCCP) to solve it. To overcome the drawbacks of conventional multi-classification methodologies, the TKSVC is utilised as a core of our Ramp-TKSVC. In the Ramp-TKSVC, the outliers are prevented from becoming support vectors, thus they are not involved in the construction of hyperplanes, making the Ramp-TKSVC more robust. Besides, the Ramp-TKSVC is sparser than the TKSVC. To verify the validity of our Ramp-TKSVC, we conduct experiments on 12 benchmark datasets in both linear and nonlinear cases. The experimental results indicate that our algorithm outperforms the other five compared algorithms. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

32. On Security and Sparsity of Linear Classifiers for Adversarial Settings

Author: Demontis, Ambra, Russu, Paolo, Biggio, Battista, Fumera, Giorgio, Roli, Fabio, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Robles-Kelly, Antonio, editor, Loog, Marco, editor, Biggio, Battista, editor, Escolano, Francisco, editor, and Wilson, Richard, editor
Published: 2016
Full Text: View/download PDF

33. Hinge Loss Projection for Classification

Author: Alfarozi, Syukron Abu Ishaq, Woraratpanya, Kuntpong, Pasupa, Kitsuchart, Sugimoto, Masanori, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Hirose, Akira, editor, Ozawa, Seiichi, editor, Doya, Kenji, editor, Ikeda, Kazushi, editor, Lee, Minho, editor, and Liu, Derong, editor
Published: 2016
Full Text: View/download PDF

34. Primal dual algorithm for solving the nonsmooth Twin SVM.

Author: Lyaqini, S., Hadri, A., Ellahyani, A., and Nachaoui, M.
Subjects: *NONSMOOTH optimization, *CONSTRAINED optimization, *RANDOM noise theory, *ALGORITHMS, *PROBLEM solving
Abstract: In this paper, we propose an improved version of Twin SVM using a non-smooth optimization method. Twin SVM generally consists in determining two non-parallel planes by alternately solving two constrained optimization models. Solving this problem using the classical Lagrangian method has many limitations, notably: its only limited to handle Gaussian noise, generally exaggerates the influence of outliers and cannot handle unbalanced data, this due to the differentiability of the model. To circumvent these issues, we transform two-constraint optimization models using the penalty method into an unconstrained non-smooth optimization one. The non-smoothness nature of the problem has many advantages, but it requires special treatment, which is why we use the primal dual method to solve it, since it is the most appropriate and it is robust in terms of stability, convergence and speed (Lyaqini, Nachaoui and Hadri, 2022). To demonstrate the effectiveness of the proposed approach, several experiments were carried out on numerous UCI benchmarks, medical image and HandPD datasets. These experiments demonstrated the effectiveness and applicability of the proposed approach, with satisfactory results compared to the state of the art. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. Top-k Partial Label Machine

Author: Dong Yuan, Xiuwen Gong, and Wei Bao
Subjects: Optimization algorithm, Linear programming, Computer Networks and Communications, Computer science, business.industry, Regular polygon, Pattern recognition, Computer Science Applications, Dual (category theory), Set (abstract data type), Artificial Intelligence, Partial loss, Hinge loss, Artificial intelligence, Noise (video), business, Software
Abstract: To deal with ambiguities in partial label learning (PLL), the existing PLL methods implement disambiguations, by either identifying the ground-truth label or averaging the candidate labels. However, these methods can be easily misled by the false-positive labels in the candidate label set. We find that these ambiguities often originate from the noise caused by highly correlated or overlapping candidate labels, which leads to the difficulty in identifying the ground-truth label on the first attempt. To give the trained models more tolerance, we first propose the top-k partial loss and convex top-k partial hinge loss. Based on the losses, we present a novel top-k partial label machine (TPLM) for partial label classification. An efficient optimization algorithm is proposed based on accelerated proximal stochastic dual coordinate ascent (Prox-SDCA) and linear programming (LP). Moreover, we present a theoretical analysis of the generalization error for TPLM. Comprehensive experiments on both controlled UCI datasets and real-world partial label datasets demonstrate that the proposed method is superior to the state-of-the-art approaches.
Published: 2022

36. On the Rates of Convergence From Surrogate Risk Minimizers to the Bayes Optimal Classifier

Author: Dacheng Tao, Jingwei Zhang, and Tongliang Liu
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Mathematical optimization, Optimization problem, Computer Networks and Communications, Computer science, Machine Learning (stat.ML), Bayes Theorem, Machine Learning (cs.LG), Computer Science Applications, Support vector machine, Bayes' theorem, Rate of convergence, Statistics - Machine Learning, Artificial Intelligence, Sample Size, Classifier (linguistics), Hinge loss, Convergence (routing), Neural Networks, Computer, AdaBoost, Algorithms, Software
Abstract: We study the rates of convergence from empirical surrogate risk minimizers to the Bayes optimal classifier. Specifically, we introduce the notion of \emph{consistency intensity} to characterize a surrogate loss function and exploit this notion to obtain the rate of convergence from an empirical surrogate risk minimizer to the Bayes optimal classifier, enabling fair comparisons of the excess risks of different surrogate risk minimizers. The main result of the paper has practical implications including (1) showing that hinge loss is superior to logistic and exponential loss in the sense that its empirical minimizer converges faster to the Bayes optimal classifier and (2) guiding to modify surrogate loss functions to accelerate the convergence to the Bayes optimal classifier., Under Minor Revision in TNNLS
Published: 2022

37. Joint Learning for Attribute-Consistent Person Re-Identification

Author: Khamis, Sameh, Kuo, Cheng-Hao, Singh, Vivek K., Shet, Vinay D., Davis, Larry S., Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Agapito, Lourdes, editor, Bronstein, Michael M., editor, and Rother, Carsten, editor
Published: 2015
Full Text: View/download PDF

38. Centroid Estimation With Guaranteed Efficiency: A General Framework for Weakly Supervised Learning

Author: Jane J. You, Jian Yang, Chen Gong, and Masashi Sugiyama
Subjects: Computer Science::Machine Learning, Computer science, 02 engineering and technology, Minimum-variance unbiased estimator, Artificial Intelligence, Hinge loss, 0202 electrical engineering, electronic engineering, information engineering, business.industry, Applied Mathematics, Supervised learning, Centroid, Estimator, Term (time), Benchmarking, ComputingMethodologies_PATTERNRECOGNITION, Efficiency, Computational Theory and Mathematics, Benchmark (computing), 020201 artificial intelligence & image processing, Supervised Machine Learning, Computer Vision and Pattern Recognition, Artificial intelligence, business, Algorithm, Algorithms, Software
Abstract: In this paper, we propose a general framework termed "Centroid Estimation with Guaranteed Efficiency" (CEGE) for Weakly Supervised Learning (WSL) with incomplete, inexact, and inaccurate supervision. The core of our framework is to devise an unbiased and statistically efficient risk estimator that is applicable to various weak supervision. Specifically, by decomposing the loss function (e.g., the squared loss and hinge loss) into a label-independent term and a label-dependent term, we discover that only the latter is influenced by the weak supervision and is related to the centroid of the entire dataset. Therefore, by constructing two auxiliary pseudo-labeled datasets with synthesized labels, we derive unbiased estimates of centroid based on the two auxiliary datasets, respectively. These two estimates are further linearly combined with a properly decided coefficient which makes the final combined estimate not only unbiased but also statistically efficient. This is better than some existing methods that only care about the unbiasedness of estimation but ignore the statistical efficiency. The good statistical efficiency of the derived estimator is guaranteed as we theoretically prove that it acquires the minimum variance when estimating the centroid. As a result, intensive experimental results on a large number of benchmark datasets demonstrate that our CEGE generally obtains better performance than the existing approaches related to typical WSL problems including semi-supervised learning, positive-unlabeled learning, multiple instance learning, and label noise learning.
Published: 2022

39. Deep Learning With Asymmetric Connections and Hebbian Updates

Author: Yali Amit
Subjects: Hebbian learning, asymmetric backpropagation, feedback connections, hinge loss, convolutional networks, Neurosciences. Biological psychiatry. Neuropsychiatry, RC321-571
Abstract: We show that deep networks can be trained using Hebbian updates yielding similar performance to ordinary back-propagation on challenging image datasets. To overcome the unrealistic symmetry in connections between layers, implicit in back-propagation, the feedback weights are separate from the feedforward weights. The feedback weights are also updated with a local rule, the same as the feedforward weights—a weight is updated solely based on the product of activity of the units it connects. With fixed feedback weights as proposed in Lillicrap et al. (2016) performance degrades quickly as the depth of the network increases. If the feedforward and feedback weights are initialized with the same values, as proposed in Zipser and Rumelhart (1990), they remain the same throughout training thus precisely implementing back-propagation. We show that even when the weights are initialized differently and at random, and the algorithm is no longer performing back-propagation, performance is comparable on challenging datasets. We also propose a cost function whose derivative can be represented as a local Hebbian update on the last layer. Convolutional layers are updated with tied weights across space, which is not biologically plausible. We show that similar performance is achieved with untied layers, also known as locally connected layers, corresponding to the connectivity implied by the convolutional layers, but where weights are untied and updated separately. In the linear case we show theoretically that the convergence of the error to zero is accelerated by the update of the feedback weights.
Published: 2019
Full Text: View/download PDF

40. Communication-Efficient Distributed Online Prediction by Dynamic Model Synchronization

Author: Kamp, Michael, Boley, Mario, Keren, Daniel, Schuster, Assaf, Sharfman, Izchak, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Kobsa, Alfred, Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Goebel, Randy, Series editor, Tanaka, Yuzuru, Series editor, Wahlster, Wolfgang, Series editor, Siekmann, Jörg, Series editor, Calders, Toon, editor, Esposito, Floriana, editor, Hüllermeier, Eyke, editor, and Meo, Rosa, editor
Published: 2014
Full Text: View/download PDF

41. An Cutting Plane Algorithm for Structured Output Ranking

Author: Blaschko, Matthew B., Mittal, Arpit, Rahtu, Esa, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Jiang, Xiaoyi, editor, Hornegger, Joachim, editor, and Koch, Reinhard, editor
Published: 2014
Full Text: View/download PDF

42. Person Re-Identification Using Kernel-Based Metric Learning Methods

Author: Xiong, Fei, Gou, Mengran, Camps, Octavia, Sznaier, Mario, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Kobsa, Alfred, Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Fleet, David, editor, Pajdla, Tomas, editor, Schiele, Bernt, editor, and Tuytelaars, Tinne, editor
Published: 2014
Full Text: View/download PDF

43. Data Augmented Maximum Margin Matrix Factorization for Flickr Group Recommendation

Author: Chen, Liang, Wang, Yilun, Liang, Tingting, Ji, Lichuan, Wu, Jian, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Kobsa, Alfred, editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Weikum, Gerhard, editor, Goebel, Randy, editor, Tanaka, Yuzuru, editor, Wahlster, Wolfgang, editor, Siekmann, Jörg, editor, Tseng, Vincent S., editor, Ho, Tu Bao, editor, Zhou, Zhi-Hua, editor, Chen, Arbee L. P., editor, and Kao, Hung-Yu, editor
Published: 2014
Full Text: View/download PDF

44. Deep Learning With Asymmetric Connections and Hebbian Updates.

Author: Amit, Yali
Subjects: DEEP learning, COST functions
Abstract: We show that deep networks can be trained using Hebbian updates yielding similar performance to ordinary back-propagation on challenging image datasets. To overcome the unrealistic symmetry in connections between layers, implicit in back-propagation, the feedback weights are separate from the feedforward weights. The feedback weights are also updated with a local rule, the same as the feedforward weights—a weight is updated solely based on the product of activity of the units it connects. With fixed feedback weights as proposed in Lillicrap et al. (2016) performance degrades quickly as the depth of the network increases. If the feedforward and feedback weights are initialized with the same values, as proposed in Zipser and Rumelhart (1990), they remain the same throughout training thus precisely implementing back-propagation. We show that even when the weights are initialized differently and at random, and the algorithm is no longer performing back-propagation, performance is comparable on challenging datasets. We also propose a cost function whose derivative can be represented as a local Hebbian update on the last layer. Convolutional layers are updated with tied weights across space, which is not biologically plausible. We show that similar performance is achieved with untied layers, also known as locally connected layers, corresponding to the connectivity implied by the convolutional layers, but where weights are untied and updated separately. In the linear case we show theoretically that the convergence of the error to zero is accelerated by the update of the feedback weights. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

45. BiLSTM-SSVM: Training the BiLSTM with a Structured Hinge Loss for Named-Entity Recognition

Author: Massimo Piccardi and Hanieh Poostchi
Subjects: Information Systems and Management, Computer science, business.industry, computer.software_genre, Cross entropy, Named-entity recognition, Bounding overwatch, Hinge loss, Benchmark (computing), Artificial intelligence, business, F1 score, Hamming code, computer, Sentence, Natural language processing, Information Systems
Abstract: Building on the achievements of the BiLSTM-CRF in named-entity recognition (NER), this paper introduces the BiLSTM-SSVM, an equivalent neural model where training is performed using a structured hinge loss. The typical loss functions used for evaluating NER are entity-level variants of the F1 score such as the CoNLL and MUC losses. Unfortunately, the common loss function used for training NER - the cross entropy - is only loosely related to the evaluation losses. For this reason, in this paper we propose a training approach for the BiLSTM-CRF that leverages a hinge loss bounding the CoNLL loss from above. In addition, we present a mixed hinge loss that bounds either the CoNLL loss or the Hamming loss based on the density of entity tokens in each sentence. The experimental results over four benchmark languages (English, German, Spanish and Dutch) show that training with the mixed hinge loss has led to small but consistent improvements over the cross entropy across all languages and four different evaluation measures.
Published: 2022

46. Weighted Coordinate-Wise Pegasos

Author: Jumutc, Vilen, Suykens, Johan A. K., Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Maji, Pradipta, editor, Ghosh, Ashish, editor, Murty, M. Narasimha, editor, Ghosh, Kuntal, editor, and Pal, Sankar K., editor
Published: 2013
Full Text: View/download PDF

47. Efficient Discriminative Learning of Class Hierarchy for Many Class Prediction

Author: Chen, Lin, Duan, Lixin, Tsang, Ivor W., Xu, Dong, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Lee, Kyoung Mu, editor, Matsushita, Yasuyuki, editor, Rehg, James M., editor, and Hu, Zhanyi, editor
Published: 2013
Full Text: View/download PDF

48. Statistical Tests Using Hinge/ε-Sensitive Loss

Author: Yıldız, Olcay Taner, Alpaydın, Ethem, Gelenbe, Erol, editor, and Lent, Ricardo, editor
Published: 2013
Full Text: View/download PDF

49. A Latent Variable Ranking Model for Content-Based Retrieval

Author: Quattoni, Ariadna, Carreras, Xavier, Torralba, Antonio, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Baeza-Yates, Ricardo, editor, de Vries, Arjen P., editor, Zaragoza, Hugo, editor, Cambazoglu, B. Barla, editor, Murdock, Vanessa, editor, Lempel, Ronny, editor, and Silvestri, Fabrizio, editor
Published: 2012
Full Text: View/download PDF

50. Distributed Sparse Class-Imbalance Learning and Its Applications

Author: Gopalan Vijendran Venkoparao, Durga Toshniwal, and Chandresh Kumar Maurya
Subjects: 0209 industrial biotechnology, Mathematical optimization, Information Systems and Management, Optimization problem, Computer science, Random coordinate descent, Online machine learning, 02 engineering and technology, Semi-supervised learning, Regularization (mathematics), 020901 industrial engineering & automation, Distributed algorithm, Hinge loss, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Coordinate descent, Information Systems
Abstract: In the present work, the study on class imbalance problems in a distributed setting exploiting sparsity structure in the data has been carried out. We formulate the class-imbalance learning problem as a cost-sensitive learning problem with $L_1$ L 1 regularization. The cost-sensitive loss function is a cost-weighted smooth hinge loss. The resultant optimization problem is minimized within the Distributed Alternating Direction Method of Multiplier (DADMM) framework. We partition the data matrix across samples. This operation splits the original problem into a distributed $L_2$ L 2 regularized smooth loss minimization and a $L_1$ L 1 regularized squared loss minimization. $L_2$ L 2 regularized subproblem is solved via Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) and random coordinate descent method in parallel at multiple processing nodes using MPI whereas $L_1$ L 1 regularized problem is just a simple soft-thresholding operation. We show, empirically, that the distributed solution approximates the centralized solution on many benchmark data sets. The centralized solution is obtained via Cost-Sensitive Stochastic Coordinate Descent (CSSCD). Empirical results on small and large-scale benchmark datasets show some promising avenues to further investigate the real-world applications of the proposed algorithms such as anomaly detection, class-imbalance learning, etc. To the best of our knowledge, ours is the first work to study class-imbalance in a distributed environment on large-scale sparse data.
Published: 2021

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

593 results on '"Hinge loss"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources