Back to Search Start Over

Functional analysis of generalized linear models under non-linear constraints with applications to identifying highly-cited papers.

Authors :
Chowdhury, K.P.
Source :
Journal of Informetrics; Feb2021, Vol. 15 Issue 1, pN.PAG-N.PAG, 1p
Publication Year :
2021

Abstract

• Robust functional form contains true parameters far more often than popular models. • Matches/outperforms widely used regression and Neural Network models. • Finds appropriate balance between Model Fit, Inference and Prediction (MIPs). • Introduces new large-sample DGP test; can use to improve A.I. models. • For MIS field finds Popularity Parameter to be important for predicting citations. This article introduces a versatile functional form for Generalized Linear Models (GLMs) through a simple, yet effective, transformation of the current framework. The models are applied through a new hierarchical bayesian estimation procedure for logistic regression to highly-cited papers in the Management Information Systems (MIS) field. The results are uniformly better, in regards to model fit and inference for in-sample and out-of-sample data, for simulation studies and real-world data applications, requiring very little time to convergence to true population parameters. In simulation studies, I show that the method contains the true parameters nearly three times as often as widely used existing GLMs, and does so while having confidence intervals that are 54.50% smaller, while requiring around two-thirds the number of MCMC iterations as existing bayesian methods. In Scientometric applications the methodology is shown to be highly robust with predictive/classification accuracy, either equaling or exceeding existing methods for identifying highly-cited articles including Artificial Neural Networks (ANN). Thus, the method is shown to be robust to the amount of asymmetry (or symmetry) of the probability of success (or failure) and robust to unbalanced samples and varying Data Generating Processes. Further, the methodology is equivalent to current methods if the data support them and is therefore complementary to existing methods, without loss of interpretability of model parameters. For the MIS field it finds that Popularity Parameter (PP) of an article Keywords can predict whether a paper will be highly-cited (top 25% of highly-cited articles) between two to three years after publication and beyond. Furthermore, given the small number of iterations needed for convergence, the methodology can also be used as a baseline method in Big Data (BD) settings for both Artificial Intelligence (AI) and Machine Learning (ML) contexts as well. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
17511577
Volume :
15
Issue :
1
Database :
Supplemental Index
Journal :
Journal of Informetrics
Publication Type :
Academic Journal
Accession number :
149053330
Full Text :
https://doi.org/10.1016/j.joi.2020.101112