"Michael J. Hill" / Search Limiters: Peer Reviewed - Searchworks@Jio Institute Digital Library Search Results

1. The First Michael J Hill Memorial Lecture - 2003

Author: Maskens, A P and Reed, P I
Published: 2004

2. Professor Michael J Hill (1939-2003)

Author: Reed, Peter I
Published: 2003

7. Using remote sensing to monitor the spring phenology of Acadia National Park across elevational gradients

Author: Yan Liu, Caitlin McDonough MacKenzie, Richard B. Primack, Michael J. Hill, Xiaoyang Zhang, Zhuosen Wang, and Crystal B. Schaaf
Subjects: field observation, Landsat, mountainous region, spring phenology, VIIRS, Ecology, QH540-549.5
Abstract: Abstract Greenup dates and their responses to elevation and temperature variations across the mountains of Acadia National Park are monitored using remote sensing data, including Landsat 8 surface reflectances (at a 30‐m spatial resolution) and VIIRS reflectances adjusted to a nadir view (gridded at a 500‐m spatial resolution), during the 2013–2016 growing seasons. The 30‐m resolution provides a better scale for studying the phenology variation across elevational gradients than the 500‐m resolution, as greenup dates monitored at 30‐m scale have better agreement with leaf‐out dates recorded in the field alongside the north–south‐oriented hiking trails on three of the park’s tallest mountains (466 m, 418 m, and 380 m), and can provide landcover‐specific analysis. The spring phenology responses to temperature and elevation vary among different spatial scales. Greenup dates of Acadia National Park monitored at 30‐m scale show a weak advancing trend with higher spring temperature, while greenup dates monitored at 500 m show a weak delaying trend. The species mix within landcover at 30‐m scale could weaken the advancing trend detected at field observation level. The landcover mix and elevation variation within 500‐m scale could alter the spring phenology response to spring temperature variation. Greenup dates monitored at both 30‐m and 500‐m scales vary among different elevational zones, aspects, landcovers, and years. However, the relationship between greenup dates and elevation is rather weak.
Published: 2021
Full Text: View/download PDF

9. Updating the Grassland Vegetation Inventory Using Change Vector Analysis and Functionally-Based Vegetation Indices

Author: Xiaohui Yang, Anne M. Smith, and Michael J. Hill
Subjects: Environmental sciences, GE1-350, Technology
Abstract: The Grassland Vegetation Inventory (GVI), which represents a comprehensive biophysical, anthropogenic, and land-use inventory of grasslands in Alberta, is widely used as a baseline for grassland conditions. An up-to-date GVI is essential for understanding grassland changes and for planning management or conservation actions on grasslands. In this study, a hybrid change detection method is proposed that incorporates change vector analysis and a set of vegetation indices (VIs) measuring different vegetation attributes for mapping the conversion of native grassland to cultivated agriculture, and ultimately to update the GVI based on multiseasonal and multiyear Landsat images. Vegetation indices that contribute significantly to differentiation between existing native grassland and land recently converted from native grassland to cultivated cropland were identified by using stepwise regression analyses and were used as inputs for mapping the conversion between 2006 and 2011 or 2015. The results showed that land conversion can be detected using a single image acquired during the growing season, but that the accuracy of identification is affected by the date of image collection and the nature of the VIs used. The greatest accuracy in detecting land conversion between 2006 and 2011 was achieved using the difference in VI between years (dVI) for the Shortwave Infrared Reflectance 3/2 Ratio (SWIR32) and the Enhanced Vegetation Difference Index (EVI) derived from July imagery (accuracy = 95.2 %; Kappa = 0.86). The same combination of SWIR32 and EVI was also effective, although with lower accuracy (accuracy = 86.0 %; Kappa = 0.64) when tested on a larger geographical area and for detecting land use change between 2006 and 2015. The method proposed here could be applied to detect the land cover conversion in other grassland regions, although the optimal VIs and image acquisition date may need to be modified depending on the type of land use activities implemented in each region.
Published: 2017
Full Text: View/download PDF

10. Remote Sensing of Savannas and Woodlands: Editorial

Author: Michael J. Hill
Subjects: n/a, Science
Abstract: Savannas and woodlands represent one of the most challenging targets for remote sensing [...]
Published: 2021
Full Text: View/download PDF

12. The MODIS Global Vegetation Fractional Cover Product 2001–2018: Characteristics of Vegetation Fractional Cover in Grasslands and Savanna Woodlands

Author: Michael J. Hill and Juan P. Guerschman
Subjects: vegetation, grassland, savanna, fractional cover, trend, ecoregion, bare soil, livestock, production systems, Science
Abstract: Vegetation Fractional Cover (VFC) is an important global indicator of land cover change, land use practice and landscape, and ecosystem function. In this study, we present the Global Vegetation Fractional Cover Product (GVFCP) and explore the levels and trends in VFC across World Grassland Type (WGT) Ecoregions considering variation associated with Global Livestock Production Systems (GLPS). Long-term average levels and trends in fractional cover of photosynthetic vegetation (FPV), non-photosynthetic vegetation (FNPV), and bare soil (FBS) are mapped, and variation among GLPS types within WGT Divisions and Ecoregions is explored. Analysis also focused on the savanna-woodland WGT Formations. Many WGT Divisions showed wide variation in long-term average VFC and trends in VFC across GLPS types. Results showed large areas of many ecoregions experiencing significant positive and negative trends in VFC. East Africa, Patagonia, and the Mitchell Grasslands of Australia exhibited large areas of negative trends in FNPV and positive trends FBS. These trends may reflect interactions between extended drought, heavy livestock utilization, expanded agriculture, and other land use changes. Compared to previous studies, explicit measurement of FNPV revealed interesting additional information about vegetation cover and trends in many ecoregions. The Australian and Global products are available via the GEOGLAM RAPP (Group on Earth Observations Global Agricultural Monitoring Rangeland and Pasture Productivity) website, and the scientific community is encouraged to utilize the data and contribute to improved validation.
Published: 2020
Full Text: View/download PDF

13. Functional Phenology of a Texas Post Oak Savanna from a CHRIS PROBA Time Series

Author: Michael J. Hill, Andrew Millington, Rebecca Lemons, and Cherie New
Subjects: savanna, post oak, vegetation index, ecosystem function, phenology, encroachment, evergreen, deciduous, Science
Abstract: Remnant midwestern oak savannas in the USA have been altered by fire suppression and the encroachment of woody evergreen trees and shrubs. The Gus Engeling Wildlife Management Area (GEWMA) near Palestine, Texas represents a relatively intact southern example of thickening and evergreen encroachment in oak savannas. In this study, 18 images from the CHRIS/PROBA (Compact High-Resolution Imaging Spectrometer/Project for On-Board Autonomy) sensor were acquired between June 2009 and October 2010 and used to explore variation in canopy dynamics among deciduous and evergreen trees and shrubs, and savanna grassland in seasonal leaf-on and leaf-off conditions. Nadir CHRIS images from the 11 useable dates were processed to surface reflectance and a selection of vegetation indices (VIs) sensitive to pigments, photosynthetic efficiency, and canopy water content were calculated. An analysis of temporal VI phenology was undertaken using a fishnet polygon at 90 m resolution incorporating tree densities from a classified aerial photo and soil type polygons. The results showed that the major differences in spectral phenology were associated with deciduous tree density, the density of evergreen trees and shrubs—especially during deciduous leaf-off periods—broad vegetation types, and soil type interactions with elevation. The VIs were sensitive to high densities of evergreens during the leaf-off period and indicative of a photosynthetic advantage over deciduous trees. The largest differences in VI profiles were associated with high and low tree density, and soil types with the lowest and highest available soil water. The study showed how time series of hyperspectral data could be used to monitor the relative abundance and vigor of desirable and less desirable species in conservation lands.
Published: 2019
Full Text: View/download PDF

14. Assessment of Regional Vegetation Response to Climate Anomalies: A Case Study for Australia Using GIMMS NDVI Time Series between 1982 and 2006

Author: Wanda De Keersmaecker, Stef Lhermitte, Michael J. Hill, Laurent Tits, Pol Coppin, and Ben Somers
Subjects: vegetation stability, non-stationarity, resistance, resilience, variance, Australia, climate change, Science
Abstract: Within the context of climate change, it is of utmost importance to quantify the stability of ecosystems with respect to climate anomalies. It is well acknowledged that ecosystem stability may change over time. As these temporal stability changes may provide a warning for increased vulnerability of the system, this study provides a methodology to quantify and assess these temporal changes in vegetation stability. Within this framework, vegetation stability changes were quantified over Australia from 1982 to 2006 using GIMMS NDVI and climate time series (i.e., SPEI (Standardized Precipitation and Evaporation Index)). Starting from a stability assessment on the complete time series, we aim to assess: (i) the magnitude and direction of stability changes; and (ii) the similarity in these changes for different stability metrics, i.e., the standard deviation of the NDVI anomaly (SD), auto-correlation at lag one of the NDVI anomaly (AC) and the correlation of NDVI anomaly with SPEI (CS). Results show high variability in magnitude and direction for the different stability metrics. Large areas and types of Australian vegetation showed an increase in variability (SD) over time; however, vegetation memory (AC) decreased. The association of NDVI anomalies with drought events (CS) showed a mixed response: the association increased in the western part, while it decreased in the eastern part. This methodology shows the potential for quantifying vegetation responses to major climate shifts and land use change, but results could be enhanced with higher resolution time series data.
Published: 2017
Full Text: View/download PDF

15. Book Reviews : Community Action and Race Relations: A Study of Community Relations Committees in Britain. By MICHAEL J. HILL and RUTH M. ISSACHAROFF (London, Oxford University Press for the Institute of Race Relations, 1971). xvi + 295 pp., ...

Author: Allen, Sheila
Published: 1972
Full Text: View/download PDF

16. Integration of Optical and Radar Classifications for Mapping Pasture Type in Western Australia.

Author: Michael J. Hill, Catherine J. Ticehurst, Jong-sen Lee, Mitchell R. Grunes, Graham E. Donald, and David Henry
Subjects: SYNTHETIC aperture radar, OPTICS, IMAGING systems, COHERENT radar, SPECTRUM analysis
Abstract: In this study, independent classifications of Landsat Thematic Mapper imagery and Jet Propulsion Laboratory AirSAR were combined to create an integrated classification of pasture and other vegetation types for a study area in the agricultural zone of Western Australia. The resulting classification combines greenness and brightness information from optical data with structure and water content information from synthetic aperture radar (SAR). Field observations of vegetation type, botanical composition, ground cover percentage, wet and dry biomass, canopy height, and soil water content were collected at 34 sites representing a range of pastures, browse shrubs, and crops. An unsupervised version of the Complex Wishart classification procedure, based on preserving scattering characteristics from the Freeman and Durden backscatter decomposition, was applied to the C-, L-, and P-band polarimetric SAR data. The optical classification was carried out using a principle component analysis on the green, red, and near-infrared bands and clustering on the basis of a class centroid distance measure and knowledge of ground targets. These two classification results were then fused together. Assessment of a confusion matrix using the individual sites showed that identification of more uniform, dense, and structurally distinct canopies was better than that of more diverse, sparse, and structurally ambiguous canopies, as the former were better represented by the canopy height attribute used in the SAR classification component. The optical classification enabled correction of SAR misclassification of vegetation due to surface roughness and soil moisture effects, or similar backscatter responses from herbaceous or arboreal canopies. The results show that simplification of vegetation into groups based upon properties with sensitive responses in both the optical and SAR domains, and combination of separate SAR and optical classifications, has potential for improving classification of diverse and heterogeneous herbaceous and browse cover in grazing lands. However, collection of ground calibration data must be at an appropriate spatial scale and include canopy and surface measurements directly related to backscatter mechanisms and spectral sensitivity. [ABSTRACT FROM AUTHOR]
Published: 2005
Full Text: View/download PDF

17. The First Michael J Hill Memorial Lecture ''' 2003.

Author: A P Maskens
Published: 2004
Full Text: View/download PDF

18. Causation and prevention of human cancer, proceedings of the 8th annual symposium of the european organization for the cooperation in cancer prevention studies (ECP), Heidelburg, Germany, April 2-3, 1990. Michael J. Hill, DSc, FRCPath, and Attilio Giacosa, MD, eds. Kluwer Academic Publishers, 1991. 164 pp

Author: Jardines, Lori
Published: 1991
Full Text: View/download PDF

19. Symbol ungrounding: what the successes (and failures) of large language models reveal about human cognition.

Author: Dove, Guy
Subjects: LANGUAGE models, ARTIFICIAL intelligence, SEMANTIC memory, COGNITION, HUMAN beings
Abstract: Large language models can handle sophisticated natural language processing tasks. This raises the question of how their understanding of semantic meaning compares to that of human beings. Supporters of embodied cognition often point out that because these models are trained solely on text, their representations of semantic content are not grounded in sensorimotor experience. This paper contends that human cognition exhibits capabilities that fit with both the embodied and artificial intelligence approaches. Evidence suggests that semantic memory is partially grounded in sensorimotor systems and dependent on language-specific learning. From this perspective, large language models demonstrate the richness of language as a source of semantic information. They show how our experience with language might scaffold and extend our capacity to make sense of the world. In the context of an embodied mind, language provides access to a valuable form of ungrounded cognition. This article is part of the theme issue 'Minds in movement: embodied cognition in the age of artificial intelligence'. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Explainable depression symptom detection in social media.

Author: Bao, Eliseo, Pérez, Anxo, and Parapar, Javier
Published: 2024
Full Text: View/download PDF

21. Efficiency-oriented approaches for self-supervised speech representation learning.

Author: Lugo, Luis and Vielzeuf, Valentin
Subjects: SPEECH, AUTOMATIC speech recognition, NATURAL language processing, SUPERVISED learning, COMPUTER vision, ENVIRONMENTAL economics
Abstract: Self-supervised learning enables the training of large neural models without the need for large, labeled datasets. It has been generating breakthroughs in several fields, including computer vision, natural language processing, biology, and speech. In particular, the state-of-the-art in several speech processing applications, such as automatic speech recognition or speaker identification, are models where the latent representation is learned using self-supervised approaches. Several configurations exist in self-supervised learning for speech, including contrastive, predictive, and multilingual approaches. There is, however, a crucial limitation in the majority of existing approaches: their high computational costs. These costs limit the deployment of models, the size of the training dataset, and the number of research groups that can afford research with large self-supervised models. Likewise, we should consider the environmental costs that high energy consumption implies. Efforts in this direction comprise optimization of existing models, neural architecture efficiency, improvements in finetuning for speech processing tasks, and data efficiency. But despite current efforts, more work could be done to address high computational costs in self-supervised representation learning. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. Adversarial attacks and defenses for large language models (LLMs): methods, frameworks & challenges.

Author: Kumar, Pranjal
Published: 2024
Full Text: View/download PDF

23. The Limits of Calibration and the Possibility of Roles for Trustworthy AI.

Author: Franke, Ulrik
Abstract: With increasing use of artificial intelligence (AI) in high-stakes contexts, a race for “trustworthy AI” is under way. However, Dorsch and Deroy (Philosophy & Technology 37, 62, 2024) recently argued that regardless of its feasibility, morally trustworthy AI is unnecessary: We should merely rely on rather than trust AI, and carefully calibrate our reliance using the reliability scores which are often available. This short commentary on Dorsch and Deroy engages with the claim that morally trustworthy AI is unnecessary and argues that since there are important limits to how good calibration based on reliability scores can be, some residual roles for trustworthy AI (if feasible) are still possible. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. Y-Tuning: an efficient tuning paradigm for large-scale pre-trained models via label representation learning.

Author: Liu, Yitao, An, Chenxin, and Qiu, Xipeng
Abstract: With current success of large-scale pre-trained models (PTMs), how efficiently adapting PTMs to downstream tasks has attracted tremendous attention, especially for PTMs with billions of parameters. Previous work focuses on designing parameter-efficient tuning paradigms but needs to save and compute the gradient of the whole computational graph. In this paper, we propose Y -Tuning, an efficient yet effective paradigm to adapt frozen large-scale PTMs to specific downstream tasks. Y -Tuning learns dense representations for labels Y defined in a given task and aligns them to fixed feature representation. Without computing the gradients of text encoder at training phrase, Y -Tuning is not only parameter-efficient but also training-efficient. Experimental results show that for DeBERTaXXL with 1.6 billion parameters, Y -Tuning achieves performance more than 96% of full fine-tuning on GLUE Benchmark with only 2% tunable parameters and much fewer training costs. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. PETA: evaluating the impact of protein transfer learning with sub-word tokenization on downstream applications.

Author: Tan, Yang, Li, Mingchen, Zhou, Ziyi, Tan, Pan, Yu, Huiqun, Fan, Guisheng, and Hong, Liang
Abstract: Protein language models (PLMs) play a dominant role in protein representation learning. Most existing PLMs regard proteins as sequences of 20 natural amino acids. The problem with this representation method is that it simply divides the protein sequence into sequences of individual amino acids, ignoring the fact that certain residues often occur together. Therefore, it is inappropriate to view amino acids as isolated tokens. Instead, the PLMs should recognize the frequently occurring combinations of amino acids as a single token. In this study, we use the byte-pair-encoding algorithm and unigram to construct advanced residue vocabularies for protein sequence tokenization, and we have shown that PLMs pre-trained using these advanced vocabularies exhibit superior performance on downstream tasks when compared to those trained with simple vocabularies. Furthermore, we introduce PETA, a comprehensive benchmark for systematically evaluating PLMs. We find that vocabularies comprising 50 and 200 elements achieve optimal performance. Our code, model weights, and datasets are available at https://github.com/ginnm/ProteinPretraining. Scientific contribution: This study introduces advanced protein sequence tokenization analysis, leveraging the byte-pair-encoding algorithm and unigram. By recognizing frequently occurring combinations of amino acids as single tokens, our proposed method enhances the performance of PLMs on downstream tasks. Additionally, we present PETA, a new comprehensive benchmark for the systematic evaluation of PLMs, demonstrating that vocabularies of 50 and 200 elements offer optimal performance. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. MicroBERT: Distilling MoE-Based Knowledge from BERT into a Lighter Model.

Author: Zheng, Dashun, Li, Jiaxuan, Yang, Yunchu, Wang, Yapeng, and Pang, Patrick Cheong-Iao
Subjects: LANGUAGE models, GENERATIVE adversarial networks, KNOWLEDGE transfer, DISTILLATION, GLUE
Abstract: Natural language-processing tasks have been improved greatly by large language models (LLMs). However, numerous parameters make their execution computationally expensive and difficult on resource-constrained devices. For this problem, as well as maintaining accuracy, some techniques such as distillation and quantization have been proposed. Unfortunately, current methods fail to integrate model pruning with downstream tasks and overlook sentence-level semantic modeling, resulting in reduced efficiency of distillation. To alleviate these limitations, we propose a novel distilled lightweight model for BERT named MicroBERT. This method can transfer the knowledge contained in the "teacher" BERT model to a "student" BERT model. The sentence-level feature alignment loss (FAL) distillation mechanism, guided by Mixture-of-Experts (MoE), captures comprehensive contextual semantic knowledge from the "teacher" model to enhance the "student" model's performance while reducing its parameters. To make the outputs of "teacher" and "student" models comparable, we introduce the idea of a generative adversarial network (GAN) to train a discriminator. Our experimental results based on four datasets show that all steps of our distillation mechanism are effective, and the MicroBERT (101.14%) model outperforms TinyBERT (99%) by 2.24% in terms of average distillation reductions in various tasks on the GLUE dataset. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Large language models for automated Q&A involving legal documents: a survey on algorithms, frameworks and applications.

Author: Yang, Xiaoxian, Wang, Zhifeng, Wang, Qi, Wei, Ke, Zhang, Kaiqi, and Shi, Jiangang
Abstract: Purpose: This study aims to adopt a systematic review approach to examine the existing literature on law and LLMs.It involves analyzing and synthesizing relevant research papers, reports and scholarly articles that discuss the use of LLMs in the legal domain. The review encompasses various aspects, including an analysis of LLMs, legal natural language processing (NLP), model tuning techniques, data processing strategies and frameworks for addressing the challenges associated with legal question-and-answer (Q&A) systems. Additionally, the study explores potential applications and services that can benefit from the integration of LLMs in the field of intelligent justice. Design/methodology/approach: This paper surveys the state-of-the-art research on law LLMs and their application in the field of intelligent justice. The study aims to identify the challenges associated with developing Q&A systems based on LLMs and explores potential directions for future research and development. The ultimate goal is to contribute to the advancement of intelligent justice by effectively leveraging LLMs. Findings: To effectively apply a law LLM, systematic research on LLM, legal NLP and model adjustment technology is required. Originality/value: This study contributes to the field of intelligent justice by providing a comprehensive review of the current state of research on law LLMs. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. Language in Brains, Minds, and Machines.

Author: Tuckute, Greta, Kanwisher, Nancy, and Fedorenko, Evelina
Subjects: LANGUAGE models, ARTIFICIAL languages, COGNITIVE neuroscience, TASK performance, BRAIN imaging
Abstract: It has long been argued that only humans could produce and understand language. But now, for the first time, artificial language models (LMs) achieve this feat. Here we survey the new purchase LMs are providing on the question of how language is implemented in the brain. We discuss why, a priori, LMs might be expected to share similarities with the human language system. We then summarize evidence that LMs represent linguistic information similarly enough to humans to enable relatively accurate brain encoding and decoding during language processing. Finally, we examine which LM properties—their architecture, task performance, or training—are critical for capturing human neural responses to language and review studies using LMs as in silico model organisms for testing hypotheses about language. These ongoing investigations bring us closer to understanding the representations and processes that underlie our ability to comprehend sentences and express thoughts in language. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. Pretrained models for cross-modal retrieval: experiments and improvements.

Author: Zhou, Kun, Hassan, Fadratul Hafinaz, and Gan, Keng Hoon
Abstract: Cross-modal retrieval, the process of retrieving relevant data from one modality in response to a query in another, has become increasingly important with the growing amount of multimodal data. This paper proposes using a pretrained model CLIP as the backbone of a cross-modal retrieval system and explores various methods to enhance its performance. The proposed approach reduces the output feature dimension to 384, reducing model parameters, storage capacity, and retrieval time by 62.5%. By conducting cross-training on the training dataset, the model not only enhances its intermodal invariance but also achieves multimodal retrieval. The residual connections and an increased dropout ratio of 30% increase average retrieval performance. Additionally, we propose the utilization of class proxies as missing data to accomplish training in an incomplete (imbalanced) dataset. The proposed approach is evaluated on four benchmark datasets: Wikipedia, NUS-WIDE, Pascal-Sentence, and XmediaNet, achieving 3.4%, 1.9%, 2.3%, and 5.8% retrieval performance improvement, respectively. The results demonstrate the effectiveness of the proposed approach in significantly improving the performance of cross-modal retrieval systems, outperforming state-of-the-art methods on benchmark datasets while reducing the number of model parameters, retrieval time, and database storage space. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. Raising the Bar on Acceptability Judgments Classification: An Experiment on ItaCoLA Using ELECTRA.

Author: Guarasci, Raffaele, Minutolo, Aniello, Buonaiuto, Giuseppe, De Pietro, Giuseppe, and Esposito, Massimo
Subjects: NATURAL language processing, LEGAL judgments, LANGUAGE models, CORPORA, CLASSIFICATION
Abstract: The task of automatically evaluating acceptability judgments has relished increasing success in Natural Language Processing, starting from including the Corpus of Linguistic Acceptability (CoLa) in the GLUE benchmark dataset. CoLa spawned a thread that led to the development of several similar datasets in different languages, broadening the investigation possibilities to many languages other than English. In this study, leveraging the Italian Corpus of Linguistic Acceptability (ItaCoLA), comprising nearly 10,000 sentences with acceptability judgments, we propose a new methodology that utilizes the neural language model ELECTRA. This approach exceeds the scores obtained from current baselines and demonstrates that it can overcome language-specific limitations in dealing with specific phenomena. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

31. LCAS-DetNet: A Ship Target Detection Network for Synthetic Aperture Radar Images.

Author: Liu, Junlin, Liao, Dingyi, Wang, Xianyao, Li, Jun, Yang, Bing, and Chen, Guanyu
Subjects: SYNTHETIC aperture radar, SYNTHETIC apertures, SPECKLE interference, SPATIAL arrangement, SHIPS, SPATIAL ability
Abstract: Monitoring ships on water surfaces encounters obstacles such as weather conditions, sunlight, and water ripples, posing significant challenges in accurately detecting target ships in real time. Synthetic Aperture Radar (SAR) offers a viable solution for real-time ship detection, unaffected by cloud coverage, precipitation, or light levels. However, SAR images are often affected by speckle noise, salt-and-pepper noise, and water surface ripple interference. This study introduces LCAS-DetNet, a Multi-Location Cross-Attention Ship Detection Network tailored for the ships in SAR images. Modeled on the YOLO architecture, LCAS-DetNet comprises a feature extractor, an intermediate layer ("Neck"), and a detection head. The feature extractor includes the computation of Multi-Location Cross-Attention (MLCA) for precise extraction of ship features at multiple scales. Incorporating both local and global branches, MLCA bolsters the network's ability to discern spatial arrangements and identify targets via a cross-attention mechanism. Each branch utilizes Multi-Location Attention (MLA) and calculates pixel-level correlations in both channel and spatial dimensions, further combating the impact of salt-and-pepper noise on the distribution of objective ship pixels. The feature extractor integrates downsampling and MLCA stacking, enhanced with residual connections and Patch Embedding, to improve the network's multi-scale spatial recognition capabilities. As the network deepens, we consider this structure to be cascaded and multi-scale, providing the network with a richer receptive field. Additionally, we introduce a loss function based on Wise-IoUv3 to address the influence of label quality on the gradient updates. The effectiveness of our network was validated on the HRSID and SSDD datasets, where it achieved state-of-the-art performance: a 96.59% precision on HRSID and 97.52% on SSDD. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. Accelerating BERT inference with GPU-efficient exit prediction.

Author: Li, Lei, Wang, Chengyu, Qiu, Minghui, Chen, Cen, Gao, Ming, and Zhou, Aoying
Abstract: BERT is a representative pre-trained language model that has drawn extensive attention for significant improvements in downstream Natural Language Processing (NLP) tasks. The complex architecture and massive parameters bring BERT competitive performance but also result in slow speed at model inference time. To speed up BERT inference, FastBERT realizes adaptive inference with an acceptable drop in accuracy based on knowledge distillation and the early-exit technique. However, many factors may limit the performance of FastBERT, such as the teacher classifier that is not knowledgeable enough, the batch size shrinkage and the redundant computation of student classifiers. To overcome these limitations, we propose a new BERT inference method with GPU-Efficient Exit Prediction (GEEP). GEEP leverages the shared exit loss to simplify the training process of FastBERT from two steps into only one step and makes the teacher classifier more knowledgeable by feeding diverse Transformer outputs to the teacher classifier. In addition, the exit layer prediction technique is proposed to utilize a GPU hash table to handle the token-level exit layer distribution and to sort test samples by predicted exit layers. In this way, GEEP can avoid batch size shrinkage and redundant computation of student classifiers. Experimental results on twelve public English and Chinese NLP datasets prove the effectiveness of the proposed approach. The source codes of GEEP will be released to the public upon paper acceptance. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Natural Language Processing: An Overview of Models, Transformers and Applied Practices.

Author: Canchila, Santiago, Meneses-Eraso, Carlos, Casanoves-Boix, Javier, Cortés-Pellicer, Pascual, and Castelló-Sirvent, Fernando
Abstract: The study of utilizing human language in computer systems referred to as NLP, is becoming increasingly significant in various aspects of life, including research, daily activities, commerce, and entrepreneurship endeavors. A multitude of tech companies are dedicating resources towards the development and improvement of NLP methods, models, and products. To add to that, open-source contributions to the field are on the rise. However, with so much progress being made, it may be challenging to understand the current state of NLP and what models are considered to be the most efficient. To help those grappling with the fast-paced and constantly evolving NLP landscape, we have put together a comprehensive overview of the latest NLP research and advancements. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. WeatherBench 2: A Benchmark for the Next Generation of Data‐Driven Global Weather Models.

Author: Rasp, Stephan, Hoyer, Stephan, Merose, Alexander, Langmore, Ian, Battaglia, Peter, Russell, Tyler, Sanchez‐Gonzalez, Alvaro, Yang, Vivian, Carver, Rob, Agrawal, Shreya, Chantry, Matthew, Ben Bouallegue, Zied, Dueben, Peter, Bromberg, Carla, Sisk, Jared, Barrington, Luke, Bell, Aaron, and Sha, Fei
Subjects: NUMERICAL weather forecasting, WEATHER forecasting, WEATHER, ARTIFICIAL intelligence, WEATHERING
Abstract: WeatherBench 2 is an update to the global, medium‐range (1–14 days) weather forecasting benchmark proposed by (Rasp et al., 2020, https://doi.org/10.1029/2020ms002203), designed with the aim to accelerate progress in data‐driven weather modeling. WeatherBench 2 consists of an open‐source evaluation framework, publicly available training, ground truth and baseline data as well as a continuously updated website with the latest metrics and state‐of‐the‐art models: https://sites.research.google/weatherbench. This paper describes the design principles of the evaluation framework and presents results for current state‐of‐the‐art physical and data‐driven weather models. The metrics are based on established practices for evaluating weather forecasts at leading operational weather centers. We define a set of headline scores to provide an overview of model performance. In addition, we also discuss caveats in the current evaluation setup and challenges for the future of data‐driven weather forecasting. Plain Language Summary: Traditionally, weather forecasts are made by models that attempt to replicate the physical processes of the atmosphere. This has been very successful over the last few decades as better computers, better observations and model upgrades have lead to steadily improving weather forecasts. However, with rapid advances in artificial intelligence (AI), the question can be asked whether one can simply learn a weather model from past observations or reanalyzes. In the last couple of years, we have seen tremendous progress with state‐of‐the‐art AI models rivaling the best "traditional" weather models in skill. WeatherBench 2 is a benchmark data set designed to evaluate and compare the quality of AI and traditional models. By setting a standard for evaluation, alongside providing open‐source data and code, this project aims to accelerate this research direction and lead to better weather prediction. Key Points: WeatherBench 2 is a framework for evaluating and comparing data‐driven and traditional numerical weather forecasting modelsIt provides an evaluation framework, publicly available data sets and a website to assess the state‐of‐the‐art weather modelsThe evaluation protocol has been designed following best practices established in the operational weather forecasting community [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. Transformers for Remote Sensing: A Systematic Review and Analysis.

Author: Wang, Ruikun, Ma, Lei, He, Guangjun, Johnson, Brian Alan, Yan, Ziyun, Chang, Ming, and Liang, Ying
Subjects: TRANSFORMER models, REMOTE sensing, CONVOLUTIONAL neural networks, OPTICAL remote sensing, DATABASES, RECURRENT neural networks, OBJECT recognition (Computer vision)
Abstract: Research on transformers in remote sensing (RS), which started to increase after 2021, is facing the problem of a relative lack of review. To understand the trends of transformers in RS, we undertook a quantitative analysis of the major research on transformers over the past two years by dividing the application of transformers into eight domains: land use/land cover (LULC) classification, segmentation, fusion, change detection, object detection, object recognition, registration, and others. Quantitative results show that transformers achieve a higher accuracy in LULC classification and fusion, with more stable performance in segmentation and object detection. Combining the analysis results on LULC classification and segmentation, we have found that transformers need more parameters than convolutional neural networks (CNNs). Additionally, further research is also needed regarding inference speed to improve transformers' performance. It was determined that the most common application scenes for transformers in our database are urban, farmland, and water bodies. We also found that transformers are employed in the natural sciences such as agriculture and environmental protection rather than the humanities or economics. Finally, this work summarizes the analysis results of transformers in remote sensing obtained during the research process and provides a perspective on future directions of development. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

36. A Historical Survey of Advances in Transformer Architectures.

Author: Sajun, Ali Reza, Zualkernan, Imran, and Sankalpa, Donthi
Subjects: LANGUAGE models, TRANSFORMER models, DEEP learning, COMPUTER vision, MACHINE learning
Abstract: In recent times, transformer-based deep learning models have risen in prominence in the field of machine learning for a variety of tasks such as computer vision and text generation. Given this increased interest, a historical outlook at the development and rapid progression of transformer-based models becomes imperative in order to gain an understanding of the rise of this key architecture. This paper presents a survey of key works related to the early development and implementation of transformer models in various domains such as generative deep learning and as backbones of large language models. Previous works are classified based on their historical approaches, followed by key works in the domain of text-based applications, image-based applications, and miscellaneous applications. A quantitative and qualitative analysis of the various approaches is presented. Additionally, recent directions of transformer-related research such as those in the biomedical and timeseries domains are discussed. Finally, future research opportunities, especially regarding the multi-modality and optimization of the transformer training process, are identified. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. Design of a large language model for improving customer service in telecom operators.

Author: Xiaoliang, Ma, RuQiang, Zhao, Ying, Liu, Congjian, Deng, and Dequan, Du
Subjects: LANGUAGE models, CUSTOMER services, INFORMATION technology security, QUALITY of service, KNOWLEDGE base
Abstract: Telecommunications operators are tasked with enhancing service quality, reducing operational costs, and preserving customer privacy. This study presents an innovative application of large language models (LLMs) integrated with the LangChain technology framework, aimed at revolutionizing customer service in the telecom sector. The LangChain framework features a Knowledge Organizing Module and a Knowledge Search Module, both designed to refine customer support operations. The research develops an LLM‐based approach to improve the segmentation and organization of knowledge bases, tailored for the telecommunications industry. This approach ensures seamless integration with existing LLMs while preserving distinct knowledge domains, crucial for search accuracy. Additionally, the framework includes an advanced information security protocol with a robust filtering system that effectively excludes sensitive data from the model's outputs, enhancing data security. Empirical findings indicate that the ChatGLM2‐6B+LangChain model outperforms the baseline ChatGLM2, demonstrating heightened effectiveness in telecom‐specific tasks and outstripping even more sophisticated models like GPT‐4. The implementation of this LLM‐based framework within telecom customer service systems has significantly sharpened the precision of knowledge recommendations, as reflected by a dramatic increase in acceptance rates from 15% to 70%. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. Enriching Language Models with Graph-Based Context Information to Better Understand Textual Data.

Author: Roethel, Albert, Ganzha, Maria, and Wróblewska, Anna
Subjects: LANGUAGE models, NATURAL language processing, GRAPH neural networks
Abstract: A considerable number of texts encountered daily are somehow connected. For example, Wikipedia articles refer to other articles via hyperlinks, or scientific papers relate to others via citations or (co)authors; tweets relate via users that follow each other or reshare content. Hence, a graph-like structure can represent existing connections and be seen as capturing the "context" of the texts. The question thus arises of whether extracting and integrating such context information into a language model might help facilitate a better-automated understanding of the text. In this study, we experimentally demonstrate that incorporating graph-based contextualization into the BERT model enhances its performance on an example of a classification task. Specifically, in the Pubmed dataset, we observed a reduction in balanced mean error from 8.51% to 7.96%, while increasing the number of parameters just by 1.6%. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

39. Korean named entity recognition based on language-specific features.

Author: Chen, Yige, Lim, KyungTae, and Park, Jungyeul
Subjects: KOREAN language, AMBIGUITY, MORPHEMICS
Abstract: In this paper, we propose a novel way of improving named entity recognition (NER) in the Korean language using its language-specific features. While the field of NER has been studied extensively in recent years, the mechanism of efficiently recognizing named entities (NEs) in Korean has hardly been explored. This is because the Korean language has distinct linguistic properties that present challenges for modeling. Therefore, an annotation scheme for Korean corpora by adopting the CoNLL-U format, which decomposes Korean words into morphemes and reduces the ambiguity of NEs in the original segmentation that may contain functional morphemes such as postpositions and particles, is proposed herein. We investigate how the NE tags are best represented in this morpheme-based scheme and implement an algorithm to convert word-based and syllable-based Korean corpora with NEs into the proposed morpheme-based format. Analyses of the results of traditional and neural models reveal that the proposed morpheme-based format is feasible, and the varied performances of the models under the influence of various additional language-specific features are demonstrated. Extrinsic conditions were also considered to observe the variance of the performances of the proposed models, given different types of data, including the original segmentation and different types of tagging formats. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. Fine-tuning pretrained transformer encoders for sequence-to-sequence learning.

Author: Bao, Hangbo, Dong, Li, Wang, Wenhui, Yang, Nan, Piao, Songhao, and Wei, Furu
Abstract: In this paper, we introduce s2s-ft, a method for adapting pretrained bidirectional Transformer encoders, such as BERT and RoBERTa, to sequence-to-sequence tasks like abstractive summarization and question generation. By employing a unified modeling approach and well-designed self-attention masks, s2s-ft leverages the generative capabilities of pretrained Transformer encoders without the need for an additional decoder. We conduct extensive experiments comparing three fine-tuning algorithms (causal fine-tuning, masked fine-tuning, and pseudo-masked fine-tuning) and various pretrained models for initialization. Results demonstrate that s2s-ft achieves strong performance across different tasks and languages. Additionally, the method is successfully extended to multilingual pretrained models, such as XLM-RoBERTa, and evaluated on multilingual generation tasks. Our work highlights the importance of reducing the discrepancy between masked language model pretraining and sequence-to-sequence fine-tuning and showcases the effectiveness and expansibility of the s2s-ft method. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

41. Comparison between parameter-efficient techniques and full fine-tuning: A case study on multilingual news article classification.

Author: Razuvayevskaya, Olesya, Wu, Ben, Leite, João A., Heppell, Freddy, Srba, Ivan, Scarton, Carolina, Bontcheva, Kalina, and Song, Xingyi
Subjects: LANGUAGE models, CLASSIFICATION, ENGLISH language, DESIGN techniques
Abstract: Adapters and Low-Rank Adaptation (LoRA) are parameter-efficient fine-tuning techniques designed to make the training of language models more efficient. Previous results demonstrated that these methods can even improve performance on some classification tasks. This paper complements existing research by investigating how these techniques influence classification performance and computation costs compared to full fine-tuning. We focus specifically on multilingual text classification tasks (genre, framing, and persuasion techniques detection; with different input lengths, number of predicted classes and classification difficulty), some of which have limited training data. In addition, we conduct in-depth analyses of their efficacy across different training scenarios (training on the original multilingual data; on the translations into English; and on a subset of English-only data) and different languages. Our findings provide valuable insights into the applicability of parameter-efficient fine-tuning techniques, particularly for multilabel classification and non-parallel multilingual tasks which are aimed at analysing input texts of varying length. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

42. A Survey of LLM Datasets: From Autoregressive Model to AI Chatbot.

Author: Du, Fei, Ma, Xin-Jian, Yang, Jing-Ru, Liu, Yi, Luo, Chao-Ran, Wang, Xue-Bin, Jiang, Hai-Ou, and Jing, Xiang
Subjects: LANGUAGE models, CHATBOTS, AUTOREGRESSIVE models, ARTIFICIAL intelligence, NATURAL language processing, CHATGPT
Abstract: Since OpenAI opened access to ChatGPT, large language models (LLMs) become an increasingly popular topic attracting researchers' attention from abundant domains. However, public researchers meet some problems when developing LLMs given that most of the LLMs are produced by industries and the training details are typically unrevealed. Since datasets are an important setup of LLMs, this paper does a holistic survey on the training datasets used in both the pre-train and fine-tune processes. The paper first summarizes 16 pre-train datasets and 16 fine-tune datasets used in the state-of-the-art LLMs. Secondly, based on the properties of the pre-train and fine-tune processes, it comments on pre-train datasets from quality, quantity, and relation with models, and comments on fine-tune datasets from quality, quantity, and concerns. This study then critically figures out the problems and research trends that exist in current LLM datasets. The study helps public researchers train and investigate LLMs by visual cases and provides useful comments to the research community regarding data development. To the best of our knowledge, this paper is the first to summarize and discuss datasets used in both autoregressive and chat LLMs. The survey offers insights and suggestions to researchers and LLM developers as they build their models, and contributes to the LLM study by pointing out the existing problems of LLM studies from the perspective of data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. GPT-4 passes the bar exam.

Author: Katz, Daniel Martin, Bommarito, Michael James, Gao, Shang, and Arredondo, Pablo
Subjects: GENERATIVE pre-trained transformers, LANGUAGE models, BAR examinations, CHATGPT, COMPLEXITY (Philosophy)
Abstract: In this paper, we experimentally evaluate the zero-shot performance of GPT-4 against prior generations of GPT on the entire uniform bar examination (UBE), including not only the multiple-choice multistate bar examination (MBE), but also the open-ended multistate essay exam (MEE) and multistate performance test (MPT) components. On the MBE, GPT-4 significantly outperforms both human test-takers and prior models, demonstrating a 26% increase over ChatGPT and beating humans in five of seven subject areas. On the MEE and MPT, which have not previously been evaluated by scholars, GPT-4 scores an average of 4.2/6.0 when compared with much lower scores for ChatGPT. Graded across the UBE components, in the manner in which a human test-taker would be, GPT-4 scores approximately 297 points, significantly in excess of the passing threshold for all UBE jurisdictions. These findings document not just the rapid and remarkable advance of large language model performance generally, but also the potential for such models to support the delivery of legal services in society. This article is part of the theme issue 'A complexity science approach to law and governance'. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. Identification of Perceived Challenges in the Green Energy Transition by Turkish Society through Sentiment Analysis.

Author: Bilgin, Ugur and Soner Kara, Selin
Abstract: Green energy refers to energy derived from renewable sources such as solar, wind, hydro, and biomass, which are environmentally sustainable. It aims to reduce reliance on fossil fuels and mitigate environmental impacts. In the Turkish context, alongside positive sentiments regarding the establishment of energy plants, there are also prevalent negative perspectives. Societal responses to the transition towards green energy can be effectively gauged through the analysis of individual comments. However, manually examining thousands of comments is both time-consuming and impractical. To address this challenge, this study proposes the integration of the Transformer method, a Natural Language Processing (NLP) technique. This study presents a defined NLP procedure that utilizes a multi-labeled NLP model, with a particular emphasis on the analysis of comments on social media classified as "dirty text". The primary objective of this investigation is to ascertain the evolving perception of Turkish society regarding the transition to green energy over time and to conduct a comprehensive analysis utilizing NLP. The study utilizes a dataset that is multi-labeled, wherein emotions are not equally represented and each dataset may contain multiple emotions. Consequently, the measured accuracy rates for the risk, environment, and cost labels are, respectively, 0.950, 0.924, and 0.913, whereas the ROC AUC scores are 0.896, 0.902, and 0.923. The obtained results indicate that the developed model yielded successful outcomes. This study aims to develop a forecasting model tailored to green energy to analyze the current situation and monitor societal behavior dynamically. The central focus is on determining the reactions of Turkish society during the transition to green energy. The insights derived from the study aim to guide decision-makers in formulating policies for the transition. The research concludes with policy recommendations based on the model outputs, providing valuable insights for decision-makers in the context of the green energy transition. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

45. Benchmarking Large Language Model (LLM) Performance for Game Playing via Tic-Tac-Toe.

Author: Topsakal, Oguzhan and Harper, Jackson B.
Subjects: LANGUAGE models, GENERATIVE pre-trained transformers, SIMULATION games, MOBILE apps
Abstract: This study investigates the strategic decision-making abilities of large language models (LLMs) via the game of Tic-Tac-Toe, renowned for its straightforward rules and definitive outcomes. We developed a mobile application coupled with web services, facilitating gameplay among leading LLMs, including Jurassic-2 Ultra by AI21, Claude 2.1 by Anthropic, Gemini-Pro by Google, GPT-3.5-Turbo and GPT-4 by OpenAI, Llama2-70B by Meta, and Mistral Large by Mistral, to assess their rule comprehension and strategic thinking. Using a consistent prompt structure in 10 sessions for each LLM pair, we systematically collected data on wins, draws, and invalid moves across 980 games, employing two distinct prompt types to vary the presentation of the game's status. Our findings reveal significant performance variations among the LLMs. Notably, GPT-4, GPT-3.5-Turbo, and Llama2 secured the most wins with the list prompt, while GPT-4, Gemini-Pro, and Mistral Large excelled using the illustration prompt. GPT-4 emerged as the top performer, achieving victory with the minimum number of moves and the fewest errors for both prompt types. This research introduces a novel methodology for assessing LLM capabilities using a game that can illuminate their strategic thinking abilities. Beyond enhancing our comprehension of LLM performance, this study lays the groundwork for future exploration into their utility in complex decision-making scenarios, offering directions for further inquiry and the exploration of LLM limits within game-based frameworks. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

46. Adapting transformer-based language models for heart disease detection and risk factors extraction.

Author: Houssein, Essam H., Mohamed, Rehab E., Hu, Gang, and Ali, Abdelmgeid A.
Subjects: NATURAL language processing, LANGUAGE models, DISEASE risk factors, TRANSFORMER models, HEART diseases
Abstract: Efficiently treating cardiac patients before the onset of a heart attack relies on the precise prediction of heart disease. Identifying and detecting the risk factors for heart disease such as diabetes mellitus, Coronary Artery Disease (CAD), hyperlipidemia, hypertension, smoking, familial CAD history, obesity, and medications is critical for developing effective preventative and management measures. Although Electronic Health Records (EHRs) have emerged as valuable resources for identifying these risk factors, their unstructured format poses challenges for cardiologists in retrieving relevant information. This research proposed employing transfer learning techniques to automatically extract heart disease risk factors from EHRs. Leveraging transfer learning, a deep learning technique has demonstrated a significant performance in various clinical natural language processing (NLP) applications, particularly in heart disease risk prediction. This study explored the application of transformer-based language models, specifically utilizing pre-trained architectures like BERT (Bidirectional Encoder Representations from Transformers), RoBERTa, BioClinicalBERT, XLNet, and BioBERT for heart disease detection and extraction of related risk factors from clinical notes, using the i2b2 dataset. These transformer models are pre-trained on an extensive corpus of medical literature and clinical records to gain a deep understanding of contextualized language representations. Adapted models are then fine-tuned using annotated datasets specific to heart disease, such as the i2b2 dataset, enabling them to learn patterns and relationships within the domain. These models have demonstrated superior performance in extracting semantic information from EHRs, automating high-performance heart disease risk factor identification, and performing downstream NLP tasks within the clinical domain. This study proposed fine-tuned five widely used transformer-based models, namely BERT, RoBERTa, BioClinicalBERT, XLNet, and BioBERT, using the 2014 i2b2 clinical NLP challenge dataset. The fine-tuned models surpass conventional approaches in predicting the presence of heart disease risk factors with impressive accuracy. The RoBERTa model has achieved the highest performance, with micro F1-scores of 94.27%, while the BERT, BioClinicalBERT, XLNet, and BioBERT models have provided competitive performances with micro F1-scores of 93.73%, 94.03%, 93.97%, and 93.99%, respectively. Finally, a simple ensemble of the five transformer-based models has been proposed, which outperformed the most existing methods in heart disease risk fan, achieving a micro F1-Score of 94.26%. This study demonstrated the efficacy of transfer learning using transformer-based models in enhancing risk prediction and facilitating early intervention for heart disease prevention. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

47. Sarcasm detection in online comments using machine learning.

Author: Šandor, Daniel and Bagić Babac, Marina
Published: 2024
Full Text: View/download PDF

48. "Transforming" Personality Scale Development: Illustrating the Potential of State-of-the-Art Natural Language Processing.

Author: Fyffe, Shea, Lee, Philseok, and Kaplan, Seth
Subjects: NATURAL language processing, TRANSFORMER models, INDUSTRIAL psychology, CONTENT analysis
Abstract: Natural language processing (NLP) techniques are becoming increasingly popular in industrial and organizational psychology. One promising area for NLP-based applications is scale development; yet, while many possibilities exist, so far these applications have been restricted—mainly focusing on automated item generation. The current research expands this potential by illustrating an NLP-based approach to content analysis, which manually categorizes scale items by their measured constructs. In NLP, content analysis is performed as a text classification task whereby a model is trained to automatically assign scale items to the construct that they measure. Here, we present an approach to text classification—using state-of-the-art transformer models—that builds upon past approaches. We begin by introducing transformer models and their advantages over alternative methods. Next, we illustrate how to train a transformer to content analyze Big Five personality items. Then, we compare the models trained to human raters, finding that transformer models outperform human raters and several alternative models. Finally, we present practical considerations, limitations, and future research directions. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

49. Natural Language Processing in Knowledge-Based Support for Operator Assistance.

Author: Besharati Moghaddam, Fatemeh, Lopez, Angel J., De Vuyst, Stijn, and Gautama, Sidharta
Subjects: NATURAL language processing, LINGUISTICS, MANUFACTURING industries
Abstract: Manufacturing industry faces increasing complexity in the performance of assembly tasks due to escalating demand for complex products with a greater number of variations. Operators require robust assistance systems to enhance productivity, efficiency, and safety. However, existing support services often fall short when operators encounter unstructured open questions and incomplete sentences due to primarily relying on procedural digital work instructions. This draws attention to the need for practical application of natural language processing (NLP) techniques. This study addresses these challenges by introducing a domain-specific dataset tailored to assembly tasks, capturing unique language patterns and linguistic characteristics. We explore strategies to process declarative and imperative sentences, including incomplete ones, effectively. Thorough evaluation of three pre-trained NLP libraries—NLTK, SPACY, and Stanford—is performed to assess their effectiveness in handling assembly-related concepts and ability to address the domain's distinctive challenges. Our findings demonstrate the efficient performance of these open-source NLP libraries in accurately handling assembly-related concepts. By providing valuable insights, our research contributes to developing intelligent operator assistance systems, bridging the gap between NLP techniques and the assembly domain within manufacturing industry. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

50. AMPLIFY: attention-based mixup for performance improvement and label smoothing in transformer.

Author: Yang, Leixin and Xiang, Yu
Subjects: TRANSFORMER models, DATA augmentation, PROBLEM solving
Abstract: Mixup is an effective data augmentation method that generates new augmented samples by aggregating linear combinations of different original samples. However, if there are noises or aberrant features in the original samples, mixup may propagate them to the augmented samples, leading to over-sensitivity of the model to these outliers. To solve this problem, this paper proposes a new mixup method called AMPLIFY. This method uses the attention mechanism of Transformer itself to reduce the influence of noises and aberrant values in the original samples on the prediction results, without increasing additional trainable parameters, and the computational cost is very low, thereby avoiding the problem of high resource consumption in common mixup methods such as Sentence Mixup. The experimental results show that, under a smaller computational resource cost, AMPLIFY outperforms other mixup methods in text classification tasks on seven benchmark datasets, providing new ideas and new ways to further improve the performance of pre-trained models based on the attention mechanism, such as BERT, ALBERT, RoBERTa, and GPT. Our code can be obtained at https://github.com/kiwi-lilo/AMPLIFY. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

467 results on '"Michael J. Hill"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources