Descriptor: "T-SNE" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"T-SNE"' showing total 811 results

Start Over Descriptor "T-SNE"

811 results on '"T-SNE"'

1. Characterization of Water Consumers in Urban Areas Based on Data Visualization Techniques

Author: Rubiños, Manuel, Arcano-Bea, Paula, Díaz-Longueira, Antonio, Timiraos, Míriam, Michelena, Álvaro, Zayas-Gato, Francisco, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Quintián, Héctor, editor, Corchado, Emilio, editor, Troncoso Lora, Alicia, editor, Pérez García, Hilde, editor, Jove Pérez, Esteban, editor, Calvo Rolle, José Luis, editor, Martínez de Pisón, Francisco Javier, editor, García Bringas, Pablo, editor, Martínez Álvarez, Francisco, editor, Herrero, Álvaro, editor, and Fosci, Paolo, editor
Published: 2025
Full Text: View/download PDF

2. Analyzing public sentiment toward economic stimulus using natural language processing

Author: Chowdhury, Mohammad Ashraful Ferdous, Abdullah, Mohammad, and Albashrawi, Mousa
Published: 2024
Full Text: View/download PDF

3. Gradient-based explanation for non-linear non-parametric dimensionality reduction.

Author: Corbugy, Sacha, Marion, Rebecca, and Frénay, Benoît
Subjects: NEIGHBORHOODS, EXPLANATION, ALGORITHMS
Abstract: Dimensionality reduction (DR) is a popular technique that shows great results to analyze high-dimensional data. Generally, DR is used to produce visualizations in 2 or 3 dimensions. While it can help understanding correlations between data, embeddings generated by DR are hard to grasp. The position of instances in low-dimension may be difficult to interpret, especially for non-linear, non-parametric DR techniques. Because most of the techniques are said to be neighborhood preserving (which means that explaining long distances is not relevant), some approaches try explaining them locally. These methods use simpler interpretable models to approximate the decision frontier locally. This can lead to misleading explanations. In this paper a novel approach to locally explain non-linear, non-parametric DR embeddings like t-SNE is introduced. It is the first gradient-based method for explaining these DR algorithms. The technique presented in this paper is applied on t-SNE, but is theoretically suitable for any DR method that is a minimization or maximization problem. The approach uses the analytical derivative of a t-SNE embedding to explain the position of an instance in the visualization. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. Unveiling Geographical Variation and Chemotypes of Cupressus torulosa Needle Essential Oil: A Novel Approach Using t‐SNE and HCA.

Author: Bhalla, Piyush, Chauhan, Kiran, and Varshney, V. K.
Subjects: *HIERARCHICAL clustering (Cluster analysis), *ESSENTIAL oils, *GAS chromatography, *CYPRESS, *COMMERCIALIZATION
Abstract: ABSTRACT This study aimed to assess the geographical variation in the content and chemical composition of Cupressus torulosa needles essential oil across different locations in the Himalayan region of India. The methodology involved the collection of needles from 14 distinct locations, followed by hydro‐distillation using a Clevenger‐type apparatus. Qualitative analysis was conducted using gas chromatography–mass spectrometry (GC–MS), while gas chromatography with flame ionisation detector (GC‐FID) was employed for quantitative analysis. The GC–MS analysis identified a total of 57 compounds, with oxygenated monoterpenes and monoterpene hydrocarbons being the dominant chemical constituents, ranging from 22.5% to 63.01% and from 10.39% to 63.95%, respectively. Terpinen‐4‐ol emerged as the major compound, with concentrations ranging from 101.2 ± 45.7 μg/mg to 393.8 ± 12.5 μg/mg across different locations, with the highest concentration observed in the Dehradun location. The application of t‐distributed stochastic neighbour embedding (t‐SNE) analysis and hierarchical cluster analysis (HCA) revealed the presence of five distinct chemotypes within the essential oil, characterised by different combinations of chemical constituents. These chemotypes were identified as terpinen‐4‐ol/limonene, terpinen‐4‐ol/sabinene, terpinen‐4‐ol, terpinen‐4‐ol/umbellulone, and terpinen‐4‐ol/totarol chemotypes. This research serves as a foundational framework for future investigations aimed at harnessing the unique properties of different chemotypes for specific purposes, potentially facilitating the successful commercialization and utilisation of C. torulosa needles essential oil. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Lamb mode identification based on lightweight CNN.

Author: Li, Juanjuan and Wang, Anhong
Subjects: *CONVOLUTIONAL neural networks, *LAMB waves, *DATABASES, *LAMBS, *TITANIUM
Abstract: In this study, a lightweight convolutional neural network (CNN) is employed to identify Lamb modes. The proposed approach consists of five convolutional and pooling layers, then a fully-connected layer and a sigmoid layer. In which, the first convolutional layer is a wide-scale kernel. Lamb wave responses based on froward modelling are obtained for different plate materials (aluminium, steel and titanium), different excitation frequencies (250 kHz, 500 kHz), and different excitation cycles (4-cycle, 5-cycle). 16800 Lamb wave samples labelled by ‘A0 mode’ and ‘S0 mode’ are beforehand and hosted in a database, then trained via the lightweight CNN. In validation process, the lightweight CNN reaches 100% accuracy. The performance of light-weight CNN is also compared with some popular networks. Now, the well-trained network can be used to identify Lamb mode. Some responses are stimulated by ABAQUS under different excitation signal, different propagating distance, different plate material, and the predicted results via the lightweight CNN are all right. In addition, the extensibility of the network is validated by identifying new-converted Lamb mode correctly. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. How to visualize high‐dimensional data.

Author: Mrowka, Ralf and Schmauder, Ralf
Subjects: *SCIENTIFIC literature, *PRINCIPAL components analysis, *REGIONAL development, *DATA structures, *BLOOD pressure measurement
Abstract: This article discusses the visualization of high-dimensional data in the field of physiology. The authors emphasize the importance of clarifying the axes and variables represented in diagrams to ensure accurate interpretation. They explain that traditional methods like principal component analysis (PCA) may not be sufficient for high-dimensional data and introduce nonlinear techniques like t-Distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP). These methods allow for the visualization of complex data and have been widely used in various fields, including neurophysiology, immunology, cancer research, and infectious diseases. The authors caution that the interpretation of these plots requires careful consideration due to the nonlinear transformations involved. They also mention ongoing efforts to improve these methods. Overall, the article highlights the need for clear explanations of high-dimensional plots in presentations and acknowledges the interdisciplinary nature of physiology and the rapid development of methods in the field. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

7. Analysis and Application Research of E-Commerce Financial Management Based on T-DPC Optimization Algorithm.

Author: Yilan Wang and Yao Shan
Subjects: DATA analysis, ELECTRONIC commerce, FINANCIAL management, PERFORMANCE evaluation, CORPORATE finance
Abstract: Given the intricate, multifaceted nature of financial data in e-commerce enterprises, this article presents a T-DPC algorithm for analyzing financial management in these businesses. The algorithm utilizes the t-SNE method to reduce the dimensionality of financial data, whilst also implementing an enhanced DPC algorithm based on the K-nearest neighbor concept to analyze financial data clusters. The results show that the F-measure metrics of the DPC algorithm optimized by t-SNE improve 16.7% and 3.07% over the DPC algorithm after testing on the PID and Wine datasets, and its running time is faster than the DPC algorithm on the Aggregation, D31, and R15 datasets by 16.2. Therefore, the algorithm has reference significance for the financial analysis of e-commerce enterprises. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. Using Neural Networks to Develop a Database of Failures and Emergencies at Hydroelectric Power Stations.

Author: Shipilov, A. V., Tikhonova, T. S., and Pechantikova, O. A.
Abstract: The article provides an example of employing a neural network and a natural language model to develop the database of failures and emergencies at hydroelectric power stations around the world that is available at JSC Vedeneev VNIIG. Using particular examples in conjunction with the t-SNE machine learning algorithm for visualization and the DBSCAN data clustering algorithm, the study shows an approach for enhancing the database. This technique enables a remarkable improvement in the selection of analog objects when justifying accident scenarios. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. Compressed representation of brain genetic transcription.

Author: Ruffle, James K., Watkins, Henry, Gray, Robert J., Hyare, Harpreet, Thiebaut de Schotten, Michel, and Nachev, Parashkev
Subjects: *GENETIC transcription, *PRINCIPAL components analysis, *GENE expression, *DEEP learning, *BRAIN imaging
Abstract: The architecture of the brain is too complex to be intuitively surveyable without the use of compressed representations that project its variation into a compact, navigable space. The task is especially challenging with high‐dimensional data, such as gene expression, where the joint complexity of anatomical and transcriptional patterns demands maximum compression. The established practice is to use standard principal component analysis (PCA), whose computational felicity is offset by limited expressivity, especially at great compression ratios. Employing whole‐brain, voxel‐wise Allen Brain Atlas transcription data, here we systematically compare compressed representations based on the most widely supported linear and non‐linear methods—PCA, kernel PCA, non‐negative matrix factorisation (NMF), t‐stochastic neighbour embedding (t‐SNE), uniform manifold approximation and projection (UMAP), and deep auto‐encoding—quantifying reconstruction fidelity, anatomical coherence, and predictive utility across signalling, microstructural, and metabolic targets, drawn from large‐scale open‐source MRI and PET data. We show that deep auto‐encoders yield superior representations across all metrics of performance and target domains, supporting their use as the reference standard for representing transcription patterns in the human brain. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. Statistical plots in oncologic imaging, a primer for neuroradiologists.

Author: Bagheri, Sina, Taghvaei, Mohammad, Familiar, Ariana, Haldar, Debanjan, Zandifar, Alireza, Khalili, Nastaran, Vossough, Arastoo, and Nabavizadeh, Ali
Abstract: The simplest approach to convey the results of scientific analysis, which can include complex comparisons, is typically through the use of visual items, including figures and plots. These statistical plots play a critical role in scientific studies, making data more accessible, engaging, and informative. A growing number of visual representations have been utilized recently to graphically display the results of oncologic imaging, including radiomic and radiogenomic studies. Here, we review the applications, distinct properties, benefits, and drawbacks of various statistical plots. Furthermore, we provide neuroradiologists with a comprehensive understanding of how to use these plots to effectively communicate analytical results based on imaging data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. Manifold information through neighbor embedding projection for image retrieval.

Author: Leticio, Gustavo Rosseto, Kawai, Vinicius Sato, Valem, Lucas Pascotti, Pedronette, Daniel Carlos Guimarães, and da S. Torres, Ricardo
Subjects: *IMAGE retrieval, *CONVOLUTIONAL neural networks, *TRANSFORMER models, *DATA visualization, *DIMENSION reduction (Statistics)
Abstract: Although studied for decades, constructing effective image retrieval remains an open problem in a wide range of relevant applications. Impressive advances have been made to represent image content, mainly supported by the development of Convolution Neural Networks (CNNs) and Transformer-based models. On the other hand, effectively computing the similarity between such representations is still challenging, especially in collections in which images are structured in manifolds. This paper introduces a novel solution to this problem based on dimensionality reduction techniques, often used for data visualization. The key idea consists in exploiting the spatial relationships defined by neighbor embedding data visualization methods, such as t-SNE and UMAP, to compute a more effective distance/similarity measure between images. Experiments were conducted on several widely-used datasets. Obtained results indicate that the proposed approach leads to significant gains in comparison to the original feature representations. Experiments also indicate competitive results in comparison with state-of-the-art image retrieval approaches. • Manifold information encoded by the Neighbor Embedding framework for image retrieval. • Use of 2D spatial relationships given by Neighbor Embedding for similarity definition. • A simple, yet effective and efficient image retrieval scheme is proposed. • A late fusion method is used to combine distance given by t-SNE and UMAP projections. • Significant gains obtained on diverse datasets and features based on CNNs and Transformers. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. Shape Pattern Recognition of Building Footprints Using t-SNE Dimensionality Reduction Visualization.

Author: Li, Jingzhong and Mao, Kainan
Subjects: *RECOGNITION (Psychology), *PATTERN perception, *GAUSSIAN mixture models, *VISUAL perception, *CARTESIAN coordinates, *GEOGRAPHIC information systems
Abstract: The shape pattern recognition of building footprints stands as a pivotal concern within GIS spatial cognition. In this study, we introduce a novel approach for the shape recognition of building footprints, leveraging t-distributed stochastic neighbor embedding (t-SNE) dimensionality reduction visualization. First, the Canonical Time Warping (CTW) algorithm is employed to gauge the shape similarity distance of building footprints. Subsequently, the t-SNE model is utilized to map the building footprints, featuring varying numbers of coordinate vertices, onto points within the Cartesian coordinate system. The shape similarity distance serves as the input to the t-SNE model for parameter optimization. Lastly, building footprint shapes are identified through the inherent clustering patterns of points using a Gaussian Mixture Model (GMM). Experimental results demonstrate the method's robustness to the translation, rotation, scaling, and mirroring of geometric objects, while effectively measuring shape similarity between building footprints. Furthermore, diverse types of building footprints are discernible through natural clustering in low-dimensional spaces, aligning closely with human visual perception. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. Identification of Dendrobium Using Laser-Induced Breakdown Spectroscopy in Combination with a Multivariate Algorithm Model.

Author: Zhang, Tingsong, Liu, Ziyuan, Ma, Qing, Hu, Dong, Dai, Yujia, Zhang, Xinfeng, and Zhou, Zhu
Subjects: LASER-induced breakdown spectroscopy, DENDROBIUM, IDENTIFICATION, K-nearest neighbor classification, SUPPORT vector machines, FEATURE selection
Abstract: Dendrobium, a highly effective traditional Chinese medicinal herb, exhibits significant variations in efficacy and price among different varieties. Therefore, achieving an efficient classification of Dendrobium is crucial. However, most of the existing identification methods for Dendrobium make it difficult to simultaneously achieve both non-destructiveness and high efficiency, making it challenging to truly meet the needs of industrial production. In this study, we combined Laser-Induced Breakdown Spectroscopy (LIBS) with multivariate models to classify 10 varieties of Dendrobium. LIBS spectral data for each Dendrobium variety were collected from three circular medicinal blocks. During the data analysis phase, multivariate models to classify different Dendrobium varieties first preprocess the LIBS spectral data using Gaussian filtering and stacked correlation coefficient feature selection. Subsequently, the constructed fusion model is utilized for classification. The results demonstrate that the classification accuracy of 10 Dendrobium varieties reached 100%. Compared to Support Vector Machine (SVM), Random Forest (RF), and K-Nearest Neighbors (KNN), our method improved classification accuracy by 14%, 20%, and 20%, respectively. Additionally, it outperforms three models (SVM, RF, and KNN) with added Principal Component Analysis (PCA) by 10%, 10%, and 17%. This fully validates the excellent performance of our classification method. Finally, visualization analysis of the entire research process based on t-distributed Stochastic Neighbor Embedding (t-SNE) technology further enhances the interpretability of the model. This study, by combining LIBS and machine learning technologies, achieves efficient classification of Dendrobium, providing a feasible solution for the identification of Dendrobium and even traditional Chinese medicinal herbs. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. Malaria cell identification using improved machine learning and modified deep learning architecture.

Author: S., Shashikiran and D., Sunitha H.
Subjects: DEEP learning, MACHINE learning, CONVOLUTIONAL neural networks, MALARIA, DATA augmentation, ANTIGEN analysis
Abstract: Malaria continues to be a serious problem for public health because of its occurrence in tropical and subtropical areas with inadequate healthcare systems and few resources. For prompt intervention and treatment of malaria, effective and precise diagnosis is essential. Professional pathologists examine blood smear films by hand to get a microscopic diagnosis and another way they will do a rapid antigen malaria test which produces the result of 50% accuracy. Convolutional neural network (CNN) is a type of deep learning (DL) model that has been effectively used for a variety of image recognition applications. Our suggested approach uses, improved machine learning (IML) methods like support vector machine (SVM)+principal component analysis (PCA) fit, SVM+t-distributed stochastic neighbor embedding (t-SNE) fit, and CNN architecture with an accuracy of 86.23%, 88.27%, and 97.16% accuracy respectively, to combine feature extraction, data augmentation, and modify the layers by including the SVM algorithm in the final layer of the CNN architecture. The proposed method will significantly reduce pathologists' burden by automating the identification of malaria and improving diagnosis accuracy in resourceconstrained contexts. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. Biomedical Text Data Visualization

Author: Verma, Shikha, Gupta, Yogesh, Chakrabarti, Amlan, Series Editor, Becker, Jürgen, Editorial Board Member, Hu, Yu-Chen, Editorial Board Member, Chattopadhyay, Anupam, Editorial Board Member, Tribedi, Gaurav, Editorial Board Member, Saha, Sriparna, Editorial Board Member, Goswami, Saptarsi, Editorial Board Member, Sharan, Aditi, editor, Malik, Nidhi, editor, Imran, Hazra, editor, and Ghosh, Indira, editor
Published: 2024
Full Text: View/download PDF

16. Exploration and Analysis of Seizure Spikes Through Spectral Domain Transformation

Author: Najmusseher, Nizar Banu, P. K., Azar, Ahmad Taher, Kamal, Nashwa Ahmad, Alzahrani, Abdulkareem, Howlett, Robert J., Series Editor, Jain, Lakhmi C., Series Editor, Hassanien, Aboul Ella, editor, Zheng, Dequan, editor, Zhao, Zhijie, editor, and Fan, Zhipeng, editor
Published: 2024
Full Text: View/download PDF

17. Quantitative Stock Market Modeling Using Multivariate Geometric Random Walk

Author: Pokojovy, Michael, Anum, Andrews T., Amo, Obed, Mariani, Maria C., Orosz, Michael C., Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, and Han, Henry, editor
Published: 2024
Full Text: View/download PDF

18. NRASV: Noise Robust ASV System for Audio Replay Attack Detection

Author: Chakravarty, Nidhi, Dua, Mohit, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Pastor-Escuredo, David, editor, Brigui, Imene, editor, Kesswani, Nishtha, editor, Bordoloi, Sushanta, editor, and Ray, Ashok Kumar, editor
Published: 2024
Full Text: View/download PDF

19. Classification of Cancer Types Based on RNA HI-SEQ Data Using Dimensionality Reduction

Author: Tunny, Zannatul Ferdous, Munna, MD Abir Hasan, Hossain, MD. Shahadat, Raisa, Roksana Akter, Rahman, Muhammad Arifur, Brown, David J., Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Mahmud, Mufti, editor, Ben-Abdallah, Hanene, editor, Kaiser, M. Shamim, editor, Ahmed, Muhammad Raisuddin, editor, and Zhong, Ning, editor
Published: 2024
Full Text: View/download PDF

20. Beyond Accuracy: Measuring Representation Capacity of Embeddings to Preserve Structural and Contextual Information

Author: Ali, Sarwan, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Lossio-Ventura, Juan Antonio, editor, Ceh-Varela, Eduardo, editor, Vargas-Solar, Genoveva, editor, Marcacini, Ricardo, editor, Tadonki, Claude, editor, Calvo, Hiram, editor, and Alatrista-Salas, Hugo, editor
Published: 2024
Full Text: View/download PDF

21. Visualization of large-scale user-related feature data based on nonlinear dimensionality reduction method

Author: Wei, Xiuzhuo, Wang, Chunjie, Tang, Bo, Zhao, Huinan, Li, Kan, Editor-in-Chief, Li, Qingyong, Associate Editor, Fournier-Viger, Philippe, Series Editor, Hong, Wei-Chiang, Series Editor, Liang, Xun, Series Editor, Wang, Long, Series Editor, Xu, Xuesong, Series Editor, Subramaniyam, Kannimuthu, editor, Leng, Lu, editor, Li, Jing, editor, and Wheeb, Ali Hussein, editor
Published: 2024
Full Text: View/download PDF

22. Critical Analysis of 5G Networks’ Traffic Intrusion Using PCA, t-SNE, and UMAP Visualization and Classifying Attacks

Author: Ghani, Humera, Salekzamankhani, Shahram, Virdee, Bal, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Swaroop, Abhishek, editor, Polkowski, Zdzislaw, editor, Correia, Sérgio Duarte, editor, and Virdee, Bal, editor
Published: 2024
Full Text: View/download PDF

23. Weighted t-Distributed Stochastic Neighbor Embedding for Projection-Based Clustering

Author: Nápoles, Gonzalo, Concepción, Leonardo, Özgöde Yigin, Büşra, Saygili, Görkem, Vanhoof, Koen, Bello, Rafael, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Hernández Heredia, Yanio, editor, Milián Núñez, Vladimir, editor, and Ruiz Shulcloper, José, editor
Published: 2024
Full Text: View/download PDF

24. Quantifying User Experience Through Self-reporting Questionnaires: A Systematic Analysis of the Sentence Similarity Between the Items of the Measurement Approaches

Author: Graser, Stefan, Böhm, Stephan, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Stephanidis, Constantine, editor, Antona, Margherita, editor, Ntoa, Stavroula, editor, and Salvendy, Gavriel, editor
Published: 2024
Full Text: View/download PDF

25. A Semi-Supervised Learning Framework for Classifying Colorectal Neoplasia Based on the NICE Classification

Author: Wang, Yu, Ni, Haoxiang, Zhou, Jielu, Liu, Lihe, Lin, Jiaxi, Yin, Minyue, Gao, Jingwen, Zhu, Shiqi, Yin, Qi, Zhu, Jinzhou, and Li, Rui
Published: 2024
Full Text: View/download PDF

26. An autoencoder based unsupervised clustering approach to analyze the effect of E-learning on the mental health of Indian students during the Covid-19 pandemic

Author: Banerjee, Pritha, Jana, Chandan, Saha, Jayita, and Chowdhury, Chandreyee
Published: 2024
Full Text: View/download PDF

27. The synergistic effect of QR decomposition with t-SNE.

Author: Ali, Mohsin and Choudhary, Jitendra
Subjects: MANN Whitney U Test, CONFIDENCE intervals
Abstract: The study utilized non-parametric tests, specifically, the Mann-Whitney U test, to evaluate the performance of a proposed model called QRPCA-t-SNE, along with two other models, MDS and UMAP. The study compared these three models with two datasets on performance metrics such as model accuracy, training accuracy, testing accuracy, mean square error, AUC scores, precision, recall, and F1 scores. Once the model's performance was conducted, the Anderson-Darling test was to check for data normality before applying the hypothesis for model proof. The analysis revealed that Model 1 (QRPCA-t-SNE) significantly outperformed Model 2 (UMAP) and Model 3 (MDS) in terms of accuracy, with p-values of 0.0027 and 0.0003, respectively. This finding suggests that Model 1 (QRPCA-t-SNE) is suitable for highaccuracy and reliability applications, providing valuable insights into predictive analytics with a 95% confidence interval (confidence level a= 0.05). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. Predicting Alloying Element Yield in Converter Steelmaking Using t-SNE-WOA-LSTM.

Author: Liu, Xin, Qu, Xihui, Xie, Xinjun, Li, Sijun, Bao, Yanping, and Zhao, Lihua
Subjects: METAHEURISTIC algorithms, OPTIMIZATION algorithms, STEEL manufacture, FERROSILICON, IRON alloys, ALLOYS
Abstract: The performance and quality of steel products are significantly impacted by the alloying element control. The efficiency of alloy utilization in the steelmaking process was directly related to element yield. This study analyses the factors that influence the yield of elements in the steelmaking process using correlation analysis. A yield prediction model was developed using a t-distributed stochastic neighbor embedding (t-SNE) algorithm, a whale optimization algorithm (WOA), and a long short-term memory (LSTM) neural network. The t-SNE algorithm was used to reduce the dimensionality of the original data, while the WOA optimization algorithm was employed to optimize the hyperparameters of the LSTM neural network. The t-SNE-WOA-LSTM model accurately predicted the yield of Mn and Si elements with hit rates of 71.67%, 96.67%, and 99.17% and 57.50%, 89.17%, and 97.50%, respectively, falling within the error range of ±1%, ±2%, and ±3% for Mn and ±1%, ±3%, and ±5% for Si. The results demonstrate that the t-SNE-WOA-LSTM model outperforms the backpropagation (BP), LSTM, and WOA-LSTM models in terms of prediction accuracy. The model was applied to actual production in a Chinese plant. The actual performance of the industrial application is within a ±3% error range, with an accuracy of 100%. Furthermore, the elemental yield predicted by the model and then added the ferroalloys resulted in a reduction in the elemental content of the product by 0.017%. The model enables accurate prediction of alloying element yields and was effectively applied in industrial production. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. Method for wind power forecasting based on support vector machines optimized and weighted composite gray relational analysis.

Author: You, Miaona, Zhuang, Sumei, and Luo, Ruxue
Subjects: *SUPPORT vector machines, *WIND forecasting, *WIND power, *GREY relational analysis, *NUMERICAL weather forecasting
Abstract: This study proposes a weighted composite approach for grey relational analysis (GRA) that utilizes a numerical weather prediction (NWP) and support vector machine (SVM). The approach is optimized using an improved grey wolf optimization (IGWO) algorithm. Initially, the dimension of NWP data is decreased by t-distributed stochastic neighbor embedding (t-SNE), then the weight of sample coefficients is calculated by entropy-weight method (EWM), and the weighted grey relational of data points is calculated for different weather numerical time series data. At the same time, a new weighted composite grey relational degree is formed by combining the weighted cosine similarity of NWP values of the historical day and to be measured day. The SVM's regression power prediction model is constructed by the time series data. To improve the accuracy of the system's predictions, the grey relational time series data is chosen as the input variable for the SVM, and the influence parameters of the ideal SVM are discovered using the IGWO technique. According to the simulated prediction and analysis based on NWP, it can be observed that the proposed method in this study significantly improves the prediction accuracy of the data. Specifically, evaluation metrics such as root mean squared error (RMSE), regression correlation coefficient (r2), mean absolute error (MAE) and mean absolute percent error (MAPE) all show corresponding enhancements, while the computational burden remains relatively low. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. Characterization of CD34 + Cells from Patients with Acute Myeloid Leukemia (AML) and Myelodysplastic Syndromes (MDS) Using a t-Distributed Stochastic Neighbor Embedding (t-SNE) Protocol.

Author: Nollmann, Cathrin, Moskorz, Wiebke, Wimmenauer, Christian, Jäger, Paul S., Cadeddu, Ron P., Timm, Jörg, Heinzel, Thomas, and Haas, Rainer
Subjects: *MYELODYSPLASTIC syndromes, *RESEARCH funding, *HEMATOPOIETIC stem cells
Abstract: Simple Summary: Hematopoietic stem and progenitor cells (HSPCs) play a pivotal role in maintaining the homeostasis of the blood and immune systems. Acute myeloid leukemia (AML) and myelodysplastic syndromes (MDS) represent heterogeneous hematologic malignancies resulting from genetic mutations within cells of the hematopoietic lineage, leading to the expansion of leukemic blasts including leukemic stem cells (LSCs). Using the t-distributed stochastic neighbor embedding (t-SNE) methodology, we examined the immunological phenotype of HSPCs based on the differential expression of CD34, CD38, CD45RA, CD123 and programmed death ligand 1 (PD-L1) antigens, and contrasted it with the immunophenotype of blasts and LSCs in AML and MDS. Using multi-color flow cytometry analysis, we studied the immunophenotypical differences between leukemic cells from patients with AML/MDS and hematopoietic stem and progenitor cells (HSPCs) from patients in complete remission (CR) following their successful treatment. The panel of markers included CD34, CD38, CD45RA, CD123 as representatives for a hierarchical hematopoietic stem and progenitor cell (HSPC) classification as well as programmed death ligand 1 (PD-L1). Rather than restricting the evaluation on a 2- or 3-dimensional analysis, we applied a t-distributed stochastic neighbor embedding (t-SNE) approach to obtain deeper insight and segregation between leukemic cells and normal HPSCs. For that purpose, we created a t-SNE map, which resulted in the visualization of 27 cell clusters based on their similarity concerning the composition and intensity of antigen expression. Two of these clusters were "leukemia-related" containing a great proportion of CD34+/CD38− hematopoietic stem cells (HSCs) or CD34+ cells with a strong co-expression of CD45RA/CD123, respectively. CD34+ cells within the latter cluster were also highly positive for PD-L1 reflecting their immunosuppressive capacity. Beyond this proof of principle study, the inclusion of additional markers will be helpful to refine the differentiation between normal HSPCs and leukemic cells, particularly in the context of minimal disease detection and antigen-targeted therapeutic interventions. Furthermore, we suggest a protocol for the assignment of new cell ensembles in quantitative terms, via a numerical value, the Pearson coefficient, based on a similarity comparison of the t-SNE pattern with a reference. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

31. 머신러닝 모델의 성능 저하 완화를 위한 반복적 결측값 처리 기법.

Author: 이종관 and 이민우
Abstract: Machine learning models find extensive application across diverse domains, with their performance heavily reliant on the data quality employed during the learning process. However, real-world datasets include some missing data due to limitations and errors in data collection methods, incomplete or inconsistent data-gathering processes, and human errors during processing. Consequently, effective handling of missing values becomes imperative to ensure optimal model performance. A common way to deal with missing data is to either delete the data containing the missing values or to impute them appropriately. Deletion is straightforward, but at the cost of information loss. Imputation, on the other hand, can result in a loss of variability in the dataset and skewed correlations between variables. The proposed scheme reduces dimensionality by utilizing variables without missing values and employs the outcomes to estimate the missing values. Experimental validations affirm that the proposed scheme mitigates the performance degradation of various machine learning models compared to existing methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. Beta Hebbian Learning for intrusion detection in networks with MQTT Protocols for IoT devices.

Author: Michelena, Álvaro, Ordás, María Teresa García, Aveleira-Mata, José, Blanco, David Yeregui Marcos del, Díaz, Míriam Timiraos, Zayas-Gato, Francisco, Jove, Esteban, Casteleiro-Roca, José-Luis, Quintián, Héctor, Alaiz-Moretón, Héctor, and Calvo-Rolle, José Luis
Abstract: This paper aims to enhance security in IoT device networks through a visual tool that utilizes three projection techniques, including Beta Hebbian Learning (BHL), t-distributed Stochastic Neighbor Embedding (t-SNE) and ISOMAP, in order to facilitate the identification of network attacks by human experts. This work research begins with the creation of a testing environment with IoT devices and web clients, simulating attacks over Message Queuing Telemetry Transport (MQTT) for recording all relevant traffic information. The unsupervised algorithms chosen provide a set of projections that enable human experts to visually identify most attacks in real-time, making it a powerful tool that can be implemented in IoT environments easily. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Nonlinear dimensionality reduction with q-Gaussian distribution.

Author: Abe, Motoshi, Nomura, Yuichiro, and Kurita, Takio
Abstract: In recent years, the dimensionality reduction has become more important as the number of dimensions of data used in various tasks such as regression and classification has increased. As popular nonlinear dimensionality reduction methods, t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP) have been proposed. However, the former outputs only one low-dimensional space determined by the t-distribution and the latter is difficult to control the distribution of distance between each pair of samples in low-dimensional space. To tackle these issues, we propose novel t-SNE and UMAP extended by q-Gaussian distribution, called q-Gaussian-distributed stochastic neighbor embedding (q-SNE) and q-Gaussian-distributed uniform manifold approximation and projection (q-UMAP). The q-Gaussian distribution is a probability distribution derived by maximizing the tsallis entropy by escort distribution with mean and variance, and a generalized version of Gaussian distribution with a hyperparameter q. Since the shape of the q-Gaussian distribution can be tuned smoothly by the hyperparameter q, q-SNE and q-UMAP can in- tuitively derive different embedding spaces. To show the quality of the proposed method, we compared the visualization of the low-dimensional embedding space and the classification accuracy by k-NN in the low-dimensional space. Empirical results on MNIST, COIL-20, OliverttiFaces and FashionMNIST demonstrate that the q-SNE and q-UMAP can derive better embedding spaces than t-SNE and UMAP. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. Demystifying dimensionality reduction techniques in the 'omics' era: A practical approach for biological science students.

Author: Garma, Leonardo D. and Osório, Nuno S.
Subjects: LIFE sciences, SCIENCE students, PYTHON programming language, PRINCIPAL components analysis, LAPTOP computers, MOLECULAR biology, GRADUATE students
Abstract: Dimensionality reduction techniques are essential in analyzing large 'omics' datasets in biochemistry and molecular biology. Principal component analysis, t‐distributed stochastic neighbor embedding, and uniform manifold approximation and projection are commonly used for data visualization. However, these methods can be challenging for students without a strong mathematical background. In this study, intuitive examples were created using COVID‐19 data to help students understand the core concepts behind these techniques. In a 4‐h practical session, we used these examples to demonstrate dimensionality reduction techniques to 15 postgraduate students from biomedical backgrounds. Using Python and Jupyter notebooks, our goal was to demystify these methods, typically treated as "black boxes", and empower students to generate and interpret their own results. To assess the impact of our approach, we conducted an anonymous survey. The majority of the students agreed that using computers enriched their learning experience (67%) and that Jupyter notebooks were a valuable part of the class (66%). Additionally, 60% of the students reported increased interest in Python, and 40% gained both interest and a better understanding of dimensionality reduction methods. Despite the short duration of the course, 40% of the students reported acquiring research skills necessary in the field. While further analysis of the learning impacts of this approach is needed, we believe that sharing the examples we generated can provide valuable resources for others to use in interactive teaching environments. These examples highlight advantages and limitations of the major dimensionality reduction methods used in modern bioinformatics analysis in an easy‐to‐understand way. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. Data Management and Network Architecture Effect on Performance Variability in Direct Attenuation Correction via Deep Learning for Cardiac SPECT: A Feasibility Study

Author: Torkaman, Mahsa, Yang, Jaewon, Shi, Luyao, Wang, Rui, Miller, Edward J, Sinusas, Albert J, Liu, Chi, Gullberg, Grant T, and Seo, Youngho
Subjects: Biomedical and Clinical Sciences, Clinical Sciences, Heart Disease - Coronary Heart Disease, Cardiovascular, Bioengineering, Heart Disease, Biomedical Imaging, Attenuation correction, deep learning, hierarchical clustering, myocardial perfusion imaging, performance variability, single-photon computed tomography, t-distributed stochastic neighbor embedding, Wasserstein cycle generative adversarial network, Deep learning, Hierarchical clustering, Myocardial perfusion imaging, Performance variability, SPECT, Wasserstein cycle GAN, t-SNE, Clinical sciences, Oncology and carcinogenesis, Biomedical engineering
Abstract: Attenuation correction (AC) is important for accurate interpretation of SPECT myocardial perfusion imaging (MPI). However, it is challenging to perform AC in dedicated cardiac systems not equipped with a transmission imaging capability. Previously, we demonstrated the feasibility of generating attenuation-corrected SPECT images using a deep learning technique (SPECTDL) directly from non-corrected images (SPECTNC). However, we observed performance variability across patients which is an important factor for clinical translation of the technique. In this study, we investigate the feasibility of overcoming the performance variability across patients for the direct AC in SPECT MPI by proposing to develop an advanced network and a data management strategy. To investigate, we compared the accuracy of the SPECTDL for the conventional U-Net and Wasserstein cycle GAN (WCycleGAN) networks. To manage the training data, clustering was applied to a representation of data in the lower-dimensional space, and the training data were chosen based on the similarity of data in this space. Quantitative analysis demonstrated that DL model with an advanced network improves the global performance for the AC task with the limited data. However, the regional results were not improved. The proposed data management strategy demonstrated that the clustered training has potential benefit for effective training.
Published: 2022

36. A Study of the Relationship Between Driving and Health Based on Large-Scale Data Analysis Using PLSA and t-SNE

Author: Mitsugu Mera, Nanae Michida, Masanori Honda, Kazuo Sakamoto, Yoshinori Tamada, Tatsuya Mikami, and Shigeyuki Nakaji
Subjects: Cognitive function, driving, health, machine learning, PLSA, t-SNE, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: The purpose of this study is to facilitate knowledge discovery about the relationship between driving and health among the elderly. In the Iwaki Health Promotion Project that is an annual project conducted by Hirosaki University, we have included a survey on driving for the first time in 2019. After linking the data obtained from the survey with four years of health data for 2016–2019, we have utilized PLSA as a machine learning method to cluster those data in an integrated manner. As a result, we have found latent classes broadly classified according to whether the health level has been generally high or low. Also, when we have focused on a specific health item, for example, cognitive function, we have found some people with higher and lower maintenance of cognitive function over four years, even if they have belonged to a same latent class. To characterize these differences in detail, we have utilized t-SNE as a machine learning method. As a result, we have found that “I like driving” as a factor related to the Kansei (sensitivity) may characterize the high maintenance of cognitive function. For those who like driving, it is considered that the high maintenance of cognitive function may be occurred because they enjoy driving, have a wider range of activities, and increase the possibility of multitasking.
Published: 2024
Full Text: View/download PDF

37. Buy One Get One: The Legal and Socio-Cultural Context of ‘Gifting’ Within the Australian Human Remains Trade

Author: Damien Huffer
Subjects: australia, facebook, human remains trafficking, discourse analysis, t-sne, voyant tools, Archaeology, CC1-960, Electronic computers. Computer science, QA75.5-76.95
Abstract: Today’s global human remains trade – how it operates on and offline, where remains come from, and how algorithmic amplification allows for complex networks to form between buyers, sellers, and middlemen – has seen an increasing amount of research and media attention. Underpinning this increasing interest is the growing realization that poorly regulated trafficking inflicts genuine psychological harm on the living (whether relatives of body donors or descendant communities), as well as accrues losses to the archaeological record or risks the jeopardization of crime scenes. Much of this work, however, has focused on the global north. Within the global south, Australia is recognized as an emerging market country for many categories of cultural heritage trafficking, including human remains. This paper reviews the function and socio-legal context of a specific seller’s tactic so far seen only among Australian human remains collectors, whereby photographs of human remains are offered for sale, with the bones themselves included as a “gift”. From a network analysis of text from a corpus of anonymized posts from Facebook, conducted using t-SNE and Voyant Tools, 11 key discourse themes are identified that point to how and why this sales tactic is used. Better understanding its function is a necessary first step to closing this loophole within Australian law, but also to identifying similar tricks at work within collector networks elsewhere.
Published: 2024
Full Text: View/download PDF

38. Exploring lifestyle patterns from GPS trajectory data: embedding spatio-temporal context information via geohash and POI

Author: Lee, Huiju, Kang, Youngok, Noh, Seungmin, Kim, Jiyeon, and Lee, Jiyoon
Published: 2024
Full Text: View/download PDF

39. Anomaly detection in Structural Health Monitoring using spectral distance and t-SNE–GMM framework under ambient excitation

Author: Laha, S. K., Swarnakar, B., and Kansabanik, S.
Published: 2024
Full Text: View/download PDF

40. Fault Feature Extraction of Parallel-Axis Gearbox Based on IDBO-VMD and t-SNE.

Author: Wang, Zhen, Wang, Shuaiyu, and Cheng, Yiyang
Subjects: GEARBOXES, DIMENSIONAL reduction algorithms, FEATURE extraction, OPTIMIZATION algorithms, FAULT diagnosis, PARALLEL algorithms, DUNG beetles, SUPPORT vector machines
Abstract: For the problem that the fault states of parallel shaft gearboxes are difficult to identify, a diagnostic method is proposed to optimize variational modal decomposition (VMD) and t-distributed stochastic neighbor embedding (t-SNE) using an improved dung beetle optimization algorithm I have checked and revised all. (IDBO). IDBO is obtained by amplifying dung beetle optimization (DBO) using strategies such as chaos mapping, Levy flight policy, and dynamic adaptive weighting. IDBO is employed to optimize VMD, extracting decomposed eigenvalues restructured into high-dimensional feature vectors. Subsequently, we employ the t-SNE algorithm for dimensionality reduction to eliminate redundancy, obtaining two-dimensional vectors. Finally, these vectors are input into a support vector machine (SVM) for fault diagnosis. We apply IDBO, grey wolf optimization (GWO), DBO, and the sparrow search algorithm (SSA) to both benchmark functions and VMD, conducting a performance comparison. The results demonstrate that IDBO exhibits superior convergence speed and global search capability, effectively suppressing modal aliasing issues in VMD, thereby enhancing the algorithm's robustness. Through experimental fault diagnosis on a gear transmission system, we compare our proposed method with EMD + t-SNE and traditional VMD + t-SNE feature extraction approaches. The experimental results indicate that the fault diagnosis accuracy reaches 100% after processing the fault signals with IDBO-VMD + t-SNE. This method proves to be an effective fault diagnosis approach specifically tailored for parallel-axis gearboxes, providing a reliable means to enhance diagnostic accuracy. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

41. A hybrid model for data visualization using linear algebra methods and machine learning algorithm.

Author: Ali, Mohsin, Choudhary, Jitendra, and Kasbe, Tanmay
Subjects: LINEAR algebra, EIGENVECTORS, DATA visualization, RECEIVER operating characteristic curves, PRINCIPAL components analysis, MACHINE learning, DATA modeling, DIMENSION reduction (Statistics)
Abstract: The t-distributed stochastic neighbor embedding (t-SNE) is a powerful technique for visualizing high-dimensional datasets. By reducing the dimensionality of the data, t-SNE transforms it into a format that can be more easily understood and analyzed. The existing approach is to visualize high-dimensional data but not deeply visualize. This paper proposes a model that enhances visualization and improves the accuracy. The proposed model combines the non-linear embedding technique t-SNE, the linear dimensionality reduction method principal component analysis (PCA), and the QR decomposition algorithm for discovering eigenvalues and eigenvectors. In Addition, we quantitatively compare the proposed model QRPCA-t-SNE with PCA-t-SNE using the following criteria: data visualization with different perplexity and different principal components, confusion matrix, model score, mean square error (MSE), training, testing accuracy, receiver operating characteristic curve (ROC) score, and AUC score. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

42. New methods of structural break detection and an ensemble approach to analyse exchange rate volatility of Indian rupee during coronavirus pandemic.

Author: Mareeswaran, M, Sen, Shubhajit, and Deb, Soudeep
Subjects: INDIAN rupee, COVID-19 pandemic, FOREIGN exchange rates, SPECTRAL energy distribution, TIME series analysis, MARKETING forecasting
Abstract: In this work, we develop a methodology to detect structural breaks in multivariate time series data using the t-distributed stochastic neighbour embedding (t-SNE) technique and non-parametric spectral density estimates. By applying the proposed algorithm to the exchange rates of Indian rupee against four primary currencies, we establish that the coronavirus pandemic (COVID-19) has indeed caused a structural break in the volatility dynamics. Next, to study the effect of the pandemic on the Indian currency market, we provide a compact and efficient way of combining three models, each with a specific objective, to explain and forecast the exchange rate volatility. We find that a forward-looking regime change makes a drop in persistence, while an exogenous shock like COVID-19 makes the market highly persistent. Our analysis shows that although all exchange rates are found to be exposed to common structural breaks, the degrees of impact vary across the four series. Finally, we develop an ensemble approach to combine predictions from multiple models in the context of volatility forecasting. Using model confidence set procedure, we show that the proposed approach improves the accuracy from benchmark models. Relevant economic explanations to our findings are provided as well. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. Laplacian-based Cluster-Contractive t-SNE for High-Dimensional Data Visualization.

Author: YAN SUN, YI HAN, and JICONG FAN
Subjects: DATA visualization, HIGH-dimensional model representation, DIMENSION reduction (Statistics), EIGENVALUES, SCATTER diagrams, PROBLEM solving
Abstract: Dimensionality reduction techniques aim at representing high-dimensional data in low-dimensional spaces to extract hidden and useful information or facilitate visual understanding and interpretation of the data. However, few of them take into consideration the potential cluster information contained implicitly in the high-dimensional data. In this article, we propose LaptSNE, a new graph-layout nonlinear dimensionality reduction method based on t-SNE, one of the best techniques for visualizing high-dimensional data as 2D scatter plots. Specifically, LaptSNE leverages the eigenvalue information of the graph Laplacian to shrink the potential clusters in the low-dimensional embedding when learning to preserve the local and global structure from high-dimensional space to low-dimensional space. It is nontrivial to solve the proposed model because the eigenvalues of normalized symmetric Laplacian are functions of the decision variable. We provide amajorization-minimization algorithm with convergence guarantee to solve the optimization problem of LaptSNE and show how to calculate the gradient analytically, which may be of broad interest when considering optimization with Laplacian-composited objective. We evaluate our method by a formal comparison with state-of-the-art methods on seven benchmark datasets, both visually and via established quantitative measurements. The results demonstrate the superiority of our method over baselines such as t-SNE and UMAP. We also provide out-of-sample extension, large-scale extension, and mini-batch extension for our LaptSNE to facilitate dimensionality reduction in various scenarios. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. Translating single-neuron axonal reconstructions into meso-scale connectivity statistics in the mouse somatosensory thalamus.

Author: Timonidis, Nestor, Bakker, Rembrandt, Rubio-Teves, Mario, Alonso-Martínez, Carmen, Garcia-Amado, Maria, Clascá, Francisco, and Tiesinga, Paul H. E.
Subjects: THALAMUS, MICE, THALAMIC nuclei, NEURONS, SOMATOSENSORY cortex
Abstract: Characterizing the connectomic and morphological diversity of thalamic neurons is key for better understanding how the thalamus relays sensory inputs to the cortex. The recent public release of complete single-neuron morphological reconstructions enables the analysis of previously inaccessible connectivity patterns from individual neurons. Here we focus on the Ventral Posteromedial (VPM) nucleus and characterize the full diversity of 257 VPM neurons, obtained by combining data from the MouseLight and Braintell projects. Neurons were clustered according to their most dominantly targeted cortical area and further subdivided by their jointly targeted areas. We obtained a 2D embedding of morphological diversity using the dissimilarity between all pairs of axonal trees. The curved shape of the embedding allowed us to characterize neurons by a 1-dimensional coordinate. The coordinate values were aligned both with the progression of soma position along the dorsal-ventral and lateral-medial axes and with that of axonal terminals along the posterior-anterior and medial-lateral axes, as well as with an increase in the number of branching points, distance fromsoma and branching width. Taken together, we have developed a novel workflow for linking three challenging aspects of connectomics, namely the topography, higher order connectivity patterns and morphological diversity, with VPM as a test-case. The workflow is linked to a unified access portal that contains the morphologies and integrated with 2D cortical flatmap and subcortical visualization tools. The workflow and resulting processed data have been made available in Python, and can thus be used for modeling and experimentally validating new hypotheses on thalamocortical connectivity. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

45. Unravelling the gait and balance: A novel approach for detecting depression in young healthy individuals.

Author: Maguluri, Lakshmana Phaneendra, Vinya, Viyyapu Lokeshwari, Goutham, V., Uma Maheswari, B., Kumar, Boddepalli Kiran, Musthafa, Syed, Manikandan, S., Srivastava, Suraj, and Munjal, Neha
Subjects: *GAIT in humans, *ATMOSPHERIC pressure, *PATTERN recognition systems, *MEDICAL personnel, *MENTAL illness, *PRESSURE sensors
Abstract: Depression is a prevalent mental health disorder that affects people of all ages and origins; therefore, early detection is essential for timely intervention and support. This investigation proposes a novel method for detecting melancholy in young, healthy individuals by analysing their gait and balance patterns. In order to accomplish this, a comprehensive system is designed that incorporates cutting-edge technologies such as a Barometric Pressure Sensor, Beck Depression Inventory (BDI), and t-Distributed Stochastic Neighbour Embedding (t-SNE) algorithm. The system intends to capitalize on the subtle motor and physiological changes associated with melancholy, which may manifest in a person's gait and balance. The Barometric Pressure Sensor is used to estimate variations in altitude and vertical velocity, thereby adding context to the evaluation. The mood states of participants are evaluated using the BDI, a well-established psychological assessment instrument that provides insight into their emotional health. Integrated and pre-processed data from the Barometric Pressure Sensor, BDI responses, and gait and balance measurements. The t-SNE algorithm is then used to map the high-dimensional data into a lower-dimensional space while maintaining the local structure and identifying underlying patterns within the dataset. The t-SNE algorithm improves visualization and pattern recognition by reducing the dimensionality of the data, allowing for a more nuanced analysis of depression-related markers. As the proposed system combines objective physiological measurements with subjective psychological assessments, it has the potential to advance the early detection and prediction of depression in young, healthy individuals. The results of this exploratory study have implications for the development of non-intrusive and easily accessible instruments that can assist healthcare professionals in identifying individuals at risk and implementing targeted interventions. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

46. Collaboration System for Multidisciplinary Research with Essential Data Analysis Toolkit Built-In.

Author: Garay-Jiménez, Laura I., Romero-Lujambio, Jose Fausto, Santiago-Horta, Amaury, Tovar-Corona, Blanca, Gómez-Miranda, Pilar, and Mata-Rivera, Miguel Félix
Subjects: *INTERDISCIPLINARY research, *ENVIRONMENTAL research, *ACCESS control, *RESEARCH personnel, *RESEARCH teams
Abstract: Environmental research calls for a multidisciplinary approach, where highly specialized research teams collaborate in data analysis. Nevertheless, managing the data lifecycle and research artifacts becomes challenging because the project teams require techniques and tools tailored to their study fields. Another pain point is the unavailability of essential analysis and data representation formats for querying and interpreting the shared results. In addition, managing progress reports across the teams is demanding because they manage different platforms and systems. These concerns discourage the knowledge-sharing process and lead to researchers' low adherence to the system. A hybrid methodology based on Design Thinking and an Agile approach enables us to understand and attend to the research process needs. As a result, a microservices-based architecture of the system, which can be deployed in cloud, hybrid, or standalone environments and adapt the computing resources according to the actual requirements with an access control system based on users and roles, enables the security and confidentiality, allowing the team's lead to share or revoke access. Additionally, intelligent assistance is available for document searches and dataset analyses. A multidisciplinary researchers' team that uses this system as a knowledge-sharing workspace reported an 83% acceptance. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

47. Improving ISOMAP Efficiency with RKS: A Comparative Study with t-Distributed Stochastic Neighbor Embedding on Protein Sequences.

Author: Ali, Sarwan and Patterson, Murray
Subjects: AMINO acid sequence, COMPUTATIONAL complexity, COMPARATIVE studies, DATA visualization, COMPUTATIONAL neuroscience
Abstract: Data visualization plays a crucial role in gaining insights from high-dimensional datasets. ISOMAP is a popular algorithm that maps high-dimensional data into a lower-dimensional space while preserving the underlying geometric structure. However, ISOMAP can be computationally expensive, especially for large datasets, due to the computation of the pairwise distances between data points. The motivation behind this study is to improve efficiency by leveraging an approximate method, which is based on random kitchen sinks (RKS). This approach provides a faster way to compute the kernel matrix. Using RKS significantly reduces the computational complexity of ISOMAP while still obtaining a meaningful low-dimensional representation of the data. We compare the performance of the approximate ISOMAP approach using RKS with the traditional t-SNE algorithm. The comparison involves computing the distance matrix using the original high-dimensional data and the low-dimensional data computed from both t-SNE and ISOMAP. The quality of the low-dimensional embeddings is measured using several metrics, including mean squared error (MSE), mean absolute error (MAE), and explained variance score (EVS). Additionally, the runtime of each algorithm is recorded to assess its computational efficiency. The comparison is conducted on a set of protein sequences, used in many bioinformatics tasks. We use three different embedding methods based on k-mers, minimizers, and position weight matrix (PWM) to capture various aspects of the underlying structure and the relationships between the protein sequences. By comparing different embeddings and by evaluating the effectiveness of the approximate ISOMAP approach using RKS and comparing it against t-SNE, we provide insights on the efficacy of our proposed approach. Our goal is to retain the quality of the low-dimensional embeddings while improving the computational performance. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

48. Nitrogen Metabolism in Pseudomonas putida: Functional Analysis Using Random Barcode Transposon Sequencing

Author: Schmidt, Matthias, Pearson, Allison N, Incha, Matthew R, Thompson, Mitchell G, Baidoo, Edward EK, Kakumanu, Ramu, Mukhopadhyay, Aindrila, Shih, Patrick M, Deutschbauer, Adam M, Blank, Lars M, and Keasling, Jay D
Subjects: Biochemistry and Cell Biology, Biological Sciences, Industrial Biotechnology, Genetics, Biotechnology, Amino Acids, Nitrogen, Phenotype, Pseudomonas putida, Transaminases, nitrogen, RB-TnSeq, transposon, metabolism, BarSeq, t-SNE, aminotransferase, lactam, biosensor, amino acid, nucleotide, nitrate, nitrite, polyamine, aminotransferases, biosensors, nitrogen metabolism, Microbiology, Medical microbiology
Abstract: Pseudomonas putida KT2440 has long been studied for its diverse and robust metabolisms, yet many genes and proteins imparting these growth capacities remain uncharacterized. Using pooled mutant fitness assays, we identified genes and proteins involved in the assimilation of 52 different nitrogen containing compounds. To assay amino acid biosynthesis, 19 amino acid drop-out conditions were also tested. From these 71 conditions, significant fitness phenotypes were elicited in 672 different genes including 100 transcriptional regulators and 112 transport-related proteins. We divide these conditions into 6 classes, and propose assimilatory pathways for the compounds based on this wealth of genetic data. To complement these data, we characterize the substrate range of three promiscuous aminotransferases relevant to metabolic engineering efforts in vitro. Furthermore, we examine the specificity of five transcriptional regulators, explaining some fitness data results and exploring their potential to be developed into useful synthetic biology tools. In addition, we use manifold learning to create an interactive visualization tool for interpreting our BarSeq data, which will improve the accessibility and utility of this work to other researchers. IMPORTANCE Understanding the genetic basis of P. putida's diverse metabolism is imperative for us to reach its full potential as a host for metabolic engineering. Many target molecules of the bioeconomy and their precursors contain nitrogen. This study provides functional evidence linking hundreds of genes to their roles in the metabolism of nitrogenous compounds, and provides an interactive tool for visualizing these data. We further characterize several aminotransferases, lactamases, and regulators, which are of particular interest for metabolic engineering.
Published: 2022

49. Improving ISOMAP Efficiency with RKS: A Comparative Study with t-Distributed Stochastic Neighbor Embedding on Protein Sequences

Author: Sarwan Ali and Murray Patterson
Subjects: t-SNE, ISOMAP, data visualization, COVID-19, Science
Abstract: Data visualization plays a crucial role in gaining insights from high-dimensional datasets. ISOMAP is a popular algorithm that maps high-dimensional data into a lower-dimensional space while preserving the underlying geometric structure. However, ISOMAP can be computationally expensive, especially for large datasets, due to the computation of the pairwise distances between data points. The motivation behind this study is to improve efficiency by leveraging an approximate method, which is based on random kitchen sinks (RKS). This approach provides a faster way to compute the kernel matrix. Using RKS significantly reduces the computational complexity of ISOMAP while still obtaining a meaningful low-dimensional representation of the data. We compare the performance of the approximate ISOMAP approach using RKS with the traditional t-SNE algorithm. The comparison involves computing the distance matrix using the original high-dimensional data and the low-dimensional data computed from both t-SNE and ISOMAP. The quality of the low-dimensional embeddings is measured using several metrics, including mean squared error (MSE), mean absolute error (MAE), and explained variance score (EVS). Additionally, the runtime of each algorithm is recorded to assess its computational efficiency. The comparison is conducted on a set of protein sequences, used in many bioinformatics tasks. We use three different embedding methods based on k-mers, minimizers, and position weight matrix (PWM) to capture various aspects of the underlying structure and the relationships between the protein sequences. By comparing different embeddings and by evaluating the effectiveness of the approximate ISOMAP approach using RKS and comparing it against t-SNE, we provide insights on the efficacy of our proposed approach. Our goal is to retain the quality of the low-dimensional embeddings while improving the computational performance.
Published: 2023
Full Text: View/download PDF

50. HDSNE a new unsupervised multiple image database fusion learning algorithm with flexible and crispy production of one database: a proof case study of lung infection diagnose In chest X-ray images

Author: Muhammad Atta Othman Ahmed, Ibrahim A. Abbas, and Yasser AbdelSatar
Subjects: COVID-19, X-ray, Coronavirus, MD5, t-SNE, Data aggregation, Medical technology, R855-855.5
Abstract: Abstract Continuous release of image databases with fully or partially identical inner categories dramatically deteriorates the production of autonomous Computer-Aided Diagnostics (CAD) systems for true comprehensive medical diagnostics. The first challenge is the frequent massive bulk release of medical image databases, which often suffer from two common drawbacks: image duplication and corruption. The many subsequent releases of the same data with the same classes or categories come with no clear evidence of success in the concatenation of those identical classes among image databases. This issue stands as a stumbling block in the path of hypothesis-based experiments for the production of a single learning model that can successfully classify all of them correctly. Removing redundant data, enhancing performance, and optimizing energy resources are among the most challenging aspects. In this article, we propose a global data aggregation scale model that incorporates six image databases selected from specific global resources. The proposed valid learner is based on training all the unique patterns within any given data release, thereby creating a unique dataset hypothetically. The Hash MD5 algorithm (MD5) generates a unique hash value for each image, making it suitable for duplication removal. The T-Distributed Stochastic Neighbor Embedding (t-SNE), with a tunable perplexity parameter, can represent data dimensions. Both the Hash MD5 and t-SNE algorithms are applied recursively, producing a balanced and uniform database containing equal samples per category: normal, pneumonia, and Coronavirus Disease of 2019 (COVID-19). We evaluated the performance of all proposed data and the new automated version using the Inception V3 pre-trained model with various evaluation metrics. The performance outcome of the proposed scale model showed more respectable results than traditional data aggregation, achieving a high accuracy of 98.48%, along with high precision, recall, and F1-score. The results have been proved through a statistical t-test, yielding t-values and p-values. It’s important to emphasize that all t-values are undeniably significant, and the p-values provide irrefutable evidence against the null hypothesis. Furthermore, it’s noteworthy that the Final dataset outperformed all other datasets across all metric values when diagnosing various lung infections with the same factors.
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

811 results on '"T-SNE"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources