Author: "Zyblewski, Paweł" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Zyblewski, Paweł"' showing total 32 results

Start Over Author "Zyblewski, Paweł"

32 results on '"Zyblewski, Paweł"'

1. Cross-Modality Clustering-based Self-Labeling for Multimodal Data Classification

Author: Zyblewski, Paweł and Minku, Leandro L.
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: Technological advances facilitate the ability to acquire multimodal data, posing a challenge for recognition systems while also providing an opportunity to use the heterogeneous nature of the information to increase the generalization capability of models. An often overlooked issue is the cost of the labeling process, which is typically high due to the need for a significant investment in time and money associated with human experts. Existing semi-supervised learning methods often focus on operating in the feature space created by the fusion of available modalities, neglecting the potential for cross-utilizing complementary information available in each modality. To address this problem, we propose Cross-Modality Clustering-based Self-Labeling (CMCSL). Based on a small set of pre-labeled data, CMCSL groups instances belonging to each modality in the deep feature space and then propagates known labels within the resulting clusters. Next, information about the instances' class membership in each modality is exchanged based on the Euclidean distance to ensure more accurate labeling. Experimental evaluation conducted on 20 datasets derived from the MM-IMDb dataset indicates that cross-propagation of labels between modalities -- especially when the number of pre-labeled instances is small -- can allow for more reliable labeling and thus increase the classification performance in each modality., Comment: 10 pages, 5 figures, 9 tables
Published: 2024

2. Employing Sentence Space Embedding for Classification of Data Stream from Fake News Domain

Author: Zyblewski, Paweł, Klikowski, Jakub, Borek-Marciniec, Weronika, and Ksieniewicz, Paweł
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Tabular data is considered the last unconquered castle of deep learning, yet the task of data stream classification is stated to be an equally important and demanding research area. Due to the temporal constraints, it is assumed that deep learning methods are not the optimal solution for application in this field. However, excluding the entire -- and prevalent -- group of methods seems rather rash given the progress that has been made in recent years in its development. For this reason, the following paper is the first to present an approach to natural language data stream classification using the sentence space method, which allows for encoding text into the form of a discrete digital signal. This allows the use of convolutional deep networks dedicated to image classification to solve the task of recognizing fake news based on text data. Based on the real-life Fakeddit dataset, the proposed approach was compared with state-of-the-art algorithms for data stream classification based on generalization ability and time complexity., Comment: 8 pages, 8 figures
Published: 2024

3. WarCov -- Large multilabel and multimodal dataset from social platform

Author: Borek-Marciniec, Weronika, Zyblewski, Pawel, Klikowski, Jakub, and Ksieniewicz, Pawel
Subjects: Computer Science - Computation and Language, Computer Science - Social and Information Networks
Abstract: In the classification tasks, from raw data acquisition to the curation of a dataset suitable for use in evaluating machine learning models, a series of steps - often associated with high costs - are necessary. In the case of Natural Language Processing, initial cleaning and conversion can be performed automatically, but obtaining labels still requires the rationalized input of human experts. As a result, even though many articles often state that "the world is filled with data", data scientists suffer from its shortage. It is crucial in the case of natural language applications, which is constantly evolving and must adapt to new concepts or events. For example, the topic of the COVID-19 pandemic and the vocabulary related to it would have been mostly unrecognizable before 2019. For this reason, creating new datasets, also in languages other than English, is still essential. This work presents a collection of 3~187~105 posts in Polish about the pandemic and the war in Ukraine published on popular social media platforms in 2022. The collection includes not only preprocessed texts but also images so it can be used also for multimodal recognition tasks. The labels define posts' topics and were created using hashtags accompanying the posts. The work presents the process of curating a dataset from acquisition to sample pattern recognition experiments., Comment: 13 pages, 6 figures
Published: 2024

4. Employing Two-Dimensional Word Embedding for Difficult Tabular Data Stream Classification

Author: Zyblewski, Paweł
Subjects: Computer Science - Machine Learning
Abstract: Rapid technological advances are inherently linked to the increased amount of data, a substantial portion of which can be interpreted as data stream, capable of exhibiting the phenomenon of concept drift and having a high imbalance ratio. Consequently, developing new approaches to classifying difficult data streams is a rapidly growing research area. At the same time, the proliferation of deep learning and transfer learning, as well as the success of convolutional neural networks in computer vision tasks, have contributed to the emergence of a new research trend, namely Multi-Dimensional Encoding (MDE), focusing on transforming tabular data into a homogeneous form of a discrete digital signal. This paper proposes Streaming Super Tabular Machine Learning (SSTML), thereby exploring for the first time the potential of MDE in the difficult data stream classification task. SSTML encodes consecutive data chunks into an image representation using the STML algorithm and then performs a single ResNet-18 training epoch. Experiments conducted on synthetic and real data streams have demonstrated the ability of SSTML to achieve classification quality statistically significantly superior to state-of-the-art algorithms while maintaining comparable processing time., Comment: 16 pages, 8 figures
Published: 2024

5. Machine Learning Model for Predicting Production Process Capability in Packaging Process

Author: Orłowski, Robert, Burduk, Anna, Zyblewski, Paweł, Chaari, Fakher, Series Editor, Gherardini, Francesco, Series Editor, Ivanov, Vitalii, Series Editor, Haddar, Mohamed, Series Editor, Cavas-Martínez, Francisco, Editorial Board Member, di Mare, Francesca, Editorial Board Member, Kwon, Young W., Editorial Board Member, Tolio, Tullio A. M., Editorial Board Member, Trojanowska, Justyna, Editorial Board Member, Schmitt, Robert, Editorial Board Member, Xu, Jinyang, Editorial Board Member, Machado, Jose, editor, Soares, Filomena, editor, Yildirim, Sahin, editor, Vojtěšek, Jiří, editor, Rea, Pierluigi, editor, Gramescu, Bogdan, editor, and Hrybiuk, Olena O., editor
Published: 2024
Full Text: View/download PDF

6. Lifelong Learning Natural Language Processing Approach for Multilingual Data Classification

Author: Kozal, Jędrzej, Leś, Michał, Zyblewski, Paweł, Ksieniewicz, Paweł, and Woźniak, Michał
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: The abundance of information in digital media, which in today's world is the main source of knowledge about current events for the masses, makes it possible to spread disinformation on a larger scale than ever before. Consequently, there is a need to develop novel fake news detection approaches capable of adapting to changing factual contexts and generalizing previously or concurrently acquired knowledge. To deal with this problem, we propose a lifelong learning-inspired approach, which allows for fake news detection in multiple languages and the mutual transfer of knowledge acquired in each of them. Both classical feature extractors, such as Term frequency-inverse document frequency or Latent Dirichlet Allocation, and integrated deep NLP (Natural Language Processing) BERT (Bidirectional Encoder Representations from Transformers) models paired with MLP (Multilayer Perceptron) classifier, were employed. The results of experiments conducted on two datasets dedicated to the fake news classification task (in English and Spanish, respectively), supported by statistical analysis, confirmed that utilization of additional languages could improve performance for traditional methods. Also, in some cases supplementing the deep learning method with classical ones can positively impact obtained results. The ability of models to generalize the knowledge acquired between the analyzed languages was also observed.
Published: 2022

7. Active Weighted Aging Ensemble for Drifted Data Stream Classification

Author: Woźniak, Michał, Zyblewski, Paweł, and Ksieniewicz, Paweł
Subjects: Computer Science - Machine Learning
Abstract: One of the significant problems of streaming data classification is the occurrence of concept drift, consisting of the change of probabilistic characteristics of the classification task. This phenomenon destabilizes the performance of the classification model and seriously degrades its quality. An appropriate strategy counteracting this phenomenon is required to adapt the classifier to the changing probabilistic characteristics. One of the significant problems in implementing such a solution is the access to data labels. It is usually costly, so to minimize the expenses related to this process, learning strategies based on semi-supervised learning are proposed, e.g., employing active learning methods indicating which of the incoming objects are valuable to be labeled for improving the classifier's performance. This paper proposes a novel chunk-based method for non-stationary data streams based on classifier ensemble learning and an active learning strategy considering a limited budget that can be successfully applied to any data stream classification algorithm. The proposed method has been evaluated through computer experiments using both real and generated data streams. The results confirm the high quality of the proposed algorithm over state-of-the-art methods., Comment: 29 pages, 3 figures
Published: 2021

8. A Non-deep Approach to Classifying Movie Genres Based on Multimodal Data

Author: Niedziółka, Paweł, Zyblewski, Paweł, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Burduk, Robert, editor, Choraś, Michał, editor, Kozik, Rafał, editor, Ksieniewicz, Paweł, editor, Marciniak, Tomasz, editor, and Trajdos, Paweł, editor
Published: 2023
Full Text: View/download PDF

9. stream-learn -- open-source Python library for difficult data stream batch analysis

Author: Ksieniewicz, Paweł and Zyblewski, Paweł
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Statistics - Machine Learning
Abstract: stream-learn is a Python package compatible with scikit-learn and developed for the drifting and imbalanced data stream analysis. Its main component is a stream generator, which allows to produce a synthetic data stream that may incorporate each of the three main concept drift types (i.e. sudden, gradual and incremental drift) in their recurring or non-recurring versions. The package allows conducting experiments following established evaluation methodologies (i.e. Test-Then-Train and Prequential). In addition, estimators adapted for data stream classification have been implemented, including both simple classifiers and state-of-art chunk-based and online classifier ensembles. To improve computational efficiency, package utilises its own implementations of prediction metrics for imbalanced binary classification tasks.
Published: 2020

10. Active Weighted Aging Ensemble for drifted data stream classification

Author: Woźniak, Michał, Zyblewski, Paweł, and Ksieniewicz, Paweł
Published: 2023
Full Text: View/download PDF

11. Cyber-Attack Detection from IoT Benchmark Considered as Data Streams

Author: Zyblewski, Paweł, Pawlicki, Marek, Kozik, Rafał, Choraś, Michał, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Choraś, Michal, editor, Choraś, Ryszard S., editor, Kurzyński, Marek, editor, Trajdos, Paweł, editor, Pejaś, Jerzy, editor, and Hyla, Tomasz, editor
Published: 2022
Full Text: View/download PDF

12. Statistical Drift Detection Ensemble for batch processing of data streams

Author: Komorniczak, Joanna, Zyblewski, Paweł, and Ksieniewicz, Paweł
Published: 2022
Full Text: View/download PDF

13. Dynamic Ensemble Selection for Imbalanced Data Stream Classification with Limited Label Access

Author: Zyblewski, Paweł, Woźniak, Michał, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Rutkowski, Leszek, editor, Scherer, Rafał, editor, Korytkowski, Marcin, editor, Pedrycz, Witold, editor, Tadeusiewicz, Ryszard, editor, and Zurada, Jacek M., editor
Published: 2021
Full Text: View/download PDF

14. Clustering-Based Ensemble Pruning in the Imbalanced Data Classification

Author: Zyblewski, Paweł, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Paszynski, Maciej, editor, Kranzlmüller, Dieter, editor, Krzhizhanovskaya, Valeria V., editor, Dongarra, Jack J., editor, and Sloot, Peter M.A., editor
Published: 2021
Full Text: View/download PDF

15. Analysis of Variance Application in the Construction of Classifier Ensemble Based on Optimal Feature Subset for the Task of Supporting Glaucoma Diagnosis

Author: Sułot, Dominika, Zyblewski, Paweł, Ksieniewicz, Paweł, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Paszynski, Maciej, editor, Kranzlmüller, Dieter, editor, Krzhizhanovskaya, Valeria V., editor, Dongarra, Jack J., editor, and Sloot, Peter M.A., editor
Published: 2021
Full Text: View/download PDF

16. Data Preprocessing and Dynamic Ensemble Selection for Imbalanced Data Stream Classification

Author: Zyblewski, Paweł, Sabourin, Robert, Woźniak, Michał, Barbosa, Simone Diniz Junqueira, Editorial Board Member, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Kotenko, Igor, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Cellier, Peggy, editor, and Driessens, Kurt, editor
Published: 2020
Full Text: View/download PDF

17. Combination of Active and Random Labeling Strategy in the Non-stationary Data Stream Classification

Author: Zyblewski, Paweł, Ksieniewicz, Paweł, Woźniak, Michał, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Rutkowski, Leszek, editor, Scherer, Rafał, editor, Korytkowski, Marcin, editor, Pedrycz, Witold, editor, Tadeusiewicz, Ryszard, editor, and Zurada, Jacek M., editor
Published: 2020
Full Text: View/download PDF

18. Dynamic Classifier Selection for Data with Skewed Class Distribution Using Imbalance Ratio and Euclidean Distance

Author: Zyblewski, Paweł, Woźniak, Michał, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Krzhizhanovskaya, Valeria V., editor, Závodszky, Gábor, editor, Lees, Michael H., editor, Dongarra, Jack J., editor, Sloot, Peter M. A., editor, Brissos, Sérgio, editor, and Teixeira, João, editor
Published: 2020
Full Text: View/download PDF

19. Fusion of linear base classifiers in geometric space

Author: Ksieniewicz, Paweł, Zyblewski, Paweł, and Burduk, Robert
Published: 2021
Full Text: View/download PDF

20. Preprocessed dynamic classifier ensemble selection for highly imbalanced drifted data streams

Author: Zyblewski, Paweł, Sabourin, Robert, and Woźniak, Michał
Published: 2021
Full Text: View/download PDF

21. Clustering-Based Ensemble Pruning and Multistage Organization Using Diversity

Author: Zyblewski, Paweł, Woźniak, Michał, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Pérez García, Hilde, editor, Sánchez González, Lidia, editor, Castejón Limas, Manuel, editor, Quintián Pardo, Héctor, editor, and Corchado Rodríguez, Emilio, editor
Published: 2019
Full Text: View/download PDF

22. Classifier Selection for Highly Imbalanced Data Streams with Minority Driven Ensemble

Author: Zyblewski, Paweł, Ksieniewicz, Paweł, Woźniak, Michał, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Pandu Rangan, C., Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Rutkowski, Leszek, editor, Scherer, Rafał, editor, Korytkowski, Marcin, editor, Pedrycz, Witold, editor, Tadeusiewicz, Ryszard, editor, and Zurada, Jacek M., editor
Published: 2019
Full Text: View/download PDF

23. Cyber-Attack Detection from IoT Benchmark Considered as Data Streams

Author: Zyblewski, Paweł, primary, Pawlicki, Marek, additional, Kozik, Rafał, additional, and Choraś, Michał, additional
Published: 2021
Full Text: View/download PDF

24. Dynamic Ensemble Selection for Imbalanced Data Stream Classification with Limited Label Access

Author: Zyblewski, Paweł, primary and Woźniak, Michał, additional
Published: 2021
Full Text: View/download PDF

25. Clustering-Based Ensemble Pruning in the Imbalanced Data Classification

Author: Zyblewski, Paweł, primary
Published: 2021
Full Text: View/download PDF

26. Analysis of Variance Application in the Construction of Classifier Ensemble Based on Optimal Feature Subset for the Task of Supporting Glaucoma Diagnosis

Author: Sułot, Dominika, primary, Zyblewski, Paweł, additional, and Ksieniewicz, Paweł, additional
Published: 2021
Full Text: View/download PDF

27. Combination of Active and Random Labeling Strategy in the Non-stationary Data Stream Classification

Author: Zyblewski, Paweł, primary, Ksieniewicz, Paweł, additional, and Woźniak, Michał, additional
Published: 2020
Full Text: View/download PDF

28. Data Preprocessing and Dynamic Ensemble Selection for Imbalanced Data Stream Classification

Author: Zyblewski, Paweł, primary, Sabourin, Robert, additional, and Woźniak, Michał, additional
Published: 2020
Full Text: View/download PDF

29. Classifier Selection for Highly Imbalanced Data Streams with Minority Driven Ensemble

Author: Zyblewski, Paweł, primary, Ksieniewicz, Paweł, additional, and Woźniak, Michał, additional
Published: 2019
Full Text: View/download PDF

30. Alphabet Flatting as a variant of n-gram feature extraction method in ensemble classification of fake news

Author: Ksieniewicz, Paweł, primary, Zyblewski, Paweł, additional, Borek-Marciniec, Weronika, additional, Kozik, Rafał, additional, Choraś, Michał, additional, and Woźniak, Michał, additional
Published: 2023
Full Text: View/download PDF

31. Novel clustering-based pruning algorithms

Author: Zyblewski, Paweł, primary and Woźniak, Michał, additional
Published: 2020
Full Text: View/download PDF

32. Alphabet Flatting as a variant of [formula omitted]-gram feature extraction method in ensemble classification of fake news.

Author: Ksieniewicz, Paweł, Zyblewski, Paweł, Borek-Marciniec, Weronika, Kozik, Rafał, Choraś, Michał, and Woźniak, Michał
Subjects: *NATURAL language processing, *FAKE news, *FEATURE extraction, *PATTERN recognition systems
Abstract: The detection of disinformation becomes a significant challenge in the modern world. Most of our communication media and most of the sources of information about reality are located on the distributed network services, where the published content is usually not a subject to any initial verification. One of the few tools that seem to be able to process such large volumes of data efficiently are pattern recognition methods employing extraction of features obtained through the Natural Language Processing models and procedures. The following paper is proposing an Alphabet Flatting – a modification of the preprocessing method for the feature extraction from large language corpora – allowing the construction of diverse classifier ensembles integrated by the support accumulation, the generalization power of which may compete with quality of the state-of-the-art models in environments with strict time constraints. The proposed method has been thoroughly evaluated with the set of computer experiments, the results of which allow us to conclude its potential usefulness in the solutions of the automatic systems for preventing the spread of fake news. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

32 results on '"Zyblewski, Paweł"'

1. Cross-Modality Clustering-based Self-Labeling for Multimodal Data Classification

2. Employing Sentence Space Embedding for Classification of Data Stream from Fake News Domain

3. WarCov -- Large multilabel and multimodal dataset from social platform

4. Employing Two-Dimensional Word Embedding for Difficult Tabular Data Stream Classification

5. Machine Learning Model for Predicting Production Process Capability in Packaging Process

6. Lifelong Learning Natural Language Processing Approach for Multilingual Data Classification

7. Active Weighted Aging Ensemble for Drifted Data Stream Classification

8. A Non-deep Approach to Classifying Movie Genres Based on Multimodal Data

9. stream-learn -- open-source Python library for difficult data stream batch analysis

10. Active Weighted Aging Ensemble for drifted data stream classification

11. Cyber-Attack Detection from IoT Benchmark Considered as Data Streams

12. Statistical Drift Detection Ensemble for batch processing of data streams

13. Dynamic Ensemble Selection for Imbalanced Data Stream Classification with Limited Label Access

14. Clustering-Based Ensemble Pruning in the Imbalanced Data Classification

15. Analysis of Variance Application in the Construction of Classifier Ensemble Based on Optimal Feature Subset for the Task of Supporting Glaucoma Diagnosis

16. Data Preprocessing and Dynamic Ensemble Selection for Imbalanced Data Stream Classification

17. Combination of Active and Random Labeling Strategy in the Non-stationary Data Stream Classification

18. Dynamic Classifier Selection for Data with Skewed Class Distribution Using Imbalance Ratio and Euclidean Distance

19. Fusion of linear base classifiers in geometric space

20. Preprocessed dynamic classifier ensemble selection for highly imbalanced drifted data streams

21. Clustering-Based Ensemble Pruning and Multistage Organization Using Diversity

22. Classifier Selection for Highly Imbalanced Data Streams with Minority Driven Ensemble

23. Cyber-Attack Detection from IoT Benchmark Considered as Data Streams

24. Dynamic Ensemble Selection for Imbalanced Data Stream Classification with Limited Label Access

25. Clustering-Based Ensemble Pruning in the Imbalanced Data Classification

26. Analysis of Variance Application in the Construction of Classifier Ensemble Based on Optimal Feature Subset for the Task of Supporting Glaucoma Diagnosis

27. Combination of Active and Random Labeling Strategy in the Non-stationary Data Stream Classification

28. Data Preprocessing and Dynamic Ensemble Selection for Imbalanced Data Stream Classification

29. Classifier Selection for Highly Imbalanced Data Streams with Minority Driven Ensemble

30. Alphabet Flatting as a variant of n-gram feature extraction method in ensemble classification of fake news

31. Novel clustering-based pruning algorithms

32. Alphabet Flatting as a variant of [formula omitted]-gram feature extraction method in ensemble classification of fake news.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

32 results on '"Zyblewski, Paweł"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources