107 results on '"Andrzejak, Artur"'
Search Results
102. Introduction to randomized algorithms.
- Author
-
Goos, Gerhard, Hartmanis, Juris, Leeuwen, Jan, Mayr, Ernst W., Jürgen Prömel, Hans, Steger, Angelika, and Andrzejak, Artur
- Published
- 1998
- Full Text
- View/download PDF
103. Service-Centric Globally Distributed Computing.
- Author
-
Graupner, Sven, Kotov, Vadim, Andrzejak, Artur, and Trinks, Holger
- Subjects
DATA libraries ,INFORMATION storage & retrieval systems ,INFORMATION resources management ,COMPUTER systems ,INFORMATION technology - Abstract
Addresses the process of confederating multiple compute-data centers or utility data centers (UDC) to increase flexibility and utilization for business workloads. Information on the control cycle of an automated resource demand-supply control system; Features of the UDC, which was developed by Hewlett-Packard; Functions of a UDC platform; Ways in which the UDC can achieve demand-control instruments; Description of the three layers of the UDC control system. INSET: Related Work in Autonomous System Management.
- Published
- 2003
- Full Text
- View/download PDF
104. Memory and resource leak defects in Java Projects.
- Author
-
Ghanavati, Mohammadreza, Costa, Diego, Andrzejak, Artur, and Seboek, Janos
- Subjects
SOFTWARE engineering ,PROGRAMMING languages ,FAULT-tolerant computing ,MALWARE ,JAVA programming language - Abstract
Despite many software engineering efforts and programming language support, resource and memory leaks remain a troublesome issue in managed languages such as Java. Understanding the properties of leak-related issues, such as their type distribution, how they are found, and which defects induce them is an essential prerequisite for designing better approaches for avoidance, diagnosis, and repair of leak-related bugs. To answer these questions, we conduct an empirical study on 452 issues found in repositories of 10 mature Apache Java projects. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
105. Software failure prediction based on patterns of multiple-event failures
- Author
-
Caio Augusto Rodrigues dos Santos, Matias Júnior, Rivalino, Trivedi, Kishor S., Andrzejak, Artur, Silva, Dima da, and Albertini, Marcelo Keese
- Subjects
Event (computing) ,Computer science ,CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO [CNPQ] ,Falhas de múltiplos eventos ,Software failure ,Failure sequences ,Sequências de falha ,Multiple-event failures ,Failure associations ,Reliability engineering ,Software failures ,Predição ,Associações de falha ,Padrões ,Falhas de software ,Patterns ,Prediction - Abstract
CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior Uma necessidade fundamental para a engenharia de confiabilidade de software é compreender como os sistemas de software falham, que significa entender a dinâmica que governa os diferentes tipos de manifestação de falha. Esta pesquisa apresenta um estudo exploratório sobre falhas de múltiplos eventos, que é uma manifestação de falha caracterizada por sequências de eventos de falha que variam em comprimento, duração e combinação de tipos de falha. Este estudo visa (i) melhorar a compreensão das falhas de múltiplos eventos em sistemas de software reais, investigando suas ocorrências, associações e causas; (ii) propor protocolos de análise que levem em consideração as manifestações de falha de múltiplos eventos; (iii) aproveitar a natureza sequencial desse tipo de falha de software para realizar previsões. As falhas analisadas nesta pesquisa foram observadas empiricamente. No total, foram analisadas 42.209 falhas reais de software de 644 computadores de diferentes locais de trabalho. As principais contribuições deste estudo são um protocolo desenvolvido para investigar a existência de padrões de associações de falha; um protocolo para descobrir padrões de sequências de falha; e uma abordagem de previsão cuja principal ideia é calcular a probabilidade de um determinado evento de falha ocorrer dentro de um intervalo de tempo após a ocorrência de um padrão particular de falhas anteriores. Três métodos foram utilizados para resolver o problema de previsão; Regressão Logística Multinomial (com ou sem regularização Ridge), Decision Tree e Random Forest. Tais métodos foram escolhidos devido à natureza dos dados de falha, nos quais os tipos de falha devem ser tratados como variáveis categóricas. Inicialmente, foi realizada uma análise de descoberta de associação de falhas que considerou apenas falhas de um sistema operacional (SO) comercial amplamente utilizado. Como resultado, foram descobertos 45 padrões de associação de falhas de sistema operacional com 153.511 ocorrências, compostos dos mesmos ou diferentes tipos de falha e ocorrendo, sistematicamente, em intervalos de tempo bem estabelecidos. As associações observadas sugerem a existência de mecanismos subjacentes que regem essas ocorrências de falha, o que motivou o aprimoramento do método anterior, com a criação de um protocolo para descobrir padrões de sequências de falhas usando limites de tempo flexíveis e uma abordagem de previsão de falha. Para ter uma visão abrangente de como as diferentes falhas de software podem afetar umas às outras, os dois métodos foram aplicados a três amostras diferentes — a primeira amostra contém apenas falhas do Sistema Operacional, a segunda contém apenas falhas de Aplicativos do Usuário e a terceira engloba falhas do Sistema Operacional e de Aplicativos de Usuário. Como resultado, foram encontradas 165, 480 e 640 sequências de falha diferentes com milhares de ocorrências, respectivamente. Por fim, a abordagem proposta foi capaz de prever falhas com boa até alta precisão (86% a 93%). A fundamental need for software reliability engineering is to comprehend how software systems fail, which means understanding the dynamics that govern different types of failure manifestation. In this research, I present an exploratory study on multiple-event failures, which is a failure manifestation characterized by sequences of failure events, varying in terms of length, duration, and combination of failure types. This study aims to (i) improve the understanding of multiple-event failures in real software systems, investigating their occurrences, associations, and causes; (ii) propose analysis protocols that take into account multiple-event failure manifestations; (iii) take advantage of the sequential nature of this type of software failure to perform predictions. The failures analyzed in this research were observed empirically. In total, I analyzed 42,209 real software failures from 644 computers used in different workplaces. The major contributions of this study are a protocol developed to investigate the existence of patterns of failure associations; a protocol to discover patterns of failure sequences; and a prediction approach whose main concept is to calculate the probability of a certain failure event to occur within a time interval upon the occurrence of a particular pattern of preceding failures. I used three methods to tackle the prediction problem; Multinomial Logistic Regression (w/ and w/o Ridge regularization), Decision Tree, and Random Forest. These methods were chosen due to the nature of the failure data, in which the failure types must be handled as categorical variables. Initially, I performed a failure association discovery analysis which only included failures from a widely used commercial off-the-shelf Operating System (OS). As a result, I discovered 45 OS failure association patterns with 153,511 occurrences, which were composed of the same or different failure types and occurring within well-established time intervals, systematically. The observed associations suggest the existence of underlying mechanisms governing these failure occurrences, which motivated the improvement of the previous method by creating a protocol to discover patterns of failure sequences using flexible time thresholds and a failure prediction approach. To have a comprehensive view of how different software failures may affect each other, both methods were applied to three different samples — the first sample contained only OS failures, the second contained only User Application failures, and the third encompassed both OS and User Application failures altogether. As a result, I found 165, 480, and 640 different failure sequences with thousands of occurrences, respectively. Finally, the proposed approach was able to predict failures with good to high accuracy (86% to 93%). Tese (Doutorado)
- Published
- 2021
- Full Text
- View/download PDF
106. Approximate Distributed Set Reconciliation with Defined Accuracy
- Author
-
Kruber, Nico, Reinefeld, Alexander, Schweikardt, Nicole, and Andrzejak, Artur
- Subjects
Genauigkeitsmodelle ,Approximative Algorithmen ,Accuracy Models ,Replication ,Replikation ,Approximate Algorithms ,Set Reconciliation ,Bloom Filter ,Verteilte Systeme ,Synchronisation ,Distributed Systems ,ST 200 ,ddc:000 ,000 Informatik, Informationswissenschaft, allgemeine Werke ,Merkle Tree ,Mengenabgleich - Abstract
Mit aktuell vorhandenen Mitteln ist es schwierig, objektiv approximative Algorithmen zum Mengenabgleich gegen��berzustellen und zu vergleichen. Jeder Algorithmus kann durch unterschiedliche Wahl seiner jeweiligen Parameter an ein gegebenes Szenario angepasst werden und so zum Beispiel Bandbreiten- oder CPU-optimiert werden. ��nderungen an den Parametern gehen jedoch meistens auch mit ��nderungen an der Genauigkeit bei der Erkennung von Differenzen in den teilnehmenden Mengen einher und behindern somit objektive Vergleiche, die auf derselben Genauigkeit basieren. In dieser Arbeit wird eine Methodik entwickelt, die einen fairen Vergleich von approximativen Algorithmen zum Mengenabgleich erlaubt. Dabei wird eine feste Zielgenauigkeit definiert und im Weiteren alle die Genauigkeit beeinflussenden Parameter entsprechend gesetzt. Diese Methode ist universell genug, um f��r eine breite Masse an Algorithmen eingesetzt zu werden. In der Arbeit wird sie auf zwei triviale hashbasierte Algorithmen, einem basierend auf Bloom Filtern und einem basierend auf Merkle Trees angewandt, um dies zu untermauern. Im Vergleich zu vorherigen Arbeiten zu Merkle Trees wird vorgeschlagen, die Gr����e der Hashsummen dynamisch im Baum zu w��hlen und so den Bandbreitenbedarf an die gew��nschte Zielgenauigkeit anzupassen. Dabei entsteht eine neue Variante des Mengenabgleichs mit Merkle Trees, bei der sich erstmalig die Genauigkeit konfigurieren l��sst. Eine umfassende Evaluation eines jeden der vier unter dem Genauigkeitsmodell angepassten Algorithmen best��tigt die Anwendbarkeit der entwickelten Methodik und nimmt eine Neubewertung dieser Algorithmen vor. Die vorliegenden Ergebnisse erlauben die Auswahl eines effizienten Algorithmus f��r unterschiedliche praktische Szenarien basierend auf einer gew��nschten Zielgenauigkeit. Die pr��sentierte Methodik zur Bestimmung passender Parameter, um f��r unterschiedliche Algorithmen die gleiche Genauigkeit zu erreichen, kann auch auf weitere Algorithmen zum Mengenabgleich angewandt werden und erlaubt eine objektive, allgemeing��ltige Einordnung ihrer Leistung unter verschiedenen Metriken. Der in der Arbeit entstandene neue approximative Mengenabgleich mit Merkle Trees erweitert die Anwendbarkeit von Merkle Trees und wirft ein neues Licht auf dessen Effektivit��t., The objective comparison of approximate versioned set reconciliation algorithms is challenging. Each algorithm's behaviour can be tuned for a given use case, e.g. low bandwidth or computational overhead, using different sets of parameters. Changes of these parameters, however, often also influence the algorithm's accuracy in recognising differences between participating sets and thus hinder objective comparisons based on the same level of accuracy. We develop a method to fairly compare approximate set reconciliation algorithms by enforcing a fixed accuracy and deriving accuracy-influencing parameters accordingly. We show this method's universal applicability by adopting two trivial hash-based algorithms as well as set reconciliation with Bloom filters and Merkle trees. Compared to previous research on Merkle trees, we propose to use dynamic hash sizes to align the transfer overhead with the desired accuracy and create a new Merkle tree reconciliation algorithm with an adjustable accuracy target. An extensive evaluation of each algorithm under this accuracy model verifies its feasibility and ranks these four algorithms. Our results allow to easily choose an efficient algorithm for practical set reconciliation tasks based on the required level of accuracy. Our way to find configuration parameters for different, yet equally accurate, algorithms can also be adopted to other set reconciliation algorithms and allows to rate their respective performance in an objective manner. The resultant new approximate Merkle tree reconciliation broadens the applicability of Merkle trees and sheds some new light on its effectiveness.
- Published
- 2020
- Full Text
- View/download PDF
107. Scalable time series similarity search for data analytics
- Author
-
Schäfer, Patrick, Reinefeld, Alexander, Leser, Ulf, and Andrzejak, Artur
- Subjects
Similarity Search ,Skalierbar ,SK 845 ,ST 265 ,28 Informatik, Datenverarbeitung ,Data Analytics ,Data Mining ,Ähnlichkeitssuche ,Time Series ,ddc:004 ,004 Informatik ,Zeitreihen ,Scalable - Abstract
Eine Zeitreihe ist eine zeitlich geordnete Folge von Datenpunkten. Zeitreihen werden typischerweise über Sensormessungen oder Experimente erfasst. Sensoren sind so preiswert geworden, dass sie praktisch allgegenwärtig sind. Während dadurch die Menge an Zeitreihen regelrecht explodiert, lag der Schwerpunkt der Forschung in den letzten Jahrzehnten auf der Analyse von (a) vorgefilterten und (b) kleinen Zeitreihendatensätzen. Die Analyse realer Zeitreihendatensätze wirft zwei Probleme auf: Erstens setzen aktuelle Ähnlichkeitsmodelle eine Vorfilterung der Zeitreihen voraus. Das beinhaltet die Extraktion charakteristischer Teilsequenzen und das Entfernen von Rauschen. Diese Vorverarbeitung muss durch einen Spezialisten erfolgen. Sie kann zeit- und kostenintensiver als die anschließende Analyse und für große Datensätze unrentabel werden. Zweitens führte die Verbesserung der Genauigkeit aktueller Ähnlichkeitsmodelle zu einem unverhältnismäßig hohen Anstieg der Komplexität (quadratisch bis biquadratisch). Diese Dissertation behandelt beide Probleme. Es wird eine symbolische Zeitreihenrepräsentation vorgestellt. Darauf aufbauend werden drei verschiedene Ähnlichkeitsmodelle eingeführt. Diese erweitern den aktuellen Stand der Forschung insbesondere dadurch, dass sie vorverarbeitungsfrei, unempfindlich gegenüber Rauschen und skalierbar sind. Anhand von 91 realen Datensätzen und Benchmarkdatensätzen wird zusätzlich gezeigt, dass die hier eingeführten Modelle auf den meisten Datenätzen die höchste Genauigkeit im Vergleich zu 15 aktuellen Ähnlichkeitsmodellen liefern. Sie sind teilweise drei Größenordnungen schneller und benötigen kaum Vorfilterung. A time series is a collection of values sequentially recorded from sensors or live observations over time. Sensors for recording time series have become cheap and omnipresent. While data volumes explode, research in the field of time series data analytics has focused on the availability of (a) pre-processed and (b) moderately sized time series datasets in the last decades. The analysis of real world datasets raises two major problems: Firstly, state-of-the-art similarity models require the time series to be pre-processed. Pre-processing aims at extracting approximately aligned characteristic subsequences and reducing noise. It is typically performed by a domain expert, may be more time consuming than the data mining part itself, and simply does not scale to large data volumes. Secondly, time series research has been driven by accuracy metrics and not by reasonable execution times for large data volumes. This results in quadratic to biquadratic computational complexities of state-of-the-art similarity models. This dissertation addresses both issues by introducing a symbolic time series representation and three different similarity models. These contribute to state of the art by being pre-processing-free, noise-robust, and scalable. Our experimental evaluation on 91 real-world and benchmark datasets shows that our methods provide higher accuracy for most datasets when compared to 15 state-of-the-art similarity models. Meanwhile they are up to three orders of magnitude faster, require less pre-processing for noise or alignment, or scale to large data volumes.
- Published
- 2015
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.