25 results on '"Statistics"'
Search Results
2. Defect Data Analysis Based on Extended Association Rule Mining
- Abstract
This paper describes an empirical study to reveal rules associated with defect correction effort. We defined defect correction effort as a quantitative (ratio scale) variable, and extended conventional (nominal scale based) association rule mining to directly handle such quantitative variables. An extended rule describes the statistical characteristic of a ratio or interval scale variable in the consequent part of the rule by its mean value and standard deviation so that conditions producing distinctive statistics can be discovered As an analysis target, we collected various attributes of about 1,200 defects found in a typical medium-scale, multi-vendor (distance development) information system development project in Japan. Our findings based on extracted rules include: (l)Defects detected in coding/unit testing were easily corrected (less than 7% of mean effort) when they are related to data output or validation of input data. (2)Nevertheless, they sometimes required much more effort (lift of standard deviation was 5.845) in case of low reproducibility, (i)Defects introduced in coding/unit testing often required large correction effort (mean was 12.596 staff-hours and standard deviation was 25.716) when they were related to data handing. From these findings, we confirmed that we need to pay attention to types of defects having large mean effort as well as those having large standard deviation of effort since such defects sometimes cause excess effort., MSR'07:ICSE Workshops 2007 : Fourth International Workshop on Mining Software Repositories, 20-26 May 2007, Minneapolis, MN, USA
- Published
- 2023
3. A Method for Sharing Traffic Jam Information using Inter-Vehicle Communication
- Abstract
In this paper, we propose a method for cars to autonomously and cooperatively collect traffic jam statistics to estimate arrival time to destination for each car using inter-vehicle communication. In the method, the target geographical region is divided into areas, and each car measures time to pass through each area. Traffic information is collected by exchanging information between cars using inter-vehicle communication. In order to improve accuracy of estimation, we introduce several mechanisms to avoid same data to be repeatedly counted. Since wireless bandwidth usable for exchanging statistics information is limited, the proposed method includes a mechanism to categorize data, and send important data prior to other data. In order to evaluate effectiveness of the proposed method, we implemented the method on a traffic simulator NETSTREAM developed by Toyota Central R&D Labs, conducted some experiments and confirmed that the method achieves practical performance in sharing traffic jam information using inter-vehicle communication., V2VCOM2006 : Vehicle-to-Vehicle Communications , Jul 17-21, 2006 , San Jose, CA, USA
- Published
- 2023
4. PeakRNN and StatsRNN: Dynamic Pruning in Recurrent Neural Networks
- Abstract
This paper introduces two dynamic real-time pruning techniques PeakRNN and StatsRNN for reducing costly multiplications and memory accesses in recurrent neural networks. The methods are demonstrated on a gated recurrent unit in a multi-layer network, solving a single-channel speech enhancement task with a wide variety of real-world acoustic environments and speakers. The performance is compared against the baseline gated recurrent unit and the DeltaRNN method. Compared to the unprocessed speech, the SNR and Perceptual Evaluation of Speech Quality were on average improved by 8.11 dB and 0.43 MOS-LQO, respectively. Additionally, the two proposed methods outperformed DeltaRNN by 0.7 dB and 0.11 MOS-LQO in the two objective measures, while using the same computational budget per timestep and reducing the original operations by 88%. Furthermore, PeakRNN is fully deterministic, i.e. it is always known in advance how many computations will be executed. Such worst-case guarantees are crucial for real-time acoustics applications.
- Published
- 2022
5. Accessible or Not? An Empirical Investigation of Android App Accessibility
- Abstract
Mobile apps provide new opportunities to people with disabilities to act independently in the world. Following the law of the US, EU, mobile OS vendors such as Google and Apple have included accessibility features in their mobile systems and provide a set of guidelines and toolsets for ensuring mobile app accessibility. Motivated by this trend, researchers have conducted empirical studies by using the inaccessibility issue rate of each page (i.e., screen level) to represent the characteristics of mobile app accessibility. However, there still lacks an empirical investigation directly focusing on the issues themselves (i.e., issue level) to unveil more fine-grained findings, due to the lack of an effective issue detection method and a relatively comprehensive dataset of issues. To fill in this literature gap, we first propose an automated app page exploration tool, named Xbot, to facilitate app accessibility testing and automatically collect accessibility issues by leveraging the instrumentation technique and static program analysis. Owing to the relatively high activity coverage (around 80%) achieved by Xbot when exploring apps, Xbot achieves better performance on accessibility issue collection than existing testing tools such as Google Monkey. With Xbot, we are able to collect a relatively comprehensive accessibility issue dataset and finally collect 86,767 issues from 2,270 unique apps including both closed-source and open-source apps, based on which we further carry out an empirical study from the perspective of accessibility issues themselves to investigate novel characteristics of accessibility issues. Specifically, we extensively investigate these issues by checking 1) the overall severity of issues with multiple criteria, 2) the in-depth relation between issue types and app categories, GUI component types, 3) the frequent issue patterns quantitatively, and 4) the fixing status of accessibility issues. Finally, we highlight some insights to the community and h
- Published
- 2022
6. The Impact of Potentially Realistic Fabricated Road Sign Messages on Route Change
- Abstract
This article studies self-reported route change behavior of 4,706 licensed drivers in the continental U.S. through a stated preference survey when they encounter road sign messages. Respondents are asked to score their likelihood of route change and speed change on a 5-point Likert scale to three messages: (1) "Heavy Traffic Due to Accident," (2) "Road Closure Due to Police Activity," and (3) "Storm Watch, Flooding in Area Soon." We fulfill three objectives. First, we identify the relationship between the route change behavior and socioeconomic and attitudinal-related factors. Second, we explore the impact of road sign messages with different contents on route change behavior. Third, we test the association between route change and speed change behaviors. The results demonstrate that: (1) the response of participants to compromised dynamic message signs varies according to the socioeconomic standing and attitude of participants, (2) the response of participants varies under different messages, and socioeconomic and attitudinal factors impact this differentiation, and (3) the likelihood of route change is positively associated with slowing down. This means, in practice, a malicious adversary has the potential to shunt and disturb traffic by disseminating fabricated messages and engineering route choice of drivers.
- Published
- 2022
- Full Text
- View/download PDF
7. Context-aware learning for generative models
- Abstract
This work studies the class of algorithms for learning with side-information that emerge by extending generative models with embedded context-related variables. Using finite mixture models (FMM) as the prototypical Bayesian network, we show that maximum-likelihood estimation (MLE) of parameters through expectation-maximization (EM) improves over the regular unsupervised case and can approach the performances of supervised learning, despite the absence of any explicit ground truth data labeling. By direct application of the missing information principle (MIP), the algorithms' performances are proven to range between the conventional supervised and unsupervised MLE extremities proportionally to the information content of the contextual assistance provided. The acquired benefits regard higher estimation precision, smaller standard errors, faster convergence rates and improved classification accuracy or regression fitness shown in various scenarios, while also highlighting important properties and differences among the outlined situations. Applicability is showcased with three real-world unsupervised classification scenarios employing Gaussian Mixture Models. Importantly, we exemplify the natural extension of this methodology to any type of generative model by deriving an equivalent context-aware algorithm for variational autoencoders (VAs), thus broadening the spectrum of applicability to unsupervised deep learning with artificial neural networks. The latter is contrasted with a neural-symbolic algorithm exploiting side-information.
- Published
- 2020
8. Leadership and Pedagogical Skills in Computer Science Engineering by Combining a Degree in Engineering with a Degree in Education
- Abstract
In this full paper on innovative practice, we describe and discuss findings from dual degree study programmes that combine a master's degree in engineering with a master's degree in education. This innovative study programme design has emerged in Sweden due to an alarming demand for more Upper Secondary School teachers in STEM subjects. Studies on alumni from these programmes indicate that the graduates are highly appreciated not only as teachers in schools, but also in business and industry, e.g. in roles as IT consultants and computer science engineers. Data indicate that the breadth of the combined education, and especially leadership and pedagogical skills, are important factors for these graduates' success as engineers.
- Published
- 2020
- Full Text
- View/download PDF
9. Context-aware learning for generative models
- Abstract
This work studies the class of algorithms for learning with side-information that emerge by extending generative models with embedded context-related variables. Using finite mixture models (FMM) as the prototypical Bayesian network, we show that maximum-likelihood estimation (MLE) of parameters through expectation-maximization (EM) improves over the regular unsupervised case and can approach the performances of supervised learning, despite the absence of any explicit ground truth data labeling. By direct application of the missing information principle (MIP), the algorithms' performances are proven to range between the conventional supervised and unsupervised MLE extremities proportionally to the information content of the contextual assistance provided. The acquired benefits regard higher estimation precision, smaller standard errors, faster convergence rates and improved classification accuracy or regression fitness shown in various scenarios, while also highlighting important properties and differences among the outlined situations. Applicability is showcased with three real-world unsupervised classification scenarios employing Gaussian Mixture Models. Importantly, we exemplify the natural extension of this methodology to any type of generative model by deriving an equivalent context-aware algorithm for variational autoencoders (VAs), thus broadening the spectrum of applicability to unsupervised deep learning with artificial neural networks. The latter is contrasted with a neural-symbolic algorithm exploiting side-information.
- Published
- 2020
10. How Good Is Your Puppet?: An Empirically Defined and Validated Quality Model for Puppet
- Abstract
Puppet is a declarative language for configuration management that has rapidly gained popularity in recent years. Numerous organizations now rely on Puppet code for deploying their software systems onto cloud infrastructures. In this paper we provide a definition of code quality for Puppet code and an automated technique for measuring and rating Puppet code quality. To this end, we first explore the notion of code quality as it applies to Puppet code by performing a survey among Puppet developers. Second, we develop a measurement model for the maintainability aspect of Puppet code quality. To arrive at this measurement model, we derive appropriate quality metrics from our survey results and from existing software quality models. We implemented the Puppet code quality model in a software analysis tool. We validate our definition of Puppet code quality and the measurement model by a structured interview with Puppet experts and by comparing the tool results with quality judgments of those experts. The validation shows that the measurement model and tool provide quality judgments of Puppet code that closely match the judgments of experts. Also, the experts deem the model appropriate and usable in practice. The Software Improvement Group (SIG) has started using the model in its consultancy practice., Software Engineering
- Published
- 2018
- Full Text
- View/download PDF
11. How Good Is Your Puppet?: An Empirically Defined and Validated Quality Model for Puppet
- Abstract
Puppet is a declarative language for configuration management that has rapidly gained popularity in recent years. Numerous organizations now rely on Puppet code for deploying their software systems onto cloud infrastructures. In this paper we provide a definition of code quality for Puppet code and an automated technique for measuring and rating Puppet code quality. To this end, we first explore the notion of code quality as it applies to Puppet code by performing a survey among Puppet developers. Second, we develop a measurement model for the maintainability aspect of Puppet code quality. To arrive at this measurement model, we derive appropriate quality metrics from our survey results and from existing software quality models. We implemented the Puppet code quality model in a software analysis tool. We validate our definition of Puppet code quality and the measurement model by a structured interview with Puppet experts and by comparing the tool results with quality judgments of those experts. The validation shows that the measurement model and tool provide quality judgments of Puppet code that closely match the judgments of experts. Also, the experts deem the model appropriate and usable in practice. The Software Improvement Group (SIG) has started using the model in its consultancy practice., Software Engineering
- Published
- 2018
- Full Text
- View/download PDF
12. Symbolic method for deriving policy in reinforcement learning
- Abstract
This paper addresses the problem of deriving a policy from the value function in the context of reinforcement learning in continuous state and input spaces. We propose a novel method based on genetic programming to construct a symbolic function, which serves as a proxy to the value function and from which a continuous policy is derived. The symbolic proxy function is constructed such that it maximizes the number of correct choices of the control input for a set of selected states. Maximization methods can then be used to derive a control policy that performs better than the policy derived from the original approximate value function. The method was experimentally evaluated on two control problems with continuous spaces, pendulum swing-up and magnetic manipulation, and compared to a standard policy derivation method using the value function approximation. The results show that the proposed method and its variants outperform the standard method., Accepted Author Manuscript, OLD Intelligent Control & Robotics
- Published
- 2016
- Full Text
- View/download PDF
13. Symbolic method for deriving policy in reinforcement learning
- Abstract
This paper addresses the problem of deriving a policy from the value function in the context of reinforcement learning in continuous state and input spaces. We propose a novel method based on genetic programming to construct a symbolic function, which serves as a proxy to the value function and from which a continuous policy is derived. The symbolic proxy function is constructed such that it maximizes the number of correct choices of the control input for a set of selected states. Maximization methods can then be used to derive a control policy that performs better than the policy derived from the original approximate value function. The method was experimentally evaluated on two control problems with continuous spaces, pendulum swing-up and magnetic manipulation, and compared to a standard policy derivation method using the value function approximation. The results show that the proposed method and its variants outperform the standard method., Accepted Author Manuscript, OLD Intelligent Control & Robotics
- Published
- 2016
- Full Text
- View/download PDF
14. Software defined health
- Published
- 2015
15. Defect Data Analysis Based on Extended Association Rule Mining
- Abstract
This paper describes an empirical study to reveal rules associated with defect correction effort. We defined defect correction effort as a quantitative (ratio scale) variable, and extended conventional (nominal scale based) association rule mining to directly handle such quantitative variables. An extended rule describes the statistical characteristic of a ratio or interval scale variable in the consequent part of the rule by its mean value and standard deviation so that conditions producing distinctive statistics can be discovered As an analysis target, we collected various attributes of about 1,200 defects found in a typical medium-scale, multi-vendor (distance development) information system development project in Japan. Our findings based on extracted rules include: (l)Defects detected in coding/unit testing were easily corrected (less than 7% of mean effort) when they are related to data output or validation of input data. (2)Nevertheless, they sometimes required much more effort (lift of standard deviation was 5.845) in case of low reproducibility, (i)Defects introduced in coding/unit testing often required large correction effort (mean was 12.596 staff-hours and standard deviation was 25.716) when they were related to data handing. From these findings, we confirmed that we need to pay attention to types of defects having large mean effort as well as those having large standard deviation of effort since such defects sometimes cause excess effort.
- Published
- 2007
16. Defect Data Analysis Based on Extended Association Rule Mining
- Abstract
application/pdf, This paper describes an empirical study to reveal rules associated with defect correction effort. We defined defect correction effort as a quantitative (ratio scale) variable, and extended conventional (nominal scale based) association rule mining to directly handle such quantitative variables. An extended rule describes the statistical characteristic of a ratio or interval scale variable in the consequent part of the rule by its mean value and standard deviation so that conditions producing distinctive statistics can be discovered As an analysis target, we collected various attributes of about 1,200 defects found in a typical medium-scale, multi-vendor (distance development) information system development project in Japan. Our findings based on extracted rules include: (l)Defects detected in coding/unit testing were easily corrected (less than 7% of mean effort) when they are related to data output or validation of input data. (2)Nevertheless, they sometimes required much more effort (lift of standard deviation was 5.845) in case of low reproducibility, (i)Defects introduced in coding/unit testing often required large correction effort (mean was 12.596 staff-hours and standard deviation was 25.716) when they were related to data handing. From these findings, we confirmed that we need to pay attention to types of defects having large mean effort as well as those having large standard deviation of effort since such defects sometimes cause excess effort., MSR'07:ICSE Workshops 2007 : Fourth International Workshop on Mining Software Repositories, 20-26 May 2007, Minneapolis, MN, USA, conference paper
- Published
- 2007
17. Infrequent item mining in multiple data streams
- Abstract
The problem of extracting infrequent patterns from streams and building associations between these patterns is becoming increasingly relevant today as many events of interest such as attacks in network data or unusual stories in news data occur rarely. The complexity of the problem is compounded when a system is required to deal with data from multiple streams. To address these problems, we present a framework that combines the time based association mining with a pyramidal structure that allows a rolling analysis of the stream and maintains a synopsis of the data without requiring increasing memory resources. We apply the algorithms and show the usefulness of the techniques. © 2007 Crown Copyright.
- Published
- 2007
18. A Technique for Information Sharing using Inter-Vehicle Communication with Message Ferrying
- Abstract
In this paper, we propose a method to realize traffic information sharing among cars using inter-vehicle communication. When traffic information on a target area is retained by ordinary cars near the area, the information may be lost when the density of cars becomes low. In our method, we use the message ferrying technique together with the neighboring broadcast to mitigate this problem. We use buses which travel through regular routes as ferries. We let buses maintain the traffic information statistics in each area received from its neighboring cars. We implemented the proposed system, and conducted performance evaluation using traffic simulator NETSTREAM. As a result, we have confirmed that the proposed method can achieve better performance than using only neighboring broadcast.
- Published
- 2006
19. A Method for Sharing Traffic Jam Information using Inter-Vehicle Communication
- Abstract
In this paper, we propose a method for cars to autonomously and cooperatively collect traffic jam statistics to estimate arrival time to destination for each car using inter-vehicle communication. In the method, the target geographical region is divided into areas, and each car measures time to pass through each area. Traffic information is collected by exchanging information between cars using inter-vehicle communication. In order to improve accuracy of estimation, we introduce several mechanisms to avoid same data to be repeatedly counted. Since wireless bandwidth usable for exchanging statistics information is limited, the proposed method includes a mechanism to categorize data, and send important data prior to other data. In order to evaluate effectiveness of the proposed method, we implemented the method on a traffic simulator NETSTREAM developed by Toyota Central R&D Labs, conducted some experiments and confirmed that the method achieves practical performance in sharing traffic jam information using inter-vehicle communication.
- Published
- 2006
20. Application of statistical methods for making maintenance decisions within power utilities
- Abstract
Electrical Engineering, Mathematics and Computer Science
- Published
- 2006
21. Application of statistical methods for making maintenance decisions within power utilities
- Abstract
Electrical Engineering, Mathematics and Computer Science
- Published
- 2006
22. Approximate Transmit Covariance Optimization of MIMO Systems with Covariance Feedback
- Abstract
The data rate of a multiple input, multiple output (MIMO) communication link can be improved if knowledge about the channel statistics is exploited at the transmitter. However, many realistic system scenarios require computationally prohibitive Monte Carlo methods in order to optimize the transmit covariance for maximization of the exact mutual information. This paper instead considers an approximative approach to maximize the performance of a communication link where covariance feedback is available at the transmitter. The algorithm presented is based on an asymptotic expression of the mutual information that allows for correlation at both the transmit and the receive side of the system. From simulations we demonstrate significant gain over beamforming and pure divversity schemes in many realistic system scenarios., NR 20140805
- Published
- 2003
- Full Text
- View/download PDF
23. Control Charts and Efficient Sampling Methodologies in the Field of Photovoltaics
- Abstract
Industries must have their process under control in order to become more performant. This is true in particular for the photovoltaic (PV) industry. A Statistical Process Control (SPC) study has been realised for the semi-industrial process at IMEC. In these circumstances control charts have been built and sampling schemes have been investigated. It has been found to be powerful to detect special causes of variation and will still be used in the future for early detection of out of control situations.
- Published
- 2002
24. Robust Blind Second Order Deconvolution of Multiple FIR Channels
- Abstract
Second order blind deconvolution of single input multiple output (SIMO) FIR channels is considered herein. A major drawback of several blind diversity techniques using antenna arrays/temporal-oversampling is high sensitivity to the choice of model order. In this article, a robust method using only the second order statistics is described. It provides high estimation accuracy even for a limited sample size and an unknown model order. In contrast to other suggested approaches, that often exploit properties valid only in the large sample case, the proposed method is applicable both in the large sample and high signal-to-noise ratio (SNR) scenario; it also enjoys a simple implementation., NR 20140805
- Published
- 1998
- Full Text
- View/download PDF
25. Asymptotic Comparison of two Blind Channel Identification Algorithms
- Abstract
In this paper the performance of two second order based blind channel identification techniques is studied. The methods are compared theoretically and the formulas validated practically by simulations. The first method is a well known subspace approach and the second is a covariance matching estimator. This last estimator should attain the lower bound for the asymptotical estimation error covariance of any second order based blindidentification algorithm., NR 20140805
- Published
- 1997
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.