Descriptor: "006" / Publisher: university of surrey - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"006"' showing total 10 results

Start Over Descriptor "006" Publisher university of surrey

10 results on '"006"'

1. Advanced pre-and-post processing techniques for speech coding

Author: Farsi, Hassan
Subjects: 006, Pattern recognition & image processing
Abstract: Advances in digital technology in the last decade have motivated the development of very efficient and high quality speech compression algorithms. While in the early low bit rate coding systems, the main target was the production of intelligible speech at low bit rates, expansion of new applications such as mobile satellite systems increased the demand for reducing the transmission bandwidth and achieving higher speech quality. This resulted in the development of efficient parametric models for speech production system. These models were the basis of powerful speech compression algorithms such as CELP, MBE, MELP and WI. The performance of a speech coder not only depends on the speech production model employed but also on the accurate estimation of speech parameters. Periodicity, also known as pitch, is one of the speech parameters that greatly affect the synthesised speech quality. Thus, the subject of pitch determination has attracted much research in the area of low bit rate coding. In these studies it is assumed that for a short segment of speech, called frame, the pitch is fixed or smoothly evolving. The pitch estimation algorithms generally fail to determine irregular variations, which can occur at onset and offset speech segments. In order to overcome this problem, a novel preprocessing method, which detects irregular pitch variations and modifies the speech signal such as to improve the accuracy of the pitch estimation, is proposed. This method results in more regular speech while maintaining perceptual speech quality. The perceptual quality of the synthesised speech may also be improved using postfiltering techniques. Conventional postfiltering methods generally consider the enhancement of the whole speech spectrum. This may result in the broadening of the first formant, which leads to the increase of quantisation noise for this formant. A new postfiltering technique, which is based on factorising the linear prediction synthesis filter, is proposed. This provides more control over the formant bandwidth and attenuation of spectral speech valleys. Key words: Pitch smoothing, speech pre-processor, postfiltering.
Published: 2003

2. Fast statistically robust image registration

Author: Fitch, Alistair John
Subjects: 006, Pattern recognition & image processing
Abstract: Image registration is the automatic alignment of images. It is a fundamental task in computer vision. Image registration is challenging, in part, because of the wide range of applications with an equally wide range of content. Applications that require the automatic alignment of images include: super-resolution, face detection, video coding, medical imaging, mosaicking, post-production video effects, and satellite image registration. The wide and diverse range of applications have led to a wide and diverse range of image registration algorithms. An image registration algorithm is defined by its transformation, criterion, and search. The transformation is the model of image deformation required for alignment. The criterion is the definition of the best registration. The search describes how the best registration is to be found. This thesis presents two image registration methods; fast robust correlation and orientation correlation. The presented methods find translational transformations. Both define their criterion of the best registration using robust statistics. Fast robust correlation applies robust statistics to pixel intensity differences. Orientation correlation applies robust statistics to differences in orientation of intensity gradient. This gives orientation correlation the property of illumination invariance. Both use an exhaustive search to find the best registration. The novelty of fast robust correlation and orientation correlation is the combination of robust statistics, with an exhaustive search that can be computed quickly with fast Fourier transforms (FFTs). This is achieved by expressing a statistically robust registration surface with correlations. The correlations are computed quickly using FFTs. Computation with FFTs is shown to be particularly advantageous in registration of large images of similar size. Experimental comparisons demonstrate the advantages of the methods over standard correlation-based approaches. Advantage is shown in the experiments of: video coding, video frame registration, tolerance of rotation and zoom, registration of multimodal microscopy images, and face registration.
Published: 2003

3. Object recognition by region matching using relaxation with relational constraints

Author: Ahmadyfard, Alireza
Subjects: 006, Computer vision
Abstract: Our objective in this thesis is to develop a method for establishing an object recognition system based on the matching of image regions. A region is segmented from image based on colour homogeneity of pixels. The method can be applied to a number of computer vision applications such as object recognition (in general) and image retrieval. The motivation for using regions as image primitives is that they can be represented invariantly to a group of geometric transformations and regions are stable under scaling. We model each object of interest in our database using a single frontal image. The recognition task is to determine the presence of object(s) of interest in scene images. We propose a novel method for afflne invariant representation of image regions in the form of Attributed Relational Graph (ARG). To make image regions comparable for matching, we project each region to an affine invariant space and describe it using a set of unary measurements. The distinctiveness of these features is enhanced by describing the relation between the region and its neighbours. We limit ourselves to the low order relations, binary relations, to minimise the combinatorial complexity of both feature extraction and model matching, and to maximise the probability of the features being observed. We propose two sets of binary measurements: geometric relations between pair of regions, and colour profile on the line connecting the centroids of regions. We demonstrate that the former measurements are very discriminative when the shape of segmented regions is informative. However, they are susceptible to distortion of regions boundaries as a result of severe geometric transformations. In contrast, the colour profile binary measurements are very robust. Using this representation we construct a graph to represent the regions in the scene image and refer to it as the scene graph. Similarly a graph containing the regions of all object models is constructed and referred to as the model graph. We consider the object recognition as the problem of matching the scene graph and model graphs. We adopt the probabilistic relaxation labelling technique for our problem. The method is modified to cope better with image segmentation errors. The implemented algorithm is evaluated under affine transformation, occlusion, illumination change and cluttered scene. Good performance for recognition even under severe scaling and in cluttered scenes is reported. Key words: Region Matching, Object Recognition, Relaxation Labelling, Affine Invariant.
Published: 2003

4. Geometric surface registration for 3D model building

Author: Cunnington, Simon James
Subjects: 006, Pattern recognition & image processing
Abstract: This thesis concerns geometric surface registration, a vital part of automatic 3D model building. The work is centred on the iterative closest point (ICP) algorithm and a study is made of how the choice of closest point method affects the accuracy, stability and speed of the algorithm. A comparative study of n view point set alignment methods is also presented. It is shown how the ICP algorithm can be extended to use the n view point set alignment methods to register multiple surfaces. The value of robustness checks in improving registration is demonstrated, especially when registering multiple surfaces. Finally a post-processing self-calibration technique is presented for data acquired using the ModelMaker Reality Capture System, a laser sensor on a coordinate-measuring arm.
Published: 2003

5. Human modelling from multiple views

Author: Starck, J. R.
Subjects: 006, Pattern recognition & image processing
Abstract: A long standing problem in computer graphics and animation is the production of synthetic computer graphics models whose appearance, movement and behaviour are visually indistinguishable from the real world. This thesis addresses the problem of reconstructing visually realistic computer graphics models of real people using multiple camera views. A model-based computer vision algorithm is introduced to reconstruct the shape and appearance of a person in an arbitrary pose viewed in a multiple camera studio. Current techniques for multiple view reconstruction address the problem of general scene recovery. These non model-based approaches can fail to accurately reconstruct shape and appearance in the presence of visual ambiguities. The techniques also provide no structure to edit or reuse the captured content in computer animation. The primary novel contributions in this research work are 1) a shape constrained deformable model formulation to match a generic model to shape information in multiple view silhouettes in the presence of visual ambiguities; and 2) a model-based multiple view reconstruction algorithm to recover a model that matches appearance across multiple views to sub-pixel accuracy. Model-based multiple view reconstruction of people is evaluated and results are presented for the reconstruction of shape and appearance of people in an arbitrary pose. The recovered models provide an accurate shape representation for a person and a visual appearance approaching the quality of the original camera images. The models also provide a consistent structured representation for the editing, synthesis and transmission of 3D content in computer graphics and animation.
Published: 2003

6. Efficient system identification based on root cepstral deconvolution

Author: Sarpal, Sanjeev
Subjects: 006, Speech synthesis
Abstract: This thesis summarizes approximately three years of research on signal modelling for the purposes of system identification. Improvements in signal modelling techniques have been encouraged over the years by society's demand for more efficient ways of accessing information. As a consequence, several modelling/compression techniques in both the time domain and the frequency domain have been developed as possible solutions to these problems. Cepstral deconvolution is a frequency domain modelling technique that has been successfully applied to many diverse fields, such as speech and seismic analysis. Thus far, all cepstral modelling performance has been empirical, relying on the judgement of the designer. Therefore a novel method for measuring root cepstral pole-zero modelling performance is proposed, by introducing a cost function applied directly to the root cepstral domain. It is, therefore, possible to demonstrate the optimized modelling of a pole-zero model and show that its performance is superior to that of a FIR Wiener filter and LPC. The optimized modelling of speech data is considered by a special form of the developed cost function. It is demonstrated that the modelling performance of the root cepstral method is superior to that of the real (magnitude) cepstrum and LPC. A novel method of model order identification for use with time domain modelling methods based around z-plane root cepstral plots is also developed and discussed. It is demonstrated that the positions of a model or plant's poles and zeros may be determined by visual inspection of the resulting z-plane plot. However, performance in noise was poor to that of LPC, leading to difficulties when trying to determine the model's order. Finally, an investigation into the poor phase modelling performance of the algorithm when modelling signals comprised of multiple excitations is presented. It is demonstrated that all DFT/FFT based analysis techniques are fundamentally flawed due to discontinuities. As a consequence, a simple pre-filtering algorithm is presented as a possible solution.
Published: 2003

7. Adaptive resonance theory : theory and application to synthetic aperture radar

Author: Saddington, P.
Subjects: 006, Artificial intelligence
Abstract: Artificial Neural Networks are massively parallel systems that are constructed from many simple processing elements called neurons. The neurons are connected via weights. This structure is inspired by the current understanding of how biological networks function. Since the 1980s, research into this field has exploded into the hive of activity that currently surrounds neural networks and intelligent systems. The work in this thesis is concerned with one particular artificial neural network: Adaptive Resonance Theory (ART). It is an unsupervised neural network that attempts to solve the stability-plasticity dilemma. The model is, however, limited by a few serious problems that restrict its use in real life situations. The network's ability to cluster consistently with uncorrupt inputs when the input is subject to even modest amounts of noise is severely handicapped. The work detailed herein attempts to improve on ART's behaviour towards noisy inputs. Novel equations are developed and described that improve on the network's performance when the system is subject to noisy inputs. One of the novel equations affecting vigilance makes a significant improvement over the originators' equations and can cope with 16% target noise before results fall to the same values as the standard equation. The novel work is tested using a real-life (not simulated) data set from the MSTAR database. Synthetic Aperture Radar targets are clustered and then subject to noise before being represented to the network. These data simulate a typical environment where a clustering or classifying module would be needed for object recognition. Such a module could then be used in an Automatic Target Recognition (ATR) system. Once the problem is mitigated, Adaptive Resonance Theory neural networks could play important roles in ATR systems due to its lack of computational complexity and low memory requirements when compared with other clustering techniques. Keywords: Adaptive Resonance Theory, clustering consistency, neural network, automatic target recognition, noisy inputs.
Published: 2002

8. Decision tree simplification for classifier ensembles

Author: Ardeshir, G.
Subjects: 006, Pruning
Abstract: Design of ensemble classifiers involves three factors: 1) a learning algorithm to produce a classifier (base classifier), 2) an ensemble method to generate diverse classifiers, and 3) a combining method to combine decisions made by base classifiers. With regard to the first factor, a good choice for constructing a classifier is a decision tree learning algorithm. However, a possible problem with this learning algorithm is its complexity which has only been addressed previously in the context of pruning methods for individual trees. Furthermore, the ensemble method may require the learning algorithm to produce a complex classifier. Considering the fact that performance of simplification methods as well as ensemble methods changes from one domain to another, our main contribution is to address a simplification method (post-pruning) in the context of ensemble methods including Bagging, Boosting and Error-Correcting Output Code (ECOC). Using a statistical test, the performance of ensembles made by Bagging, Boosting and ECOC as well as five pruning methods in the context of ensembles is compared. In addition to the implementation a supporting theory called Margin, is discussed and the relationship of Pruning to bias and variance is explained. For ECOC, the effect of parameters such as code length and size of training set on performance of Pruning methods is also studied. Decomposition methods such as ECOC are considered as a solution to reduce complexity of multi-class problems in many real problems such as face recognition. Focusing on the decomposition methods, AdaBoost.OC which is a combination of Boosting and ECOC is compared with the pseudo-loss based version of Boosting, AdaBoost.M2. In addition, the influence of pruning on the performance of ensembles is studied. Motivated by the result that both pruned and unpruned ensembles made by AdaBoost.OC have similar accuracy, pruned ensembles are compared with ensembles of single node decision trees. This results in the hypothesis that ensembles of simple classifiers may give better performance as shown for AdaBoost.OC on the identification problem in face recognition. The implication is that in some problems to achieve best accuracy of an ensemble, it is necessary to select base classifier complexity.
Published: 2002

9. Automatic architecture selection for probability density function estimation in computer vision

Author: Sadeghi, Mohammad T.
Subjects: 006, Gaussian mixture modelling
Abstract: In this thesis, the problem of probability density function estimation using finite mixture models is considered. Gaussian mixture modelling is used to provide a semi-parametric density estimate for a given data set. The fundamental problem with this approach is that the number of mixtures required to adequately describe the data is not known in advance. In this work, a predictive validation technique [91] is studied and developed as a useful, operational tool that automatically selects the number of components for Gaussian mixture models. The predictive validation test approves a candidate model if, for the set of events they try to predict, the predicted frequencies derived from the model match the empirical ones derived from the data set. A model selection algorithm, based on the validation test, is developed which prevents both problems of over-fitting and under-fitting. We investigate the influence of the various parameters in the model selection method in order to develop it into a robust operational tool. The capability of the proposed method in real world applications is examined on the problem of face image segmentation for automatic initialisation of lip tracking systems. A segmentation approach is proposed which is based on Gaussian mixture modelling of the pixels RGB values using the predictive validation technique. The lip region segmentation is based on the estimated model. First a grouping of the model components is performed using a novel approach. The resulting groups are then the basis of a Bayesian decision making system which labels the pixels in the mouth area as lip or non-lip. The experimental results demonstrate the superiority of the method over the conventional clustering approaches. In order to improve the method computationally an image sampling technique is applied which is based on Sobol sequences. Also, the image modelling process is strengthened by incorporating spatial contextual information using two different methods, a Neigh-bourhood Expectation Maximisation technique and a spatial clustering method based on a Gibbs/Markov random field modelling approach. Both methods are developed within the proposed modelling framework. The results obtained on the lip segmentation application suggest that spatial context is beneficial.
Published: 2002

10. Proper name knowledge acquisition for text understanding

Author: Guo, Runli
Subjects: 006, Proper names
Published: 2002

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

10 results on '"006"'

1. Advanced pre-and-post processing techniques for speech coding

2. Fast statistically robust image registration

3. Object recognition by region matching using relaxation with relational constraints

4. Geometric surface registration for 3D model building

5. Human modelling from multiple views

6. Efficient system identification based on root cepstral deconvolution

7. Adaptive resonance theory : theory and application to synthetic aperture radar

8. Decision tree simplification for classifier ensembles

9. Automatic architecture selection for probability density function estimation in computer vision

10. Proper name knowledge acquisition for text understanding

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

10 results on '"006"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources