13 results on '"Aria Pezeshk"'
Search Results
2. Reducing overfitting of a deep learning breast mass detection algorithm in mammography using synthetic images
- Author
-
Aria Pezeshk, Kenny H. Cha, Andreu Badal, Diksha Sharma, Berkman Sahiner, Christian G. Graff, Nicholas Petrick, and Aldo Badano
- Subjects
Training set ,medicine.diagnostic_test ,business.industry ,Computer science ,Breast imaging ,Deep learning ,Overfitting ,medicine.disease ,Set (abstract data type) ,Data set ,Breast cancer ,medicine ,Mammography ,Artificial intelligence ,skin and connective tissue diseases ,business ,Algorithm - Abstract
We evaluated whether using synthetic mammograms for training data augmentation may reduce the effects of overfitting and increase the performance of a deep learning algorithm for breast mass detection. Synthetic mammograms were generated using a combination of an in-silico random breast generation algorithm and x-ray transport simulation. In-silico breast phantoms containing masses were modeled across the four BI-RADS breast density categories, and the masses were modeled with different sizes, shapes and margins. A Monte Carlo-based xray transport simulation code, MC-GPU, was used to project the 3D phantoms into realistic synthetic mammograms. A training data set of 2,000 mammograms with 2,522 masses were generated and used for augmenting a data set of real mammograms for training. The data set of real mammograms included all the masses in the Curated Breast Imaging Subset of Digital Database for Screening Mammography (CBIS-DDSM) and consisted of 1,112 mammograms (1,198 masses) for training, 120 mammograms (120 masses) for validation, and 361 mammograms (378 masses) for testing. We used Faster R-CNN for our deep learning network with pre-training from ImageNet using Resnet-101 architecture. We compared the detection performance when the network was trained using only the CBIS-DDSM training images, and when subsets of the training set were augmented with 250, 500, 1,000 and 2,000 synthetic mammograms. FROC analysis was performed to compare performances with and without the synthetic mammograms. Our study showed that enlarging the training data with synthetic mammograms shows promise in reducing the overfitting, and that the inclusion of the synthetic images for training increased the performance of the deep learning algorithm for mass detection on mammograms.
- Published
- 2019
- Full Text
- View/download PDF
3. Test data reuse for evaluation of adaptive machine learning algorithms: over-fitting to a fixed 'test' dataset and a potential solution
- Author
-
Alexej Gossmann, Berkman Sahiner, and Aria Pezeshk
- Subjects
education.field_of_study ,business.industry ,Computer science ,Generalization ,Population ,Overfitting ,Reuse ,Machine learning ,computer.software_genre ,030218 nuclear medicine & medical imaging ,Test (assessment) ,03 medical and health sciences ,0302 clinical medicine ,Medical test ,Artificial intelligence ,education ,business ,Algorithm ,computer ,Performance metric ,030217 neurology & neurosurgery ,Test data - Abstract
After the initial release of a machine learning algorithm, the subsequently gathered data can be used to augment the training dataset in order to modify or fine-tune the algorithm. For algorithm performance evaluation that generalizes to a targeted population of cases, ideally, test datasets randomly drawn from the targeted population are used. To ensure that test results generalize to new data, the algorithm needs to be evaluated on new and independent test data each time a new performance evaluation is required. However, medical test datasets of sufficient quality are often hard to acquire, and it is tempting to utilize a previously-used test dataset for a new performance evaluation. With extensive simulation studies, we illustrate how such a "naive" approach to test data reuse can inadvertently result in overfitting the algorithm to the test data, even when only a global performance metric is reported back from the test dataset. The overfitting behavior leads to a loss in generalization and overly optimistic conclusions about the algorithm performance. We investigate the use of the Thresholdout method of Dwork et. al. (Ref. 1) to tackle this problem. Thresholdout allows repeated reuse of the same test dataset. It essentially reports a noisy version of the performance metric on the test data, and provides theoretical guarantees on how many times the test dataset can be accessed to ensure generalization of the reported answers to the underlying distribution. With extensive simulation studies, we show that Thresholdout indeed substantially reduces the problem of overfitting to the test data under the simulation conditions, at the cost of a mild additional uncertainty on the reported test performance. We also extend some of the theoretical guarantees to the area under the ROC curve as the reported performance metric.
- Published
- 2018
- Full Text
- View/download PDF
4. A database for assessment of effect of lossy compression on digital mammograms
- Author
-
Jiheng Wang, Aria Pezeshk, Nicholas Petrick, and Berkman Sahiner
- Subjects
Lossless compression ,Digital mammography ,Database ,Computer science ,Image quality ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Mammography Quality Standards Act ,Data_CODINGANDINFORMATIONTHEORY ,computer.file_format ,Lossy compression ,computer.software_genre ,JPEG ,JPEG 2000 ,computer ,Image compression - Abstract
With widespread use of screening digital mammography, efficient storage of the vast amounts of data has become a challenge. While lossless image compression causes no risk to the interpretation of the data, it does not allow for high compression rates. Lossy compression and the associated higher compression ratios are therefore more desirable. The U.S. Food and Drug Administration (FDA) currently interprets the Mammography Quality Standards Act as prohibiting lossy compression of digital mammograms for primary image interpretation, image retention, or transfer to the patient or her designated recipient. Previous work has used reader studies to determine proper usage criteria for evaluating lossy image compression in mammography, and utilized different measures and metrics to characterize medical image quality. The drawback of such studies is that they rely on a threshold on compression ratio as the fundamental criterion for preserving the quality of images. However, compression ratio is not a useful indicator of image quality. On the other hand, many objective image quality metrics (IQMs) have shown excellent performance for natural image content for consumer electronic applications. In this paper, we create a new synthetic mammogram database with several unique features. We compare and characterize the impact of image compression on several clinically relevant image attributes such as perceived contrast and mass appearance for different kinds of masses. We plan to use this database to develop a new objective IQM for measuring the quality of compressed mammographic images to help determine the allowed maximum compression for different kinds of breasts and masses in terms of visual and diagnostic quality.
- Published
- 2018
- Full Text
- View/download PDF
5. Towards the use of computationally inserted lesions for mammographic CAD assessment
- Author
-
Berkman Sahiner, Zahra Ghanian, Aria Pezeshk, and Nicholas Petrick
- Subjects
Digital mammography ,Receiver operating characteristic ,Image detector ,Computer science ,business.industry ,0206 medical engineering ,CAD ,Pattern recognition ,02 engineering and technology ,Full field ,medicine.disease ,020601 biomedical engineering ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,Breast cancer ,Microcalcification clusters ,medicine ,Artificial intelligence ,business ,Area under the roc curve - Abstract
Computer-aided detection (CADe) devices used for breast cancer detection on mammograms are typically first developed and assessed for a specific “original” acquisition system, e.g., a specific image detector. When CADe developers are ready to apply their CADe device to a new mammographic acquisition system, they typically assess the CADe device with images acquired using the new system. Collecting large repositories of clinical images containing verified cancer locations and acquired by the new image acquisition system is costly and time consuming. Our goal is to develop a methodology to reduce the clinical data burden in the assessment of a CADe device for use with a different image acquisition system. We are developing an image blending technique that allows users to seamlessly insert lesions imaged using an original acquisition system into normal images or regions acquired with a new system. In this study, we investigated the insertion of microcalcification clusters imaged using an original acquisition system into normal images acquired with that same system utilizing our previously-developed image blending technique. We first performed a reader study to assess whether experienced observers could distinguish between computationally inserted and native clusters. For this purpose, we applied our insertion technique to clinical cases taken from the University of South Florida Digital Database for Screening Mammography (DDSM) and the Breast Cancer Digital Repository (BCDR). Regions of interest containing microcalcification clusters from one breast of a patient were inserted into the contralateral breast of the same patient. The reader study included 55 native clusters and their 55 inserted counterparts. Analysis of the reader ratings using receiver operating characteristic (ROC) methodology indicated that inserted clusters cannot be reliably distinguished from native clusters (area under the ROC curve, AUC=0.58±0.04). Furthermore, CADe sensitivity was evaluated on mammograms with native and inserted microcalcification clusters using a commercial CADe system. For this purpose, we used full field digital mammograms (FFDMs) from 68 clinical cases, acquired at the University of Michigan Health System. The average sensitivities for native and inserted clusters were equal, 85.3% (58/68). These results demonstrate the feasibility of using the inserted microcalcification clusters for assessing mammographic CAD devices.
- Published
- 2018
- Full Text
- View/download PDF
6. 3D convolutional neural network for automatic detection of lung nodules in chest CT
- Author
-
Sardar Hamidian, Nicholas Petrick, Aria Pezeshk, and Berkman Sahiner
- Subjects
Artificial neural network ,Computer science ,business.industry ,Deep learning ,Volume (computing) ,02 engineering and technology ,Image segmentation ,Convolutional neural network ,Article ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,Sliding window protocol ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Segmentation ,Computer vision ,Artificial intelligence ,business - Abstract
Deep convolutional neural networks (CNNs) form the backbone of many state-of-the-art computer vision systems for classification and segmentation of 2D images. The same principles and architectures can be extended to three dimensions to obtain 3D CNNs that are suitable for volumetric data such as CT scans. In this work, we train a 3D CNN for automatic detection of pulmonary nodules in chest CT images using volumes of interest extracted from the LIDC dataset. We then convert the 3D CNN which has a fixed field of view to a 3D fully convolutional network (FCN) which can generate the score map for the entire volume efficiently in a single pass. Compared to the sliding window approach for applying a CNN across the entire input volume, the FCN leads to a nearly 800-fold speed-up, and thereby fast generation of output scores for a single case. This screening FCN is used to generate difficult negative examples that are used to train a new discriminant CNN. The overall system consists of the screening FCN for fast generation of candidate regions of interest, followed by the discrimination CNN.
- Published
- 2017
- Full Text
- View/download PDF
7. Semi-parametric estimation of the area under the precision-recall curve
- Author
-
Aria Pezeshk, Berkman Sahiner, Nicholas Petrick, and Weijie Chen
- Subjects
Receiver operating characteristic ,Estimator ,computer.software_genre ,01 natural sciences ,Confidence interval ,Semiparametric model ,010104 statistics & probability ,03 medical and health sciences ,Delta method ,0302 clinical medicine ,Binary classification ,Sample size determination ,Statistics ,030212 general & internal medicine ,Data mining ,0101 mathematics ,Precision and recall ,computer ,Mathematics - Abstract
Precision and recall are two common metrics used in the evaluation of information retrieval systems. By changing the number of retrieved documents, one can obtain a precision-recall curve. The area under the precision-recall curve (AUCPR) has been suggested as a performance measure for information retrieval systems, in a manner similar to the use of the area under the receiver operating characteristic curve in binary classification. Limited work has been performed in the literature to investigate the bias and variance of AUCPR estimators. The goal of our study was to investigate the bias and variability of a semi-parametric binormal method for estimating the AUCPR, and to compare it to other techniques, such as average precision (AP) and lower trapezoid (LT) approximation. We show how AUCPR can be obtained given the binormal model parameters, and how its variance can be estimated using the delta method. We performed simulation experiments with normal and non-normal data, and investigated the effect of sample size and prevalence. Our results indicated that the semi-parametric binormal approach provided AUCPR estimates with small bias and confidence intervals with acceptable coverage when the sample size was large, and the performance of the binormal model was comparable to or better than alternative methods evaluated in this study when the sample size was small. We conclude that the semi-parametric binormal model can be used to accurately estimate the AUCPR, and that the confidence intervals derived from the model can be at least as accurate as from other alternatives, even for non-normal decision variable distributions.
- Published
- 2016
- Full Text
- View/download PDF
8. Seamless lesion insertion in digital mammography: methodology and reader study
- Author
-
Aria Pezeshk, Nicholas Petrick, and Berkman Sahiner
- Subjects
Ground truth ,Poisson image editing ,Digital mammography ,Receiver operating characteristic ,medicine.diagnostic_test ,Multimedia ,Computer science ,Screening mammography ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Cancer ,02 engineering and technology ,computer.software_genre ,medicine.disease ,Digital mammogram ,030218 nuclear medicine & medical imaging ,Lesion ,03 medical and health sciences ,0302 clinical medicine ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,Mammography ,020201 artificial intelligence & image processing ,Data mining ,medicine.symptom ,computer - Abstract
Collection of large repositories of clinical images containing verified cancer locations is costly and time consuming due to difficulties associated with both the accumulation of data and establishment of the ground truth. This problem poses a significant challenge to the development of machine learning algorithms that require large amounts of data to properly train and avoid overfitting. In this paper we expand the methods in our previous publications by making several modifications that significantly increase the speed of our insertion algorithms, thereby allowing them to be used for inserting lesions that are much larger in size. These algorithms have been incorporated into an image composition tool that we have made publicly available. This tool allows users to modify or supplement existing datasets by seamlessly inserting a real breast mass or micro-calcification cluster extracted from a source digital mammogram into a different location on another mammogram. We demonstrate examples of the performance of this tool on clinical cases taken from the University of South Florida Digital Database for Screening Mammography (DDSM). Finally, we report the results of a reader study evaluating the realism of inserted lesions compared to clinical lesions. Analysis of the radiologist scores in the study using receiver operating characteristic (ROC) methodology indicates that inserted lesions cannot be reliably distinguished from clinical lesions.
- Published
- 2016
- Full Text
- View/download PDF
9. Improving CAD performance by seamless insertion of pulmonary nodules in chest CT exams
- Author
-
Nicholas Petrick, Aria Pezeshk, Berkman Sahiner, and Weijie Chen
- Subjects
Ground truth ,business.industry ,Computer science ,education ,Chest ct ,CAD ,Pattern recognition ,Lesion ,Set (abstract data type) ,ComputingMethodologies_PATTERNRECOGNITION ,Computer-aided diagnosis ,medicine ,Artificial intelligence ,medicine.symptom ,business ,Focus (optics) ,Simulation - Abstract
The availability of large medical image datasets is critical in training and testing of computer aided diagnosis (CAD) systems. However, collection of data and establishment of ground truth for medical images are both costly and difficult. To address this problem, we have developed an image composition tool that allows users to modify or supplement existing datasets by seamlessly inserting a clinical lesion extracted from a source image into a different location on a target image. In this study we focus on the application of this tool to the training of a CAD system designed to detect pulmonary nodules in chest CT. To compare the performance of a CAD system without and with the use of our image composition tool, we trained the system on two sets of data. The first training set was obtained from original CT cases, while the second set consisted of the first set plus nodules in the first set inserted into new locations. We then compared the performance of the two CAD systems in differentiating nodules from normal areas by testing each trained system against a fixed dataset containing natural nodules, and using the area under the ROC curve (AUC) as the figure of merit. The performance of the system trained with the augmented dataset was found to be significantly better than that trained with the original dataset under several training scenarios.
- Published
- 2015
- Full Text
- View/download PDF
10. CT image quality evaluation for detection of signals with unknown location, size, contrast and shape using unsupervised methods
- Author
-
Berkman Sahiner, Aria Pezeshk, and Lucretiu M. Popescu
- Subjects
Radon transform ,business.industry ,Image quality ,Computer science ,Template matching ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Image processing ,Iterative reconstruction ,Signal ,Digital image processing ,Computer vision ,Artificial intelligence ,business ,Image restoration ,Feature detection (computer vision) - Abstract
The advent of new image reconstruction and image processing techniques for CT images has increased the need for robust objective image quality assessment methods. One of the most common quality assessment methods is the measurement of signal detectability for a known signal at a known location using supervised classification techniques. However, this method requires a large number of simulations or physical measurements, and its underlying assumptions may be considered clinically unrealistic. In this study we focus on objective assessment of image quality in terms of detection of a signal with unknown location, size, shape, and contrast. We explore several unsupervised saliency detection methods which assume no knowledge about the signal, along with a template matching technique which uses information about the signal's size and shape in the object domain, for simulated phantoms that have been reconstructed using filtered back projection (FBP) and iterative reconstruction algorithms (IRA). The performance of each of the image reconstruction algorithms is then measured using the area under the localization receiver operating characteristic curve (LROC) and exponential transformation of the free response operating characteristic curve (EFROC). Our results indicate that unsupervised saliency detection methods can be effectively used to determine image quality in terms of signal detectability for unknown signals given only a small number of sample images.
- Published
- 2015
- Full Text
- View/download PDF
11. Comparison of two stand-alone CADe systems at multiple operating points
- Author
-
Nicholas Petrick, Aria Pezeshk, Weijie Chen, and Berkman Sahiner
- Subjects
False discovery rate ,Operating point ,symbols.namesake ,Bonferroni correction ,Computer science ,Multiple comparisons problem ,symbols ,Algorithm ,Simulation ,Statistical hypothesis testing ,Type I and type II errors - Abstract
Computer-aided detection (CADe) systems are typically designed to work at a given operating point: The device displays a mark if and only if the level of suspiciousness of a region of interest is above a fixed threshold. To compare the standalone performances of two systems, one approach is to select the parameters of the systems to yield a target false-positive rate that defines the operating point, and to compare the sensitivities at that operating point. Increasingly, CADe developers offer multiple operating points, which necessitates the comparison of two CADe systems involving multiple comparisons. To control the Type I error, multiple-comparison correction is needed for keeping the family-wise error rate (FWER) less than a given alpha-level. The sensitivities of a single modality at different operating points are correlated. In addition, the sensitivities of the two modalities at the same or different operating points are also likely to be correlated. It has been shown in the literature that when test statistics are correlated, well-known methods for controlling the FWER are conservative. In this study, we compared the FWER and power of three methods, namely the Bonferroni, step-up, and adjusted step-up methods in comparing the sensitivities of two CADe systems at multiple operating points, where the adjusted step-up method uses the estimated correlations. Our results indicate that the adjusted step-up method has a substantial advantage over other the two methods both in terms of the FWER and power.
- Published
- 2015
- Full Text
- View/download PDF
12. Investigation of methods for calibration of classifier scores to probability of disease
- Author
-
Weijie Chen, Aria Pezeshk, Berkman Sahiner, Frank W. Samuelson, and Nicholas Petrick
- Subjects
Bayes' theorem ,Brier score ,Receiver operating characteristic ,Mean squared error ,Sample size determination ,business.industry ,Nonparametric statistics ,Isotonic regression ,Pattern recognition ,Artificial intelligence ,business ,Classifier (UML) ,Mathematics - Abstract
Classifier scores in many diagnostic devices, such as computer-aided diagnosis systems, are usually on an arbitrary scale, the meaning of which is unclear. Calibration of classifier scores to a meaningful scale such as the probability of disease is potentially useful when such scores are used by a physician or another algorithm. In this work, we investigated the properties of two methods for calibrating classifier scores to probability of disease. The first is a semiparametric method in which the likelihood ratio for each score is estimated based on a semiparametric proper receiver operating characteristic model, and then an estimate of the probability of disease is obtained using the Bayes theorem assuming a known prevalence of disease. The second method is nonparametric in which isotonic regression via the pool-adjacent-violators algorithm is used. We employed the mean square error (MSE) and the Brier score to evaluate the two methods. We evaluate the methods under two paradigms: (a) the dataset used to construct the score-to-probability mapping function is used to calculate the performance metric (MSE or Brier score) (resubstitution); (b) an independent test dataset is used to calculate the performance metric (independent). Under our simulation conditions, the semiparametric method is found to be superior to the nonparametric method at small to medium sample sizes and the two methods appear to converge at large sample sizes. Our simulation results also indicate that the resubstitution bias may depend on the performance metric and, for the semiparametric method, the resubstitution bias is small when a reasonable number of cases (> 100 cases per class) are available.
- Published
- 2015
- Full Text
- View/download PDF
13. Seamless insertion of real pulmonary nodules in chest CT exams
- Author
-
Adam Wunderlich, Berkman Sahiner, Aria Pezeshk, Weijie Chen, Rongping Zeng, and Nicholas Petrick
- Subjects
Focus (computing) ,Ground truth ,Poisson image editing ,Multimedia ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Nodule (medicine) ,Image segmentation ,computer.software_genre ,Computer-aided diagnosis ,medicine ,Segmentation ,Computer vision ,Artificial intelligence ,medicine.symptom ,business ,computer - Abstract
The availability of large medical image datasets is critical in many applications such as training and testing of computer aided diagnosis (CAD) systems, evaluation of segmentation algorithms, and conducting perceptual studies. However, collection of large repositories of clinical images is hindered by the high cost and difficulties associated with both the accumulation of data and establishment of the ground truth. To address this problem, we are developing an image blending tool that allows users to modify or supplement existing datasets by seamlessly inserting a real lesion extracted from a source image into a different location on a target image. In this study we focus on the application of this tool to pulmonary nodules in chest CT exams. We minimize the impact of user skill on the perceived quality of the blended image by limiting user involvement to two simple steps: the user first draws a casual boundary around the nodule of interest in the source, and then selects the center of desired insertion area in the target. We demonstrate examples of the performance of the proposed system on samples taken from the Lung Image Database Consortium (LIDC) dataset, and compare the noise power spectrum (NPS) of blended nodules versus that of native nodules in simulated phantoms.
- Published
- 2014
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.