483 results on '"Homer, H"'
Search Results
2. Time-Division Multiplexing Light Field Display With Learned Coded Aperture
- Author
-
Chung-Hao Chao, Chang-Le Liu, and Homer H. Chen
- Subjects
Computer Graphics and Computer-Aided Design ,Software - Published
- 2023
3. A novel birdbath eyepiece for light field AR glasses
- Author
-
Chao Chien Wu, Kuang-Tsu Shih, Jiun-Woei Huang, and Homer H. Chen
- Published
- 2023
4. High resolution full-field optical coherence tomography microscope for the evaluation of freshly excised skin specimens during Mohs surgery: A feasibility study
- Author
-
Manu Jain, Shu-Wen Chang, Kiran Singh, Nicholas R. Kurtansky, Sheng-Lung Huang, Homer H. Chen, and Chih-Shan Jason Chen
- Abstract
Histopathology for tumor margin assessment is time-consuming and expensive. High-resolution full-field optical coherence tomography (FF-OCT) images fresh tissues rapidly at cellular resolution and potentially facilitates evaluation. Here, we define FF-OCT features of normal and neoplastic skin lesions in fresh ex vivo tissues and assess its diagnostic accuracy for malignancies. For this, normal and neoplastic tissues were obtained from Mohs surgery, imaged using FF-OCT, and their features were described. Two expert OCT readers conducted a blinded analysis to evaluate their diagnostic accuracies, using histopathology as the ground truth. A convolutional neural network was built to distinguish and outline normal structures and tumors. Of the 113 tissues imaged, 95 (84%) had a tumor (75 BCCs and 17 SCCs). The average reader diagnostic accuracy was 88.1%, with, a sensitivity of 93.7%, and a specificity of 58.3%. The AI model achieved a diagnostic accuracy of 87.6%±5.9%, sensitivity of 93.2%±2.1%, and specificity of 81.2%±9.2%. A mean intersection-over-union of 60.3%±10.1% was achieved when delineating the nodular BCC from normal structures. Limitation of the study was the small sample size for all tumors, especially SCCs. However, based on our preliminary results, we envision FF-OCT to rapidly image fresh tissues, facilitating surgical margin assessment. AI algorithms can aid in automated tumor detection, enabling widespread adoption of this technique.
- Published
- 2023
5. Identification of Sex and Age from Macular Optical Coherence Tomography and Feature Analysis Using Deep Learning
- Author
-
Homer H. Chen, Yi-Ting Hsieh, I-Hsin Ma, Kuan-Ming Chueh, and Sheng-Lung Huang
- Subjects
medicine.medical_specialty ,genetic structures ,Fundus Oculi ,Age prediction ,Deep Learning ,Optical coherence tomography ,Foveal ,Ophthalmology ,medicine ,Humans ,Macula Lutea ,Child ,Retina ,medicine.diagnostic_test ,business.industry ,Deep learning ,Fundus photography ,eye diseases ,Cross-Sectional Studies ,medicine.anatomical_structure ,Child, Preschool ,Population study ,sense organs ,Artificial intelligence ,Choroid ,business ,Tomography, Optical Coherence - Abstract
PURPOSE To develop deep learning models for identification of sex and age from macular optical coherence tomography (OCT), and to analyze the features for differentiation of sex and age. DESIGN Algorithm development using database of macular OCT. SETTING One eye center in Taiwan. STUDY POPULATION 6147 sets of macular optical coherence tomography (OCT) images from the healthy eyes of 3134 persons. MAIN OUTCOME MEASURES Deep learning based algorithms were used to develop models for identification of sex and age, and 10-fold cross-validation was applied. Gradient-weighted class activation mapping (Grad-CAM) was used for feature analysis. RESULTS The accuracy for sex prediction using deep learning from macular OCT was 85.6±2.1%, compared to the accuracy of 61.9% by using macular thickness and 61.4±4.0% by using deep learning from infrared fundus photography (P
- Published
- 2022
6. Efficient and Accurate Stitching for 360° Dual-Fisheye Images and Videos
- Author
-
Kuang-Tsu Shih, I-Chan Lo, and Homer H. Chen
- Subjects
Artifact (error) ,Computer science ,business.industry ,Distortion (optics) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Graphics and Computer-Aided Design ,Compensation (engineering) ,Image stitching ,Seam carving ,Calibration ,Trajectory ,Computer vision ,Artificial intelligence ,business ,Software ,ComputingMethodologies_COMPUTERGRAPHICS ,Jitter - Abstract
Back-to-back dual-fisheye cameras are the most cost-effective devices to capture 360° visual content. However, image and video stitching for such cameras often suffer from the effect of fisheye distortion, photometric inconsistency between the two views, and non-collocated optical centers. In this paper, we present algorithms for geometric calibration, photometric compensation, and seamless stitching to address these issues for back-to-back dual-fisheye cameras. Specifically, we develop a co-centric trajectory model for geometric calibration to characterize both intrinsic and extrinsic parameters of the fisheye camera to fifth-order precision, a photometric correction model for intensity and color compensation to provide efficient and accurate local color transfer, and a mesh deformation model along with an adaptive seam carving method for image stitching to reduce geometric distortion and ensure optimal spatiotemporal alignment. The stitching algorithm and the compensation algorithm can run efficiently for 1920×960 images. Quantitative evaluation of geometric distortion, color discontinuity, jitter, and ghost artifact of the resulting image and video shows that our solution outperforms the state-of-the-art techniques.
- Published
- 2022
7. Efficient and Iterative Training for High-Performance Light Field Synthesis
- Author
-
Jun-Hua Ko and Homer H. Chen
- Published
- 2022
8. Face Recognition for Fisheye Images
- Author
-
Yi-Cheng Lo, Chiao-Chun Huang, Yueh-Feng Tsai, I-Chan Lo, An-Yeu Andy Wu, and Homer H. Chen
- Published
- 2022
9. Comparison of Virtual-Real Integration Efficiency between Light Field and Conventional Near-Eye AR Displays
- Author
-
Wei-An Teng, Su-Ling Yeh, and Homer H. Chen
- Published
- 2022
10. Enhancement and Speedup of Photometric Compensation for Projectors by Reducing Inter-Pixel Coupling and Calibration Patterns
- Author
-
Jen-Shuo Liu, Homer H. Chen, Frank Shyu, and Kuang-Tsu Shih
- Subjects
Coupling ,Speedup ,Pixel ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Graphics and Computer-Aided Design ,law.invention ,Compensation (engineering) ,Photometry (optics) ,Projector ,law ,Distortion ,Calibration ,Computer vision ,Artificial intelligence ,Projection (set theory) ,business ,Software ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
For a procam to preserve the color appearance of an image projected on a color surface, the photometric distortion introduced by the color surface has to be properly compensated. The performance of such photometric compensation relies on an accurate estimation of the projector nonlinearity. In this paper, we improve the accuracy of projector nonlinearity estimation by taking inter-pixel coupling into consideration. In addition, to respond quickly to the change of projection area due to projector movement, we reduce the number of calibration patterns from six to one and use the projected image as the calibration pattern. This greatly improves the computational efficiency of re-calibration that needs to be performed on the fly during a multimedia presentation without breaking its continuity. Both objective and subjective results are provided to illustrate the effectiveness of the proposed method for color compensation.
- Published
- 2021
11. Tag Propagation and Cost-Sensitive Learning for Music Auto-Tagging
- Author
-
Yi-Hsun Lin and Homer H. Chen
- Subjects
Signal processing ,Similarity (geometry) ,Exploit ,Computer science ,Speech recognition ,media_common.quotation_subject ,Cost sensitive ,Context (language use) ,02 engineering and technology ,Computer Science Applications ,Robustness (computer science) ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,020201 artificial intelligence & image processing ,Quality (business) ,Electrical and Electronic Engineering ,Function (engineering) ,media_common - Abstract
The performance of music auto-tagging depends on the quality of training data. In practice, the links between songs and tags in the manually labeled training data can be incorrect (false positive) or missing (false negative). In this paper, we propose a cost-sensitive tag propagation learning method to improve auto-tagging. Specifically, we exploit music context to determine similar songs and propagate tags between them. Both propagated tags and original tags are used to optimize the auto-tagging models, and cost-sensitivity is incorporated into the loss function to enhance the robustness by adjusting the weight of relevant ( positive ) links with respect to irrelevant ( negative ) links. The proposed method is tested on three auto-tagging models: 2D-CNN, CRNN, and SampleCNN. The Million Song Dataset is used for training, and four music contexts, artist, playlist, tag, and listener, are used for song similarity measurement. The experimental results show 1) The proposed method can successfully improve the performance of the three auto-tagging models, 2) The cost-sensitive loss function helps reduce the impact of missing tags, and 3) The artist music context is more powerful for tag propagation than the other three music contexts.
- Published
- 2021
12. Deep Face Rectification for 360° Dual-Fisheye Cameras
- Author
-
Homer H. Chen, Yi-Hsin Li, and I-Chan Lo
- Subjects
Computer science ,business.industry ,Distortion (optics) ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,Real image ,Computer Graphics and Computer-Aided Design ,Facial recognition system ,Face (geometry) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,Projection (set theory) ,business ,Software ,Image restoration - Abstract
Rectilinear face recognition models suffer from severe performance degradation when applied to fisheye images captured by 360° back-to-back dual fisheye cameras. We propose a novel face rectification method to combat the effect of fisheye image distortion on face recognition. The method consists of a classification network and a restoration network specifically designed to handle the non-linear property of fisheye projection. The classification network classifies an input fisheye image according to its distortion level. The restoration network takes a distorted image as input and restores the rectilinear geometric structure of the face. The performance of the proposed method is tested on an end-to-end face recognition system constructed by integrating the proposed rectification method with a conventional rectilinear face recognition system. The face verification accuracy of the integrated system is 99.18% when tested on images in the synthetic Labeled Faces in the Wild (LFW) dataset and 95.70% for images in a real image dataset, resulting in an average accuracy improvement of 6.57% over the conventional face recognition system. For face identification, the average improvement over the conventional face recognition system is 4.51%.
- Published
- 2021
13. Light Field Synthesis by Training Deep Network in the Refocused Image Domain
- Author
-
Jiun-Woei Huang, Kuang-Tsu Shih, Homer H. Chen, and Chang-Le Liu
- Subjects
FOS: Computer and information sciences ,Image quality ,Structural similarity ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,Image (mathematics) ,FOS: Electrical engineering, electronic engineering, information engineering ,0202 electrical engineering, electronic engineering, information engineering ,Angular resolution ,Computer vision ,Image sensor ,business.industry ,Image and Video Processing (eess.IV) ,Electrical Engineering and Systems Science - Image and Video Processing ,Computer Graphics and Computer-Aided Design ,Ray ,View synthesis ,020201 artificial intelligence & image processing ,Augmented reality ,Artificial intelligence ,business ,Software ,Light field - Abstract
Light field imaging, which captures spatio-angular information of incident light on image sensor, enables many interesting applications like image refocusing and augmented reality. However, due to the limited sensor resolution, a trade-off exists between the spatial and angular resolution. To increase the angular resolution, view synthesis techniques have been adopted to generate new views from existing views. However, traditional learning-based view synthesis mainly considers the image quality of each view of the light field and neglects the quality of the refocused images. In this paper, we propose a new loss function called refocused image error (RIE) to address the issue. The main idea is that the image quality of the synthesized light field should be optimized in the refocused image domain because it is where the light field is perceived. We analyze the behavior of RIL in the spectral domain and test the performance of our approach against previous approaches on both real and software-rendered light field datasets using objective assessment metrics such as MSE, MAE, PSNR, SSIM, and GMSD. Experimental results show that the light field generated by our method results in better refocused images than previous methods., Accepted to IEEE Transactions on Image Processing
- Published
- 2020
14. AF-Net: A Convolutional Neural Network Approach to Phase Detection Autofocus
- Author
-
Homer H. Chen, Chi-Jui Ho, and Chin-Cheng Chan
- Subjects
Autofocus ,Computer science ,business.industry ,Detector ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,Computer Graphics and Computer-Aided Design ,Convolutional neural network ,law.invention ,law ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,Image sensor ,business ,Software - Abstract
It is important for an autofocus system to accurately and quickly find the in-focus lens position so that sharp images can be captured without human intervention. Phase detectors have been embedded in image sensors to improve the performance of autofocus; however, the phase shift estimation between the left and right phase images is sensitive to noise. In this paper, we propose a robust model based on convolutional neural network to address this issue. Our model includes four convolutional layers to extract feature maps from the phase images and a fully-connected network to determine the lens movement. The final lens position error of our model is five times smaller than that of a state-of-the-art statistical PDAF method. Furthermore, our model works consistently well for all initial lens positions. All these results verify the robustness of our model.
- Published
- 2020
15. Robust Masked Face Recognition via Balanced Feature Matching
- Author
-
Yu-Chieh Huang, Lin-Hsi Tsao, and Homer H. Chen
- Published
- 2022
16. Unsupervised Learning of 3D Object Reconstruction with Small Dataset
- Author
-
Shan-Ling Chen, Kuang-Tsu Shih, and Homer H. Chen
- Published
- 2021
17. Speed Up Light Field Synthesis from Stereo Images
- Author
-
Yi-Chou Chen, Chun-Hao Chao, Chang-Le Liu, Kuang-Tsu Shih, and Homer H. Chen
- Published
- 2021
18. Robust Light Field Synthesis From Stereo Images With Left-Right Geometric Consistency
- Author
-
Chun-Hao Chao, Homer H. Chen, and Chang-Le Liu
- Subjects
Computer science ,business.industry ,Computer vision ,Artificial intelligence ,business ,Geometric consistency ,Light field - Published
- 2021
19. H&E-like staining of OCT images of human skin via generative adversarial network
- Author
-
Sheng-Ting Tsai, Chih-Hao Liu, Chin-Cheng Chan, Yi-Hsin Li, Sheng-Lung Huang, and Homer H. Chen
- Subjects
Physics and Astronomy (miscellaneous) - Abstract
Noninvasive and high-speed optical coherence tomography (OCT) systems have been widely deployed for daily clinical uses. High-resolution OCTs are advancing rapidly; however, grey-level OCT images are not easy to read for pathologists due to the lack of diagnosis specificity compared with hematoxylin and eosin (H&E) stained images. This work presents an OCT to H&E image translation model to convert the OCT images to H&E-like stained images using unpaired OCT and H&E datasets. “H&E like” means the stratum corneum (SC) boundary and the dermal-epidermal junction (DEJ) of the OCT and the translated images are consistent. Pre-trained segmentation models for the DEJ and the SC are exploited to enhance the performance of anatomical image translation and reduce the DEJ and SC lower boundary errors to ±2.3 and ±1.7 μm, respectively. A pre-trained VGG16 network extracts the features of the nuclei. Pearson's correlation coefficient of the nuclei location and size consistency is 84% ± 1%. As a result, in vivo medical image translation accuracy with cellular resolution was achieved.
- Published
- 2022
20. Generating High-Resolution Image and Depth Map Using a Camera Array With Mixed Focal Lengths
- Author
-
Homer H. Chen and Kuang-Tsu Shih
- Subjects
Pixel ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Astrophysics::Instrumentation and Methods for Astrophysics ,020207 software engineering ,02 engineering and technology ,Camera array ,Computer Science Applications ,Computational Mathematics ,Kernel (image processing) ,Pixel aspect ratio ,High resolution image ,Depth map ,Computer Science::Computer Vision and Pattern Recognition ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Focal length ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business ,Image resolution - Abstract
Producing high-resolution images and depth maps of a scene is critical to many applications. Although some imaging systems today are equipped with multiple cameras, the output image resolution is usually only a small fraction of the total number of sensor pixels. To significantly increase the output pixel ratio, we propose an imaging system that consists of an array of telescopic cameras and a wide-angle camera. By minimizing the overlap between the telescopic cameras and maximizing the overlap between the low-resolution wide-angle camera and the telescopic cameras, a camera system with nonparallel optical axes is created. The heterogeneous images of different resolutions are fused into a high-resolution wide-angle image, and its corresponding depth map is generated by pairwise heterogeneous matching. The performance of the proposed imaging system is evaluated using both synthetic and real data.
- Published
- 2019
21. Deep face recognition for dim images
- Author
-
Yu-Hsuan Huang and Homer H. Chen
- Subjects
Artificial Intelligence ,Signal Processing ,Computer Vision and Pattern Recognition ,Software - Published
- 2022
22. A 20-year-old woman with abnormal eye movements
- Author
-
Homer H Chiang, Konstantinos A A Douglas, Nurhan Torun, Vivian Paraskevi Douglas, and Tavé van Zyl
- Subjects
Abnormal eye movements ,Eye Movements ,business.industry ,Grand Rounds ,General Medicine ,Anatomy ,Diagnosis, Differential ,Young Adult ,Ocular Motility Disorders ,Medicine ,Humans ,Cavernous Sinus ,Female ,business ,Brain Stem - Abstract
A 20-year-old woman was referred to the neuro-ophthalmology clinic at Beth Israel Deaconess Medical Center for evaluation of diplopia. Three months prior to presentation, she awoke with oblique binocular diplopia, which resolved spontaneously over a period of several weeks but recurred 1 month prior to presentation, accompanied by an inability to make facial expressions on her left side and difficulty with eye movements. She denied diplopia in primary gaze but complained of horizontal diplopia in right- and leftward gaze, worse in left gaze. She denied eye pain and any change in visual acuity. Associated symptoms included dry eye on the left, for which she was applying artificial tears four times daily and lubricating ointment at bedtime. She was previously healthy and took no medications. Family history was unremarkable. She had no history of head or eye trauma.
- Published
- 2021
23. Photometric Consistency For Dual Fisheye Cameras
- Author
-
I-Chan Lo, Homer H. Chen, Kuang-Tsu Shih, and Gwo-Hwa Ju
- Subjects
Computer science ,Cost effectiveness ,business.industry ,Color correction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,02 engineering and technology ,law.invention ,Dual (category theory) ,Compensation (engineering) ,Lens (optics) ,Photometry (optics) ,Consistency (statistics) ,law ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Back-to-back dual fisheye cameras are the most popular and cost-effective device to capture 360° images. However, inconsistent intensity and color between the pair of fisheye images often cause visible artifacts in the final 360° image. In this paper, we present a method to create photometric consistency between the two fisheye images. Specifically, we propose a loss function for image intensity compensation and a local color transfer model for color correction. Experimental results show that our method is able to correct the photometric inconsistency between dual fisheye images for high-quality 360circ imaging.
- Published
- 2020
24. Face Recognition Under Low Illumination Via Deep Feature Reconstruction Network
- Author
-
Homer H. Chen and Yu-Hsuan Huang
- Subjects
Identification (information) ,business.industry ,Feature (computer vision) ,Computer science ,Deep learning ,Face (geometry) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer vision ,Artificial intelligence ,Image enhancement ,business ,Facial recognition system ,Image (mathematics) - Abstract
Recent major benchmarks show that deep-learning-based face recognition can achieve superb performance, even surpassing human capability. However, many state-of-the-art face recognition models suffer from severe performance degradation for images captured under low illumination. The issue can be addressed by enhancing the illumination of face images before performing face recognition. In this paper, we evaluate such enhancement methods and, based on the findings, propose a novel feature reconstruction network to make face features illumination-invariant by generating a feature image from both the raw face image and the illumination-enhanced face image. The performance of the proposed approach is tested on the Specs on Faces (SoF) dataset. The overall verification accuracy is improved by 0.5% to 2.5% and the rank-l identification accuracy is improved by 2.1%.
- Published
- 2020
25. Segmentation based OCT Image to H&E-like Image Conversion
- Author
-
Chin-Cheng Chan, Jeng-Wei Tjiu, Homer H. Chen, Sheng-Lung Huang, and Sheng-Ting Tsai
- Subjects
Physics ,genetic structures ,integumentary system ,medicine.diagnostic_test ,business.industry ,Image (category theory) ,Image segmentation ,eye diseases ,Image conversion ,Three dimensional imaging ,medicine.anatomical_structure ,Optical coherence tomography ,Medical imaging ,medicine ,Stratum corneum ,Segmentation ,Computer vision ,sense organs ,Artificial intelligence ,business - Abstract
Weakly supervised conversion from in vivo OCT images on human skin to H&E-stain-like images is developed. The dermis-epidermis junction, stratum corneum boundary, and nuclei distribution match well between the OCT and converted H&E-stain-liked images.
- Published
- 2020
26. Classification of squamous cell carcinoma from FF-OCT images: Data selection and progressive model construction
- Author
-
Manuel Calderon-Delgado, Sheng-Lung Huang, Ming-Yi Lin, Chi-Jui Ho, Jeng-Wei Tjiu, and Homer H. Chen
- Subjects
Speedup ,Radiological and Ultrasound Technology ,medicine.diagnostic_test ,Channel (digital image) ,Computer science ,business.industry ,Deep learning ,Health Informatics ,Pattern recognition ,Computer Graphics and Computer-Aided Design ,Convolutional neural network ,Regularization (mathematics) ,Mice ,Optical coherence tomography ,Feature (computer vision) ,Classifier (linguistics) ,Carcinoma, Squamous Cell ,medicine ,Animals ,Radiology, Nuclear Medicine and imaging ,Neural Networks, Computer ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Tomography, Optical Coherence - Abstract
We investigate the speed and performance of squamous cell carcinoma (SCC) classification from full-field optical coherence tomography (FF-OCT) images based on the convolutional neural network (CNN). Due to the unique characteristics of SCC features, the high variety of CNN, and the high volume of our 3D FF-OCT dataset, progressive model construction is a time-consuming process. To address the issue, we develop a training strategy for data selection that makes model training 16 times faster by exploiting the dependency between images and the knowledge of SCC feature distribution. The speedup makes progressive model construction computationally feasible. Our approach further refines the regularization, channel attention, and optimization mechanism of SCC classifier and improves the accuracy of SCC classification to 87.12% at the image level and 90.10% at the tomogram level. The results are obtained by testing the proposed approach on an FF-OCT dataset with over one million mouse skin images.
- Published
- 2021
27. Clark and Katz's The Law of Domestic Relations in the United States, 3d (Hornbook Series)
- Author
-
Clark Jr., Homer H., Katz, Sanford N., Clark Jr., Homer H., and Katz, Sanford N.
- Subjects
- Hornbooks (Law), Domestic relations--United States, Divorce--Law and legislation--United States, Divorce suits--United States, Parent and child--Law and legislation--United, Children--Legal status, laws, etc.--United Sta
- Published
- 2021
28. Component Tying for Mixture Model Adaptation in Personalization of Music Emotion Recognition
- Author
-
Ju-Chiang Wang, Homer H. Chen, Yu-An Chen, and Yi-Hsuan Yang
- Subjects
Acoustics and Ultrasonics ,Computer science ,Process (engineering) ,Speech recognition ,Feature vector ,02 engineering and technology ,computer.software_genre ,Personalization ,030507 speech-language pathology & audiology ,03 medical and health sciences ,Component (UML) ,0202 electrical engineering, electronic engineering, information engineering ,Computer Science (miscellaneous) ,Electrical and Electronic Engineering ,Adaptation (computer science) ,Focus (computing) ,business.industry ,Tying ,Mixture model ,Computational Mathematics ,020201 artificial intelligence & image processing ,Artificial intelligence ,0305 other medical science ,business ,computer ,Natural language processing - Abstract
Personalizing a music emotion recognition model is needed because the perception of music emotion is highly subjective, but it is a time-consuming process. In this paper, we consider how to expedite the personalization process that begins with a general model trained offline using a general user base and progressively adapts the model to a music listener using the emotion annotations of the listener. Specifically, we focus on reducing the number of user annotations needed for the personalization. We investigate and evaluate four component tying methods: single group tying, quadrantwise tying, hierarchical tying, and random tying. These methods aim to exploit the available annotations by identifying related model parameters on-the-fly and updating them jointly. In the evaluation, we use the AMG1608 dataset, which contains the clip-level valence-arousal emotion ratings of 1608 30-s music clips annotated by 665 listeners. Also, we use the acoustic emotion Gaussians model as the general model that uses a mixture of Gaussian components to learn the mapping between the acoustic feature space and the emotion space. The results show that the model adaptation with component tying requires only 10-20 personal annotations to obtain the same level of prediction accuracy as the baseline model adaptation method that uses 50 personal annotations without component tying.
- Published
- 2017
29. A case of ophthalmomyiasis interna in the Pacific Northwest
- Author
-
Andreas K. Lauer, Rasanamar K. Sandhu, Justin Baynham, David J. Wilson, and Homer H. Chiang
- Subjects
Pars plana ,medicine.medical_specialty ,genetic structures ,medicine.medical_treatment ,030231 tropical medicine ,Vitrectomy ,Ophthalmomyiasis interna ,03 medical and health sciences ,Botfly ,0302 clinical medicine ,Oestrus ovis ,lcsh:Ophthalmology ,Ophthalmology ,Case report ,parasitic diseases ,medicine ,Close contact ,biology ,business.industry ,fungi ,biology.organism_classification ,eye diseases ,3. Good health ,Surgery ,medicine.anatomical_structure ,lcsh:RE1-994 ,030221 ophthalmology & optometry ,sense organs ,business ,Sheep botfly - Abstract
Purpose We report a case of ophthalmomyiasis interna successfully removed in toto with pars plana vitrectomy. Observations An 84-year-old woman with recent close contact with lambs presented with a new floater. Examination revealed subretinal tracks pathognomonic for ophthalmomyiasis and a larva suspended in the vitreous. The larva was successfully removed in toto with pars plana vitrectomy by aspiration through the vitreous cutter. Conclusions and importance Aspiration with pars plana vitrectomy can be considered a primary therapeutic modality for botfly larvae suspended in the vitreous. In our case, in toto removal of the larvae reduced the risk of inflammatory reaction.
- Published
- 2017
- Full Text
- View/download PDF
30. On the Distinction between Phase Images and Two-View Light Field for PDAF of Mobile Imaging
- Author
-
Homer H. Chen and Chi-Jui Ho
- Subjects
Autofocus ,Computer science ,business.industry ,Iterative method ,Stereo matching ,Phase detector ,Phase image ,law.invention ,law ,Phase correlation ,Computer vision ,Artificial intelligence ,business ,Optical correlation ,Light field - Published
- 2020
31. Generation of Affective Accompaniment in Accordance With Emotion Flow
- Author
-
Homer H. Chen and Yi-Chan Wu
- Subjects
Acoustics and Ultrasonics ,Computer science ,Speech recognition ,05 social sciences ,02 engineering and technology ,050105 experimental psychology ,Arousal ,Computational Mathematics ,0202 electrical engineering, electronic engineering, information engineering ,Computer Science (miscellaneous) ,Chord (music) ,020201 artificial intelligence & image processing ,0501 psychology and cognitive sciences ,Electrical and Electronic Engineering ,Valence (psychology) - Abstract
The emotion expressed by a music piece varies as the music unrolls in time. To create such dynamic expression, we develop an algorithm that automatically generates the accompaniment for a melody according to the emotion flow specified by a user. The emotion flow is given in the form of arousal and valence curves, each as a function of time. The affective accompaniment is composed of chord progression and accompaniment pattern. The chord progression, which controls the valence of the composed music, is generated by dynamic programming using the input melody and valence data as constraints. A mathematical model is developed to describe the temporal relationship between valence and chord progression. The accompaniment pattern, which controls the arousal of the composed music, is determined according to the quantized arousal values. The performance of the system is evaluated by subjective tests. The cross-correlation coefficient between the input arousal (valence) and the perceived arousal (valence) of the composed music is 0.85 (0.52). It rises to 0.92 for arousal and 0.88 for valence if only musician subjects in the test are considered. Overall, the proposed system is capable of generating subjectively appropriate accompaniments conforming to the user specification.
- Published
- 2016
32. 360° Video Stitching for Dual Fisheye Cameras
- Author
-
Homer H. Chen, Kuang-Tsu Shih, and I-Chan Lo
- Subjects
Artifact (error) ,Computer science ,business.industry ,Distortion (optics) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,02 engineering and technology ,Visualization ,law.invention ,Lens (optics) ,Dimension (vector space) ,Seam carving ,law ,Distortion ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business ,Parallax ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Back-to-back dual fisheye camera configuration offers a cost-effective 360° video solution. However, the inherent parallax of this camera configuration and the unmatched scene along the seam between the two camera views present a great challenge to video stitching. To address the jittering and fragmented visual appearance issues, we present in this paper a robust method that preserves the geometric structure of the scene and enhances the stableness of the video along the temporal dimension. The proposed video stitching method entails a mesh deforming operation prior for the minimization of the geometric distortion and an adaptive seam carving operation that generates optimal spatial and temporal alignment. Experimental results show that our method can produce 360° videos without jitter and ghost artifact.
- Published
- 2019
33. Seamless Stitching Dual Fisheye Images For 360° Free View
- Author
-
Minghuang Shih, Homer H. Chen, Makoto Odamaki, Po Chin Yu, I-Chan Lo, Kuang-Tsu Shih, and Chun-Ting Hung
- Subjects
Computer science ,business.industry ,Frame (networking) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020206 networking & telecommunications ,Image processing ,02 engineering and technology ,Artifact (software development) ,Dual (category theory) ,Image stitching ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business - Abstract
We demonstrate a seamless stitching technology for dual-fisheye video cameras that combines two images/frames of ultra-wide field of view into a 360° image/frame. Our technology generates high-quality 360° free view without the disjoint appearance and ghost artifact that plague most 360° cameras in the market today. Our technology is efficient, robust, and stunning. Target customers include VR device manufactures, telco operators, broadcasters, surveillance service providers, and immersive free view users.
- Published
- 2019
34. Thumbnail Image Selection for VOD Services
- Author
-
Jing-Kai Lou, Homer H. Chen, and Chun-Ning Tsao
- Subjects
Multimedia ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Video content analysis ,Selection (linguistics) ,Video on demand ,Thumbnail Image ,Representation (mathematics) ,computer.software_genre ,computer ,Video retrieval - Abstract
The booming of video on demand (VOD) provides numerous TV series and movies for users to watch anywhere at any time. As the amount of video contents available for viewing grows explosively, thumbnail image representation of video works as a surrogate and facilitates quick and easy video retrieval. Unlike previous work, which considers representativeness as the main criterion for thumbnail image selection, the work described in this paper incorporates attractiveness as an additional criterion. Our idea is based on the observation that thumbnail image for VOD services should not only convey the gist of the video but also intrigue the users. We propose a two-stage method that efficiently utilizes visual features to select a thumbnail image from each TV series. The effectiveness of the proposed method is verified by a subjective test. The results provide further insight into the user preference.
- Published
- 2019
35. Cross-Cultural Music Emotion Recognition by Adversarial Discriminative Domain Adaptation
- Author
-
Yi-Wei Chen, Yi-Hsuan Yang, and Homer H. Chen
- Subjects
Artificial neural network ,InformationSystems_INFORMATIONINTERFACESANDPRESENTATION(e.g.,HCI) ,Computer science ,Speech recognition ,05 social sciences ,010501 environmental sciences ,01 natural sciences ,050105 experimental psychology ,Rhythm ,Discriminative model ,Emotion perception ,Cultural diversity ,0501 psychology and cognitive sciences ,Valence (psychology) ,Timbre ,0105 earth and related environmental sciences - Abstract
Annotation of the perceived emotion of a music piece is required for an automatic music emotion recognition system. Most music emotion datasets are developed for Western pop songs. The problem is that a music emotion recognizer trained on such datasets may not work well for non-Western pop songs due to the differences in acoustic characteristics and emotion perception that are inherent to cultural background. The problem was also found in cross-cultural and cross-dataset studies; however, little has been done to learn how to adapt a model pre-trained on a source music genre to a target music genre of interest. In this paper, we propose to address the problem by an unsupervised adversarial domain adaptation method. It employs neural network models to make the target music indistinguishable from the source music in a learned feature representation space. Because emotion perception is multifaceted, three types of input feature representations related to timbre, pitch, and rhythm are considered for performance evaluation. The results show that the proposed method effectively improves the prediction of the valence of Chinese pop songs from a model trained for Western pop songs.
- Published
- 2018
36. Dermal epidermal junction detection for full-field optical coherence tomography data of human skin by deep learning
- Author
-
Jeng-Wei Tjiu, Homer H. Chen, Hua-Yu Chou, and Sheng-Lung Huang
- Subjects
Computer science ,Health Informatics ,Convolutional neural network ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,Deep Learning ,0302 clinical medicine ,Optical coherence tomography ,medicine ,Humans ,Radiology, Nuclear Medicine and imaging ,Optical tomography ,Skin ,Dermoepidermal junction ,integumentary system ,Radiological and Ultrasound Technology ,medicine.diagnostic_test ,Pixel ,business.industry ,Deep learning ,Pattern recognition ,Computer Graphics and Computer-Aided Design ,Cross-Sectional Studies ,Computer-aided diagnosis ,Computer Vision and Pattern Recognition ,Tomography ,Artificial intelligence ,Epidermis ,business ,Tomography, Optical Coherence ,030217 neurology & neurosurgery - Abstract
Full-field optical coherence tomography (FF-OCT) has been developed to obtain three-dimensional (3D) OCT data of human skin for early diagnosis of skin cancer. Detection of dermal epidermal junction (DEJ), where melanomas and basal cell carcinomas originate, is an essential step for skin cancer diagnosis. However, most existing DEJ detection methods consider each cross-sectional frame of the 3D OCT data independently, leaving the relationship between neighboring frames unexplored. In this paper, we exploit the continuity of 3D OCT data to enhance DEJ detection. In particular, we propose a method for noise reduction of the training data and a multi-directional convolutional neural network to predict the probability of epidermal pixels in the 3D OCT data, which is more stable than one-directional convolutional neural network for DEJ detection. Our crosscheck refinement method also exploits the domain knowledge to generate a smooth DEJ surface. The average mean error of the entire DEJ detection system is approximately 6 μm.
- Published
- 2021
37. Blocking harmful blue light while preserving image color appearance
- Author
-
Homer H. Chen, Frank Shyu, Jen-Shuo Liu, Kuang-Tsu Shih, and Su-Ling Yeh
- Subjects
business.industry ,Computer science ,Blocking (radio) ,Frequency band ,020207 software engineering ,02 engineering and technology ,Spectral transmittance ,Computer Graphics and Computer-Aided Design ,Image (mathematics) ,03 medical and health sciences ,Vision science ,0302 clinical medicine ,Optics ,Distortion ,0202 electrical engineering, electronic engineering, information engineering ,Computer vision ,Artificial intelligence ,business ,030217 neurology & neurosurgery ,Blue light - Abstract
Recent study in vision science has shown that blue light in a certain frequency band affects human circadian rhythm and impairs our health. Although applying a light blocker to an image display can block the harmful blue light, it inevitably makes an image look like an aged photo. In this paper, we show that it is possible to reduce harmful blue light while preserving the blue appearance of an image. Moreover, we optimize the spectral transmittance profile of blue light blocker based on psychophysical data and develop a color compensation algorithm to minimize color distortion. A prototype using notch filters is built as a proof of concept.
- Published
- 2016
38. Efficient Quantization Based on Rate–Distortion Optimization for Video Coding
- Author
-
Homer H. Chen and Tsung-Yau Huang
- Subjects
0209 industrial biotechnology ,Mathematical optimization ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Vector quantization ,Variable-length code ,Data_CODINGANDINFORMATIONTHEORY ,02 engineering and technology ,Coding tree unit ,Sub-band coding ,020901 industrial engineering & automation ,Rate–distortion optimization ,Computer Science::Multimedia ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,020201 artificial intelligence & image processing ,Electrical and Electronic Engineering ,Algorithm ,Group of pictures ,Context-adaptive binary arithmetic coding ,Context-adaptive variable-length coding ,Mathematics - Abstract
Most rate-distortion (R-D) optimized quantization methods of video coding involve an exhaustive search process to determine the optimal quantized transform coefficients of a coding block and are computationally more expensive than the conventional quantization. In this paper, we present a novel analytical method that directly solves the rate–distortion optimization problem in a closed form by employing a rate model for entropy coding. It has the appealing property of low complexity and is easy to implement. The results show that the proposed method is $4\times $ to $40\times $ faster than the previous methods and 3%–5% more efficient in bitrate than the H.264/AVC reference encoder, which uses the conventional quantization, for video coded with the IBBP structure of group of pictures (GOP) in the normal peak signal-to-noise ratio (PSNR) range (30–40 dB).
- Published
- 2016
39. Using Disparity Information for Stereo Autofocus in 3-D Photography
- Author
-
Cheng-Chieh Yang, Shao-Kang Huang, Kuang-Tsu Shih, and Homer H. Chen
- Subjects
Autofocus ,business.industry ,Computer science ,law ,Photography ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,02 engineering and technology ,Artificial intelligence ,business ,law.invention - Published
- 2016
40. Exploiting Perceptual Anchoring for Color Image Enhancement
- Author
-
Homer H. Chen and Kuang-Tsu Shih
- Subjects
Color histogram ,Image quality ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Color balance ,02 engineering and technology ,HSL and HSV ,False color ,Backlight ,Luminance ,Image texture ,Color depth ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,Computer vision ,Electrical and Electronic Engineering ,business.industry ,Color image ,Binary image ,020206 networking & telecommunications ,Frame rate ,Color quantization ,Computer Science Applications ,Signal Processing ,Human visual system model ,Chrominance ,RGB color model ,020201 artificial intelligence & image processing ,Artificial intelligence ,business - Abstract
The preservation of image quality under various display conditions becomes more and more important in the multimedia era. A considerable amount of effort has been devoted to compensating the quality degradation caused by dim LCD backlight for mobile devices and desktop monitors. However, most previous enhancement methods for backlight-scaled images only consider the luminance component and overlook the impact of color appearance on image quality. In this paper, we propose a fast and elegant method that exploits the anchoring property of human visual system to preserve the color appearance of backlight-scaled images as much as possible. Our approach is distinguished from previous ones in many aspects. First, it has a sound theoretical basis. Second, it takes the luminance and chrominance components into account in an integral manner. Third, it has low complexity and can process 720p high-definition videos at 35 frames per second without flicker. The superior performance of the proposed method is verified through psychophysical tests.
- Published
- 2016
41. Multi-Label Playlist Classification Using Convolutional Neural Network
- Author
-
Yian Chen, Guan-Hua Wang, Homer H. Chen, and Chia-Hao Chung
- Subjects
Audio signal ,InformationSystems_INFORMATIONINTERFACESANDPRESENTATION(e.g.,HCI) ,Computer science ,business.industry ,Feature extraction ,Matrix (music) ,Pattern recognition ,02 engineering and technology ,Convolutional neural network ,Support vector machine ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Embedding ,Domain knowledge ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Theme (computing) - Abstract
With the popularization of music streaming services, millions of songs can be accessed easily. The effectiveness of music access can be further enhanced by making the labels of a music playlist more indicative of the theme of the playlist. However, manually classifying playlists is laborious and often requires domain knowledge. In this paper, we propose a novel multi-label model for playlist classification based on a convolutional neural network. The network is trained in an end-to-end manner to jointly learn song embedding and convolutional filters without the need of feature extraction from audio signals. Specifically, the song embedding vectors are concatenated as a matrix to represent a playlist, and the convolutional filters for playlist classification are applied to the playlist matrix. We also propose two augmentation techniques to prevent over-fitting of playlist data and to improve the training of the proposed model. Experimental results show that the proposed model performs significantly better than the support vector machine and k-nearest neighbors models.
- Published
- 2018
42. Depth from Gaze
- Author
-
Kuang-Tsu Shih, Homer H. Chen, Sheng-Lung Chung, and Tzu-Sheng Kuo
- Subjects
business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,0102 computer and information sciences ,02 engineering and technology ,01 natural sciences ,Gaze ,Visualization ,InformationSystems_MODELSANDPRINCIPLES ,010201 computation theory & mathematics ,Depth map ,Fixation (visual) ,0202 electrical engineering, electronic engineering, information engineering ,Eye tracking ,020201 artificial intelligence & image processing ,Augmented reality ,Computer vision ,Artificial intelligence ,business - Abstract
Eye trackers are found on various electronic devices. In this paper, we propose to exploit the gaze information acquired by an eye tracker for depth estimation. The data collected from the eye tracker in a fixation interval are used to estimate the depth of a gazed object. The proposed method can be used to construct a sparse depth map of an augmented reality space. The resulting depth map can be applied to, for example, controlling the visual information displayed to the viewer. A mathematical model for determining whether two depths in the augmented reality space are statistically distinguishable is also developed. Experimental results show that the proposed method can estimate and distinguish different object depths effectively.
- Published
- 2018
43. Image Stitching for Dual Fisheye Cameras
- Author
-
Homer H. Chen, Kuang-Tsu Shih, and I-Chan Lo
- Subjects
Computer science ,business.industry ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020206 networking & telecommunications ,02 engineering and technology ,law.invention ,Image stitching ,Lens (optics) ,law ,0202 electrical engineering, electronic engineering, information engineering ,Equirectangular projection ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,Image warping ,business ,Panoramic photography ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Panoramic photography creates stunning immersive visual experiences for viewers. In this paper, we investigate how to seamlessly stitch a pair of images captured by two uncalibrated, back-to-back, 195-degree fisheye cameras to generate a surround view of a 3D scene. It is a challenging task because the two camera centers are displaced and because the common region is the most distorted area. To enhance the robustness of feature matching and hence the quality of stitching, we propose a novel technique that projects the image rectilinearly onto an equirectangular plane. Unlike most previous global warping methods that are sensitive to the disparity between back-to-back fisheye images, our method employs local warping for image alignment while preserving the geometric structure of the scene. Experimental results show that our method effectively produces high-quality seamless panoramic images without stitching artifacts.
- Published
- 2018
44. Playlist-Based Tag Propagation for Improving Music Auto-Tagging
- Author
-
Homer H. Chen, Yi-Hsun Lin, and Chia-Hao Chung
- Subjects
InformationSystems_INFORMATIONINTERFACESANDPRESENTATION(e.g.,HCI) ,Computer science ,020204 information systems ,Speech recognition ,0202 electrical engineering, electronic engineering, information engineering ,Task analysis ,020201 artificial intelligence & image processing ,02 engineering and technology ,Coherence (statistics) ,Convolutional neural network - Abstract
The performance of a music auto-tagging system highly relies on the quality of the training dataset. In particular, each training song should have sufficient relevant tags. Tag propagation is a technique that creates additional tags for a song by passing the tags from other similar songs. In this paper, we present a novel tag propagation approach that exploits the song coherence of a playlist to improve the training of an auto-tagging model. The main idea is to share the tags between neighboring songs in a playlist and to optimize the auto-tagging model through a multi-task objective function. We test the proposed playlist-based approach on a convolutional neural network for music auto-tagging and show that it can indeed provide a significant performance improvement.
- Published
- 2018
45. Extracting Blood Vessels From Full-Field OCT Data of Human Skin by Short-Time RPCA
- Author
-
Pin-Hsien Lee, Sheng-Lung Huang, Andrew C.A. Chen, Homer H. Chen, and Chin-Cheng Chan
- Subjects
Adult ,Male ,genetic structures ,Computer science ,01 natural sciences ,010309 optics ,030207 dermatology & venereal diseases ,03 medical and health sciences ,0302 clinical medicine ,Optical coherence tomography ,0103 physical sciences ,Medical imaging ,medicine ,Image Processing, Computer-Assisted ,Humans ,Electrical and Electronic Engineering ,Sparse matrix ,Skin ,Ground truth ,Principal Component Analysis ,Radiological and Ultrasound Technology ,medicine.diagnostic_test ,business.industry ,Angiography ,Pattern recognition ,Speckle noise ,Blood flow ,eye diseases ,Computer Science Applications ,Face (geometry) ,Blood Vessels ,sense organs ,Artificial intelligence ,business ,Robust principal component analysis ,Software ,Algorithms ,Tomography, Optical Coherence - Abstract
Recent advances in optical coherence tomography (OCT) lead to the development of OCT angiography to provide additional helpful information for diagnosis of diseases like basal cell carcinoma. In this paper, we investigate how to extract blood vessels of human skin from full-field OCT (FF-OCT) data using the robust principal component analysis (RPCA) technique. Specifically, we propose a short-time RPCA method that divides the FF-OCT data into segments and decomposes each segment into a low-rank structure representing the relatively static tissues of human skin and a sparse matrix representing the blood vessels. The method mitigates the problem associated with the slow-varying background and is free of the detection error that RPCA may have when dealing with FF-OCT data. Both short-time RPCA and RPCA methods can extract blood vessels from FF-OCT data with heavy speckle noise, but the former takes only half the computation time of the latter. We evaluate the performance of the proposed method by comparing the extracted blood vessels with the ground truth vessels labeled by a dermatologist and show that the proposed method works equally well for FF-OCT volumes of different quality. The average F-measure improvements over the correlation-mapping OCT method, the modified amplitude-decorrelation OCT angiography method, and the RPCA method, respectively, are 0.1835, 0.1032, and 0.0458.
- Published
- 2018
46. Occlusion-and-Edge-Aware Depth Estimation From Stereo Images for Synthetic Refocusing
- Author
-
Hua-Yu Chou, Kuang-Tsu Shih, and Homer H. Chen
- Subjects
Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Process (computing) ,020207 software engineering ,02 engineering and technology ,GeneralLiterature_MISCELLANEOUS ,Image (mathematics) ,Feature (computer vision) ,Depth map ,Occlusion ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Enhanced Data Rates for GSM Evolution ,Artificial intelligence ,business - Abstract
Due to the emergence of dual cameras for smart phones in recent years, synthetic refocusing using stereo data has become an important issue. For synthetic refocusing to produce satisfactory results, it is crucial to render the image with high-quality depth map, which is often obtained through a refinement process. However, most existing depth map refinement algorithms pay little attention to problems related to notorious occlusion and loss of image details. In this paper, we study how the quality of depth map affects the performance of synthetic refocusing and, based on the study, propose a new method that integrates depth information and RGB image for synthetic refocusing. A notable feature of our approach is that it formulates occlusion filling as a labeling problem and solves it by multi-layer alpha matting, resulting in a depth map with edges well-aligned with the RGB image in the occluded area and giving synthetic refocusing a realistic sensation. An evaluation of our method against previous methods is performed by comparing the estimated depth map and the refocusing results.
- Published
- 2018
47. Dehazing With A See-Through Near-Eye Display
- Author
-
Kuang-Tsu Shih, Homer H. Chen, and Kai-En Lin
- Subjects
Brightness ,Haze ,Computer science ,business.industry ,media_common.quotation_subject ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,GeneralLiterature_MISCELLANEOUS ,Perception ,Near eye display ,Contrast (vision) ,Computer vision ,Artificial intelligence ,business ,media_common - Abstract
Human perception of the visual world in the presence of fog or haze usually suffers from loss of color and details. In this demo, we show a powerful technique to provide a see-through head-mounted display with the image dehazing capability. Our method utilizes an auxiliary image projected from the head-mounted device onto our eyes to enhance our perception and the contrast and brightness of the visual world. This method can be applied to tourism, military, and navigation to overcome poor visual condition.
- Published
- 2018
48. Subjective Evaluation of Vector Representation of Emotion Flow for Music Retrieval
- Author
-
Chia-Hao Chung, Homer H. Chen, and Ming-I Yang
- Subjects
Computer science ,business.industry ,Learnability ,Novelty ,Representation (systemics) ,Usability ,computer.software_genre ,Visualization ,Dynamics (music) ,Music information retrieval ,Artificial intelligence ,Affordance ,business ,computer ,Natural language processing - Abstract
Because it simply consists of an initial point and a terminal point in a two dimensional emotion plane, vector representation of music emotion provides an intuitive and instant visualization of the dynamics of music emotion. In this paper, we investigate the performance of this representation for music information retrieval by conducting a series of subjective tests. A music retrieval system is created, and the user experience data are evaluated by seven metrics: learnability, ease of use, affordance, usefulness, joyfulness, novelty, and overall satisfaction. Compared with the point representation, the vector representation performs relatively better in affordance, novelty, and joyfulness but slightly worse in learnability and ease of use. The overall satisfaction score is 5.19 for the point representation and 5.43 for the vector representation. The results suggest that each representation has its own strengths, and the choice between the two representations depends on which metrics carry more weight in an application at hand.
- Published
- 2018
49. Towards radiologist-level cancer risk assessment in CT lung screening using deep learning
- Author
-
Tobias Klinder, Homer H. Pien, Christoph Wald, Stojan Trajanovski, Binyam Gebrekidan Gebre, Brady McKee, Sebastian Flacke, Bastiaan S. Veeling, Heber MacMahon, Shawn Regis, Dimitrios Mavroeidis, Rafael Wiemker, Amir M. Tahmasebi, and Christine Leon Swisher
- Subjects
Male ,FOS: Computer and information sciences ,medicine.medical_specialty ,Lung Neoplasms ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,Stability (learning theory) ,Health Informatics ,Context (language use) ,Risk Assessment ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,Deep Learning ,0302 clinical medicine ,Radiologists ,medicine ,Humans ,Radiology, Nuclear Medicine and imaging ,Stage (cooking) ,Lung cancer ,Lung ,Early Detection of Cancer ,Radiological and Ultrasound Technology ,business.industry ,Cancer ,medicine.disease ,Computer Graphics and Computer-Aided Design ,National Lung Screening Trial ,Computer Vision and Pattern Recognition ,Radiology ,Tomography, X-Ray Computed ,business ,Risk assessment ,030217 neurology & neurosurgery ,Lung cancer screening - Abstract
Importance: Lung cancer is the leading cause of cancer mortality in the US, responsible for more deaths than breast, prostate, colon and pancreas cancer combined and it has been recently demonstrated that low-dose computed tomography (CT) screening of the chest can significantly reduce this death rate. Objective: To compare the performance of a deep learning model to state-of-the-art automated algorithms and radiologists as well as assessing the robustness of the algorithm in heterogeneous datasets. Design, Setting, and Participants: Three low-dose CT lung cancer screening datasets from heterogeneous sources were used, including National Lung Screening Trial (NLST, n=3410), Lahey Hospital and Medical Center (LHMC, n=3174) data, Kaggle competition data (from both stages, n=1595+505) and the University of Chicago data (UCM, a subset of NLST, annotated by radiologists, n=197). Relevant works on automated methods for Lung Cancer malignancy estimation have used significantly less data in size and diversity. At the first stage, our framework employs a nodule detector; while in the second stage, we use both the image area around the nodules and nodule features as inputs to a neural network that estimates the malignancy risk for the entire CT scan. We trained our two-stage algorithm on a part of the NLST dataset, and validated it on the other datasets. Results, Conclusions, and Relevance: The proposed deep learning model: (a) generalizes well across all three data sets, achieving AUC between 86% to 94%; (b) has better performance than the widely accepted PanCan Risk Model, achieving 11% better AUC score; (c) has improved performance compared to the state-of-the-art represented by the winners of the Kaggle Data Science Bowl 2017 competition on lung cancer screening; (d) has comparable performance to radiologists in estimating cancer risk at a patient level., Comment: Submitted for publication. 11 pages
- Published
- 2018
- Full Text
- View/download PDF
50. Guest Editorial: Visual Information Processing and Perception
- Author
-
Homer H. Chen, Hari Kalva, Gerardo Fernández-Escribano, and Velibor Adzic
- Subjects
Visual perception ,Point (typography) ,Computer Networks and Communications ,Computer science ,business.industry ,media_common.quotation_subject ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Wearable computer ,Video processing ,Automatic summarization ,Hardware and Architecture ,Perception ,Media Technology ,Computer vision ,Artificial intelligence ,Bitstream ,business ,Software ,media_common - Abstract
The goal of this special issue was to bring together experts working on applying models, principles, and knowledge of human audio-visual perception and cognition to optimize video processing algorithms and applications. This is an area where our deeper understanding of visual perception to video processing will lead to new breakthroughs in visual information processing. Attentional focus is an important aspect of perception that helps understand user interest and intent. Determining points of saliency is computationally complex. Compressed-domain methods thus become valuable tools for developing computationally efficient and practical solutions. In BCompressed-Domain Correlates of Human Fixations in Dynamic Scenes^ (10. 1007/s11042-015-2802-3) authors present a method for detecting points of fixation in H.264/ AVC video using motion vectors, block coding modes and coded residuals parsed from a H. 164/AVC bitstream. Egocentric videos are captured usingwearable cameras and used to detect a wearer’s point of view. Amount of video captured using wearable cameras is increasing and saliency detection in such videos enables applications such as efficient video summarization. In BGeometrical Cues in Visual Saliency Models for Active Object Recognition in Egocentric Videos^ (10.1007/ s11042-015-2803-2) authors use geometrical cues to improve saliency detection in videos. Action recognition from videos is a challenging task with many applications including surveillance and social behavior understanding. Inspired by models of neural response to visual input, the paper entitled BDeep Learning Human Actions from Video via Sparse Filtering and Locally Competitive Algorithms^ (10.1007/s11042-015-2808-x) presents a method that combines sparse filtering with locally competitive algorithms for applications in action recognition. Multimed Tools Appl (2015) 74:10053–10056 DOI 10.1007/s11042-015-2957-y
- Published
- 2015
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.