85 results on '"Smolic, Aljosa"'
Search Results
2. Exercise quantification from single camera view markerless 3D pose estimation
- Author
-
Mercadal-Baudart, Clara, Liu, Chao-Jung, Farrell, Garreth, Boyne, Molly, González Escribano, Jorge, Smolic, Aljosa, and Simms, Ciaran
- Published
- 2024
- Full Text
- View/download PDF
3. Spectral analysis of re-parameterized light fields
- Author
-
Alain, Martin and Smolic, Aljosa
- Published
- 2022
- Full Text
- View/download PDF
4. Frequency-domain loss function for deep exposure correction of dark images
- Author
-
Yadav, Ojasvi, Ghosal, Koustav, Lutz, Sebastian, and Smolic, Aljosa
- Published
- 2021
- Full Text
- View/download PDF
5. Delivery of omnidirectional video using saliency prediction and optimal bitrate allocation
- Author
-
Ozcinar, Cagri, İmamoğlu, Nevrez, Wang, Weimin, and Smolic, Aljosa
- Published
- 2021
- Full Text
- View/download PDF
6. Per-point processing for detailed urban solar estimation with aerial laser scanning and distributed computing
- Author
-
Vo, Anh Vu, Laefer, Debra F., Smolic, Aljosa, and Zolanvari, S.M. Iman
- Published
- 2019
- Full Text
- View/download PDF
7. 2DToonShade: A stroke based toon shading system
- Author
-
Hudon, Matis, Grogan, Mairéad, Pagés, Rafael, Ondřej, Jan, and Smolić, Aljoša
- Published
- 2019
- Full Text
- View/download PDF
8. Robust global and local color matching in stereoscopic omnidirectional content
- Author
-
Dudek, Roman, Croci, Simone, Smolic, Aljosa, and Knorr, Sebastian
- Published
- 2019
- Full Text
- View/download PDF
9. SalNet360: Saliency maps for omni-directional images with CNN
- Author
-
Monroy, Rafael, Lutz, Sebastian, Chalasani, Tejo, and Smolic, Aljosa
- Published
- 2018
- Full Text
- View/download PDF
10. Visual attention-aware quality estimation framework for omnidirectional video using spherical Voronoi diagram
- Author
-
Croci, Simone, Ozcinar, Cagri, Zerman, Emin, Knorr, Sebastian, Cabrera, Julián, and Smolic, Aljosa
- Published
- 2020
- Full Text
- View/download PDF
11. BASICS: Broad quality Assessment of Static point clouds In Compression Scenarios
- Author
-
Ak, Ali, Zerman, Emin, Quach, Maurice, Chetouani, Aladine, Smolic, Aljosa, Valenzise, Giuseppe, and Callet, Patrick Le
- Subjects
FOS: Computer and information sciences ,Computer Science - Graphics ,Image and Video Processing (eess.IV) ,FOS: Electrical engineering, electronic engineering, information engineering ,Electrical Engineering and Systems Science - Image and Video Processing ,Graphics (cs.GR) ,Computer Science - Multimedia ,Multimedia (cs.MM) - Abstract
Point clouds are now commonly used to represent 3D scenes in virtual world, in addition to 3D meshes. Their ease of capture enable various applications on mobile devices, such as smartphones or other microcontrollers. Point cloud compression is now at an advanced level and being standardized. Nevertheless, quality assessment databases, which is needed to develop better objective quality metrics, are still limited. In this work, we create a broad quality assessment database for static point clouds, mainly for telepresence scenario. For the sake of completeness, the created database is analyzed using the mean opinion scores, and it is used to benchmark several state-of-the-art quality estimators. The generated database is named Broad quality Assessment of Static point clouds In Compression Scenario (BASICS). Currently, the BASICS database is used as part of the ICIP 2023 Grand Challenge on Point Cloud Quality Assessment, and therefore only a part of the database has been made publicly available at the challenge website. The rest of the database will be made available once the challenge is over., Manuscript in preparation, 11 pages, 8 figures
- Published
- 2023
12. Feel the Music!—Audience Experiences of Audio–Tactile Feedback in a Novel Virtual Reality Volumetric Music Video.
- Author
-
Young, Gareth W., O'Dwyer, Néill, Vargas, Mauricio Flores, Donnell, Rachel Mc, and Smolic, Aljosa
- Subjects
VIRTUAL reality ,MUSIC videos ,MUSICAL performance ,HUMAN-computer interaction ,USER experience ,PRACTICING (Music performance) - Abstract
The creation of imaginary worlds has been the focus of philosophical discourse and artistic practice for millennia. Humans have long evolved to use media and imagination to express their inner worlds outwardly via artistic practice. As a fundamental factor of fantasy world-building, the imagination can produce novel objects, virtual sensations, and unique stories related to previously unlived experiences. The expression of the imagination often takes a narrative form that applies some medium to facilitate communication, for example, books, statues, music, or paintings. These virtual realities are expressed and communicated via multiple multimedia immersive technologies, stimulating modern audiences via their combined Aristotelian senses. Incorporating interactive graphic, auditory, and haptic narrative elements in extended reality (XR) permits artists to express their imaginative intentions with visceral accuracy. However, these technologies are constantly in flux, and the precise role of multimodality has yet to be fully explored. Thus, this contribution to Feeling the Future—Haptic Audio explores the potential of novel multimodal technology to communicate artistic expression via an immersive virtual reality (VR) volumetric music video. We compare user experiences of our affordable volumetric video (VV) production to more expensive commercial VR music videos. Our research also inspects audio–tactile interactions in the auditory experience of immersive music videos, where both auditory and haptic channels receive vibrations during the imaginative virtual performance. This multimodal interaction is then analyzed from the audience's perspective to capture the user's experiences and examine the impact of this form of haptic feedback in practice via applied human–computer interaction (HCI) evaluation practices. Our results demonstrate the application of haptics in contemporary music consumption practices, discussing how they affect audience experiences regarding functionality, usability, and the perceived quality of a musical performance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. Chapter 21 - Volumetric video as a novel medium for creative storytelling
- Author
-
Young, Gareth W., O'Dwyer, Néill, and Smolic, Aljosa
- Published
- 2023
- Full Text
- View/download PDF
14. Chapter 18 - Subjective and objective quality assessment for volumetric video
- Author
-
Alexiou, Evangelos, Nehmé, Yana, Zerman, Emin, Viola, Irene, Lavoué, Guillaume, Ak, Ali, Smolic, Aljosa, Le Callet, Patrick, and Cesar, Pablo
- Published
- 2023
- Full Text
- View/download PDF
15. Chapter 4 - Subjective and objective quality assessment for omnidirectional video
- Author
-
Croci, Simone, Singla, Ashutosh, Fremerey, Stephan, Raake, Alexander, and Smolic, Aljosa
- Published
- 2023
- Full Text
- View/download PDF
16. Exploring virtual reality for quality immersive empathy building experiences.
- Author
-
Young, Gareth W., O'Dwyer, Néill, and Smolic, Aljosa
- Subjects
KRUSKAL-Wallis Test ,EMPATHY ,FOCUS groups ,ANALYSIS of variance ,CONFIDENCE intervals ,VIRTUAL reality ,RESEARCH methodology ,MANN Whitney U Test ,COMPARATIVE studies ,CRONBACH'S alpha ,T-test (Statistics) ,IMAGINATION ,RESEARCH funding ,REPEATED measures design ,DESCRIPTIVE statistics ,QUESTIONNAIRES ,SCALE analysis (Psychology) ,CHI-squared test ,SOCIAL skills ,CONTENT analysis - Abstract
Virtual reality (VR) technology presents users with virtual environments to experience various interactive, immersive, and imaginary experiences. While traditional perspective-taking exercises rely on the participant to imagine a self-other merging process to feel connected with other people (typically using second and third-person narrative perspectives), VR can allow an individual to embody an other through first-person narratives delivered via multimodal – visual, aural, haptic – technology-mediated experiences. This process enables users to perceptually and effectively portal into somebody else's body, where they can potentially see, hear, and feel from the point of view of the protagonist and control choices on their behalf in real-time. This article explores the use of VR as an 'empathy-making machine' by facilitating perspective-taking and allowing users to experience another person's circumstances. An experiment was performed to compare two different types of perspective-taking VR applications. Levels of empathy, oneness, and attitudes towards a protagonist or focus group within VR materials were captured. Participants then identified the elements of the VR content that contributed to a quality experience. These measures were used to discuss methodologies and techniques for creating quality empathy-building techniques. The findings of this research will be used to inform future creative technology projects presented in VR. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
17. Assessment of deep learning pose estimates for sports collision tracking.
- Author
-
Blythman, Richard, Saxena, Manan, Tierney, Gregory J., Richter, Chris, Smolic, Aljosa, and Simms, Ciaran
- Subjects
SPORTS injuries risk factors ,DEEP learning ,PREDICTIVE tests ,RUGBY football ,RISK assessment ,BODY movement ,MOTION capture (Human mechanics) ,KINEMATICS ,VIDEO recording - Abstract
Injury assessment during sporting collisions requires estimation of the associated kinematics. While marker-based solutions are widely accepted as providing accurate and reliable measurements, setup times are lengthy and it is not always possible to outfit athletes with restrictive equipment in sporting situations. A new generation of markerless motion capture based on deep learning techniques holds promise for enabling measurement of movement in the wild. The aim of this work is to evaluate the performance of a popular deep learning model "out of the box" for human pose estimation, on a dataset of ten staged rugby tackle movements performed in a marker-based motion capture laboratory with a system of three high-speed video cameras. An analysis of the discrepancy between joint positions estimated by the marker-based and markerless systems shows that the deep learning approach performs acceptably well in most instances, although high errors exist during challenging intervals of heavy occlusion and self-occlusion. In total, 75.6% of joint position estimates are found to have a mean absolute error (MAE) of less than or equal to 25 m m , 17.8% with MAE between 25 and 50 m m and 6.7% with MAE greater than 50 m m. The mean per joint position error is 47 m m. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
18. A Study on Visual Perception of Light Field Content
- Author
-
Smolic, Aljosa, Zerman, Emin, Ozcinar, Cagri, and Gill, Ailbhe
- Subjects
FOS: Computer and information sciences ,I.2.10 ,I.4 ,I.5 ,Computer Vision and Pattern Recognition (cs.CV) ,Image and Video Processing (eess.IV) ,05 social sciences ,Computer Science - Computer Vision and Pattern Recognition ,02 engineering and technology ,Electrical Engineering and Systems Science - Image and Video Processing ,050105 experimental psychology ,FOS: Electrical engineering, electronic engineering, information engineering ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,0501 psychology and cognitive sciences - Abstract
The effective design of visual computing systems depends heavily on the anticipation of visual attention, or saliency. While visual attention is well investigated for conventional 2D images and video, it is nevertheless a very active research area for emerging immersive media. In particular, visual attention of light fields (light rays of a scene captured by a grid of cameras or micro lenses) has only recently become a focus of research. As they may be rendered and consumed in various ways, a primary challenge that arises is the definition of what visual perception of light field content should be. In this work, we present a visual attention study on light field content. We conducted perception experiments displaying them to users in various ways and collected corresponding visual attention data. Our analysis highlights characteristics of user behaviour in light field imaging applications. The light field data set and attention data are provided with this paper., To appear in Irish Machine Vision and Image Processing (IMVIP) 2020
- Published
- 2020
19. Sub-Pixel Back-Projection Network For Lightweight Single Image Super-Resolution
- Author
-
Smolic, Aljosa, Banerjee, Supratik, Ozcinar, Cagri, Rana, Aakanksha, and Manzke, Michael
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Image and Video Processing (eess.IV) ,FOS: Electrical engineering, electronic engineering, information engineering ,0202 electrical engineering, electronic engineering, information engineering ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Computer Vision and Pattern Recognition ,020201 artificial intelligence & image processing ,02 engineering and technology ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
Convolutional neural network (CNN)-based methods have achieved great success for single-image superresolution (SISR). However, most models attempt to improve reconstruction accuracy while increasing the requirement of number of model parameters. To tackle this problem, in this paper, we study reducing the number of parameters and computational cost of CNN-based SISR methods while maintaining the accuracy of super-resolution reconstruction performance. To this end, we introduce a novel network architecture for SISR, which strikes a good trade-off between reconstruction quality and low computational complexity. Specifically, we propose an iterative back-projection architecture using sub-pixel convolution instead of deconvolution layers. We evaluate the performance of computational and reconstruction accuracy for our proposed model with extensive quantitative and qualitative evaluations. Experimental results reveal that our proposed method uses fewer parameters and reduces the computational cost while maintaining reconstruction accuracy against state-of-the-art SISR methods over well-known four SR benchmark datasets. Code is available at "https://github.com/supratikbanerjee/SubPixel-BackProjection_SuperResolution"., To appear in IMVIP 2020
- Published
- 2020
20. A Case Study on Video Color Transfer: Exploring User Motivations, Expectations, and Satisfaction
- Author
-
Grogan, Mair��ad, Zerman, Emin, Young, Gareth W., and Smolic, Aljosa
- Subjects
FOS: Computer and information sciences ,Image and Video Processing (eess.IV) ,FOS: Electrical engineering, electronic engineering, information engineering ,Computer Science - Human-Computer Interaction ,Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Multimedia ,Human-Computer Interaction (cs.HC) ,Multimedia (cs.MM) - Abstract
Multimedia and creativity software products are being used to edit and control various elements of creative media practices. These days, the technical affordances of mobile multimedia devices and the advent of high-speed 5G internet access mean that these abilities are simpler and more readily available to be harnessed by mobile applications. In this paper, using a prototype application, we discuss how potential users of such technology are motivated to use a video recoloring application and explore the role that user expectation and satisfaction play in this process. By exploring this topic and focusing on the human-computer interaction, we found that color transfer interactions are driven by several intrinsic motivations and that user expectations and satisfaction ratings can be maintained via clear visualizations of the processes to be undertaken. Furthermore, we reveal the specific language that users use to communicate video recoloring when regarding user motivations, expectations, and satisfaction. This research provides important information for developers of state-of-art recoloring processes and contributes to dialogues surrounding the users of mobile multimedia technology in practice.
- Published
- 2020
21. A Study of Efficient Light Field Subsampling and Reconstruction Strategies
- Author
-
Smolic, Aljosa, Alain, Martin, and Yang, Chen
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,0202 electrical engineering, electronic engineering, information engineering ,020207 software engineering ,020201 artificial intelligence & image processing ,02 engineering and technology - Abstract
Limited angular resolution is one of the main obstacles for practical applications of light fields. Although numerous approaches have been proposed to enhance angular resolution, view selection strategies have not been well explored in this area. In this paper, we study subsampling and reconstruction strategies for light fields. First, different subsampling strategies are studied with a fixed sampling ratio, such as row-wise sampling, column-wise sampling, or their combinations. Second, several strategies are explored to reconstruct intermediate views from four regularly sampled input views. The influence of the angular density of the input is also evaluated. We evaluate these strategies on both real-world and synthetic datasets, and optimal selection strategies are devised from our results. These can be applied in future light field research such as compression, angular super-resolution, and design of camera systems., Comment: Accepted at IMVIP 2020
- Published
- 2020
- Full Text
- View/download PDF
22. XR Ulysses: addressing the disappointment of cancelled site-specific re-enactments of Joycean literary cultural heritage on Bloomsday.
- Author
-
O'Dwyer, Néill, Young, Gareth W., and Smolic, Aljosa
- Subjects
CULTURAL property ,DISAPPOINTMENT ,DIGITAL technology ,VIRTUAL reality ,COVID-19 pandemic ,SOCIAL reality - Abstract
Site-specific performances are shows created for a specific location and can occur in one or more areas outside the traditional theatre. Social gathering restrictions during the Covid-19 lockdown demanded that these shows be shut down. However, site-specific performances that apply emergent and novel mobile digital technologies have been afforded a compelling voice in showing how performance practitioners and audiences might proceed under the stifling constraints of lockdown and altered live performance paradigms, however they may manifest. Although extended reality (XR) technologies have been in development for a long time, their recent surge in sophistication presents renewed potentialities for site-specific performers to explore ways of bringing the physical world into the digital to recreate real-world places in shared digital spaces. In this research, we explore the potential role of digital XR technologies, such as volumetric video, social virtual reality (VR) and photogrammetry, for simulating site-specific theatre, thereby assessing the potential of these content creation techniques to support future remote performative events. We report specifically on adapting a real-world site-specific performance for VR. This case study approach provides examples and opens dialogues on innovative approaches to site-specific performance in the post-Covid-19 era. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
23. List of contributors
- Author
-
Ak, Ali, Alain, Martin, Alaya Cheikh, Faouzi, Alexiou, Evangelos, Battisti, Federica, Bätz, Michel, Cagnazzo, Marco, Cesar, Pablo, Chao, Fang-Yi, Chelli, Kelvin, Chen, Siheng, Croci, Simone, Dufaux, Frédéric, Eisert, Peter, Feldmann, Ingo, Fink, Laura, Fößel, Siegfried, Forchhammer, Søren, Fremerey, Stephan, Garus, Patrick, Goldmann, Florian, Graziosi, Danillo, Guedes, Alan, Gul, Muhammad Shahzeb Khan, Hellge, Cornelius, Helzle, Volker, Herfet, Thorsten, Hilsmann, Anna, Jaschke, Tobias, Jung, Joël, Keinert, Joachim, Krivokuća, Maja, Lavoué, Guillaume, Lebreton, Pierre, Le Callet, Patrick, Le Pendu, Mikael, Li, Jie, Liu, Jingyu, Mantel, Claire, Mantiuk, Rafał K., Marvie, Jean-Eudes, Maugey, Thomas, Milovanović, Marta, Nehmé, Yana, O'Dwyer, Néill, Ozcinar, Cagri, Palomar, Rafael, Pang, Jiahao, Pelanis, Egidijus, Prappacher, Nico, Prasanna Kumar, Rahul, Quach, Maurice, Raake, Alexander, Rossi, Silvia, Schreer, Oliver, Singla, Ashutosh, Smolic, Aljosa, Stepanov, Milan, Tian, Dong, Toni, Laura, Valenzise, Giuseppe, Viola, Irene, Wang, Congcong, Young, Gareth W., Zeng, Jin, Zerman, Emin, Zhong, Fangcheng, and Ziegler, Matthias
- Published
- 2023
- Full Text
- View/download PDF
24. Interactive Light Field Tilt-Shift Refocus with Generalized Shift-and-Sum
- Author
-
Alain, Martin, Aenchbacher, Weston, and Smolic, Aljosa
- Subjects
FOS: Computer and information sciences ,Computer Science - Graphics ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Human-Computer Interaction ,Graphics (cs.GR) ,Human-Computer Interaction (cs.HC) - Abstract
Since their introduction more than two decades ago, light fields have gained considerable interest in graphics and vision communities due to their ability to provide the user with interactive visual content. One of the earliest and most common light field operations is digital refocus, enabling the user to choose the focus and depth-of-field for the image after capture. A common interactive method for such an operation utilizes disparity estimations, readily available from the light field, to allow the user to point-and-click on the image to chose the location of the refocus plane. In this paper, we address the interactivity of a lesser-known light field operation: refocus to a non-frontoparallel plane, simulating the result of traditional tilt-shift photography. For this purpose we introduce a generalized shift-and-sum framework. Further, we show that the inclusion of depth information allows for intuitive interactive methods for placement of the refocus plane. In addition to refocusing, light fields also enable the user to interact with the viewpoint, which can be easily included in the proposed generalized shift-and-sum framework., 4 pages, 5 figures, to be published in Proceedings of the European Light field Imaging Workshop 2019, authors Martin Alain and Weston Aenchbacher contributed equally to this publication, additional results can be found at https://v-sense.scss.tcd.ie/research/tilt-shift/
- Published
- 2019
- Full Text
- View/download PDF
25. A PILOT STUDY ON VIDEO COLOR TRANSFER: A SURVEY OF USER-TYPE OPINIONS.
- Author
-
Grogan, Mairéad, Zerman, Emin, Young, Gareth W., and Smolic, Aljosa
- Subjects
PILOT projects ,ELECTRONIC commerce ,MOBILE apps ,IDL (Computer program language) ,HUMAN-computer interaction - Abstract
Multimedia software products can be used to create and edit various aspects of online media. Recently, the affordances of mobile devices and high-speed mobile data networks mean that these editing capabilities are more readily available for mobile devices enabling a broader consumer-base. However, the precise role of the user in creative practice is often neglected in favor of reporting faster, more streamlined device functionality. In this paper, we seek to identify high-level human-computer interaction issues concerning video recoloring interfaces that are driven by the needs of different user-types via a methodological and explorative process. By conducting a pilot study, we have captured both quantitative and qualitative responses that formatively explore the role of the user in video recoloring tasks carried out on mobile devices. This research presents a variety of user responses to a video recoloring application, identifying areas of future investigation for explorative practices in user interface design for video recoloring visualization. These findings present important information for researchers exploring the use of state-of-art video recoloring processes and contribute to dialogues surrounding the study of mobile technology in use. [ABSTRACT FROM AUTHOR]
- Published
- 2020
26. Dynamic Environment Mapping for Augmented Reality Applications on Mobile Devices
- Author
-
Monroy, Rafael, Hudon, Matis, and Smolic, Aljosa
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Computer Vision and Pattern Recognition ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Augmented Reality is a topic of foremost interest nowadays. Its main goal is to seamlessly blend virtual content in real-world scenes. Due to the lack of computational power in mobile devices, rendering a virtual object with high-quality, coherent appearance and in real-time, remains an area of active research. In this work, we present a novel pipeline that allows for coupled environment acquisition and virtual object rendering on a mobile device equipped with a depth sensor. While keeping human interaction to a minimum, our system can scan a real scene and project it onto a two-dimensional environment map containing RGB+Depth data. Furthermore, we define a set of criteria that allows for an adaptive update of the environment map to account for dynamic changes in the scene. Then, under the assumption of diffuse surfaces and distant illumination, our method exploits an analytic expression for the irradiance in terms of spherical harmonic coefficients, which leads to a very efficient rendering algorithm. We show that all the processes in our pipeline can be executed while maintaining an average frame rate of 31Hz on a mobile device., To be presented at VMV 2018 (Eurographics Digital Library)
- Published
- 2018
27. AlphaGAN: Generative adversarial networks for natural image matting
- Author
-
Lutz, Sebastian, Amplianitis, Konstantinos, and Smolic, Aljosa
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Computer Vision and Pattern Recognition - Abstract
We present the first generative adversarial network (GAN) for natural image matting. Our novel generator network is trained to predict visually appealing alphas with the addition of the adversarial loss from the discriminator that is trained to classify well-composited images. Further, we improve existing encoder-decoder architectures to better deal with the spatial localization issues inherited in convolutional neural networks (CNN) by using dilated convolutions to capture global context information without downscaling feature maps and losing spatial information. We present state-of-the-art results on the alphamatting online benchmark for the gradient error and give comparable results in others. Our method is particularly well suited for fine structures like hair, which is of great importance in practical matting applications, e.g. in film/TV production., Accepted at BMVC 2018
- Published
- 2018
28. Colour Correction for Stereoscopic Omnidirectional Images
- Author
-
Smolic, Aljosa, Simone, Croci, Mairead, Grogan, and Knorr, Sebastian
- Subjects
FOS: Computer and information sciences ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Stereoscopic omnidirectional images (ODI) when viewed with a head-mounted display are a way to generate an immersive experience. Unfortunately, their creation is not an easy process, and different problems can be present in the ODI that can reduce the quality of experience. A common problem is colour mismatch, which occurs when the colours of the objects in the scene are different between the two stereoscopic views. In this paper we propose a novel method for the correction of colour mismatch based on the subdivision of ODIs into patches, where local colour correction transformations are fitted and then globally combined. The results presented in the paper show that the proposed method is able to reduce the colour mismatch in stereoscopic ODIs.
- Published
- 2018
- Full Text
- View/download PDF
29. Automatic Palette Extraction for Image Editing
- Author
-
Smolic, Aljosa, Grogan, Mairéad, Hudon, Matis, and Daniel, McCormack
- Subjects
FOS: Computer and information sciences ,InformationSystems_INFORMATIONINTERFACESANDPRESENTATION(e.g.,HCI) ,Data_MISCELLANEOUS ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Interactive palette based colour editing applications have grown in popularity in recent years, but while many methods propose fast palette extraction techniques, they typically rely on the user to define the number of colours needed. In this paper, we present an approach that extracts a small set of representative colours from an image automatically, determining the optimal palette size without user interaction. Our iterative technique assigns a vote to each pixel in the image based on how close they are in colour space to the colours already in the palette. We use a histogram to divide the colours into bins and determine which colour occurs most frequently in the image but is far away from all of the palette colours, and we add this colour to the palette. This process continues until all pixels in the image are well represented by the palette. Comparisons with existing methods show that our colour palettes compare well to other state of the art techniques, while also computing the optimal number of colours automatically at interactive speeds. In addition, we showcase how our colour palette performs when used in image editing applications such as colour transfer and layer decomposition.
- Published
- 2018
- Full Text
- View/download PDF
30. A Geometry-Sensitive Approach for Photographic Style Classification
- Author
-
Smolic, Aljosa, Ghosal, Koustav, and Prasad, Mukta
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Vision and Pattern Recognition (cs.CV) ,Image and Video Processing (eess.IV) ,FOS: Electrical engineering, electronic engineering, information engineering ,Computer Science - Computer Vision and Pattern Recognition ,0202 electrical engineering, electronic engineering, information engineering ,020206 networking & telecommunications ,02 engineering and technology ,Electrical Engineering and Systems Science - Image and Video Processing ,Machine Learning (cs.LG) - Abstract
Photographs are characterized by different compositional attributes like the Rule of Thirds, depth of field, vanishing-lines etc. The presence or absence of one or more of these attributes contributes to the overall artistic value of an image. In this work, we analyze the ability of deep learning based methods to learn such photographic style attributes. We observe that although a standard CNN learns the texture and appearance based features reasonably well, its understanding of global and geometric features is limited by two factors. First, the data-augmentation strategies (cropping, warping, etc.) distort the composition of a photograph and affect the performance. Secondly, the CNN features, in principle, are translation-invariant and appearance-dependent. But some geometric properties important for aesthetics, e.g. the Rule of Thirds (RoT), are position-dependent and appearance-invariant. Therefore, we propose a novel input representation which is geometry-sensitive, position-cognizant and appearance-invariant. We further introduce a two-column CNN architecture that performs better than the state-of-the-art (SoA) in photographic style classification. From our results, we observe that the proposed network learns both the geometric and appearance-based attributes better than the SoA., Irish Machine Vision and Image Processing Conference, Belfast, 2018
- Published
- 2018
- Full Text
- View/download PDF
31. Estimation of optimal encoding ladders for tiled 360{\deg} VR video in adaptive streaming systems
- Author
-
Ozcinar, Cagri, De Abreu, Ana, Knorr, Sebastian, and Smolic, Aljosa
- Subjects
Computer Science - Multimedia - Abstract
Given the significant industrial growth of demand for virtual reality (VR), 360{\deg} video streaming is one of the most important VR applications that require cost-optimal solutions to achieve widespread proliferation of VR technology. Because of its inherent variability of data-intensive content types and its tiled-based encoding and streaming, 360{\deg} video requires new encoding ladders in adaptive streaming systems to achieve cost-optimal and immersive streaming experiences. In this context, this paper targets both the provider's and client's perspectives and introduces a new content-aware encoding ladder estimation method for tiled 360{\deg} VR video in adaptive streaming systems. The proposed method first categories a given 360{\deg} video using its features of encoding complexity and estimates the visual distortion and resource cost of each bitrate level based on the proposed distortion and resource cost models. An optimal encoding ladder is then formed using the proposed integer linear programming (ILP) algorithm by considering practical constraints. Experimental results of the proposed method are compared with the recommended encoding ladders of professional streaming service providers. Evaluations show that the proposed encoding ladders deliver better results compared to the recommended encoding ladders in terms of objective quality for 360{\deg} video, providing optimal encoding ladders using a set of service provider's constraint parameters., Comment: The 19th IEEE International Symposium on Multimedia (ISM 2017), Taichung, Taiwan
- Published
- 2017
32. Viewport-aware adaptive 360{\deg} video streaming using tiles for virtual reality
- Author
-
Ozcinar, Cagri, De Abreu, Ana, and Smolic, Aljosa
- Subjects
ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Multimedia - Abstract
360{\deg} video is attracting an increasing amount of attention in the context of Virtual Reality (VR). Owing to its very high-resolution requirements, existing professional streaming services for 360{\deg} video suffer from severe drawbacks. This paper introduces a novel end-to-end streaming system from encoding to displaying, to transmit 8K resolution 360{\deg} video and to provide an enhanced VR experience using Head Mounted Displays (HMDs). The main contributions of the proposed system are about tiling, integration of the MPEG-Dynamic Adaptive Streaming over HTTP (DASH) standard, and viewport-aware bitrate level selection. Tiling and adaptive streaming enable the proposed system to deliver very high-resolution 360{\deg} video at good visual quality. Further, the proposed viewport-aware bitrate assignment selects an optimum DASH representation for each tile in a viewport-aware manner. The quality performance of the proposed system is verified in simulations with varying network bandwidth using realistic view trajectories recorded from user experiments. Our results show that the proposed streaming system compares favorably compared to existing methods in terms of PSNR and SSIM inside the viewport., Comment: IEEE International Conference on Image Processing (ICIP) 2017
- Published
- 2017
33. High Quality Light Field Extraction and Post-Processing for Raw Plenoptic Data.
- Author
-
Matysiak, Pierre, Grogan, Mairead, Le Pendu, Mikael, Alain, Martin, Zerman, Emin, and Smolic, Aljosa
- Subjects
LIGHT-field cameras ,COMPUTER vision ,LIGHT in art ,LIGHT art ,VIRTUAL reality ,PIXELS - Abstract
Light field technology has reached a certain level of maturity in recent years, and its applications in both computer vision research and industry are offering new perspectives for cinematography and virtual reality. Several methods of capture exist, each with its own advantages and drawbacks. One of these methods involves the use of handheld plenoptic cameras. While these cameras offer freedom and ease of use, they also suffer from various visual artefacts and inconsistencies. We propose in this paper an advanced pipeline that enhances their output. After extracting sub-aperture images from the RAW images with our demultiplexing method, we perform three correction steps. We first remove hot pixel artefacts, then correct colour inconsistencies between views using a colour transfer method, and finally we apply a state of the art light field denoising technique to ensure a high image quality. An in-depth analysis is provided for every step of the pipeline, as well as their interaction within the system. We compare our approach to existing state of the art sub-aperture image extracting algorithms, using a number of metrics as well as a subjective experiment. Finally, we showcase the positive impact of our system on a number of relevant light field applications. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
34. Deep Tone Mapping Operator for High Dynamic Range Images.
- Author
-
Rana, Aakanksha, Singh, Praveer, Valenzise, Giuseppe, Dufaux, Frederic, Komodakis, Nikos, and Smolic, Aljosa
- Subjects
HIGH dynamic range imaging ,GENERATIVE adversarial networks ,TONE color (Music theory) - Abstract
A computationally fast tone mapping operator (TMO) that can quickly adapt to a wide spectrum of high dynamic range (HDR) content is quintessential for visualization on varied low dynamic range (LDR) output devices such as movie screens or standard displays. Existing TMOs can successfully tone-map only a limited number of HDR content and require an extensive parameter tuning to yield the best subjective-quality tone-mapped output. In this paper, we address this problem by proposing a fast, parameter-free and scene-adaptable deep tone mapping operator (DeepTMO) that yields a high-resolution and high-subjective quality tone mapped output. Based on conditional generative adversarial network (cGAN), DeepTMO not only learns to adapt to vast scenic-content (e.g., outdoor, indoor, human, structures, etc.) but also tackles the HDR related scene-specific challenges such as contrast and brightness, while preserving the fine-grained details. We explore 4 possible combinations of Generator-Discriminator architectural designs to specifically address some prominent issues in HDR related deep-learning frameworks like blurring, tiling patterns and saturation artifacts. By exploring different influences of scales, loss-functions and normalization layers under a cGAN setting, we conclude with adopting a multi-scale model for our task. To further leverage on the large-scale availability of unlabeled HDR data, we train our network by generating targets using an objective HDR quality metric, namely Tone Mapping Image Quality Index (TMQI). We demonstrate results both quantitatively and qualitatively, and showcase that our DeepTMO generates high-resolution, high-quality output images over a large spectrum of real-world scenes. Finally, we evaluate the perceived quality of our results by conducting a pair-wise subjective study which confirms the versatility of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
35. Do Users Behave Similarly in VR? Investigation of the User Influence on the System Design.
- Author
-
ROSSI, SILVIA, OZCINAR, CAGRI, SMOLIC, ALJOSA, and TONI, LAURA
- Subjects
SYSTEMS design ,ARCHITECTURAL design ,VIRTUAL reality software ,VIRTUAL reality ,AUTOMOTIVE navigation systems ,HEAD-mounted displays ,DATA analysis ,INTEGERS - Abstract
With the overarching goal of developing user-centric Virtual Reality (VR) systems, a new wave of studies focused on understanding how users interact in VR environments has recently emerged. Despite the intense efforts, however, current literature still does not provide the right framework to fully interpret and predict users' trajectories while navigating in VR scenes. This work advances the state-of-the-art on both the study of users' behaviour in VR and the user-centric system design. In more detail, we complement current datasets by presenting a publicly available dataset that provides navigation trajectories acquired for heterogeneous omnidirectional videos and different viewing platforms--namely, head-mounted display, tablet, and laptop. We then present an exhaustive analysis on the collected data to better understand navigation in VR across users, content, and, for the first time, across viewing platforms. The novelty lies in the user-affinity metric, proposed in this work to investigate users' similarities when navigating within the content. The analysis reveals useful insights on the effect of device and content on the navigation, which could be precious considerations from the system design perspective. As a case study of the importance of studying users' behaviour when designing VR systems, we finally propose a user-centric server optimisation. We formulate an integer linear program that seeks the best stored set of omnidirectional content that minimises encoding and storage cost while maximising the user's experience. This is posed while taking into account network dynamics, type of video content, and also user population interactivity. Experimental results prove that our solution outperforms common company recommendations in terms of experienced quality but also in terms of encoding and storage, achieving a savings up to 70%. More importantly, we highlight a strong correlation between the storage cost and the user-affinity metric, showing the impact of the latter in the system architecture design. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
36. Occlusion-Aware Depth Map Coding Optimization Using Allowable Depth Map Distortions.
- Author
-
Gao, Pan and Smolic, Aljosa
- Subjects
- *
VIDEO coding , *PIXELS , *DYNAMIC programming - Abstract
In depth map coding, rate-distortion optimization for those pixels that will cause occlusion in view synthesis is a rather challenging task, since the synthesis distortion estimation is complicated by the warping competition and the occlusion order can be easily changed by the adopted optimization strategy. In this paper, an efficient depth map coding approach using allowable depth map distortions is proposed for occlusion-inducing pixels. First, we derive the range of allowable depth level change for both the zero disparity error case and non-zero disparity error case with theoretic and geometrical proofs. Then, we formulate the problem of optimally selecting the depth distortion within allowable depth distortion range with the objective to minimize the overall synthesis distortion involved in the occlusion. The unicity and occlusion order invariance properties of allowable depth distortion range is demonstrated. Finally, we propose a dynamic programming based algorithm to locate the optimal depth distortion for each pixel. Simulation results illustrate the performance improvement of the proposed algorithm over the other state-of-the-art depth map coding optimization schemes. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
37. Pipelines for HDR Video Coding Based on Luminance Independent Chromaticity Preprocessing.
- Author
-
Mahmalat, Samir, Aydin, Tunc Ozan, and Smolic, Aljosa
- Subjects
VIDEO coding ,HIGH dynamic range imaging ,IMAGE color analysis ,LUMINANCE (Video) ,CHROMATICITY ,DATA compression - Abstract
We consider the chromaticity in high dynamic range (HDR) video coding and show the advantages of a constant luminance color space for encoding. For this, we introduce two constant luminance HDR video coding pipelines, which convert the source video to linear $Y u^\prime v^\prime $. A content dependent scaling of the chromaticity components serves as color quality parameter. This reduces perceivable color artifacts while remaining fully compatible with core High Efficiency Video Coding or other video coding standards. One of the pipelines further combines the scaling with a dedicated chromaticity transform to optimize the representation of the chromaticity components for encoding. We validate both pipelines with subjective user studies in addition to an objective comparison to the other state-of-the-art methods. The user studies show a significant improvement in perceived color quality at medium to high compression rates without sacrificing luminance quality compared with current standard coding pipelines. The objective evaluation suggests that both pipelines perform at least comparable to the current state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
38. Luminance independent chromaticity preprocessing for HDR video coding.
- Author
-
Mahmalat, Samir, Stefanoski, Nikolce, Luginbuhl, Daniel, Aydin, Tunc Ozan, and Smolic, Aljosa
- Published
- 2016
- Full Text
- View/download PDF
39. Robust calibration of broadcast cameras based on ellipse and line contours.
- Author
-
Croci, Simone, Stefanoski, Nikolce, and Smolic, Aljosa
- Published
- 2016
- Full Text
- View/download PDF
40. Advanced tools and framework for historical film restoration.
- Author
-
Croci, Simone, Aydin, Tunç Ozan, Stefanoski, Nikolce, Gross, Markus, and Smolic, Aljosa
- Subjects
HISTORICAL films ,PRESERVATION of motion picture film ,MOTION pictures & history ,DIGITAL cinematography ,IMAGE processing - Abstract
Digital restoration of film content that has historical value is crucial for the preservation of cultural heritage. Also, digital restoration is not only a relevant application area of various video processing technologies that have been developed in computer graphics literature but also involves a multitude of unresolved research challenges. Currently, the digital restoration workflow is highly labor intensive and often heavily relies on expert knowledge. We revisit some key steps of this workflow and propose semiautomatic methods for performing them. To do that we build upon state-of-the-art video processing techniques by adding the components necessary for enabling (i) restoration of chemically degraded colors of the film stock, (ii) removal of excessive film grain through spatiotemporal filtering, and (Hi) contrast recovery by transferring contrast from the negative film stock to the positive. We show that when applied individually our tools produce compelling results and when applied in concert significantly improve the degraded input content. Building on a conceptual framework of film restoration ensures the best possible combination of tools and use of available materials. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
41. Hybrid ASIC/FPGA System for Fully Automatic Stereo-to-Multiview Conversion Using IDW.
- Author
-
Schaffner, Michael, Gurkaynak, Frank K., Greisen, Pierre, Kaeslin, Hubert, Benini, Luca, and Smolic, Aljosa
- Subjects
VIDEO processing ,STREAMING technology ,THREE-dimensional imaging ,COMPUTER software - Abstract
Recently, multiview autostereoscopic dis-plays (MADs), which enable a limited glasses-free 3D experience, have become commercially available. The main problem of MADs is that they require several (typically eight or nine) views, while most of the 3D video content is in stereoscopic 3D today. In order to bridge this gap, the research community started to devise automatic multiview synthesis (MVS) methods. These algorithms require real-time processing and should be portable to end-user devices to develop their full potential. To this end, we revisit an algorithmic solution based on image domain warping (IDW) and devise a hardware architecture of a complete synthesis pipeline, provide insights into where the computationally challenging parts are, and present implementation results of a hybrid field programmable gate array/application-specific integrated circuit prototype, which is the first hardware implementation of a complete IDW-based MVS system. Based on these results, we also estimate the complexity and energy efficiency of a fully integrated solution in 65- and 28-nm CMOS technology and show that a full-high-definition real-time solution on a single chip is within reach. The proposed architecture could be used as a coprocessor in a system-on-chip targeting 3D TV sets, thereby enabling efficient content generation with limited user interaction (e.g., depth range adjustment) in real time. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
42. Chromatic calibration of an HDR display using 3D octree forests.
- Author
-
Liu, Jing, Stefanoski, Nikolce, Aydin, Tunc Ozan, Grundhofer, Anselm, and Smolic, Aljosa
- Published
- 2015
- Full Text
- View/download PDF
43. Automatic multiview synthesis ? Towards a mobile system on a chip.
- Author
-
Schaffner, Michael, Gurkaynak, Frank K., Kaeslin, Hubert, Benini, Luca, and Smolic, Aljosa
- Published
- 2015
- Full Text
- View/download PDF
44. An Approximate Computing Technique for Reducing the Complexity of a Direct-Solver for Sparse Linear Systems in Real-Time Video Processing.
- Author
-
Schaffner, Michael, Gürkaynak, Frank K., Smolic, Aljosa, Kaeslin, Hubert, and Benini, Luca
- Published
- 2014
- Full Text
- View/download PDF
45. ColorBrush: Animated diffusion for intuitive colorization simulating water painting.
- Author
-
Marki, Nicolas, Wang, Oliver, Gross, Markus, and Smolic, Aljosa
- Published
- 2014
- Full Text
- View/download PDF
46. MasterCam FVV: Robust registration of multiview sports video to a static high-resolution master camera for free viewpoint video.
- Author
-
Angehrn, Florian, Wang, Oliver, Aksoy, Yagiz, Gross, Markus, and Smolic, Aljosa
- Published
- 2014
- Full Text
- View/download PDF
47. Depth estimation and depth enhancement by diffusion of depth features.
- Author
-
Stefanoski, Nikolce, Bal, Can, Lang, Manuel, Wang, Oliver, and Smolic, Aljosa
- Published
- 2013
- Full Text
- View/download PDF
48. Optimizing stereo-to-multiview conversion for autostereoscopic displays.
- Author
-
Chapiro, Alexandre, Heinzle, Simon, Aydın, Tunç Ozan, Poulakos, Steven, Zwicker, Matthias, Smolic, Aljosa, and Gross, Markus
- Subjects
IMAGE processing ,COMPUTER vision software ,DIGITAL image processing ,ANALYSIS of variance ,CHI-square distribution ,THREE-dimensional display systems - Abstract
We present a novel stereo-to-multiview video conversion method for glasses-free multiview displays. Different from previous stereo-to-multiview approaches, our mapping algorithm utilizes the limited depth range of autostereoscopic displays optimally and strives to preserve the scene's artistic composition and perceived depth even under strong depth compression. We first present an investigation of how perceived image quality relates to spatial frequency and disparity. The outcome of this study is utilized in a two-step mapping algorithm, where we (i) compress the scene depth using a non-linear global function to the depth range of an autostereoscopic display and (ii) enhance the depth gradients of salient objects to restore the perceived depth and salient scene structure. Finally, an adapted image domain warping algorithm is proposed to generate the multiview output, which enables overall disparity range extension. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
49. Automatic View Synthesis by Image-Domain-Warping.
- Author
-
Stefanoski, Nikolce, Wang, Oliver, Lang, Manuel, Greisen, Pierre, Heinzle, Simon, and Smolic, Aljosa
- Subjects
THREE-dimensional imaging ,BLU-ray discs ,TELEVISION programs ,MOTION pictures ,DATA transmission systems ,DATA conversion ,MPEG (Video coding standard) - Abstract
Today, stereoscopic 3D (S3D) cinema is already mainstream, and almost all new display devices for the home support S3D content. S3D distribution infrastructure to the home is already established partly in the form of 3D Blu-ray discs, video on demand services, or television channels. The necessity to wear glasses is, however, often considered as an obstacle, which hinders broader acceptance of this technology in the home. Multiviewautostereoscopic displays enable a glasses free perception of S3D content for several observers simultaneously, and support head motion parallax in a limited range. To support multiviewautostereoscopic displays in an already established S3D distribution infrastructure, a synthesis of new views from S3D video is needed. In this paper, a view synthesis method based on image-domain-warping (IDW) is presented that automatically synthesizes new views directly from S3D video and functions completely. IDW relies on an automatic and robust estimation of sparse disparities and image saliency information, and enforces target disparities in synthesized images using an image warping framework. Two configurations of the view synthesizer in the scope of a transmission and view synthesis framework are analyzed and evaluated. A transmission and view synthesis system that uses IDW is recently submitted to MPEG's call for proposals on 3D video technology, where it is ranked among the four best performing proposals. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
50. Analysis and VLSI Implementation of EWA Rendering for Real-Time HD Video Applications.
- Author
-
Greisen, Pierre, Schaffner, Michael, Heinzle, Simon, Runo, Marian, Smolic, Aljosa, Burg, Andreas, Kaeslin, Hubert, and Gross, Markus
- Subjects
RENDERING algorithms ,RESAMPLING (Statistics) ,RENDERING (Computer graphics) ,VIDEO processing ,VERY large scale circuit integration - Abstract
Nonlinear image warping or image resampling is a necessary step in many current and upcoming video applications, such as video retargeting, stereoscopic 3-D mapping, and multiview synthesis. The challenges for real-time resampling include not only image quality but also available energy and computational power of the employed device. In this paper, we employ an elliptical-weighted average (EWA) rendering approach to 2-D image resampling. We extend the classical EWA framework for increased visual quality and provide a very large scale integration architecture for efficient view rendering. The resulting architecture is able to render high-quality video sequences in real time targeted for low-power applications in end-user display devices. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.