819 results on '"image manipulation"'
Search Results
2. Advanced Algorithms to Detect and Prevent the Spread of Manipulated Images and Videos Using Deep Learning and AI
- Author
-
Pachimatla, Divya, Kilari, Rampriya, Ranitha, I. B., Ranaveer, Indur, Rao, Katakam Srinivasa, Renuka, Kummari, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Gunjan, Vinit Kumar, editor, and Zurada, Jacek M., editor
- Published
- 2025
- Full Text
- View/download PDF
3. AID-AppEAL: Automatic Image Dataset and Algorithm for Content Appeal Enhancement and Assessment Labeling
- Author
-
Chen, Sherry X., Vaxman, Yaron, Baruch, Elad Ben, Asulin, David, Moreshet, Aviad, Sra, Misha, Sen, Pradeep, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
- Published
- 2025
- Full Text
- View/download PDF
4. Detection of Manipulations in Digital Images: A Review of Passive and Active Methods Utilizing Deep Learning.
- Author
-
Duszejko, Paweł, Walczyna, Tomasz, and Piotrowski, Zbigniew
- Subjects
SCIENTIFIC literature ,DEEPFAKES ,PUBLIC opinion ,DIGITAL images ,MODERN society - Abstract
The modern society generates vast amounts of digital content, whose credibility plays a pivotal role in shaping public opinion and decision-making processes. The rapid development of social networks and generative technologies, such as deepfakes, significantly increases the risk of disinformation through image manipulation. This article aims to review methods for verifying images' integrity, particularly through deep learning techniques, addressing both passive and active approaches. Their effectiveness in various scenarios has been analyzed, highlighting their advantages and limitations. This study reviews the scientific literature and research findings, focusing on techniques that detect image manipulations and localize areas of tampering, utilizing both statistical properties of images and embedded hidden watermarks. Passive methods, based on analyzing the image itself, are versatile and can be applied across a broad range of cases; however, their effectiveness depends on the complexity of the modifications and the characteristics of the image. Active methods, which involve embedding additional information into the image, offer precise detection and localization of changes but require complete control over creating and distributing visual materials. Both approaches have their applications depending on the context and available resources. In the future, a key challenge remains the development of methods resistant to advanced manipulations generated by diffusion models and further leveraging innovations in deep learning to protect the integrity of visual content. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
5. A constrained convolutional neural network with attention mechanism for image manipulation detection.
- Author
-
Hamidja, Kamagate Beman, Rosalie Tokpa, Fatoumata Wongbé, Monsan, Vincent, and Oumtanaga, Souleymane
- Subjects
CONVOLUTIONAL neural networks ,LITERARY sources ,PUBLIC opinion ,FALSIFICATION ,DEEP learning - Abstract
The information disseminated by online media is often presented in the form of images, in order to quickly captivate readers and increase audience ratings. However, these images can be manipulated for malicious purposes, such as influencing public opinion, undermining media credibility, disrupting democratic processes or creating conflict within society. Various approaches, whether relying on manually developed features or deep learning, have been devised to detect falsified images. However, they frequently prove less effective when confronted with widespread and multiple manipulations. To address this challenge, in our study, we have designed a model comprising a constrained convolution layer combined with an attention mechanism and a transfer learning ResNet50 network. These components are intended to automatically learn image manipulation features in the initial layer and extract spatial features, respectively. It makes possible to detect various falsifications with much more accuracy and precision. The proposed model has been trained and tested on real datasets sourced from the literature, which include MediaEval and Casia. The obtained results indicate that our proposal surpasses other models documented in the literature. Specifically, we achieve an accuracy of 87% and a precision of 93% on the MediaEval dataset. In comparison, the performance of methods from the literature on the same dataset does not exceed 84% for accuracy and 90% for precision. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
6. TmfimCLIP: Text-Driven Multi-Attribute Face Image Manipulation.
- Author
-
Yaermaimaiti, Yilihamu, Wang, Ruohao, Lou, Xudong, Liu, Yajie, and Xi, Linfei
- Subjects
- *
LANGUAGE models , *HAIR - Abstract
Text-to-image conversion has garnered significant research attention, with contemporary methods leveraging the latent space analysis of StyleGAN. However, issues with latent code decoupling, interpretability, and controllability often remain, leading to misaligned image attributes. To address these challenges, we propose a refined approach that segments StyleGAN’s latent code using the Visual Language Model (CLIP). Our method aligns the latent code segments with text embeddings via an image-text alignment module and modulates them through a text injection module. Additionally, we incorporate semantic segmentation loss and mouth loss to constrain operations that affect irrelevant attributes. Compared to previous CLIP-driven techniques, our approach significantly enhances decoupling, interpretability, and controllability. Experiments on the CelebA-HQ and FFHQ datasets validate our model’s efficacy through both qualitative and quantitative comparisons. Our model effectively handles a wide range of style variations, achieving an FID score of 21.15 for facial attributes and an ID metric of 0.88 for hair attributes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Colorful Diffuse Intrinsic Image Decomposition in the Wild.
- Author
-
Careaga, Chris and Aksoy, Yağız
- Subjects
ALBEDO ,REFLECTANCE ,LIGHTING ,PHOTOGRAPHS ,EDITING - Abstract
Intrinsic image decomposition aims to separate the surface reflectance and the effects from the illumination given a single photograph. Due to the complexity of the problem, most prior works assume a single-color illumination and a Lambertian world, which limits their use in illumination-aware image editing applications. In this work, we separate an input image into its diffuse albedo, colorful diffuse shading, and specular residual components. We arrive at our result by gradually removing first the single-color illumination and then the Lambertian-world assumptions. We show that by dividing the problem into easier sub-problems, in-the-wild colorful diffuse shading estimation can be achieved despite the limited ground-truth datasets. Our extended intrinsic model enables illumination-aware analysis of photographs and can be used for image editing applications such as specularity removal and per-pixel white balancing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. An efficient convolutional neural network for adversarial training against adversarial attack.
- Author
-
Vaddadi, Srinivas A., Vadakkethi, Sanjaikanth E., Pillai, Somanathan, Addula, Santosh Reddy, Vallabhaneni, Rohith, and Ananthan, Bhuvanesh
- Subjects
CONVOLUTIONAL neural networks ,DEEP learning ,IMAGE databases ,RESEARCH personnel - Abstract
Convolutional neural networks (CNN) are widely used by researchers due to their extensive advantages over various applications. However, images are highly susceptible to malicious attacks using perturbations that are unrecognized even under human intervention. This causes significant security perils and challenges to CNN-related applications. In this article, an efficient adversarial training model against malevolent attacks is demonstrated. This model is highly robust to black-box malicious examples, it is processed with different malicious samples. Initially, malicious training models like fast gradient descent (FGS), recursive-FGSM (I-FGS), Deep-Fool, and Carlini and Wagner (CW) techniques are utilized that generate adversarial input by means of the CNN acknowledged to the attacker. In the experimentation process, the MNIST dataset comprising 60K and 10K training and testing grey-scale images are utilized. In the experimental section, the adversarial training model reduces the attack accuracy rate (ASR) by an average of 29.2% for different malicious inputs, when preserving the accuracy of 98.9% concerning actual images in the MNIST database. The simulation outcomes show the preeminence of the model against adversarial attacks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Disentangled Lifespan Synthesis via Transformer‐Based Nonlinear Regression.
- Author
-
Li, Mingyuan and Guo, Yingchun
- Subjects
- *
NONLINEAR regression , *IMAGE reconstruction , *SPACE Age, 1957- , *IMAGE processing , *AGE - Abstract
Lifespan face age transformation aims to generate facial images that accurately depict an individual's appearance at different age stages. This task is highly challenging due to the need for reasonable changes in facial features while preserving identity characteristics. Existing methods tend to synthesize unsatisfactory results, such as entangled facial attributes and low identity preservation, especially when dealing with large age gaps. Furthermore, over‐manipulating the style vector may deviate it from the latent space and damage image quality. To address these issues, this paper introduces a novel nonlinear regression model‐Disentangled Lifespan face Aging (DL‐Aging) to achieve high‐quality age transformation images. Specifically, we propose an age modulation encoder to extract age‐related multi‐scale facial features as key and value, and use the reconstructed style vector of the image as the query. The multi‐head cross‐attention in the W+ space is utilized to update the query for aging image reconstruction iteratively. This nonlinear transformation enables the model to learn a more disentangled mode of transformation, which is crucial for alleviating facial attribute entanglement. Additionally, we introduce a W+ space age regularization term to prevent excessive manipulation of the style vector and ensure it remains within the W+ space during transformation, thereby improving generation quality and aging accuracy. Extensive qualitative and quantitative experiments demonstrate that the proposed DL‐Aging outperforms state‐of‐the‐art methods regarding aging accuracy, image quality, attribute disentanglement, and identity preservation, especially for large age gaps. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Text-free diffusion inpainting using reference images for enhanced visual fidelity.
- Author
-
Kim, Beomjo and Sohn, Kyung-Ah
- Subjects
- *
INPAINTING , *VISUAL education - Abstract
• Language-based Subject Generation faces challenge in accurate portrayal of subject. • Nowadays Reference Guided Generation lacks ability to preserve subject identity. • Exemplar-based instructions with visual tokens preserve visual details of subject. • Model based guidance samples better quality images with different pose. • Our model achieved highest CLIP, DINO score and user study compared to others. This paper presents a novel approach to subject-driven image generation that addresses the limitations of traditional text-to-image diffusion models. Our method generates images using reference images without relying on language-based prompts. We introduce a visual detail preserving module that captures intricate details and textures, addressing overfitting issues associated with limited training samples. The model's performance is further enhanced through a modified classifier-free guidance technique and feature concatenation, enabling the natural positioning and harmonization of subjects within diverse scenes. Quantitative assessments using CLIP, DINO and Quality scores (QS), along with a user study, demonstrate the superior quality of our generated images. Our work highlights the potential of pre-trained models and visual patch embeddings in subject-driven editing, balancing diversity and fidelity in image generation tasks. Our implementation is available at https://github.com/8eomio/Subject-Inpainting. [Display omitted] To create your abstract, type over the instructions in the template box below. Fonts or abstract dimensions should not be changed or altered. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Attribute-guided face adversarial example generation.
- Author
-
Gan, Yan, Xiao, Xinyao, and Xiang, Tao
- Subjects
- *
ARTIFICIAL neural networks , *PROBLEM solving , *INTERPOLATION , *SEMANTICS , *GENERALIZATION - Abstract
Deep neural networks (DNNs) are susceptible to adversarial examples generally generated by adding imperceptible perturbations to the clean images, resulting in the degraded performance of DNNs models. To generate adversarial examples, most methods utilize the L p norm to limit the perturbations and satisfy such imperceptibility. However, the L p norm cannot fully guarantee the semantic authenticity of adversarial examples. Defenses may take advantage of this defect to weaken the attack capability of adversarial examples. Moreover, existing methods with L p restriction have poor generalization ability in white-box attacks and have inferior aggressiveness in black-box attacks. To solve the problems mentioned above, we propose a multiple feature interpolation method to generate face adversarial examples. In the proposed method, we perform the multiple feature interpolation to generate face adversarial examples with new semantics in the process of original image reconstruction and conditional attribute-guided image generation based on StarGAN. Experimental results demonstrate that adversarial examples generated by our method possess new attribute-guided semantics and satisfactory attack success rates under both white-box and black-box settings. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. mosaic-library: A Python video mosaicking library specialised for seabed mapping
- Author
-
Fletcher Thompson, David O’Brien-Møller, Bo Lundgren, and Patrizio Mariani
- Subjects
Video mosaic ,Seabed mapping ,Image manipulation ,Computer software ,QA76.75-76.765 - Abstract
This paper presents mosaic-library, a Python software package designed for underwater mosaicking applications to manipulate and process seabed video data. The library aims to simplify and enhance the process of creating mosaics from underwater videos, allowing for improved exploration and analysis of seabed environments for marine science applications (such as bottom feature classification and biodiversity monitoring). The library offers various functionalities, including reading input videos, colour and contrast balancing, image resizing, image registration using feature detection and description, transformation estimation, homography transformations, visual-inertial alignment, and mosaic generation. Moreover, version 2 of the library contains an extensible set of classes to allow advanced users to develop their own mosaicking applications, with support for CUDA acceleration on NVIDIA hardware. The library is currently used for several applications to map the seabottom and in support of fish stock assessment procedures and biodiversity analyses. The software’s capabilities are demonstrated with examples showcasing the various features.
- Published
- 2025
- Full Text
- View/download PDF
13. Virtual non-contrast images in photon-counting computed tomography: impact of different contrast phases.
- Author
-
Gadsbøll, Eva Laurin, Aurumskjöld, Marie-Louise, Holmquist, Fredrik, and Baubeta, Erik
- Subjects
- *
ERECTOR spinae muscles , *COMPUTED tomography , *IMAGE reconstruction , *APPLICATION software , *STANDARD deviations - Abstract
Background: Photon-counting computed tomography (PCCT) enables new ways of image reconstruction, e.g. material decomposition and creation of virtual non-contrast (VNC) series with higher resolution and lower radiation dose than standard computed tomography (CT). Clinical experiences of this are limited. Purpose: To compare true non-contrast (TNC) series with VNC series derived from non-enhanced (VNCu), arterial phase (VNCa) and portal venous phase (VNCv) in clinically approved PCCT. Material and Methods: A total of 45 clinical, tri-phasic abdominal CT scans from the PCCT Naetom Alpha, between February 2022 and November 2022, were retrospectively assessed. Placing a region of interest in six different locations in each VNC series – right liver parenchyma, left liver parenchyma, spleen, aorta, erector spinae muscle, and in the subcutaneous fat – absolute Hounsfield values (HU) and standard deviations (SD) were collected. Differences in HU (ΔHU) were compared and statistically analyzed. Results: Statistically significant differences between VNC and TNC were seen in all measurements, with the largest difference in the subcutaneous fat and the smallest difference in the erector spinae muscle. Only small differences were seen between VNCa and VNCv, where the largest differences were seen in the left and right liver lobes. Conclusion: VNC images from the first-generation clinically approved PCCT showed a significant difference between VNC and TNC images. The differences vary with the type of tissue. Only small differences were seen depending from which contrast phase the VNC was derived. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Improving synthetic media generation and detection using generative adversarial networks.
- Author
-
Zia, Rabbia, Rehman, Mariam, Hussain, Afzaal, Nazeer, Shahbaz, and Anjum, Maria
- Subjects
ARTIFICIAL neural networks ,MACHINE learning ,GENERATIVE adversarial networks ,DATA augmentation ,ARTIFICIAL intelligence ,DEEP learning - Abstract
Synthetic images ar---e created using computer graphics modeling and artificial intelligence techniques, referred to as deepfakes. They modify human features by using generative models and deep learning algorithms, posing risks violations of social media regulations and spread false information. To address these concerns, the study proposed an improved generative adversarial network (GAN) model which improves accuracy while differentiating between real and fake images focusing on data augmentation and label smoothing strategies for GAN training. The study utilizes a dataset containing human faces and employs DCGAN (deep convolutional generative adversarial network) as the base model. In comparison with the traditional GANs, the proposed GAN outperform in terms of frequently used metrics i.e., Fréchet Inception Distance (FID) and accuracy. The model effectiveness is demonstrated through evaluation on the Flickr-Faces Nvidia dataset and Fakefaces d--ataset, achieving an FID score of 55.67, an accuracy of 98.82%, and an F1-score of 0.99 in detection. This study optimizes the model parameters to achieve optimal parameter settings. This study fine-tune the model parameters to reach optimal settings, thereby reducing risks in synthetic image generation. The article introduces an effective framework for both image manipulation and detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. The effectiveness of a virtual learning companion for supporting the critical judgment of social media content.
- Author
-
Aprin, Farbod, Peters, Pascal, and Hoppe, H. Ulrich
- Subjects
COURSEWARE ,SOCIAL media ,CLASSROOMS ,EMPIRICAL research ,CRITICAL thinking - Abstract
Social media usage has become a daily habit for the younger generation. It can have positive effects on educational processes, but it also raises concerns about harmful content, such as fake news or hate speech. Fake news is often distributed with the intention to manipulate the public opinion by propagating disinformation. This includes the manipulation of images taken from reputable news resources. In response to these concerns and manipulations, we developed a web-based learning environment with a virtual learning companion (VLC). The VLC is designed to assist students in developing their critical thinking skills while interacting with social media content. The VLC is incorporated into a controlled learning environment that resembles Instagram and contains real and manipulated content. The "Courage" companion communicates with the students through chat dialogue and knowledge-activating questions. It also links additional images from web sources discovered through reverse image search based on image similarity. Learners are provided with various textual descriptions and keywords and are prompted to reflect on this information. The system's effectiveness and assistance in identifying real or manipulated images have been evaluated in an empirical classroom study in August 2022 with 22 high school students in Germany that generated 95 conversations around five images and their texts assisted by the companion. Users' interactions with the companion in a controlled Instagram-like social media environment have been recorded in xAPI format and then analyzed using a research dashboard. The results of this evaluation demonstrate that guidance through the VLC, along with the provision of additional content sources containing similar images and their explanation through the companion, improves the learners' judgment. This corroborates the claim that the tool can help enhance learners' critical thinking, resilience, and sensitivity facing social media issues. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Image Forgery Detection System Using Passive Techniques
- Author
-
Agarwal, Lalit, Budhwani, Saksham, Sharma, Shivay, Jain, Devansh, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Malik, Hasmat, editor, Mishra, Sukumar, editor, Sood, Y. R., editor, García Márquez, Fausto Pedro, editor, and Ustun, Taha Selim, editor
- Published
- 2024
- Full Text
- View/download PDF
17. Seeing Through the Lies: A Vision Transformer-Based Solution
- Author
-
Rasool, Aale, Katarya, Rahul, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Santosh, K. C., editor, Sood, Sandeep Kumar, editor, Pandey, Hari Mohan, editor, and Virmani, Charu, editor
- Published
- 2024
- Full Text
- View/download PDF
18. Text-Guided Multi-region Scene Image Editing Based on Diffusion Model
- Author
-
Li, Ruichen, Wu, Lei, Wang, Changshuo, Dong, Pei, Li, Xin, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Pan, Yijie, editor, and Zhang, Qinhu, editor
- Published
- 2024
- Full Text
- View/download PDF
19. Image Manipulation Detection Using Augmentation and Convolutional Neural Networks
- Author
-
Maheshwari, Annant, Jain, Rishi, Mahapatra, Ritika, Palakuru, Saagar, Kumar, M. Anand, Celebi, Emre, Series Editor, Chen, Jingdong, Series Editor, Gopi, E. S., Series Editor, Neustein, Amy, Series Editor, Liotta, Antonio, Series Editor, Di Mauro, Mario, Series Editor, and Maheswaran, P, editor
- Published
- 2024
- Full Text
- View/download PDF
20. Fighting Fake Visual Media: A Study of Current and Emerging Methods for Detecting Image and Video Tampering
- Author
-
Khan, Mahejabi, Gajbhiye, Samta, Tiwari, Rajesh, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Tan, Kay Chen, Series Editor, Kumar, Amit, editor, and Mozar, Stefan, editor
- Published
- 2024
- Full Text
- View/download PDF
21. Detection of Manipulations in Digital Images: A Review of Passive and Active Methods Utilizing Deep Learning
- Author
-
Paweł Duszejko, Tomasz Walczyna, and Zbigniew Piotrowski
- Subjects
image manipulation ,deep learning ,active protection ,passive protection ,image forensics ,deepfakes ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
The modern society generates vast amounts of digital content, whose credibility plays a pivotal role in shaping public opinion and decision-making processes. The rapid development of social networks and generative technologies, such as deepfakes, significantly increases the risk of disinformation through image manipulation. This article aims to review methods for verifying images’ integrity, particularly through deep learning techniques, addressing both passive and active approaches. Their effectiveness in various scenarios has been analyzed, highlighting their advantages and limitations. This study reviews the scientific literature and research findings, focusing on techniques that detect image manipulations and localize areas of tampering, utilizing both statistical properties of images and embedded hidden watermarks. Passive methods, based on analyzing the image itself, are versatile and can be applied across a broad range of cases; however, their effectiveness depends on the complexity of the modifications and the characteristics of the image. Active methods, which involve embedding additional information into the image, offer precise detection and localization of changes but require complete control over creating and distributing visual materials. Both approaches have their applications depending on the context and available resources. In the future, a key challenge remains the development of methods resistant to advanced manipulations generated by diffusion models and further leveraging innovations in deep learning to protect the integrity of visual content.
- Published
- 2025
- Full Text
- View/download PDF
22. Enhancing copy-move forgery detection through a novel CNN architecture and comprehensive dataset analysis.
- Author
-
Kuznetsov, Oleksandr, Frontoni, Emanuele, Romeo, Luca, and Rosati, Riccardo
- Subjects
FORGERY ,CONVOLUTIONAL neural networks ,DIGITAL technology - Abstract
In the contemporary digital era, images are omnipresent, serving as pivotal entities in conveying information, authenticating experiences, and substantiating facts. The ubiquity of image editing tools has precipitated a surge in image forgeries, notably through copy-move attacks where a portion of an image is copied and pasted within the same image to concoct deceptive narratives. This phenomenon is particularly perturbing considering the pivotal role images play in legal, journalistic, and scientific domains, necessitating robust forgery detection mechanisms to uphold image integrity and veracity. While advancements in Convolutional Neural Networks (CNN) have propelled copy-move forgery detection, existing methodologies grapple with limitations concerning the detection efficacy amidst complex manipulations and varied dataset characteristics. Additionally, a palpable void exists in comprehensively understanding and exploiting dataset heterogeneity to enhance detection capabilities. This heralds a pronounced exigency for innovative CNN architectures and nuanced understandings of dataset intricacies to augment detection capabilities, which has remained notably underexplored in the prevailing literature. Against this backdrop, our research broaches novel frontiers in copy-move forgery detection by introducing an innovative CNN architecture meticulously tailored to discern the subtlest manipulations, even amidst intricate image contexts. An extensive analysis of multiple datasets – MICC-F220, MICC-F600, and a combined variant – enables us to delineate a granular understanding of their attributes, thereby shedding unprecedented light on their influences on detection performance. Further, our research goes beyond mere detection, delving deep into comprehensive analyses of varied datasets and conducting additional experiments with differential training-validation sets and randomly labeled data to scrutinize the robustness and reliability of our model. We not only meticulously document and analyze our findings but also juxtapose them against extant models, offering an exhaustive comparative analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Deep Fake Detection using Inception ResNetV2.
- Author
-
Chemparathy, Alex B. and T., Kavitha
- Subjects
DEEPFAKES ,DEEP learning ,DIGITAL technology ,DIGITAL media ,DATA integrity - Abstract
The increasing prevalence of DeepFake technology poses significant threats to various industries and public trust, making the development of robust detection methods crucial. In this study, we propose a novel approach for DeepFake detection using the InceptionResNetV2 architecture, leveraging its advanced capabilities in extracting features from images. Our method utilizes a deep learning framework to train the model on a diverse dataset of authentic and DeepFake videos, enabling it to learn distinct patterns and discrepancies between the two types of content. Through extensive experimentation and evaluation, we demonstrate the effectiveness of the proposed approach in accurately identifying DeepFake videos with high precision and recall rates. Furthermore, our method exhibits robustness against various manipulation techniques, showcasing its potential for real-world applications in combating the spread of misinformation and fraudulent content. The implementation of InceptionResNetV2 for DeepFake detection presents a promising solution to the growing challenges posed by synthetic media, providing a reliable tool for safeguarding the integrity of visual information in digital environments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
24. Real Time Sign Language Recognition using Knowledge Assisted Method.
- Author
-
Srivastava, Vivek, Pratap, Jay, Sinha, Shobhit, and Kumar, Sunil
- Subjects
SIGN language ,DEAF children ,ORAL communication ,IMAGE processing ,SPEECH ,DEAF people ,FUNCTIONAL magnetic resonance imaging - Abstract
As we are aware that verbal communication can be hampered by speech impairment, and sign language is one of the best systems for resolving this problem. The goal of our paper is to create a system or application that can recognize sign language motions in order to reduce the gap between persons with disabilities and others, such as deaf and mute people. The suggested system uses machine learning and image processing to create a real-time system. Preprocessing photos and removing hand motions from the backdrop are both done via image processing. The 24 English alphabets are then included in a dataset created using these photos. Both a specially created dataset and live hand gestures produced by people of different ages are used to evaluate the Faster (R-CNN) that we proposed in our research. The accuracy comes near to 95%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
25. Toward effective image forensics via a novel computationally efficient framework and a new image splice dataset.
- Author
-
Yadav, Ankit and Vishwakarma, Dinesh Kumar
- Abstract
Splice detection models are the need of the hour since splice manipulations can be used to mislead, spread rumors and create disharmony in society. However, there is a severe lack of image-splicing datasets, which restricts the capabilities of deep learning models to extract discriminative features without overfitting. This manuscript presents twofold contributions toward splice detection. Firstly, a novel splice detection dataset is proposed having two variants. The two variants include spliced samples generated from code and through manual editing. Spliced images in both variants have corresponding binary masks to aid localization approaches. Secondly, a novel spatio-compression lightweight splice detection framework is proposed for accurate splice detection with minimum computational cost. The proposed dual-branch framework extracts discriminative spatial features from a lightweight spatial branch. It uses original resolution compression data to extract double compression artifacts from the second branch, thereby making it 'information preserving.' Several CNNs are tested in combination with the proposed framework on a composite dataset of images from the proposed dataset and the CASIA v2.0 dataset. The best model accuracy of 0.9382 is achieved and compared with similar state-of-the-art methods, demonstrating the superiority of the proposed framework. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Region‐Aware Simplification and Stylization of 3D Line Drawings.
- Author
-
Nguyen, Vivien, Fisher, Matthew, Hertzmann, Aaron, and Rusinkiewicz, Szymon
- Subjects
- *
ARTISTIC style , *TOPOLOGY , *GEOMETRY - Abstract
Shape‐conveying line drawings generated from 3D models normally create closed regions in image space. These lines and regions can be stylized to mimic various artistic styles, but for complex objects, the extracted topology is unnecessarily dense, leading to unappealing and unnatural results under stylization. Prior works typically simplify line drawings without considering the regions between them, and lines and regions are stylized separately, then composited together, resulting in unintended inconsistencies. We present a method for joint simplification of lines and regions simultaneously that penalizes large changes to region structure, while keeping regions closed. This feature enables region stylization that remains consistent with the outline curves and underlying 3D geometry. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. EmoStyle: Emotion-Aware Semantic Image Manipulation with Audio Guidance.
- Author
-
Shen, Qiwei, Xu, Junjie, Mei, Jiahao, Wu, Xingjiao, and Dong, Daoguo
- Subjects
EMOTIONS ,AFFECT (Psychology) ,IMAGE retrieval - Abstract
With the flourishing development of generative models, image manipulation is receiving increasing attention. Rather than text modality, several elegant designs have delved into leveraging audio to manipulate images. However, existing methodologies mainly focus on image generation conditional on semantic alignment, ignoring the vivid affective information depicted in the audio. We propose an Emotion-aware StyleGAN Manipulator (EmoStyle), a framework where affective information from audio can be explicitly extracted and further utilized during image manipulation. Specifically, we first leverage the multi-modality model ImageBind for initial cross-modal retrieval between images and music, and select the music-related image for further manipulation. Simultaneously, by extracting sentiment polarity from the lyrics of the audio, we generate an emotionally rich auxiliary music branch to accentuate the affective information. We then leverage pre-trained encoders to encode audio and the audio-related image into the same embedding space. With the aligned embeddings, we manipulate the image via a direct latent optimization method. We conduct objective and subjective evaluations on the generated images, and our results show that our framework is capable of generating images with specified human emotions conveyed in the audio. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. The digitally manipulated family photograph: MyHeritage's 'Deep Nostalgia', and the extended temporality of the photographic image.
- Author
-
Conaghan, Fern
- Subjects
PHOTOGRAPHS ,DIGITAL photography ,TECHNOLOGICAL innovations ,EXTENDED families ,NOSTALGIA ,POPULAR culture - Abstract
This article examines how the digitally manipulated family photograph functions as a means of understanding the temporal instability of the use and interpretations of photographic images. It begins by taking a close look at scholarly debates on how 'credible' the documentary value of a still photograph is, as well as how it is able to emotionally resonate with spectators. From this discussion, it becomes important to look at a key example of how an image can produce an emotional effect on a viewer; in this case, photographs of individuals' deceased family members. While exploring how this allows the spectator to reconnect with their relatives, it is also crucial to acknowledge that readings of images like these are often determined by reductive interpretations of their stillness. As the consideration of photographs as 'documents' has been contested for an extensive amount of time, it is illuminating to turn to the properties of digital photography by inspecting the photo manipulation feature 'Deep Nostalgia' on the MyHeritage app that circulated around TikTok in 2021. I look at a YouTube compilation of people reacting to seeing photographs of their family manipulated in a way that gives the impression that they are moving and emoting, alongside discussions about this in recent pop culture articles. By taking a Barthesian reading of the extended temporality of these family photographs, it is important to recognise that the connection between the subject and the image is severed both iconically and indexically from its original context. However, by understanding this photographic image in the context of being digital it must be understood differently. I will therefore use the MyHeritage phenomenon as a means of arguing that the digital image is not inferior to the 'realism' of analogue photography and must, instead, be read in relation to the history of technological change. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Improving synthetic media generation and detection using generative adversarial networks
- Author
-
Rabbia Zia, Mariam Rehman, Afzaal Hussain, Shahbaz Nazeer, and Maria Anjum
- Subjects
Generative adversarial networks ,Deep neural networks ,Image manipulation ,DeepFake ,Manipulation detection ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Synthetic images are created using computer graphics modeling and artificial intelligence techniques, referred to as deepfakes. They modify human features by using generative models and deep learning algorithms, posing risks violations of social media regulations and spread false information. To address these concerns, the study proposed an improved generative adversarial network (GAN) model which improves accuracy while differentiating between real and fake images focusing on data augmentation and label smoothing strategies for GAN training. The study utilizes a dataset containing human faces and employs DCGAN (deep convolutional generative adversarial network) as the base model. In comparison with the traditional GANs, the proposed GAN outperform in terms of frequently used metrics i.e., Fréchet Inception Distance (FID) and accuracy. The model effectiveness is demonstrated through evaluation on the Flickr-Faces Nvidia dataset and Fakefaces dataset, achieving an FID score of 55.67, an accuracy of 98.82%, and an F1-score of 0.99 in detection. This study optimizes the model parameters to achieve optimal parameter settings. This study fine-tune the model parameters to reach optimal settings, thereby reducing risks in synthetic image generation. The article introduces an effective framework for both image manipulation and detection.
- Published
- 2024
- Full Text
- View/download PDF
30. Intrinsic Image Decomposition via Ordinal Shading.
- Author
-
Careaga, Chris and Aksoy, Yağız
- Subjects
COMPUTATIONAL photography ,ALBEDO ,QUANTITATIVE research - Abstract
Intrinsic decomposition is a fundamental mid-level vision problem that plays a crucial role in various inverse rendering and computational photography pipelines. Generating highly accurate intrinsic decompositions is an inherently under-constrained task that requires precisely estimating continuous-valued shading and albedo. In this work, we achieve highresolution intrinsic decomposition by breaking the problem into two parts. First, we present a dense ordinal shading formulation using a shift-and scale-invariant loss in order to estimate ordinal shading cues without restricting the predictions to obey the intrinsic model. We then combine low-and high-resolution ordinal estimations using a second network to generate a shading estimate with both global coherency and local details. We encourage the model to learn an accurate decomposition by computing losses on the estimated shading as well as the albedo implied by the intrinsic model. We develop a straightforward method for generating dense pseudo ground truth using our model's predictions and multi-illumination data, enabling generalization to in-the-wild imagery. We present exhaustive qualitative and quantitative analysis of our predicted intrinsic components against state-of-the-art methods. Finally, we demonstrate the real-world applicability of our estimations by performing otherwise difficult editing tasks such as recoloring and relighting. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. An unconditional generative model with self-attention module for single image generation.
- Author
-
Yıldız, Eyyüp, Yüksel, Mehmet Erkan, and Sevgen, Selçuk
- Subjects
- *
GENERATIVE adversarial networks , *DEEP learning , *IMAGE processing , *ROBUST statistics , *REALISM - Abstract
Generative Adversarial Networks (GANs) have revolutionized the field of deep learning by enabling the production of high-quality synthetic data. However, the effectiveness of GANs largely depends on the size and quality of training data. In many real-world applications, collecting large amounts of high-quality training data is time-consuming, and expensive. Accordingly, in recent years, GAN models that use limited data have begun to be developed. In this study, we propose a GAN model that can learn from a single training image. Our model is based on the principle of multiple GANs operating sequentially at different scales, where each GAN learns the features of the training image and transfers them to the next GAN, ultimately generating examples with different realistic structures at the final scale. In our model, we utilized a selfattention and new scaling method to increase the realism and quality of the generated images. The experimental results show that our model performs image generation successfully. In addition, we demonstrated the robustness of our model by testing it in different image manipulation applications. As a result, our model can successfully produce realistic, high-quality, diverse images from a single training image, providing short training time and good training stability. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. When Seeing Isn't Believing: Navigating Visual Health Misinformation through Library Instruction.
- Author
-
Cowles, Kelsey, Miller, Rebekah, and Suppok, Rachel
- Subjects
- *
DIGITAL image processing , *MEDICAL libraries , *INFORMATION display systems , *COMPUTER assisted instruction , *SOCIAL media , *ARTIFICIAL intelligence , *FRAUD in science , *DISINFORMATION , *INFORMATION literacy , *HEALTH literacy , *FRAUD , *HEALTH , *INFORMATION resources , *COMMUNICATION , *MISINFORMATION , *VIDEO recording - Abstract
Visual misinformation poses unique challenges to public health due to its potential for persuasiveness and rapid spread on social media. In this article, librarians at the University of Pittsburgh Health Sciences Library System identify four types of visual health misinformation: misleading graphs and charts, out of context visuals, image manipulation in scientific publications, and AI-generated images and videos. To educate our campus's health sciences audience and wider community on these topics, we have developed a range of instruction about visual health misinformation. We describe our strategies and provide suggestions for implementing visual misinformation programming for a variety of audiences. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. MI-csoda szépség!: Nézői attitűdök a mesterséges intelligencia által manipulált arcokkal kapcsolatban.
- Author
-
EVELIN, HORVÁTH
- Abstract
The attention of modern media research in the last decade has been focused on the truth value of images: within the field of manipulated photographs, portrait photographs represent a special group, since the human face is a specific visual stimulus that plays an important role in the perception of beauty. The present research investigated portrait photographs that have been automatically retouched by an artificial intelligence-based image editing software, using an online questionnaire. Accordint to the results, viewer’s perception of artificial intelligence based (AI-based) image retouching is not influenced by the gender or age of the photo model, nor by the gender or age of the recipients. However, the beauty judged by the recipients does affect the viewer’s attitude towards both the model and the photograph. The study points to new potential research directions in the field of AI-generated beauty ideals and automated AI-based retouching. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Enhancing Sensing and Imaging Capabilities Through Surface Plasmon Resonance for Deepfake Image Detection
- Author
-
Maheshwari, R. Uma, B.Paulchamy, Pandey, Binay Kumar, and Pandey, Digvijay
- Published
- 2024
- Full Text
- View/download PDF
35. Diffusion-Adapter: Text Guided Image Manipulation with Frozen Diffusion Models
- Author
-
Wei, Rongting, Fan, Chunxiao, Wu, Yuexin, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Iliadis, Lazaros, editor, Papaleonidas, Antonios, editor, Angelov, Plamen, editor, and Jayne, Chrisina, editor
- Published
- 2023
- Full Text
- View/download PDF
36. Text Guided Facial Image Synthesis Using StyleGAN and Variational Autoencoder Trained CLIP
- Author
-
Srinivasa, Anagha, Praveen, Anjali, Mavathur, Anusha, Pothumarthi, Apurva, Arya, Arti, Agarwal, Pooja, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Rutkowski, Leszek, editor, Scherer, Rafał, editor, Korytkowski, Marcin, editor, Pedrycz, Witold, editor, Tadeusiewicz, Ryszard, editor, and Zurada, Jacek M., editor
- Published
- 2023
- Full Text
- View/download PDF
37. Machine Learning Techniques for Image Manipulation Detection: A Review and Analysis
- Author
-
Iqbal, Suhaib Wajahat, Arora, Bhavna, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Singh, Yashwant, editor, Verma, Chaman, editor, Zoltán, Illés, editor, Chhabra, Jitender Kumar, editor, and Singh, Pradeep Kumar, editor
- Published
- 2023
- Full Text
- View/download PDF
38. Binary Pattern for Copy-Move Image Forgery Detection
- Author
-
Rathore, Neeraj, Jain, Neelesh, Singh, Pawan, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Kumar Singh, Koushlendra, editor, Bajpai, Manish Kumar, editor, and Sheikh Akbari, Akbar, editor
- Published
- 2023
- Full Text
- View/download PDF
39. Learning of Linear Transformations Involving Mathematical Modelling Supported by Technology: A Study with Undergraduate Students
- Author
-
Ramirez-Montes, Guillermo, Carreira, Susana, Henriques, Ana, Kaiser, Gabriele, Series Editor, Stillman, Gloria Ann, Series Editor, Biembengut, Maria Salett, Editorial Board Member, Blum, Werner, Editorial Board Member, Doerr, Helen, Editorial Board Member, Galbraith, Peter, Editorial Board Member, Ikeda, Toshikazu, Editorial Board Member, Niss, Mogens, Editorial Board Member, Xie, Jinxing, Editorial Board Member, Greefrath, Gilbert, editor, and Carreira, Susana, editor
- Published
- 2023
- Full Text
- View/download PDF
40. Container Security Using Algorithmic Approach.
- Author
-
R. P. N. M., Ramanayaka and J. A. S. T., Abeywickrama
- Subjects
INTERNET security ,MACHINE learning ,DEEP learning ,TECHNOLOGICAL innovations ,DIGITAL photography - Abstract
Container technology is one of the fastest growing technologies. However, when it comes to the security of containers, their vulnerability has increased in line with its popularity. Because when users use containers, they may unknowingly perform actions that can make their Docker environment insecure. So, we need to verify security of entire container environment. In this paper, we describe a mechanism that can validate security level of container environment by using algorithmic approach. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. Unsupervised Discovery and Manipulation of Continuous Disentangled Factors of Variation.
- Author
-
FONTANINI, TOMASO, DONATI, LUCA, BERTOZZI, MASSIMO, and PRATI, ANDREA
- Abstract
Learning a disentangled representation of a distribution in a completely unsupervised way is a challenging task that has drawn attention recently. In particular, much focus has been put in separating factors of variation (i.e., attributes) within the latent code of a Generative Adversarial Network (GAN). Achieving that permits control of the presence or absence of those factors in the generated samples by simply editing a small portion of the latent code. Nevertheless, existing methods that perform very well in a noise-to-image setting often fail when dealing with a real data distribution, i.e., when the discovered attributes need to be applied to real images. However, some methods are able to extract and apply a style to a sample but struggle to maintain its content and identity, while others are not able to locally apply attributes and end up achieving only a global manipulation of the original image. In this article, we propose a completely (i.e., truly) unsupervised method that is able to extract a disentangled set of attributes from a data distribution and apply them to new samples from the same distribution by preserving their content. This is achieved by using an image-to-image GAN that maps an image and a random set of continuous attributes to a new image that includes those attributes. Indeed, these attributes are initially unknown and they are discovered during training by maximizing the mutual information between the generated samples and the attributes' vector. Finally, the obtained disentangled set of continuous attributes can be used to freely manipulate the input samples.We prove the effectiveness of our method over a series of datasets and show its application on various tasks, such as attribute editing, data augmentation, and style transfer. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. Learning to Generate and Manipulate 3D Radiance Field by a Hierarchical Diffusion Framework with CLIP Latent.
- Author
-
Wang, Jiaxu, Zhang, Ziyi, and Xu, Renjing
- Abstract
3D‐aware generative adversarial networks (GAN) are widely adopted in generating and editing neural radiance fields (NeRF). However, these methods still suffer from GAN‐related issues including degraded diversity and training instability. Moreover, 3D‐aware GANs consider NeRF pipeline as regularizers and do not directly operate with 3D assets, leading to imperfect 3D consistencies. Besides, the independent changes in disentangled editing cannot be ensured due to the sharing of some shallow hidden features in generators. To address these challenges, we propose the first purely diffusion‐based three‐stage framework for generative and editing tasks, with a series of well‐designed loss functions that can directly handle 3D models. In addition, we present a generalizable neural point field as our 3D representation, which explicitly disentangles geometry and appearance in feature spaces. For 3D data conversion, it simplifies the preparation pipeline of datasets. Assisted by the representation, our diffusion model can separately manipulate the shape and appearance in a hierarchical manner by image/text prompts that are provided by the CLIP encoder. Moreover, it can generate new samples by adding a simple generative head. Experiments show that our approach outperforms the SOTA work in the generative tasks of direct generation of 3D representations and novel image synthesis, and completely disentangles the manipulation of shape and appearance with correct semantic correspondence in the editing tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Natural Image Decay With a Decay Effects Generator
- Author
-
Guoqing Hao, Satoshi Iizuka, Kensho Hara, Hirokatsu Kataoka, and Kazuhiro Fukui
- Subjects
Arbitrary-sized image generation ,generative adversarial networks ,image decay ,image editing ,image manipulation ,image processing ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
We present a novel framework for simulating time-varying decay effects for natural images. Conventional methods assume the input image includes enough decay information and uses the color or texture information of the decayed regions to transfer its effect to the non-decayed regions. Unlike these approaches, our framework generates diverse patterns of decay effects by leveraging a decay effects generator network without referencing the decay features of the input image, which allows us to handle more general images with non-decayed objects. Our decay generator network is formed by a style-based generative adversarial network with an arbitrary-sized stationary texture generation mechanism that allows us to synthesize various sizes of decay textures. This arbitrary-sized stationary texture generation is necessary to synthesize photo-realistic decay effects since the appropriate resolutions of the decay textures depend on those of the target objects. We construct a novel decay texture image dataset that contains various types of decay texture images, such as mossy and rust, to train the decay generator network. We show that our framework is able to synthesize diverse decay effects on various non-decayed objects without using additional decayed object images.
- Published
- 2023
- Full Text
- View/download PDF
44. 基于块效应谱分析的 JPEG 图像旋转角度估计.
- Author
-
党良慧 and 张玉金
- Subjects
- *
HIGHPASS electric filters , *JPEG (Image coding standard) , *QUALITY factor , *ARCHAEOLOGY methodology , *FOURIER transforms , *IMAGE compression , *ANGLES - Abstract
Image rotation makes fake images more realistic in geometric perspective, and the performance of existing JPEG image rotation angle estimation algorithms is easily disturbed by block artifacts and the image block size. It remains an imminent and challenging work for the rotation angle interval [1°,15°]. This study presents an effective algorithm to estimate the rotation angle of JPEG images based on block artifact spectral analysis. Firstly, the edges of the image are extracted and removed using a variant colony algorithm to highlight block effects. Secondly, the effect of image texture is further mitigated by cross-differentiation. Thirdly, extraneous peaks are removed in the Fourier transform domain by setting reasonable thresholds and using a Gaussian high-pass filter to reduce interference. Finally, the amplitude component of the Fourier spectrum is projected into the polar coordinates, and the polar angle corresponding to the peak in the polar coordinates is the estimation of rotation angle we need. The experimental results show that the average absolute error of the method is lower than that of the existing methods for the detection of small-size JPEG images whose rotation angle lies within the [1°,15°] interval. Besides, when the compression quality factor is gradually reduced, the performance of the proposed method is still better than the existing methods, and the robustness of the proposed method to JPEG compression is better. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. Impact of Image Manipulation through Digital Software on Pakistani Advertising Design.
- Author
-
Faraz, Ahmed
- Subjects
BRAND loyalty ,ADVERTISING ,VALUES (Ethics) ,CONSUMERS ,COMPUTER software - Abstract
The significance of image in advertising design has led to advancement in technological development of images. This article aims to evaluate the aesthetics, function, and implications of image manipulation in Pakistani advertising design. From limited facilities to digital advancement, the history of Pakistani design reveals a creative evolution. The advent of digital software has enabled the manipulation of photographs to effectively convey the intended messages. The research emphasizes on the analysis of contemporary advertising design of Pakistan through qualitative method. It puts forth various image manipulation techniques which are employed in contemporary advertising design. The research reflects evolving digital trends in image manipulation, and emphasizes the captivating quality of these images in engaging and influencing the consumer, thereby bolstering brand promotion. The study underscores the effective role of image manipulation in Pakistani advertising. Furthermore, advertisers and designers are recommended to consider the ethical values regarding image manipulation to ensure progress in advertising practices. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. AISMSNet: Advanced Image Splicing Manipulation Identification Based on Siamese Networks
- Author
-
Ana Elena Ramirez-Rodriguez, Rodrigo Eduardo Arevalo-Ancona, Hector Perez-Meana, Manuel Cedillo-Hernandez, and Mariko Nakano-Miyatake
- Subjects
deep learning ,splicing detection ,K-means ,Siamese neural network ,image manipulation ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
The exponential surge in specialized image editing software has intensified visual forgery, with splicing attacks emerging as a popular forgery technique. In this context, Siamese neural networks are a remarkable tool in pattern identification for detecting image manipulations. This paper introduces a deep learning approach for splicing detection based on a Siamese neural network tailored to identifying manipulated image regions. The Siamese neural network learns unique features of specific image areas and detects tampered regions through feature comparison. This architecture employs two identical branches with shared weights and image features to compare image blocks and identify tampered areas. Subsequently, a K-means algorithm is applied to identify similar centroids and determine the precise localization of duplicated regions in the image. The experimental results encompass various splicing attacks to assess effectiveness, demonstrating a high accuracy of 98.6% and a precision of 97.5% for splicing manipulation detection. This study presents an advanced splicing image forgery detection and localization algorithm, showcasing its efficacy through comprehensive experiments.
- Published
- 2024
- Full Text
- View/download PDF
47. EmoStyle: Emotion-Aware Semantic Image Manipulation with Audio Guidance
- Author
-
Qiwei Shen, Junjie Xu, Jiahao Mei, Xingjiao Wu, and Daoguo Dong
- Subjects
generative model ,image manipulation ,affective information ,audio-based image manipulation ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
With the flourishing development of generative models, image manipulation is receiving increasing attention. Rather than text modality, several elegant designs have delved into leveraging audio to manipulate images. However, existing methodologies mainly focus on image generation conditional on semantic alignment, ignoring the vivid affective information depicted in the audio. We propose an Emotion-aware StyleGAN Manipulator (EmoStyle), a framework where affective information from audio can be explicitly extracted and further utilized during image manipulation. Specifically, we first leverage the multi-modality model ImageBind for initial cross-modal retrieval between images and music, and select the music-related image for further manipulation. Simultaneously, by extracting sentiment polarity from the lyrics of the audio, we generate an emotionally rich auxiliary music branch to accentuate the affective information. We then leverage pre-trained encoders to encode audio and the audio-related image into the same embedding space. With the aligned embeddings, we manipulate the image via a direct latent optimization method. We conduct objective and subjective evaluations on the generated images, and our results show that our framework is capable of generating images with specified human emotions conveyed in the audio.
- Published
- 2024
- Full Text
- View/download PDF
48. VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance
- Author
-
Crowson, Katherine, Biderman, Stella, Kornis, Daniel, Stander, Dashiell, Hallahan, Eric, Castricato, Louis, Raff, Edward, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
- Published
- 2022
- Full Text
- View/download PDF
49. Traffic Sign Detection and Recognition for Hazy Images: ADAS
- Author
-
Galgali, Raiee, Punagin, Sahana, Iyer, Nalini, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Chen, Joy Iong-Zong, editor, Tavares, João Manuel R. S., editor, Iliyasu, Abdullah M., editor, and Du, Ke-Lin, editor
- Published
- 2022
- Full Text
- View/download PDF
50. BUILDING A FACE DATABASE OF ARAB FACES TOWARD EVALUATING BIAS IN FACIAL ANALYSIS SYSTEMS.
- Author
-
Khalil, Ashraf, Glal, Suha, Ahmed, Khider, Khan, Sana Zeb, and Abdulgani, Aysha
- Subjects
DATABASES ,FACIAL anatomy ,PASSENGERS ,MOBILE health ,TRANSPORTATION - Abstract
Machine learning algorithms are fundamentally driven by the data provided by humans; consequently, the decisions made by those algorithms are not free from human bias. This is particularly evident in the case of facial analysis systems that employ machine learning algorithms. Recent studies have shown that the decisions made by many of the commercially available facial analysis systems are prejudiced against certain groups of race, ethnicity, age, gender and culture. Further studies have identified that the underlying reason for such biased decisions is that the open source material available for facial image databases which are used in commerce and academia to train the algorithms has meager diversity in these categories. To compound this issue, facial analysis technology is promoted by influential companies and artificial intelligence service providers without affirming the fairness and accuracy of the decisions given by these systems. To minimize bias and ensure representation of the Middle Eastern population in the imminent growth of this technology, we propose the development of two Arab face databases along with an algorithmic audit involving seven commercially available facial analysis systems. Of the databases, the first, Arab-LEANA, will include 300 Arab subjects' face images with variation in lighting, expression, accessory, nationality and age (LEANA). The second, Arab Public Figures Faces (APFF), will contain images and videos of 300 Arab public figures captured "in the wild". Faces for APFF will be selected manually from the internet since manual selection of faces will result in a high degree of variability in scale, pose, expression, illumination, age, occlusion and make-up. These databases will provide the worldwide community of face recognition researchers with a large-scale, diverse collection of Arab face images for training and evaluating algorithms toward developing a more representative, and therefore more robust, capacity for facial analysis. This, in turn, will facilitate the development of more accurate face recognition technology as it prepares to go mainstream and enter numerous facets of modern life. [ABSTRACT FROM AUTHOR]
- Published
- 2023
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.