Search

Your search keyword '"Multimedia (cs.MM)"' showing total 4,986 results

Search Constraints

Start Over You searched for: Descriptor "Multimedia (cs.MM)" Remove constraint Descriptor: "Multimedia (cs.MM)"
4,986 results on '"Multimedia (cs.MM)"'

Search Results

1. SeamlessGAN: Self-Supervised Synthesis of Tileable Texture Maps

2. Exploring the Contextual Factors Affecting Multimodal Emotion Recognition in Videos

3. A Survey on Perceptually Optimized Video Coding

4. A Deep Multi-level Attentive Network for Multimodal Sentiment Analysis

5. Perceived Conversation Quality in Spontaneous Interactions

6. Ray-Space Motion Compensation for Lenslet Plenoptic Video Coding

7. Side-Informed Steganography for JPEG Images by Modeling Decompressed Images

8. Reduced-Reference Quality Assessment of Point Clouds via Content-Oriented Saliency Projection

9. Temporal Sentence Grounding in Videos: A Survey and Future Directions

10. M2P2: Multimodal Persuasion Prediction Using Adaptive Fusion

11. Adaptive Marginalized Semantic Hashing for Unpaired Cross-Modal Retrieval

12. Deep Learning for Predictive Analytics in Reversible Steganography

13. Plug-and-Play Regulators for Image-Text Matching

14. Cross-Modal Variational Auto-Encoder for Content-Based Micro-Video Background Music Recommendation

15. Embedding-Based Music Emotion Recognition Using Composite Loss

16. Blind Quality Assessment for in-the-Wild Images via Hierarchical Feature Fusion and Iterative Mixed Database Training

17. OccluMix: Towards De-Occlusion Virtual Try-on by Semantically-Guided Mixup

18. Making DeepFakes More Spurious: Evading Deep Face Forgery Detection via Trace Removal Attack

19. Class-Aware Sounding Objects Localization via Audiovisual Correspondence

20. Smartbanner: intelligent banner design framework that strikes a balance between creative freedom and design rules

21. Event-guided Multi-patch Network with Self-supervision for Non-uniform Motion Deblurring

22. SHREC’22 track: Sketch-based 3D shape retrieval in the wild

23. Meta-Transformer: A Unified Framework for Multimodal Learning

24. Investigating VTubing as a Reconstruction of Streamer Self-Presentation: Identity, Performance, and Gender

25. AGAR: Attention Graph-RNN for Adaptative Motion Prediction of Point Clouds of Deformable Objects

26. Embedded Heterogeneous Attention Transformer for Cross-lingual Image Captioning

27. NTIRE 2023 Quality Assessment of Video Enhancement Challenge

28. AI-assisted Improved Service Provisioning for Low-latency XR over 5G NR

29. CSSL-RHA: Contrastive Self-Supervised Learning for Robust Handwriting Authentication

30. Neural Video Recovery for Cloud Gaming

31. Semantic Communications System with Model Division Multiple Access and Controllable Coding Rate for Point Cloud

32. Just noticeable difference-aware per-scene bitrate-laddering for adaptive video streaming

33. Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback

34. Predictive Coding For Animation-Based Video Compression

35. Emotion-Guided Music Accompaniment Generation Based on Variational Autoencoder

36. Physical-aware Cross-modal Adversarial Network for Wearable Sensor-based Human Action Recognition

37. Anableps: Adapting Bitrate for Real-Time Communication Using VBR-encoded Video

38. Towards Robust SDRTV-to-HDRTV via Dual Inverse Degradation Network

39. MultiVENT: Multilingual Videos of Events with Aligned Natural Text

40. DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models

41. Transcribing Educational Videos Using Whisper: A preliminary study on using AI for transcribing educational videos

42. musif: a Python package for symbolic music feature extraction

43. Conformer LLMs -- Convolution Augmented Large Language Models

44. StyleStegan: Leak-free Style Transfer Based on Feature Steganography

45. INDCOR White Paper 0: Interactive Digital Narratives (IDNs) -- A Solution to the Challenge of Representing Complex Issues

46. SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs

47. Deep Equilibrium Multimodal Fusion

48. Envisioning a Next Generation Extended Reality Conferencing System with Efficient Photorealistic Human Rendering

49. $\mathbf{C}^2$Former: Calibrated and Complementary Transformer for RGB-Infrared Object Detection

50. Learning to Pan-sharpening with Memories of Spatial Details

Catalog

Books, media, physical & digital resources