111 results on '"Gordon Wetzstein"'
Search Results
2. Larger visual changes compress time: The inverted effect of asemantic visual features on interval time perception
- Author
-
Sandra Malpica, Belen Masia, Laura Herman, Gordon Wetzstein, David M. Eagleman, Diego Gutierrez, Zoya Bylinskii, and Qi Sun
- Subjects
Medicine ,Science - Abstract
Time perception is fluid and affected by manipulations to visual inputs. Previous literature shows that changes to low-level visual properties alter time judgments at the millisecond-level. At longer intervals, in the span of seconds and minutes, high-level cognitive effects (e.g., emotions, memories) elicited by visual inputs affect time perception, but these effects are confounded with semantic information in these inputs, and are therefore challenging to measure and control. In this work, we investigate the effect of asemantic visual properties (pure visual features devoid of emotional or semantic value) on interval time perception. Our experiments were conducted with binary and production tasks in both conventional and head-mounted displays, testing the effects of four different visual features (spatial luminance contrast, temporal frequency, field of view, and visual complexity). Our results reveal a consistent pattern: larger visual changes all shorten perceived time in intervals of up to 3min, remarkably contrary to their effect on millisecond-level perception. Our findings may help alter participants’ time perception, which can have broad real-world implications.
- Published
- 2022
3. Off-Axis Layered Displays: Hybrid Direct-View/Near-Eye Mixed Reality with Focus Cues
- Author
-
Christoph Ebner, Peter Mohr, Tobias Langlotz, Yifan Peng, Dieter Schmalstieg, Gordon Wetzstein, and Denis Kalkofen
- Subjects
Signal Processing ,Computer Vision and Pattern Recognition ,Computer Graphics and Computer-Aided Design ,Software - Published
- 2023
- Full Text
- View/download PDF
4. Acorn
- Author
-
Gordon Wetzstein, Marco Monteiro, Julien N. P. Martel, David B. Lindell, Eric R. Chan, and Connor Z. Lin
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,010103 numerical & computational mathematics ,02 engineering and technology ,01 natural sciences ,Machine Learning (cs.LG) ,Rendering (computer graphics) ,Octree ,Computer Science - Graphics ,0202 electrical engineering, electronic engineering, information engineering ,Quadtree ,Computer vision ,Polygon mesh ,0101 mathematics ,Block (data storage) ,Network architecture ,business.industry ,020207 software engineering ,Computer Graphics and Computer-Aided Design ,Graphics (cs.GR) ,Artificial intelligence ,Geometric modeling ,business ,Encoder - Abstract
Neural representations have emerged as a new paradigm for applications in rendering, imaging, geometric modeling, and simulation. Compared to traditional representations such as meshes, point clouds, or volumes they can be flexibly incorporated into differentiable learning-based pipelines. While recent improvements to neural representations now make it possible to represent signals with fine details at moderate resolutions (e.g., for images and 3D shapes), adequately representing large-scale or complex scenes has proven a challenge. Current neural representations fail to accurately represent images at resolutions greater than a megapixel or 3D scenes with more than a few hundred thousand polygons. Here, we introduce a new hybrid implicit-explicit network architecture and training strategy that adaptively allocates resources during training and inference based on the local complexity of a signal of interest. Our approach uses a multiscale block-coordinate decomposition, similar to a quadtree or octree, that is optimized during training. The network architecture operates in two stages: using the bulk of the network parameters, a coordinate encoder generates a feature grid in a single forward pass. Then, hundreds or thousands of samples within each block can be efficiently evaluated using a lightweight feature decoder. With this hybrid implicit-explicit network architecture, we demonstrate the first experiments that fit gigapixel images to nearly 40 dB peak signal-to-noise ratio. Notably this represents an increase in scale of over 1000x compared to the resolution of previously demonstrated image-fitting experiments. Moreover, our approach is able to represent 3D shapes significantly faster and better than previous techniques; it reduces training times from days to hours or minutes and memory requirements by over an order of magnitude., Comment: J. N. P. Martel and D. B. Lindell equally contributed to this work
- Published
- 2021
- Full Text
- View/download PDF
5. Neural light field 3D printing
- Author
-
Quan Zheng, Gordon Wetzstein, Gurprit Singh, Matthias Zwicker, Hans-Peter Seidel, and Vahid Babaei
- Subjects
Computer science ,business.industry ,Pipeline (computing) ,3D printing ,020207 software engineering ,02 engineering and technology ,Volumetric display ,Grid ,Computer Graphics and Computer-Aided Design ,Planar ,Computer engineering ,0202 electrical engineering, electronic engineering, information engineering ,business ,Representation (mathematics) ,Light field - Abstract
Modern 3D printers are capable of printing large-size light-field displays at high-resolutions. However, optimizing such displays in full 3D volume for a given light-field imagery is still a challenging task. Existing light field displays optimize over relatively small resolutions using a few co-planar layers in a 2.5D fashion to keep the problem tractable. In this paper, we propose a novel end-to-end optimization approach that encodes input light field imagery as a continuous-space implicit representation in a neural network. This allows fabricating high-resolution, attenuation-based volumetric displays that exhibit the target light fields. In addition, we incorporate the physical constraints of the material to the optimization such that the result can be printed in practice. Our simulation experiments demonstrate that our approach brings significant visual quality improvement compared to the multilayer and uniform grid-based approaches. We validate our simulations with fabricated prototypes and demonstrate that our pipeline is flexible enough to allow fabrications of both planar and non-planar displays.
- Published
- 2020
- Full Text
- View/download PDF
6. Neural Sensors: Learning Pixel Exposures for HDR Imaging and Video Compressive Sensing With Programmable Sensors
- Author
-
Gordon Wetzstein, Lorenz K. Muller, Stephen J. Carey, Julien N. P. Martel, and Piotr Dudek
- Subjects
high-speed imaging ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,end-to-end optimization ,vision chip ,Image processing ,02 engineering and technology ,Computational photography ,Optical imaging ,Artificial Intelligence ,High-dynamic-range imaging ,Shutter ,High-dynamic range imaging ,0202 electrical engineering, electronic engineering, information engineering ,Vision chip ,Computer vision ,Image sensor ,programmable sensors ,Pixel ,business.industry ,Applied Mathematics ,Rolling shutter ,video compressive sensing ,deep neural networks ,Computational Theory and Mathematics ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Software - Abstract
Camera sensors rely on global or rolling shutter functions to expose an image. This fixed function approach severely limits the sensors' ability to capture high-dynamic-range (HDR) scenes and resolve high-speed dynamics. Spatially varying pixel exposures have been introduced as a powerful computational photography approach to optically encode irradiance on a sensor and computationally recover additional information of a scene, but existing approaches rely on heuristic coding schemes and bulky spatial light modulators to optically implement these exposure functions. Here, we introduce neural sensors as a methodology to optimize per-pixel shutter functions jointly with a differentiable image processing method, such as a neural network, in an end-to-end fashion. Moreover, we demonstrate how to leverage emerging programmable and re-configurable sensor-processors to implement the optimized exposure functions directly on the sensor. Our system takes specific limitations of the sensor into account to optimize physically feasible optical codes and we evaluate its performance for snapshot HDR and high-speed compressive imaging both in simulation and experimentally with real scenes.
- Published
- 2020
- Full Text
- View/download PDF
7. Learning Spatially Varying Pixel Exposures for Motion Deblurring
- Author
-
Cindy M. Nguyen, Julien N. P. Martel, and Gordon Wetzstein
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Image and Video Processing (eess.IV) ,FOS: Electrical engineering, electronic engineering, information engineering ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Computer Vision and Pattern Recognition ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
Computationally removing the motion blur introduced by camera shake or object motion in a captured image remains a challenging task in computational photography. Deblurring methods are often limited by the fixed global exposure time of the image capture process. The post-processing algorithm either must deblur a longer exposure that contains relatively little noise or denoise a short exposure that intentionally removes the opportunity for blur at the cost of increased noise. We present a novel approach of leveraging spatially varying pixel exposures for motion deblurring using next-generation focal-plane sensor--processors along with an end-to-end design of these exposures and a machine learning--based motion-deblurring framework. We demonstrate in simulation and a physical prototype that learned spatially varying pixel exposures (L-SVPE) can successfully deblur scenes while recovering high frequency detail. Our work illustrates the promising role that focal-plane sensor--processors can play in the future of computational imaging., Project page with code: https://ccnguyen.github.io/lsvpe/
- Published
- 2022
8. Focus issue introduction: 3D image acquisition and display: technology, perception and applications
- Author
-
Bahram Javidi, Hong Hua, Adrian Stern, Manuel Martinez, Osamu Matobe, and Gordon Wetzstein
- Subjects
Òptica ,Atomic and Molecular Physics, and Optics ,Innovacions tecnològiques - Abstract
This Feature Issue of Optics Express is organized in conjunction with the 2021 Optica (OSA) conference on 3D Image Acquisition and Display: Technology, Perception and Applications which was held virtually from 19 to 23, July 2021 as part of the Imaging and Sensing Congress 2021. This Feature Issue presents 29 articles which cover the topics and scope of the 2021 3D conference. This Introduction provides a summary of these articles.
- Published
- 2022
9. Video See-Through Mixed Reality with Focus Cues
- Author
-
Christoph Ebner, Shohei Mori, Peter Mohr, Yifan Peng, Dieter Schmalstieg, Gordon Wetzstein, and Denis Kalkofen
- Subjects
Augmented Reality ,Signal Processing ,Computer Graphics ,Computer Vision and Pattern Recognition ,Cues ,Computer Graphics and Computer-Aided Design ,Software - Abstract
This work introduces the first approach to video see-through mixed reality with full support for focus cues. By combining the flexibility to adjust the focus distance found in varifocal designs with the robustness to eye-tracking error found in multifocal designs, our novel display architecture reliably delivers focus cues over a large workspace. In particular, we introduce gaze-contingent layered displays and mixed reality focal stacks, an efficient representation of mixed reality content that lends itself to fast processing for driving layered displays in real time. We thoroughly evaluate this approach by building a complete end-to-end pipeline for capture, render, and display of focus cues in video see-through displays that uses only off-the-shelf hardware and compute components.
- Published
- 2022
10. Computational optical sensing and imaging 2021: feature issue introduction
- Author
-
Jun Ke, Tatiana Alieva, Figen S. Oktem, Paulo E. X. Silveira, Gordon Wetzstein, and Florian Willomitzer
- Subjects
ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Atomic and Molecular Physics, and Optics ,Óptica - Abstract
This Feature Issue includes 2 reviews and 34 research articles that highlight recent works in the field of Computational Optical Sensing and Imaging. Many of the works were presented at the 2021 OSA Topical Meeting on Computational Optical Sensing and Imaging, held virtually from July 19 to July 23, 2021. Articles in the feature issue cover a broad scope of computational imaging topics, such as microscopy, 3D imaging, phase retrieval, non-line-of-sight imaging, imaging through scattering media, ghost imaging, compressed sensing, and applications with new types of sensors. Deep learning approaches for computational imaging and sensing are also a focus of this feature issue.
- Published
- 2022
11. ScanGAN360: a generative model of realistic scanpaths for 360 images
- Author
-
Daniel Martin, Ana Serrano, Alexander W. Bergman, Gordon Wetzstein, and Belen Masia
- Subjects
Signal Processing ,Computer Graphics ,Humans ,Computer Simulation ,Computer Vision and Pattern Recognition ,Computer Graphics and Computer-Aided Design ,Software - Abstract
Understanding and modeling the dynamics of human gaze behavior in 360° environments is crucial for creating, improving, and developing emerging virtual reality applications. However, recruiting human observers and acquiring enough data to analyze their behavior when exploring virtual environments requires complex hardware and software setups, and can be time-consuming. Being able to generate virtual observers can help overcome this limitation, and thus stands as an open problem in this medium. Particularly, generative adversarial approaches could alleviate this challenge by generating a large number of scanpaths that reproduce human behavior when observing new scenes, essentially mimicking virtual observers. However, existing methods for scanpath generation do not adequately predict realistic scanpaths for 360° images. We present ScanGAN360, a new generative adversarial approach to address this problem. We propose a novel loss function based on dynamic time warping and tailor our network to the specifics of 360° images. The quality of our generated scanpaths outperforms competing approaches by a large margin, and is almost on par with the human baseline. ScanGAN360 allows fast simulation of large numbers of virtual observers, whose behavior mimics real users, enabling a better understanding of gaze behavior, facilitating experimentation, and aiding novel applications in virtual reality and beyond.
- Published
- 2022
12. CryoAI: Amortized Inference of Poses for Ab Initio Reconstruction of 3D Molecular Volumes from Real Cryo-EM Images
- Author
-
Axel Levy, Frédéric Poitevin, Julien Martel, Youssef Nashed, Ariana Peck, Nina Miolane, Daniel Ratner, Mike Dunne, and Gordon Wetzstein
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Quantitative Biology - Biomolecules ,Computer Vision and Pattern Recognition (cs.CV) ,FOS: Biological sciences ,Computer Science - Computer Vision and Pattern Recognition ,Biomolecules (q-bio.BM) ,Article ,Machine Learning (cs.LG) - Abstract
Cryo-electron microscopy (cryo-EM) has become a tool of fundamental importance in structural biology, helping us understand the basic building blocks of life. The algorithmic challenge of cryo-EM is to jointly estimate the unknown 3D poses and the 3D electron scattering potential of a biomolecule from millions of extremely noisy 2D images. Existing reconstruction algorithms, however, cannot easily keep pace with the rapidly growing size of cryo-EM datasets due to their high computational and memory cost. We introduce cryoAI, an ab initio reconstruction algorithm for homogeneous conformations that uses direct gradient-based optimization of particle poses and the electron scattering potential from single-particle cryo-EM data. CryoAI combines a learned encoder that predicts the poses of each particle image with a physics-based decoder to aggregate each particle image into an implicit representation of the scattering potential volume. This volume is stored in the Fourier domain for computational efficiency and leverages a modern coordinate network architecture for memory efficiency. Combined with a symmetrized loss function, this framework achieves results of a quality on par with state-of-the-art cryo-EM solvers for both simulated and experimental data, one order of magnitude faster for large datasets and with significantly lower memory requirements than existing methods., Project page: https://www.computationalimaging.org/publications/cryoai/
- Published
- 2022
- Full Text
- View/download PDF
13. Speckle-free holography with partially coherent light sources and camera-in-the-loop calibration
- Author
-
Yifan Peng, Suyeon Choi, Gordon Wetzstein, and Jonghyun Kim
- Subjects
Physics ,Multidisciplinary ,business.industry ,Holography ,SciAdv r-articles ,Optics ,law.invention ,Loop (topology) ,Speckle pattern ,law ,Calibration ,Physical and Materials Sciences ,Augmented reality ,business ,Research Article - Abstract
Description, A holographic display combines artificial intelligence with partially coherent light sources to reduce speckle., Computer-generated holography (CGH) holds transformative potential for a wide range of applications, including direct-view, virtual and augmented reality, and automotive display systems. While research on holographic displays has recently made impressive progress, image quality and eye safety of holographic displays are fundamentally limited by the speckle introduced by coherent light sources. Here, we develop an approach to CGH using partially coherent sources. For this purpose, we devise a wave propagation model for partially coherent light that is demonstrated in conjunction with a camera-in-the-loop calibration strategy. We evaluate this algorithm using light-emitting diodes (LEDs) and superluminescent LEDs (SLEDs) and demonstrate improved speckle characteristics of the resulting holograms compared with coherent lasers. SLEDs in particular are demonstrated to be promising light sources for holographic display applications, because of their potential to generate sharp and high-contrast two-dimensional (2D) and 3D images that are bright, eye safe, and almost free of speckle.
- Published
- 2021
- Full Text
- View/download PDF
14. Holographic near-eye displays based on overlap-add stereograms
- Author
-
Gordon Wetzstein, Yifan Peng, and Nitish Padmanaban
- Subjects
Image quality ,Computer science ,business.industry ,Short-time Fourier transform ,Holography ,020207 software engineering ,02 engineering and technology ,Computer Graphics and Computer-Aided Design ,law.invention ,law ,Face (geometry) ,0202 electrical engineering, electronic engineering, information engineering ,Angular resolution ,Computer vision ,Artificial intelligence ,business ,Image resolution ,Light field - Abstract
Holographic near-eye displays are a key enabling technology for virtual and augmented reality (VR/AR) applications. Holographic stereograms (HS) are a method of encoding a light field into a hologram, which enables them to natively support view-dependent lighting effects. However, existing HS algorithms require the choice of a hogel size, forcing a tradeoff between spatial and angular resolution. Based on the fact that the short-time Fourier transform (STFT) connects a hologram to its observable light field, we develop the overlap-add stereogram (OLAS) as the correct method of "inverting" the light field into a hologram via the STFT. The OLAS makes more efficient use of the information contained within the light field than previous HS algorithms, exhibiting better image quality at a range of distances and hogel sizes. Most remarkably, the OLAS does not degrade spatial resolution with increasing hogel size, overcoming the spatio-angular resolution tradeoff that previous HS algorithms face. Importantly, the optimal hogel size of previous methods typically varies with the depth of every object in a scene, making the OLAS not only a hogel size-invariant method, but also nearly scene independent. We demonstrate the performance of the OLAS both in simulation and on a prototype near-eye display system, showing focusing capabilities and view-dependent effects.
- Published
- 2019
- Full Text
- View/download PDF
15. Learned large field-of-view imaging with thin-plate optics
- Author
-
Gordon Wetzstein, Felix Heide, Wolfgang Heidrich, Yifan Peng, Xiong Dun, and Qilin Sun
- Subjects
Pixel ,Computer science ,business.industry ,Image quality ,Lift (data mining) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,Field of view ,02 engineering and technology ,Computer Graphics and Computer-Aided Design ,Field (computer science) ,law.invention ,Lens (optics) ,Optics ,law ,0202 electrical engineering, electronic engineering, information engineering ,Pinhole (optics) ,business - Abstract
Typical camera optics consist of a system of individual elements that are designed to compensate for the aberrations of a single lens. Recent computational cameras shift some of this correction task from the optics to post-capture processing, reducing the imaging optics to only a few optical elements. However, these systems only achieve reasonable image quality by limiting the field of view (FOV) to a few degrees - effectively ignoring severe off-axis aberrations with blur sizes of multiple hundred pixels. In this paper, we propose a lens design and learned reconstruction architecture that lift this limitation and provide an order of magnitude increase in field of view using only a single thin-plate lens element. Specifically, we design a lens to produce spatially shift-invariant point spread functions, over the full FOV, that are tailored to the proposed reconstruction architecture. We achieve this with a mixture PSF, consisting of a peak and and a low-pass component, which provides residual contrast instead of a small spot size as in traditional lens designs. To perform the reconstruction, we train a deep network on captured data from a display lab setup, eliminating the need for manual acquisition of training data in the field. We assess the proposed method in simulation and experimentally with a prototype camera system. We compare our system against existing single-element designs, including an aspherical lens and a pinhole, and we compare against a complex multielement lens, validating high-quality large field-of-view (i.e. 53°) imaging performance using only a single thin-plate element.
- Published
- 2019
- Full Text
- View/download PDF
16. Non-line-of-sight Imaging with Partial Occluders and Surface Normals
- Author
-
Steven Diamond, Kai Zang, Matthew O'Toole, Gordon Wetzstein, Felix Heide, and David B. Lindell
- Subjects
FOS: Computer and information sciences ,Surface (mathematics) ,Computer science ,business.industry ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,02 engineering and technology ,01 natural sciences ,Computer Graphics and Computer-Aided Design ,Reflectivity ,010309 optics ,Non-line-of-sight propagation ,Computational photography ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Computer vision ,Artificial intelligence ,Representation (mathematics) ,business ,Search and rescue ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Imaging objects obscured by occluders is a significant challenge for many applications. A camera that could “see around corners” could help improve navigation and mapping capabilities of autonomous vehicles or make search and rescue missions more effective. Time-resolved single-photon imaging systems have recently been demonstrated to record optical information of a scene that can lead to an estimation of the shape and reflectance of objects hidden from the line of sight of a camera. However, existing non-line-of-sight (NLOS) reconstruction algorithms have been constrained in the types of light transport effects they model for the hidden scene parts. We introduce a factored NLOS light transport representation that accounts for partial occlusions and surface normals. Based on this model, we develop a factorization approach for inverse time-resolved light transport and demonstrate high-fidelity NLOS reconstructions for challenging scenes both in simulation and with an experimental NLOS imaging system.
- Published
- 2019
- Full Text
- View/download PDF
17. Advances in neural rendering
- Author
-
Stephen Lombardi, M. Guo, Ayush Tewari, Sergio Orts-Escolano, Tomas Simon, Christian Theobalt, Lingjie Liu, Sean Fanello, Matthias Nießner, Gordon Wetzstein, Jun-Yan Zhu, Pratul P. Srinivasan, Maneesh Agrawala, Edgar Tretschk, Vincent Sitzmann, Zexiang Xu, Michael Zollhöfer, Ohad Fried, Justus Thies, Ben Mildenhall, Dan B. Goldman, and Rohit Pandey
- Subjects
Computer science ,Computer graphics (images) ,Rendering (computer graphics) - Published
- 2021
- Full Text
- View/download PDF
18. Deep S3PR: Simultaneous Source Separation and Phase Retrieval Using Deep Generative Models
- Author
-
Gordon Wetzstein and Christopher A. Metzler
- Subjects
Range (mathematics) ,Generative model ,Computer science ,business.industry ,Phase (waves) ,Source separation ,Wireless ,Space (mathematics) ,Phase retrieval ,business ,Algorithm ,Measure (mathematics) - Abstract
This paper introduces and solves the simultaneous source separation and phase retrieval (S3PR) problem. S3PR is an important but largely unsolved problem in a number application domains, including microscopy, wireless communication, and imaging through scattering media, where one has multiple independent coherent sources whose phase is difficult to measure. In general, S3PR is highly under-determined, non-convex, and difficult to solve. In this work, we demonstrate that by restricting the solutions to lie in the range of a deep generative model, we can constrain the search space sufficiently to solve S3PR.Code associated with this work is available at https://github.com/computational-imaging/DeepS3PR. An extended version of this work is available at https://arxiv.org/abs/2002.05856.
- Published
- 2021
- Full Text
- View/download PDF
19. AutoInt: Automatic Integration for Fast Neural Volume Rendering
- Author
-
David B. Lindell, Julien N. P. Martel, and Gordon Wetzstein
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial neural network ,business.industry ,Computer science ,Image quality ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,Volume rendering ,Graphics (cs.GR) ,Antiderivative ,Machine Learning (cs.LG) ,Rendering (computer graphics) ,View synthesis ,Computer Science - Graphics ,Computer engineering ,Fundamental theorem of calculus ,Graph (abstract data type) ,Artificial intelligence ,business - Abstract
Numerical integration is a foundational technique in scientific computing and is at the core of many computer vision applications. Among these applications, neural volume rendering has recently been proposed as a new paradigm for view synthesis, achieving photorealistic image quality. However, a fundamental obstacle to making these methods practical is the extreme computational and memory requirements caused by the required volume integrations along the rendered rays during training and inference. Millions of rays, each requiring hundreds of forward passes through a neural network are needed to approximate those integrations with Monte Carlo sampling. Here, we propose automatic integration, a new framework for learning efficient, closed-form solutions to integrals using coordinate-based neural networks. For training, we instantiate the computational graph corresponding to the derivative of the coordinate-based network. The graph is fitted to the signal to integrate. After optimization, we reassemble the graph to obtain a network that represents the antiderivative. By the fundamental theorem of calculus, this enables the calculation of any definite integral in two evaluations of the network. Applying this approach to neural rendering, we improve a tradeoff between rendering speed and image quality: improving render times by greater than 10× with a tradeoff of reduced image quality.
- Published
- 2021
- Full Text
- View/download PDF
20. Neural Lumigraph Rendering
- Author
-
Ryan Spicer, Kari Pulli, Andrew Jones, Petr Kellnhofer, Lars C. Jebe, and Gordon Wetzstein
- Subjects
FOS: Computer and information sciences ,Computer science ,business.industry ,Image quality ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Volume rendering ,Facial recognition system ,Graphics pipeline ,Graphics (cs.GR) ,Rendering (computer graphics) ,View synthesis ,Computer Science - Graphics ,Face (geometry) ,Computer vision ,Artificial intelligence ,Graphics ,business ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Novel view synthesis is a challenging and ill-posed inverse rendering problem. Neural rendering techniques have recently achieved photorealistic image quality for this task. State-of-the-art (SOTA) neural volume rendering approaches, however, are slow to train and require minutes of inference (i.e., rendering) time for high image resolutions. We adopt high-capacity neural scene representations with periodic activations for jointly optimizing an implicit surface and a radiance field of a scene supervised exclusively with posed 2D images. Our neural rendering pipeline accelerates SOTA neural volume rendering by about two orders of magnitude and our implicit surface representation is unique in allowing us to export a mesh with view-dependent texture information. Thus, like other implicit surface representations, ours is compatible with traditional graphics pipelines, enabling real-time rendering rates, while achieving unprecedented image quality compared to other surface methods. We assess the quality of our approach using existing datasets as well as high-quality 3D face data captured with a custom multi-camera rig., Comment: Project website: http://www.computationalimaging.org/publications/nlr/
- Published
- 2021
- Full Text
- View/download PDF
21. Event-Based Near-Eye Gaze Tracking Beyond 10,000 Hz
- Author
-
Gordon Wetzstein, Julien N. P. Martel, Jörg Conradt, Amit P. S. Kohli, and Anastasios Angelopoulos
- Subjects
Event (computing) ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,Tracking system ,02 engineering and technology ,Computer Graphics and Computer-Aided Design ,Gaze ,Rendering (computer graphics) ,Signal Processing ,Parametric model ,0202 electrical engineering, electronic engineering, information engineering ,Eye tracking ,Computer vision ,Augmented reality ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Microsaccade ,business ,Software - Abstract
The cameras in modern gaze-tracking systems suffer from fundamental bandwidth and power limitations, constraining data acquisition speed to 300 Hz realistically. This obstructs the use of mobile eye trackers to perform, e.g., low latency predictive rendering, or to study quick and subtle eye motions like microsaccades using head-mounted devices in the wild. Here, we propose a hybrid frame-event-based near-eye gaze tracking system offering update rates beyond 10,000 Hz with an accuracy that matches that of high-end desktop-mounted commercial trackers when evaluated in the same conditions. Our system, previewed in Figure 1, builds on emerging event cameras that simultaneously acquire regularly sampled frames and adaptively sampled events. We develop an online 2D pupil fitting method that updates a parametric model every one or few events. Moreover, we propose a polynomial regressor for estimating the point of gaze from the parametric pupil model in real time. Using the first event-based gaze dataset, we demonstrate that our system achieves accuracies of 0.45°-1.75° for fields of view from 45° to 98°. With this technology, we hope to enable a new generation of ultra-low-latency gaze-contingent rendering and display techniques for virtual and augmented reality.
- Published
- 2021
22. Larger visual changes compress time: The inverted effect of asemantic visual features on interval time perception
- Author
-
Sandra Malpica, Belen Masia, Laura Herman, Gordon Wetzstein, David M. Eagleman, Diego Gutierrez, Zoya Bylinskii, and Qi Sun
- Subjects
Judgment ,Multidisciplinary ,genetic structures ,Time Perception ,Visual Perception ,Humans ,Orientation, Spatial ,Vision, Ocular ,Time - Abstract
Time perception is fluid and affected by manipulations to visual inputs. Previous literature shows that changes to low-level visual properties alter time judgments at the millisecond-level. At longer intervals, in the span of seconds and minutes, high-level cognitive effects (e.g., emotions, memories) elicited by visual inputs affect time perception, but these effects are confounded with semantic information in these inputs, and are therefore challenging to measure and control. In this work, we investigate the effect of asemantic visual properties (pure visual features devoid of emotional or semantic value) on interval time perception. Our experiments were conducted with binary and production tasks in both conventional and head-mounted displays, testing the effects of four different visual features (spatial luminance contrast, temporal frequency, field of view, and visual complexity). Our results reveal a consistent pattern: larger visual changes all shorten perceived time in intervals of up to 3min, remarkably contrary to their effect on millisecond-level perception. Our findings may help alter participants’ time perception, which can have broad real-world implications.
- Published
- 2021
23. A Human-centric Approach to Near-eye Display Engineering
- Author
-
Gordon Wetzstein
- Subjects
Human–computer interaction ,Computer science ,Human centric ,Near eye display - Published
- 2021
- Full Text
- View/download PDF
24. A Perceptual Model for Eccentricity-dependent Spatio-temporal Flicker Fusion and its Applications to Foveated Graphics
- Author
-
Brooke Krajancich, Gordon Wetzstein, and Petr Kellnhofer
- Subjects
FOS: Computer and information sciences ,Computer science ,Image quality ,Computer Science - Human-Computer Interaction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Flicker fusion threshold ,02 engineering and technology ,Virtual reality ,Luminance ,01 natural sciences ,Human-Computer Interaction (cs.HC) ,010309 optics ,Computer Science - Graphics ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,FOS: Electrical engineering, electronic engineering, information engineering ,Computer vision ,Graphics ,business.industry ,Image and Video Processing (eess.IV) ,020207 software engineering ,Electrical Engineering and Systems Science - Image and Video Processing ,Computer Graphics and Computer-Aided Design ,Graphics (cs.GR) ,Human visual system model ,020201 artificial intelligence & image processing ,Augmented reality ,Spatial frequency ,Artificial intelligence ,business - Abstract
Virtual and augmented reality (VR/AR) displays strive to provide a resolution, framerate and field of view that matches the perceptual capabilities of the human visual system, all while constrained by limited compute budgets and transmission bandwidths of wearable computing systems. Foveated graphics techniques have emerged that could achieve these goals by exploiting the falloff of spatial acuity in the periphery of the visual field. However, considerably less attention has been given to temporal aspects of human vision, which also vary across the retina. This is in part due to limitations of current eccentricity-dependent models of the visual system. We introduce a new model, experimentally measuring and computationally fitting eccentricity-dependent critical flicker fusion thresholds jointly for both space and time. In this way, our model is unique in enabling the prediction of temporal information that is imperceptible for a certain spatial frequency, eccentricity, and range of luminance levels. We validate our model with an image quality user study, and use it to predict potential bandwidth savings 7X higher than those afforded by current spatial-only foveated models. As such, this work forms the enabling foundation for new temporally foveated graphics techniques.
- Published
- 2021
- Full Text
- View/download PDF
25. pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis
- Author
-
Gordon Wetzstein, Marco Monteiro, Petr Kellnhofer, Jiajun Wu, and Eric R. Chan
- Subjects
FOS: Computer and information sciences ,Network architecture ,business.industry ,Computer science ,Image quality ,Computer Vision and Pattern Recognition (cs.CV) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Computer Vision and Pattern Recognition ,020207 software engineering ,Volume rendering ,02 engineering and technology ,Graphics (cs.GR) ,Rendering (computer graphics) ,Visualization ,Generative model ,Computer Science - Graphics ,0202 electrical engineering, electronic engineering, information engineering ,Artificial intelligence ,Representation (mathematics) ,business ,Generative grammar - Abstract
We have witnessed rapid progress on 3D-aware image synthesis, leveraging recent advances in generative visual models and neural rendering. Existing approaches however fall short in two ways: first, they may lack an underlying 3D representation or rely on view-inconsistent rendering, hence synthesizing images that are not multi-view consistent; second, they often depend upon representation network architectures that are not expressive enough, and their results thus lack in image quality. We propose a novel generative model, named Periodic Implicit Generative Adversarial Networks ($\pi$-GAN or pi-GAN), for high-quality 3D-aware image synthesis. $\pi$-GAN leverages neural representations with periodic activation functions and volumetric rendering to represent scenes as view-consistent 3D representations with fine detail. The proposed approach obtains state-of-the-art results for 3D-aware image synthesis with multiple real and synthetic datasets.
- Published
- 2020
26. Semantic Implicit Neural Scene Representations With Semi-Supervised Training
- Author
-
Vincent Sitzmann, Gordon Wetzstein, and Amit P. S. Kohli
- Subjects
FOS: Computer and information sciences ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Point cloud ,02 engineering and technology ,010501 environmental sciences ,Semantics ,01 natural sciences ,0202 electrical engineering, electronic engineering, information engineering ,Segmentation ,Representation (mathematics) ,0105 earth and related environmental sciences ,I.2.10 ,Artificial neural network ,business.industry ,I.4.5 ,I.4.6 ,Trilinear interpolation ,020207 software engineering ,Pattern recognition ,Image segmentation ,I.4.10 ,I.4.8 ,Task analysis ,Artificial intelligence ,business - Abstract
The recent success of implicit neural scene representations has presented a viable new method for how we capture and store 3D scenes. Unlike conventional 3D representations, such as point clouds, which explicitly store scene properties in discrete, localized units, these implicit representations encode a scene in the weights of a neural network which can be queried at any coordinate to produce these same scene properties. Thus far, implicit representations have primarily been optimized to estimate only the appearance and/or 3D geometry information in a scene. We take the next step and demonstrate that an existing implicit representation (SRNs) is actually multi-modal; it can be further leveraged to perform per-point semantic segmentation while retaining its ability to represent appearance and geometry. To achieve this multi-modal behavior, we utilize a semi-supervised learning strategy atop the existing pre-trained scene representation. Our method is simple, general, and only requires a few tens of labeled 2D segmentation masks in order to achieve dense 3D semantic segmentation. We explore two novel applications for this semantically aware implicit neural scene representation: 3D novel view and semantic label synthesis given only a single input RGB image or 2D label mask, as well as 3D interpolation of appearance and semantics., 3DV 2020 Camera Ready https://www.computationalimaging.org/publications/
- Published
- 2020
- Full Text
- View/download PDF
27. D-VDAMP: Denoising-based Approximate Message Passing for Compressive MRI
- Author
-
Gordon Wetzstein and Christopher A. Metzler
- Subjects
Signal Processing (eess.SP) ,Signal processing ,Computer science ,Gaussian ,Noise reduction ,Message passing ,Image and Video Processing (eess.IV) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Approximation algorithm ,02 engineering and technology ,Iterative reconstruction ,Electrical Engineering and Systems Science - Image and Video Processing ,Noise ,symbols.namesake ,Colored ,0202 electrical engineering, electronic engineering, information engineering ,symbols ,FOS: Electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Electrical Engineering and Systems Science - Signal Processing ,Algorithm - Abstract
Plug and play (P&P) algorithms iteratively apply highly optimized image denoisers to impose priors and solve computational image reconstruction problems, to great effect. However, in general the "effective noise", that is the difference between the true signal and the intermediate solution, within the iterations of P&P algorithms is neither Gaussian nor white. This fact makes existing denoising algorithms suboptimal. In this work, we propose a CNN architecture for removing colored Gaussian noise and combine it with the recently proposed VDAMP algorithm, whose effective noise follows a predictable colored Gaussian distribution. We apply the resulting denoising-based VDAMP (D-VDAMP) algorithm to variable density sampled compressive MRI where it substantially outperforms existing techniques.
- Published
- 2020
28. Single-shot Hyperspectral-Depth Imaging with Learned Diffractive Optics
- Author
-
Seung-Hwan Baek, Hayato Ikoma, Daniel S. Jeon, Yuqi Li, Wolfgang Heidrich, Gordon Wetzstein, and Min H. Kim
- Subjects
FOS: Computer and information sciences ,I.2.10 ,I.4.1 ,I.5 ,Computer Vision and Pattern Recognition (cs.CV) ,Image and Video Processing (eess.IV) ,FOS: Electrical engineering, electronic engineering, information engineering ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Computer Vision and Pattern Recognition ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
Imaging depth and spectrum have been extensively studied in isolation from each other for decades. Recently, hyperspectral-depth (HS-D) imaging emerges to capture both information simultaneously by combining two different imaging systems; one for depth, the other for spectrum. While being accurate, this combinational approach induces increased form factor, cost, capture time, and alignment/registration problems. In this work, departing from the combinational principle, we propose a compact single-shot monocular HS-D imaging method. Our method uses a diffractive optical element (DOE), the point spread function of which changes with respect to both depth and spectrum. This enables us to reconstruct spectrum and depth from a single captured image. To this end, we develop a differentiable simulator and a neural-network-based reconstruction that are jointly optimized via automatic differentiation. To facilitate learning the DOE, we present a first HS-D dataset by building a benchtop HS-D imager that acquires high-quality ground truth. We evaluate our method with synthetic and real experiments by building an experimental prototype and achieve state-of-the-art HS-D imaging results.
- Published
- 2020
29. MetaSDF: Meta-learning Signed Distance Functions
- Author
-
Sitzmann, V., Chan, E. R., Tucker, R., Snavely, N., and Gordon Wetzstein
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Science - Graphics ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,Graphics (cs.GR) ,Machine Learning (cs.LG) - Abstract
Neural implicit shape representations are an emerging paradigm that offers many potential benefits over conventional discrete representations, including memory efficiency at a high spatial resolution. Generalizing across shapes with such neural implicit representations amounts to learning priors over the respective function space and enables geometry reconstruction from partial or noisy observations. Existing generalization methods rely on conditioning a neural network on a low-dimensional latent code that is either regressed by an encoder or jointly optimized in the auto-decoder framework. Here, we formalize learning of a shape space as a meta-learning problem and leverage gradient-based meta-learning algorithms to solve this task. We demonstrate that this approach performs on par with auto-decoder based approaches while being an order of magnitude faster at test-time inference. We further demonstrate that the proposed gradient-based method outperforms encoder-decoder based methods that leverage pooling-based set encoders., Project website: https://vsitzmann.github.io/metasdf/
- Published
- 2020
30. Non-line-of-sight Imaging
- Author
-
Gordon Wetzstein, Andreas Velten, and Daniele Faccio
- Subjects
Line-of-sight ,Photon ,Computer science ,business.industry ,Perspective (graphical) ,Detector ,Image and Video Processing (eess.IV) ,Process (computing) ,General Physics and Astronomy ,FOS: Physical sciences ,Electrical Engineering and Systems Science - Image and Video Processing ,Non-line-of-sight propagation ,Computer Science::Computer Vision and Pattern Recognition ,FOS: Electrical engineering, electronic engineering, information engineering ,Computer vision ,Artificial intelligence ,business ,Inverse method ,Optics (physics.optics) ,Physics - Optics - Abstract
Emerging single-photon-sensitive sensors produce picosecond-accurate time-stamped photon counts. Applying advanced inverse methods to process these data has resulted in unprecedented imaging capabilities, such as non-line-of-sight (NLOS) imaging. Rather than imaging photons that travel along direct paths from a source to an object and back to the detector, NLOS methods analyse photons that travel along indirect light paths, scattered from multiple surfaces, to estimate 3D images of scenes outside the direct line of sight of a camera, hidden by a wall or other obstacles. We review the transient imaging techniques that underlie many NLOS imaging approaches, discuss methods for reconstructing hidden scenes from time-resolved measurements, describe some other methods for NLOS imaging that do not require transient imaging and discuss the future of ‘seeing around corners’. Non-line-of-sight (NLOS) imaging methods use light scattered from multiple surfaces to reconstruct images of scenes that are hidden by another object. This Perspective summarizes existing NLOS imaging techniques and discusses which directions show most promise for future developments.
- Published
- 2020
31. Capture, Reconstruction, and Representation of the Visual Real World for Virtual Reality
- Author
-
James Tompkin, Gordon Wetzstein, Christian Richardt, Magnor, Marcus, and Sorkine-Horning, Alexander
- Subjects
Flexibility (engineering) ,Cover (telecommunications) ,Computer science ,Human–computer interaction ,media_common.quotation_subject ,Perception ,Representation (systemics) ,Quality (business) ,State (computer science) ,Virtual reality ,Image-based modeling and rendering ,media_common - Abstract
We provide an overview of the concerns, current practice, and limitations for capturing, reconstructing, and representing the real world visually within virtual reality. Given that our goals are to capture, transmit, and depict complex real-world phenomena to humans, these challenges cover the opto-electro-mechanical, computational, informational, and perceptual fields. Practically producing a system for real-world VR capture requires navigating a complex design space and pushing the state of the art in each of these areas. As such, we outline several promising directions for future work to improve the quality and flexibility of real-world VR capture systems.
- Published
- 2020
- Full Text
- View/download PDF
32. Computational Optical Sensing and Imaging 2021: introduction to the feature issue
- Author
-
Jun Ke, Tatiana Alieva, Figen S. Oktem, Paulo E. X. Silveira, Gordon Wetzstein, and Florian Willomitzer
- Subjects
Electrical and Electronic Engineering ,Engineering (miscellaneous) ,Atomic and Molecular Physics, and Optics - Abstract
This feature issue includes two reviews and 34 research papers that highlight recent works in the field of computational optical sensing and imaging. Many of the works were presented at the 2021 Optica (formerly OSA) Topical Meeting on Computational Optical Sensing and Imaging, held virtually from 19 July to 23 July 2021. Papers in the feature issue cover a broad scope of computational imaging topics, such as microscopy, 3D imaging, phase retrieval, non-line-of-sight imaging, imaging through scattering media, ghost imaging, compressed sensing, and applications with new types of sensors. Deep learning approaches for computational imaging and sensing are also a focus of this feature issue.
- Published
- 2022
- Full Text
- View/download PDF
33. Hybrid optical-electronic convolutional neural networks with optimized diffractive optics for image classification
- Author
-
Wolfgang Heidrich, Julie Chang, Vincent Sitzmann, Gordon Wetzstein, and Xiong Dun
- Subjects
Diffraction ,Multidisciplinary ,Contextual image classification ,Computer science ,lcsh:R ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Optical computing ,lcsh:Medicine ,02 engineering and technology ,021001 nanoscience & nanotechnology ,01 natural sciences ,Convolutional neural network ,Article ,Power (physics) ,010309 optics ,Computer engineering ,0103 physical sciences ,Optical correlator ,lcsh:Q ,Layer (object-oriented design) ,0210 nano-technology ,lcsh:Science - Abstract
Convolutional neural networks (CNNs) excel in a wide variety of computer vision applications, but their high performance also comes at a high computational cost. Despite efforts to increase efficiency both algorithmically and with specialized hardware, it remains difficult to deploy CNNs in embedded systems due to tight power budgets. Here we explore a complementary strategy that incorporates a layer of optical computing prior to electronic computing, improving performance on image classification tasks while adding minimal electronic computational cost or processing time. We propose a design for an optical convolutional layer based on an optimized diffractive optical element and test our design in two simulations: a learned optical correlator and an optoelectronic two-layer CNN. We demonstrate in simulation and with an optical prototype that the classification accuracies of our optical systems rival those of the analogous electronic implementations, while providing substantial savings on computational cost.
- Published
- 2018
- Full Text
- View/download PDF
34. End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging
- Author
-
Stephen Boyd, Felix Heide, Gordon Wetzstein, Xiong Dun, Vincent Sitzmann, Wolfgang Heidrich, Steven Diamond, and Yifan Peng
- Subjects
Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,Image processing ,Reconstruction algorithm ,02 engineering and technology ,01 natural sciences ,Computer Graphics and Computer-Aided Design ,Superresolution ,law.invention ,010309 optics ,End-to-end principle ,Achromatic lens ,law ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Computer vision ,Depth of field ,Artificial intelligence ,business - Abstract
In typical cameras the optical system is designed first; once it is fixed, the parameters in the image processing algorithm are tuned to get good image reproduction. In contrast to this sequential design approach, we consider joint optimization of an optical system (for example, the physical shape of the lens) together with the parameters of the reconstruction algorithm. We build a fully-differentiable simulation model that maps the true source image to the reconstructed one. The model includes diffractive light propagation, depth and wavelength-dependent effects, noise and nonlinearities, and the image post-processing. We jointly optimize the optical parameters and the image processing algorithm parameters so as to minimize the deviation between the true and reconstructed image, over a large set of images. We implement our joint optimization method using autodifferentiation to efficiently compute parameter gradients in a stochastic optimization algorithm. We demonstrate the efficacy of this approach by applying it to achromatic extended depth of field and snapshot super-resolution imaging.
- Published
- 2018
- Full Text
- View/download PDF
35. A convex 3D deconvolution algorithm for low photon count fluorescence imaging
- Author
-
Hayato Ikoma, Takamasa Kudo, Michael Broxton, and Gordon Wetzstein
- Subjects
0301 basic medicine ,Hessian matrix ,Multidisciplinary ,Optimization problem ,Computer science ,lcsh:R ,Shot noise ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,lcsh:Medicine ,02 engineering and technology ,Regularization (mathematics) ,Article ,03 medical and health sciences ,symbols.namesake ,Biological specimen ,Noise ,030104 developmental biology ,0202 electrical engineering, electronic engineering, information engineering ,symbols ,Calibration ,020201 artificial intelligence & image processing ,lcsh:Q ,Deconvolution ,lcsh:Science ,Algorithm - Abstract
Deconvolution is widely used to improve the contrast and clarity of a 3D focal stack collected using a fluorescence microscope. But despite being extensively studied, deconvolution algorithms can introduce reconstruction artifacts when their underlying noise models or priors are violated, such as when imaging biological specimens at extremely low light levels. In this paper we propose a deconvolution method specifically designed for 3D fluorescence imaging of biological samples in the low-light regime. Our method utilizes a mixed Poisson-Gaussian model of photon shot noise and camera read noise, which are both present in low light imaging. We formulate a convex loss function and solve the resulting optimization problem using the alternating direction method of multipliers algorithm. Among several possible regularization strategies, we show that a Hessian-based regularizer is most effective for describing locally smooth features present in biological specimens. Our algorithm also estimates noise parameters on-the-fly, thereby eliminating a manual calibration step required by most deconvolution software. We demonstrate our algorithm on simulated images and experimentally-captured images with peak intensities of tens of photoelectrons per voxel. We also demonstrate its performance for live cell imaging, showing its applicability as a tool for biological research.
- Published
- 2018
- Full Text
- View/download PDF
36. Shift-variant color-coded diffractive spectral imaging system
- Author
-
Samuel Pinilla, Hayato Ikoma, Yifan Peng, Jorge Bacca, Gordon Wetzstein, and Henry Arguello
- Subjects
Point spread function ,Image formation ,medicine.medical_specialty ,Pixel ,Computer science ,business.industry ,Spectral bands ,Atomic and Molecular Physics, and Optics ,Electronic, Optical and Magnetic Materials ,Convolution ,Spectral imaging ,Imaging spectroscopy ,Optics ,medicine ,Projection (set theory) ,business - Abstract
State-of-the-art snapshot spectral imaging (SI) systems introduce color-coded apertures (CCAs) into their setups to obtain a flexible spatial-spectral modulation, allowing spectral information to be reconstructed from a set of coded measurements. Besides the CCA, other optical elements, such as lenses, prisms, or beam splitters, are usually employed, making systems large and impractical. Recently, diffractive optical elements (DOEs) have partially replaced refractive lenses to drastically reduce the size of the SI devices. The sensing model of these systems is represented as a projection modeled by a spatially shift-invariant convolution between the unknown scene and a point spread function (PSF) at each spectral band. However, the height maps of the DOE are the only free parameters that offer changes in the spectral modulation, which causes the ill-posedness of the reconstruction to increase significantly. To overcome this challenge, our work explores the advantages of the spectral modulation of an optical setup composed of a DOE and a CCA. Specifically, the light is diffracted by the DOE and then filtered by the CCA, located close to the sensor. A shift-variant property of the proposed system is clearly evidenced, resulting in a different PSF for each pixel, where a symmetric structure constraint is imposed on the CCA to reduce the high number of resulting PSFs. Additionally, we jointly design the DOE and the CCA parameters with a fully differentiable image formation model using an end-to-end approach to minimize the deviation between the true and reconstructed image over a large set of images. Simulation shows that the proposed system improves the spectral reconstruction quality in up to 4 dB compared with current state-of-the-art systems. Finally, experimental results with a fabricated prototype in indoor and outdoor scenes validate the proposed system, where it can recover up to 49 high-fidelity spectral bands in the 420–660 nm.
- Published
- 2021
- Full Text
- View/download PDF
37. Holographic pancake optics for thin and lightweight optical see-through augmented reality
- Author
-
Gordon Wetzstein, Peter Bosel, Yi Qin, and Ozan Cakmakci
- Subjects
Physics ,business.industry ,Orientation (computer vision) ,Holography ,Curved mirror ,Field of view ,Diffraction efficiency ,Atomic and Molecular Physics, and Optics ,law.invention ,Optics ,Reflection (mathematics) ,law ,Optical transfer function ,Focal length ,business - Abstract
Holographic pancake optics have been designed and fabricated in eyewear display optics literature dating back to 1985, however, a see-through pancake optic solution has not been demonstrated to date. The key contribution here is the first full-color volume holographic pancake optic in an optical see-through configuration for applications in mobile augmented reality. Specifically, the full-color volume holographic pancake is combined with a flat lightguide in order to achieve the optical see-through property. The fabricated hardware optics has a measured field of view of 29 degrees (horizontal) by 12 degrees (vertical) and a measured large eyebox that allows a ±10 mm horizontal motion and ∼±3 mm vertical motion for a 4 mm diameter pupil. The measured modulation transfer function (average orientation) is 10% contrast at 10 lp/deg. Three holograms were characterized with respect to their diffraction efficiency, angular bandwidth, focal length, haze, and thickness parameters. The phase function in the reflection mode hologram implements a spherical mirror that has a relatively simple recording geometry.
- Published
- 2021
- Full Text
- View/download PDF
38. SpinVR
- Author
-
Donald G. Dansereau, Aniq Masood, Robert Konrad, and Gordon Wetzstein
- Subjects
Cave automatic virtual environment ,Artificial reality ,Computer science ,Real-time computing ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,02 engineering and technology ,Computer-mediated reality ,Virtual reality ,Metaverse ,Frame rate ,Computer Graphics and Computer-Aided Design ,Live streaming ,Mixed reality ,Computational photography ,0202 electrical engineering, electronic engineering, information engineering ,Immersion (virtual reality) ,020201 artificial intelligence & image processing ,Augmented reality - Abstract
Streaming of 360° content is gaining attention as an immersive way to remotely experience live events. However live capture is presently limited to 2D content due to the prohibitive computational cost associated with multi-camera rigs. In this work we present a system that directly captures streaming 3D virtual reality content. Our approach does not suffer from spatial or temporal seams and natively handles phenomena that are challenging for existing systems, including refraction, reflection, transparency and speculars. Vortex natively captures in the omni-directional stereo (ODS) format, which is widely supported by VR displays and streaming pipelines. We identify an important source of distortion inherent to the ODS format, and demonstrate a simple means of correcting it. We include a detailed analysis of the design space, including tradeoffs between noise, frame rate, resolution, and hardware complexity. Processing is minimal, enabling live transmission of immersive, 3D, 360° content. We construct a prototype and demonstrate capture of 360° scenes at up to 8192 X 4096 pixels at 5 fps, and establish the viability of operation up to 32 fps.
- Published
- 2017
- Full Text
- View/download PDF
39. State of the Art on Neural Rendering
- Author
-
Jun-Yan Zhu, Kalyan Sunkavalli, Christian Theobalt, Maneesh Agrawala, Gordon Wetzstein, Ricardo Martin-Brualla, Sean Fanello, Stephen Lombardi, Matthias Nießner, Dan B. Goldman, Tomas Simon, Ayush Tewari, Michael Zollhöfer, Vincent Sitzmann, Ohad Fried, Rohit Pandey, Justus Thies, Jason Saragih, and Eli Shechtman
- Subjects
FOS: Computer and information sciences ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,02 engineering and technology ,Metaverse ,Computer Graphics and Computer-Aided Design ,Graphics (cs.GR) ,Rendering (computer graphics) ,View synthesis ,Computer graphics ,Open research ,Computer Science - Graphics ,Human–computer interaction ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Use case ,Augmented reality ,Graphics ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Efficient rendering of photo-realistic virtual worlds is a long standing effort of computer graphics. Modern graphics techniques have succeeded in synthesizing photo-realistic images from hand-crafted scene representations. However, the automatic generation of shape, materials, lighting, and other aspects of scenes remains a challenging problem that, if solved, would make photo-realistic computer graphics more widely accessible. Concurrently, progress in computer vision and machine learning have given rise to a new approach to image synthesis and editing, namely deep generative models. Neural rendering is a new and rapidly emerging field that combines generative machine learning techniques with physical knowledge from computer graphics, e.g., by the integration of differentiable rendering into network training. With a plethora of applications in computer graphics and vision, neural rendering is poised to become a new area in the graphics community, yet no survey of this emerging field exists. This state-of-the-art report summarizes the recent trends and applications of neural rendering. We focus on approaches that combine classic computer graphics techniques with deep generative models to obtain controllable and photo-realistic outputs. Starting with an overview of the underlying computer graphics and machine learning concepts, we discuss critical aspects of neural rendering approaches. This state-of-the-art report is focused on the many important use cases for the described algorithms such as novel view synthesis, semantic photo manipulation, facial and body reenactment, relighting, free-viewpoint video, and the creation of photo-realistic avatars for virtual and augmented reality telepresence. Finally, we conclude with a discussion of the social implications of such technology and investigate open research problems., Comment: Eurographics 2020 survey paper
- Published
- 2020
- Full Text
- View/download PDF
40. Inference in artificial intelligence with deep optics and photonics
- Author
-
Cornelia Denz, Sylvain Gigan, Gordon Wetzstein, Marin Soljacic, Aydogan Ozcan, Demetri Psaltis, Shanhui Fan, Dirk Englund, and David A. B. Miller
- Subjects
neural-networks ,Multidisciplinary ,Artificial neural network ,business.industry ,Computer science ,Optical computing ,Inference ,02 engineering and technology ,021001 nanoscience & nanotechnology ,01 natural sciences ,Visual computing ,010309 optics ,0103 physical sciences ,microscopy ,Artificial intelligence ,Applications of artificial intelligence ,Photonics ,recognition ,0210 nano-technology ,business ,implementation - Abstract
Artificial intelligence tasks across numerous applications require accelerators for fast and low-power execution. Optical computing systems may be able to meet these domain-specific needs but, despite half a century of research, general-purpose optical computing systems have yet to mature into a practical technology. Artificial intelligence inference, however, especially for visual computing applications, may offer opportunities for inference based on optical and photonic systems. In this Perspective, we review recent work on optical computing for artificial intelligence applications and discuss its promise and challenges.
- Published
- 2019
41. Deep Optics for Monocular Depth Estimation and 3D Object Detection
- Author
-
Gordon Wetzstein and Julie Chang
- Subjects
FOS: Computer and information sciences ,Monocular ,Simple lens ,Artificial neural network ,Computer science ,business.industry ,Computer Vision and Pattern Recognition (cs.CV) ,Image and Video Processing (eess.IV) ,Computer Science - Computer Vision and Pattern Recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,Image processing ,02 engineering and technology ,Electrical Engineering and Systems Science - Image and Video Processing ,Object detection ,law.invention ,Lens (optics) ,Optics ,law ,Chromatic aberration ,FOS: Electrical engineering, electronic engineering, information engineering ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,business - Abstract
Depth estimation and 3D object detection are critical for scene understanding but remain challenging to perform with a single image due to the loss of 3D information during image capture. Recent models using deep neural networks have improved monocular depth estimation performance, but there is still difficulty in predicting absolute depth and generalizing outside a standard dataset. Here we introduce the paradigm of deep optics, i.e. end-to-end design of optics and image processing, to the monocular depth estimation problem, using coded defocus blur as an additional depth cue to be decoded by a neural network. We evaluate several optical coding strategies along with an end-to-end optimization scheme for depth estimation on three datasets, including NYU Depth v2 and KITTI. We find an optimized freeform lens design yields the best results, but chromatic aberration from a singlet lens offers significantly improved performance as well. We build a physical prototype and validate that chromatic aberrations improve depth estimation on real-world results. In addition, we train object detection networks on the KITTI dataset and show that the lens optimized for depth estimation also results in improved 3D object detection performance., 10 pages, 5 figures
- Published
- 2019
- Full Text
- View/download PDF
42. Gaze-Contingent Ocular Parallax Rendering for Virtual Reality
- Author
-
Gordon Wetzstein, Robert Konrad, and Anastasios Angelopoulos
- Subjects
FOS: Computer and information sciences ,J.4 ,Computer science ,media_common.quotation_subject ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Human-Computer Interaction ,02 engineering and technology ,Virtual reality ,050105 experimental psychology ,Human-Computer Interaction (cs.HC) ,Rendering (computer graphics) ,Computer graphics ,InformationSystems_MODELSANDPRINCIPLES ,Computer Science - Graphics ,H.5.1 ,Perception ,0202 electrical engineering, electronic engineering, information engineering ,Immersion (virtual reality) ,medicine ,0501 psychology and cognitive sciences ,Computer vision ,media_common ,ComputingMethodologies_COMPUTERGRAPHICS ,Retina ,business.industry ,05 social sciences ,Perspective (graphical) ,I.3.7 ,020207 software engineering ,Computer Graphics and Computer-Aided Design ,Gaze ,Graphics (cs.GR) ,medicine.anatomical_structure ,Eye tracking ,Augmented reality ,Artificial intelligence ,Parallax ,business ,Depth perception - Abstract
Immersive computer graphics systems strive to generate perceptually realistic user experiences. Current-generation virtual reality (VR) displays are successful in accurately rendering many perceptually important effects, including perspective, disparity, motion parallax, and other depth cues. In this article, we introduce ocular parallax rendering, a technology that accurately renders small amounts of gaze-contingent parallax capable of improving depth perception and realism in VR. Ocular parallax describes the small amounts of depth-dependent image shifts on the retina that are created as the eye rotates. The effect occurs because the centers of rotation and projection of the eye are not the same. We study the perceptual implications of ocular parallax rendering by designing and conducting a series of user experiments. Specifically, we estimate perceptual detection and discrimination thresholds for this effect and demonstrate that it is clearly visible in most VR applications. Additionally, we show that ocular parallax rendering provides an effective ordinal depth cue and it improves the impression of realistic depth in VR., Video: https://www.youtube.com/watch?v=FvBYYAObJNM&feature=youtu.be Project Page: http://www.computationalimaging.org/publications/gaze-contingent-ocular-parallax-rendering-for-virtual-reality/
- Published
- 2019
43. Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations
- Author
-
Sitzmann, V., Zollhöfer, M., and Gordon Wetzstein
- Subjects
FOS: Computer and information sciences ,I.2.10 ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Computer Vision and Pattern Recognition (cs.CV) ,I.4.5 ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Computer Vision and Pattern Recognition ,I.4.8 ,I.4.10 ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Unsupervised learning with generative models has the potential of discovering rich representations of 3D scenes. While geometric deep learning has explored 3D-structure-aware representations of scene geometry, these models typically require explicit 3D supervision. Emerging neural scene representations can be trained only with posed 2D images, but existing methods ignore the three-dimensional structure of scenes. We propose Scene Representation Networks (SRNs), a continuous, 3D-structure-aware scene representation that encodes both geometry and appearance. SRNs represent scenes as continuous functions that map world coordinates to a feature representation of local scene properties. By formulating the image formation as a differentiable ray-marching algorithm, SRNs can be trained end-to-end from only 2D images and their camera poses, without access to depth or shape. This formulation naturally generalizes across scenes, learning powerful geometry and appearance priors in the process. We demonstrate the potential of SRNs by evaluating them for novel view synthesis, few-shot reconstruction, joint shape and appearance interpolation, and unsupervised discovery of a non-rigid face model., Video: https://youtu.be/6vMEBWD8O20 Project page: https://vsitzmann.github.io/srns/
- Published
- 2019
44. Autofocals: Evaluating gaze-contingent eyeglasses for presbyopes
- Author
-
Nitish Padmanaban, Gordon Wetzstein, and Robert Konrad
- Subjects
Male ,Visual acuity ,Computer science ,media_common.quotation_subject ,Visual Acuity ,Neurophysiology ,02 engineering and technology ,01 natural sciences ,Task (project management) ,law.invention ,010309 optics ,Contrast Sensitivity ,Engineering ,law ,0103 physical sciences ,Task Performance and Analysis ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,Contrast (vision) ,Humans ,Research Articles ,media_common ,Aged ,Multidisciplinary ,business.industry ,Vision Tests ,SciAdv r-articles ,020207 software engineering ,Presbyopia ,Middle Aged ,021001 nanoscience & nanotechnology ,medicine.disease ,Gaze ,Lens (optics) ,Stereopsis ,Eyeglasses ,Eye tracking ,Optometry ,Female ,medicine.symptom ,0210 nano-technology ,business ,Accommodation ,Research Article - Abstract
Modern presbyopia corrections exhibit unnatural refocusing behavior; we build and evaluate autofocal eyeglasses to improve them., As humans age, they gradually lose the ability to accommodate, or refocus, to near distances because of the stiffening of the crystalline lens. This condition, known as presbyopia, affects nearly 20% of people worldwide. We design and build a new presbyopia correction, autofocals, to externally mimic the natural accommodation response, combining eye tracker and depth sensor data to automatically drive focus-tunable lenses. We evaluated 19 users on visual acuity, contrast sensitivity, and a refocusing task. Autofocals exhibit better visual acuity when compared to monovision and progressive lenses while maintaining similar contrast sensitivity. On the refocusing task, autofocals are faster and, compared to progressives, also significantly more accurate. In a separate study, a majority of 23 of 37 users ranked autofocals as the best correction in terms of ease of refocusing. Our work demonstrates the superiority of autofocals over current forms of presbyopia correction and could affect the lives of millions.
- Published
- 2019
45. Tensor low-rank and sparse light field photography
- Author
-
Mahdad Hosseini Kamal, Barmak Heshmat, Gordon Wetzstein, Pierre Vandergheynst, and Ramesh Raskar
- Subjects
Low-rank tensor factorization ,Imagination ,Computer science ,media_common.quotation_subject ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,High dimensional ,Low-rank and sparse decomposition ,law.invention ,Redundancy (information theory) ,Computational photography ,law ,0202 electrical engineering, electronic engineering, information engineering ,Image acquisition ,Computer vision ,media_common ,Light-field camera ,business.industry ,020207 software engineering ,Compressive sensing ,Compressed sensing ,Computer Science::Computer Vision and Pattern Recognition ,Signal Processing ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Software ,Light field ,Curse of dimensionality - Abstract
We present a computational camera system for efficient light field image and video acquisition.Our mathematical framework models the intrinsic low dimensionality of light fields using tensor low-rank and sparse priors.We design and implement a prototype compressive light field camera that avoids capturing redundancy of high-dimensional plenoptic function. High-quality light field photography has been one of the most difficult challenges in computational photography. Conventional methods either sacrifice resolution, use multiple devices, or require multiple images to be captured. Combining coded image acquisition and compressive reconstruction is one of the most promising directions to overcome limitations of conventional light field cameras. We present a new approach to compressive light field photography that exploits a joint tensor low-rank and sparse prior (LRSP) on natural light fields. As opposed to recently proposed light field dictionaries, our method does not require a computationally expensive learning stage but rather models the redundancies of high dimensional visual signals using a tensor low-rank prior. This is not only computationally more efficient but also more flexible in that the proposed techniques are easily applicable to a wide range of different imaging systems, camera parameters, and also scene types.
- Published
- 2016
- Full Text
- View/download PDF
46. Optimizing image quality for holographic near-eye displays with Michelson Holography
- Author
-
Yifan Peng, Jonghyun Kim, Gordon Wetzstein, and Suyeon Choi
- Subjects
Liquid-crystal display ,business.industry ,Image quality ,Computer science ,Holography ,020207 software engineering ,02 engineering and technology ,01 natural sciences ,Atomic and Molecular Physics, and Optics ,Electronic, Optical and Magnetic Materials ,law.invention ,010309 optics ,Quality (physics) ,Stochastic gradient descent ,Optics ,law ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Holographic display ,business - Abstract
We introduce Michelson holography (MH), a holographic display technology that optimizes image quality for emerging holographic near-eye displays. Using two spatial light modulators (SLMs), MH is capable of leveraging destructive interference to optically cancel out undiffracted light corrupting the observed image. We calibrate this system using emerging camera-in-the-loop holography techniques and demonstrate state-of-the-art 2D and multi-plane holographic image quality.
- Published
- 2021
- Full Text
- View/download PDF
47. Toward the next-generation VR/AR optics: a review of holographic near-eye displays from a human-centric perspective
- Author
-
Gordon Wetzstein, Byoungho Lee, Chenliang Chang, Kiseung Bang, and Liang Gao
- Subjects
Computer science ,Headset ,Holography ,Wearable computer ,02 engineering and technology ,021001 nanoscience & nanotechnology ,Stereo display ,01 natural sciences ,Article ,Atomic and Molecular Physics, and Optics ,Electronic, Optical and Magnetic Materials ,law.invention ,010309 optics ,law ,Human–computer interaction ,0103 physical sciences ,Holographic display ,Immersion (virtual reality) ,Augmented reality ,0210 nano-technology ,Depth perception - Abstract
Wearable near-eye displays for virtual and augmented reality (VR/AR) have seen enormous growth in recent years. While researchers are exploiting a plethora of techniques to create life-like three-dimensional (3D) objects, there is a lack of awareness of the role of human perception in guiding the hardware development. An ultimate VR/AR headset must integrate the display, sensors, and processors in a compact enclosure that people can comfortably wear for a long time while allowing a superior immersion experience and user-friendly human–computer interaction. Compared with other 3D displays, the holographic display has unique advantages in providing natural depth cues and correcting eye aberrations. Therefore, it holds great promise to be the enabling technology for next-generation VR/AR devices. In this review, we survey the recent progress in holographic near-eye displays from the human-centric perspective.
- Published
- 2020
- Full Text
- View/download PDF
48. Has half the time passed? Investigating time perception at long time scales
- Author
-
Qi Sun, Gordon Wetzstein, Zoya Bylinskii, Diego Gutierrez, David M. Eagleman, Laura M. Herman, Belen Masia, and Sandra Malpica
- Subjects
Ophthalmology ,Time perception ,Psychology ,Sensory Systems ,Cognitive psychology - Published
- 2020
- Full Text
- View/download PDF
49. Learned rotationally symmetric diffractive achromat for full-spectrum computational imaging
- Author
-
Xiong Dun, Hayato Ikoma, Gordon Wetzstein, Xinbin Cheng, Yifan Peng, and Zhanshan Wang
- Subjects
Computational complexity theory ,Artificial neural network ,Computer science ,Image quality ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,Iterative reconstruction ,Inverse problem ,Atomic and Molecular Physics, and Optics ,Electronic, Optical and Magnetic Materials ,law.invention ,Achromatic lens ,law ,Optical transfer function ,Algorithm - Abstract
Diffractive achromats (DAs) promise ultra-thin and light-weight form factors for full-color computational imaging systems. However, designing DAs with the optimal optical transfer function (OTF) distribution suitable for image reconstruction algorithms has been a difficult challenge. Emerging end-to-end optimization paradigms of diffractive optics and processing algorithms have achieved impressive results, but these approaches require immense computational resources and solve non-convex inverse problems with millions of parameters. Here, we propose a learned rotational symmetric DA design using a concentric ring decomposition that reduces the computational complexity and memory requirements by one order of magnitude compared with conventional end-to-end optimization procedures, which simplifies the optimization significantly. With this approach, we realize the joint learning of a DA with an aperture size of 8 mm and an image recovery neural network, i.e., Res-Unet, in an end-to-end manner across the full visible spectrum (429–699 nm). The peak signal-to-noise ratio of the recovered images of our learned DA is 1.3 dB higher than that of DAs designed by conventional sequential approaches. This is because the learned DA exhibits higher amplitudes of the OTF at high frequencies over the full spectrum. We fabricate the learned DA using imprinting lithography. Experiments show that it resolves both fine details and color fidelity of diverse real-world scenes under natural illumination. The proposed design paradigm paves the way for incorporating DAs for thinner, lighter, and more compact full-spectrum imaging systems.
- Published
- 2020
- Full Text
- View/download PDF
50. Cortical Observation by Synchronous Multifocal Optical Sampling Reveals Widespread Population Encoding of Actions
- Author
-
Timothy A. Machado, Gordon Wetzstein, Elle Yuen, John Kochalka, Karl Deisseroth, Isaac Kauvar, William E. Allen, and Minseung Choi
- Subjects
0301 basic medicine ,Population ,Neocortex ,Neuroimaging ,Observation ,Signal-To-Noise Ratio ,Biology ,Article ,Mice ,03 medical and health sciences ,0302 clinical medicine ,Calcium imaging ,Robotic Surgical Procedures ,Cortex (anatomy) ,Encoding (memory) ,medicine ,Animals ,education ,Cerebral Cortex ,Neurons ,Brain Mapping ,education.field_of_study ,Behavior, Animal ,General Neuroscience ,Optical sampling ,Optogenetics ,030104 developmental biology ,medicine.anatomical_structure ,Cosmos (category theory) ,Conditioning, Operant ,Neuroscience ,Algorithms ,Craniotomy ,Psychomotor Performance ,030217 neurology & neurosurgery ,Neural decoding - Abstract
Summary To advance the measurement of distributed neuronal population representations of targeted motor actions on single trials, we developed an optical method (COSMOS) for tracking neural activity in a largely uncharacterized spatiotemporal regime. COSMOS allowed simultaneous recording of neural dynamics at ∼30 Hz from over a thousand near-cellular resolution neuronal sources spread across the entire dorsal neocortex of awake, behaving mice during a three-option lick-to-target task. We identified spatially distributed neuronal population representations spanning the dorsal cortex that precisely encoded ongoing motor actions on single trials. Neuronal correlations measured at video rate using unaveraged, whole-session data had localized spatial structure, whereas trial-averaged data exhibited widespread correlations. Separable modes of neural activity encoded history-guided motor plans, with similar population dynamics in individual areas throughout cortex. These initial experiments illustrate how COSMOS enables investigation of large-scale cortical dynamics and that information about motor actions is widely shared between areas, potentially underlying distributed computations.
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.