Author: "Eric J. Kelmelis" - Searchworks@Jio Institute Digital Library Search Results

5. Front Matter: Volume 10650

Author: Eric J. Kelmelis
Subjects: Materials science, Optics, Range (biology), business.industry, business
Published: 2018
Full Text: View/download PDF

6. Real-time high performance atmospheric distortion correction using a Xilinx UltraScale Plus

Author: Eric J. Kelmelis, Nick Henning, Aaron Paolini, James Bonnett, Wayne Cranwell, and Steve Carl Jamieson Parker
Subjects: business.industry, Computer science, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Image processing, Video processing, MPSoC, Lucky imaging, business, Adaptive optics, Frame rate, Field-programmable gate array, Bispectrum, Computer hardware
Abstract: Long-range video surveillance is usually limited by the wavefront aberrations caused by atmospheric turbulence, rather than by the quality of the imaging optics or sensor. These aberrations can be mitigated optically by adaptive optics, or corrected post detection by digital video processing. Video processing is preferred if the quality of the enhancement is acceptable, because the hardware is less expensive and has lower size, weight and power (SWaP). Several competing video processing solutions may be employed: speckle imaging with bispectrum processing, lucky imaging, geometric correction and blind deconvolution. Speckle imaging was originally developed for astronomy. It has subsequently been adapted for the more challenging problem of low altitude, slant path, imaging, where the atmosphere is denser and more turbulent. This paper considers a bispectrum-based video processing solution, called ATCOM, which was originally implemented on an i7 CPU and accelerated using a GPU by EM Photonics Ltd. The design has since been adapted in a joint venture with RFEL Ltd to produce a low SWaP implementation based around Xilinx’s Zynq 7045 allprogrammable system-on-a-chip (SoC). This system is called ATACAMA. Bispectrum processing is computationally expensive and, for both ATCOM and ATACAMA, a sub-region of the image must be processed to achieve operation at standard video frame rates. This paper considers how the design may be optimized to increase the size of this region, while maintaining high performance. Finally, use of Xilinx’s next-generation UltraScale+ multiprocessor SoC (MPSoC), which has an embedded Mali-400 GPU as well as an ARM CPU, is explored to further improve functionality.
Published: 2018
Full Text: View/download PDF

7. Front Matter: Volume 10204

Author: Eric J. Kelmelis
Subjects: Optics, Materials science, Range (biology), business.industry, business
Published: 2017
Full Text: View/download PDF

8. Enhancing data from commercial space flights (Conference Presentation)

Author: Eric J. Kelmelis, Stephen Kozacik, Aaron Paolini, and Ariel Sherman
Subjects: Rocket (weapon), Presentation, business.product_category, Aeronautics, Rocket, Computer science, Range (aeronautics), media_common.quotation_subject, ComputerApplications_COMPUTERSINOTHERSYSTEMS, Space (commercial competition), business, Remote sensing, media_common
Abstract: Video tracking of rocket launches inherently must be done from long range. Due to the high temperatures produced, cameras are often placed far from launch sites and their distance to the rocket increases as it is tracked through the flight. Consequently, the imagery collected is generally severely degraded by atmospheric turbulence. In this talk, we present our experience in enhancing commercial space flight videos. We will present the mission objectives, the unique challenges faced, and the solutions to overcome them.
Published: 2017
Full Text: View/download PDF

9. Photo-acoustic and video-acoustic methods for sensing distant sound sources

Author: Eric J. Kelmelis, Dan Slater, and Stephen Kozacik
Subjects: Electromagnetic field, Pixel, Dynamic range, Computer science, Turbulence, business.industry, Microphone, Video processing, Signal, Photodiode, law.invention, Stereophonic sound, Transducer, Sampling (signal processing), law, Distortion, Demodulation, Waveform, Computer vision, Artificial intelligence, business
Abstract: Long range telescopic video imagery of distant terrestrial scenes, aircraft, rockets and other aerospace vehicles can be a powerful observational tool. But what about the associated acoustic activity? A new technology, Remote Acoustic Sensing (RAS), may provide a method to remotely listen to the acoustic activity near these distant objects. Local acoustic activity sometimes weakly modulates the ambient illumination in a way that can be remotely sensed. RAS is a new type of microphone that separates an acoustic transducer into two spatially separated components: 1) a naturally formed in situ acousto-optic modulator (AOM) located within the distant scene and 2) a remote sensing readout device that recovers the distant audio. These two elements are passively coupled over long distances at the speed of light by naturally occurring ambient light energy or other electromagnetic fields. Stereophonic, multichannel and acoustic beam forming are all possible using RAS techniques and when combined with high-definition video imagery it can help to provide a more cinema like immersive viewing experience. A practical implementation of a remote acousto-optic readout device can be a challenging engineering problem. The acoustic influence on the optical signal is generally weak and often with a strong bias term. The optical signal is further degraded by atmospheric seeing turbulence. In this paper, we consider two fundamentally different optical readout approaches: 1) a low pixel count photodiode based RAS photoreceiver and 2) audio extraction directly from a video stream. Most of our RAS experiments to date have used the first method for reasons of performance and simplicity. But there are potential advantages to extracting audio directly from a video stream. These advantages include the straight forward ability to work with multiple AOMs (useful for acoustic beam forming), simpler optical configurations, and a potential ability to use certain preexisting video recordings. However, doing so requires overcoming significant limitations typically including much lower sample rates, reduced sensitivity and dynamic range, more expensive video hardware, and the need for sophisticated video processing. The ATCOM real time image processing software environment provides many of the needed capabilities for researching video-acoustic signal extraction. ATCOM currently is a powerful tool for the visual enhancement of atmospheric turbulence distorted telescopic views. In order to explore the potential of acoustic signal recovery from video imagery we modified ATCOM to extract audio waveforms from the same telescopic video sources. In this paper, we demonstrate and compare both readout techniques for several aerospace test scenarios to better show where each has advantages.
Published: 2017
Full Text: View/download PDF

10. Development of an embedded atmospheric turbulence mitigation engine

Author: Aaron Paolini, Stephen Kozacik, Eric J. Kelmelis, and James Bonnett
Subjects: Ubiquitous computing, business.industry, Computer science, Atmospheric correction, Graphics processing unit, 02 engineering and technology, 021001 nanoscience & nanotechnology, Chip, 01 natural sciences, 010309 optics, Software, Gate array, Software deployment, Embedded system, 0103 physical sciences, 0210 nano-technology, business, Field-programmable gate array
Abstract: Methods to reconstruct pictures from imagery degraded by atmospheric turbulence have been under development for decades. The techniques were initially developed for observing astronomical phenomena from the Earth’s surface, but have more recently been modified for ground and air surveillance scenarios. Such applications can impose significant constraints on deployment options because they both increase the computational complexity of the algorithms themselves and often dictate a requirement for low size, weight, and power (SWaP) form factors. Consequently, embedded implementations must be developed that can perform the necessary computations on low-SWaP platforms. Fortunately, there is an emerging class of embedded processors driven by the mobile and ubiquitous computing industries. We have leveraged these processors to develop embedded versions of the core atmospheric correction engine found in our ATCOM software. In this paper, we will present our experience adapting our algorithms for embedded systems on a chip (SoCs), namely the NVIDIA Tegra that couples general-purpose ARM cores with their graphics processing unit (GPU) technology and the Xilinx Zynq which pairs similar ARM cores with their field-programmable gate array (FPGA) fabric.
Published: 2017
Full Text: View/download PDF

11. Enhancement of DARPA SRVS data with a real-time commercial turbulence mitigation software

Author: Aaron Paolini, Ariel Sherman, Eric J. Kelmelis, Richard L. Espinola, and Stephen Kozacik
Subjects: Computer science, business.industry, Turbulence, Digital imaging, 02 engineering and technology, 021001 nanoscience & nanotechnology, 01 natural sciences, 010309 optics, Speckle pattern, Software, 0103 physical sciences, 0210 nano-technology, business, Remote sensing
Abstract: Modern digital imaging systems are susceptible to degraded imagery because of atmospheric turbulence. Notwithstanding significant improvements in resolution and speed, significant degradation of captured imagery still hampers system designers and operators. Several techniques exist for mitigating the effects of the turbulence on captured imagery, we will concentrate on the effects of Bi-Spectrum Speckle Averaging [1], [2] approach to image enhancement, on a data-set captured in-conjunction with meteorological data.
Published: 2017
Full Text: View/download PDF

12. Improving developer productivity with C++ embedded domain specific languages

Author: Eric J. Kelmelis, James Bonnett, Aaron Paolini, Evenie Chao, and Stephen Kozacik
Subjects: 010302 applied physics, Domain-specific language, Vocabulary, Programming language, Computer science, media_common.quotation_subject, Interoperability, computer.software_genre, Supercomputer, 01 natural sciences, Domain (software engineering), 010309 optics, Set (abstract data type), 0103 physical sciences, Preprocessor, Compiler, computer, media_common
Abstract: Domain-specific languages are a useful tool for productivity allowing domain experts to program using familiar concepts and vocabulary while benefiting from performance choices made by computing experts. Embedding the domain specific language into an existing language allows easy interoperability with non-domain-specific code and use of standard compilers and build systems. In C++, this is enabled through the template and preprocessor features. C++ embedded domain specific languages (EDSLs) allow the user to write simple, safe, performant, domain specific code that has access to all the low-level functionality that C and C++ offer as well as the diverse set of libraries available in the C/C++ ecosystem. In this paper, we will discuss several tools available for building EDSLs in C++ and show examples of projects successfully leveraging EDSLs. Modern C++ has added many useful new features to the language which we have leveraged to further extend the capability of EDSLs. At EM Photonics, we have used EDSLs to allow developers to transparently benefit from using high performance computing (HPC) hardware. We will show ways EDSLs combine with existing technologies and EM Photonics high performance tools and libraries to produce clean, short, high performance code in ways that were not previously possible.
Published: 2017
Full Text: View/download PDF

13. Quantifying the improvement of turbulence mitigation technology

Author: Aaron Paolini, Stephen Kozacik, James Bonnett, Ariel Sherman, and Eric J. Kelmelis
Subjects: Scintillation, Computer science, business.industry, Turbulence, Atmospheric correction, Strehl ratio, Image processing, 02 engineering and technology, 01 natural sciences, 010309 optics, Optical transfer function, 0103 physical sciences, Metric (mathematics), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, Image warping, business, Remote sensing
Abstract: Atmospheric turbulence degrades imagery by imparting scintillation and warping effects that can reduce the ability to identify key features of the subjects. While visually, a human can intuitively understand the improvement that turbulence mitigation techniques can offer in increasing visual information, this enhancement is rarely quantified in a meaningful way. In this paper, we discuss methods for measuring the potential improvement on system performance video enhancement algorithms can provide. To accomplish this, we explore two metrics. We use resolution targets to determine the difference between imagery degraded by turbulence and that improved by atmospheric correction techniques. By comparing line scans between the data before and after processing, it is possible to quantify the additional information extracted. Advanced processing of this data can provide information about the effective modulation transfer function (MTF) of the system when atmospheric effects are considered and removed, using this data we compute a second metric, the relative improvement in Strehl ratio.
Published: 2017
Full Text: View/download PDF

14. Front Matter: Volume 9846

Author: Eric J. Kelmelis
Subjects: Optics, Materials science, Range (biology), business.industry, business
Published: 2016
Full Text: View/download PDF

15. Comparison of turbulence mitigation algorithms

Author: Aaron Paolini, Eric J. Kelmelis, Stephen Kozacik, and James Bonnett
Subjects: Computer science, Turbulence, Image quality, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Image processing, 02 engineering and technology, 030218 nuclear medicine & medical imaging, 03 medical and health sciences, 0302 clinical medicine, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Image warping, Image sensor, Algorithm
Abstract: When capturing image data over long distances (0.5 km and above), images are often degraded by atmospheric turbulence, especially when imaging paths are close to the ground or in hot environments. These issues manifest as time-varying scintillation and warping effects that decrease the effective resolution of the sensor and reduce actionable intelligence. In recent years, several image processing approaches to turbulence mitigation have shown promise. Each of these algorithms have different computational requirements, usability demands, and degrees of independence from camera sensors. They also produce different degrees of enhancement when applied to turbulent imagery. Additionally, some of these algorithms are applicable to real-time operational scenarios while others may only be suitable for post-processing workflows. EM Photonics has been developing image-processing-based turbulence mitigation technology since 2005 as a part of our ATCOM [1] image processing suite. In this paper we will compare techniques from the literature with our commercially available real-time GPU accelerated turbulence mitigation software suite, as well as in-house research algorithms. These comparisons will be made using real, experimentally-obtained data for a variety of different conditions, including varying optical hardware, imaging range, subjects, and turbulence conditions. Comparison metrics will include image quality, video latency, computational complexity, and potential for real-time operation.
Published: 2016
Full Text: View/download PDF

16. Enabling power-aware software in embedded systems

Author: Eric J. Kelmelis, Adam Markey, Aaron Paolini, James Bonnett, Paul Fox, and Stephen Kozacik
Subjects: Battery (electricity), Mobile processor, business.industry, Computer science, Wearable computer, Linux kernel, 02 engineering and technology, 021001 nanoscience & nanotechnology, computer.software_genre, 01 natural sciences, 010309 optics, Software, Embedded system, 0103 physical sciences, Compiler, 0210 nano-technology, business, Mobile device, Electrical efficiency, computer
Abstract: The use of commodity mobile processors in wearable computing and field-deployed applications has risen as these processors have become increasingly powerful and inexpensive. Battery technology, however, has not advanced as quickly, and as the processing power of these systems has increased, so has their power consumption. In order to maximize endurance without compromising performance, fine-grained control of power consumption by these devices is highly desirable. Various methodologies exist to affect system-level bias with respect to the prioritization of performance or efficiency, but these are fragmented and global in effect, and so do not offer the breadth and granularity of control desired. This paper introduces a method of giving application programmers more control over system power consumption using a directive-based approach similar to existing APIs such as OpenMP. On supported platforms the compiler, application runtime, and Linux kernel will work together to translate the power-saving intent expressed in compiler directives into instructions to control the hardware, reducing power consumption when possible while still providing high performance when required.
Published: 2016
Full Text: View/download PDF

17. Many-core graph analytics using accelerated sparse linear algebra routines

Author: Paul Fox, Eric J. Kelmelis, Stephen Kozacik, and Aaron Paolini
Subjects: Power graph analysis, Wait-for graph, Theoretical computer science, Graph database, business.industry, Computer science, Computer programming, 02 engineering and technology, computer.software_genre, 01 natural sciences, Graph, 010309 optics, Analytics, 0103 physical sciences, Linear algebra, 0202 electrical engineering, electronic engineering, information engineering, Programming paradigm, Graph (abstract data type), 020201 artificial intelligence & image processing, business, computer
Abstract: Graph analytics is a key component in identifying emerging trends and threats in many real-world applications. Largescale graph analytics frameworks provide a convenient and highly-scalable platform for developing algorithms to analyze large datasets. Although conceptually scalable, these techniques exhibit poor performance on modern computational hardware. Another model of graph computation has emerged that promises improved performance and scalability by using abstract linear algebra operations as the basis for graph analysis as laid out by the GraphBLAS standard. By using sparse linear algebra as the basis, existing highly efficient algorithms can be adapted to perform computations on the graph. This approach, however, is often less intuitive to graph analytics experts, who are accustomed to vertex-centric APIs such as Giraph, GraphX, and Tinkerpop. We are developing an implementation of the high-level operations supported by these APIs in terms of linear algebra operations. This implementation is be backed by many-core implementations of the fundamental GraphBLAS operations required, and offers the advantages of both the intuitive programming model of a vertex-centric API and the performance of a sparse linear algebra implementation. This technology can reduce the number of nodes required, as well as the run-time for a graph analysis problem, enabling customers to perform more complex analysis with less hardware at lower cost. All of this can be accomplished without the requirement for the customer to make any changes to their analytics code, thanks to the compatibility with existing graph APIs.
Published: 2016
Full Text: View/download PDF

18. Front Matter: Volume 9478

Author: Eric J. Kelmelis
Subjects: Volume (thermodynamics), Mechanics, Geology, Front (military)
Published: 2015
Full Text: View/download PDF

19. Adaptive OpenCL libraries for platform portability

Author: Paul Fox, Allyssa L. Batten, Eric J. Kelmelis, and Marcus Hayes
Subjects: Software portability, Computer architecture, business.industry, Computer science, Computer data storage, Computer programming, Memory organisation, business, Field-programmable gate array, Execution model, Massively parallel
Abstract: The OpenCL API provides an abstract mechanism for massively parallel programming on a very wide range of hardware, including traditional CPUs, GPUs, accelerator devices, FPGAs, and more. However, these different hardware architectures and platforms function quite differently. Therefore, coding OpenCL applications that are usefully portable is challenging. Certain considerations are therefore required in developing an effectively portable OpenCL library to enable parallel application development without requiring fully separate code paths for each target platform. By making use of device detection and characterization provided by the OpenCL API, valuable information can be obtained to make runtime decisions for optimization. In particular, the effects of memory affinity change depending on the memory organization of the device architecture. Work partitioning and assignment depend on the device execution model, in particular the types of parallel execution supported and available synchronization primitives. These considerations, in turn, affect the selection and invocation of kernel code. For certain devices, platform-specific libraries are available, while others can benefit from generated kernel code based on the specified device parameters. By parameterizing an algorithm based on how these considerations affect performance, a combination of device parameters can be used to produce an execution strategy that will provide improved performance for that device or collection of devices.
Published: 2015
Full Text: View/download PDF

20. Real-time image processing for passive mmW imagery

Author: Aaron Paolini, Daniel G. Mackrides, James Bonnett, Thomas E. Dillon, Eric J. Kelmelis, Charles Harrity, Dennis W. Prather, Christopher A. Schuetz, Richard D. Martin, and Stephen Kozacik
Subjects: Diffraction, Computer science, business.industry, Aperture, Noise reduction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Image processing, Signal-to-noise ratio, Transmission (telecommunications), Digital image processing, Computer vision, Artificial intelligence, Noise (video), business
Abstract: The transmission characteristics of millimeter waves (mmWs) make them suitable for many applications in defense and security, from airport preflight scanning to penetrating degraded visual environments such as brownout or heavy fog. While the cold sky provides sufficient illumination for these images to be taken passively in outdoor scenarios, this utility comes at a cost; the diffraction limit of the longer wavelengths involved leads to lower resolution imagery compared to the visible or IR regimes, and the low power levels inherent to passive imagery allow the data to be more easily degraded by noise. Recent techniques leveraging optical upconversion have shown significant promise, but are still subject to fundamental limits in resolution and signal-to-noise ratio. To address these issues we have applied techniques developed for visible and IR imagery to decrease noise and increase resolution in mmW imagery. We have developed these techniques into fieldable software, making use of GPU platforms for real-time operation of computationally complex image processing algorithms. We present data from a passive, 77 GHz, distributed aperture, video-rate imaging platform captured during field tests at full video rate. These videos demonstrate the increase in situational awareness that can be gained through applying computational techniques in real-time without needing changes in detection hardware.
Published: 2015
Full Text: View/download PDF

21. Real-time technology for enhancing long-range imagery

Author: Eric J. Kelmelis, Paul Fox, Aaron Paolini, James Bonnett, and Stephen Kozacik
Subjects: Atmosphere (unit), business.industry, Computer science, media_common.quotation_subject, Real-time computing, Image processing, Video processing, Range (mathematics), Computer vision, Quality (business), Artificial intelligence, Image warping, business, Focus (optics), media_common
Abstract: Many ISR applications require constant monitoring of targets from long distance. When capturing over long distances, imagery is often degraded by atmospheric turbulence. This adds a time-variant blurring effect to captured data, and can result in a significant loss of information. To recover it, image processing techniques have been developed to enhance sequences of short exposure images or videos in order to remove frame-specific scintillation and warping. While some of these techniques have been shown to be quite effective, the associated computational complexity and required processing power limits the application of these techniques to post-event analysis. To meet the needs of real-time ISR applications, video enhancement must be done in real-time in order to provide actionable intelligence as the scene unfolds. In this paper, we will provide an overview of an algorithm capable of providing the enhancement desired and focus on its real-time implementation. We will discuss the role that GPUs play in enabling real-time performance. This technology can be used to add performance to ISR applications by improving the quality of long-range imagery as it is collected and effectively extending sensor range.
Published: 2015
Full Text: View/download PDF

22. Automatic parameter estimation for atmospheric turbulence mitigation techniques

Author: Aaron Paolini, Stephen Kozacik, and Eric J. Kelmelis
Subjects: Range (mathematics), Computer science, Estimation theory, Turbulence, Real-time computing, Image processing, Simulation
Abstract: Several image processing techniques for turbulence mitigation have been shown to be effective under a wide range of long-range capture conditions; however, complex, dynamic scenes have often required manual interaction with the algorithm’s underlying parameters to achieve optimal results. While this level of interaction is sustainable in some workflows, in-field determination of ideal processing parameters greatly diminishes usefulness for many operators. Additionally, some use cases, such as those that rely on unmanned collection, lack human-in-the-loop usage. To address this shortcoming, we have extended a well-known turbulence mitigation algorithm based on bispectral averaging with a number of techniques to greatly reduce (and often eliminate) the need for operator interaction. Automations were made in the areas of turbulence strength estimation (Fried’s parameter), as well as the determination of optimal local averaging windows to balance turbulence mitigation and the preservation of dynamic scene content (non-turbulent motions). These modifications deliver a level of enhancement quality that approaches that of manual interaction, without the need for operator interaction. As a consequence, the range of operational scenarios where this technology is of benefit has been significantly expanded.
Published: 2015
Full Text: View/download PDF

23. Comparison of turbulence mitigation algorithms

Author: Ariel Sherman, Stephen Kozacik, Aaron Paolini, James Bonnett, and Eric J. Kelmelis
Subjects: Image fusion, 020205 medical informatics, Turbulence, Computer science, Image quality, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, General Engineering, Image processing, 02 engineering and technology, Real image, 01 natural sciences, Atomic and Molecular Physics, and Optics, 010309 optics, Speckle pattern, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, Deconvolution, Image warping, Image sensor, Algorithm, ComputingMethodologies_COMPUTERGRAPHICS
Abstract: When capturing imagery over long distances, atmospheric turbulence often degrades the data, especially when observation paths are close to the ground or in hot environments. These issues manifest as time-varying scintillation and warping effects that decrease the effective resolution of the sensor and reduce actionable intelligence. In recent years, several image processing approaches to turbulence mitigation have shown promise. Each of these algorithms has different computational requirements, usability demands, and degrees of independence from camera sensors. They also produce different degrees of enhancement when applied to turbulent imagery. Additionally, some of these algorithms are applicable to real-time operational scenarios while others may only be suitable for postprocessing workflows. EM Photonics has been developing image-processing-based turbulence mitigation technology since 2005. We will compare techniques from the literature with our commercially available, real-time, GPU-accelerated turbulence mitigation software. These comparisons will be made using real (not synthetic), experimentally obtained data for a variety of conditions, including varying optical hardware, imaging range, subjects, and turbulence conditions. Comparison metrics will include image quality, video latency, computational complexity, and potential for real-time operation. Additionally, we will present a technique for quantitatively comparing turbulence mitigation algorithms using real images of radial resolution targets.
Published: 2017
Full Text: View/download PDF

24. Practical considerations for real-time turbulence mitigation in long-range imagery

Author: Aaron Paolini, Stephen Kozacik, and Eric J. Kelmelis
Subjects: Signal processing, Computer science, Image quality, Turbulence, business.industry, Real-time computing, General Engineering, Image processing, 02 engineering and technology, Video processing, 021001 nanoscience & nanotechnology, 01 natural sciences, Atomic and Molecular Physics, and Optics, 010309 optics, Range (mathematics), 0103 physical sciences, Computer vision, Artificial intelligence, Image warping, 0210 nano-technology, business
Abstract: Atmospheric turbulence degrades imagery by imparting scintillation and warping effects that blur the collected pictures and reduce the effective level of detail. While this reduction in image quality can occur in a wide range of scenarios, it is particularly noticeable when capturing over long distances, when close to the ground, or in hot and humid environments. For decades, researchers have attempted to correct these problems through device and signal processing solutions. While fully digital approaches have the advantage of not requiring specialized hardware, they have been difficult to realize in real-time scenarios due to a variety of practical considerations, including computational performance, the need to integrate with cameras, and the ability to handle complex scenes. We address these challenges and our experience overcoming them. We enumerate the considerations for developing an image processing approach to atmospheric turbulence correction and describe how we approached them to develop software capable of real-time enhancement of long-range imagery.
Published: 2017
Full Text: View/download PDF

25. Optimization techniques for OpenCL-based linear algebra routines

Author: John R. Humphrey, Stephen Kozacik, Paul Fox, Aryeh Kuller, Dennis W. Prather, and Eric J. Kelmelis
Subjects: Set (abstract data type), Kernel (linear algebra), Computer science, business.industry, Linear algebra, Computer programming, Multiplication, Parallel computing, General-purpose computing on graphics processing units, business, Parametrization, Matrix multiplication, Block (data storage)
Abstract: The OpenCL standard for general-purpose parallel programming allows a developer to target highly parallel computations towards graphics processing units (GPUs), CPUs, co-processing devices, and field programmable gate arrays (FPGAs). The computationally intense domains of linear algebra and image processing have shown significant speedups when implemented in the OpenCL environment. A major benefit of OpenCL is that a routine written for one device can be run across many different devices and architectures; however, a kernel optimized for one device may not exhibit high performance when executed on a different device. For this reason kernels must typically be hand-optimized for every target device family. Due to the large number of parameters that can affect performance, hand tuning for every possible device is impractical and often produces suboptimal results. For this work, we focused on optimizing the general matrix multiplication routine. General matrix multiplication is used as a building block for many linear algebra routines and often comprises a large portion of the run-time. Prior work has shown this routine to be a good candidate for high-performance implementation in OpenCL. We selected several candidate algorithms from the literature that are suitable for parameterization. We then developed parameterized kernels implementing these algorithms using only portable OpenCL features. Our implementation queries device information supplied by the OpenCL runtime and utilizes this as well as user input to generate a search space that satisfies device and algorithmic constraints. Preliminary results from our work confirm that optimizations are not portable from one device to the next, and show the benefits of automatic tuning. Using a standard set of tuning parameters seen in the literature for the NVIDIA Fermi architecture achieves a performance of 1.6 TFLOPS on an AMD 7970 device, while automatically tuning achieves a peak of 2.7 TFLOPS
Published: 2014
Full Text: View/download PDF

26. Targeting multiple heterogeneous hardware platforms with OpenCL

Author: Aaron Paolini, John R. Humphrey, Stephen Kozacik, Paul Fox, Aryeh Kuller, and Eric J. Kelmelis
Subjects: Hardware architecture, business.industry, Computer science, Subroutine, Symmetric multiprocessor system, computer.software_genre, Software portability, Just-in-time compilation, Computer architecture, Modular programming, Preprocessor, Compiler, business, computer, Implementation, Computer hardware
Abstract: The OpenCL API allows for the abstract expression of parallel, heterogeneous computing, but hardware implementations have substantial implementation differences. The abstractions provided by the OpenCL API are often insufficiently high-level to conceal differences in hardware architecture. Additionally, implementations often do not take advantage of potential performance gains from certain features due to hardware limitations and other factors. These factors make it challenging to produce code that is portable in practice, resulting in much OpenCL code being duplicated for each hardware platform being targeted. This duplication of effort offsets the principal advantage of OpenCL: portability. The use of certain coding practices can mitigate this problem, allowing a common code base to be adapted to perform well across a wide range of hardware platforms. To this end, we explore some general practices for producing performant code that are effective across platforms. Additionally, we explore some ways of modularizing code to enable optional optimizations that take advantage of hardware-specific characteristics. The minimum requirement for portability implies avoiding the use of OpenCL features that are optional, not widely implemented, poorly implemented, or missing in major implementations. Exposing multiple levels of parallelism allows hardware to take advantage of the types of parallelism it supports, from the task level down to explicit vector operations. Static optimizations and branch elimination in device code help the platform compiler to effectively optimize programs. Modularization of some code is important to allow operations to be chosen for performance on target hardware. Optional subroutines exploiting explicit memory locality allow for different memory hierarchies to be exploited for maximum performance. The C preprocessor and JIT compilation using the OpenCL runtime can be used to enable some of these techniques, as well as to factor in hardware-specific optimizations as necessary.
Published: 2014
Full Text: View/download PDF

27. Mean square error performance evaluation of a commercial speckle imaging system using simulated imagery

Author: Jeremy P. Bos, Michael C. Roggemann, Eric J. Kelmelis, and Aaron Paolini
Subjects: Diffraction, Mean squared error, Turbulence, Computer science, business.industry, Volume (computing), Image processing, Computer Science::Computer Vision and Pattern Recognition, Range (statistics), Computer vision, Speckle imaging, Artificial intelligence, business, Bispectrum, Remote sensing
Abstract: We examine the performance of a commercially available speckle imaging system in reconstructing static scenes from imagery corrupted by anisoplanatic distortions commonly observed when imaging over long horizontal paths near the ground. Performance is evaluated using the Mean Squared Error between system outputs and a diffraction-limited reference image. Input image frames are taken from a large library of simulated imagery of a static object observed over a 1 km horizontal path through volume turbulence in 3 turbulence conditions. 1000 image frames are available for each condition allowing for a statistically significant characterization of system performance over a range of turbulence conditions.
Published: 2014
Full Text: View/download PDF

28. Multi-frame image processing with panning cameras and moving subjects

Author: Eric J. Kelmelis, Aaron Paolini, Petersen F. Curt, and John R. Humphrey
Subjects: Computer science, business.industry, Digital image processing, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Image processing, Computer vision, Artificial intelligence, Speckle imaging, Panning (camera), business, Multi frame
Abstract: Imaging scenarios commonly involve erratic, unpredictable camera behavior or subjects that are prone to movement, complicating multi-frame image processing techniques. To address these issues, we developed three techniques that can be applied to multi-frame image processing algorithms in order to mitigate the adverse effects observed when cameras are panning or subjects within the scene are moving. We provide a detailed overview of the techniques and discuss the applicability of each to various movement types. In addition to this, we evaluated algorithm efficacy with demonstrated benefits using field test video, which has been processed using our commercially available surveillance product. Our results show that algorithm efficacy is significantly improved in common scenarios, expanding our software’s operational scope. Our methods introduce little computational burden, enabling their use in real-time and low-power solutions, and are appropriate for long observation periods. Our test cases focus on imaging through turbulence, a common use case for multi-frame techniques. We present results of a field study designed to test the efficacy of these techniques under expanded use cases.
Published: 2014
Full Text: View/download PDF

29. Using ATCOM to enhance long-range imagery collected by NASA’s flight test tracking cameras at Armstrong Flight Research Center

Author: David Tow, Aaron Paolini, and Eric J. Kelmelis
Subjects: Computer science, business.industry, Turbulence, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, ComputerApplications_COMPUTERSINOTHERSYSTEMS, Image processing, Tracking (particle physics), Flight test, Rocket launch, Software, Range (aeronautics), business, Research center, Remote sensing
Abstract: Located at Edwards Air Force Base, Armstrong Flight Research Center (AFRC) is NASA’s premier site for aeronautical research and operates some of the most advanced aircraft in the world. As such, flight tests for advanced manned and unmanned aircraft are regularly performed there. All such tests are tracked through advanced electro-optic imaging systems to monitor the flight status in real-time and to archive the data for later analysis. This necessitates the collection of imagery from long-range camera systems of fast moving targets from a significant distance away. Such imagery is severely degraded due to the atmospheric turbulence between the camera and the object of interest. The result is imagery that becomes blurred and suffers a substantial reduction in contrast, causing significant detail in the video to be lost. In this paper, we discuss the image processing techniques located in the ATCOM software, which uses a multi-frame method to compensate for the distortions caused by the turbulence.
Published: 2014
Full Text: View/download PDF

30. Front Matter: Volume 8752

Author: Eric J. Kelmelis
Subjects: Volume (thermodynamics), Mechanics, Geology, Front (military)
Published: 2013
Full Text: View/download PDF

31. Advances in computational fluid dynamics solvers for modern computing environments

Author: Aaron Paolini, John R. Humphrey, Eric J. Kelmelis, and Daniel Hertenstein
Subjects: Multi-core processor, business.industry, Computer science, Parallel computing, Computational fluid dynamics, Solver, Supercomputer, Computational science, Software, Scalability, Software architecture, business, Multicore architecture, Xeon Phi
Abstract: EM Photonics has been investigating the application of massively multicore processors to a key problem area: Computational Fluid Dynamics (CFD). While the capabilities of CFD solvers have continually increased and improved to support features such as moving bodies and adjoint-based mesh adaptation, the software architecture has often lagged behind. This has led to poor scaling as core counts reach the tens of thousands. In the modern High Performance Computing (HPC) world, clusters with hundreds of thousands of cores are becoming the standard. In addition, accelerator devices such as NVIDIA GPUs and Intel Xeon Phi are being installed in many new systems. It is important for CFD solvers to take advantage of the new hardware as the computations involved are well suited for the massively multicore architecture. In our work, we demonstrate that new features in NVIDIA GPUs are able to empower existing CFD solvers by example using AVUS, a CFD solver developed by the Air Force Research Labratory (AFRL) and the Volcanic Ash Advisory Center (VAAC). The effort has resulted in increased performance and scalability without sacrificing accuracy. There are many well-known codes in the CFD space that can benefit from this work, such as FUN3D, OVERFLOW, and TetrUSS. Such codes are widely used in the commercial, government, and defense sectors.
Published: 2013
Full Text: View/download PDF

32. Frontmatter: Volume 8403

Author: Eric J. Kelmelis
Subjects: Volume (thermodynamics), Mechanics, Geology
Published: 2012
Full Text: View/download PDF

33. Accelerating CULA Linear Algebra Routines with Hybrid GPU and Multicore Computing

Author: Daniel K. Price, Eric J. Kelmelis, John R. Humphrey, and Kyle E. Spagnoli
Subjects: Fortran, Computer science, Graphics processing unit, Parallel computing, System of linear equations, LU decomposition, Computational science, law.invention, law, Interfacing, Linear algebra, Computer Science::Mathematical Software, Central processing unit, MATLAB, computer, computer.programming_language
Abstract: Publisher Summary The LU decomposition is a popular linear algebra technique with applications such as the solution of systems of linear equations and calculation of matrix inverses and determinants. Central processing unit (CPU) versions of this routine exhibit very high performance, making the port to a graphics processing unit (GPU) a challenging prospect. This chapter discusses the implementation of LU decomposition in CULA library for linear algebra on the GPU, describing the steps necessary for achieving significant speed-ups over the CPU. Specialized techniques are employed by CULA to obtain significant speed-ups over existing packages. CULA features a wide variety of linear algebra functions, including least squares solvers (constrained and unconstrained), system solvers (general and symmetric positive definite), eigenproblem solvers (general and symmetric), singular value decompositions, and many useful factorizations (QR, Hessenberg). It also presents a number of methods for interfacing with CULA. The two major interfaces are host and device, and they accept data via host memory and device memory, respectively. The host interface features high convenience, whereas the device interface is more manual, but can avoid data transfer times. Additionally, there are facilities for interfacing with MATLAB and the Fortran language.
Published: 2012
Full Text: View/download PDF

34. Accelerating sparse linear algebra using graphics processing units

Author: Eric J. Kelmelis, John R. Humphrey, Kyle E. Spagnoli, and Daniel K. Price
Subjects: Numerical linear algebra, Computer science, Graphics processing unit, Parallel computing, computer.software_genre, Finite element method, Computational science, CUDA, Linear algebra, Computer Science::Mathematical Software, Central processing unit, General-purpose computing on graphics processing units, Graphics, computer, Execution model
Abstract: The modern graphics processing unit (GPU) found in many standard personal computers is a highly parallel math processor capable of over 1 TFLOPS of peak computational throughput at a cost similar to a high-end CPU with excellent FLOPS-to-watt ratio. High-level sparse linear algebra operations are computationally intense, often requiring large amounts of parallel operations and would seem a natural fit for the processing power of the GPU. Our work is on a GPU accelerated implementation of sparse linear algebra routines. We present results from both direct and iterative sparse system solvers. The GPU execution model featured by NVIDIA GPUs based on CUDA demands very strong parallelism, requiring between hundreds and thousands of simultaneous operations to achieve high performance. Some constructs from linear algebra map extremely well to the GPU and others map poorly. CPUs, on the other hand, do well at smaller order parallelism and perform acceptably during low-parallelism code segments. Our work addresses this via hybrid a processing model, in which the CPU and GPU work simultaneously to produce results. In many cases, this is accomplished by allowing each platform to do the work it performs most naturally. For example, the CPU is responsible for graph theory portion of the direct solvers while the GPU simultaneously performs the low level linear algebra routines.
Published: 2011
Full Text: View/download PDF

35. Front Matter: Volume 8060

Author: Eric J. Kelmelis
Subjects: Volume (thermodynamics), Mechanics, Geology, Front (military)
Published: 2011
Full Text: View/download PDF

36. CULA: hybrid GPU accelerated linear algebra routines

Author: John R. Humphrey, Daniel K. Price, Aaron Paolini, Kyle E. Spagnoli, and Eric J. Kelmelis
Subjects: CUDA, law, Computer science, Linear algebra, Singular value decomposition, Computer Science::Mathematical Software, Graphics processing unit, Parallel computing, Central processing unit, FLOPS, LU decomposition, law.invention, QR decomposition
Abstract: The modern graphics processing unit (GPU) found in many standard personal computers is a highly parallel math processor capable of nearly 1 TFLOPS peak throughput at a cost similar to a high-end CPU and an excellent FLOPS/watt ratio. High-level linear algebra operations are computationally intense, often requiring O(N3) operations and would seem a natural fit for the processing power of the GPU. Our work is on CULA, a GPU accelerated implementation of linear algebra routines. We present results from factorizations such as LU decomposition, singular value decomposition and QR decomposition along with applications like system solution and least squares. The GPU execution model featured by NVIDIA GPUs based on CUDA demands very strong parallelism, requiring between hundreds and thousands of simultaneous operations to achieve high performance. Some constructs from linear algebra map extremely well to the GPU and others map poorly. CPUs, on the other hand, do well at smaller order parallelism and perform acceptably during low-parallelism code segments. Our work addresses this via hybrid a processing model, in which the CPU and GPU work simultaneously to produce results. In many cases, this is accomplished by allowing each platform to do the work it performs most naturally.
Published: 2010
Full Text: View/download PDF

37. Comparing FPGAs and GPUs for high-performance image processing applications

Author: Michael R. Bodnar, Daniel K. Price, Petersen F. Curt, Eric J. Kelmelis, Fernando E. Ortiz, Kyle E. Spagnoli, and Aaron Paolini
Subjects: Flexibility (engineering), Workstation, business.industry, Computer science, Image quality, Image processing, law.invention, Microprocessor, Parallel processing (DSP implementation), law, Embedded system, Computer data storage, business, Field-programmable gate array
Abstract: Modern image enhancement techniques have been shown to be effective in improving the quality of imagery. However, the computational requirements of applying such algorithms to streams of video in real-time often cannot be satisfied by standard microprocessor-based systems. While a scaled solution involving clusters of microprocessors may provide the necessary arithmetic capacity, deployment is limited to data-center scenarios. What is needed is a way to perform these techniques in real time on embedded platforms. A new paradigm of computing utilizing special-purpose commodity hardware including Field-Programmable Gate Arrays (FPGAs) and Graphics Processing Units (GPU) has recently emerged as an alternative to parallel computing using clusters of traditional CPUs. Recent research has shown that for many applications, such as image processing techniques requiring intense computations and large memory spaces, these hardware platforms significantly outperform microprocessors. Furthermore, while microprocessor technology has begun to stagnate, GPUs and FPGAs have continued to improve exponentially. FPGAs, flexible and powerful, are best targeted at embedded, low-power systems and specific applications. GPUs, cheap and readily available, are available to most users through their standard desktop machines. Additionally, as fabrication scale continues to shrink, heat and power consumption issues typically limiting GPU deployment to high-end desktop workstations are becoming less of a factor. The ability to include these devices in embedded environments opens up entire new application domains. In this paper, we investigate two state-of-the-art image processing techniques, super-resolution and the average-bispectrum speckle method, and compare FPGA and GPU implementations in terms of performance, development effort, cost, deployment options, and platform flexibility.
Published: 2010
Full Text: View/download PDF

38. Front Matter: Volume 7705

Author: Eric J. Kelmelis
Subjects: Volume (thermodynamics), Mechanics, Geology, Front (military)
Published: 2010
Full Text: View/download PDF

39. Organically enabled silicon-based photonic/RF-photonic applications

Author: Matthew Zablocki, Peng Yao, Dennis W. Prather, Ozgenc Ebil, Ahmed S. Sharkawy, Christopher A. Schuctz, Eric J. Kelmelis, and Shouyuan Shi
Subjects: Fabrication, Materials science, Silicon, business.industry, Physics::Optics, chemistry.chemical_element, Amorphous solid, chemistry, Polymer chemistry, Optoelectronics, Photonics, Hybrid material, business, Ultrashort pulse, Realization (systems), Photonic crystal
Abstract: In this paper, we present novel designs for the realization of organic-inorganic hybrid material systems and develop concepts and designs for silicon-organic hybrid ultrafast RF Photonic Devices. The designs presented combine, crystalline electro-optic materials, conventional crystalline materials, and amorphous polymers. Numerical simulation results as well as fabrication results are also included.
Published: 2010
Full Text: View/download PDF

40. An embedded processor for real-time atmoshperic compensation

Author: Petersen F. Curt, Eric J. Kelmelis, Fernando E. Ortiz, Carmen J. Carrano, and Michael R. Bodnar
Subjects: Speckle pattern, business.industry, Image quality, Computer science, Interface (computing), Image processing, Computer vision, Artificial intelligence, business, Bispectrum, Computer hardware, Compensation (engineering)
Abstract: Imaging over long distances is crucial to a number of defense and security applications, such as homeland security and launch tracking. However, the image quality obtained from current long-range optical systems can be severely degraded by the turbulent atmosphere in the path between the region under observation and the imager. While this obscured image information can be recovered using post-processing techniques, the computational complexity of such approaches has prohibited deployment in real-time scenarios. To overcome this limitation, we have coupled a state-of-the-art atmospheric compensation algorithm, the average-bispectrum speckle method, with a powerful FPGA-based embedded processing board. The end result is a light-weight, lower-power image processing system that improves the quality of long-range imagery in real-time, and uses modular video I/O to provide a flexible interface to most common digital and analog video transport methods. By leveraging the custom, reconfigurable nature of the FPGA, a 20x speed increase over a modern desktop PC was achieved in a form-factor that is compact, low-power, and field-deployable.
Published: 2009
Full Text: View/download PDF

41. A GPU-accelerated toolbox for the solutions of systems of linear equations

Author: John R. Humphrey, Aaron Paolini, Daniel K. Price, and Eric J. Kelmelis
Subjects: Computer science, law, Graphics processing unit, Parallel computing, Solver, System of linear equations, Supercomputer, Generalized minimal residual method, Linear equation, LU decomposition, law.invention
Abstract: The modern graphics processing unit (GPU) found in many off-the shelf personal computers is a very high performance computing engine that often goes unutilized. The tremendous computing power coupled with reasonable pricing has made the GPU a topic of interest in recent research. An application for such power would be the solution to large systems of linear equations. Two popular solution domains are direct solution, via the LU decomposition, and iterative solution, via a solver such as the Generalized Method of Residuals (GMRES). Our research focuses on the acceleration of such processes, utilizing the latest in GPU technologies. We show performance that exceeds that of a standard computer by an order of magnitude, thus significantly reducing the run time of the numerous applications that depend on the solution of a set of linear equations.
Published: 2009
Full Text: View/download PDF

42. Biologically inspired collision avoidance system for unmanned vehicles

Author: Eric J. Kelmelis, Kyle E. Spagnoli, Brett J. Graham, and Fernando E. Ortiz
Subjects: business.industry, Computer science, Controller (computing), Central nervous system, Robotics, Optic tectum, Cerebro, Object detection, Midbrain, medicine.anatomical_structure, Computer architecture, Embedded system, medicine, Collision avoidance system, Artificial intelligence, Field-programmable gate array, business, Massively parallel, Collision avoidance
Abstract: In this project, we collaborate with researchers in the neuroscience department at the University of Delaware to develop an Field Programmable Gate Array (FPGA)-based embedded computer, inspired by the brains of small vertebrates (fish). The mechanisms of object detection and avoidance in fish have been extensively studied by our Delaware collaborators. The midbrain optic tectum is a biological multimodal navigation controller capable of processing input from all senses that convey spatial information, including vision, audition, touch, and lateral-line (water current sensing in fish). Unfortunately, computational complexity makes these models too slow for use in real-time applications. These simulations are run offline on state-of-the-art desktop computers, presenting a gap between the application and the target platform: a low-power embedded device. EM Photonics has expertise in developing of high-performance computers based on commodity platforms such as graphic cards (GPUs) and FPGAs. FPGAs offer (1) high computational power, low power consumption and small footprint (in line with typical autonomous vehicle constraints), and (2) the ability to implement massively-parallel computational architectures, which can be leveraged to closely emulate biological systems. Combining UD's brain modeling algorithms and the power of FPGAs, this computer enables autonomous navigation in complex environments, and further types of onboard neural processing in future applications.
Published: 2009
Full Text: View/download PDF

43. Real-time embedded atmospheric compensation for long-range imaging using the average bispectrum speckle method

Author: Eric J. Kelmelis, Petersen F. Curt, Fernando E. Ortiz, Carmen J. Carrano, and Michael R. Bodnar
Subjects: Speckle pattern, business.industry, Computer science, Image processing, Angular resolution, Speckle imaging, business, Field-programmable gate array, Bispectrum, Computer hardware, Simulation
Abstract: While imaging over long distances is criti cal to a number of security and defense applications, such as homeland security and launch tracking, current optical systems are limited in resolving power. This is largely a result of the turbulent atmosphere in the path between the region under observation and the imaging system, which can severely degrade captured imagery. There are a variety of post-processing techniques capable of recovering this obscured image information; however, the computational complexity of such approaches has prohibited real-time deployment and hampers the usability of these technologies in many scenarios. To overcome this limitation, we have designed and manufactured an embedded image processing system based on commodity hardware which can compensate for these atmospheric disturbances in real-time. Our system consists of a reformulation of the average bispectrum speckle method coupled with a high-end FPGA processing board, and employs modular I/O capable of interfacing with most common digital and analog video transport methods (composite, component, VGA, DVI, SDI, HD-SDI, etc.). By leveraging the custom, reconfigurable nature of the FPGA, we have achieved performance twenty times faster than a modern desktop PC, in a form-factor that is compact , low-power, and field-deployable. Keywords: bispectral speckle imaging, FPGA , embedded, atmospheric compensati on, real-time image processing
Published: 2009
Full Text: View/download PDF

44. Fabrication of Large Area 'Woodpile' Photonic Crystal Structures for Near IR

Author: Dennis W. Prather, Peng Yao, Shouyuan Shi, Ahmed S. Sharkawy, Ozgenc Ebil, Elton Marchena, Neilanjan Dutta, and Eric J. Kelmelis
Subjects: chemistry.chemical_classification, Fabrication, Materials science, business.industry, Nanotechnology, Polymer, law.invention, Planar, chemistry, Resist, law, Optoelectronics, Batch fabrication, Photolithography, business, Lithography, Photonic crystal
Abstract: We have fabricated large area 3D polymer photonic crystals by modifying planar lithography to achieve exposure confinement and multiple resist application. This fabrication process allows arbitrary defect introduction and is suitable for batch fabrication.
Published: 2009
Full Text: View/download PDF

45. Accelerated determination of UAV flight envelopes

Author: Michael R. Bodnar, John R. Humphrey, Eric J. Kelmelis, and Lyle N. Long
Subjects: Engineering, business.industry, Computational fluid dynamics, Solver, Supercomputer, Euler equations, Modeling and simulation, symbols.namesake, Software, symbols, System integration, Aerospace engineering, Graphics, business, Simulation
Abstract: Unmanned Aerial Vehicle (UAV) system integration with naval vessels is currently realized in limited form. The operational envelopes of these vehicles are constricted due to the complexities involved with at-sea flight testing. Furthermore, the unsteady nature of ship airwakes and the use of automated UAV control software necessitates that these tests be extremely conservative in nature. Modeling and simulation are natural alternatives to flight testing; however, a fully-coupled computational fluid dynamics (CFD) solution requires many thousands of CPU hours. We therefore seek to decrease simulation time by accelerating the underlying computations using state-of-the-art, commodity hardware. In this paper we present the progress of our proposed solution, harnessing the computational power of high-end commodity graphics processing units (GPUs) to create an accelerated Euler equations solver on unstructured hexahedral grids.
Published: 2008
Full Text: View/download PDF

46. Fabrication of 3D polymer photonic crystals for near-IR applications

Author: Garrett J. Schneider, Dennis W. Prather, Liang Qiu, Eric J. Kelmelis, Ahmed S. Sharkawy, Shouyuan Shi, and Peng Yao
Subjects: Materials science, Fabrication, business.industry, Nanotechnology, law.invention, Surface micromachining, Resist, law, Optoelectronics, X-ray lithography, Photolithography, business, Lithography, Microfabrication, Photonic crystal
Abstract: Photonic crystals[1, 2] have stirred enormous research interest and became a growing enterprise in the last 15 years. Generally, PhCs consist of periodic structures that possess periodicity comparable with the wavelength that the PhCs are designed to modulate. If material and periodic pattern are properly selected, PhCs can be applied to many applications based on their unique properties, including photonic band gaps (PBG)[3], self-collimation[4], super prism[5], etc. Strictly speaking, PhCs need to possess periodicity in three dimensions to maximize their advantageous capabilities. However, many current research is based on scaled two-dimensional PhCs, mainly due to the difficulty of fabrication such three-dimensional PhCs. Many approaches have been explored for the fabrication of 3D photonic crystals, including layer-by-layer surface micromachining[6], glancing angle deposition[7], 3D micro-sculpture method[8], self-assembly[9] and lithographical methods[10-12]. Among them, lithographic methods became increasingly accepted due to low costs and precise control over the photonic crystal structure. There are three mostly developed lithographical methods, namely X-ray lithography[10], holographic lithography[11] and two-photon polymerization[12]. Although significant progress has been made in developing these lithography-based technologies, these approaches still suffer from significant disadvantages. X-ray lithography relies on an expensive radiation source. Holographic lithography lacks the flexibility to create engineered defects, and multi-photon polymerization is not suitable for parallel fabrication. In our previous work, we developed a multi-layer photolithography processes[13, 14] that is based on multiple resist application and enhanced absorption upon exposure. Using a negative lift-off resist (LOR) and 254nm DUV source, we have demonstrated fabrication of 3D arbitrary structures with feature size of several microns. However, severe intermixing problem occurred as we reduced the lattice constant for near-IR applications. In this work, we address this problem by employing SU8. The exposure is vertically confined by using a mismatched 220nm DUV source. Intermixing problem is eliminated due to more densely crosslinked resist molecules. Using this method, we have demonstrated 3D "woodpile" structure with 1.55μm lattice constant and a 2mm-by-2mm pattern area.
Published: 2008
Full Text: View/download PDF

47. FPGA acceleration of superresolution algorithms for embedded processing in millimeter-wave sensors

Author: Fernando E. Ortiz, Dennis W. Prather, James P. Durbano, and Eric J. Kelmelis
Subjects: Pixel, Computer science, business.industry, Computation, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Optical flow, Bottleneck, Power (physics), Computer Science::Computer Vision and Pattern Recognition, Computer vision, Artificial intelligence, Dither, business, Field-programmable gate array, Algorithm, Linear least squares
Abstract: Superresolution reconstruction (SR-REC) algorithms combine multiple frames captured using spatially under-sampled imagers to produce a single higher-resolution image. Sub-pixel information is gained from natural motion within the image instead of active pixel scanning (dithering/micro-scanning), eliminating the reliability issues and power consumption associated with moving parts. One of the major computational challenges associated with SR-REC methods is the estimation of the optical flow of the image (i.e., determining the unknown pixel shifts between consecutive frames). A linear least squares approximation is the simplest method for estimating the pixel movements from the captured data, but the size of the problem (directly proportional to the number of pixels in the image) creates a computational bottleneck, which in turn limits the usability of this algorithm in real-time portable systems. We propose the use of a reconfigurable platform to implement these computations in a low power/size environment, suitable for integration into portable millimeter wave imagers.
Published: 2007
Full Text: View/download PDF

48. Reconfigurable device for enhancement of long-range imagery

Author: Fernando E. Ortiz, Eric J. Kelmelis, Petersen F. Curt, and Carmen J. Carrano
Subjects: Flexibility (engineering), Acceleration, Engineering, Speedup, business.industry, Electronic engineering, Solver, business, Field-programmable gate array, Reconfigurable computing, Compensation (engineering), Reusability
Abstract: In this paper, we discuss the real-time compensation of air turbulence in imaging through long atmospheric paths. We propose the use of a reconfigurable hardware platform, specifically field-programmable gate arrays (FPGAs), to reduce costs and development time, as well as increase flexibility and reusability. We present the results of our acceleration efforts to date (40x speedup) and our strategy to achieve a real-time, atmospheric compensation solver for highdefinition video signals.
Published: 2007
Full Text: View/download PDF

49. An architecture for the efficient implementation of compressive sampling reconstruction algorithms in reconfigurable hardware

Author: Eric J. Kelmelis, Gonzalo R. Arce, and Fernando E. Ortiz
Subjects: Signal processing, Compressed sensing, Computer science, Pipeline (computing), Bandwidth (signal processing), Sampling (statistics), Hardware acceleration, Algorithm, Wireless sensor network, Reconfigurable computing
Abstract: According to the Shannon-Nyquist theory, the number of samples required to reconstruct a signal is proportional to its bandwidth. Recently, it has been shown that acceptable reconstructions are possible from a reduced number of random samples, a process known as compressive sampling. Taking advantage of this realization has radical impact on power consumption and communication bandwidth, crucial in applications based on small/mobile/unattended platforms such as UAVs and distributed sensor networks. Although the benefits of these compression techniques are self-evident, the reconstruction process requires the solution of nonlinear signal processing algorithms, which limit applicability in portable and real-time systems. In particular, (1) the power consumption associated with the difficult computations offsets the power savings afforded by compressive sampling, and (2) limited computational power prevents these algorithms to maintain pace with the data-capturing sensors, resulting in undesirable data loss. FPGA based computers offer low power consumption and high computational capacity, providing a solution to both problems simultaneously. In this paper, we present an architecture that implements the algorithms central to compressive sampling in an FPGA environment. We start by studying the computational profile of the convex optimization algorithms used in compressive sampling. Then we present the design of a pixel pipeline suitable for FPGA implementation, able to compute these algorithms.
Published: 2007
Full Text: View/download PDF

50. A reconfigurable self-collimation-based photonic crystal switch in silicon

Author: Dennis W. Prather, Richard K. Martin, Ahmed S. Sharkawy, Eric J. Kelmelis, Caihua Chen, and Binglin Miao
Subjects: Optics, Materials science, business.industry, Electric field, Dispersion (optics), Topology (electrical circuits), Absorption (electromagnetic radiation), business, Optical switch, Electromagnetic radiation, Signal, Photonic crystal
Abstract: We present a reconfigurable, compact, low loss, optical switch in silicon. The device utilizes the self-collimation properties of photonic crystal structures and provides a technique for efficiently switching an electromagnetic wave guided through a pre-engineered dispersion based photonic crystal self-guiding structure. The electromagnetic wave can be either in the microwave or optical regime based on the constituent materials and dimensions of the photonic crystal host. We propose that the loss tangent of dielectric material in the switching region can be modified by external commands to control the direction of propagation of the sel f-collimated signal and hence attain switching, thereby re-directing the light. Based on the geometrical orientation and position of the applied electric field, electromagnetic waves can be completely redirected (switched), or partially routed towards any arbitrary direction on a Manhattan grid or network. We have found that the induced loss does not signi ficantly attenuate the waves switched in any direction. The structure presented can be generalized to an arbitrary N by M interconnected switching network or fabric, where the switching topology can be dynamically modulated by the application of external fields. To attain switching, the free-carrier absorption loss of Si is controlled by carrier injection from forward-biased PN junction. The concept device is designed and analyzed using the FastFDTD
Published: 2007
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

62 results on '"Eric J. Kelmelis"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources