Back to Search
Start Over
Compile-Time and Instruction-Set Methods for Improving Floating- to Fixed-Point Conversion Accuracy.
- Source :
- ACM Transactions on Embedded Computing Systems; Apr2008, Vol. 7 Issue 3, p26-26:27, 27p, 4 Diagrams, 7 Charts, 8 Graphs
- Publication Year :
- 2008
-
Abstract
- This paper proposes and evaluates compile time and instruction-set techniques for improving the accuracy of signal-processing algorithms run on fixed-point embedded processors. These techniques are proposed in the context of a profile guided floating- to fixed-point compiler-based conversion process. A novel fixed-point scaling algorithm (IRP) is introduced that exploits correlations between values in a program by applying fixed-point scaling, retaining as much precision as possible without causing overflow. This approach is extended into a more aggressive scaling algorithm (IRP-SA) by leveraging the modulo nature of 2's complement addition and subtraction to discard most significant bits that may not be redundant sign-extension bits. A complementary scaling technique (IDS) is then proposed that enables the fixed-point scaling of a variable to be parameterized, depending upon the context of its definitions and uses. Finally, a novel instruction-set enhancement--fractional multiplication with internal left shift (FMLS)--is proposed to further leverage interoperand correlations uncovered by the IRP-SA scaling algorithm. FMLS preserves a different subset of the full product's bits than traditional fractional fixed-point or integer multiplication. On average, FMLS combined with IRP-SA improves accuracy on processors with uniform bitwidth register architectures by the equivalent of 0.61 bits of additional precision for a set of signal-processing benchmarks (up to 2 bits). Even without employing FMLS, the IRP-SA scaling algorithm achieves additional accuracy over two previous fixed-point scaling algorithms by averages of 1.71 and 0.49 bits. Furthermore, as FMLS combines multiplication with a scaling shift, it reduces execution time by an average of 9.8%. An implementation of IDS, specialized to single-nested loops, is found to improve accuracy of a lattice filter benchmark by the equivalent of more than 16-bits of precision. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 15399087
- Volume :
- 7
- Issue :
- 3
- Database :
- Complementary Index
- Journal :
- ACM Transactions on Embedded Computing Systems
- Publication Type :
- Academic Journal
- Accession number :
- 32203919
- Full Text :
- https://doi.org/10.1145/1347375.1347379