27 results on '"Reconfigurable hardware"'
Search Results
2. Universal Gaussian elimination hardware for cryptographic purposes.
- Author
-
Hu, Jingwei, Wang, Wen, Gaj, Kris, Chen, Donglong, and Wang, Huaxiong
- Abstract
In this paper, we investigate the possibility of performing Gaussian elimination for arbitrary binary matrices on hardware. In particular, we presented a generic approach for hardware-based Gaussian elimination, which is able to process both non-singular and singular matrices. Previous works on hardware-based Gaussian elimination can only process non-singular ones. However, a plethora of cryptosystems, for instance, quantum-safe key encapsulation mechanisms based on rank-metric codes, ROLLO and RQC, which are among NIST post-quantum cryptography standardization round-2 candidates, require performing Gaussian elimination for random matrices regardless of the singularity. We accordingly implemented an optimized and parameterized Gaussian eliminator for (singular) matrices over binary fields, making the intense computation of linear algebra feasible and efficient on hardware. To the best of our knowledge, this work solves for the first time eliminating a singular matrix on reconfigurable hardware and also describes the a generic hardware architecture for rank-code based cryptographic schemes. The experimental results suggest hardware-based Gaussian elimination can be done in linear time regardless of the matrix type. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Breaking TrustZone memory isolation and secure boot through malicious hardware on a modern FPGA-SoC.
- Author
-
Gross, Mathieu, Jacob, Nisha, Zankl, Andreas, and Sigl, Georg
- Abstract
FPGA-SoCs are heterogeneous embedded computing platforms consisting of reconfigurable hardware and high-performance processing units. This combination offers flexibility and good performance for the design of embedded systems. However, allowing the sharing of resources between an FPGA and an embedded CPU enables possible attacks from one system on the other. This work demonstrates that a malicious hardware block contained inside the reconfigurable logic can manipulate the memory and peripherals of the CPU. Previous works have already considered direct memory access attacks from malicious logic on platforms containing no memory isolation mechanism. In this work, such attacks are investigated on a modern platform which contains state-of-the-art memory and peripherals isolation mechanisms. We demonstrate two attacks capable of compromising a Trusted Execution Environment based on ARM TrustZone and show a new attack capable of bypassing the secure boot configuration set by a device owner via the manipulation of Battery-Backed RAM and eFuses from malicious logic. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
4. Disk encryption: do we need to preserve length?
- Author
-
Chakraborty, Debrup, López, Cuauhtemoc Mancillas, and Sarkar, Palash
- Abstract
In the last one and a half decade there has been a lot of activity toward development of cryptographic techniques for disk encryption. It has been almost canonized that an encryption scheme suitable for the application of disk encryption must be length preserving, i.e., it rules out the use of schemes such as authenticated encryption where an authentication tag is also produced as a part of the ciphertext resulting in ciphertexts being longer than the corresponding plaintexts. The notion of a tweakable enciphering scheme (TES) has been formalized as the appropriate primitive for disk encryption, and it has been argued that they provide the maximum security possible for a tagless scheme. On the other hand, TESs are less efficient than some existing authenticated encryption schemes. Also TES cannot provide true authentication as they do not have authentication tags. In this paper, we analyze the possibility of the use of encryption schemes where length expansion is produced for the purpose of disk encryption. On the negative side, we argue that nonce-based authenticated encryption schemes are not appropriate for this application. On the positive side, we demonstrate that deterministic authenticated encryption (DAE) schemes may have more advantages than disadvantages compared to a TES when used for disk encryption. Finally, we propose a new deterministic authenticated encryption scheme called BCTR which is suitable for this purpose. We provide the full specification of BCTR, prove its security and also report an efficient implementation in reconfigurable hardware. Our experiments suggests that BCTR performs significantly better than existing TESs and existing DAE schemes. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
5. An efficient hardware accelerator for NTT-based polynomial multiplication using FPGA.
- Author
-
Salarifard, Raziyeh and Soleimany, Hadi
- Abstract
The number theoretic transform (NTT) is used to efficiently execute polynomial multiplication. It has become an important part of lattice-based post-quantum methods and the subsequent generation of standard cryptographic systems. However, implementing post-quantum schemes is challenging since they rely on intricate structures. This paper demonstrates how to develop a high-speed NTT multiplier highly optimized for FPGAs with few logical resources. We describe a novel architecture for NTT that leverages unique precomputation. Our method efficiently maps these specific pre-computed values into the built-in Block RAMs, which greatly reduces the area and time required for implementation when compared to previous works. We have chosen Kyber parameters to implement the proposed architectures. Compared to the most well-known approach for implementing Kyber's polynomial multiplication using NTT, the AC (area × latency) is reduced by 33 % , and AT (area × time) is improved by 18 % as a result of the pre-computation we suggest in this study. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Streamlined NTRU Prime on FPGA.
- Author
-
Peng, Bo-Yuan, Marotzke, Adrian, Tsai, Ming-Han, Yang, Bo-Yin, and Chen, Ho-Lin
- Abstract
We present a novel full hardware implementation of Streamlined NTRU Prime, with two variants: a high-speed, high-area implementation and a slower, low-area implementation. We introduce several new techniques that improve performance, including a batch inversion for key generation, a high-speed schoolbook polynomial multiplier, an NTT polynomial multiplier combined with a CRT map, a new DSP-free modular reduction method, a high-speed radix sorting module, and new encoders and decoders. With the high-speed design, we achieve the to-date fastest speeds for Streamlined NTRU Prime, with speeds of 5007, 10,989, and 64,026 cycles for encapsulation, decapsulation, and key generation, respectively, while running at 285 MHz on a Xilinx Zynq Ultrascale+. The entire design uses 40,060 LUT, 26,384 flip-flops, 36.5 Bram, and 31 DSP. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. Low area-time complexity point multiplication architecture for ECC over GF(2m) using polynomial basis.
- Author
-
Nadikuda, Pradeep Kumar Goud and Boppana, Lakshmi
- Abstract
In the present day, billions of devices communicate over the wireless networks. The massive information transmitted over open ended, and unsecured Internet architecture results in eavesdropping of private, sensitive and confidential information. Therefore, it is necessary to incorporate some data encryption techniques while communicating any sensitive information. Public key cryptography is one of the widely used data encryption technique, and elliptic curve cryptography (ECC) is the most-sought after public key cryptographic algorithm. The efficiency of ECC depends on a series of hierarchical finite field operations, and point multiplication is one of the most time-critical and resource-consuming ECC operation. Point multiplication involves a substantial number of multiplications, additions and inversion operations over finite fields of higher orders. In this article, we present a point multiplication architecture developed for a modified Montgomery-ladder algorithm. A digit-serial multiplier is employed to perform multiplication in the realization of the modified Montgomery-ladder algorithm. The area and time complexities of the proposed elliptic curve point multiplication (ECPM) architecture are computed for irreducible pentanomial GF(2 163 ) and irreducible trinomial GF(2 233 ) targeting Virtex-5(XC5VLX110) FPGA and compared with the similar architectures available in the literature. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
8. Rethinking modular multi-exponentiation in real-world applications.
- Author
-
Attias, Vidal, Vigneri, Luigi, and Dimitrov, Vassil
- Abstract
The importance of efficient multi-exponentiation algorithms in a large spectrum of cryptographic applications continues to grow. Previous literature on the subject pays attention exclusively on the minimization of the number of modular multiplications. However, a small reduction of the multiplicative complexity can be easily overshadowed by other figures of merit. In this article, we demonstrate that the most efficient algorithm for computing multi-exponentiation changes if considering execution time instead of number of multi-exponentiations. We focus our work on two algorithms that perform best under the number of multi-exponentiation metric and show that some side operations affect their theoretical ranking. We provide this analysis on different hardware, such as Intel Core and ARM CPUs and the two latest generations of Raspberry Pis, to show how the machine chosen affects the execution time of multi-exponentiation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. A new read–write collision-based SRAM PUF implemented on Xilinx FPGAs.
- Author
-
Cicek, Ihsan and Al Khas, Ahmad
- Abstract
Physically unclonable functions (PUFs) are device-specific digital fingerprints derived from physical properties. They are used in critical cryptographic applications, including unique ID generation, key generation, and challenge-response-based authentication. The advantages of low implementation cost and robust operation render PUF an indispensable component for secure embedded systems. In the last decade, SRAM-based PUFs have become very popular in the ASIC industry due to increasing security demand against cloning and counterfeiting. However, their use in the FPGA applications is limited, since it is almost impossible to power-cycle the SRAMs after the FPGA device is configured. Additionally, FPGA vendors usually clear the SRAM contents after power on which also makes it hard to implement an SRAM PUF. In this work, we propose a new approach for designing SRAM-based PUFs on Xilinx FPGAs. The proposed PUF is based on the idea of triggering a collision between reading and writing operations in a block-RAM to generate random responses induced by timing violation instead of power cycling. We have integrated the proposed PUF as an AXI peripheral with a synthesizable processor core for data acquisition. The design has been tested on 10 different Xilinx Artix-7 devices of the same type, and acquired data were tested for reliability, uniqueness, bit-aliasing, and uniformity properties. On the average, the proposed PUF achieved 93% reliability (at 55 ∘ C ), 37% uniqueness, 47% bit-aliasing, and 55% uniformity. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. Six shades lighter: a bit-serial implementation of the AES family.
- Author
-
Roldán Lombardía, Sergio, Balli, Fatih, and Banik, Subhadeep
- Abstract
Recently, cryptographic literature has seen new block cipher designs such as PRESENT, GIFT or SKINNY that aim to be more lightweight than the current standard, i.e., AES. Even though AES family of block ciphers were designed two decades ago, they still remain as the de facto encryption standard, with AES-128 being the most widely deployed variant. In this work, we revisit the combined one-in-all implementation of the AES family, namely both encryption and decryption of each AES-128/192/256 as a single ASIC circuit. A preliminary version appeared in Africacrypt 2019 by Balli and Banik, where the authors design a byte-serial circuit with such functionality. We improve on their work by reducing the size of the compact circuit to 2268 GE through 1-bit-serial implementation, which achieves 38% reduction in area. We also report stand-alone bit-serial versions of the circuit, targeting only a subset of modes and versions, e.g., AES-192 and AES-256. Our results imply that, in terms of area, AES-192 and AES-256 can easily compete with the larger members of recently designed SKINNY family, e.g., SKINNY-128-256, SKINNY-128-384. Thus, our implementations can be used interchangeably inside authenticated encryption candidates such as SKINNY-AEAD/-HASH, ForkAE or Romulus in place of SKINNY. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
11. Exploring Parallelism to Improve the Performance of FrodoKEM in Hardware.
- Author
-
Howe, James, Martinoli, Marco, Oswald, Elisabeth, and Regazzoni, Francesco
- Abstract
FrodoKEM is a lattice-based key encapsulation mechanism, currently a semi-finalist in NIST's post-quantum standardisation effort. A condition for these candidates is to use NIST standards for sources of randomness (i.e. seed-expanding), and as such most candidates utilise SHAKE, an XOF defined in the SHA-3 standard. However, for many of the candidates, this module is a significant implementation bottleneck. Trivium is a lightweight, ISO standard stream cipher which performs well in hardware and has been used in previous hardware designs for lattice-based cryptography. This research proposes optimised designs for FrodoKEM, concentrating on high throughput by parallelising the matrix multiplication operations within the cryptographic scheme. This process is eased by the use of Trivium due to its higher throughput and lower area consumption. The parallelisations proposed also complement the addition of first-order masking to the decapsulation module. Overall, we significantly increase the throughput of FrodoKEM; for encapsulation we see a 16 × speed-up, achieving 825 operations per second, and for decapsulation we see a 14 × speed-up, achieving 763 operations per second, compared to the previous state of the art, whilst also maintaining a similar FPGA area footprint of less than 2000 slices. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
12. Review of error correction for PUFs and evaluation on state-of-the-art FPGAs.
- Author
-
Hiller, Matthias, Kürzinger, Ludwig, and Sigl, Georg
- Abstract
Efficient error correction and key derivation is a prerequisite to generate secure and reliable keys from PUFs. The most common methods can be divided into linear schemes and pointer-based schemes. This work compares the performance of several previous designs on an algorithmic level concerning the required number of PUF response bits, helper data bits, number of clock cycles, and FPGA slices for two scenarios. One targets the widely used key error probability of 10 - 6 , while the other one requires a key error probability of 10 - 9 . In addition, we provide a wide span of new implementation results on state-of-the-art Xilinx FPGAs and set them in context to old synthesis results on legacy FPGAs. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
13. Kite attack: reshaping the cube attack for a flexible GPU-based maxterm search.
- Author
-
Cianfriglia, Marco, Guarino, Stefano, Bernaschi, Massimo, Lombardi, Flavio, and Pedicini, Marco
- Abstract
Dinur and Shamir's cube attack has attracted significant attention in the literature. Nevertheless, the lack of implementations achieving effective results casts doubts on its practical relevance. On the theoretical side, promising results have been recently achieved leveraging on division trails. The present paper follows a more practical approach and aims at giving new impetus to this line of research by means of a cipher-independent flexible framework that is able to carry out the cube attack on GPU/CPU clusters. We address all issues posed by a GPU implementation, providing evidence in support of parallel variants of the attack and identifying viable directions for solving open problems in the future. We report the results of running our GPU-based cube attack against round-reduced versions of three well-known ciphers: Trivium, Grain-128 and SNOW 3G. Our attack against Trivium improves the state of the art, permitting full key recovery for Trivium reduced to (up to) 781 initialization rounds (out of 1152) and finding the first-ever maxterm after 800 rounds. In this paper, we also present the first standard cube attack (i.e., neither dynamic nor tester) to yield maxterms for Grain-128 up to 160 initialization rounds on non-programmable hardware. We include a thorough evaluation of the impact of system parameters and GPU architecture on the performance. Moreover, we demonstrate the scalability of our solution on multi-GPU systems. We believe that our extensive set of results can be useful for the cryptographic engineering community at large and can pave the way to further results in the area. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
14. Karatsuba-like formulae and their associated techniques.
- Author
-
Cenk, Murat
- Abstract
Efficient polynomial multiplication formulae are required for cryptographic computation. From elliptic curve cryptography to homomorphic encryption, many cryptographic systems need efficient multiplication formulae. The most widely used multiplication formulae for cryptographic systems are the Karatsuba-like polynomial multiplication formulae. In this paper, these formulae and Montgomery’s work yielding more efficient such formulae are introduced. Moreover, recent efforts to improve these results are discussed by presenting associated techniques. The state of art for this area is also discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
15. Spectral arithmetic in Montgomery modular multiplication.
- Author
-
Dai, Wangchen and Cheung, Ray C. C.
- Abstract
Modular multiplication is considered to be the most computation-intensive operation for cryptographic algorithms involving large operands, such as RSA and Diffie-Hellman. Their key sizes have been increased significantly in recent decades to provide sufficient cryptographic strength. Thus, large integer modular multiplication algorithm with high efficiency is in demand. Montgomery modular multiplication (MMM) integrated by the spectral arithmetic can be a suitable solution. This is because MMM eliminates the time-consuming trail division, while the spectral arithmetic can speed up the integer multiplications from quadratic time to linearithmic time. This survey paper introduces the development of spectral-based MMM, as well as its two important properties: high parallelism and low complexity. Besides, different algorithms are explored to demonstrate how each of them benefits the modular multiplication. Moreover, we also compare these algorithms in terms of digit-level complexity and provide general ideas about algorithm selection when implementing modular multiplication with 1024-bit operand size and above. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
16. A review of lightweight block ciphers.
- Author
-
Hatzivasilis, George, Fysarakis, Konstantinos, Papaefstathiou, Ioannis, and Manifavas, Charalampos
- Abstract
Embedded systems are deployed in various domains, including industrial installations, critical and nomadic environments, private spaces and public infrastructures. Their operation typically involves access, storage and communication of sensitive and/or critical information that requires protection, making the security of their resources and services an imperative design concern. The demand for applicable cryptographic components is therefore strong and growing. However, the limited resources of these devices, in conjunction with the ever-present need for smaller size and lower production costs, hinder the deployment of secure algorithms typically found in other environments and necessitate the adoption of lightweight alternatives. This paper provides a survey of lightweight cryptographic algorithms, presenting recent advances in the field and identifying opportunities for future research. More specifically, we examine lightweight implementations of symmetric-key block ciphers in hardware and software architectures. We evaluate 52 block ciphers and 360 implementations based on their security, performance and cost, classifying them with regard to their applicability to different types of embedded devices and referring to the most important cryptanalysis pertaining to these ciphers. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
17. Arithmetic coding and blinding countermeasures for lattice signatures.
- Author
-
Saarinen, Markku-Juhani O.
- Abstract
We describe new arithmetic coding techniques and side-channel blinding countermeasures for lattice-based cryptography. Using these techniques, we develop a practical, compact, and more quantum-resistant variant of the BLISS Ideal Lattice Signature Scheme. We first show how the BLISS parameters and hash-based random oracle can be modified to be more secure against quantum pre-image attacks while optimizing signature size. Arithmetic Coding offers an information theoretically optimal compression for stationary and memoryless sources, such as the discrete Gaussian distributions often present in lattice-based cryptography. We show that this technique gives better signature sizes than the previously proposed advanced Huffman-based signature compressors. We further demonstrate that arithmetic decoding from an uniform source to target distribution is also an optimal non-uniform sampling method in the sense that a minimal amount of true random bits is required. Performance of this new Binary Arithmetic Coding sampler is comparable to other practical samplers. The same code, tables, or circuitry can be utilized for both tasks, eliminating the need for separate sampling and compression components. We then describe simple randomized blinding techniques that can be applied to anti-cyclic polynomial multiplication to mask timing- and power consumption side-channels in ring arithmetic. We further show that the Gaussian sampling process can also be blinded by a split-and-permute techniques as an effective countermeasure against side-channel attacks. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
18. An overview of hardware-level statistical power analysis attack countermeasures.
- Author
-
Mayhew, Matthew and Muresan, Radu
- Abstract
While the cryptographic modules used in modern embedded systems may employ mathematically secure algorithms, an attacker may still be able to compromise the security of a design using side-channel analysis. Side-channel attacks use leaked information in order to make inferences regarding the value of the secret key used for encryption. Statistical power analysis attacks are a class of side-channel attack which target power consumption as a leakage vector and apply statistical analysis to collected traces. As these attacks have been proven to be effective on a variety of hardware implementations, there exists a corresponding body of research regarding countermeasures. This work examines several statistical power analysis attack countermeasures in the literature and groups them into three broad categories consisting of secure logic styles, alterations to existing functional modules, and the inclusion of additional modules designed to enhance security. While a variety of options are available to a designer, there will always be a corresponding trade-off in terms of overhead factors like additional power consumption and area. As such, this work seeks to document and classify several of the approaches presented in the literature in order to help designers better select a countermeasure suited to their needs. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
19. Masking ring-LWE.
- Author
-
Reparaz, Oscar, Roy, Sujoy, Clercq, Ruan, Vercauteren, Frederik, and Verbauwhede, Ingrid
- Abstract
In this paper, we propose a masking scheme to protect ring-LWE decryption from first-order side-channel attacks. In an unprotected ring-LWE decryption, the recovered plaintext is computed by first performing polynomial arithmetic on the secret key and then decoding the result. We mask the polynomial operations by arithmetically splitting the secret key polynomial into two random shares; the final decoding operation is performed using a new bespoke masked decoder. The outputs of our masked ring-LWE decryption are Boolean shares suitable for derivation of a symmetric key. Thus, the masking scheme keeps all intermediates, including the recovered plaintext, in the masked domain. We have implemented the masking scheme on both hardware and software. On a Xilinx Virtex-II FPGA, the masked ring-LWE processor requires around 2000 LUTs, a $$20~\%$$ increase in the area with respect to the unprotected architecture. A masked decryption operation takes 7478 cycles, which is only a factor $$2.6\times $$ larger than the unprotected decryption. On a 32-bit ARM Cortex-M4F processor, the masked software implementation costs around $$5.2\times $$ more cycles than the unprotected implementation. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
20. Leakage assessment methodology.
- Author
-
Schneider, Tobias and Moradi, Amir
- Abstract
Evoked by the increasing need to integrate side-channel countermeasures into security-enabled commercial devices, evaluation labs are seeking a standard approach that enables a fast, reliable and robust evaluation of the side-channel vulnerability of the given products. To this end, standardization bodies such as NIST intend to establish a leakage assessment methodology fulfilling these demands. One of such proposals is the Welch's t test, which is being put forward by Cryptography Research Inc. and is able to relax the dependency between the evaluations and the device's underlying architecture. In this work, we deeply study the theoretical background of the test's different flavors and present a roadmap which can be followed by the evaluation labs to efficiently and correctly conduct the tests. More precisely, we express a stable, robust and efficient way to perform the tests at higher orders. Further, we extend the test to multivariate settings and provide details on how to efficiently and rapidly carry out such a multivariate higher-order test. Including a suggested methodology to collect the traces for these tests, we point out practical case studies where different types of t tests can exhibit the leakage of supposedly secure designs. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
21. Practical feasibility evaluation and improvement of a pay-per-use licensing scheme for hardware IP cores in Xilinx FPGAs.
- Author
-
Vliegen, Jo, Mentens, Nele, Koch, Dirk, Schellekens, Dries, and Verbauwhede, Ingrid
- Abstract
In earlier published work, Maes et al. present a pay-per-use licensing scheme for hardware Intellectual Property (IP) cores. This scheme focuses on the use of IP cores on static random access memory-based field programmable gate arrays (FPGAs) and is mainly based on the partial reconfigurability property of this type of FPGA. Our work evaluates the practical feasibility of the scheme and the accompanying architecture. As already (partly) indicated by Maes et al., their solution introduces some security and usability issues. Therefore, we present improvements to the scheme and the architecture together with an additional method for decreasing the area overhead. The overall result is the first practical implementation of the pay-per-use licensing scheme occupying 841 slices on a Xilinx XC6S-LX45 FPGA. The small area overhead is mainly achieved by moving the storage of keys from slice flip-flops to configuration memory. Moreover, the implementation would not have been feasible with commercially available tools. We use an academic tool that allows nested partial reconfiguration and flexible IP core placement. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
22. A new power-aware FPGA design metric.
- Author
-
Templin, Joshua and Hamlet, Jason
- Abstract
Dozens of Advanced Encryption Standard (AES) implementations have been presented since AES was officially published by the National Institute of Standards and Technology in 2001. Many of these implementations have targeted FPGA platforms either for ASIC prototyping or as the destination hardware. Typically, these publications have comparative metrics to show how the proposed implementation compares to previously published work. Unfortunately, these metrics often present inaccurate comparisons. To date, these metrics have focused on area and speed, neglecting the third point of the hardware optimization triangle, power. As AES becomes more prolific and attractive for use in embedded devices, power considerations will be increasingly important. In this paper, we discuss the subtleties and qualities of metrics previously applied to FPGA AES publications. We then propose a power metric to generate a more complete, quantitative description of the quality of the implementation. The proposed metric is not specific to AES but has general FPGA design applicability. Finally, we present a comparison between four AES-256 implementations that demonstrates the inconsistent conclusions drawn when various metrics are used. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
23. Minimizing performance overhead in memory encryption.
- Author
-
Kurdziel, Michael, Lukowiak, Marcin, and Sanfilippo, Michael
- Abstract
Modern communications devices process, distribute and store massive amounts of data compared to only a few years ago. These devices can contain very sensitive information. In addition, they are used in uncontrolled, open environments where they can be lost or compromised. The communications channels are protected using encryption technologies, but the internal data-at-rest is often not secured in any way. If the device is lost or stolen while in service, a motivated adversary could attempt to compromise the unprotected internal data. This paper presents a keystream caching methodology and architecture for encrypting/decrypting program code and data in real-time during each access within CPU's system memory. A prototype was developed for the Cyclone III FPGA using a Nios II processor, the 256-bit key Advanced Encryption Standard (AES) block cipher operating in a counter mode, and low latency off-chip SRAM memory. Various applications were used to benchmark the performance overhead of the method. The results show that this can be achieved while incurring as little as 1 % performance overhead. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
24. An exploration of mechanisms for dynamic cryptographic instruction set extension.
- Author
-
Grabher, P., Großschädl, J., Hoerder, S., Järvinen, K., Page, D., Tillich, S., and Wójcik, M.
- Abstract
Instruction set extensions (ISEs) supplement a host processor with special-purpose, typically fixed-function hardware components and instructions to utilise them. For cryptographic use-cases, this can be very effective due to the demand for non-standard or niche operations that are not supported by general-purpose architectures. However, one disadvantage of fixed-function ISEs is inflexibility, contradicting a need for 'algorithm agility'. This paper explores a new approach, namely the provision of reconfigurable mechanisms to support dynamic (run-time changeable) ISEs. Our results, obtained using an FPGA-based LEON3 prototype, show that this approach provides a flexible general-purpose platform for cryptographic ISEs with all known advantages of previous work, but relies on careful analysis of the associated security issues. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
25. Utilizing hard cores of modern FPGA devices for high-performance cryptography.
- Author
-
Güneysu, Tim
- Abstract
This article presents a unique design approach for the implementation of standardized symmetric and asymmetric cryptosystems on modern FPGA devices. In contrast to many other FPGA implementations that algorithmically optimize the cryptosystems for being optimally placed in the generic array logic, our primary implementation goal is to shift as many cryptographic operations as possible into specific hard cores that have become available on many reconfigurable devices. For example, some of these dedicated functions are designed to provide large blocks of memory or fast arithmetic functions for Digital Signal Processing applications that can also be adopted for efficient cryptographic implementations. Based on these dedicated functions, we present specific design approaches that enable a performance for the symmetric AES block cipher (FIPS 197) of up to 55 GBit/s and a throughput of more than 30.000 scalar multiplications per second for asymmetric Elliptic Curve Cryptography over NIST's P-224 prime (FIPS 186-3). [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
26. Introduction to the CHES 2012 special issue.
- Author
-
Prouff, Emmanuel and Schaumont, Patrick
- Published
- 2013
- Full Text
- View/download PDF
27. Your rails cannot hide from localized EM: how dual-rail logic fails on FPGAs—extended version
- Author
-
Immler, Vincent, Specht, Robert, and Unterstein, Florian
- Published
- 2018
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.