Monday, February 8, 2016

CAESAR Candidates and RFID for IOT

Authenticated Encryption (AE) schemes require data authenticity in addition to the confidentiality of the message content. Achieving these two security goals is crucial for enabling communication in the presence of an active adversary – one that that may tamper with message content in addition to observing communication. This problem is common to many of real-life scenarios, starting with the traditional ones (e.g., securing protocols for financial transactions), up to the rising number of Internet of Things (IOT) applications. For example, data acquired in a wireless sensor network should be available only to the owners of the network. Additionally, adversaries that can modify valid sensor data, or spoof false sensor data into the system, would lead to complete loss of credibility of the sensor readings – hence rendering the system useless.

In spite of many relevant use-cases, AE schemes remained not formalized for many years. Instead, cryptographic community – as well as “crypto-enthusiasts” – have relied on using generic compositions of existing cryptographic constructs (e.g., AES-CCM).  Unfortunately, this approach often lead to different types of exploitable weaknesses. In the year 2002 Rogaway made one of the first steps towards more reliable design of AE schemes. Today, a Competition for Authenticated Encryption: Security, Applicability, and Robustness (CAESAR) is running since 2014 with a goal of selecting a portfolio of AE ciphers. Now, CAESAR competition is in the second round, with 29 candidates remaining after the selection committee eliminated nearly half of them.  Therefore, at this point one may consider that remaining candidates provide a satisfactory level of security and robustness. Additionally, many of the authors discuss use-cases that involve both constrained and high-end devices, supporting them with hardware and software implementation results.

Considering IOT application of CAESAR candidates, boundaries of high-end implementations lie in what fits a server rack. On the other hand, constrained domain imposes many limitations. Namely, a swarm of low-end devices that is the frontier of IOT must provide sufficient data throughput, including both computational and communicational aspects; at a very low price. This price is most generally presented by the maximum allowed chip area, equal to the size of 2000 two-input NAND gates (2000 GE) in the technology of choice. Furthermore, the majority of these devices is either passively-, or battery-powered; and therefore must conform either low power, or low energy criteria, respectively. Contradicting nature of all these requirements makes it very difficult to find a single optimal approach for all use-cases. Hence, we will focus on devices that are most constrained in terms of resources at hand – passively powered Radio Frequency IDentification (RFID) circuits.

RFID systems infiltrated the world of IOT in many applications starting from electronic identification documents, access management, and tracking. More importantly, they can handle medical and payment applications. Whether they are in passports, credit cards, or they in tags that mark shipping crates, RFID are often international travelers. Consequently, interoperability in different parts of the world is crucial for their use; therefore, their use is commonly standardized. For example, ISO/IEC 18000 family of standards specifies RFID devices for a number of different applications. Compliant to this standard is the EPCGlobal Gen2 air interface protocol, published in 2004 and widely deployed since. The cryptographic community often overlooks these standards, and ones akin to them. Nevertheless, these standards include various overheads and constraints (e.g., maximum latency of the response), compliance to which makes the difference between usability and actual real-life applicability. For example, Table 1 gives overview of timing constraints for the EPCGlobal Gen2 protocol. A paper from 2012 gives an example of a RFID security study within the boundaries of the aforementioned protocol. Here Tari determines the length of the logic 0 symbol that initiates the communication. Additionally, T1 denotes the amount of time an RFID tag has to begin its response, upon successfully receiving the last symbol of the command, depending on the Tari value.

Tari (µs)
T1 Time (µs)
#Cycles @ 1.5 MHz
#Cycles @ 2 MHz
#Cycles @ 2.5 MHz

Seeing how the most straightforward way of minimizing power consumption is by reducing the area of the circuit, implementers strive towards both low area and power by serializing the computation. This often leads to an unfittingly large number of cycles. For example, a 1000 GE serial implementation of lightweight block cipher PRESENT fits the general constraint of 2000 GE rather elegantly. Nevertheless, its computational latency is 563 clock cycles, which makes it rather unfitting for the purpose at hand. Since this implementation takes only 50% of the area boundary, it allows a large maneuver space that can be threaded in order to meet the needs of EPCGlobal Gen2 protocol; which is not the case for a number of CAESAR candidates (see below). Lastly, note that this standard is only one real-life example chosen to depict, not dictate, requirements. Nevertheless, we due to the very scarce amount of resource, and the interoperability imperative, we believe it would be most beneficial to consider concrete use-cases while evaluating hardware implementations of the remaining candidates.

Continuing to overview the remaining candidates, we notice there are 15 candidates that either mandate or recommend use of AES or AES round function. While these are sound design decisions from the perspective of the knowledge base on AES and its presence in the modern day world, relying on AES may bring hardship in lightweight applications. Namely, the smallest implementation of AES by, Moradi et al., requires 2400 GE of UMCL18G212T3 standard-cell library, (and 226 clock cycles), and consumes 3.7 µW of power at the operating frequency of 100 kHz. AES computation is the dominant part of the computation/area it determines required resources in a great deal; therefore, we will not discuss these candidates separately.

Listed in the alphabetical order, we overview candidates that make claims of being lightweight in hardware, and present relevant implementation results. Therefore, designs intended for high-end applications are not considered. Furthermore, since there are no results available for all candidates, we allow certain estimates. In addition, some authors have abstained from providing power figures, and results are provided for different technology libraries, hence these implementations should not be compared among themselves, since it might yield an unfair comparison. 

  •         ACORN. Based on a novel construct, uses 128-bit key.  It is a stream-based cipher, and authors estimate its hardware implementation to be slightly larger than that of TRIVIUM cipher, which requires at least 2599 GE using 130 nm CMOS technology and 1333 cycles for initialization, followed by a cycle for each bit of encryption. We found no power estimations for this implementation.
  •           Ascon. Sponge-based mode of operation based on a novel SPN permutation, uses 128-bit key. A variety of hardware implementations are available, smallest one requiring 2570 GE (3750 GE with interface) in UMC 90nm technology, and 512 cycles per round; utilizing 1.5 µW at 100 kHz operating frequency. Note that 6 or 12 rounds are required for a complete permutation.
  •          Ketje. Based on a round-reduced version of SHA-3 competition winner Keccak-f[400] (Ketje-Sr) and Keccak-f[200] (Ketje-Jr), uses 128-, and 96-bit keys, respectively. Authors did not include any hardware implementation results in the submission document. Nevertheless, this candidate comes with an advantage of being able to share hardware with SHA-3 circuitry, and Keccak-f[200] permutation can be implemented by as little as 1270 GE.
  •     Minalpher. Based on a novel SPN permutation, employed in a tweakable Even-Manseur construct, uses 128-bit key. Smallest hardware implementation results presented are only for the underlying Minalpher-P permutation, require 2700 GE in NanGate 45nm standard-cell library, and require 288 clock cycles to finish computation. Authors did not present any power figures.
  •          PRIMATEs. Another sponge-based design, with a proprietary SPN permutation; use 80-, or 120-bit keys. PRIMATEs offer 3 modes of operation, allowing tradeoffs between security and performance, for each key size. Authors did not include any hardware implementation results in the submission document. Still unpublished work of ours shows that the core permutation for an 80-bit key can be implemented with as little as 1200 GE using UMC 90 nm standard cell library, which requires 16 cycles per round, consuming 0.68 µW of power at the operating frequency of 100 kHz Note that 6 or 12 rounds are required for a complete permutation.
  •           TriviA-Ck. Based on Trivia-SC stream cipher is a variant of TRIVIUM; uses 128-bit key. 

No comments:

Post a Comment