Cryptography

📎 Number theory basics

Definition

notation, variable definitions
$k$ key - can be secret $sk$ or public $pk$
$c$ cipher
$m$ message, plaintext
$\text{Gen}() = k$ randomized key generator
$\text{Enc}(k, m) = c$ (often randomized) encryption algorithm
$\text{Dec}(k,c) =m$ decryption algorithm
$(\text{Gen, Enc, Dec})$ encryption Scheme
$(\text{Gen, Sig, Ver})$
$\mathcal{K}$ key space, $k \in \mathcal{K}$
$\mathcal{M}$ message space, $m \in \mathcal{M}$
$\mathcal{C}$ ciphertext space, $c \in \mathcal{C}$ .
$\mathcal{T}$ tag space, $t \in \mathcal{C}$
$K$ random variable over space $\mathcal K$
$M$ random variable over space $\mathcal M$
$C$ random variable over space $\mathcal C$
Each random variable has its own probability distribution. (We only consider non-zero probability to all elements of each space).

Symmetric vs. Asymmetric

Symmetric

$+$ fast

$-$ based on heuristics (no proofs)

$-$ 1 key per user-pair (lots of keys)

$-$ must be kept a secret by both ends

Examples
- AES, based on Rijndael Cipher STANDARD
- electronik cook-book ECB ALWAYS AVOID
- cipherblock chaining BC
- cipher feedback CFB
- output feedback OFB
- countermode CTR

Asymmetric

$-$ slow

$+$ Based on security proofs

$+$ One $pk$ for all users

$+$ only $sk$ must be kept a secret

Examples
- CBC-MAC (similar to block cipher)
has 2 keys, would otherwise be vulnerable

MAC vs. Digital Signature

Both are similar in that they provide message integrity : attacker cannot change message, ie. generate any valid pair $(m,t)$

Message Authentication Code MAC

Symmetric

Same key $k$ used to sign and verify

Digital signature

Asymmetric

$sk$ for signature, $pk$ for verification

public verifiability through $pk$

non-repudation: only signer has $sk$ , can not deny having signed (legal evidence)

Hash Functions

$H: \cal M \mapsto \cal T$ (any message always mapped to a hash with the same size)

ie: MD5 (broken), SHA1 (broken), SHA2 family, SHA3 family, $\dots$

One-way functions Easy to compute output, infeasible to find the input from output

Collision-resistance Infeasible to find different inputs that map to the same output

called collision-resistant-hash-function CRHF

Collision $\big(H(m_1) = H(m_2)\big) \wedge \big(m_1 \not = m_2 \big)$

Attack Models

Passive Attack

Ciphertext only Observation of ciphertexts

Known plaintext Observation of plaintexts

Active Attacks

Chosen plaintext CPA Access to encryption algorithm

Chosen ciphertext CCA Access to decryption algorithm

Perfect Secrecy of Ciphers

= information theoretic security of encryption schemes

Perfect Secrecy

Cipher $c$ should reveal nothing about plaintext $m$ .

If for any probability distribution over $\mathcal{M}$ with random Variable $M$ :

$\forall m \in \mathcal M, c \in \mathcal C: \quad \mathbf{Pr}(M=m \mid C =c)=\mathbf{Pr}(M=m)$

That means: cipher occurence = message occurence

Proof: For perfect secrecy we need $|\mathcal{K}| \geq |\mathcal{M}|$

Assume uniformly distributed $M$ with any $k \in \mathcal K$ :

$M(c)=\{ m \mid m=\operatorname{Dec}(k, c)\}$ (= set of $c$ 's that can be decoded to $m'$ )

If $|\mathcal{K}| < |\mathcal{M}|$ then

$\exists m^{\prime} \notin M(c) \Lrarr$

$\mathbf{Pr}(M=m' \mid C=c)=0 \neq \mathbf{Pr}(M=m')$

Therefore the message could not occur - no perfect secrecy.

____________

Symmetric Encryption

(Syntactic) Correctness of symmetric encryption(for all examples)

$\forall k \in \mathcal{K}, m \in \mathcal{M}: \quad \text{Enc}(k,m) = c ~~\Rarr ~~\text{Dec}(k,c) = m$

Ancient example: Substitution Cipher / Caesar Encryption

Cyphertext-only-attack: Letter, letter-pair frequency analysis

One-Time-Pad OTP

$+$ fast encryption and decryption

$+$ perfect secrecy

$-$ the key must be as long as the message (key size = message size, requires too much storage)

$-$ needs generation of lots of true-randomness

Definition

All spaces are n-bit boolean strings.

$\text{Gen}() = k$ (k.length = m.length)

$\text{Enc}(k, m) = k \oplus m = c$

$\text{Dec}(k,c) = k \oplus c= m$

Proof of perfect secrecy

$\operatorname{Pr}(C=c \mid M=m)=$

$\text{Pr}(K \oplus M = c \mid M = m ) =$

$\text{Pr}(K \oplus \textcolor{pink}m = c) =$

$\text{Pr}(K = m \oplus c ) =$ (just a property of $\oplus$ )

$= \frac{1}{2^n}$ ( $2^n$ is the number of all possible keys / values of random variable $K$ )

Important: Key must be used once for entire $m$

To save storage, one might try to split $m$ up in smaller pieces and encrypt them with the same $c$ .

$c_1=\operatorname{Enc}(k,m_1)= k \oplus m_1$

$c_2=\operatorname{Enc}(k,m_2)= k \oplus m_2$

$c_{1} \oplus c_{2}= ( k \oplus m_1)\oplus(k \oplus m_2)= m_{1} \oplus {m}_{2}$

which is vulnerable to frequency analysis.

Stream Cipher

Pseudo random number generator PRG

Small bit sequence $\mapsto$ Large pseudorandom bit sequence (by Linear Feedback Shift Register LSFR)

To be secure the seed should be private and truly random (not chosen from a known message part like the email header)

Randomness in practice
Weak:
- throwing coin
- data from load / system parameters
Stronger:
- physical processes
- thermal noise, air perturbation XORed, hashed
Even Stronger:
- Truely random seed for unpredictable PRG, with added entropy

Stream Cipher

$\text{PRG}(seed) \oplus m = c$

means no perfect secrecy because PRG is not truly random.

Stream Cipher usage

Generate a truly random seed, send with asymmetric encryption (using $pk$ of receiver)

Then use PRG from that point on

No integrity for OTP and Stream Cipher

We perserve confidentiality but not the integrity.

Example:

Voting system, where a vote $m \in\{0,1\}$ , and we have result-predictions.

$\text{Voter} ~— ~\{m\}_k \longrightarrow \bigcirc~— ~\{m \oplus 1\}_k \longrightarrow \text{VotingSys}$ (flips votes)

Message Authentication Code MAC

$\text{Gen}() = k$ randomized key generation algorithm

$\text{Sig}(k, m) = t$ (often randomized) signing / encryption algor. that generates a tag

$\text{Ver}(k,m,t) = \{0,1\}$ verification algorithm (the decryption algorithm would take $c$ not $k$ )

Used to provide message integrity: attacker cannot change message, ie. generate any valid pair $(m,t)$

Correctness

$\forall k, m, t\in \{\text{Sig}(k,m) \}: \quad \text{Ver}(k,m,t)=1$

____________

Asymmetric Encryption

Public-Key Encryption

$\text{Gen}() = (pk,sk)$ randomized key generation algorithm

$\text{Enc}(pk, m) = c$ (often randomized) encryption algorithm

$\text{Dec}(sk,c) =m$ decryption algorithm

Correctness

$\forall ks,ps, m: \quad \text{Enc}(pk,m) = c ~~\Rarr ~~\text{Dec}(sk,c) = m$

CPA-Security

Ciphertext indistinguishability under CPA(= Chosen Plaintext attack) for any public-key-encryption.

An experiment between challenger and adversary / attacker:

Generate a key pair, send $pk$ to attacker (attacker has access to encryption algorithm - but it is usually randomized)

Receive $m_1, m_2$ from attacker

Randomly choose one of them, encrypt it and send it back

Attacker should only be able to guess which of the two messages he received $50\%$ of the time.

Definition

$n$ = number of attackers attempty / "security parameter" that is bounded polynomially

$\operatorname{Exp}_{P E, A}^{C P A}(b)$ = experiment where the challenger chose $b \in \{1,2\}$

A series of ciphers under a CPA-attack is indistinguishable if for all adversaries if the following expression is very small:

$\operatorname{Adv}_{P E, A}^{C P A}=\left|\operatorname{Pr}\left(\operatorname{Exp}_{P E, A}^{C P A}(0)=1\right)-\operatorname{Pr}\left(\operatorname{Exp}_{P E, A}^{C P A}(1)=1\right)\right|$

The probability that attacker chose $0$ while reality is $1$ vs. the opposite.

1-CPA

This extension does not strengthen the definition - he already had access to encryption algorithm.

Generate key pair, send $pk$ to attacker

Receive $m \not\in \{m_0,m_1\}$ from attacker and immediately return it in the encrypted form $(E(pk, m))$ (encryption algorithm is usually randomized)

$\dots$ previous experiment from this point

ElGamal Encryption Scheme

Example for public key encryption - within $\Z_p^*$

$\text{Gen}()$

pick random $g,x$ and prime $p$ - for $\Z_p^*$

$p k:=\left(p, g, \textcolor{pink}{g^x}\right)$

$s k:=(p, g, \textcolor{pink}{x})$ - $x$ is private and can not be computed feasibly $x = \text{Dlog}_g(g^x)$

$\text{Enc}(pk, m)$

$p k:=\left(p, \textcolor{pink}g, \textcolor{pink}{g^x}\right)$

pick random $y$

return $c := \textcolor{grey}{(g^y,m\cdot (g^x)^y) = }(g^y,m\cdot g^{xy})$

$\text{Dec}(sk, c)$

$sk =(p, g, \textcolor{pink}x)$

received $c=(\overbrace{g^y}^A,\overbrace{m\cdot g^{xy}}^B)$

return $m:=A^{-x} \cdot B$

Correctness

$B\cdot A^{\textcolor{pink}{-x}} =$

$m\cdot g^{xy} ~~\cdot ~~ (g^{{y}})^{\textcolor{pink}{-x}} =$

$m\cdot g^{xy} ~~\cdot ~~ (g^{\textcolor{pink}{-xy}}) =$

$m\cdot g^{\textcolor{pink}{xy-xy}} = m\cdot g^{\textcolor{pink}{0}} = m$

Proof of Correctness

ElGamal has CPA security if the DDH Decisional Diffie-Hellman Assumption assumption holds in $G := \Z_p^*$ .

Decisional Diffie-Hellman Assumption

(= One can not compute the discrete logarithm in polynomially bounded $n$ = in $\textcolor{pink}n$ attempts).

$|G| \approx 2^{\textcolor{pink}n}$ and we choose random $g\in \Z_p^*$ .

With given $g^x, g^y, Z$ we can not decide whether $Z = g^{xy}$ for any $x,y,Z \in \{1,\dots,|\Z_p^*|\}$ .

Proof by contradiction

We can break ElGamal if we can break the DDH assumption.

If $\exists A$ algorithm that breaks ElGamal with any $pk$ then we imitate the challenger to break DDH with his advantage $\operatorname{Adv} >0$ of distinguishing correctly.

We have $g^x, g^y$ and want to know whether $Z = g^{xy}$

Generate keys:
$p k:=\left(p, g, \textcolor{pink}{g^x}\right)$
$s k:=(p, g, \textcolor{pink}{x})$
and send $pk$ to $A$ .

receive $m_0, m_1$ randomly choose one and encrypt with $y$ .
return $c := \textcolor{grey}{(g^y,m\cdot (g^x)^y) = }(g^y,m\cdot g^{xy})$

Receive attackers guess $b'$ of $b$ . (which is correct because of his advantage)
If $(b = b') \Rarr (Z=g^{xy})$
If $(b \not = b') \Rarr (Z \not=g^{xy})$ means it was just a random number.

Naive RSA

Also called "textbook RSA" because it is simplified - but not secure.

$\text{Gen}()$

Pick two random primes $p,q$
$p \cdot q = N$
calculate $\varphi(N) \textcolor{grey}{=\left|\mathbb{Z}_{N}^{*}\right|}$ (size of set where all elements have an inverse)

Choose an random $\textcolor{pink}e$ so that it has an inverse in $\Z_{\varphi(N)}: ~~~\gcd(\textcolor{pink}e,\varphi(N))=1$
$e$ has an inverse in $\Z_{\varphi(N)}$ but does not necessarily one in $\Z_N$ .
$\textcolor{pink}d:=\textcolor{pink}{e^{-1} }$ in $\Z_{\varphi(N)}$
return key pair
$p k:=(N, \textcolor{pink}e)$
$s k:=(p, q, \textcolor{pink}d)$
You can not figure out $d$ just from $e$ and $N$ unless you know $p,q$ - this is a simplified version.

Encryption $\text{Enc}(pk, m)$

$p k =\left(N,\textcolor{pink}e\right)$

return cipher $c:=m^{\textcolor{pink}e}$ in $\Z_N$ .

Decryption $\text{Dec}(sk, c)$

$sk =(p, q,\textcolor{pink} d)$

return $m:=c^{\textcolor{pink}d}$ in $\Z_N$ .

Correctness

Decryption: $c=m^{\textcolor{pink}e}$

$c^d = (m^e)^d = m^{ed} = m^{ee^{-1}}= m^1 = m$ in $\Z_N$

What still needs to be proven:

We know that $ed = ee^{-1} = 1$ in $\Z_{\varphi(N)}$ - but what about $\Z_N$ ? See below.

This version is insecure: No randomization of $\text{Enc}$

Not secure against passive attacks because it is deterministic.

Same messages result in the same ciphers:

$m=m^{\prime} \Rightarrow \operatorname{Enc}(p k, m)=\operatorname{Enc}\left(p k, m^{\prime}\right)$

Secure usage of RSA

big public key length with high strength.

Preprocessing, padding

no sidechannels for timing

Proof of Correctness

Chinese remainder theorem CRT

Let $p \not = q$ be primes and $N = p\cdot q$ .

Two numbers are equal in $\Z_{N}$ $\Leftrightarrow$ they are also equal in $\Z_{p}$ and $\Z_{q}$ .

Proof of Correctness

$e$ has an inverse in $\Z_{\varphi(N)}$ , now we want to prove that it also has an inverse in $\Z_N$ .

It is sufficient to prove that $m^{ed} = m^{e\cdot e^{-1}} = m$ in $\Z_{p}$ and $\Z_{q}$ .

case 1: $m = 0$

then $m^{ed} = 0^{ed} = 0$ in $\Z_{p}$ and $\Z_{q}$ .

case 2: $m \not= 0$

If $e\cdot d = e \cdot e^{-1}= 1$ in $\Z_{\varphi(N)}= \Z_{(p-1)(q-1)}$

$m \in \Z^*_p, \Z^*_q$ because: $\exists k:~ e \cdot e^{-1}=k(p-1)(q-1)+1 = 1$ also in $\Z_{N}$

Digital signature

$\mathcal{M}$ message space, where $m \in \mathcal{M}_{pk}$

$\cal M_{pk} = \{\forall m: \text{Sig}(sk,m) \not\mapsto (\text{error}\downarrow) \}$

$\text{Gen}() = (pk, sc)$ randomized key generation algorithm

$\text{Sig}(\textcolor{pink}{sk}, m) = t$ (often randomized) signing / encryption algor. that generates a tag

$\text{Ver}(\textcolor{pink}{pk},m,t) = \{0,1\}$ verification algorithm - decrypts $t$ and if it is equal to $m$ returns $1$

Correctness

$\forall k, m, t\in \{\text{Sig}(sk,m) \}: \quad \text{Ver}(pk,m,t)=1$

CMA-Security

Chosen message attack CMA: access to decryption algorithm..

Goal: forging a signature - if attacker is successful in then the verifier must return $1$ .

$\operatorname{Pr}(\operatorname{Exp}_{I_{n}, A_{n}}^{\mathrm{CMA}}=1) \approx 0$

Naive RSA-based Digital Signatures

Same as naive RSA but

$p k:=(N, \textcolor{pink}e)$

$s k:=\textcolor{pink}d$

with encryption being the signing algorithm just returning $t:=m^{\textcolor{pink}d}$ and der Verifier returning $1$ if the decrypted $t$ is equal to $m$ .

Correctness

$t^{e} = m^{e d} = m$ in $\Z_N$

This is a simplified version. Not secure.

Would be secure if the signing algorithm would also hash the messages:

$\text {Sig}(m)=(\mathrm{H}(m))^{d} \bmod \mathrm{N}$