Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Message Agreement #55

Open
lumaier opened this issue Sep 24, 2024 · 4 comments
Open

Missing Message Agreement #55

lumaier opened this issue Sep 24, 2024 · 4 comments
Labels
protocol research Issues for tracking protocol research and choices security Potential and confirmed security issues

Comments

@lumaier
Copy link
Contributor

lumaier commented Sep 24, 2024

Problem: The current protocol draft has no message agreement. In general, message agreement is concerned whether the sender and receiver have a shared understanding of the messages exchanged. Non-injective agreement guarantees that if $B$ receives a message $msg$ from $A$, then $A$ has sent $msg$ to $B$.

The attack works as follows: Source $S$ wants to send a message $m$ to journalist $J_1$ of newsroom $NR_1$. If the longterm signing key $sk_1$ of $J_1$ was leaked, $sk$ can be used by an adversary to sign an ephemeral encryption key $ek$ of a different journalist $J_2$ (enrolled at a different newsroom $NR_2$). This key $ek$ - which the source believes belongs to $J_1$ is used to encrypt the message $msg$. The ciphertext is then relayed to $J_2$ by an active network adversary. Hence $J_2$ of $NR_2$ receives and decrypts the message, even tough $S$ intended to send the message to $J_1$ of $NR_1$.

We propose two possible approaches: Both work with the assumption that from the POV of the source, a particular newsroom $NR$ is the intended receiver (not a journalist - journalists only act on behalf of the newsroom).

Here how the protocol encrypts a message $msg$ using an ephemeral key $m$:

Screenshot from 2024-09-24 16-54-45

Variant 1: Incorporate the newsroom identity in the message and use the source's long-term key $s$ as part of the encryption key. (In red our changes)

Screenshot from 2024-09-24 16-55-00

The encryption key incorporates the source's long-term secret $s$ but masks it using the ephemeral key $m$ (identity of source is not leaked with $\hat{m}$). The journalist checks whether $g^s$ and $m$ were used to encrypt the message (gives origin authentication) and by including the intended newsroom $NR$ (since no adversary can tamper with the message without knowing $k$), the journalist can verify the source's intention. Without including $NR$, the ciphertext can still be relayed to a different journalist.

Variant 2: Incorporate the newsroom identity in the message and let the source sign the message using its longterm-secret $s$.

Screenshot from 2024-09-24 16-55-06

The journalist first decrypts the message and then checks whether the source with knowledge of $s$ has sent the message.

Security: We were able to prove that both variants guarantee non-injective and injective agreement between sources and newsrooms on messages in the symbolic model.

@lsd-cat lsd-cat added protocol research Issues for tracking protocol research and choices security Potential and confirmed security issues labels Oct 4, 2024
@lsd-cat
Copy link
Member

lsd-cat commented Oct 5, 2024

Thank you, this is very valuable input. As discussed, I think this attack is a demonstration of the underlying problem of the missing message agreement. I like more variant 1, because in avoiding signature we both avoid to introduce another cryptographic primitive, and keep DH related message repudiation (deniability).

While thinking about variant 1, I am wondering if the more common way to achieve the same is by doing the partial DH shares, and then using them to derive a key, as X3DH does. In trying to understand that, I have taken a stab at demoing the protocol using PQXDH (which has the side benefit of introducing PQ resistance for message secrecy). Due to the asymmetry in the protocol, there are some limitations as we will see, and I am uncertain if we would inherit the same properties due to this changes.

pqxdh_securedrop.txt

Test 1: Source to Journalist
{"Dir: <class '__main__.Source'> -> <class '__main__.Journalist'>"}
DH1: 
DH2: 1ac1a247d2eb55ba48673b03a5f85104103920cb7b63e84aa156c8d8fc4daf26
DH3: 61fb424b5c5d2f574224b376d508b4fe6ff8eda02d95b58e48f9268b2aeb4d2d
DH4: 0cc8a6f78a3ed68d34a4c76e6a7b5391ef2cc0fa052f9fbef2f3bcad527ada66
 SS: 1c57fd0c60c6f2f4616a82a4f0d0f3091a3d797c3f2a3b704068d48f61260f97
KEY: 92117d36dc8a258cc35e804dd32a078bcb065b3731f0620ffafdad8abef198ce
DH1: 
DH2: 1ac1a247d2eb55ba48673b03a5f85104103920cb7b63e84aa156c8d8fc4daf26
DH3: 61fb424b5c5d2f574224b376d508b4fe6ff8eda02d95b58e48f9268b2aeb4d2d
DH4: 0cc8a6f78a3ed68d34a4c76e6a7b5391ef2cc0fa052f9fbef2f3bcad527ada66
 SS: 1c57fd0c60c6f2f4616a82a4f0d0f3091a3d797c3f2a3b704068d48f61260f97
KEY: 92117d36dc8a258cc35e804dd32a078bcb065b3731f0620ffafdad8abef198ce
Success!


Test 2: Journalist to Source
{"Dir: <class '__main__.Journalist'> -> <class '__main__.Source'>"}
DH1: 8b233fe20a51e15dc719d1d07e82373459189a7a135eb1870ade9065d3b8fd36
DH2: 75b60a29a82ddc1b48242788297c594dfd28c87a14ac3ff760ea1c37e8005130
DH3: ed89b89afde975e5eec0dc4638f4dee136d391f7031a84d47975936a016d8551
DH4: 
 SS: 2d1072d3258944f98ee0b7b96fe4eb9757207e642f4eb64d2a26078c3dc2e562
KEY: 905492e3c82caa0cd880e33403c90fd68f73e5c605aa5b90069243583f5080b1
DH1: 8b233fe20a51e15dc719d1d07e82373459189a7a135eb1870ade9065d3b8fd36
DH2: 75b60a29a82ddc1b48242788297c594dfd28c87a14ac3ff760ea1c37e8005130
DH3: ed89b89afde975e5eec0dc4638f4dee136d391f7031a84d47975936a016d8551
DH4: 
 SS: 2d1072d3258944f98ee0b7b96fe4eb9757207e642f4eb64d2a26078c3dc2e562
KEY: 905492e3c82caa0cd880e33403c90fd68f73e5c605aa5b90069243583f5080b1
Success!


Test 3: Journalist to Journalist
{"Dir: <class '__main__.Journalist'> -> <class '__main__.Journalist'>"}
DH1: f569399114e5a20e4fa4186c5f5f7a78dfe9c8d5cbf7bc4d3e729eeb8f3c9874
DH2: 06e6b25d848381d2b4310e1af7e710c68a1b648de05695e40c55ed2dd51f0b2d
DH3: bbd41415989395f60091140e0fa3c2e38e9e6d146bf787a1094b71892c137719
DH4: 77261e96ee8ea1193a90822c709213fe569c23cbd2943a3b9f6422cd12372923
 SS: 223b39718f0cfbf91cf3bba6d0db823b4b408daa5e74225d5490d4305ab29e69
KEY: e06a4fa0ae35ada34aeea06a11f5d9d1b32a6120d3584919208c5c6b7aa72d4b
DH1: f569399114e5a20e4fa4186c5f5f7a78dfe9c8d5cbf7bc4d3e729eeb8f3c9874
DH2: 06e6b25d848381d2b4310e1af7e710c68a1b648de05695e40c55ed2dd51f0b2d
DH3: bbd41415989395f60091140e0fa3c2e38e9e6d146bf787a1094b71892c137719
DH4: 77261e96ee8ea1193a90822c709213fe569c23cbd2943a3b9f6422cd12372923
 SS: 223b39718f0cfbf91cf3bba6d0db823b4b408daa5e74225d5490d4305ab29e69
KEY: e06a4fa0ae35ada34aeea06a11f5d9d1b32a6120d3584919208c5c6b7aa72d4b
Success!


Test 4: Source to Source
{"Dir: <class '__main__.Source'> -> <class '__main__.Source'>"}
DH1: 
DH2: 2e18c8ccd684870e13379add7ecd1b409dc4bdc5ca1e6c00fc888689c9b0ca41
DH3: cde4df9507d53d591bd5b91335b49dee9d4c9090b8487874af814b677bf78f75
DH4: 
 SS: ec3ea58924d00131c94f2c990f405b585abd838423ee3215629743a5e159ff6e
KEY: dfd543a5097d752fbeb0558580b83abf55565fda31b7f390b9218ed845210e33
DH1: 
DH2: 2e18c8ccd684870e13379add7ecd1b409dc4bdc5ca1e6c00fc888689c9b0ca41
DH3: cde4df9507d53d591bd5b91335b49dee9d4c9090b8487874af814b677bf78f75
DH4: 
 SS: ec3ea58924d00131c94f2c990f405b585abd838423ee3215629743a5e159ff6e
KEY: dfd543a5097d752fbeb0558580b83abf55565fda31b7f390b9218ed845210e33
Success!

It has been implemented manually directly from the official specification. As we can see in Test 1, when sending from a source to a journalist we cannot do DH1 because the public key of the source cannot be advertised (and thus we cannot satisfy the spec and attach AD = EncodeEC(IKA) || EncodeEC(IKB). If we did it, a journalist could never decrypt, because they cannot learn the sender public key before the first contact message.

Similarly, when when a journalist is sending to a source, such as in Test 2 we cannot do DH4, because sources do not have ephemeral (one time keys).

When doing journalist to journalist, Test 3 PQXDH should be complete as per spec.

Both Test 2 and Test 3 should inherit full PQXDH properties. I am unsure of the consequences of removing DH1 from Test 1.

A returning source could potentially do a full PQXDH too, since now the source is known

Furthermore, this makes decryption more complex I am quite sure the journalist will have to do more decryption attempts (such as, all the ephemeral keys for every know source already known).

@lsd-cat
Copy link
Member

lsd-cat commented Oct 5, 2024

In better readable format with the requirements to run it here.

Memo: for simplicity now I am using a single long term PQ key, which would not provide forward secrecy in the PQ domain. Let's think about this after :), Signal uses them interchangeably, as they are used only for encryption and not for authentication

The consideration here, is that we already have 3 set of key (plus the PQ one) for every participant, and the one time (or ephemeral keys) for the journalist. This actually matches 1:1 PQXDH, if we use all of them including the fetching key.

Then way I applied it is by is by considering:

Signal participant short key name Signal description SD Journalist SD Source SD description
IK Identity key J S Long term identity key
EK Ephemeral key ME ME Per-message ephemeral key
SPK Signed prekey JC SC Fetching key
(OPK1, OPK2, …) Set of one-time prekeys (JE1, JE2, ...) Journalist ephemeral keys
PQPK PQ signed prekey JPQPK SPKQP Post quantum public key

This is how a run of the protocol should work with this set of keys.

Now with these matching, let's picture the 3 different types of exchanges.

Source to Journalist

This is a first contact message between a secret party and a public party.

Source shared key computation

  1. DH1 = DH(S, JC) -> The Journalist will not be able to decrypt, because they do not know S.
  2. DH2 = DH(ME, J)
  3. DH3 = DH(ME, JC)
  4. DH4 = DH(ME, JEi)
  5. (CT, SS) = PQKEM-ENC(JPQPK)
  6. SK = KDF(DH1 || DH2 || DH3 || DH4 || SS)

Journalist shared key computation (trial decryption with the set of (JE, ...))

  1. DH1 = DH(JC, S)
  2. DH2 = DH(J, ME)
  3. DH3 = DH(JC, ME)
  4. DH4 = DH(JEi, ME)
  5. (SS) = PQKEM-DEC(JPQPK, CT)
  6. SK = KDF(DH1 || DH2 || DH3 || DH4 || SS)

The journalist can compute the shared secret by knowing only the ME public key, and the PQ CT.

Questions

  1. What do we lose when not doing DH1? I would say sender authentication, but to be verified
  2. Are CT values unlinkable? Does sending them to the server paired with the messages and the other values break other guarantees?
  3. Does using the fetching key for multiple purposes, which is now encryption and fetching weaken something else?
Test 1: Source to Journalist
{"Dir: <class '__main__.Source'> -> <class '__main__.Journalist'>"}
DH1: 
DH2: 1ac1a247d2eb55ba48673b03a5f85104103920cb7b63e84aa156c8d8fc4daf26
DH3: 61fb424b5c5d2f574224b376d508b4fe6ff8eda02d95b58e48f9268b2aeb4d2d
DH4: 0cc8a6f78a3ed68d34a4c76e6a7b5391ef2cc0fa052f9fbef2f3bcad527ada66
 SS: 1c57fd0c60c6f2f4616a82a4f0d0f3091a3d797c3f2a3b704068d48f61260f97
KEY: 92117d36dc8a258cc35e804dd32a078bcb065b3731f0620ffafdad8abef198ce

Journalist to source

  1. DH1 = DH(J, SC)
  2. DH2 = DH(ME, S)
  3. DH3 = DH(ME, SC)
  4. DH4 = DH(ME, ..) -> Sources do not have ephemeral (one-time) keys
  5. (CT, SS) = PQKEM-ENC(SPQPK)
  6. SK = KDF(DH1 || DH2 || DH3 || DH4 || SS)
Test 2: Journalist to Source
{"Dir: <class '__main__.Journalist'> -> <class '__main__.Source'>"}
DH1: 8b233fe20a51e15dc719d1d07e82373459189a7a135eb1870ade9065d3b8fd36
DH2: 75b60a29a82ddc1b48242788297c594dfd28c87a14ac3ff760ea1c37e8005130
DH3: ed89b89afde975e5eec0dc4638f4dee136d391f7031a84d47975936a016d8551
DH4: 
 SS: 2d1072d3258944f98ee0b7b96fe4eb9757207e642f4eb64d2a26078c3dc2e562
KEY: 905492e3c82caa0cd880e33403c90fd68f73e5c605aa5b90069243583f5080b1

Journalist to Journalist

What a joy:

  1. DH1 = DH(JA, JCB)
  2. DH2 = DH(ME, JB)
  3. DH3 = DH(ME, JCB)
  4. DH4 = DH(ME, JEiB)
  5. (CT, SS) = PQKEM-ENC(JPQPK)
  6. SK = KDF(DH1 || DH2 || DH3 || DH4 || SS)

There is a subtlety here, that of course the journalist does not know from whom the message is coming, thus they have to try. So they have to try as if the sender was a source, all n JE, and then all n JE * the number of journalists.

If we assume that replying sources, can also add their DH1, because now their identity is known, then the jourenalist would also have to try the total number of sources * n JE.

Test 3: Journalist to Journalist
{"Dir: <class '__main__.Journalist'> -> <class '__main__.Journalist'>"}
DH1: f569399114e5a20e4fa4186c5f5f7a78dfe9c8d5cbf7bc4d3e729eeb8f3c9874
DH2: 06e6b25d848381d2b4310e1af7e710c68a1b648de05695e40c55ed2dd51f0b2d
DH3: bbd41415989395f60091140e0fa3c2e38e9e6d146bf787a1094b71892c137719
DH4: 77261e96ee8ea1193a90822c709213fe569c23cbd2943a3b9f6422cd12372923
 SS: 223b39718f0cfbf91cf3bba6d0db823b4b408daa5e74225d5490d4305ab29e69
KEY: e06a4fa0ae35ada34aeea06a11f5d9d1b32a6120d3584919208c5c6b7aa72d4b

If we want to use ephemeral or semi-ephemeral PQ keys, we have to pair them in couple with the classical journalist ephemeral keys, otherwise we'd get another quadratic increase in trial decryption complexity.

Note: noticed that what I called a clue in the code is not consistent with the README and even the blog post nomenclature, but the message fetching part is demoed just to show everything can work together, and it is not really the point. But we should really start fixing the docs.

Also doing this would finally close #48 (partially), #31, #30.

@lumaier
Copy link
Contributor Author

lumaier commented Nov 4, 2024

We came up with another variant to get message agreement. From a high level perspective: the sender encrypts its public DH share $g^a$ using an asymmetric encryption scheme under the receiver's public key $pk(sk_B)$. This way, the sender's identity $g^a$ is not revealed unless the adversary has access to $sk_B$. Because the receiver's identity also needs to be kept secret, the PKE scheme needs to provide anonymous encryption (i.e., the ciphertext doesn't reveal the used public key).

The KeyGeneration algorithm works as follows:

Source:

  • The source derives a master secret $MS_S$ from the the passphrase.
  • It generates two secrets $S_{SK,DH} \parallel S_{SK,PKE} = KDF(MS_S)$ from the master secret
  • The corresponding public keys are $S_{PK,DH} = DH(g, S_{SK,DH})$ and $S_{PK,PKE} = GetPub(S_{SK,PKE})$

Journalist:

  • The journalist derives a master secret $MS_J$.
  • It generates two long-term secrets $J_{SK,DH} \parallel J_{SK,SIG} = KDF(MS_J)$ from the master secret
  • The corresponding public keys are $J_{PK,DH} = DH(g, J_{SK,DH})$ and $J_{PK,SIG} = GetPub(J_{SK,SIG})$. The tuple $J_{PK,DH} \parallel J_{PK,SIG}$ is signed by the newsroom.
  • It generates ephemeral master secrets $MS_{JE}$
  • For each of them it generates two secrets $JE_{SK,DH} \parallel JE_{SK,PKE} = KDF(MS_{JE})$ and corresponding public keys $JE_{PK,DH} = DH(g, JE_{SK,DH})$ and $JE_{PK,PKE} = GetPub(JE_{SK,PKE})$. The tuple $JE_{PK,DH} \parallel JE_{PK,PKE}$ is signed using $J_{SK,SIG}$.

Note that

  • the parties are annotated with $S$ or $J$
  • the suffix $E$ denotes a ephemeral key
  • the usages are denoted with $DH$ (Diffie Hellman key share), $PKE$ for a public-key encryption scheme, $SIG$ for a signature scheme
  • $PK$ stands for public key and $SK$ for secret key

For example, $JE_{SK,PKE}$ is a journalist's ephemeral private key used in a public-key encryption scheme.

The encryption from source to journalist would look as follows:

(assuming it has access to the newsrooms keys and verified journalist's keys $JE_{SK,DH}$ and $JE_{SK,PKE}$)

$\textbf{Source Encryption}$

  1. $k = KDF(DH(JE_{PK,DH},{S_{SK,DH}}))$
  2. $ckey = PKE.Enc(JE_{PK,PKE}, S_{PK,DH})$
  3. $c = SE.Enc(k, msg \parallel S_{PK,DH} \parallel S_{PK,PKE} \parallel J \parallel NR)$
  4. return $ckey, c$

$\textbf{Journalist Decryption}$ on input $(ckey, c)$

  1. $S_{PK,DH} = PKE.Dec(JE_{SK,PKE}, ckey)$
  2. $k = KDF(DH(S_{PK,DH},{JE_{SK,DH}}))$
  3. $msg \parallel S_{PK,DH} \parallel S_{PK,PKE} \parallel J \parallel NR = SE.Dec(k, c)$

Since this version relies on the source's longterm key for encryption, we would lose secrecy in case of the sender's (in this case the source's) longterm key being revealed. We can extend the scheme by incorporating an ephemeral key $ME_{SK, DH}$:

$\textbf{Source Encryption}$

  1. generates an ephemeral message key $ME_{SK,DH}$ and computes $ME_{PK,DH} = DH(g, ME_{SK,DH})$
  2. $k = KDF(DH(JE_{PK,DH},{S_{SK,DH}}) \parallel DH(JE_{PK,DH},{ME_{SK,DH}}))$
  3. $ckey = PKE.Enc(JE_{PK,PKE}, S_{PK,DH})$
  4. $c = SE.Enc(k, msg \parallel S_{PK,PKE} \parallel J \parallel NR)$
  5. return $ckey, c, ME_{PK,DH}$

$\textbf{Journalist Decryption}$ on input $(ckey, c, ME_{PK,DH})$

  1. $S_{PK,DH} = PKE.Dec(JE_{SK,PKE}, ckey)$
  2. $k = KDF(DH(S_{PK,DH},JE_{SK,DH}) \parallel DH(ME_{PK,DH},JE_{SK,DH} ))$
  3. $msg \parallel S_{PK,PKE} \parallel J \parallel NR = SE.Dec(k, c)$

Note that you have to include the intended recipient $J$ and $NR$ such that the receiver can verify it (otherwise there exists attack traces).

For messages from journalists to sources, it works very similar:

$\textbf{Journalist Encryption}$

  1. generates an ephemeral message key $ME_{SK,DH}$ and computes $ME_{PK,DH} = DH(g, ME_{SK,DH})$
  2. $k = KDF(DH(S_{PK,DH},{J_{SK,DH}}) \parallel DH(S_{PK,DH},{ME_{SK,DH}}))$
  3. $ckey = PKE.Enc(S_{PK,PKE}, J_{PK,DH})$
  4. $c = SE.Enc(k, msg)$
  5. return $ckey, c, ME_{PK,DH}$

$\textbf{Source Decryption}$ on input $(ckey, c, ME_{PK,DH})$

  1. $J_{PK,DH} = PKE.Dec(S_{SK,PKE}, ckey)$
  2. $k = KDF(DH(J_{PK,DH},S_{SK,DH}) \parallel DH(ME_{PK,DH},S_{SK,DH} ))$
  3. $msg = SE.Dec(k, c)$
  4. The source believes that journalist with control over $J_{PK,DH}$ sent the message

We can discuss whether $J_{SK,DH}$ should be used in the key generation for messages from journalists to sources to authenticate the sending journalist.

We were able to prove message agreement in the symbolic model (Tamarin) under this scheme.

@cfm
Copy link
Member

cfm commented Nov 14, 2024

We've done a toy implementation of this DHETM-based encryption scheme in https://gist.github.com/cfm/dab18074b9cecb06cbd006e1ab7ede7f and will be happy to proceed this way, @lumaier. Thank you for the proposal!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
protocol research Issues for tracking protocol research and choices security Potential and confirmed security issues
Projects
None yet
Development

No branches or pull requests

3 participants