KDF API review, round 2

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

KDF API review, round 2

Jamil Nimeh
Hello all,

Thanks to everyone who has given input so far.  I've updated the KeyDerivation API with the comments I've received.  The new specification is here:

Text: http://cr.openjdk.java.net/~jnimeh/reviews/kdfspec/kdfspec.02.txt
Javadoc: http://cr.openjdk.java.net/~jnimeh/reviews/kdfspec/javadoc.02/

In terms of high level changes:
  • Moved to a getInstance/init usage pattern similar to Mac, KeyAgreement, Cipher, etc.  This allows KDF objects to be reused with different parameters by reinitializing.
  • Name change: DerivedKeyParameterSpec --> DerivationParameterSpec
  • Keys returned by derivation methods are now java.security.Key rather than SecretKey
  • Provided additional derivation methods to support non-key based output: deriveData, deriveObject
  • Added a new constructor to DerivationParameterSpec to support the Object return type.
Thanks,
--Jamil
Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Jamil Nimeh

Hello everyone, one other request.  I would like to end this comment period by the end of next week (11/24) ideally.

Thanks,
--Jamil

On 11/15/2017 08:43 AM, Jamil Nimeh wrote:
Hello all,

Thanks to everyone who has given input so far.  I've updated the KeyDerivation API with the comments I've received.  The new specification is here:

Text: http://cr.openjdk.java.net/~jnimeh/reviews/kdfspec/kdfspec.02.txt
Javadoc: http://cr.openjdk.java.net/~jnimeh/reviews/kdfspec/javadoc.02/

In terms of high level changes:
  • Moved to a getInstance/init usage pattern similar to Mac, KeyAgreement, Cipher, etc.  This allows KDF objects to be reused with different parameters by reinitializing.
  • Name change: DerivedKeyParameterSpec --> DerivationParameterSpec
  • Keys returned by derivation methods are now java.security.Key rather than SecretKey
  • Provided additional derivation methods to support non-key based output: deriveData, deriveObject
  • Added a new constructor to DerivationParameterSpec to support the Object return type.
Thanks,
--Jamil

Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Michael StJohns
In reply to this post by Jamil Nimeh
On 11/15/2017 11:43 AM, Jamil Nimeh wrote:
Hello all,

Thanks to everyone who has given input so far.  I've updated the KeyDerivation API with the comments I've received.  The new specification is here:

Text: http://cr.openjdk.java.net/~jnimeh/reviews/kdfspec/kdfspec.02.txt
Javadoc: http://cr.openjdk.java.net/~jnimeh/reviews/kdfspec/javadoc.02/

In terms of high level changes:
  • Moved to a getInstance/init usage pattern similar to Mac, KeyAgreement, Cipher, etc.  This allows KDF objects to be reused with different parameters by reinitializing.
  • Name change: DerivedKeyParameterSpec --> DerivationParameterSpec
  • Keys returned by derivation methods are now java.security.Key rather than SecretKey
  • Provided additional derivation methods to support non-key based output: deriveData, deriveObject
  • Added a new constructor to DerivationParameterSpec to support the Object return type.
Thanks,
--Jamil


This is pretty close, but I think you need to add an AlgorithmParameters argument to each of the getInstance calls in KeyDerivation - or require each KDF to specify a default model - not all KDFs are fully specified in a given document.

Alternately, you could use the .setParameter/.getParameter model of signature,  but it may be that underlying code will actually be creating a whole new instance.  (E.g. getInstance("NIST-SP800-108") vs getInstance("NIST-SP800-108-Counter") vs getInstance("NIST-SP800-108/Counter"))


Here's the model I'm thinking about:

SP800-108 is a parameterized set of Key derivation functions which goes something like:

Pick either Counter or Feedback

Pick the PRF (e.g. HMAC-SHA256, AES-128-CMAC, etc)
Pick the size of the counter and endianness:  (e.g. Big endian Uint16)

Pick the size and endianness of L

Pick whether the counter precedes or follows the fixed data (for counter mode).
Pick whether the counter is included and whether it precedes or follows the fixed data (for feedback mode)

Taken together those instantiation parameters define a particular KDF model.

Then for the .init() call, the kdfParams is where the Label and Context data go (what HKDF calls 'info').  For most KDFs this could just be a byte array.

For HKDF the getInstance must specify an underlying hash function - by definition mode is feedback, the size of the counter is fixed, L is not included in the base calculation.  (TLS1.3 uses HKDF and makes L a mandatory part of the HKDF).

I want to do a worked example from instantiation to use to make sure this covers the corner cases.  Give me a day....  I'm currently in Singapore.

Mike






Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Adam Petcher

On 11/16/2017 12:47 AM, Michael StJohns wrote:

This is pretty close, but I think you need to add an AlgorithmParameters argument to each of the getInstance calls in KeyDerivation - or require each KDF to specify a default model - not all KDFs are fully specified in a given document.

Alternately, you could use the .setParameter/.getParameter model of signature,  but it may be that underlying code will actually be creating a whole new instance.  (E.g. getInstance("NIST-SP800-108") vs getInstance("NIST-SP800-108-Counter") vs getInstance("NIST-SP800-108/Counter"))


Here's the model I'm thinking about:

SP800-108 is a parameterized set of Key derivation functions which goes something like:

Pick either Counter or Feedback

Pick the PRF (e.g. HMAC-SHA256, AES-128-CMAC, etc)
Pick the size of the counter and endianness:  (e.g. Big endian Uint16)

Pick the size and endianness of L

Pick whether the counter precedes or follows the fixed data (for counter mode).
Pick whether the counter is included and whether it precedes or follows the fixed data (for feedback mode)

Taken together those instantiation parameters define a particular KDF model.

Then for the .init() call, the kdfParams is where the Label and Context data go (what HKDF calls 'info').  For most KDFs this could just be a byte array.

For HKDF the getInstance must specify an underlying hash function - by definition mode is feedback, the size of the counter is fixed, L is not included in the base calculation.  (TLS1.3 uses HKDF and makes L a mandatory part of the HKDF).

I don't like the idea of putting algorithm parameters in getInstance, because we don't have this pattern in JCA, and it doesn't seem like it is necessary here. In your example above, the first set of parameters are somehow different from the second set, but it is not clear how. So it seems like they could all be supplied to init. Alternatively, algorithm names could specify more concrete algorithms that include the mode/PRF/etc. Can you provide more information to explain why these existing patterns won't work in this case?

Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Michael StJohns
On 11/16/2017 2:15 PM, Adam Petcher wrote:

On 11/16/2017 12:47 AM, Michael StJohns wrote:

This is pretty close, but I think you need to add an AlgorithmParameters argument to each of the getInstance calls in KeyDerivation - or require each KDF to specify a default model - not all KDFs are fully specified in a given document.

Alternately, you could use the .setParameter/.getParameter model of signature,  but it may be that underlying code will actually be creating a whole new instance.  (E.g. getInstance("NIST-SP800-108") vs getInstance("NIST-SP800-108-Counter") vs getInstance("NIST-SP800-108/Counter"))


Here's the model I'm thinking about:

SP800-108 is a parameterized set of Key derivation functions which goes something like:

Pick either Counter or Feedback

Pick the PRF (e.g. HMAC-SHA256, AES-128-CMAC, etc)
Pick the size of the counter and endianness:  (e.g. Big endian Uint16)

Pick the size and endianness of L

Pick whether the counter precedes or follows the fixed data (for counter mode).
Pick whether the counter is included and whether it precedes or follows the fixed data (for feedback mode)

Taken together those instantiation parameters define a particular KDF model.

Then for the .init() call, the kdfParams is where the Label and Context data go (what HKDF calls 'info').  For most KDFs this could just be a byte array.

For HKDF the getInstance must specify an underlying hash function - by definition mode is feedback, the size of the counter is fixed, L is not included in the base calculation.  (TLS1.3 uses HKDF and makes L a mandatory part of the HKDF).

I don't like the idea of putting algorithm parameters in getInstance, because we don't have this pattern in JCA, and it doesn't seem like it is necessary here.
Which is why I mentioned the Signature.setParameter() pattern as an alternative.

In your example above, the first set of parameters are somehow different from the second set, but it is not clear how.
The first set configures HOW the kdf operations, the second (.init()) gives the parameters needed for a specific set of invocations. 
So it seems like they could all be supplied to init. Alternatively, algorithm names could specify more concrete algorithms that include the mode/PRF/etc. Can you provide more information to explain why these existing patterns won't work in this case?
What I need to do is provide a lifecycle diagram, but its hard to do in text.  But basically, the .getInstance() followed by .setParameters() builds a concrete engine while the .init() initializes that engine with a key and the derivation parameters.  Think about a TLS 1.2 instance - the PRF is selected once, but the KDF may be used multiple times.

I considered the mode/PRF/etc stuff but that works for things like Cipher and Signature because most of those have exactly the same pattern.  For the KDF pattern we;ve got fully specified KDFs (e.g. TLS 1.1 and before, IPSEC), almost fully specified KDFs (TLS 1.2 and HDKF needs a PRF) and then the SP800 style KDFs which are defined to be *very* flexible.  So translating that into a naming convention is going to be restrictive and may not cover all of the possible approaches.  I'd rather do it as an algorithmparameter instead.  With a given KDF implementation having a default if nothing is specified during instantiation.

Mike




Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Adam Petcher
On 11/17/2017 10:04 AM, Michael StJohns wrote:

> On 11/16/2017 2:15 PM, Adam Petcher wrote:
>> So it seems like they could all be supplied to init. Alternatively,
>> algorithm names could specify more concrete algorithms that include
>> the mode/PRF/etc. Can you provide more information to explain why
>> these existing patterns won't work in this case?
> What I need to do is provide a lifecycle diagram, but its hard to do
> in text.  But basically, the .getInstance() followed by
> .setParameters() builds a concrete engine while the .init()
> initializes that engine with a key and the derivation parameters.
> Think about a TLS 1.2 instance - the PRF is selected once, but the KDF
> may be used multiple times.

This is the information I was missing. There are two sets of parameters,
and the first set should be fixed, but the second set should be changed
on each init.

>
> I considered the mode/PRF/etc stuff but that works for things like
> Cipher and Signature because most of those have exactly the same
> pattern.  For the KDF pattern we;ve got fully specified KDFs (e.g. TLS
> 1.1 and before, IPSEC), almost fully specified KDFs (TLS 1.2 and HDKF
> needs a PRF) and then the SP800 style KDFs which are defined to be
> *very* flexible.  So translating that into a naming convention is
> going to be restrictive and may not cover all of the possible
> approaches.  I'd rather do it as an algorithmparameter instead.  With
> a given KDF implementation having a default if nothing is specified
> during instantiation.

I agree that this is challenging because there is so much variety in
KDFs. But I don't think that SP 800-108 is a good example of something
that should be exposed as an algorithm in JCA, because it is too broad.
SP 800-108 is more of a toolbox that can be used to construct KDFs.
Particular specializations of SP 800-108 are widely used, and they will
get names that can be used in getInstance. For example, HKDF-Expand is a
particular specialization of SP 800-108.

So I think the existing pattern of using algorithm names to specify
concrete algorithms should work just as well in this API as it does in
the rest of JCA. Of course, more flexibility in the API is a nice
feature, but supporting this level of generality may be out of scope for
this effort.


>
> Mike
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Michael StJohns
On 11/17/2017 1:07 PM, Adam Petcher wrote:

> On 11/17/2017 10:04 AM, Michael StJohns wrote:
>
>> On 11/16/2017 2:15 PM, Adam Petcher wrote:
>>> So it seems like they could all be supplied to init. Alternatively,
>>> algorithm names could specify more concrete algorithms that include
>>> the mode/PRF/etc. Can you provide more information to explain why
>>> these existing patterns won't work in this case?
>> What I need to do is provide a lifecycle diagram, but its hard to do
>> in text.  But basically, the .getInstance() followed by
>> .setParameters() builds a concrete engine while the .init()
>> initializes that engine with a key and the derivation parameters.
>> Think about a TLS 1.2 instance - the PRF is selected once, but the
>> KDF may be used multiple times.
>
> This is the information I was missing. There are two sets of
> parameters, and the first set should be fixed, but the second set
> should be changed on each init.
>
>>
>> I considered the mode/PRF/etc stuff but that works for things like
>> Cipher and Signature because most of those have exactly the same
>> pattern.  For the KDF pattern we;ve got fully specified KDFs (e.g.
>> TLS 1.1 and before, IPSEC), almost fully specified KDFs (TLS 1.2 and
>> HDKF needs a PRF) and then the SP800 style KDFs which are defined to
>> be *very* flexible.  So translating that into a naming convention is
>> going to be restrictive and may not cover all of the possible
>> approaches.  I'd rather do it as an algorithmparameter instead.  With
>> a given KDF implementation having a default if nothing is specified
>> during instantiation.
>
> I agree that this is challenging because there is so much variety in
> KDFs. But I don't think that SP 800-108 is a good example of something
> that should be exposed as an algorithm in JCA, because it is too
> broad. SP 800-108 is more of a toolbox that can be used to construct
> KDFs. Particular specializations of SP 800-108 are widely used, and
> they will get names that can be used in getInstance. For example,
> HKDF-Expand is a particular specialization of SP 800-108.

>
> So I think the existing pattern of using algorithm names to specify
> concrete algorithms should work just as well in this API as it does in
> the rest of JCA. Of course, more flexibility in the API is a nice
> feature, but supporting this level of generality may be out of scope
> for this effort.


The more I think about it the more I think you're mostly right.  But
let's split this slightly as almost every KDF allows for the
specification of the PRF.  So

<kdfname>/<prf>    as the standard naming convention.

Or TLS13/HMAC-SHA256 and HKDF/HMAC-SHA256 (which are different because
of the mandatory inclusion of "L" in the derivation parameters and each
component object for TLS13)

Still - let's include the .setParameters() call as a failsafe as looking
forward I can see the need for flexibility rearing its ugly head (e.g.
adding PSS parameters to RSA signatures way late in the game.....) and
it does match the pattern for Signature so its not a new concept. A
given provider need not support the call, but its there if needed.

Mike

>
>
>>
>> Mike
>>
>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Jamil Nimeh


On 11/19/2017 12:45 PM, Michael StJohns wrote:

> On 11/17/2017 1:07 PM, Adam Petcher wrote:
>> On 11/17/2017 10:04 AM, Michael StJohns wrote:
>>
>>> On 11/16/2017 2:15 PM, Adam Petcher wrote:
>>>> So it seems like they could all be supplied to init. Alternatively,
>>>> algorithm names could specify more concrete algorithms that include
>>>> the mode/PRF/etc. Can you provide more information to explain why
>>>> these existing patterns won't work in this case?
>>> What I need to do is provide a lifecycle diagram, but its hard to do
>>> in text.  But basically, the .getInstance() followed by
>>> .setParameters() builds a concrete engine while the .init()
>>> initializes that engine with a key and the derivation parameters.
>>> Think about a TLS 1.2 instance - the PRF is selected once, but the
>>> KDF may be used multiple times.
>>
>> This is the information I was missing. There are two sets of
>> parameters, and the first set should be fixed, but the second set
>> should be changed on each init.
>>
>>>
>>> I considered the mode/PRF/etc stuff but that works for things like
>>> Cipher and Signature because most of those have exactly the same
>>> pattern.  For the KDF pattern we;ve got fully specified KDFs (e.g.
>>> TLS 1.1 and before, IPSEC), almost fully specified KDFs (TLS 1.2 and
>>> HDKF needs a PRF) and then the SP800 style KDFs which are defined to
>>> be *very* flexible.  So translating that into a naming convention is
>>> going to be restrictive and may not cover all of the possible
>>> approaches. I'd rather do it as an algorithmparameter instead.  With
>>> a given KDF implementation having a default if nothing is specified
>>> during instantiation.
>>
>> I agree that this is challenging because there is so much variety in
>> KDFs. But I don't think that SP 800-108 is a good example of
>> something that should be exposed as an algorithm in JCA, because it
>> is too broad. SP 800-108 is more of a toolbox that can be used to
>> construct KDFs. Particular specializations of SP 800-108 are widely
>> used, and they will get names that can be used in getInstance. For
>> example, HKDF-Expand is a particular specialization of SP 800-108.
>
>>
>> So I think the existing pattern of using algorithm names to specify
>> concrete algorithms should work just as well in this API as it does
>> in the rest of JCA. Of course, more flexibility in the API is a nice
>> feature, but supporting this level of generality may be out of scope
>> for this effort.
>
>
> The more I think about it the more I think you're mostly right. But
> let's split this slightly as almost every KDF allows for the
> specification of the PRF.  So
>
> <kdfname>/<prf>    as the standard naming convention.
>
> Or TLS13/HMAC-SHA256 and HKDF/HMAC-SHA256 (which are different because
> of the mandatory inclusion of "L" in the derivation parameters and
> each component object for TLS13)
>
> Still - let's include the .setParameters() call as a failsafe as
> looking forward I can see the need for flexibility rearing its ugly
> head (e.g. adding PSS parameters to RSA signatures way late in the
> game.....) and it does match the pattern for Signature so its not a
> new concept. A given provider need not support the call, but its there
> if needed.
Signature appears to have setParameter because the initSign and
initVerify didn't have APS parameters in their method signatures. Since
we're talking about providing APS objects through both getInstance() for
those locked to the algorithm and init() for things like salts, info,
etc. that can be changed on successive inits it seems like we're covered
without the need for a setParameter method.

One additional topic for discussion: Late in the week we talked about
the current state of the API internally and one item to revisit is where
the DerivationParameterSpec objects are passed.  It was brought up by a
couple people that it would be better to provide the DPS objects
pertaining to keys at the time they are called for through deriveKey()
and deriveKeys() (and possibly deriveData).

Originally we had them all grouped in a List in the init method. One
reason for needing it up there was to know the total length of material
to generate.  If we can provide the total length through the
AlgorithmParameterSpec passed in via init() then things like:

Key deriveKey(DerivationParameterSpec param);
List<Key> deriveKeys(List<DerivationParameterSpec> params);

become possible.  To my eyes at least it does make it more clear what
DPS you're processing since they're provided at derive time, rather than
the caller having to keep track in their heads where in the DPS list
they might be with each successive deriveKey or deriveKeys calls.  And I
think we could do away with deriveKeys(int), too.

--Jamil

Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Adam Petcher
On 11/20/2017 5:12 AM, Jamil Nimeh wrote:

>
> On 11/19/2017 12:45 PM, Michael StJohns wrote:
>> On 11/17/2017 1:07 PM, Adam Petcher wrote:
>>>
>>> I agree that this is challenging because there is so much variety in
>>> KDFs. But I don't think that SP 800-108 is a good example of
>>> something that should be exposed as an algorithm in JCA, because it
>>> is too broad. SP 800-108 is more of a toolbox that can be used to
>>> construct KDFs. Particular specializations of SP 800-108 are widely
>>> used, and they will get names that can be used in getInstance. For
>>> example, HKDF-Expand is a particular specialization of SP 800-108.
>>> So I think the existing pattern of using algorithm names to specify
>>> concrete algorithms should work just as well in this API as it does
>>> in the rest of JCA. Of course, more flexibility in the API is a nice
>>> feature, but supporting this level of generality may be out of scope
>>> for this effort.
>>
>> The more I think about it the more I think you're mostly right. But
>> let's split this slightly as almost every KDF allows for the
>> specification of the PRF.  So
>>
>> <kdfname>/<prf>    as the standard naming convention.
>>
>> Or TLS13/HMAC-SHA256 and HKDF/HMAC-SHA256 (which are different
>> because of the mandatory inclusion of "L" in the derivation
>> parameters and each component object for TLS13)

This approach seems fine to me. We would probably want to allow any
algorithm name after the / (rather than limiting it to PRFs), because
JCA doesn't have a notion of PRF, and because some KDFs take other kinds
of functions (e.g. PBKDF1 uses a bare hash function).

>>
>> Still - let's include the .setParameters() call as a failsafe as
>> looking forward I can see the need for flexibility rearing its ugly
>> head (e.g. adding PSS parameters to RSA signatures way late in the
>> game.....) and it does match the pattern for Signature so its not a
>> new concept. A given provider need not support the call, but its
>> there if needed.
> Signature appears to have setParameter because the initSign and
> initVerify didn't have APS parameters in their method signatures.
> Since we're talking about providing APS objects through both
> getInstance() for those locked to the algorithm and init() for things
> like salts, info, etc. that can be changed on successive inits it
> seems like we're covered without the need for a setParameter method.

My argument is that providing APS in getInstance doesn't appear to be
necessary. Of course, if you want to tackle this, that's fine with me. 
But I think it complicates the API and I expect it will lead to other
API/design problems that will need to be sorted out.

I agree that setParameter() in Signature appears to be there to solve a
different problem. This API doesn't have that problem because the init
method takes an APS.

>
> One additional topic for discussion: Late in the week we talked about
> the current state of the API internally and one item to revisit is
> where the DerivationParameterSpec objects are passed. It was brought
> up by a couple people that it would be better to provide the DPS
> objects pertaining to keys at the time they are called for through
> deriveKey() and deriveKeys() (and possibly deriveData).
>
> Originally we had them all grouped in a List in the init method. One
> reason for needing it up there was to know the total length of
> material to generate.  If we can provide the total length through the
> AlgorithmParameterSpec passed in via init() then things like:
>
> Key deriveKey(DerivationParameterSpec param);
> List<Key> deriveKeys(List<DerivationParameterSpec> params);
>
> become possible.  To my eyes at least it does make it more clear what
> DPS you're processing since they're provided at derive time, rather
> than the caller having to keep track in their heads where in the DPS
> list they might be with each successive deriveKey or deriveKeys
> calls.  And I think we could do away with deriveKeys(int), too.

I like this change. It simplifies the API, and forcing the JCA client to
be explicit and supply the output length in the APS is a good thing.

Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Michael StJohns
In reply to this post by Jamil Nimeh
Apologies in advance for top posting and a need to be a little pedantic
about KDFs.  I'll have some comments inline below as well.

KDF's aren't well understood but people think they are.  The key stream
generation part is pretty straightforward (keyed PRBG), but the
interaction of how the key stream is generated and how the key stream is
assigned to actual cryptographic objects is not.  Here's why:

1) KDF's are repeatable.  Given the exact same inputs (key, mixin data)
they produce the same key stream.
2) Any change in the inputs changes ALL of the key stream.
3) Unless the overall length property is included, then changing the
length of the key stream will not change the prefix (e.g. if the
original call was for 10 bytes and a second call was for 20, the first
10 bytes of both calls will produce the exact same key stream data)
4) The general format of each round of key stream generation is
something like PRF (master key, mixins), where mixins are the
concatenation of at least a label and context and a value to
differentiate each round (a counter or the previous rounds output for
example).  Including L in the mixin prevents the property described in
(3) above.  Including a length for each subcomponent as a mixin prevents
the property described in (5) below.
5) Unless the length for each derived object is included in the mix in,
then it is possible to move the assignment of key stream bytes between
objects.  For example, both TLS (1.2 and before) and IPSEC use KDFs that
generate non-secret IV material along with secret session key
material.     This is less important for software only KDFs as both the
secret key material and the IV material are both in the JVM memory
domain.  This is very important if you're trying to keep your secret key
material secure in an HSM.

Example:  a given TLS session may need 2 256 bit AES keys and 2 128 bit
IVs.  That is a requirement for 96 bytes of key stream (if I've got my
calculation correct).  We have the HSM produce this (see the PKCS11
calling sequence for example) and we get out the IVs.  An attacker who
has access to the HSM (which may or may not be on the same machine as
the TLS instantiation) can call the derivation function with new output
parameters (but with the same master key and mixins)  which specifies
only IV material and have the function output the same key stream bytes
that were previously assigned to the secret key material in the IV
output.  A very easy key extraction attack.

This is why TLS1.3 only does single outputs per KDF call and makes the
length of that output a mandatory mixin.  An HSM can also look at the
labels and make a determination as to whether an object need be
protected (key material) or in the clear (iv).

Given (3) and (5) I believe that both L and l[i] (subcomponent length)
may need to be provided for BEFORE any key material is produced which
argues for input during initialization phase.


On 11/20/2017 5:12 AM, Jamil Nimeh wrote:

>
>
> On 11/19/2017 12:45 PM, Michael StJohns wrote:
>> On 11/17/2017 1:07 PM, Adam Petcher wrote:
>>> On 11/17/2017 10:04 AM, Michael StJohns wrote:
>>>
>>>> On 11/16/2017 2:15 PM, Adam Petcher wrote:
>>>>> So it seems like they could all be supplied to init.
>>>>> Alternatively, algorithm names could specify more concrete
>>>>> algorithms that include the mode/PRF/etc. Can you provide more
>>>>> information to explain why these existing patterns won't work in
>>>>> this case?
>>>> What I need to do is provide a lifecycle diagram, but its hard to
>>>> do in text.  But basically, the .getInstance() followed by
>>>> .setParameters() builds a concrete engine while the .init()
>>>> initializes that engine with a key and the derivation parameters.
>>>> Think about a TLS 1.2 instance - the PRF is selected once, but the
>>>> KDF may be used multiple times.
>>>
>>> This is the information I was missing. There are two sets of
>>> parameters, and the first set should be fixed, but the second set
>>> should be changed on each init.
>>>
>>>>
>>>> I considered the mode/PRF/etc stuff but that works for things like
>>>> Cipher and Signature because most of those have exactly the same
>>>> pattern.  For the KDF pattern we;ve got fully specified KDFs (e.g.
>>>> TLS 1.1 and before, IPSEC), almost fully specified KDFs (TLS 1.2
>>>> and HDKF needs a PRF) and then the SP800 style KDFs which are
>>>> defined to be *very* flexible.  So translating that into a naming
>>>> convention is going to be restrictive and may not cover all of the
>>>> possible approaches. I'd rather do it as an algorithmparameter
>>>> instead.  With a given KDF implementation having a default if
>>>> nothing is specified during instantiation.
>>>
>>> I agree that this is challenging because there is so much variety in
>>> KDFs. But I don't think that SP 800-108 is a good example of
>>> something that should be exposed as an algorithm in JCA, because it
>>> is too broad. SP 800-108 is more of a toolbox that can be used to
>>> construct KDFs. Particular specializations of SP 800-108 are widely
>>> used, and they will get names that can be used in getInstance. For
>>> example, HKDF-Expand is a particular specialization of SP 800-108.
>>
>>>
>>> So I think the existing pattern of using algorithm names to specify
>>> concrete algorithms should work just as well in this API as it does
>>> in the rest of JCA. Of course, more flexibility in the API is a nice
>>> feature, but supporting this level of generality may be out of scope
>>> for this effort.
>>
>>
>> The more I think about it the more I think you're mostly right. But
>> let's split this slightly as almost every KDF allows for the
>> specification of the PRF.  So
>>
>> <kdfname>/<prf>    as the standard naming convention.
>>
>> Or TLS13/HMAC-SHA256 and HKDF/HMAC-SHA256 (which are different
>> because of the mandatory inclusion of "L" in the derivation
>> parameters and each component object for TLS13)
>>
>> Still - let's include the .setParameters() call as a failsafe as
>> looking forward I can see the need for flexibility rearing its ugly
>> head (e.g. adding PSS parameters to RSA signatures way late in the
>> game.....) and it does match the pattern for Signature so its not a
>> new concept. A given provider need not support the call, but its
>> there if needed.
> Signature appears to have setParameter because the initSign and
> initVerify didn't have APS parameters in their method signatures.
> Since we're talking about providing APS objects through both
> getInstance() for those locked to the algorithm and init() for things
> like salts, info, etc. that can be changed on successive inits it
> seems like we're covered without the need for a setParameter method.

You're missing the point that setParameter() provides information used
in all future calls to the signature generation, while init() provides
data specifically for a given key stream production.  In Signature() you
call .setParameter() to set up the PSS parameters (or use the
defaults).  Each subsequent call to initSign or initVerify uses those
PSS parameters.  The  equivalent part of .init() in KeyDerivation is
actually the calls to .update() in signature as they provide the
specific information for the production of the output key stream.  In
fact, setting up an HMAC signature instance and passing it the mixin
data as part of a .update() is a way of producing the key stream round.

So equivalences:

KeyDerivation.getInstance(PRF) == Signature.getInstance(HMAC)
KeyDerivation.setParameters() == Signature.setParameters()
KeyDerivation.init(key, List<Parameters>) == concatenation of the
results of multiple calls (each key stream round based on the needed
output length) to [Signature.initSign(Key) followed by
Signature.update(converttobytearray(List<Parameters>)) followed by 
Signature.sign()] to produce the key stream
KeyDerivation.deriveKey() ==  various calls to key or object factories
with parts of the key stream (signature).

(Hmm.. I think I forgot to get back to this comment - a KDF key should
be tagged differently than an HMAC key even though the underlying
functions are the same.  It shouldn't be possible to use an HMAC
SecretKey (or an AES secret key) as a KDF master key and vice versa,
basically because of the property that an HMAC output is by definition
non-secret data while the key stream production is by definition -
secret.  You want to make sure that its not trivial to do this).

>
> One additional topic for discussion: Late in the week we talked about
> the current state of the API internally and one item to revisit is
> where the DerivationParameterSpec objects are passed. It was brought
> up by a couple people that it would be better to provide the DPS
> objects pertaining to keys at the time they are called for through
> deriveKey() and deriveKeys() (and possibly deriveData).
>
> Originally we had them all grouped in a List in the init method. One
> reason for needing it up there was to know the total length of
> material to generate.  If we can provide the total length through the
> AlgorithmParameterSpec passed in via init() then things like:
>
> Key deriveKey(DerivationParameterSpec param);
> List<Key> deriveKeys(List<DerivationParameterSpec> params);
>
> become possible.  To my eyes at least it does make it more clear what
> DPS you're processing since they're provided at derive time, rather
> than the caller having to keep track in their heads where in the DPS
> list they might be with each successive deriveKey or deriveKeys
> calls.  And I think we could do away with deriveKeys(int), too.

See above - the key stream is logically produced in its entirety before
any assignment of that stream is made to any cryptographic objects
because the mixins (except for the round differentiator) are the same
for each key stream production round.   Simply passing in the total
length may not give you the right result if the KDF requires a per
component length (and it should to defeat (5) or it should only produce
a single key).

95% of the time this will be a call to produce a single key.  4% of the
time it will be a call to produce multiple keys. Only 1% of the time
will it need to intermix key, data and object productions. Anybody who
is doing that is going to write a wrapper around this class to make sure
they get the key and data production order correct for each call.  So
I'm not all that bothered by keeping the complexity as a price for
keeping flexibility.

You could have a Key deriveKey(Key k, DerivationParameterSpec param) for
some things like TLS1.3 (where you can only make a single call to derive
key between inits) , but then you'd also need at least a byte[]
deriveData (Key k, DerivationParameterSpec param) and an Object
deriveObject(Key k, DerivationParameterSpec param).


I think the most common pattern will be

.init(Key k, DerivationParameterSpec param) followed by .deriveKey()  or
.init(Key k, List<DerivationParameterSpec> params) followed by .deriveKeys()

but the other intermixed patterns are just as valid.


>
> --Jamil
>

Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Adam Petcher
On 11/20/2017 11:17 AM, Michael StJohns wrote:

>
>>
>> One additional topic for discussion: Late in the week we talked about
>> the current state of the API internally and one item to revisit is
>> where the DerivationParameterSpec objects are passed. It was brought
>> up by a couple people that it would be better to provide the DPS
>> objects pertaining to keys at the time they are called for through
>> deriveKey() and deriveKeys() (and possibly deriveData).
>>
>> Originally we had them all grouped in a List in the init method. One
>> reason for needing it up there was to know the total length of
>> material to generate.  If we can provide the total length through the
>> AlgorithmParameterSpec passed in via init() then things like:
>>
>> Key deriveKey(DerivationParameterSpec param);
>> List<Key> deriveKeys(List<DerivationParameterSpec> params);
>>
>> become possible.  To my eyes at least it does make it more clear what
>> DPS you're processing since they're provided at derive time, rather
>> than the caller having to keep track in their heads where in the DPS
>> list they might be with each successive deriveKey or deriveKeys
>> calls.  And I think we could do away with deriveKeys(int), too.
>
> See above - the key stream is logically produced in its entirety
> before any assignment of that stream is made to any cryptographic
> objects because the mixins (except for the round differentiator) are
> the same for each key stream production round.   Simply passing in the
> total length may not give you the right result if the KDF requires a
> per component length (and it should to defeat (5) or it should only
> produce a single key).

In general, if the KDF needs any information up front to produce the
output bits, then that information can be supplied in init using the
APS. In your example attack for (5), an implementation that prevents
this attack would probably take a length and label (e.g. "key1", "key2",
"iv1", "iv2") for each derived value. Then the HSM would need to enforce
that any value derived with a "key*" label could not be extracted. If
that entire sequence of lengths and labels needs to be known up front,
then it can be supplied in the APS. In this organization, the only
additional information that the DPS passed to deriveKey() needs to
provide is the encryption algorithm name and parameters (though taking a
length and checking it is probably a good idea).



Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Jamil Nimeh
In reply to this post by Michael StJohns

> You're missing the point that setParameter() provides information used
> in all future calls to the signature generation, while init() provides
> data specifically for a given key stream production.  In Signature()
> you call .setParameter() to set up the PSS parameters (or use the
> defaults).  Each subsequent call to initSign or initVerify uses those
> PSS parameters.  The  equivalent part of .init() in KeyDerivation is
> actually the calls to .update() in signature as they provide the
> specific information for the production of the output key stream.  In
> fact, setting up an HMAC signature instance and passing it the mixin
> data as part of a .update() is a way of producing the key stream round.
>
> So equivalences:
>
> KeyDerivation.getInstance(PRF) == Signature.getInstance(HMAC)
> KeyDerivation.setParameters() == Signature.setParameters()
> KeyDerivation.init(key, List<Parameters>) == concatenation of the
> results of multiple calls (each key stream round based on the needed
> output length) to [Signature.initSign(Key) followed by
> Signature.update(converttobytearray(List<Parameters>)) followed by 
> Signature.sign()] to produce the key stream
> KeyDerivation.deriveKey() ==  various calls to key or object factories
> with parts of the key stream (signature).
>
Are you expecting that setParameters is called once per instantiation? 
If so, then the parameters that would go into setParameter (an APS I
assume) could just as easily go into the getInstance call.  It also
removes the chance that someone would call it twice.

If you're expecting someone to call setParameter more than once, then I
would expect an init must follow.  So why not place it in a form of init
that allows you to change that particular set of params?  Either way it
seems like we could coalesce this method into one of the calls that
sandwich it in your proposed model.


Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Michael StJohns
On 11/20/2017 1:10 PM, Jamil Nimeh wrote:

>
>> You're missing the point that setParameter() provides information
>> used in all future calls to the signature generation, while init()
>> provides data specifically for a given key stream production.  In
>> Signature() you call .setParameter() to set up the PSS parameters (or
>> use the defaults).  Each subsequent call to initSign or initVerify
>> uses those PSS parameters.  The  equivalent part of .init() in
>> KeyDerivation is actually the calls to .update() in signature as they
>> provide the specific information for the production of the output key
>> stream.  In fact, setting up an HMAC signature instance and passing
>> it the mixin data as part of a .update() is a way of producing the
>> key stream round.
>>
>> So equivalences:
>>
>> KeyDerivation.getInstance(PRF) == Signature.getInstance(HMAC)
>> KeyDerivation.setParameters() == Signature.setParameters()
>> KeyDerivation.init(key, List<Parameters>) == concatenation of the
>> results of multiple calls (each key stream round based on the needed
>> output length) to [Signature.initSign(Key) followed by
>> Signature.update(converttobytearray(List<Parameters>)) followed by 
>> Signature.sign()] to produce the key stream
>> KeyDerivation.deriveKey() ==  various calls to key or object
>> factories with parts of the key stream (signature).
>>
> Are you expecting that setParameters is called once per
> instantiation?  If so, then the parameters that would go into
> setParameter (an APS I assume) could just as easily go into the
> getInstance call.  It also removes the chance that someone would call
> it twice.

That was my original proposal.  .setParameter() was an alternative that
matched the Signature pattern.
>
> If you're expecting someone to call setParameter more than once, then
> I would expect an init must follow.  So why not place it in a form of
> init that allows you to change that particular set of params?  Either
> way it seems like we could coalesce this method into one of the calls
> that sandwich it in your proposed model.
>
>

I don't expect them to call it more than once.  The original (now
deprecated) .setParameter (String, Object) method in Signature indicated
it could be called only once and would throw an error if called again -
I'm not sure why that wasn't brought forward to the
Signature.setParameter(AlgorithmParameterSpec) method.

In any event, I'd rather do the parameter setting in the getInstance
call than as a separate .setParameters call if it can be done without
exploding the interface.

Hmm.. how does that map to the Spi?  Does the
KeyDerivation.getInstance() code instantiate the object, call a
setParameter() method on the SPI and then return the new object? Or
what?  It may make more sense to just add the parameter related methods
to both the KeyDerivationSpi and the KeyDerivation classes and leave the
getInstance() method alone....

I'm sort of a don't care as long as I have a way of tweaking the KDF
before run the first derivation.



Mike




Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Jamil Nimeh


On 11/20/2017 12:34 PM, Michael StJohns wrote:

> On 11/20/2017 1:10 PM, Jamil Nimeh wrote:
>>
>>> You're missing the point that setParameter() provides information
>>> used in all future calls to the signature generation, while init()
>>> provides data specifically for a given key stream production.  In
>>> Signature() you call .setParameter() to set up the PSS parameters
>>> (or use the defaults).  Each subsequent call to initSign or
>>> initVerify uses those PSS parameters.  The equivalent part of
>>> .init() in KeyDerivation is actually the calls to .update() in
>>> signature as they provide the specific information for the
>>> production of the output key stream.  In fact, setting up an HMAC
>>> signature instance and passing it the mixin data as part of a
>>> .update() is a way of producing the key stream round.
>>>
>>> So equivalences:
>>>
>>> KeyDerivation.getInstance(PRF) == Signature.getInstance(HMAC)
>>> KeyDerivation.setParameters() == Signature.setParameters()
>>> KeyDerivation.init(key, List<Parameters>) == concatenation of the
>>> results of multiple calls (each key stream round based on the needed
>>> output length) to [Signature.initSign(Key) followed by
>>> Signature.update(converttobytearray(List<Parameters>)) followed by 
>>> Signature.sign()] to produce the key stream
>>> KeyDerivation.deriveKey() ==  various calls to key or object
>>> factories with parts of the key stream (signature).
>>>
>> Are you expecting that setParameters is called once per
>> instantiation?  If so, then the parameters that would go into
>> setParameter (an APS I assume) could just as easily go into the
>> getInstance call.  It also removes the chance that someone would call
>> it twice.
>
> That was my original proposal.  .setParameter() was an alternative
> that matched the Signature pattern.
Yes, I recall that.  Since it's a once-per instance call let me come up
with a rev with it in the getInstance.  There's precedent for
getInstance with APS, too.  It's just not any of the keyed forms, and
the rationale for that was to combine instantiation and initialization
(CertStore is an example).  It's not a great comparison to KDF for a
myriad of reasons most of which you've talked about...but it at least
shows that we have added params to getInstance calls in the past.  This
seems like one place where we're not going to come up with an answer
that pleases everyone.

>>
>> If you're expecting someone to call setParameter more than once, then
>> I would expect an init must follow.  So why not place it in a form of
>> init that allows you to change that particular set of params?  Either
>> way it seems like we could coalesce this method into one of the calls
>> that sandwich it in your proposed model.
>>
>>
>
> I don't expect them to call it more than once.  The original (now
> deprecated) .setParameter (String, Object) method in Signature
> indicated it could be called only once and would throw an error if
> called again - I'm not sure why that wasn't brought forward to the
> Signature.setParameter(AlgorithmParameterSpec) method.
>
> In any event, I'd rather do the parameter setting in the getInstance
> call than as a separate .setParameters call if it can be done without
> exploding the interface.
>
> Hmm.. how does that map to the Spi?  Does the
> KeyDerivation.getInstance() code instantiate the object, call a
> setParameter() method on the SPI and then return the new object? Or
> what?  It may make more sense to just add the parameter related
> methods to both the KeyDerivationSpi and the KeyDerivation classes and
> leave the getInstance() method alone....
>
> I'm sort of a don't care as long as I have a way of tweaking the KDF
> before run the first derivation.
>
That's a good question, and one that I've been turning around in my head
and don't (yet) have a great answer for, but we'll get there.

So my original prototype was based off KeyAgreement.java and the order
obtaining the spi depends on the form of getInstance.  If it's a simple
string-based algorithm form, then the provider is actually selected
during the init method.  In the other two forms where a provider is
specified as either a String or a Provider, the spi is obtained through
the Provider object and therefore no init-time selection is needed.

We may need to have provider selection done a bit earlier since we're
not only having to deal with the KDF itself, but a flavor of the KDF
with the underlying PRF.  I need to find out a little bit of the history
on why the provider selection happens during init for some of these
APIs.  IIRC there was a reason to have delayed provider selection, but I
don't have the history on that one.

Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Jamil Nimeh
In reply to this post by Michael StJohns

>
>>
>> One additional topic for discussion: Late in the week we talked about
>> the current state of the API internally and one item to revisit is
>> where the DerivationParameterSpec objects are passed. It was brought
>> up by a couple people that it would be better to provide the DPS
>> objects pertaining to keys at the time they are called for through
>> deriveKey() and deriveKeys() (and possibly deriveData).
>>
>> Originally we had them all grouped in a List in the init method. One
>> reason for needing it up there was to know the total length of
>> material to generate.  If we can provide the total length through the
>> AlgorithmParameterSpec passed in via init() then things like:
>>
>> Key deriveKey(DerivationParameterSpec param);
>> List<Key> deriveKeys(List<DerivationParameterSpec> params);
>>
>> become possible.  To my eyes at least it does make it more clear what
>> DPS you're processing since they're provided at derive time, rather
>> than the caller having to keep track in their heads where in the DPS
>> list they might be with each successive deriveKey or deriveKeys
>> calls.  And I think we could do away with deriveKeys(int), too.
>
> See above - the key stream is logically produced in its entirety
> before any assignment of that stream is made to any cryptographic
> objects because the mixins (except for the round differentiator) are
> the same for each key stream production round.   Simply passing in the
> total length may not give you the right result if the KDF requires a
> per component length (and it should to defeat (5) or it should only
> produce a single key).
 From looking at 800-108, I don't see any place where the KDF needs a
per-component length.  It looks like it takes L (total length) as an
input and that is applied to each round of the PRF.  HKDF takes L
up-front as an input too, though it doesn't use it as an input to the
HMAC function itself.  For TLS 1.3 that component length becomes part of
the context info (HkdfLabel) through the HKDF-Expand-Label
function...and it's only doing one key for a given label which is also
part of that context specific info, necessitating an init() call.  Seems
like the length can go into the APS provided via init (for those KDFs
that need it at least) and you shouldn't need a DPS list up-front.

As far as your (5) scenario goes, I can see how you can twiddle the
lengths to get the keystream output with zero-length keys and large IV
buffers.  But that scenario really glosses over what should be a big
hurdle and a major access control issue that stands outside the KDF API:
That the attacker shouldn't have access to the input keying material in
the first place.  Protect the input keying material properly and their
attack cannot be done.

I would rather see the DPS provided in the deriveKey.  It couples what
you want out with the call that makes the object and it makes a lot more
sense to keep those two together than try to remember where in the
submitted list of DPS objects you are.

>
> 95% of the time this will be a call to produce a single key.  4% of
> the time it will be a call to produce multiple keys. Only 1% of the
> time will it need to intermix key, data and object productions.
> Anybody who is doing that is going to write a wrapper around this
> class to make sure they get the key and data production order correct
> for each call.  So I'm not all that bothered by keeping the complexity
> as a price for keeping flexibility.
>
> You could have a Key deriveKey(Key k, DerivationParameterSpec param)
> for some things like TLS1.3 (where you can only make a single call to
> derive key between inits) , but then you'd also need at least a byte[]
> deriveData (Key k, DerivationParameterSpec param) and an Object
> deriveObject(Key k, DerivationParameterSpec param).
I don't think those are necessary.  If you're just doing HKDF-Expand
(for the HKDF-Expand-Label TLS 1.3 key derivation) then you can provide
the input key, label and max length and any other context info that goes
into that HkdfLabel structure...all of that would go into init().  Then
provide the key alg and desired length via the DPS at deriveKey time. 
Any subsequent keys in the TLS 1.3 key schedule would need a new init
call anyway since the labels change and possibly the output length.

Over the next day or so I'm going to have to make some final decisions
on this API as there are internal projects that are waiting on this API
to proceed.  I'm already past the cut-off date I set, but I recognize
these discussions are important to have and I appreciate the input you
and others have provided.

--Jamil


Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Michael StJohns
On 11/27/2017 1:03 AM, Jamil Nimeh wrote:



One additional topic for discussion: Late in the week we talked about the current state of the API internally and one item to revisit is where the DerivationParameterSpec objects are passed. It was brought up by a couple people that it would be better to provide the DPS objects pertaining to keys at the time they are called for through deriveKey() and deriveKeys() (and possibly deriveData).

Originally we had them all grouped in a List in the init method. One reason for needing it up there was to know the total length of material to generate.  If we can provide the total length through the AlgorithmParameterSpec passed in via init() then things like:

Key deriveKey(DerivationParameterSpec param);
List<Key> deriveKeys(List<DerivationParameterSpec> params);

become possible.  To my eyes at least it does make it more clear what DPS you're processing since they're provided at derive time, rather than the caller having to keep track in their heads where in the DPS list they might be with each successive deriveKey or deriveKeys calls.  And I think we could do away with deriveKeys(int), too.

See above - the key stream is logically produced in its entirety before any assignment of that stream is made to any cryptographic objects because the mixins (except for the round differentiator) are the same for each key stream production round.   Simply passing in the total length may not give you the right result if the KDF requires a per component length (and it should to defeat (5) or it should only produce a single key).
From looking at 800-108, I don't see any place where the KDF needs a per-component length.  It looks like it takes L (total length) as an input and that is applied to each round of the PRF.  HKDF takes L up-front as an input too, though it doesn't use it as an input to the HMAC function itself.  For TLS 1.3 that component length becomes part of the context info (HkdfLabel) through the HKDF-Expand-Label function...and it's only doing one key for a given label which is also part of that context specific info, necessitating an init() call.  Seems like the length can go into the APS provided via init (for those KDFs that need it at least) and you shouldn't need a DPS list up-front.


HKDF and SP800-108 only deal with the creation of the key stream and ignore the issues with assigning the key stream to cryptographic objects.  In the TLS version of HDKF, the L value is mandatory and only a single object is assigned per init/call to the KDF.   An HSM can look at the HKDF label information and set the appropriate policies for the assigned cryptographic object (because if any of the label data changes, the entire key stream changes).  That's not the case for the raw HKDF nor for any KDF that allows for multiple objects to be extracted out of a single key stream.  Hence the per-component length values. 

Ideally, there should be a complete object spec for each object to be generated that is part of the mixins (label and context) for any KDF.   That allows an HSM to rely upon the object spec when setting policy controls for each generated object - and incidentally allows for a KDF to generate both public and non-public data in a secure way.

So as long as you allow for the specification of all of the production objects as part of the .init() I'm good.   A given KDF might not require this - but I can't see any way of fixing the current KDFs to work in HSMs without something like this.

As far as your (5) scenario goes, I can see how you can twiddle the lengths to get the keystream output with zero-length keys and large IV buffers.  But that scenario really glosses over what should be a big hurdle and a major access control issue that stands outside the KDF API: That the attacker shouldn't have access to the input keying material in the first place.  Protect the input keying material properly and their attack cannot be done.

Let me give you an example.   I'm running an embedded HSM - to protect TLS keys and to do all of the crypto.  An attacker compromises the TLS server and now has access to the HSM.  No problem - I'm going to notice if the attacker starts extraditing large amounts of data from the server (e.g. copies of the TLS in the clear but possibly reencrypted data stream) so this isn't a threat or is it?  Smart attacker does an extraction attack on the TLS 1.2 and before KDF and turns all of the key stream material into IV material and exports it from the HSM.  The attacker now has the much smaller key material so he can send a few messages with those keys and allow for the passive external interception of the traffic and decryption thereof without the risk of detection of all that traffic being sent.  Alternately, I can place the key material in a picture via steganography and publish it as part of the server data.

The idea is to protect extraction of the key material from an HSM even from authorized users of that key material

 KDFs don't currently do this well.  Adding the overall length and per component length stuff as well as a per component spec to the data used to derive the key stream means that 1) changes to any of those change the entire key stream, 2) the per component spec data may be used by the security module policy engine to enforce restrictions and 3) because of (1) and (2) calling the KDF a second time gets me exactly the same objects rather than just the same key stream.  The last isn't very important in a software based security domain, but turns out to have real implications for policy enforcing security modules.

This gets worse when you realize that the KDF key is under it all either a HASH HMAC or CMAC key and all of those algorithms produce public data.   Ideally you need a way of preventing a KDF key from calling the raw HASH/HMAC/CMAC functions directly (and vice versa).


I would rather see the DPS provided in the deriveKey.  It couples what you want out with the call that makes the object and it makes a lot more sense to keep those two together than try to remember where in the submitted list of DPS objects you are.

95% of the time this will be a call to produce a single key.  4% of the time it will be a call to produce multiple keys. Only 1% of the time will it need to intermix key, data and object productions. Anybody who is doing that is going to write a wrapper around this class to make sure they get the key and data production order correct for each call.  So I'm not all that bothered by keeping the complexity as a price for keeping flexibility.

You could have a Key deriveKey(Key k, DerivationParameterSpec param) for some things like TLS1.3 (where you can only make a single call to derive key between inits) , but then you'd also need at least a byte[] deriveData (Key k, DerivationParameterSpec param) and an Object deriveObject(Key k, DerivationParameterSpec param).
I don't think those are necessary.  If you're just doing HKDF-Expand (for the HKDF-Expand-Label TLS 1.3 key derivation) then you can provide the input key, label and max length and any other context info that goes into that HkdfLabel structure...all of that would go into init().  Then provide the key alg and desired length via the DPS at deriveKey time.  Any subsequent keys in the TLS 1.3 key schedule would need a new init call anyway since the labels change and possibly the output length.

Over the next day or so I'm going to have to make some final decisions on this API as there are internal projects that are waiting on this API to proceed.  I'm already past the cut-off date I set, but I recognize these discussions are important to have and I appreciate the input you and others have provided.

--Jamil


Reading this last I think I've lost the context.   Here's where I think we are:

1) Get instance gets the default configuration of a given KDF (and that default will be attached to the instance name defintion)
2) .setParameter() may be used to update the KDF configuration - once.
3) .init() takes at least the key, it may optionally take a set of derivation parameters.   The derivation parameters provided in .init() are intended for use in forming the label and context mixins for the KDF.   They may provide - for example - the total length of the key stream, the objects to be derived, the length of the objects, protection parameters for each of the objects etc.
4) A kdf generate a free-running or fixed length key stream depending on the derivation parameters (e.g. if "L" is not a mixin to the KDF then it is free-running and may produce as much key stream as desired or if the production object specifications are not part of the derivation mixins).

Doing (4) is mostly not a good idea, but someone might want to do this.   In that case it may make the most sense to just allow them to do deriveData(int length) calls as the only function (a keyed PRNG basically).

Re the last version of your api - if you add the .setParameter() .getParameter() calls to both KeyDerivation and KeyDerivationSpi I think I'm happy with this part of the API.  I'm wondering if we should talk about KeyAgreement though.



Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Jamil Nimeh


On 11/27/2017 10:09 AM, Michael StJohns wrote:
On 11/27/2017 1:03 AM, Jamil Nimeh wrote:



One additional topic for discussion: Late in the week we talked about the current state of the API internally and one item to revisit is where the DerivationParameterSpec objects are passed. It was brought up by a couple people that it would be better to provide the DPS objects pertaining to keys at the time they are called for through deriveKey() and deriveKeys() (and possibly deriveData).

Originally we had them all grouped in a List in the init method. One reason for needing it up there was to know the total length of material to generate.  If we can provide the total length through the AlgorithmParameterSpec passed in via init() then things like:

Key deriveKey(DerivationParameterSpec param);
List<Key> deriveKeys(List<DerivationParameterSpec> params);

become possible.  To my eyes at least it does make it more clear what DPS you're processing since they're provided at derive time, rather than the caller having to keep track in their heads where in the DPS list they might be with each successive deriveKey or deriveKeys calls.  And I think we could do away with deriveKeys(int), too.

See above - the key stream is logically produced in its entirety before any assignment of that stream is made to any cryptographic objects because the mixins (except for the round differentiator) are the same for each key stream production round.   Simply passing in the total length may not give you the right result if the KDF requires a per component length (and it should to defeat (5) or it should only produce a single key).
From looking at 800-108, I don't see any place where the KDF needs a per-component length.  It looks like it takes L (total length) as an input and that is applied to each round of the PRF.  HKDF takes L up-front as an input too, though it doesn't use it as an input to the HMAC function itself.  For TLS 1.3 that component length becomes part of the context info (HkdfLabel) through the HKDF-Expand-Label function...and it's only doing one key for a given label which is also part of that context specific info, necessitating an init() call.  Seems like the length can go into the APS provided via init (for those KDFs that need it at least) and you shouldn't need a DPS list up-front.


HKDF and SP800-108 only deal with the creation of the key stream and ignore the issues with assigning the key stream to cryptographic objects.  In the TLS version of HDKF, the L value is mandatory and only a single object is assigned per init/call to the KDF.   An HSM can look at the HKDF label information and set the appropriate policies for the assigned cryptographic object (because if any of the label data changes, the entire key stream changes).  That's not the case for the raw HKDF nor for any KDF that allows for multiple objects to be extracted out of a single key stream.  Hence the per-component length values. 

Ideally, there should be a complete object spec for each object to be generated that is part of the mixins (label and context) for any KDF.   That allows an HSM to rely upon the object spec when setting policy controls for each generated object - and incidentally allows for a KDF to generate both public and non-public data in a secure way.

So as long as you allow for the specification of all of the production objects as part of the .init() I'm good.   A given KDF might not require this - but I can't see any way of fixing the current KDFs to work in HSMs without something like this.

As far as your (5) scenario goes, I can see how you can twiddle the lengths to get the keystream output with zero-length keys and large IV buffers.  But that scenario really glosses over what should be a big hurdle and a major access control issue that stands outside the KDF API: That the attacker shouldn't have access to the input keying material in the first place.  Protect the input keying material properly and their attack cannot be done.

Let me give you an example.   I'm running an embedded HSM - to protect TLS keys and to do all of the crypto.  An attacker compromises the TLS server and now has access to the HSM.  No problem - I'm going to notice if the attacker starts extraditing large amounts of data from the server (e.g. copies of the TLS in the clear but possibly reencrypted data stream) so this isn't a threat or is it?  Smart attacker does an extraction attack on the TLS 1.2 and before KDF and turns all of the key stream material into IV material and exports it from the HSM.  The attacker now has the much smaller key material so he can send a few messages with those keys and allow for the passive external interception of the traffic and decryption thereof without the risk of detection of all that traffic being sent.  Alternately, I can place the key material in a picture via steganography and publish it as part of the server data.

The idea is to protect extraction of the key material from an HSM even from authorized users of that key material

 KDFs don't currently do this well.  Adding the overall length and per component length stuff as well as a per component spec to the data used to derive the key stream means that 1) changes to any of those change the entire key stream, 2) the per component spec data may be used by the security module policy engine to enforce restrictions and 3) because of (1) and (2) calling the KDF a second time gets me exactly the same objects rather than just the same key stream.  The last isn't very important in a software based security domain, but turns out to have real implications for policy enforcing security modules.

This gets worse when you realize that the KDF key is under it all either a HASH HMAC or CMAC key and all of those algorithms produce public data.   Ideally you need a way of preventing a KDF key from calling the raw HASH/HMAC/CMAC functions directly (and vice versa).


I would rather see the DPS provided in the deriveKey.  It couples what you want out with the call that makes the object and it makes a lot more sense to keep those two together than try to remember where in the submitted list of DPS objects you are.

95% of the time this will be a call to produce a single key.  4% of the time it will be a call to produce multiple keys. Only 1% of the time will it need to intermix key, data and object productions. Anybody who is doing that is going to write a wrapper around this class to make sure they get the key and data production order correct for each call.  So I'm not all that bothered by keeping the complexity as a price for keeping flexibility.

You could have a Key deriveKey(Key k, DerivationParameterSpec param) for some things like TLS1.3 (where you can only make a single call to derive key between inits) , but then you'd also need at least a byte[] deriveData (Key k, DerivationParameterSpec param) and an Object deriveObject(Key k, DerivationParameterSpec param).
I don't think those are necessary.  If you're just doing HKDF-Expand (for the HKDF-Expand-Label TLS 1.3 key derivation) then you can provide the input key, label and max length and any other context info that goes into that HkdfLabel structure...all of that would go into init().  Then provide the key alg and desired length via the DPS at deriveKey time.  Any subsequent keys in the TLS 1.3 key schedule would need a new init call anyway since the labels change and possibly the output length.

Over the next day or so I'm going to have to make some final decisions on this API as there are internal projects that are waiting on this API to proceed.  I'm already past the cut-off date I set, but I recognize these discussions are important to have and I appreciate the input you and others have provided.

--Jamil


Reading this last I think I've lost the context.   Here's where I think we are:

1) Get instance gets the default configuration of a given KDF (and that default will be attached to the instance name defintion)
2) .setParameter() may be used to update the KDF configuration - once.
I thought that we had ditched setParameter in favor of putting these parameters in getInstance.  IIRC we were headed toward an algorithm naming convention of <KDF>/<PRF>, plus APS in the getInstance (which may be null (and might be for most KDFs that we start with: HKDF and possibly TLS-PRF).

For those I could see naming conventions:
HKDF would need a PRF specifier, so HKDF/HmacSHA256, HKDF/HmacSHA384.  Basically for that PRF field I want to see values that line up with Mac algorthms in the standard names document.
TLS-PRF would probably allow a default "TLS-PRF" would be TLS-PRF used in 1.1 and earlier.  "TLS-PRF/SHA256" would be P_SHA256 from RFC 5246.  Or we could make it also follow the Mac standard name, so "TLS-PRF/HmacSHA256".  I'm fine with that too.  Basically each implementation

3) .init() takes at least the key, it may optionally take a set of derivation parameters.   The derivation parameters provided in .init() are intended for use in forming the label and context mixins for the KDF.   They may provide - for example - the total length of the key stream, the objects to be derived, the length of the objects, protection parameters for each of the objects etc.
Okay.  I think you've made a pretty strong case for the DerivationParameterSpec objects up-front.
4) A kdf generate a free-running or fixed length key stream depending on the derivation parameters (e.g. if "L" is not a mixin to the KDF then it is free-running and may produce as much key stream as desired or if the production object specifications are not part of the derivation mixins).

Doing (4) is mostly not a good idea, but someone might want to do this.   In that case it may make the most sense to just allow them to do deriveData(int length) calls as the only function (a keyed PRNG basically).
There's a couple ways we could do this:
byte[] deriveData(int length);
int deriveData(byte[] buf, int offset, int length);

I don't think we'll add these for this release of the KDF API.  It's easier to add these types of calls later if we need to than it is to have these extra forms for a KDF use-case that is "mostly not a good idea".

Re the last version of your api - if you add the .setParameter() .getParameter() calls to both KeyDerivation and KeyDerivationSpi I think I'm happy with this part of the API.  I'm wondering if we should talk about KeyAgreement though

See above with respect to set/getParameter.  But hopefully you'll be happy with the API after this next round.  I have one other change I will be making.  I'm removing deriveObject.  I'm uncomfortable right now with the idea of the API executing an arbitrary class' constructor.  This is something I'm definitely willing to examine in the future once the most pressing tasks both with this API, and projects that are immediately depending on it are take care of.  It is easier to add calls to the API than it is to remove/modify/deprecate them if there's a problem.  I will file an RFE so that we can track this enhancement.

Modifications to the KeyAgreement API are beyond the scope of this JEP.  We can certainly discuss ideas you have, but this KDF JEP isn't going to be dependent on those discussions.

--Jamil
Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Anthony Scarpino
On 11/27/2017 11:16 AM, Jamil Nimeh wrote:

> I thought that we had ditched setParameter in favor of putting these
> parameters in getInstance.  IIRC we were headed toward an algorithm
> naming convention of <KDF>/<PRF>, plus APS in the getInstance (which may
> be null (and might be for most KDFs that we start with: HKDF and
> possibly TLS-PRF).
>
> For those I could see naming conventions:
> HKDF would need a PRF specifier, so HKDF/HmacSHA256, HKDF/HmacSHA384.  
> Basically for that PRF field I want to see values that line up with Mac
> algorthms in the standard names document.
> TLS-PRF would probably allow a default "TLS-PRF" would be TLS-PRF used
> in 1.1 and earlier.  "TLS-PRF/SHA256" would be P_SHA256 from RFC 5246.  
> Or we could make it also follow the Mac standard name, so
> "TLS-PRF/HmacSHA256".  I'm fine with that too.  Basically each
> implementation


When the naming convention first came up, I never got around to
replying.  I think it would be better to specify the KDF and PRF as
separate parameters.  I don't think it's worth creating an naming
convention given what we have/are experiencing with Cipher
transformations, it's simpler to spell out each one separately.

Tony

Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Michael StJohns
In reply to this post by Jamil Nimeh
On 11/27/2017 2:16 PM, Jamil Nimeh wrote:

> See above with respect to set/getParameter.  But hopefully you'll be
> happy with the API after this next round.  I have one other change I
> will be making.  I'm removing deriveObject.  I'm uncomfortable right
> now with the idea of the API executing an arbitrary class'
> constructor.  This is something I'm definitely willing to examine in
> the future once the most pressing tasks both with this API, and
> projects that are immediately depending on it are take care of. It is
> easier to add calls to the API than it is to remove/modify/deprecate
> them if there's a problem.  I will file an RFE so that we can track
> this enhancement.
>
> Modifications to the KeyAgreement API are beyond the scope of this
> JEP.  We can certainly discuss ideas you have, but this KDF JEP isn't
> going to be dependent on those discussions.


Fair enough.

The deriveObject stuff is a problem because it doesn't fit well in the
JCA.  Mostly we've got KeyGenerator/KeyPairGenerator/KeyFactory that
produce objects of a particular provider.  KeyDerivation is weird in
that one provider will be producing the derived key stream and
potentially others might need to provide key or cryptographic objects
from that stream.   I can see the point in delaying this to a later rev
though it might make something like [KDF is Bouncycastle, keys are
PKCS11] a bit difficult to work around.

Last one -

Can I get you to buy into a MasterKey/MasterKeySpec  that is not a sub
class of SecretKey but has the same characteristics (and probably the
same definitions) as those classes (and is what gets used in the .init()
argument)?  This goes back to trying to prevent a SecretKey from being
used both with a KDF and the underlying PRF of the KDF.  I know this is
a don't care for software based providers but would be useful for
security module based ones.

I'm really hoping to improve cryptographic type and use safety along the
way.

Thanks - Mike



Reply | Threaded
Open this post in threaded view
|

Re: KDF API review, round 2

Jamil Nimeh


On 11/27/2017 11:46 AM, Michael StJohns wrote:
On 11/27/2017 2:16 PM, Jamil Nimeh wrote:
See above with respect to set/getParameter.  But hopefully you'll be happy with the API after this next round.  I have one other change I will be making.  I'm removing deriveObject.  I'm uncomfortable right now with the idea of the API executing an arbitrary class' constructor.  This is something I'm definitely willing to examine in the future once the most pressing tasks both with this API, and projects that are immediately depending on it are take care of. It is easier to add calls to the API than it is to remove/modify/deprecate them if there's a problem.  I will file an RFE so that we can track this enhancement.

Modifications to the KeyAgreement API are beyond the scope of this JEP.  We can certainly discuss ideas you have, but this KDF JEP isn't going to be dependent on those discussions.


Fair enough.

The deriveObject stuff is a problem because it doesn't fit well in the JCA.  Mostly we've got KeyGenerator/KeyPairGenerator/KeyFactory that produce objects of a particular provider.  KeyDerivation is weird in that one provider will be producing the derived key stream and potentially others might need to provide key or cryptographic objects from that stream.   I can see the point in delaying this to a later rev though it might make something like [KDF is Bouncycastle, keys are PKCS11] a bit difficult to work around.

Last one -

Can I get you to buy into a MasterKey/MasterKeySpec  that is not a sub class of SecretKey but has the same characteristics (and probably the same definitions) as those classes (and is what gets used in the .init() argument)?  This goes back to trying to prevent a SecretKey from being used both with a KDF and the underlying PRF of the KDF.  I know this is a don't care for software based providers but would be useful for security module based ones.

I'm really hoping to improve cryptographic type and use safety along the way.

I'm not quite getting what you mean here.  From looking at KDFs described in 800-108, it looks like the key input to the KDF is KI, and KI ends up being the seed for each round of the PRF.  If that isn't what you're referring to can you explain what you're looking for in more detail?

--Jamil 
12