Date   

Re: Coding Up LdapIdentityStoreDefinition Changes

Will Hopkins
 

FYI, I posted a PR for these changes, and subsequently merged it to master, so that I could build a release for the TCK team.

https://github.com/javaee/security-soteria/pull/116

Review/comments (of the closed PR) would still be welcome; I can address any issues in a subsequent PR. (I think I've implemented all the necessary logic, but I'd be surprised if there aren't bugs in it that need to be fixed).

On 07/23/2017 07:04 PM, Arjan Tijms wrote:
Hi,

As far as I am concerned, please go ahead. I'm not currently implementing any of those changes.

Kind regards,
Arjan Tijms

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803


Re: LDAP Annotation and Database Hashing Proposal

Will Hopkins
 



On 07/24/2017 02:43 AM, Ivar Grimstad wrote:
Hi,

Sorry for not responding sooner. Been to JCrete...

A minor detail, but wouldn't it make sense to skip the Hash suffix for the generateHash and verifyHash methods on the Hash algorithm classes?
E.g.
XXXHash.generateHash() -> XXXHash.generate()
XXXHash.verifyHash() -> XXXHash.verify()

I'm in favour of DRY and it is not like these classes will generate or verify anything else. Just my 2c...
I agree -- that was my initial thought for naming, but I decided to stick with the naming pattern Arjan used.

On a related note, I've thought a little bit about PasswordHash vs. PasswordHashAlgorithm for the IF name. Initially I just didn't want the names to get too long, but the more I think about it, "Algorithm" doesn't seem accurate even if length isn't an issue.

An Algorithm is "RSA" or "HmacSHA256" or "PBKDF2", and the original intent of the hashAlgorithm attribute was to specify an algorithm by name. What we're now providing is an _implementation_ of an algorithm. It's a "PasswordHasher" more than it's a "PasswordHashAlgorithm". On balance, I still favor the name "PasswordHash". (And possibly renaming "hashAlgorithm" to "passwordHash" or "passwordHasher".)


Ivar

On Mon, Jul 24, 2017 at 7:46 AM Guillermo González de Agüero <z06.guillermo@...> wrote:
I was going to raise the same concern, although I think properties could be supported in the future by providing a new method and specifying that the old one y to be used when there are no parameters defined, and the new one when there are.

If that is acceptable and can be done in the future, I think it's not such a big deal to leave them for now if the EG don't find consensus.


Regards,

Guillermo González de Agüero


El lun., 24 de julio de 2017 6:49, Rudy De Busscher <rdebusscher@...> escribió:
Will,

regarding the properties. Indeed when we do not support the properties, they don't need to be specified at the method.

But then, we will never be able to support them in any further version since it would break the backward compatibility of the interface methods. Once this release is out, we can't change anything anymore to interface (no change in the method parameters, method count, etc ... only we can add default implemented methods)

So I guess it is in now or leave out for ever.

Rudy



On 23 July 2017 at 23:31, Will Hopkins <will.hopkins@...> wrote:
Hi Rudy,

Thanks for the feedback, responses inline.

Will

On 07/23/2017 08:05 AM, Rudy De Busscher wrote:
Will, See inline comments (hope it is clear enough, I'll put them also in a separate thread)


On 22 July 2017 at 23:17, Arjan Tijms <arjan.tijms@...> wrote:
Hi,

On Sat, Jul 22, 2017 at 10:06 PM, Will Hopkins <will.hopkins@...> wrote:
Arjan, et al.:

I've pushed some changes to the hashalgorithm branch in both security-api and security-soteria; please have a look to see what you think.
  • I considered PasswordHashAlgorithm, but that seemed too long, especially for interfaces/classes that would extend it. 


"PasswordHashAlgorithm" certainly doesn't feel too long for me.
 
  • Also renamed IdentityHashAlgorithm to PlaintextPasswordHash.

Ok 
  • Added generateHash() method.

Ok 
  • Removed hash parameters passed to verifyHash(). I did not remove the DatabaseIdentityStoreDefinition hashProperties attribute, nor did I remove the "plumbing" that passees the properties down to the hash implementation, in case we want to restore the properties. But they are not currently passed in to the PasswordHash.

Even if the standard Pbkdf2PasswordHashImpl doesn't use it, I think it doesn't hurt much to pass them in anyway.

The parameters are required. if the PasswordHash interface doesn't have methods with the parameters passed along, there is no implemtattion ever that can use them!

I assume what you mean is that the methods need the parameters passed along if we retain the hashProperties attribute ... correct?

The intent would be to either support properties, or remove the hashProperties attribute on DatabaseIdentityStoreDefinition. My change leaves them on the annotation because I didn't feel like we'd made a final decision yet, and didn't want to have to add all that back if we decided to keep them. If we don't, we would remove the annotation attribute.

 
 
  • Implemented a (more or less) constant time compare for passwords/hashes, and updated the two PasswordHash impl classes to use it.
  • Re-wrote Pbkdf2PasswordHashImpl to:
    • Generate hashes with a specific storage format (encoding), and parse that encoding when doing verifyHash(), such that the verify operation can use the algorithm/parameters/salt that were used to hash the password originally, instead of the currently configured paramters.
    • Added support for multiple flavors of PBKDF2 (i.e., "with" different Hmac algorithms). Now supports all SHA1/2 variants, but not MD5 (which doesn't appear to be available, or at least not enabled by default, in JDK8).
    • Changed the default parameters to be more in line with current best practices.

Ok

I would also consider removing SHA1. Just as MD5, it is no longer considered as really safe.
Good point, I'll do that. My thought in keeping it (and perhaps MD5, if it worked) was to support passwords hashed with legacy algorithms, but that doesn't make much sense since it's unlikely they'd be encoded the same way in the database, so the default PasswordHashAlgorithm wouldn't work anyway.

 
 
So, this is what I had in mind. It's really not so different from what you coded up, Arjan, except for some of the algorithm implementation details, the lack of properties support, and the flexible encoding format that allows default parameters to be changed while preserving the ability to verify passwords hashed with different parameters.

My reasoning behind removing the properties support is as follows:
  • The application developer already has complete control because he/she can provide a custom algorithm to meet any requirement. Properties may be convenient in some cases, but even without them the developer can do whatever they need to do.
  • We are already introducing a very large change after PFD publication. Properties mean more to specify and document, and, more crucially, they mean more test assertions and TCK test cases to write. There is already concern internally about our ability to complete the TCK on schedule.

Ok
 
  • As is hopefully made more clear by my implementation code, the algorithm itself, and related parameters, are not really the source of most variation and complexity for password hashes -- it's the content and format of the encoded hashed password, as stored in the database, that seems likely to be infinitely more variable. There are likely a number of fairly common formats, but I suspect most are ad hoc, and practices are changing rapidly over time. In any event, we simply don't have time to research that question to figure out if there are any formats common enough to standardize on. (Nor do we have time to develop and implement some kind of expression syntax to specifies what the encoding should look like.)

As I mentioned, the encoding step is very likely common enough to warrant having its own method in the interface, so developers can e.g. decorate the default algorithm and only change the encoding.

 

As a result, I think it's highly unlikely that the ability to supply properties is going to be enough to allow developers to customize everything they'll want or need to

Like the entire concept of the DatabaseIdentityStore, it's indeed unlikely to ever cover everything everyone ever wants, but the 80% rule comes in play here; it's okay if we design for the 80% and then the 20% can do whatever they want in a custom implementation.

I personally still feel parameters with reasonable defaults are still a good intermediate solution between using the standard algorithm as is, and having to write something from scratch.

That said, if there's no time, there's no time. Or in other words, it is what it is ;)


But it needs to be useable. Otherwise it is better to deliver nothing then something which isn't working properly and useable!
If the perception of Java EE security API is not good during the first months, it will never be used by anyone, even if we come up with a better version later on.

Sorry to be this hard, but that is the thruth.
I agree that it needs to be usable, but, to me, the properties/no properties question doesn't seem to be a fatal flaw -- properties would be convenient, in some cases, but lack of properties won't prevent a developer (or platform vendor) from implementing any algorithm(s) they need. Do you see it differently?

 
Kind regards,
Arjan Tijms


 
, even when they're creating a new identity store and have complete control over the format. In the case that they need to develop for an existing identity store, with existing password hash formats, the chances are close to zero that a generic password hash implementation could be adequately customized with properties.
With all of that in mind, I do think properties are a nice feature, but they don't seem absolutely necessary -- developers have full control without them -- and it's not clear they'd really be useful in very many scenarios. I suspect they would mostly be used to fiddle with parameters in cases where the default storage encoding is deemed to be acceptable, and that, in most cases, the default algorithm/parameters would probably be deemed acceptable as well, if properties weren't available. As such, I'm reluctant to take on the extra complexity and work required to support properties at this point.

Anyway, that's my proposal. Thoughts?

Will


A small detail anout the code, I would make it more Java 8 alike ( with diamond operator, foreach method, ....)

If you like, I can make a PR for that.
Sure, if you want, but let's wait until we're fully decided on the approach. Also, did you sign the OCA yet? I don't understand why that's suddenly a requirement, but apparently it is.


Rudy
 

On 07/20/2017 03:25 PM, Arjan Tijms wrote:
Hi,

The adaptability concern is basically the main concern of the entire DataBaseIdentityStore. Indeed, having a query to get the (hashed) password from the caller name and another to get the groups covers maybe 80% of use cases, but certainly not all.

There are for instance exotic database schemes were you'd need at least 2 seperate queries to get that data.

For the hash algorithm the same thing holds. We need to specify enough moving parts so we can adapt to *many* existing databases and requirements, but we shouldn't even be dreaming about covering them all.

Since parameters can contain EL as values as well, it wouldn't be crazy hard really to supply the encoding as a method expression. It's basically a byte[] -> String function.

In this case though I think encoding is a general enough moving part to warrant its own method in the interface. It can easily be defaulted to e.g. base64 and it can be overridden without exposing the implementation by use of a CDI decorator (or should be after we solve a bug or problem that currently prevents decorating).

Kind regards,
Arjan Tijms


On Thursday, July 20, 2017, Will Hopkins <will.hopkins@...> wrote:


On 07/20/2017 12:38 PM, Guillermo González de Agüero wrote:


Similarly, parameters like the salt and iteration count can be defaulted. That way you can have 3 levels:

1. Specify only the class name; you get defaults for all parameters and it Just Works
2. Specify the parameters locally to match a specific DB or requirement
3. (via EL) fetch the parameter values externally

The most important rule should always be that the developer is in full control, or can be in full control. External overrides should be possible, but never be required. That is -the- big mistake that Java EE made in the past.
I agree, but I think we're only talking about what the mechanism for override is, and what the trade-off between simplicity and the override mechanism is. In practice, I don't think parameters are going to be very helpful to most developers. They can be used to tweak some simple things, but the simple things won't typically be necessary and the more complex things can't be handled by parameters.
 
Could you please provide an example of such a setting, please? Just to better understand. Otherwise, we could just wait for your new proposal to see.

Basically, it's the encoding of the hashed password. Parameters like salt, iterations, hash key size, etc. are all pretty standard. They can be dialed up or dialed down to make the hash stronger or weaker, and I think "stronger vs. weaker" is the level at which most people think about them. I doubt many developers are concerned about specifying precisely a certain number of iterations for some reason, they just want iterations count to be high enough to be secure, without imposing excessive overhead or resource consumption, and security most often wins out vs. resource consumption these days. Most will be happy if they can turn up the volume with a single knob, they don't need bass and treble and balance knobs. Even better: if the default volume is about right, and if the volume automatically goes up when you're driving at higher speeds.

In contrast, the format of the encoded string is not standard in any way that I'm aware of. There are probably LDAP vendors that do things in a common way, but for any given database user store, either the format has already been decided, because there are existing user accounts, or the developer won't care as long as the default format is reasonable (assuming the DB schema allows enough space for the chosen encoding format.

I can't think of any good way to standardize parameters to describe how the hashed password is encoded for storage, and that's the hard part, the part that you really have to adapt to. It's not a big deal if you have to use 4000 iterations instead of 2000, but it's a big deal if you can't parse the encoded password. And since the encoding is completely arbitrary and non-standard, that's the real problem, and that's what will likely force people to do their own implementation of HashAlgorithm. That can't be specified as Properties.

We could probably come up with an interface for encoding/decoding the hashed value, or for describing the encoded value as a string and passing it as a property. But that's a lot to spec out, and we just don't have time for it. It's complicated by the fact that decoding a hash potentially yields all the parameters that were used to generate the hash, so there'd need to be a generalized interface for returning those in a standard way. Plus, in some databases, the encodings used will vary, as legacy hashes are replaced with stronger hashes/different encodings when people change their password.

It's hard to argue against supplying Properties. It's not difficult to implement, and it might be useful. But I'm trying to keep the interface as simple as possible -- in part to reduce the number of additional TCK tests that need to be written -- and I suspect the Properties will rarely be useful. Some people will want to use them, but for many the defaults will be good enough, and many of those with specific requirements will already be forced to write an implementation in order to deal with encoding issues.

Will
-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803



-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

--

Java Champion, JCP EC/EG Member, JUG Leader


-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803


Re: LDAP Annotation and Database Hashing Proposal

Will Hopkins
 

All true.

A couple of counter-points to ponder:
  • Property values will never be standard, even if we support passing them. At best they would be standard for the default PasswordHash that is supplied by the RI. A developer passing properties would have to understand the specifics of the implementation being used.
  • Passing it to the method, rather than to the constructor or an init method. And we won't ever be in a position to pass them to generateHash(), so it can never behave consistently with verifyHash.
  • Implementations can provide their own configuration mechanisms -- a standard config object that's injected, decorators, annotations of their own, abstract impls that developers extend to customize and deploy, etc. Those won't be standard, but neither would properties, because the actual property names and values would be specific to the implementation in use.

Will


On 07/24/2017 02:54 AM, Rudy De Busscher wrote:
Guillermo,

We can't add additional methods to an interface later on. This would make WAR files packages and compiled against JAVA EE 8, break when they are deployed on a Java EE 9 server for example which has the additional methods in the interface.

So if we decide now (today?) that they are out, they are out forever.

Or we have to define new interface classes.

Rudy


On 24 July 2017 at 07:46, Guillermo González de Agüero <z06.guillermo@...> wrote:
I was going to raise the same concern, although I think properties could be supported in the future by providing a new method and specifying that the old one y to be used when there are no parameters defined, and the new one when there are.

If that is acceptable and can be done in the future, I think it's not such a big deal to leave them for now if the EG don't find consensus.


Regards,

Guillermo González de Agüero


El lun., 24 de julio de 2017 6:49, Rudy De Busscher <rdebusscher@...> escribió:
Will,

regarding the properties. Indeed when we do not support the properties, they don't need to be specified at the method.

But then, we will never be able to support them in any further version since it would break the backward compatibility of the interface methods. Once this release is out, we can't change anything anymore to interface (no change in the method parameters, method count, etc ... only we can add default implemented methods)

So I guess it is in now or leave out for ever.

Rudy



On 23 July 2017 at 23:31, Will Hopkins <will.hopkins@...> wrote:
Hi Rudy,

Thanks for the feedback, responses inline.

Will

On 07/23/2017 08:05 AM, Rudy De Busscher wrote:
Will, See inline comments (hope it is clear enough, I'll put them also in a separate thread)


On 22 July 2017 at 23:17, Arjan Tijms <arjan.tijms@...> wrote:
Hi,

On Sat, Jul 22, 2017 at 10:06 PM, Will Hopkins <will.hopkins@...> wrote:
Arjan, et al.:

I've pushed some changes to the hashalgorithm branch in both security-api and security-soteria; please have a look to see what you think.
  • I considered PasswordHashAlgorithm, but that seemed too long, especially for interfaces/classes that would extend it. 


"PasswordHashAlgorithm" certainly doesn't feel too long for me.
 
  • Also renamed IdentityHashAlgorithm to PlaintextPasswordHash.

Ok 
  • Added generateHash() method.

Ok 
  • Removed hash parameters passed to verifyHash(). I did not remove the DatabaseIdentityStoreDefinition hashProperties attribute, nor did I remove the "plumbing" that passees the properties down to the hash implementation, in case we want to restore the properties. But they are not currently passed in to the PasswordHash.

Even if the standard Pbkdf2PasswordHashImpl doesn't use it, I think it doesn't hurt much to pass them in anyway.

The parameters are required. if the PasswordHash interface doesn't have methods with the parameters passed along, there is no implemtattion ever that can use them!

I assume what you mean is that the methods need the parameters passed along if we retain the hashProperties attribute ... correct?

The intent would be to either support properties, or remove the hashProperties attribute on DatabaseIdentityStoreDefinition. My change leaves them on the annotation because I didn't feel like we'd made a final decision yet, and didn't want to have to add all that back if we decided to keep them. If we don't, we would remove the annotation attribute.

 
 
  • Implemented a (more or less) constant time compare for passwords/hashes, and updated the two PasswordHash impl classes to use it.
  • Re-wrote Pbkdf2PasswordHashImpl to:
    • Generate hashes with a specific storage format (encoding), and parse that encoding when doing verifyHash(), such that the verify operation can use the algorithm/parameters/salt that were used to hash the password originally, instead of the currently configured paramters.
    • Added support for multiple flavors of PBKDF2 (i.e., "with" different Hmac algorithms). Now supports all SHA1/2 variants, but not MD5 (which doesn't appear to be available, or at least not enabled by default, in JDK8).
    • Changed the default parameters to be more in line with current best practices.

Ok

I would also consider removing SHA1. Just as MD5, it is no longer considered as really safe.
Good point, I'll do that. My thought in keeping it (and perhaps MD5, if it worked) was to support passwords hashed with legacy algorithms, but that doesn't make much sense since it's unlikely they'd be encoded the same way in the database, so the default PasswordHashAlgorithm wouldn't work anyway.

 
 
So, this is what I had in mind. It's really not so different from what you coded up, Arjan, except for some of the algorithm implementation details, the lack of properties support, and the flexible encoding format that allows default parameters to be changed while preserving the ability to verify passwords hashed with different parameters.

My reasoning behind removing the properties support is as follows:
  • The application developer already has complete control because he/she can provide a custom algorithm to meet any requirement. Properties may be convenient in some cases, but even without them the developer can do whatever they need to do.
  • We are already introducing a very large change after PFD publication. Properties mean more to specify and document, and, more crucially, they mean more test assertions and TCK test cases to write. There is already concern internally about our ability to complete the TCK on schedule.

Ok
 
  • As is hopefully made more clear by my implementation code, the algorithm itself, and related parameters, are not really the source of most variation and complexity for password hashes -- it's the content and format of the encoded hashed password, as stored in the database, that seems likely to be infinitely more variable. There are likely a number of fairly common formats, but I suspect most are ad hoc, and practices are changing rapidly over time. In any event, we simply don't have time to research that question to figure out if there are any formats common enough to standardize on. (Nor do we have time to develop and implement some kind of expression syntax to specifies what the encoding should look like.)

As I mentioned, the encoding step is very likely common enough to warrant having its own method in the interface, so developers can e.g. decorate the default algorithm and only change the encoding.

 

As a result, I think it's highly unlikely that the ability to supply properties is going to be enough to allow developers to customize everything they'll want or need to

Like the entire concept of the DatabaseIdentityStore, it's indeed unlikely to ever cover everything everyone ever wants, but the 80% rule comes in play here; it's okay if we design for the 80% and then the 20% can do whatever they want in a custom implementation.

I personally still feel parameters with reasonable defaults are still a good intermediate solution between using the standard algorithm as is, and having to write something from scratch.

That said, if there's no time, there's no time. Or in other words, it is what it is ;)


But it needs to be useable. Otherwise it is better to deliver nothing then something which isn't working properly and useable!
If the perception of Java EE security API is not good during the first months, it will never be used by anyone, even if we come up with a better version later on.

Sorry to be this hard, but that is the thruth.
I agree that it needs to be usable, but, to me, the properties/no properties question doesn't seem to be a fatal flaw -- properties would be convenient, in some cases, but lack of properties won't prevent a developer (or platform vendor) from implementing any algorithm(s) they need. Do you see it differently?

 
Kind regards,
Arjan Tijms


 
, even when they're creating a new identity store and have complete control over the format. In the case that they need to develop for an existing identity store, with existing password hash formats, the chances are close to zero that a generic password hash implementation could be adequately customized with properties.
With all of that in mind, I do think properties are a nice feature, but they don't seem absolutely necessary -- developers have full control without them -- and it's not clear they'd really be useful in very many scenarios. I suspect they would mostly be used to fiddle with parameters in cases where the default storage encoding is deemed to be acceptable, and that, in most cases, the default algorithm/parameters would probably be deemed acceptable as well, if properties weren't available. As such, I'm reluctant to take on the extra complexity and work required to support properties at this point.

Anyway, that's my proposal. Thoughts?

Will


A small detail anout the code, I would make it more Java 8 alike ( with diamond operator, foreach method, ....)

If you like, I can make a PR for that.
Sure, if you want, but let's wait until we're fully decided on the approach. Also, did you sign the OCA yet? I don't understand why that's suddenly a requirement, but apparently it is.


Rudy
 

On 07/20/2017 03:25 PM, Arjan Tijms wrote:
Hi,

The adaptability concern is basically the main concern of the entire DataBaseIdentityStore. Indeed, having a query to get the (hashed) password from the caller name and another to get the groups covers maybe 80% of use cases, but certainly not all.

There are for instance exotic database schemes were you'd need at least 2 seperate queries to get that data.

For the hash algorithm the same thing holds. We need to specify enough moving parts so we can adapt to *many* existing databases and requirements, but we shouldn't even be dreaming about covering them all.

Since parameters can contain EL as values as well, it wouldn't be crazy hard really to supply the encoding as a method expression. It's basically a byte[] -> String function.

In this case though I think encoding is a general enough moving part to warrant its own method in the interface. It can easily be defaulted to e.g. base64 and it can be overridden without exposing the implementation by use of a CDI decorator (or should be after we solve a bug or problem that currently prevents decorating).

Kind regards,
Arjan Tijms


On Thursday, July 20, 2017, Will Hopkins <will.hopkins@...> wrote:


On 07/20/2017 12:38 PM, Guillermo González de Agüero wrote:


Similarly, parameters like the salt and iteration count can be defaulted. That way you can have 3 levels:

1. Specify only the class name; you get defaults for all parameters and it Just Works
2. Specify the parameters locally to match a specific DB or requirement
3. (via EL) fetch the parameter values externally

The most important rule should always be that the developer is in full control, or can be in full control. External overrides should be possible, but never be required. That is -the- big mistake that Java EE made in the past.
I agree, but I think we're only talking about what the mechanism for override is, and what the trade-off between simplicity and the override mechanism is. In practice, I don't think parameters are going to be very helpful to most developers. They can be used to tweak some simple things, but the simple things won't typically be necessary and the more complex things can't be handled by parameters.
 
Could you please provide an example of such a setting, please? Just to better understand. Otherwise, we could just wait for your new proposal to see.

Basically, it's the encoding of the hashed password. Parameters like salt, iterations, hash key size, etc. are all pretty standard. They can be dialed up or dialed down to make the hash stronger or weaker, and I think "stronger vs. weaker" is the level at which most people think about them. I doubt many developers are concerned about specifying precisely a certain number of iterations for some reason, they just want iterations count to be high enough to be secure, without imposing excessive overhead or resource consumption, and security most often wins out vs. resource consumption these days. Most will be happy if they can turn up the volume with a single knob, they don't need bass and treble and balance knobs. Even better: if the default volume is about right, and if the volume automatically goes up when you're driving at higher speeds.

In contrast, the format of the encoded string is not standard in any way that I'm aware of. There are probably LDAP vendors that do things in a common way, but for any given database user store, either the format has already been decided, because there are existing user accounts, or the developer won't care as long as the default format is reasonable (assuming the DB schema allows enough space for the chosen encoding format.

I can't think of any good way to standardize parameters to describe how the hashed password is encoded for storage, and that's the hard part, the part that you really have to adapt to. It's not a big deal if you have to use 4000 iterations instead of 2000, but it's a big deal if you can't parse the encoded password. And since the encoding is completely arbitrary and non-standard, that's the real problem, and that's what will likely force people to do their own implementation of HashAlgorithm. That can't be specified as Properties.

We could probably come up with an interface for encoding/decoding the hashed value, or for describing the encoded value as a string and passing it as a property. But that's a lot to spec out, and we just don't have time for it. It's complicated by the fact that decoding a hash potentially yields all the parameters that were used to generate the hash, so there'd need to be a generalized interface for returning those in a standard way. Plus, in some databases, the encodings used will vary, as legacy hashes are replaced with stronger hashes/different encodings when people change their password.

It's hard to argue against supplying Properties. It's not difficult to implement, and it might be useful. But I'm trying to keep the interface as simple as possible -- in part to reduce the number of additional TCK tests that need to be written -- and I suspect the Properties will rarely be useful. Some people will want to use them, but for many the defaults will be good enough, and many of those with specific requirements will already be forced to write an implementation in order to deal with encoding issues.

Will
-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803



-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803



-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803


Re: LDAP Annotation and Database Hashing Proposal

Ivar Grimstad
 


On Mon, Jul 24, 2017 at 8:54 AM Rudy De Busscher <rdebusscher@...> wrote:
Guillermo,

We can't add additional methods to an interface later on. This would make WAR files packages and compiled against JAVA EE 8, break when they are deployed on a Java EE 9 server for example which has the additional methods in the interface.

Won't default methods solve that problem?
 

So if we decide now (today?) that they are out, they are out forever.

Or we have to define new interface classes.

Rudy


On 24 July 2017 at 07:46, Guillermo González de Agüero <z06.guillermo@...> wrote:
I was going to raise the same concern, although I think properties could be supported in the future by providing a new method and specifying that the old one y to be used when there are no parameters defined, and the new one when there are.

If that is acceptable and can be done in the future, I think it's not such a big deal to leave them for now if the EG don't find consensus.


Regards,

Guillermo González de Agüero


El lun., 24 de julio de 2017 6:49, Rudy De Busscher <rdebusscher@...> escribió:
Will,

regarding the properties. Indeed when we do not support the properties, they don't need to be specified at the method.

But then, we will never be able to support them in any further version since it would break the backward compatibility of the interface methods. Once this release is out, we can't change anything anymore to interface (no change in the method parameters, method count, etc ... only we can add default implemented methods)

So I guess it is in now or leave out for ever.

Rudy



On 23 July 2017 at 23:31, Will Hopkins <will.hopkins@...> wrote:
Hi Rudy,

Thanks for the feedback, responses inline.

Will

On 07/23/2017 08:05 AM, Rudy De Busscher wrote:
Will, See inline comments (hope it is clear enough, I'll put them also in a separate thread)


On 22 July 2017 at 23:17, Arjan Tijms <arjan.tijms@...> wrote:
Hi,

On Sat, Jul 22, 2017 at 10:06 PM, Will Hopkins <will.hopkins@...> wrote:
Arjan, et al.:

I've pushed some changes to the hashalgorithm branch in both security-api and security-soteria; please have a look to see what you think.
  • I considered PasswordHashAlgorithm, but that seemed too long, especially for interfaces/classes that would extend it. 


"PasswordHashAlgorithm" certainly doesn't feel too long for me.
 
  • Also renamed IdentityHashAlgorithm to PlaintextPasswordHash.

Ok 
  • Added generateHash() method.

Ok 
  • Removed hash parameters passed to verifyHash(). I did not remove the DatabaseIdentityStoreDefinition hashProperties attribute, nor did I remove the "plumbing" that passees the properties down to the hash implementation, in case we want to restore the properties. But they are not currently passed in to the PasswordHash.

Even if the standard Pbkdf2PasswordHashImpl doesn't use it, I think it doesn't hurt much to pass them in anyway.

The parameters are required. if the PasswordHash interface doesn't have methods with the parameters passed along, there is no implemtattion ever that can use them!

I assume what you mean is that the methods need the parameters passed along if we retain the hashProperties attribute ... correct?

The intent would be to either support properties, or remove the hashProperties attribute on DatabaseIdentityStoreDefinition. My change leaves them on the annotation because I didn't feel like we'd made a final decision yet, and didn't want to have to add all that back if we decided to keep them. If we don't, we would remove the annotation attribute.

 
 
  • Implemented a (more or less) constant time compare for passwords/hashes, and updated the two PasswordHash impl classes to use it.
  • Re-wrote Pbkdf2PasswordHashImpl to:
    • Generate hashes with a specific storage format (encoding), and parse that encoding when doing verifyHash(), such that the verify operation can use the algorithm/parameters/salt that were used to hash the password originally, instead of the currently configured paramters.
    • Added support for multiple flavors of PBKDF2 (i.e., "with" different Hmac algorithms). Now supports all SHA1/2 variants, but not MD5 (which doesn't appear to be available, or at least not enabled by default, in JDK8).
    • Changed the default parameters to be more in line with current best practices.

Ok

I would also consider removing SHA1. Just as MD5, it is no longer considered as really safe.
Good point, I'll do that. My thought in keeping it (and perhaps MD5, if it worked) was to support passwords hashed with legacy algorithms, but that doesn't make much sense since it's unlikely they'd be encoded the same way in the database, so the default PasswordHashAlgorithm wouldn't work anyway.

 
 
So, this is what I had in mind. It's really not so different from what you coded up, Arjan, except for some of the algorithm implementation details, the lack of properties support, and the flexible encoding format that allows default parameters to be changed while preserving the ability to verify passwords hashed with different parameters.

My reasoning behind removing the properties support is as follows:
  • The application developer already has complete control because he/she can provide a custom algorithm to meet any requirement. Properties may be convenient in some cases, but even without them the developer can do whatever they need to do.
  • We are already introducing a very large change after PFD publication. Properties mean more to specify and document, and, more crucially, they mean more test assertions and TCK test cases to write. There is already concern internally about our ability to complete the TCK on schedule.

Ok
 
  • As is hopefully made more clear by my implementation code, the algorithm itself, and related parameters, are not really the source of most variation and complexity for password hashes -- it's the content and format of the encoded hashed password, as stored in the database, that seems likely to be infinitely more variable. There are likely a number of fairly common formats, but I suspect most are ad hoc, and practices are changing rapidly over time. In any event, we simply don't have time to research that question to figure out if there are any formats common enough to standardize on. (Nor do we have time to develop and implement some kind of expression syntax to specifies what the encoding should look like.)

As I mentioned, the encoding step is very likely common enough to warrant having its own method in the interface, so developers can e.g. decorate the default algorithm and only change the encoding.

 

As a result, I think it's highly unlikely that the ability to supply properties is going to be enough to allow developers to customize everything they'll want or need to

Like the entire concept of the DatabaseIdentityStore, it's indeed unlikely to ever cover everything everyone ever wants, but the 80% rule comes in play here; it's okay if we design for the 80% and then the 20% can do whatever they want in a custom implementation.

I personally still feel parameters with reasonable defaults are still a good intermediate solution between using the standard algorithm as is, and having to write something from scratch.

That said, if there's no time, there's no time. Or in other words, it is what it is ;)


But it needs to be useable. Otherwise it is better to deliver nothing then something which isn't working properly and useable!
If the perception of Java EE security API is not good during the first months, it will never be used by anyone, even if we come up with a better version later on.

Sorry to be this hard, but that is the thruth.
I agree that it needs to be usable, but, to me, the properties/no properties question doesn't seem to be a fatal flaw -- properties would be convenient, in some cases, but lack of properties won't prevent a developer (or platform vendor) from implementing any algorithm(s) they need. Do you see it differently?

 
Kind regards,
Arjan Tijms


 
, even when they're creating a new identity store and have complete control over the format. In the case that they need to develop for an existing identity store, with existing password hash formats, the chances are close to zero that a generic password hash implementation could be adequately customized with properties.
With all of that in mind, I do think properties are a nice feature, but they don't seem absolutely necessary -- developers have full control without them -- and it's not clear they'd really be useful in very many scenarios. I suspect they would mostly be used to fiddle with parameters in cases where the default storage encoding is deemed to be acceptable, and that, in most cases, the default algorithm/parameters would probably be deemed acceptable as well, if properties weren't available. As such, I'm reluctant to take on the extra complexity and work required to support properties at this point.

Anyway, that's my proposal. Thoughts?

Will


A small detail anout the code, I would make it more Java 8 alike ( with diamond operator, foreach method, ....)

If you like, I can make a PR for that.
Sure, if you want, but let's wait until we're fully decided on the approach. Also, did you sign the OCA yet? I don't understand why that's suddenly a requirement, but apparently it is.


Rudy
 

On 07/20/2017 03:25 PM, Arjan Tijms wrote:
Hi,

The adaptability concern is basically the main concern of the entire DataBaseIdentityStore. Indeed, having a query to get the (hashed) password from the caller name and another to get the groups covers maybe 80% of use cases, but certainly not all.

There are for instance exotic database schemes were you'd need at least 2 seperate queries to get that data.

For the hash algorithm the same thing holds. We need to specify enough moving parts so we can adapt to *many* existing databases and requirements, but we shouldn't even be dreaming about covering them all.

Since parameters can contain EL as values as well, it wouldn't be crazy hard really to supply the encoding as a method expression. It's basically a byte[] -> String function.

In this case though I think encoding is a general enough moving part to warrant its own method in the interface. It can easily be defaulted to e.g. base64 and it can be overridden without exposing the implementation by use of a CDI decorator (or should be after we solve a bug or problem that currently prevents decorating).

Kind regards,
Arjan Tijms


On Thursday, July 20, 2017, Will Hopkins <will.hopkins@...> wrote:


On 07/20/2017 12:38 PM, Guillermo González de Agüero wrote:


Similarly, parameters like the salt and iteration count can be defaulted. That way you can have 3 levels:

1. Specify only the class name; you get defaults for all parameters and it Just Works
2. Specify the parameters locally to match a specific DB or requirement
3. (via EL) fetch the parameter values externally

The most important rule should always be that the developer is in full control, or can be in full control. External overrides should be possible, but never be required. That is -the- big mistake that Java EE made in the past.
I agree, but I think we're only talking about what the mechanism for override is, and what the trade-off between simplicity and the override mechanism is. In practice, I don't think parameters are going to be very helpful to most developers. They can be used to tweak some simple things, but the simple things won't typically be necessary and the more complex things can't be handled by parameters.
 
Could you please provide an example of such a setting, please? Just to better understand. Otherwise, we could just wait for your new proposal to see.

Basically, it's the encoding of the hashed password. Parameters like salt, iterations, hash key size, etc. are all pretty standard. They can be dialed up or dialed down to make the hash stronger or weaker, and I think "stronger vs. weaker" is the level at which most people think about them. I doubt many developers are concerned about specifying precisely a certain number of iterations for some reason, they just want iterations count to be high enough to be secure, without imposing excessive overhead or resource consumption, and security most often wins out vs. resource consumption these days. Most will be happy if they can turn up the volume with a single knob, they don't need bass and treble and balance knobs. Even better: if the default volume is about right, and if the volume automatically goes up when you're driving at higher speeds.

In contrast, the format of the encoded string is not standard in any way that I'm aware of. There are probably LDAP vendors that do things in a common way, but for any given database user store, either the format has already been decided, because there are existing user accounts, or the developer won't care as long as the default format is reasonable (assuming the DB schema allows enough space for the chosen encoding format.

I can't think of any good way to standardize parameters to describe how the hashed password is encoded for storage, and that's the hard part, the part that you really have to adapt to. It's not a big deal if you have to use 4000 iterations instead of 2000, but it's a big deal if you can't parse the encoded password. And since the encoding is completely arbitrary and non-standard, that's the real problem, and that's what will likely force people to do their own implementation of HashAlgorithm. That can't be specified as Properties.

We could probably come up with an interface for encoding/decoding the hashed value, or for describing the encoded value as a string and passing it as a property. But that's a lot to spec out, and we just don't have time for it. It's complicated by the fact that decoding a hash potentially yields all the parameters that were used to generate the hash, so there'd need to be a generalized interface for returning those in a standard way. Plus, in some databases, the encodings used will vary, as legacy hashes are replaced with stronger hashes/different encodings when people change their password.

It's hard to argue against supplying Properties. It's not difficult to implement, and it might be useful. But I'm trying to keep the interface as simple as possible -- in part to reduce the number of additional TCK tests that need to be written -- and I suspect the Properties will rarely be useful. Some people will want to use them, but for many the defaults will be good enough, and many of those with specific requirements will already be forced to write an implementation in order to deal with encoding issues.

Will
-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803



-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803



--

Java Champion, JCP EC/EG Member, JUG Leader


Re: LDAP Annotation and Database Hashing Proposal

Rudy De Busscher
 

Guillermo,

We can't add additional methods to an interface later on. This would make WAR files packages and compiled against JAVA EE 8, break when they are deployed on a Java EE 9 server for example which has the additional methods in the interface.

So if we decide now (today?) that they are out, they are out forever.

Or we have to define new interface classes.

Rudy


On 24 July 2017 at 07:46, Guillermo González de Agüero <z06.guillermo@...> wrote:
I was going to raise the same concern, although I think properties could be supported in the future by providing a new method and specifying that the old one y to be used when there are no parameters defined, and the new one when there are.

If that is acceptable and can be done in the future, I think it's not such a big deal to leave them for now if the EG don't find consensus.


Regards,

Guillermo González de Agüero


El lun., 24 de julio de 2017 6:49, Rudy De Busscher <rdebusscher@...> escribió:
Will,

regarding the properties. Indeed when we do not support the properties, they don't need to be specified at the method.

But then, we will never be able to support them in any further version since it would break the backward compatibility of the interface methods. Once this release is out, we can't change anything anymore to interface (no change in the method parameters, method count, etc ... only we can add default implemented methods)

So I guess it is in now or leave out for ever.

Rudy



On 23 July 2017 at 23:31, Will Hopkins <will.hopkins@...> wrote:
Hi Rudy,

Thanks for the feedback, responses inline.

Will

On 07/23/2017 08:05 AM, Rudy De Busscher wrote:
Will, See inline comments (hope it is clear enough, I'll put them also in a separate thread)


On 22 July 2017 at 23:17, Arjan Tijms <arjan.tijms@...> wrote:
Hi,

On Sat, Jul 22, 2017 at 10:06 PM, Will Hopkins <will.hopkins@...> wrote:
Arjan, et al.:

I've pushed some changes to the hashalgorithm branch in both security-api and security-soteria; please have a look to see what you think.
  • I considered PasswordHashAlgorithm, but that seemed too long, especially for interfaces/classes that would extend it. 


"PasswordHashAlgorithm" certainly doesn't feel too long for me.
 
  • Also renamed IdentityHashAlgorithm to PlaintextPasswordHash.

Ok 
  • Added generateHash() method.

Ok 
  • Removed hash parameters passed to verifyHash(). I did not remove the DatabaseIdentityStoreDefinition hashProperties attribute, nor did I remove the "plumbing" that passees the properties down to the hash implementation, in case we want to restore the properties. But they are not currently passed in to the PasswordHash.

Even if the standard Pbkdf2PasswordHashImpl doesn't use it, I think it doesn't hurt much to pass them in anyway.

The parameters are required. if the PasswordHash interface doesn't have methods with the parameters passed along, there is no implemtattion ever that can use them!

I assume what you mean is that the methods need the parameters passed along if we retain the hashProperties attribute ... correct?

The intent would be to either support properties, or remove the hashProperties attribute on DatabaseIdentityStoreDefinition. My change leaves them on the annotation because I didn't feel like we'd made a final decision yet, and didn't want to have to add all that back if we decided to keep them. If we don't, we would remove the annotation attribute.

 
 
  • Implemented a (more or less) constant time compare for passwords/hashes, and updated the two PasswordHash impl classes to use it.
  • Re-wrote Pbkdf2PasswordHashImpl to:
    • Generate hashes with a specific storage format (encoding), and parse that encoding when doing verifyHash(), such that the verify operation can use the algorithm/parameters/salt that were used to hash the password originally, instead of the currently configured paramters.
    • Added support for multiple flavors of PBKDF2 (i.e., "with" different Hmac algorithms). Now supports all SHA1/2 variants, but not MD5 (which doesn't appear to be available, or at least not enabled by default, in JDK8).
    • Changed the default parameters to be more in line with current best practices.

Ok

I would also consider removing SHA1. Just as MD5, it is no longer considered as really safe.
Good point, I'll do that. My thought in keeping it (and perhaps MD5, if it worked) was to support passwords hashed with legacy algorithms, but that doesn't make much sense since it's unlikely they'd be encoded the same way in the database, so the default PasswordHashAlgorithm wouldn't work anyway.

 
 
So, this is what I had in mind. It's really not so different from what you coded up, Arjan, except for some of the algorithm implementation details, the lack of properties support, and the flexible encoding format that allows default parameters to be changed while preserving the ability to verify passwords hashed with different parameters.

My reasoning behind removing the properties support is as follows:
  • The application developer already has complete control because he/she can provide a custom algorithm to meet any requirement. Properties may be convenient in some cases, but even without them the developer can do whatever they need to do.
  • We are already introducing a very large change after PFD publication. Properties mean more to specify and document, and, more crucially, they mean more test assertions and TCK test cases to write. There is already concern internally about our ability to complete the TCK on schedule.

Ok
 
  • As is hopefully made more clear by my implementation code, the algorithm itself, and related parameters, are not really the source of most variation and complexity for password hashes -- it's the content and format of the encoded hashed password, as stored in the database, that seems likely to be infinitely more variable. There are likely a number of fairly common formats, but I suspect most are ad hoc, and practices are changing rapidly over time. In any event, we simply don't have time to research that question to figure out if there are any formats common enough to standardize on. (Nor do we have time to develop and implement some kind of expression syntax to specifies what the encoding should look like.)

As I mentioned, the encoding step is very likely common enough to warrant having its own method in the interface, so developers can e.g. decorate the default algorithm and only change the encoding.

 

As a result, I think it's highly unlikely that the ability to supply properties is going to be enough to allow developers to customize everything they'll want or need to

Like the entire concept of the DatabaseIdentityStore, it's indeed unlikely to ever cover everything everyone ever wants, but the 80% rule comes in play here; it's okay if we design for the 80% and then the 20% can do whatever they want in a custom implementation.

I personally still feel parameters with reasonable defaults are still a good intermediate solution between using the standard algorithm as is, and having to write something from scratch.

That said, if there's no time, there's no time. Or in other words, it is what it is ;)


But it needs to be useable. Otherwise it is better to deliver nothing then something which isn't working properly and useable!
If the perception of Java EE security API is not good during the first months, it will never be used by anyone, even if we come up with a better version later on.

Sorry to be this hard, but that is the thruth.
I agree that it needs to be usable, but, to me, the properties/no properties question doesn't seem to be a fatal flaw -- properties would be convenient, in some cases, but lack of properties won't prevent a developer (or platform vendor) from implementing any algorithm(s) they need. Do you see it differently?

 
Kind regards,
Arjan Tijms


 
, even when they're creating a new identity store and have complete control over the format. In the case that they need to develop for an existing identity store, with existing password hash formats, the chances are close to zero that a generic password hash implementation could be adequately customized with properties.
With all of that in mind, I do think properties are a nice feature, but they don't seem absolutely necessary -- developers have full control without them -- and it's not clear they'd really be useful in very many scenarios. I suspect they would mostly be used to fiddle with parameters in cases where the default storage encoding is deemed to be acceptable, and that, in most cases, the default algorithm/parameters would probably be deemed acceptable as well, if properties weren't available. As such, I'm reluctant to take on the extra complexity and work required to support properties at this point.

Anyway, that's my proposal. Thoughts?

Will


A small detail anout the code, I would make it more Java 8 alike ( with diamond operator, foreach method, ....)

If you like, I can make a PR for that.
Sure, if you want, but let's wait until we're fully decided on the approach. Also, did you sign the OCA yet? I don't understand why that's suddenly a requirement, but apparently it is.


Rudy
 

On 07/20/2017 03:25 PM, Arjan Tijms wrote:
Hi,

The adaptability concern is basically the main concern of the entire DataBaseIdentityStore. Indeed, having a query to get the (hashed) password from the caller name and another to get the groups covers maybe 80% of use cases, but certainly not all.

There are for instance exotic database schemes were you'd need at least 2 seperate queries to get that data.

For the hash algorithm the same thing holds. We need to specify enough moving parts so we can adapt to *many* existing databases and requirements, but we shouldn't even be dreaming about covering them all.

Since parameters can contain EL as values as well, it wouldn't be crazy hard really to supply the encoding as a method expression. It's basically a byte[] -> String function.

In this case though I think encoding is a general enough moving part to warrant its own method in the interface. It can easily be defaulted to e.g. base64 and it can be overridden without exposing the implementation by use of a CDI decorator (or should be after we solve a bug or problem that currently prevents decorating).

Kind regards,
Arjan Tijms


On Thursday, July 20, 2017, Will Hopkins <will.hopkins@...> wrote:


On 07/20/2017 12:38 PM, Guillermo González de Agüero wrote:


Similarly, parameters like the salt and iteration count can be defaulted. That way you can have 3 levels:

1. Specify only the class name; you get defaults for all parameters and it Just Works
2. Specify the parameters locally to match a specific DB or requirement
3. (via EL) fetch the parameter values externally

The most important rule should always be that the developer is in full control, or can be in full control. External overrides should be possible, but never be required. That is -the- big mistake that Java EE made in the past.
I agree, but I think we're only talking about what the mechanism for override is, and what the trade-off between simplicity and the override mechanism is. In practice, I don't think parameters are going to be very helpful to most developers. They can be used to tweak some simple things, but the simple things won't typically be necessary and the more complex things can't be handled by parameters.
 
Could you please provide an example of such a setting, please? Just to better understand. Otherwise, we could just wait for your new proposal to see.

Basically, it's the encoding of the hashed password. Parameters like salt, iterations, hash key size, etc. are all pretty standard. They can be dialed up or dialed down to make the hash stronger or weaker, and I think "stronger vs. weaker" is the level at which most people think about them. I doubt many developers are concerned about specifying precisely a certain number of iterations for some reason, they just want iterations count to be high enough to be secure, without imposing excessive overhead or resource consumption, and security most often wins out vs. resource consumption these days. Most will be happy if they can turn up the volume with a single knob, they don't need bass and treble and balance knobs. Even better: if the default volume is about right, and if the volume automatically goes up when you're driving at higher speeds.

In contrast, the format of the encoded string is not standard in any way that I'm aware of. There are probably LDAP vendors that do things in a common way, but for any given database user store, either the format has already been decided, because there are existing user accounts, or the developer won't care as long as the default format is reasonable (assuming the DB schema allows enough space for the chosen encoding format.

I can't think of any good way to standardize parameters to describe how the hashed password is encoded for storage, and that's the hard part, the part that you really have to adapt to. It's not a big deal if you have to use 4000 iterations instead of 2000, but it's a big deal if you can't parse the encoded password. And since the encoding is completely arbitrary and non-standard, that's the real problem, and that's what will likely force people to do their own implementation of HashAlgorithm. That can't be specified as Properties.

We could probably come up with an interface for encoding/decoding the hashed value, or for describing the encoded value as a string and passing it as a property. But that's a lot to spec out, and we just don't have time for it. It's complicated by the fact that decoding a hash potentially yields all the parameters that were used to generate the hash, so there'd need to be a generalized interface for returning those in a standard way. Plus, in some databases, the encodings used will vary, as legacy hashes are replaced with stronger hashes/different encodings when people change their password.

It's hard to argue against supplying Properties. It's not difficult to implement, and it might be useful. But I'm trying to keep the interface as simple as possible -- in part to reduce the number of additional TCK tests that need to be written -- and I suspect the Properties will rarely be useful. Some people will want to use them, but for many the defaults will be good enough, and many of those with specific requirements will already be forced to write an implementation in order to deal with encoding issues.

Will
-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803



-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803




Re: LDAP Annotation and Database Hashing Proposal

Ivar Grimstad
 

Hi,

Sorry for not responding sooner. Been to JCrete...

A minor detail, but wouldn't it make sense to skip the Hash suffix for the generateHash and verifyHash methods on the Hash algorithm classes?
E.g.
XXXHash.generateHash() -> XXXHash.generate()
XXXHash.verifyHash() -> XXXHash.verify()

I'm in favour of DRY and it is not like these classes will generate or verify anything else. Just my 2c...

Ivar

On Mon, Jul 24, 2017 at 7:46 AM Guillermo González de Agüero <z06.guillermo@...> wrote:
I was going to raise the same concern, although I think properties could be supported in the future by providing a new method and specifying that the old one y to be used when there are no parameters defined, and the new one when there are.

If that is acceptable and can be done in the future, I think it's not such a big deal to leave them for now if the EG don't find consensus.


Regards,

Guillermo González de Agüero


El lun., 24 de julio de 2017 6:49, Rudy De Busscher <rdebusscher@...> escribió:
Will,

regarding the properties. Indeed when we do not support the properties, they don't need to be specified at the method.

But then, we will never be able to support them in any further version since it would break the backward compatibility of the interface methods. Once this release is out, we can't change anything anymore to interface (no change in the method parameters, method count, etc ... only we can add default implemented methods)

So I guess it is in now or leave out for ever.

Rudy



On 23 July 2017 at 23:31, Will Hopkins <will.hopkins@...> wrote:
Hi Rudy,

Thanks for the feedback, responses inline.

Will

On 07/23/2017 08:05 AM, Rudy De Busscher wrote:
Will, See inline comments (hope it is clear enough, I'll put them also in a separate thread)


On 22 July 2017 at 23:17, Arjan Tijms <arjan.tijms@...> wrote:
Hi,

On Sat, Jul 22, 2017 at 10:06 PM, Will Hopkins <will.hopkins@...> wrote:
Arjan, et al.:

I've pushed some changes to the hashalgorithm branch in both security-api and security-soteria; please have a look to see what you think.
  • I considered PasswordHashAlgorithm, but that seemed too long, especially for interfaces/classes that would extend it. 


"PasswordHashAlgorithm" certainly doesn't feel too long for me.
 
  • Also renamed IdentityHashAlgorithm to PlaintextPasswordHash.

Ok 
  • Added generateHash() method.

Ok 
  • Removed hash parameters passed to verifyHash(). I did not remove the DatabaseIdentityStoreDefinition hashProperties attribute, nor did I remove the "plumbing" that passees the properties down to the hash implementation, in case we want to restore the properties. But they are not currently passed in to the PasswordHash.

Even if the standard Pbkdf2PasswordHashImpl doesn't use it, I think it doesn't hurt much to pass them in anyway.

The parameters are required. if the PasswordHash interface doesn't have methods with the parameters passed along, there is no implemtattion ever that can use them!

I assume what you mean is that the methods need the parameters passed along if we retain the hashProperties attribute ... correct?

The intent would be to either support properties, or remove the hashProperties attribute on DatabaseIdentityStoreDefinition. My change leaves them on the annotation because I didn't feel like we'd made a final decision yet, and didn't want to have to add all that back if we decided to keep them. If we don't, we would remove the annotation attribute.

 
 
  • Implemented a (more or less) constant time compare for passwords/hashes, and updated the two PasswordHash impl classes to use it.
  • Re-wrote Pbkdf2PasswordHashImpl to:
    • Generate hashes with a specific storage format (encoding), and parse that encoding when doing verifyHash(), such that the verify operation can use the algorithm/parameters/salt that were used to hash the password originally, instead of the currently configured paramters.
    • Added support for multiple flavors of PBKDF2 (i.e., "with" different Hmac algorithms). Now supports all SHA1/2 variants, but not MD5 (which doesn't appear to be available, or at least not enabled by default, in JDK8).
    • Changed the default parameters to be more in line with current best practices.

Ok

I would also consider removing SHA1. Just as MD5, it is no longer considered as really safe.
Good point, I'll do that. My thought in keeping it (and perhaps MD5, if it worked) was to support passwords hashed with legacy algorithms, but that doesn't make much sense since it's unlikely they'd be encoded the same way in the database, so the default PasswordHashAlgorithm wouldn't work anyway.

 
 
So, this is what I had in mind. It's really not so different from what you coded up, Arjan, except for some of the algorithm implementation details, the lack of properties support, and the flexible encoding format that allows default parameters to be changed while preserving the ability to verify passwords hashed with different parameters.

My reasoning behind removing the properties support is as follows:
  • The application developer already has complete control because he/she can provide a custom algorithm to meet any requirement. Properties may be convenient in some cases, but even without them the developer can do whatever they need to do.
  • We are already introducing a very large change after PFD publication. Properties mean more to specify and document, and, more crucially, they mean more test assertions and TCK test cases to write. There is already concern internally about our ability to complete the TCK on schedule.

Ok
 
  • As is hopefully made more clear by my implementation code, the algorithm itself, and related parameters, are not really the source of most variation and complexity for password hashes -- it's the content and format of the encoded hashed password, as stored in the database, that seems likely to be infinitely more variable. There are likely a number of fairly common formats, but I suspect most are ad hoc, and practices are changing rapidly over time. In any event, we simply don't have time to research that question to figure out if there are any formats common enough to standardize on. (Nor do we have time to develop and implement some kind of expression syntax to specifies what the encoding should look like.)

As I mentioned, the encoding step is very likely common enough to warrant having its own method in the interface, so developers can e.g. decorate the default algorithm and only change the encoding.

 

As a result, I think it's highly unlikely that the ability to supply properties is going to be enough to allow developers to customize everything they'll want or need to

Like the entire concept of the DatabaseIdentityStore, it's indeed unlikely to ever cover everything everyone ever wants, but the 80% rule comes in play here; it's okay if we design for the 80% and then the 20% can do whatever they want in a custom implementation.

I personally still feel parameters with reasonable defaults are still a good intermediate solution between using the standard algorithm as is, and having to write something from scratch.

That said, if there's no time, there's no time. Or in other words, it is what it is ;)


But it needs to be useable. Otherwise it is better to deliver nothing then something which isn't working properly and useable!
If the perception of Java EE security API is not good during the first months, it will never be used by anyone, even if we come up with a better version later on.

Sorry to be this hard, but that is the thruth.
I agree that it needs to be usable, but, to me, the properties/no properties question doesn't seem to be a fatal flaw -- properties would be convenient, in some cases, but lack of properties won't prevent a developer (or platform vendor) from implementing any algorithm(s) they need. Do you see it differently?

 
Kind regards,
Arjan Tijms


 
, even when they're creating a new identity store and have complete control over the format. In the case that they need to develop for an existing identity store, with existing password hash formats, the chances are close to zero that a generic password hash implementation could be adequately customized with properties.
With all of that in mind, I do think properties are a nice feature, but they don't seem absolutely necessary -- developers have full control without them -- and it's not clear they'd really be useful in very many scenarios. I suspect they would mostly be used to fiddle with parameters in cases where the default storage encoding is deemed to be acceptable, and that, in most cases, the default algorithm/parameters would probably be deemed acceptable as well, if properties weren't available. As such, I'm reluctant to take on the extra complexity and work required to support properties at this point.

Anyway, that's my proposal. Thoughts?

Will


A small detail anout the code, I would make it more Java 8 alike ( with diamond operator, foreach method, ....)

If you like, I can make a PR for that.
Sure, if you want, but let's wait until we're fully decided on the approach. Also, did you sign the OCA yet? I don't understand why that's suddenly a requirement, but apparently it is.


Rudy
 

On 07/20/2017 03:25 PM, Arjan Tijms wrote:
Hi,

The adaptability concern is basically the main concern of the entire DataBaseIdentityStore. Indeed, having a query to get the (hashed) password from the caller name and another to get the groups covers maybe 80% of use cases, but certainly not all.

There are for instance exotic database schemes were you'd need at least 2 seperate queries to get that data.

For the hash algorithm the same thing holds. We need to specify enough moving parts so we can adapt to *many* existing databases and requirements, but we shouldn't even be dreaming about covering them all.

Since parameters can contain EL as values as well, it wouldn't be crazy hard really to supply the encoding as a method expression. It's basically a byte[] -> String function.

In this case though I think encoding is a general enough moving part to warrant its own method in the interface. It can easily be defaulted to e.g. base64 and it can be overridden without exposing the implementation by use of a CDI decorator (or should be after we solve a bug or problem that currently prevents decorating).

Kind regards,
Arjan Tijms


On Thursday, July 20, 2017, Will Hopkins <will.hopkins@...> wrote:


On 07/20/2017 12:38 PM, Guillermo González de Agüero wrote:


Similarly, parameters like the salt and iteration count can be defaulted. That way you can have 3 levels:

1. Specify only the class name; you get defaults for all parameters and it Just Works
2. Specify the parameters locally to match a specific DB or requirement
3. (via EL) fetch the parameter values externally

The most important rule should always be that the developer is in full control, or can be in full control. External overrides should be possible, but never be required. That is -the- big mistake that Java EE made in the past.
I agree, but I think we're only talking about what the mechanism for override is, and what the trade-off between simplicity and the override mechanism is. In practice, I don't think parameters are going to be very helpful to most developers. They can be used to tweak some simple things, but the simple things won't typically be necessary and the more complex things can't be handled by parameters.
 
Could you please provide an example of such a setting, please? Just to better understand. Otherwise, we could just wait for your new proposal to see.

Basically, it's the encoding of the hashed password. Parameters like salt, iterations, hash key size, etc. are all pretty standard. They can be dialed up or dialed down to make the hash stronger or weaker, and I think "stronger vs. weaker" is the level at which most people think about them. I doubt many developers are concerned about specifying precisely a certain number of iterations for some reason, they just want iterations count to be high enough to be secure, without imposing excessive overhead or resource consumption, and security most often wins out vs. resource consumption these days. Most will be happy if they can turn up the volume with a single knob, they don't need bass and treble and balance knobs. Even better: if the default volume is about right, and if the volume automatically goes up when you're driving at higher speeds.

In contrast, the format of the encoded string is not standard in any way that I'm aware of. There are probably LDAP vendors that do things in a common way, but for any given database user store, either the format has already been decided, because there are existing user accounts, or the developer won't care as long as the default format is reasonable (assuming the DB schema allows enough space for the chosen encoding format.

I can't think of any good way to standardize parameters to describe how the hashed password is encoded for storage, and that's the hard part, the part that you really have to adapt to. It's not a big deal if you have to use 4000 iterations instead of 2000, but it's a big deal if you can't parse the encoded password. And since the encoding is completely arbitrary and non-standard, that's the real problem, and that's what will likely force people to do their own implementation of HashAlgorithm. That can't be specified as Properties.

We could probably come up with an interface for encoding/decoding the hashed value, or for describing the encoded value as a string and passing it as a property. But that's a lot to spec out, and we just don't have time for it. It's complicated by the fact that decoding a hash potentially yields all the parameters that were used to generate the hash, so there'd need to be a generalized interface for returning those in a standard way. Plus, in some databases, the encodings used will vary, as legacy hashes are replaced with stronger hashes/different encodings when people change their password.

It's hard to argue against supplying Properties. It's not difficult to implement, and it might be useful. But I'm trying to keep the interface as simple as possible -- in part to reduce the number of additional TCK tests that need to be written -- and I suspect the Properties will rarely be useful. Some people will want to use them, but for many the defaults will be good enough, and many of those with specific requirements will already be forced to write an implementation in order to deal with encoding issues.

Will
-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803



-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803


--

Java Champion, JCP EC/EG Member, JUG Leader


Re: LDAP Annotation and Database Hashing Proposal

Guillermo González de Agüero
 

I was going to raise the same concern, although I think properties could be supported in the future by providing a new method and specifying that the old one y to be used when there are no parameters defined, and the new one when there are.

If that is acceptable and can be done in the future, I think it's not such a big deal to leave them for now if the EG don't find consensus.


Regards,

Guillermo González de Agüero


El lun., 24 de julio de 2017 6:49, Rudy De Busscher <rdebusscher@...> escribió:
Will,

regarding the properties. Indeed when we do not support the properties, they don't need to be specified at the method.

But then, we will never be able to support them in any further version since it would break the backward compatibility of the interface methods. Once this release is out, we can't change anything anymore to interface (no change in the method parameters, method count, etc ... only we can add default implemented methods)

So I guess it is in now or leave out for ever.

Rudy



On 23 July 2017 at 23:31, Will Hopkins <will.hopkins@...> wrote:
Hi Rudy,

Thanks for the feedback, responses inline.

Will

On 07/23/2017 08:05 AM, Rudy De Busscher wrote:
Will, See inline comments (hope it is clear enough, I'll put them also in a separate thread)


On 22 July 2017 at 23:17, Arjan Tijms <arjan.tijms@...> wrote:
Hi,

On Sat, Jul 22, 2017 at 10:06 PM, Will Hopkins <will.hopkins@...> wrote:
Arjan, et al.:

I've pushed some changes to the hashalgorithm branch in both security-api and security-soteria; please have a look to see what you think.
  • I considered PasswordHashAlgorithm, but that seemed too long, especially for interfaces/classes that would extend it. 


"PasswordHashAlgorithm" certainly doesn't feel too long for me.
 
  • Also renamed IdentityHashAlgorithm to PlaintextPasswordHash.

Ok 
  • Added generateHash() method.

Ok 
  • Removed hash parameters passed to verifyHash(). I did not remove the DatabaseIdentityStoreDefinition hashProperties attribute, nor did I remove the "plumbing" that passees the properties down to the hash implementation, in case we want to restore the properties. But they are not currently passed in to the PasswordHash.

Even if the standard Pbkdf2PasswordHashImpl doesn't use it, I think it doesn't hurt much to pass them in anyway.

The parameters are required. if the PasswordHash interface doesn't have methods with the parameters passed along, there is no implemtattion ever that can use them!

I assume what you mean is that the methods need the parameters passed along if we retain the hashProperties attribute ... correct?

The intent would be to either support properties, or remove the hashProperties attribute on DatabaseIdentityStoreDefinition. My change leaves them on the annotation because I didn't feel like we'd made a final decision yet, and didn't want to have to add all that back if we decided to keep them. If we don't, we would remove the annotation attribute.

 
 
  • Implemented a (more or less) constant time compare for passwords/hashes, and updated the two PasswordHash impl classes to use it.
  • Re-wrote Pbkdf2PasswordHashImpl to:
    • Generate hashes with a specific storage format (encoding), and parse that encoding when doing verifyHash(), such that the verify operation can use the algorithm/parameters/salt that were used to hash the password originally, instead of the currently configured paramters.
    • Added support for multiple flavors of PBKDF2 (i.e., "with" different Hmac algorithms). Now supports all SHA1/2 variants, but not MD5 (which doesn't appear to be available, or at least not enabled by default, in JDK8).
    • Changed the default parameters to be more in line with current best practices.

Ok

I would also consider removing SHA1. Just as MD5, it is no longer considered as really safe.
Good point, I'll do that. My thought in keeping it (and perhaps MD5, if it worked) was to support passwords hashed with legacy algorithms, but that doesn't make much sense since it's unlikely they'd be encoded the same way in the database, so the default PasswordHashAlgorithm wouldn't work anyway.

 
 
So, this is what I had in mind. It's really not so different from what you coded up, Arjan, except for some of the algorithm implementation details, the lack of properties support, and the flexible encoding format that allows default parameters to be changed while preserving the ability to verify passwords hashed with different parameters.

My reasoning behind removing the properties support is as follows:
  • The application developer already has complete control because he/she can provide a custom algorithm to meet any requirement. Properties may be convenient in some cases, but even without them the developer can do whatever they need to do.
  • We are already introducing a very large change after PFD publication. Properties mean more to specify and document, and, more crucially, they mean more test assertions and TCK test cases to write. There is already concern internally about our ability to complete the TCK on schedule.

Ok
 
  • As is hopefully made more clear by my implementation code, the algorithm itself, and related parameters, are not really the source of most variation and complexity for password hashes -- it's the content and format of the encoded hashed password, as stored in the database, that seems likely to be infinitely more variable. There are likely a number of fairly common formats, but I suspect most are ad hoc, and practices are changing rapidly over time. In any event, we simply don't have time to research that question to figure out if there are any formats common enough to standardize on. (Nor do we have time to develop and implement some kind of expression syntax to specifies what the encoding should look like.)

As I mentioned, the encoding step is very likely common enough to warrant having its own method in the interface, so developers can e.g. decorate the default algorithm and only change the encoding.

 

As a result, I think it's highly unlikely that the ability to supply properties is going to be enough to allow developers to customize everything they'll want or need to

Like the entire concept of the DatabaseIdentityStore, it's indeed unlikely to ever cover everything everyone ever wants, but the 80% rule comes in play here; it's okay if we design for the 80% and then the 20% can do whatever they want in a custom implementation.

I personally still feel parameters with reasonable defaults are still a good intermediate solution between using the standard algorithm as is, and having to write something from scratch.

That said, if there's no time, there's no time. Or in other words, it is what it is ;)


But it needs to be useable. Otherwise it is better to deliver nothing then something which isn't working properly and useable!
If the perception of Java EE security API is not good during the first months, it will never be used by anyone, even if we come up with a better version later on.

Sorry to be this hard, but that is the thruth.
I agree that it needs to be usable, but, to me, the properties/no properties question doesn't seem to be a fatal flaw -- properties would be convenient, in some cases, but lack of properties won't prevent a developer (or platform vendor) from implementing any algorithm(s) they need. Do you see it differently?

 
Kind regards,
Arjan Tijms


 
, even when they're creating a new identity store and have complete control over the format. In the case that they need to develop for an existing identity store, with existing password hash formats, the chances are close to zero that a generic password hash implementation could be adequately customized with properties.
With all of that in mind, I do think properties are a nice feature, but they don't seem absolutely necessary -- developers have full control without them -- and it's not clear they'd really be useful in very many scenarios. I suspect they would mostly be used to fiddle with parameters in cases where the default storage encoding is deemed to be acceptable, and that, in most cases, the default algorithm/parameters would probably be deemed acceptable as well, if properties weren't available. As such, I'm reluctant to take on the extra complexity and work required to support properties at this point.

Anyway, that's my proposal. Thoughts?

Will


A small detail anout the code, I would make it more Java 8 alike ( with diamond operator, foreach method, ....)

If you like, I can make a PR for that.
Sure, if you want, but let's wait until we're fully decided on the approach. Also, did you sign the OCA yet? I don't understand why that's suddenly a requirement, but apparently it is.


Rudy
 

On 07/20/2017 03:25 PM, Arjan Tijms wrote:
Hi,

The adaptability concern is basically the main concern of the entire DataBaseIdentityStore. Indeed, having a query to get the (hashed) password from the caller name and another to get the groups covers maybe 80% of use cases, but certainly not all.

There are for instance exotic database schemes were you'd need at least 2 seperate queries to get that data.

For the hash algorithm the same thing holds. We need to specify enough moving parts so we can adapt to *many* existing databases and requirements, but we shouldn't even be dreaming about covering them all.

Since parameters can contain EL as values as well, it wouldn't be crazy hard really to supply the encoding as a method expression. It's basically a byte[] -> String function.

In this case though I think encoding is a general enough moving part to warrant its own method in the interface. It can easily be defaulted to e.g. base64 and it can be overridden without exposing the implementation by use of a CDI decorator (or should be after we solve a bug or problem that currently prevents decorating).

Kind regards,
Arjan Tijms


On Thursday, July 20, 2017, Will Hopkins <will.hopkins@...> wrote:


On 07/20/2017 12:38 PM, Guillermo González de Agüero wrote:


Similarly, parameters like the salt and iteration count can be defaulted. That way you can have 3 levels:

1. Specify only the class name; you get defaults for all parameters and it Just Works
2. Specify the parameters locally to match a specific DB or requirement
3. (via EL) fetch the parameter values externally

The most important rule should always be that the developer is in full control, or can be in full control. External overrides should be possible, but never be required. That is -the- big mistake that Java EE made in the past.
I agree, but I think we're only talking about what the mechanism for override is, and what the trade-off between simplicity and the override mechanism is. In practice, I don't think parameters are going to be very helpful to most developers. They can be used to tweak some simple things, but the simple things won't typically be necessary and the more complex things can't be handled by parameters.
 
Could you please provide an example of such a setting, please? Just to better understand. Otherwise, we could just wait for your new proposal to see.

Basically, it's the encoding of the hashed password. Parameters like salt, iterations, hash key size, etc. are all pretty standard. They can be dialed up or dialed down to make the hash stronger or weaker, and I think "stronger vs. weaker" is the level at which most people think about them. I doubt many developers are concerned about specifying precisely a certain number of iterations for some reason, they just want iterations count to be high enough to be secure, without imposing excessive overhead or resource consumption, and security most often wins out vs. resource consumption these days. Most will be happy if they can turn up the volume with a single knob, they don't need bass and treble and balance knobs. Even better: if the default volume is about right, and if the volume automatically goes up when you're driving at higher speeds.

In contrast, the format of the encoded string is not standard in any way that I'm aware of. There are probably LDAP vendors that do things in a common way, but for any given database user store, either the format has already been decided, because there are existing user accounts, or the developer won't care as long as the default format is reasonable (assuming the DB schema allows enough space for the chosen encoding format.

I can't think of any good way to standardize parameters to describe how the hashed password is encoded for storage, and that's the hard part, the part that you really have to adapt to. It's not a big deal if you have to use 4000 iterations instead of 2000, but it's a big deal if you can't parse the encoded password. And since the encoding is completely arbitrary and non-standard, that's the real problem, and that's what will likely force people to do their own implementation of HashAlgorithm. That can't be specified as Properties.

We could probably come up with an interface for encoding/decoding the hashed value, or for describing the encoded value as a string and passing it as a property. But that's a lot to spec out, and we just don't have time for it. It's complicated by the fact that decoding a hash potentially yields all the parameters that were used to generate the hash, so there'd need to be a generalized interface for returning those in a standard way. Plus, in some databases, the encodings used will vary, as legacy hashes are replaced with stronger hashes/different encodings when people change their password.

It's hard to argue against supplying Properties. It's not difficult to implement, and it might be useful. But I'm trying to keep the interface as simple as possible -- in part to reduce the number of additional TCK tests that need to be written -- and I suspect the Properties will rarely be useful. Some people will want to use them, but for many the defaults will be good enough, and many of those with specific requirements will already be forced to write an implementation in order to deal with encoding issues.

Will
-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803



-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803



Re: LDAP Annotation and Database Hashing Proposal

Rudy De Busscher
 

Will,

regarding the properties. Indeed when we do not support the properties, they don't need to be specified at the method.

But then, we will never be able to support them in any further version since it would break the backward compatibility of the interface methods. Once this release is out, we can't change anything anymore to interface (no change in the method parameters, method count, etc ... only we can add default implemented methods)

So I guess it is in now or leave out for ever.

Rudy



On 23 July 2017 at 23:31, Will Hopkins <will.hopkins@...> wrote:
Hi Rudy,

Thanks for the feedback, responses inline.

Will

On 07/23/2017 08:05 AM, Rudy De Busscher wrote:
Will, See inline comments (hope it is clear enough, I'll put them also in a separate thread)


On 22 July 2017 at 23:17, Arjan Tijms <arjan.tijms@...> wrote:
Hi,

On Sat, Jul 22, 2017 at 10:06 PM, Will Hopkins <will.hopkins@...> wrote:
Arjan, et al.:

I've pushed some changes to the hashalgorithm branch in both security-api and security-soteria; please have a look to see what you think.
  • I considered PasswordHashAlgorithm, but that seemed too long, especially for interfaces/classes that would extend it. 


"PasswordHashAlgorithm" certainly doesn't feel too long for me.
 
  • Also renamed IdentityHashAlgorithm to PlaintextPasswordHash.

Ok 
  • Added generateHash() method.

Ok 
  • Removed hash parameters passed to verifyHash(). I did not remove the DatabaseIdentityStoreDefinition hashProperties attribute, nor did I remove the "plumbing" that passees the properties down to the hash implementation, in case we want to restore the properties. But they are not currently passed in to the PasswordHash.

Even if the standard Pbkdf2PasswordHashImpl doesn't use it, I think it doesn't hurt much to pass them in anyway.

The parameters are required. if the PasswordHash interface doesn't have methods with the parameters passed along, there is no implemtattion ever that can use them!

I assume what you mean is that the methods need the parameters passed along if we retain the hashProperties attribute ... correct?

The intent would be to either support properties, or remove the hashProperties attribute on DatabaseIdentityStoreDefinition. My change leaves them on the annotation because I didn't feel like we'd made a final decision yet, and didn't want to have to add all that back if we decided to keep them. If we don't, we would remove the annotation attribute.

 
 
  • Implemented a (more or less) constant time compare for passwords/hashes, and updated the two PasswordHash impl classes to use it.
  • Re-wrote Pbkdf2PasswordHashImpl to:
    • Generate hashes with a specific storage format (encoding), and parse that encoding when doing verifyHash(), such that the verify operation can use the algorithm/parameters/salt that were used to hash the password originally, instead of the currently configured paramters.
    • Added support for multiple flavors of PBKDF2 (i.e., "with" different Hmac algorithms). Now supports all SHA1/2 variants, but not MD5 (which doesn't appear to be available, or at least not enabled by default, in JDK8).
    • Changed the default parameters to be more in line with current best practices.

Ok

I would also consider removing SHA1. Just as MD5, it is no longer considered as really safe.
Good point, I'll do that. My thought in keeping it (and perhaps MD5, if it worked) was to support passwords hashed with legacy algorithms, but that doesn't make much sense since it's unlikely they'd be encoded the same way in the database, so the default PasswordHashAlgorithm wouldn't work anyway.

 
 
So, this is what I had in mind. It's really not so different from what you coded up, Arjan, except for some of the algorithm implementation details, the lack of properties support, and the flexible encoding format that allows default parameters to be changed while preserving the ability to verify passwords hashed with different parameters.

My reasoning behind removing the properties support is as follows:
  • The application developer already has complete control because he/she can provide a custom algorithm to meet any requirement. Properties may be convenient in some cases, but even without them the developer can do whatever they need to do.
  • We are already introducing a very large change after PFD publication. Properties mean more to specify and document, and, more crucially, they mean more test assertions and TCK test cases to write. There is already concern internally about our ability to complete the TCK on schedule.

Ok
 
  • As is hopefully made more clear by my implementation code, the algorithm itself, and related parameters, are not really the source of most variation and complexity for password hashes -- it's the content and format of the encoded hashed password, as stored in the database, that seems likely to be infinitely more variable. There are likely a number of fairly common formats, but I suspect most are ad hoc, and practices are changing rapidly over time. In any event, we simply don't have time to research that question to figure out if there are any formats common enough to standardize on. (Nor do we have time to develop and implement some kind of expression syntax to specifies what the encoding should look like.)

As I mentioned, the encoding step is very likely common enough to warrant having its own method in the interface, so developers can e.g. decorate the default algorithm and only change the encoding.

 

As a result, I think it's highly unlikely that the ability to supply properties is going to be enough to allow developers to customize everything they'll want or need to

Like the entire concept of the DatabaseIdentityStore, it's indeed unlikely to ever cover everything everyone ever wants, but the 80% rule comes in play here; it's okay if we design for the 80% and then the 20% can do whatever they want in a custom implementation.

I personally still feel parameters with reasonable defaults are still a good intermediate solution between using the standard algorithm as is, and having to write something from scratch.

That said, if there's no time, there's no time. Or in other words, it is what it is ;)


But it needs to be useable. Otherwise it is better to deliver nothing then something which isn't working properly and useable!
If the perception of Java EE security API is not good during the first months, it will never be used by anyone, even if we come up with a better version later on.

Sorry to be this hard, but that is the thruth.
I agree that it needs to be usable, but, to me, the properties/no properties question doesn't seem to be a fatal flaw -- properties would be convenient, in some cases, but lack of properties won't prevent a developer (or platform vendor) from implementing any algorithm(s) they need. Do you see it differently?

 
Kind regards,
Arjan Tijms


 
, even when they're creating a new identity store and have complete control over the format. In the case that they need to develop for an existing identity store, with existing password hash formats, the chances are close to zero that a generic password hash implementation could be adequately customized with properties.
With all of that in mind, I do think properties are a nice feature, but they don't seem absolutely necessary -- developers have full control without them -- and it's not clear they'd really be useful in very many scenarios. I suspect they would mostly be used to fiddle with parameters in cases where the default storage encoding is deemed to be acceptable, and that, in most cases, the default algorithm/parameters would probably be deemed acceptable as well, if properties weren't available. As such, I'm reluctant to take on the extra complexity and work required to support properties at this point.

Anyway, that's my proposal. Thoughts?

Will


A small detail anout the code, I would make it more Java 8 alike ( with diamond operator, foreach method, ....)

If you like, I can make a PR for that.
Sure, if you want, but let's wait until we're fully decided on the approach. Also, did you sign the OCA yet? I don't understand why that's suddenly a requirement, but apparently it is.


Rudy
 

On 07/20/2017 03:25 PM, Arjan Tijms wrote:
Hi,

The adaptability concern is basically the main concern of the entire DataBaseIdentityStore. Indeed, having a query to get the (hashed) password from the caller name and another to get the groups covers maybe 80% of use cases, but certainly not all.

There are for instance exotic database schemes were you'd need at least 2 seperate queries to get that data.

For the hash algorithm the same thing holds. We need to specify enough moving parts so we can adapt to *many* existing databases and requirements, but we shouldn't even be dreaming about covering them all.

Since parameters can contain EL as values as well, it wouldn't be crazy hard really to supply the encoding as a method expression. It's basically a byte[] -> String function.

In this case though I think encoding is a general enough moving part to warrant its own method in the interface. It can easily be defaulted to e.g. base64 and it can be overridden without exposing the implementation by use of a CDI decorator (or should be after we solve a bug or problem that currently prevents decorating).

Kind regards,
Arjan Tijms


On Thursday, July 20, 2017, Will Hopkins <will.hopkins@...> wrote:


On 07/20/2017 12:38 PM, Guillermo González de Agüero wrote:


Similarly, parameters like the salt and iteration count can be defaulted. That way you can have 3 levels:

1. Specify only the class name; you get defaults for all parameters and it Just Works
2. Specify the parameters locally to match a specific DB or requirement
3. (via EL) fetch the parameter values externally

The most important rule should always be that the developer is in full control, or can be in full control. External overrides should be possible, but never be required. That is -the- big mistake that Java EE made in the past.
I agree, but I think we're only talking about what the mechanism for override is, and what the trade-off between simplicity and the override mechanism is. In practice, I don't think parameters are going to be very helpful to most developers. They can be used to tweak some simple things, but the simple things won't typically be necessary and the more complex things can't be handled by parameters.
 
Could you please provide an example of such a setting, please? Just to better understand. Otherwise, we could just wait for your new proposal to see.

Basically, it's the encoding of the hashed password. Parameters like salt, iterations, hash key size, etc. are all pretty standard. They can be dialed up or dialed down to make the hash stronger or weaker, and I think "stronger vs. weaker" is the level at which most people think about them. I doubt many developers are concerned about specifying precisely a certain number of iterations for some reason, they just want iterations count to be high enough to be secure, without imposing excessive overhead or resource consumption, and security most often wins out vs. resource consumption these days. Most will be happy if they can turn up the volume with a single knob, they don't need bass and treble and balance knobs. Even better: if the default volume is about right, and if the volume automatically goes up when you're driving at higher speeds.

In contrast, the format of the encoded string is not standard in any way that I'm aware of. There are probably LDAP vendors that do things in a common way, but for any given database user store, either the format has already been decided, because there are existing user accounts, or the developer won't care as long as the default format is reasonable (assuming the DB schema allows enough space for the chosen encoding format.

I can't think of any good way to standardize parameters to describe how the hashed password is encoded for storage, and that's the hard part, the part that you really have to adapt to. It's not a big deal if you have to use 4000 iterations instead of 2000, but it's a big deal if you can't parse the encoded password. And since the encoding is completely arbitrary and non-standard, that's the real problem, and that's what will likely force people to do their own implementation of HashAlgorithm. That can't be specified as Properties.

We could probably come up with an interface for encoding/decoding the hashed value, or for describing the encoded value as a string and passing it as a property. But that's a lot to spec out, and we just don't have time for it. It's complicated by the fact that decoding a hash potentially yields all the parameters that were used to generate the hash, so there'd need to be a generalized interface for returning those in a standard way. Plus, in some databases, the encodings used will vary, as legacy hashes are replaced with stronger hashes/different encodings when people change their password.

It's hard to argue against supplying Properties. It's not difficult to implement, and it might be useful. But I'm trying to keep the interface as simple as possible -- in part to reduce the number of additional TCK tests that need to be written -- and I suspect the Properties will rarely be useful. Some people will want to use them, but for many the defaults will be good enough, and many of those with specific requirements will already be forced to write an implementation in order to deal with encoding issues.

Will
-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803



-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803



Re: Coding Up LdapIdentityStoreDefinition Changes

Arjan Tijms
 

Hi,

As far as I am concerned, please go ahead. I'm not currently implementing any of those changes.

Kind regards,
Arjan Tijms


Coding Up LdapIdentityStoreDefinition Changes

Will Hopkins
 

Unless someone else is already doing this, I plan to code up the changes to LdapIdentityStore needed to support the LdapIdentityStoreDefinition changes in the PFD (search scope, memberOf support, etc.).
-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803


Re: Review PasswordHash / DatabaseIdentityStore

Will Hopkins
 

Thanks, Rudy, I responded to your comments in the original thread.

On 07/23/2017 08:08 AM, Rudy De Busscher wrote:
Copied some parts out of the other thread as it becomes too unreadable.

  • remove the "plumbing" that passees the properties down to the hash implementation, in case we want to restore the properties. But they are not currently passed in to the PasswordHash.

Even if the standard Pbkdf2PasswordHashImpl doesn't use it, I think it doesn't hurt much to pass them in anyway.

The parameters are required. if the PasswordHash interface doesn't have methods with the parameters passed along, there is no implementation ever that can use them!

  • Added support for multiple flavors of PBKDF2 (i.e., "with" different Hmac algorithms). Now supports all SHA1/2 variants, but not MD5 (which doesn't appear to be available, or at least not enabled by default, in JDK8)
Ok

I would also consider removing SHA1. Just as MD5, it is no longer considered as really safe.

That said, if there's no time, there's no time. Or in other words, it is what it is ;)


But it needs to be useable. Otherwise, it is better to deliver nothing than something which isn't working properly and usable!
If the perception of Java EE security API is not good during the first months, it will never be used by anyone, even if we come up with a better version later on.

Sorry to be this hard, but that is the truth.


A small detail about the code, I would make it more Java 8 alike ( with diamond operator, foreach method, ....)

If you like, I can make a PR for that.

Rudy

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803


Re: LDAP Annotation and Database Hashing Proposal

Will Hopkins
 

Hi Rudy,

Thanks for the feedback, responses inline.

Will

On 07/23/2017 08:05 AM, Rudy De Busscher wrote:
Will, See inline comments (hope it is clear enough, I'll put them also in a separate thread)


On 22 July 2017 at 23:17, Arjan Tijms <arjan.tijms@...> wrote:
Hi,

On Sat, Jul 22, 2017 at 10:06 PM, Will Hopkins <will.hopkins@...> wrote:
Arjan, et al.:

I've pushed some changes to the hashalgorithm branch in both security-api and security-soteria; please have a look to see what you think.
  • I considered PasswordHashAlgorithm, but that seemed too long, especially for interfaces/classes that would extend it. 


"PasswordHashAlgorithm" certainly doesn't feel too long for me.
 
  • Also renamed IdentityHashAlgorithm to PlaintextPasswordHash.

Ok 
  • Added generateHash() method.

Ok 
  • Removed hash parameters passed to verifyHash(). I did not remove the DatabaseIdentityStoreDefinition hashProperties attribute, nor did I remove the "plumbing" that passees the properties down to the hash implementation, in case we want to restore the properties. But they are not currently passed in to the PasswordHash.

Even if the standard Pbkdf2PasswordHashImpl doesn't use it, I think it doesn't hurt much to pass them in anyway.

The parameters are required. if the PasswordHash interface doesn't have methods with the parameters passed along, there is no implemtattion ever that can use them!

I assume what you mean is that the methods need the parameters passed along if we retain the hashProperties attribute ... correct?

The intent would be to either support properties, or remove the hashProperties attribute on DatabaseIdentityStoreDefinition. My change leaves them on the annotation because I didn't feel like we'd made a final decision yet, and didn't want to have to add all that back if we decided to keep them. If we don't, we would remove the annotation attribute.

 
 
  • Implemented a (more or less) constant time compare for passwords/hashes, and updated the two PasswordHash impl classes to use it.
  • Re-wrote Pbkdf2PasswordHashImpl to:
    • Generate hashes with a specific storage format (encoding), and parse that encoding when doing verifyHash(), such that the verify operation can use the algorithm/parameters/salt that were used to hash the password originally, instead of the currently configured paramters.
    • Added support for multiple flavors of PBKDF2 (i.e., "with" different Hmac algorithms). Now supports all SHA1/2 variants, but not MD5 (which doesn't appear to be available, or at least not enabled by default, in JDK8).
    • Changed the default parameters to be more in line with current best practices.

Ok

I would also consider removing SHA1. Just as MD5, it is no longer considered as really safe.
Good point, I'll do that. My thought in keeping it (and perhaps MD5, if it worked) was to support passwords hashed with legacy algorithms, but that doesn't make much sense since it's unlikely they'd be encoded the same way in the database, so the default PasswordHashAlgorithm wouldn't work anyway.

 
 
So, this is what I had in mind. It's really not so different from what you coded up, Arjan, except for some of the algorithm implementation details, the lack of properties support, and the flexible encoding format that allows default parameters to be changed while preserving the ability to verify passwords hashed with different parameters.

My reasoning behind removing the properties support is as follows:
  • The application developer already has complete control because he/she can provide a custom algorithm to meet any requirement. Properties may be convenient in some cases, but even without them the developer can do whatever they need to do.
  • We are already introducing a very large change after PFD publication. Properties mean more to specify and document, and, more crucially, they mean more test assertions and TCK test cases to write. There is already concern internally about our ability to complete the TCK on schedule.

Ok
 
  • As is hopefully made more clear by my implementation code, the algorithm itself, and related parameters, are not really the source of most variation and complexity for password hashes -- it's the content and format of the encoded hashed password, as stored in the database, that seems likely to be infinitely more variable. There are likely a number of fairly common formats, but I suspect most are ad hoc, and practices are changing rapidly over time. In any event, we simply don't have time to research that question to figure out if there are any formats common enough to standardize on. (Nor do we have time to develop and implement some kind of expression syntax to specifies what the encoding should look like.)

As I mentioned, the encoding step is very likely common enough to warrant having its own method in the interface, so developers can e.g. decorate the default algorithm and only change the encoding.

 

As a result, I think it's highly unlikely that the ability to supply properties is going to be enough to allow developers to customize everything they'll want or need to

Like the entire concept of the DatabaseIdentityStore, it's indeed unlikely to ever cover everything everyone ever wants, but the 80% rule comes in play here; it's okay if we design for the 80% and then the 20% can do whatever they want in a custom implementation.

I personally still feel parameters with reasonable defaults are still a good intermediate solution between using the standard algorithm as is, and having to write something from scratch.

That said, if there's no time, there's no time. Or in other words, it is what it is ;)


But it needs to be useable. Otherwise it is better to deliver nothing then something which isn't working properly and useable!
If the perception of Java EE security API is not good during the first months, it will never be used by anyone, even if we come up with a better version later on.

Sorry to be this hard, but that is the thruth.
I agree that it needs to be usable, but, to me, the properties/no properties question doesn't seem to be a fatal flaw -- properties would be convenient, in some cases, but lack of properties won't prevent a developer (or platform vendor) from implementing any algorithm(s) they need. Do you see it differently?

 
Kind regards,
Arjan Tijms


 
, even when they're creating a new identity store and have complete control over the format. In the case that they need to develop for an existing identity store, with existing password hash formats, the chances are close to zero that a generic password hash implementation could be adequately customized with properties.
With all of that in mind, I do think properties are a nice feature, but they don't seem absolutely necessary -- developers have full control without them -- and it's not clear they'd really be useful in very many scenarios. I suspect they would mostly be used to fiddle with parameters in cases where the default storage encoding is deemed to be acceptable, and that, in most cases, the default algorithm/parameters would probably be deemed acceptable as well, if properties weren't available. As such, I'm reluctant to take on the extra complexity and work required to support properties at this point.

Anyway, that's my proposal. Thoughts?

Will


A small detail anout the code, I would make it more Java 8 alike ( with diamond operator, foreach method, ....)

If you like, I can make a PR for that.
Sure, if you want, but let's wait until we're fully decided on the approach. Also, did you sign the OCA yet? I don't understand why that's suddenly a requirement, but apparently it is.


Rudy
 

On 07/20/2017 03:25 PM, Arjan Tijms wrote:
Hi,

The adaptability concern is basically the main concern of the entire DataBaseIdentityStore. Indeed, having a query to get the (hashed) password from the caller name and another to get the groups covers maybe 80% of use cases, but certainly not all.

There are for instance exotic database schemes were you'd need at least 2 seperate queries to get that data.

For the hash algorithm the same thing holds. We need to specify enough moving parts so we can adapt to *many* existing databases and requirements, but we shouldn't even be dreaming about covering them all.

Since parameters can contain EL as values as well, it wouldn't be crazy hard really to supply the encoding as a method expression. It's basically a byte[] -> String function.

In this case though I think encoding is a general enough moving part to warrant its own method in the interface. It can easily be defaulted to e.g. base64 and it can be overridden without exposing the implementation by use of a CDI decorator (or should be after we solve a bug or problem that currently prevents decorating).

Kind regards,
Arjan Tijms


On Thursday, July 20, 2017, Will Hopkins <will.hopkins@...> wrote:


On 07/20/2017 12:38 PM, Guillermo González de Agüero wrote:


Similarly, parameters like the salt and iteration count can be defaulted. That way you can have 3 levels:

1. Specify only the class name; you get defaults for all parameters and it Just Works
2. Specify the parameters locally to match a specific DB or requirement
3. (via EL) fetch the parameter values externally

The most important rule should always be that the developer is in full control, or can be in full control. External overrides should be possible, but never be required. That is -the- big mistake that Java EE made in the past.
I agree, but I think we're only talking about what the mechanism for override is, and what the trade-off between simplicity and the override mechanism is. In practice, I don't think parameters are going to be very helpful to most developers. They can be used to tweak some simple things, but the simple things won't typically be necessary and the more complex things can't be handled by parameters.
 
Could you please provide an example of such a setting, please? Just to better understand. Otherwise, we could just wait for your new proposal to see.

Basically, it's the encoding of the hashed password. Parameters like salt, iterations, hash key size, etc. are all pretty standard. They can be dialed up or dialed down to make the hash stronger or weaker, and I think "stronger vs. weaker" is the level at which most people think about them. I doubt many developers are concerned about specifying precisely a certain number of iterations for some reason, they just want iterations count to be high enough to be secure, without imposing excessive overhead or resource consumption, and security most often wins out vs. resource consumption these days. Most will be happy if they can turn up the volume with a single knob, they don't need bass and treble and balance knobs. Even better: if the default volume is about right, and if the volume automatically goes up when you're driving at higher speeds.

In contrast, the format of the encoded string is not standard in any way that I'm aware of. There are probably LDAP vendors that do things in a common way, but for any given database user store, either the format has already been decided, because there are existing user accounts, or the developer won't care as long as the default format is reasonable (assuming the DB schema allows enough space for the chosen encoding format.

I can't think of any good way to standardize parameters to describe how the hashed password is encoded for storage, and that's the hard part, the part that you really have to adapt to. It's not a big deal if you have to use 4000 iterations instead of 2000, but it's a big deal if you can't parse the encoded password. And since the encoding is completely arbitrary and non-standard, that's the real problem, and that's what will likely force people to do their own implementation of HashAlgorithm. That can't be specified as Properties.

We could probably come up with an interface for encoding/decoding the hashed value, or for describing the encoded value as a string and passing it as a property. But that's a lot to spec out, and we just don't have time for it. It's complicated by the fact that decoding a hash potentially yields all the parameters that were used to generate the hash, so there'd need to be a generalized interface for returning those in a standard way. Plus, in some databases, the encodings used will vary, as legacy hashes are replaced with stronger hashes/different encodings when people change their password.

It's hard to argue against supplying Properties. It's not difficult to implement, and it might be useful. But I'm trying to keep the interface as simple as possible -- in part to reduce the number of additional TCK tests that need to be written -- and I suspect the Properties will rarely be useful. Some people will want to use them, but for many the defaults will be good enough, and many of those with specific requirements will already be forced to write an implementation in order to deal with encoding issues.

Will
-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803



-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803


Re: JetBrains s.r.o. feedback?

Will Hopkins
 

I wrote directly to their EC member, Trish Gee, right after the ballot concluded. I gave her a copy of the PFD spec, which I hoped would address some of their concerns, explained why we had chosen not to do OpenID Connect/OAuth2, and asked for a more detailed breakdown of their concerns.

She did respond, and said the concerns had been raised by an internal working group, and that she'd forward the updated spec to them and follow up to make sure we got more detailed feedback. I haven't heard anything further from them, but I just now pinged her to see if there was any further feedback.

There's not much we can do about the fact that we didn't to OIDC/OAuth2, but I'm optimistic that the recent updates will address most of their other concerns.

Will

On 07/23/2017 06:18 AM, Arjan Tijms wrote:
Hi,

In their voting comment for the Public Review Ballot, JetBrains s.r.o. mentioned working on a detailed response. We're about to post the Proposed Final Draft 2, but I haven't heard from them. Any concerns they may have will be increasingly hard to address at this point I think.

Has anyone seen their response or know what it's about other than what was mentioned in the comments?

Werner, Ivar, have you heard of anything?

Kind regards,
Arjan Tijms

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803


Re: Missing description of @LoginToContinue interceptor in spec

Arjan Tijms
 

Hi,

I just provided a PR for the mentioned missing description of LoginToContinue. See https://github.com/javaee/security-spec/pull/41

Kind regards,
Arjan Tijms


Review PasswordHash / DatabaseIdentityStore

Rudy De Busscher
 

Copied some parts out of the other thread as it becomes too unreadable.

  • remove the "plumbing" that passees the properties down to the hash implementation, in case we want to restore the properties. But they are not currently passed in to the PasswordHash.

Even if the standard Pbkdf2PasswordHashImpl doesn't use it, I think it doesn't hurt much to pass them in anyway.

The parameters are required. if the PasswordHash interface doesn't have methods with the parameters passed along, there is no implementation ever that can use them!

  • Added support for multiple flavors of PBKDF2 (i.e., "with" different Hmac algorithms). Now supports all SHA1/2 variants, but not MD5 (which doesn't appear to be available, or at least not enabled by default, in JDK8)
Ok

I would also consider removing SHA1. Just as MD5, it is no longer considered as really safe.

That said, if there's no time, there's no time. Or in other words, it is what it is ;)


But it needs to be useable. Otherwise, it is better to deliver nothing than something which isn't working properly and usable!
If the perception of Java EE security API is not good during the first months, it will never be used by anyone, even if we come up with a better version later on.

Sorry to be this hard, but that is the truth.


A small detail about the code, I would make it more Java 8 alike ( with diamond operator, foreach method, ....)

If you like, I can make a PR for that.

Rudy


Re: LDAP Annotation and Database Hashing Proposal

Rudy De Busscher
 

Will, See inline comments (hope it is clear enough, I'll put them also in a separate thread)


On 22 July 2017 at 23:17, Arjan Tijms <arjan.tijms@...> wrote:
Hi,

On Sat, Jul 22, 2017 at 10:06 PM, Will Hopkins <will.hopkins@...> wrote:
Arjan, et al.:

I've pushed some changes to the hashalgorithm branch in both security-api and security-soteria; please have a look to see what you think.
  • I considered PasswordHashAlgorithm, but that seemed too long, especially for interfaces/classes that would extend it. 


"PasswordHashAlgorithm" certainly doesn't feel too long for me.
 
  • Also renamed IdentityHashAlgorithm to PlaintextPasswordHash.

Ok 
  • Added generateHash() method.

Ok 
  • Removed hash parameters passed to verifyHash(). I did not remove the DatabaseIdentityStoreDefinition hashProperties attribute, nor did I remove the "plumbing" that passees the properties down to the hash implementation, in case we want to restore the properties. But they are not currently passed in to the PasswordHash.

Even if the standard Pbkdf2PasswordHashImpl doesn't use it, I think it doesn't hurt much to pass them in anyway.

The parameters are required. if the PasswordHash interface doesn't have methods with the parameters passed along, there is no implemtattion ever that can use them!
 
 
  • Implemented a (more or less) constant time compare for passwords/hashes, and updated the two PasswordHash impl classes to use it.
  • Re-wrote Pbkdf2PasswordHashImpl to:
    • Generate hashes with a specific storage format (encoding), and parse that encoding when doing verifyHash(), such that the verify operation can use the algorithm/parameters/salt that were used to hash the password originally, instead of the currently configured paramters.
    • Added support for multiple flavors of PBKDF2 (i.e., "with" different Hmac algorithms). Now supports all SHA1/2 variants, but not MD5 (which doesn't appear to be available, or at least not enabled by default, in JDK8).
    • Changed the default parameters to be more in line with current best practices.

Ok

I would also consider removing SHA1. Just as MD5, it is no longer considered as really safe.
 
 
So, this is what I had in mind. It's really not so different from what you coded up, Arjan, except for some of the algorithm implementation details, the lack of properties support, and the flexible encoding format that allows default parameters to be changed while preserving the ability to verify passwords hashed with different parameters.

My reasoning behind removing the properties support is as follows:
  • The application developer already has complete control because he/she can provide a custom algorithm to meet any requirement. Properties may be convenient in some cases, but even without them the developer can do whatever they need to do.
  • We are already introducing a very large change after PFD publication. Properties mean more to specify and document, and, more crucially, they mean more test assertions and TCK test cases to write. There is already concern internally about our ability to complete the TCK on schedule.

Ok
 
  • As is hopefully made more clear by my implementation code, the algorithm itself, and related parameters, are not really the source of most variation and complexity for password hashes -- it's the content and format of the encoded hashed password, as stored in the database, that seems likely to be infinitely more variable. There are likely a number of fairly common formats, but I suspect most are ad hoc, and practices are changing rapidly over time. In any event, we simply don't have time to research that question to figure out if there are any formats common enough to standardize on. (Nor do we have time to develop and implement some kind of expression syntax to specifies what the encoding should look like.)

As I mentioned, the encoding step is very likely common enough to warrant having its own method in the interface, so developers can e.g. decorate the default algorithm and only change the encoding.

 
As a result, I think it's highly unlikely that the ability to supply properties is going to be enough to allow developers to customize everything they'll want or need to

Like the entire concept of the DatabaseIdentityStore, it's indeed unlikely to ever cover everything everyone ever wants, but the 80% rule comes in play here; it's okay if we design for the 80% and then the 20% can do whatever they want in a custom implementation.

I personally still feel parameters with reasonable defaults are still a good intermediate solution between using the standard algorithm as is, and having to write something from scratch.

That said, if there's no time, there's no time. Or in other words, it is what it is ;)


But it needs to be useable. Otherwise it is better to deliver nothing then something which isn't working properly and useable!
If the perception of Java EE security API is not good during the first months, it will never be used by anyone, even if we come up with a better version later on.

Sorry to be this hard, but that is the thruth.
 
Kind regards,
Arjan Tijms


 
, even when they're creating a new identity store and have complete control over the format. In the case that they need to develop for an existing identity store, with existing password hash formats, the chances are close to zero that a generic password hash implementation could be adequately customized with properties.
With all of that in mind, I do think properties are a nice feature, but they don't seem absolutely necessary -- developers have full control without them -- and it's not clear they'd really be useful in very many scenarios. I suspect they would mostly be used to fiddle with parameters in cases where the default storage encoding is deemed to be acceptable, and that, in most cases, the default algorithm/parameters would probably be deemed acceptable as well, if properties weren't available. As such, I'm reluctant to take on the extra complexity and work required to support properties at this point.

Anyway, that's my proposal. Thoughts?

Will


A small detail anout the code, I would make it more Java 8 alike ( with diamond operator, foreach method, ....)

If you like, I can make a PR for that.

Rudy
 

On 07/20/2017 03:25 PM, Arjan Tijms wrote:
Hi,

The adaptability concern is basically the main concern of the entire DataBaseIdentityStore. Indeed, having a query to get the (hashed) password from the caller name and another to get the groups covers maybe 80% of use cases, but certainly not all.

There are for instance exotic database schemes were you'd need at least 2 seperate queries to get that data.

For the hash algorithm the same thing holds. We need to specify enough moving parts so we can adapt to *many* existing databases and requirements, but we shouldn't even be dreaming about covering them all.

Since parameters can contain EL as values as well, it wouldn't be crazy hard really to supply the encoding as a method expression. It's basically a byte[] -> String function.

In this case though I think encoding is a general enough moving part to warrant its own method in the interface. It can easily be defaulted to e.g. base64 and it can be overridden without exposing the implementation by use of a CDI decorator (or should be after we solve a bug or problem that currently prevents decorating).

Kind regards,
Arjan Tijms


On Thursday, July 20, 2017, Will Hopkins <will.hopkins@...> wrote:


On 07/20/2017 12:38 PM, Guillermo González de Agüero wrote:


Similarly, parameters like the salt and iteration count can be defaulted. That way you can have 3 levels:

1. Specify only the class name; you get defaults for all parameters and it Just Works
2. Specify the parameters locally to match a specific DB or requirement
3. (via EL) fetch the parameter values externally

The most important rule should always be that the developer is in full control, or can be in full control. External overrides should be possible, but never be required. That is -the- big mistake that Java EE made in the past.
I agree, but I think we're only talking about what the mechanism for override is, and what the trade-off between simplicity and the override mechanism is. In practice, I don't think parameters are going to be very helpful to most developers. They can be used to tweak some simple things, but the simple things won't typically be necessary and the more complex things can't be handled by parameters.
 
Could you please provide an example of such a setting, please? Just to better understand. Otherwise, we could just wait for your new proposal to see.

Basically, it's the encoding of the hashed password. Parameters like salt, iterations, hash key size, etc. are all pretty standard. They can be dialed up or dialed down to make the hash stronger or weaker, and I think "stronger vs. weaker" is the level at which most people think about them. I doubt many developers are concerned about specifying precisely a certain number of iterations for some reason, they just want iterations count to be high enough to be secure, without imposing excessive overhead or resource consumption, and security most often wins out vs. resource consumption these days. Most will be happy if they can turn up the volume with a single knob, they don't need bass and treble and balance knobs. Even better: if the default volume is about right, and if the volume automatically goes up when you're driving at higher speeds.

In contrast, the format of the encoded string is not standard in any way that I'm aware of. There are probably LDAP vendors that do things in a common way, but for any given database user store, either the format has already been decided, because there are existing user accounts, or the developer won't care as long as the default format is reasonable (assuming the DB schema allows enough space for the chosen encoding format.

I can't think of any good way to standardize parameters to describe how the hashed password is encoded for storage, and that's the hard part, the part that you really have to adapt to. It's not a big deal if you have to use 4000 iterations instead of 2000, but it's a big deal if you can't parse the encoded password. And since the encoding is completely arbitrary and non-standard, that's the real problem, and that's what will likely force people to do their own implementation of HashAlgorithm. That can't be specified as Properties.

We could probably come up with an interface for encoding/decoding the hashed value, or for describing the encoded value as a string and passing it as a property. But that's a lot to spec out, and we just don't have time for it. It's complicated by the fact that decoding a hash potentially yields all the parameters that were used to generate the hash, so there'd need to be a generalized interface for returning those in a standard way. Plus, in some databases, the encodings used will vary, as legacy hashes are replaced with stronger hashes/different encodings when people change their password.

It's hard to argue against supplying Properties. It's not difficult to implement, and it might be useful. But I'm trying to keep the interface as simple as possible -- in part to reduce the number of additional TCK tests that need to be written -- and I suspect the Properties will rarely be useful. Some people will want to use them, but for many the defaults will be good enough, and many of those with specific requirements will already be forced to write an implementation in order to deal with encoding issues.

Will
-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803




JetBrains s.r.o. feedback?

Arjan Tijms
 

Hi,

In their voting comment for the Public Review Ballot, JetBrains s.r.o. mentioned working on a detailed response. We're about to post the Proposed Final Draft 2, but I haven't heard from them. Any concerns they may have will be increasingly hard to address at this point I think.

Has anyone seen their response or know what it's about other than what was mentioned in the comments?

Werner, Ivar, have you heard of anything?

Kind regards,
Arjan Tijms


Re: LDAP Annotation and Database Hashing Proposal

Will Hopkins
 



On 07/22/2017 05:17 PM, Arjan Tijms wrote:
Hi,

On Sat, Jul 22, 2017 at 10:06 PM, Will Hopkins <will.hopkins@...> wrote:
Arjan, et al.:

I've pushed some changes to the hashalgorithm branch in both security-api and security-soteria; please have a look to see what you think.
  • I considered PasswordHashAlgorithm, but that seemed too long, especially for interfaces/classes that would extend it. 


"PasswordHashAlgorithm" certainly doesn't feel too long for me.
It wasn't PasswordHashAlgorithm by itself so much, but the derived names like Pbkdf2PasswordHashAlgorithm felt long.

I'm OK with the longer form, though, if people like it better. It would fit better with the "hashAlgorithm" attribute.

 
  • Also renamed IdentityHashAlgorithm to PlaintextPasswordHash.

Ok 
  • Added generateHash() method.

Ok 
  • Removed hash parameters passed to verifyHash(). I did not remove the DatabaseIdentityStoreDefinition hashProperties attribute, nor did I remove the "plumbing" that passees the properties down to the hash implementation, in case we want to restore the properties. But they are not currently passed in to the PasswordHash.

Even if the standard Pbkdf2PasswordHashImpl doesn't use it, I think it doesn't hurt much to pass them in anyway.
See below for my thoughts re: properties. It comes down to a cost/benefit question for me at this point, given that we're already at the PFD stage.

  • Implemented a (more or less) constant time compare for passwords/hashes, and updated the two PasswordHash impl classes to use it.
  • Re-wrote Pbkdf2PasswordHashImpl to:
    • Generate hashes with a specific storage format (encoding), and parse that encoding when doing verifyHash(), such that the verify operation can use the algorithm/parameters/salt that were used to hash the password originally, instead of the currently configured paramters.
    • Added support for multiple flavors of PBKDF2 (i.e., "with" different Hmac algorithms). Now supports all SHA1/2 variants, but not MD5 (which doesn't appear to be available, or at least not enabled by default, in JDK8).
    • Changed the default parameters to be more in line with current best practices.

Ok
 
So, this is what I had in mind. It's really not so different from what you coded up, Arjan, except for some of the algorithm implementation details, the lack of properties support, and the flexible encoding format that allows default parameters to be changed while preserving the ability to verify passwords hashed with different parameters.

My reasoning behind removing the properties support is as follows:
  • The application developer already has complete control because he/she can provide a custom algorithm to meet any requirement. Properties may be convenient in some cases, but even without them the developer can do whatever they need to do.
  • We are already introducing a very large change after PFD publication. Properties mean more to specify and document, and, more crucially, they mean more test assertions and TCK test cases to write. There is already concern internally about our ability to complete the TCK on schedule.

Ok
 
  • As is hopefully made more clear by my implementation code, the algorithm itself, and related parameters, are not really the source of most variation and complexity for password hashes -- it's the content and format of the encoded hashed password, as stored in the database, that seems likely to be infinitely more variable. There are likely a number of fairly common formats, but I suspect most are ad hoc, and practices are changing rapidly over time. In any event, we simply don't have time to research that question to figure out if there are any formats common enough to standardize on. (Nor do we have time to develop and implement some kind of expression syntax to specifies what the encoding should look like.)

As I mentioned, the encoding step is very likely common enough to warrant having its own method in the interface, so developers can e.g. decorate the default algorithm and only change the encoding.
Agree, we could invent something like that, but it's not completely simple, because we'd need to design an algorithm-independent canonical representation for the various "parts" of the hash that might need to be encoded/decoded (i.e., the hash, salt, algorithm name, algorithm parameters -- which vary depending on the algorithm -- etc.). The canonical representation is needed so that the values can be passed back and forth between the encoder and the hash algorithm.

Alternatively, some sort of a string representation of the format, like a printf format string, for example, might allow the storage format to be passed in as a parameter.

I'd actually like to do something like this, I just don't think we have time to do a good job on it.

As a result, I think it's highly unlikely that the ability to supply properties is going to be enough to allow developers to customize everything they'll want or need to

Like the entire concept of the DatabaseIdentityStore, it's indeed unlikely to ever cover everything everyone ever wants, but the 80% rule comes in play here; it's okay if we design for the 80% and then the 20% can do whatever they want in a custom implementation.
You might be right, but I still feel like properties would address the 20%, leaving the 80% to write their own implementation anyway (or use one provided by the environment to adapt to it's own local variations).

I personally still feel parameters with reasonable defaults are still a good intermediate solution between using the standard algorithm as is, and having to write something from scratch.
i don't disagree, I actually like the properties idea the more I think about it. I'm just trying to keep things simple and minimize churn to the spec. I think the fact that we've got a good solution for the hash algorithm is already a big win.

That said, if there's no time, there's no time. Or in other words, it is what it is ;)
How about this: default plan of record will be to go forward without properties, but I'll check with the RI and TCK teams, and if they feel they can handle the additional surface area, we can add them into the spec. If you come up with any good ideas for an "encoding decorator", let me know that as well, but keep in mind that it's unlikely we can fit it in, especially if it's complicated to specify, implement, or test.

Thanks for the review comments in github, I'll address them there.

Regards,

Will


Kind regards,
Arjan Tijms


 
, even when they're creating a new identity store and have complete control over the format. In the case that they need to develop for an existing identity store, with existing password hash formats, the chances are close to zero that a generic password hash implementation could be adequately customized with properties.
With all of that in mind, I do think properties are a nice feature, but they don't seem absolutely necessary -- developers have full control without them -- and it's not clear they'd really be useful in very many scenarios. I suspect they would mostly be used to fiddle with parameters in cases where the default storage encoding is deemed to be acceptable, and that, in most cases, the default algorithm/parameters would probably be deemed acceptable as well, if properties weren't available. As such, I'm reluctant to take on the extra complexity and work required to support properties at this point.

Anyway, that's my proposal. Thoughts?

Will


On 07/20/2017 03:25 PM, Arjan Tijms wrote:
Hi,

The adaptability concern is basically the main concern of the entire DataBaseIdentityStore. Indeed, having a query to get the (hashed) password from the caller name and another to get the groups covers maybe 80% of use cases, but certainly not all.

There are for instance exotic database schemes were you'd need at least 2 seperate queries to get that data.

For the hash algorithm the same thing holds. We need to specify enough moving parts so we can adapt to *many* existing databases and requirements, but we shouldn't even be dreaming about covering them all.

Since parameters can contain EL as values as well, it wouldn't be crazy hard really to supply the encoding as a method expression. It's basically a byte[] -> String function.

In this case though I think encoding is a general enough moving part to warrant its own method in the interface. It can easily be defaulted to e.g. base64 and it can be overridden without exposing the implementation by use of a CDI decorator (or should be after we solve a bug or problem that currently prevents decorating).

Kind regards,
Arjan Tijms


On Thursday, July 20, 2017, Will Hopkins <will.hopkins@...> wrote:


On 07/20/2017 12:38 PM, Guillermo González de Agüero wrote:


Similarly, parameters like the salt and iteration count can be defaulted. That way you can have 3 levels:

1. Specify only the class name; you get defaults for all parameters and it Just Works
2. Specify the parameters locally to match a specific DB or requirement
3. (via EL) fetch the parameter values externally

The most important rule should always be that the developer is in full control, or can be in full control. External overrides should be possible, but never be required. That is -the- big mistake that Java EE made in the past.
I agree, but I think we're only talking about what the mechanism for override is, and what the trade-off between simplicity and the override mechanism is. In practice, I don't think parameters are going to be very helpful to most developers. They can be used to tweak some simple things, but the simple things won't typically be necessary and the more complex things can't be handled by parameters.
 
Could you please provide an example of such a setting, please? Just to better understand. Otherwise, we could just wait for your new proposal to see.

Basically, it's the encoding of the hashed password. Parameters like salt, iterations, hash key size, etc. are all pretty standard. They can be dialed up or dialed down to make the hash stronger or weaker, and I think "stronger vs. weaker" is the level at which most people think about them. I doubt many developers are concerned about specifying precisely a certain number of iterations for some reason, they just want iterations count to be high enough to be secure, without imposing excessive overhead or resource consumption, and security most often wins out vs. resource consumption these days. Most will be happy if they can turn up the volume with a single knob, they don't need bass and treble and balance knobs. Even better: if the default volume is about right, and if the volume automatically goes up when you're driving at higher speeds.

In contrast, the format of the encoded string is not standard in any way that I'm aware of. There are probably LDAP vendors that do things in a common way, but for any given database user store, either the format has already been decided, because there are existing user accounts, or the developer won't care as long as the default format is reasonable (assuming the DB schema allows enough space for the chosen encoding format.

I can't think of any good way to standardize parameters to describe how the hashed password is encoded for storage, and that's the hard part, the part that you really have to adapt to. It's not a big deal if you have to use 4000 iterations instead of 2000, but it's a big deal if you can't parse the encoded password. And since the encoding is completely arbitrary and non-standard, that's the real problem, and that's what will likely force people to do their own implementation of HashAlgorithm. That can't be specified as Properties.

We could probably come up with an interface for encoding/decoding the hashed value, or for describing the encoded value as a string and passing it as a property. But that's a lot to spec out, and we just don't have time for it. It's complicated by the fact that decoding a hash potentially yields all the parameters that were used to generate the hash, so there'd need to be a generalized interface for returning those in a standard way. Plus, in some databases, the encodings used will vary, as legacy hashes are replaced with stronger hashes/different encodings when people change their password.

It's hard to argue against supplying Properties. It's not difficult to implement, and it might be useful. But I'm trying to keep the interface as simple as possible -- in part to reduce the number of additional TCK tests that need to be written -- and I suspect the Properties will rarely be useful. Some people will want to use them, but for many the defaults will be good enough, and many of those with specific requirements will already be forced to write an implementation in order to deal with encoding issues.

Will
-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803


-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803


Re: LDAP Annotation and Database Hashing Proposal

Will Hopkins
 

PRs are created.

On 07/22/2017 07:41 PM, Will Hopkins wrote:
I didn't create a PR because I wasn't sure I'd be submitting it as is, but I can do so later.

Meanwhile, you should be able to easily compare against master by selecting "branches" from the "code" tab, clicking on the hashalgorithm branch to select it, then clicking the "compare" button.

Thanks,

Will

On July 22, 2017 4:42:48 PM EDT, "Guillermo González de Agüero" <z06.guillermo@...> wrote:
Thanks Will!

I imagine your changes are on the https://github.com/javaee/security-soteria/tree/hashalgorithm branch, no? Could you please create a pull request to master? That simplifies reviewing the changes of all commits at once and also will trigger notifications for repository watchers.

I'll take a look at it tomorrow and come back with some feedback.


Regards,

Guillermo González de Agüero

El sáb., 22 de julio de 2017 22:06, Will Hopkins <will.hopkins@...> escribió:
Arjan, et al.:

I've pushed some changes to the hashalgorithm branch in both security-api and security-soteria; please have a look to see what you think.
  • One change was a rename of the interface from HashAlgorithm to PasswordHash, to be more explicit/accurate about what the interface is for -- it's specific to passwords, which is very distinct from regular cryptographic hashing. I considered PasswordHashAlgorithm, but that seemed too long, especially for interfaces/classes that would extend it. Willing to reconsider the name, but if others have strong feelings.
  • Also renamed IdentityHashAlgorithm to PlaintextPasswordHash.
  • Added generateHash() method.
  • Removed hash parameters passed to verifyHash(). I did not remove the DatabaseIdentityStoreDefinition hashProperties attribute, nor did I remove the "plumbing" that passees the properties down to the hash implementation, in case we want to restore the properties. But they are not currently passed in to the PasswordHash.
  • Implemented a (more or less) constant time compare for passwords/hashes, and updated the two PasswordHash impl classes to use it.
  • Re-wrote Pbkdf2PasswordHashImpl to:
    • Generate hashes with a specific storage format (encoding), and parse that encoding when doing verifyHash(), such that the verify operation can use the algorithm/parameters/salt that were used to hash the password originally, instead of the currently configured paramters.
    • Added support for multiple flavors of PBKDF2 (i.e., "with" different Hmac algorithms). Now supports all SHA1/2 variants, but not MD5 (which doesn't appear to be available, or at least not enabled by default, in JDK8).
    • Changed the default parameters to be more in line with current best practices.
So, this is what I had in mind. It's really not so different from what you coded up, Arjan, except for some of the algorithm implementation details, the lack of properties support, and the flexible encoding format that allows default parameters to be changed while preserving the ability to verify passwords hashed with different parameters.

My reasoning behind removing the properties support is as follows:
  • The application developer already has complete control because he/she can provide a custom algorithm to meet any requirement. Properties may be convenient in some cases, but even without them the developer can do whatever they need to do.
  • We are already introducing a very large change after PFD publication. Properties mean more to specify and document, and, more crucially, they mean more test assertions and TCK test cases to write. There is already concern internally about our ability to complete the TCK on schedule.
  • As is hopefully made more clear by my implementation code, the algorithm itself, and related parameters, are not really the source of most variation and complexity for password hashes -- it's the content and format of the encoded hashed password, as stored in the database, that seems likely to be infinitely more variable. There are likely a number of fairly common formats, but I suspect most are ad hoc, and practices are changing rapidly over time. In any event, we simply don't have time to research that question to figure out if there are any formats common enough to standardize on. (Nor do we have time to develop and implement some kind of expression syntax to specifies what the encoding should look like.)
As a result, I think it's highly unlikely that the ability to supply properties is going to be enough to allow developers to customize everything they'll want or need to, even when they're creating a new identity store and have complete control over the format. In the case that they need to develop for an existing identity store, with existing password hash formats, the chances are close to zero that a generic password hash implementation could be adequately customized with properties.
With all of that in mind, I do think properties are a nice feature, but they don't seem absolutely necessary -- developers have full control without them -- and it's not clear they'd really be useful in very many scenarios. I suspect they would mostly be used to fiddle with parameters in cases where the default storage encoding is deemed to be acceptable, and that, in most cases, the default algorithm/parameters would probably be deemed acceptable as well, if properties weren't available. As such, I'm reluctant to take on the extra complexity and work required to support properties at this point.

Anyway, that's my proposal. Thoughts?

Will


On 07/20/2017 03:25 PM, Arjan Tijms wrote:
Hi,

The adaptability concern is basically the main concern of the entire DataBaseIdentityStore. Indeed, having a query to get the (hashed) password from the caller name and another to get the groups covers maybe 80% of use cases, but certainly not all.

There are for instance exotic database schemes were you'd need at least 2 seperate queries to get that data.

For the hash algorithm the same thing holds. We need to specify enough moving parts so we can adapt to *many* existing databases and requirements, but we shouldn't even be dreaming about covering them all.

Since parameters can contain EL as values as well, it wouldn't be crazy hard really to supply the encoding as a method expression. It's basically a byte[] -> String function.

In this case though I think encoding is a general enough moving part to warrant its own method in the interface. It can easily be defaulted to e.g. base64 and it can be overridden without exposing the implementation by use of a CDI decorator (or should be after we solve a bug or problem that currently prevents decorating).

Kind regards,
Arjan Tijms


On Thursday, July 20, 2017, Will Hopkins <will.hopkins@...> wrote:


On 07/20/2017 12:38 PM, Guillermo González de Agüero wrote:


Similarly, parameters like the salt and iteration count can be defaulted. That way you can have 3 levels:

1. Specify only the class name; you get defaults for all parameters and it Just Works
2. Specify the parameters locally to match a specific DB or requirement
3. (via EL) fetch the parameter values externally

The most important rule should always be that the developer is in full control, or can be in full control. External overrides should be possible, but never be required. That is -the- big mistake that Java EE made in the past.
I agree, but I think we're only talking about what the mechanism for override is, and what the trade-off between simplicity and the override mechanism is. In practice, I don't think parameters are going to be very helpful to most developers. They can be used to tweak some simple things, but the simple things won't typically be necessary and the more complex things can't be handled by parameters.
 
Could you please provide an example of such a setting, please? Just to better understand. Otherwise, we could just wait for your new proposal to see.

Basically, it's the encoding of the hashed password. Parameters like salt, iterations, hash key size, etc. are all pretty standard. They can be dialed up or dialed down to make the hash stronger or weaker, and I think "stronger vs. weaker" is the level at which most people think about them. I doubt many developers are concerned about specifying precisely a certain number of iterations for some reason, they just want iterations count to be high enough to be secure, without imposing excessive overhead or resource consumption, and security most often wins out vs. resource consumption these days. Most will be happy if they can turn up the volume with a single knob, they don't need bass and treble and balance knobs. Even better: if the default volume is about right, and if the volume automatically goes up when you're driving at higher speeds.

In contrast, the format of the encoded string is not standard in any way that I'm aware of. There are probably LDAP vendors that do things in a common way, but for any given database user store, either the format has already been decided, because there are existing user accounts, or the developer won't care as long as the default format is reasonable (assuming the DB schema allows enough space for the chosen encoding format.

I can't think of any good way to standardize parameters to describe how the hashed password is encoded for storage, and that's the hard part, the part that you really have to adapt to. It's not a big deal if you have to use 4000 iterations instead of 2000, but it's a big deal if you can't parse the encoded password. And since the encoding is completely arbitrary and non-standard, that's the real problem, and that's what will likely force people to do their own implementation of HashAlgorithm. That can't be specified as Properties.

We could probably come up with an interface for encoding/decoding the hashed value, or for describing the encoded value as a string and passing it as a property. But that's a lot to spec out, and we just don't have time for it. It's complicated by the fact that decoding a hash potentially yields all the parameters that were used to generate the hash, so there'd need to be a generalized interface for returning those in a standard way. Plus, in some databases, the encodings used will vary, as legacy hashes are replaced with stronger hashes/different encodings when people change their password.

It's hard to argue against supplying Properties. It's not difficult to implement, and it might be useful. But I'm trying to keep the interface as simple as possible -- in part to reduce the number of additional TCK tests that need to be written -- and I suspect the Properties will rarely be useful. Some people will want to use them, but for many the defaults will be good enough, and many of those with specific requirements will already be forced to write an implementation in order to deal with encoding issues.

Will
-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

--
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803


Re: LDAP Annotation and Database Hashing Proposal

Will Hopkins
 

I didn't create a PR because I wasn't sure I'd be submitting it as is, but I can do so later.

Meanwhile, you should be able to easily compare against master by selecting "branches" from the "code" tab, clicking on the hashalgorithm branch to select it, then clicking the "compare" button.

Thanks,

Will


On July 22, 2017 4:42:48 PM EDT, "Guillermo González de Agüero" <z06.guillermo@...> wrote:
Thanks Will!

I imagine your changes are on the https://github.com/javaee/security-soteria/tree/hashalgorithm branch, no? Could you please create a pull request to master? That simplifies reviewing the changes of all commits at once and also will trigger notifications for repository watchers.

I'll take a look at it tomorrow and come back with some feedback.


Regards,

Guillermo González de Agüero

El sáb., 22 de julio de 2017 22:06, Will Hopkins <will.hopkins@...> escribió:
Arjan, et al.:

I've pushed some changes to the hashalgorithm branch in both security-api and security-soteria; please have a look to see what you think.
  • One change was a rename of the interface from HashAlgorithm to PasswordHash, to be more explicit/accurate about what the interface is for -- it's specific to passwords, which is very distinct from regular cryptographic hashing. I considered PasswordHashAlgorithm, but that seemed too long, especially for interfaces/classes that would extend it. Willing to reconsider the name, but if others have strong feelings.
  • Also renamed IdentityHashAlgorithm to PlaintextPasswordHash.
  • Added generateHash() method.
  • Removed hash parameters passed to verifyHash(). I did not remove the DatabaseIdentityStoreDefinition hashProperties attribute, nor did I remove the "plumbing" that passees the properties down to the hash implementation, in case we want to restore the properties. But they are not currently passed in to the PasswordHash.
  • Implemented a (more or less) constant time compare for passwords/hashes, and updated the two PasswordHash impl classes to use it.
  • Re-wrote Pbkdf2PasswordHashImpl to:
    • Generate hashes with a specific storage format (encoding), and parse that encoding when doing verifyHash(), such that the verify operation can use the algorithm/parameters/salt that were used to hash the password originally, instead of the currently configured paramters.
    • Added support for multiple flavors of PBKDF2 (i.e., "with" different Hmac algorithms). Now supports all SHA1/2 variants, but not MD5 (which doesn't appear to be available, or at least not enabled by default, in JDK8).
    • Changed the default parameters to be more in line with current best practices.
So, this is what I had in mind. It's really not so different from what you coded up, Arjan, except for some of the algorithm implementation details, the lack of properties support, and the flexible encoding format that allows default parameters to be changed while preserving the ability to verify passwords hashed with different parameters.

My reasoning behind removing the properties support is as follows:
  • The application developer already has complete control because he/she can provide a custom algorithm to meet any requirement. Properties may be convenient in some cases, but even without them the developer can do whatever they need to do.
  • We are already introducing a very large change after PFD publication. Properties mean more to specify and document, and, more crucially, they mean more test assertions and TCK test cases to write. There is already concern internally about our ability to complete the TCK on schedule.
  • As is hopefully made more clear by my implementation code, the algorithm itself, and related parameters, are not really the source of most variation and complexity for password hashes -- it's the content and format of the encoded hashed password, as stored in the database, that seems likely to be infinitely more variable. There are likely a number of fairly common formats, but I suspect most are ad hoc, and practices are changing rapidly over time. In any event, we simply don't have time to research that question to figure out if there are any formats common enough to standardize on. (Nor do we have time to develop and implement some kind of expression syntax to specifies what the encoding should look like.)
As a result, I think it's highly unlikely that the ability to supply properties is going to be enough to allow developers to customize everything they'll want or need to, even when they're creating a new identity store and have complete control over the format. In the case that they need to develop for an existing identity store, with existing password hash formats, the chances are close to zero that a generic password hash implementation could be adequately customized with properties.
With all of that in mind, I do think properties are a nice feature, but they don't seem absolutely necessary -- developers have full control without them -- and it's not clear they'd really be useful in very many scenarios. I suspect they would mostly be used to fiddle with parameters in cases where the default storage encoding is deemed to be acceptable, and that, in most cases, the default algorithm/parameters would probably be deemed acceptable as well, if properties weren't available. As such, I'm reluctant to take on the extra complexity and work required to support properties at this point.

Anyway, that's my proposal. Thoughts?

Will


On 07/20/2017 03:25 PM, Arjan Tijms wrote:
Hi,

The adaptability concern is basically the main concern of the entire DataBaseIdentityStore. Indeed, having a query to get the (hashed) password from the caller name and another to get the groups covers maybe 80% of use cases, but certainly not all.

There are for instance exotic database schemes were you'd need at least 2 seperate queries to get that data.

For the hash algorithm the same thing holds. We need to specify enough moving parts so we can adapt to *many* existing databases and requirements, but we shouldn't even be dreaming about covering them all.

Since parameters can contain EL as values as well, it wouldn't be crazy hard really to supply the encoding as a method expression. It's basically a byte[] -> String function.

In this case though I think encoding is a general enough moving part to warrant its own method in the interface. It can easily be defaulted to e.g. base64 and it can be overridden without exposing the implementation by use of a CDI decorator (or should be after we solve a bug or problem that currently prevents decorating).

Kind regards,
Arjan Tijms


On Thursday, July 20, 2017, Will Hopkins <will.hopkins@...> wrote:


On 07/20/2017 12:38 PM, Guillermo González de Agüero wrote:


Similarly, parameters like the salt and iteration count can be defaulted. That way you can have 3 levels:

1. Specify only the class name; you get defaults for all parameters and it Just Works
2. Specify the parameters locally to match a specific DB or requirement
3. (via EL) fetch the parameter values externally

The most important rule should always be that the developer is in full control, or can be in full control. External overrides should be possible, but never be required. That is -the- big mistake that Java EE made in the past.
I agree, but I think we're only talking about what the mechanism for override is, and what the trade-off between simplicity and the override mechanism is. In practice, I don't think parameters are going to be very helpful to most developers. They can be used to tweak some simple things, but the simple things won't typically be necessary and the more complex things can't be handled by parameters.
 
Could you please provide an example of such a setting, please? Just to better understand. Otherwise, we could just wait for your new proposal to see.

Basically, it's the encoding of the hashed password. Parameters like salt, iterations, hash key size, etc. are all pretty standard. They can be dialed up or dialed down to make the hash stronger or weaker, and I think "stronger vs. weaker" is the level at which most people think about them. I doubt many developers are concerned about specifying precisely a certain number of iterations for some reason, they just want iterations count to be high enough to be secure, without imposing excessive overhead or resource consumption, and security most often wins out vs. resource consumption these days. Most will be happy if they can turn up the volume with a single knob, they don't need bass and treble and balance knobs. Even better: if the default volume is about right, and if the volume automatically goes up when you're driving at higher speeds.

In contrast, the format of the encoded string is not standard in any way that I'm aware of. There are probably LDAP vendors that do things in a common way, but for any given database user store, either the format has already been decided, because there are existing user accounts, or the developer won't care as long as the default format is reasonable (assuming the DB schema allows enough space for the chosen encoding format.

I can't think of any good way to standardize parameters to describe how the hashed password is encoded for storage, and that's the hard part, the part that you really have to adapt to. It's not a big deal if you have to use 4000 iterations instead of 2000, but it's a big deal if you can't parse the encoded password. And since the encoding is completely arbitrary and non-standard, that's the real problem, and that's what will likely force people to do their own implementation of HashAlgorithm. That can't be specified as Properties.

We could probably come up with an interface for encoding/decoding the hashed value, or for describing the encoded value as a string and passing it as a property. But that's a lot to spec out, and we just don't have time for it. It's complicated by the fact that decoding a hash potentially yields all the parameters that were used to generate the hash, so there'd need to be a generalized interface for returning those in a standard way. Plus, in some databases, the encodings used will vary, as legacy hashes are replaced with stronger hashes/different encodings when people change their password.

It's hard to argue against supplying Properties. It's not difficult to implement, and it might be useful. But I'm trying to keep the interface as simple as possible -- in part to reduce the number of additional TCK tests that need to be written -- and I suspect the Properties will rarely be useful. Some people will want to use them, but for many the defaults will be good enough, and many of those with specific requirements will already be forced to write an implementation in order to deal with encoding issues.

Will
-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803


--
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

221 - 240 of 736