Re: LDAP Annotation and Database Hashing Proposal


Will Hopkins
 

Hi Rudy,

Thanks for the feedback, responses inline.

Will

On 07/23/2017 08:05 AM, Rudy De Busscher wrote:
Will, See inline comments (hope it is clear enough, I'll put them also in a separate thread)


On 22 July 2017 at 23:17, Arjan Tijms <arjan.tijms@...> wrote:
Hi,

On Sat, Jul 22, 2017 at 10:06 PM, Will Hopkins <will.hopkins@...> wrote:
Arjan, et al.:

I've pushed some changes to the hashalgorithm branch in both security-api and security-soteria; please have a look to see what you think.
  • I considered PasswordHashAlgorithm, but that seemed too long, especially for interfaces/classes that would extend it. 


"PasswordHashAlgorithm" certainly doesn't feel too long for me.
 
  • Also renamed IdentityHashAlgorithm to PlaintextPasswordHash.

Ok 
  • Added generateHash() method.

Ok 
  • Removed hash parameters passed to verifyHash(). I did not remove the DatabaseIdentityStoreDefinition hashProperties attribute, nor did I remove the "plumbing" that passees the properties down to the hash implementation, in case we want to restore the properties. But they are not currently passed in to the PasswordHash.

Even if the standard Pbkdf2PasswordHashImpl doesn't use it, I think it doesn't hurt much to pass them in anyway.

The parameters are required. if the PasswordHash interface doesn't have methods with the parameters passed along, there is no implemtattion ever that can use them!

I assume what you mean is that the methods need the parameters passed along if we retain the hashProperties attribute ... correct?

The intent would be to either support properties, or remove the hashProperties attribute on DatabaseIdentityStoreDefinition. My change leaves them on the annotation because I didn't feel like we'd made a final decision yet, and didn't want to have to add all that back if we decided to keep them. If we don't, we would remove the annotation attribute.

 
 
  • Implemented a (more or less) constant time compare for passwords/hashes, and updated the two PasswordHash impl classes to use it.
  • Re-wrote Pbkdf2PasswordHashImpl to:
    • Generate hashes with a specific storage format (encoding), and parse that encoding when doing verifyHash(), such that the verify operation can use the algorithm/parameters/salt that were used to hash the password originally, instead of the currently configured paramters.
    • Added support for multiple flavors of PBKDF2 (i.e., "with" different Hmac algorithms). Now supports all SHA1/2 variants, but not MD5 (which doesn't appear to be available, or at least not enabled by default, in JDK8).
    • Changed the default parameters to be more in line with current best practices.

Ok

I would also consider removing SHA1. Just as MD5, it is no longer considered as really safe.
Good point, I'll do that. My thought in keeping it (and perhaps MD5, if it worked) was to support passwords hashed with legacy algorithms, but that doesn't make much sense since it's unlikely they'd be encoded the same way in the database, so the default PasswordHashAlgorithm wouldn't work anyway.

 
 
So, this is what I had in mind. It's really not so different from what you coded up, Arjan, except for some of the algorithm implementation details, the lack of properties support, and the flexible encoding format that allows default parameters to be changed while preserving the ability to verify passwords hashed with different parameters.

My reasoning behind removing the properties support is as follows:
  • The application developer already has complete control because he/she can provide a custom algorithm to meet any requirement. Properties may be convenient in some cases, but even without them the developer can do whatever they need to do.
  • We are already introducing a very large change after PFD publication. Properties mean more to specify and document, and, more crucially, they mean more test assertions and TCK test cases to write. There is already concern internally about our ability to complete the TCK on schedule.

Ok
 
  • As is hopefully made more clear by my implementation code, the algorithm itself, and related parameters, are not really the source of most variation and complexity for password hashes -- it's the content and format of the encoded hashed password, as stored in the database, that seems likely to be infinitely more variable. There are likely a number of fairly common formats, but I suspect most are ad hoc, and practices are changing rapidly over time. In any event, we simply don't have time to research that question to figure out if there are any formats common enough to standardize on. (Nor do we have time to develop and implement some kind of expression syntax to specifies what the encoding should look like.)

As I mentioned, the encoding step is very likely common enough to warrant having its own method in the interface, so developers can e.g. decorate the default algorithm and only change the encoding.

 

As a result, I think it's highly unlikely that the ability to supply properties is going to be enough to allow developers to customize everything they'll want or need to

Like the entire concept of the DatabaseIdentityStore, it's indeed unlikely to ever cover everything everyone ever wants, but the 80% rule comes in play here; it's okay if we design for the 80% and then the 20% can do whatever they want in a custom implementation.

I personally still feel parameters with reasonable defaults are still a good intermediate solution between using the standard algorithm as is, and having to write something from scratch.

That said, if there's no time, there's no time. Or in other words, it is what it is ;)


But it needs to be useable. Otherwise it is better to deliver nothing then something which isn't working properly and useable!
If the perception of Java EE security API is not good during the first months, it will never be used by anyone, even if we come up with a better version later on.

Sorry to be this hard, but that is the thruth.
I agree that it needs to be usable, but, to me, the properties/no properties question doesn't seem to be a fatal flaw -- properties would be convenient, in some cases, but lack of properties won't prevent a developer (or platform vendor) from implementing any algorithm(s) they need. Do you see it differently?

 
Kind regards,
Arjan Tijms


 
, even when they're creating a new identity store and have complete control over the format. In the case that they need to develop for an existing identity store, with existing password hash formats, the chances are close to zero that a generic password hash implementation could be adequately customized with properties.
With all of that in mind, I do think properties are a nice feature, but they don't seem absolutely necessary -- developers have full control without them -- and it's not clear they'd really be useful in very many scenarios. I suspect they would mostly be used to fiddle with parameters in cases where the default storage encoding is deemed to be acceptable, and that, in most cases, the default algorithm/parameters would probably be deemed acceptable as well, if properties weren't available. As such, I'm reluctant to take on the extra complexity and work required to support properties at this point.

Anyway, that's my proposal. Thoughts?

Will


A small detail anout the code, I would make it more Java 8 alike ( with diamond operator, foreach method, ....)

If you like, I can make a PR for that.
Sure, if you want, but let's wait until we're fully decided on the approach. Also, did you sign the OCA yet? I don't understand why that's suddenly a requirement, but apparently it is.


Rudy
 

On 07/20/2017 03:25 PM, Arjan Tijms wrote:
Hi,

The adaptability concern is basically the main concern of the entire DataBaseIdentityStore. Indeed, having a query to get the (hashed) password from the caller name and another to get the groups covers maybe 80% of use cases, but certainly not all.

There are for instance exotic database schemes were you'd need at least 2 seperate queries to get that data.

For the hash algorithm the same thing holds. We need to specify enough moving parts so we can adapt to *many* existing databases and requirements, but we shouldn't even be dreaming about covering them all.

Since parameters can contain EL as values as well, it wouldn't be crazy hard really to supply the encoding as a method expression. It's basically a byte[] -> String function.

In this case though I think encoding is a general enough moving part to warrant its own method in the interface. It can easily be defaulted to e.g. base64 and it can be overridden without exposing the implementation by use of a CDI decorator (or should be after we solve a bug or problem that currently prevents decorating).

Kind regards,
Arjan Tijms


On Thursday, July 20, 2017, Will Hopkins <will.hopkins@...> wrote:


On 07/20/2017 12:38 PM, Guillermo González de Agüero wrote:


Similarly, parameters like the salt and iteration count can be defaulted. That way you can have 3 levels:

1. Specify only the class name; you get defaults for all parameters and it Just Works
2. Specify the parameters locally to match a specific DB or requirement
3. (via EL) fetch the parameter values externally

The most important rule should always be that the developer is in full control, or can be in full control. External overrides should be possible, but never be required. That is -the- big mistake that Java EE made in the past.
I agree, but I think we're only talking about what the mechanism for override is, and what the trade-off between simplicity and the override mechanism is. In practice, I don't think parameters are going to be very helpful to most developers. They can be used to tweak some simple things, but the simple things won't typically be necessary and the more complex things can't be handled by parameters.
 
Could you please provide an example of such a setting, please? Just to better understand. Otherwise, we could just wait for your new proposal to see.

Basically, it's the encoding of the hashed password. Parameters like salt, iterations, hash key size, etc. are all pretty standard. They can be dialed up or dialed down to make the hash stronger or weaker, and I think "stronger vs. weaker" is the level at which most people think about them. I doubt many developers are concerned about specifying precisely a certain number of iterations for some reason, they just want iterations count to be high enough to be secure, without imposing excessive overhead or resource consumption, and security most often wins out vs. resource consumption these days. Most will be happy if they can turn up the volume with a single knob, they don't need bass and treble and balance knobs. Even better: if the default volume is about right, and if the volume automatically goes up when you're driving at higher speeds.

In contrast, the format of the encoded string is not standard in any way that I'm aware of. There are probably LDAP vendors that do things in a common way, but for any given database user store, either the format has already been decided, because there are existing user accounts, or the developer won't care as long as the default format is reasonable (assuming the DB schema allows enough space for the chosen encoding format.

I can't think of any good way to standardize parameters to describe how the hashed password is encoded for storage, and that's the hard part, the part that you really have to adapt to. It's not a big deal if you have to use 4000 iterations instead of 2000, but it's a big deal if you can't parse the encoded password. And since the encoding is completely arbitrary and non-standard, that's the real problem, and that's what will likely force people to do their own implementation of HashAlgorithm. That can't be specified as Properties.

We could probably come up with an interface for encoding/decoding the hashed value, or for describing the encoded value as a string and passing it as a property. But that's a lot to spec out, and we just don't have time for it. It's complicated by the fact that decoding a hash potentially yields all the parameters that were used to generate the hash, so there'd need to be a generalized interface for returning those in a standard way. Plus, in some databases, the encodings used will vary, as legacy hashes are replaced with stronger hashes/different encodings when people change their password.

It's hard to argue against supplying Properties. It's not difficult to implement, and it might be useful. But I'm trying to keep the interface as simple as possible -- in part to reduce the number of additional TCK tests that need to be written -- and I suspect the Properties will rarely be useful. Some people will want to use them, but for many the defaults will be good enough, and many of those with specific requirements will already be forced to write an implementation in order to deal with encoding issues.

Will
-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803



-- 
Will Hopkins | WebLogic Security Architect | +1.781.442.0310
Oracle Application Development
35 Network Drive, Burlington, MA 01803

Join javaee-security-spec@javaee.groups.io to automatically receive all group messages.