Suggestion to enhance the OPVault design document

juanii · January 2019

Hello,

While writing a program to parse the OPVault file format I came across a couple of errors/omissions and I though it would be good to report them.

The Key Derivation section says that to derive the encryption and mac keys the master password is converted to a "UTF-8 null terminated string". This is a little confusing because the terminating NULL character is not actually used to derive the key. It only serves the purpose of using strlen to easily determine the password length to call CCKeyDerivationPBKDF. This is also highly dependent on both the Apple platform and the C language. Given the multi-platform nature of your software I would describe the file format in a more language and platform agnostic way. Moreover the document does not specify how Unicode strings should be normalized (NFC or NFD) which is as important as specifying the UTF-8 encoding is used to get the correct derived key.

In the hmac section, the pseudo-code to compute the MAC shows that the elements' values are converted to strings. The problem here arises when a value is of a boolean type, like in the "trashed" element. From my tests I guess your code to compute the MAC is converting the true JSON token to the string "1" instead of the more intuitive string "true". It would be good to clearly state how these conversions are performed to properly validate the MAC.

Regards,
Juan

1Password Version: Not Provided
Extension Version: Not Provided
OS Version: Not Provided
Sync Type: Not Provided

juanii · January 2019

Just found the Attachments section describes what once was the first and only attachments version (0x01) but there seems to be a newer 0x02. This new version doesn't seem to introduce any breaking changes, but knowing them all allows software to properly handle the files or discard if they're unknown.

Regards,
Juan

jpgoldberg · February 2019

Well spotted, @juanii!

Yes, null termination isn't applicable for these strings. I don't know why I made that mistake in the original document.
When I first wrote that document (mid 2012), we only had OPVault running in a private beta of 1Password for Mac. Although the broad features were planned out beforehand, I was documenting from our code. So there are lots of Mac-isms in there. As you can imagine, this annoyed our Windows and Android developers when it came time for them to support OPVault.
Unicode normalization. We lost an opportunity here. We didn't normalize. And so someone entering a ü on one keyboard or OS (or some combination) might find that their Master Password failed when using a different keyboard or OS. We did not repeat this mistake with our latest format. (We now normalize to NFKD):
HMACing string representations or the raw bytes? I will need to actually go to the source to see how that is done. I'm sure I knew at the time that I wrote that document, but I will have to "use the source" to see what we actually did in those case.
I can't recall precisely what the difference is between the version 1 and version 2 of the attachment format is, but it was something that we did in response to some migration errors from Agile Keychain. We had a bug in the migration process that affected a small number of attachments, and somehow or other this change was part of our fix. But again that was more than six years ago, and I do not recall the details.

I can't promise to get these documentation issues fixed quickly, but these are things that should be corrected at some point. When I get a chance to update documentation of this nature, my priority is on filling in the gaps of our Security white paper (PDF).

Thank you!

juanii · February 2019

As you can imagine, this annoyed our Windows and Android developers when it came time for them to support OPVault.

I wouldn't go that far and call it annoying, at least it's described using documented functions you can go and see how they work. I don't think it's a road block for someone trying to read the file though. If you're fiddling with it, chances are you already have an idea of what you're dealing with. But yeah, the standard for these kind of specifications is a more formal description.

HMACing string representations or the raw bytes? I will need to actually go to the source to see how that is done. I'm sure I knew at the time that I wrote that document, but I will have to "use the source" to see what we actually did in those case.

From my tests I can confirm you're feeding string representations of the values to the HMAC function, otherwise I just found a crazy easy collision! (of course I did not). Using the strings worked fine with the exception of this quirk in the conversion of boolean values. While not completely arbitrary, it's not clear that true translates into "1" (and I bet false turns into "0").
I guess this could be another point for improvement. The guts in the computation of a MAC are byte-blocks operations. So it would be wise to precisely describe the bytes you're supposed to use as input. I'm no expert in the inner workings of JSON parsing frameworks and file I/O libraries, but it's not crazy to think an UTF-8 string could be normalized after being read from a file. So specifying the encoding and normalization for these string representations could save some headaches.

I can't recall precisely what the difference is between the version 1 and version 2 of the attachment format is

Probably nothing of interest for third parties, but knowing there are newer versions is useful to write code that behaves.

I know I'm being overly picky on this but, you know, it's the cost of openness :) I appreciate your will and effort to publicly document the file format.

Thanks for taking the time to read and answer, your comments were enlightening.

Regards

AGAlumB · March 2019

Likewise, thank you for raising these points! There's always room for us to improve, both with existing documentation and planning as we continue to work on newer stuff. :)

Suggestion to enhance the OPVault design document

Comments