Random errors during concurrent requests
Hi, we are using the CLI tool as a part of a CI/CD step to test our numerous integrations. In doing so, we have created a node wrapper that invokes the CLI. As a part of the build, we pull a vault with 100+ items and then call each of items in that vault to hydrate our object since there is not enough of the detailed information in the op list items --vault=<OUR_VAULT>
call.
During that step, we are getting a whole host of strange errors. The typical ones are invalid character '"' after top-level value
or invalid character '}' after top-level value
. We suspected it had to do with the concurrency so we implemented an exponential backoff strategy, (up to 5 requests) which partially alleviated the problem. As we have added more items to the vault, the performance has gotten worse.
Example Error:
Error: Command failed: printf 'XfVsdbE4VMMtf53-YmRn-NO7P5sCHTNm1eCz2gk9ENA' | op get item 4sxtoqsn5atn73zy7sia6sds4u [LOG] 2019/05/30 14:19:52 (ERROR) invalid character '}' after top-level value at ChildProcess.exithandler (child_process.js:275:12) at emitTwo (events.js:126:13) at ChildProcess.emit (events.js:214:7) at maybeClose (internal/child_process.js:925:16) at Socket.stream.socket.on (internal/child_process.js:346:11) at emitOne (events.js:116:13) at Socket.emit (events.js:211:7) at Pipe._handle.close [as _onclose] (net.js:554:12)
1Password Version: 0.5.5
Extension Version: Not Provided
OS Version: Not Provided
Sync Type: Not Provided
Comments
-
To note here: it doesn't seem to be a JSON error. If I switch the session token to one that's active, then I'm able to retrieve the data successfully.
0 -
Hi @tyltot
As Mickael mentioned, this is most likely due to how
op
currently handles concurrency. In short, poorly.The current implementation of the command line tool does not support truly simultaneous API calls due to the authentication model requiring a unique identifier. Each CLI request has an identifier based off of the time since the session creation, and if the
op
requests are close enough, it is possible for them to have the same identifier thus leading to the server rejecting the duplicates.We are looking to change this in the future, but for now we recommend to not do concurrent requests with the CLI.
0 -
Thanks @mickael and @graham_1P for the confirmation. Switching it to synchronous changed the workflow from 2 min (when successful) to 14 min. Would love more concurrency options in the future!
0 -
Would we be able to improve the error logging to indicate the cause rather than the random
invalid character
error? That would've saved me a few hours of backtracking a red herring0 -
Improving error logging and error handling is one thing we are looking to improve in the future. Currently our error output can be a little verbose, and sometimes a bit inconsistent, depending on how you are consuming the cli. As we work towards 1.0.0 for the cli, we are going to be taking another look at our inputs, outputs, and errors. While I can't give any specifics, I can say that we are working towards standardising everything and ensuring a consistent interface across interactions with the cil.
0 -
For further context/use-case — my team's been trying to integrate op into our existing Ansible configurations. It works reasonably well with a handful of hosts, but beyond about 10 it's very likely you'll run into this problem. Would be lovely if it "Just Worked (TM)". :)
0 -
@danschmidt5189 Thanks for the feedback - we're going to be looking into this to see how we can improve things.
0 -
Is there any ETA on when this will be resolved? Also having the same struggle with Ansible and concurrency. Would be nice to have an API as opposed to a CLI wrapper as well...
0 -
FYI v1.0 includes a fix that should prevent corrupt session files when running concurrent "op" processes.
0 -
Version v1.1.0 contains another change to improve the reliability of concurrent
op
processes. However, there are some limitations for concurrentop
processes that we probably won't be able to solve.It would be interesting to learn more about how/why any of you are using
op
concurrently. We are looking into solutions that would eliminate the need to runop
concurrently, but of course every use case is different.0