2.3.1 SSL Cert failed to update for SCIM URL

KnowsNothing
KnowsNothing
Community Member

Hello community,

We went the Digital Ocean (DO) route and followed this guide: https://support.1password.com/scim-deploy-digitalocean/

The latest version in the DO marketplace is 2.3.1 even though I see there are more recent versions offered by 1Password.
Our Identity Provider is Azure AD.

This worked great but unfortunately, the sync broke at the 3 month mark. Looked at the SSL cert issued by R3 Let's Encrypt and that seems to have expired at the same time SCIM Bridge provision sync broke.

Is my only option to delete and re-deploy the Bridge or is there a way we can troubleshoot why the cert failed to renew?
I was under the assumption that 2.3.1 had resolved issues with R3 certs.
TIA.


1Password Version: 8.7.3
Extension Version: 2.3.6
OS Version: Windows 10
Browser:_ Chrome

Comments

  • Hi @KnowsNothing. Thanks for reaching out!

    Thank you for reporting that the DO marketplace does not have the latest version available. I'm following up with the team to confirm why that is, and to get it bumped to the latest.

    I'm sorry to hear that your Let's Encrypt certificate didn't renew automatically. Have you perhaps noticed any errors in the SCIM bridge logs related to the renewal of the certificate?

    You can access the logs via the web interface by navigating to the URL where the SCIM bridge is deployed and entering your bearer token. This may not be possible if the certificate is expired. In that case you can also obtain the logs via the Kubernetes command line interface, kubectl.

    In a terminal that has access to kubectl and your DO cluster, you can run the following commands to obtain the SCIM bridge logs:
    1. Get namespaces: kubectl get namespaces
    2. Get application pods: kubectl get pods [--namespace=<scim-bridge-namespace>]
    3. Get logs for SCIM bridge pod: kubectl logs <scim-pridge-pod> [--namespace=<scim-bridge-namespace>]

    Any errors related to the certificate renewal will help us diagnose why the automatic renewal failed.

    For now you can force a renewal of the SCIM bridge by clearing the redis cache and restarting the SCIM bridge. The SCIM bridge caches the Lets Encrypt certificate in redis, and attempts to obtain a new certificate on startup when no certificate is available in the cache.

    Note before you continue to the steps below: The tradeoff is that you will also lose the SCIM bridge logs for the last 3 days. If you'd like to keep a copy of the application logs you follow the steps mentioned previously in this post.

    The steps to clear the redis cache and restart the SCIM bridge:
    1. Open a terminal where you have access to the Kubernetes command line interface (kubectl), and make sure you are connected to the cluster running your SCIM bridge
    2. Scale down the SCIM bridge instance in your cluster: kubectl scale --replicas=0 deployment/<scim-bridge-deployment-name>
    3. Scale down the redis instance in your cluster: kubectl scale --replicas=0 deployment/<redis-deployment-name>
    4. Wait a few seconds for the Kubernetes scheduler to remove the running redis and SCIM bridge instance
    5. Scale up the redis instance in your cluster: kubectl scale --replicas=1 deployment/<redis-deployment-name>
    6. Scale up the SCIM bridge instance in your cluster: kubectl scale --replicas=1 deployment/<scim-bridge-deployment-name>

    You may need to specify the namespace for your SCIM bridge and redis deployment. You can get the namespace for these by issuing the kubectl get deployment --all-namespaces command. You can then include the namespace in the above commands by adding the --namespace=<deployment-namespace> flag.

    The SCIM bridge will attempt to get a new certificate from Let's Encrypt when it starts up. You may experience a couple of minutes of downtime while you scale down/up the SCIM bridge and redis instances in your cluster. Feel free to pause provisioning in your identity provider while you perform the steps to force a certificate renewal.

    I hope this helps!

  • KnowsNothing
    KnowsNothing
    Community Member
    edited July 2022

    @DeVille_1P

    The available logs don't mention "Let" and the only mention of "Cert" is this:

    {"level":"info","version":"2.3.1","build":"203011","application":"op-scim","service":"health","source":"CertificateManager","created":"2022-07-03T02:07:55Z","expires":"2022-07-03T02:17:55Z","status":"unhealthy","time":"2022-07-03T02:10:49Z","message":"health report"}

    Thanks for the reply. We'd love to attempt a force renew and report back. Unfortunatelly, the guide I mentioned for DO setup does not reference a terminal window or how to access this. Here is the guide we used again for your reference: https://support.1password.com/scim-deploy-digitalocean/

    Could you elaboreate on how to gain Termminal Access?
    I see an oppertunity to expand your Knoledgebase articles here. ;-)

    Thanks

  • Hi @KnowsNothing.

    I can confirm that v2.4.1 of the SCIM bridge has been submitted to the DO app store. The new version should show as an available option once the DO team approves the release.

    Given that this is for Digital Ocean it may be better to follow the steps using the interface provided through the dashboard. The following is an adapted version of our upgrade steps for Digital Ocean:

    Access the Kubernetes Dashboard

    1. Visit the DigitalOcean Kubernetes console
    2. Choose the cluster where the SCIM bridge is deployed
    3. Click "Kubernetes Dashboard"

    Scale down the resources

    1. Select the op-scim-bridge namespace from the dropdown at the top of the screen
    2. Select "Deployment" from the Workloads menu on the left
    3. Select op-scim-bridge under the Deployments
    4. Select "Scale resource" from the top menu (looks like ...)
    5. Change "Desired replicas *" to 0
    6. Click "Scale"
    7. Select "Stateful Sets" from the Workloads menu on the left
    8. Select op-scim-bridge-redis-master under the Stateful Sets
    9. Select "Scale resource" from the top menu (looks like ...)
    10. Change "Desired replicas *" to 0
    11. Click "Scale"

    Wait a few seconds for the resources to be scaled down.

    Scale the resources back up (reverse of previous steps)

    1. Ensure that you still have the op-scim-bridge namespace selected from the dropdown at the top of the screen
    2. Select "Stateful Sets" from the Workloads menu on the left
    3. Select op-scim-bridge-redis-master under the Stateful Sets
    4. Select "Scale resource" from the top menu (looks like ...)
    5. Change "Desired replicas *" to 1
    6. Click "Scale"
    7. Select "Deployment" from the Workloads menu on the left
    8. Select op-scim-bridge under the Deployments
    9. Select "Scale resource" from the top menu (looks like ...)
    10. Change "Desired replicas *" to 1
    11. Click "Scale"

    It may take a minute or two for the bridge to come back online and obtain a new certificate.

    I hear you regarding expanding our support articles, but for the most part the bridge should "just work" after being set up and continue to obtain new certificates automatically. Unfortunately this wasn't the case in this instance. We will definitely be investigating why the certificate wasn't automatically renewed and continue to monitor the new releases.

  • KnowsNothing
    KnowsNothing
    Community Member

    @DeVille_1P

    This worked flawlessly on the first go. Thank you very much.

    We'll look forward to the upgrade once released from DO. Then we can get patched up to 2.4.1.

    Thanks again.

  • No problem @KnowsNothing. I'm glad you managed to resolve the issue. Thanks for letting us know!

  • KnowsNothing
    KnowsNothing
    Community Member

    @DeVille_1P

    Heres what I foud in the logs after we did successfully renew the R3 Cert. I wonder if there is more we can do to investigate this.

    {"level":"error","version":"2.3.1","build":"203011","application":"op-scim","component":"CertificateManager","subcomponent":"certmagic","error":"[scim.example.com] solving challenges: scim.example.com: no solvers available for remaining challenges (configured=[tls-alpn-01] offered=[http-01 dns-01 tls-alpn-01] remaining=[http-01 dns-01]) (order=https://acme-v02.api.letsencrypt.org/acme/order/44217013/412830559) (ca=https://acme-v02.api.letsencrypt.org/directory)","retry_time":0,"elapsed":0,"attempt":0,"time":"2022-07-05T22:12:25Z","message":"certificate manager error"}

    {"level":"error","version":"2.3.1","build":"203011","application":"op-scim","component":"CertificateManager","subcomponent":"certmagic","error":"[scim.example.com] Renew: [scim.example.com] solving challenges: scim.example.com: no solvers available for remaining challenges (configured=[tls-alpn-01] offered=[http-01 dns-01 tls-alpn-01] remaining=[http-01 dns-01]) (order=https://acme-v02.api.letsencrypt.org/acme/order/44217013/4128305596) (ca=https://acme-v02.api.letsencrypt.org/directory)","retry_time":60,"elapsed":2.0186756,"attempt":1,"time":"2022-07-05T22:12:25Z","message":"certificate manager error"}

    *Domain redacted

  • Thanks for the additional logs @KnowsNothing. Some errors are normal as Let's Encrypt goes through it's challenge process to confirm that the domain is configured correctly, and before a certificate is issued.

    The important part of the process is that the tls-alpn-01 challenge method is available. This requires that port 443 is open and accessible on the SCIM bridge pod. In the latest version of the bridge this is the preferred challenge method we use. This means that we no longer require port 80 to be open and accessible for the Let's Encrypt challenge.

    Just to confirm, your SCIM bridge did receive a valid certificate, even after the shown errors, right?

  • KnowsNothing
    KnowsNothing
    Community Member

    @DeVille_1P

    Correct. We did get a cert. Not sure why it failed to renew after the initial 3 month period but we'll keep an eye on it.

This discussion has been closed.