Active Incident

Updated a few seconds ago

Back to current status

Status History

Filter: Google Compute Engine (Clear)



February 2024

Bad gateway errors

February 27, 2024 13:10 UTC

Bad gateway errorsDegraded Performance

Incident Status

Degraded Performance


Components

Website


Locations

Google Compute Engine




February 27, 2024 13:10 UTC
[Resolved] This incident has been resolved. More information can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/17667

February 27, 2024 12:44 UTC
[Monitoring] The service status has been restored, however we are still looking into the root cause of the issue. More details can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/17667

February 27, 2024 12:18 UTC
[Investigating] We are currently investigating 502 errors on GitLab.com. More details about this incident can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/17667

GitLab Agent timeout issues

February 27, 2024 04:01 UTC

GitLab Agent timeout issuesDegraded Performance

Incident Status

Degraded Performance


Components

API


Locations

Google Compute Engine




February 27, 2024 04:01 UTC
[Resolved] The issue has been resolved and the affected services are now operating normally. Further details are available in gitlab.com/gitlab-com/gl-infra/production/-/issues/17655

February 26, 2024 23:46 UTC
[Monitoring] The resolution has been merged. Waiting for builds to complete. We will continue to monitor. Connectivity does appear to be resolved at this time.

February 26, 2024 21:02 UTC
[Monitoring] We have a permanent fix ready to deploy. We are working through other issues preventing us from deploying this fix. For now, We will continue to monitor. Most connectivity does appear to be resolved at this time.

February 26, 2024 18:15 UTC
[Identified] We are in the process of resolving the issue. Work on this issue continues and may take around 2h-3h until it is fully deployed on GitLab.com. More details can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/17655

February 26, 2024 15:08 UTC
[Identified] We are in the process of resolving the issue. This process will take around 2h-4h until it is fully deployed on GitLab.com. More details can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/17655

February 26, 2024 12:22 UTC
[Identified] We are in the process of resolving the issue. More details can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/17655

February 26, 2024 11:55 UTC
[Identified] We are working on resolving GitLab Agent timeout issues. Users might experience 500 errors in CI jobs when using it. More details can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/17655

Intermittent job failures with GitLab.com SaaS runners

February 21, 2024 17:54 UTC

Incident Status

Partial Service Disruption


Components

CI/CD - Hosted runners on Linux


Locations

Google Compute Engine




February 21, 2024 17:54 UTC
[Resolved] The issue has been resolved and the affected services are now operating normally. Further details are available in - gitlab.com/gitlab-com/gl-infra/production/-/issues/17636

February 21, 2024 17:03 UTC
[Monitoring] We have taken steps to mitigate the problem and are now monitoring. The next update will be in 60 minutes unless there is anything to report sooner. Tracking in - gitlab.com/gitlab-com/gl-infra/production/-/issues/17636

February 21, 2024 16:32 UTC
[Investigating] We're currently investigating a known problem with SaaS runners which may cause CI jobs to fail intermittently. Next update will be in 30 minutes unless there is anything to report sooner. Tracking in - gitlab.com/gitlab-com/gl-infra/production/-/issues/17636

Intermittent job failures with GitLab.com SaaS runners

February 16, 2024 03:09 UTC

Incident Status

Partial Service Disruption


Components

CI/CD - Hosted runners on Linux


Locations

Google Compute Engine




February 16, 2024 03:09 UTC
[Resolved] The issue has been resolved and affected services are now operating normally. Further details are available in - gitlab.com/gitlab-com/gl-infra/production/-/issues/17603

February 16, 2024 02:47 UTC
[Monitoring] The cause of the problem has been identified and subsequent action has been taken to resolve it. We're seeing a return to normal operation and are continuing to monitor before declaring the incident as resolved. Tracking in - gitlab.com/gitlab-com/gl-infra/production/-/issues/17603

February 16, 2024 02:37 UTC
[Investigating] We're currently investigating a known SSL certificate problem with SaaS runners which may cause CI jobs to fail intermittently. Tracking in - gitlab.com/gitlab-com/gl-infra/production/-/issues/17603

Multipart uploads failing on /-/user_settings/profile

February 14, 2024 07:08 UTC

Incident Status

Degraded Performance


Components

Website


Locations

Google Compute Engine




February 14, 2024 07:08 UTC
[Resolved] The increased error rate has been identified as malicious traffic which has since subsided and operations are back to normal. This incident is considered resolved. Further details can be found in the incident issue: gitlab.com/gitlab-com/gl-infra/production/-/issues/17587

February 14, 2024 06:33 UTC
[Investigating] The severity of this incident has been decreased as the impact is to less than 5% of users.

February 14, 2024 06:18 UTC
[Investigating] Our team is still investigating the cause of the increased error rate - We'll continue to provide updates as more information is available.

February 14, 2024 06:00 UTC
[Investigating] We're seeing elevated error rates on GitLab.com related to updating profile pictures. We are currently investigating the cause. More information in gitlab.com/gitlab-com/gl-infra/production/-/issues/17587

Temporarily archiving the gitlab-org/gitlab repository

February 3, 2024 12:00 UTC

Description

We will be performing a maintenance task that requires us to put the gitlab-org/gitlab repository in archived mode temporarily


Components

Canary


Locations

Google Compute Engine


Schedule

February 3, 2024 11:00 - February 3, 2024 12:00 UTC



February 3, 2024 12:00 UTC
[Update] Scheduled maintenance is complete.

February 3, 2024 11:00 UTC
[Update] Scheduled maintenance is starting.

January 2024

Sidekiq degredation

January 30, 2024 23:02 UTC

Sidekiq degredationDegraded Performance

Incident Status

Degraded Performance


Components

Website, CI/CD - Hosted runners on Linux, CI/CD - Hosted runners on Windows, CI/CD - Hosted runners on macOS, CI/CD - Hosted runners for GitLab community contributions, Background Processing


Locations

Google Compute Engine




January 30, 2024 23:02 UTC
[Resolved] The feature flag responsible for the issue has been disabled by default and operations are back to normal. This incident is considered resolved. Further details can be found in the incident issue: gitlab.com/gitlab-com/gl-infra/production/-/issues/17504

January 30, 2024 22:02 UTC
[Identified] We've identified the issue and have narrowed the problem to a recent feature flag that created slow queries caused by specific import workers. These slow queries lead to database saturation. To mitigate the issue, the team has disabled the feature flag. We are now re-processing jobs from these workers. This issue is considered mitigated. Further updates will be minimal until there is a permanent code fix in place. More details in gitlab.com/gitlab-com/gl-infra/production/-/issues/17504

January 30, 2024 20:22 UTC
[Investigating] No material update at this time. Our team is still investigating the root cause.

January 30, 2024 19:54 UTC
[Investigating] We've temporarily disabled some features of the GitHub import functionality. Users might experience issues with GitHub imports at this time while our team continues to investigate.

January 30, 2024 19:28 UTC
[Investigating] Sidekiq has largely recovered and jobs are processing. Our team is still investigating the root cause.

January 30, 2024 19:08 UTC
[Investigating] Our team is still investigating the root cause of this issue. We'll continue to provide updates as more information is available.

January 30, 2024 18:51 UTC
[Investigating] We're currently seeing some sidekiq degradation that appears to be impacting some pipelines and merge requests. Our team is investigating. More information in gitlab.com/gitlab-com/gl-infra/production/-/issues/17504

Email delivery delays

January 25, 2024 17:45 UTC

Email delivery delaysDegraded Performance

Incident Status

Degraded Performance


Components

Background Processing


Locations

Google Compute Engine




January 25, 2024 17:45 UTC
[Resolved] No further email delays have been reported by our users while our new IP's continue to warm up for the next couple of days. We consider this incident resolved until new reports are received. See gitlab.com/gitlab-com/gl-infra/production/-/issues/17404 for details.

January 19, 2024 23:34 UTC
[Monitoring] Earlier this week, we added additional IPs to our mail server. As these IPs go through the warm-up phase, we anticipate ongoing improvements. Expect additional updates on Monday, unless notable changes necessitate earlier communication. Details in: gitlab.com/gitlab-com/gl-infra/production/-/issues/17404

January 19, 2024 04:28 UTC
[Monitoring] We continue to monitor email delivery for improvements after mitigation steps have been implemented. Details in: gitlab.com/gitlab-com/gl-infra/production/-/issues/17404

January 19, 2024 04:24 UTC
[Monitoring] We continue to monitor email delivery for improvements after mitigation steps have been implemented. Details in: gitlab.com/gitlab-com/gl-infra/production/-/issues/17404

January 18, 2024 18:15 UTC
[Monitoring] Although there are signs of improvement in email delivery delays, reports of delayed emails from certain users persist. We aim to identify additional options for further mitigation and continue to monitor the situation closely. Details in: gitlab.com/gitlab-com/gl-infra/production/-/issues/17404

January 18, 2024 00:30 UTC
[Monitoring] Indications are that the email delivery delay has improved, however we are still receiving reports of late arriving email by some users. We are continuing our analysis to determine options to further mitigate the delayed delivery.

January 17, 2024 00:31 UTC
[Monitoring] Further monitoring indicates an improvement since the changes were implemented. We will continue to monitor and provide an update once we have concluded a measurable success. Details in: gitlab.com/gitlab-com/gl-infra/production/-/issues/17404

January 16, 2024 23:31 UTC
[Identified] We have initiated a change aimed at resolving the email delays. The full implementation of this change may take 24-48 hours to show results. We will continue to monitor the situation and provide the next update once we confirm the effectiveness of the changes. Details in: gitlab.com/gitlab-com/gl-infra/production/-/issues/17404

January 16, 2024 22:02 UTC
[Identified] We're working on implementing mitigation measures to address the email delays. Our team is monitoring the situation, and we will provide an update as soon as additional information becomes available. Details in: gitlab.com/gitlab-com/gl-infra/production/-/issues/17404

January 16, 2024 19:55 UTC
[Identified] Our team continues to work on resolving this issue as swiftly as possible. We will post the next update when we have significant developments to share. Details in: gitlab.com/gitlab-com/gl-infra/production/-/issues/17404

January 16, 2024 18:44 UTC
[Identified] We are actively working to address the email delivery delays. We appreciate your patience as we continue to work towards a solution. Next update will be in an hour. Details in: gitlab.com/gitlab-com/gl-infra/production/-/issues/17404

January 16, 2024 17:27 UTC
[Identified] We continue to address email delivery delays. While there is no new information to share at this time, we are actively working on a resolution. Another update will be provided in one hour. Details in: gitlab.com/gitlab-com/gl-infra/production/-/issues/17404

January 16, 2024 16:19 UTC
[Identified] We have identified an issue causing delays in email delivery and are actively working to resolve the issue. We anticipate providing an update within the next hour. Details in: gitlab.com/gitlab-com/gl-infra/production/-/issues/17404

January 16, 2024 15:31 UTC
[Investigating] No material updates to report. We are still working with the vendor to identify the cause. Next update will be posted when we know more. For more information: gitlab.com/gitlab-com/gl-infra/production/-/issues/17404

January 16, 2024 15:05 UTC
[Investigating] We are working with the email vendor to identify the cause. For more information: gitlab.com/gitlab-com/gl-infra/production/-/issues/17404

January 16, 2024 14:40 UTC
[Investigating] We are still actively investing reports of delayed emails. More information: gitlab.com/gitlab-com/gl-infra/production/-/issues/17404

January 16, 2024 14:10 UTC
[Investigating] We have identified a problem where emails are being delayed intermittently. More information: gitlab.com/gitlab-com/gl-infra/production/-/issues/17404.

System notes authored by a deleted user/bot prevent any Work Item discussions from being loaded

January 22, 2024 19:05 UTC

Incident Status

Partial Service Disruption


Components

Website, API


Locations

Google Compute Engine




January 22, 2024 19:05 UTC
[Resolved] We have now confirmed that the issue has been resolved by the implemented fix. Please review gitlab.com/gitlab-com/gl-infra/production/-/issues/17436 for details.

January 22, 2024 17:49 UTC
[Monitoring] The fix has now been confirmed to be in production. We will monitor the status of this incident to confirm it is solved. For more details continue to follow gitlab.com/gitlab-com/gl-infra/production/-/issues/17436.

January 22, 2024 17:29 UTC
[Identified] Making a slight correction to our previous update, the fix has been deployed to our pre-production environment and is pending to land on production. Follow gitlab.com/gitlab-com/gl-infra/production/-/issues/17436 for more details.

January 22, 2024 16:58 UTC
[Monitoring] The deployment of the fix to production is complete. We will continue to monitor the situation. For details, please see: gitlab.com/gitlab-com/gl-infra/production/-/issues/17436

January 22, 2024 09:52 UTC
[Identified] We are expecting the deployment of the fix to be in production in approx. 4.5 h. We will post our next update when the deployment is complete. For details, please see: gitlab.com/gitlab-com/gl-infra/production/-/issues/17436

January 22, 2024 09:15 UTC
[Identified] The MR has been merged, and we're waiting for the fix to be deployed. For details, please see: gitlab.com/gitlab-com/gl-infra/production/-/issues/17436

January 22, 2024 08:35 UTC
[Identified] We're waiting for the fix to be deployed. For details, please see: gitlab.com/gitlab-com/gl-infra/production/-/issues/17436

January 22, 2024 07:59 UTC
[Identified] We continue working on resolving the problem. A tentative fix is in progress. For details, please see: gitlab.com/gitlab-com/gl-infra/production/-/issues/17436

January 22, 2024 07:12 UTC
[Identified] We have identified an issue preventing discussion loading and are actively working to resolve it. We will provide an update within the next hour. Details in: gitlab.com/gitlab-com/gl-infra/production/-/issues/17436

January 22, 2024 06:41 UTC
[Investigating] We became aware of an issue that the system notes by deleted users or bots prevent users from loading the discussions on the page. We are now investigating the root cause and will provide updates along the way. See gitlab.com/gitlab-com/gl-infra/production/-/issues/17436 for details.

DIND not working in Shared Runners after update

January 18, 2024 15:03 UTC

Incident Status

Partial Service Disruption


Components

CI/CD - Hosted runners on Linux


Locations

Google Compute Engine




January 18, 2024 15:03 UTC
[Resolved] The rollback was successful in restoring service. Runner service has been restored. More information: gitlab.com/gitlab-com/gl-infra/production/-/issues/17422

January 18, 2024 14:53 UTC
[Monitoring] The shared runner update has been successfully rolled back. Positive reports have been received from some users. We are still monitoring to confirm if errors have subsided. More information: gitlab.com/gitlab-com/gl-infra/production/-/issues/17422

January 18, 2024 14:39 UTC
[Identified] Shared runner rollback is still in progress. For More information: gitlab.com/gitlab-com/gl-infra/production/-/issues/17422

January 18, 2024 14:15 UTC
[Identified] We’re aware of an issue with Shared Runners and a rollback is in progress. For more information: gitlab.com/gitlab-com/gl-infra/production/-/issues/17422

December 2023

Add-ons can not be purchased via GitLab.com

December 23, 2023 00:09 UTC

Incident Status

Partial Service Disruption


Components

Website


Locations

Google Compute Engine




December 23, 2023 00:09 UTC
[Resolved] The fix has been deployed to production and the incident is resolved. Add-ons can once again be purchased directly via GitLab.com.

December 22, 2023 23:43 UTC
[Monitoring] The fix for this problem is being deployed. More information is in: gitlab.com/gitlab-com/gl-infra/production/-/issues/17320. Next update: within two hours.

December 22, 2023 20:54 UTC
[Identified] We are deploying a fix we expect to resolve this problem. More information and a workaround are in: gitlab.com/gitlab-com/gl-infra/production/-/issues/17320. Next update: within three hours.

December 22, 2023 18:53 UTC
[Identified] We identified a problem where add-ons (Compute, Storage) can not be purchased via GitLab.com. More information and a workaround are in: gitlab.com/gitlab-com/gl-infra/production/-/issues/17320. Next update: within two hours.

SaaS runners erroring out when queuing jobs

December 15, 2023 21:00 UTC

Incident Status

Degraded Performance


Components

CI/CD - Hosted runners on Linux


Locations

Google Compute Engine




December 15, 2023 21:00 UTC
[Resolved] Docker has been updated and functionality has been restored. Customers are now able to use lastest and 24.0.7 tags for Docker. Marking the issue as resolved for more information see issue gitlab.com/gitlab-com/gl-infra/production/-/issues/17283

December 15, 2023 08:45 UTC
[Identified] We have identified the cause of the issue introduced with the latest Docker image (24.0.7 tag) and are looking into ways to address it. We will post the next update when we have identified a solution to this change gitlab.com/gitlab-com/gl-infra/production/-/issues/17283

December 15, 2023 08:07 UTC
[Identified] We are working on compatibility issues with the latest Docker image (24.0.7 tag) for the shared runner component. Affected users, kindly use the workaround mentioned in the previous update. For more info: gitlab.com/gitlab-com/gl-infra/production/-/issues/17283.

December 15, 2023 07:26 UTC
[Identified] We've identified the cause of the issue being the latest/24.0.7 tag of Docker image used. The workaround is to pin a tag that is 24.0.6 and before. For more info: gitlab.com/gitlab-com/gl-infra/production/-/issues/17283.

December 15, 2023 07:03 UTC
[Investigating] We are primarily seeing SaaS Linux small runners affected. Our investigation is ongoing. More information on gitlab.com/gitlab-com/gl-infra/production/-/issues/17283.

December 15, 2023 06:43 UTC
[Investigating] We are investigating an issue with elevated errors on our SaaS runners being able to queue jobs. For more information: gitlab.com/gitlab-com/gl-infra/production/-/issues/17283

Self hosted Pipelines Failing with Cert Errors

December 14, 2023 10:22 UTC

Incident Status

Operational


Components

CI/CD - Hosted runners for GitLab community contributions


Locations

Google Compute Engine




December 14, 2023 10:22 UTC
[Resolved] The issue is resolved. For self-hosted runners, the previous workaround may need to be re-applied as we have updated GitLab.com certificate. Please see possible workarounds in gitlab.com/gitlab-com/gl-infra/production/-/issues/17265

December 14, 2023 09:28 UTC
[Monitoring] We are updating the certificate on GitLab.com. If you are experiencing self-hosted runner connection errors, please see possible workarounds in gitlab.com/gitlab-com/gl-infra/production/-/issues/17265

December 14, 2023 08:24 UTC
[Monitoring] We are investigating an issue with connection issues for self-hosted runners and users using the AWS OIDC provider due to an SSL certificate change. If you are experiencing self-hosted runner connection errors, please see possible workarounds in gitlab.com/gitlab-com/gl-infra/production/-/issues/17265

December 14, 2023 05:47 UTC
[Monitoring] We continue to monitor the situation, and are looking into a solution to reduce the need for either of the two workarounds mentioned in our previous updates. For more information: gitlab.com/gitlab-com/gl-infra/production/-/issues/17265.

December 14, 2023 00:52 UTC
[Monitoring] GitLab.com shared runners may be affected by the switch to LetsEncrypt certs if they use container images that do not recognise LetsEncrypt's CA certificates. If you are facing this issue, please update the container image. More details: gitlab.com/gitlab-com/gl-infra/production/-/issues/17265

December 13, 2023 22:43 UTC
[Monitoring] We have a workaround in place and are monitoring the incident. The current workaround is to reboot the runners to restore functionality. We will keep this alert in monitoring for now. This is affecting Saas self-hosted runners. Shared runners are not affected.

December 13, 2023 22:03 UTC
[Monitoring] We are currently monitoring the situation. Self-hosted runners on Saas are experiencing failed pipelines due to certificate errors. Saas runners are not affected. The current workaround is to reboot the runner to refresh SSL and restore service. gitlab.com/gitlab-com/gl-infra/production/-/issues/17265

December 13, 2023 21:44 UTC
[Identified] Further investigation reveals this issue to be affecting specifically self-hosted runners on Saas. Saas runners are not affected. Current workaround is to reboot the runner to refresh the SSL.

December 13, 2023 21:26 UTC
[Investigating] We are still investigating an issue with SSL certificates causing pipelines to fail on self-manged. As a workaround, we can try rebooting the runners to update the SSL on cache. We are still investigating the issue gitlab.com/gitlab-com/gl-infra/production/-/issues/17265

December 13, 2023 20:52 UTC
[Investigating] We have made a change to our Cloudflare certificates and self-hosted runners may have trouble running pipelines on self-managed. Please stand by while we are investigating this issue.

November 2023

Sidekiq processing is delayed

November 23, 2023 19:38 UTC

Sidekiq processing is delayedDegraded Performance

Incident Status

Degraded Performance


Components

Background Processing


Locations

Google Compute Engine




November 23, 2023 19:38 UTC
[Resolved] The backlog of jobs has been completed. Security Scan Result Policies are working as expected. This incident is now considered resolved.

November 23, 2023 15:52 UTC
[Monitoring] Our team has merged a code change for the issue into production and we've re-enabled the affected worker. We are working through a backlog of jobs and will provide an update once it is completed.

November 22, 2023 23:25 UTC
[Monitoring] No material updates at this time. We'll provide additional updates once more information is available.

November 22, 2023 22:17 UTC
[Monitoring] While the issue has been mitigated, our team has disabled a worker that impacts the availability of the Security Scan Result Policies. This can impact the ability to enable/disable and update policies. Our team is working on a code change to fix the root issue.

November 22, 2023 21:18 UTC
[Resolved] The mitigation has been successful and the issue is considered resolved. Our team will continue to work on additional action items. More details in gitlab.com/gitlab-com/gl-infra/production/-/issues/17168

November 22, 2023 21:05 UTC
[Monitoring] Our team has disabled the offending worker and performance is improving. We are continuing to monitor the state of the background processing.

November 22, 2023 20:49 UTC
[Identified] Our team has identified the root cause as related to a specific worker. We are continuing to work on mitigation efforts.

November 22, 2023 20:19 UTC
[Investigating] Our team is still continuing the investigation. Additional details can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/17168

November 22, 2023 19:45 UTC
[Investigating] Our team is continuing to investigate the root cause. We believe to have identified a query that is driving high CPU usage and saturating sidekiq.

November 22, 2023 19:20 UTC
[Investigating] Our team is still investigating the delays with sidekiq processing. This may affect jobs getting picked up, or the ability to create or update merge requests. More details in gitlab.com/gitlab-com/gl-infra/production/-/issues/17168

November 22, 2023 19:06 UTC
[Investigating] Our team is currently investigating some sidekiq performance issues which is causing some delays in jobs processing.

Performance issues affecting processing of MRs

November 20, 2023 18:49 UTC

Incident Status

Degraded Performance


Components

CI/CD - Hosted runners on Linux, CI/CD - Hosted runners on Windows, CI/CD - Hosted runners on macOS, CI/CD - Hosted runners for GitLab community contributions, Background Processing


Locations

Google Compute Engine, AWS




November 20, 2023 18:49 UTC
[Resolved] Performance of CI pipelines and merge requests has fully returned to normal and shows no further signs of degradation. The status page is being marked resolved, and future updates can be found in the production issue: gitlab.com/gitlab-com/gl-infra/production/-/issues/17158

November 20, 2023 18:05 UTC
[Monitoring] Performance of CI job pickup and merge requests has returned to a normal state. The incident response team will continue monitoring and reviewing the rolled back changes. Next update in an hour, unless there is anything to report prior.

November 20, 2023 17:21 UTC
[Investigating] We have started to notice some signs of recovery after the rollback. Investigation into root causes is still ongoing to identify the source of the CI and merge request performance problems.

November 20, 2023 16:50 UTC
[Investigating] Rollback to an earlier deployment has completed. No material updates in the CI job pickup slowness has been observed yet. Next update will be in 30 minutes unless there is anything to report sooner.

November 20, 2023 16:32 UTC
[Investigating] The rollback to an earlier deployment is in progress and a review of all changes is in progress to identify the root cause of slow CI job pickup. Additional support from other teams is being brought in to assist.

November 20, 2023 16:12 UTC
[Investigating] The incident response team is continuing to investigate the slow CI job pickup. A rollback to a previous deployment is in progress as a potential mitigating solution.

November 20, 2023 15:53 UTC
[Investigating] We are actively investigating an issue where CI jobs are being picked up slowly affecting performance of MR's. More details here: gitlab.com/gitlab-com/gl-infra/production/-/issues/17158

Configured Services in GitLab CI Not Receiving Variables from .gitlab-ci.yml

November 16, 2023 07:49 UTC

Incident Status

Partial Service Disruption


Components

CI/CD - Hosted runners on Linux, CI/CD - Hosted runners for GitLab community contributions


Locations

Google Compute Engine




November 16, 2023 07:49 UTC
[Resolved] The rollback has been successfully completed, and variables are now being correctly passed to the services specified in the CI as expected. This incident has been resolved. For more details, please visit: gitlab.com/gitlab-com/gl-infra/production/-/issues/17143

November 16, 2023 07:03 UTC
[Identified] Our team is currently working on a rollback of the recent changes. For more details, please visit: gitlab.com/gitlab-com/gl-infra/production/-/issues/17143

November 16, 2023 06:39 UTC
[Identified] The revert merge request (gitlab.com/gitlab-org/gitlab/-/merge_requests/137084) has been successfully merged. Our team is now working on deploying these changes to production. For more details, please see: gitlab.com/gitlab-com/gl-infra/production/-/issues/17143

November 16, 2023 06:15 UTC
[Identified] Our team has identified the probable root cause of the issue and is currently working on a revert merge request. In the meantime, a workaround is to specify the variables in the global section. More details here: gitlab.com/gitlab-com/gl-infra/production/-/issues/17143

November 16, 2023 05:57 UTC
[Investigating] We are actively investigating an issue where services specified in .gitlab-ci.yml are not picking up environment variables as expected. More details here: gitlab.com/gitlab-com/gl-infra/production/-/issues/17143

OpenID issues

November 7, 2023 21:23 UTC

OpenID issuesOperational

Incident Status

Operational


Components

Website


Locations

Google Compute Engine




November 7, 2023 21:23 UTC
[Resolved] After a period of monitoring, we have not received any further reports of OpenID authentication errors. As a result, we are marking the incident as resolved. For a detailed timeline of the incident, please refer to: gitlab.com/gitlab-com/gl-infra/production/-/issues/17073

November 3, 2023 14:05 UTC
[Monitoring] We identified a caching issue with our gitlab.com/.well-known/openid-configuration endpoint that may lead to authentication failures with OpenID. It has been resolved by purging the cache and we continue monitoring. More details: gitlab.com/gitlab-com/gl-infra/production/-/issues/17073

Merge requests incorrectly blocked by denied license

November 6, 2023 16:12 UTC

Incident Status

Partial Service Disruption


Components

Website, API


Locations

Google Compute Engine




November 6, 2023 16:12 UTC
[Resolved] The issue has been successfully resolved. As a result, the blockage of merge requests due to license denials is no longer expected to occur. For a comprehensive timeline of the incident, please refer to: gitlab.com/gitlab-com/gl-infra/production/-/issues/17082

November 6, 2023 16:10 UTC
[Resolved] The MR gitlab.com/gitlab-org/gitlab/-/merge_requests/135986 has been deployed to production, resolving this incident.

November 3, 2023 21:53 UTC
[Identified] The planned mitigation has been verified and committed and will be deployed on Monday. In the meantime a workaround is to create a scan result policy approval to unblock your merge request. gitlab.com/gitlab-com/gl-infra/production/-/issues/17082#note_1634269721

November 3, 2023 21:16 UTC
[Identified] The incident response team has a potential mitigation path planned, and is working to deploy a fix as soon as it's verified. Details may be found in this issue: gitlab.com/gitlab-com/gl-infra/production/-/issues/17082

November 3, 2023 20:43 UTC
[Identified] We are investigating reports of merge requests incorrectly being blocked due to denied licenses

October 2023

GitLab.com availability issues

October 30, 2023 18:33 UTC

Incident Status

Service Disruption


Components

Website, API, Git Operations


Locations

Google Compute Engine




October 30, 2023 18:33 UTC
[Resolved] Import functionality has been restored and operation of GitLab.com is fully returned to normal. We're marking the incident resolved, please see the issue for further details: gitlab.com/gitlab-com/gl-infra/production/-/issues/17054

October 30, 2023 17:58 UTC
[Monitoring] The incident response team has mitigated the root cause of the performance impacts and is now working to restore full import functionality. Monitoring is ongoing to ensure there are no further impacts. Issue: gitlab.com/gitlab-com/gl-infra/production/-/issues/17054

October 30, 2023 17:28 UTC
[Identified] Performance continues to improve and the team is continuing to work on mitigations to prevent future impact.

October 30, 2023 17:09 UTC
[Identified] We continue to work towards full mitigation. Performance of GitLab.com is returning, though project import remains partially disabled while the team investigates.

October 30, 2023 16:54 UTC
[Identified] A likely cause of the database load has been identified. Our incident response team has temporarily disabled part of the project import feature while we continue to investigate.

October 30, 2023 16:32 UTC
[Investigating] We are seeing signs of improvement, and the team has narrowed the potential causes. Mitigation and additional investigation are in progress.

October 30, 2023 16:23 UTC
[Investigating] We've confirmed potential impacts to Git operations and API as well. Investigation still ongoing.

October 30, 2023 16:10 UTC
[Investigating] The incident response team is continuing to investigate causes of the increased database usage, additional teams are being brought in to assist.

October 30, 2023 15:55 UTC
[Investigating] We are investigating increased database usage as a primary cause. We will update as more information becomes available.

October 30, 2023 15:39 UTC
[Investigating] We are currently investigating availability issues with on GitLab.com. We will update as more information becomes available.

Delays in background and CI processing

October 25, 2023 18:08 UTC

Incident Status

Degraded Performance


Components

Website, Background Processing


Locations

Google Compute Engine




October 25, 2023 18:08 UTC
[Resolved] GitLab identified problematic usage patterns that lead to DB saturation. We took action to mitigate the problems caused by these usage patterns and will be investigating how to mitigate the impact of similar usage patterns in the future. We will post further updates to the incident issue. Incident issue: gitlab.com/gitlab-com/gl-infra/production/-/issues/17030

October 25, 2023 17:30 UTC
[Monitoring] The rollback is done and we are monitoring. Background processing should be back to normal levels. We see that CI jobs are being picked up and workers are working through the CI queues; users may see delays in jobs being picked up but they are being processed. Incident issue: gitlab.com/gitlab-com/gl-infra/production/-/issues/17030

October 25, 2023 16:25 UTC
[Investigating] We are continuing to investigate the root cause for CI and Merge Request performance issues; the rollback has completed and we have identified a possible proximate cause for DB saturation. We have made changes to mitigate DB saturation and are monitoring to understand if our changes have been effective. Incident issue: gitlab.com/gitlab-com/gl-infra/production/-/issues/17030





Back to current status