Active Incident

Updated a few seconds ago

Back to current status

Status History



December 2024

Fargate runner error - file name too long

December 4, 2024 16:50 UTC

Incident Status

Partial Service Disruption


Components

Website, CI/CD - Hosted runners for GitLab community contributions


Locations

Google Compute Engine




December 4, 2024 16:50 UTC
[Resolved] As no new user reports have been received during our monitoring period we consider this incident resolved. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18939 for the full incident history.

December 4, 2024 15:46 UTC
[Monitoring] We have disabled the feature flag and are now monitoring the issue for 1 hour before marking as resolved. More details can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/18939.

December 4, 2024 15:42 UTC
[Identified] We've identified cause of the issue and are working on resolving it. More details can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/18939.

December 4, 2024 15:16 UTC
[Investigating] We are currently investigating issues with GitLab runners with Fargate driver returning "file name too long" errors. More details about this incident can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/18939

Project mirror disabled due to excessive notifications

December 3, 2024 20:26 UTC

Incident Status

Degraded Performance


Components

Website, Background Processing


Locations

Google Compute Engine




December 3, 2024 20:26 UTC
[Resolved] After seeing no further issues arise during our monitoring period, we are considering this incident resolved. Please review gitlab.com/gitlab-com/gl-infra/production/-/issues/18929 for more details.

December 3, 2024 18:49 UTC
[Monitoring] Project mirroring has been re-enabled on GitLab.com and we are monitoring to make sure no further issues arise. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18929 for further details.

December 3, 2024 15:51 UTC
[Identified] We've turned off the functionality relating to schedules for project mirroring. Project mirroring will be reenabled once we resolve this issue. Mirrored projects will not be updated during this time.

December 3, 2024 14:47 UTC
[Identified] We've turned off the functionality that sends out email updates temporarily for project mirrors. We are continuing to investigate this incident.

December 3, 2024 14:04 UTC
[Investigating] We are currently investigating emails being sent for older project mirrors and imports. More details about this incident can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/18929

November 2024

Customers Portal is down (customers.gitlab.com)

November 28, 2024 10:39 UTC

Incident Status

Service Disruption


Components

GitLab Customers Portal


Locations

Google Compute Engine




November 28, 2024 10:39 UTC
[Resolved] Functionality has been restored to the Customer Portal (customers.gitlab.com). The billing pages in GitLab.com are also available.

November 28, 2024 09:39 UTC
[Investigating] The Customers Portal is currently in maintenance mode and unavailable due to a 3rd-party API outage. The billing pages in GitLab.com may be affected.

Duplicate Merge Request Events

November 26, 2024 19:52 UTC

Duplicate Merge Request EventsDegraded Performance

Incident Status

Degraded Performance


Components

Website


Locations

Google Compute Engine




November 26, 2024 19:52 UTC
[Resolved] We have confirmed that duplicate merge request events have stopped being created. Please see the production issue for details: gitlab.com/gitlab-com/gl-infra/production/-/issues/18904

November 26, 2024 19:04 UTC
[Monitoring] A fix has been rolled out and merge request events should no longer be duplicated. See the production issue for more details: gitlab.com/gitlab-com/gl-infra/production/-/issues/18904

November 26, 2024 18:43 UTC
[Identified] Users may see duplicated events on merge requests, or failed merge attempts and merge request updates. Please see gitlab.com/gitlab-com/gl-infra/production/-/issues/18904 for more information.

Error when installing GitLab for Jira Cloud app

November 19, 2024 22:18 UTC

Incident Status

Partial Service Disruption


Components

Website


Locations

Google Compute Engine




November 19, 2024 22:18 UTC
[Resolved] We have verified that GitLab for Jira Cloud application issues have been fixed. We are resolving this incident; any further updates will be posted to gitlab.com/gitlab-com/gl-infra/production/-/issues/18872

November 19, 2024 22:02 UTC
[Monitoring] The fix has been deployed to production and has been tested; we believe that issues with the GitLab for Jira Cloud application are resolved; we are monitoring to confirm. See more details in gitlab.com/gitlab-com/gl-infra/production/-/issues/18872

November 19, 2024 14:19 UTC
[Identified] The fix has been merged and is currently being deployed to production. We will confirm when deployment has completed or provide another update sooner if relevant. See more details in gitlab.com/gitlab-com/gl-infra/production/-/issues/18872

November 19, 2024 12:51 UTC
[Identified] The merge request to fix the issue is still in progress, but expected to complete shortly. We will confirm when it has been merged or provide another update sooner if relevant. See more details in gitlab.com/gitlab-com/gl-infra/production/-/issues/18872

November 19, 2024 12:08 UTC
[Identified] We've identified the root cause and are currently working on a merge request to fix the issue. See more details in gitlab.com/gitlab-com/gl-infra/production/-/issues/18872

November 19, 2024 11:27 UTC
[Investigating] We're aware of an issue with installing the GitLab for Jira Cloud app from Atlassian Marketplace and are investigating. See more details in gitlab.com/gitlab-com/gl-infra/production/-/issues/18872

500 errors on user preferences pages

November 18, 2024 15:45 UTC

Incident Status

Degraded Performance


Components

Website


Locations

Google Compute Engine




November 18, 2024 15:45 UTC
[Resolved] This incident has been resolved as we are no longer seeing any issues on the User Preferences page after a period of monitoring. For more information please see: gitlab.com/gitlab-com/gl-infra/production/-/issues/18867

November 18, 2024 15:18 UTC
[Monitoring] The fix has been deployed to production and the User Preferences page should be accessible again. We're now monitoring to ensure the issue has been resolved. For more information please see: gitlab.com/gitlab-com/gl-infra/production/-/issues/18867

November 18, 2024 14:07 UTC
[Identified] Deployment to production is currently ongoing, but is expected to complete shortly. We will provide another update once deployment is completed. For more information please see: gitlab.com/gitlab-com/gl-infra/production/-/issues/18867

November 18, 2024 13:16 UTC
[Identified] We have deployed on canary and the revert fixed the issue with accessing the User Preferences page. We will confirm once deployment to production is complete within the next ~40 minutes. For more information please see: gitlab.com/gitlab-com/gl-infra/production/-/issues/18867

November 18, 2024 11:58 UTC
[Identified] The deployment process to resolve the issue is still ongoing. We will confirm once the deployment is complete or provide another progress update within ~1 hour. For more information please see: gitlab.com/gitlab-com/gl-infra/production/-/issues/18867

November 18, 2024 09:24 UTC
[Identified] A fix is currently being deployed and is expected to complete in the next ~3 hours. Another update will be provided at 12:00 UTC or once the deployment has completed. For more information please see: gitlab.com/gitlab-com/gl-infra/production/-/issues/18867

November 18, 2024 07:57 UTC
[Identified] The fix is awaiting deployment, which is expected to occur before 12:00 UTC. Once the fix has deployed the User Preferences page will be monitored to ensure the issue has been resolved. For more information please see: gitlab.com/gitlab-com/gl-infra/production/-/issues/18867

November 18, 2024 07:34 UTC
[Identified] A fix has been merged and it will be deployed shortly - once it has been deployed we will monitor the number of 500s occurring on the settings page to ensure the issue is resolved. For more information please see: gitlab.com/gitlab-com/gl-infra/production/-/issues/18867

November 18, 2024 07:18 UTC
[Identified] A fix has been implemented and is being tested before being merged. For more information please see: gitlab.com/gitlab-com/gl-infra/production/-/issues/18867

November 18, 2024 07:02 UTC
[Identified] We are seeing a number of 500 errors occurring when users attempt to navigate to the user profile settings in the UI - We have identified the issue and are working on a fix. The issue is being tracked in the incident issue: gitlab.com/gitlab-com/gl-infra/production/-/issues/18867

Elevated errors in Duo Chat

November 15, 2024 09:20 UTC

Elevated errors in Duo ChatDegraded Performance

Incident Status

Degraded Performance


Components

GitLab Duo


Locations

Google Compute Engine




November 15, 2024 09:20 UTC
[Resolved] The revert MR has now been deployed to production and Duo Chat is working as expected Details to be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/18858

November 15, 2024 01:05 UTC
[Identified] The revert MR has been merged and is awaiting deployment to production. A further update will be provided once deployment has completed. Details to be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/18858

November 15, 2024 00:39 UTC
[Identified] The root cause has been identified and a reversion MR is being prepared. Further details can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/18858

November 15, 2024 00:26 UTC
[Investigating] We are currently investigating errors from Duo Chat (A9999). Further details can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/18858

Update the CA of certificates for number of hostnames

November 7, 2024 01:55 UTC

Description

To maximize device compatibility going forward, GitLab will update the CA of certificates used by a number of hostnames to Google Trust Services. For full details please refer to the CR issue : gitlab.com/gitlab-com/gl-infra/production/-/issues/18625


Components

packages.gitlab.com


Locations

AWS


Schedule

November 6, 2024 00:00 - November 7, 2024 00:00 UTC



November 7, 2024 01:55 UTC
[Update] GitLab.com planned maintenance for updating certificate CAs has completed. There were no reported problems during the monitoring period in the maintenance window. Thank you for your patience.

November 6, 2024 23:44 UTC
[Update] Maintenance started as of November 6, 2024 00:00 and is expected to end November 7, 2024 00:00 UTC

November 6, 2024 02:48 UTC
[Update] Maintenance is underway to update the CA of certificates used by a number of hostnames to Google Trust Services. No customer impact is expected from this change. We'll continue to monitor until the end of the maintenance window. For full details please refer to the CR issue : gitlab.com/gitlab-com/gl-infra/production/-/issues/18625

November 4, 2024 00:39 UTC
[Update] Due to the ongoing production change lock this maintenance has been rescheduled, to begin 2024-11-06 00:00 to 2024-11-07 00:00 UTC We apologize for the inconvenience; this update has been reflected on the infrastructure issue as well: gitlab.com/gitlab-com/gl-infra/production/-/issues/18655

October 30, 2024 14:01 UTC
[Update] Due to circumstances beyond our control, the maintenance has been rescheduled from the original timeframe of 2024-10-31 - 2024-11-01 to 2024-11-04 - 2024-11-05. We apologize for the inconvenience; this update has been reflected on the infrastructure issue as well: gitlab.com/gitlab-com/gl-infra/production/-/issues/18655

Database query timeout on groups API

November 6, 2024 12:59 UTC

Database query timeout on groups APIPartial Service Disruption

Incident Status

Partial Service Disruption


Components

API


Locations

Google Compute Engine




November 6, 2024 12:59 UTC
[Resolved] After a period of monitoring, we have confirmed that the issue has been resolvedSee the GitLab incident issue for more details: gitlab.com/gitlab-com/gl-infra/production/-/issues/18820

November 6, 2024 12:50 UTC
[Monitoring] The fix has now successfully been deployed to Production and we will continue to monitor this situation. See the GitLab incident issue for more details: gitlab.com/gitlab-com/gl-infra/production/-/issues/18820

November 6, 2024 11:15 UTC
[Identified] Deployment of this fix is continuing, and more updates will follow shortly. Please see gitlab.com/gitlab-com/gl-infra/production/-/issues/18820 for more information.

November 6, 2024 09:28 UTC
[Identified] A fix has been identified, and will be deployed to production shortly. More updates will follow, but please see gitlab.com/gitlab-com/gl-infra/production/-/issues/18820 for more information.

November 6, 2024 08:36 UTC
[Investigating] Investigation of this issue is continuing, with no material updates to report. Please follow gitlab.com/gitlab-com/gl-infra/production/-/issues/18820 for more information.

November 6, 2024 08:12 UTC
[Investigating] No material updates to report. We are continuing to investigate potential causes of this issue. Please follow gitlab.com/gitlab-com/gl-infra/production/-/issues/18820 for more information.

November 6, 2024 07:44 UTC
[Investigating] No material updates to report. We are continuing to investigate potential causes of this issue. Please follow gitlab.com/gitlab-com/gl-infra/production/-/issues/18820 for more information.

November 6, 2024 07:20 UTC
[Investigating] We previously resolved this issue, but with the recent increase in traffic, we're noticing a spike in error rates for the Groups API. We're re-opening this incident for further investigation. Details in: gitlab.com/gitlab-com/gl-infra/production/-/issues/18820

November 6, 2024 03:20 UTC
[Resolved] After a period of monitoring, we have confirmed that the issue has been resolved after the rollout of the feature flag. See the GitLab incident issue for more details: gitlab.com/gitlab-com/gl-infra/production/-/issues/18811

November 6, 2024 01:45 UTC
[Monitoring] A fix has successfully been deployed to Production. The corresponding Feature Flag is being rolled-out gradually. We will continue to monitor. See the GitLab incident issue for more details: gitlab.com/gitlab-com/gl-infra/production/-/issues/18811

November 5, 2024 23:47 UTC
[Identified] A fix is currently being rolled out to production. We are still waiting for the pipeline to complete. Once completed, we will continue to monitor.

November 5, 2024 22:46 UTC
[Identified] A fix is currently being rolled out to production. Once complete, the issue should be mitigated, and we will continue to monitor.

November 5, 2024 21:37 UTC
[Identified] A fix is currently being rolled out to production. Once complete, the issue should be mitigated, and we will continue to monitor.

November 5, 2024 19:56 UTC
[Identified] We have identified the issue and are in the process of Merging a fix! We will update you once that fix has merged.

November 5, 2024 18:39 UTC
[Identified] We have identified the issue and are in the process of Merging a fix! Please stand by for updates.

November 5, 2024 17:35 UTC
[Identified] We have identified the issue and are still working with our team to resolve it. The next update is in an hour. Please standby.

November 5, 2024 16:54 UTC
[Identified] We have identified the issue and are still working with our team to resolve it. Please stand by for updates.

November 5, 2024 16:01 UTC
[Identified] The root cause has been identified, and our engineers are working on a fix. Please stand by for further updates.

November 5, 2024 15:50 UTC
[Investigating] Investigation is continuing, we will provide another update in 15 minutes.

November 5, 2024 15:35 UTC
[Investigating] Our engineers continue to investigate the root cause, please stand by for further updates.

November 5, 2024 15:20 UTC
[Investigating] Our engineers are continuing their investigation. Please stand by for further updates.

November 5, 2024 15:04 UTC
[Investigating] Our engineers continue to investigate the root cause, please stand by for further updates.

November 5, 2024 14:49 UTC
[Investigating] Investigation is continuing, we will provide another update in 15 minutes.

November 5, 2024 14:33 UTC
[Investigating] We have identified an issue with the Groups API, when the min_access_level is included in the API Call. Investigation is proceeding, and we will post an update in 15 minutes.

MAIN DB cluster upgrade

November 5, 2024 11:02 UTC

MAIN DB cluster upgrade Planned Maintenance

Description

Next week, we will be undergoing scheduled maintenance to our MAIN database layer. The maintenance will start at 2024-11-03 06:00 UTC and should finish at 2024-11-05 11:00 UTC (including performance regression observability period). GitLab.com will be available during the whole period as the maintenance should be seamless and transparent for the application. We apologize in advance for any inconvenience this may cause. See <gitlab.com/gitlab-com/gl-infra/production/-/issues/18747>


Components

Website, API, Git Operations, GitLab Pages, CI/CD - Hosted runners on Linux, CI/CD - Hosted runners on Windows, CI/CD - Hosted runners on macOS, CI/CD - Hosted runners for GitLab community contributions


Locations

Google Compute Engine, AWS


Schedule

November 3, 2024 06:00 - November 5, 2024 11:00 UTC



November 5, 2024 11:02 UTC
[Update] Gitlab.com MAIN database upgrade is now complete with all systems are functioning correctly. Thank you for your patience. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18747

November 3, 2024 09:01 UTC
[Update] Gitlab.com MAIN database upgrade was performed. We'll continue to monitor for any performance issues until the end of the maintenance window. Thank you for your patience. See <gitlab.com/gitlab-com/gl-infra/production/-/issues/18747>

November 3, 2024 06:05 UTC
[Update] Gitlab.com scheduled maintenance of our MAIN database layer have started. GitLab.com should be available during the whole period. See <gitlab.com/gitlab-com/gl-infra/production/-/issues/18747>

November 3, 2024 05:00 UTC
[Update] MAIN DB layer maintenance will start in 1h from now at 2024-11-03 06:00 UTC. GitLab.com will remains available. Details: gitlab.com/gitlab-com/gl-infra/production/-/issues/18747

November 2, 2024 05:00 UTC
[Update] Scheduled maintenance on MAIN DB layer: Start: 2024-11-03 06:00 UTC End: 2024-11-05 11:00 UTC (incl. monitoring) GitLab.com remains available. Maintenance should be seamless. We apologize for any inconvenience. Details: gitlab.com/gitlab-com/gl-infra/production/-/issues/18747

October 30, 2024 06:00 UTC
[Update] In three days, we will be undergoing scheduled maintenance to our MAIN database layer. The maintenance will start at 2024-11-03 06:00 UTC and should finish at 2024-11-05 11:00 UTC (including performance and regression observability period). GitLab.com will be available during the whole period as the maintenance should be seamless and transparent for the application. We apologize in advance for any inconvenience this may cause. See: gitlab.com/gitlab-com/gl-infra/production/-/issues/18747

October 2024

Runner authentication verification API errors

October 31, 2024 22:50 UTC

Incident Status

Partial Service Disruption


Components

API


Locations

Google Compute Engine




October 31, 2024 22:50 UTC
[Resolved] After a period of monitoring we confirmed that the issue has been resolved.

October 31, 2024 22:39 UTC
[Monitoring] We've manually applied a fix that has stopped the error rate and mitigated the incident. We'll be monitoring for a time to ensure the issue doesn't recur. Details: gitlab.com/gitlab-com/gl-infra/production/-/issues/18792.

October 31, 2024 18:51 UTC
[Identified] We're finalizing the next method we'll be using to mitigate the incident and will be attempting another fix shortly. Details in gitlab.com/gitlab-com/gl-infra/production/-/issues/18792.

October 31, 2024 17:42 UTC
[Identified] Deployment of the fix has completed, but it is not having the impact we hoped for. We're currently weighing options for a second fix. Details in gitlab.com/gitlab-com/gl-infra/production/-/issues/18792.

October 31, 2024 14:50 UTC
[Identified] We have identified and pushed the fix, and it is currently in the process of being deployed. For more details, see the production issue: gitlab.com/gitlab-com/gl-infra/production/-/issues/18792

October 31, 2024 09:19 UTC
[Identified] We have identified the cause of the errors, and are currently working on the fix. For more details, see the incident issue: gitlab.com/gitlab-com/gl-infra/production/-/issues/18792

October 31, 2024 08:42 UTC
[Investigating] We are seeing elevated error rates on the POST /runners/verify endpoint. We are currently investigating. See the infrastructure issue for details: gitlab.com/gitlab-com/gl-infra/production/-/issues/18792

CI DB cluster upgrade

October 30, 2024 01:37 UTC

CI DB cluster upgradePlanned Maintenance

Description

Next week, we will be undergoing scheduled maintenance to our CI database layer. The maintenance will start at 2024-10-26 06:00 UTC and should finish at 2024-10-29 11:00 UTC (including performance regression observability period). GitLab.com will be available during the whole period as the maintenance should be seamless and transparent for the application. We apologize in advance for any inconvenience this may cause. See <gitlab.com/gitlab-com/gl-infra/production/-/issues/18639>


Components

Website, API, Git Operations, CI/CD - Hosted runners on Linux, CI/CD - Hosted runners on Windows, CI/CD - Hosted runners on macOS, CI/CD - Hosted runners for GitLab community contributions


Locations

Google Compute Engine, AWS


Schedule

October 26, 2024 06:00 - October 29, 2024 11:00 UTC



October 30, 2024 01:37 UTC
[Update] GitLab.com upgrade of the CI database layer is now complete with all systems are functioning correctly. Thank you for your patience. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18639

October 27, 2024 08:47 UTC
[Update] GitLab.com upgrade of the CI database layer is complete. We'll be monitoring the platform to ensure all systems are functioning correctly. Thank you for your patience.

October 26, 2024 06:01 UTC
[Update] We are starting maintenance to the CI database layer, and should finish at 2024-10-29 11:00 UTC (including performance regression and observability period). GitLab.com will be available during the whole period as the maintenance should be seamless. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18639

October 25, 2024 19:03 UTC
[Update] Reminder: Tomorrow, we will be undergoing scheduled maintenance to our CI database layer. The maintenance will start at 2024-10-26 06:00 UTC and should finish at 2024-10-29 11:00 UTC (including performance regression observability period). GitLab.com will be available during the whole period as the maintenance should be seamless and transparent for the application. We apologize in advance for any inconvenience this may cause. See <gitlab.com/gitlab-com/gl-infra/production/-/issues/18639>

October 23, 2024 17:25 UTC
[Update] In 3 days, we will be undergoing scheduled maintenance to our CI database layer. The maintenance will start at 2024-10-26 06:00 UTC and should finish at 2024-10-29 11:00 UTC (including performance regression observability period). GitLab.com will be available during the whole period as the maintenance should be seamless and transparent for the application. We apologize in advance for any inconvenience this may cause. See <gitlab.com/gitlab-com/gl-infra/production/-/issues/18639>

Self-managed runners are failing after runners are upgraded to 17.5

October 18, 2024 21:18 UTC

Incident Status

Partial Service Disruption


Components

Website, CI/CD - Hosted runners for GitLab community contributions


Locations

Google Compute Engine




October 18, 2024 21:18 UTC
[Resolved] Runner version 17.5.1 has been released and includes a fix for the "failed to get user home dir: $HOME is not defined" error. No action besides upgrading Runner is required. For more details see: gitlab.com/gitlab-com/gl-infra/production/-/issues/18732

October 18, 2024 10:25 UTC
[Identified] We have identified the problem and are working on releasing a patch for version 17.5. We will send a new update once the patch is available. For more information please see gitlab.com/gitlab-com/gl-infra/production/-/issues/18732.

October 18, 2024 09:49 UTC
[Investigating] Self-managed runners are failing with the error "failed to get user home dir: $HOME is not defined" after upgrading the GitLab runner to version 17.5. The current workaround is to downgrade the affected runners back to version 17.4. More details are available in gitlab.com/gitlab-org/gitlab-runner/-/issues/38252.

Sidekiq queue lengths increasing causing delays in MR and pipeline processing

October 16, 2024 15:02 UTC

Incident Status

Partial Service Disruption


Components

Website, Background Processing


Locations

Google Compute Engine




October 16, 2024 15:02 UTC
[Resolved] Our monitoring shows no further issues in Sidekiq performance after the mitigation steps were taken. We are resolving this issue and any further updates will be in: gitlab.com/gitlab-com/gl-infra/production/-/issues/18720

October 16, 2024 13:47 UTC
[Monitoring] We are continuing to see the queue recover, and are monitoring Sidekiq performance while recovery is complete. More details about this incident can be found in: gitlab.com/gitlab-com/gl-infra/production/-/issues/18720

October 16, 2024 12:49 UTC
[Monitoring] We are continuing to see the queue recover, and we are monitoring performance while recovery is complete. More details about this incident can be found in: gitlab.com/gitlab-com/gl-infra/production/-/issues/18720

October 16, 2024 12:14 UTC
[Identified] We've identified issues with delays in MR and pipeline processing as a result of Sidekiq queue lengths increasing. We're starting to see the queue recover. More details about this incident can be found in: gitlab.com/gitlab-com/gl-infra/production/-/issues/18720

Errors on work item updates

October 10, 2024 17:44 UTC

Errors on work item updatesPartial Service Disruption

Incident Status

Partial Service Disruption


Components

Website, API


Locations

Google Compute Engine




October 10, 2024 17:44 UTC
[Resolved] The MR to remove a database foreign key has been deployed to the production. This incident is now resolved. For more details, visit gitlab.com/gitlab-com/gl-infra/production/-/issues/18692.

October 10, 2024 06:21 UTC
[Identified] We are preparing a MR to remove a database foreign key that is causing this issue. The next update will be provided once the incident has been mitigated. For more details, visit gitlab.com/gitlab-com/gl-infra/production/-/issues/18692

October 10, 2024 05:01 UTC
[Investigating] We are currently investigating an error with Work Item updates. More details about this incident can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/18692

Pipelines not completing

October 9, 2024 17:33 UTC

Pipelines not completingPartial Service Disruption

Incident Status

Partial Service Disruption


Components

Background Processing


Locations

Google Compute Engine




October 9, 2024 17:33 UTC
[Resolved] The MR reverting the offending commit that caused this issue has been deployed to production. This incident is now resolved. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18676 for all the details.

October 9, 2024 10:16 UTC
[Monitoring] We have enabled a feature flag that will prevent new pipelines and Merge Requests from being impacted. Pipelines and MR that were previously stuck will need to be recreated. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18676

October 9, 2024 09:28 UTC
[Identified] We have identified a potential commit that could have caused this incident. We are working on reverting it. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18676 for details.

October 9, 2024 09:23 UTC
[Investigating] Some Merge requests are still impacted by this incident. Merge requests are stuck with the message "Your merge request is almost ready!". The workaround is to recreate the impacted Merge Request. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18676 for details.

October 9, 2024 08:26 UTC
[Investigating] We are moving this incident back to Investigating, as after resolving the Redis saturation, we are still seeing Pipelines not progressing between stages. Merge Requests are functioning normally now. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18676 for details.

October 9, 2024 04:25 UTC
[Monitoring] We have resolved the Redis saturation issue and see that all pipelines are functioning normally now. We will continue to monitor while continuing investigation into other aspects. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18676 for details.

October 9, 2024 04:18 UTC
[Resolved] We have resolved the Redis saturation issue and see that all pipelines are functioning normally now. We will continue to monitor while continuing investigation into other aspects. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18676 for details.

October 9, 2024 03:33 UTC
[Monitoring] We have resolved the Redis saturation issue and see that all pipelines are functioning normally now. We will continue to monitor while continuing investigation into other aspects. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18676 for details.

October 9, 2024 02:33 UTC
[Investigating] No material updates to report. We continue investigations and will provide further updates in 1 hour. Please follow gitlab.com/gitlab-com/gl-infra/production/-/issues/18676 for the latest.

October 9, 2024 01:26 UTC
[Investigating] No material updates to report. We continue to investigate potential causes of this issue. Please follow gitlab.com/gitlab-com/gl-infra/production/-/issues/18676 for the latest updates.

October 9, 2024 01:02 UTC
[Investigating] Investigation continues. Remember, restarting the pipeline on individual jobs may serve as a workaround. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18676.

October 9, 2024 00:02 UTC
[Investigating] Reminder: If you have urgent pipelines affected by the ongoing issue, you can cancel and retry them. A retry may make the pipeline run properly. We're still working on a permanent fix. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18676.

October 8, 2024 23:57 UTC
[Investigating] Despite resolving the Redis saturation issue, we're still receiving reports of hanging pipelines on GitLab.com. We understand the frustration this causes and are urgently investigating. Updates at gitlab.com/gitlab-com/gl-infra/production/-/issues/18676.

October 8, 2024 20:24 UTC
[Monitoring] After alleviating the saturation reported in our Redis infrastructure we are seeing readings go back to healthy levels. We will monitor now for new occurrences and reports to confirm this is resolved. Please review gitlab.com/gitlab-com/gl-infra/production/-/issues/18676 for details.

October 8, 2024 19:23 UTC
[Investigating] We're taking corrective actions to alleviate some unexpected pressure in one of our Redis shards, which may alleviate the problem. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18676 for more details.

October 8, 2024 18:29 UTC
[Investigating] We are currently investigating errors on a specific Redis shard that may be contributing to the problem. For the latest updates and detailed information, please check gitlab.com/gitlab-com/gl-infra/production/-/issues/18676.

October 8, 2024 17:55 UTC
[Investigating] Investigation continues. Remember restarting the pipeline or individual jobs may serve as a workaround. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18676.

October 8, 2024 17:25 UTC
[Investigating] Work is ongoing to identify the root of this issue, including an analysis of our Redis infrastructure. Please continue to follow gitlab.com/gitlab-com/gl-infra/production/-/issues/18676.

October 8, 2024 16:48 UTC
[Investigating] Work is still in progress to identify the cause of service degradation to pipelines. Update frequency will be increased to every 30 minutes or as soon as we have material updates to share. Please follow gitlab.com/gitlab-com/gl-infra/production/-/issues/18676 for more information.

October 8, 2024 16:23 UTC
[Investigating] We continue to investigate potential causes of this issue. Please follow gitlab.com/gitlab-com/gl-infra/production/-/issues/18676 for the latest updates.

October 8, 2024 15:39 UTC
[Investigating] We are currently investigating an issue where pipelines are not progressing between stages. This causes them to remain in a running state indefinitely, and merge requests are not being marked as ready. gitlab.com/gitlab-com/gl-infra/production/-/issues/18676

Pipelines using SaaS Runners stuck on Pending status

October 9, 2024 08:38 UTC

Incident Status

Service Disruption


Components

CI/CD - Hosted runners on Linux


Locations

Google Compute Engine




October 9, 2024 08:38 UTC
[Resolved] Pipeline jobs are now being picked up by Runners based on our monitoring. We are marking this incident as resolved. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18679

October 8, 2024 23:31 UTC
[Monitoring] We've discovered this incident also impacted self-managed Runners when attempting to verify their credentials on GitLab.com. 404 errors were generated, preventing job execution. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18679.

October 8, 2024 22:39 UTC
[Monitoring] Next update for this incident will be once the revert MR is deployed to production. Please follow gitlab.com/gitlab-com/gl-infra/production/-/issues/18679 for more details.

October 8, 2024 22:38 UTC
[Monitoring] Pipelines processed by GitLab-hosted runners are back to normal operation after the rollback. A revert MR is in process to be deployed to production which will address the root cause of the issue. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18679.

October 8, 2024 22:02 UTC
[Monitoring] We have performed a deployment rollback and previously stuck pipelines are being picked up. We will start a monitoring window for 30 minutes to confirm incident resolution. See gitlab.com/gitlab-com/gl-infra/production/-/issues/18679.

October 8, 2024 21:44 UTC
[Identified] We are receiving reports of CI Pipelines failing to start and stuck on Pending status. This is only affecting pipelines using GitLab-hosted runners. A fix has already been identified and should be deployed to production soon. Please see gitlab.com/gitlab-com/gl-infra/production/-/issues/18679.

GitLab slow background processing

October 8, 2024 11:42 UTC

Incident Status

Degraded Performance


Components

Background Processing


Locations

Google Compute Engine




October 8, 2024 11:42 UTC
[Resolved] We are resolving the slowness with background jobs issue gitlab.com/gitlab-com/gl-infra/production/-/issues/18677 and we will follow up with stuck MRs and pipelines in gitlab.com/gitlab-com/gl-infra/production/-/issues/18676

October 8, 2024 10:42 UTC
[Identified] We have identified the cause of the background job saturation and the services are recovering. More information in: gitlab.com/gitlab-com/gl-infra/production/-/issues/18676

October 8, 2024 10:27 UTC
[Investigating] We are currently investigating slowness with background jobs which might impact Merge Requests and Pipelines. More information in: gitlab.com/gitlab-com/gl-infra/production/-/issues/18676

Sidekiq job delays resulting in updates to MR approval policies being ineffective

October 4, 2024 06:59 UTC

Incident Status

Partial Service Disruption


Components

Website, API


Locations

Google Compute Engine




October 4, 2024 06:59 UTC
[Resolved] Our monitoring shows no further issues after the fix was applied. We are resolving this issue and any further updates will be in: gitlab.com/gitlab-com/gl-infra/production/-/issues/18660

October 3, 2024 22:23 UTC
[Monitoring] The fix has been successfully merged, but we are still waiting for the changes to reach production. We will continue to monitor Sidekiq’s performance, and the next update will be provided once the incident has been mitigated. For more details, visit gitlab.com/gitlab-com/gl-infra/production/-/issues/18660

October 3, 2024 21:04 UTC
[Monitoring] The fix has been successfully merged. We will continue to monitor Sidekiq’s performance to ensure everything is functioning as expected. For details, visit gitlab.com/gitlab-com/gl-infra/production/-/issues/18660

October 3, 2024 20:00 UTC
[Monitoring] The fix is expected to be merged soon. In the meantime, we’ll continue monitoring Sidekiq’s performance. Updates to MR approval policies will remain delayed until the issue is fully resolved. For more details, visit gitlab.com/gitlab-com/gl-infra/production/-/issues/18660

October 3, 2024 19:02 UTC
[Monitoring] We are still awaiting the deployment of the fix and will continue to monitor Sidekiq’s performance in the meantime. Updates to MR approval policies are delayed until the issue is resolved. For more details, visit gitlab.com/gitlab-com/gl-infra/production/-/issues/18660

October 3, 2024 18:02 UTC
[Monitoring] We continue to closely monitor Sidekiq’s performance while awaiting the deployment of the fix. In the meantime, updates to MR approval policies remain delayed. More details about this incident can be found at gitlab.com/gitlab-com/gl-infra/production/-/issues/18660

October 3, 2024 17:06 UTC
[Monitoring] We are actively monitoring the performance of Sidekiq until the fix is deployed. Updates to merge request approval policies are still experiencing delays. Details about this incident can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/18660

October 3, 2024 15:55 UTC
[Monitoring] We're continuing to monitor the performance of Sidekiq until the fix is deployed. Updates to MR approval policies are still processed with delays. The MR to fix the root cause is currently waiting to be merged. More details about this incident can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/18660

October 3, 2024 14:46 UTC
[Monitoring] We're continuing to see improvements and are monitoring until a fix is deployed. Updates to MR approval policies are still processed with delays. More details about this incident can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/18660

October 3, 2024 14:28 UTC
[Monitoring] We've identified the root cause and are recovering from a high volume of request traffic and are seeing improvements. Updates to MR approval policies are currently processing but with delays. More details about this incident can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/18660

October 3, 2024 14:12 UTC
[Investigating] We are currently investigating issues with updating MR Approval policies as a result of Sidekiq performance degradation. More details about this incident can be found in gitlab.com/gitlab-com/gl-infra/production/-/issues/18660.

September 2024

Main DB Upgrade

September 26, 2024 03:54 UTC

Main DB UpgradePlanned Maintenance

Description

We will be undergoing scheduled maintenance to our database layer. GitLab.com will continue to be available during the maintenance window. For more details, see gitlab.com/gitlab-com/gl-infra/dbre/-/issues/227


Components

Website, API, Git Operations, CI/CD - Hosted runners on Linux, CI/CD - Hosted runners on Windows, CI/CD - Hosted runners on macOS, CI/CD - Hosted runners for GitLab community contributions


Locations

Google Compute Engine, AWS


Schedule

October 25, 2024 00:00 - October 30, 2024 01:00 UTC



September 26, 2024 03:54 UTC
[Update] Scheduled maintenance on October 25 to our database layer has now been cancelled. For more details, see gitlab.com/gitlab-com/gl-infra/dbre/-/issues/227





Back to current status