All Systems Operational

Updated a few seconds ago

Back to current status

Status History

Filter: Digital Ocean (Clear)



October 2019

Delays in job processing

October 25, 2019 14:04 UTC

Delays in job processingDegraded Performance

Incident Status

Degraded Performance


Components

Website, API, Git Operations, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, GitLab Customers Portal, Support Services, packages.gitlab.com, version.gitlab.com, forum.gitlab.com


Locations

Google Compute Engine, Azure, Digital Ocean, Zendesk, AWS




October 25, 2019 14:04 UTC
[Resolved] A patch was pushed yesterday evening to fix the root cause of the issue. See gitlab.com/gitlab-org/gitlab/commit/b4037524908171800e92d72a4f12eca5ce5e7972. CI shared runners are operational.

October 24, 2019 23:16 UTC
[Monitoring] We've cleared out another problematic build that caused a resurgence in the issue and are applying a patch to fix the underlying problem. Details in: gitlab.com/gitlab-org/gitlab/issues/34860 and gitlab.com/gitlab-org/gitlab/merge_requests/19124

October 24, 2019 20:41 UTC
[Monitoring] We're seeing vast improvements in job queue times for Shared Runners on GitLab.com. Service levels are nearing normal operation and we're now monitoring to ensure the issue does not recur.

October 24, 2019 19:18 UTC
[Identified] We are still seeing issues with job queue processing and are continuing to work towards getting the matter fully resolved. Tracking in gitlab.com/gitlab-com/gl-infra/production/issues/1275.

October 24, 2019 17:52 UTC
[Resolved] CI jobs On shared runners are fully operational again. We apologize for any delays you may have experienced.

October 24, 2019 15:34 UTC
[Monitoring] Shared runner CI jobs are starting and our queues are slowly coming down. We expect to achieve normal levels within 90 minutes. We'll continue to monitor and will update once we're fully operational again.

October 24, 2019 14:46 UTC
[Identified] We've identified an issue where malformed data from a project import began throwing errors and is preventing some CI pipelines from starting. We've canceled the pipelines in question and are monitoring metrics.

October 24, 2019 12:59 UTC
[Investigating] The job durations are still higher than usual. We are continuing to investigate the situation.

October 24, 2019 12:39 UTC
[Monitoring] Jobs duration times are looking good again. We are still monitoring and investigating the root cause of the durations in gitlab.com/gitlab-com/gl-infra/production/issues/1275.

October 24, 2019 11:31 UTC
[Investigating] We are currently seeing delays in CI job processing and are investigating.

gitlab.com outage

October 23, 2019 17:05 UTC

gitlab.com outageService Disruption

Incident Status

Service Disruption


Components

Website, API, Git Operations, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, GitLab Customers Portal, Support Services, packages.gitlab.com, version.gitlab.com, forum.gitlab.com


Locations

Google Compute Engine, Azure, Digital Ocean, Zendesk, AWS




October 23, 2019 17:05 UTC
[Resolved] The incident is resolved. We are conducting our review in gitlab.com/gitlab-com/gl-infra/infrastructure/issues/8247.

October 23, 2019 16:02 UTC
[Monitoring] We've alleviated the memory pressure on our Redis cluster and we'll be monitoring for the next hour before sounding the all clear. All systems are operating normally.

October 23, 2019 13:30 UTC
[Identified] We confirmed the issues were caused by failures with our Redis cluster. We observed unusual activity that contributed to OOM errors on Redis. We'll be continuing to report our findings in an incident review issue: gitlab.com/gitlab-com/gl-infra/infrastructure/issues/8247.

October 23, 2019 12:17 UTC
[Investigating] While the site is up again, we are investigating problems with our redis cluster as the root cause.

October 23, 2019 11:56 UTC
[Resolved] The site is flapping again. We are investigating the root cause in gitlab.com/gitlab-com/gl-infra/production/issues/1272.

October 23, 2019 11:39 UTC
[Investigating] The site is up again. We are still checking for the root cause of the short outage.

October 23, 2019 11:35 UTC
[Investigating] We are experiencing an outage of gitlab.com and are investigating the root cause.

July 2019

Database Replica Replacement

July 11, 2019 12:07 UTC

Incident Status

Operational


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, GitLab Customers Portal, Support Services, packages.gitlab.com


Locations

Google Compute Engine, Azure, Digital Ocean, Zendesk, AWS




July 11, 2019 12:07 UTC
[Resolved] Replica created.

July 11, 2019 11:08 UTC
[Identified] We are under maintenance for a secondary database, due replication issue. No users impacted.

We are currently investigating delayed execution of mirror jobs.

July 2, 2019 17:06 UTC

Incident Status

Degraded Performance


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, GitLab Customers Portal, Support Services, packages.gitlab.com


Locations

Google Compute Engine, Azure, Digital Ocean, Zendesk, AWS




July 2, 2019 17:06 UTC
[Resolved] We're all caught up with the delayed execution of pull mirror jobs. dashboards.gitlab.com/d/_MKRXrSmk/pull-mirrors?orgId=1&refresh=30s. We apologize for the inconvenience. The original issue has been moved into our production tracker. Ongoing conversation will take place there - gitlab.com/gitlab-com/gl-infra/production/issues/934.

July 2, 2019 17:06 UTC
[Resolved] All pull mirrors have been processed and the queue is back to normal operations since 4pm UTC.

July 2, 2019 15:05 UTC
[Investigating] We are currently investigating delayed execution of repository mirror jobs. See gitlab.com/gitlab-com/gl-infra/infrastructure/issues/7155 for details.

Degraded performance on GitLab.com

July 1, 2019 19:40 UTC

Incident Status

Degraded Performance


Components

Website, API, Container Registry, CI/CD - Hosted runners on Linux


Locations

Google Compute Engine, Digital Ocean




July 1, 2019 19:40 UTC
[Resolved] GitLab.com including pending CI jobs is now operating normally.

July 1, 2019 18:11 UTC
[Monitoring] We are continuing to monitor GitLab.com. All services are operating normally. CI pipelines continue to catch up from delays earlier and are nearly at normal levels.

July 1, 2019 16:51 UTC
[Monitoring] We are continuing to investigate degraded performance on GitLab.com. More of our traffic looks healthy and CI jobs are catching up.

July 1, 2019 13:51 UTC
[Investigating] We are continuing to investigate the degraded performance and CI pipeline delays on GitLab.com. We are tracking on gitlab.com/gitlab-com/gl-infra/production/issues/928.

July 1, 2019 13:14 UTC
[Investigating] We are now tracking CI issues and observed performance degradation as one incident.

July 1, 2019 09:56 UTC
[Investigating] We are adding more workers to alleviate the symptoms of the incident.

July 1, 2019 09:06 UTC
[Investigating] We are investigating slow response times on GitLab.com

June 2019

increased delays for CI jobs

June 27, 2019 10:12 UTC

increased delays for CI jobsDegraded Performance

Incident Status

Degraded Performance


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, GitLab Customers Portal, Support Services, packages.gitlab.com


Locations

Google Compute Engine, Azure, Digital Ocean, Zendesk, AWS




June 27, 2019 10:12 UTC
[Resolved] CI jobs are not delayed anymore since 6/26 14:00 UTC.

June 26, 2019 14:43 UTC
[Investigating] CI pipeline delays are improving now. We are still investigating the root cause. See gitlab.com/gitlab-com/gl-infra/production/issues/922 for details.

June 26, 2019 14:10 UTC
[Investigating] We are investigating delays for jobs in CI pipelines.

git operations over https are slow

June 20, 2019 19:12 UTC

Incident Status

Degraded Performance


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, GitLab Customers Portal, Support Services, packages.gitlab.com


Locations

Google Compute Engine, Azure, Digital Ocean, Zendesk, AWS




June 20, 2019 19:12 UTC
[Resolved] We're operating normally again. We are continuing the search for a root cause analysis and will update in gitlab.com/gitlab-com/gl-infra/production/issues/912.

June 20, 2019 17:52 UTC
[Monitoring] We were unable to determine the root cause of the problem, but we've seen the latencies return to normal levels. We will continue monitoring for spikes and we'll be carefully listening for user reports.

June 20, 2019 17:04 UTC
[Investigating] We are still investigating the git access slowdowns for a limited number of repositories.

June 20, 2019 16:31 UTC
[Investigating] We've narrowed the impact to specific projects, including gitlab-ce. The majority of users are not impacted by this issue.

June 20, 2019 16:14 UTC
[Investigating] We've confirmed the latency issues, but we're still investigating. Thanks for your patience.

June 20, 2019 15:49 UTC
[Investigating] We're investigating reports of slow git operations over https connections.

Elevated Error Rates on GitLab.com

June 3, 2019 00:04 UTC

Incident Status

Service Disruption


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing


Locations

Google Compute Engine, Digital Ocean




June 3, 2019 00:04 UTC
[Resolved] We are no longer seeing errors and Google Cloud has resolved the issue as of 23:00 UTC. Any further information can be found on the issue at gitlab.com/gitlab-com/gl-infra/production/issues/862

June 2, 2019 20:17 UTC
[Monitoring] We are continuing to monitor issues with GitLab.com . They appear to be related to Google Cloud issues. We will continue to update on gitlab.com/gitlab-com/gl-infra/production/issues/862

June 2, 2019 19:41 UTC
[Monitoring] We are continuing to monitor issues with GitLab.com. We are tracking on gitlab.com/gitlab-com/gl-infra/production/issues/862.

June 2, 2019 19:29 UTC
[Monitoring] GitLab.com is now available. There was a failover on our DB and services are now working normally. We'll continue to monitor.

June 2, 2019 19:21 UTC
[Identified] We are currently investigating a database failover on GitLab.com that's led to elevated errors rates and latencies. More details in working doc at docs.google.com/document/d/1RM3QnuJ4FPH10J3UrJS0T26d-mn_Dd11-jmZweHQVV8/edit

June 2, 2019 19:04 UTC
[Investigating] We are investigating elevated error rates on GitLab.com.

May 2019

Observing git-over-ssh errors

May 23, 2019 19:22 UTC

Incident Status

Operational


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, Support Services, packages.gitlab.com


Locations

Google Compute Engine, Digital Ocean, Zendesk, AWS




May 23, 2019 19:22 UTC
[Resolved] We've determined the issue is no longer impacting users and are marking this issue resolved.

May 23, 2019 17:05 UTC
[Monitoring] We are continuing to monitor git over ssh performance. Response latency and error ratios are back to normal, and we've not observed any other metrics that indicate any other operations on GitLab.com were impacted.

May 23, 2019 16:29 UTC
[Investigating] We're observing increased errors on GitLab.com where git connections via ssh are unexpectedly failing. The issue is being tracked in gitlab.com/gitlab-com/gl-infra/production/issues/844.

Issues with credentials for customers previously logged in via SSO

May 22, 2019 16:05 UTC

Incident Status

Operational


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, Support Services, packages.gitlab.com


Locations

Google Compute Engine, Digital Ocean, Zendesk, AWS




May 22, 2019 16:05 UTC
[Resolved] We haven't received any additional reports of issues with SSO authentication. Both login and runner credentials are operating normally, again. Our apologies to anyone who was impacted–and our thanks to the few who brought this issue to our attention!

May 22, 2019 15:15 UTC
[Monitoring] We've received customer confirmation that the configuration change we reverted has resolved their issues. We'll continue monitoring with our support team.

May 22, 2019 14:58 UTC
[Investigating] We've reverted a setting that was forcing new sessions for SSO authentication. But, we're still investigating, as we've yet to find definitive metrics indicating the issue as resolved.

May 22, 2019 14:39 UTC
[Investigating] We're currently investigating customer reports of issues with CI runners and user logins. If you're using SSO you may be impacted.

We are investigating performance issues on our site.

May 17, 2019 15:26 UTC

Incident Status

Degraded Performance


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, Support Services, packages.gitlab.com


Locations

Google Compute Engine, Digital Ocean, Zendesk, AWS




May 17, 2019 15:26 UTC
[Resolved] The performance issues are fully resolved. And the site is fully operational.

May 17, 2019 13:10 UTC
[Monitoring] The rollback is still ongoing and the website performance is not fully recovered yet.

May 17, 2019 11:42 UTC
[Monitoring] We are seeing performance of GitLab.com going back to normal levels while the rollback is going on. See gitlab.com/gitlab-com/gl-infra/production/issues/832 for more details.

May 17, 2019 10:37 UTC
[Investigating] We are rolling back the changes that lead to the performance degradation. See gitlab.com/gitlab-com/gl-infra/production/issues/832 for more details.

May 17, 2019 09:58 UTC
[Investigating] We are continuing to investigate the performance issues. See gitlab.com/gitlab-com/gl-infra/production/issues/832 for more details.

May 17, 2019 08:20 UTC
[Investigating] We are investigating performance issues. See gitlab.com/gitlab-com/gl-infra/production/issues/832 for more details.

April 2019

Scheduled jobs not triggering

April 26, 2019 23:44 UTC

Scheduled jobs not triggeringDegraded Performance

Incident Status

Degraded Performance


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, Support Services, packages.gitlab.com


Locations

Google Compute Engine, Digital Ocean, Zendesk, AWS




April 26, 2019 23:44 UTC
[Resolved] Scheduled jobs are now getting triggered as expected. Please check: gitlab.com/gitlab-com/gl-infra/production/issues/805 for further details.

April 26, 2019 07:48 UTC
[Investigating] We are investigating an issue with scheduled jobs not getting triggered. Please follow: gitlab.com/gitlab-com/gl-infra/production/issues/805 for further details and investigation.

Degraded service availability

April 24, 2019 16:21 UTC

Degraded service availability Partial Service Disruption

Incident Status

Partial Service Disruption


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, Support Services, packages.gitlab.com


Locations

Google Compute Engine, Digital Ocean, Zendesk, AWS




April 24, 2019 16:21 UTC
[Resolved] Our cloud provider resolved the underlying inconsistency within their infrastructure 3h ago and we started our remaining job processor as of 30min ago. We are not seeing any further issue. Details: gitlab.com/gitlab-com/gl-infra/production/issues/802

April 24, 2019 11:32 UTC
[Monitoring] The jobs that have been stuck are all caught up and processed now. We are monitoring the issue on our end while we wait to get further update from our cloud provider. For details: gitlab.com/gitlab-com/gl-infra/production/issues/802

April 24, 2019 08:36 UTC
[Identified] We believe we have a good lead on what might be happening and waiting to hear back from our provider for an update. The error rates have dropped down drastically and users should be seeing improvements. Details: gitlab.com/gitlab-com/gl-infra/production/issues/802

April 24, 2019 06:40 UTC
[Investigating] We are investigating an issue within our infrastructure that is causing a degraded service availability. Current known symptoms users might see are intermittent error 500s when trying to do certain operations that involve DB writes. Follow: gitlab.com/gitlab-com/gl-infra/production/issues/802 for details.

March 2019

ssh connections - IP blocked

March 20, 2019 15:51 UTC

Incident Status

Operational


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, Support Services, packages.gitlab.com


Locations

Google Compute Engine, Digital Ocean, Zendesk, AWS




March 20, 2019 15:51 UTC
[Resolved] We just blocked an IP that was hogging ssh connections. We apologize to anyone who may have had recent trouble executing git commands over ssh!

GitLab.com not responding

March 18, 2019 22:58 UTC

GitLab.com not respondingService Disruption

Incident Status

Service Disruption


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, Support Services, packages.gitlab.com


Locations

Google Compute Engine, Digital Ocean, Zendesk, AWS




March 18, 2019 22:58 UTC
[Resolved] GitLab.com has been operating normally with the temporary remediations in place. RCA will be on gitlab.com/gitlab-com/gl-infra/infrastructure/issues/6407.

March 18, 2019 20:53 UTC
[Monitoring] GitLab.com is again operating normally and we will continue to monitor while we patch the affected area. Thanks for bearing with us!

March 18, 2019 20:35 UTC
[Identified] We have identified the source of the issues with the requests and we are tracking on gitlab.com/gitlab-com/gl-infra/production/issues/735

March 18, 2019 20:23 UTC
[Investigating] We are investigating slow queries on our database which appear to be related to the higher error rates and slow requests.

March 18, 2019 20:05 UTC
[Investigating] We are currently investigating elevated error rates on GitLab.com web and api requests.

gitlab-com/gl-infra/production#733

March 18, 2019 18:44 UTC

Incident Status

Degraded Performance


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, Support Services, packages.gitlab.com


Locations

Google Compute Engine, Digital Ocean, Zendesk, AWS




March 18, 2019 18:44 UTC
[Resolved] We've stabilized and we're resolving the issue. All operations are normal.

March 18, 2019 18:24 UTC
[Monitoring] We've relieved the pressure on our build infrastructure and systems are operating normally. We will continue to monitor for stability before issuing issuing an all clear.

March 18, 2019 18:05 UTC
[Identified] We've narrowed the issue to our build infrastructure. 500 errors are still occurring, but in low numbers. Users might continue to experience issues or delays with their build pipelines.

March 18, 2019 17:34 UTC
[Investigating] We're still seeing a slightly elevated error count, but the error rate is declining. API users are those most likely to be affected by this incident. We apologize for any degraded performance and will update you again soon.

March 18, 2019 17:13 UTC
[Investigating] We are currently seeing a high rate of errors on GitLab.com. Our infrastructure team is investigating.

Problems with CI queue

March 11, 2019 16:34 UTC

Problems with CI queue Degraded Performance

Incident Status

Degraded Performance


Components

CI/CD - Hosted runners on Linux


Locations

Google Compute Engine, Digital Ocean




March 11, 2019 16:34 UTC
[Resolved] The problem was fully resolved and since few hours we don't see any performance issues with our CI infrastructure

March 11, 2019 12:58 UTC
[Monitoring] All of the managers were updated and started again. The number of pending jobs is dropping down. The problem should be resolved now but we are still monitoring the infrastructure.

March 11, 2019 12:29 UTC
[Identified] We've identified the problem with deployment. Most of our runners are back up and the CI queue is slowly going down.

March 11, 2019 12:17 UTC
[Investigating] We're investigating problems with CI queue after a deploy rollback

January 2019

Investigating slow Git interactions on GitLab.com

January 17, 2019 17:43 UTC

Incident Status

Degraded Performance


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, Support Services


Locations

Google Compute Engine, Digital Ocean, Zendesk




January 17, 2019 17:43 UTC
[Resolved] All GitLab.com services are operating normally again and more notes have been added to gitlab.com/gitlab-com/gl-infra/production/issues/657 related to our investigation.

January 17, 2019 17:04 UTC
[Monitoring] Error rates on GitLab.com have gone back to normal levels and all services are operating normally. We will continue to investigate and update on gitlab.com/gitlab-com/gl-infra/production/issues/657.

January 17, 2019 16:44 UTC
[Investigating] We are currently investigating slow interactions with git on GitLab.com.

Investigating High Error Rate on GitLab.com requests

January 7, 2019 15:22 UTC

Incident Status

Service Disruption


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing


Locations

Google Compute Engine, Digital Ocean




January 7, 2019 15:22 UTC
[Resolved] All GItLab.com systems are running normally.

January 7, 2019 15:07 UTC
[Monitoring] All GitLab.com services are operating normally. We will put further updates and a link to the RCA on gitlab.com/gitlab-com/gl-infra/production/issues/640. We apologize for the inconvenience!

January 7, 2019 14:55 UTC
[Monitoring] GitLab.com and GitLab Pages are both operational. We are continuing to monitor the health of the stack.

January 7, 2019 14:44 UTC
[Monitoring] While we bring back services on GitLab.com , the GitLab pages will have degraded performance for a short period of time.

January 7, 2019 14:42 UTC
[Monitoring] We have repaired the database backends for GItLab.com and are continuing to monitor the health of the system.

January 7, 2019 14:33 UTC
[Identified] We have identified an issue with our database VMs and are working on restoring service.

December 2018

Repository storage nodes are down

December 22, 2018 00:06 UTC

Incident Status

Service Disruption


Components

Website, API, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing


Locations

Google Compute Engine, Digital Ocean




December 22, 2018 00:06 UTC
[Resolved] GitLab.com git nodes are all operating normally. Summary and RCA will continued be updated on gitlab.com/gitlab-com/gl-infra/production/issues/632. We apologize for the issues today.

December 21, 2018 22:18 UTC
[Monitoring] GitLab.com gitaly nodes have been restored. We will have a timeline on issue gitlab.com/gitlab-com/gl-infra/production/issues/632. Currently all GitLab.com services should be back to operating normally.

December 21, 2018 21:59 UTC
[Monitoring] The service is partially restored, we are still monitoring the situation. Sorry for the inconvenience.

December 21, 2018 21:43 UTC
[Identified] Due to an engineering error, repository storage nodes were brought down, we are working on bringing them back up. No data is lost. More updates will follow in 15 minutes.





Back to current status