All Systems Operational

Updated a few seconds ago

Back to current status

Status History

Filter: forum.gitlab.com (Clear)



October 2019

Delays in job processing

October 25, 2019 14:04 UTC

Delays in job processingDegraded Performance

Incident Status

Degraded Performance


Components

Website, API, Git Operations, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, GitLab Customers Portal, Support Services, packages.gitlab.com, version.gitlab.com, forum.gitlab.com


Locations

Google Compute Engine, Azure, Digital Ocean, Zendesk, AWS




October 25, 2019 14:04 UTC
[Resolved] A patch was pushed yesterday evening to fix the root cause of the issue. See gitlab.com/gitlab-org/gitlab/commit/b4037524908171800e92d72a4f12eca5ce5e7972. CI shared runners are operational.

October 24, 2019 23:16 UTC
[Monitoring] We've cleared out another problematic build that caused a resurgence in the issue and are applying a patch to fix the underlying problem. Details in: gitlab.com/gitlab-org/gitlab/issues/34860 and gitlab.com/gitlab-org/gitlab/merge_requests/19124

October 24, 2019 20:41 UTC
[Monitoring] We're seeing vast improvements in job queue times for Shared Runners on GitLab.com. Service levels are nearing normal operation and we're now monitoring to ensure the issue does not recur.

October 24, 2019 19:18 UTC
[Identified] We are still seeing issues with job queue processing and are continuing to work towards getting the matter fully resolved. Tracking in gitlab.com/gitlab-com/gl-infra/production/issues/1275.

October 24, 2019 17:52 UTC
[Resolved] CI jobs On shared runners are fully operational again. We apologize for any delays you may have experienced.

October 24, 2019 15:34 UTC
[Monitoring] Shared runner CI jobs are starting and our queues are slowly coming down. We expect to achieve normal levels within 90 minutes. We'll continue to monitor and will update once we're fully operational again.

October 24, 2019 14:46 UTC
[Identified] We've identified an issue where malformed data from a project import began throwing errors and is preventing some CI pipelines from starting. We've canceled the pipelines in question and are monitoring metrics.

October 24, 2019 12:59 UTC
[Investigating] The job durations are still higher than usual. We are continuing to investigate the situation.

October 24, 2019 12:39 UTC
[Monitoring] Jobs duration times are looking good again. We are still monitoring and investigating the root cause of the durations in gitlab.com/gitlab-com/gl-infra/production/issues/1275.

October 24, 2019 11:31 UTC
[Investigating] We are currently seeing delays in CI job processing and are investigating.

gitlab.com outage

October 23, 2019 17:05 UTC

gitlab.com outageService Disruption

Incident Status

Service Disruption


Components

Website, API, Git Operations, Container Registry, GitLab Pages, CI/CD - Hosted runners on Linux, Background Processing, GitLab Customers Portal, Support Services, packages.gitlab.com, version.gitlab.com, forum.gitlab.com


Locations

Google Compute Engine, Azure, Digital Ocean, Zendesk, AWS




October 23, 2019 17:05 UTC
[Resolved] The incident is resolved. We are conducting our review in gitlab.com/gitlab-com/gl-infra/infrastructure/issues/8247.

October 23, 2019 16:02 UTC
[Monitoring] We've alleviated the memory pressure on our Redis cluster and we'll be monitoring for the next hour before sounding the all clear. All systems are operating normally.

October 23, 2019 13:30 UTC
[Identified] We confirmed the issues were caused by failures with our Redis cluster. We observed unusual activity that contributed to OOM errors on Redis. We'll be continuing to report our findings in an incident review issue: gitlab.com/gitlab-com/gl-infra/infrastructure/issues/8247.

October 23, 2019 12:17 UTC
[Investigating] While the site is up again, we are investigating problems with our redis cluster as the root cause.

October 23, 2019 11:56 UTC
[Resolved] The site is flapping again. We are investigating the root cause in gitlab.com/gitlab-com/gl-infra/production/issues/1272.

October 23, 2019 11:39 UTC
[Investigating] The site is up again. We are still checking for the root cause of the short outage.

October 23, 2019 11:35 UTC
[Investigating] We are experiencing an outage of gitlab.com and are investigating the root cause.





Back to current status