The Wayback Machine - https://web.archive.org/web/20231205000557/https://www.githubstatus.com/
GitHub header
All Systems Operational
Git Operations ? Operational
API Requests ? Operational
Webhooks ? Operational
Visit www.githubstatus.com for more information Operational
Issues ? Operational
Pull Requests ? Operational
Actions ? Operational
Packages ? Operational
Pages ? Operational
Codespaces ? Operational
Copilot Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Past Incidents
Dec 4, 2023

No incidents reported today.

Dec 3, 2023

No incidents reported.

Dec 2, 2023

No incidents reported.

Dec 1, 2023
Resolved - This incident has been resolved.
Dec 1, 18:16 UTC
Update - A small percentage of Copilot Chat users are still experiencing long request times and errors. We are still investigating to determine the root cause.
Dec 1, 17:36 UTC
Update - Some customers are experiencing higher latency for Copilot Chat. We are continuing our investigation.
Dec 1, 16:41 UTC
Update - Copilot is experiencing degraded performance. We are continuing to investigate.
Dec 1, 15:55 UTC
Update - We are investigating reports that that some customers are experiencing increased latency and failed requests for Copilot Chat.
Dec 1, 15:53 UTC
Investigating - We are currently investigating this issue.
Dec 1, 15:49 UTC
Nov 30, 2023

No incidents reported.

Nov 29, 2023

No incidents reported.

Nov 28, 2023
Resolved - This incident has been resolved.
Nov 28, 19:40 UTC
Update - We were not able to publish webhooks in response to push events triggered between 16:23 and 17:12 UTC. To avoid further disruption to customer workflows, we’ve decided not to continue our attempts to re-process those events.
Workflows that trigger on pull_request or git push events may not have been run during this time period.
You can run impacted workflows manually or by pushing a new commit to the same branch.

Nov 28, 19:36 UTC
Update - Customers saw push event deliveries for Actions and Webhooks fail between 16:23 and 17:12 (UTC). We fixed the issue, and we are working to re-process push events for the affected time period.
Nov 28, 18:09 UTC
Investigating - We are currently investigating this issue.
Nov 28, 18:09 UTC
Resolved - This incident has been resolved. An interaction between two feature flag rollouts caused us to suppress delivery of push webhooks between 16:23 and 17:11 UTC in a manner that evaded our existing observability. This affected 71,000 repositories whose users will have noted missing webhook deliveries and/or experienced Actions jobs with push triggers failing to start. During this period, around 25% of new Actions jobs were impacted. The issues were resolved by disabling one of the feature flags in question.

After weighing various options for retroactively dispatching old push webhooks, we concluded that the risk of delivering stale data to customers with these redeliveries outweighed the possible benefit.

As follow up to this incident, we are working to improve monitoring of webhook throughput and to document a policy around webhook redelivery timelines.

Nov 28, 17:59 UTC
Update - Actions is operating normally.
Nov 28, 17:59 UTC
Update - Customers saw pull requests push event deliveries for Actions and Webhooks fail between 16:23 and 17:12 (UTC). We fixed the issue, and we are working to re-process push events for the affected time period.
Nov 28, 17:36 UTC
Investigating - We are investigating reports of degraded performance for Actions and Webhooks
Nov 28, 17:24 UTC
Nov 27, 2023
Resolved - This incident has been resolved.
On November 27, 2023 at 18:46 UTC, we attempted to rotate our OpenID Connect (OIDC) authentication flow certificates. Due to an error in the certificate formatting, we uploaded an invalid certificate configuration that was not observed in our pre-production testing. Our background job servers were unable to start because a valid configuration is required at worker start up. As a result, users experienced delays in Pull Requests, Webhooks, Issues, Actions and Projects. Rollback of the change was slowed by the invalid certificate as our deployment system relied on the same certificate. Rollback was completed at 20:35 UTC. Most services recovered by 20:44 UTC.



Delayed updates to Issues and Pull Requests were applied normally once the changes were rolled back. After the change was rolled back, a large queue of Actions-related jobs built up which included Pull Request, Pull Request review and Pull Request review comment events. About 2.3% of Actions jobs failed during the duration of the incident. Job queue times returned to normal once all remaining jobs were processed.



We are working to improve our certificate testing and rotation process to reduce the risk of customer-impacting errors.

Nov 27, 21:11 UTC
Update - Webhooks is operating normally.
Nov 27, 20:44 UTC
Update - Issues is operating normally.
Nov 27, 20:44 UTC
Update - Pull Requests is operating normally.
Nov 27, 20:43 UTC
Update - Actions customers are experiencing workflow start delays as part of the ongoing PRs incident. We are seeing previously delayed runs kick off and will continue to monitor.
Nov 27, 20:39 UTC
Update - Actions is experiencing degraded performance. We are continuing to investigate.
Nov 27, 20:30 UTC
Update - Customers are also experiencing delays in webhook delivery and issue updates. We are seeing recovery and are continuing to monitor.
Nov 27, 20:22 UTC
Update - Webhooks is experiencing degraded performance. We are continuing to investigate.
Nov 27, 20:16 UTC
Update - Issues is experiencing degraded performance. We are continuing to investigate.
Nov 27, 20:16 UTC
Update - Customers are seeing delays in pushed commits appearing on pull requests. We are currently investigating.
Nov 27, 19:46 UTC
Investigating - We are investigating reports of degraded performance for Pull Requests
Nov 27, 19:43 UTC
Nov 26, 2023

No incidents reported.

Nov 25, 2023

No incidents reported.

Nov 24, 2023

No incidents reported.

Nov 23, 2023

No incidents reported.

Nov 22, 2023

No incidents reported.

Nov 21, 2023
Resolved - On November 21, 2023, at 09:50 UTC GitHub Actions jobs encountered delays due to an incident in our background job service caused by excessive rebalancing in a Kafka consumer group. After a quick mitigation, we began to see recovery on the job queues by 10:02 UTC. During this time window 100% of Actions jobs were delayed in starting for up to 11 minutes.

Unfortunately, the rapid queue recovery sent a thundering herd of jobs to Actions hosted runner pools, causing a database deadlock that resulted in some hosted runner pools having increased latency when accepting new jobs. This affected only a small percentage of overall jobs, around 2%. Configuration changes led to a resolution and the system was fully recovered by 11:27 UTC and all in progress jobs were processed.

The incident is now resolved.

Nov 21, 11:27 UTC
Update - We've applied a mitigation to fix the issues with queuing and running Actions jobs. We are seeing improvements in telemetry and are monitoring for full recovery.
Nov 21, 11:12 UTC
Update - We have recovery for the underlying issue but are waiting for Actions queues to catch up. We expect this to be completed in less than 1 hour(s).
Nov 21, 10:24 UTC
Investigating - We are investigating reports of degraded performance for Actions
Nov 21, 10:11 UTC
Nov 20, 2023

No incidents reported.