spoonai
TOPGitHubCopilotPrivacy

GitHub Copilot Will Train on Your Code by Default Starting April 24 -- Here is How to Opt Out

GitHub announced it will use Copilot interaction data to train AI models starting April 24, with the setting defaulting to ON. Private repo code snippets are included. What you need to know and do.

·4분 소요·Updates to our Privacy Statement and Terms of Service
GitHub Copilot data policy change
Source: GitHub

April 24

GitHub quietly dropped a bombshell this week. On March 25, the company announced updates to its Privacy Statement and Terms of Service. The key change: starting April 24, interaction data from Copilot Free, Pro, and Pro+ users will be used to train AI models.

The default setting is ON. If you do not explicitly turn it off, you are opting in.

The Backstory

GitHub's relationship with AI training data has always been contentious. When Copilot first launched in 2021, it was trained on public GitHub repositories, including GPL-licensed code. A class-action lawsuit followed in 2022.

That controversy pushed GitHub to take a cautious stance. "We do not use your code for training" became a key Copilot selling point, especially for Business and Enterprise plans. It was a promise the platform leaned on heavily in marketing.

This policy change effectively reverses that promise for individual users. Business, Enterprise, student, and teacher accounts remain exempt. But individual developers and Pro subscribers are directly affected.

Plan Data used for training? Can opt out?
Copilot Free Yes (from 4/24) Yes
Copilot Pro Yes (from 4/24) Yes
Copilot Pro+ Yes (from 4/24) Yes
Copilot Business No N/A
Copilot Enterprise No N/A
Students/Teachers No N/A

Breaking It Down

What Data Gets Collected

"Interaction data" sounds like chat logs. But the actual scope is broader: code snippets, accepted suggestions, navigation patterns, and active session data from private repositories.

One important nuance. GitHub says "this update does not change the treatment of private repository source code stored on GitHub." In other words, your entire codebase is not being ingested. But code snippets and context generated during Copilot interactions are fair game.

This is a subtle distinction. They are not taking your whole repo. They are taking the pieces you show Copilot while you work.

The Microsoft Precedent

Chief Product Officer Mario Rodriguez offered an interesting justification. Over the past year, using interaction data from Microsoft employees yielded meaningful improvements, including higher code acceptance rates across multiple programming languages.

The "better products" framing is reasonable. But the core question remains: is it right to collect data by default when users have not explicitly consented?

How to Opt Out

Go to Settings > Copilot > Features, and disable "Allow GitHub to use my data for AI model training" under the Privacy heading. Do this before April 24. You can still opt out after, but data collected before that point may already be in the training pipeline.

The Bigger Picture

This is not just a GitHub story. It is part of a sweeping industry shift in data policies across major tech platforms.

Since 2025, major companies have steadily expanded their data collection scope. LinkedIn changed its terms to use posts for AI training. Reddit struck a deal with Google for AI training data. Now GitHub joins the pattern.

The playbook is familiar: promise not to use customer data, build market dominance, then change the terms. This is not unique to GitHub. It is the classic platform business pivot.

The open-source community's reaction has been intense. Some developers are actively discussing migration to alternatives like GitLab and Codeberg.

What This Means for You

The action items are clear.

First, check your opt-out settings before April 24. If you are uncomfortable with your code snippets entering training data, turn it off now. The path is Settings > Copilot > Features > Privacy.

Second, if your company uses Copilot, verify you are on a Business or Enterprise plan. Those plans are not affected.

Third, think long-term. When your code writing tool doubles as a training data collector, you may need to reconsider using Copilot for sensitive code involving security, finance, or healthcare.

The deadline is April 24. Less than a month away.

References

관련 기사

무료 뉴스레터

AI 트렌드를 앞서가세요

매일 아침, 엄선된 AI 뉴스를 받아보세요. 스팸 없음. 언제든 구독 취소.