Banking & Finance Healthcare Manufacturing Legal Government & Defense How It Works Knowledge About Request Demo
Original analysis

What 710 Healthcare Data Breaches Say About Putting Patient Data in Cloud AI

Every large healthcare data breach in the United States gets reported to the HHS Office for Civil Rights and posted in a public list. The 2025 record contains 710 of them, affecting more than 61.5 million people. Two patterns in that record matter for any health system thinking about cloud AI, and they show up before you get past the first cut of the data.


The headline figures, plain

OCR received 710 large data breach reports in 2025, where "large" means a single incident exposing the protected health information of 500 or more people. The reports cover hospitals, health plans, healthcare clearinghouses, and the third-party vendors that work on their behalf. Hacking and other IT incidents continued to dominate the cause column, as they have for several years now.[1][2]

710
Large healthcare data breaches reported to OCR in 2025.[2]
61.5M+
Individuals whose protected health information was exposed or impermissibly disclosed.[2]
22%
Share of all 2025 ransomware attacks across every industry that hit healthcare, the worst-affected sector.[2]

The number of breaches is roughly flat against 2024, and the individuals-affected total is down sharply, mostly because the 2024 figure was inflated by the Change Healthcare ransomware attack. The year-over-year story isn't what matters here. What matters is the shape of the breaches in the 2025 column.


More than a third happened at a third-party vendor

The OCR breach record sorts every reported incident by the type of entity it happened to. In 2025, the breakdown looked like this: 57.5% of breaches happened at healthcare providers, 35.8% at business associates, 6.5% at health plans, and 0.3% at clearinghouses.[2]

That 35.8% number is the one that matters for any cloud AI conversation, and it deserves a moment.

A business associate under HIPAA is the third-party that handles protected health information on a covered entity's behalf. Billing companies, transcription services, IT vendors, claims processors. When a hospital signs a Business Associate Agreement with one of these companies, the BA inherits a real, enforceable slice of HIPAA responsibility, and the hospital remains accountable for the data the whole way down the chain.[3]

A cloud AI service that touches PHI fits this definition by another name. Every time a clinician pastes a discharge summary into a hosted model for a quick summary, every time a billing team uses an AI tool to draft an appeal, every time a vendor agent reads a chart, the cloud service is doing what a business associate does: handling regulated patient data outside the institution's network. Whether the paperwork has been done or not, the function is the same.

And the 2025 record shows what happens to that function under stress. More than one in three large breaches in the public record happened on the third-party side of the line. Cloud AI does not invent this risk. It joins it.


Eighty-six percent of breaches involved data in digital, network-accessible form

OCR also classifies breaches by where the protected health information was sitting when it leaked. The 2025 distribution looks like this:[2]

Add the first two together and 86.4% of 2025 breaches involved PHI sitting in the same digital, network-accessible form that a cloud AI request needs. The cloud AI argument doesn't depend on a hypothesis here. The breach record is telling you what category of data gets exposed: it is the category of data you would be sending to an outside model.


Blue Shield of California: 4.7 million records, and no one got hacked

The fourth-largest healthcare data breach of 2025 deserves its own paragraph, because it is the cloud-vendor scenario in the public record, exactly as it would play out with any outside AI service.

Blue Shield of California disclosed that its website had been sending personal information, and in some cases protected health information, to third parties including Meta and Google through tracking tools embedded on its pages. No hacking group was involved. No ransomware payment was demanded. The exposure was caused by ordinary website integrations with outside cloud services, used in good faith and not fully understood. The eventual count of affected individuals was 4.7 million.[4]

The pattern that matters for cloud AI

A regulated entity, in normal operation, sent regulated data to outside platforms it had voluntarily integrated with. The platforms were not hostile. The integration was not a mistake of policy, it was a feature of how the platforms work. The data left the network anyway, and the breach was real. That is the same shape as routing PHI to a cloud AI service. The institution is responsible for the data. The outside service is in the chain. The chain leaks.


What the 2025 record actually argues for

Two things, neither of them complicated.

First, the third-party surface is where a substantial share of breach risk lives. More than a third of 2025 large healthcare breaches happened on the business-associate side. Adding another business associate, especially one whose value proposition depends on processing more PHI faster, expands that surface. The 35.8% number is the price tag, attached to the public record.

Second, the form of the data being protected is exactly the form cloud AI requests need. 86% of 2025 breaches involved PHI on network servers or in email accounts, which is the digital, network-accessible category. The cloud AI question is not whether the model is trustworthy. It is whether your data should be in the form that leaks.

The argument the record makes is not "do not use AI." It is "decide where the AI runs before you decide which model to use, because the deployment choice sets the breach surface."


What changes when the AI runs inside your network

An on-premises deployment does not eliminate breach risk. Ransomware still exists, insider risk still exists, and the security work of running a regulated system still has to be done. The honest claim is narrower than that.

When the AI runs inside your environment, the data does not get handed to an outside vendor to get an answer. The cloud-vendor category of risk, the one that 35.8% of 2025 breaches sit in, does not get added to your stack. The 86% category, data sitting on network servers and in email, is still your category to defend, but it is yours alone rather than yours plus a chain of third parties whose security posture you cannot fully see.

That is the trade Cognetryx is built for. The platform deploys inside your environment, indexes your own documents, serves the model privately, and keeps every prompt, every answer, and every cited source where you already keep your other records. The on-premises decision is the part of the architecture that decides what the breach record shows up to ask about.

See what on-premises AI looks like with your own data

A short AI Strategy Assessment maps where cloud AI is creating breach surface in your institution and what running the model inside your own network would take. No data leaves your walls to find out.

Book a free AI Strategy Assessment

Frequently asked questions

How many healthcare data breaches were reported in 2025?

OCR received 710 large healthcare data breach reports in 2025, affecting more than 61.5 million individuals. A large breach is one involving 500 or more people. Hacking and other IT incidents continued to dominate the cause column, as they have for several years.

What share of healthcare data breaches happened at third-party vendors?

35.8% of 2025 healthcare data breaches happened at business associates, the third-party vendors that handle protected health information on behalf of healthcare providers, health plans, and clearinghouses. Healthcare providers were 57.5%, health plans 6.5%, and clearinghouses 0.3%. A cloud AI service that touches PHI is a business associate by another name.

Where was the breached data sitting?

61.5% of 2025 healthcare data breaches involved PHI on network servers, and 24.9% involved compromised email accounts. Together, that is 86% of breaches involving data in digital, network-accessible form. Paper and films accounted for 5.6%, and electronic medical records for 4.6%.

Does on-premises AI eliminate breach risk?

No. On-premises AI removes a specific category of risk, the addition of a new business associate processing PHI outside your network. It does not remove ransomware risk, insider risk, or the general security work of running a regulated system. The point is to subtract the risks that come with sending data to outside services, not to claim a property no architecture can offer.

Is the Blue Shield of California case really a cloud-AI parallel?

The mechanism is the same. Regulated data flowed from a covered entity's environment to outside cloud services through a feature of how the integration worked, not through a hostile act, and ended up in the year's top breaches. Cloud AI services are another set of outside cloud services that PHI flows to during normal use. The category of risk is identical even though the platforms differ.


Keep reading


Methodology

This page analyzes large healthcare data breaches as reported to the HHS Office for Civil Rights for calendar year 2025. A "large" breach is one affecting 500 or more individuals, the threshold at which HIPAA requires the breach be reported to OCR and posted in the public list known informally as the "wall of shame." Source figures and category percentages are drawn from the OCR breach portal and HIPAA Journal's annual compilation of that portal, which lists each breach with its covered entity, state, entity type, individuals affected, type of breach, and location of breached information.

Percentages are calculated on the share of breach incidents, not the share of individuals affected, unless noted otherwise. The 2025 total of 710 breaches reflects the OCR portal as of HIPAA Journal's February 2026 compilation. Late additions to the portal will adjust the total slightly, including for breaches under continuing review.

This analysis is informational and not legal or compliance advice. Confirm how any rule applies to your institution with your own counsel and examiners.

Sources

  1. HHS Office for Civil Rights, Breach Portal: Notice to the Secretary of HHS Breach of Unsecured Protected Health Information. The primary public list of large healthcare data breach reports. ocrportal.hhs.gov/ocr/breach/breach_report.jsf
  2. HIPAA Journal, 2025 Healthcare Data Breach Report, February 2026. Annual compilation of OCR breach portal data, including the 710 breach total, the 61.5 million affected individuals, the 61.5% network-server and 24.9% email distribution, the 57.5% / 35.8% / 6.5% / 0.3% entity breakdown, and the 22% ransomware share. hipaajournal.com/2025-healthcare-data-breach-report
  3. U.S. Department of Health and Human Services, Business Associates. HIPAA's definition of a business associate and the obligations that travel with the role. hhs.gov/hipaa/for-professionals/privacy/guidance/business-associates
  4. HIPAA Journal, Blue Shield of California Google Ads Data Breach. Reporting on the website-tracking-tool exposure of PHI to Meta and Google, which affected 4.7 million individuals. hipaajournal.com/blue-shield-of-california-google-ads-data-breach