SentinelOne Exposes 175K Unprotected Ollama AI Servers – Security Enterprise Cloud Magazine

SentinelOne’s latest research reveals that roughly 175,000 Ollama instances are reachable on the public internet without any authentication. These open AI servers act as free compute for attackers, letting them run models, harvest data, and launch phishing campaigns. If you’re running Ollama, you need to know how exposed your nodes really are.

Scope of the Exposure

After nearly a year of scanning, the team logged 7.23 million observations of exposed Ollama hosts. While many appeared briefly, about 23,000 systems stayed online for the majority of the time, providing continuous access to multiple AI models.

Geographic Distribution

The exposure isn’t uniform. In the United States, Virginia accounts for 18 % of all open hosts, while China’s Beijing alone holds 30 % of the country’s vulnerable nodes. More than half of the servers sit behind consumer ISPs, and a sizable share runs on major hyperscalers such as AWS and Azure.

Persistent Hosts

These long‑living servers represent the most attractive targets for adversaries because they consistently deliver compute power and often lack logging or usage tracking.

Security Risks of Open Ollama Nodes

Nearly half of the exposed instances support “tool‑calling” features, which let the model execute code, query APIs, or interact with internal systems. In the wrong hands, this capability can be weaponized to harvest credentials, scrape proprietary data, or generate thousands of phishing emails—all while the owner remains unaware.

Mitigation Strategies

To protect your infrastructure, follow a three‑pronged approach:

Network Segmentation: Block inbound traffic to Ollama ports unless it’s explicitly required.
Enable Authentication: Deploy basic auth, OAuth, or mutual TLS so only authorized users can query the model.
Continuous Monitoring: Flag anomalous request volumes or unexpected tool‑calling usage in real time.

Expert Recommendations

Security engineers stress that treating Ollama endpoints as critical assets is essential. Don’t leave them open by default; instead, apply zero‑trust controls, rotate credentials regularly, and instrument telemetry to spot sudden surges in token generation.

If you’re experimenting with self‑hosted LLMs, consider disabling tool‑calling in production unless a compelling business case exists. When you do need it, sandbox the execution environment to limit potential damage.

Looking Ahead

The rapid migration of AI from cloud‑only services to edge and home labs means exposure will keep growing. The real question isn’t whether more servers will appear online—it’s how quickly you can secure them before they become a commodity for cyber‑criminals.