If you know where to look, plenty of secrets can be found online. Since the fall of 2021, independent security researcher Bill Demirkapi has been building ways to tap into huge data sources, which are often overlooked by researchers, to find masses of security problems. This includes automatically finding developer secrets—such as passwords, API keys, and authentication tokens—that could give cybercriminals access to company systems and the ability to steal data.

Today, at the Defcon security conference in Las Vegas, Demirkapi is unveiling the results of this work, detailing a massive trove of leaked secrets and wider website vulnerabilities. Among at least 15,000 developer secrets hard-coded into software, he found hundreds of username and password details linked to Nebraska’s Supreme Court and its IT systems; the details needed to access Stanford University’s Slack channels; and more than a thousand API keys belonging to OpenAI customers.

A major smartphone manufacturer, customers of a fintech company, and a multibillion-dollar cybersecurity company are counted among the thousands of organizations that inadvertently exposed secrets. As part of his efforts to stem the tide, Demirkapi hacked together a way to automatically get the details revoked, making them useless to any hackers.

In a second strand to the research, Demirkapi also scanned data sources to find 66,000 websites with dangling subdomain issues, making them vulnerable to various attacks including hijacking. Some of the world’s biggest websites, including a development domain owned by The New York Times, had the weaknesses.

While the two security issues he looked into are well-known among researchers, Demirkapi says that turning to unconventional datasets, which are usually reserved for other purposes, allowed thousands of issues to be identified en masse and, if expanded, offers the potential to help protect the web at large. “The goal has been to find ways to discover trivial vulnerability classes at scale,” Demirkapi tells WIRED. “I think that there’s a gap for creative solutions.”

Spilled Secrets; Vulnerable Websites

It is relatively trivial for a developer to accidentally include their company’s secrets in software or code. Alon Schindel, the vice president of AI and threat research at the cloud security company Wiz, says there’s a huge variety of secrets that developers can inadvertently hard-code, or expose, throughout the software development pipeline. These can include passwords, encryption keys, API access tokens, cloud provider secrets, and TLS certificates.

“The most acute risk of leaving secrets hard-coded is that if digital authentication credentials and secrets are exposed, they can grant adversaries unauthorized access to a company’s code bases, databases, and other sensitive digital infrastructure,” Schindel says.

The risks are high: Exposed secrets can result in data breaches, hackers breaking into networks, and supply chain attacks, Schindel adds. Previous research in 2019 found thousands of secrets were being leaked on GitHub every day. And while various secret scanning tools exist, these largely are focused on specific targets and not the wider web, Demirkapi says.

During his research, Demirkapi, who first found prominence for his teenage school-hacking exploits five years ago, hunted for these secret keys at scale—as opposed to selecting a company and looking specifically for its secrets. To do this, he turned to VirusTotal, the Google-owned website, which allows developers to upload files—such as apps—and have them scanned for potential malware.