The future is encrypted. Real-time, encrypted chat apps like Signal and WhatsApp, and messaging apps like Telegram, WeChat, and Messenger—used by two out of five people worldwide—help safeguard privacy and facilitate our rights to organize, speak freely, and keep close contact with our communities.
They are intentionally built for convenience and speed, for person-to-person communication as well as large group connections. Yet it is these same conditions that have fueled abusive and illegal behavior, disinformation and hate speech, and hoaxes and scams; all to the detriment of the vast majority of their users. As early as 2018, investigative reports have explored the role that these very features played in dozens of deaths in India and Indonesia as well as elections in Nigeria and Brazil. The ease with which users can forward messages without verifying their accuracy means disinformation can spread quickly, secretly, and at significant scale. Some apps allow extremely large groups—up to 200,000—or have played host to organized encrypted propaganda machinery, breaking away from the original vision to emulate a “living room.” And some platforms have proposed profit-driven policy changes, allowing business users to leverage customer data in new and invasive ways, which ultimately erode privacy.
In response to the harms that these apps have enabled, prominent governments have urged platforms to implement so-called backdoors or employ client-side automated scans of messages. But such solutions erode everyone’s basic liberties and put many users at greater risk, as many have pointed out. These violating measures and other traditional moderation solutions that depend on access to content are rarely effective for combating online abuse, as shown in recent research by Stanford University’s Riana Pfefferkorn.
Product design changes, not backdoors, are key to reconciling the competing uses and misuses of encrypted messaging. While the content of individual messages can be harmful, it is the scale and virality of allowing them to spread that presents the real challenge by turning sets of harmful messages into a groundswell of debilitating societal forces. Already, researchers and advocates have analyzed how changes like forwarding limits, better labeling, and reducing group sizes could dramatically reduce the spread and severity of problematic content, organized propaganda, and criminal behavior. However, such work is done using workarounds such as tiplines and public groups. Without good datasets from platforms, audits of any real-world effectiveness of such changes is hampered.
The platforms could do a lot more. In order for such important product changes to become more effective, they need to share the “metadata of the metadata” with researchers. This comprises aggregated datasets showing how many users a platform has, where accounts are created and when, how information travels, which types of messages and format-types are fastest to spread, which messages are commonly reported, and how (and when) users are booted off. To be clear, this is not information that is typically referred to as “metadata,” which normally refers to information about any specific individual and can be deeply personal to users, such as one’s name, email address, mobile number, close contacts, and even payment information. It is important to protect the privacy of this type of personal metadata, which is why the United Nations Office of the High Commissioner for Human Rights rightly considers a user’s metadata to be covered by the right to privacy when applied to the online space.
Luckily, we do not need this level or type of data to start seriously addressing harms. Instead, companies must first be forthcoming to researchers and regulators about the nature and extent of the metadata they do collect, with whom they share such data, and how they analyze it to influence product design and revenue model choices. We know for certain that many private messaging platforms collect troves of information that include tremendous insights useful to both how they design and trial new product features, or when enticing investment and advertisers.
The aggregated, anonymized data they collect can, without compromising encryption and privacy, be used by platforms and researchers alike to shed light on important patterns. Such aggregated metadata could lead to game-changing trust and safety improvements through better features and design choices.