Anna’s Archive, the open source search engine for shadow libraries, has announced it has effectively “backed up Spotify” – archiving hundreds of millions of tracks’ worth of metadata and audio files, and distributing the whole thing via torrents.
Best known for preserving books and academic papers, the group says it scraped metadata for around 256 million tracks and audio files for roughly 86 million songs. By its own estimate, that accounts for about 99.6 percent of all listens on Spotify, bundled into a dataset weighing just under 300TB and sorted by ‘popularity’.
In a blog post, Anna’s Archive describes the project as the largest publicly available music metadata database in the world, framing it as a long-term, fully open “preservation archive” for modern music. The group argues that while chart-topping hits are unlikely to disappear, swathes of lesser-known music could be lost if streaming platforms drop licences or shut down entirely.
“A while ago, we discovered a way to scrape Spotify at scale. We saw a role for us here to build a music archive primarily aimed at preservation,” the post states. “This Spotify scrape is our humble attempt to start such a ‘preservation archive’ for music. Of course Spotify doesn’t have all the music in the world, but it’s a great start.”
According to Anna’s Archive, much of the audio originates from Spotify itself. Popular tracks are stored in their original 160kbps format, while less-played songs have been reencoded into smaller files to save space. The archive is said to contain about 37 percent of all songs available on Spotify as of July 2025. Metadata torrents are already live, while the audio files are being released in stages, starting with the most-streamed tracks.
Legally, however, the project sits on very shaky ground. Spotify licenses the vast majority of its catalogue under strict agreements with labels and rights holders, and mass scraping and redistribution of audio files – preservation-minded or not – violates both its terms of service and copyright law in many jurisdictions.
Following the announcement, the streaming giant released a statement confirming it had identified and disabled the accounts responsible: “Spotify has identified and disabled the nefarious user accounts that engaged in unlawful scraping,” the company says. “We’ve implemented new safeguards for these types of anti-copyright attacks and are actively monitoring for suspicious behavior. Since day one, we have stood with the artist community against piracy, and we are actively working with our industry partners to protect creators and defend their rights.”
Get the MusicTech newsletter
Get the latest news, reviews and tutorials to your inbox.

