Welcome to the URLTeam website. The URLTeam is the ArchiveTeam subcommittee on URL shorteners. We believe that they pose a serious threat to the internet's integrity. If one of them dies, gets hacked or sells out, millions of links will stop working. Thus we preemptively release backups, because URL shorteners are too busy to make backups themselves.

Releases

Every 6 months or so we release a torrent of all backed up files. When a new torrent is released, you can simply delete the old torrent and download the new files to the same location. Your BitTorrent client will figure out which files have changed and will redownload them, even if you weren't finished yet on the previous torrent.

The latest torrent was released on July 20th, 2013: urlteam.torrent - List of files - Readme

For our previous releases, see the releases subdirectory. The next release is planned for around January 2014.

Data format

All data files in the torrent are simple text files compressed using LZMA2/xz. The text file format is very simple: Each line contains one mapping in the following format: Shortcode, pipe (Ascii 0x7C), long URL, line feed (Ascii 0x10). The file is sorted by shortcodes using the following order:

Depending on the URL shortener there might be multiple long URLs for one shortcode.

Q&A

Can you do a backup of shortener XY please?

Maybe. Some shorteners are very fast at banning scrapers which makes it impossible to do a backup in an efficient way. Contact us and we will look into it.

What about 301Works.org? They help URL shorteners with backups.

Unfortunately they rely on the cooperation of URL shorteners, and many of the biggest URL shorteners refuse to cooperate. Furthermore, they don't plan on releasing any data files to the public. We do however greatly value their work and when selecting which URL shorteners we will scrape next we concentrate especially on those that don't cooperate with 301Works.

Since March 2011 we are actively uploading data from non-cooperating shorteners to the 301Works archive. While those files are not available for download (they contain the same data as our torrents anyway) you can watch our progress here.

I like what you do. Can I help?

Sure thing. We can always use people who help with scraping. Or programmers. We could also use a fast server with lots of space for storing the data and seeding the torrents. Anyhow, if you want to help, please contact us.

What's with all those weird directory and file names in the torrent?

It's hard to organize that stuff into files so that each individual file is only a few hundred megabytes in size. Because we want to accomodate people with inferior operating systems we also need to assume a case insensitive file system.

Contact

You can contact us via IRC in #urlteam on EFNet or leave a message on our ArchiveTeam Wiki talk page (requires free registration).