How does a URL Shortener work?

UrlUrl ShortenerTinyurl

Url Problem Overview


I wonder how a URL Shortener works, like how they extract the text from address bar and map it to correct URL, later redirect it. What programming language do they use? How do they maintain the history of the mapping? How do they ensure the uniqueness of the shortened url? How can a lay man unmap it without visiting the URL?

Url Solutions


Solution 1 - Url

Wiki Is Your Friend

Basically, a website with a shorter name is used as a place holder, such as bit.ly.

Then, bit.ly generates a key for the user to provide, which is randomly generated to not repeat. With 35 character options and 8 or so values, do the math. That's a lot of possible keys. If a URL is equal to a previously existing key, I remember reading somewhere that they reuse keys as well.

They don't really use a specific programming language, they just use a simple URL redirect, which can be done with HTTP response status code 301, 302, 307 or 308, depending.

Solution 2 - Url

URL shortners just generate a shortcode, map the target URL to the shortcode, and provide a new URL. Visiting the URL performs a database lookup with the shortcode as a key, and redirects you to the target URL. There is no algorithmic association between a shortened URL and a destination URL, so you can't "unmap" it without going through the URL shortener's systems.

You can do it with any programming language and data store. Code generation is trivial to ensure uniqueness as well; if you had an incrementing primary integer key, you could simply encode the key as base62 and serve that. Since codes are incremental in nature, you'll never have a conflict.

Solution 3 - Url

The process is pretty simple actually: There a script that asks for the URL, generates a random string (and verifies that this string isn't already used), and puts the two in some kind of database. When you request a url, another script looks in the database for the random string, and if its found redirects you to the site.

This is of course more complicated in production due to needed features like abuse prevention, URL filtering, spam prevention, URL verification, etc. But those are pretty simple to implement.


The language is irrelevant, mostly any one will do.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionprap19View Question on Stackoverflow
Solution 1 - UrlDaniel G. WilsonView Answer on Stackoverflow
Solution 2 - UrlChris HealdView Answer on Stackoverflow
Solution 3 - UrlTheLQView Answer on Stackoverflow