I used MD5 as my password hashing back in the days (few years ago). I used it in combination with a salt and it was enough to secure the web applications I worked on at the time. Now, MD5 is broken. Not that the algorithm it used to generate one-way hashing was updated or is broken, but techniques have been developed to create the same hash using arbitrary input. Also, GPUs are powerful enough to brute force passwords. Ask Linked in about that. But the question now is, since we are all moving to more secure cryptographic methods to secure passwords, are there any uses left for good Old MD5?
The answer is yes. MD5 should never had been used for cryptography, but we only learn from our mistakes. There are still many tasks that can be simplified with the used of this hashing algorithm. One of the places I use it is for comparing two entities.
Fast Image comparison
Years ago I wrote a small web app that takes screenshots of websites. What it does is open a web page, wait for it to render, then save it as a PNG. I opened it up to the web and in a matter of days, it filled up the entirety of my small hard drive with images. Most if not all images were duplicates of the same page. A person would request the same page 3 to 4 times and I would create 6 different images (different resolutions) and an original for each request.
To fix the problem, I decided to do an image comparison before I save the image. Reading a PNG is complex work, I wrote a simple tutorial for a basic format like Bitmap and even that was overwhelming. Trying to read a PNG, decompress it, the compare pixel by pixel would have been overkill. Instead I used an MD5 hash.
When a request for screenshot came in, I took screenshot and saved it in memory, MD5ed it into a 32 characters long string. Loaded the old version into memory, MD5ed this one too into a string and compared it to the previous.
//: check for cache first
...
If ($md5ImageA === $md5ImageB){
// No need to save
serveImageB();
exit;
}
Comparing two 32 characters long string is much easier then comparing two long binaries. And MD5 is fast enough to run on the fly.
String Comparison.
Some URLs, in an attempt to make them SEO friendly, end up being very long. Those URLs are usually saved in a database field as VARCHAR. In MySQL, an InnoDB table cannot be longer than 767 bytes. This mean the bigger your table, the slower it will be to fetch content by its URL since it cannot be indexed.
The solution is to have the URL field, as long as necessary and add another field, VARCHAR
of length 32. This field will contain the 32 character long hash of the URL. Remember a MD5 hash is always 32 characters, no matter what the input is. So when you get a particular URL, you can search the database by its MD5 hash.
$url = get_url();
$md5Url = md5($url);
$article = Db::get_article_by_hash($md5Url);
I also use MD5 to generate a unique key to cache data with Memcached. To avoid connecting to the database, I can save the records into memory using their URL for example.
$url = get_url();
$md5Url = md5($url);
$article = $memcache->get($md5Url);
if ($article === false){
$article = Db::get_article_by_hash($md5Url);
$memcache->set($md5Url,$article);
}
Initial file comparison.
Doing a file diff can be a complex task. It requires checking the file line by line and using complex algorithm to see what was added and what was removed. If two files are the same, you would only know after you have ran your algorithm throughout the entire lines. Instead, you can do a quick MD5 of both files and check if the resulting strings are the same before you start running your algorithm.
if ($md5FileA !== $md5FileB){
runAlgorith();
}
Conclusion.
In the cases I have listed, md5 is not used for its cryptographic feature but as a time saver to quickly compare sets of data. We may no longer use it to protect our passwords or secrets, but we can still use it for its ability to generate a unique enough sort of data. There are many other places you can use MD5, but just remember that if you are using it to hide something, it can be broken.
Comments
There are no comments added yet.
Let's hear your thoughts