What the Hash. A Simple Guide to Hashing
What is Hashing
Hashing, plainly put is using hardcore mathematics to increase security on some "message". Hashing is a procedure that takes a variable length value as input and spits out a fixed length hashed value as output, also known as a hash, digest or thumbprint depending on the scenario the hashing is used in. This procedure is irreversible in a more general sense, but we can use some special procedures to understand the input value (more on that later). And most importantly, no two different inputs can produce the same output. The image below gives you a quick look at what I just blabbered.Why Hashing
Now let's look at a practical example. Imagine a user registration and login scenario.
- User enters his/her new credentials and clicks register which posts this username and password to the server.
- Server grabs the password and sends it through a hash function (SHA256 in this case) and gets the hashed password in return.
- Server then stores the user's username and the corresponding hashed password in the database.
- User logs in using his username and password.
- Server, similar to the earlier case, retrieves the password from the post request and runs it through the SHA256 function and obtains the hash value. (typically server would first check for the username in the database and if only available, continues with this step and further).
- Server then verifies the obtained hash value with that of in the database under the user "munsif".
- DB returns true for availability.
- Server responds with successful login for the user.
What is Salt and Why
The above scenario ensures that no one can read the password even if someone (most probably an insubordinate DBA) gets a hold of the "User" table information.
This isn't it. Now imagine a scenario where multiple users have the same password... What happens then? When that culprit DBA goes through the records, he can see that multiple users are having the same hashed value for the password. Since no two inputs can have the same output, it's obvious that those users have the same plain-text password as well. If he manages to figure out the plain-text password of one of those users (using dictionary or pre-computed rainbow table attacks), the other users' accounts are also in peril.
Now to overcome this we can add append some string to the password, say.. the username, and store the hash value of it. Again if you think about it, a similar issue rises in systems where username is not unique. That is where the final wildcard comes into play, "Salt". Salt, similar to the previous point, is adding a text but in this case, a random text to the password and the hash it. This random salt text could be generated using a UUID library of sorts (i.e. it should be unique). So finally in the database, salt will also be saved along with the username and the salted hashed value of the password.
username
|
salted_hashed_password
|
salt
|
munsif
|
4E367EB.....
|
newbie
|
Here, when user logs in, server gets the posted username and checks if username exists in the database. If available, it retrieves the salt from the table, appends to the posted password, hashes it. Then checks it with the hashed value in the table to authenticate the user.
Real-world use cases
Hashing is being used majorly in the user authentication case as known for many, but it can and is being used in multiple other use cases as well like for Integrity protection and HTTPS certificate handling.
Integrity protection
Integrity protection is sort of an umbrella term that encompasses quite a lot of functions under it.
- Hashing is used to prevent messages from being subjected to man-in-the-middle attacks.
- Also when downloading software, the website will provide you with the checksum of the original file for you to check and verify with the file you downloaded.
HTTPS certificate handling
- Client (browser) gets the certificate from the server at the initial request and stores it.
- When user later accesses the site, server responds with the certificate.
- Browser will check the stored certificate's SHA1/ SHA256 fingerprint with that of the newly received certificate, validates and allows the communication to flow through
This post is to aimed at novice learners to give a brief introduction to hashing.
Special thanks goes to Mr. Tharindu Edirisinghe
Great post! Clearly explained the basic concepts of hashing and its real world uses. Keep up the good work!
ReplyDeleteThank you. Await more on secure programming :)
Delete