I understand that sha256 is not good for hashing passwords because it's too fast, so attackers can brute-force the output of commonly used passwords. I recently used Google Ads API and other ad platforms to upload conversions. A hashed (using sha256) user email is included in the uploaded conversion. I'm wondering why sha256 is okay for hashing email in this case. If an attacker manages to obtain the user email database, wouldn't he be able to easily find out a user's original email?

Tag:google-ads-api, sha256

2 comments.

  1. dorian

    I'd say there's two main factors.

    Email addresses are typically much longer than passwords. I found a couple of sources that place the average length at around 20 to 25 characters. Even if we go with the lower estimate and only allow lower-case characters, that already gives around 9e27 possible different strings.

    Compare that to passwords, where a standard requirement might be a minimum length of 8 characters, including at least one uppercase letter, one lowercase letter, one special character, and one digit. That gives you an alphabet of 94 symbols, but still only about 6e15 individual combinations because of the shorter length.

    So, naively brute-forcing SHA256 hashes for potential email addresses will take you one trillion times longer than for passwords. Of course, this is completely ignoring important aspects such as email addresses being much more "regular" than passwords, but should still give an idea of the different complexities.

    Email addresses in that specific scenario aren't really that all that interesting. I'd argue that most customer lists used in Google Ads are probably rather smallish in the sense that they wouldn't be of much value to a spambot operator who is after data sets of millions of fresh email addresses ideally.

    And the general fact that email address xyz@gmail.com is included in the customer list of company ABC in itself also isn't of much worth to an attacker.

  2. Jerome Doucet

    Don't forget that security is all about tradeoff and is never absolute. Hash functions recommended for password like bcrypt are far more slow and ressource intensive to use than SHA* function familly.

    It means its more robust agains brut force attack, but also than you can't use it for some scenarios.

    For instance, if you should calculate some Hash in a high throughput process, bcrypt will just be a pain in the ass, while SHA256 will be fast enough with a still decent cryptography security (SHA256 is still considered strong).

    In your case, I suppose that this choice from the API designer came from such considerations.

Add a new comment.