I am learning about various hashing technique and found interesting library to start with cryptoJs
In the documentation, there are multiple options defined as below
hashing
HMAC
PBKDF2
Ciphers
Encoders
I understanding hashing is about generating the ciphertext. HMAC is about generating message authenticate code. But I am struggling to differentiate between PBKDF2, Ciphers, and Encoders. Which one to choose when?
Any pointers are helpful.
Password-Based Key Derivation Function 2 - PBKDF2 are functions used to create cryptographic keys that are harder to brute force using key-stretching. because humans are lazy and create passwords way too easy to brute force.
For example: our favorite password is "password"
Given a salt of "5C52FBAE9A4D97A49D14C8AF338DA55C"
The cryptographic key becomes
(Hex)A2EB261802FFD1965D034AC252E880A44955078D6D4F12EDCDF6D03549F0
(B64)ousmGAL/0ZZdA0rCUuiApElVB41tTxLtzfbQNUnw
try it here
It becomes apparent that the hash is not as easy to break as "password" on its own.
Nevertheless still possible with pre-computed hashes. You can see more here.
Ciphers on the other hand constitutes of methods for performing encryption as well as decryption. Some ciphers you see in cryptoJs are your basic AES, DES, triple DES etc.
Encoders are simply used for Encoding where encoding is very general. It is largely used to transform data so that another system can understand it. In the technology field, this is largely because every system architecture and technology has their own interpretations. Different applications will understand different encoding as per their need.
In Summary,
Encryption and Encoding are are designed 2 ways whereas PBKDF2 is a method of generating cryptographic keys (hashes) which are designed one way. Encoders are used to encode data into a form that can be transmitted or interpreted by another system.
Putting it in context:
If we want to store the password in a database we hash it because we do not need to know what the password is (no reversal required). However when we sent an encrypted mail to a friend we want to be able to reverse that encryption (decryption). Otherwise the content is lost. When the mail is sent, we added an attachment. The attachment is encoded in a way that other email clients can decode otherwise the other system cannot open up the attachment or will wrongly interpret the data sent.
So Encoding and Encrypting are similar in that encoded text and encrypted text can both be reversed. However, encoded text are meant to be reversed by anyone or any system that gets its hand on the encoded text since the encoding schemes are publicly available but encrypted text such as ciphertext are meant to be reversed only by certain specified individuals i.e. people who possess the key or decryption algorithms. In our example above, we want our attachment to be interpreted by any system but we do not want the content of the email including the attachment to be opened by everyone.
PBKDF2 is used when you want to hash a password but with the usual hashing functions, your password is vulnerable to dictionary attacks. So here comes PBKDF2 and salt.
Ciphers: Those are your normal encrypting functions. If you want to send some encrypted message where only the one with the right key can decrypt it.
Encoders: Are for text encoding formats.
Related
I'm currently experimenting AES encryption using Google App Script, and I found out about cCryptoGS.
It feels weird, as like all ciphered texts seem to start with U2FsdGVkX1 (even though I change the part this is my passphrase in the example to something else very very different). I am not sure if I remembered correctly, I once tried AES in the past, but on Nodejs, and it looked just so much different, I'll get completely different text ciphered out even if I change only a single character in either my message, or my key.
Even in this post How is AES implemented in CryptoJS?, the ciphered text also starts with U2FsdGVkX1.
What I am asking is this: Is this cCryptoGS actually do what it claims to do? (i.e to apply AES encryption to a message)
Here's the website https://ramblings.mcpher.com/gassnippets2/cryptojs-libraries-for-google-apps-script/ There are also graphs on the site to show that Google App Script cannot handle complicated calculations well, so it looks legit, but the result seems to be so... weird... Since, overall AES seemed to be one of the best option to do encryption.
If this is indeed how AES should work, is there any way that I can make it seems more random? Thank you so much in advance, :(
Thank you very much in advance,
CryptoJS can process both passphrases and keys for encryption and decryption. Strings are interpreted as passphrases, WordArrays as keys, s. The Cipher Input.
cCryptoGS wraps the passphrase variant and supports the algorithms AES, DES, TripleDES and Rabbit, see Usage.
E.g. for AES, cCryptoGS/CryptoJS encrypts with AES-256, whereby a passphrase must be passed in addition to the plaintext.
Before encryption a random 8 bytes salt is generated and from passphrase and salt a 32 bytes key and 16 bytes IV is derived with the OpenSSL key derivation function EVP_BytesToKey().
The result is generated in OpenSSL format for compatibility with OpenSSL, which consists of the ASCII encoding of Salted__ followed by the 8 bytes salt and the actual ciphertext, with the entire expression Base64 encoded.
The Base64 encoding of Salted__ is U2FsdGVkX18=, where U2FsdGVkX1 is fixed (the last two characters depend on the 1st byte of the salt and can therefore change). Thus, any encryption starts with U2FsdGVkX1, but this does not reveal any information.
So yes, it is encrypted with AES-256, and the constant prefix U2FsdGVkX1 is not critical.
However, the key derivation function EVP_BytesToKey() is deemed insecure nowadays, especially with the parameters used by cCryptoGS/CryptoJS (broken MD5 digest and an iteration count of 1), s. e.g. here, 3rd part, so its use cannot actually be recommended (apart for compatibility maybe).
This applies to the wrapped functionalities that use passphrases for encryption/decryption. cCryptoGS also directly allows the use of CryptoJS functions, see CryptoJS direct, whose security is to be assessed individually.
The secure way is to pass key and IV directly, or when using a passphrase not to apply the built-in function EVP_BytesToKey(), but a reliable key derivation function like PBKDF2.
These variants are supported by CryptoJS, but apparently not by cCryptoGS, at least not by the wrapped functionalities.
Also note that at least the linked cCryptoGS sources seem to be based on CryptoJS version 3.1.2 which is from 2013, s. cCryptoGS sources (current CryptoJS version is 4.1.1).
Coda Hale's article "How To Safely Store a Password" claims that:
bcrypt has salts built-in to prevent rainbow table attacks.
He cites this paper, which says that in OpenBSD's implementation of bcrypt:
OpenBSD generates the 128-bit bcrypt salt from an arcfour
(arc4random(3)) key stream, seeded with random data the kernel
collects from device timings.
I don't understand how this can work. In my conception of a salt:
It needs to be different for each stored password, so that a separate rainbow table would have to be generated for each
It needs to be stored somewhere so that it's repeatable: when a user tries to log in, we take their password attempt, repeat the same salt-and-hash procedure we did when we originally stored their password, and compare
When I'm using Devise (a Rails login manager) with bcrypt, there is no salt column in the database, so I'm confused. If the salt is random and not stored anywhere, how can we reliably repeat the hashing process?
In short, how can bcrypt have built-in salts?
This is bcrypt:
Generate a random salt. A "cost" factor has been pre-configured. Collect a password.
Derive an encryption key from the password using the salt and cost factor. Use it to encrypt a well-known string. Store the cost, salt, and cipher text. Because these three elements have a known length, it's easy to concatenate them and store them in a single field, yet be able to split them apart later.
When someone tries to authenticate, retrieve the stored cost and salt. Derive a key from the input password, cost and salt. Encrypt the same well-known string. If the generated cipher text matches the stored cipher text, the password is a match.
Bcrypt operates in a very similar manner to more traditional schemes based on algorithms like PBKDF2. The main difference is its use of a derived key to encrypt known plain text; other schemes (reasonably) assume the key derivation function is irreversible, and store the derived key directly.
Stored in the database, a bcrypt "hash" might look something like this:
$2a$10$vI8aWBnW3fID.ZQ4/zo1G.q1lRps.9cGLcZEiGDMVr5yUP1KUOYTa
This is actually three fields, delimited by "$":
2a identifies the bcrypt algorithm version that was used.
10 is the cost factor; 210 iterations of the key derivation function are used (which is not enough, by the way. I'd recommend a cost of 12 or more.)
vI8aWBnW3fID.ZQ4/zo1G.q1lRps.9cGLcZEiGDMVr5yUP1KUOYTa is the salt and the cipher text, concatenated and encoded in a modified Base-64. The first 22 characters decode to a 16-byte value for the salt. The remaining characters are cipher text to be compared for authentication.
This example is taken from the documentation for Coda Hale's ruby implementation.
I believe that phrase should have been worded as follows:
bcrypt has salts built into the generated hashes to prevent rainbow table attacks.
The bcrypt utility itself does not appear to maintain a list of salts. Rather, salts are generated randomly and appended to the output of the function so that they are remembered later on (according to the Java implementation of bcrypt). Put another way, the "hash" generated by bcrypt is not just the hash. Rather, it is the hash and the salt concatenated.
This is a simple terms...
Bcrypt does not have a database it stores the salt...
The salt is added to the hash in base64 format....
The question is how does bcrypt verifies the password when it has no database...?
What bcrypt does is that it extract the salt from the password hash... Use the salt extracted to encrypt the plain password and compares the new hash with the old hash to see if they are the same...
To make things even more clearer,
Registeration/Login direction ->
The password + salt is encrypted with a key generated from the: cost, salt and the password. we call that encrypted value the cipher text. then we attach the salt to this value and encoding it using base64. attaching the cost to it and this is the produced string from bcrypt:
$2a$COST$BASE64
This value is stored eventually.
What the attacker would need to do in order to find the password ? (other direction <- )
In case the attacker got control over the DB, the attacker will decode easily the base64 value, and then he will be able to see the salt. the salt is not secret. though it is random.
Then he will need to decrypt the cipher text.
What is more important : There is no hashing in this process, rather CPU expensive encryption - decryption. thus rainbow tables are less relevant here.
Lets imagine a table that has 1 hashed password. If hacker gets access he would know the salt but he will have to calculate a big list for all the common passwords and compare after each calculation. This will take time and he would have only cracked 1 password.
Imagine a second hashed password in the same table. The salt is visible but the same above calculation needs to happen again to crack this one too because the salts are different.
If no random salts were used, it would have been much easier, why? If we use simple hashing we can just generate hashes for common passwords 1 single time (rainbow table) and just do a simple table search, or simple file search between the db table hashes and our pre-calculated hashes to find the plain passwords.
I would like to get Both CryptoJS's SHA256 and php's Crypt SHA256 output to match.
PHP crypt has a salt and a number of rounds. E.g. for 5000 rounds and salt of "usesomesillystringforsalt" it would be;
$hash = crypt('Clear Text String', '$5$rounds=5000$usesomesillystringforsalt$');
I hope im not blind, but i cant find how to reproduce this behaviour in crypto-js. Its syntax doesn't seem to allow for rounds or salt.
Is it possible or should i just resort to using the basic PHP hash instead of crypt?
The CryptoJS API doesn't provide a way to specify a salt or the number of rounds for SHA256. You could add a salt manually if necessary, and specifying rounds doesn't make sense since "plain" SHA256 always uses a fixed number of rounds (64).
The number of rounds in PHP's crypt() actually defines how often the SHA256 algorithm is applied consecutively, in order to increase the complexity of brute force attacks. The PHP source code comments on this as follows: "Repeatedly run the collected hash value through SHA256 to burn CPU cycles".
As you can see in the source code (here and here), crypt() is actually a key derivation function that only makes use of SHA256 to generate cryptographically secure, salted hashes. So it also doesn't simply append the given salt to the key, instead it's a more elaborate process.
Therefore, it is not possible to get the same results with the algorithms provided by CryptoJS.
crypt() is mainly intended for password hashing. So if you need the hashes for another purpose, hash() is a good alternative (and of course creates exactly the same results as CryptoJS.SHA256()).
However, bear in mind that any cryptography with JavaScript is generally considered harmful. Hence, you should better use SSL in your application, if possible, and generate the hashes server side. If this is an option, have a look at bcrypt.
In PBKDF2 the salt should be unique for each passwort, so two users using the same password are getting two different hashes.
My Idea for the salt is a SHA1-hash of the username and the password, so it will be unique for each user.
Actually I must generate the PBKDF2 hash in a JavaScript environment. Is it save to show how the salt is generated, because JavaScript sources are plain text?
Using the username as a salt is mostly satisfying, but does not protect against several attack scenarios. As for example, the password can be compared in the databases of two websites that use the same algorithm. Furthermore, rainbow tables can be generated in advance, hence reducing the cracking time after compromise.
For this exact reason, the salt should be generated using a cryptographic PRNG. The idea of having the source code of the random source visible to attackers isn't a problem in itself, if it is non-predictable. See this question for how to generate it using javascript.
What is wrong with simply using a random salt? Other than your method doesn't require the salt to be stored with hashed password, I can't think of any advantages.
A hash of the username and password concatenated together should be okay. Since the password is secret the hash of the username and password will be unpredictable. However, as others have mentioned, there could be a security issue if there are two websites using the same algorithm, and someone using those websites uses the same credentials for both (i.e. looking at their hashed passwords we can tell they use the same password for each website). Adding the domain of your site should get around this. For example (+ is concatenation):
salt = hash(domain + username + password)
All of that being said, I would strongly recommend using a cryptographic random number generator to generate your salt since it's the standard practice for salt generation.
Let's say I have am creating a webapp, where users can create a nested tree of strings (with sensitive information). These strings are presumably quite short. I want to encrypt both keys and values in this tree before saving it. All values in the tree will be encrypted client-side using a symmetric key supplied by the user. Likewise they will be decrypted client-side, when reading.
The tree is persisted in a Mongo database.
I can't decide whether I should serialize the tree and encrypt it has a whole string or whether to encrypt values individually, considering that all data in the tree will be encrypted using the same key.
What are the pros and cons of either?
From what I can tell, AES uses a block size of 128 bits, meaning that any string can grow up to 15 characters in length when encoded, which speaks in favor of encoding a serialized string (if you want to avoid overhead)
Note: Although the webapp will use both HTTPS, IP whitelisting and multifactor authentication, I want to make an effort to prevent data breach in the event the Mongo database is stolen. That's what I'm going for here. Advice or thoughts on how to accomplish this is appreciated.
Update
Furthermore, I also want my service to inspire trust. Sending data in the clear (although over HTTPS) means the user must trust me to encrypt it before persisting it. Encrypting client-side allows me to emphasize that I don't know (or need to know) what I'm saving.
I can't think of a reason why these approaches would be different in terms of security of the actual strings (assuming they are both implemented correctly). Encrypting the strings individually obviously means that the structure of the tree will not be secret, but I'm not sure if you are concerned with that or not. For example, if you encrypt each string individually, someone seeing the ciphertexts could find out how many keys there are in the tree, and he could also learn something about the length of each key and value. If you encrypt the tree as a whole serialized blob, then someone seeing the ciphertext can tell roughly how much data is in the tree but nothing about the lengths or number of individual keys/values.
In terms of overhead, the padding would be a consideration, as you mentioned. A bigger source of storage overhead is IVs: if you are using a block cipher mode such as CTR, you need to use a distinct IV for each ciphertext. This means if you are encrypting each string individually, you need to store an IV for each string. If you encrypt the whole serialized tree, then you just need to store the one IV for that one ciphertext.
Before you implement this in Javascript, though, you should make sure that you're actually getting a real improvement in security from doing client-side encryption. This article is a classic: http://www.matasano.com/articles/javascript-cryptography/ One important point is to remember that the server is providing the Javascript encryption code, so encrypting data on the client doesn't protect it from the server. If your main concern is a stolen database, you could achieve the same security by just encrypting the data on the server before inserting it in the database.
First of all, I am not a security expert ;-)
I can't decide whether I should serialize the tree and encrypt it has a whole string or whether to encrypt values individually, considering that all data in the tree will be encrypted using the same key.
I would say serializing the tree first and encrypting the result of that has the biggest con.
What plays a huge role in successfully cracking encryption is often the knowledge about certain characters that appear quite often in the original text – for example the letters e and n in English language – and doing statistical analysis based on that on the encrypted text.
Now lets say you use for example JSON to serialize your tree client-side before encrypting it. As the attacker, I would easily know that, since I can analyze your client-side script at my leisure. So I also know already that the “letters” {, }, [, ], : and " will have a high percentage of occurrence in every “text” that you encrypt … and that the first letter of every text will have been either a { or a [ (based upon whether your tree is an object or an array) – that’s already quite a bit of potentially very useful knowledge about the texts that get encrypted by your app.