SHA-256 hashes different between C# and Javascript

SHA-256 hashes different between C# and Javascript - javascript

I am currently working on a project that will involve credit card swipes for admissions based on database rows. Like a will call system, the SHA-256 hash of the CC number must match the hash in the DB row in order to be considered the "proper pickup".
However, because the box office system is based in the browser, the CC number on pickup must be hashed client-side, using Javascript, and then compared to the previously downloaded will call data.
However when trying to hash the numbers, the hash always ends up different than what was hashed when the DB row was created (using VB.NET and SQL Server 2008 R2). For example, if a CC number in the database happened to be 4444333322221111, then the resulting hash from .NET would become xU6sVelMEme0N8aEcCKlNl5cG25kl8Mo5pzTowExenM=.
However, when using any SHA-256 hash library for Javascript I could find, the resulting hash would always be NbjuSagE7lHVQzKSZG096bHtQoMLscYAXyuCXX0Wtw0=.
I'm assuming this is some kind of Unicode/UTF-8 issue, but no matter what I try I cannot get the hashes to come out the same and it's starting to drive me crazy. Can anyone offer any advice?
Here's something that may provide some insight. Please go to http://www.insidepro.com/hashes.php?lang=eng and insert "4444333322221111" without quotes into the Password box. Afterwards, scroll down to the SHA-256 section.
You can see that there are four results, two of them are the hash codes I posted (the second from the top being the Javascript hash and the bottom one being the SQL hash). According to that page, the bottom hash result is generated using a base 64 string, as well as making the password into unicode format.
I've investigated this and tried many different functions to encode the password into unicode format, but no matter what little tweaks I try or other functions I make, I could never get it to match the hash code I need.
I am currently investigating the parameters used when calling the SHA-256 function on the server side.
UPDATE:
So just to make sure I wasn't crazy, I ran the Hash method I'm using for the CC numbers in the immediate window while debugging. Again, the result remains the same as before. You can see a screenshot here: http://i.imgur.com/raEyX.png

According to online SHA-256 hash calculator and a base-64 to hex decoder, it is the .NET implementation that has not calculated the hash correctly. You may want to double check the parameters you pass to the hashing functions.
When you are dealing with two untrusted implementations, it is always a good idea to find another independent implementation, and choose the one that matches the third one as correct. Either that, or find some test vectors, and validate the implementations individually.
EDIT:
A quick experiment shows that the SHA-256 hash you get from .NET matches the hext string 3400340034003400330033003300330032003200320032003100310031003100 - little endian 16-bit characters. Make sure you pass in ASCII.

Adam Liss had it right when he mentioned the byte arrays between strings in .NET/SQL Server are different than strings in Javascript. The array in .NET for the string 4444333322221111 would look like [52 0 52 0 52 0 52 0 51 0 51 0... etc.] and the same thing in Javascript would just look like [52 52 52 52 51 51 51 51...]. Thus, with different byte arrays, different hashes were generated.
I was able to remedy this for my application by modifying the base 64 SHA-256 hashing algorithm from here, where each character is pulled from the string one at a time in order to generate the hash.
Rather than having it do it this way, I first converted the string into a unicode-like byte array (like the .NET example above, 52 0 52 0 etc), fed that array to the hashing algorithm instead of the string, and did some very minor tweaks in order for it to grab each array member to generate the hash. Low and behold, it worked and now I have a very convenient method of hashing CC numbers in the same fashion as the .NET framework for quick and easy order lookup.

Are you sure about your JavaScript SHA256 function ?
And your firstly generated hash ?
SHA-256("4444333322221111"); // 35b8ee49a804ee51d5433292646d3de9b1ed42830bb1c6005f2b825d7d16b70d
hex: 35b8ee49a804ee51d5433292646d3de9b1ed42830bb1c6005f2b825d7d16b70d
HEX: 35B8EE49A804EE51D5433292646D3DE9B1ED42830BB1C6005F2B825D7D16B70D
h:e:x: 35:b8:ee:49:a8:04:ee:51:d5:43:32:92:64:6d:3d:e9:b1:ed:42:83:0b:b1:c6:00:5f:2b:82:5d:7d:16:b7:0d
base64: NbjuSagE7lHVQzKSZG096bHtQoMLscYAXyuCXX0Wtw0=

Related

Storing more info in QR Code

I am trying to develop a hybrid mobile app with QR code functionality. QR Code contains a limited number of character can be stored with it. So, I am thinking is it possible to compress the string to make it shorter so that I can store more info into the QR code?

At lengths that short, most compression algorithms will actually make data longer, not shorter. There are some algorithms which may work well, though… smaz comes to mind. However, it is going to depend heavily on what you are trying to compress, and you haven't really provided any information about that.
Instead of thinking about compression, your best bet may be to find an encoding scheme which makes more sense for your data. For example, if you're encoding a date and time, store it as a single number instead of text. Think about whether you really need seconds. If you are storing numbers, consider using variable-length quantities. If your data is JSON, consider using protobuf instead.
If what you have really is text, it may be worth considering coming up with your own character set. Instead of ASCII where each character 8 bits, can you limit yourself to 64 characters? a-z, A-Z, 0-9, and two punctuation characters is only 64 possible symbols… if that is all you need, you could use a 6-bit encoding. If the strings aren't case-sensitive you have tons of room for punctuation.

How does this javascript compression technique works?

I was checking the results of a security contest involving XSS (link) and found some wonderful and scary JS XSS payloas. The winner (#kinugawamasato) used a javascript compression technique that seems completely other worldly to me:
Compressed payload:
https://cure53.de/xmas2013/?xss=<scriPt>document.write(unescape(escape(location)
.replace(/u(..)/g,'$1%')))<\/scriPt>㱯扪散琠楤㵥⁣污獳楤㵣汳楤㨳㌳䌷䉃㐭㐶うⴱㅄ〭
䉃〴ⴰ〸ぃ㜰㔵䄸㌠潮牯睥湴敲㵡汥牴⠯繷⸪ℱ⼮數散⡲散潲摳整⠰⤩⤾㱳癧⁯湬潡搽攮摡瑡畲氽慬汛攮
牯睤敬業㴳㍝⬧㽳慮瑡㵀Ⅱ汬潷彤潭慩湳㴧⭤潭慩渻攮捨慲獥琽❵瑦ⴷ✾
What really happened:
<object id=e classid=clsid:333C7BC4-460F-11D0-BC04-0080C7055A83 onrowenter=alert(/~w.*!1/.exec(recordset(0)))><svg onload=e.dataurl=all[e.rowdelim=33]+'?santa=#!allow_domains='+domain;e.charset='utf-7'>
Is this technique already documented somewhere so I can study it? How exacly this thing works? Is there already some javascript compressor that does that in an automated way? How would a WAF react to such a payload like that?
You can see more examples here.

I am using lz-string library for JS compression whenever placing any data into localStorage. I am just a user of this library - not the compression expert. But this is information which could be found around that tool...
The lz-string Goal:
lz-string was designed to fulfill the need of storing large amounts of data in localStorage, specifically on mobile devices. localStorage being usually limited to 5MB, all you can compress is that much more data you can store.
...
I (note: "I" means, Pieroxy, author of the lz-string) started out from an LZW implementation (no more patents on that), which is very simple...
So, the fundament, the base of this implemntation is LZW, which is mentioned here Javascript client-data compression by Andy E. Let me point out
the link to Wikipedia article on LZW
the LZW compression example.
An extract from Wikipedia - Algorithm:
The scenario described by Welch's 1984 encodes sequences of 8-bit data as fixed-length 12-bit codes. The codes from 0 to 255 represent 1-character sequences consisting of the corresponding 8-bit character, and the codes 256 through 4095 are created in a dictionary for sequences encountered in the data as it is encoded. At each stage in compression, input bytes are gathered into a sequence until the next character would make a sequence for which there is no code yet in the dictionary. The code for the sequence (without that character) is added to the output, and a new code (for the sequence with that character) is added to the dictionary.
Wikipedia - Encoding:
A high level view of the encoding algorithm is shown here:
Initialize the dictionary to contain all strings of length one.
Find the longest string W in the dictionary that matches the current input.
Emit the dictionary index for W to output and remove W from the input.
Add W followed by the next symbol in the input to the dictionary.
Go to Step 2.
How it works in case of the lz-string we can observer here:
The source code: lz-string-1.3.3.js
Let me cite few steps from the already mentioned lz-string source:
What I did was:
localStorage can only contain JavaScript strings. Strings in JavaScript are stored internally in UTF-16, meaning every character weight 16 bits. I modified the implementation to work with a 16bit-wide token space.
I had to remove the default dictionary initialization, totally useless on a 16bit-wide token space.
I initialize the dictionary with three tokens:
An entry that produces a 16-bit token.
An entry that produces an 8-bit token, because most of what I will store is in the iso-latin-1 space, meaning tokens below 256.
An entry that mark the end of the stream.
The output is processed by a bit stream that stores effectively 16 bits per character in the output string.
Each token is stored with just as many bits that are needed according to the size of the dictionary. Hence, the first token takes 2 bits, the second to 7th three bits, etc....
Well, now we know, that by these compression techniques we get 16 bits information. We can test it in this demo: http://pieroxy.net/blog/pages/lz-string/demo.html (or/and another here)
Which converts the: Hello, world. into
85 04 36 30 f6 60 40 03 0e 04 01 e9 80 39 03 26
00 a2
So we need the final step, let me cite again:
Well, this lib produces stuff that isn't really a string. By using all 16 bits of the UTF-16 bitspace, those strings aren't exactly valid UTF-16. By version 1.3.0, I added two helper encoders to produce stuff that we can do something with:
compress produces invalid UTF-16 strings. Those can be stored in localStorage only on webkit browsers (Tested on Android, Chrome, Safari). Can be decompressed with decompress
to continue our example, the Hello, world. would be converted into
҅〶惶̀Ў㦀☃ꈀ
And that's finally it. We can see, that the set of all the ...other then latin characters... comes from the final conversion into UTF-16. Hope, this will give some hints...

Identifying cookie data type

Is there some library (preferably in javascript) that will classify a cookie value into some sensible data type?
When I look at various cookie values, I see various types such as:
plain english
numbers (hex, dec)
base64
some combination of above
It would be even more awesome if in addition to guessing the data-type, the library can also guess the type of encryption, or hashing used.
I remember experimenting with a python library a while ago but that did not seem to guess even simple hashes such as shasum, sha256sum, sha256sum, md5sum etc

There's no way to do this since all cookies are stored as string values. A workaround can be to classify them into general classes by putting them through regular expressions.
/^[0-9A-Fa-f]{8}$/ Signifying that could be an Adler-32 or CRC-32 Checksum
/^[0-9A-Fa-f]{32}$/ Signifying that could be an MD2, MD4, MD5, or Haval Sum
/^[0-9A-Fa-f]{64}$/ Signifying that could be a SHA-256 Sum
/^[0-9A-Fa-f]{96}$/ Signifying that could be a SHA-384 Sum
/^[0-9A-Fa-f]{128}$/ Signifying that could be a SHA-512 Sum
/^[^0-9A-Za-z+/]+={0,2}$/ Signifying that could be a Base-64 Encode
There's a chance that any one of these could just be regular numbers or plain test too (like "DEADBEEF"). If you're in charge of that data, I would specify the type in another cookie.
In summary, there's just no guarantee, unless you know what to expect.

Greasemonkey Storage

Is there any limit on how much data can be stored using GM_setValue?

GM stores it in properties. Open about:config and look for them.
According to http://diveintogreasemonkey.org/api/gm_getvalue.html, you can find them in the greasemonkey.scriptvals branch.
This sqlite info on its limits shows some default limits for strings and blobs, but they may be changed by Firefox.

More information is in the Greasespot Wiki:
The Firefox preference store is not designed for storing large amounts of data. There are no hard limits, but very large amounts of data may cause Firefox to consume more memory and/or run more slowly.2
The link refers to a discussion in the Greasemonkey Mailinglist. Anthony Lieuallen answers the same question as you posted:
I've just tested this. Running up to a 32 meg string seems to work
without major issues, but 64 or 128 starts to thrash the disk for
virtual memory a fair deal.

According to the site you provided, "The value argument can be a string, boolean, or integer."
Obviously, a string can hold far more information than an integer or boolean.
Since GreaseMonkey scripts are JavaScript, the max length for a GM_setValue is the max length of a JavaScript string. Actually, the JavaScript engine (browser specific) determines the max length of a string.
I do not know any specifics, but you could write a script to determine max length.
Keep doubling length until you get an error. Then, try a value halfway between maxGoodLen and minBadLen until maxGoodLen = maxBadLen - 1.

Writing a JavaScript zip code validation function

I would like to write a JavaScript function that validates a zip code, by checking if the zip code actually exists. Here is a list of all zip codes:
http://www.census.gov/tiger/tms/gazetteer/zips.txt (I only care about the 2nd column)
This is really a compression problem. I would like to do this for fun. OK, now that's out of the way, here is a list of optimizations over a straight hashtable that I can think of, feel free to add anything I have not thought of:
Break zipcode into 2 parts, first 2 digits and last 3 digits.
Make a giant if-else statement first checking the first 2 digits, then checking ranges within the last 3 digits.
Or, covert the zips into hex, and see if I can do the same thing using smaller groups.
Find out if within the range of all valid zip codes there are more valid zip codes vs invalid zip codes. Write the above code targeting the smaller group.
Break up the hash into separate files, and load them via Ajax as user types in the zipcode. So perhaps break into 2 parts, first for first 2 digits, second for last 3.
Lastly, I plan to generate the JavaScript files using another program, not by hand.
Edit: performance matters here. I do want to use this, if it doesn't suck. Performance of the JavaScript code execution + download time.
Edit 2: JavaScript only solutions please. I don't have access to the application server, plus, that would make this into a whole other problem =)

You could do the unthinkable and treat the code as a number (remember that it's not actually a number). Convert your list into a series of ranges, for example:
zips = [10000, 10001, 10002, 10003, 23001, 23002, 23003, 36001]
// becomes
zips = [[10000,10003], [23001,23003], [36001,36001]]
// make sure to keep this sorted
then to test:
myzip = 23002;
for (i = 0, l = zips.length; i < l; ++i) {
if (myzip >= zips[i][0] && myzip <= zips[i][1]) {
return true;
}
}
return false;
this is just using a very naive linear search (O(n)). If you kept the list sorted and used binary searching, you could achieve O(log n).

I would like to write a JavaScript function that validates a zip code
Might be more effort than it's worth, keeping it updated so that at no point someone's real valid ZIP code is rejected. You could also try an external service, or do what everyone else does and just accept any 5-digit number!
here is a list of optimizations over a straight hashtable that I can think of
Sorry to spoil the potential Fun, but you're probably not going to manage much better actual performance than JavaScript's Object gives you when used as a hashtable. Object member access is one of the most common operations in JS and will be super-optimised; building your own data structures is unlikely to beat it even if they are potentially better structures from a computer science point of view. In particular, anything using ‘Array’ is not going to perform as well as you think because Array is actually implemented as an Object (hashtable) itself.
Having said that, a possible space compression tool if you only need to know 'valid or not' would be to use a 100000-bit bitfield, packed into a string. For example for a space of only 100 ZIP codes, where codes 032-043 are ‘valid’:
var zipfield= '\x00\x00\x00\x00\xFF\x0F\x00\x00\x00\x00\x00\x00\x00';
function isvalid(zip) {
if (!zip.match('[0-9]{3}'))
return false;
var z= parseInt(zip, 10);
return !!( zipfield.charCodeAt(Math.floor(z/8)) & (1<<(z%8)) );
}
Now we just have to work out the most efficient way to get the bitfield to the script. The naive '\x00'-filled version above is pretty inefficient. Conventional approaches to reducing that would be eg. to base64-encode it:
var zipfield= atob('AAAAAP8PAAAAAAAAAA==');
That would get the 100000 flags down to 16.6kB. Unfortunately atob is Mozilla-only, so an additional base64 decoder would be needed for other browsers. (It's not too hard, but it's a bit more startup time to decode.) It might also be possible to use an AJAX request to transfer a direct binary string (encoded in ISO-8859-1 text to responseText). That would get it down to 12.5kB.
But in reality probably anything, even the naive version, would do as long as you served the script using mod_deflate, which would compress away a lot of that redundancy, and also the repetition of '\x00' for all the long ranges of ‘invalid’ codes.

I use Google Maps API to check whether a zipcode exists.
It's more accurate.

Assuming you've got the zips in a sorted array (seems fair if you're controlling the generation of the datastructure), see if a simple binary search is fast enough.

So... You're doing client side validation and want to optimize for file size? you probably cannot beat general compression. Fortunately, most browsers support gzip for you, so you can use that much for free.
How about a simple json coded dict or list with the zip codes in sorted order and do a look up on the dict. it'll compress well, since its a predictable sequence, import easily since it's json, using the browsers in-built parser, and lookup will probably be very fast also, since that's a javascript primitive.

This might be useful:
PHP Zip Code Range and Distance Calculation
As well as List of postal codes.

We Keep Coding

JavaScript is the programming language of the Web.