Cross code Hash for Javascript & Python - javascript

NO. THAT SUGGESTION DOES NOT ANSWER THIS AT ALL. SEE CORRECT ANSWER BELOW.
I am building an application whereby I want a user to enter a password into a browser, which is sent via my server to another device running Python. The password then needs to be validated by the device running Python.
The problem is, I dont want my server handling passwords in any way. So I figured I could hash the password in the browser before it is sent, have the server pass on the hash to the device, then check the hash is equivalent on the Python side.
Python has a built-in library for this purpose, but it seems javascript does not. I thought I could leverage a public javascript library, but when I compare the results from the javascript SHA256 algorithm here to what the SHA256 function in Python produces it is not the same string of characters.
Is there a cross code hash function (or any other solution) I can use?
An Update
In response to a "gee whiz, this question is the same as all these ones" let me clarify. This is not about a strategy for storing passwords or finding a 'trustworthy' library (like the post suggested). There is NOT any discussion about cross code compatibility of SHA2 on this site. I could not even find a discussion that pointed out that different SHA2 implementations SHOULD produce the same result. I did plenty of research. In fact it was the various discussions about different javascript "implementations" of SHA2 that confused me. I actually tested a scenario myself, which further confused me as the website picked up a carriage return and produced a different hash. (see below)
This is about having a function in TWO languages that produces the same output...on different devices. I think it is actually an unusual application of hashing, as generally the same code layer is used to hash, store and compare hashed values.
In the rush to down-vote the question and establish mental superiority it seems to me the question was not read properly and incorrect assumptions were made. Hopefully contributors to this site will in future take a more considered and helpful approach like the successful answer.
The link for the javascript library I provided produced the following hash for the text 'MyPassword'
5e618e009fe35ea092150ad1f2c24e3181b4cf6693dc7bbd9a09ea9c8144720d
If I use the sha256 function from Python I get the result below, which seems to indicate to me that not all SHA256 functions are equal and produce the same result.

All proper implementations of SHA256 (or any hash/encryption) produce the same result if supplied with the same data. Your problem is solved by properly processing the data that you supply to the javascript library. The "5e61..." hash is a result of additional newline appended to the end of the "MyPassword" string, look:
In [1]: import hashlib
In [2]: hashlib.sha256(b'MyPassword').hexdigest()
Out[2]: 'dc1e7c03e162397b355b6f1c895dfdf3790d98c10b920c55e91272b8eecada2a'
In [3]: hashlib.sha256(b'MyPassword\n').hexdigest()
Out[3]: '5e618e009fe35ea092150ad1f2c24e3181b4cf6693dc7bbd9a09ea9c8144720d'
For the future, popular implementations of hashes and cryptographic algorithms are thoroughly tested, and if the answer seems wrong - it's probably because your data is wrong.

Related

Producing the same signature with WebAuthn

I just started playing around with WebAuthn on localhost. I was given to understand that the signature value found in credentials.response.signature was signing credentials.response.clientDataJSON. However, for the same inputs / challenge for navigator.credentials.get I seem to be getting a different signature. My best guess is there is a timestamp or counter going somewhere into the value that is signed?
I can't seem to decode the signature or authenticatorData, which would really help me to visualize what's going on inside. I'm able to decode clientDataJSON as follows, anyone have sample code with which I code decode the other two aforementioned params?
String.fromCharCode.apply(null, new Uint8Array(credentials.response.clientDataJSON))
I also found when decoding clientDataJSON I get the occasional extra field in Chrome, which is a little annoying for my use case.
My goal is to get the user to produce the same signature or hash each time when authenticating the same PublicKeyCredential. Is there a way to do this? or are there other methods within the scrope of WebAuthn or outside of its scope to benefit from the biometric auth with which I can produce identical signatures or hashes from the same inputs?
Please forgive any misconceptions I might have about WebAuthn, I'm quite new to this amazing tech. I completely understand that this is not the original intended use of WebAuthn so a janky workaround may be needed.
My goal is to get the user to produce the same signature or hash each time when authenticating the same PublicKeyCredential.
This is actually a really bad idea. The whole purpose of signing a message with a random challenge is to avoid replay attacks. Otherwise, if an attacker somehow intercepts an authentication message, that message could simply be reused to impersonate the user.
I was given to understand that the signature value found in credentials.response.signature was signing credentials.response.clientDataJSON
That is not accurate. The signature signs authenticatorData + SHA256(clientDataJSON).
Both are variable. The authenticatorData contains a "counter" increasing each time the credential key was used to authenticate and clientDataJSON should (or must to be secure) contain a randomly server side generated challenge.
I can't seem to decode the signature or authenticatorData, which would really help me to visualize what's going on inside. I'm able to decode clientDataJSON as follows, anyone have sample code with which I code decode the other two aforementioned params?
The signature cannot be "decoded", it can only be "verified" given the adequate public key. For the other paramters authenticatorData and clientDataJSON , check out the following link at the bottom, it will decode them.
https://webauthn.passwordless.id/demos/playground.html
I also found when decoding clientDataJSON I get the occasional extra field in Chrome, which is a little annoying for my use case.
I'm not sure, I believe this is related to localhost testing.
If you want a small, fixed bit of data associated with a credential then you may wish to investigate the credBlob or prf extensions. Not all authenticators will support them, however. Many more will support prf but support for that in Chromium won't appear for a few more months. So there's not a great answer here yet, but it may work better than trying to fix the signature.
So, first things first, in general it depends on the signature scheme used whether the same signature will be produced when you use the same data as input. Check this question https://crypto.stackexchange.com/questions/26974/ where they discuss about it.
Now, coming back to WebAuthn (assuming that you use a signature algorithm that for the the same input will generate the same signature) let's look how the signature is generated. Here is a small code from my virtual authenticator that is responsible for generating the WebAuthn signature:
let authData = this._concatUint8Arrays(
rp_id_hash,
flags,
sign_count, // The signature counter will always increase
this._getAAGUID(),
credential_id_length,
credential_id,
cose_key
);
// Attestation object
let attestation_object = {'fmt': 'none', 'attStmt': {}, 'authData': authData};
// ...
// Generate signature
let client_data_hash = new Uint8Array(await crypto.subtle.digest('SHA-256', client_data));
let signatureData = this._concatUint8Arrays(authData, client_data_hash);
let signature = await Algorithms.Sign(this.private_key, signatureData);
You will notice that the data to be signed include the authenticator's signature counter which should increase each time you use the authenticator. This helps detecting replay attacks or cloned authenticator attacks (more info here).
Thus, it is not feasible to generate the same signature.
If you want to look more into what is going on under the hood of WebAuthn you can have a look into my WebDevAuthn project and browser extension that allows you to inspect the WebAuthn requests and responses.

PHP - Filtering user query to prevent all attacks [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
A user submits a search query to my site.
I then take this query and use it in other places, as well as echo'ing it back out to the page.
Right now I'm using htmlspecialchars(); to filter it.
What other steps should I take to prevent XSS, SQL Injection, etc, and things I can't even think of. I want to have all my bases covered.
<?php
$query = $_GET["query"];
$query = htmlspecialchars($query);
?>
Right now I'm using htmlspecialchars(); to filter it.
What other steps should I take to prevent XSS, SQL Injection, etc, and things I can't even think of. I want to have all my bases covered.
To cover all your bases, this depends a lot. The most straight forward (but unsatisfying) answer then probably is: do not accept user input.
And even this may sound easy, it is often not and then forgotten that any input from a different context has to be considered user input. For example when you open a file from the file-system, e.g. reading records from a database or some other data from some other system or service - not only some parameter from the HTTP request or a file upload.
Thinking this through, in context of PHP, this normally also includes the PHP code itself which is often read from disk. Not SQL, just PHP code injection.
So if you really think about the question in such a generally broad way ("etc"), the first thing you need to ensure is you've got a defined process to deploy the application and have checks and marks in place that the files of the deployment can't be tempered with (e.g. a read-only file-system). And from the operational side: You can create and restore the known state of the program within seconds with little or no side effects.
Only after that you should start to worry about other kind of user-input. For which - to complete the answer - you should only accept what is acceptable.
A user submits a search query to my site.
Accepting a search query is the higher art of user input. It involves (free form) text which tends to become more and more complex after every other day and may also include logical operators and instructions which may require parsing which involves even more components that can break and can be exploited by various kind of attacks (and SQL Injection is only one of those, albeit still pretty popular). So plan ahead for it.
As a first level mitigation, you can question if the search is really a feature that is needed. Then if you have decided on that, you should outline which problems it generally creates and you should take a look if the problems are common. That question is important because common questions may already have answers, even common answers. So if a problem is common, it is likely that the problem is already solved. Leaning towards an existing solution then only bears the problem to integrate that solution (that is understanding the problem - you always need to do it and you learn soon enough, one or two decades is normally fine - and then understanding the specific solution as you need to integrate it).
For example:
$query = $_GET["query"];
$query = htmlspecialchars($query);
is making use of variable re-use. This is commonly known to be error prone. Using different variable names that mark the context of its value(s) can help:
$getQuery = $_GET["query"];
$htmlQuery = htmlspecialchars($getQuery);
It is then more visible that $htmlQuery can be used in HTML output to show the search query (or at least was intended for it). Similar to $_GET["query"], it would make totally visible that $getQuery would not be appropriate for HTML output and its string concatenation operations.
In the original example, this would not be equally visible for $query.
It would then perhaps also made visible that in other than HTML output contexts, it ($htmlQuery) is not appropriate either. As your question suggests you already imagine that $getQuery or $htmlQuery is not appropriate to deal with the risks of an SQL Injection for example.
The example is intentionally verbose on the naming, real-life naming schemes are normally different and wouldn't emphasize the type on the variable name that much but would have a concrete type:
try {
...
$query = new Query($_GET["query"]);
...
<?= htmlspecialchars($query) ?>
If you already read up to this point, it may become more clear that there hardly can not be any one-size-fits-it-all function that magically prevents all attacks (apart from muting any kind of user-input which sometimes is equal to deleting the overall software in the first place - which is known to be safe, perhaps most of all for your software users). If you allow me the joke, maybe this is it:
$safeQuery = unset($_GET["query"]); // null
which technically works in PHP, but I hope you get the idea, it's not really meant as an answer to your question.
So now as it is hopefully clear that each input needs to be treated in context of input and output to work, it should give some pointers how and where to look for the data-handling that is of need.
Context is a big word here. One guidance is to take a look if you're dealing with user data (user input) in the input phase of a system or in the output phase.
In the input phase what you normally want to do is to sanitize, to verify the data. E.g. is it correctly encoded? Can the actual value or values the data represents (or is intended to represent) be safely decoded? Can any actual value be obtained from that data? If the encoding is already broken, ensure no further processing of that data is done. This is basically error handling and commonly means to refuse input. In context of a web-application this can mean to close the connection on the TCP transport layer (or not send anything (back) on UDP), to respond with a HTTP Status Code that denotes an error (with or without further, spare details in the response body), with a more user-friendly hypertext message in the response body, or, for a HTML-Form dedicated error messages for the part of the input that was not accepted and for some API in the format that the client can consume for the API protocol to channel out errors with the request input data (the deeper you go, the more complicated).
In the output phase it is a bit different. If you for example identified the user-input being a search query and passed the query (as value) to a search service or system and then get back the results (the reflected user input which still is user input), all this data needs to be correctly encoded to transport all result value(s) back to the user. So for example if you output the search query along with the search results, all this data needs to be passed in the expected format. In context of a web application, the user normally tells with each request what the preferred encoding of the response should be. Lets say this is normally hypertext encoded as HTML. Then all values need to be output in a way/form so that these are properly represented in HTML (and not for some error as HTML, e.g. a search for <marquee> would not cause the whole output to move all over the page - you get the idea).
htmlspecialchars() may do the job here, so might by chance htmlentities(), but which function to use with which parameters highly depends on underlying encoding like HTTP, HTML or character encoding and to which part something belongs in the response (e.g. using htmlspecialchars() on a value that is communicated back with a cookie response header would certainly not lead to intended results).
In the input phase you assert the input is matching your expectations so that you can safely let pass it along into the application or refuse further processing. Only you can know in detail what these requirements are.
In the output phase your job is to ensure that all data is properly encoded and formatted for the overall output to work and the user can safely consume it.
In the input phase you should not try to "fix" issues with the incoming data yourself, instead assume the best and communicate back that there will be no communication - or - what the problem was (note: do not let fool yourself: if this involves output of user input, mind what is important for the output phase of it, there is less risk in just dropping user input and not further process it, e.g. do not reflect it by communicating it back).
This is a bit different for the non-error handling output phase (given the input was acceptable), you err here on the safe side and encode it properly, you may even be fine with filtering the user-data so that it is safe in the output (not as the output which belongs to your overall process, and mind filtering is harder than it looks on first sight).
In short, don't filter input, only let it pass along if it is acceptable (sanitize). Filter input only in/for output if you do not have any other option (it is a fall-back, often gone wrong). Mind that filtering is often much harder and much more error prone incl. opening up to attacks than just refusing the data overall (so there is some truth in the initial joke).
Next to input or output context for the data, there is also the context in use of the values. In your example the search query. How could anyone here on Stackoverflow or any other internet site answer that as it remains completely undefined in your question: A search query. A search query for what? Isn't your question itself in a search for an answer? Taking it as an example, Stackoverflow can take it:
Verify the input is in the form of a question title and its text message that can safely enter their database - it passed that check, which can be verified as your question was published.
With your attempt to enter that query on Stackoverflow, some input validation steps were done prior sending it to the database - while already querying it: Similar questions, is your user valid etc.
As this short example shows, many of the questions for a concrete application (your application, your code) needs not only the basic foundation to work (and therefore do error handling on the protocol level, standard input and output so to say), but also to build on top of it to work technically correct (a database search for existing questions must not be prone to SQL injection, neither on the title, not on the question text, nor must the display of error messages or hints introduce other form of injections).
To come back to your own example, $htmlQuery is not appropriate if you need to encode it as a Javascript string in a response. To encode a value within Javascript as a string you would certainly use a different function, maybe json_encode($string) instead of htmlspecialchars($string).
And for passing the search query to a search service, it may be as well encoded differently, e.g. as XML, JSON or SQL (for which most database drivers offers a nice feature called parameterized queries or more formalized prepared statements which are of great help to handle input and output context more easily - common problems, common solutions).
prevent XSS, SQL Injection, etc, and things I can't even think of. I want to have all my bases covered.
You may already now spot the "error" with this "search query". It's not about the part that there aren't things you or anyone else can even think of. Regardless of how much knowledge you have, there always will be known and unknown unknowns. Next to the just sheer number of mistakes we do encode into software each other day. The one "wrong" perhaps is in thinking that there would be a one-size-fits-it-all solution (even in good intend as things must have been solved already - and truly most have been, but still one needs to learn about them first, so good you ask) and perhaps more important the other one to assume that others are solving your problems: your technical problems perhaps, but your problems you can only solve yourself. And if that sentence may sound hard, take the good side of it: You can solve them. And I write this even I can only give a lengthy answer to your question.
So take any security advice - including the text-wall I just placed here - on Stackoverflow or elsewhere with a grain of salt. Only your own sharp eyes can decide if they are appropriate to cover your bases.
Older PHP Security Poster (via my blog)

Possible XSS Attack in Java Script

Fortify on demand shows me this line of code as possible XSS problem
if (window.location.search != '') {
window.location.href = window.location.href.substr(0,baseurl.length+1)+'currencyCode='+event.getCurrencyCode()+'&'+window.location.href.substr(baseurl.length+1);
} else {
window.location.href = window.location.href.substr(0,baseurl.length)+'?currencyCode='+event.getCurrencyCode()+window.location.href.substr(baseurl.length);
}
I'm far from being JavaScript expert, but I need to fix this code.
Can you please help?
I think Fortify has found that event.getCurrencyCode() could be any length string and may contain a cross-site scripting attack that might send an unsuspecting user to a malicious site or cause the browser to load JavaScript that does bad things to the user. You might be able to tell this by looking at the details tab of the finding in Fortify's Audit Workbench tool.
Assuming that the potentially malicious data could be supplied by event.getCurrencyCode, you need to whitelist validate this value either when the event is sourced or here in this code. I'm going to bet that the spectrum values of currency code in this application is relatively small and each are of limited length so it should be directly possible to whitelist this value using Javascript's built in regex functionality.
As it stands now JavaScript will happily add a practically unlimited length string in that URL if it is supplied by the event, and with the UTF-8 character set there is a lot that an attacker can do (inlined JavaScript, etc.)
Hope this helps. Good luck.
Unfortunately the existing answer is incorrect, and ivy_lynx's comment doesn't really address the question.
Fortify On Demand is reporting a data flow vulnerability. This is a case where some data originates anywhere except from the programmer - and is reflected to the unaware end user's browser.
The potentially dangerous data comes from:
event.getCurrencyCode()
You haven't posted enough source for us to know what this is, but a pretty good guess is that the function is supposed to return nothing other than an ISO currency code, or rather three, uppercase letters. ("EUR" or "JPY" etc.) Note I am making a big assumption here; I cannot see the code.
The potentially dangerous data ends up going into the browser's location.
The problem is that the developer has no guarantee what event will be sent, or that unexpected data might appear in that currency code.
The simplest fix is for you to transform the return value from "event.getCurrencyCode()" into guaranteed three uppercase letters. There is no known attack that can express in three such uppercase letters. So you could replace:
event.getCurrencyCode()
with
/^[A-Z][A-Z][A-Z]$/.exec( event.getCurrencyCode() )
(reference: http://www.w3schools.com/jsref/jsref_regexp_exec.asp )
That will correctly build your URL if and only if event.getCurrencyCode() resolves to three uppercase letters like "USD". Otherwise, "null" will go into the URL at the point where the currency code was expected.
Obviously, you need to work with a real JavaScript developer to implement such a fix so that no further problems are introduced.

Basic Knowledge about HTML Query Strings needed

I am new to the world of webdesign and already assigned myself with a very (at least for me) difficult task: I want to build a webpage, that sends a query string to the website of the German Railway (bahn.de) with the parameters I entered on my webpage.
My question now is, if there is a way to decipher the answer, the other webpage (bahn.de) is sending back in regard to my query string.
In my case there will be departure and arrival times, fares, line numbers, .... Is it possible to extract this information from the answer the bahn.de- page is sending?
First and foremost you need to determine what type of data-encoding the website is returning. Is it XML or perhaps JSON? Both formats can be dealt with using a server-side language such as PHP, but the extraction process may differ slightly.
In order to continue along your learning path (which is great, by the way!) you'll need to find out a bit more about what kind of data object the website is sending back from your query. There are plenty of great resources at the other end of a Google search that can teach you how to handle that incoming data once you know which format it is in.

Why does gmail use eval?

This question suggests that using eval is a bad practice and many other questions suggest that it is 'evil'.
An answer to the question suggests that using eval() could be helpful in one of these cases:
Evaluate code received from a remote server. (Say you want to make a site that can be remotely controlled by sending JavaScript code to it?)
Evaluate user-written code. Without eval, you can't program, for
example, an online editor/REPL.
Creating functions of arbitrary length dynamically (function.length
is readonly, so the only way is using eval).
Loading a script and returning it's value. If your script is, for
example, a self-calling function, and you want to evaluate it and get
it's result (eg: my_result = get_script_result("foo.js")), the only
way of programming the function get_script_result is by using eval
inside it.
Re-creating a function in a different closure.
While looking at the Google Accounts page Source code I've found this:
(function(){eval('var f,g=this,k=void 0,p=Date.now||function(){return+new Date},q=function(a,b,c,d,e){c=a.split("."),d=g,c[0]in d||!d.execScript||d.execScript("var "+c[0]);for(;c.length&&(e=c.shift());) [a lot of code...] q("botguard.bg.prototype.invoke",K.prototype.ha);')})()</script>
I just can't get how is this helpful as it does not match any of the above cases. A comment there says:
/* Anti-spam. Want to say hello? Contact (base64)Ym90Z3VhcmQtY29udGFjdEBnb29nbGUuY29tCg== */
I can't see how eval would be used as anti-spam . Can somebody tell me why is it used in this specific case?
Mike Hearn from plan99.net created anti-bot JS system, and you see parts of its anti-reverse engineering methods (random encryption). There is his letter with mention about it: https://moderncrypto.org/mail-archive/messaging/2014/000780.html
[messaging] Modern anti-spam and E2E crypto
Mike Hearn
Fri Sep 5 08:07:30 PDT 2014
There's a significant amount of magic involved in preventing bulk signups.
As an example, I created a system that randomly generates encrypted
JavaScripts that are designed to resist reverse engineering attempts. These
programs know how to detect automated signup scripts and entirely wiped
them out
http://webcache.googleusercontent.com/search?q=cache:v6Iza2JzJCwJ:www.hackforums.net/archive/index.php/thread-2198360.html+&cd=8&hl=en&ct=clnk&gl=ch
You can google the info about system by its "Ym90Z3VhcmQtY29udGFjdEBnb29nbGUuY29tCg" base64 contact code or by "botguard-contact".
The post http://webcache.googleusercontent.com/search?q=cache:v6Iza2JzJCwJ:www.hackforums.net/archive/index.php/thread-2198360.html+&cd=8&hl=en&ct=clnk&gl=ch says:
The reason for this is being the new protection google introduced a couple of weeks/months ago.
Let me show you a part of the new Botguard ( as google calls it )
Code:
/* Anti-spam. Want to say hello? Contact (base64) Ym90Z3VhcmQtY29udGFjdEBnb29nbGUuY29tCg== */
You will have to crack the algorithm of this javascript, to be able to create VALID tokens that allow you to register a new account.
Google still allows you to create accounts without these tokens, and you wanna know why?
Its because they wait a couple of weeks, follow up the trace you and your stupid bot leave behind and than make a banwave.
ALL accounts you've sold, all accounts your customers created will be banned.
Your software might be able to be able to still create accounts after the banwave, but whats the use?
So, botguard is the optional security measure. It can be correctly computed in browser, but not in some/most javascript engines, used by bots. You can bypass it by not entering correct code, but the created account will be marked as bot-account and it will be disabled soon (and linked accounts will be terminated too).
There are also several epic threads on the GitHub:
https://github.com/assaf/zombie/issues/336
Why does Zombie produce an improper output compared to the more basic contextify version in the following example?
Output varies depending on when document.bg is initialized to new botguard.bg(), because the botguard script mixes in a timestamp salt when encoding.
mikehearn commented on May 21, 2012
Hi there,
I work for Google on signup and login security.
Please do not attempt to automate the Google signup form. This is not a good idea and you are analyzing a system that is specifically designed to stop you.
There are no legitimate use cases for automating this form. If you do so and we detect you, the accounts you create with it will be immediately terminated. Accounts associated with the IPs you use (ie, your personal accounts) may also be terminated.
If you believe you have a legitimate use case, you may be best off exploring other alternatives.
In the https://github.com/jonatkins/ingress-intel-total-conversion/issues/864 thread there are some details:
a contains heavily obfuscated code that starts with this comment:
The code contains a lot of generic stuff: useragent sniffing (yay, Internet Explorer), object type detection, code for listening to mouse/kb events... So it's looks like some generic library. After that there's a lot of cryptic stuff that makes absolutely no sense. The interesting bit is that it calls something labeled as "botguard.bg.prototype.invoke".
Evidently this must be google's botguard. From what I know, It collects data about user behavior on the page and its browser and avaluates it against other know data, this way it can detect anomaly usage and detect bots (kinda like clienBlob in ingress client). My guess would be it's detecting what kind of actions it takes the user to send requests (clicks, map events would be the most sensible)
So, google uses evil eval to fight evil users, which are unable to emulate the evaluated code fast/correctly enough.
eval() is dangerous when used on untrusted input. When used on a hardcoded string, that's not generally the case.

Categories