Sanitizers VS dangerouslySetInnerHtml

Sanitizers VS dangerouslySetInnerHtml - javascript

According to some React documentation:
Improper use of the innerHTML can open you up to a cross-site
scripting (XSS) attack. Sanitizing user input for display is
notoriously error-prone, and failure to properly sanitize is one of
the leading causes of web vulnerabilities on the internet.
It seems that improper usage of the sanitizers and the innerHTML can expose the site XSS (Cross-Site Scripting) attacks.
On the other hand, according to other documentation (such as Gatsby or sanitizers itself), they are recommended:
The most straightforward way to prevent a XSS attack is to sanitize
the innerHTML string before dangerously setting it. Fortunately, there
are npm packages that can accomplish this; packages like sanitize-html
and DOMPurify.
What's the best and safest approach to avoid exposing an application to XSS attacks in React while also avoiding improper usage of sanitizers?

The two options are not in contrast with each other:
Improper use of the innerHTML can open you up to a cross-site scripting (XSS) attack
Emphasis on 'improper'.
sanitize the innerHTML string before dangerously setting it
Using an established and well-known library to sanitize the input before setting it is safe, because it is not an improper use of innerHTML.

I think the best, safest, and optimal approach, as it has been said through comments (especially by Corey Ward) is to avoid the usage of the dangerouslySetInnerHtml as long as it is possible prior to sanitizers. There are some amazing libraries such as markdown-to-jsx that extends the benefits of dangerouslySetInnerHtml (rendering HTML) without exposing the web to XSS attacks.
If the only solution for the use-case is to usedangerouslySetInnerHtml, then the solution must be using sanitizers, keeping in mind that it should be configured to keep styles, classes, and other desired behavior to avoid losing changes.

Related

React: dangerouslySetInnerHTML

I've question about dangerouslySetInnerHTML, is it dangerous to use it, when the html comes from my api or there is no difference and maybe attacked anyway? Thanks

When the HTML comes from any API is when it's most dangerous to use dangerouslySetInnerHTML.
This is covered in the docs:
In general, setting HTML from code is risky because it’s easy to inadvertently expose your users to a cross-site scripting (XSS) attack.
Essentially, the issue is that if your server doesn't properly sanitise user input, you open your site up to attacks that React protects you from by default. For example, lets say you allow users to write comments on your website, and your API sends back those comments in HTML. If a user writes a comment like e.g.
my comment <img src="./404" onerror="alert('test')" />
then e.g. that JavaScript code might be executed whenever someone else views that comment, unless you remember to protect against that when processing the comments.
I'd recommend returning your data from your API in a different format (e.g. JSON), and then processing it into React elements on the frontend.

DangerouslySetInnerHTML
Yes, there's potential concern. Although, you'll gain some extra performance, since react doesn't compare the dangerouslySetInnerHTML field with Virtual DOM.
HOW EXACTLTY IT'S DANGEROUS?
If user add some malicious code into the comments / other inputs and this data renders on the dangerouslySetInnerHTML field via API, your application deal with XSS Attack ([https://owasp.org/www-community/attacks/xss/][1]).
HOW CAN POTENTIAL XSS BE PREVENTED?
Based on the best practices, it's best to sanitize the dangerouslySetInnerHTML element with some external modules like: DOMPurify (https://github.com/cure53/DOMPurify). It will remove/filter out the potentially malicious code from the HTML and prevent XSS.

set innerHtml from a sanitized html is still dangerous?

There is a replacement for innerHtml in react: dangerouslySetInnerHTML.
Its name scares me.
In the React documents I read that:
In general, setting HTML from code is risky because it’s easy to inadvertently expose your users to a cross-site scripting (XSS) attack.
But I sanitized my html with dompurify.
Does this completely protect me from XSS attacks?

But I sanitized my html with dompurify. Does this completely protect me from XSS attacks?
Likely yes, but it's not 100% guaranteed. If DOMPurify doesn't have bugs that will let XSS through, setting innerHTML or dangerouslySetInnerHTML with its results will be safe. DOMPurify is open-source and relatively popular, so if it did have such vulnerabilities, they would probably have been seen by now.
But, like with everything humans do, mistakes and coincidences that result in vulnerabilities not being seen are still possible.

Its name scares me.
It's supposed to. :-)
But I sanitized my html with dompurify. Does this completely protect me from XSS attacks?
It claims to:
DOMPurify sanitizes HTML and prevents XSS attacks. You can feed DOMPurify with string full of dirty HTML and it will return a string (unless configured otherwise) with clean HTML. DOMPurify will strip out everything that contains dangerous HTML and thereby prevent XSS attacks and other nastiness.
Whether you believe the claim is really your call to make. That said, sanitizing HTML is a well-studied problem so it's certainly possible to do. I make no claims for that particular library, which I haven't used or audited.

JavaScript framekiller and XSS vulnerability

Are all known javascript framekillers vulnerable to XSS?
If yes, whould it be enough to sanitize window.location before breaking out of an iframe?
What would be the best way to do it?
Could you please give an example of possible XSS attack?
Thanks!
UPD: The reason I'm asking is because I got a vulnerability scan alert saying that JS framekiller code containing top.location.replace(document.location) is XSS vulnerable as document.location is controlled by the user.

What was right in their description: variables like 'document.location', 'window.location', 'self.location' are (partially) controlled by non-trusted user. This is because the choice of (sub)string in non-trusted domain and page location ('http://non.trusted.domain.com/mypage') and non-trusted request string ('http://my.domain.com/?myrequest') are formed according to user's intention that may not always be good for you.
What was wrong: this user-dependency is not necessarily XSS vulnerability. In fact, to form XSS you would need to have some code that effectively uses the content controlled by non-trusted user somewhere in your output stream for your page. In the example of simple framekiller like top.location.replace(window.location) there's no danger of XSS.
One example where we could talk about XSS would be code like
document.write('Click here')
Constructing URI like http://test.com/?dummy"<script>alert("Test")</script>"dummy and substituting instead of document.location by you code will trigger non-trusted script in trusted webpage's context. As constructing such URI and passing it unescaped is a challenge, real XSS would work in some more complex scenarios involving inserting non-trusted variables verbatim into flow of language directives, be it HTML, CSS, JS, PHP, etc.
Another well-known example of XSS-unaware development was history of inventing JSON. While JSON has got strong popularity (having me among its proponents too), initially it was intended as "quick-n-dirty" way of storing JS data as pieces of plain JS-formatted data structures. In order to "parse" JSON blocks, it was enough just to eval() them. Fortunately, people quickly recognised how flawed was this whole idea, so nowadays any knowledgeable developer in sane mind will always use proper safe JSON parser instead.

Are DOM based XSS attacks still possible in modern browsers?

I am currently doing some research into XSS prevention but I am a bit confused about DOM based attacks. Most papers I have read about the subject give the example of injecting JavaScript through URL parameters to modify the DOM if the value is rendered in the page by JavaScript instead of server-side code.
However, it appears that all modern browsers encode all special characters given through URL parameters if rendered by JavaScript.
Does this mean DOM based XSS attacks cannot be performed unless against older browsers such as IE6?

They are absolutely possible. If you don't filter output that originated from your users, that output can be anything, including scripts. The browser doesn't have a way to know whether it is a legitimate script controlled by you or not.
It's not a matter of modern browsers, it's the basic principle that the browser treats every content that comes from your domain as legitimate to execute.
There are other aspects that are indeed blocked (sometimes, not always) by modern browsers (although security flaws always exist) like cross-domain scripting, 3rd party access to resources etc.

Forget about those old-school XSS exampls from 10 years ago. Programmers who write javascript to render page by taking something unescaped from query params have either been fired or switched to frameworks like angular/backbone a long time ago.
However, reflected/stored XSS still widely exists. This requires proper escaping from both server side and client side. Modern frameworks all provide good support for escaping sensitive characters when rendering the HTML. For example, when rendering views from model data, angular has $sce(strict contextual escaping) service (https://docs.angularjs.org/api/ng/service/$sce) to address possible XSS threats. backbone models also have methods like "model.escape(attribute)" (http://backbonejs.org/#Model-escape) to eliminate the XSS threats.

Is it possible to sanitize Javascript code?

I want to allow user contributed Javascript in areas of my website.
Is this completely insane?
Are there any Javascript sanitizer scripts or good regex patterns out there to scan for alerts, iframes, remote script includes and other malicious Javascript?
Should this process be manually authorized (by a human checking the Javascript)?
Would it be more sensible to allow users to only use a framework (like jQuery) rather than giving them access to actual Javascript? This way it might be easier to monitor.
Thanks

I think the correct answer is 1.
As soon as you allow Javascript, you open yourself and your users to all kinds of issues. There is no perfect way to clean Javascript, and people like the Troll Army will take it as their personal mission to mess you up.

1. Is this completely insane?
Don't think so, but near. Let's see.
2. Are there any Javascript sanitizer scripts or good regex patterns out there to scan for alerts, iframes, remote script includes and other malicious Javascript?
Yeah, at least there are Google Caja and ADSafe to sanitize the code, allowing it to be sandboxed. I don't know up to what degree of trustworthiest they provide, though.
3. Should this process be manually authorized (by a human checking the Javascript)?
It may be possible that sandbox fails, so it would be a sensible solution, depending on the risk and the trade-off of being attacked by malicious (or faulty) code.
4. Would it be more sensible to allow users to only use a framework (like jQuery) rather than giving them access to actual Javascript? This way it might be easier to monitor.
JQuery is just plain Javascript, so if you're trying to protect from attacks, it won't help at all.
If it is crucial to prevent these kind of attacks, you can implement a custom language, parse it in the backend and produce the controlled, safe javascript; or you may consider another strategy, like providing an API and accessing it from a third-party component of your app.

Take a look at Google Caja:
Caja allows websites to safely embed DHTML web applications from third parties, and enables rich interaction between the embedding page and the embedded applications. It uses an object-capability security model to allow for a wide range of flexible security policies, so that the containing page can effectively control the embedded applications' use of user data and to allow gadgets to prevent interference between gadgets' UI elements.

Instead of checking for evil things like script includes, I would go for regex-based whitelisting of the few commands you expect to be used. Then involve a human to authorize and add new acceptable commands to the whitelist.

Think about all of the things YOU can do with javascript. Then think about the things you would do if you could do it on someone elses site. These are things that people will do just because they can, or to find out if they can. I don't think it is a good idea at all.

It might be safer to design/implement your own restricted scripting language, which can be very similar to JavaScript, but which is under the control of your own interpreter.

Probably. The scope for doing bad things is going to be much greater than it is when you simply allow HTML but try to avoid alloing JavaScript.I do not know.Well, two things: do you really want to spend your time doing this, and if you do this you had better make sure they see the javascript code rather than actual live JavaScript!I can't see why this would make any difference, unless you do have someone approving posts and that person happens to be more at home with jQuery than plain JavaScript.

Host it on a different domain. Same-origin security policy in browsers will then prevent user-submitted JS from attacking your site.
It's not enough to host it on a different subdomain, because subdomains can set cookies on higher-level domain, and this could be used for session fixation attacks.

We Keep Coding

JavaScript is the programming language of the Web.