How can I disable javascript for some part of my page. For examle I have next structure
html
head
my js files
head
body
div
my components(they use javascript)
div
div
some untrusted content(may be some elements with javascript triggers 'on load' or smt. like that
div
body
html
I don`t want to process this content and only give it 'AS IS' but dont be vulnerable for XSS attack.
update
I want to build small service for posting text information from simple form and saving to the database. And I want to show it for user in preview mode on the html page(include two elements - header and body ).
Try replacing the <'s in your untrusted content with <'s:
var somewhatSanitizedContent = untrustedContent.replace("<", "<");
This should make all HTML tags appear as plain text, disabling <script> tags as well.
However, it may be a good idea to sanitize your input before storing it, instead of after.
Related
I'm working to address some Stored XSS vulnerabilities and I am using HTMLPurifier. I have an input box on the page and if I type '" onclick="alert(1);" the code is saved to the database and executed on the client. This is happening even after running the input and output through purifier. It seems as if HTMLpurifier only strips these attr when included within html tag. I'm wondering if there is some config for purifier that will strip just the event attr's or any other suggestions on how to cleans these up.
HTML Purifier is purely intended for use on content which will be used as HTML on a page. It is not appropriate for validating content which, for example, will go in an attribute for an HTML element.
You can use some internal APIs of HTML Purifier to validate content for this case. However, for the example quoted in the comments, all you need is htmlspecialchars to do the right thing. The right choice of validator depends on what attribute you put the content in.
I was trying to perform a Reflective XSS attack on a tutorial website. The webpage basically consists of a form with an input field and a submit button. On submitting the form, the content of the input field are displayed on the same webpage.
I figured out that the website is blacklisting script tag and some of the JavaScript methods in order to prevent an XSS attack. So, I decided to encode my input and then tried submitting the form. I tried 2 different inputs and one of them worked and the other one didn't.
When I tried:
<body onload="alert('Hi')"></body>
It worked and an alert box was displayed. However, I when encoded some characters in the HTML tag, something like:
<body onload="alert('Hi')"></body>
It didn't work! It simply printed <body onload="alert('Hi')"></body> as it is on the webpage!
I know that the browsers execute inline JavaScript as they parse an HTML document (please correct me if I'm wrong). But, I'm not able to understand why did the browser show different behavior for the different inputs that I've mentioned.
-------------------------------------------------------------Edit---------------------------------------------------------
I tired the same with a more basic XSS tutorial with no XSS protection. Again:
<script>alert("Hi")</script> -> Worked!
<script>alert("Hi")</script> -> Didn't work! (Got printed as string on the Web Page)
So basically, if I encode anything in JavaScript, it works. But if I'm encoding anything that is HTML, it's not executing the JavaScript within that HTML!
I can't come up with words to describe the properly, so i'll just give you an example. Lets say we have this string:
<div>Hello World! <span id="foo">Foobar</span></div>
When this gets parsed, you end up with a div element that contains the text:
Hello World! <span id="foo">Foobar</span>
Note, while there is something that looks like html inside the text, it is still just text, not html. For that text to become html, it would have to be parsed again.
Attributes work a little bit differently, html entities in attributes do get parsed the first time.
tl;dr:
if the service you are using is stripping out tags, there's nothing you can do about it unless the script is poorly written in a way that results in the string getting parsed twice.
Demo: http://jsfiddle.net/W6UhU/ note how after setting the div's inner html equal to it's inner text, the span becomes an html element rather than a string.
When an HTML page says <body It treats it the same as if it said <body
That is, it just displays the encoded characters, doesn't parse them as HTML. So you're not creating a new tag with onload attributes http://jsfiddle.net/SSfNw/1/
alert(document.body.innerHTML);
// When an HTML page says <body It treats it the same as if it said <body
So in your case, you're never creating a body tag, just content that ends up getting moved into the body tag http://jsfiddle.net/SSfNw/2/
alert(document.body.innerHTML)
// <body onload="alert('Hi')"></body>
In the case <body onload="alert('Hi')"></body>, the parser is able to create the body tag, once within the body tag, it's also able to create the onload attribute. Once within the attribute, everything gets parsed as a string.
So I don't know if I'm looking for the wrong thing on Google or Stackoverflow, but I want to achieve this-
There is a text-area in a form and I want the user to be able to enter HTML tags.
So the user would enter this in to the text area:
<html>
<p>Hello World</p>
</html>
This is then submitted by AJAX and JavaScript to the database however is seems to get rid of the tags.
What I'm wanting is to keep the tags when the data is returned, however not actually affect the other data in the text area. So example if I was to echo out the content of the text area it would echo out:
<html>
<p>Hello World</p>
</html>
as plain text.
Okay I have gone down the root of using htmlspecialchars, which does what I wanted, as it displays the tags as plain text. However I would like some tags to be executed sill such as the bold tag. How would I combine htmlspecialchars and striptags to allow tags to be displayed as plain text but also allow the tags specified in the striptags to be executed.
There is nothing you need (or can) do to allow users to enter HTML tags. The reason is that the input is read as plain text anyway, so any < character is taken just as-is. So if the user types <a>, these three characters get inserted into the form data.
What you do with the data then, server-side or otherwise, may or may not handle HTML tags. It’s all up to your code. If you simply echo everything as such on a generated HTML page, then HTML markup will have the usual effect. If you wish to render it as text, as visible tags, then simply encode any & as & and any < as <.
You don't need to do anything, it automatically does as long as you dont filter the user submitted text.
N.B. If you want to echo the entered HTML back to users, be very aware of potential malicious code in the entered HTML. This security issue is known as Cross-site scripting (or XSS).
In other words: never trust the entered code
I'm gathering HTML from a HTML-editor and save in my database. I want to display this data to the user, but I don't know how to do this without the HTML-text being affected by the styling of my page.
Are there any cool libraries around which can help me with this, or is there a very simple way using only HTML tags and/or javascript?
The easiest way to do this is probably simply stuffing your HTML into an iframe.
Have a look at this question if you want to set it as HTML: Set content of iframe .
But I typically simply accept that the contents of the iframe are loaded using a separate request.
I want to display html provided by a user in a page. My page is almost entirely dynamic (JS code), and I was wondering if there's an easy way to sanitize it?
Like, maybe I could remove all the <script> and <iframe> tags and unbind all the events contained in the string (or remove any html attribute starting by 'on') in order to not have any javascript code from the string possibly executed?
Can the users possibly insert javascript with a css 'content' property in a style attribute?
The jquery $(...).text(...) function doesn't help me, since I want to preserve any html mark-up or css styling.
If there's no easy solution i'm ready to live with a whitelist of html tags (table span div img a b u i strong...), but i'd rather not have to white-list the attributes too.
The more foolproof way to show user content safely is to embed it in an iframe who's origin is a different domain than your host web page. This is what jsFiddle does. The main page is served from jsfiddle.net, but the user scripts are served from fiddle.jshell.net. This lets the user content do what it would normally do, but the browser's cross-origin protection keeps the user content from messing with the host page or domain or cookies, etc....
Trying to strip all possible places that scripts could be in the content is a risky proposition which you will probably forever be chasing new attack vectors. I'd personally much rather let the browser be in that business and put the user content on a different domain. Plus, allowing the user content to have it's normal JS will also let it work as desired.