Allow HTML to be entered in text area - javascript

So I don't know if I'm looking for the wrong thing on Google or Stackoverflow, but I want to achieve this-
There is a text-area in a form and I want the user to be able to enter HTML tags.
So the user would enter this in to the text area:
<html>
<p>Hello World</p>
</html>
This is then submitted by AJAX and JavaScript to the database however is seems to get rid of the tags.
What I'm wanting is to keep the tags when the data is returned, however not actually affect the other data in the text area. So example if I was to echo out the content of the text area it would echo out:
<html>
<p>Hello World</p>
</html>
as plain text.
Okay I have gone down the root of using htmlspecialchars, which does what I wanted, as it displays the tags as plain text. However I would like some tags to be executed sill such as the bold tag. How would I combine htmlspecialchars and striptags to allow tags to be displayed as plain text but also allow the tags specified in the striptags to be executed.

There is nothing you need (or can) do to allow users to enter HTML tags. The reason is that the input is read as plain text anyway, so any < character is taken just as-is. So if the user types <a>, these three characters get inserted into the form data.
What you do with the data then, server-side or otherwise, may or may not handle HTML tags. It’s all up to your code. If you simply echo everything as such on a generated HTML page, then HTML markup will have the usual effect. If you wish to render it as text, as visible tags, then simply encode any & as & and any < as <.

You don't need to do anything, it automatically does as long as you dont filter the user submitted text.

N.B. If you want to echo the entered HTML back to users, be very aware of potential malicious code in the entered HTML. This security issue is known as Cross-site scripting (or XSS).
In other words: never trust the entered code

Related

what is best practice to show typed data - textarea vs p tag

I have a form that users create a profile. In the form there is a textarea in which hey put their blurb / description whatever.
Later when I want to show the profile in a view only screen, what is best practice? To use a <p> tag or an html tag?
It appears I lose the paragraphs etc when I display the data in a <p> tag.
If, the best practice is to maybe use a readonly textarea for view purposes, how can one dynamically adjust the rows depending on the length of text?
Textareas are designed for accepting input from users, not displaying data back to them.
You need to process the submitted data before displaying it on the user profile. Typically this would involve formatting (like splitting the content on double new lines and then wrapping each part of the result with paragraphs) and implementing protection against XSS attacks.
For formatting, you might consider using a Markdown engine (similar to what Stackoverflow does).
Use the <p> tag.
It is ok, but you should modify the contents, before output at least with two functions:
1) htmlentities() - to protect against XSS attacks and print any html symbols as text
2) nl2br() - to add <br/> tags (html new line) next to \n symbols (new line in text format)
You can look at the <pre> tag as well, then you do not need the nl2br() function

Disable javascript for some parth of my page

How can I disable javascript for some part of my page. For examle I have next structure
html
head
my js files
head
body
div
my components(they use javascript)
div
div
some untrusted content(may be some elements with javascript triggers 'on load' or smt. like that
div
body
html
I don`t want to process this content and only give it 'AS IS' but dont be vulnerable for XSS attack.
update
I want to build small service for posting text information from simple form and saving to the database. And I want to show it for user in preview mode on the html page(include two elements - header and body ).
Try replacing the <'s in your untrusted content with <'s:
var somewhatSanitizedContent = untrustedContent.replace("<", "<");
This should make all HTML tags appear as plain text, disabling <script> tags as well.
However, it may be a good idea to sanitize your input before storing it, instead of after.

How does a browser render this inline JavaScript within an encoded tag?

I was trying to perform a Reflective XSS attack on a tutorial website. The webpage basically consists of a form with an input field and a submit button. On submitting the form, the content of the input field are displayed on the same webpage.
I figured out that the website is blacklisting script tag and some of the JavaScript methods in order to prevent an XSS attack. So, I decided to encode my input and then tried submitting the form. I tried 2 different inputs and one of them worked and the other one didn't.
When I tried:
<body onload="&#97lert('Hi')"></body>
It worked and an alert box was displayed. However, I when encoded some characters in the HTML tag, something like:
&#60body onload="&#97lert('Hi')"&#62&#60/body&#62
It didn't work! It simply printed <body onload="alert('Hi')"></body> as it is on the webpage!
I know that the browsers execute inline JavaScript as they parse an HTML document (please correct me if I'm wrong). But, I'm not able to understand why did the browser show different behavior for the different inputs that I've mentioned.
-------------------------------------------------------------Edit---------------------------------------------------------
I tired the same with a more basic XSS tutorial with no XSS protection. Again:
<script>alert("Hi")</script> -> Worked!
&#60s&#99ript&#62&#97lert("Hi")&#60/s&#99ript&#62 -> Didn't work! (Got printed as string on the Web Page)
So basically, if I encode anything in JavaScript, it works. But if I'm encoding anything that is HTML, it's not executing the JavaScript within that HTML!
I can't come up with words to describe the properly, so i'll just give you an example. Lets say we have this string:
<div>Hello World! <span id="foo">Foobar</span></div>
When this gets parsed, you end up with a div element that contains the text:
Hello World! <span id="foo">Foobar</span>
Note, while there is something that looks like html inside the text, it is still just text, not html. For that text to become html, it would have to be parsed again.
Attributes work a little bit differently, html entities in attributes do get parsed the first time.
tl;dr:
if the service you are using is stripping out tags, there's nothing you can do about it unless the script is poorly written in a way that results in the string getting parsed twice.
Demo: http://jsfiddle.net/W6UhU/ note how after setting the div's inner html equal to it's inner text, the span becomes an html element rather than a string.
When an HTML page says &#60body It treats it the same as if it said <body
That is, it just displays the encoded characters, doesn't parse them as HTML. So you're not creating a new tag with onload attributes http://jsfiddle.net/SSfNw/1/
alert(document.body.innerHTML);
// When an HTML page says <body It treats it the same as if it said <body
So in your case, you're never creating a body tag, just content that ends up getting moved into the body tag http://jsfiddle.net/SSfNw/2/
alert(document.body.innerHTML)
// <body onload="alert('Hi')"></body>
In the case <body onload="&#97lert('Hi')"></body>, the parser is able to create the body tag, once within the body tag, it's also able to create the onload attribute. Once within the attribute, everything gets parsed as a string.

html markup in form field does not work

I am creating a site that a user can login and write or paste a text in a form field like so
<textarea name="descr" id="descr" class="textformfront" rows="24" cols="50" required onFocus="cleari();"></textarea>
The text is saved in a DB (postgreSQL 9.1-extended with PostGIS 2.0). The data type of the column in the DB is "text". Then the text is displayed in the front-end, in a div like so
<div id="formdescr" style="overflow-y:auto; height:400px; width:100%;"></div>
My problem is that if the user insert a long text in the form, with paragraphs and breaks, in the div none of those is displayed. In the div all I see is a continuous text with no breaks, no paragraphs.
How do I fix this?
Thanks.
UPDATE
I use nodejs 0.10.12 / websockets to transfer from DB to browser and from browser to DB. I put text in the div like document.getElementById("formdescr").innerHTML=descr; where descr came from websockets in the client. In the source code I see no text. The user has to search first and then the div will get text.
Your problem is that browsers ignore white space in content. Multiple spaces and new lines are all collapsed down into one space in the rendered output.
If you want to preserve all of the original formatting, with indents and line breaks, you could output the text into a <pre> block inside that div.
Your other option is to encode the white space into html entities. Use <br> for line breaks and for spaces that should be preserved.
Your solution very likely depends on the backend programming language you use, not the database. I guess this should answer your question if you use php (and if not, you should be able to do the transfer ;-) )

Get raw HTML from a div using js?

I'm working on a website where users can create and save their own HTML forms. Instead of inserting form elements and ids one by one in the database I was thinking to use js (preferably jquery) to just get the form's HTML (in code source format) and insert it in a text row via mysql.
For example I have a form in a div
<div class="new_form">
<form>
Your Name:
<input type="text" name="something" />
About You:
<textarea name=about_you></textarea>
</form>
</div>
With js is it possible to get the raw HTML within the "new_form" div?
To get all HTML inside the div
$(".new_form").html()
To get only the text it would be
$(".new_form").text()
You might need to validate the HTML, this question might help you (it's in C# but you can get the idea)
Yes, it is. You use the innerHTML property of the div. Like this:
var myHTML = document.getElementById('new_form').innerHTML;
Note when you use innerHTML or html() as above you won't get the exact raw HTML you put in. You'll get the web browser's idea of what the current document objects should look like serialised into HTML.
There will be browser differences in the exact format that comes out, in areas like name case, spacing, attribute order, which characters are &-escaped, and attribute quoting. IE, in particular, can give you invalid HTML where attributes that should be quoted aren't. IE will also, incorrectly, output the current values of form fields in their value attributes.
You should also be aware of the cross-site-scripting risks involved in letting users submit arbitrary HTML. If you are to make this safe you will need some heavy duty HTML ‘purification’.

Categories