SO kept preventing me from posting the title I wanted so finally got a title that let me post though it kind of sucks so feel free to edit/change it.
I have fields a user can fill in and in the javascript we have
'${chart.title}'
and stuff like that. Is it sufficient to just strip out the single quote character such that they cannot escape it back to javascript? or are there other ways to close out the string that started with the single quote character.
${chart.title} inserts the title a user typed in on a previous page so naturally they could type something like "Title'+callMethod()+'RestOfTitle" injecting a callMethod into my javascript.
thanks,
Dean
The best way would be to restrict the input to alphanumerical and space characters.
If you want to allow anything inside the title, you can use a escaping function.
http://xkr.us/articles/javascript/encode-compare/
Just stripping the string of single quote characters is definitely not enough. Think of new lines for one reason.
There are couple of options.
First go very restrictive way and do both so called white-list validation for input field for you title and always encode the text that you output to the page. That will filtered out all unwanted (and potentially dangerous) characters and make sure that if some of them pass filter (or somebody update the text to contains some js code after the filters were applied) the encoding procedure make all malicious js scripts not runable (it turns it into plain text).
Second you do let your users input what ever they want (which is highly unrecommended way but sometime developers asked to do it) but always encode the text that you output to the page.
You can implement white-list validation by yourself using regular expression or you can use one of the libraries.
Related
How would you link to a file that contains a space? Is it possible? I have a javascript document and already have dozens of images that contain spaces but I was hoping to be able to still link to them.
%20 is the escaped value for a blank space. Use that in a hyperlink, and you'll get the file you want :)
In case you test it in a browser: modern browsers (Chrome for sure) does not visually change the space to %20 anymore in the address bar, but it does still escape all characters before making a web request.
Edit
Generally speaking, you'd like to html encode your strings via an accessible method, rather than manually escaping the needed characters.
The following SO question has a very elegant solution. If you use it with an element that is not visible to the user (or not even part of the DOM, as is the case with the linked answer), they won't even know.
I have a client's site that keeps getting hacked with XSS injections somehow. These XSS attacks are without fail in the banners section, and the banner ads need to have <script> tags to function.
I am still trying to figure out where and when this happens (it is a HUGE site, is badly coded (sorry, previous guy...) and I am really swamped. So, in the mean time, I want to do a regular expression that deletes the partial tag that gets inserted.
So, if the script should be:
<script src="valid_script.js"></script>
The hacker simply does this:
<script src="valid_script.js"></script>
<script src="invalid_script.js"></script>
I need the regex to delete the script tag (there may be multiple matches) that contains "invalid_script.js" but leave the one that contains "valid_script.js" in tact.
My question: Could you experts out there please show me how to do this regex? I am sorry, but I do not understand regex, I tried so hard to understand, but it is way over my head :-(
Taking note of all the comments, as you have, to answer your question if you have the text to be outputted in the $content variable (that will be containing both the good and the bad script), then the following regular expression will strip just the bad:
$content = preg_replace('#<script[^>]*invalid_script\.js[^>]*></script>#s', '', $content);
This says, briefly, look for the following in sequence: <script, a string of non-> characters, invalid_script.js, a string of non-> characters, and ></script>.
But to reiterate all the comments, this could be got around and is certainly only a sticking plaster of sorts.
Consider the following Javascript:
var previewImg = 'http://example.com/preview_img/hey.jpg';
var fullImg = previewImg.replace('preview','full');
I would expect the value of fullImg to be:
http://example.com/full_img/hey.jpg
In fact, it is... sort of. Running alert(fullImg); shows the expected url string. But when I deliver that variable to jQuery Fancybox, like this:
jQuery.fancybox.open(fullImg);
Something adds characters into the string, like this:
http://example.com/%EF%BF%BCfull_img/hey.jpg
Where is this %EF%BF%BC coming from? What is it? And most importantly, how do I get rid of it?
Some other clues: This is a Drupal 7 site, running jQuery 1.5.1. I'm using that same Fancybox script elsewhere on the site with no issues.
%EF%BF%BC is a sequence of three URL-encoded characters.
You clearly can't see any unexpected characters in the string. That's because the character sequence %EF%BF%BC is invisible.
It's actually a UTF-8 byte-order mark sequence. This sequence typically comes at the start of a UTF-8 encoded text file. They probably got into your code when you did a copy+paste from another file.
The quickest way to get rid of them is to find the bit of code that was copied+pasted, delete the characters on either side of the problem, and retype them. Depending on your editor, you may find the delete behaves strangely as it deletes the hidden characters.
Some text editors and IDEs will have an option to show hidden characters. If your editor has this, it may help you see where the mystery characters are so you can delete them.
Hope that helps.
So I'm working on a micro lib, html.js, and basically it creates text nodes with document.createTextNode but when I want to create a text node with a b I get a b so I'm wondering how to escape the & char, without using innerHTML ideally..
Javascript supports the \uXXXX notation, so in the case of a non-breaking space, that would be \u00A0.
document.createTextNode('a\u00A0b');
That's as far as you can get. It's a text node, consisting only of text, and there's no difference between texts created from entity references or from normal characters.
If that's not what you want, you should take a second look at innerHtml. Can't you read it, modify it and put it back?
There's not much functionality in js to encode/decode html entities. Seems like there some libraries out there, though, that can help you achieve this. Here is one I found on goodle.. haven't tried it, but you can check it out, or look for others.
http://www.strictly-software.com/htmlencode
I'm using the CKeditor and I need to be able to impose a maxLength restriction on it.
For instance, prevent user from entering more than 100 characters, excluding the html characters
applied by the user.
Has anyone been able to do this?
Thanks, I'd appreciate if you point me towards a resource. I found similar questions here but they were not of much help.
I doubt this is going to end up being reliable even if someone posts an approach. Consider the following:
var tags = /<[^>]*?\/?>/;
That should match most tags, but what if you get someone who does something screwy like this:
<img alt=">My Title<" />
Now your regular expression that should be ignoring tags is improperly recognizing the contents of this image's alt tag as counting towards their character limit. If some back end system requires that the text content be only 100 characters what I'd suggest doing is giving the user a single text input with a maxlength of 100, and then look for another control or library that will let them change it's look and feel via CSS.
Attempting to strip out the HTML Tags and then count the remaining characters is unlikely to do anything but give you a headache, will be error prone in the best of cases, and will malfunction entirely in the worst of cases.