How would you link to a file that contains a space? Is it possible? I have a javascript document and already have dozens of images that contain spaces but I was hoping to be able to still link to them.
%20 is the escaped value for a blank space. Use that in a hyperlink, and you'll get the file you want :)
In case you test it in a browser: modern browsers (Chrome for sure) does not visually change the space to %20 anymore in the address bar, but it does still escape all characters before making a web request.
Edit
Generally speaking, you'd like to html encode your strings via an accessible method, rather than manually escaping the needed characters.
The following SO question has a very elegant solution. If you use it with an element that is not visible to the user (or not even part of the DOM, as is the case with the linked answer), they won't even know.
Related
I have this site:
http://a.b/x – y
where the dash is non-ASCII \u2013 or %E2%80%93 in UTF-8 speak.
The following link with UTF-8 works fine:
True Link
but scripting it with window.open() with the exact same URL gives a 404:
Raw JS Link
Viewing properties on the error page to see the resulting URL I note the extended dash is replaced with:
â??
If I replace the extended dash, and only the extended dash with "\u2013" the link works fine:
Modified JS Link
and the resulting URL seems to have re-endocded the extended dash back to UTF-8.
With this in mind I tried to decode the UTF-8 encoding and re-encode just the space but this failed with the same error as before:
Raw JS Link
I suspect that window.open() is mangling the URL for some reason.
I then went on to try a bunch of different ideas and combinations of decode / encode and even dragged escpae()/unescape() back into use, but to no avail.
The reason for window.open is that I am limited to controlling just the content of the HREF attribute. In this case it's an SSRS expression in a "Go to URL" Action, which SSRS UTF-8 encodes certain characters, so that even with the split(' ') above I actually have to use split(String.fromCharCode(32)).
However I've stripped everything out into a simple HTML page which is where I am doing my analysis with.
PS: IE8, though user base is IE8+
PSS: Added missing quote.
PSS: It looks like this might be an IE8 specific issue.
<a href="javascript:void(window.open('http://a.b/...component...
So here you've got multiple nested escaping contexts. You're injecting text into:
a component of a URL (needs URL-escaping), inside
a JavaScript string literal (needs JS-escaping), inside
a javascript: pseudo-URL (needs URL-escaping), inside
an HTML attribute value (needs HTML-escaping)
So the value x – y has to be escaped four times:
URL-escape to x%20%E2%80%93%20y
JS-escape to x%20%E2%80%93%20y (no changes this time as there are no JS-special characters in this value)
URL-escape to x%2520%25E2%2580%2593%2520y
HTML-escape to x%2520%25E2%2580%2593%2520y (no changes this time as there are no HTML-special characters in this value).
Nested syntaxes needing escaping are very, very difficult to get right. And generally you should never use javascript: URLs: as well as being a nightmare of multiple-escaping, they're also pretty bad for usability and accessibility.
Avoid injecting into nested code. A better pattern for links that open in a new window (if you absolutely must) is to put the real URL in the href, so it responds correctly to middle-click and other link affordances, and then read that href from JS, eg.:
<a href="http://a.b/x%20%E2%80%93%20y" onclick="window.open(this.href, ...options...); return false;"
(The return-false prevents the link being followed after the window is opened.) Also consider breaking the JS code out into a separate script that binds to all appropriate links automatically (eg by class attribute) so you don't have to have inline JavaScript in your HTML.
The single quotes were misplaced on your last example, also, there's no need for .split(' ').join('%20') as it will create errors.
Raw JS Link
demo
http://jsfiddle.net/bf2703ah/1/
I have a problem concerning string output on HTML page when using Javascript and ASP. Logic of page generation goes like this:
We use asp page to generate HTML code using Response.Write(). If string contains numeric character reference (for example С) it would show on the user's side just fine as a character.
After that we add OnLoad event, which calls for a Javascript function. All this happens inside <body><\body> tags. Source for JavaScript added inside <script></script> tags. The function only adds document.href, which contains reference to the same asp page.
The asp logic loads again and adds some text to the page using Response.BinaryWrite() (Response.Write can be used all the same) All character references are shown as codes:С. Obviously all '&' symbols become &(asp automatic conversion), browser decodes it as & and we can only see a code С and not the symbol 'С'.
As far as I know such behaviour can be caused by <script> tags, as a precaution against xss attacks. In the end I want to stop encoding '&' as &.
However here is the most important part:
If I add header with "Content-Type" "text\html", IE (any version) starts encoding NCR symbols in a correct way. But Firefox, Chrome and Safari do not change behavior and keep encoding & as &. I can see several questions on Stack Overflow which looks like mine, yet the situation is not exactly the same (My strings are not inserted directly by JavaScript, so I cannot manipulate output string and change & to &, also my strings have correct symbols in the first place, they get changed by asp or by browser). Is there any elegant way to force Firefox or Chrome to decode page as IE? Maybe some settings or attributes in HTML tags? This problem looks like it depends on a browser to me, am I right?
I trying to make a regex for finding: background-image:url('URL'); Where the URL is a external link for an image.
Been trying for something like this:
/\s*?[ \t\n]background-image:url('https?:\/\/(?:[a-z\-]+\.)+[a-z]{2,6}(?:\/[^\/#?]+)+\.(?:jpe?g|gif|png)$');/i
But couldn't get it to work.
I am using this with javascript/jquery
Does this get what you want?:
/\s*?[ \t\n]background-image:url\('.+?'\);/i
I think you can simplify it to this if you know it will only change with the URL in the middle. I probably went overboard with the \ escapes but better to be safe than sorry.
/background\-image\:url\(\'.*?\'\)\;/
Epascarello hit the nail on the head. Is this source you control? Or at least a predictable website? What are multiple different examples of input and your expected results?
Will this always be inline in double quotes, and therefore your URL will always be in single quotes? Some old websites use double-quotes in their CSS Files or header CSS.
Do you want to capture the whole thing? Or are you just trying to extract the resulting URL?
SirCapsAlot brings up a good question, are you just looking for background image URL's in general? Because they can use the Background property also, or even be set in JavaScript with .backgroundImage="url(image.jpg)".
And you definitely only want the ones that include http(s)?
With the limited requirements you gave, this is the best Regex:
background-image\s*:\s*url\('(https?://[^']+)
Comment here if you have answers to my questions which may alter your requirements, and thusly my answer.
Breakdown:
background-image:\s*url //Find the literal text to begin
\(' //Find the literal opening parens and quote
( //Begin Capture Group 1
https?:// //Require the match of https:// (the s is optional because of the ?)
[^']+ //Require that everything until the next quote is matched
) //Capture the result into Group 1
A Co-Worker pointed out that I might have been downvoted for not capturing the closing tick. Note: Capturing the closing tick would be a wasted step, and is not necessary for this regex to work.
He also pointed out somebody might have downvoted me for requiring http or https in the url portion. But the user's question was specifically for external URLs, not internal ones. So this is a valid requirement and gets him closer to what he asked.
Sooo... not sure why this got a downvote.
SO kept preventing me from posting the title I wanted so finally got a title that let me post though it kind of sucks so feel free to edit/change it.
I have fields a user can fill in and in the javascript we have
'${chart.title}'
and stuff like that. Is it sufficient to just strip out the single quote character such that they cannot escape it back to javascript? or are there other ways to close out the string that started with the single quote character.
${chart.title} inserts the title a user typed in on a previous page so naturally they could type something like "Title'+callMethod()+'RestOfTitle" injecting a callMethod into my javascript.
thanks,
Dean
The best way would be to restrict the input to alphanumerical and space characters.
If you want to allow anything inside the title, you can use a escaping function.
http://xkr.us/articles/javascript/encode-compare/
Just stripping the string of single quote characters is definitely not enough. Think of new lines for one reason.
There are couple of options.
First go very restrictive way and do both so called white-list validation for input field for you title and always encode the text that you output to the page. That will filtered out all unwanted (and potentially dangerous) characters and make sure that if some of them pass filter (or somebody update the text to contains some js code after the filters were applied) the encoding procedure make all malicious js scripts not runable (it turns it into plain text).
Second you do let your users input what ever they want (which is highly unrecommended way but sometime developers asked to do it) but always encode the text that you output to the page.
You can implement white-list validation by yourself using regular expression or you can use one of the libraries.
Consider the following Javascript:
var previewImg = 'http://example.com/preview_img/hey.jpg';
var fullImg = previewImg.replace('preview','full');
I would expect the value of fullImg to be:
http://example.com/full_img/hey.jpg
In fact, it is... sort of. Running alert(fullImg); shows the expected url string. But when I deliver that variable to jQuery Fancybox, like this:
jQuery.fancybox.open(fullImg);
Something adds characters into the string, like this:
http://example.com/%EF%BF%BCfull_img/hey.jpg
Where is this %EF%BF%BC coming from? What is it? And most importantly, how do I get rid of it?
Some other clues: This is a Drupal 7 site, running jQuery 1.5.1. I'm using that same Fancybox script elsewhere on the site with no issues.
%EF%BF%BC is a sequence of three URL-encoded characters.
You clearly can't see any unexpected characters in the string. That's because the character sequence %EF%BF%BC is invisible.
It's actually a UTF-8 byte-order mark sequence. This sequence typically comes at the start of a UTF-8 encoded text file. They probably got into your code when you did a copy+paste from another file.
The quickest way to get rid of them is to find the bit of code that was copied+pasted, delete the characters on either side of the problem, and retype them. Depending on your editor, you may find the delete behaves strangely as it deletes the hidden characters.
Some text editors and IDEs will have an option to show hidden characters. If your editor has this, it may help you see where the mystery characters are so you can delete them.
Hope that helps.