As expected I need to encode a URI component before I call an API using it but when it hits our server somewhere along the line our backend framework (tapestry) converts spaces too early: Java URLEncoding / Decoding URL Spaces with dashes
I figured out that if I changed the URI %20 to $0020 it works. So the code below works in Chrome and Firefox and converts the % to a $00.
function furtherEncode(uriComp) {
var nonSafe = encodeURIComponent(uriComp);
return nonSafe.replace(/%/g, "$00");
}
In Internet Explorer 11 (and IE10) it doesn't do the replacement.
I have tried /\x25/g and /%/g as well as "$00" and '$00' but to no avail.
Any help would be greatly appreciated.
You need to have two dollar signs like here (tested in IE and Chrome):
"%".replace(/%/g, "$$00") /// returns "$00"
See docs here: docs
Relevant parts are defined under "Specifying a string as a parameter":
$$ | Inserts a "$".
$n | Where n or nn are decimal digits, inserts the nth parenthesized submatch string, provided the first argument was a RegExp object
Related
I've run into an issue of regex match not evaluating in Internet Explorer and in Firefox. It works fine in Chrome and Opera. I know Chrome is generally much more tolerant of mistakes so I suspect I've dropped the ball somewhere along the way - yet none of the online evaluation tools seem to find any errors in my expression. I'm sorry that it's such a convoluted expression but hopefully something will be easily obvious as the culprit. The expression is as follows:
keyData = data.match(/\w+\u0009\w+\u0009[\u0009]?\w+\u0009([-]?\w+|%%)[#]?\u0009([-]?\w+|%%)[#]?\u0009([-]?\w+|%%)[#]?(\u0009([-]?\w+|%%)[#]?)?(\u0009([-]?\w+|%%)[#]?)?(\u0009([-]?\w+|%%)[#]?)?\u0009\u0009\/\//g);
'data' is a text file which I am parsing with no errors. I wont post the whole file here but what I am hoping to match is something such as the following:
10 Q 1 0439 0419 -1 // CYRILLIC SMALL LETTER SHORT I, CYRILLIC CAPITAL LETTER SHORT I, <none>
I believe that when I post the string here it removes the 'u0009' characters so if you'd like to see one of the full files, I've linked one here. If there is anything more I can clarify, please let me know!
Edit:
My goal in this post is understanding not only why this is failing, but also if this expression well-formatted. After further review, it seems that it's an issue with how Internet Explorer and Firefox parse the text file. They seem to strip out the tabs and replace them with spaces. I tried to update the expression and it matches with no problems in an online validator but it still fails in IE/FF.
Edit 2
I have since updated my expression to a clearer form taking into account feedback. The issue still is persisting in IE and Firefox. It seems to be an issue with the string itself. IE won't let me match more than a single character, no matter what my expression is. For example, if the character string of the file is KEYBOARD and I try to match with /\w+/, it will just return K.
/[0-9](\w)?(\t+|\s+)\w+(\t+|\s+)[0-9](\t+|\s+)(-1|\w+#?|%%)(\t+|\s+)(-1|\w+#?|%%)(\t+|\s+)(-1|\w+#?|%%)((\t+|\s+)(-1|\w+#?|%%))?((\t+|\s+)(-1|\w+#?|%%))?((\t+|\s+)(-1|\w+#?|%%))?(\t+|\s+)\/\//g
After poking around with my regex for a while, I suspected something was wrong with the way IE was actually reading the text file as compared to Chrome. Specifically, if I had the string KEYBOARDwithin the text file and I tried to match it using /\w+/, it would simply return K in IE but in Chrome it would match the whole string KEYBOARD. I suspected IE was inserting some dead space between characters so I stepped through the first few characters of the file and printed their unicode equivalent.
for (i = 0; i < 30; i++) {
console.log(data.charCodeAt(i) + ' ' + data[i]);
}
This confirmed my suspicion and I saw u0000 pop up between each character. I'm not sure why there are NULL characters between each character but to resolve my issue I simply performed:
data = data.replace(/\u0000+/g, '');
This completely resolved my issue and I was able to parse my string like normal using the expression:
keyData = data.match(/[0-9](\w)?(\t+|\s+)\w+(\t+|\s+)[0-9](\t+|\s+)(-1|\w+#?|%%)(\t+|\s+)(-1|\w+#?|%%)(\t+|\s+)(-1|\w+#?|%%)((\t+|\s+)(-1|\w+#?|%%))?((\t+|\s+)(-1|\w+#?|%%))?((\t+|\s+)(-1|\w+#?|%%))?(\t+|\s+)\/\//g);
I have a number of Dynamic Actions in my Oracle Apex 4.2 page with action "Execute Javascript Code" on a phone number entry field:
$s("P40_MOBILE_PHONE", $v("P40_MOBILE_PHONE").replace(/[()-\s]+/g, ''));
This works in IE and Chrome. In Firefox, however, it not only doesn't work, but it causes all other dynamic actions on the page to stop working entirely.
The only difference between this and the other dynamic actions seems to be the use of string.replace(/[()-\s]+/g, ''). This is supposed to strip any spaces, (, ) and - characters from the phone number.
As #dandavis said in a comment, escaping the dash works (no need to escape parentheses, though).
If you try to run the code
/[()-\s]+/
you get
SyntaxError: invalid range in character class
That's because Firefox is trying to use the dash as a range character, not dash.
To fix it, you can:
Escape the dash: /[()\-\s]+/
Place the dash at the beginning or end: /[-()\s]+/, /[()\s-]+/
For future reference, changing the regex as follows fixed the problem:
replace(/[\(\)\-\s]+/g, '')
I've been working on my Safari extension for saving content to Instapaper and have been working on enhancing my title parsing for bookmarks. For example, an article that I recently saved has a tag that looks like this:
Report: Bing Users Disproportionately Affected By Malware Redirects | TechCrunch
I want to use the JavaScript in my Safari extension to remove all of the text after the pipe character so that I can make the final bookmark look neater once it is saved to Instapaper.
I've attempted the title parsing successfully in a couple of similar cases using blocks of code that look like this:
if(safari.application.activeBrowserWindow.activeTab.title.search(' - ') != -1) {
console.log(safari.application.activeBrowserWindow.activeTab.title);
console.log(safari.application.activeBrowserWindow.activeTab.title.search(' - '));
var parsedTitle = safari.application.activeBrowserWindow.activeTab.title.substring(0, safari.application.activeBrowserWindow.activeTab.title.search(' - '));
console.log(parsedTitle);
};
I started getting thrown for a loop once I tried doing this same thing with the pipe character; however, since JavaScript uses it as a special character. I've tried several bits of code to try and solve this problem. The most recent looks like this (attempting to use regular expressions and escape the pipe character):
if(safari.application.activeBrowserWindow.activeTab.title.search('/\|') != -1) {
console.log(safari.application.activeBrowserWindow.activeTab.title);
console.log(safari.application.activeBrowserWindow.activeTab.title.search('/\|'));
var parsedTitle = safari.application.activeBrowserWindow.activeTab.title.substring(0, safari.application.activeBrowserWindow.activeTab.title.search('/\|'));
console.log(parsedTitle);
};
If anybody could give me a tip that works for this, your help would be greatly appreciated!
Your regex is malformed. It should be:
safari.application.activeBrowserWindow.activeTab.title.search(/\|/)
Note the lack of quotes; I'm using a regex literal here. Also, regex literals need to be bound by /.
Instead of searching and then replacing, you can simply do a replace with the following regex:
str = str.replace(/\|.*$/, "");
This will remove everything after the | character if it exists.
I'm doing some pretty unholy things with JavaScript, and I've run into a weird problem.
I am creating binary data that fills a buffer of a static size. If the content doesn't fill the buffer, the remainder is filled with null characters.
The next step is to convert to base64.
The size (bytes) isn't always a multiple of 3, so I may need to add padding to the end. The last bytes in the buffer are always null (actually, it's about a kb of nulls).
When I convert this to base64 on Firefox and Chrome, I get an ERR_INVALID_URL when I have a trailing '=', but it downloads fine when I don't.
For example:
var url = "data:application/octet-stream;base64,";
window.open(url + "AAAA"); // works
window.open(url + "AAAA="); // doesn't work
window.open(url + "icw="); // works
My files work, but they're not up to spec.
Is there a reason why this is invalid base64? More importantly, is this a bug or part of the specification?
Edit:
I've posted an answer that gives some of the oddities between Firefox and Chrome. Does anyone know what the standard specifies? Or is it one of those loose specifications that causes fragmentation? I'd like something definitive if possible.
The padding character = is used to fill up to a multiple of four code characters. As every three bytes of input are mapped onto four bytes of output, a number of input bytes that is not a multiple of three requires padding (a remainder of one byte requires == and a remainder of two bytes requires =).
In you case AAAA already is a valid code word and doesn’t require padding.
Why would you imagine that adding an "=" character to the end of the string would work? That's not a valid character in base64.
The character set is upper- and lower-case letters; the digits; and "+" and "/". Anything else is therefore not valid in a base64 string.
edit — well for URLs it seems that instead of "+" and "/" you use "-" and "_" (for positions 62 and 63 in the character set).
edit some more — this is a very confusing topic due to the existence of different, apparently authoritative but contradictory, sources of information. For example, the Mozilla description of the data URL scheme makes no mention of using the "filename-friendly" alternate encoding. Same goes for the IETF data url RFC. However, other IETF documents clearly discuss and specify the variation with "-" and "_" replacing the problematic (for file names) "+" and "/".
Therefore, I declare myself ignorant :-) Gumbo is probably right, that the complaints you're getting are about incorrect padding (that is, padding when no padding is actually necessary).
Notes about different browsers:
Chrome
datalength % 4 === 1- a single equals is necessary
datalength % 4 === 2- two equals are necessary
Firefox- equals signs are optional, but follow the same conventions as in Chrome
This is the line I used to test it (I replaced AAAAAA== with a different string each time):
var url = "data:application/octet-stream;base64,AAAAAA=="; window.open(url);
Also, both Firefox and Chrome use + & /, not - and _.
Source:
My tests on Ubuntu 11.04 with Chrome 11 and Firefox 4.
Edit:
The code I need this for is a tar utility for Javascript. My code works as is, but I'd like to be as standards-compliant as possible, and I'm missing a byte I think. No biggie because tar in Linux recognizes it.
I'm going mad with this regex in JS:
var patt1=/^http(s)?:\/\/[a-z0-9-]+(.[a-z0-9-]+)*?(:[0-9]+)?(\/)?$/i;
If I give an input string like "http://www.eitb.com/servicios/concursos/516522/" this regex it's supossed to return NULL, because there are a "folder" after base URL. It works in PHP, but not in Javascript, like in this script:
<script type="text/javascript">
var str="http://www.eitb.com/servicios/concursos/516522/";
var patt1=/^http(s)?:\/\/[a-z0-9-]+(.[a-z0-9-]+)*?(:[0-9]+)?(\/)?$/i;
document.write(str.match(patt1));
</script>
It returns
http://www.eitb.com/servicios/concursos/516522/,,/516522,,/
The question is: why it is not working? How to make it work?
The idea is to implement this regex in another function to get NULL when the URL passed is not in the correct format:
http://www.eitb.com/ -> Correct
http://www.eitb.com/something -> Incorrect
Thanks
I'm no javascript pro, but accustomed to perl regexp, so I'll give it a try; the . in the middle of the regexp might need to be escaped, as it can map a / and jinx the whole regexp.
Try this way:
var patt1=/^http(s)?:\/\/[a-z0-9-]+(\.[a-z0-9-]+)*?(:[0-9]+)?(\/)?$/i;
Considering you have a properly formatted URL this simple RegExp should do the trick every time.
var patt1=/^https?:\/\/[^\/]+/i;
Here's the breakdown...
Starting with the first position (denoted by ^)
Look for http
http can be followed by s (denoted by the ? which means 0 or 1 of the character or set before it)
Then look for :// after the http or https (denoted by :\/\/)
Next match any number of characters except for / (denoted by [^\/]+ - the + means 1 or more)
Case insensitive (denoted by i)
NOTE: this will also pick up ports http://example.com:80 - to get rid of the :80 (or a colon followed by any port number) simply add a : to the negated character class [^\/:] for example.