RegEx does match in Expresso but doesn't in JavaScript - javascript

I have written an regex with the help of Expresso. It matches all my samples so I copied it into my JavaScript code. There it doesn't match one of my examples, but why?
RegEx:
^(\d{1,2}):?(\d\d)?\s*-\s*(\d{1,2}):?(\d\d)?$
Should match:
10-12
10:00-12:00
1000-1200
In JavaScript 10:00-12:00 doesn't work for me in all browsers like IE9, Chrome, Firefox.
Any ideas?
Update (JavaScript Code):
input.match(/^(\d{1,2}):?(\d\d)?\s*-\s*(\d{1,2}):?(\d\d)?$/);
Update (solved):
Due some prefiltering the code never got reached. Sorry for that!

Testing it in Chrome right now, and it appears to work:
var exp = /^(\d{1,2}):?(\d\d)?\s*-\s*(\d{1,2}):?(\d\d)?$/;
exp.test('10-12') // true
exp.test('10:00-12:00') // true
exp.test('1000-1200') // true
exp.test('1000-12005') // false

Did you escape the \'s when placing the expression in your Javascript code?
When embedding it as a string you'll end up writing:
var expression = "^(\\d{1,2}):?(\\d\\d)?\\s* etc

Related

JavaScript RegEx Fails In IE / Firefox

I've run into an issue of regex match not evaluating in Internet Explorer and in Firefox. It works fine in Chrome and Opera. I know Chrome is generally much more tolerant of mistakes so I suspect I've dropped the ball somewhere along the way - yet none of the online evaluation tools seem to find any errors in my expression. I'm sorry that it's such a convoluted expression but hopefully something will be easily obvious as the culprit. The expression is as follows:
keyData = data.match(/\w+\u0009\w+\u0009[\u0009]?\w+\u0009([-]?\w+|%%)[#]?\u0009([-]?\w+|%%)[#]?\u0009([-]?\w+|%%)[#]?(\u0009([-]?\w+|%%)[#]?)?(\u0009([-]?\w+|%%)[#]?)?(\u0009([-]?\w+|%%)[#]?)?\u0009\u0009\/\//g);
'data' is a text file which I am parsing with no errors. I wont post the whole file here but what I am hoping to match is something such as the following:
10 Q 1 0439 0419 -1 // CYRILLIC SMALL LETTER SHORT I, CYRILLIC CAPITAL LETTER SHORT I, <none>
I believe that when I post the string here it removes the 'u0009' characters so if you'd like to see one of the full files, I've linked one here. If there is anything more I can clarify, please let me know!
Edit:
My goal in this post is understanding not only why this is failing, but also if this expression well-formatted. After further review, it seems that it's an issue with how Internet Explorer and Firefox parse the text file. They seem to strip out the tabs and replace them with spaces. I tried to update the expression and it matches with no problems in an online validator but it still fails in IE/FF.
Edit 2
I have since updated my expression to a clearer form taking into account feedback. The issue still is persisting in IE and Firefox. It seems to be an issue with the string itself. IE won't let me match more than a single character, no matter what my expression is. For example, if the character string of the file is KEYBOARD and I try to match with /\w+/, it will just return K.
/[0-9](\w)?(\t+|\s+)\w+(\t+|\s+)[0-9](\t+|\s+)(-1|\w+#?|%%)(\t+|\s+)(-1|\w+#?|%%)(\t+|\s+)(-1|\w+#?|%%)((\t+|\s+)(-1|\w+#?|%%))?((\t+|\s+)(-1|\w+#?|%%))?((\t+|\s+)(-1|\w+#?|%%))?(\t+|\s+)\/\//g
After poking around with my regex for a while, I suspected something was wrong with the way IE was actually reading the text file as compared to Chrome. Specifically, if I had the string KEYBOARDwithin the text file and I tried to match it using /\w+/, it would simply return K in IE but in Chrome it would match the whole string KEYBOARD. I suspected IE was inserting some dead space between characters so I stepped through the first few characters of the file and printed their unicode equivalent.
for (i = 0; i < 30; i++) {
console.log(data.charCodeAt(i) + ' ' + data[i]);
}
This confirmed my suspicion and I saw u0000 pop up between each character. I'm not sure why there are NULL characters between each character but to resolve my issue I simply performed:
data = data.replace(/\u0000+/g, '');
This completely resolved my issue and I was able to parse my string like normal using the expression:
keyData = data.match(/[0-9](\w)?(\t+|\s+)\w+(\t+|\s+)[0-9](\t+|\s+)(-1|\w+#?|%%)(\t+|\s+)(-1|\w+#?|%%)(\t+|\s+)(-1|\w+#?|%%)((\t+|\s+)(-1|\w+#?|%%))?((\t+|\s+)(-1|\w+#?|%%))?((\t+|\s+)(-1|\w+#?|%%))?(\t+|\s+)\/\//g);

.trim() and regular expressions producing unexpected results

I wrote a fairly simple regular expression to detect when a string looks like it could be an email:
var looksLikeEmail = /^\S+#\S+\.\S+$/gi;
I'm using Knockout and the string being tested is the value of a textarea.
Essentially, say we have the value of the textarea in a variable text. This value was, for example, the typed in value abc#example.com.
What's odd, is it seems like, even though text === text.trim(), looksLikeEmail.test(text) returns true, but looksLikeEmail.test(text.trim()) returns false.
On the other hand, if I manually create the string var test2 = 'abc#example.com', it does not have this issue.
This seems to indicate to me that the textarea is inserting some odd characters or something... that .trim() is doing something weird with. But test.length === test2.length and test.length === test.trim().length
Does anyone know how to make this behave correctly?
I've written up a jsfiddle to quickly demonstrate the behavior...
If you go to the fiddle and try typing in an email... you will see the problem. another weird behavior: add a space after the email, then remove it. /confused
Any help is much appreciated. Thanks.
.test(), just like .exec() will remember the last index of a match when using a global regex, and try to match from it onward, failing on the second call. Just remove the /g option from your regex - it doesn't make sense to have /g in a non-multiline regex which matches beginning and end.

Parsing Phrases with a Pipe Character Using JavaScript

I've been working on my Safari extension for saving content to Instapaper and have been working on enhancing my title parsing for bookmarks. For example, an article that I recently saved has a tag that looks like this:
Report: Bing Users Disproportionately Affected By Malware Redirects | TechCrunch
I want to use the JavaScript in my Safari extension to remove all of the text after the pipe character so that I can make the final bookmark look neater once it is saved to Instapaper.
I've attempted the title parsing successfully in a couple of similar cases using blocks of code that look like this:
if(safari.application.activeBrowserWindow.activeTab.title.search(' - ') != -1) {
console.log(safari.application.activeBrowserWindow.activeTab.title);
console.log(safari.application.activeBrowserWindow.activeTab.title.search(' - '));
var parsedTitle = safari.application.activeBrowserWindow.activeTab.title.substring(0, safari.application.activeBrowserWindow.activeTab.title.search(' - '));
console.log(parsedTitle);
};
I started getting thrown for a loop once I tried doing this same thing with the pipe character; however, since JavaScript uses it as a special character. I've tried several bits of code to try and solve this problem. The most recent looks like this (attempting to use regular expressions and escape the pipe character):
if(safari.application.activeBrowserWindow.activeTab.title.search('/\|') != -1) {
console.log(safari.application.activeBrowserWindow.activeTab.title);
console.log(safari.application.activeBrowserWindow.activeTab.title.search('/\|'));
var parsedTitle = safari.application.activeBrowserWindow.activeTab.title.substring(0, safari.application.activeBrowserWindow.activeTab.title.search('/\|'));
console.log(parsedTitle);
};
If anybody could give me a tip that works for this, your help would be greatly appreciated!
Your regex is malformed. It should be:
safari.application.activeBrowserWindow.activeTab.title.search(/\|/)
Note the lack of quotes; I'm using a regex literal here. Also, regex literals need to be bound by /.
Instead of searching and then replacing, you can simply do a replace with the following regex:
str = str.replace(/\|.*$/, "");
This will remove everything after the | character if it exists.

Regex not working in Javascript with IE 7

last_tag="abcde x";
last_tag = last_tag.replace(/[\s]+x$/, '');
this is my problem: i have to remove an "x" at the end of my string. This piece of code is used in a plugin i've been using without problems until now. On IE 7 "last_tag" is selected in the wrong way, so i get an "x" and i have to remove it. I think who wrote the plugin added this replace to do exactly this but it's not working on IE7.
Example:
before:last_tag="abcde x"
after: last_tag="abcde"
Actually the problem is that last_tag remain exactly the same.
is the regex correct? is there any error or compatibility issue with IE?
EDIT: Probably the regex is not the issue.
I've tried this piece of code, but nothing happens:
var temp_tag="abc x";
alert(temp_tag);
temp_tag = temp_tag.replace(/[\s]+x$/, '');
alert(temp_tag)
The same piece of code work perfectly on Chrome.
The regex looks okay, but it's possible you're trying to match non-breaking spaces (U+00A0). \s doesn't match those in IE (as explained in this answer), but it does in FireFox and Chrome.
I'd go for this RegExp
/\s+x$/
don't use character class [] for \s which is a character class already
(shorthand for something like [ \t\r\n\v\f]) (space, tab, carriage return, line feed, vertical tab, form feed)
edit
Alan Moore is right:
try this instead
/[\s\u00A0]+x$/
edit
maybe this is case sensitive: maybe \u00a0would not be correct
this should match every white-space-character as well as the non breaking spaces
Try this
last_tag = last_tag.replace(/[\t\r\n]+x$/, '');

Making quoted text pretty using javascript regular expression

I want to make lines of text prefixed with > to be wrapped in <span class="quoted"></span>.
Here is my code:
$(document).ready(function(){
x = $('DIV#test').html();
x = x.replace(/(^|\n)>([^\n]+)(\n|$)/g, "$1<span class=\"quoted\">$2</span>$3");
$('DIV#test').html(x);
});
I can't find a reason why, but this makes only odd lines quoted in Chrome and in IE it makes all text gray. Any ideas what is wrong with this code?
Demo at jsFiddle: http://jsfiddle.net/jBJ6V/1/
Use the m ("multiline") flag on the regular expression rather than the alternation at the outset (and with the necessary adjustments to the resulting capture groups):
$(document).ready(function(){
x = $('DIV#test').html();
x = x.replace(/^>([^\n]+)$/gm, "<span class=\"quoted\">$1</span>");
// Changes: ^----here----^ ^--here ^^--here-^
$('DIV#test').html(x);
});
Updated fiddle
The m flag tells the regular expression to treat ^ and $ as relative to individual lines, not relative to the string as a whole. More in sections 15.10.2.6 and 15.10.2.7 of the specification, and on MDC.
The reason it doesn't work in IE is that older IE handles innerHTML differently, and this reflects in jQuery having different results of .html().
Try this jsfiddle:
http://jsfiddle.net/jBJ6V/8/
I ran it through 7,8 and 9 compatibility modes and seems to be working.

Categories