.trim() and regular expressions producing unexpected results

.trim() and regular expressions producing unexpected results - javascript

I wrote a fairly simple regular expression to detect when a string looks like it could be an email:
var looksLikeEmail = /^\S+#\S+\.\S+$/gi;
I'm using Knockout and the string being tested is the value of a textarea.
Essentially, say we have the value of the textarea in a variable text. This value was, for example, the typed in value abc#example.com.
What's odd, is it seems like, even though text === text.trim(), looksLikeEmail.test(text) returns true, but looksLikeEmail.test(text.trim()) returns false.
On the other hand, if I manually create the string var test2 = 'abc#example.com', it does not have this issue.
This seems to indicate to me that the textarea is inserting some odd characters or something... that .trim() is doing something weird with. But test.length === test2.length and test.length === test.trim().length
Does anyone know how to make this behave correctly?
I've written up a jsfiddle to quickly demonstrate the behavior...
If you go to the fiddle and try typing in an email... you will see the problem. another weird behavior: add a space after the email, then remove it. /confused
Any help is much appreciated. Thanks.

.test(), just like .exec() will remember the last index of a match when using a global regex, and try to match from it onward, failing on the second call. Just remove the /g option from your regex - it doesn't make sense to have /g in a non-multiline regex which matches beginning and end.

Related

Why isn't .replace() working on a large generated string from escodege.generate()?

I am attempting to generate some code using escodegen's .generate() function which gives me a string.
Unfortunately it does not remove completely the semi-colons (only on blocks of code), which is what I need it to do get rid of them myself. So I am using the the .replace() function , however the semi-colons are not removed for some reason.
Here is what I currently have:
generatedCode = escodegen.generate(esprima.parseModule(code), escodegenOptions)
const cleanGeneratedCode = generatedFile.replace(';', '')
console.log('cleanGeneratedCode ', cleanGeneratedCode) // string stays the exact same.
Am I doing something wrong or missing something perhaps?

As per MDN, if you provide a substring instead of a regex
It is treated as a verbatim string and is not interpreted as a regular expression. Only the first occurrence will be replaced.
So, the output probably isn't exactly the same as the code generated, but rather the first semicolon has been removed. To remedy this, simply use a regex with the "global" flag (g). An example:
const cleanGenereatedCode = escodegen.generate(esprima.parseModule(code), escodegenOptions).replace(/;/g, '');
console.log('Clean generated code: ', cleanGeneratedCode);

Get all the WORDS except one specific word

I want to get all the words, except one, from a string using JS regex match function. For example, for a string testhello123worldtestWTF, excluding the word test, the result would be helloworldWTF.
I realize that I have to do it using look-ahead functions, but I can't figiure out how exactly. I came up with the following regex (?!test)[a-zA-Z]+(?=.*test), however, it work only partially.
http://refiddle.com/refiddles/59511c2075622d324c090000

IMHO, I would try to replace the incriminated word with an empty string, no?

Lookarounds seem to be an overkill for it, you can just replace the test with nothing:
var str = 'testhello123worldtestWTF';
var res = str.replace(/test/g, '');

Plugging this into your refiddle produces the results you're looking for:
/(test)/g
It matches all occurrences of the word "test" without picking up unwanted words/letters. You can set this to whatever variable you need to hold these.

WORDS OF CAUTION
Seeing that you have no set delimiters in your inputted string, I must say that you cannot reliably exclude a specific word - to a certain extent.
For example, if you want to exclude test, this might create a problem if the input was protester or rotatestreet. You don't have clear demarcations of what a word is, thus leading you to exclude test when you might not have meant to.
On the other hand, if you just want to ignore the string test regardless, just replace test with an empty string and you are good to go.

Regex expression for exactly known pattern without "cutting into" the string not working

I am currently developing a web-application where I work with java, javascript, html, jquery, etc. and at some point I need to check that whether an input matches a known pattern and only proceed if it is true.
The pattern should be [at least one but max 3 numbers between 0-9]/[exactly 4 numbers between 0-9], so the only acceptable variations should be like
1/2014 or 23/2015 or 123/2016.
and nothing else, and I CANNOT accept something like 1234/3012 or anything else, and this is my problem right here, it accepts everything in which it can find the above pattern, so like from 12345/6789 it accepts and saves 345/6789.
I am a total newbie with regex, so I checked out http://regexr.com and this is the code I have in my javascript:
$.validator.addMethod("hatarozat", function(value, element) {
return (this.optional(element) || /[0-9]{1,3}(?:\/)[0-9]{4}/i.test(value));
}, "Hibás határozat szám!");
So this is my regex: /[0-9]{1,3}(?:\/)[0-9]{4}/i
which I built up using the above website. What could be the problem, or how can I achived what I described? I tried /^[0-9]{1,3}(?:\/)[0-9]{4}$/ibut this doesn't seem to work, please anyone help me, I have everything else done and am getting pretty stressed over something looking so simple yet I cannot solve it. Thank you!

Your last regex with the anchors (^ and $) is a correct regex. What prevents your code from working is this.optional(element) ||. Since this is a static thing, and is probably true, so it does not show any error (as || is an OR condition, if the first is true, the whole returns true, the regex is not checked at all).
So, use
return /^[0-9]{1,3}\/[0-9]{4}$/.test(value);
Note you do not need the (?:...) with \/ as the grouping does not do anything important here and is just redundant. The anchors are important, since you want the whole string to match the pattern (and ^ anchors the regex at the start of the string and $ does that at the end of the string.)

You need use the the following special characters in your regex expression:
^ and $
or \b
so 2 regexp will be correct:
/\b[0-9]{1,3}(?:\/)[0-9]{4}\b/i;
or
/^[0-9]{1,3}(?:\/)[0-9]{4}$/i

Javascript substring check using indexOf or search on a date string with forward slash /

I am surprised to not to find any post regarding this, I must be missing something very trivial. I have a small JavaScript function to check if a string matches an object's properties. Simple stuff right? It works easily with all strings except those which contain a forward slash.
"‎04‎/‎08‎/‎2015‎".indexOf('4') // returns 2 :good
"‎04‎/‎08‎/‎2015‎".indexOf('4/') // returns -1 :why?
The same issue appears to be with .search() function as well. I encountered this issue while working on date strings.
Please note that I don't want to use regex based solution for performance reasons. Thanks for your help in advance!

Your string has invisible Unicode characters in it. The "left-to-right mark" (hex 200E) appears around the two slash characters as well as at the beginning and the end of the string.
If you type the code in on your browser console instead of cutting and pasting, you'll see that it works as expected.

Regex to validate textbox length

I have this RegEx that validates input (in javascript) to make sure user didn't enter more than 1000 characters in a textbox:
^.{0,1000}$
It works ok if you enter text in one line, but once you hit Enter and add new line, it stops matching. How should I change this RegEx to fix that problem?

The problem is that . doesn't match the newline character. I suppose you could use something like this:
^[.\r\n]{0,1000}$
It should work (as long as you're not using m), but do you really need a regular expression here? Why not just use the .length property?
Obligatory jwz quote:
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Edit: You could use a CustomValidator to check the length instead of using Regex. MSDN has an example available here.

What you wish is this:
/^[\s\S]{0,1000}$/
The reason is that . won't match newlines.
A better way however is to not use regular expressions and just use <text area element>.value.length

If you just want to verify the length of the input wouldn't it be easier to just verify the length of the string?
if (input.length > 1000)
// fail validation

We Keep Coding

JavaScript is the programming language of the Web.

.trim() and regular expressions producing unexpected results - javascript

.test(), just like .exec() will remember the last index of a match when using a global regex, and try to match from it onward, failing on the second call. Just remove the /g option from your regex - it doesn't make sense to have /g in a non-multiline regex which matches beginning and end.

Related

Why isn't .replace() working on a large generated string from escodege.generate()?

Get all the WORDS except one specific word

Regex expression for exactly known pattern without "cutting into" the string not working

Javascript substring check using indexOf or search on a date string with forward slash /

Regex to validate textbox length

Categories

Resources