Regex line in js - javascript

please help I have the regex line ^((?![<>=^##]).)*$ which checking not ordinary symbols in an input field, it's ok but I need to add for this line one more condition , my line need to have vs.
For example when we have the name of sport game like this Patriots vs. Tigers
How can I complete my ^((?![<>=^##]).)*$ condition and add rule for checking vs. in line (input field must have vs.) ?
It will be so cool if conditional also check spaces around vs. at left and at right, because for example Patriotsvs.Tigers is not good and need to show error also

I think what you want is
/^[^<>=^##]*?\bvs\.[^<>=^##]*$/
which blacklists the characters [<>=^##] and requires the literal text "vs." somewhere in the string.
That character blacklist is probably insufficient if you're trying to only approve inputs that won't lead to SQL-injection or XSS. Please consider using a stock input filtering/escaping system with this.

You can use look aheads with the start anchor to effectively use multiple conditions. Here is something that should work for you:
^(?=((?![<>=^##]).)*$)(?=.*?\svs\.\s).*$
Will match:
thing vs. another
Patriots vs. Tigers
Won't match:
th%^hg vs. another
thing another
thingvs.another

Related

How to select and replace certain words in input strings regardless of the full string. (More info for clarification below)

I am making a filter for a chat room I own.
I was succesful in having it turn NSFW words into a bunch of symbols and astericks to censor it, but many people bypass it by simply putting a backslash, period, or other symbol/letter after it because I only put in the words without the punctation and symbols. They also come up with a bit more creative methods such as eeeNSFWeee so the filter doesn't count it as a word.
Is there a way to make it so that the filter will select certain characters that form a word in a string and replace them (with or without replacing the extra characters connected to the message)?
The Filter is made in javascript and Socket.io
Filter code:
const array = [
"NSFW",
"Bad Word"
"Innapropiate Word"
];
message = message
.split(" ")
.map((word) => (array.includes(word.toLowerCase()) ? "$#!%" : word))
.join(" ");
For an example if somebody typed "Bad Word" exactly like that (caps are not a problem), it would censor it succesfully.
But if somebody typed "Bad Word." that would be a problem because since it has a period it would count it as a different word, thats what I need fixed.
There are a number of approaches you could take here.
You could use replace() if you just want to remove symbols. For example:
word.replace(/[&\/\\#,+()$~%.`'"!;\^:*?<>{}_\[\]]/g, '')
You could use Regular Expressions in general, which allows you to match on patterns instead of exact string matching.
You could also use more complex fuzzy matching libraries or custom fuzzy matching to accomplish your goal. This post may be helpful.

Incorrect Regex Expression

I found this site:
https://mathiasbynens.be/demo/url-regex
and wanted to use for my url validation the regex from the #diegoperini, because according to the table provided on the top of the site, it is the best regex.
When I try to use it, I get a range value error.
P.S. I am using the following Regex expression:
_^(?:(?:https?|ftp):\/\/)(?:\S+(?::\S*)?#)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)*(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,})))(?::\d{2,5})?(?:\/[^\s]*)?$_iuS
and the following online validator:
http://regexr.com/
It does show the error place in the regex, but I don't know how to manage it. I tried to swap the both ranges, but it doesn't do the trick.
I would appreciate some help.
P. P. S.
I use the regex in the AngularJS directive to validate url input.
Buried within your character classes, you have this range:
\x{00a1}-\x{ffff}
But it should be:
\u00a1-\uffff
Your expression \x{00a1}-\x{ffff} is not the correct syntax for a hex encoding or a character and as-is means any of "x{}0a1f" plus the range "}-x", but "x" is less than "}" so an error is raised to that effect.
This should work
^(?:(?:https?|ftp):\/\/)(?:\S+(?::\S*)?#)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))(?::\d{2,5})?(?:\/[^\s]*)?$

My regex that should only accept latin-based characters is acting strangely

I've got a regex written to the best of my ability that allows the latin character set only with the option of a '-' that, if included MUST be followed by at least one other latin character.
My RegEx:
[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+(?:[-]?[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+)
I came to this after reading a few posts and rereading the manual to figure out the best way to approach this. This check is attached to a text field where a user types only their first name and then submits.
It works okay but there is certainly room for improvement.
Examples:
Tom // passes
Éve // passes
John-Paul // passes
2pac // passes and removes numbers (not really what I want)
John316 // passes and removes numbers (not really what I want)
What I would REALLY want to happen is a fail on those last two checks.
How would I revise it to get the outcome I'd like?
You need to anchor the regex by adding ^ at the start and $ at the end. That way you will not let any other symbols in the input string.
I also suggest enhancing the pattern by moving ? from after hyphen to the end (that will make regex execution linear as the hyphen has no quantifier and is required, thus, limiting backtracking):
^[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+(?:-[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+)?$
See regex demo.
JS snippet:
console.log(/^[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+(?:-[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+)?$/.test('Éve')); //=> true
console.log(/^[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+(?:-[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+)?$/.test('John-Paul')); // => true
console.log(/^[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+(?:-[\u00BF-\u1FFF\u2C00-\uD7FFA-Za-z]+)?$/.test('John316')); // => false

JavaScript Repetitive RegEx

Consider my few input strings.
http://local.app.com/local/frontend/v12/#/abcde/
http://local.app.com/local/frontend/v12/#/abcde/!/fghij/
http://local.app.com/local/frontend/v12/#/abcde/!/ghijk/!/klmno/
I have written this regex which works fine for input string 1.
(?:([a-zA-Z0-9.://_]*)(/#/(?=([a-zA-Z0-9]{5})/)))
Output:
http://local.app.com/local/frontend/v12/#/,http://local.app.com/local/frontend/v12,/#/,abcde
But when I extend it to support repetitive !/.../ place holder for input string 1,2 and 3, it doesn't work and gives empty string rather than token.
(?:([a-zA-Z0-9.://_]*)(/#/(?=([a-zA-Z0-9]{5})/))(!/(?=([a-zA-Z0-9]{5})/))*)
Output:
http://local.app.com/local/frontend/v12/#/,http://local.app.com/local/frontend/v12,/#/,abcde,,
?= captures in fact a position defined by what you specify after the ?=
It does not (also) capture whatever may match the specification of the lookaround (?=).
Try
(.+? # (/[a-zA-Z0-9]{5}/) (!/([a-zA-Z0-9]{5})/)* )
(hope I didn't make a typo, can't test it right now.)
This should capture the complete input, but the various captures inside give you access to the captured "tokens".
You can, in addition, give names to the various captures inside, making it easier to identify them in the match:
(.+?#(/(?<tokenFirst>[a-zA-Z0-9]{5})/)(!/(?<tokenMore>[a-zA-Z0-9]{5})/)*)
Success
Hope this will clarify my comment and earlier remarks.

Regex to validate textbox length

I have this RegEx that validates input (in javascript) to make sure user didn't enter more than 1000 characters in a textbox:
^.{0,1000}$
It works ok if you enter text in one line, but once you hit Enter and add new line, it stops matching. How should I change this RegEx to fix that problem?
The problem is that . doesn't match the newline character. I suppose you could use something like this:
^[.\r\n]{0,1000}$
It should work (as long as you're not using m), but do you really need a regular expression here? Why not just use the .length property?
Obligatory jwz quote:
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Edit: You could use a CustomValidator to check the length instead of using Regex. MSDN has an example available here.
What you wish is this:
/^[\s\S]{0,1000}$/
The reason is that . won't match newlines.
A better way however is to not use regular expressions and just use <text area element>.value.length
If you just want to verify the length of the input wouldn't it be easier to just verify the length of the string?
if (input.length > 1000)
// fail validation

Categories