regexp: match everything beginning from second dot including dot - javascript

I want to match everything beginning from second ., including .
Regexp: /(?<=\d\.\d+)\..*/g. Playground regex101
It does not work for strings 1232..233232.

Update
as #WiktorStribiżew points out the regex don't test for 1212.2e1.121212 os this might be a better solution.
/(?<=^[^.]*\.[^.]*)\..*/ since it will also test for this
old answer.
You can do this regex101, this will begin including from the second . including it.
Regexp: /(?<=\d?\.\d*)\..*/g
You need to use * (include 0 to x elements of this character) instead of + (include 1 to x of this character)
I have added a ? after your first \d to handle the case if it starts with a . and not a digit.

When reading your question literally.
I want to match everything beginning from second ., including .
This would do the trick:
[.][^.]*([.].*)
Leaving the resulting answer in group 1. Keep in mind that [^.] also matches newline characters, if you don't want this add \n to the character negation class.

Related

Replacements only in the first line with a regex

There is a transform of multiline string.
!a! b!
should become
.a. b.
And
!a! b!
c!
!d!
should become
.a. b.
c!
!d!
I approached it with a lookbehind:
str(/(?<!\n)([^\n!]*)!+/g, '$1.')
It didn't work as intended:
.a. b.
c.
!d.
Splitting a string and transforming the first line seems straightforward. But is there a reliable way to do replacements only in the first line of multiline string with a regex only?
Also would appreciate an explanation what exactly goes wrong with my approach so it fails.
The question is not limited to JS regex flavour but I'm interested in this one in the first place.
About the pattern you tried:
(?<!\n) Negative lookbehind, assert what is directly to the left is not a newline or !
([^\n!]*) Capture group 1, match 0+ times any char except a newline or !
!+ Match 1+ times ! (What you want to remove)
The pattern will match too much, as it will match all the individual parts. There is for example no rule that says match this pattern 2 times, so you will replace with group 1 for every time that pattern has a match.
Note that the quantifier in this part is 0+ times ([^\n!]*) it will also match a single ! except when preceded by a newline.
If you can make use of SKIP FAIL, you can first match what you want to avoid, which in this case is a line that optionally starts with an exclamation mark and ends with an exclamation mark with none in between.
After that match all the other exclamation marks and replace them with a dot.
^!?[^\r\n!]*!$(*SKIP)(*FAIL)|!
See a regex demo
Another option could be using 2 capturing groups.
The first group will match between the first set of exclamation marks, and the second group will match the whitespaces after followed by a char other than !.
Then match the ! at the end so it is not in the replacement
!([^\s!]+)!([^\S\r\n]+[^\s!])!
See another regex demo
In the replacement use the 2 capturing groups with the dots
.$1.$2.

Regular Expression for Blocking a character in begining

I am facing an issue with a regular expression while trying to block any string which has minus(-) in the beginning of some white listed characters.
^(?!-.*$).([a-zA-Z0-9-:#\\,()\\/\\.]+)$
It is blocking minus(-) at place and allowing it any where in the character sequence but this regex is not working if the passed string is single character.
For e.g A or 9 etc.
Please help me out with this or give me a good regex to do the task.
Your pattern requires at least 2 chars in the input string because there is a dot after the first lookahead and then a character class follows that has + after it (that is, at least 1 occurrence must be present in the string).
So, you need to remove the dot. Also, you do not need to escape any special char inside a character class. Besides, to avoid matching strings atarting with - a mere (?!-) will suffice, no need adding .*$ there. You may use
^(?!-)[a-zA-Z0-9:#,()/.-]+$
See the regex demo. Remember to escape / if used in a regex literal notation in JavaScript, there is no need to escape it in a constructor notation or in a Java regex pattern.
Details
^ - start of a string
(?!-) - cannot start with -
[a-zA-Z0-9:#,()/.-]+ - 1 or more ASCII letters, digits and special chars defined in the character class (:, #, ,, (, ), /, ., -)
$ - end of string.
If i understand correctly, and you don't want a minus at the beginning, does ^[^-].* work as a regex for you? Java's "matches" would return false if it starts with minus
There is a method in a String class that provides you exactly what you are asking for - it's a startsWith() method - you could use this method in your code like this (you can translate it as "If the given String doesn't start with -, doSomething, in other case do the else part, that can contain some code or might be empty if you want nothing to be done if the given String starts with - ") :
if(!(yourString.startsWith("-"))) {
doSomething()
} else {
doNothingOrProvideAnyInformationAboutWrongInput()
}
I think that it can help you.
^(?!-).*[a-zA-Z0-9-:#\\,()\/\\.]+$

Why not being greedy in triple dots?

I'm not looking to brute this to work, with a workaround, I am interested in learning why it failed.
I am trying to match all occurrences of a comma or period NOT followed by a space.
I used this patt: ([.,]+)(?! )
It should match only two cases in this string:
This is a test... And,another test.
It should match the , between And and another AND it should match the final period of the sentence. HOWEVER it is also matching the first two dots of the triple dots. .... Shouldn't the + make it greedy so it see the tripple dots is followed by a space and not match it?
Screenshot:
Your regex ([.,]+)(?! ) matches .. in ... because of backtracking. It happens when a regex may match a part of the string in different ways, and it is the case when you use quantifiers and lookarounds. Here, the engine matches ... and checks if there is a space. There is a space after ... in your string, thus, the match is failed, but the regex engine knows there is another possible way to match at the current location, and backtracks. It discards the final . from the match and checks if the second . in ... is not followed with a space. It is not, there is a . after it. So, .. are matched.
You can use an atomic group workaround here:
/(?=([.,]+))\1(?! )/g
See the regex demo
One or more dots or commas are captured inside a lookahead and then \1 consumes the text. Since there is no backtracking possible into backreferences, the negative lookahead is checked after the last . or , and if there is a space, fail occurs and the preceding . or , are not checked.
A better way for a JS regex engine to match what you want is to include the . and , into the negative lookahead condition (see Pavneet's suggestion):
/[.,]+(?![ .,])/g
^^^^^

JavaScript: Regex for one letter and a point

Using Javascript, I am trying to split a text using a regular expression that contains only one letter and a point; for example 'A.' or 'Q.'
I am using: array[0].split(/[A-Z]+./);
But it is returning true if there are multiple characters like 'AA.' or 'ZZ.'
What is the regex for one character only? I also have access to jQuery.
Thank you!
This should work: array[0].split(/([A-Z]{1}\.)/);
Explanation:
[A-Z]{1} - this looks for any single letter
\. - this looks for a . note that you'll need the backslash as a . in a regex means any character.
These are surrounded by brackets so that you are looking for the sequence:
one letter followed by a .
You'll need to remove the + as it means one or more. if you want to find several occorances then you'll need the global flag (g at the end): array[0].split(/([A-Z]{1}\.)/g);
Limit to A. and not AA.
If you want to make sure its only one letter follow by a . and not match on the end of AA. then you can use the word boundary like so: array[0].split(/\b([A-Z]{1}\.)/);
This site is amazing for helping with regex: https://regex101.com/
You don't want to use +, as it means "one or more". Just remove it. Also you have to escape the . otherwise it means "any character".
If you want to match only a capital letter and a point, this should do the trick : /[A-Z]\./g if you want the regex to be case insensitive, simply add the i flag : /[A-Z]\./gi. Remove the g flag if you only want to match the first occurrence.
You can try this
one
/\s[A-Z]\.\s/g

Explanation of this Regular expression

I'm not very good with Regular Expressions, and I didn't fully understood this one, All I get from this is that it find every h1 and add a class to it's last word.
$("h1").html(function(index, old) {
return old.replace(/(\b\w+)$/, '<span class="myClass">$1</span>');
});
I'm trying to make it work by last two characters
Here is and explanation:
/ : regex delimter
( : begin capture group #1
\b : word boundary
\w+ : one or more word character (same as [a-zA-Z0-9_]+)
) : end of group
$ : end of string
/ : regex delimiter
It matches the last word of the string, ie the last word of the h1 tag.
This (poorly written) regex finds a sequence of word characters (latin letters, numbers and underscore) at the end of the input. The same can be achieved much simpler: /\w+$/, so neither \b nor parens are actually necessary here.
To match two last words you'll need something like
/\w+(?=(\W+\w+)?$)/g
which means "a word, optionally followed by another word before the end of the input".
To match two last characters -- well, this is something you should be able to figure out on your own (hint: any character is . (dot) in regex language).

Categories