Obtain everything after one of these phrases in regex? - javascript

I'm trying to use a regex to obtain everything after one of these phrases in a string in javascript.
The phrases are call me or my name's, or my name is or my names or I am or I'm
So I want everything after those phrases in the string.
I'm trying to do it like so, but it is capturing everything. Not only the text after.
/call\s+me(.*)|my\s+name\s+is(.*)|my\s+name's(.*)|my\s+names(.*)|Im(.*)|I\s+am(.*)|I'm(.*)/i.exec(string));
How can I do this properly?

The text after it will be in the capture groups. It will be in a different capture group depending on which prefix matched. So it would be better to put just the prefixes in the | alternatives, and just have a single capture group:
var result = str.match(/(?:call me|my name's|my name is|my names|I am|I'm)(.*)/)
Now result[1] will contain the text after the phrase.
DEMO

Try a positive lookbehind assertion:
(?<=call me|my name's|my name is|my names|I am|I'm).*
See this demo.
Edit
This regex won't work on Javascript ( See #Barmar's answer for a Javascript use )

Related

Splitting a string at question mark, exclamation mark, or period in javascript and retain those marks?

I was a bit surprised, that actually no one had the exact same issue in javascript...
I tried several different solutions none of them parse the content correctly.
The closest one I tried : (I stole its regex query from a PHP solution)
const test = `abc?aaa.abcd?.aabbccc!`;
const sentencesList = test.split("/(\?|\.|!)/");
But result just going to be
["abc?aaa.abcd?.aabbccc!"]
What I want to get is
['abc?', 'aaa.', 'abcd?','.', 'aabbccc!']
I am so confused.. what exactly is wrong?
/[a-z]*[?!.]/g) will do what you want:
const test = `abc?aaa.abcd?.aabbccc!`;
console.log(test.match(/[a-z]*[?!.]/g))
To help you out, what you write is not a regex. test.split("/(\?|\.|!)/"); is simply an 11 character string. A regex would be, for example, test.split(/(\?|\.|!)/);. This still would not be the regex you're looking for.
The problem with this regex is that it's looking for a ?, ., or ! character only, and capturing that lone character. What you want to do is find any number of characters, followed by one of those three characters.
Next, String.split does not accept regexes as arguments. You'll want to use a function that does accept them (such as String.match).
Putting this all together, you'll want to start out your regex with something like this: /.*?/. The dot means any character matches, the asterisk means 0 or more, and the questionmark means "non-greedy", or try to match as few characters as possible, while keeping a valid match.
To search for your three characters, you would follow this up with /[?!.]/ to indicate you want one of these three characters (so far we have /.*?[?!.]/). Lastly, you want to add the g flag so it searches for every instance, rather than only the first. /.*?[?!.]/g. Now we can use it in match:
const rawText = `abc?aaa.abcd?.aabbccc!`;
const matchedArray = rawText.match(/.*?[?!.]/g);
console.log(matchedArray);
The following code works, I do not think we need pattern match. I take that back, I have been answering in Java.
final String S = "An sentence may end with period. Does it end any other way? Ofcourse!";
final String[] simpleSentences = S.split("[?!.]");
//now simpleSentences array has three elements in it.

Get all the WORDS except one specific word

I want to get all the words, except one, from a string using JS regex match function. For example, for a string testhello123worldtestWTF, excluding the word test, the result would be helloworldWTF.
I realize that I have to do it using look-ahead functions, but I can't figiure out how exactly. I came up with the following regex (?!test)[a-zA-Z]+(?=.*test), however, it work only partially.
http://refiddle.com/refiddles/59511c2075622d324c090000
IMHO, I would try to replace the incriminated word with an empty string, no?
Lookarounds seem to be an overkill for it, you can just replace the test with nothing:
var str = 'testhello123worldtestWTF';
var res = str.replace(/test/g, '');
Plugging this into your refiddle produces the results you're looking for:
/(test)/g
It matches all occurrences of the word "test" without picking up unwanted words/letters. You can set this to whatever variable you need to hold these.
WORDS OF CAUTION
Seeing that you have no set delimiters in your inputted string, I must say that you cannot reliably exclude a specific word - to a certain extent.
For example, if you want to exclude test, this might create a problem if the input was protester or rotatestreet. You don't have clear demarcations of what a word is, thus leading you to exclude test when you might not have meant to.
On the other hand, if you just want to ignore the string test regardless, just replace test with an empty string and you are good to go.

Javascript regex optional on left or right

Hi I can't get my pattern to work correctly. I want to detect if a specific word has any word/letter either in left or right side or both.
For example:
a{placeholder} = found
{placeholder}b = found
a{placeholder}b = found
{placeholder} = not found
This is my pattern so far (\w)?\{LINK_TO_WEB_VERSION\}(\w)?
https://regex101.com/r/hX4lM0/1
You need to explicitly define the patterns combined with the delimiter |
\w\{LINK_TO_WEB_VERSION\}\w?|\w?\{LINK_TO_WEB_VERSION\}\w|\w\{LINK_TO_WEB_VERSION\}\w
DEMO
You can use this negative lookahead based regex:
/^(?!\B{LINK_TO_WEB_VERSION}\B).+$/gim
RegEx Demo
You have more options
1: with an or condition (|) you can say, match all with letter before or letter after as follows: /(\w){placeholder}|{placeholder}(\w)/img
if you have one of them, is found, you not need match before AND after ;)
2: negate all stuff: find placeholder with no letter before and no letter after: /[^A-Z]{placeholder}[^A-Z]/img => not found, any other case mean found
You not need look behind or look ahead, in my opinion, but you can use it if you want: /(?<!\w)\{placeholder\}(?!\w)/ , but be careful, look behind is not supported by some languages (f.e. javascript).

What's wrong with this regular expression to find URLs?

I'm working on a JavaScript to extract a URL from a Google search URL, like so:
http://www.google.com/search?client=safari&rls=en&q=thisisthepartiwanttofind.org&ie=UTF-8&oe=UTF-8
Right now, my code looks like this:
var checkForURL = /[\w\d](.org)/i;
var findTheURL = checkForURL.exec(theURL);
I've ran this through a couple regex testers and it seems to work, but in practice the string I get returned looks like this:
thisisthepartiwanttofind.org,.org
So where's that trailing ,.org coming from?
I know my pattern isn't super robust but please don't suggest better patterns to use. I'd really just like advice on what in particular I did wrong with this one. Thanks!
Remove the parentheses in the regex if you do not process the .org (unlikely since it is a literal). As per #Mark comment, add a + to match one or more characters of the class [\w\d]. Also, I would escape the dot:
var checkForURL = /[\w\d]+\.org/i;
What you're actually getting is an array of 2 results, the first being the whole match, the second - the group you defined by using parens (.org).
Compare with:
/([\w\d]+)\.org/.exec('thisistheurl.org')
→ ["thisistheurl.org", "thisistheurl"]
/[\w\d]+\.org/.exec('thisistheurl.org')
→ ["thisistheurl.org"]
/([\w\d]+)(\.org)/.exec('thisistheurl.org')
→ ["thisistheurl.org", "thisistheurl", ".org"]
The result of an .exec of a JS regex is an Array of strings, the first being the whole match and the subsequent representing groups that you defined by using parens. If there are no parens in the regex, there will only be one element in this array - the whole match.
You should escape .(DOT) in (.org) regex group or it matches any character. So your regex would become:
/[\w\d]+(\.org)/
To match the url in your example you can use something like this:
https?://([0-9a-zA-Z_.?=&\-]+/?)+
or something more accurate like this (you should choose the right regex according to your needs):
^https?://([0-9a-zA-Z_\-]+\.)+(com|org|net|WhatEverYouWant)(/[0-9a-zA-Z_\-?=&.]+)$

Match a specific sequence or everything else with regex

Been trying to come up with a regex in JS that could split user input like :
"Hi{user,10,default} {foo,10,bar} Hello"
into:
["Hi","{user,10,default} ","{foo,10,bar} ","Hello"]
So far i achieved to split these strings with ({.+?,(?:.+?){2}})|([\w\d\s]+) but the second capturing group is too exclusive, as I want every character to be matched in this group. Tried (.+?) but of course it fails...
Ideas fellow regex gurus?
Here's the regex I came up with:
(:?[^\{])+|(:?\{.+?\})
Like the one above, it includes that space as a match.
Use this:
"Hi{user,10,default} {foo,10,bar} Hello".split(/(\{.*?\})/)
And you will get this
["Hi", "{user,10,default}", " ", "{foo,10,bar}", " Hello"]
Note: {.*?}. The question mark here ('?') stops at fist match of '}'.
Beeing no JavaScript expert, I would suggest the following:
get all positive matches using ({[^},]*,[^},]*,[^},]*?})
remove all positive matches from the original string
split up the remaining string
Allthough, this might get tricky if you need the resulting values in order.

Categories