Javascript regex optional on left or right - javascript

Hi I can't get my pattern to work correctly. I want to detect if a specific word has any word/letter either in left or right side or both.
For example:
a{placeholder} = found
{placeholder}b = found
a{placeholder}b = found
{placeholder} = not found
This is my pattern so far (\w)?\{LINK_TO_WEB_VERSION\}(\w)?
https://regex101.com/r/hX4lM0/1

You need to explicitly define the patterns combined with the delimiter |
\w\{LINK_TO_WEB_VERSION\}\w?|\w?\{LINK_TO_WEB_VERSION\}\w|\w\{LINK_TO_WEB_VERSION\}\w
DEMO

You can use this negative lookahead based regex:
/^(?!\B{LINK_TO_WEB_VERSION}\B).+$/gim
RegEx Demo

You have more options
1: with an or condition (|) you can say, match all with letter before or letter after as follows: /(\w){placeholder}|{placeholder}(\w)/img
if you have one of them, is found, you not need match before AND after ;)
2: negate all stuff: find placeholder with no letter before and no letter after: /[^A-Z]{placeholder}[^A-Z]/img => not found, any other case mean found
You not need look behind or look ahead, in my opinion, but you can use it if you want: /(?<!\w)\{placeholder\}(?!\w)/ , but be careful, look behind is not supported by some languages (f.e. javascript).

Related

Regex for specific strings/paths

I need a regular expression to match many specific paths/strings but I can't figure it out.
E.g.
../foo/hoo/something.js -> Needs to match ../foo/hoo/
../foo/bar/somethingElse.js -> Needs to match ../foo/bar/
../foo/something-else.js -> Needs to match ../foo/
What I tried with no luck is the following regex:
/\..\/foo\/|bar\/|hoo\//g
This should work out for you:
/(\.\.\/foo\/(hoo\/|bar\/)?)/
https://regex101.com/r/1aTf7y/1
So you select ../foo/ at first and then have a group that can either contain hoo/ or bar/. And the question mark allows 0 or one instances.
If you want to be a little less specific, you could also do
/(\.\.\/[^\/]+\/(hoo\/|bar\/)?)/
The [^\/]+ allows all characters except for a slash
You can use the regex
(\/[^\/\s]+)+(?=\/)
see the regex101 demo
function match(str){
console.log(str.match(/(\/[^\/\s]+)+(?=\/)/)[0]);
}
match('./foo/hoo/something.js');
match('../foo/bar/somethingElse.js');
match('../foo/something-else.js');
This should be the regex for matching all dirs without filename.
/^(.*[/])[^/]+$/

JS regexp error?

Seems that it is a bug in regexp in JS basing, otherwise I cannot explaine why it matches character
'test '.match(/^s(?:e)?|s(?:e)?|c(?:q)?|c(?:q)?$/i);
> ["s"]
why does that happen?
The meaning of this regexp is : if you have a keyword like 'se' and you want to match only a part of it(like only s or the whole se) you write something like this.
Duplicates happens when you have multiple keyphrase to a keyword relation.
Your regexp will match ONE of these four groups and only one.
/^s(?:e)?|s(?:e)?|c(?:q)?|c(?:q)?$/i
^^^^^^^^ ^^^^^^^ ^^^^^^^ ^^^^^^^^
That is to say:
^s OR ^se
OR
s OR se
OR
c OR cq
OR
c$ OR cq$
where ^ is the start of the string and $ is the end.
In this case it is matching exactly the s (the first possibility in the second alternation group).
I have no idea what you actually want it to match so can't really advise further but that's why it's matching what it is.
You gotta group all the possible alternations, or you'll fall into the trap of alternating between partial fragmented subpatterns, which won't work for you. See this:
/^(?:s(?:e)?|s(?:e)?|c(?:q)?|c(?:q)?)$/i
^^^ ^
If you don't need an alternation, the anchors or the groups at all, remove them:
/se?se?cq?cq?/i
In this case it sounds like you just built the wrong regular expression. In such event, you will need:
/(?:se|es|cq|qc)/i

Obtain everything after one of these phrases in regex?

I'm trying to use a regex to obtain everything after one of these phrases in a string in javascript.
The phrases are call me or my name's, or my name is or my names or I am or I'm
So I want everything after those phrases in the string.
I'm trying to do it like so, but it is capturing everything. Not only the text after.
/call\s+me(.*)|my\s+name\s+is(.*)|my\s+name's(.*)|my\s+names(.*)|Im(.*)|I\s+am(.*)|I'm(.*)/i.exec(string));
How can I do this properly?
The text after it will be in the capture groups. It will be in a different capture group depending on which prefix matched. So it would be better to put just the prefixes in the | alternatives, and just have a single capture group:
var result = str.match(/(?:call me|my name's|my name is|my names|I am|I'm)(.*)/)
Now result[1] will contain the text after the phrase.
DEMO
Try a positive lookbehind assertion:
(?<=call me|my name's|my name is|my names|I am|I'm).*
See this demo.
Edit
This regex won't work on Javascript ( See #Barmar's answer for a Javascript use )

What's wrong with this regular expression to find URLs?

I'm working on a JavaScript to extract a URL from a Google search URL, like so:
http://www.google.com/search?client=safari&rls=en&q=thisisthepartiwanttofind.org&ie=UTF-8&oe=UTF-8
Right now, my code looks like this:
var checkForURL = /[\w\d](.org)/i;
var findTheURL = checkForURL.exec(theURL);
I've ran this through a couple regex testers and it seems to work, but in practice the string I get returned looks like this:
thisisthepartiwanttofind.org,.org
So where's that trailing ,.org coming from?
I know my pattern isn't super robust but please don't suggest better patterns to use. I'd really just like advice on what in particular I did wrong with this one. Thanks!
Remove the parentheses in the regex if you do not process the .org (unlikely since it is a literal). As per #Mark comment, add a + to match one or more characters of the class [\w\d]. Also, I would escape the dot:
var checkForURL = /[\w\d]+\.org/i;
What you're actually getting is an array of 2 results, the first being the whole match, the second - the group you defined by using parens (.org).
Compare with:
/([\w\d]+)\.org/.exec('thisistheurl.org')
→ ["thisistheurl.org", "thisistheurl"]
/[\w\d]+\.org/.exec('thisistheurl.org')
→ ["thisistheurl.org"]
/([\w\d]+)(\.org)/.exec('thisistheurl.org')
→ ["thisistheurl.org", "thisistheurl", ".org"]
The result of an .exec of a JS regex is an Array of strings, the first being the whole match and the subsequent representing groups that you defined by using parens. If there are no parens in the regex, there will only be one element in this array - the whole match.
You should escape .(DOT) in (.org) regex group or it matches any character. So your regex would become:
/[\w\d]+(\.org)/
To match the url in your example you can use something like this:
https?://([0-9a-zA-Z_.?=&\-]+/?)+
or something more accurate like this (you should choose the right regex according to your needs):
^https?://([0-9a-zA-Z_\-]+\.)+(com|org|net|WhatEverYouWant)(/[0-9a-zA-Z_\-?=&.]+)$

JavaScript negative lookbehind issue

I've got some JavaScript that looks for Amazon ASINs within an Amazon link, for example
http://www.amazon.com/dp/B00137QS28
For this I use the following regex: /([A-Z0-9]{10})
However, I don't want it to match artist links which look like:
http://www.amazon.com/Artist-Name/e/B000AQ1JZO
So I need to exclude any links where there's a '/e' before the slash and the 10-character alphanumeric code. I thought the following would do that: (?<!/e)([A-Z0-9]{10}), but it turns out negative lookbehinds don't work in JavaScript. Is that right? Is there another way to do this instead?
Any help would be much appreciated!
As a side note, be aware there are plenty of Amazon link formats, which is why I want to blacklist rather than whitelist, eg, these are all the same page:
http://www.amazon.com/gp/product/B00137QS28/
http://www.amazon.com/dp/B00137QS28
http://www.amazon.com/exec/obidos/ASIN/B00137QS28/
http://www.amazon.com/Product-Title-Goes-Here/dp/B00137QS28/
In your case an expression like this would work:
/(?!\/e)..\/([A-Z0-9]{10})/
([A-Z0-9]{10}) will work equally well on the reverse of its input, so you can
reverse the string,
use positive lookahead,
reverse it back.
You need to use a lookahead to filter the /e/* ones out. Then trim the leading /e/ from each of the matches.
var source; // the source you're matching against the RegExp
var matches = source.match(/(?!\/e)..\/[A-Z0-9]{10}/g) || [];
var ids = matches.map(function (match) {
return match.substr(3);
});

Categories