Regex - get the fileName? - javascript

I have this urls:
C:\Projects\Ensure_Solution\GD_EServices_Web\App_WebReferences\GD_Eservices_Web_Service\GD_Eservices_Web_Service.wsdl
C:\Projects\Ensure_Solution\GD_EServices_Web\App_WebReferences\GD_Eservices_Web_Service\GD_Eservices_Web_Service.wsdl
I want to get the wsdl file name ( no leading slash)
I have succeeded with 2 solutions :
\\[^\\]+$
\\(.(?!\\))+$
But this returns the leading slash : http://regexr.com?32lvi
how can I enhance my regex return only the file ?

You just need to exclude the the leading slash in the regex.
var path = 'C:\\Projects\\Ensure_Solution\\GD_EServices_Web\\App_WebReferences\\GD_Eservices_Web_Service\\GD_Eservices_Web_Service.wsdl';
console.log(path.match(/[^\\]+$/));
And you could get it without regex, use split, and get the last element with pop:
console.log(path.split('\\').pop());

This should work [^\\]+$
But for your case I'd prefer smth like string.split('/').pop() (javascript) or array_pop(split('/', string)) (for php, I don't know language you are using) not regexp.

Try with the negative look-ahead (?!\\)(.(?!\\))+$

This should do it:
([\w\d_-]*)\.?[^\\\/]*$
This thread has some examples for javascript.
Alternatively, you can so a string split on "\" to create an array and get the last one in the array.

Try this
\\[^\\]+$
Note:that means try with only one leading backslash

Related

regex replace all backward slashes before '\?'

I have a string such as
'frontend\less\defaults\layout.css?file=\foo'
I want a regex that replaces it with
'frontend/less/defaults/layout.css?file=\foo'
I tried /\\/g, but it keeps matching stuff after a \?, which I want to avoid somehow
Following will work, use a lookahead in your regexp:
var myString="path\\to\\file.php?query=\\something"
var r=(/\?/g.test(myString))?/(\\)(?=.+[\?])/ig:/\\/ig;
.replace(r,"/")
You can do this with String.replace, with a replacement function:
str.replace(/^([^?]*)/, function (_, $1) {
return $1.replace(/\\/g, '/');
});
This will work regardless of whether the query string exists or not.
Explanation
/^([^?]*)/
([^?]*) will match and capture everything before ? (if any).
I assume the URL is valid, so there is no validation done here.
(Thanks to #Pumbaa80 for the suggestion. There is no need to match the query string part if it is going to stay the same after the replacement)
Unless you know the number of \'s in advance, I doubt you can do this with a comprehensible regex. I would:
split the string in two parts: the part before the ?, and after it
use your regex on the first part
put the two strings back together.

What's wrong with this regular expression to find URLs?

I'm working on a JavaScript to extract a URL from a Google search URL, like so:
http://www.google.com/search?client=safari&rls=en&q=thisisthepartiwanttofind.org&ie=UTF-8&oe=UTF-8
Right now, my code looks like this:
var checkForURL = /[\w\d](.org)/i;
var findTheURL = checkForURL.exec(theURL);
I've ran this through a couple regex testers and it seems to work, but in practice the string I get returned looks like this:
thisisthepartiwanttofind.org,.org
So where's that trailing ,.org coming from?
I know my pattern isn't super robust but please don't suggest better patterns to use. I'd really just like advice on what in particular I did wrong with this one. Thanks!
Remove the parentheses in the regex if you do not process the .org (unlikely since it is a literal). As per #Mark comment, add a + to match one or more characters of the class [\w\d]. Also, I would escape the dot:
var checkForURL = /[\w\d]+\.org/i;
What you're actually getting is an array of 2 results, the first being the whole match, the second - the group you defined by using parens (.org).
Compare with:
/([\w\d]+)\.org/.exec('thisistheurl.org')
→ ["thisistheurl.org", "thisistheurl"]
/[\w\d]+\.org/.exec('thisistheurl.org')
→ ["thisistheurl.org"]
/([\w\d]+)(\.org)/.exec('thisistheurl.org')
→ ["thisistheurl.org", "thisistheurl", ".org"]
The result of an .exec of a JS regex is an Array of strings, the first being the whole match and the subsequent representing groups that you defined by using parens. If there are no parens in the regex, there will only be one element in this array - the whole match.
You should escape .(DOT) in (.org) regex group or it matches any character. So your regex would become:
/[\w\d]+(\.org)/
To match the url in your example you can use something like this:
https?://([0-9a-zA-Z_.?=&\-]+/?)+
or something more accurate like this (you should choose the right regex according to your needs):
^https?://([0-9a-zA-Z_\-]+\.)+(com|org|net|WhatEverYouWant)(/[0-9a-zA-Z_\-?=&.]+)$

How to identify all URLs that contain a (domain) substring?

If I am correct, the following code will only match a URL that is exactly as presented.
However, what would it look like if you wanted to identify subdomains as well as urls that contain various different query strings - in other words, any address that contains this domain:
var url = /test.com/
if (window.location.href.match(url)){
alert("match!");
}
If you want this regex to match "test.com" you need to escape the "." and both of the "/" that means any character in regex syntax.
Escaped : \/test\.com\/
Take a look for here for more info
No, your pattern will actually match on all strings containing test.com.
The regular expresssion /test.com/ says to match for test[ANY CHARACTER]com anywhere in the string
Better to use example.com for example links. So I replaces test with example.
Some example matches could be
http://example.com
http://examplexcom.xyz
http://example!com.xyz
http://example.com?q=123
http://sub.example.com
http://fooexample.com
http://example.com/asdf/123
http://stackoverflow.com/?site=example.com
I think you need to use /g. /g enables "global" matching. When using the replace() method, specify this modifier to replace all matches, rather than only the first one:
var /test.com/g;
If you want to test if an URL is valid this is the one I use. Fairly complex, because it takes care also of numeric domain & a few other peculiarities :
var urlMatcher = /(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?/;
Takes care of parameters and anchors etc... dont ask me to explain the details pls.

JS/Jquery, Match not finding the PNG = match('/gif|jpg|jpeg|png/')

I have the following code which I use to match fancybox possible elements:
$('a.grouped_elements').each(function(){
var elem = $(this);
// Convert everything to lower case to match smart
if(elem.attr('href').toLowerCase().match('/gif|jpg|jpeg|png/') != null) {
elem.fancybox();
}
});
It works great with JPGs but it isn't matching PNGs for some reason. Anyone see a bug with the code?
Thanks
A couple of things.
Match accepts an object of RegExp, not a string. It may work in some browsers, but is definitely not standard.
"gif".match('/gif|png|jpg/'); // null​​​​​​​​​​​​​​​​​​​​​​​​​​​​
Without the strings
"gif".match(/gif|png|jpg/); // ["gif"]
Also, you would want to check these at the end of a filename, instead of anywhere in the string.
"isthisagif.nope".match(/(gif|png|jpg|jpeg)/); // ["gif", "gif"]
Only searching at the end of string with $ suffix
"isthisagif.nope".match(/(gif|png|jpg|jpeg)$/); // null
No need to make href lowercase, just do a case insensitive search /i.
Look for a dot before the image extension as an additional check.
And some tests. I don't know how you got any results back with using a string argument to .match. What browser are you on?
I guess the fact that it'll match anywhere in the string (it would match "http://www.giftshop.com/" for instance) could be considered a bug. I'd use
/\.(gif|jpe?g|png)$/i
You are passing a string to the match() function rather than a regular expression. In JavaScript, strings are delimited with single quotes, and regular expressions are delimited with forward slashes. If you use both, you have a string, not a regex.
This worked perfectly for me: /.+\.(gif|png|jpe?g)$/i
.+ -> any string
\. -> followed by a point.
(gif|png|jpe?g) -> and then followed by any of these extensions. jpeg may or may not have the letter e.
$ -> now the end of the string it's expected
/i -> case insensitive mode: matches both sflkj.JPG and lkjfsl.jpg

Regex to match part of a string

Regex fun again...
Take for example http://something.com/en/page
I want to test for an exact match on /en/ including the forward slashes, otherwise it could match 'en' from other parts of the string.
I'm sure this is easy, for someone other than me!
EDIT:
I'm using it for a string.match() in javascript
Well it really depends on what programming language will be executing the regex, but the actual regex is simply
/en/
For .Net the following code works properly:
string url = "http://something.com/en/page";
bool MatchFound = Regex.Match(url, "/en/").Success;
Here is the JavaScript version:
var url = 'http://something.com/en/page';
if (url.match(/\/en\//)) {
alert('match found');
}
else {
alert('no match');
}
DUH
Thank you to Welbog and Chris Ballance to making what should have been the most obvious point. This does not require Regular Expressions to solve. It simply is a contains statement. Regex should only be used where it is needed and that should have been my first consideration and not the last.
If you're trying to match /en/ specifically, you don't need a regular expression at all. Just use your language's equivalent of contains to test for that substring.
If you're trying to match any two-letter part of the URL between two slashes, you need an expression like this:
/../
If you want to capture the two-letter code, enclose the periods in parentheses:
/(..)/
Depending on your language, you may need to escape the slashes:
\/..\/
\/(..)\/
And if you want to make sure you match letters instead of any character (including numbers and symbols), you might want to use an expression like this instead:
/[a-z]{2}/
Which will be recognized by most regex variations.
Again, you can escape the slashes and add a capturing group this way:
\/([a-z]{2})\/
And if you don't need to escape them:
/([a-z]{2})/
This expression will match any string in the form /xy/ where x and y are letters. So it will match /en/, /fr/, /de/, etc.
In JavaScript, you'll need the escaped version: \/([a-z]{2})\/.
You may need to escape the forward-slashes...
/\/en\//
Any reason /en/ would not work?
/\/en\// or perhaps /http\w*:\/\/[^\/]*\/en\//
You don't need a regex for this:
location.pathname.substr(0, 4) === "/en/"
Of course, if you insist on using a regex, use this:
/^\/en\//.test(location.pathname)

Categories