str replace all in Javascript - javascript

I am trying to some some urls throught javascript where some replacement of urls needs to be done. I have a textarea with some URLs example given below:
http://mywebsite.com/preview.aspx?mode=desktop&url=http://mywebsite.com/post.aspx?id=44&content=1
http://mywebsite.com/preview.aspx?mode=desktop&url=http://mywebsite.com/post.aspx?id=44&content=2
http://mywebsite.com/preview.aspx?mode=desktop&url=http://mywebsite.com/post.aspx?id=44&content=3
http://mywebsite.com/preview.aspx?mode=desktop&url=http://mywebsite.com/post.aspx?id=44&content=3
Now what i am trying to do is replacing http://mywebsite.com/preview.aspx?mode=desktop&url= with spaces.
I have tried using str.replace() but it is replacing only first occurence of that url.
I have also tried with Global variable g the query i have used is
str_replace(\http://mywebsite.com/preview.aspx?mode=desktop&url=/g,'');
But its not working So can anyone tell me how i can do that ?
I want the output of the textarea like:
http://mywebsite.com/post.aspx?id=44&content=1
http://mywebsite.com/post.aspx?id=44&content=2
http://mywebsite.com/post.aspx?id=44&content=3
http://mywebsite.com/post.aspx?id=44&content=4

I believe that your biggest issue is that your regex syntax is incorrect. Try this:
Imagine that var s is equal the the value of your textarea.
s.replace(/http\:\/\/mywebsite\.com\/preview.aspx\?mode\=desktop\&url\=/g, '');
The issue you were having was improper delimiters and unescaped reserved symbols.
Though Javascript has some of its own regex idiosyncrasies, the issues here were related to basic regex, you might find these resources useful:
http://www.cheatography.com/davechild/cheat-sheets/regular-expressions/
http://regexpal.com

try this.
var string = document.getElementById('textareaidhere');
string.replace(/http:\/\/mywebsite\.com\/preview\.aspxmode=desktop&url=/g, '');
JSFiddle here

Related

Replace params in javascript

I tried a lot to replace the query parammeter using Javascript. But its not working. Can you please share any solutions to replace the parameter
Below is the example
console.log("www.test.com?x=a".replace(new RegExp(`${"x=a"}&?`),''));
the output i am getting is www.test.com? . Is there any way to replace ? and to get only www.test.com.
If you want to remove whatever comes from the question mark including it, try this instead:
console.log("www.test.com?x=a".split("?")[0]);
That way you get only what's before the question mark.
I hope that helps you out.
You can remove all query strings using the following regex:
\?(.*)
const url = "www.test.com?x=1&b=2"
console.log(url.replace(/\?(.*)/, ''));
You could brutally replace the '?x=a' string with the JavaScript replace function or, even better, you could split the string in two (based on the index of ?) with the JavaScript split function and take the first part, e.g.:
let str = 'www.test.com?x=a';
console.log(str.replace('?x=a', ''));
console.log(str.split('?')[0]);

RegEx to search for a href="something" pattern

I know RegEx should not be used for parsing HTML, but I'm unable to use any other solution, so I'm stuck with this
I got this for URI.js:
/\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’&quote]))/ig
However it doesn't work very well, so I wanted to add a prefix that would search only for strings starting with href=
Ended up with something like this (which works in the RegEx tester):
href\=\"\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’&quote]))
But when compiled, it throws "illegal character" error. Not sure if it's the " or = that causes that.
JS code:
matches_temp = result_content.match(href\=\"\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’&quote])));
result_content is taken from the DB.
You need the slashes that say this is a regex, sort of how like quotes say that this value is a string. So .match(regex) should be .match(/regex/). Take a look:
var result_content = 'blah';
var matches_temp = result_content.match(/href\=\"\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’&quote]))/);
console.log(matches_temp[1]);

Why isn't .replace() working on a large generated string from escodege.generate()?

I am attempting to generate some code using escodegen's .generate() function which gives me a string.
Unfortunately it does not remove completely the semi-colons (only on blocks of code), which is what I need it to do get rid of them myself. So I am using the the .replace() function , however the semi-colons are not removed for some reason.
Here is what I currently have:
generatedCode = escodegen.generate(esprima.parseModule(code), escodegenOptions)
const cleanGeneratedCode = generatedFile.replace(';', '')
console.log('cleanGeneratedCode ', cleanGeneratedCode) // string stays the exact same.
Am I doing something wrong or missing something perhaps?
As per MDN, if you provide a substring instead of a regex
It is treated as a verbatim string and is not interpreted as a regular expression. Only the first occurrence will be replaced.
So, the output probably isn't exactly the same as the code generated, but rather the first semicolon has been removed. To remedy this, simply use a regex with the "global" flag (g). An example:
const cleanGenereatedCode = escodegen.generate(esprima.parseModule(code), escodegenOptions).replace(/;/g, '');
console.log('Clean generated code: ', cleanGeneratedCode);

Javascript regex to get src url between script tag

I wanted to get name of the script from such a string:
var text = '<script src="scripts/044c7c5e.vendor.js"></script><script src="scripts/fa9f85fb.scripts.js"></script>'
I wanted to retrieve the second script name i.e. fa9f85fb.scripts. How can I achieve this using javascript regex?
I'm writing something like this:
text.match(new RegExp(/<script src="scripts\/[(.*?)]\.scripts\.js"><\/script>/), 'g')[0]
But its returning the whole string.
Your pattern grabbing is a bit off; [(.*?)] should instead be (.*?) simply:
/<script src="scripts\/(.*?)\.scripts\.js"><\/script>/g
will be the entire regex, no need to call the RegExp class constructor either. The matched string is stored at index 0. The various segments are then stored from index 1 onwards.
text.match( /<script src="scripts\/(.*?)\.scripts\.js"><\/script>/g )[1]
Try /\w+.scripts(?=.js)/ ?
Reference: https://developer.mozilla.org/en/docs/Web/JavaScript/Guide/Regular_Expressions
Your match pattern is a bit vague. I can simply use /fa9f85fb.scripts/ to match it.

What's wrong with this regular expression to find URLs?

I'm working on a JavaScript to extract a URL from a Google search URL, like so:
http://www.google.com/search?client=safari&rls=en&q=thisisthepartiwanttofind.org&ie=UTF-8&oe=UTF-8
Right now, my code looks like this:
var checkForURL = /[\w\d](.org)/i;
var findTheURL = checkForURL.exec(theURL);
I've ran this through a couple regex testers and it seems to work, but in practice the string I get returned looks like this:
thisisthepartiwanttofind.org,.org
So where's that trailing ,.org coming from?
I know my pattern isn't super robust but please don't suggest better patterns to use. I'd really just like advice on what in particular I did wrong with this one. Thanks!
Remove the parentheses in the regex if you do not process the .org (unlikely since it is a literal). As per #Mark comment, add a + to match one or more characters of the class [\w\d]. Also, I would escape the dot:
var checkForURL = /[\w\d]+\.org/i;
What you're actually getting is an array of 2 results, the first being the whole match, the second - the group you defined by using parens (.org).
Compare with:
/([\w\d]+)\.org/.exec('thisistheurl.org')
→ ["thisistheurl.org", "thisistheurl"]
/[\w\d]+\.org/.exec('thisistheurl.org')
→ ["thisistheurl.org"]
/([\w\d]+)(\.org)/.exec('thisistheurl.org')
→ ["thisistheurl.org", "thisistheurl", ".org"]
The result of an .exec of a JS regex is an Array of strings, the first being the whole match and the subsequent representing groups that you defined by using parens. If there are no parens in the regex, there will only be one element in this array - the whole match.
You should escape .(DOT) in (.org) regex group or it matches any character. So your regex would become:
/[\w\d]+(\.org)/
To match the url in your example you can use something like this:
https?://([0-9a-zA-Z_.?=&\-]+/?)+
or something more accurate like this (you should choose the right regex according to your needs):
^https?://([0-9a-zA-Z_\-]+\.)+(com|org|net|WhatEverYouWant)(/[0-9a-zA-Z_\-?=&.]+)$

Categories