A regular expression which excludes comments lines starting with "//" in JavaScript - javascript

I need to find all lines with string "new qx.ui.form.Button" WHICH EXCLUDE lines starting with comments "//".
Example
line 1:" //btn = new qx.ui.form.Button(plugin.menuName, plugin.menuIcon).set({"
line 2:" btn = new qx.ui.form.Button(plugin.menuName, plugin.menuIcon).set({"
Pattern should catch only "line 2"!
Be aware about leading spaces.
Finally I have to FIND and REPLACE "new qx.ui.form.Button" in all UNCOMMENTED code lines with "this.__getButton".
I tried.
/new.*Button/g
/[^\/]new.*Button/g
and many others without success.

In JavaScript this is a bit icky:
^\s*(?=\S)(?!//)
excludes a comment at the start of a line. So far, so standard. But you cannot look backwards for this pattern because JS doesn't support arbitrary-length lookbehind, so you have to match and replace more than needed:
^(\s*)(?=\S)(?!//)(.*)(new qx\.ui\.form\.Button)
Replace that by
$1$2this.__getButton
Quick PowerShell test:
PS Home:\> $line1 -replace '^(\s*)(?=\S)(?!//)(.*)(new qx\.ui\.form\.Button)','$1$2this.__getButton'
//btn = new qx.ui.form.Button(plugin.menuName, plugin.menuIcon).set({
PS Home:\> $line2 -replace '^(\s*)(?=\S)(?!//)(.*)(new qx\.ui\.form\.Button)','$1$2this.__getButton'
btn = this.__getButton(plugin.menuName, plugin.menuIcon).set({
That being said, why do you care about what's in the commented lines anyway? It's not as if they had any effect on the program.

Ah, if only JavaScript had lookbehinds... Then all you'd need is
/(?<!\/\/.*)new\s+qx\.ui\.form\.Button/g... Ah well.
This'll work just fine too:
.replace(/(.*)new\s(qx\.ui\.form\.Button)/g,function(_,m) {
// note that the second set of parentheses aren't needed
// they are there for readability, especially with the \s there.
if( m.indexOf("//") > -1) {
// line is commented, return as-is
// note that this allows comments in an arbitrary position
// to only allow comments at the start of the line (with optional spaces)
// use if(m.match(/^\s*\/\//))
return _;
}
else {
// uncommented! Perform replacement
return m+"this.__getButton";
}
});

Grep uses Regular Expressions, this will exclude all white space (if any) plus two // at the beginning of any line.
grep -v "^\s*//"

Related

How to extract body of this callback-ish style function using regex?

I'm trying to extract body of this function using JavaScript
J_Script(void, someName, (const char *str), {
function howdy() {
console.log("What's up");
}
howdy();
});
I have attempted the following regex,
(J_Script\s?)([^\.])([\w|,|\s|-|_|\$]*)(.+?\{)([^\.][\s|\S]*(?=\}))
It capture most of it but fails to detect end of the function thus corrputing the end result.
The end result need to looks like this,
function howdy() {
console.log("What's up");
}
howdy();
Yes, I know Regex maybe be not perfect for this but I don't have time to create an AST and I'm looking to do some pre-processing using Javascript.
Worth noting that the function will always ends with }); not })
Assuming your function has the substring }); starting at character 0 at the start of the function closing line (i.e. is flush to the left), you can use
(J_Script\s?)([^\.])([\w|,|\s|-|_|\$]*)(.+?\{)([^\.][\s|\S]*?(?=(^\}\);)))
With the multiline flag
The only modifications from your original are:
(J_Script\s?)([^\.])([\w|,|\s|-|_|\$]*)(.+?\{)([^\.][\s|\S]*?(?=^\}\);))
// ^ ^
// | |
// non-greedy beginning of line
If your functions have a fixed offset indentation, you can exploit that just the same, using /^ {x}/ where x is a digit representing whatever indentation count you have.
This also handles nested }); or whatever else might be in the function, so long as it's indented correctly.
If you want to capture the closing });, add a capture group to the above pattern:
(J_Script\s?)([^\.])([\w|,|\s|-|_|\$]*)(.+?\{)([^\.][\s|\S]*?(?=(^\}\);)))
Try changing the lookahead to find the }); that you always expect at the end:
(J_Script\s?)([^\.])([\w|,|\s|-|_|\$]*)(.+?\{)([^\.][\s|\S]*(?=(\}\);)))
Testbed: https://regex101.com/r/qdsB1w/1/

Regex working in debugger, but not in JavaScript [duplicate]

This question already has an answer here:
Regular expression works on regex101.com, but not on prod
(1 answer)
Closed 2 years ago.
I want to get all the content in a text file before the first empty line.
I've found a working regex, but when I try to accomplish the same in Javascript it doesn't work.
(loading the file's contents is working)
async function readDir() {
return new Promise((resolve,reject) => {
fs.readdir('./content', (err, files) => {
if(err) { reject(err) }
resolve(files)
});
});
}
readDir().then((files) => {
files.forEach(file => {
var filepath = path.resolve('./content/'+file)
if(filepath.endsWith('.txt')) {
if(fs.statSync(filepath)["size"] > 0) {
let data = fs.readFileSync(filepath).toString();
let reg = /^[\s\S]*?(?=\n{2,})/;
console.log(data.match(reg)) //returns null
}
}
});
})
EDIT:
As O. Jones pointed out, the problem lies with the line endings. My regex was not picking up on \r\n line endings present in my file.
For now, this one seems to do the job: /^[\s\S]*?(?=(\r\n\r\n?|\n\n))/m
It looks like you want to match your re to the whole, multiline, contents of your file. You need the multiline flag to do that.
Try this
let reg = /^[\s\S]*?(?=\n{2,})/m;
Notice the m after the re's closing /. For more explanation see the section called Advanced Searching With Flags here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions
Also, it's possible you have line-ending trouble. Linux/ FreeBSD/ UNIX systems use \n aka newline to mark the end of each line. Macs use \r aka return for that. And Windows uses \r\n, two characters at the end of each line. Yeah, we all know what a pain in the xxx neck this is.
So your blank line detector is probably too simple. Regular Expression to match cross platform newline characters Try using this to match cross-os ends of lines
\r\n?|\n
meaning either a return followed by an optional newline, or just a newline.
It might look something like this.
let reg = /^[\s\S]*?(?=(\r\n?|\n)(\r\n?|\n))/m;
That looks for two of those end of line patterns in a row (not tested by me, sorry).
You may want to try:
const EOL = require('os').EOL; // system newline.
const regex = new Regex('^.*?(?=' + EOL + EOL + ')', 's'); // everything before first two newlines.

Javascript trim double slashes

I would like to trim //a/url/// to a/url. There are a few questions on Stackoverflow but they don't work, solves another problem or is too long and complex.
The code below is working and is based on Javascript regular expression: remove first and last slash
function trimSlashes(str) {
str = str.replace(/^\/|\/$/g, '');
return str.replace(/^\/|\/$/g, '');
};
However it's not very nice to duplicate code like that. How would a regex look like that takes care of double slashes as well?
Testcase
let str1 = trimSlashes('/some/url/here/');
let str2 = trimSlashes('//some/other/url/here///');
Expected result
some/url/here
some/other/url/here
Wishlist
Just a single regex
Shorter or faster is better
Here's another variation without a regex but with a functional flair. I don't know about the performance but I had fun writing it and seems less cryptic.
const newString = '//some/other/url/here///'
.split('/')
.filter(s => s)
.join('/')
Edit:
Just ran some perf tests and this is slower than a regex but it might be insignificant if used sparingly.
https://jsperf.com/regex-vs-functional/1
replace(/^\/+|\/+$/g, '') is what you're looking for:
Result with both test cases:
> '/some/url/here/'.replace(/^\/+|\/+$/g, '');
"some/url/here"
> '//some/other/url/here///'.replace(/^\/+|\/+$/g, '');
"some/other/url/here"
Explained:
^\/+ # one or more forward slashes at the beginning
| # or
\/+$ # one or more forward slashes at the end
With regexes you must be careful of unintended matches. for example do you want to trim the slash when the text is "// and this is a comment in some line of text//"?
If you don't want to trim things like that down you need to be a little more careful with the regex, how about this?
let regex = /^\/+([\w\/]+?)\/+$/;
let matches = regex.exec("//some/other/url/here///");
let url = matches[1];
https://regex101.com/r/K8CnxP/1

Javascript Regex Look behind

In my web app I need to remove all whitespace and line breaks before and after the content between a pair of ``. Example:
``\s\s\s\s\stest1234\s\s\s\s23432\s\s\s\s\s\s\s`` would become something like this: ``test1234\s\s\s\s23432``.
(\s is a whitespace)
The regex I wrote for this is: /(``(?<=[\s]*)[^`]*(?=[\s]*)``)/g but I found out JS doesn't have look behind, how would I transform this regex into something that does the job?
My JavaScript would look something like this:
replace(/(``(?<=[\s]*)[^`]*(?=[\s]*)``)/g, function(match, p1) {
return p1;
})
Note, I only want to remove the outer whitespace, the ones that belong to the content need t be preserved.
Make it two steps.
var src = "`` test123423432 \n\n ``";
var results = src.replace(/``([\s\S]*?)``/g,function(_,m) {
// note [\s\S] above is to handle JS's lack of a DOTALL flag
return "``"+m.replace(/^\s+|\s+$/g,"")+"``"; // trim all whitespace
});
If a problem seems too hard, usually breaking it down into smaller problems is the answer.

Regex remove line breaks

I am hoping that someone can help me. I am not exactly sure how to use the following regex. I am using classic ASP with Javascript
completehtml = completehtml.replace(/\<\!-- start-code-remove --\>.*?\<\!-- start-code-end --\>/ig, '');
I have this code to remove everything between
<\!-- start-code-remove --\> and <\!-- start-code-end --\>
It works perfect up to the point where there is line breaks in the values between start and end code...
How will I write the regex to remove everything between start and end even if there is line breaks
Thanks a million for responding...
Shoud I use the \n and \s characters not 100% sure..
(/\<\!-- start-code-remove --\>\s\n.*?\s\n\<\!-- start-code-end --\>/ig, '');
also the code should not be greedy between <\!-- start-code-remove --\> <\!-- start-code-end --\>/ and capture the values in groups...
There could be 3 or more of these sets...
The dot doesn't match new lines in Javascript, nor is there a modifier to make it do that (unlike in most modern regex engines). A common work-around is to use this character class in place of the dot: [\s\S]. So your regex becomes:
completehtml = completehtml.replace(
/\<\!-- start-code-remove --\>[\s\S]*?\<\!-- start-code-end --\>/ig, '');
Try (.|\n|\r)*.
completehtml = completehtml.replace(/\<\!-- start-code-remove --\>(.|\n|\r)*?\<\!-- start-code-end --\>/ig, '');
Source
There is indeed no /s modifier to make the dot match all characters, including line breaks. To match absolutely any character, you can use character class that contains a shorthand class and its negated version, such as [\s\S].
Regex support in javascript is not very reliable.
function remove_tag_from_text(text, begin_tag, end_tag) {
var tmp = text.split(begin_tag);
while(tmp.length > 1) {
var before = tmp.shift();
var after = tmp.join(begin_tag).split(end_tag);
after.shift();
text = before + after.join(end_tag);
tmp = text.split(begin_tag);
}
return text;
}

Categories