JavaScript RegEx to get test spec - javascript

I have a need to get the current test spec my caret is in when using Jasmine. So if I have a spec like:
it("should do something", function() {
var foo = 'bar';
expect(foo).toEqual('bar');
});
and I have my caret in the blank line and I click some button in a UI, it should go back from the caret to find the spec. So it goes to the var foo = 'bar'; line and detects it's not a match so goes to the next which has it() and therefore finds that line to be the spec. So going back line by line I can do but detecting if it's the line with the it() in it is what I need help with.
My end case will be detecting if the function() being passed as the 2nd arg has an argument in it or not. If it doesn't then I need to add one in there. So since the above snippet doesn't have an argument in the function() then I need to add one so that it looks like:
it("should do something", function(done) {
var foo = 'bar';
expect(foo).toEqual('bar');
});
Notice the done now in the function(). Also, the "should do something" can be double quotes or single quotes and can contain any legal JavaScript character in it.
As a test, I used this RegEx:
/^\s*it\((?:"|')[\w\s]+(?:"|'), function\((?:\w+)?\) {/
And it works for my simple tests but it feels incomplete especially in the "should do something" detection part.

I think you can safely use a regex that is based on the unroll-the-loop method:
^\s*it\([^,]*(?:,(?!\s*function\()[^,]*)*,\s*function\(\w*\)\s*{
It matches it( at the beginning of a string, followed by anything that is not , function( and then , function(...) {. A synonym of a ^\s*it\([\s\S]*?,\s*function\(\w*\)\s*{, but a much more efficient expression.
See the regex demo
Now, if you need to match such signatures without any text inside function(), you can use capturing groups around the subpatterns you want to keep and that you can later reference as $1 and $2:
var re = /^(\s*it\([^,]*(?:,(?!\s*function\()[^,]*)*,\s*function\()(\)\s*{)/;
var str = 'it("should do something", function() {\n var foo = \'bar\';\n\n expect(foo).toEqual(\'bar\');\n});';
var subst = '$1done$2';
var result = str.replace(re, subst);
document.write(result);
If you really can have such wierd strings as Oriol suggests, use
^\s*it\((?:"[^"\\]*(?:\\.[^"\\]*)*"|'[^'\\]*(?:\\.[^'\\]*)*'),\s*function\(\w*\)\s*{
See another regex demo
It will match
it(",function(){", function() {
it(',function(){', function() {

Related

Javascript regex - getting function name from string

I am trying to get a function name from a string in javascript.
Let's say I have this string:
function multiply($number) {
return ($number * 2);
}
I am using the following regex in javascript:
/([a-zA-Z_{1}][a-zA-Z0-9_]+)\(/g
However, what is selected is multiply(. This is wrong. What I want is the word multiply without the the (, though the regex should keep in mind that the function name must be attached an (.
I can't get this done. How can I make the proper regex for this? I know that this is not something I really should do and that it is quite error sensitive, but I still wanna try to make this work.
Just replace last \) with (?=\()
`function multiply($number) {
return ($number * 2);
}`.match(/([a-zA-Z_{1}][a-zA-Z0-9_]+)(?=\()/g) // ["multiply"]
You can use:
var name = functionString.match(/function(.*?)\(/)[1].trim();
Get anything between function and the first ( (using a non-gredy quantifier *?), then get the value of the group [1]. And finally, trim to remove surrounding spaces.
Example:
var functionString = "function dollar$$$AreAllowedToo () { }";
var name = functionString.match(/function(.*?)\(/)[1].trim();
console.log(name);
Notes:
The characters allowed in javascript for variable names are way too much to express in a set. My answer takes care of that. More about this here
You may want to consider the posibility of a comment between function and (, or a new line too. But this really depends on how you intend to use this regex.
take for examlpe:
function /*this is a tricky comment*/ functionName // another one
(param1, param2) {
}

Doubts in JavaScript RegExp and String.replace() method

I am trying to enter 'username' in a webpage using VBA. So in the source code of the webpage, there are some modifications done to the 'username' value.
I have attached the code,
function myFunction()
{
document.str.value = "Abc02023";
document.str.value = document.str.value.toUpperCase();
pattern = new RegExp("\\*", "g");
document.str.value = document.str.value.replace(pattern, "");
document.str.value = document.str.value.replace(/^\s+/, "");
document.str.value = document.str.value.replace(/\s+$/, "");
}
I read about these and from my understanding, after the modifications document.str.value is ABC02023.
Obviously I am wrong as there would not be no point in doing all these modifications then. Also, I am getting an 'incorrect username error'.
So can anybody please help me to understand these. What would be the value of document.str.value and how did you figure it out? I am new to JavaScript so please forgive me if I am being too slow...
Looks like you are using some very old code to learn from. ☹
Let's see if we can still learn something by bringing this code up to date, then you go find some newer learning materials. Here is a well-written book series with free online versions available: You Don't Know JS.
function myFunction() {
// Assuming your code runs in a browser, `document` is equal to the
// global object. So if in a browser and somewhere outside the function
// a variable `str` has been created, this will add an attribute `value`
// to `str` and set the value of `str.value` to 'Abc02023'. If there is
// no already existing object `document` (for instance not running in
// a browser) or if document does not have an already created property
// called`str` then this will throw a TypeError because you cannot add
// a property to `undefined`.
document.str.value = "Abc02023";
// You probably were just trying to create a new variable `str` so let's
// just start over
}
Second try
function myFunction() {
// create a variable `str` and set it to 'Abc02023'
var str = "Abc02023";
// Take value of str and convert it to all capital letters
// then overwrite current value of str with the result.
// So now `str === 'ABC02023'
str = str.toUpperCase();
// Create a regular expression representing all occurences of `*`
// and assign it to variable `pattern`.
var pattern = new RegExp("\\*", "g");
// Remove all instances of our pattern from the string. (which does not
// affect this string, but prevents user inputting some types of bad
// strings to hack your website.
str = str.replace(pattern, "");
// Remove any leading whitespace form our string (which does not
// affect this string, but cleans up strings input by a user).
str = str.replace(/^\s+/, "");
// Remove any trailing whitespace form our string (which does not
// affect this string, but cleans up strings input by a user).
str = str.replace(/\s+$/, "");
// Let's at least see our result behind the scenes. Press F12
// to see the developer console in most browsers.
console.log("`str` is equal to: ", str );
}
Third try, let's clean this up a little:
// The reason to use functions is so we can contain the logic
// separate from the data. Let's pull extract our data (the string value)
// and then pass it in as a function parameter
var result = myFunction('Abc02023')
console.log('result = ', result)
function myFunction(str) {
str = str.toUpperCase();
// Nicer syntax for defining regular expression.
var pattern = /\*/g;
str = str.replace(pattern, '');
// Unnecesarry use of regular expressions. Let's use trim instead
// to clean leading and trailing whitespace at once.
str = str.trim()
// let's return our result so the rest of the program can use it
// return str
}
Last go round. We can make this much shorter and easier to read by chaining together all the modifications to str. And let's also give our function a useful name and try it out against a bad string.
var cleanString1 = toCleanedCaps('Abc02023')
var cleanString2 = toCleanedCaps(' ** test * ')
console.log('cleanString1 = ', cleanString1)
console.log('cleanString2 = ', cleanString2)
function toCleanedCaps(str) {
return str
.toUpperCase()
.replace(/\\*/g, '')
.trim()
}
#skylize answer is close
what is equivalent to your code is actually
function toCleanedCaps(str) {
return str
.toUpperCase()
.replace(/\*/g, '') // he got this wrong
.trim()
}
Let's go over the statements one by one
document.str.value = document.str.value.toUpperCase();
makes the string uppercase
pattern = new RegExp("\\*", "g");
document.str.value = document.str.value.replace(pattern, "");
replaces between zero and unlimited occurences of the \ character , so no match in this case.
document.str.value = document.str.value.replace(/^\s+/, "");
replaces any whitespace character occurring between one and unlimited times at the beginning of the string, so no match.
document.str.value = document.str.value.replace(/\s+$/, "");
replaces any whitespace character occurring between one and unlimited times at the end of the string, so no match.
You are right. With "Abc02023" as input, the output is what you suggest.

regex not being called repeatedly for multiple matches (isn't global)

I have this regex /#[a-zA-Z0-9_]+$/g to do a global look up of all user names that are mentioned.
Here is some sample code.
var userRegex = /#[a-zA-Z0-9_]+$/g;
var text = "This is some sample text #Stuff #Stuff2 #Stuff3";
text.replace(userRegex, function(match, text, urlId) {
console.log(match);
});
So basically that console.log only gets called once, in this case it'll just show #Stuff3. I'm not sure why it isn't searching globally. If someone can help fix up that regex for me, that'd be awesome!
$ means "Assert the position at the end of the string (or before a line break at the end of the string, if any)". But you don't seem to want that.
So remove the $ and use /#[a-zA-Z0-9_]+/g instead.
var userRegex = /#[a-zA-Z0-9_]+/g,
text = "This is some sample text #Stuff #Stuff2 #Stuff3";
text.match(userRegex); // [ "#Stuff", "#Stuff2", "#Stuff3" ]
It isn't doing a global search throughout the entire context simply because of the end of string $ anchor (which only asserts at the end of string position). You can use the following here:
var results = text.match(/#\w+/g) //=> [ '#Stuff', '#Stuff2', '#Stuff3' ]
Note: \w is shorthand for matching any word character.
Adding to #Oriol's answer. You can add word boundaries to be more specific.
#([a-zA-Z0-9_]+)\b
the \b will cause the username to match only if it is followed by a non-word character.
Here is the regex demo.

Why do these JavaScript regular expression capture parenthesis snag entire line instead of the suffixes appended to a word?

Can someone please tell me WHY my simple expression doesn't capture the optional arbitrary length .suffix fragments following hello, matching complete lines?
Instead, it matches the ENTIRE LINE (hello.aa.b goodbye) instead of the contents of the capture parenthesis.
Using this code (see JSFIDDLE):
//var line = "hello goodbye"; // desired: suffix null
//var line = "hello.aa goodbye"; // desired: suffix[0]=.aa
var line = "hello.aa.b goodbye"; // desired: suffix[0]=.aa suffix[1]=.b
var suffix = line.match(/^hello(\.[^\.]*)*\sgoodbye$/g);
I've been working on this simple expression for OVER three hours and I'm beginning to believe I have a fundamental misunderstanding of how capturing works: isn't there a "cursor" gobbling up each line character-by-character and capturing content inside the parenthesis ()?
I originally started from Perl and then PHP. When I started with JavaScript, I got stuck with this situation once myself.
In JavaScript, the GLOBAL match does NOT produce a multidimensional array. In other words, in GLOBAL match there is only match[0] (no sub-patterns).
Please note that suffix[0] matches the whole string.
Try this:
//var line = "hello goodbye"; // desired: suffix undefined
//var line = "hello.aa goodbye"; // desired: suffix[1]=.aa
var line = "hello.aa.b goodbye"; // desired: suffix[1]=.aa suffix[2]=.b
var suffix = line.match(/^hello(\.[^.]+)?(\.[^.]+)?\s+goodbye$/);
If you have to use a global match, then you have to capture the whole strings first, then run a second RegEx to get the sub-patterns.
Good luck
:)
Update: Further Explanation
If each string only has ONE matchable pattern (like var line = "hello.aa.b goodbye";)
then you can use the pattern I posted above (without the GLOBAL modifier)
If a sting has more than ONE matchable pattern, then look at the following:
// modifier g means it will match more than once in the string
// ^ at the start mean starting with, when you wan the match to start form the beginning of the string
// $ means the end of the string
// if you have ^.....$ it means the whole string should be a ONE match
var suffix = line.match(/^hello(\.[^.]+)?(\.[^.]+)?\s+goodbye$/g);
var line = 'hello.aa goodbye and more hello.aa.b goodbye and some more hello.cc.dd goodbye';
// no match here since the whole of the string doesn't match the RegEx
var suffix = line.match(/^hello(\.[^.]+)?(\.[^.]+)?\s+goodbye$/);
// one match here, only the first one since it is not a GLOBAL match (hello.aa goodbye)
// suffix[0] = hello.aa goodbye
// suffix[1] = .aa
// suffix[2] = undefined
var suffix = line.match(/hello(\.[^.]+)?(\.[^.]+)?\s+goodbye/);
// 3 matches here (but no sub-patterns), only a one dimensional array with GLOBAL match in JavaScript
// suffix[0] = hello.aa goodbye
// suffix[1] = hello.aa.b goodbye
// suffix[2] = hello.cc.dd goodbye
var suffix = line.match(/hello(\.[^.]+)?(\.[^.]+)?\s+goodbye/g);
I hope that helps.
:)
inside ()
please do not look for . and then some space , instead look for . and some characters and finally outside () look for that space
A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations.
var suffix = line.match(/^hello((\.[^\.]*)*)\sgoodbye$/g);
if (suffix !== null)
suffix = suffix[1].match(/(\.[^\.\s]*)/g)
and I recommand regex101 site.
Using the global flag with the match method doesn't return any capturing groups. See the specification.
Although you use ()* it's only one capturing group. The * only defines that the content has to be matched 0 or more time before the space comes.
As #EveryEvery has pointed out you can use a two-step approach.

Regular expression to find all methods in a piece of code

I am trying to write a regular expression to match all the JavaScript method definitions in a constructor string.
//These two should match
this.myMethod_1 = function(test){ return "foo" }; //Standard
this.myMethod_2 = function(test, test2){ return "foo" }; //Spaces before
//All of these should not
//this.myMethod_3 = function(test){ return "foo" }; //Comment shouldn't match
/**
*this.myMethod_4 = function(test){ return "foo" }; //Block comment shouldn't match
*/
// this.myMethod_5 = function(test){ return "foo" }; //Comment them spaces shouldn't match
/*
* this.myMethod_6 = function(test){ return "foo" }; //Block comment + spaces shouldn't match
*/
this.closure = (function(){ alert("test") })(); //closures shouldn't match
The regular expression should match ['myMethod_1', 'myMethod_2']. The regular expression should not match ['myMethod_3', 'myMethod_5', 'myMethod_6', 'closure'].
Here's what I have so far, but I am having problems with the ones that appear in comments:
/(?<=this\.)\w*(?=\s*=\s*function\()/g
I've been using this cool site to test it.
How do I solve this?
This sounds complicated to do it correctly. You will need to create a parser for this, a simple regular expression will most likely not make it.
A very good starting point is Narcissus, which is a JavaScript parser written in ... JavaScript.
It is just 1000 lines of code. It should be possible to extract just the method-matching parts of it.
Add a ^\s* to the begining might help. It's not perfect, but it will work for your test cases.
One regular expression might be difficult to write and debug. Think about writing several regular expressions, one for each line that should either match to confirm or reject a piece of code.
For example,
/(?<=this.)\w*(?=\s*=\s*function()/g // Matches a simple constructor.
/^\/\// // If it matches then this line starts with a comment.
and so on.

Categories