Javascript and Regex: undefined and "" - javascript

I am not really familiar with Regular Expressions and I am having the following problem:
When running the regExp to split my string I get lots of undefined and "" along with the result. I already tried to use "(?:" which I saw in another answer here in stackoverflow and lots of other things.
I am then using the array.filter function to remove them but I didn`t want to do that. Can anyone help me? (and also explain to me why this is happening).
let values = line.split(/(\!=)|(<=)|(>=)|(==)|(\/\/)|(\/\*)|(\*\/)|(")|(=)|(<)|(>)|(\+)|(-)|(\*)|(\\)|(\()|(\))|;| /g);
return values.filter(value => {
return value !== undefined && value !== "";
});
Strings that can be used:
"int x = 7;" => ["int","x","=","7",";"]
"int x = 7 + 25 * 52" => ["int","x","=","7","+","25","*","52"]
"while( x != 0)" => ["while","(","x","!=","0",")"]
'if(idade > 70 && sexo == "masculino")'
=> ["if","(","idade",">","70","&&","sexo","==",""masculino"",")"]
Thanks!

The reason is that you have each alternative in its own capture group. Only the capture group that actually matches will be filled in, the rest will be empty. Instead, put the capture group around all the alternatives.
line = "#include <stdio.h>";
let values = line.split(/(\!=|<=|>=|==|\/\/|\/\*|\*\/|"|=|<|>|\+|-|\*|\\|\(|\)|;| )/g);
console.log(values);

Related

Check length between two characters in a string

So I've been working on this coding challenge for about a day now and I still feel like I haven't scratched the surface even though it's suppose to be Easy. The problem asks us to take a string parameter and if there are exactly 3 characters (not including spaces) in between the letters 'a' and 'b', it should be true.
Example: Input: "maple bread"; Output: false // Because there are > 3 places
Input: "age bad"; Output: true // Exactly three places in between 'a' and 'b'
Here is what I've written, although it is unfinished and most likely in the wrong direction:
function challengeOne(str) {
let places = 0;
for (let i=0; i < str.length; i++) {
if (str[i] != 'a') {
places++
} else if (str[i] === 'b'){
}
}
console.log(places)
}
So my idea was to start counting places after the letter 'a' until it gets to 'b', then it would return the amount of places. I would then start another flow where if 'places' > 3, return false or if 'places' === 3, then return true.
However, attempting the first flow only returns the total count for places that aren't 'a'. I'm using console.log instead of return to test if it works or not.
I'm only looking for a push in the right direction and if there is a method I might be missing or if there are other examples similar to this. I feel like the solution is pretty simple yet I can't seem to grasp it.
Edit:
I took a break from this challenge just so I could look at it from fresh eyes and I was able to solve it quickly! I looked through everyone's suggestions and applied it until I found the solution. Here is the new code that worked:
function challengeOne(str) {
// code goes here
str = str.replace(/ /g, '')
let count = Math.abs(str.lastIndexOf('a')-str.lastIndexOf('b'));
if (count === 3) {
return true
} else return false
}
Thank you for all your input!
Here's a more efficient approach - simply find the indexes of the letter a and b and check whether the absolute value of subtracting the two is 4 (since indexes are 0 indexed):
function challengeOne(str) {
return Math.abs(str.indexOf("a") - str.indexOf("b")) == 4;
}
console.log(challengeOne("age bad"));
console.log(challengeOne("maple bread"));
if there are exactly 3 characters (not including spaces)
Simply remove all spaces via String#replace, then perform the check:
function challengeOne(str) {
return str = str.replace(/ /g, ''), Math.abs(str.indexOf("a") - str.indexOf("b")) == 4;
}
console.log(challengeOne("age bad"));
console.log(challengeOne("maple bread"));
References:
Math#abs
String#indexOf
Here is another approach: This one excludes spaces as in the OP, so the output reflects that. If it is to include spaces, that line could be removed.
function challengeOne(str) {
//strip spaces
str = str.replace(/\s/g, '');
//take just the in between chars
let extract = str.match(/a(.*)b/).pop();
return extract.length == 3
}
console.log(challengeOne('maple bread'));
console.log(challengeOne('age bad'));
You can go recursive:
Check if the string starts with 'a' and ends with 'b' and check the length
Continue by cutting the string either left or right (or both) until there are 3 characters in between or the string is empty.
Examples:
maple bread
aple brea
aple bre
aple br
aple b
ple
le
FALSE
age bad
age ba
age b
TRUE
const check = (x, y, z) => str => {
const exec = s => {
const xb = s.startsWith(x);
const yb = s.endsWith(y);
return ( !s ? false
: xb && yb && s.length === z + 2 ? true
: xb && yb ? exec(s.slice(1, -1))
: xb ? exec(s.slice(0, -1))
: exec(s.slice(1)));
};
return exec(str);
}
const challenge = check('a', 'b', 3);
console.log(`
challenge("maple bread"): ${challenge("maple bread")}
challenge("age bad"): ${challenge("age bad")}
challenge("aabab"): ${challenge("aabab")}
`)
I assume spaces are counted and your examples seem to indicate this, although your question says otherwise. If so, here's a push that should be helpful. You're right, there are JavaScript methods for strings, including one that should help you find the index (location) of the a and b within the given string.
Try here:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String#instance_methods

How do I grab user input after an # symbol and before a space?

I want to grab the user input from an input tag including everything after the # symbol and up to a space if the space exists. For example:
If the user input is "hello#yourname"
I want to grab "yourname"
If the user input is "hello#yourname hisname"
I want to grab "yourname" because it is after the # symbol and ends at the space.
I have some code written that attempts to grab the user input based on these rules, but there is a bug present that I can't figure out how to fix. Right now if I type "hello#yourname hisname"
My code will return "yourname hisn"
I don't know why the space and four characters "hisn" are being returned. Please help me figure out where the bug is.
Here is my function which performs the user input extraction.
handleSearch(event) {
let rawName, nameToSearch;
rawName = event.target.value.toLowerCase();
if (rawName.indexOf('#') >= 0 && rawName.indexOf(' ') >= 0) {
nameToSearch = rawName.substr(rawName.indexOf('#') + 1, rawName.indexOf(' ') - 1);
} else if (rawName.indexOf('#') >= 0 && rawName.indexOf(' ') < 0) {
nameToSearch = rawName.substr(rawName.indexOf('#') + 1);
} else {
nameToSearch = '';
}
return nameToSearch;
}
Working example:
handleSearch(event) {
let rawName = event.target.value.toLowerCase();
if (rawName.indexOf("#") === -1) {
return '';
}
return (rawName.split("#")[1].split(" "))[0];
}
You have to handle a lack of "#", but you don't need to handle the case where there is a space or not after the "#". The split function will still behave correctly in either of those scenarios.
Edit: The specific reason why OP's code doesn't work is because the substr method's second argument is not the end index, but the number of characters to return after the start index. You can use the similar SUBSTRING method instead of SUBSTR to make this easier. Change the line after the first if statement as follows:
nameToSearch = rawName.substring(rawName.indexOf('#') + 1, rawName.indexOf(' '));
const testCases = [
"hello#yourname",
"hello#yourname hisname"
];
for (let test of testCases) {
let re = /#(.*?)(?:\s|$)/g;
let result = re.exec(test);
console.log(result[1]);
}
Use regex instead if you know how the string will be created.
You could do something like this--
var string = "me#somename yourname";
var parts = string.split("#");
var parts2 = parts[1];
var yourPart = parts2.split(" ");
console.log(yourPart[0]);
NOTE:
I am suggesting it just because you know your string structure.
Suggestion
For your Piece of code I think you have some white space after hisn that is why it is returning this output. Try to replace all the white spaces with some character see if you are getting any white space after hisn.
I'm not sure of the language your code is in (there are several it 'could be', probably Javascript), but in most languages (including Javascript) a substring function 'starts at' the position of the first parameter, and then 'ends at' that position plus the second parameter. So when your second parameter is 'the position of the first space - 1', you can substitute 'the position of the first space - 1' with the number 13. Thus, you're saying 'get a substring by starting one after the position of the first # character i.e. position 6 in a zero-based system. Then return me the next 13 characters.'
In other words, you seem to be trying to say 'give me the characters between position 6 and position 12 (inclusive)', but you're really saying 'give me the characters between position 6 and position 18 (inclusive)'.
This is
y o u r n a m e h i s n
1 2 3 4 5 6 7 8 9 10 11 12 13
(For some reason I can't get my spaces and newlines to get preserved in this answer; but if you count the letters in 'yourname hisn' it should make sense :) )
This is why you could use Neophyte's code so long as you can presume what the string would be. To expand on Neophyte's answer, here's the code I would use (in the true branch of the conditional - you could also probably rename the variables based on this logic, etc.):
nameToSearch = rawName.substr(rawName.indexOf('#') + 1;
var nameFromNameToSearch = nameToSearch.substr(nameToSearch.indexof(' ') - 1;
nameFromNameToSearch would contain the string you're looking for. I haven't completely tested this code, but I hope it 'conceptually' gives you the answer you're looking for. Also, 'conceptually', it should work whether there are more than one '#' sign, etc.
P.S. In that first 'rawName.substr' I'm not giving a second parameter, which in Javascript et al. effectively says 'start at the first position and give me every character up to the end of the string'.

Check for empty string is failing in js

After splitting a set of glossary terms with:
lines = text.split(/[\r\n]+/);
I then iterate through the array and parse out each term to properly format them during output. However, a simple check for empty strings has become much more of a headache than I could've ever imagined. Console logging gives me this:
...
"Pushing" "dyspnea: Labored or difficult respiration." //correct
"Pushing" ""
...
Things I have tried in order to find these empty strings:
line === ""
line.length == 0
if(line)
isNaN(line.charCodeAt(0))
typeof line == "undefined"
And various combinations of the list above. On recommendation from a coworker, I checked the line endings of the input text, but it all seemed normal.
I'm sure I'm just doing something really stupid, but the solution has eluded me for far too long. Any help/suggestions would be greatly appreciated!
Edit:
Thanks for the suggestions everyone. Alas, the problem persists...
Also, I forgot to mention, but I have tried both trimming and replacing whitespace in each line after the split, but came up with nothing.
As requested, here is more relevant code.
var text = "";
var end = /\x2E\s\x5B/gm; // ". ["
var lines = [];
var terms = [];
text = document.getElementById("terms").value;
lines = text.split(/[\r\n]+/);
parseText();
function parseText() {
var i = 0;
while(i < lines.length) {
var line = lines[i];
endIndex = lines[i].search(end);
if(line != "" || line != " " || line.length != 0 ) {
parseTerm(lines[i].substring(0, endIndex+1));
}
i++;
}
As the previous answer stated issue is probably whitespace, you can use the trim function to shorten your code:
if (line.trim() == "") {
alert("Blank");
}
maybe string is not a "", but " "?
so check not only zero length, but "white space"
if(st1 == "" || st1 == " " || st1.length == 0 ){
console.log("find empty")
}
Turns out that in my input there was a line with two spaces. I have NO idea why this was causing problems, considering the split was specifically on the pattern described above, but replacing instances of too much whitespace fixed the issue. The new line:
text.replace(/\s\s+/g, " ").split(/[\r\n]+/);

Why doesnt this Javascript regular expression work

var existing = "";
if(disk.isLinux){
var valinvalid = "/usr" ;
var valinput = /^\/[a-zA-Z]{2,}/ ;
if(!valinput.match(valinvalid)){
return "^/" + existing + "[^/][a-zA-Z]{2,}[^/]$";
}
}
Here im trying to do the following in the first if condition ie. if(disk.isLinux):
1. there should be minimum 3 characters
2. the first character should be /
3. the entire input shouldnt match "/usr". But it can be /us or /usra
If you are just trying to test if it matches, us test on regexp:
/^\/[\w]{2,}/.test("/usr/"); //true
Is this what you are trying to do?
A couple of things:
1) vars should never ever ever be inside if statements
2) String.prototype.match exists RegExp.prototype.match does not
But more importantly, you dont need regEx at all
if (
input.length < 3 ||
input.charAt(0) !== '/' ||
input === '/user'
) {
throw new Error("I'm not happy with the input");
}
try changing your code to use:
var valinput = new RegExp("/^\/[a-zA-Z]{2,}/") ;
if(!valinput.test(valinvalid)){

Determine if string is in base64 using JavaScript

I'm using the window.atob('string') function to decode a string from base64 to a string. Now I wonder, is there any way to check that 'string' is actually valid base64? I would like to be notified if the string is not base64 so I can perform a different action.
If you want to check whether it can be decoded or not, you can simply try decoding it and see whether it failed:
try {
window.atob(str);
} catch(e) {
// something failed
// if you want to be specific and only catch the error which means
// the base 64 was invalid, then check for 'e.code === 5'.
// (because 'DOMException.INVALID_CHARACTER_ERR === 5')
}
Building on #anders-marzi-tornblad's answer, using the regex to make a simple true/false test for base64 validity is as easy as follows:
var base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
base64regex.test("SomeStringObviouslyNotBase64Encoded..."); // FALSE
base64regex.test("U29tZVN0cmluZ09idmlvdXNseU5vdEJhc2U2NEVuY29kZWQ="); // TRUE
Update 2021
Following the comments below it transpires this regex-based solution provides a more accurate check than simply try`ing atob because the latter doesn't check for =-padding. According to RFC4648 =-padding may only be ignored for base16-encoding or if the data length is known implicitely.
Regex-based solution also seems to be the fastest as hinted by kai. As jsperf seems flaky atm i made a new test on jsbench which confirms this.
This should do the trick.
function isBase64(str) {
if (str ==='' || str.trim() ===''){ return false; }
try {
return btoa(atob(str)) == str;
} catch (err) {
return false;
}
}
If "valid" means "only has base64 chars in it" then check against /[A-Za-z0-9+/=]/.
If "valid" means a "legal" base64-encoded string then you should check for the = at the end.
If "valid" means it's something reasonable after decoding then it requires domain knowledge.
I would use a regular expression for that. Try this one:
/^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/
Explanation:
^ # Start of input
([0-9a-zA-Z+/]{4})* # Groups of 4 valid characters decode
# to 24 bits of data for each group
( # Either ending with:
([0-9a-zA-Z+/]{2}==) # two valid characters followed by ==
| # , or
([0-9a-zA-Z+/]{3}=) # three valid characters followed by =
)? # , or nothing
$ # End of input
This method attempts to decode then encode and compare to the original. Could also be combined with the other answers for environments that throw on parsing errors. Its also possible to have a string that looks like valid base64 from a regex point of view but is not actual base64.
if(btoa(atob(str))==str){
//...
}
This is how it's done in one of my favorite validation libs:
const notBase64 = /[^A-Z0-9+\/=]/i;
export default function isBase64(str) {
assertString(str); // remove this line and make sure you pass in a string
const len = str.length;
if (!len || len % 4 !== 0 || notBase64.test(str)) {
return false;
}
const firstPaddingChar = str.indexOf('=');
return firstPaddingChar === -1 ||
firstPaddingChar === len - 1 ||
(firstPaddingChar === len - 2 && str[len - 1] === '=');
}
https://github.com/chriso/validator.js/blob/master/src/lib/isBase64.js
For me, a string is likely an encoded base64 if:
its length is divisible by 4
uses A-Z a-z 0-9 +/=
only uses = in the end (0-2 chars)
so the code would be
function isBase64(str)
{
return str.length % 4 == 0 && /^[A-Za-z0-9+/]+[=]{0,2}$/.test(str);
}
Implementation in nodejs (validates not just allowed chars but base64 string at all)
const validateBase64 = function(encoded1) {
var decoded1 = Buffer.from(encoded1, 'base64').toString('utf8');
var encoded2 = Buffer.from(decoded1, 'binary').toString('base64');
return encoded1 == encoded2;
}
I have tried the below answers but there are some issues.
var base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
base64regex.test(value)
when using this it will be true with "BBBBB" capital letters. and also it will be true with "4444".
I added some code to work correctly for me.
function (value) {
var base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
if (base64regex.test(value) && isNaN(value) && !/^[a-zA-Z]+$/.test(value)) {
return decodeURIComponent(escape(window.atob(value)));
}
Throwing my results into the fray here.
In my case, there was a string that was not base64 but was valid base64 so it was getting decoded into gibberish. (i.e. yyyyyyyy is valid base64 according to the usual regex)
My testing resulted in checking first if the string was a valid base64 string using the regex others shared here and then decrypting it and testing if it was a valid ascii string since (in my case) I should only get ascii characters back. (This can probably be extended to include other characters that may not fall into ascii ranges.)
This is a bit of a mix of multiple answers.
let base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
function isBase64(str) {
if (str ==='' || str.trim() ===''){ return false; }
try {
if (base64regex.test(str)) {
return /^[\x00-\x7F]*$/.test(atob(str));
} else {
return false
}
} catch (err) {
// catch
}
}
As always with my JavaScript answers, I have no idea what I am doing. So there might be a better way to write this out. But it works for my needs and covers the case when you have a string that isn't supposed to be base64 but is valid and still decrypts as base64.
I know its late, but I tried to make it simple here;
function isBase64(encodedString) {
var regexBase64 = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
return regexBase64.test(encodedString); // return TRUE if its base64 string.
}

Categories