What does this JS do? - javascript

var passwordArray = pwd.replace(/\s+/g, '').split(/\s*/);
I found the above line of code is a rather poorly documented JavaScript file, and I don't know exactly what it does. I think it splits a string into an array of characters, similar to PHP's str_split. Am I correct, and if so, is there a better way of doing this?

it replaces any spaces from the password and then it splits the password into an array of characters.
It is a bit redundant to convert a string into an array of characters,because you can already access the characters of a string through brackets(.. not in older IE :( ) or through the string method "charAt" :
var a = "abcdefg";
alert(a[3]);//"d"
alert(a.charAt(1));//"b"

It does the same as: pwd.split(/\s*/).
pwd.replace(/\s+/g, '').split(/\s*/) removes all whitespace (tab, space, lfcr etc.) and split the remainder (the string that is returned from the replace operation) into an array of characters. The split(/\s*/) portion is strange and obsolete, because there shouldn't be any whitespace (\s) left in pwd.
Hence pwd.split(/\s*/) should be sufficient. So:
'hello cruel\nworld\t how are you?'.split(/\s*/)
// prints in alert: h,e,l,l,o,c,r,u,e,l,w,o,r,l,d,h,o,w,a,r,e,y,o,u,?
as will
'hello cruel\nworld\t how are you?'.replace(/\s+/g, '').split(/\s*/)

The replace portion is removing all white space from the password. The \\s+ atom matches non-zero length white spcace. The 'g' portion matches all instances of the white space and they are all replaced with an empty string.

Related

Best practice for converting string to object in JavaScript

I am working on a small UI for JSON editing which includes some object and string manipulation. I was able to make it work, but one of the fields is bit tricky and I would be grateful for an advice.
Initial string:
'localhost=3000,password=12345,ssl=True,isAdmin=False'
Should be converted to this:
{ app_server: 'localhost:3000', app_password:'12345', app_ssl: 'True', app_isAdmin: 'False' }
I was able to do that by first splitting the string with the ',' which returns an array. And then I would loop through the second array and split by '='. In the last step I would simply use forEach to loop through the array and create an object:
const obj = {}
arr2.forEach((item) => (obj[`app_${item[0]}`] = item[1]));
This approach works, but in case some of the fields, i.e password contains ',' or '=', my code will break. Any idea on how to approach this? Would some advanced regex be a good idea?
Edit: In order to make things simple, it seems that I have caused an opposite effect, so I apologize for that.
The mentioned string is a part of larger JSON file, it is the one of the values. On the high level, I am changing the shape of the object, every value that has the structure I described 'server='something, password=1234, ssl=True', has to be transformed into separate values which will populate the input fields. After that, user modify them or simply download the file (I have separate logic for joining the input fields into the initial shape again)
Observation/Limitation with the design that you have :
As per your comment, none of the special characters is escaped in any way then how we will read this string password=12345,ssl=True ? It will be app_password: 12345,ssl=True or app_password: 12345 ?
why localhost=3000 is converted into app_server: 'localhost:3000' instead of app_localhost: '3000' like other keys ? Is there any special requirement for this ?
You have to design your password field in the way that it will not accept at least , character which is basically used to split the string.
Here you go, If we can correct the above mentioned design observations :
const str = 'localhost=3000,password=123=45,ssl=True,isAdmin=False';
const splittedStr = str.split(',');
const result = {};
splittedStr.forEach(s => {
const [key, ...values] = s.split('=')
const value = values.join('=');
result[`app_${key}`] = value
});
console.log(result);
As you can see in above code snippet, I added password value as 123=45 and it is working properly as per the requirement.
You can use a regular expression that matches key and value in the key=value format, and will capture anything between single quotes when the value happens to start with a single quote:
(\w+)=(?:'((?:\\.|[^'])*)'|([^,]+))
This assumes that:
The key consists of alphanumerical characters and underscores only
There is no white space around the = (any space that follows it, is considered part of the value)
If the value starts with a single quote, it is considered a delimiter for the whole value, which will be terminated by another quote that must be followed by a comma, or must be the last character in the string.
If the value is not quoted, all characters up to the next comma or end of the string will be part of the value.
As you've explained that the first part does not follow the key=value pattern, but is just a value, we need to deal with this exception. I suggest prefixing the string with server=, so that now also that first part has the key=value pattern.
Furthermore, as this input is part of a value that occurs in JSON, it should be parsed as a JSON string (double quoted), in order to decode any escaped characters that might occur in it, like for instance \n (backslash followed by "n").
Since it was not clarified how quotes would be escaped when they occur in a quoted string, it remains undecided how for instance a password (or any text field) can include a quote. The above regex will require that if there is a character after a quote that is not a comma, the quote will be considered part of the value, as opposed to terminating the string. But this is just shifting the problem, as now it is impossible to encode the sequence ', in a quoted field. If ever this point is clarified, the regex can be adapted accordingly.
Implementation in JavaScript:
const regex = /(\w+)=(?:'(.*?)'(?![^,])|([^,]+))/g;
function parse(s) {
return Object.fromEntries(Array.from(JSON.parse('"server=' + s + '"').matchAll(regex),
([_, key, quoted, value]) => ["app_" + key, quoted ?? (isNaN(value) ? value : +value)]
));
}
// demo:
// Password includes here a single quote and a JSON encoded newline character
const s = "localhost:3000, password='12'\\n345', ssl='True', isAdmin='False'";
console.log(parse(s));

regex with replace() for letters only

I have a string that output
20153 Risk
What i am trying to achieve is getting only letters, i have achieved by getting only numbers using regular expression which is
const cf_regex_number = cf_input.replace(/\D/g, '');
this will return only 20153 . But as soon as i tried to only get letters , its returning the while string instead of Risk . i have done my research and the regular expression to get only letters is using **/^[a-zA-Z]*$/**
This is my line of code i tried to get only letters
const cf_regex_character = cf_input.replace(/^[a-zA-Z]*$/,'')
but instead of returning Risk , it is returning 20153 Risk which is the whole line of string .
/[^a-z]+/i
The [ brackets ] signify a range of characters; specifically, a to z in this case.
Actually the i flag means insensitive to case, so that includes A to Z also.
The caret ^ inverts the pattern; it means, anything not in the specified range.
And the + means continue adding characters to the match as long as they are they within that range.
Then stop matching.
In effect this matches everything up to the space in 20153 Risk.
Then you replace this match with the empty string '' and what you've got left is Risk.
const string = '20153 Risk';
const result = string.replace(/[^a-z]+/i, '');
console.log(result);
Your first pattern is locating every non-digit and replacing it with nothing.
On the other hand, your second pattern is locating just the first occurence of a pattern, and the pattern is looking for start of string, followed by letters, followed by end of string. There is no such sequence - if you start from the start of string, there are exactly zero letters, and then you are left very far from the expected end of the string. Even if that worked, you are deleting letters, not non-letters.
This pattern is parallel to your first one (delete any occurence of a non-letter):
const cf_regex_character = cf_input.replace(/[^a-zA-Z]/g,'')
but possibly a better way to go is to extract the desired substring, instead of deleting everything that it is not:
const letters = cf_input.match(/[a-z]+/i)[0];
const numbers = cf_input.match(/\d+/)[0];
(This is if you know there is such a substring; if you are unsure it would be better to code a bit more defensively.)
cf_input="20153 Risk"
const cf_regex_character = cf_input.replace(/\d+\s/,'')
console.log(cf_regex_character)
str="20153 Risk"
reg=/[a-z]+/gi
res=str.match(reg)
console.log(res[0])

Remove all ANSI colors/styles from strings

I use a library that adds ANSI colors / styles to strings. For example:
> "Hello World".rgb(255, 255, 255)
'\u001b[38;5;231mHello World\u001b[0m'
> "Hello World".rgb(255, 255, 255).bold()
'\u001b[1m\u001b[38;5;231mHello World\u001b[0m\u001b[22m'
When I do:
console.log('\u001b[1m\u001b[38;5;231mHello World\u001b[0m\u001b[22m')
a "Hello World" white and bold message will be output.
Having a string like '\u001b[1m\u001b[38;5;231mHello World\u001b[0m\u001b[22m' how can these elements be removed?
foo('\u001b[1m\u001b[38;5;231mHello World\u001b[0m\u001b[22m') //=> "Hello World"
Maybe a good regular expression? Or is there any built-in feature?
The work around I was thinking was to create child process:
require("child_process")
.exec("node -pe \"console.error('\u001b[1m\u001b[38;5;231mHello World\u001b[0m\u001b[22m')\""
, function (err, stderr, stdout) { console.log(stdout);
});
But the output is the same...
The regex you should be using is
/[\u001b\u009b][[()#;?]*(?:[0-9]{1,4}(?:;[0-9]{0,4})*)?[0-9A-ORZcf-nqry=><]/g
This matches most of the ANSI escape codes, beyond just colors, including the extended VT100 codes, archaic/proprietary printer codes, etc.
Note that the \u001b in the above regex may not work for your particular library (even though it should); check out my answer to a similar question regarding acceptable escape characters if it doesn't.
If you don't like regexes, you can always use the strip-ansi package.
For instance, the string jumpUpAndRed below contains ANSI codes for jumping to the previous line, writing some red text, and then going back to the beginning of the next line - of which require suffixes other than m.
var jumpUpAndRed = "\x1b[F\x1b[31;1mHello, there!\x1b[m\x1b[E";
var justText = jumpUpAndRed.replace(
/[\u001b\u009b][[()#;?]*(?:[0-9]{1,4}(?:;[0-9]{0,4})*)?[0-9A-ORZcf-nqry=><]/g, '');
console.log(justText);
The escape character is \u001b, and the sequence from [ until first m is encountered is the styling. You just need to remove that. So, replace globally using the following pattern:
/\u001b\[.*?m/g
Thus,
'\u001b[1m\u001b[38;5;231mHello World\u001b[0m\u001b[22m'.replace(/\u001b\[.*?m/g, '')
The colors are like ESC[39m format, the shortest regexp is for it the /\u001b[^m]*?m/g
Where \u001b is the ESC character,
[^m]*? is any character(s) till m (not greedy pattern),
the m itself, and /g for global (all) replace.
Example:
var line="\x1B[90m2021-02-03 09:35:50.323\x1B[39m\t\x1B[97mFinding: \x1B[39m\x1B[97m»\x1B[39m\x1B[33m42125121242\x1B[39m\x1B[97m«\x1B[39m\x1B[0m\x1B[0m\t\x1B[92mOK\x1B[39m";
console.log(line.replace(/\u001b[^m]*?m/g,""));
// -> 2021-02-03 09:35:50.323 Finding: »42125121242« OK ( without colors )
console.log(line);
// -> 2021-02-03 09:35:50.323 Finding: »42125121242« OK ( colored )

javascript url-safe filename-safe string

Looking for a regex/replace function to take a user inputted string say, "John Smith's Cool Page" and return a filename/url safe string like "john_smith_s_cool_page.html", or something to that extent.
Well, here's one that replaces anything that's not a letter or a number, and makes it all lower case, like your example.
var s = "John Smith's Cool Page";
var filename = s.replace(/[^a-z0-9]/gi, '_').toLowerCase();
Explanation:
The regular expression is /[^a-z0-9]/gi. Well, actually the gi at the end is just a set of options that are used when the expression is used.
i means "ignore upper/lower case differences"
g means "global", which really means that every match should be replaced, not just the first one.
So what we're looking as is really just [^a-z0-9]. Let's read it step-by-step:
The [ and ] define a "character class", which is a list of single-characters. If you'd write [one], then that would match either 'o' or 'n' or 'e'.
However, there's a ^ at the start of the list of characters. That means it should match only characters not in the list.
Finally, the list of characters is a-z0-9. Read this as "a through z and 0 through 9". It's a short way of writing abcdefghijklmnopqrstuvwxyz0123456789.
So basically, what the regular expression says is: "Find every letter that is not between 'a' and 'z' or between '0' and '9'".
I know the original poster asked for a simple Regular Expression, however, there is more involved in sanitizing filenames, including filename length, reserved filenames, and, of course reserved characters.
Take a look at the code in node-sanitize-filename for a more robust solution.
For more flexible and robust handling of unicode characters etc, you could use the slugify in conjunction with some regex to remove unsafe URL characters
const urlSafeFilename = slugify(filename, { remove: /"<>#%\{\}\|\\\^~\[\]`;\?:#=&/g });
This produces nice kebab-case filenemas in your url and allows for more characters outside the a-z0-9 range.
Here's what I did. It works to convert full sentences into a decently clean URL.
First it trims the string, then it converts spaces to dashes (-), then it gets rid of anything that's not a letter/number/dash
function slugify(title) {
return title
.trim()
.replace(/ +/g, '-')
.toLowerCase()
.replace(/[^a-z0-9-]/g, '')
}
slug.value = slugify(text.value);
text.oninput = () => { slug.value = slugify(text.value); };
<input id="text" value="Foo: the old #Foobîdoo!! " style="font-size:1.2em">
<input id="slug" readonly style="font-size:1.2em">
I think your requirement is to replaces white spaces and aphostophy `s with _ and append the .html at the end try to find such regex.
refer
http://www.regular-expressions.info/javascriptexample.html

Split string by HTML entities?

My string contain a lot of HTML entities, like this
"Hello <everybody> there"
And I want to split it by HTML entities into this :
Hello
everybody
there
Can anybody suggest me a way to do this please? May be using Regex?
It looks like you can just split on &[^;]*; regex. That is, the delimiter are strings that starts with &, ends with ;, and in between there can be anything but ;.
If you can have multiple delimiters in a row, and you don't want the empty strings between them, just use (&[^;]*;)+ (or in general (delim)+ pattern).
If you can have delimiters in the beginning or front of the string, and you don't want them the empty strings caused by them, then just trim them away before you split.
Example
Here's a snippet to demonstrate the above ideas (see also on ideone.com):
var s = ""Hello <everybody> there""
print (s.split(/&[^;]*;/));
// ,Hello,,everybody,,there,
print (s.split(/(?:&[^;]*;)+/));
// ,Hello,everybody,there,
print (
s.replace(/^(?:&[^;]*;)+/, "")
.replace(/(?:&[^;]*;)+$/, "")
.split(/(?:&[^;]*;)+/)
);
// Hello,everybody,there
var a = str.split(/\&[#a-z0-9]+\;/); should do it, although you'll end up with empty slots in the array when you have two entities next to each other.
split(/&.*?;(?=[^&]|$)/)
and cut the last and first result:
["", "Hello", "everybody", "there", ""]
>> ""Hello <everybody> there"".split(/(?:&[^;]+;)+/)
['', 'Hello', 'everybody', 'there', '']
The regex is: /(?:&[^;]+;)+/
Matches entities as & followed by 1+ non-; characters, followed by a ;. Then matches at least one of those (or more) as the split delimiter. The (?:expression) non-capturing syntax is used so that the delimiters captured don't get put into the result array (split() puts capture groups into the result array if they appear in the pattern).

Categories