Best practice for converting string to object in JavaScript - javascript

I am working on a small UI for JSON editing which includes some object and string manipulation. I was able to make it work, but one of the fields is bit tricky and I would be grateful for an advice.
Initial string:
'localhost=3000,password=12345,ssl=True,isAdmin=False'
Should be converted to this:
{ app_server: 'localhost:3000', app_password:'12345', app_ssl: 'True', app_isAdmin: 'False' }
I was able to do that by first splitting the string with the ',' which returns an array. And then I would loop through the second array and split by '='. In the last step I would simply use forEach to loop through the array and create an object:
const obj = {}
arr2.forEach((item) => (obj[`app_${item[0]}`] = item[1]));
This approach works, but in case some of the fields, i.e password contains ',' or '=', my code will break. Any idea on how to approach this? Would some advanced regex be a good idea?
Edit: In order to make things simple, it seems that I have caused an opposite effect, so I apologize for that.
The mentioned string is a part of larger JSON file, it is the one of the values. On the high level, I am changing the shape of the object, every value that has the structure I described 'server='something, password=1234, ssl=True', has to be transformed into separate values which will populate the input fields. After that, user modify them or simply download the file (I have separate logic for joining the input fields into the initial shape again)

Observation/Limitation with the design that you have :
As per your comment, none of the special characters is escaped in any way then how we will read this string password=12345,ssl=True ? It will be app_password: 12345,ssl=True or app_password: 12345 ?
why localhost=3000 is converted into app_server: 'localhost:3000' instead of app_localhost: '3000' like other keys ? Is there any special requirement for this ?
You have to design your password field in the way that it will not accept at least , character which is basically used to split the string.
Here you go, If we can correct the above mentioned design observations :
const str = 'localhost=3000,password=123=45,ssl=True,isAdmin=False';
const splittedStr = str.split(',');
const result = {};
splittedStr.forEach(s => {
const [key, ...values] = s.split('=')
const value = values.join('=');
result[`app_${key}`] = value
});
console.log(result);
As you can see in above code snippet, I added password value as 123=45 and it is working properly as per the requirement.

You can use a regular expression that matches key and value in the key=value format, and will capture anything between single quotes when the value happens to start with a single quote:
(\w+)=(?:'((?:\\.|[^'])*)'|([^,]+))
This assumes that:
The key consists of alphanumerical characters and underscores only
There is no white space around the = (any space that follows it, is considered part of the value)
If the value starts with a single quote, it is considered a delimiter for the whole value, which will be terminated by another quote that must be followed by a comma, or must be the last character in the string.
If the value is not quoted, all characters up to the next comma or end of the string will be part of the value.
As you've explained that the first part does not follow the key=value pattern, but is just a value, we need to deal with this exception. I suggest prefixing the string with server=, so that now also that first part has the key=value pattern.
Furthermore, as this input is part of a value that occurs in JSON, it should be parsed as a JSON string (double quoted), in order to decode any escaped characters that might occur in it, like for instance \n (backslash followed by "n").
Since it was not clarified how quotes would be escaped when they occur in a quoted string, it remains undecided how for instance a password (or any text field) can include a quote. The above regex will require that if there is a character after a quote that is not a comma, the quote will be considered part of the value, as opposed to terminating the string. But this is just shifting the problem, as now it is impossible to encode the sequence ', in a quoted field. If ever this point is clarified, the regex can be adapted accordingly.
Implementation in JavaScript:
const regex = /(\w+)=(?:'(.*?)'(?![^,])|([^,]+))/g;
function parse(s) {
return Object.fromEntries(Array.from(JSON.parse('"server=' + s + '"').matchAll(regex),
([_, key, quoted, value]) => ["app_" + key, quoted ?? (isNaN(value) ? value : +value)]
));
}
// demo:
// Password includes here a single quote and a JSON encoded newline character
const s = "localhost:3000, password='12'\\n345', ssl='True', isAdmin='False'";
console.log(parse(s));

Related

How to split a string by one delimiter but having a particular format as described below

I have a string as:
const str = 'My [Link format](https://google.com) demo'
I want the word array to be like:
['My', '[Link format](https://google.com)', 'demo']
What to do in javascript?
I was trying using split() and str.match(). Nothing worked yet.
This is a simple split on a space as a delimiter, but we us a negative lookahead to check for the combination of open and closed square brackets [] and round brackets ()
const str = 'My [Link format](https://google.com) demo'
console.log(str.split(/\s+(?![^\[]*\])(?![^\(]*\))/));
We also allow for spaces in the URL portion, even though it has a low chance of having spaces, it could still happen
Try it here: https://jsfiddle.net/m4q6e9x7/
["My", "[Link format](https://google.com)", "demo"]
In the fiddle I've tried to show to two separate negative lookaheads for the combination of the types of brackets: (I've put a space in the round brackets to prove the concept)
const str = 'My [Link format](http s://google.com) demo'
ignore space between []
console.log(str.split(/\s+(?![^\[]*\])/));
["My", "[Link format](http", "s://google.com)", "demo"]
ignore space between ()
console.log(str.split(/\s+(?![^\(]*\))/));
["My", "[Link", "format](http s://google.com)", "demo"]
So we can easily combine the two criteria because we need both of them to not match.
Because [] and () need to be escaped, it might be easier to see the regex if we modify and test for spaces between braces {}
const str = 'My {Link format}(https://google.com) demo'
console.log(str.split(/\s+(?![^{]*})/));
["My", "{Link format}(https://google.com)", "demo"]
Both solutions assume, that the string has correct form (meaning basically no space between ']' and '(', no ']' characters inside [...] and similar intuitions. You didn't really provide information about what the input string can be other than your concrete example – so solutions work well in this and very similar cases. Second is very easily modified as needed, first is easily extended to check if the string is in fact not correct.
Solution using Regular Expressions
Below code finds everything before first '[', everything in '[...](...)' pattern (note: first ... must not contain ']', and second – ')', but I assume this would make for an incorrect input in the first place), and everything after that.
So
let regex = /(.*)(\[.*\]\(.*\))(.*)/
let res = str.match(regex).splice(1,3)
gives res as
['My ', '[Link format](https://google.com)', ' demo']
From there, you can trim every entry in this array ('My ' => 'My') for example using a trim function like so:
res.map((val) => val.trim());
Look here for explanation of what the array obtained from .match() method represents, but generally except index 0 it contains capture groups, meaning the parts of string corresponding to parts of regex surrounded by parentheses.
If you are not familiar with Regular Expressions (regexes) in JS, or at all, you will find many online resources about the topic easily. After grasping the basics, regex101 is a nice tool to experiment with regexes and explore their capabilities. When using it, you should probably choose EcmaSCRIPT/JS flavor from the menu on the left.
Equivalent solution without regex
Equivalent solution is to find where is the first '[' manually, as well as where the '[...](...)' pattern ends. Than splice the parts (before '[', pattern, and after pattern) from the string, and probably trim them. So just loop over characters of the string in search of '[' and than ']', '(', ')'. Note that in this case you can easily and granularily decide what to do if the string has unexpected/incorrect form.
TODO: I will probably sketch some code when I have time for it
Regex is your friend!
const regexMdLinks = /!?\[([^\]]*)\]\(([^\)]+)\)/gm
// Example md file contents
const str = `My [Link format](https://google.com) demo My [Link format2](https://google.com/2) demo2`
let regex_splitted = str.split(regexMdLinks);
let arr = [];
//1. Item will be the text (or empty text)
//2. Item is the link text
//3. Item is the url
for(let i = 0; i < regex_splitted.length; i++){
if(i % 3 == 0){ //Split normal text
arr.push(...regex_splitted[i].split(" ").filter(i => i));
} else if(i % 3 == 1){//Add brackets around link text
arr.push("["+regex_splitted[i]+"]");
} else {
arr.push("("+regex_splitted[i]+")");
}
}
console.log(arr)

Check whether string contains other than specific word

I need to check whether a string contains other than the specified words/sentence (javascript), it will return true if:
it contains an alphabets, except this phrase: ANOTHER CMD
it contains other than specified multiple sequence of numbers for example: ["8809 8805", "8806 8807"] (the numbers are examples I should be able to test the string for any array of numbers)
Thank you!
Yes you can replace all not in the array
const arr = ["ANOTHER CMD","8809 8805", "8809 8805"]
const okContent = str => {
arr.forEach(entry => str = str.replaceAll(entry,""))
return str.trim()==="";
};
console.log(okContent('Has other stuff than ANOTHER CMD and 8809 8805'))
console.log(okContent('8809 8805 ANOTHER CMD 8809 8805'))
I don't know if it's the correct way of doing it but this worked for me:
replace all the valid words with balnk (using replace)
check if the string is left empty
if it's empty, it means that the string does not contain any unwanted string (to check for space you could use trim method)
you can try regex!
use your array of strings as the '|' separated regex value
and check the specified string in the given line. if it presents negate the output.
const regex = /(ANOTHER CMD|8809 8805|8806 8807)/i
console.log(!regex.test('Should not contain word ANOTHER CMD'))
console.log(regex.test('Should contain word ANOTHER CMD'))

regex with replace() for letters only

I have a string that output
20153 Risk
What i am trying to achieve is getting only letters, i have achieved by getting only numbers using regular expression which is
const cf_regex_number = cf_input.replace(/\D/g, '');
this will return only 20153 . But as soon as i tried to only get letters , its returning the while string instead of Risk . i have done my research and the regular expression to get only letters is using **/^[a-zA-Z]*$/**
This is my line of code i tried to get only letters
const cf_regex_character = cf_input.replace(/^[a-zA-Z]*$/,'')
but instead of returning Risk , it is returning 20153 Risk which is the whole line of string .
/[^a-z]+/i
The [ brackets ] signify a range of characters; specifically, a to z in this case.
Actually the i flag means insensitive to case, so that includes A to Z also.
The caret ^ inverts the pattern; it means, anything not in the specified range.
And the + means continue adding characters to the match as long as they are they within that range.
Then stop matching.
In effect this matches everything up to the space in 20153 Risk.
Then you replace this match with the empty string '' and what you've got left is Risk.
const string = '20153 Risk';
const result = string.replace(/[^a-z]+/i, '');
console.log(result);
Your first pattern is locating every non-digit and replacing it with nothing.
On the other hand, your second pattern is locating just the first occurence of a pattern, and the pattern is looking for start of string, followed by letters, followed by end of string. There is no such sequence - if you start from the start of string, there are exactly zero letters, and then you are left very far from the expected end of the string. Even if that worked, you are deleting letters, not non-letters.
This pattern is parallel to your first one (delete any occurence of a non-letter):
const cf_regex_character = cf_input.replace(/[^a-zA-Z]/g,'')
but possibly a better way to go is to extract the desired substring, instead of deleting everything that it is not:
const letters = cf_input.match(/[a-z]+/i)[0];
const numbers = cf_input.match(/\d+/)[0];
(This is if you know there is such a substring; if you are unsure it would be better to code a bit more defensively.)
cf_input="20153 Risk"
const cf_regex_character = cf_input.replace(/\d+\s/,'')
console.log(cf_regex_character)
str="20153 Risk"
reg=/[a-z]+/gi
res=str.match(reg)
console.log(res[0])

Empty strings in array after using the split method with a regexp

I'm reading through Chapter 5 of Professional JavaScript for Web Developers and came across this example involving the split method and a regular expression. My confusion stems from the output of the variable colors3. Why does the array contain an empty string before and after the commas?
var colorText = “red,blue,green,yellow”;
var colors1 = colorText.split(“,”); //[“red”, “blue”, “green”, “yellow”]
var colors2 = colorText.split(“,”, 2); //[“red”, “blue”]
var colors3 = colorText.split(/[^\,]+/); //[“”, “,”, “,”, “,”, “”]
In the last case, you're defining separator as "any run of characters that aren't commas".
Because nothing precedes the first "separator" ("red") and nothing follows the last "separator" ("yellow"). Split presumes that the first separator is preceded by a value, and that the last separator is followed by a value -- as they are, in your first and second examples, and in any normal case such as a line in a CSV file. The only quasi-exception would be if the first (or last) value in the CSV line were an empty string; in that case, what would you see if there were an empty string followed by a separator?
You would see just a seemingly orphaned separator at the beginning of the line (or a separator at the end). It has to be this way because you have to support empty values.
If you preceded "red" with a comma, you would see an initial empty string in the first array, and an initial comma in the last.
I think you're thrown off by the fact that your last regex redefines "separator" as a set of characters normally regarded as data, and redefines "data" as a character normally defined as a separator.
Accept the arbitrariness. Let it flow through you. They're not commas and letters, they're zeroes and ones.

What does this JS do?

var passwordArray = pwd.replace(/\s+/g, '').split(/\s*/);
I found the above line of code is a rather poorly documented JavaScript file, and I don't know exactly what it does. I think it splits a string into an array of characters, similar to PHP's str_split. Am I correct, and if so, is there a better way of doing this?
it replaces any spaces from the password and then it splits the password into an array of characters.
It is a bit redundant to convert a string into an array of characters,because you can already access the characters of a string through brackets(.. not in older IE :( ) or through the string method "charAt" :
var a = "abcdefg";
alert(a[3]);//"d"
alert(a.charAt(1));//"b"
It does the same as: pwd.split(/\s*/).
pwd.replace(/\s+/g, '').split(/\s*/) removes all whitespace (tab, space, lfcr etc.) and split the remainder (the string that is returned from the replace operation) into an array of characters. The split(/\s*/) portion is strange and obsolete, because there shouldn't be any whitespace (\s) left in pwd.
Hence pwd.split(/\s*/) should be sufficient. So:
'hello cruel\nworld\t how are you?'.split(/\s*/)
// prints in alert: h,e,l,l,o,c,r,u,e,l,w,o,r,l,d,h,o,w,a,r,e,y,o,u,?
as will
'hello cruel\nworld\t how are you?'.replace(/\s+/g, '').split(/\s*/)
The replace portion is removing all white space from the password. The \\s+ atom matches non-zero length white spcace. The 'g' portion matches all instances of the white space and they are all replaced with an empty string.

Categories