Parsing text using regex javascript - javascript

guys i am stuck while parsing following text into object. I have created two separate regex but i want to make only one. Below i am posting sample text as well as my following regex pattern.
PAYER:\r\n\r\n MCNA \r\n\r\nPROVIDER:\r\n\r\n MY KHAN \r\n Provider ID: 115446397114\r\n Tax ID: 27222193992\r\n\r\nINSURED:\r\n\r\n VICTORY OKOYO\r\n Member ID: 60451158048\r\n Birth Date: 05/04/2008\r\n Gender: Male\r\n\r\nCOVERAGE TYPE:\r\n\r\n Dental Care
REGEX:
re = new RegExp('(.*?):\r\n\r\n(.*?)(?:\r\n|$)', 'g');
re2 = new RegExp('(.*?):(.*?)(?:\r\n|$)', 'g');
Expected result:
{
payer: 'MCNA',
provider: 'MY KHAN'
}

This turns your input into an object that contains all key/value pairs:
const input = 'PAYER:\r\n\r\n MCNA \r\n\r\nPROVIDER:\r\n\r\n MY KHAN \r\n Provider ID: 115446397114\r\n Tax ID: 27222193992\r\n\r\nINSURED:\r\n\r\n VICTORY OKO\r\n Member ID: 60451158048\r\n Birth Date: 05/04/2009\r\n Gender: Male\r\n\r\nCOVERAGE TYPE:\r\n\r\n Dental Care';
let result = Object.fromEntries(input
.replace(/([^:]+):\s+([^\n\r]+)\s*/g, (m, c1, c2) => c1.toLowerCase() + '\r' + c2 + '\n')
.split('\n')
.filter(Boolean)
.map(item => item.trim().split('\r'))
);
console.log(result);
Output:
{
"payer": "MCNA",
"provider": "MY KHAN",
"provider id": "115446397114",
"tax id": "27222193992",
"insured": "VICTORY OKO",
"member id": "60451158048",
"birth date": "05/04/2009",
"gender": "Male",
"coverage type": "Dental Care"
}
Explanation:
Object.fromEntries() -- convert a 2D array to object, ex: [ ['a', 1], ['b', 2] ] => {a: 1, b: 2}
.replace() regex /([^:]+):\s+([^\n\r]+)\s*/g -- two capture groups, one for key, one for value
replace action c1.toLowerCase() + '\r' + c2 + '\n' -- convert key to lowercase, separate key/value pairs with newline
.split('\n') -- split by newline
.filter(Boolean): -- remove empty items
.map(item => item.trim().split('\r')) -- change array item to [key, value], e.g. change flat array to 2D array
You could add one more filter after the .map() to keep only keys of interest.

Related

How to fix invalid JSON with RegExp in Javascript?

This is what I've tried
// input
let input = "{id: 1, name: apple, qty: 2, colors: [{id: 1, hex: #f95}], store: {id: 1, name: Apple Store}}"
let result = input.replace((/([\w]+)(:)/g), "\"$1\"$2");
// {"id": 1, "name": apple, "qty": 2, "colors": [{"id": 1, "hex": #f95}], "store": {"id": 1, "name": Apple Store}}
And then I just replace it like, replaceAll(': ', ': "'). I think it's not good practice to resolve it, may there is someone who can help me with this problem, thank you so much.
You can convert the stated string that looks almost like an object into an actual JavaScript object with the following assumptions:
keys are composed of alphanumeric and underscores chars
values are treated as numbers if they have the format of a number, e.g. an optional minus sign, followed by digits with optional .
values are treated as a string unless it has the form of a number, or start with [ (array) or { (object)
string values may not contain , or }
const input = "{id: 1, name: apple, qty: 2, colors: [{id: 1, hex: #f95}], store: {id: 1, name: Apple Store}}";
const regex1 = /([,\{] *)(\w+):/g;
const regex2 = /([,\{] *"\w+":)(?! *-?[0-9\.]+[,\}])(?! *[\{\[])( *)([^,\}]*)/g;
let json = input
.replace(regex1, '$1"$2":')
.replace(regex2, '$1$2"$3"')
let result = JSON.parse(json);
console.log(JSON.stringify(result, null, ' '));
Output:
{
"id": 1,
"name": "apple",
"qty": 2,
"colors": [
{
"id": 1,
"hex": "#f95"
}
],
"store": {
"id": 1,
"name": "Apple Store"
}
}
Explanation of regex1:
([,\{] *) -- capture group 1: , or {, followed by optional spaces
(\w+) -- capture group 2: 1+ word chars (alphanumeric and underscore)
: -- literal :
replace '$1"$2":' -- capture group 1, followed by capture group 2 enclosed in quotes, followed by colon
Explanation of regex2:
([,\{] *"\w+":) -- capture group 1: , or {, followed by optional spaces, quote, 1+ word chars, quote, colon
(?! *-?[0-9\.]+[,\}]) -- negative lookahead for optional spaces, a number, followed by , or }
(?! *[\{\[]) -- negative lookahead for optional spaces, followed by { or [
( *) -- capture group 2: optional spaces
([^,\}]*) -- capture group 3: everything that is not a , or }
replace '$1$2"$3"' -- capture group 1, followed by capture group 2, followed by capture group 3 enclosed in quotes
Learn more about regex: https://twiki.org/cgi-bin/view/Codev/TWikiPresentation2018x10x14Regex
Thanks for all answers, I tried this way and its works
class FixJson {
constructor() {
this.run = (json) => {
const fixDataType = (json) => {
for (const key in json) {
if (json.hasOwnProperty(key)) {
const value = json[key];
if (typeof value === 'object') {
fixDataType(value);
} else if (value === 'true' || value === 'false') {
json[key] = value === 'true';
} else if (!isNaN(value)) {
json[key] = Number(value);
}
}
}
return json;
}
// use the replace function to add double quotes around the property names
const fixedJson = json.replace(/([a-zA-Z0-9!##\$%\^\&*\)\(+=._-]+)/g, '"$1"');
// use the JSON.parse function to parse the fixed JSON string into a JavaScript object
const obj = JSON.parse(fixedJson.replaceAll('" "', ' '));
// fix json data type, and return the result
return fixDataType(obj)
}
}
}
const fix = new FixJson()
let result = fix.run("<your_invalid_json>")

Jquery Object and Array implementation

I have an Object
let data = {
a: 1,
b: 2,
c: {
abc: "ak",
bcd: "gh",
cfv: "ht"
}
}
then I have variables which I need to show with these object value
let abc = "first 1", bcd="sec2", cfv="third3" , def="fourth 4", tdf = "fifth 5";
Now the Object will come in API call it can be any of these variable.
How can I match the variable name with the object data.c.(object key) and concatinate their value.
for example the output should be
As we have (abc, bcd, cfv) as our object key then the output would be
first 1ak ==> that is the value of (abc + data.c["abc"])
sec2gh ==> that is the value of (bcd + data.c["bcd"])
third3ht ==> that is the value of (cfv + data.c["cfv"])
I tried using Object.keys() method so from this method we will get the object keys in array then how can I match with the variable name -
Object.keys(data.c);
==> ["abc", "bcd", "cfv"] (After this how can I proceed to match the variable and show their values?)
Shall I loop throught the object that (data.c)?
Please help me giving some ideas to achieve this implementation.
thank you
If it's possible for you to amend the format of your abc, bcd etc. variables to be the properties in an object, then this problem becomes trivial. You can use flatMap() to create a new array of the output values by linking the properties of the two target objects, like this:
let values = {
abc: "first 1",
bcd: "sec2",
cfv: "third3",
def: "fourth 4",
tdf: "fifth 5"
}
let data = {
a: 1,
b: 2,
c: {
abc: "ak",
bcd: "gh",
cfv: "ht"
}
}
let output = Object.keys(values).flatMap(k => data.c.hasOwnProperty(k) ? values[k] + data.c[k] : []);
console.log(output);

Split each element in array into object after certain character

I'm new to node.js and javascript. I have the following array:
var oldarray = [
'name1\tstreet\tperson\tphone1\tphone2\nname2\street2\tperson1\tphone82\tphone3\n'
]
Note, this is a single element array. First, I require the array to contain a new element after each new line first, then, re-format like below:
let headers = {
name: "",
street: "",
person: "",
phone 1 "",
phone 2 ""
}
How can I parse through each element (after creating a new element after each +), and assign an object within an array after each instance of \
The desired output is this:
[{
name: 'name1',
street: 'street2',
person: 'person1',
phone1: 'phone82 ',
phone2: 'phone3'
},
{
name: 'name2',
street: 'street2',
person: 'person1',
phone1: 'phone1 ',
phone2: 'phone2'
}]
Any help is highly appreciated.
If you have the same structure for all items in OLD_ARRAY you can use map, filter and reduce in order to manipulate your input.
So what I did?
In case that you have multiple strings like the example input (more than 1 array item) I convert it to sub-arrays of each string by using map and split by \n, which is your string separator. Than I filtered it by strings that are not empty (becasue that you have a post-fix of \n as well).
From each sub-array I extracted all the contacts using extractContacts function - it splites the sub-array by your separaotr, \t, and map it according to your contacts temaplte.
Since it's a format of array of arrays, I used reduce to concat all the arrays together
const OLD_ARRAY = [
'name1\tstreet\tperson\tphone1\tphone2\n' +
'name2\tstreet2\tperson1\tphone82\tphone3\n'
];
function extractContacts(templates) {
return templates.map(t => t.split('\t'))
.map(details => ({
name: details[0],
street: details[1],
person: details[2],
phone1: details[3],
phone2: details[4]
}));
}
let contacts = OLD_ARRAY.map(str => str.split('\n').filter(str => str !== ''))
.map(template => extractContacts(template))
.reduce((a, acc) => acc.concat(a), []);
console.log(contacts)
You can split each oldarray value on \n and then \t into newarray, and then use Object.fromEntries to build an object from each newarray value, combining the split values with each key from headers:
var oldarray = [
'name1\tstreet\tperson\tphone1\tphone2\n' +
'name2\tstreet2\tperson1\tphone82\tphone3\n'
]
let newarray = [];
oldarray.forEach(s => s.trim().split('\n').map(v => newarray.push(v.split('\t'))));
let headers = {
'name': "",
'street': "",
'person': "",
'phone 1': "",
'phone 2': ""
}
let keys = Object.keys(headers);
out = newarray.map(s => Object.fromEntries(s.map((v, i) => [keys[i], v])));
console.log(out);
First split the array by \n to get individual paths and then split them by \t, and use reduce to create new header objects from each subarray
var oldarray = [
'name1\tstreet\tperson\tphone1\tphone2\n' +
'name2\tstreet2\tperson1\tphone82\tphone3\n' +
'name4\tstreet4\tperson4\tphone84\tphone4\n'
]
arr = oldarray.flatMap(o => o.split("\n"))
c = arr.map(o => o.split("\t"))
c.pop()
result = c.reduce((acc,[name, street, person, phone1, phone2],i) => {
acc = [...acc,{name:name,street:street,person:person,phone1:phone1,phone2:phone2}]
return acc
},[])
console.log(result)

Match and replace all strings inside object with strings from another object

I have an object which contains UTF-8 characters as strings - I figured I could make another object with the list of characters and how I'd like to replace them?
The Data Object
var data = [
{"my_string":"ABC & I","value":13,"key":8},
{"my_string":"A “B” C","value":12,"key":9}
];
The Replacement Object
var str_to_change = [
{value: "&", replace: "&"},
{value: "“", replace: ""},
{value: "”", replace: ""}
];
I'd like to write a function where anytime a str_to_change.value is seen inside data.my_string, replace it with str_to_change.replace
Is this the best way to go about changing various character strings, and how would I execute this? I found this: Iterate through object literal and replace strings but it's a little more complex since I'm not just replacing with a singular string.
Rather than an array of objects, consider constructing just a single object with multiple keys:
const replacements = {
"&": "&",
"“": '',
"”": '',
};
Then, with the keys, escape characters with a special meaning in regular expressions, join the keys by |, construct a regular expression, and have a replacer function access the matched substring as a property of the replacements object:
var str_to_change = [{value: "&", replace: "&"},
{value: "“", replace: ""},
{value: "”", replace: ""}];
const replacements = Object.fromEntries(str_to_change.map(({ value, replace }) => [value, replace]));
const escape = s => s.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
const pattern = new RegExp(Object.keys(replacements).map(escape).join('|'), 'gi');
var data = [{
"my_string": "ABC & I",
"value": 13,
"key": 8
},
{
"my_string": "A “B” C",
"value": 12,
"key": 9
}];
const mappedData = data.map(({ my_string, ...rest }) => ({
...rest,
my_string: my_string.replace(
pattern,
prop => replacements[prop]
)
}));
console.log(mappedData);

How to trim the last one or last two characters of a string

I have an object with a bunch of strings:
[
{
date: "03/12/2014",
name: "mr blue",
title: "math teacher -"
},
{
date: "04/02/2015",
name: "mrs yellow",
title: "chemistry teacher"
},
{
date: "11/04/2014",
name: "mrs green",
title: "chemistry teacher - "
},
]
How can i strip the - from the title field if that string contains a -.
I know a can perform a slice/subtring:
var myvalue = myobject.title.substring(0, myobject.title.length-1);
However this will apply for all cases, and not just the ones that contain the -
Use replace:
var myvalue = myobject.title.replace(/\s*-\s*$/,'');
Bonus: with this regular expression only a dash at the end will be removed (along with the spaces around).
var title = 'math teacher -';
title = title.replace('-', '').trim();
document.write(title);
Update
Above will fail if title has dashes in the middle. Therefore, using lastIndexOf you can do
title = title.substring(0,oldString.lastIndexOf("-")).trim;

Categories