I am trying to make my code looks professional by removing those duplicate code. the question is I want to get some data from a string, to be specific, I need to know the NUMBER, X, Y, Z, A, B, etc. values but the regex expression are different for each variable so I have to repeat myself writing a lot of duplicate code.
let TextString = `DRILL(NUMBER:=20,NAME:='4',PN:=1,X:=10.1,Y:=73.344,Z:=0,A:=-1.435,B:=1.045,M1:=1,M2:=2,M3:=3,M4:=4,M5:=1,S1:=10.5,S2:=2.1,S3:=1.2,S4:=2,S5:=2.4,RS1:=1,RS2:=2);`;
const regNumber = /(?<=NUMBER:=)[0-9]+/gm;
let lineNumber = Number(TextString.match(regNumber));
const regX = /(?<=X:=)(-?[0-9]+)(.[0-9]+)?/gm;
let X = Number(TextString.match(regX)).toFixed(1);
const regY = /(?<=Y:=)(-?[0-9]+)(.[0-9]+)?/gm;
let Y = Number(TextString.match(regY)).toFixed(1);
const regZ = /(?<=Z:=)(-?[0-9]+)(.[0-9]+)?/gm;
let Z = Number(TextString.match(regZ)).toFixed(1);
const regA = /(?<=A:=)(-?[0-9]+)(.[0-9]+)?/gm;
let A = Number(TextString.match(regA)).toFixed(1);
const regB = /(?<=B:=)(-?[0-9]+)(.[0-9]+)?/gm;
let B = Number(TextString.match(regB)).toFixed(1);
// and many more duplicate code.
console.log(lineNumber, X, Y, Z, A, B);
I could only think of a way like the above, to match each variable individually and run .match() multiple times, but as you can see there are 17 variables total and in real situations, there are hundreds of these TextString. I was worried that this matching process will have a huge impact on performance.
Are there any other ways to fetch all variables in one match and store them in an array or object? or any other elegant way of doing this?
Every coordinate will have a single letter identifier, so you can use a more general positive lookback (?<=,[A-Z]:=). This lookback matches a comma followed by a single uppercase letter then the equality symbol.
You can then use .match() to get all matches and use .map() to run the conversion you were doing.
let TextString = `DRILL(NUMBER:=20,NAME:='4',PN:=1,X:=10.1,Y:=73.344,Z:=0,A:=-1.435,B:=1.045,M1:=1,M2:=2,M3:=3,M4:=4,M5:=1,S1:=10.5,S2:=2.1,S3:=1.2,S4:=2,S5:=2.4,RS1:=1,RS2:=2);`;
const regNumber = /(?<=NUMBER:=)[0-9]+/gm;
let lineNumber = Number(TextString.match(regNumber));
const regex = /(?<=,[A-Z]:=)(-?[0-9]+)(.[0-9]+)?/gm;
let coord = TextString.match(regex).map(n => Number(n).toFixed(1));
console.log(lineNumber, coord);
You could write a single pattern:
(?<=\b(?:NUMBER|[XYZAB]):=)-?\d+(?:\.\d+)?\b
Explanation
(?<= Positive lookbehind, assert that to the left of the current position is
\b(?:NUMBER|[XYZAB]):= Match either NUMBER or one of X Y Z A B preceded by a word boundary and followed by :=
) Close the lookbehind
-? Match an optional -
\d+(?:\.\d+)? Match 1+ digits and an optional decimal part
\b A word boundary to prevent a partial word match
See a regex demo.
const TextString = `DRILL(NUMBER:=20,NAME:='4',PN:=1,X:=10.1,Y:=73.344,Z:=0,A:=-1.435,B:=1.045,M1:=1,M2:=2,M3:=3,M4:=4,M5:=1,S1:=10.5,S2:=2.1,S3:=1.2,S4:=2,S5:=2.4,RS1:=1,RS2:=2);`;
const regNumber = /(?<=\b(?:NUMBER|[XYZAB]):=)-?\d+(?:\.\d+)?\b/g;
const result = TextString
.match(regNumber)
.map(s =>
Number(s).toFixed(1)
);
console.log(result);
One possible approach could be based on a regex pattern which utilizes capturing groups. The matching regex for the OP's sample text would look like this ...
/\b(NUMBER|[XYZAB])\:=([^,]+),/g
... and the description is provided with the regex' test site.
The pattern is both simple and generic. The latter is due to always capturing both the matching key like Number and its related value like 20. Thus it doesn't matter where a key-value pair occurs within a drill-data string.
Making use later of an object based Destructuring Assignment for assigning all of the OP's variables at once the post processing task needs to reduce the result array of matchAll into an object which features all the captured keys and values. Within this task one also can control how the values are computed and/or whether or how the keys might get sanitized.
const regXDrillData = /\b(NUMBER|[XYZAB])\:=([^,]+),/g;
const textString =
`DRILL(NUMBER:=20,NAME:='4',PN:=1,X:=10.1,Y:=73.344,Z:=0,A:=-1.435,B:=1.045,M1:=1,M2:=2,M3:=3,M4:=4,M5:=1,S1:=10.5,S2:=2.1,S3:=1.2,S4:=2,S5:=2.4,RS1:=1,RS2:=2);`;
// - processed values via reducing the captured
// groups of a `matchAll` result array of a
// generic drill-data match-pattern.
const {
number: lineNumber,
x, y, z,
a, b,
} = [...textString.matchAll(regXDrillData)]
.reduce((result, [match, key, value]) => {
value = Number(value);
value = (key !== 'NUMBER') ? value.toFixed(1) : value;
return Object.assign(result, { [ key.toLowerCase() ]: value });
}, {})
console.log(
`processed values via reducing the captured
groups of a 'matchAll' result array of a
generic drill-data match-pattern ...`,
{ lineNumber, x, y, z, a, b },
);
.as-console-wrapper { min-height: 100%!important; top: 0; }
Every value match a pattern :=[value], or :=[value]) for the last one. So there is my regex
(?<=:=)-?[\d\w.']+(?=[,)])
Positive Lookbehind (?<=:=) look for match behind :=
-? match - optional (for negative number)
[\d\w.']+: match digit, word character, ., '
Positive Lookahead (?=[,)]) look for match ahead character , or )
Live regex101.com demo
Now change your code to
let TextString = `DRILL(NUMBER:=20,NAME:='4',PN:=1,X:=10.1,Y:=73.344,Z:=0,A:=-1.435,B:=1.045,M1:=1,M2:=2,M3:=3,M4:=4,M5:=1,S1:=10.5,S2:=2.1,S3:=1.2,S4:=2,S5:=2.4,RS1:=1,RS2:=2);`;
const regexPattern= /(?<=:=)-?[\d\w.']+(?=[,)])/g;
console.log(TextString.match(regexPattern))
// ['20', "'4'", '1', '10.1', '73.344', '0', '-1.435', '1.045', '1', '2', '3', '4', '1', '10.5', '2.1', '1.2', '2', '2.4', '1', '2']
Edit
I just realized the the Positive Lookahead is unnecessary as #Peter Seliger
mentioned
(?<=:=)-?[\d\w.']+
Change your regex pattern to
const regexPattern= /(?<=:=)-?[\d\w.']+/g;
Here is a solution using a .reduce() on keys of interest and returns an object:
const TextString = `DRILL(NUMBER:=20,NAME:='4',PN:=1,X:=10.1,Y:=73.344,Z:=0,A:=-1.435,B:=1.045,M1:=1,M2:=2,M3:=3,M4:=4,M5:=1,S1:=10.5,S2:=2.1,S3:=1.2,S4:=2,S5:=2.4,RS1:=1,RS2:=2);`;
const keys = [ 'NUMBER', 'X', 'Y', 'Z', 'A', 'B' ];
let result = keys.reduce((obj, key) => {
const regex = new RegExp('(?<=\\b' + key + ':=)-?[0-9.]+');
obj[key] = Number(TextString.match(regex)).toFixed(1);
return obj;
}, {});
console.log(result);
Output:
{
"NUMBER": "20.0",
"X": "10.1",
"Y": "73.3",
"Z": "0.0",
"A": "-1.4",
"B": "1.0"
}
Notes:
The regex is built dynamically from the key
A \b word boundary is added to the regex to reduce the chance of unintended matches
If you need the line number as an integer you could take that out of the keys, and handle it separately.
So I know how to get a substring from 2 characters using index or split method. But I'm stuck in a scenario of lots of string with similar names such as:
"2020-12-09-name_of_this_mission_num1_mission1_fileName_something"
"2020-12-09-name_of_this_mission_num1_mission12_fileName_something"
"2020-12-09-name_of_this_mission_num23_mission1_fileName_something_else"
Like I am stuck on how to extract just the "mission#" part, because sometimes the names can be different, so the length is different, and sometimes the names are the same, same as the fileName. I also thought about using the index of "_", but there are multiple "_" and they might end up in different index if the name is different.
Could anyone give me some hint on this?
If the structure of the strings are always the same - and you want the second instance of 'mision' - then split the full string on the text of 'mission'.
This will yield an array with three portions -
["2020-12-09-name_of_this_", "num1", "1_fileName_something"])
Then get the last item in this portions array and grab the number from the start of the resultant string.
Then you can prefix it with the 'mission' that you removed, push it into an array and you have a array of of missions.
If your initial string does not contain a two instances of 'mission' then you can set it to return the 2nd not 3rd portion as I have doen with 'mission2'.
const missions = [
"2020-12-09-name_of_this_mission_num1_mission1_fileName_something",
"2020-12-09-name_of_this_mission_num1_mission12_fileName_something",
"2020-12-09-name_of_this_mission_num23_mission1_fileName_something_else",
"2020-12-09-name_of_this_mission2_fileName_something_else"
]
let missionsArr = [];
missions.forEach(function(mission) {
const missionPortions = mission.split('mission');
let index;
missionPortions.length == 2
? index = 1
: index = 2
missionsArr.push('mission' + parseInt(missionPortions[index]))
})
console.log(missionsArr); //gives ["mission1","mission12", "mission1", "mission2"];
A simple regex match function. Note that 'match' outputs an array, so push match[0]:
const missions = [
"2020-12-09-name_of_this_mission_num1_mission1_fileName_something",
"2020-12-09-name_of_this_mission_num1_mission12_fileName_something",
"2020-12-09-name_of_this_mission_num23_mission1_fileName_something_else"
]
let Arr = [];
missions.forEach(function(mission) {
const missionID = mission.match(/mission\d+/);
Arr.push(missionID[0]);
})
console.log(Arr);
Easiest way to just get the mission##, assuming # is a variable number of digits, is by using regex.
The base regex would be /mission\d+/ which matches the string "mission" followed by at least one number.
Assuming you have your input as:
const missionsTexts = [
"2020-12-09-name_of_this_mission_num1_mission1_fileName_something",
"2020-12-09-name_of_this_mission_num1_mission12_fileName_something",
"2020-12-09-name_of_this_mission_num23_mission1_fileName_something_else"
];
You can transform them into an array of just mission# with the following algorithm:
const missions = missionsTexts.map(missionText => missionText.match(/mission\d+/g)[0]);
Note that this assumes there's only one mission# per missionText. The g modifier is used to make sure the regex doesn't create a match after the first digit it finds.
I have various strings with numbers in brackets like "[4]Motherboard, [25]RAM" how can I convert such a string to a JSON array (keeping both ids and values) like this:
{"data":[
{"id":"4","item":"Motherboard"},
{"id":"25","item":"RAM"}
]};
I'm tried using split(",") to create the array but I really can't find out how to get the inner data in this case.
You could use a regular expression, which takes the number and the string, and assign it as property to an object.
var string = "[4]Motherboard, [25]RAM",
data = string.split(', ').map(function (a) {
var p = a.match(/^\[(\d+)\](.+)$/);
return { id: p[1], item: p[2] };
});
console.log(data);
Here one way to do it. The pattern \[(\d+?)\](.+) works like this:
(…) is a capture group. Just means whatever matches within the brackets will be a token in the result.
\d means a digit
\d+ means a digit, one or more times
\d+? means a digit, one or more times, but as few as possibile before the pattern matches something else.
.+ means any character, one or more times.
[ and ] have a special meaning in regular expression, so if you actually want to match the characters themselves, you need to escape them like so \[ and \].
The double backslashes \\ are just a JS oddity when defining a regex via a string as opposed to using a /literal/. Just two ways of saying the same thing.
There's plenty of resources to learn regex syntax, and http://regex101.com is a great place to play with patterns and experiment.
var input = "[4]Motherboard, [25]RAM";
var pattern = '\\[(\\d+?)\\](.+)';
var result = input.split(',').map(function (item) {
var matches = item.match(new RegExp(pattern));
return {id: matches[1], val: matches[2]};
});
console.log(result)
function toArray(string) {
return {
data: string.split(",").map(function(str) {
str = str.trim();
return {
id: str.substring(1, str.indexOf("]")),
item: str.substring(str.indexOf("]") + 1),
};
}),
};
}
I have example string:
[:pl]Field_value_in_PL[:en]Field_value_in_EN[:]
And I want get something like it:
Object {
pl: "Field_value_in_PL",
en: "Field_value_in_EN"
}
But I cannot assume there will be always "[:pl]" and "[:en]" in input string. There can by only :pl or :en, :de and :fr or any other combination.
I tried to write Regexp for this but I failed.
Try using .match() with RegExp /:(\w{2})/g to match : followed by two alphanumeric characters, .map() to iterate results returned from .match(), String.prototype.slice() to remove : from results, .split() with RegExp /\[:\w{2}\]|\[:\]|:\w{2}/ to remove [, ] characters and matched : followed by two alphanumeric characters, .filter() with Boolean as parameter to remove empty string from array returned by .split(), use index of .map() to set value of object, return object
var str = "[:pl]Field_value_in_PL[:en]Field_value_in_EN[:]:deField_value_in_DE";
var props = str.match(/:(\w{2})/g).map(function(val, index) {
var obj = {}
, prop = val.slice(1)
,vals = str.split(/\[:\w{2}\]|\[:\]|:\w{2}/).filter(Boolean);
obj[prop] = vals[index];
return obj
});
console.log(JSON.stringify(props, null, 2))
Solution with String.replace , String.split and Array.forEach functions:
var str = "[:pl]Field_value_in_PL[:en]Field_value_in_EN[:fr]Field_value_in_FR[:de]Field_value_in_DE[:]",
obj = {},
fragment = "";
var matches = str.replace(/\[:(\w+?)\]([a-zA-Z_]+)/gi, "$1/$2|").split('|');
matches.forEach(function(v){ // iterating through key/value pairs
fragment = v.split("/");
if (fragment.length == 2) obj[fragment[0]] = fragment[1]; // making sure that we have a proper 'final' key/value pair
});
console.log(obj);
// the output:
Object { pl: "Field_value_in_PL", en: "Field_value_in_EN", fr: "Field_value_in_FR", de: "Field_value_in_DE" }
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/split
You can try this regex to capture in one group what's inside a pair of brackets and in the other group the group of words that follow the brackets.
(\[.*?\])(\w+)
I have two possible strings that I need to match:
+/-90000
and
+9000 / -80000
I need to recognise the two patterns separately so wrote some regex for this. The first single number string I can match like so:
/\+\/\-{1}/g
And i wrote this for the second:
/(\+(?=[0-9]+){1}|\-(?=[0-9]+){1}|\/(?=\s){1})/g
The second would also partially match the first the first number i.e. the -90000. Is there a way that they can be improved so that they match exclusively?
You can use a single expression:
^(?:(\+\/-\s*\d+)|((\+\s*\d+)\s*\/\s*(-\s*\d+)))$
The only restriction you'll have to work with would be that in the second type of input, the positive number should come first.
You'll get the matched group in matches[1] if the input was of type 1, and in matches[2] if it was of type 2. For the type-2 input, further matches of each number gets stored in matches[3] and matches[4].
You can see the demo on regex101.
Here are two solutions with slightly different semantics.
With the first, if the string is type 1 the number will be in capture group 1 (result[1]) and if it's type 2 the numbers will be in capture groups 2 and 3 (and capture group 1 will be null). The test for type 1, then, is result[1] !== null.
var a = '+/-90000';
var b = '+9000 / -80000';
var result;
var expr1 = /\+(?:\/-(\d+)|(\d+) \/ -(\d+))/;
result = a.match(expr1);
// => [ '+/-90000', '90000', null, null ]
result = b.match(expr1);
// => [ '+9000 / -80000', null, '9000', '80000' ]
With the second, if the string is type 1 the number will be in capture group 1 (and capture group 2 will be null), and if it's type 2 the numbers will be in capture groups 2 and 3. The test for type 1 is result[1] === null.
var expr2 = /\+(\d+ )?\/ ?-(\d+)/;
result = a.match(expr2);
// => [ '+/-90000', null, '90000' ]
result = b.match(expr2);
// => [ '+9000 / -80000', '9000', '80000' ]