I am trying to make a regex to matches all the combinations of a given string. For example of the string is "1234", answers would include:
"1"
"123"
"4321"
"4312"
Nonexamples would include:
"11"
"11234"
"44132"
If it matters, the programming language I am using is javascript.
Thank you for any help.
You may use this lookahead based assertions in your regex:
^(?!(?:[^1]*1){2})(?!(?:[^2]*2){2})(?!(?:[^3]*3){2})(?!(?:[^4]*4){2})[1234]+$
RegEx Demo
Here we have 4 lookahead assertions:
(?!(?:[^1]*1){2}): Assert that we don't have more than one instance of 1
(?!(?:[^2]*2){2}): Assert that we don't have more than one instance of 2
(?!(?:[^3]*3){2}): Assert that we don't have more than one instance of 3
(?!(?:[^4]*4){2}): Assert that we don't have more than one instance of 4
We use [1234]+ to match any string with these 4 characters.
A combination of group captures using character classes and negative look-ahead assertions using back-references would do the trick.
Let's begin with simply matching any combination of 1, 2, 3, and 4 using a character class,[1-4], and allowing any length from 1 to 4 characters. {1,4}.
const regex = /^[1-4]{1,4}$/;
// Create set of inputs from 0 to 4322
const inputs = Array.from(new Array(4323), (v, i) => i.toString());
// Output only values that match criteria
console.log(inputs.filter((input) => regex.test(input)));
When that code is run, it's easy to see that although only numbers consisting of some combination of 1, 2, 3, and 4 are matched, it also is matching numbers with repeating combinations (e.g. 11, 22, 33, 112, etc). Obviously, this was not what was desired.
To prevent repeating characters requires a reference to previously matched characters and then a negation of them from any following matched characters. Negative look-aheads, (?!...) using a back-reference, \1-9, can accomplish this.
Building on the previous example with a subset of the inputs (limiting to a max length of two characters for the moment) would now incorporate a group match surrounding the first character, ([1-4]), followed by a negative look-ahead with a back-reference to the first capture, (?!\1), and finally a second optional character class.
const regex = /^([1-4])(?!\1)[1-4]?$/;
// Create set of inputs from 0 to 44
const inputs = Array.from(new Array(45), (v, i) => i.toString());
// Output only values that match criteria
console.log(inputs.filter((input) => regex.test(input)));
This matches the desired characters with no repetition!
Expanding this pattern to include back-references for each of the previously matched characters up to the desired max length of 4 yields the following expression.
const regex = /^([1-4])((?!\1)[1-4])?((?!\1|\2)[1-4])?((?!\1|\2|\3)[1-4])?$/;
// Create set of inputs from 0 to 4322
const inputs = Array.from(new Array(4323), (v, i) => i.toString());
// Output only values that match criteria
console.log(inputs.filter((input) => regex.test(input)));
Hope this helps!
You don't need to use regex for this. The snippet below does the following:
Loop over possible combinations (a => s) (1, 123, 4321, etc.)
Copy the current combination so as not to overwrite it (s2 = s)
Loop over the characters of test string (x => ch) (1234 => 1, 2, 3, 4)
Replace common characters in the combination string shared with the test string (s2.replace)
For example in the combination 1, the 1 will be replaced when the loop gets to the character 1 in 1234 resulting in an empty string
If the combination string's length reaches 0 (s2.length == 0) write the result to the console and break out of the loop (no point in continuing to attempt to replace on an empty string)
const x = "1234"
const a = ["1","123","4321","4312","11","11234","44132"]
a.forEach(function(s) {
var s2 = s
for(var ch of x) {
s2 = s2.replace(ch, '')
if(s2.length == 0) {
console.log(s);
break;
}
}
})
Results:
1
123
4321
4312
Related
I'm trying to create a regex that will select the numbers/numbers with commas(if easier, can trim commas later) that do not have a parentheses after and not the numbers inside the parentheses should not be selected either.
Used with the JavaScript's String.match method
Example strings
9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4
What i have so far:
/((^\d+[^\(])|(,\d+,)|(,*\d+$))/gm
I tried this in regex101 and underlined the numbers i would like to match and x on the one that should not.
You could start with a substitution to remove all the unwanted parts:
/\d*\(.*?\),?//gm
Demo
This leaves you with
5,10
10,2,5,
10,7,2,4
which makes the matching pretty straight forward:
/(\d+)/gm
If you want it as a single match expression you could use a negative lookbehind:
/(?<!\([\d,]*)(\d+)(?:,|$)/gm
Demo - and here's the same matching expression as a runnable javascript (skeleton code borrowed from Wiktor's answer):
const text = `9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4`;
const matches = Array.from(text.matchAll(/(?<!\([\d,]*)(\d+)(?:,|$)/gm), x=>x[1])
console.log(matches);
Here, I'd recommend the so-called "best regex trick ever": just match what you do not need (negative contexts) and then match and capture what you need, and grab the captured items only.
If you want to match integer numbers that are not matched with \d+\([^()]*\) pattern (a number followed with a parenthetical substring), you can match this pattern or match and capture the \d+, one or more digit matching pattern, and then simply grab Group 1 values from matches:
const text = `9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4`;
const matches = Array.from(text.matchAll(/\d+\([^()]*\)|(\d+)/g), x=> x[1] ?? "").filter(Boolean)
console.log(matches);
Details:
text.matchAll(/\d+\([^()]*\)|(\d+)/g) - matches one or more digits (\d+) + ( (with \() + any zero or more chars other than ( and ) (with [^()]*) + \) (see \)), or (|) one or more digits captured into Group 1 ((\d+))
Array.from(..., x=> x[1] ?? "") - gets Group 1 value, or, if not assigned, just adds an empty string
.filter(Boolean) - removes empty strings.
Using several replacement regexes
var textA = `9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4
`
console.log('A', textA)
var textB = textA.replace(/\(.*?\),?/g, ';')
console.log('B', textB)
var textC = textB.replace(/^\d+|\d+$|\d*;\d*/gm, '')
console.log('C', textC)
var textD = textC.replace(/,+/g, ' ').trim(',')
console.log('D', textD)
With a loop
Here is a solution which splits the lines on comma and loops over the pieces:
var inside = false;
var result = [];
`9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4
`.split("\n").map(line => {
let pieceArray = line.split(",")
pieceArray.forEach((piece, k) => {
if (piece.includes('(')) {
inside = true
} else if (piece.includes(')')) {
inside = false
} else if (!inside && k > 0 && k < pieceArray.length-1 && !pieceArray[k-1].includes(')')) {
result.push(piece)
}
})
})
console.log(result)
It does print the expected result: ["5", "7"]
my plan is extract numbers from string except numbers with special character. What I mean?
Please imagine a following (like Excel formula):
=$A12+A$345+A6789
I need to extract numbers where in beginning of them doesn't exist any character $, so result of right regex should be:
12
6789
I made some investigation where I used a following regex:
/[A-Z][0-9]+(?![/W])/g
which extracts:
A12
A6789
I was thinking to use nested regex (to extract numbers from that result additionally) but I have no idea if it possible. My source code in javascript so far:
http://jsfiddle.net/janzitniak/fvczu7a0/7/
Regards
Jan
const regex = /(?<ref>\$?[A-Z]+(?<!\$)[0-9]+)/g;
const str = `=$A12+A$345+A6789`;
const refs = [...(str.matchAll(regex) || [])].map(result => result.groups.ref);
console.log(refs)
Matches any string containing A-Z once or more that is preceded by a $ zero or one times, followed by 0-9 once or more but not preceded by a $, all followed by + zero or one times.
You ignore all matched groups, but capture the one you want, referenced as ref (you can call it whatever you want).
Output:
["$A12","A6789"]
If you want just the number part, you can use:
const regex = /\$?[A-Z]+(?<!\$)(?<num>[0-9]+)/g;
const str = `=$A12+A$345+A6789`;
const nums = [...(str.matchAll(regex) || [])].map(result => +result.groups.num);
console.log(nums)
Output:
[12, 6789]
const charSequence = '=$A12+A$345+A6789';
const numberList = (charSequence
.split(/\$\d+/) // - split at "'$' followed by one or more numbers".
.join('') // - join array of split results into string again.
.match(/\d+/g) || []) // - match any number-sequence or fall back to empty array.
.map(str => +str); // - typecast string into number.
//.map(str => parseInt(str, 10)); // parse string into integer.
console.log('numberList : ', numberList);
.as-console-wrapper { min-height: 100%!important; top: 0; }
#ibraheem can you help me once again please? How can I increment ref output if I want to have the following result ["$A13","A6790"]? - JanZitniak 23 mins ago
... the split/join/match approach can be iterated very fast, thus it proves to be quite flexible.
const charSequence = '=$A13+A$345+A6790';
const numberList = (charSequence
.split(/\$\d+/) // - split at "'$' followed by one or more numbers".
.join('') // - join array of split results into string again.
.match(/\$*[A-Z]\d+/g) || []); // - match any sequence of an optional '$' followed
// by 1 basic latin uppercase character followed
// by one or more number character(s).
console.log('numberList : ', numberList);
.as-console-wrapper { min-height: 100%!important; top: 0; }
Peter thank you for your quick response about increment but on start I have const charSequence = '=$A12+A$345+A6789'; and as output I need ["$A13","A6790"]. – JanZitniak
... ok, finally one is going to get a full picture of the entire problem ... which is (1) getting rid of the not necessary patterns ...(2) matching numbers within specific patterns AND somehow remember the latter (3) increment such numbers AND somehow rework them into their remembered/recallable pattern.
const anchorSequence = '=$A12+A$345+A6789';
const listOfIncrementedAnchorCoordinates = [...(anchorSequence
// - split at "'$' followed by one or more numbers".
.split(/\$\d+/)
// - join array of split results into string again.
.join('')
// - match any sequence of an optional '$' followed by 1 basic latin
// uppercase character followed by one or more number character(s)
// and store each capture into a named group.
.matchAll(/(?<anchor>\$*[A-Z])(?<integer>\d+)/g) || [])
// map each regexp result from a list of RegExpStringIterator entries.
].map(({ groups }) => `${ groups.anchor }${ (+groups.integer + 1) }`);
console.log('listOfIncrementedAnchorCoordinates : ', listOfIncrementedAnchorCoordinates);
.as-console-wrapper { min-height: 100%!important; top: 0; }
Peter if you are interest(...ed) in another problem I have one. How can I change const anchorSequence = '=$A12+A$345+A6789'; to following output ["B$345","B6789"]? I mean to change letter to next one in alphabetical order (if it is A then change to B, if it is B change to C and so on) if letter doesn't start with $. In my example it should change only A$345 and A6789. – JanZitniak
... with a little thinking effort it was not that hard to iterate/refactor the version before to this last one ...
const anchorSequence = '=$A12+A$345+A6789';
const listOfIncrementedColumns = [...(anchorSequence
// - split at "'$' followed by 1 basic latin uppercase character
// followed by one or more number character(s)".
.split(/\$[A-Z]\d+/)
// - join array of split results into string again.
.join('')
// - match any sequence of 1 basic latin uppercase character
// followed by an optional '$' followed by one or more number
// character(s) and store each capture into a named group.
.matchAll(/(?<column>[A-Z])(?<anchor>\$*)(?<row>\d+)/g) || [])
// map each regexp result from a list of RegExpStringIterator entries.
].map(({ groups }) => [
// - be aware that "Z" (charCode:90) will be followed by "[" (charCode:91)
// - thus, the handling of this edge case still needs to be implemented.
String.fromCharCode(groups.column.charCodeAt(0) + 1),
groups.anchor,
groups.row
].join(''));
console.log('listOfIncrementedColumns : ', listOfIncrementedColumns);
.as-console-wrapper { min-height: 100%!important; top: 0; }
This may be a simple expression to write but I am having the hardest time with this one. I need to match group sets where each group has 2 parts, what we can call the operation and the value. I need the value to match to anything after the operation EXCEPT another operation.
Valid operations to match (standard math operators): [>,<,=,!,....]
For example: '>=25!30<50' Would result in three matching groups:
1. (>=, 25)
2. (!, 30)
3. (<, 50)
I can currently solve the above using: /(>=|<=|>|<|!|=)(\d*)/g however this only works if the characters in the second match set are numbers.
The wall I am running into is how to match EVERYTHING after EXCEPT for the specified operators.
For example I don't know how to solve: '<=2017-01-01' without writing a regex to specify each and every character I would allow (which is anything except the operators) and that just doesn't seem like the correct solution.
There has got to be a way to do this! Thanks guys.
What you might do is match the operations (>=|<=|>|<|!|=) which will be the first of the 2 parts and in a capturing group use a negative lookahead to match while there is not an operation directly at the right side which will be the second of the 2 parts.
(?:>=|<=|>|<|!|=)((?:(?!(?:>=|<=|>|<|!|=)).)+)
(?:>=|<=|>|<|!|=) Match one of the operations using an alternation
( Start capturing group (This will contain your value)
(?: Start non capturing group
(?!(?:>=|<=|>|<|!|=)). Negative lookahead which asserts what is on the right side is not an operation and matches any character .
)+ Close non capturing group and repeat one or more times
) Close capturing group
const regex = /(?:>=|<=|>|<|!|=)((?:(?!(?:>=|<=|>|<|!|=)).)+)/gm;
const strings = [
">=25!30<50",
">=test!30<$##%",
"34"
];
let m;
strings.forEach((s) => {
while ((m = regex.exec(s)) !== null) {
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
console.log(m[1]);
}
});
You can use this code
var str = ">=25!30<50";
var pattern = RegExp(/(?:([\<\>\=\!]{1,2})(\d+))/, "g");
var output = [];
let matchs = null;
while((matchs = pattern.exec(str)) != null) {
output.push([matchs[1], matchs[2]]);
}
console.log(output);
Output array :
0: Array [ ">=", "25" ]
1: Array [ "!", "30" ]
2: Array [ "<", "50" ]
I think this is what you need:
/((?:>=|<=|>|<|!|=)[^>=<!]+)/g
the ^ excludes characters you don't want, + means any number of
How to check if a digit appears more than once in a number (anywhere within it)?
Example input numbers:
1, 10, 11, 1010, 1981
Output should tell which of them has any repeated digits:
false, false, true, true, true
Publihsed all the good answers given in a jsperf page
I think the fastest way would be a RegExp test. You can use it to get a quick true or false on whether there is a repeat, and it's compact enough to use in conditional operators. Here's an example that'd work with numbers and strings of numbers.
function hasRepeatingdigits(N) {
return (/([0-9]).*?\1/).test(N)
}
console.log(
[1, 10, 11, 1010, 1981, 12345678901, 123456789].map(hasRepeatingdigits)
)
(Edit -Isaac B)
Here's a breakdown of how the RegExp works:
The [0-9] creates a list of single characters between 0 and 9 to be
matched.
Adding the parentheses ([0-9]) defines this list as the
first capture group. These parens would not be needed if you were only searching for a char and didn't need the RegExp to perform a subsequent action. (i.e. /[0-9]/ is all you need to find the first index of a char 0 through 9 in a string, or true in a RegExp test)
The . matches any single char - except for line terminators. Adding the lazy quantifier *? matches between 0 and infinity times, as few time as possible.
The \1 matches the same text as most recently matched by the first capture group
In summary: /([0-9]).*?\1/ is a regular expression that iterates through a string looking for each char 0 through 9 and returning a match the first time a match is found to the char currently in the first capture group.
In the string '123432', this RexExp would return a full match: '2343' on capture group 1: '2'.
RegExp.prototype.test() searches a string using the provided RegExp and returns true if the RegExp return a match, else it returns false. This could easily be modified to find a duplicate letter char as well using /([A-Za-z]).*?\1/).test(N).
Beyond the very useful MDN section on RegExp, I highly recommend people working to get more comfortable with them to check out this RegularExpressions101 tool.
function checkForRepeatingDigits(N){
var arr = (''+N).split(''),
result = arr.filter((elem, i) => arr.indexOf(elem) == i);
return result.length != (''+N).length;
}
// Or
function checkForRepeatingDigits(N){
return [...new Set((''+N).split(''))].length != (''+N).length;
}
console.log([1, 10, 11, 1010, 1981].map(checkForRepeatingDigits))
You could use a check with Array#indexOf and Array#lastIndexOf.
function check(a, _, aa) {
return aa.indexOf(a) !== aa.lastIndexOf(a);
}
console.log([1, 10, 11, 1010, 1981].map(a => a.toString().split('').some(check)));
Short solution using Array.prototype.map() and String.prototype.match() function:
function checkForRepeatingDigits(N) {
return N.map(function (v) {
return [v, Boolean(String(v).match(/(\d)\d*?\1/g))];
});
}
console.log(checkForRepeatingDigits([1, 10, 11, 1010, 1981]));
function repeated(n) {
var digits = [];
var digit;
while (n) {
digit = n % 10;
if (digits[digit]) return true;
digits[digit] = true;
n = Math.floor(n / 10);
}
return false;
}
[1, 10, 11, 1010, 1981].forEach(n => console.log(n, repeated(n)));
This works by first converting the number to a string with N = N + '' and then checking the result of split(), which is a String function that crushes a string to smaller parts based on a delimiter.
For example, if I split "aba" by "b", I'll get an array containing ["a", "a"]. As you can see, if there's one occurrence of "b", the length of the returned array is 2. If there's more, it will be over 2. This is what I use in my solution.
As a bonus, it works with other types of data, even null and undefined. ;)
function check(N) {
for (var N = N + '', i = (N).length; i--;)
if (N.split(N[i]).length > 2)
return true;
return false;
}
[1, 10, 11, 1010, 1981, "abcd23", "aab", "", null, undefined].forEach(num => {
console.log(num, check(num));
});
const isRepeated = (num) => new Set('' + num).size != ~~Math.log10(num) + 1;
[1, 10, 11, 1010, 1981].forEach(n => console.log(n, isRepeated(n)));
JS offers a Set object which is a data type to hold unique elements, ie. it will remove any duplicates from the input, thus, new Set(string) removes all duplicate characters from the string, thus, we do new Set('' + num) to pass num as a string, now that the set contains only unique characters, set.size will return number of unique characters in the string.
A number will have no repeated characters if the number of digits is equal to number of unique characters we found earlier, there are many ways to find number of digits in number, we could also do num.toString().length instead of ~~Math.log10(num) + 1, the logarithm of a number to base 10 gives us a decimal value, we convert it to integer, this integer will be 1 less than the number of digits, so, we add 1 to get number of digits.
Thus, it returns true when the lengths are not equal, implying that there is repetition, and false otherwise.
I have two possible strings that I need to match:
+/-90000
and
+9000 / -80000
I need to recognise the two patterns separately so wrote some regex for this. The first single number string I can match like so:
/\+\/\-{1}/g
And i wrote this for the second:
/(\+(?=[0-9]+){1}|\-(?=[0-9]+){1}|\/(?=\s){1})/g
The second would also partially match the first the first number i.e. the -90000. Is there a way that they can be improved so that they match exclusively?
You can use a single expression:
^(?:(\+\/-\s*\d+)|((\+\s*\d+)\s*\/\s*(-\s*\d+)))$
The only restriction you'll have to work with would be that in the second type of input, the positive number should come first.
You'll get the matched group in matches[1] if the input was of type 1, and in matches[2] if it was of type 2. For the type-2 input, further matches of each number gets stored in matches[3] and matches[4].
You can see the demo on regex101.
Here are two solutions with slightly different semantics.
With the first, if the string is type 1 the number will be in capture group 1 (result[1]) and if it's type 2 the numbers will be in capture groups 2 and 3 (and capture group 1 will be null). The test for type 1, then, is result[1] !== null.
var a = '+/-90000';
var b = '+9000 / -80000';
var result;
var expr1 = /\+(?:\/-(\d+)|(\d+) \/ -(\d+))/;
result = a.match(expr1);
// => [ '+/-90000', '90000', null, null ]
result = b.match(expr1);
// => [ '+9000 / -80000', null, '9000', '80000' ]
With the second, if the string is type 1 the number will be in capture group 1 (and capture group 2 will be null), and if it's type 2 the numbers will be in capture groups 2 and 3. The test for type 1 is result[1] === null.
var expr2 = /\+(\d+ )?\/ ?-(\d+)/;
result = a.match(expr2);
// => [ '+/-90000', null, '90000' ]
result = b.match(expr2);
// => [ '+9000 / -80000', '9000', '80000' ]