Check if string has a format: str-str-str. Regex - javascript

I need to check if string has the aforementioned format when user input string.
Examples of acceptable strings:
string
home-appliances
such-a-long-word-but-still-valid
Examples of disallowed strings:
home-appliances-
home-appliances- smth-else
-home-appliances
-home-appliances-
string-with-digits23132-fails
string-with-a-lot---------of-dashes
string.with-some-other-symbols-except-dashes
Every regex that I tried - not successfully validate strings.
For example, the following regex expression
/^[a-z][a-z-]+[a-z]$/
does not meet many conditions.
I will be glad to your help, thanks a lot

Create a group matching (-[a-z]+) and match that repeating zero or more times after your initial alpha-characters
const good = [
"string",
"home-appliances",
"such-a-long-word-but-still-valid",
];
const bad = [
"home-appliances-",
"home-appliances- smth-else",
"-home-appliances",
"-home-appliances-",
"string-with-digits23132-fails",
"string-with-a-lot---------of-dashes",
"string.with-some-other-symbols-except-dashes",
];
const rx = /^[a-z]+(-[a-z]+)*$/;
const check = (str) => console.log(str, rx.test(str) ? "✅" : "❌");
good.forEach(check);
bad.forEach(check);
.as-console-wrapper { max-height: 100% !important; }

Related

Complex Regex optional search with multiple returned objects

I'm trying to get the below to search with certain conditions.
There will always be the file name given, the others are optional.
[[File:HumanMaleDiagram.png|right|300px|thumb|A diagram of a male human.]]
The second item can be [left, right, center] or optional, the size is optional (either px or %), thumb is optional, the alt text at the end is optional.
I've used:
var imageFinder = /\|\s*\[\[File\:([\w\-\_\. ]+)\|*(?=[right|left|center]*)\|*(?=[\w]*px)\|*(?=[thumb]*)\|*(?=[\w\.\s]*)\]\]/gi
To return $1...$5
So I've used look-ahead's "?=" and "*" on the pipes but its not finding it.
Been using Regex in anger for the last few days only. All my others work (they're smaller than this) but some assistance would be gratefully received.
Doing this in JavaScript.
Thank you to anyone who can help.
You can make every part optional, so you don't need to use lookahead.
I recommend using named capturing groups (but it's not required).
\[\[File:[^|]+(?:\|(?<align>left|right|center))?(?:\|(?<size>\d+(?:px|%)))?(?:\|(?<thumb>thumb))?(?:\|(?<alt>[^|]+))?\]\]
See this at work on regex101
Here's a quick snippet using exec(), destructuring the results, and returning an object with a value of undefined for any capturing groups that are missing.
I wasn't sure if the pipes were permanent even if the string between was absent.
const regex = /(?:File:([\w\s-]+\.\w{3,4}))\|(right|left|center)?\|(\d+(?:px|%))?\|(thumb)?\|(.*[^\]])?/i
const parse = (str) => {
const [_, file, align, width, thumb, desc] = regex.exec(str);
return { file, align, width, thumb, desc };
}
const
str1 = '[[File:HumanMaleDiagram.png|right|300px|thumb|A diagram of a male human.]]',
str2 = '[[File:Human_23.mp4|left|300%||A diagram of a male human.]]',
str3 = '[[File:HumanMDiagram.jpeg||300%||A diagram of a human, male.]]';
console.log(parse(str1));
console.log(parse(str2));
console.log(parse(str3));
.as-console-wrapper { max-height: 100% !important; top: 0; }
String manipulation
Given that your string is delimited with | characters you could forgo the regex and use standard string manipulation.
Trim off the leading and trailing [[ and ]]
str = str.replace(/(?:^\[\[)|(?:\]\]$)/g, '')
Split the result at the | characters.
str.split('|')
const parse = (str) => {
str = str.replace(/(?:^\[\[)|(?:\]\]$)/g, '');
return str.split('|');
}
const
str1 = '[[File:HumanMaleDiagram.png|right|300px|thumb|A diagram of a male human.]]',
str2 = '[[File:Human_23.mp4|left|300%||A diagram of a male human.]]',
str3 = '[[File:HumanMDiagram.jpeg||300%||A diagram of a human, male.]]';
console.log(parse(str1));
console.log(parse(str2));
console.log(parse(str3));
.as-console-wrapper { max-height: 100% !important; top: 0; }

Separate a string into a pair of substrings at a special character sequence

I want to get two string values separated by a special character.
For example, say user types in a search query like
japanese->chinese (or with spaces japanese -> chinese).
the spaces between the arrow should not matter. I should still get 'japanese' , 'chinese' even if theres' multiple spaces between the arrow and the two string
The whole string 'japanese -> chinese' will be sent.
From that string, I want to retrieve individual string 'japanese' and 'chinese' to perform some search logic. How would I do that with javascript ? is this possible through regex?
The next provided solution takes the 'whitespace requirement' into account.
The limitations of this approach are strings that do not strictly follow the pattern the OP is talking about in the requirements.
The regex itself ... /\s*->\s*/g ... matches patterns literally like that ...
\s* ... an optional whitespace (sequence) ... followed by ...
->... followed by ...
\s* ... an optional whitespace (sequence)
flagged as global search(, which within the OP's current requirements was not even necessary)
const test = ' japanese ->chinese ';
const regXSplit = (/\s*->\s*/g);
console.log(
"' japanese ->chinese '.trim().split(/\s*->\s*/g)",
test.trim().split(regXSplit)
);
.as-console-wrapper { min-height: 100%!important; top: 0; }
you can use split
const str = "japanese->chinese";
const res = str.split("->");
console.log(res);
.as-console-wrapper { min-height: 100%!important; top: 0; }
res will be an array of string, you don't need regex !

How to extract only numbers from string except substring with special character in beginning of numbers

my plan is extract numbers from string except numbers with special character. What I mean?
Please imagine a following (like Excel formula):
=$A12+A$345+A6789
I need to extract numbers where in beginning of them doesn't exist any character $, so result of right regex should be:
12
6789
I made some investigation where I used a following regex:
/[A-Z][0-9]+(?![/W])/g
which extracts:
A12
A6789
I was thinking to use nested regex (to extract numbers from that result additionally) but I have no idea if it possible. My source code in javascript so far:
http://jsfiddle.net/janzitniak/fvczu7a0/7/
Regards
Jan
const regex = /(?<ref>\$?[A-Z]+(?<!\$)[0-9]+)/g;
const str = `=$A12+A$345+A6789`;
const refs = [...(str.matchAll(regex) || [])].map(result => result.groups.ref);
console.log(refs)
Matches any string containing A-Z once or more that is preceded by a $ zero or one times, followed by 0-9 once or more but not preceded by a $, all followed by + zero or one times.
You ignore all matched groups, but capture the one you want, referenced as ref (you can call it whatever you want).
Output:
["$A12","A6789"]
If you want just the number part, you can use:
const regex = /\$?[A-Z]+(?<!\$)(?<num>[0-9]+)/g;
const str = `=$A12+A$345+A6789`;
const nums = [...(str.matchAll(regex) || [])].map(result => +result.groups.num);
console.log(nums)
Output:
[12, 6789]
const charSequence = '=$A12+A$345+A6789';
const numberList = (charSequence
.split(/\$\d+/) // - split at "'$' followed by one or more numbers".
.join('') // - join array of split results into string again.
.match(/\d+/g) || []) // - match any number-sequence or fall back to empty array.
.map(str => +str); // - typecast string into number.
//.map(str => parseInt(str, 10)); // parse string into integer.
console.log('numberList : ', numberList);
.as-console-wrapper { min-height: 100%!important; top: 0; }
#ibraheem can you help me once again please? How can I increment ref output if I want to have the following result ["$A13","A6790"]? - JanZitniak 23 mins ago
... the split/join/match approach can be iterated very fast, thus it proves to be quite flexible.
const charSequence = '=$A13+A$345+A6790';
const numberList = (charSequence
.split(/\$\d+/) // - split at "'$' followed by one or more numbers".
.join('') // - join array of split results into string again.
.match(/\$*[A-Z]\d+/g) || []); // - match any sequence of an optional '$' followed
// by 1 basic latin uppercase character followed
// by one or more number character(s).
console.log('numberList : ', numberList);
.as-console-wrapper { min-height: 100%!important; top: 0; }
Peter thank you for your quick response about increment but on start I have const charSequence = '=$A12+A$345+A6789'; and as output I need ["$A13","A6790"]. – JanZitniak
... ok, finally one is going to get a full picture of the entire problem ... which is (1) getting rid of the not necessary patterns ...(2) matching numbers within specific patterns AND somehow remember the latter (3) increment such numbers AND somehow rework them into their remembered/recallable pattern.
const anchorSequence = '=$A12+A$345+A6789';
const listOfIncrementedAnchorCoordinates = [...(anchorSequence
// - split at "'$' followed by one or more numbers".
.split(/\$\d+/)
// - join array of split results into string again.
.join('')
// - match any sequence of an optional '$' followed by 1 basic latin
// uppercase character followed by one or more number character(s)
// and store each capture into a named group.
.matchAll(/(?<anchor>\$*[A-Z])(?<integer>\d+)/g) || [])
// map each regexp result from a list of RegExpStringIterator entries.
].map(({ groups }) => `${ groups.anchor }${ (+groups.integer + 1) }`);
console.log('listOfIncrementedAnchorCoordinates : ', listOfIncrementedAnchorCoordinates);
.as-console-wrapper { min-height: 100%!important; top: 0; }
Peter if you are interest(...ed) in another problem I have one. How can I change const anchorSequence = '=$A12+A$345+A6789'; to following output ["B$345","B6789"]? I mean to change letter to next one in alphabetical order (if it is A then change to B, if it is B change to C and so on) if letter doesn't start with $. In my example it should change only A$345 and A6789. – JanZitniak
... with a little thinking effort it was not that hard to iterate/refactor the version before to this last one ...
const anchorSequence = '=$A12+A$345+A6789';
const listOfIncrementedColumns = [...(anchorSequence
// - split at "'$' followed by 1 basic latin uppercase character
// followed by one or more number character(s)".
.split(/\$[A-Z]\d+/)
// - join array of split results into string again.
.join('')
// - match any sequence of 1 basic latin uppercase character
// followed by an optional '$' followed by one or more number
// character(s) and store each capture into a named group.
.matchAll(/(?<column>[A-Z])(?<anchor>\$*)(?<row>\d+)/g) || [])
// map each regexp result from a list of RegExpStringIterator entries.
].map(({ groups }) => [
// - be aware that "Z" (charCode:90) will be followed by "[" (charCode:91)
// - thus, the handling of this edge case still needs to be implemented.
String.fromCharCode(groups.column.charCodeAt(0) + 1),
groups.anchor,
groups.row
].join(''));
console.log('listOfIncrementedColumns : ', listOfIncrementedColumns);
.as-console-wrapper { min-height: 100%!important; top: 0; }

Looking for one regex to replace my string

I have problem with regex and need some help.
Current I have url has type
/search/:year/:month/:day/xxxx with :month and :day maybe exist or not . Now I need replace /search/:year/:month/:day patern on my url with empty string. Meaning get remain of url part. So this is some example below
1.'/search/2017/02/03/category/gun' => '/category/gun'
2.'/search/2017/02/03/' => '/'
3.'/search/2017/01/category/gun' => '/category/gun/'
4.'/search/2017/category/gun/' => '/category/gun/'
5.'/search/2018/?category=gun&type%5B%5D=sendo' => '/?category=gun&type%5B%5D=sendo/'
I try to use regex = /^\/search\/((?:\d{4}))(?:\/((?:\d|1[012]|0[1-9])))?(?:\/((?:[0-3]\d)))/
But it is failed for case /search/2017/category/gun/
const regex = /^\/search\/((?:\d{4}))(?:\/((?:\d|1[012]|0[1-9])))?(?:\/((?:[0-3]\d)))/
const testLink = [
'/search/2017/02/03/category/gun/',
'/search/2017/01/category/gun/',
'/search/2017/category/gun/',
'/search/2017/02/03/category/gun/',
'/search/2018/?category=gun&type%5B%5D=sendo'
]
testLink.forEach((value, i) => {
console.log(value.replace(regex, ''))
console.log('-------------------')
})
Use this regex pattern (\/search\/.*\d+)(?=\/)
Demo
const regex = /(\/search\/.*\d+)(?=\/)/g;
const testLink = [
'/search/2017/02/03/category/gun/',
'/search/2017/01/category/gun/',
'/search/2017/',
'/search/2017/02/03/category/gun/',
'/search/2018/?category=gun&type%5B%5D=sendo'
]
testLink.forEach((value, i) => {
console.log(value.replace(regex, ''))
console.log('-------------------')
})
Splitting on the following pattern seems to work:
\d{4}(?:/\d{2})*
This would place the target you want as the second element in the split array, with the first portion being what precedes the year-digit portions of the path.
input = '/search/2017/02/03/category/gun';
parts = input.split(/\d{4}(?:\/\d{2})*/);
console.log(parts[1]);
console.log('/search/2017/02/03'.split(/\d{4}(?:\/\d{2})*/)[1]);
console.log('/search/2017/01/category/gun'.split(/\d{4}(?:\/\d{2})*/)[1]);
console.log('/search/2017/category/gun/'.split(/\d{4}(?:\/\d{2})*/)[1]);
Whether you do or do not expect a final trailing path separator is not entirely clear. In any case, the above answered can be modified per that expectation.
Here's one way:
strings = [
'/search/2017/02/03/category/gun',
'/search/2017/02/03/',
'/search/2017/01/category/gun',
'/search/2017/category/gun/',
]; for (var i = 0; i < strings.length; i++) {
alert(strings[i].replace(/\/search\/[0-9\/]+(\/)(category\/[^\/$]+)?.*/,'$1$2'));
}
you can use the same regex but revers the order of this part (?:\d|1[012]|0[1-9])
as following (?:\d|1[012]|0[1-9]) and make last group optional as following ((?:[0-3]\d)))?
const regex = /^\/search\/((?:\d{4}))(?:\/((?:0[1-9]|1[012]|\d)))?(?:\/((?:[0-3]\d)))?/
const testLink = [
'/search/2017/02/03/category/gun/',
'/search/2017/01/category/gun/',
'/search/2017/category/gun/',
'/search/2017/02/03/category/gun/'
]
output
/category/gun/
/category/gun/
/category/gun/
/category/gun/
if you are sure of date format so you can simplify your exp to be ^\/search\/\d{4}(?:\/\d{1,2}){0,2}

Regular expression to allow either 6 digit or 10 digit with a symbol (-)

I want to validate a text in text box using regex code where user will be allowed to either 6 digit or 10 digit with a symbol(-) any where between the digits for zip code I am able t achieve the digit limitation using /^(\d{6}|\d{10})$/ but not able to apply an optional symbol
Can some one please help me to short this out.
You could check first the wanted length and then the content.
/^ string start
(?=(.{7}|.{11})$) length check with positive look ahead
\d+-\d+ pattern check
$/ end of string
var test = [
'1',
'a',
'-',
'12',
'1245678901234567',
'1-23456',
'12-3456',
'123-456',
'1234-56',
'12345-6',
'12-345-6',
'12345-67890',
'foo-bar'
];
test.forEach(function (a) {
console.log(a, /^(?=(.{7}|.{11})$)\d+-\d+$/.test(a));
});
.as-console-wrapper { max-height: 100% !important; top: 0; }
This regex makes the dash optional:
var test = [
'12456789',
'12345678901',
'123456',
'1234567890',
'1-23456',
'12-34567890',
];
console.log(test.map(function (a) {
return a+' : '+/^(?:(?:(?=.{7}$|.{11}$)\d+-\d+)|\d{6}|\d{10})$/.test(a);
}));
As per #vickey comment ,it achieve his format "12345-1234"
[0-9]{1,5}-[0-9]{1,4}
http://jsfiddle.net/5PNcJ/239/

Categories