Split after regex max without losing delimeter

Split after regex max without losing delimeter - javascript

I'd like to transform a string like:
hello!world.what?up into ["hello!", "world.", "what?", "up"]
.split(/[?=<\.\?\!>]+/) is close to what I'm after, which returns:
["hello", "world", "what", "up"]
.split(/(?=[\?\!\.])/) is a bit closer yet, which returns:
["hello", "!world", ".what", "?up"]
This does the trick, but it's not pretty:
.split(/(?=[\?\!\.])/).map((s, idx, arr) => { (idx > 0) s = s.slice(1); return idx < arr.length - 1 ? s + arr[idx+1][0] : s }).filter(s => s)
How would I rephrase this to achieve the desired output?
Edit: Updated question.

Not sure of the real requirement but to accomplish what you want you could use .match instead of .split.
const items =
'hello!world.what?'.match(/\w+\W/g);
console.log(items);
update after comment
You could add a group for any character you want to use as the terminator for each part.
const items =
'hello!world.what?'.match(/\w+[!.?]/g);
console.log(items);
additional update
the previous solution would only select alphanumeric chars before the !.?
If you want to match any char except the delimiters then use
const items =
'hello!world.what?up'.match(/[^!.?]+([!.?]|$)/g);
console.log(items);

One solution could be first to use replace() for add a token after each searched character, then you can split by this token.
let input = "hello!world.what?";
const customSplit = (str) =>
{
let token = "#";
return str.replace(/[!.?]/g, (match) => match + "#")
.split(token)
.filter(Boolean);
}
console.log(customSplit(input));

Related

Remove a substring to make given word in javascript

I want to know about the algorithm for below question in JavaScript.
Check whether the given word can be "programming" or not by removing the substring between them. You can only remove one substring from the given the word.
Give answer in 'yes' and 'no'
example answer explanation
"progabcramming" yes remove substring 'abc'
"programmmeding" yes remove substring 'med'
"proasdgrammiedg" no u have to remove 2 subtring 'asd' and 'ied'
which is not allowed
"pxrogramming" yes remove substring 'x'
"pxrogramminyg" no u have to remove 2 subtring 'x' and 'y'
which is not allowed
Please tell me an algorithm to solve it

{
// will create a regexp for fuzzy search
const factory = (str) => new RegExp(str.split('').join('(.*?)'), 'i')
const re = factory('test') // re = /t(.*?)e(.*?)s(.*?)t/i
const matches = re.exec('te-abc-st') ?? [] // an array of captured groups
const result = matches
.slice(1) // first element is a full match, we don't need it
.filter(group => group.length) // we're also not interested in empty matches
// now result contains a list of captured groups
// in this particular example a single '-abc-'
}

I'm not sure how efficient this code is, but only thing i can come up with is using regular expression.
const word = 'programming';
const test = ['progabcramming', 'programmmeding', 'proasdgrammiedg', 'pxrogramming', 'pxrogramminyg', 'programming'];
// create regular expression manually
// const regexp = /^(p).+(rogramming)|(pr).+(ogramming)|(pro).+(gramming)|(prog).+(ramming)|(progr).+(amming)|(progra).+(mming)|(program).+(ming)|(programm).+(ing)|(programmi).+(ng)|(programmin).+(g)$/;
// create regular expression programmatically
let text = '/^';
word.split('').forEach((character, i) => {
text += i ? `(${word.substring(0, i)}).+(${word.substring(i)})|` : '';
});
text = text.substring(text.length - 1, 1) + '$/';
const regexp = new RegExp(text);
// content output
let content = '';
test.forEach(element => {
content += `${element}: ${regexp.test(element)}\n`;
});
document.body.innerText = content;

Match all instances of character except the first one, without lookbehind

I’m struggling with this simple regex that is not working correctly in Safari:
(?<=\?.*)\?
It should match each ?, except of the first one.
I know that lookbehind is not working on Safari yet, but I need to find some workaround for it. Any suggestions?

You can use an alternation capture until the first occurrence of the question mark. Use that group again in the replacement to leave it unmodified.
In the second part of the alternation, match a questionmark to be replaced.
const regex = /^([^?]*\?)|\?/g;
const s = "test ? test ? test ?? test /";
console.log(s.replace(regex, (m, g1) => g1 ? g1 : "[REPLACE]"));

There are always alternatives to lookbehinds.
In this case, all you need to do is replace all instances of a character (sequence), except the first.
The .replace method accepts a function as the second argument.
That function receives the full match, each capture group match (if any), the offset of the match, and a few other things as parameters.
.indexOf can report the first offset of a match.
Alternatively, .search can also report the first offset of a match, but works with regexes.
The two offsets can be compared inside the function:
const yourString = "Hello? World? What? Who?",
yourReplacement = "!",
pattern = /\?/g,
patternString = "?",
firstMatchOffsetIndexOf = yourString.indexOf(patternString),
firstMatchOffsetSearch = yourString.search(pattern);
console.log(yourString.replace(pattern, (match, offset) => {
if(offset !== firstMatchOffsetIndexOf){
return yourReplacement;
}
return match;
}));
console.log(yourString.replace(pattern, (match, offset) => {
if(offset !== firstMatchOffsetSearch){
return yourReplacement;
}
return match;
}));
This works for character sequences, too:
const yourString = "Hello. Hello. Hello. Hello.",
yourReplacement = "Hi",
pattern = /Hello/g,
firstOffset = yourString.search(pattern);
console.log(yourString.replace(pattern, (match, offset) => {
if(offset !== firstOffset){
return yourReplacement;
}
return match;
}));

Split and join with
var s = "one ? two ? three ? four"
var l = s.split("?") // Split with ?
var first = l.shift() // Get first item and remove from l
console.log(first + "?" + l.join("<REPLACED>")) // Build the results

Looking for one regex to replace my string

I have problem with regex and need some help.
Current I have url has type
/search/:year/:month/:day/xxxx with :month and :day maybe exist or not . Now I need replace /search/:year/:month/:day patern on my url with empty string. Meaning get remain of url part. So this is some example below
1.'/search/2017/02/03/category/gun' => '/category/gun'
2.'/search/2017/02/03/' => '/'
3.'/search/2017/01/category/gun' => '/category/gun/'
4.'/search/2017/category/gun/' => '/category/gun/'
5.'/search/2018/?category=gun&type%5B%5D=sendo' => '/?category=gun&type%5B%5D=sendo/'
I try to use regex = /^\/search\/((?:\d{4}))(?:\/((?:\d|1[012]|0[1-9])))?(?:\/((?:[0-3]\d)))/
But it is failed for case /search/2017/category/gun/
const regex = /^\/search\/((?:\d{4}))(?:\/((?:\d|1[012]|0[1-9])))?(?:\/((?:[0-3]\d)))/
const testLink = [
'/search/2017/02/03/category/gun/',
'/search/2017/01/category/gun/',
'/search/2017/category/gun/',
'/search/2017/02/03/category/gun/',
'/search/2018/?category=gun&type%5B%5D=sendo'
]
testLink.forEach((value, i) => {
console.log(value.replace(regex, ''))
console.log('-------------------')
})

Use this regex pattern (\/search\/.*\d+)(?=\/)
Demo
const regex = /(\/search\/.*\d+)(?=\/)/g;
const testLink = [
'/search/2017/02/03/category/gun/',
'/search/2017/01/category/gun/',
'/search/2017/',
'/search/2017/02/03/category/gun/',
'/search/2018/?category=gun&type%5B%5D=sendo'
]
testLink.forEach((value, i) => {
console.log(value.replace(regex, ''))
console.log('-------------------')
})

Splitting on the following pattern seems to work:
\d{4}(?:/\d{2})*
This would place the target you want as the second element in the split array, with the first portion being what precedes the year-digit portions of the path.
input = '/search/2017/02/03/category/gun';
parts = input.split(/\d{4}(?:\/\d{2})*/);
console.log(parts[1]);
console.log('/search/2017/02/03'.split(/\d{4}(?:\/\d{2})*/)[1]);
console.log('/search/2017/01/category/gun'.split(/\d{4}(?:\/\d{2})*/)[1]);
console.log('/search/2017/category/gun/'.split(/\d{4}(?:\/\d{2})*/)[1]);
Whether you do or do not expect a final trailing path separator is not entirely clear. In any case, the above answered can be modified per that expectation.

Here's one way:
strings = [
'/search/2017/02/03/category/gun',
'/search/2017/02/03/',
'/search/2017/01/category/gun',
'/search/2017/category/gun/',
]; for (var i = 0; i < strings.length; i++) {
alert(strings[i].replace(/\/search\/[0-9\/]+(\/)(category\/[^\/$]+)?.*/,'$1$2'));
}

you can use the same regex but revers the order of this part (?:\d|1[012]|0[1-9])
as following (?:\d|1[012]|0[1-9]) and make last group optional as following ((?:[0-3]\d)))?
const regex = /^\/search\/((?:\d{4}))(?:\/((?:0[1-9]|1[012]|\d)))?(?:\/((?:[0-3]\d)))?/
const testLink = [
'/search/2017/02/03/category/gun/',
'/search/2017/01/category/gun/',
'/search/2017/category/gun/',
'/search/2017/02/03/category/gun/'
]
output
/category/gun/
/category/gun/
/category/gun/
/category/gun/
if you are sure of date format so you can simplify your exp to be ^\/search\/\d{4}(?:\/\d{1,2}){0,2}

Convert kebab-case to camelCase with JavaScript

Say I have a function that transforms kebab-case to camelCase:
camelize("my-kebab-string") === 'myKebabString';
I'm almost there, but my code outputs the first letter with uppercase too:
function camelize(str){
let arr = str.split('-');
let capital = arr.map(item=> item.charAt(0).toUpperCase() + item.slice(1).toLowerCase());
let capitalString = capital.join("");
console.log(capitalString);
}
camelize("my-kebab-string");

You can also try regex.
camelize = s => s.replace(/-./g, x=>x[1].toUpperCase())
Looks only for hyphen followed by any character, and capitalises it and replaces the hyphen+character with the capitalised character.

To keep your existing code, I've just added a check on the index that will return item instead of the transformed item if item is 0 (falsy), since the problem is just that you are upper-casing the first item as well, while you shouldn't.
In a nutshell, the inline expression becomes: (item, index) => index ? item.charAt(0).toUpperCase() + item.slice(1).toLowerCase() : item, because:
If index is not falsy (so, if index is > 0 in your context), the capitalized string is returned.
Otherwise, the current item is returned.
Of course, this could be cleaner and likely single line, but I wanted to stay as close as possible to your code so that you could understand what was wrong:
function camelize(str){
let arr = str.split('-');
let capital = arr.map((item, index) => index ? item.charAt(0).toUpperCase() + item.slice(1).toLowerCase() : item.toLowerCase());
// ^-- change here.
let capitalString = capital.join("");
console.log(capitalString);
}
camelize("my-kebab-string");
As a side note, you could've found a potential cleaner answer here: Converting any string into camel case

For lodash users:
_.camelCase('my-kebab-string') => 'myKebabString'

The first method is to just transform to lower case the first entry of your capital array, like this:
capital[0] = capital[0].toLowerCase();
Another method, which I think to be more efficient, is to pass another parameter to the map callback, which is the index. Take a look at this for further reading:
https://www.w3schools.com/jsref/jsref_map.asp
So you transform to upper case only if (index > 0).
Like this:
let capital = arr.map((item, index) => index ? item.charAt(0).toUpperCase() + item.slice(1).toLowerCase() : item);

so I tried both array-string and regex but regex is slower !
const string = " background-color: red; \n color: red;\n z-index: 10"
// regex
console.time("regex")
let property = string
const camelProp = property.replace(/(-[a-z])/, (g) => {
return g.replace("-", "").toUpperCase()
})
console.timeEnd("regex")
// custom
console.time("custom")
const str = string
let strNew = str
.split("-")
.map((e) => {
return e[0].toUpperCase() + e.slice(1)
})
.join("")
console.timeEnd("custom")
console.log(camelProp)
console.log(strNew)

JS / TS splitting string by a delimiter without removing the delimiter

I have a string that I need to split by a certain delimiter and convert into an array, but without removing the delimiter itself.
For example, consider the following code:
var str = "#mavic#phantom#spark";
str.split("#") //["", "mavic", "phantom", "spark"]
I need the output to be as follows:
["#mavic", "#phantom", "#spark"]
I read here but that does not answer my question.

You could split by positive lookahead of #.
var string = "#mavic#phantom#spark",
splitted = string.split(/(?=#)/);
console.log(splitted);

Split the string by # and use the reduce to return the modified string
var str = "#mavic#phantom#spark";
let x = str.split("#").reduce((acc, curr) => {
if (curr !== '') {
acc.push('#' + curr);
}
return acc;
}, [])
console.log(x)

Here is also some non-regexp methods of solving your task:
Solution 1 classical approach - iterate over the string and each time when we find indexOf our delimiter, we push to the result array the substring between current position and the next position. In the else block we have a case for the last substring - we simply add it to the result array and break the loop.
const delimiter = '#';
const result1 = [];
let i = 0;
while (i < str.length) {
const nextPosition = str.indexOf(delimiter, i+1);
if (nextPosition > 0) {
result1.push(str.substring(i, nextPosition));
i = nextPosition;
} else {
result1.push(str.substring(i));
break;
}
}
Solution 2 - split the initial string starting at index 1 (in order to not include empty string in the result array) and then just map the result array by concatenating the delimiter and current array item.
const result2 = str.substr(1).split(delimiter).map(s => delimiter + s);

another way:
filter empty elements after splitting, and map these elements to start with the character you splitted with.
string.split("#").filter((elem) => elem).map((elem) => "#" + elem);

We Keep Coding

JavaScript is the programming language of the Web.

Split after regex max without losing delimeter - javascript

Related

Remove a substring to make given word in javascript

Match all instances of character except the first one, without lookbehind

Looking for one regex to replace my string

Convert kebab-case to camelCase with JavaScript

JS / TS splitting string by a delimiter without removing the delimiter

Categories

Resources