How to split a string by a difference in character as delimiter? - javascript

What I'd like to achieve is splitting a string like this, i.e. the delimiters are the indexes where the character before that index is different from the character after that index:
"AAABBCCCCDEEE" -> ["AAA", "BB", "CCCC", "D", "EEE"]
I've been trying to make up a concise solution, but I ended up with this rather verbose code: http://jsfiddle.net/b39aM/1/.
var arr = [], // output
text = "AAABBCCCCDEEE", // input
current;
for(var i = 0; i < text.length; i++) {
var char = text[i];
if(char !== current) { // new letter
arr.push(char); // create new array element
current = char; // update current
} else { // current letter continued
arr[arr.length - 1] += char; // append letter to last element
}
}
It's naive and I don't like it:
I'm manually iterating over each character, and I'm appending to the array character by character
It's a little too long for the simple thing I want to achieve
I was thinking of using a regexp but I'm not sure what the regexp should be. Is it possible to define a regexp that means "one character and a different character following"?
Or more generally, is there a more elegant solution for achieving this splitting method?

Yes, you can use a regular expression:
"AAABBCCCCDEEE".match(/(.)\1*/g)
Here . will match any character and \1* will match any following characters that are the same as the formerly matched one. And with a global match you’ll get all matching sequences.

Related

check if for every single char in string

I am trying to make ifcondition for a large number of chars.
I can use
if (str==!||str==#||str==#||str==$||str==^||str==&)
And so on, but this seems very inefficient. I would like to get the condition to work if the char is on of those:
!##%$^&()_-+=\?/.,'][{}<>`~
Is there is any shorter and more efficient way of doing it?
for (var c0 = 1; c0 > fn.length++; c0++) {
var str = fn.charAt(c0--);
if (str ==-"!##%$^&()_-+=\?/.,'][{}<>`~") {
}
}
I want the check to accrue on every single char from the string above.
You can use a regular expression character class to check if your character matches a particular character:
/^[\!##%$\^&\(\)_\-\+=\?\/\.,'\]\[\{\}\<\>`~]$/
Here I have escape special characters so that they get treated like regular characters.
See working example below:
const regex = /^[\!##%$\^&\(\)_\-\+=\?\/\.,'\]\[\{\}\<\>`~]$/,
charA = '#', // appears in char set
charB = 'A'; // doesn't appear in char set
console.log(regex.test(charA)); // true
console.log(regex.test(charB)); // false
Alternatively, if you don't want to use regular expressions you can instead put all your characters into an array and use .includes to check if your character is in your array.
const chars = "!##%$^&()_-+=\?/.,'][{}<>`~",
charArr = [...chars],
charA = '#', // is in char set
charB = 'A'; // isn't in char set
console.log(charArr.includes(charA)); // true
console.log(charArr.includes(charB)); // false
Just use regular expressions rather than manual single character checking.
const pattern = new RegExp("!##%$^&()_-+=\?\/.,'][{}<>`~");
const exists = pattern.test(str);
if (exists) {
// code logic for special character exists in string
}
First you can use split('') to split a string into an array of characters. Next you can use .some to check if a condition is true for at least one element in the array:
"!##%$^&()_-+=\?/.,'][{}<>`~".split('').some(x => x === str)

Javascript regex: match first 50 characters, respecting words

I'm trying to keep some nav bar lines short by matching the first 50 chars then concatenating '...', but using substr sometimes creates some awkward word chops.
So I want to figure out a way to respect words.
I could write a function to do this, but I'm just seeing if there's an easier/cleaner way.
I've used this successfully in perl:
^(.{50,50}[^ ]*)
Nice and elegant! But it doesn't work in Javascript :(
let catName = "A string that is longer than 50 chars that I want to abbreviate";
let regex = /^(.{50,50}[^ ]*)/;
let match = regex.exec(catName);
match is undefined
Use String#match method with regex with word boundary to include the last word.
str.match(/^.{1,50}.*?\b/)[0]
var str="I'm trying to keep some nav bar lines short by matching the first 50 chars then concatenating '...', but using substr sometimes creates some awkward word chops. So I want to figure out a way to respect words.";
console.log('With your code:', str.substr(0,50));
console.log('Using match:',str.match(/^.{1,50}.*?\b/)[0]);
Probably the most fool-proof solution with regular expression would be to use replace method instead. It won't fail with strings less than 50 characters:
str.replace(/^(.{50}[^ ]*).*/, '$1...');
var str = 'A string that is longer than 50 chars that I want to abbreviate';
console.log( str.replace(/^(.{50}[^ ]*).*/, '$1...') );
Tinkering with Pranov's answer, I think this works and is most succinct:
// abbreviate strings longer than 50 char, respecting words
if (catName.length > 50) {
catName = catName.match(/^(.{50,50}[^ ]*)/)[0] + '...';
}
The regex in my OP did work, but it was used in a loop and was choking on strings that already had fewer than 50 chars.
You can .split() \s, count characters at each array element which contains a word at for loop, when 50 or greater is reached when .length of each word is accrued at a variable, .slice() at current iteration from array, .join() with space characters " ", .concat() ellipses, break loop.
let catName = "A string that is longer than 50 chars that I want to abbreviate";
let [stop, res] = [50, ""];
if (catName.length > stop) {
let arr = catName.split(/\s/);
for (let i = 0, n = 0; i < arr.length; i++) {
n += arr[i].length;
if (n >= stop) {
res = arr.slice(0, i).join(" ").concat("...");
break;
};
}
} else {
res = catName.slice(0, 50).concat("...")
}
document.querySelector("pre").textContent = res;
<pre></pre>

Remove (n)th space from string in JavaScript

I am trying to remove some spaces from a few dynamically generated strings. Which space I remove depends on the length of the string. The strings change all the time so in order to know how many spaces there are, I iterate over the string and increment a variable every time the iteration encounters a space. I can already remove all of a specific type of character with str.replace(' ',''); where 'str' is the name of my string, but I only need to remove a specific occurrence of a space, not all the spaces. So let's say my string is
var str = "Hello, this is a test.";
How can I remove ONLY the space after the word "is"? (Assuming that the next string will be different so I can't just write str.replace('is ','is'); because the word "is" might not be in the next string).
I checked documentation on .replace, but there are no other parameters that it accepts so I can't tell it just to replace the nth instance of a space.
If you want to go by indexes of the spaces:
var str = 'Hello, this is a test.';
function replace(str, indexes){
return str.split(' ').reduce(function(prev, curr, i){
var separator = ~indexes.indexOf(i) ? '' : ' ';
return prev + separator + curr;
});
}
console.log(replace(str, [2,3]));
http://jsfiddle.net/96Lvpcew/1/
As it is easy for you to get the index of the space (as you are iterating over the string) , you can create a new string without the space by doing:
str = str.substr(0, index)+ str.substr(index);
where index is the index of the space you want to remove.
I came up with this for unknown indices
function removeNthSpace(str, n) {
var spacelessArray = str.split(' ');
return spacelessArray
.slice(0, n - 1) // left prefix part may be '', saves spaces
.concat([spacelessArray.slice(n - 1, n + 1).join('')]) // middle part: the one without the space
.concat(spacelessArray.slice(n + 1)).join(' '); // right part, saves spaces
}
Do you know which space you want to remove because of word count or chars count?
If char count, you can Rafaels Cardoso's answer,
If word count you can split them with space and join however you want:
var wordArray = str.split(" ");
var newStr = "";
wordIndex = 3; // or whatever you want
for (i; i<wordArray.length; i++) {
newStr+=wordArray[i];
if (i!=wordIndex) {
newStr+=' ';
}
}
I think your best bet is to split the string into an array based on placement of spaces in the string, splice off the space you don't want, and rejoin the array into a string.
Check this out:
var x = "Hello, this is a test.";
var n = 3; // we want to remove the third space
var arr = x.split(/([ ])/); // copy to an array based on space placement
// arr: ["Hello,"," ","this"," ","is"," ","a"," ","test."]
arr.splice(n*2-1,1); // Remove the third space
x = arr.join("");
alert(x); // "Hello, this isa test."
Further Notes
The first thing to note is that str.replace(' ',''); will actually only replace the first instance of a space character. String.replace() also accepts a regular expression as the first parameter, which you'll want to use for more complex replacements.
To actually replace all spaces in the string, you could do str.replace(/ /g,""); and to replace all whitespace (including spaces, tabs, and newlines), you could do str.replace(/\s/g,"");
To fiddle around with different regular expressions and see what they mean, I recommend using http://www.regexr.com
A lot of the functions on the JavaScript String object that seem to take strings as parameters can also take regular expressions, including .split() and .search().

Retrieving several capturing groups recursively with RegExp

I have a string with this format:
#someID#tn#company#somethingNew#classing#somethingElse#With
There might be unlimited #-separated words, but definitely the whole string begins with #
I have written the following regexp, though it matches it, but I cannot get each #-separated word, and what I get is the last recursion and the first (as well as the whole string). How can I get an array of every word in an element separately?
(?:^\#\w*)(?:(\#\w*)+) //I know I have ruled out second capturing group with ?: , though doesn't make much difference.
And here is my Javascript code:
var reg = /(?:^\#\w*)(?:(\#\w*)+)/g;
var x = null;
while(x = reg.exec("#someID#tn#company#somethingNew#classing#somethingElse#With"))
{
console.log(x);
}
And here is the result (Firebug, console):
["#someID#tn#company#somet...sing#somethingElse#With", "#With"]
0
"#someID#tn#company#somet...sing#somethingElse#With"
1
"#With"
index
0
input
"#someID#tn#company#somet...sing#somethingElse#With"
EDIT :
I want an output like this with regular expression if possible:
["#someID", "#tn", #company", "#somethingNew", "#classing", "#somethingElse", "#With"]
NOTE that I want a RegExp solution. I know about String.split() and String operations.
You can use:
var s = '#someID#tn#company#somethingNew#classing#somethingElse#With'
if (s.substr(0, 1) == "#")
tok = s.substr(1).split('#');
//=> ["someID", "tn", "company", "somethingNew", "classing", "somethingElse", "With"]
You could try this regex also,
((?:#|#)\w+)
DEMO
Explanation:
() Capturing groups. Anything inside this capturing group would be captured.
(?:) It just matches the strings but won't capture anything.
#|# Literal # or # symbol.
\w+ Followed by one or more word characters.
OR
> "#someID#tn#company#somethingNew#classing#somethingElse#With".split(/\b(?=#|#)/g);
[ '#someID',
'#tn',
'#company',
'#somethingNew',
'#classing',
'#somethingElse',
'#With' ]
It will be easier without regExp:
var str = "#someID#tn#company#somethingNew#classing#somethingElse#With";
var strSplit = str.split("#");
for(var i = 1; i < strSplit.length; i++) {
strSplit[i] = "#" + strSplit[i];
}
console.log(strSplit);
// ["#someID", "#tn", "#company", "#somethingNew", "#classing", "#somethingElse", "#With"]

How to get the characters preceded by "add_"

I have a strings "add_dinner", "add_meeting", "add_fuel_surcharge" and I want to get characters that are preceded by "add_" (dinner, meeting, fuel_surcharge).
[^a][^d]{2}[^_]\w+
I have tried this one, but it only works for "add_dinner"
[^add_]\w+
This one works for "add_fuel_surcharge", but takes "inner" from "add_dinner"
Help me to understand please.
Use capturing groups:
/^add_(\w+)$/
Check the returned array to see the result.
Since JavaScript doesn't support lookbehind assertions, you need to use a capturing group:
var myregexp = /add_(\w+)/;
var match = myregexp.exec(subject);
if (match != null) {
result = match[1];
}
[^add_] is a character class that matches a single character except a, d or _. When applied to add_dinner, the first character it matches is i, and \w+ then matches nner.
The [^...] construct matches any single character except the ones listed. So [^add_] matches any single character other than "a", "d" or "_".
If you want to retrieve the bit after the _ you can do this:
/add_(\w+_)/
Where the parentheses "capture" the part of the expression inside. So to get the actual text from a string:
var s = "add_meeting";
var result = s.match(/add_(\w+)/)[1];
This assumes the string will match such that you can directly get the second element in the returned array that will be the "meeting" part that matched (\w+).
If there's a possibility that you'll be testing a string that won't match you need to test that the result of match() is not null.
(Or, possibly easier to understand: result = "add_meeting".split("_")[1];)
You can filter _ string by JavaScript for loop ,
var str = ['add_dinner', 'add_meeting', 'add_fuel_surcharge'];
var filterString = [];
for(var i = 0; i < str.length; i ++){
if(str[i].indexOf("_")>-1){
filterString.push(str[i].substring(str[i].indexOf("_") + 1, str[i].length));
}
}
alert(filterString.join(", "));

Categories