How can I split a string into a given number of lines? - javascript

Here is my question:
Given a string, which is made up of space separated words, how can I split that into N strings of (roughly) even length, only breaking on spaces?
Here is what I've gathered from research:
I started by researching word-wrapping algorithms, because it seems to me that this is basically a word-wrapping problem. However, the majority of what I've found so far (and there is A LOT out there about word wrapping) assumes that the width of the line is a known input, and the number of lines is an output. I want the opposite.
I have found a (very) few questions, such as this that seem to be helpful. However, they are all focused on the problem as one of optimization - e.g. how can I split a sentence into a given number of lines, while minimizing the raggedness of the lines, or the wasted whitespace, or whatever, and do it in linear (or NlogN, or whatever) time. These questions seem mostly to be unanswered, as the optimization part of the problem is relatively "hard".
However, I don't care that much about optimization. As long as the lines are (in most cases) roughly even, I'm fine if the solution doesn't work in every single edge case, or can't be proven to be the least time complexity. I just need a real world solution that can take a string, and a number of lines (greater than 2), and give me back an array of strings that will usually look pretty even.
Here is what I've come up with:
I think I have a workable method for the case when N=3. I start by putting the first word on the first line, the last word on the last line, and then iteratively putting another word on the first and last lines, until my total width (measured by the length of the longest line) stops getting shorter. This usually works, but it gets tripped up if your longest words are in the middle of the line, and it doesn't seem very generalizable to more than 3 lines.
var getLongestHeaderLine = function(headerText) {
//Utility function definitions
var getLongest = function(arrayOfArrays) {
return arrayOfArrays.reduce(function(a, b) {
return a.length > b.length ? a : b;
});
};
var sumOfLengths = function(arrayOfArrays) {
return arrayOfArrays.reduce(function(a, b) {
return a + b.length + 1;
}, 0);
};
var getLongestLine = function(lines) {
return lines.reduce(function(a, b) {
return sumOfLengths(a) > sumOfLengths(b) ? a : b;
});
};
var getHeaderLength = function(lines) {
return sumOfLengths(getLongestLine(lines));
}
//first, deal with the degenerate cases
if (!headerText)
return headerText;
headerText = headerText.trim();
var headerWords = headerText.split(" ");
if (headerWords.length === 1)
return headerText;
if (headerWords.length === 2)
return getLongest(headerWords);
//If we have more than 2 words in the header,
//we need to split them into 3 lines
var firstLine = headerWords.splice(0, 1);
var lastLine = headerWords.splice(-1, 1);
var lines = [firstLine, headerWords, lastLine];
//The header length is the length of the longest
//line in the header. We will keep iterating
//until the header length stops getting shorter.
var headerLength = getHeaderLength(lines);
var lastHeaderLength = headerLength;
while (true) {
//Take the first word from the middle line,
//and add it to the first line
firstLine.push(headerWords.shift());
headerLength = getHeaderLength(lines);
if (headerLength > lastHeaderLength || headerWords.length === 0) {
//If we stopped getting shorter, undo
headerWords.unshift(firstLine.pop());
break;
}
//Take the last word from the middle line,
//and add it to the last line
lastHeaderLength = headerLength;
lastLine.unshift(headerWords.pop());
headerLength = getHeaderLength(lines);
if (headerLength > lastHeaderLength || headerWords.length === 0) {
//If we stopped getting shorter, undo
headerWords.push(lastLine.shift());
break;
}
lastHeaderLength = headerLength;
}
return getLongestLine(lines).join(" ");
};
debugger;
var header = "an apple a day keeps the doctor away";
var longestHeaderLine = getLongestHeaderLine(header);
debugger;
EDIT: I tagged javascript, because ultimately I would like a solution I can implement in that language. It's not super critical to the problem though, and I would take any solution that works.
EDIT#2: While performance is not what I'm most concerned about here, I do need to be able to perform whatever solution I come up with ~100-200 times, on strings that can be up to ~250 characters long. This would be done during a page load, so it needs to not take forever. For example, I've found that trying to offload this problem to the rendering engine by putting each string into a DIV and playing with the dimensions doesn't work, since it (seems to be) incredibly expensive to measure rendered elements.

Try this. For any reasonable N, it should do the job:
function format(srcString, lines) {
var target = "";
var arr = srcString.split(" ");
var c = 0;
var MAX = Math.ceil(srcString.length / lines);
for (var i = 0, len = arr.length; i < len; i++) {
var cur = arr[i];
if(c + cur.length > MAX) {
target += '\n' + cur;
c = cur.length;
}
else {
if(target.length > 0)
target += " ";
target += cur;
c += cur.length;
}
}
return target;
}
alert(format("this is a very very very very " +
"long and convoluted way of creating " +
"a very very very long string",7));

You may want to give this solution a try, using canvas. It will need optimization and is only a quick shot, but I think canvas might be a good idea as you can calculate real widths. You can also adjust the font to the really used one, and so on. Important to note: This won't be the most performant way of doing things. It will create a lot of canvases.
DEMO
var t = `However, I don't care that much about optimization. As long as the lines are (in most cases) roughly even, I'm fine if the solution doesn't work in every single edge case, or can't be proven to be the least time complexity. I just need a real world solution that can take a string, and a number of lines (greater than 2), and give me back an array of strings that will usually look pretty even.`;
function getTextTotalWidth(text) {
var canvas = document.createElement("canvas");
var ctx = canvas.getContext("2d");
ctx.font = "12px Arial";
ctx.fillText(text,0,12);
return ctx.measureText(text).width;
}
function getLineWidth(lines, totalWidth) {
return totalWidth / lines ;
}
function getAverageLetterSize(text) {
var t = text.replace(/\s/g, "").split("");
var sum = t.map(function(d) {
return getTextTotalWidth(d);
}).reduce(function(a, b) { return a + b; });
return sum / t.length;
}
function getLines(text, numberOfLines) {
var lineWidth = getLineWidth(numberOfLines, getTextTotalWidth(text));
var letterWidth = getAverageLetterSize(text);
var t = text.split("");
return createLines(t, letterWidth, lineWidth);
}
function createLines(t, letterWidth, lineWidth) {
var i = 0;
var res = t.map(function(d) {
if (i < lineWidth || d != " ") {
i+=letterWidth;
return d;
}
i = 0;
return "<br />";
})
return res.join("");
}
var div = document.createElement("div");
div.innerHTML = getLines(t, 7);
document.body.appendChild(div);

I'm sorry this is C#. I had created my project already when you updated your post with the Javascript tag.
Since you said all you care about is roughly the same line length... I came up with this. Sorry for the simplistic approach.
private void DoIt() {
List<string> listofwords = txtbx_Input.Text.Split(' ').ToList();
int totalcharcount = 0;
int neededLineCount = int.Parse(txtbx_LineCount.Text);
foreach (string word in listofwords)
{
totalcharcount = totalcharcount + word.Count(char.IsLetter);
}
int averagecharcountneededperline = totalcharcount / neededLineCount;
List<string> output = new List<string>();
int positionsneeded = 0;
while (output.Count < neededLineCount)
{
string tempstr = string.Empty;
while (positionsneeded < listofwords.Count)
{
tempstr += " " + listofwords[positionsneeded];
if ((positionsneeded != listofwords.Count - 1) && (tempstr.Count(char.IsLetter) + listofwords[positionsneeded + 1].Count(char.IsLetter) > averagecharcountneededperline))//if (this is not the last word) and (we are going to bust the average)
{
if (output.Count + 1 == neededLineCount)//if we are writting the last line
{
//who cares about exceeding.
}
else
{
//we're going to exceed the allowed average, gotta force this loop to stop
positionsneeded++;//dont forget!
break;
}
}
positionsneeded++;//increment the needed position by one
}
output.Add(tempstr);//store the string in our list of string to output
}
//display the line on the screen
foreach (string lineoftext in output)
{
txtbx_Output.AppendText(lineoftext + Environment.NewLine);
}
}

(Adapted from here, How to partition an array of integers in a way that minimizes the maximum of the sum of each partition?)
If we consider the word lengths as a list of numbers, we can binary search the partition.
Our max length ranges from 0 to sum (word-length list) + (num words - 1), meaning the spaces. mid = (range / 2). We check if mid can be achieved by partitioning into N sets in O(m) time: traverse the list, adding (word_length + 1) to the current part while the current sum is less than or equal to mid. When the sum passes mid, start a new part. If the result includes N or less parts, mid is achievable.
If mid can be achieved, try a lower range; otherwise, a higher range. The time complexity is O(m log num_chars). (You'll also have to consider how deleting a space per part, meaning where the line break would go, features into the calculation.)
JavaScript code (adapted from http://articles.leetcode.com/the-painters-partition-problem-part-ii):
function getK(arr,maxLength) {
var total = 0,
k = 1;
for (var i=0; i<arr.length; i++) {
total += arr[i] + 1;
if (total > maxLength) {
total = arr[i];
k++;
}
}
return k;
}
function partition(arr,n) {
var lo = Math.max(...arr),
hi = arr.reduce((a,b) => a + b);
while (lo < hi) {
var mid = lo + ((hi - lo) >> 1);
var k = getK(arr,mid);
if (k <= n){
hi = mid;
} else{
lo = mid + 1;
}
}
return lo;
}
var s = "this is a very very very very "
+ "long and convoluted way of creating "
+ "a very very very long string",
n = 7;
var words = s.split(/\s+/),
maxLength = partition(words.map(x => x.length),7);
console.log('max sentence length: ' + maxLength);
console.log(words.length + ' words');
console.log(n + ' lines')
console.log('')
var i = 0;
for (var j=0; j<n; j++){
var str = '';
while (true){
if (!words[i] || str.length + words[i].length > maxLength){
break
}
str += words[i++] + ' ';
}
console.log(str);
}

Using the Java String Split() Method to split a string we will discover How and Where to Apply This String Manipulation Technique:
We'll examine the Java Split() method's explanation and discover how to apply it. The principles are explained simply and with enough programming examples, either as a separate explanation or in the comment part of the programs.
The Java String Split() method is used to divide or split the calling Java String into pieces and return the Array, as the name implies. The delimiters("", " ", ) or regular expressions that we have supplied separately for each component or item of an array.
Syntax
String[ ] split(String regExp)
First Case: It involves initializing a Java String variable with a variety of words separated by spaces, using the Java String Split() method, and evaluating the results. We can effectively print each word without the space using the Java Split() function.
Second Case: In this case, we initialize a Java String variable and attempt to split or deconstruct the main String variable to use the String Split() method utilizing a substring of the initialized String variable.
Third Case: In this case, we will attempt to split a String using its character by taking a String variable (a single word).
You can check out other approaches to this problem on YouTube and even coding websites on google such as Coding Ninjas

This old question was revived by a recent answer, and I think I have a simpler technique than the answers so far:
const evenSplit = (text = '', lines = 1) => {
if (lines < 2) {return [text]}
const baseIndex = Math .round (text .length / lines)
const before = text .slice (0, baseIndex) .lastIndexOf (' ')
const after = text .slice (baseIndex) .indexOf (' ') + baseIndex
const index = after - baseIndex < baseIndex - before ? after : before
return [
text .slice (0, index),
... evenSplit (text .slice (index + (before > -1 ? 1 : 0)), lines - 1)
]
}
const text = `However, I don't care that much about optimization. As long as the lines are (in most cases) roughly even, I'm fine if the solution doesn't work in every single edge case, or can't be proven to be the least time complexity. I just need a real world solution that can take a string, and a number of lines (greater than 2), and give me back an array of strings that will usually look pretty even.`
const display = (lines) => console .log (lines .join ('\n'))
display (evenSplit (text, 7))
display (evenSplit (text, 5))
display (evenSplit (text, 12))
display (evenSplit (`this should be three lines, but it has a loooooooooooooooooooooooooooooooong word`, 3))
.as-console-wrapper {max-height: 100% !important; top: 0}
It works by finding the first line then recurring on the remaining text with one fewer lines. The recursion bottoms out when we have a single line. To calculate the first line, we take an initial target index which is just an equal share of the string based on its length and the number of lines. We then check to find the closest space to that index, and split the string there.
It does no optimization, and could certainly be occasionally misled by long words, but mostly it just seems to work.

Related

How to make JS function faster/reduce complexity 0(n)/more efficient

I am working on some challenges on HackerRank and I am having some troubles with making functions faster/more efficient so that it does not timeout during the submit process. It usually times out for really large inputs (ex: string length of 1000 or more) with the number of loops I am using to get the function working. I know the loops make the complexity 0(n * n) or 0(n * n * n). I understand why the function is timing out because of the above complexity issue but I am not sure of how to make the function more efficient in order to handle larger inputs. I am a self-taught coder so please explain any answers thoroughly and simply so I can learn. Thanks!
Here is an example problem:
A string is said to be a special palindromic string if either of two conditions is met:
All of the characters are the same, e.g. aaa.
All characters except the middle one are the same, e.g. aadaa. (acdca will not satisfy this rule but aadaa will)
A special palindromic substring is any substring of a string which meets one of those criteria. Given a string, determine how many special palindromic substrings can be formed from it.
For example, given the string s = mnonopoo, we have the following special palindromic substrings:
m, n, o, n, o, p, o, o
oo
non, ono, opo
Function Description
Complete the substrCount function in the editor below. It should return an integer representing the number of special palindromic substrings that can be formed from the given string.
substrCount has the following parameter(s):
n: an integer, the length of string s
s: a string
function substrCount(n, s) {
//if each letter is its own palindrome then can start with length for count
let count = n;
//space used to get the right string slices
let space = 1;
//so we only get full strings with the split and no duplicates
let numberToCount = n;
for(let i = 0; i < n; i++){
for(let j = 0; j < n; j++){
//slice the string into the different sections for testing if palindrome
let str = s.slice(j, j+space);
if(numberToCount > 0){
//if it is an even length the all characters must be the same
if(str.length % 2 === 0){
let split = str.split('');
let matches = 0;
for(let k = 0; k < split.length; k++){
if(split[k] === split[k+1]){
matches++;
}
}
if(matches === split.length -1){
count++;
}
//if it is not even then we must check that all characters on either side
//of the middle are all the same
} else {
if(str.length > 1){
let splitMid = Math.floor(str.length / 2);
let firstHalf = str.slice(0, splitMid);
let lastHalf = str.slice(splitMid+1, str.length);
if(firstHalf === lastHalf){
if(str.length === 3){
count++;
} else {
let sNew = firstHalf + lastHalf;
let split = sNew.split('');
let matches = 0;
for(let k = 0; k < split.length; k++){
if(split[k] === split[k+1]){
matches++;
}
}
if(matches === split.length -1){
count++;
}
}
}
}
}
}
numberToCount--;
}
numberToCount = n-space;
space++;
}
return count;
}
i came up with a solution that i think is not too complex in terms of performance(one loop and a recursion at a time)
steps
split string and insert it into an array
check first for even pairs into a recursion
next check for odd pairs again into a recursion
check that the values inserted to final array are unique(not unique only for single chars)
please let me know if this is the correct solution or we can speed it up
const stirng = "mnonopoo";
const str = stirng.split("");
let finalArray = [];
str.forEach((x, index) => {
if (str[index] === str[index + 1]) {
checkEven(str, index, 1)
}
if (str[index - 1] === str[index + 1]) {
checkOdd(str, index, 0)
}
finalArray.push(x);
})
function checkOdd(str1, index, counter) {
if (str1[index - counter] === str1[index + counter]) {
counter++;
checkOdd(str1, index, counter);
} else {
pushUnique(finalArray, str1.slice(index - counter + 1, index + counter).join(""));
return str1.slice(index - counter, index + counter).join("")
}
}
function checkEven(str1, index, counter) {
if (str1[index] === str1[index + counter]) {
counter++;
checkEven(str1, index, counter);
} else {
pushUnique(finalArray, str1.slice(index, index + counter).join(""));
return;
}
}
function pushUnique(array, value) {
if (array.indexOf(value) === -1) {
array.push(value);
}
}
console.log(finalArray)
Since your only looking for special palindromes, and not all palindromes, that makes reducing complexity a bit easier, but even then, there will be some special cases, like "abababababababa....". No way I can see to reduce the complexity of that one too far.
I'd approach this like so. Start by grouping all the repeating numbers. I'm not sure of the best way to do that, but I'm thinking maybe create an array of objects, with object properties of count and letter.
Start with your totalCount at 0.
Then, select all objects with a count of 1, and check the objects to the left and right of them, and if they have the same letter value, take the MAX count, and add that value + 1 to your totalCount (the +1 being to account for the single letter by itself). If the letter values on either side do not match, just add 1 (to account for the letter by itself).
That handles all the odd number palindromes. Now to handle the even number palindromes.
Select all the objects with a count > 1, and take their count value and add the series from 1-count to the totalCount. The formula for this is (count/2)*(1+count).
Example:
In the string you have a run of 4 A's. There are the special palindromes (a, a, a, a, aa, aa, aa, aaa, aaa, aaaa) in that, for a total of 10. (4/2)*(1+4)=10.
I don't know how much that will reduce your processing time, but I believe it should reduce it some.

Optimize recursive string manipulation function with JavaScript

Problem
I was given this problem in my Algorithms class today:
Given function maxSubstring(s, t), where s is a string and t is a substring of s, find the maximum number of iterations you can delete either the first or last occurrence of substring t .
Concept
Here is a visualization of the function maxSubstring called on s = banababbaa and t = ba.
b a n a b b a a
1st move: n a b a b b a or b a n a b a b a
2nd move: n a b b a a or n a b a b a n a b a b a or b a n a b a
3rd move: n a b a or n a b a n a b a or n a b a
4th move: n a n a
Thus, this operation takes four moves.
Attempt
Here is my solution to the problem. It works, but it is very slow when I use larger strings as arguments.
Attempt #1
function maxSubstring(s, t) {
if (s.includes(t)) {
var idxSubstr = s.replace(t, '');
var lastIdxSubstr = s.substr(0, s.lastIndexOf(t)) + s.substr(s.lastIndexOf(t) + t.length, s.length);
return 1 + Math.max(maxSubstring(idxSubstr, t), maxSubstring(lastIdxSubstr, t)));
}
return 0;
}
Attempt #2
function maxSubstring(s, t) {
if (s.includes(t)) {
var idx = s.indexOf(t), lastIdx = s.lastIndexOf(t);
var idxSubstr = s.substr(0, idx) + s.substr(idx + t.length, s.length);
var lastIdxSubstr = s.substr(0, lastIdx) + s.substr(lastIdx + t.length, s.length);
if (idx != lastIdx) {
return 1 + Math.max(maxSubstring(idxSubstr, t), maxSubstring(lastIdxSubstr, t));
} else {
return 1 + maxSubstring(idxSubstr, t);
}
}
return 0;
}
Reason for update: Minor change in efficiency by storing values of indexOf and lastIndexOf in variables.
Attempt #3
function maxSubstring(s, t) {
var idx = s.indexOf(t);
if (idx >= 0) {
var lastIdx = s.lastIndexOf(t);
var idxSubstr = s.substr(0, idx) + s.substr(idx + t.length);
if (idx != lastIdx) {
var lastIdxSubstr = s.substr(0, lastIdx) + s.substr(lastIdx + t.length);
return 1 + Math.max(maxSubstring(idxSubstr, t), maxSubstring(lastIdxSubstr, t));
} else {
return 1 + maxSubstring(idxSubstr, t);
}
}
return 0;
}
Reason for update: Reduced instances in which certain values were redefined and prevented lastIndexOf calculation before the first index is checked.
Answer Requirements
Is there any algorithm or method I may use to optimize this code? Math.max is the main culprit, so I would appreciate it if anyone has an idea on how to avoid using this method altogether.
In other words, maxSubstring should only be called once inside of itself, but Math.max requires that it be called twice (once for the first index of a substring and another time for the last index of that substring).
Lastly, do you mind telling me what the Big O Notation is for my solution and what the Big O Notation is for yours? This is not part of the original challenge, but I am curious, myself. Thanks in advance.
The major problem with the naive recursive algorithm that you have presented is that it is called very often on the same input s - exponentially often even, and exactly that is what causes the noticeable slowdown on larger strings.
What you can do against this is to use memoisation - remember the result for a specific input in a lookup table.
Another optimisation you can do is check whether deleting the first vs the last lead to different results at all. In most cases, it will absolutely not matter in which sequence you remove them, the number of possible removals is always the same. However, that is not the case when the matched substring can overlap with itself. As an example, try maxSubstring('ababaa', 'aba').
function maxSubstring(s, t, prevResults = new Map()) {
function result(x) { prevResults.set(s, x); return x; }
if (prevResults.has(s))
return prevResults.get(s); // memoisation
const first = s.indexOf(t);
if (first == -1)
return result(0);
const withoutFirst = s.slice(0, first) + s.slice(first + t.length);
const last = s.lastIndexOf(t);
if (last == first) // only one match
return result(1 + maxSubstring(withoutFirst, t, prevResults));
if (t.lastIndexOf(t.charAt(t.length-1), t.length-1) == -1 // last character of t is found nowhere else in t
|| !t.includes(s.charAt(first+t.length))) // character after the match can never be part of a match
// so this match is always removed in the optimal sequence and it doesn't matter whether as first or last
return result(1 + maxSubstring(withoutFirst, t, prevResults));
const withoutLast = s.slice(0, last) + s.slice(last + t.length);
if (t.indexOf(t.charAt(0), 1) == -1 // first character of t is found nowhere else in t
|| !t.includes(s.charAt(last - 1))) // character before the match can never be part of a match
// so this match is always removed and it doesn't matter when
return result(1 + maxSubstring(withoutLast, t, prevResults));
return result(1 + Math.max(maxSubstring(withoutFirst, t, prevResults),
maxSubstring(withoutLast, t, prevResults)));
}
Time Complexity Analysis
The number of recursive calls should be roughly quadratic in the number of removals. With my second suggestion, it might get down to linear in the best cases (depending on the patterns).
For every call, factor in the linear searches (indexOf, slice, etc.) and the Map lookup, though their average complexity will be less than that as the input get smaller and the pattern is often found early in the input. In any case, the complexity is polynomial, not exponential.

How can I use a Regular Expression to replace everything except specific words in a string with Javascript

Imagine you have a string like this: "This is a sentence with words."
I have an array of words like $wordList = ["sentence", "words"];
I want to highlight words that aren't on the list. Which means I need to find and replace everything else and I can't seem to crack how to do that (if it's possible) with RegEx.
If I want to match the words I can do something like:
text = text.replace(/(sentence|words)\b/g, '<mark>$&</mark>');
(which will wrap the matching words in "mark" tags and, assuming I have some css for <mark>, highlight them) which works perfectly. But I need the opposite! I need it to basically select the entire string and then exclude the words listed. I've tried /^((?!sentence|words)*)*$/gm but this gives me a strange infinity issue because I think it's too open ended.
Taking that original sentence, what I would hope to end up with is "<mark> This is a </mark> sentence <mark> with some </mark> words."
Basically wrapping (via replace) everything except the words listed.
The closest I can seem to get is something like /^(?!sentence|words).*\b/igm which will successfully do it if a line starts with one of the words (ignoring that entire line).
So to summarize: 1) Take a string 2) take a list of words 3) replace everything in the string except the list of words.
Possible? (jQuery is loaded for something else already, so raw JS or jQuery are both acceptable).
Create the regex from the word list.
Then do a string replace with the regex.
(It's a tricky regex)
var wordList = ["sentence", "words"];
// join the array into a string using '|'.
var str = wordList.join('|');
// finalize the string with a negative assertion
str = '\\W*(?:\\b(?!(?:' + str + ')\\b)\\w+\\W*|\\W+)+';
//create a regex from the string
var Rx = new RegExp( str, 'g' );
console.log( Rx );
var text = "%%%555This is a sentence with words, but not sentences ?!??!!...";
text = text.replace( Rx, '<mark>$&</mark>');
console.log( text );
Output
/\W*(?:\b(?!(?:sentence|words)\b)\w+\W*|\W+)+/g
<mark>%%%555This is a </mark>sentence<mark> with </mark>words<mark>, but not sentences ?!??!!...</mark>
Addendum
The regex above assumes the word list contains only word characters.
If that's not the case, you must match the words to advance the match position
past them. This is easily accomplished with a simplified regex and a callback function.
var wordList = ["sentence", "words", "won't"];
// join the array into a string using '|'.
var str = wordList.join('|');
str = '([\\S\\s]*?)(\\b(?:' + str + ')\\b|$)';
//create a regex from the string
var Rx = new RegExp( str, 'g' );
console.log( Rx );
var text = "%%%555This is a sentence with words, but won't be sentences ?!??!!...";
// Use a callback to insert the 'mark'
text = text.replace(
Rx,
function(match, p1,p2)
{
var retStr = '';
if ( p1.length > 0 )
retStr = '<mark>' + p1 + '</mark>';
return retStr + p2;
}
);
console.log( text );
Output
/([\S\s]*?)(\b(?:sentence|words|won't)\b|$)/g
<mark>%%%555This is a </mark>sentence<mark> with </mark>words<mark>, but
</mark>won't<mark> be sentences ?!??!!...</mark>
You could still perform the replacement on the positive matches, but reverse the closing/opening tag, and add an opening tag at the start and a closing one at the end of the string. I use here your regular expression which could be anything you want, so I'll assume it matches correctly what needs to be matched:
var text = "This is a sentence with words.";
text = "<mark>" + text.replace(/\b(sentence|words)\b/g, '</mark>$&<mark>') + "</mark>";
// If empty tags bother you, you can add:
text = text.replace(/<mark><\/mark>/g, "");
console.log(text);
Time Complexity
In comments below someone makes a point that the second replacement (which is optional) is a waste of time. But it has linear time complexity as is illustrated in the following snippet which charts the duration for increasing string sizes.
The X axis represents the number of characters in the input string, and the Y-axis represents the number of milliseconds it takes to execute the replacement with /<mark><\/mark>/g on such input string:
// Reserve memory for the longest string
const s = '<mark></mark>' + '<mark>x</mark>'.repeat(2000);
regex = /<mark><\/mark>/g,
millisecs = {};
// Collect timings for several string sizes:
for (let size = 100; size < 25000; size+=100) {
millisecs[size] = test(15, 8, _ => s.substr(0, size).replace(regex, ''));
}
// Show results in a chart:
chartFunction(canvas, millisecs, "len", "ms");
// Utilities
function test(countPerRun, runs, f) {
let fastest = Infinity;
for (let run = 0; run < runs; run++) {
const started = performance.now();
for (let i = 0; i < countPerRun; i++) f();
// Keep the duration of the fastest run:
fastest = Math.min(fastest, (performance.now() - started) / countPerRun);
}
return fastest;
}
function chartFunction(canvas, y, labelX, labelY) {
const ctx = canvas.getContext('2d'),
axisPix = [40, 20],
largeY = Object.values(y).sort( (a, b) => b - a )[
Math.floor(Object.keys(y).length / 10)
] * 1.3; // add 30% to value at the 90th percentile
max = [+Object.keys(y).pop(), largeY],
coeff = [(canvas.width-axisPix[0]) / max[0], (canvas.height-axisPix[1]) / max[1]],
textAlignPix = [-8, -13];
ctx.translate(axisPix[0], canvas.height-axisPix[1]);
text(labelY + "/" + labelX, [-5, -13], [1, 1], false, 2);
// Draw axis lines
for (let dim = 0; dim < 2; dim++) {
const c = coeff[dim], world = [c, 1];
let interval = 10**Math.floor(Math.log10(60 / c));
while (interval * c < 30) interval *= 2;
if (interval * c > 60) interval /= 2;
let decimals = ((interval+'').split('.')[1] || '').length;
line([[0, 0], [max[dim], 0]], world, dim);
for (let x = 0; x <= max[dim]; x += interval) {
line([[x, 0], [x, -5]], world, dim);
text(x.toFixed(decimals), [x, textAlignPix[1-dim]], world, dim, dim+1);
}
}
// Draw function
line(Object.entries(y), coeff);
function translate(coordinates, world, swap) {
return coordinates.map( p => {
p = [p[0] * world[0], p[1] * world[1]];
return swap ? p.reverse() : p;
});
}
function line(coordinates, world, swap) {
coordinates = translate(coordinates, world, swap);
ctx.beginPath();
ctx.moveTo(coordinates[0][0], -coordinates[0][1]);
for (const [x, y] of coordinates.slice(1)) ctx.lineTo(x, -y);
ctx.stroke();
}
function text(s, p, world, swap, align) { // align: 0=left,1=center,2=right
const [[x, y]] = translate([p], world, swap);
ctx.font = '9px courier';
ctx.fillText(s, x - 2.5*align*s.length, 2.5-y);
}
}
<canvas id="canvas" width="600" height="200"></canvas>
For each string size (which is incremented with steps of 100 characters), the time to run the regex 15 times is measured. This measurement is repeated 8 times and the duration of the fastest run is reported in the graph. On my PC the regex runs in 25µs on a string with 25 000 characters (consisting of <mark> tags). So not something to worry about ;-)
You may see some spikes in the chart (due to browser and OS interference), but the overall tendency is linear.
Given that the main regex has linear time complexity, the overall time complexity is not negatively affected by it.
However that optional part can be performed without regular expression as follows:
if (text.substr(6, 7) === '</mark>') text = text.substr(13);
if (text.substr(-13, 6) === '<mark>') text = text.substr(0, text.length-13);
Due to how JavaScript engines deal with strings (immutable), this longer code runs in constant time.
Of course, it does not change the overall time complexity, which remains linear.
I'm not sure if this will work for every case, but for the given string it does.
let s1 = "This is a sentence with words.";
let wordList = ["sentence", "words"];
let reg = new RegExp("([\\s\\S]*?)(" + wordList.join("|") + ")", "g");
console.log(s1.replace(reg, "<mark>$1</mark>$2"))
Do it the opposite way: Mark everything and unmark the matched words you have.
text = `<mark>${text.replace(/\b(sentence|words)\b/g, '</mark>$&<mark>')}</mark>`;
Negated regex is possible but inefficient for this. In fact regex is not the right tool. The viable method is to go through the strings and manually construct the end string:
//var text = "This is a sentence with words.";
//var wordlist = ["sentence", "words"];
var result = "";
var marked = false;
var nextIndex = 0;
while (nextIndex != -1) {
var endIndex = text.indexOf(" ", nextIndex + 1);
var substring = text.slice(nextIndex, endIndex == -1 ? text.length : endIndex);
var contains = wordlist.some(word => substring.includes(word));
if (!contains && !marked) {
result += "<mark>";
marked = true;
}
if (contains && marked) {
result += "</mark>";
marked = false;
}
result += substring;
nextIndex = endIndex;
}
if (marked) {
result += "</mark>";
}
text = result;

How can I prepend characters to a string using loops?

I have an input field that expects a 10 digit number. If the user enters and submits a number less than 10 digits, the function would simply add a "0" until the inputed value is 10 digits in length.
I haven't really used, or understand how recursive functions really work, but I'm basically looking at an efficient way of doing this. One minor issue I'm having is figuring out how to prepend the "0"s at the beginning of the string rather than appended to the end.
My thinking:
function lengthCheck(sQuery) {
for (var i = 0; i < sQuery.length; i++) {
if (sQuery.length !== 10) {
sQuery += "0";
//I'd like to add the 0s to the beggining of the sQuery string.
console.log(sQuery);
lengthCheck(sQuery);
} else return sQuery
}
}
Change:
sQuery += "0"; // added at end of string
to:
sQuery = "0" + sQuery; // added at start of string
To remove the for loop/recursion, you could slice out the desired length in one step:
function padZeros(sQuery) {
// the max amount of zeros you want to lead with
const maxLengthZeros = "0000000000";
// takes the 10 rightmost characters and outputs them in a new string
return (maxLengthZeros + sQuery).slice(-10);
}
Simple generic function using ES6 repeat:
// edge case constraints not implemented for brevity
function padZeros(sQuery = "", maxPadding = 10, outputLength = 10) {
// the max amount of zeros you want to lead with
const maxLengthZeros = "0".repeat(maxPadding);
// returns the "outputLength" rightmost characters
return (maxLengthZeros + sQuery).slice(-outputLength);
}
console.log('padZeros: ' + padZeros("1234567890"));
console.log('padZeros: ' + padZeros("123"));
console.log('padZeros: ' + padZeros(""));
Alternate version that doesn't affect strings over your set limit:
function padZerosIfShort(inputString = "", paddedOutputLength = 10) {
let inputLen = inputString.length;
// only padded if under set length, otherwise returned untouched
return (paddedOutputLength > inputLen)
? "0".repeat(paddedOutputLength - inputLen) + inputString
: inputString;
}
console.log('padZerosIfShort: ' + padZerosIfShort("1234567890", 5));
console.log('padZerosIfShort: ' + padZerosIfShort("123", 5));
console.log('padZerosIfShort: ' + padZerosIfShort("", 5));
It will ultimately depend on your needs how you want to implement this behavior.
The += operator adds things to the end of strings similar to:
sQuery=sQuery+"0"
You can add characters to the front of a string like this
sQuery="0"+sQuery
I also found something interesting here. it works like this:
("00000" + sQuery).slice(-5)
You would add zeros to the front then slice off everything except the last 5. so to get 10 characters you would use:
("0000000000" + n).slice(-10)
You don't need recursion to solve this, just a simple for loop should do the trick. Try this:
function lengthCheck (sQuery) {
for (var i = sQuery.length; i<10; i++) {
sQuery = "0" + sQuery;
}
return sQuery;
}
You're looking to pad the string with zeroes. This is an example I've used before from here and will shorten your code a little bit:
function lengthCheck (sQuery) {
while (sQuery.length < 10)
sQuery = 0 + sQuery;
return sQuery;
}
I believe this has already been answered here (or similar enough to provide you the solution): How to output integers with leading zeros in JavaScript

How to prepend two chars at the beginning of an Int16Array?

For app-specific reasons I need to prepend exactly these two chars 'a,' (the a and one comma) at the beginning of an existing Int16Array
at the moment I tried with this code but it does not seem to work correctly:
function convertFloat32ToInt16(buffer) {
var prefix = 'a,',
prefixLength = prefix.length / 2, // divided by 2 because we deal with 16 bits, not 8 bits
bufferLength = buffer.length,
totalLength = prefixLength + bufferLength,
arr = new Int16Array(totalLength),
i
for (i = 0; i < prefixLength; i = i + 2) {
arr[i] = prefix.charCodeAt(i) + prefix.charCodeAt(i + 1)
}
for (i = prefixLength; i < totalLength; i++) {
arr[i] = Math.min(1, buffer[i - prefixLength]) * 0x7FFF // convert to 16 bit
}
return arr.buffer
}
Any suggestions how I can do it better and fix the above code?
Many thanks!
Why use an Int16Array if you need to store random characters in it? You're asking for problems doing this.
Why not just use a regular array? Replace your definition of arr with arr = [], and replace references to buffer with arr
In any case, you'll need to use a different data structure if you want to store random characters. You could always make your return line something like this:
return {buffer: arr.buffer, type: prefix}

Categories