Replacing characters: how is a character in a webpage indentified in Javascript? - javascript

I'm looking to replace all characters that appear in a webpage with another character, for example replace 'a' with 'A'. Except for one caveat which I will further explain, I currently have this working well with the following code:
function replaceTextOnPage(){
getAllTextNodes().forEach(function(node){
let map = new Map()
map.set('さ', ['さ', 'サ'])
node.nodeValue = node.nodeValue.replace(new RegExp(quote(map.get(node.)[0]), 'g'), map.get('さ')[1]);
});
function getAllTextNodes(){
var result = [];
(function scanSubTree(node){
if(node.childNodes.length)
for(var i = 0; i < node.childNodes.length; i++)
scanSubTree(node.childNodes[i]);
else if(node.nodeType == Node.TEXT_NODE)
result.push(node);
})(document);
return result;
}
function quote(str){
return (str+'').replace(/([.?*+^$[\]\\(){}|-])/g, "\\$1");
}
}
Now if we take a look at the upper portion, the second function
getAllTextNodes().forEach(function(node){
let map = new Map()
map.set('a', ['a', 'A'])
node.nodeValue = node.nodeValue.replace(new RegExp(quote(map.get('a')[0]), 'g'), map.get('a')[1]);
});
I use a map (for efficiency purposes if using this for replacements of many different characters). The way the code is written here works as I want - effectively replaces all 'a' with 'A'. map.get('a')[0] gets the value associated with 'a', which is an array at the 0 index, or 'a'. This is the value to be replaced. map.get('a')[1] then gets the value at the 1 index of the array, or 'A'. The value to replace with.
My question is making this process "generic" so to speak. Ultimately I will be swapping all values of around 60 different characters (for Japanese, but can be thought of as me swapping every distinct lower case with upper-case and vice-versa). I was thinking, as each character (current key) is being traversed over, that key's respective map value would replace the key. Doing this iteratively for each of the getAllTextNodes with O(1) map lookup time.
I essentially need a way to call the current whatever character is being currently iterated over. I have tried map.get(node.nodeValue) as well as map.get(node.textContent), but neither worked.
Any help is greatly appreciated!!

If you are willing to use jQuery, you could use this stack overflow post here to get all of the textNodes on the page and then manipulate the text using the .map function.
Next, you could use an object as a dictionary to declare each change you want to take place.
For example:
var dict = {"a":"A", "b":"B"}
I have added some code to illustrate what I mean :)
This may not be the most performant strategy if you have many translations or individual blocks of text that have to take place because it is looping through every item in the dictionary for each block of text but it does work and is very easy to add more items to the dictionary if need be.
jQuery.fn.textNodes = function() {
return this.contents().filter(function() {
return (this.nodeType === Node.TEXT_NODE && this.nodeValue.trim() !== "");
});
}
var dict = {"a":"A", "b":"B", "t":"T"};
function changeToUppercase()
{
//Manipulate the output how you see fit.
$('*').textNodes().toArray().map(obj => obj.replaceWith(manipulateText($(obj).text())));
}
function manipulateText(inputText)
{
var keys = Object.keys(dict);
var returnText = inputText;
for(var i = 0; i < keys.length; i++)
{
var key = keys[i];
var value = dict[key];
returnText = returnText.replace(key, value);
}
return returnText;
}
changeToUppercase();
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<p>test 1</p>
<p>test 2</p>
<p>test 3</p>
<p>test 4</p>

Related

Javascript Get Multiple Substrings Within One String

how to get multiple occurences of words in a string. I've tried multiple functions but with no success. I know I can get back the first true value using some() method as below.
var keyword_array = ["Trap","Samples","WAV","MIDI","Loops"];
function validateContentKeywords(content,keyword){
keyword.some(function(currentValue,index){
console.log(currentValue + " ");
return content.indexOf(currentValue) >= 0;
});
}
// Outputs --> Trap Samples
if(validateContentKeywords("Beat Loops WAV Trap Samples Dog Cat MIDI",keyword_array)){
console.log("Matches");
}
// What I Want is --> Trap,Samples,MIDI,Loops
The above function only outputs 2 occurences and I want it to output all of the matching values at the same time such as --> Trap,Samples,MIDI,Loops.
Is there a way to get multiple occurences of words in a string at the same time?
UPDATED:: The solution that helped me out is below
function Matches(value){
return "Beat Loops WAV Trap Samples Dog Cat MIDI".indexOf(value) !== -1;
}
var keyword_array = ["Trap","Samples","WAV","MIDI","Loops"].filter(Matches);
document.write(keyword_array);
You seem to be looking for the filter Array method that returns an array of the matched elements, instead of an boolean value whether some matched.
var keyword_array = ["Trap", "Samples", "WAV", "MIDI", "Loops"];
function validateContentKeywords(content, keyword) {
var words = content.split(' '); //split the given string
for (var i = 0; i < words.length; i++) {
if (keyword.indexOf(words[i]) > -1) { //check if actually iterated word from string is in the provided keyword array
document.write(words[i] + " "); //if it is, write it to the document
};
}
}
validateContentKeywords("Beat Loops WAV Trap Samples Dog Cat MIDI", keyword_array);
The easiest way to do it would be this:
keyword_array.filter(keyword => searchString.includes(keyword));
You can learn more about filter here. I highly recommend learning about how to use map, reduce and filter. They are your best friends.

Removing array elements that contain a number

I have seen several answers on Stackoverflow but none have helped me. I have a huge array of nearly 100,000 words, of which I am trying to remove all words that contain a number. I am using the following to do that:
for(var i = 0; i < words.length; i++){
if (hasNumbers(words[i]) {
words.splice(i, 1);
}
function hasNumbers(t)
{ return /\d/.test(t); }
It seems to work, but not all the time because I am still getting words that contain numbers. What can I change to make this remove all words that contain any number at all?
(I am using p5.js with my js)
That is because when you delete a word at index i, the next word will have index i, yet you still increase i, thereby skipping a word which you never inspect.
To solve this you can go backwards through your array:
for(var i = words.length - 1; i >= 0; i--){
// etc.
Here is a shorter way to remove words with digits:
words = words.filter(a => !hasNumbers(a));
Finally, you really should call your second function hasDigits instead of hasNumbers. The words "digit" and "number" have a slightly different meaning.
Here is a snippet, using ES6 syntax, that defines the opposite function hasNoDigits and applies it to some sample data:
let words = ['abcd', 'ab0d', '4444', '-)#', '&9µ*'];
let hasNoDigits = s => /^\D*$/.test(s);
console.log(words.filter(hasNoDigits));
words = words.filter(a => !hasNumbers(a));
I had started writing this and then trincot answered. His answer is correct, though with the popular and widespread usage of ES5 array functions, I feel like you could simplify this down quite a bit.
window.addEventListener('load', function() {
var data = [
'w3.org',
'google.com',
'00011118.com'
]; //This is supposed to be your data, I didn't have it so I made it up.
var no_nums = data.filter(function(item) {
//Tests each string against the regex, inverts the value (false becomes true, true becomes false)
return !/\d/.test(item);
});
var results = document.getElementById('results');
no_nums.forEach(function(item) {
results.innerHTML += item + '<br />';
//Loops through each of our new array to add the item so we can see it.
});
});
<div id="results">
</div>

How to identify pairs in a string

Suppose I have the string : "((a,(b,c)),(d,(e,(f,g))))"
How would I go about extracting each pair separately such as splitting the first pair and extracting (a,(b,c)) and (d,(e,(f,g))).
I am kind of lost as to how I should approach this. Since the pairs can vary as the example I can't exactly look for a set pattern.
I believe an approach to this would be identifying where the "," is in the outer most parentheses. such as finding it in ( (set of pairs 1) , (set of pairs 2)).
so I can then be able to take everything left of it and right of it. But I do not know how to do this. Using str.Indexof() will take the first occurrence of "," which is not the one I am interested in.
I would loop through the string's characters keeping track of how nested the parentheses are, to find the first comma that isn't nested, and then (as you said) take the parts to the left and right of that:
function getPairs(input) {
// remove outer parentheses, if present
if (input[0] === "(")
input = input.slice(1,-1);
// find first comma that isn't inside parentheses
var parenNestLevel = 0;
for (var i = 0; i < input.length; i++) {
if (parenNestLevel === 0 && input[i] === ",")
return [input.slice(0, i), input.slice(i+1)];
else if (input[i] === "(")
parenNestLevel++;
else if (input[i] === ")")
parenNestLevel--;
}
// note: returns undefined if the input couldn't be parsed
}
var input = "((a,(b,c)),(d,(e,(f,g))))";
var pairs = getPairs(input);
console.log(pairs);
console.log(getPairs(pairs[0]));
For your input, that would return the array ["(a,(b,c))", "(d,(e,(f,g)))"], and you could then run getPairs() on the parts of the returned array, or make it recursive, or whatever - you don't really make it clear what the final output should be from your sample "((a,(b,c)),(d,(e,(f,g))))" input.
Here is a simpler solution. We first remove the first and last parenthesis, then split the resulting string with '),(', then iterate over the result and prepend/append the missing parenthesis to individual elements depending on their position:
var a = "((a,(b,c)),(d,(e,(f,g))))";
var b = a.substring(1,a.length-1); //remove first and last parenthesis
var c = b.split('),('); //get pairs
for(var i=0;i<c.length;i++){
if(i%2===0){
c[i] = c[i]+')';
}else{
c[i] = '('+c[i];
}
}
console.log(c); // ["(a,(b,c))", "(d,(e,(f,g)))"]

How to get an Array of all words used on a page

So I'm trying to get an array of all the words used in my web page.
Should be easy, right?
The problem I run into is that $("body").text().split(" ") returns an array where the words at the beginning of one element and end of another are joined as one.
i.e:
<div id="1">Hello
<div id="2">World</div>
</div>
returns ["HelloWorld"] when I want it to return ["Hello", "World"].
I also tried:
wordArr = [];
function getText(target)
{
if($(this).children())
{
$(this).children(function(){getText(this)});
}
else
{
var testArr = $(this).text().split(" ");
for(var i =0; i < testArr.length; i++)
wordArr.push(testArr[i]);
}
}
getText("body");
but $(node).children() is truthy for any node in the DOM that exists, so that didn't work.
I'm sure I'm missing something obvious, so I'd appreciate an extra set of eyes.
For what it's worth, I don't need unique words, just every word in the body of the document as an element in the array. I'm trying to use it to generate context and lexical co-occurrence with another set of words, so duplicates just up the contextual importance of a given word.
Thanks in advance for any ideas.
See Fiddle
How about something like this?
var res = $('body *').contents().map(function () {
if (this.nodeType == 3 && this.nodeValue.trim() != "")
return this.nodeValue.trim();
}).get().join(" ");
console.log(res);
Demo
Get the array of words:
var res = $('body *').contents().map(function () {
if (this.nodeType == 3 && this.nodeValue.trim() != "") //check for nodetype text and ignore empty text nodes
return this.nodeValue.trim().split(/\W+/); //split the nodevalue to get words.
}).get(); //get the array of words.
console.log(res);
Demo
function getText(target) {
var wordArr = [];
$('*',target).add(target).each(function(k,v) {
var words = $('*',v.cloneNode(true)).remove().end().text().split(/(\s+|\n)/);
wordArr = wordArr.concat(words.filter(function(n){return n.trim()}));
});
return wordArr;
}
FIDDLE
you can do this
function getwords(e){
e.contents().each(function(){
if ( $(this).children().length > 0 ) {
getwords($(this))
}
else if($.trim($(this).text())!=""){
words=words.concat($.trim($(this).text()).split(/\W+/))
}
});
}
http://jsfiddle.net/R55eM/
The question assumes that words are not internally separated by elements. If you simply create an array of words separated by white space and elements, you will end up with:
Fr<b>e</b>d
being read as
['Fr', 'e', 'd'];
Another thing to consider is punctuation. How do you deal with: "There were three of them: Mark, Sue and Tom. They were un-remarkable. One—the red head—was in the middle." Do you remove all punctuation? Or replace it with white space before trimming? How do you re-join words that are split by markup or characters that might be inter–word or intra–word punctuation? Note that while it is popular to write a dash between words with a space at either side, "correct" punctuation uses an m dash with no spaces.
Not so simple…
Anyhow, an approach that just splits on spaces and elements using recursion and works in any browser in use without any library support is:
function getWords(element) {
element = element || document.body;
var node, nodes = element.childNodes;
var words = [];
var text, i=0;
while (node = nodes[i++]) {
if (node.nodeType == 1) {
words = words.concat(getWords(node));
} else if (node.nodeType == 3) {
text = node.data.replace(/^\s+|\s+$/g,'').replace(/\s+/g,' ');
words = !text.length? words : words.concat(text.split(/\s/));
}
}
return words;
}
but it does not deal with the issues above.
Edit
To avoid script elements, change:
if (node.nodeType == 1) {
to
if (node.nodeType == 1 && node.tagName.toLowerCase() != 'script') {
Any element that should be avoided can be added to the condition. If a number of element types should be avoided, you can do:
var elementsToAvoid = {script:'script', button:'button'};
...
if (node.nodeType == 1 && node.tagName && !(node.tagName.toLowerCase() in elementsToAvoid)) {

alternatives for excessive for() looping in javascript

Situation
I'm currently writing a javascript widget that displays a random quote into a html element. the quotes are stored in a javascript array as well as how many times they've been displayed into the html element. A quote to be displayed cannot be the same quote as was previously displayed. Furthermore the chance for a quote to be selected is based on it's previous occurences in the html element. ( less occurrences should result in a higher chance compared to the other quotes to be selected for display.
Current solution
I've currently made it work ( with my severely lacking javascript knowledge ) by using a lot of looping through various arrays. while this currently works ( !! ) I find this solution rather expensive for what I want to achieve.
What I'm looking for
Alternative methods of removing an array element from an array, currently looping through the entire array to find the element I want removed and copy all other elements into a new array
Alternative method of calculating and selecting a element from an array based on it's occurence
Anything else you notice I should / could do different while still enforcing the stated business rules under Situation
The Code
var quoteElement = $("div#Quotes > q"),
quotes = [[" AAAAAAAAAAAA ", 1],
[" BBBBBBBBBBBB ", 1],
[" CCCCCCCCCCCC ", 1],
[" DDDDDDDDDDDD ", 1]],
fadeTimer = 600,
displayNewQuote = function () {
var currentQuote = quoteElement.text();
var eligibleQuotes = new Array();
var exclusionFound = false;
for (var i = 0; i < quotes.length; i++) {
var iteratedQuote = quotes[i];
if (exclusionFound === false) {
if (currentQuote == iteratedQuote[0].toString())
exclusionFound = true;
else
eligibleQuotes.push(iteratedQuote);
} else
eligibleQuotes.push(iteratedQuote);
}
eligibleQuotes.sort( function (current, next) {
return current[1] - next[1];
} );
var calculatePoint = eligibleQuotes[0][1];
var occurenceRelation = new Array();
var relationSum = 0;
for (var i = 0; i < eligibleQuotes.length; i++) {
if (i == 0)
occurenceRelation[i] = 1 / ((calculatePoint / calculatePoint) + (calculatePoint / eligibleQuotes[i+1][1]));
else
occurenceRelation[i] = occurenceRelation[0] * (calculatePoint / eligibleQuotes[i][1]);
relationSum = relationSum + (occurenceRelation[i] * 100);
}
var generatedNumber = Math.floor(relationSum * Math.random());
var newQuote;
for (var i = 0; i < occurenceRelation.length; i++) {
if (occurenceRelation[i] <= generatedNumber) {
newQuote = eligibleQuotes[i][0].toString();
i = occurenceRelation.length;
}
}
for (var i = 0; i < quotes.length; i++) {
var iteratedQuote = quotes[i][0].toString();
if (iteratedQuote == newQuote) {
quotes[i][1]++;
i = quotes.length;
}
}
quoteElement.stop(true, true)
.fadeOut(fadeTimer);
setTimeout( function () {
quoteElement.html(newQuote)
.fadeIn(fadeTimer);
}, fadeTimer);
}
if (quotes.length > 1)
setInterval(displayNewQuote, 10000);
Alternatives considered
Always chose the array element with the lowest occurence.
Decided against this as this would / could possibly reveal a too obvious pattern in the animation
combine several for loops to reduce the workload
Decided against this as this would make the code to esoteric, I'd probably wouldn't understand the code anymore next week
jsFiddle reference
http://jsfiddle.net/P5rk3/
Update
Rewrote my function with the techniques mentioned, while I fear that these techniques still loop through the entire array to find it's requirements, at least my code looks cleaner : )
References used after reading the answers here:
http://www.tutorialspoint.com/javascript/array_map.htm
http://www.tutorialspoint.com/javascript/array_filter.htm
http://api.jquery.com/jQuery.each/
I suggest array functions that are mostly supported (and easily added if not):
[].splice(index, howManyToDelete); // you can alternatively add extra parameters to slot into the place of deletion
[].indexOf(elementToSearchFor);
[].filter(function(){});
Other useful functions include forEach and map.
I agree that combining all the work into one giant loop is ugly (and not always possible), and you gain little by doing it, so readability is definitely the winner. Although you shouldn't need too many loops with these array functions.
The answer that you want:
Create an integer array that stores the number of uses of every quote. Also, a global variable Tot with the total number of quotes already used (i.e., the sum of that integer array). Find also Mean, as Tot / number of quotes.
Chose a random number between 0 and Tot - 1.
For each quote, add Mean * 2 - the number of uses(*1). When you get that that value has exceeded the random number generated, select that quote.
In case that quote is the one currently displayed, either select the next or the previous quote or just repeat the process.
The real answer:
Use a random quote, at the very maximum repeat if the quote is duplicated. The data usages are going to be lost when the user reloads/leaves the page. And, no matter how cleverly have you chosen them, most users do not care.
(*1) Check for limits, i.e. that the first or last quota will be eligible with this formula.
Alternative methods of removing an array element from an array
With ES5's Array.filter() method:
Array.prototype.without = function(v) {
return this.filter(function(x) {
return v !== x;
});
};
given an array a, a.without(v) will return a copy of a without the element v in it.
less occurrences should result in a higher chance compared to the other quotes to be selected for display
You shouldn't mess with chance - as my mathematician other-half says, "chance doesn't have a memory".
What you're suggesting is akin to the idea that numbers in the lottery that haven't come up yet must be "overdue" and therefore more likely to appear. It simply isn't true.
You can write functions that explicitly define what you're trying to do with the loop.
Your first loop is a filter.
Your second loop is a map + some side effect.
I don't know about the other loops, they're weird :P
A filter is something like:
function filter(array, condition) {
var i = 0, new_array = [];
for (; i < array.length; i += 1) {
if (condition(array[i], i)) {
new_array.push(array[i]);
}
}
return new_array;
}
var numbers = [1,2,3,4,5,6,7,8,9];
var even_numbers = filter(numbers, function (number, index) {
return number % 2 === 0;
});
alert(even_numbers); // [2,4,6,8]
You can't avoid the loop, but you can add more semantics to the code by making a function that explains what you're doing.
If, for some reason, you are not comfortable with splice or filter methods, there is a nice (outdated, but still working) method by John Resig: http://ejohn.org/blog/javascript-array-remove/

Categories