Find and replace all strings with a certain length JavaScript/Google Script - javascript

I am a JavaScript/GoogleScript Rookie, so please bear with me. I am trying to create a Script in Google Docs that will be able to locate all instances of words having exactly 10 characters and append an element to them which would in turn give me a url.
Example : Here is my link pineapples
I would like to find the 10 character string, being pineapple, and add google.com/ in front of each of the strings that have a length of 10.
Giving me "Here is my link google.com/pineapples."
function myFunction() {
var str = document.getElementById(str.length=10);
var res = str.replace("str.length=10", "br"+"str.length=10");
This seems completely wrong, but all I can come up with for now.

You can make it work by using a Regex and then using a backreference to refer to the matching group.
Regex: (\S{10})
it has 3 parts
\S matches anything other than a space, tab or newline.
{10} matches the above character exactly 10 times.
() is the Capturing Group, which is used later in the regex $1.
You can get more information here which explain the above Regex in detail.
You may change it to fit your need.
var stringVal = "Here is my link pineapples";
var stringReplaced = stringVal.replace(/(\S{10})/, "google.com/$1");
console.log(stringReplaced);

Here is a possible solution:
Split your string using space as a separator (this will give you an array)
Test the length of each part in a loop
Prepend google.com/ if a part has 10 characters
Join your array and enjoy your transformed string
var str = "Here is my link pineapples",
arr = str.split(' ');
for (var i = 0; i < arr.length; i++) {
if (arr[i].length === 10) {
arr[i] = 'google.com/' + arr[i];
}
}
console.log(arr.join(' '));

Okay so bear with me, but my idea is as follows:
The text that you want to replace, are they all within elements of the same class? If so, you could do something like this (jQuery hope you don't mind)
function myFunction(){
$('myClass').each(function(){
var innerText = $(this).text();
var substring = innerText.substr(0,9);
$(this).text(substring);
}
}

Related

Javascript iterate over string looking for multiple character sets

Ok, so I know how to do a standard loop to iterate over a string to find a character or a word that matches a single character or word, but in this instance, I have multiple character sets that I am looking for. Some are letters, some have characters (including protected ones). I can't split it into an array of words on space or anything like that because the character sets might not have a space, so wouldn't split. I suspect I'm going to have to do a regex, but I'm not sure how to set it up. This is basically the pseudo code of what I'm trying to do and I'd appreciate any tips on how to move forward. I apologize if this is an easy thing and I'm missing it, I'm still working on my javascript.
Pseudo code:
var string = "This *^! is abdf random&!# text to x*?ysearch for character sets";
var tempSet = [];
// start a typical for loop
for(string.length bla bla...){
// look for one of those four character sets and if it hits one
if(foundSet == "abdf" | "x*?y" | "*^!" | "&!#")
// push that character set to the tempSet array
tempSet.push(foundSet);
// continue searching for the next set until the string is done
console.log(tempSet);
//expected result = ["*^!", "abdf", "&!#", "x*?y"]
and all the sets are in the array in the order in which they appeared in the string
there is obviously more, but that part I can handle. It's this line
if(??? == "abdf" | "x*?y" | "*^!" | "&!#")
that I don't know really how to tackle. I suspect it should be some kind of regex but can you have a regex like that with a | when doing an if statement? I've done them with a | when doing a map/replace but I've never used a regex in a loop. I also don't know how to get it to search multiple characters at a time. Some of the character sets are 3, some are 4 characters long.
I would appreciate any help or if you have a suggestion on how to approach this in an easier way, that would be great.
Thanks!
You can use a regular expression. Just list all your strings as alternatives separated by |. Characters that have special meaning in regular expressions (e.g. *, ?, ^, $) will need to be escaped with \ (you can safely escape any non-alphanumeric characters -- some will be redundant).
var string = "This *^! is abdf random&!# text to x*?ysearch for character sets";
var tempSet = string.match(/abdf|x\*\?y|\*\^!|&!#/g);
console.log(tempSet);
If you need a loop you can call RegExp.prototype.exec() in a loop.
var string = "This *^! is abdf random&!# text to x*?ysearch for character sets";
var regex = /abdf|x\*\?y|\*\^!|&!#/g;
var tempSet = [];
while (match = regex.exec(string)) {
tempSet.push(match[0]);
}
console.log(tempSet);
A bit more of a manual method than Barmar's excellent RegEx, but it was fun to put together and shows the pieces maybe a bit more clearly:
var text = "This *^! is abdf random&!# text to x*?ysearch for character sets",
detect = ["abdf", "x*?y", "*^!", "&!#"],
haystack = '',
found = [];
text.split('').forEach(function(letter){
haystack += letter;
detect.forEach(function(needle){
if (haystack.indexOf(needle) !== -1
&& found.indexOf(needle) === -1) {
found.push(needle);
}
});
});
console.log(found);
I think what you're looking for is the includes() function.
var sample = "This *^! is abdf random&!# text to x*?ysearch for character
sets";
var toSearch = ["*^!", "abdf", "&!#", "x*?y"];
var tempSet = [];
for (var i = 0; i < toSearch.length; i++) {
if (sample.includes(toSearch[i]){
tempSet.push(toSearch[i]);
}
}
console.log(tempSet);
//expected result = ["*^!", "abdf", "&!#", "x*?y"]
This way you can iterate through an entire array of whatever strings you're searching for and push all matching elements to tempSet.
Note: This is case sensitive, so make sure you consider your check accordingly.
I would just add this as a comment to Kevin's answer if I was able to, but if you need IE support you can also check searchString.indexOf(searchToken) !== -1.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/includes
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/indexOf

regex lookbehind in javascript

i im trying to match some words in text
working example (what i want) regex101:
regex = /(?<![a-z])word/g
text = word 1word !word aword
only the first three words will be matched which is what i want to achieve.
but the look behind will not work in javascript :(
so now im trying this regex101:
regex = /(\b|\B)word/g
text = word 1word !word aword
but all words will match and they may not be preceded with an other letter, only with an integer or special characters.
if i use only the smaller "\b" the 1word wont matchand if i only use the "\B" the !word will not match
Edit
The output should be ["word","word","word"]
and the 1 ! must not be included in the match also not in another group, this is because i want to use it with javascript .replace(regex,function(match){}) which should not loop over the 1 and !
The code i use it for
for(var i = 0; i < elements.length; i++){
text = elements[i].innerHTML;
textnew = text.replace(regexp,function(match){
matched = getCrosslink(match)[0];
return "<a href='"+matched.url+"'>"+match+"</a>";
});
elements[i].innerHTML = textnew;
}
Capturing the leading character
It's difficult to know exactly what you want without seeing more output examples, but what about looking for either starts with boundary or starts with a non-letter. Like this for example:
(\bword|[^a-zA-Z]word)
Output: ['word', '1word', '!word']
Here is a working example
Capturing only the "word"
If you only want the "word" part to be captured you can use the following and fetch the 2nd capture group:
(\b|[^a-zA-Z])(word)
Output: ['word', 'word', 'word']
Here is a working example
With replace()
You can use specific capture groups when defining the replace value, so this will work for you (where "new" is the word you want to use):
var regex = /(\b|[^a-zA-Z])(word)/g;
var text = "word 1word !word aword";
text = text.replace(regex, "$1" + "new");
output: "new 1new !new aword"
Here is a working example
If you are using a dedicated function in replace, try this:
textnew = text.replace(regexp,function (allMatch, match1, match2){
matched = getCrosslink(match2)[0];
return "<a href='"+matched.url+"'>"+match2+"</a>";
});
Here is a working example
You can use the following regex
([^a-zA-Z]|\b)(word)
Simply use replace like as
var str = "word 1word !word aword";
str.replace(/([^a-zA-Z]|\b)(word)/g,"$1"+"<a>$2</a>");
Regex

JavaScript: How can I remove any words containing (or directly preceding) capital letters, numbers, or commas, from a string?

I'm trying to write the code so it removes the "bad" words from the string (the text).
The word is "bad" if it has comma or any special sign thereafter. The word is not "bad" if it contains only a to z (small letters).
So, the result I'm trying to achieve is:
<script>
String.prototype.azwords = function() {
return this.replace(/[^a-z]+/g, "0");
}
var res = "good Remove remove1 remove, ### rem0ve? RemoVE gooood remove.".azwords();//should be "good gooood"
//Remove has a capital letter
//remove1 has 1
//remove, has comma
//### has three #
//rem0ve? has 0 and ?
//RemoVE has R and V and E
//remove. has .
alert(res);//should alert "good gooood"
</script>
Try this:
return this.replace(/(^|\s+)[a-z]*[^a-z\s]\S*(?!\S)/g, "");
It tries to match a word (that is surrounded by whitespaces / string ends) and contains any (non-whitespace) character but at least one that is not a-z. However, this is quite complicated and unmaintainable. Maybe you should try a more functional approach:
return this.split(/\s+/).filter(function(word) {
return word && !/[^a-z]/.test(word);
}).join(" ");
okay, first off you probably want to use the word boundary escape \b in your regex. Also, it's a bit tricky if you match the bad words, because a bad word might contain lower case chars, so your current regex will exclude anything which does have lowecase letters.
I'd be tempted to pick out the good words and put them in a new string. It's a much easier regex.
/\b[a-z]+\b/g
NB: I'm not totally sure that it'll work for the first and last words in the string so you might need to account for that as well. http://www.regextester.com/ is exceptionally useful.
EDIT: as you want punctiation after the word to be 'bad', this will actually do what I was suggesting
(^|\s)[a-z]+(\s|$)
Firstly I wouldn't recommend changing the prototype of String (or of any native object) if you can avoid because you leave yourself open to conflicts with other code that might define the same property in different ways. Much better to put custom methods like this on a namespaced object, though I'm sure some will disagree.
Second, is there any need to use RegEx completely? (Genuine question; not trying to be facetious.)
Here is an example of the function with plain old JS using a little bit of RegEx here and there. Easier to comment, debug, and reuse.
Here is the code:
var azwords = function(str) {
var arr = str.split(/\s+/),
len = arr.length,
i = 0,
res = "";
for (i; i < len; i += 1) {
if (!(arr[i].match(/[^a-z]/))) {
res += (!res) ? arr[i] : " " + arr[i];
}
}
return res;
}
var res = "good Remove remove1 remove, ### rem0ve? RemoVE gooood remove."; //should be "good gooood"
//Remove has a capital letter
//remove1 has 1
//remove, has comma
//### has three #
//rem0ve? has 0 and ?
//RemoVE has R and V and E
//remove. has .
alert(azwords(res));//should alert "good gooood";
Try this one:
var res = "good Remove remove1 remove, ### rem0ve? RemoVE gooood remove.";
var new_one = res.replace(/\s*\w*[#A-Z0-9,.?\\xA1-\\xFF]\w*/g,'');
//Output `good gooood`
Description:
\s* # zero-or-more spaces
\w* # zero-or-more alphanumeric characters
[#A-Z0-9,.?\\xA1-\\xFF] # matches any list of characters
\w* # zero-or-more alphanumeric characters
/g - global (run over all string)
This will find all the words you want /^[a-z]+\s|\s[a-z]+$|\s[a-z]+\s/g so you could use match.
this.match(/^[a-z]+\s|\s[a-z]+$|\s[a-z]+\s/g).join(" "); should return the list of valid words.
Note that this took some time as a JSFiddle so it maybe more efficient to split and iterate your list.

remove all but a specific portion of a string in javascript

I am writing a little app for Sharepoint. I am trying to extract some text from the middle of a field that is returned:
var ows_MetaInfo="1;#Subject:SW|NameOfADocument
vti_parservers:SR|23.0.0.6421
ContentTypeID:SW|0x0101001DB26Cf25E4F31488B7333256A77D2CA
vti_cachedtitle:SR|NameOfADocument
vti_title:SR|ATitleOfADocument
_Author:SW:|TheNameOfOurCompany
_Category:SW|
ContentType:SW|Document
vti_author::SR|mrwienerdog
_Comments:SW|This is very much the string I need extracted
vti_categories:VW|
vtiapprovallevel:SR|
vti_modifiedby:SR|mrwienerdog
vti_assignedto:SR|
Keywords:SW|Project Name
ContentType _Comments"
So......All I want returned is "This is very much the string I need extracted"
Do I need a regex and a string replace? How would you write the regex?
Yes, you can use a regular expression for this (this is the sort of thing they are good for). Assuming you always want the string after the pipe (|) on the line starting with "_Comments:SW|", here's how you can extract it:
var matchresult = ows_MetaInfo.match(/^_Comments:SW\|(.*)$/m);
var comment = (matchresult==null) ? "" : matchresult[1];
Note that the .match() method of the String object returns an array. The first (index 0) element will be the entire match (here, we the entire match is the whole line, as we anchored it with ^ and $; note that adding the "m" after the regex makes this a multiline regex, allowing us to match the start and end of any line within the multi-line input), and the rest of the array are the submatches that we capture using parenthesis. Above we've captured the part of the line that you want, so that will present in the second item in the array (index 1).
If there is no match ("_Comments:SW|" doesnt appear in ows_MetaInfo), then .match() will return null, which is why we test it before pulling out the comment.
If you need to adjust the regex for other scenarios, have a look at the Regex docs on Mozilla Dev Network: https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions
You can use this code:
var match = ows_MetaInfo.match(/_Comments:SW\|([^\n]+)/);
if (match)
document.writeln(match[1]);
I'm far from competent with RegEx, so here is my RegEx-less solution. See comments for further detail.
var extractedText = ExtractText(ows_MetaInfo);
function ExtractText(arg) {
// Use the pipe delimiter to turn the string into an array
var aryValues = ows_MetaInfo.split("|");
// Find the portion of the array that contains "vti_categories:VW"
for (var i = 0; i < aryValues.length; i++) {
if (aryValues[i].search("vti_categories:VW") != -1)
return aryValues[i].replace("vti_categories:VW", "");
}
return null;
}​
Here's a working fiddle to demonstrate.

Searching for a last word in JavaScript

I am doing some logic for the last word that is on the sentence. Words are separated by either space or with a '-' character.
What is easiest way to get it?
Edit
I could do it by traversing backwards from the end of the sentence, but I would like to find better way
Try splitting on a regex that matches spaces or hyphens and taking the last element:
var lastWord = function(o) {
return (""+o).replace(/[\s-]+$/,'').split(/[\s-]/).pop();
};
lastWord('This is a test.'); // => 'test.'
lastWord('Here is something to-do.'); // => 'do.'
As #alex points out, it's worth trimming any trailing whitespace or hyphens. Ensuring the argument is a string is a good idea too.
Using a regex:
/.*[\s-](\S+)/.exec(str)[1];
that also ignores white-space at the end
Have you tried the lastIndexOf function http://www.w3schools.com/jsref/jsref_lastIndexOf.asp
Or Split function http://www.w3schools.com/jsref/jsref_split.asp
Here is a similar discussion have a look
You can try something like this...
<script type="text/javascript">
var txt = "This is the sample sentence";
spl = txt.split(" ");
for(i = 0; i < spl.length; i++){
document.write("<br /> Element " + i + " = " + spl[i]);
}
</script>
Well, using Split Function
string lastWord = input.Split(' ').Last();
or
string[] parts = input.Split(' ');
string lastWord = parts[parts.Length - 1];
While this would work for this string, it might not work for a slightly different string, so either you'll have to figure out how to change the code accordingly, or post all the rules.
string input = ".... ,API";
here, the comma would be part of the "word".
Also, if the first method of obtaining the word is correct, ie. everything after the last space, and your string adheres to the following rules:
Will always contain at least one space
Does not end with one or more space (in case of this you can trim it)
then you can use this code that will allocate fewer objects on the heap for GC to worry about later:
string lastWord = input.Substring(input.LastIndexOf(' ') + 1);
I hope its help

Categories