Remove string after predefined string - javascript

I am pulling content from an RSS feed, before using jquery to format and edit the rss feed (string) that is returned. I am using replace to replace strings and characters like so:
var spanish = $("#wod a").text();
var newspan = spanish.replace("=","-");
$("#wod a").text(newspan);
This works great. I am also trying to remove all text after a certain point. Similar to truncation, I would like to hide all text starting from the word "Example".
In this particular RSS feed, the word example is in every feed. I would like to hide "example" and all text the follows that word. How can I accomplish this?

Though there is not enough jQuery, you even don't need it to remove everything after a certain word in the given string. The first approach is to use substring:
var new_str = str.substring(0, str.indexOf("Example"));
The second is a trick with split:
var new_str = str.split("Example")[0];

If you also want to keep "Example" and just remove everything after that particular word, you can do:
var str = "aaaa1111?bbb&222:Example=123456",
newStr = str.substring(0, str.indexOf('Example') + 'Example'.length);
// will output: aaaa1111?bbb&222:Example

jQuery isn't intended for string manipulation, you should use Vanilla JS for that:
newspan = newspan.replace(/example.*$/i, "");
The .replace() method accepts a regular expression, so in this case I've used /example.*$/i which does a case-insensitive match against the word "example" followed by zero or more of any other characters to the end of the string and replaces them with an empty string.

I would like to hide all text starting from the word "Example"
A solution that uses the simpler replace WITH backreferences so as to "hide" everything starting with the word Example but keeping the stuff before it.
var str = "my house example is bad"
str.replace(/(.*?) example.*/i, "$1") // returns "my house"
// case insensitive. also note the space before example because you
// probably want to throw that out.

Related

Match and replace a substring while ignoring special characters

I am currently looking for a way to turn matching text into a bold html line. I have it partially working except for special characters giving me problems because I desire to maintain the original string, but not compare the original string.
Example:
Given the original string:
Taco John's is my favorite place to eat.
And wanting to match:
is my 'favorite'
To get the desired result:
Taco John's <b>is my favorite</b> place to eat.
The way I'm currently getting around the extra quotes in the matching string is by replacing them
let regex = new RegExp('('+escapeRegexCharacters(matching_text.replace(/[^a-z 0-9]/gi,''))+')',"gi")
let html= full_text.replace(/[^a-z 0-9]/gi,'').replace(regex, "<b>$1</b>")}}></span>
This almost works, except that I lose all punctuation:
Taco Johns <b>is my favorite</b> place to eat
Is there any way to use regex, or another method, to add tags surrounding a matching phrase while ignoring both case and special characters during the matching process?
UPDATE #1:
It seems that I am being unclear. I need the original string's puncuation to remain in the end result's html. And I need the matching text logic to ignore all special characters and capitalization. So is my favorite is My favorite and is my 'favorite' should all trigger a match.
Instead of removing the special characters from the string being searched, you could inject in your regular expression a pattern between each character-to-match that will skip any special characters that might occur. That way you build a regular expression that can be applied directly to the string being searched, and the replacing operation will thus not touch the special characters outside of the matches:
let escapeRegexCharacters =
s => s.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, "\\$&"),
full_text = "Taco John's is My favorite place to eat.";
matching_text = "is my 'favorite'";
regex = new RegExp(matching_text.replace(/[^a-z\s\d]/gi, '')
.split().map(escapeRegexCharacters).join('[^a-z\s\d]*'), "gi"),
html = full_text.replace(regex, "<b>$&</b>");
console.log(html);
Regexps are useful where there is a pattern, but, in this case you have a direct match, so, the good approach is using a String.prototype.replace:
function wrap(source, part, tagName) {
return source
.replace(part,
`<${tagName}>${part}</${tagName}>`
)
;
}
At least, if there is a pattern, you should edit your question and provide it.
As an option, for single occurrence case - use String.split
Example replacing '###' with '###' :
let inputString = '1234###5678'
const chunks = inputString.split('###')
inputString = `${chunks[0]}###${chunks[1]}`
It's possible to avoid using a capture group with the $& replacement string, which means "entire matched substring":
var phrase = "Taco John's is my favorite place to eat."
var matchingText = "is my favorite"
var re = new RegExp(escapeRegexCharacters(matchingText), "ig");
phrase.replace(re, "<b>$&</b>");
(Code based on obarakon's answer.)
Generalizing, the regex you could use is my /w+. You can use that in a replacer function so that you can javascript manipulate the resultant text:
var str = "Taco John's is my favorite place to eat.";
var html = str.replace(/is my \w*/, function (x) {
return "<b>" + x + "</b>";
} );
console.log(html);

Replace words of text area

I have made a javascript function to replace some words with other words in a text area, but it doesn't work. I have made this:
function wordCheck() {
var text = document.getElementById("eC").value;
var newText = text.replace(/hello/g, '<b>hello</b>');
document.getElementById("eC").innerText = newText;
}
When I alert the variable newText, the console says that the variable doesn't exist.
Can anyone help me?
Edit:
Now it replace the words, but it replaces it with <b>hello</b>, but I want to have it bold. Is there a solution?
Update:
In response to your edit, about your wanting to see the word "hello" show up in bold. The short answer to that is: it can't be done. Not in a simple textarea, at least. You're probably looking for something more like an online WYSIWYG editor, or at least a RTE (Richt Text Editor). There are a couple of them out there, like tinyMCE, for example, which is a decent WYSIWYG editor. A list of RTE's and HTML editors can be found here.
First off: As others have already pointed out: a textarea element's contents is available through its value property, not the innerText. You get the contents alright, but you're trying to update it through the wrong property: use value in both cases.
If you want to replace all occurrences of a string/word/substring, you'll have to resort to using a regular expression, using the g modifier. I'd also recommend making the matching case-insensitive, to replace "hello", "Hello" and "HELLO" all the same:
var txtArea = document.querySelector('#eC');
txtArea.value = txtArea.value.replace(/(hello)/gi, '<b>$1</b>');
As you can see: I captured the match, and used it in the replacement string, to preserve the caps the user might have used.
But wait, there's more:
What if, for some reason, the input already contains <b>Hello</b>, or contains a word containing the string "hello" like "The company is called hellonearth?" Enter conditional matches (aka lookaround assertions) and word boundaries:
txtArea.value = txtArea.value.replace(x.value.replace(/(?!>)\b(hello)\b(?!<)/gi, '<b>$1</b>');
fiddle
How it works:
(?!>): Only match the rest if it isn't preceded by a > char (be more specific, if you want to and use (?!<b>). This is called a negative look-ahead
\b: a word boundary, to make sure we're not matching part of a word
(hello): match and capture the string literal, provided (as explained above) it is not preceded by a > and there is a word boundary
(?!<): same as above, only now we don't want to find a matching </b>, so you can replace this with the more specific (?!<\/b>)
/gi: modifiers, or flags, that affect the entire pattern: g for global (meaning this pattern will be applied to the entire string, not just a single match). The i tells the regex engine the pattern is case-insensitive, ie: h matches both the upper and lowercase character.
The replacement string <b>$1</b>: when the replacement string contains $n substrings, where n is a number, they are treated as backreferences. A regex can group matches into various parts, each group has a number, starting with 1, depending on how many groups you have. We're only grouping one part of the pattern, but suppose we wrote:
'foobar hello foobar'.replace(/(hel)(lo)/g, '<b>$1-$2</b>');
The output would be "foobar <b>hel-lo</b> foobar", because we've split the match up into 2 parts, and added a dash in the replacement string.
I think I'll leave the introduction to RegExp at that... even though we've only scratched the surface, I think it's quite clear now just how powerful regex's can be. Put some time and effort into learning more about this fantastic tool, it is well worth it.
If <textarea>, then you need to use .value property.
document.getElementById("eC").value = newText;
And, as mentioned Barmar, replace() replaces only first word. To replace all word, you need to use simple regex. Note that I removed quotes. /g means global replace.
var newText = text.replace(/hello/g, '<b>hello</b>');
But if you want to really bold your text, you need to use content editable div, not text area:
<div id="eC" contenteditable></div>
So then you need to access innerHTML:
function wordCheck() {
var text = document.getElementById("eC").innerHTML;
var newText = text.replace(/hello/g, '<b>hello</b>');
newText = newText.replace(/<b><b>/g,"<b>");//These two lines are there to prevent <b><b>hello</b></b>
newText = newText.replace(/<\/b><\/b>/g,"</b>");
document.getElementById("eC").innerHTML = newText;
}

Ignore Word List

I have a list of words to ignore. However, when I call it, it replaces every instance of it even when it's inside a string.
For example: "he" ends up turning "the" into "t".
How can I have it just remove the words when they're on their own?
Here's the code:
var commonWords=/and|a|an|has|he|to|was|in|were|are|is|will|as|it|if|
with|at|its|it's|be|by|on|that|from|the|about|again|all|almost|also|although|
always|among|another|any|be|because|been|before|being|between|both|by|can|could|
did|do|does|doesn't|'|done|due|during|each|either|enough|from|had|has|have|having|
here|i|if|into|is|isn't|itself|just|may|might|most|mostly|must|nor|no|neither|nearly|
of|often|on|our|ours|his|hers|he's|he|she|she's|overall|perhaps|quite|rather|really|
regarding|seem|seems|seen|several|should|show|showewd|shown|shows|significant|
significantly|since|so|some|such|than|that|then|their|theirs|there's|therefore|these|
they|this|those|through|thus|to|upon|use|used|using|various|very|was|we|were|what|when|
which|while|with|within|without|would|however|or|for|the|but|etc|yet|/g;
commonWords.ignoreCase;
var w = w.replace(commonWords, '');
You're not trying to replace any instance in a string, you want to replace whole words. You need to look for word boundaries using the \b anchor.
For example...
var commonWords = /\b(and|a|an|has|he|she)\b/g;

replacing spaces in a string with hyphens

I have a string and I need to fix it in order to append it to a query.
Say I have the string "A Basket For Every Occasion" and I want it to be "A-Basket-For-Every-Occasion"
I need to find a space and replace it with a hyphen. Then, I need to check if there is another space in the string. If not, return the fixed string. If so, run the same process again.
Sounds like a recursive function to me but I am not sure how to set it up. Any help would be greatly appreciated.
You can use a regex replacement like this:
var str = "A Basket For Every Occasion";
str = str.replace(/\s/g, "-");
The "g" flag in the regex will cause all spaces to get replaced.
You may want to collapse multiple spaces to a single hyphen so you don't end up with multiple dashes in a row. That would look like this:
var str = "A Basket For Every Occasion";
str = str.replace(/\s+/g, "-");
Use replace and find for whitespaces \s globally (flag g)
var a = "asd asd sad".replace(/\s/g,"-");
a becomes
"asd-asd-sad"
Try
value = value.split(' ').join('-');
I used this to get rid of my spaces. Instead of the hyphen I made it empty and works great. Also it is all JS. .split(limiter) will delete the limiter and puts the string pieces in an array (with no limiter elements) then you can join the array with the hyphens.

remove all but a specific portion of a string in javascript

I am writing a little app for Sharepoint. I am trying to extract some text from the middle of a field that is returned:
var ows_MetaInfo="1;#Subject:SW|NameOfADocument
vti_parservers:SR|23.0.0.6421
ContentTypeID:SW|0x0101001DB26Cf25E4F31488B7333256A77D2CA
vti_cachedtitle:SR|NameOfADocument
vti_title:SR|ATitleOfADocument
_Author:SW:|TheNameOfOurCompany
_Category:SW|
ContentType:SW|Document
vti_author::SR|mrwienerdog
_Comments:SW|This is very much the string I need extracted
vti_categories:VW|
vtiapprovallevel:SR|
vti_modifiedby:SR|mrwienerdog
vti_assignedto:SR|
Keywords:SW|Project Name
ContentType _Comments"
So......All I want returned is "This is very much the string I need extracted"
Do I need a regex and a string replace? How would you write the regex?
Yes, you can use a regular expression for this (this is the sort of thing they are good for). Assuming you always want the string after the pipe (|) on the line starting with "_Comments:SW|", here's how you can extract it:
var matchresult = ows_MetaInfo.match(/^_Comments:SW\|(.*)$/m);
var comment = (matchresult==null) ? "" : matchresult[1];
Note that the .match() method of the String object returns an array. The first (index 0) element will be the entire match (here, we the entire match is the whole line, as we anchored it with ^ and $; note that adding the "m" after the regex makes this a multiline regex, allowing us to match the start and end of any line within the multi-line input), and the rest of the array are the submatches that we capture using parenthesis. Above we've captured the part of the line that you want, so that will present in the second item in the array (index 1).
If there is no match ("_Comments:SW|" doesnt appear in ows_MetaInfo), then .match() will return null, which is why we test it before pulling out the comment.
If you need to adjust the regex for other scenarios, have a look at the Regex docs on Mozilla Dev Network: https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions
You can use this code:
var match = ows_MetaInfo.match(/_Comments:SW\|([^\n]+)/);
if (match)
document.writeln(match[1]);
I'm far from competent with RegEx, so here is my RegEx-less solution. See comments for further detail.
var extractedText = ExtractText(ows_MetaInfo);
function ExtractText(arg) {
// Use the pipe delimiter to turn the string into an array
var aryValues = ows_MetaInfo.split("|");
// Find the portion of the array that contains "vti_categories:VW"
for (var i = 0; i < aryValues.length; i++) {
if (aryValues[i].search("vti_categories:VW") != -1)
return aryValues[i].replace("vti_categories:VW", "");
}
return null;
}​
Here's a working fiddle to demonstrate.

Categories