Encoding/decoding issue of a string in HTML5 - javascript

I am trying to display the below string in the HTML browser but the moment browser encounter with custom tag like "" in the string, it just skip that.
I tried encodeURIComponent() and later tried to decodeURIComponent in my HTML template but did not work.
I even tried to sanitize the HTML by creating an PIPE like below but no luck.
transform(v: string): SafeHtml {
return this._sanitizer.bypassSecurityTrustHtml(v);
}
somehow, browser is skipping the the element like in the string.
following is the string
“The endpoint is browser-based, rather than RESTful. Therefore it could
result in the following different scenarios,↵1. SUCCESS
(response_type=code)↵> redirect_uri?code=<authorization-code>&scope=
<resource-owner-approved-scopes>[&state=<state-provided-by-the-client>]."

Never insert text with innerHTML; only insert HTML with innerHTML.
Browsers nowadays have a textContent property:
yourDiv.textContent = stringFromServer;
Also, just to clear up the common mistake you and many other developers make, encodeURIComponent is meant to encode a string for insertion into a URL, not insertion into HTML. Same goes with encodeURI (which you probably should never use anyway).
Update:
As stated in your comments, you're actually wanting to transform the text into HTML using some rules rather than inserting plain text into your HTML, which will do the typical whitespace normalization rules.
There are many options for this. Here are two:
Still insert just the plain text, but set the CSS style in the container to white-space: pre. This changes the way that whitespace is rendered, so newlines cause line breaks.
Split your original string, then intersperse your div with text nodes and <br/> elements.
Code for the latter could look something like this:
function insertFormattedText(container, text) {
const chunks = text.split('\n');
chunks.forEach((chunk, i) => {
container.appendChild(document.createTextNode(chunk));
if (i < chunks.length - 1) {
container.appendChild(document.createElement('br'));
// Equivalent of ` `
container.appendChild(document.createTextNode('\u00A0\u00A0'));
}
});
}

Related

Javascript: how to convert an element into an HTML-evaluated string?

Here's a simplified example of what I'd like to do:
var footnote = somewhere.innerHTML // This is <q>the note</q>.
var result = ???(footnote)
target.setAttribute("title", result) // This is "the note".
I've tried various methods and functions for the "???", but end up with either the raw tags displayed in the title, or with plain text and no quotation marks.
Other than processing all the inner tags myself, is there a way to convert an element into a string that contains how it would appear when HTML expanded?
Clarification:
I thought it was obvious from the "I have" and "I want" values indicated in the code comments, but this is what I want to do:
I have an element (say a <p> if you need a specific type)
that has content "This is <q>the note</q>."
I want something that will convert it into a string suitable for use in a title="..." attribute in some other element.
Displayable internal tags (in this specific example <q>) need to be HTML-interpreted so that they display as actual quotation marks, ideally handling nested quotations.
innerHTML conversion to string leaves the raw tags in place.
innerText conversion to string ignores the tags and produces no quotation marks.
Is there some other way of doing the HTML interpretation other than by writing my own function to process it?
When you add the q tag you're actually adding a text node(this is what you get with textContent, innerText, etc) that has two CSS pseudo-elements around it, the open and closing quotations.
Neither pseudo-elements nor pseudo-classes appear in the document source or document tree. They basically don't actually exist in the DOM and are therefore not selectable/won't show up in any values of the element properties.
In short, using <q></q> is more semantic mark-up, but if you're looking to represent those quotations outside the scope of the view you may want to use the traditional "
example:
let p = document.querySelector("p"), div = document.querySelector("div");
div.title = p.textContent;
console.log(div.title);
<p>"Example Text"</p>
<div></div>
Additionally, though I will say that I don't recommend it, if you really wanted to keep what you have and you're not too concerned with optimization you could simply use a replace:
let p = document.querySelector("p"), div = document.querySelector("div");
div.title = p.innerHTML.replace(/<q>|<\/q>/gmi, '"');
console.log(div.title);
<p><q>Example Text</q></p>
<div></div>

Javascript - search for HTML elements in string

I have string with html elements. There are tables with captions. I need to find table which has caption with certain text and then return this table - as a string.
What is the best way to do this with simple javascript, without any libraries ?
F.e. this is an initial string
<table border="1"><caption><strong>First</strong></caption><tbody><tr><td>...</td></tr></tbody></table><table border="1"><caption><strong>Result</strong></caption><tbody><tr><td>...</td></tr></tbody></table><table border="1"><caption><strong>Last</strong></caption><tbody><tr><td>...</td></tr></tbody></table>
I want to get this string :
<table border="1"><caption><strong>Result</strong></caption><tbody><tr><td></td></tr></tbody></table>
Any advice or algorithm how to effeciently resolve this problem ? The challenge is to resolve it with javascript without using any third-party libraries and also without converting text into xml or something similar (because some of html code is not well formatted and it causes errors).
I have not had time to completely test this, but you might be able to try using a regular expression and the match() function. Assuming your table string is in a variable called str, then something along the lines of
var res = str.match(\b<table\.\w+_</table>\b);
res will be an array of matches of strings that begin with '', which you could then check to see which string contains the caption that you need.
Hope that helps!

Assign a HTML-entity as an attribute value using JavaScript/jQuery

I've been following this technique on css-tricks to add icons to a website: http://css-tricks.com/html-for-icon-font-usage/
I've got this in my CSS:
[data-icon]:before {
font-family: Symbol;
content: attr(data-icon);
speak: none;
}
However, part of my interface is generated using jQuery. Here's one such control:
var $control = $('<div>', {
'aria-hidden':'true'
, 'data-icon': ''
});
I have tried encoding it as \e01c, \\e01c, , you name it, I've probably tried it. The result is always the same, everything after the ampersand is rendered to the screen because the ampersand comes out as & in the source code or the backslashes show up.
I tried concatenating in the CSS content:
content: "&" attr(data-icon) ";";
and just including the number in the data but the ampersand still shows up encoded on its own.
Is there any way to encode this entity and have it output to the page correctly?
To encode codepoint 57372 in JavaScript, use '\ue01c'.
You should be aware that it won't render well, since unicode defines it as an unassigned codepoint in the private use area":
http://www.unicode.org/charts/PDF/UE000.pdf
Private Use Area
Range: E000-F8FF
The Private Use Area does not contain any character assignments, consequently no character code charts or names lists are
provided for this area.
Maybe you meant another code-point.
This is a bit hackish but should work:
'data-icon': $("<span></span>").contents().get(0).nodeValue
Demo here
It works as follows:
$("<span>&entity;</span>") creates a span element with one child of type textNode. .contents() is used to extract all children nodes including the text node, whose nodeValue is then assigned to data-icon attribute.

How can I separately retrieve the HTML that's before and after a child element inside a parent element?

We're writing a web app that relies on Javascript/jQuery. It involves users filling out individual words in a large block of text, kind of like Mad Libs. We've created a sort of HTML format that we use to write the large block of text, which we then manipulate with jQuery as the user fills it out.
Part of a block of text might look like this:
<span class="fillmeout">This is a test of the <span>NOUN</span> Broadcast System.</span>
Given that markup, I need to separately retrieve and manipulate the text before and after the inner <span>; we're calling those the "prefix" and "suffix".
I know that you can't parse HTML with simple string manipulation, but I tried anyway; I tried using split() on the <span> and </span> tags. It seemed simple enough. Unfortunately, Internet Explorer casts all HTML tags to uppercase, so that technique fails. I could write a special case, but the error has taught me to do this the right way.
I know I could simply use extra HTML tags to manually denote the prefix and suffix, but that seems ugly and redundant; I'd like to keep our markup format as lean and readable and writable as possible.
I've looked through the jQuery docs, and can't find a function that does exactly what I need. There are all sorts of functions to add stuff before and after and around and inside elements, but none that I can find to retrieve what's already there. I could remove the inner <span>, but then I don't know how I can tell what came before the deleted element apart from what came after it.
Is there a "right" way to do what I'm trying to do?
With simple string manipulations you can also use Regex.
That should solve your problem.
var array = $('.fillmeout').html().split(/<\/?span>/i);
Use your jQuery API! $('.fillmeout').children() and then you can manipulate that element as required.
http://api.jquery.com/children/
For completeness, I thought I should point out that the cleanest answer is to put the prefix and suffix text in it's own <span> like this and then you can use jQuery selectors and methods to directly access the desired text:
<span class="fillmeout">
<span class="prefix">This is a test of the </span>
<span>NOUN</span>
<span class="suffix"> Broadcast System.</span>
</span>
Then, the code would be as simple as:
var fillme = $(".fillmeout").eq(0);
var prefix = fillme.find(".prefix").text();
var suffix = fillme.find(".suffix").text();
FYI, I would not call this level of simplicity "ugly and redundant" as you theorized. You're using HTML markup to delineate the text into separate elements that you want to separately access. That's just smart, not redundant.
By way of analogy, imagine you have toys of three separate colors (red, white and blue) and they are initially organized by color and you know that sometime in the future you are going to need to have them separated by color again. You also have three boxes to store them in. You can either put them all in one box now and manually sort them out by color again later or you can just take the already separated colors and put them each into their own box so there's no separation work to do later. Which is easier? Which is smarter?
HTML elements are like the boxes. They are containers for your text. If you want the text separated out in the future, you might as well put each piece of text into it's own named container so it's easy to access just that piece of text in the future.
Several of these answers almost got me what I needed, but in the end I found a function not mentioned here: .contents(). It returns an array of all child nodes, including text nodes, that I can then iterate over (recursively if needed) to find what I need.
I'm not sure if this is the 'right' way either, but you could replace the SPANs with an element you could consistently split the string on:
jQuery('.fillmeout span').replaceWith('|');
http://api.jquery.com/replaceWith/
http://jsfiddle.net/mdarnell/P24se/
You could use
$('.fillmeout span').get(0).previousSibling.textContent
$('.fillmeout span').get(0).nextSibling.textContent
This works in IE9, but sadly not in IE versions smaller than 9.
Based on your example, you could use your target as a delimiter to split the sentence.
var str = $('.fillmeout').html();
str = str.split('<span>NOUN</span>');
This would return an array of ["This is a test of the ", " Broadcast System."]. Here's a jsFiddle example.
You could just use the nextSibling and previousSibling native JavaScript (coupled with jQuery selectors):
$('.fillmeout span').each(
function(){
var prefix = this.previousSibling.nodeValue,
suffix = this.nextSibling.nodeValue;
});
JS Fiddle proof of concept.
References:
each().
node.nextSibling.
node.previousSibling.
If you want to use the DOM instead of parsing the HTML yourself and you can't put the desired text in it's own elements, then you will need to look through the DOM for text nodes and find the text nodes before and after the span tag.
jQuery isn't a whole lot of help when dealing with text nodes instead of element nodes so the work is mostly done in plain javascript like this:
$(".fillmeout").each(function() {
var node = this.firstChild, prefix = "", suffix = "", foundSpan = false;
while (node) {
if (node.nodeType == 3) {
// if text node
if (!foundSpan) {
prefix += node.nodeValue;
} else {
suffix += node.nodeValue;
}
} else if (node.nodeType == 1 && node.tagName == "SPAN") {
// if element and span tag
foundSpan = true;
}
node = node.nextSibling;
}
// here prefix and suffix are the text before and after the first
// <span> tag in the HTML
// You can do with them what you want here
});
Note: This code does not assume that all text before the span is located in one text node and one text node only. It might be, but it also might not be so it collates all the text nodes together that are before and after the span tag. The code would be simpler if you could just reference one text node on each side, but it isn't 100% certain that that is a safe assumption.
This code also handles the case where there is no text before or after the span.
You can see it work here: http://jsfiddle.net/jfriend00/P9YQ6/

Best way to pick up text in a HTML element that is in the parent node only

I have, for example, markup like this
<div id="content">
<p>Here is some wonderful text, and here is a link. All links should have a `href` attribute.</p>
</div>
Now I want to be able to perform some regex replace on the text inside the p element, but not in any HTML, i.e. be able to match the href within backticks, but not inside the anchor element.
I thought about regex, but as the general consensus is, I shouldn't be using them to parse HTML.
My current method of doing this is like so: I've got a bunch of words in an array, and I am looping through them and making an object of data like so:
termsData[term] = {
regex: new RegExp('(\\b' + term + '\\b)', 'gmi'),
replaceWith: '<span>{TERM}</span>'
};
I then loop through it again, making the replacements like so:
var html = obj.html();
$.each(terms, function(i, term) {
// Replace each word in the HTML with the span
html = html.replace(termsData[term].regex, termsData[term].replaceWith.replace(/{TERM}/, '$1'));
});
obj.html(html);
Now I did a lot of this last night at an ungodly hour, and copying and pasting it into here seems to make think I should refactor some of this.
So from you should be able to tell, I want to be able to replace plain text, but not anything inside a HTML tag.
What would be the best way to do it?
Note: The source code is coming from here if you'd like a better look.
You're right to not want to be processing HTML with regex. It's also bad news to be assigning huge chunks of .html(); apart from the performance drawbacks of serialising and reparsing a large amount of HTML, you'll also lose unserialisable data like event listeners, form data and JS properties/references.
See the findText function in this answer and call something like (assuming obj is a jQuery wrapper over your topmost node to search in):
findText(obj[0], /\b(term1|term2|term3)\b/g, function(node, match) {
var span= document.createElement('span');
node.splitText(match.index+match[0].length);
span.appendChild(node.splitText(match.index));
node.parentNode.insertBefore(span, node.nextSibling);
});

Categories