This question already exists:
Decode & back to & in JavaScript [duplicate]
Closed 5 years ago.
The current problem I'm having is that I'm received a string from the Database with ascii mixed with the string. I would like to know how to create a converstion function to change the string to the output below.
var string = 'Mc'Dought';
Output an end result of Mc'dought on the webpage.
function convert(str){
//Code Here
}
Be careful what you ask for.
You're looking at the escape sequence for single quote. If you throw it onto the page, it'll display properly.
On the other hand, if you insist on converting it to an actual single quote character, you could run into problems.
Check out this code snippet to see the consequences:
<INPUT Value="Mc'Dought">
<BR>
<INPUT Value="Mc'Dought">
<BR>
<SPAN>Mc'Dought</SPAN>
<BR>
<SPAN>Mc'Dought</SPAN>
<BR>
<INPUT Value='Mc'Dought'> <!-- Problems!! -->
<BR>
<SPAN>Mc'Dought</SPAN>
However, if you understand all this and still want to unescape it (e.g. you need to use the string to set the window title or issue an alert), I'd suggest you use an accepted HTML-unescape method, rather than replacing just '. See this answer.
Jquery:
function escapeHtml(unsafe) {
return $('<div />').text(unsafe).html()
}
function unescapeHtml(safe) {
return $('<div />').html(safe).text();
}
Plain Javascript:
function unescapeHtml(safe) {
return safe.replace(/&/g, '&')
.replace(/</g, '<')
.replace(/>/g, '>')
.replace(/"/g, '"')
.replace(/'/g, "'");
}
Use String#replace with a regular expression and a replacement function. You can parse the character code from each HTML entity using the regex, and then call String.fromCharCode to turn each code into its character form:
var string = 'Mc'Dought'
function convert (string) {
return string.replace(/&#(?:x([\da-f]+)|(\d+));/ig, function (_, hex, dec) {
return String.fromCharCode(dec || +('0x' + hex))
})
}
console.log(convert(string))
Related
I currently have a working WordWrap function in Javascript that uses RegEx. I pass the string I want wrapped and the length I want to begin wrapping the text, and the function returns a new string with newlines inserted at appropriate locations in the string as shown below:
wordWrap(string, width) {
let newString = string.replace(
new RegExp(`(?![^\\n]{1,${width}}$)([^\\n]{1,${width}})\\s`, 'g'), '$1\n'
);
return newString;
}
For consistency purposes I won't go into, I need to use an identical or similar RegEx in C#, but I am having trouble successfully replicating the function. I've been through a lot of iterations of this, but this is what I currently have:
private static string WordWrap(string str, int width)
{
Regex rgx = new Regex("(?![^\\n]{ 1,${" + width + "}}$)([^\\n]{1,${" + width + "}})\\s");
MatchCollection matches = rgx.Matches(str);
string newString = string.Empty;
if (matches.Count > 0)
{
foreach (Match match in matches)
{
newString += match.Value + "\n";
}
}
else
{
newString = "No matches found";
}
return newString;
}
This inevitably ends up finding no matches regardless of the string and length I pass. I've read that the RegEx used in JavaScript is different than the standard RegEx functionality in .NET. I looked into PCRE.NET but have had no luck with that either.
Am I heading in the right general direction with this? Can anyone help me convert the first code block in JavaScript to something moderately close in C#?
edit: For those looking for more clarity on what the working function does and what I am looking for the C# function to do: What I am looking to output is a string that has a newline (\n) inserted at the width passed to the function. One thing I forgot to mention (but really isn't related to my issue here) is that the working JavaScript version finds the end of the word so it doesn't cut up the word. So for example this string:
"This string is really really long so we want to use the word wrap function to keep it from running off the page.\n"
...would be converted to this with the width set to 20:
"This string is really \nreally long so we want \nto use the word wrap \nfunction to keep it \nfrom running off the \npage.\n"
Hope that clears it up a bit.
JavaScript and C# Regex engines are different. Also each language has it's own regex pattern executor, so Regex is language dependent. It's not the case, if it is working for one language so it will work for another.
C# supports named groups while JavaScript doesn't support them.
So you can find multiple difference between these two languages regex.
There are issues with the way you've translated the regex pattern from a JavaScript string to a C# string.
You have extra whitespace in the c# version, and you've also left in $ symbols and curly brackets { that are part of the string interpolation syntax in the JavaScript version (they are not part of the actual regex pattern).
You have:
"(?![^\\n]{ 1,${" + width + "}}$)([^\\n]{1,${" + width + "}})\\s"
when what I believe you want is:
"(?![^\\n]{1," + width + "}$)([^\\n]{1," + width + "})\\s"
I'm trying to replace double quotes with curly quotes, except when the text is wrapped in certain tags, like [quote] and [code].
Sample input
[quote="Name"][b]Alice[/b] said, "Hello world!"[/quote]
<p>"Why no goodbye?" replied [b]Bob[/b]. "It's always Hello!"</p>
Expected output
[quote="Name"][b]Alice[/b] said, "Hello world!"[/quote]
<p>“Why no goodbye?” replied [b]Bob[/b]. “It's always Hello!”</p>
I figured how to elegantly achieve what I want in PHP by using (*SKIP)(*F), however my code will be run in javascript, and the javascript solution is less than ideal.
Right now I'm splitting the string at those tags, running the replace, then putting the string together:
var o = 3;
a = a
.split(/(\[(?<first>(?:icode|quote|code))[^\]]*?\](?:[\s]*?.)*?[\s]*?\[\/(?:\k<first>)\])/i)
.map(function(x,i) {
if (i == o-1 && x) {
x = '';
}
else if (i == o && x)
{
x = x.replace(/(?![^<]*>|[^\[]*\])"([^"]*?)"/gi, '“$1”')
o = o+3;
}
return x;
}).join('');
Javascript Regex Breakdown
Inside split():
(\[(?<first>icode|quote|code)[^\]]*?\](?:.)*?\[\/(\k<first>)\]) - captures the pattern inside parentheses:
\[(?<first>quote|code|icode)[^\]]*?\] - a [quote], [code], or [icode] opening tag, with or without parameters like =html, eg [code=html]
(?:[\s]*?.)*? - any 0+ (as few as possible) occurrences of any char (.), preceded or not by whitespace, so it doesn't break if the opening tag is followed by a line break
[\s]*? - 0+ whitespaces
\[\/(\k<first>)\] - [\quote], [\code], or [\icode] closing tags. Matches the text captured in the (?<first>) group. Eg: if it's a quote opening tag, it'll be a quote closing tag
Inside replace():
(?![^<]*>|[^\[]*\])"([^"]*?)" - captures text inside double quotes:
(?![^<]*>|[^\[]*\]) - negative lookahead, looks for characters (that aren't < or [) followed by either > or ] and discards them, so it won't match anything inside bbcode and html tags. Eg: [spoiler="Name"] or <span style="color: #24c4f9">. Note that matches wrapped in tags are left untouched.
" - literal opening double quotes character.
([^"]*?) - any 0+ character, except double quotes.
" - literal closing double quotes character.
SPLIT() REGEX DEMO: https://regex101.com/r/Ugy3GG/1
That's awful, because the replace is executed multiple times.
Meanwhile, the same result can be achieved with a single PHP regex. The regex I wrote was based on Match regex pattern that isn't within a bbcode tag.
(\[(?<first>quote|code|icode)[^\]]*?\](?:[\s]*?.)*?[\s]*?\[\/(\k<first>)\])(*SKIP)(*F)|(?![^<]*>|[^\[]*\])"([^"]*?)"
PHP Regex Breakdown
(\[(?<first>quote|code|icode)[^\]]*?\](?:[\s]*?.)*?[\s]*?\[\/(\k<first>)\])(*SKIP)(*F) - matches the pattern inside capturing parentheses just like javascript split() above, then (*SKIP)(*F) make the regex engine omit the matched text.
| - or
(?![^<]*>|[^\[]*\])"([^"]*?)" - captures text inside double quotes in the same way javascript replace() does
PHP DEMO: https://regex101.com/r/fB0lyI/1
The beauty of this regex is that it only needs to be run once. No splitting and joining of strings. Is there a way to implement it in javascript?
Because JS lacks backtracking verbs you will need to consume those bracketed chunks but later replace them as is. By obtaining the second side of the alternation from your own regex the final regex would be:
\[(quote|i?code)[^\]]*\][\s\S]*?\[\/\1\]|(?![^<]*>|[^\[]*\])"([^"]*)"
But the tricky part is using a callback function with replace() method:
str.replace(regex, function($0, $1, $2) {
return $1 ? $0 : '“' + $2 + '”';
})
Above ternary operator returns $0 (whole match) if first capturing group exists otherwise it encloses second capturing group value in curly quotes and returns it.
Note: this may fail in different cases.
See live demo here
Nested markup is hard to parse with rx, and JS's RegExp in particular. Complex regular expressions also hard to read, maintain, and debug. If your needs are simple, a tag content replacement with some banned tags excluded, consider a simple code-based alternative to run-on RegExps:
function curly(str) {
var excludes = {
quote: 1,
code: 1,
icode: 1
},
xpath = [];
return str.split(/(\[[^\]]+\])/) // breakup by tag markup
.map(x => { // for each tag and content:
if (x[0] === "[") { // tag markup:
if (x[1] === "/") { // close tag
xpath.pop(); // remove from current path
} else { // open tag
xpath.push(x.slice(1).split(/\W/)[0]); // add to current path
} //end if open/close tag
} else { // tag content
if (xpath.every(tag =>!excludes[tag])) x = x.replace(/"/g, function repr() {
return (repr.z = !repr.z) ? "“" : "”"; // flip flop return value (naive)
});
} //end if markup or content?
return x;
}) // end term map
.join("");
} /* end curly() */
var input = `[quote="Name"][b]Alice[/b] said, "Hello world!"[/quote]
<p>"Why no goodbye?" replied [b]Bob[/b]. "It's always Hello!"</p>`;
var wants = `[quote="Name"][b]Alice[/b] said, "Hello world!"[/quote]
<p>“Why no goodbye?” replied [b]Bob[/b]. “It's always Hello!”</p>`;
curly(input) == wants; // true
To my eyes, even though it a bit longer, code allows documentation, indentation, and explicit naming that makes these sort of semi-complicated logical operations easier to understand.
If your needs are more complex, use a true BBCode parser for JavaScript and map/filter/reduce it's model as needed.
I'm outputting values from a database (it isn't really open to public entry, but it is open to entry by a user at the company -- meaning, I'm not worried about XSS).
I'm trying to output a tag like this:
Click Me
DESCRIPTION is actually a value from the database that is something like this:
Prelim Assess "Mini" Report
I've tried replacing " with \", but no matter what I try, Firefox keeps chopping off my JavaScript call after the space after the word Assess, and it is causing all sorts of issues.
I must bemissing the obvious answer, but for the life of me I can't figure it out.
Anyone care to point out my idiocy?
Here is the entire HTML page (it will be an ASP.NET page eventually, but in order to solve this I took out everything else but the problem code)
<html>
<body>
edit
</body>
</html>
You need to escape the string you are writing out into DoEdit to scrub out the double-quote characters. They are causing the onclick HTML attribute to close prematurely.
Using the JavaScript escape character, \, isn't sufficient in the HTML context. You need to replace the double-quote with the proper XML entity representation, ".
" would work in this particular case, as suggested before me, because of the HTML context.
However, if you want your JavaScript code to be independently escaped for any context, you could opt for the native JavaScript encoding:
' becomes \x27
" becomes \x22
So your onclick would become:DoEdit('Preliminary Assessment \x22Mini\x22');
This would work for example also when passing a JavaScript string as a parameter to another JavaScript method (alert() is an easy test method for this).
I am referring you to the duplicate Stack Overflow question, How do I escape a string inside JavaScript code inside an onClick handler?.
<html>
<body>
edit
</body>
</html>
Should do the trick.
Folks, there is already the unescape function in JavaScript which does the unescaping for \":
<script type="text/javascript">
var str="this is \"good\"";
document.write(unescape(str))
</script>
The problem is that HTML doesn't recognize the escape character. You could work around that by using the single quotes for the HTML attribute and the double quotes for the onclick.
<a href="#" onclick='DoEdit("Preliminary Assessment \"Mini\""); return false;'>edit</a>
This is how I do it, basically str.replace(/[\""]/g, '\\"').
var display = document.getElementById('output');
var str = 'class="whatever-foo__input" id="node-key"';
display.innerHTML = str.replace(/[\""]/g, '\\"');
//will return class=\"whatever-foo__input\" id=\"node-key\"
<span id="output"></span>
If you're assembling the HTML in Java, you can use this nice utility class from Apache commons-lang to do all the escaping correctly:
org.apache.commons.lang.StringEscapeUtils Escapes and unescapes
Strings for Java, Java Script, HTML, XML, and SQL.
Please find in the below code which escapes the single quotes as part of the entered string using a regular expression. It validates if the user-entered string is comma-separated and at the same time it even escapes any single quote(s) entered as part of the string.
In order to escape single quotes, just enter a backward slash followed by a single quote like: \’ as part of the string. I used jQuery validator for this example, and you can use as per your convenience.
Valid String Examples:
'Hello'
'Hello', 'World'
'Hello','World'
'Hello','World',' '
'It\'s my world', 'Can\'t enjoy this without me.', 'Welcome, Guest'
HTML:
<tr>
<td>
<label class="control-label">
String Field:
</label>
<div class="inner-addon right-addon">
<input type="text" id="stringField"
name="stringField"
class="form-control"
autocomplete="off"
data-rule-required="true"
data-msg-required="Cannot be blank."
data-rule-commaSeparatedText="true"
data-msg-commaSeparatedText="Invalid comma separated value(s).">
</div>
</td>
JavaScript:
/**
*
* #param {type} param1
* #param {type} param2
* #param {type} param3
*/
jQuery.validator.addMethod('commaSeparatedText', function(value, element) {
if (value.length === 0) {
return true;
}
var expression = new RegExp("^((')([^\'\\\\]*(?:\\\\.[^\'\\\\])*)[\\w\\s,\\.\\-_\\[\\]\\)\\(]+([^\'\\\\]*(?:\\\\.[^\'\\\\])*)('))(((,)|(,\\s))(')([^\'\\\\]*(?:\\\\.[^\'\\\\])*)[\\w\\s,\\.\\-_\\[\\]\\)\\(]+([^\'\\\\]*(?:\\\\.[^\'\\\\])*)('))*$");
return expression.test(value);
}, 'Invalid comma separated string values.');
I have done a sample one using jQuery
var descr = 'test"inside"outside';
$(function(){
$("#div1").append('Click Me');
});
function DoEdit(desc)
{
alert ( desc );
}
And this works in Internet Explorer and Firefox.
You can copy those two functions (listed below), and use them to escape/unescape all quotes and special characters. You don't have to use jQuery or any other library for this.
function escape(s) {
return ('' + s)
.replace(/\\/g, '\\\\')
.replace(/\t/g, '\\t')
.replace(/\n/g, '\\n')
.replace(/\u00A0/g, '\\u00A0')
.replace(/&/g, '\\x26')
.replace(/'/g, '\\x27')
.replace(/"/g, '\\x22')
.replace(/</g, '\\x3C')
.replace(/>/g, '\\x3E');
}
function unescape(s) {
s = ('' + s)
.replace(/\\x3E/g, '>')
.replace(/\\x3C/g, '<')
.replace(/\\x22/g, '"')
.replace(/\\x27/g, "'")
.replace(/\\x26/g, '&')
.replace(/\\u00A0/g, '\u00A0')
.replace(/\\n/g, '\n')
.replace(/\\t/g, '\t');
return s.replace(/\\\\/g, '\\');
}
Escape whitespace as well. It sounds to me like Firefox is assuming three arguments instead of one. is the non-breaking space character. Even if it's not the whole problem, it may still be a good idea.
You need to escape quotes with double backslashes.
This fails (produced by PHP's json_encode):
<script>
var jsonString = '[{"key":"my \"value\" "}]';
var parsedJson = JSON.parse(jsonString);
</script>
This works:
<script>
var jsonString = '[{"key":"my \\"value\\" "}]';
var parsedJson = JSON.parse(jsonString);
</script>
You can use the escape() and unescape() jQuery methods. Like below,
Use escape(str); to escape the string and recover again using unescape(str_esc);.
Here is the situation presented as-is in the fiddle:
http://jsfiddle.net/am4ph9ut
function encodeAllCharacters(){
var result = "%60~!%40%23%24%25%26*()_%2B-%3D%7B%7D%7C%5B%5D%3B%3A'%22%2F.%2C%3C%3E%3F%60";
result = result.replace(/[!'()*-_.~]/g, function(c) {
return '%' + c.charCodeAt(0).toString(16);
});
return result;
}
This is a simplified version of the fixedEncodeURIComponent method explained in this page:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent
I am not sure why I get %3 added after every character after the regexp is executed.
Your regular expression uses - inside the character class (ie. the square brackets) which means all the characters between the one on the left to the one on the right of the - (inclusive), eg. [c-f] will match c,d,e,f.
To match against the character - itself use the backslash to escape it:
result = result.replace(/[!'()*\-_.~]/g, function(c) {
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to escape HTML
How can a string be converted to HTML in JavaScript?
e.g.
var unsafestring = "<oohlook&atme>";
var safestring = magic(unsafestring);
where safestring now equals "<ohhlook&atme>"
I am looking for magic(...).
I am not using JQuery for magic.
function htmlEntities(str) {
return String(str).replace(/&/g, '&').replace(/</g, '<').replace(/>/g, '>').replace(/"/g, '"');
}
So then with var unsafestring = "<oohlook&atme>"; you would use htmlEntities(unsafestring);
Do not bother with encoding. Use a text node instead. Data in text node is guaranteed to be treated as text.
document.body.appendChild(document.createTextNode("Your&funky<text>here"))
If you want to use a library rather than doing it yourself:
The most commonly used way is using jQuery for this purpose:
var safestring = $('<div>').text(unsafestring).html();
If you want to to encode all the HTML entities you will have to use a library or write it yourself.
You can use a more compact library than jQuery, like HTML Encoder and Decode
You need to escape < and &. Escaping > too doesn't hurt:
function magic(input) {
input = input.replace(/&/g, '&');
input = input.replace(/</g, '<');
input = input.replace(/>/g, '>');
return input;
}
Or you let the DOM engine do the dirty work for you (using jQuery because I'm lazy):
function magic(input) {
return $('<span>').text(input).html();
}
What this does is creating a dummy element, assigning your string as its textContent (i.e. no HTML-specific characters have side effects since it's just text) and then you retrieve the HTML content of that element - which is the text but with special characters converted to HTML entities in cases where it's necessary.
The only character that needs escaping is <. (> is meaningless outside of a tag).
Therefore, your "magic" code is:
safestring = unsafestring.replace(/</g,'<');