how to write special characters in an html file?

how to write special characters in an html file? - javascript

i'm having troubles writing special characters in javascript from a python code. I explain.
I wanted to have a graphic interface for a python script. I dont have a lot of experience in graphic interfaces with python so I chose to write it in HTML/CSS. To make sure it works in every situations I created a code that writes the html code after processing some informations found in a text file. To put it in a nutshell my code takes already-wrote block of HTML, modifies it, assembles them together, and then writes that new HTML code into a .html file that I can open on my web browser.
Everything works perfectly fine. The problem is that im french and due to this I need to handle special characters like é,è or à. Furthermore in the HTML code I told you about, there's some javascript that use .innerHTML to modify the webpage without any loading time. My problem is that when the innerHTML code is triggered, the resulting text is this "s�ance" when it should be this "séance" I do think that it's an encoding problem but this happens only in javascript when I use innerHTML : if I write it in HTML the same string is fine.
This is what happens exactly to a string in my code:
I read it from a file:
file = file.read()
Then I write it into an html file:
interface = open(r'interface.html','w')
my_text = 'séance'
interface.write("var text = " + my_text + ";")
I obviously use then id_of_an_element.innerHTML(text)
And so, as explained earlier, when I open the HTML file into a web browser, 'séance' becomes 's�ance'.

Add this <meta> tag in your HTML file (To encoding UTF-8):
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

Related

How to parse HTML string and get <body> contents

I am working with tinyMCE, which, in its latest release doesn't support the editing of <head> and <!doctype>. I still need to use it to create a "full" document, so i need a way of "prepending" those tags to the editor content.
The problem can be split in two parts:
When i edit a file in my page, and use save() from TinyMCE, only the contents in the editor are POSTed to my Node.js + Express backend. The solution i've found is to have, outside of the editor, some buttons to create options for the request, so that the server knows what to write together with the content.
The real problem arises when i want to upload a file from my machine to be modified in the editor, when a file i upload is a "full" HTML file. In that case, the content outside the <body> tag still isn't displayed in the editor, but it's discarded, stopping me from editing it from within the page as i would if i were creating a new doc.
As i mentioned i'm working with node.js and Express for a backend, and, since i'm not familiar with jquery i'd need the solution to be vanilla js.
I have looked into the html-dom-parser' library, but it doesn't seem to fit the bill as i don't see how i can use the dom object it produces to then do the splitting.
I am using an <input type ="file> to choose the file i want to upload, but this problem is stumping me and i'm not sure anymore if it's the right path, so any help is welcome.

New tag and parser in HTML

I've sketched out a new programming language for client side scripting. I'm familiar with how to build a language, and most of the work that goes into it. I'm prepared for a long project but have a few questions about how to implement it into a HTML file. I'd like to implement something like the <script> tag in HTML, and I've found a few links that explain how to add a new tag to HTML. I figure developers will have to connect to a js file with the language parser (it's actually more than a parser, but that's besides the point) in the head of their HTML file, like so: <script src="my-lang.js"></script>. Here are some of the links I've found:
https://blog.teamtreehouse.com/create-custom-html-elements-2
https://developers.google.com/web/fundamentals/web-components/customelements
Here are some of my questions, assuming I want my language between <my-language> tags:
How do I prevent the browser's HTML parser from misinterpreting whatever is in the <my-language> tags as HTML code? All the code in my language would show up as plain text on the website from what I can see.
How can I implement my parser into the JS file? I'm not asking how to parse a custom language, just where to put it. The first link mentions a createdCallback() function that is called when a <my-language> tag is used. Assuming nobody adds a <my-language> tag with JavaScript later (which for this language that would be pointless) that callback should be called anytime the custom tag is used.
To parse my language, should I access the innerHTML attribute of my custom tag? I don't know if by parsing time the tag even has that attribute, or if I have to add that attribute, as I'm not familiar with how that part of it works.
Thanks for putting up with my silly questions. I like to dive into the deep end with this stuff even though I'm fairly new at this part of scripting. Basically, this is what I want the code to look like:
<!DOCTYPE html>
<html>
<head><script src="my-lang.js"></script></head>
<my-language>
//do some stuff in my language, like DOM editing.
//Just a replacement for JavaScript basically that doesn't serve much purpose.
</my-language>
</html>
Any suggestions would be appreciated. Thanks in advance!

Rather than using a new nonstandard tag in the HTML, I would recommend considering using a <script> tag. Although <script> tags are often for Javascript, this is not always the case. For example, one possible technique for the server to send data for the client's JS to parse is to put JSON in a tag like <script type="application/json">{"foo":"bar"}</script>. Scripts with types other than type="javascript" will not be attempted to be parsed as Javascript, but the data inside the tag can be retrieved with Javascript. You can do that by selecting the tag and then accessing its textContent property. (The innerHTML property is probably only appropriate when you deliberately want to retrieve HTML markup - otherwise, probably best to use textContent)
You can use nearly the same technique, but rather than JSON.parseing the content of the <script> tag, send it through your parser.
For example:
const tag = document.querySelector('script[type="myNewLanguage"]');
const scriptText = tag.textContent;
// use your parser to parse scriptText
scriptText.split('\n').forEach(line => {
console.log(line);
});
<script type="myNewLanguage">foo
bar
baz</script>

Retrieving Servlet info from non-form parts

I am trying to create an application that reads JSON strings.
Right now: I can input a JSON string through Java, write it to an HTML document and have a JavaScript application read it; which then parses it and writes it to the same HTML application. I need to know how, using Java, to read the HTML that it gets written to so I can use that data. It is important to note this HTML file is all generated by code so there is no actual text file to read.
I realize this is a roundabout way of doing it, but up until this point it has worked. My question is simple: How can I read an HTML page in a part that is not in a <form> through either regular Java or Servlet.

You can do that only by parsing the HTML in Java. And, there are some open source libraries that does this job for you.
Here is one that you can use.
http://jsoup.org/

Javascript write html easier

Does javascript have a method for having html on multiple lines without appending \ to the end of every line:
alert('\
<a>hello</a>\
<div>world</div>\
');
This is really irritating and escaping all the single quotations is even more irritating.
PHP offers
$variable = <<<XYZ
<html>
<body>
</body>
</html>
XYZ;
Normally I would just keep the html in a separate file and use jquery .load() to get it.
But this project im working on is going to be offline and in a single file so thats a no go.

I would recommend just putting the HTML into a hidden div:
<div id="my_html" style="display:none;">
<a>hello</a>
<div>world</div>
</div>
Then on page load set the variable:
var my_html = "";
$(function() {
my_html = $('#my_html').html();
});
above code assumes you're using jQuery.

The answer is "No", as per the comments above.
But a workaround I've used from time to time is to create the HTML (or other string or data that is easier to create without JS escaping) in a temporary file where I can format it as required while I work on it with nice indenting, extra blank lines for grouping or whatever, and then use a find-and-replace to escape any quotes and do something about the linebreaks (your choice of inserting \ at the end of each line, wrapping each line with "..." +, or just removing all linebreaks) such that the resulting string is safe to copy and paste directly into my JavaScript source code. Once it's in the JS source I would then abandon the temporary file and do any further edits directly in the JS, though obviously you could save the file if you're willing to do the find-and-replace and copy-paste every time something changes.
(Actually you don't need the temporary file at all if you don't care about keeping the non-JSified text, because most editors that you're likely to use for coding will have a find-and-replace that works on a selection.)

You could still have it in a separate file as you used to do and have a compile step to create just one file.
You could load the file using RequireJS and the text! plugin, and use r.js to compile the js/text/html files to one js file. After that, you can include the js content into the html itself...

JavaScript Character Issue When Filling Dropdown With jQuery From External JS File

I'm running into a character encoding issue when I load a dropdown using jQuery from an external js file. This only seems to happen when the JavaScript object is not within the page.
For example the below is the JavaScript object.
var langs = [
{value:'zh-CN', text:'中文 (简体) Chinese Simplified'},
{value:'en', text:'English'},
{value:'eo', text:'EsperAnt'},
{value:'es', text:'Español'},
{value:'ja', text:'日本語 (Japanese)'},
{value:'pt-PT', text:'Português'},
{value:'ru', text:'Русский (Russian)'},
];
If this is in my page with the proper meta tags <meta http-equiv="content-type" content="text/html; charset=utf-8" /> the below code works.
$(document).ready(function() {
// Fill language select
$.each(langs, function(i, j){
$('#LangSelect').append($("<option></option>").attr("value",j.value).text(j.text));
});
But, since I need languages on more then one page I've moved the langs object to an external js file and reference it. After doing this, I run into encoding issues such as russian characters become Ð ÑƒÑÑÐºÐ¸Ð¹ (Russian).
This encoding issues seems to still appear even when the reference to the external js file is set as below:
<script type="text/javascript" charset="UTF-8" src="externalJS.js"></script>
Is there anyway to force the JavaScript object to be loaded with the proper encoding from an external file?
Please note I am experiencing these issues when viewing content on the iPhone Mobile Safari browser. Additionally these pages are simply html and JavaScript without any server side components.
Thanks in advance,
Ben

Is there anyway to force the JavaScript object to be loaded with the proper encoding from an external file?
Yes, the script charset attribute as you quoted. However it historically didn't work everywhere and was best not relied on. Where this is not supported, the browser will always use the charset of the main page as the charset in the script. So as long as you include the UTF-8 charset parameter in the main page you should be fine either way.
I am surprised if a modern browser like Mobile Safari doesn't understand it, though.
Is it possible your server might be serving .js files with a bad Content-Type header containing a wrong charset? A combination of unset mime-types for JS plus AddDefaultCharset in Apache could leave you with:
Content-Type: text/plain;charset=iso-8859-1
Which might maybe have the effect of mucking it up.

Make sure you save the javascript file using UTF-8 encoding. If you open the file in Notepad++, then you can click Format>Encode in UTF-8 (If you try Format>Convert to UTF-8, then have a look at the page using a hex editor. Sometimes you end up with some strange characters at the beginning of the file).

We Keep Coding

JavaScript is the programming language of the Web.