I am trying to encode German characters in html. I just want to use the special character codes. For the Umlaut, I've tried using both Ü and Ü and neither renders properly. What am I doing wrong? Thanks.
This is for a Squarespace site, and I am inserting Javascript into their Code Injection page, into the footer. I am using Javascript to write a German word on the page. The relevant part of my code looks like the below. And the problem is that this simply renders "& U uml;ber' (space added by me because the umlaut renders properly on Stack Overflow without it) on the page rather than Uber with an umlaut. Thanks!
var strings = {
'About': {
'de': 'Über'
},
Try using Ü for the German character Ü.
You need to escape special characters in HTML, unless...
You address the encoding issue on a document-wide level by adding the following line of code at the beginning of the <head> section:
<meta charset="utf-8">
Then you don't need to escape special characters individually.
Further reading:
Character encodings for beginners
Declaring character encodings in HTML
Declaring character encodings in CSS
UPDATE 1 (Javascript)
Convert special characters to HTML in Javascript
How to convert characters to HTML entities using plain JavaScript
UPDATE 2 (Squarespace)
HTML Special Characters and Squarespace
special character in squarespace (text block)
If you are a Squarespace customer, they provide 24/7 customer support. Contact them directly.
This solution worked for me. Use the hex code but eliminate the &# from the beginning and add a /
So, to render the word "Über" within Javascript in Squarespace, use /xdcber
Related
I have a textarea meant for plain text that users sometimes copy and paste special characters into. It becomes a problem when emoticons are used, because it's material we then need to include in PDF files.
For instance: ❤️
❤
Now my question is, how could I go about identifying such characters and removing them with Javascript as the form is validated? I don't want to be too restrictive, as many languages are allowed (Russian, Arabic, etc.). Only those symbols would need to be excluded.
Thank you
See http://crocodillon.com/blog/parsing-emoji-unicode-in-javascript. The problem is that emoticons are in the Supplementary plane. That does not allow you to use a normal character range; instead you need to work with "surrogate pairs", along the lines of
/\ud83d[\ude00-\ude4f]/
The link above has additional information on how to find and treat emoticon characters in other Unicode ranges.
I want to reduce the size out my HTML output stream by removing all empty lines and whitespace. However I'm not very good at regex and the pattern I have seems to remove more than wanted e.g. whole script blocks. How can I make sure that blocks are kept in tact?
This is what I have so far:
html = Regex.Replace(html, ">\s+<", "><", RegexOptions.Compiled)
I think you're looking for conditional regex. Look at examples here Regex Tutorial If-Then-Else
There are different regex for different systems (.Net, Python, etc)
I've been trying to embed some hebrew characters in Thom Sander's free html5 template (download link).
For example, I've tried to change a left-side menu item text to Hebrew, i.e.,
Home Page => עמוד הבית
For some reason the hebrew characters are not shown at all.
When I add hebrew in other places in the document, it is shown correctly. At first I thought this may be an encoding issue but the head encoding seems to be valid: UTF-8. I think there might be some JS code ignoring the Hebrew text, but I'm not sure.
Any ideas?
Seems like someone already found a solution for that. I didn't try to implement the whole solution but tested it with your files and it works.
You can find the solution here
Basicaly you just need to use CufonRTL.js to be able to use Hebrew & Cufon.js together.
You may find CufonRTL.js at the begining of the blog post or just download the file from here
Then you ll have to load CufonRTL.js and execute something like:
CufonRTL.RTL('#menu a');
So the menu links would support Hebrew while using the Cufon library & custom font.
The reason you cannot embed Hebrew characters into your website is beacuse the template is using the cufon technique, which doesn't support right to left languages.
Planned features:
Support for right-to-left and bi-directional text
However, it looks like there is a way around it:
Using Cufon with Right-To-Left Text
Try adding this rule to the CSS
html { unicode-bidi: embed; }
http://www.w3schools.com/jsref/prop_style_unicodebidi.asp
The unicodeBidi property is used with the direction property to set or
return whether the text should be overridden to support multiple
languages in the same document.
Be sure to use:
<meta http-equiv='Content-Type' content='Type=text/html; charset=utf-8'>
Or (as a new HTML5 standard):
<meta charset='utf-8'>
And try adding this property in your CSS:
unicode-bidi: embed;
You can also try to display something, using HTML Entities instead of Unicode characters: ֑ ֒ #1427;
This is my string:
<link href="/post?page=4&tags=example" rel="last" title="Last Page">
From there I am trying to obtain the 4 out of that page parameter, using this regular expression:
link href="/post?page=(.*?)&tags=(.*?)" rel="last"
I will then collect the 4 out of the first group, the tags parameter has a wildcard because the contents can change. However, I don't seem to be getting a match with this, can anyone help?
And I know I shouldn't be using regex to parse HTML, but this is just a small thing and it would be a waste to import a huge module for this.
Assuming you are using a /regex literal/, you will need to escape the / in that path as \/.
Alternatively, it depends on how you are getting this string. Is it really typed that way, or is it part of an innerHTML that you are then reading out again? If that's the case, then the innerHTML won't be what you expect it to be, because the browser will "normalise" it.
If it is an innerHTML, then it'd be far easier to get the tag, then get the tag's href attribute, then regex that.
link href="/post\?page=(.*?)&tags=(.*?)" rel="last"
You forgot the slash before ?
I think it might be better to change your capture groups to something a little different, but will catch everything up to the terminating character:
link href="/post?page=([^&]+)&tags=([^\"]+)" rel="last"
Using the negating character first in the character group tells the regex engine "capture all characters EXCEPT the ones listed here". This makes it very easy to capture everything up until it hits a termination character, such as the amperstand and double-quote. Assuming you're using PHP or Java, this should also slightly improve regex performance.
If the page parameter always comes first, try the PCRE /\?page=(\d+)/. Match group 1 will contain the page number.
I have a company name that always needs to be italicized. I have navigation that is driven by my sitemap and I can not figure out how to italicize the word. The word is always the same, so I thought about some Jscript, but was wondering if I had any other options. Thank You.
If the sitemap is an XML document than you might use an XSLT stylesheet to print out the content (a little tutorial: http://www.w3schools.com/xsl/).
But without using CSS or tags you can't make a word italic. There is no italic char for each symbol. So in a pure XML document there is no ways to do that.
I added character encodings to my sitemap to italicize. Ex: for < I used <