How to create a new web character symbol recognizable by html/javascript? - javascript

I have checked many questions here but none has my answer. I appreciate if you direct me to the right path.
I like to make a new symbol (not a font or related to any alphabet) like creating a new language alphabets that could be recognized or translated in a browser in form of html or javascript code.
In other words, assigning single custom character for multiple characters.(e.x 1 ch translates into 5 ch)
I assume I need to make the font first and then assign that character. What programs do you suggest or what is the best approach?
Edit:
A better example:
Make a new character like ¢ (cent) that has entity name --> & c e n t; and entity number --> & # 1 6 2;
Edit 2: Thank you all for your replies. I'm trying to check your links and suggestions.As I understand, there might be an issue of browser compatibility. So how about make new symbols in a text file saved on server and when the user views the file, javascript converts those symbols into a word or other standard characters?
Edit3: Sorry for any confusion guys, you are all awesome. This example might clear things.
make a new symbol that assigns to "AB". So one character that translates into two characters?
Edit4:
This is based on Jared answer. This does work for Z and P. Now how should I add my custom font to this file (replace Z and P with my own)?
Assuming Z and P are my custom made symbols
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
[
<!ENTITY Z "AB">
<!ENTITY P "DE">
...
]>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Extending XHTML - Example 1</title>
</head>
<body>
<p>My symbols are &Z; and &P;</p>
</body>
</html>

This may be an option for you, I'm not quite sure. Pretty much, if you use XHTML and/or XLST, you could possibly achieve what you're looking for in custom-defined characters. For example:
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
[
<!ENTITY mailto "mailto:">
<!ENTITY username "gabriel">
<!ENTITY arobase "#">
<!ENTITY hostname "gabsoftware">
<!ENTITY tld ".com">
<!ENTITY email "&username;&arobase;&hostname;&tld;">
]>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Extending XHTML - Example 1</title>
</head>
<body>
<p>My email is &email;</p>
</body>
</html>
http://jfcoder.com/test/entities.xhtml
You'll see the SGML ENTITY element, which by the way I believe HTML5 is no longer going to belong to anymore. Whether or not you can embed an image in those entities or through an XLST transformation to achieve your goal I haven't figured out yet.
For many more examples and options, see this page:
Extending XHTML with XML, XSLT, entities, CDATA sections and JavaScript

You have to use the SGML Entity declaration to create a new character entity. This should work in all languages that inherit from SGML including HTML and XML, but I would be sure to test this in all supported user agents just to be sure.
http://en.wikipedia.org/wiki/XML_entity

If you're using standard characters (a-Z, A-Z, 0-9, !, #, #, $, etc), ,you could make a PHP variable that has the translation as the page is being rendered (JavaScript is after the page is rendered/downloaded).
<?php
$from = "A";
$into = "ABCD";
$content = "This is A whole lot of page content";
echo preg_replace("$from, $into, $content);
// will output "This is ABCD whole lot of page content"
?>
Could work with special fonts too, i'd imagine, or even swapping $from into <img src=pic/of/chars.png" />.

Related

How to correct / parse xHTML errors [duplicate]

I need to add a closing image tag. Current html:
<img class="logoEmail" src="/images/logoPDF.png">
What I want:
<img class="logoEmail" src="/images/logoPDF.png"/>
How can I do that?
myInput ='<img class="example1" src="/images/example1.png">';
myInput += '<img class="example2" src="/images/example2.png"/>';
result = myInput.replace(/(<img("[^"]*"|[^\/">])*)>/gi, "$1/>");
Explanation of the regex:
<img The start
"[^"]*" A string inside the tag. May contain the / character.
[^\/">] Anything else (not a string, not a / and not the end of the tag)
> The end of an IMG tag
This will only match unfinished tags, and will replace it by the whole thing, plus a />
As I said before this is NOT bulletproof, probably there is no regex that would work 100%.
You could try this regex also,
result = myInput.replace(/^([^\.]*\.[^>]*)(.*)$/g, "$1/$2");
DEMO
It captures all the characters upto a literal dot and stored it into a group. Then it again captures characters upto > and stored into another group. Add a / in between the captured groups in the replacement part will give you the desired output.
It can be as easy as this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Demo Replace IMG tags w/ regex</title>
<script type="text/javascript">
window.onload = function() {
document.body.innerHTML = document.body.innerHTML.replace(/(<img[^>]+)/g, "$1 /");
}
</script>
</head>
<body>
<p>Some text.</p>
<img src="images/logoPhone.jpg">
<br>
<img src="images/logoMail.png">
<p>Some more text.</p>
</body>
</html>
.
Explanation:
<img: match must start with this.
[^>]: after the starting match, the next character may be anything but >.
+: one or more occurances.
g: apply globally, do not return on the first match.
$1: as in the first capture group (= stuff between first set of parentheses).
.
Be aware that Firebug never shows closing slashes, regardless of doctype. But you can see the regex script in action here: http://regex101.com/r/zS2zO1.

escape vs encodeURIComponent [duplicate]

I thought values entered in forms are properly encoded by browsers.
But this simple test file "test_get_vs_encodeuri.html" shows it's not true:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
<title></title>
</head><body>
<form id="test" action="test_get_vs_encodeuri.html" method="GET" onsubmit="alert(encodeURIComponent(this.one.value));">
<input name="one" type="text" value="Euro-€">
<input type="submit" value="SUBMIT">
</form>
</body></html>
When hitting submit button:
encodeURICompenent encodes input value into "Euro-%E2%82%AC"
while browser into the GET query writes only a simple "Euro-%80"
Could someone explain?
How do i encode everything in the same way of the borwser's FORM (windows-1252) using Javascript??? (escape function does not work, encodeURIComponent does not work either)?
Or is encodeURIComponent doing unnecessary conversions?
This is a character encoding issue. Your document is using the charset Windows-1252 where the € is at position 128 that is encoded with Windows-1252 as 0x80. But encodeURICompenent is expecting the input to be UTF-8, thus using Unicode’s charset where the € is at position 8364 (PDF) that is encoded with UTF-8 0xE282AC.
A solution would be to use UTF-8 for your document as well. Or you write a mapping to convert UTF-8 encoded strings to Windows-1252.
I think the root of the problem is character encodings. If I mess around with charset in the meta tag and save the file with different encodings I can get the page to render in the browser like this:
(source: boogdesign.com)
That € looks a lot like what you're getting from encodeURIComponent. However I could find no combination of encodings which made any difference to what encodeURIComponent was returning. I can make a difference to what the GET query returns. This is your original page, submitting gives an URL like:
test-get-vs-encodeuri.html?one=Euro-%80
This is a UTF-8 version of the page, submitting gives an URL that looks like this (in Firefox):
http://www.boogdesign.com/examples/encode/test-get-vs-encodeuri-utf8.html?one=Euro-€
But if I copy and paste it I get:
http://www.boogdesign.com/examples/encode/test-get-vs-encodeuri-utf8.html?one=Euro-%E2%82%AC
So it looks like if the page is UTF-8 then the GET and encodeURIComponent match.

How to show exact URL with escaped characters in Safari?

I have a url like this : http://www.refskou.dk/safari-%F8.html
The file is named like this: safari-ø.html
The file consists of this:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<script>
alert(this.location);
</script>
</head>
<body>
</body>
</html>
But it does not print out /safari-%F8.html nor safari-ø.html
It prints out the question mark indicating that it does not know of the character "ø".
All I want is to print out the URL as I see it in the address bar.
Please give me a hint. This is only a problem in Safari as far as I have testet.
I need to tell you that I do not have control over what kind of charset used on the page. I can only execute javascript :-)
In response to this answer.
The reason for the lack of control, is that I do a script that can be included to hopefully any webpage, and so I have no control over what kind of charset used. The included script can ofcouse have its own charset, introduced by the charset attribute on the "script" tag but I cannot get it to work.
unescape('/safari-%F8.html') == 'safari-ø.html'
Note that Safari still gives you a ?, but Chrome shows either a %F8 or ø
In Safari (nevermind):
var str = '/safari-%F8.html';
alert(str.replace(/%[A-F0-9]{2}/g, function(v){ return String.fromCharCode(parseInt(v.substr(1), 16)); }));
The above works on normal strings, but Safari is seeing that character as unicode 65533, and I'm not sure how to convert that back to ASCII 248...
Try the unescape javascript function:
alert(unescape(this.location));
I believe you'll need to specify a character set.
The first thing in your Head section...
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
More Info Here
EDIT: I missed the part where the OP states he has no control over the character set on the page. I believe this is the root of the problem and wonder why he has no control over this.
Well I finally got it working. For some reason Safari cannot understand the strange characters when asking from this/window.location. But moving down a level to the document object and asking for the URL gives me just what I need. Why this is, I cannot tell you, but it solves the problem.

getElementsByTagNameNS in (X)HTML documents

I have a question on Javascript and DOM; shouldn't the following code retrieve the three foo:bar elements in the body? The alert window displays zero. It doesn't work in any browser I have (not even Chrome Canary). Thank you for helping, have a nice weekend.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:foo="http://example.com">
<head>
<title>Hello!</title>
<script type="text/javascript">
function bodyLoad() {
var extElements = document.getElementsByTagNameNS('http://example.com', 'bar');
alert(extElements.length);
}
</script>
</head>
<body onload="bodyLoad();">
<foo:bar>First Foo-Bar</foo:bar>
<foo:bar>Second Foo-Bar</foo:bar>
<foo:bar>Third Foo-Bar</foo:bar>
</body>
</html>
You are likely serving the document with the wrong content type. The browser has to treat it as XML for namespaces to be recognized, so you need to use application/xhtml+xml or another XML content-type, and not text/html.
As an aside, your Doctype is wrong. If you want to use a DTD, then you will need one that includes the elements you are using from the foo namespace. If you don't, then just get rid of the Doctype — it has no bearing on rendering mode in XML documents (again, text/html documents are treated as tag soup, not XML).

JSF EL expression renders question marks (?) for Chinese characters inside JavaScript

I am using an EL expression inside JavaScript for rendering Chinese value.
alert('#{bundle.chinese}');
But it renders question marks (?) instead of actual characters.
When I use it outside a script tag in the same XHTML page, e.g.
<p>#{bundle.chinese}</p>
It renders the right chinese Characters. View source shows the html UTF encoded values &....;).
I am using JSF on Facelets.
Sorry, I can't reproduce this with Mojarra 2.0.2 on Tomcat 6.0.20. Here's the JSF page I used:
<!DOCTYPE html>
<html
xmlns="http://www.w3.org/1999/xhtml"
xmlns:f="http://java.sun.com/jsf/core"
xmlns:h="http://java.sun.com/jsf/html">
<f:loadBundle basename="com.example.i18n.text" var="bundle" />
<h:head>
<title>test</title>
<script>alert('#{bundle.chinese}');</script>
</h:head>
<h:body>
<p>#{bundle.chinese}</p>
</h:body>
</html>
And here is the contents of com/example/i18n/text.properties.
chinese=\u6C49\u8BED\uFF0F\u6F22\u8A9E\u002C\u0020\u534E\u8BED\uFF0F\u83EF\u8A9E\u0020\u006F\u0072\u0020\u4E2D\u6587
The generated HTML source is:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"><head>
<title>test</title>
<script>alert('汉语/漢語, 华语/華語 or 中文');</script></head><body>
<p>汉语/漢語, 华语/華語 or 中文</p></body>
</html>
Probably you're doing some stuff a bit differently and/or using a different JSF impl/version. Aren't you somewhere hardcoding/using a non-UTF-8 character encoding? Watch the IDE settings as well.

Categories