React Replacing Special Characters in CSS `content`

React Replacing Special Characters in CSS `content` - javascript

Issue
When loading my React webapp, sometimes special characters (e.g. ⋮) are replaced with strings of other characters (e.g. â‹®). The special characters appear correctly most of the time, but when this bug occurs all the below-listed special characters fail to display correctly. I am unable to intentionally replicate the issue.
Code Sample
The special characters are all within the CSS styles:
.someElement:before {
content: "⋮";
}
Example Characters
Here are the unintended replacements that are occurring with seemingly random frequency:
Intended Character
Characters that Appear
⋮
â‹®
▾
â–¾
✔
âœ”

For using special characters in CSS you must add their CSS entity instead of only character
in your case you want to add ⋮ to css which its entity code is \22EE
.someElement:before {
content: "\22EE";
}
for more entity codes check out here
https://www.w3schools.com/cssref/css_entities.asp

Related

Are there problems that I can run into by minifying graphql queries

I have a graphql query that I'm sending to the server (uses apollo-server) using fetch. I strip all extra whitespace from the query string before I send it. Should I be worried?
const query = `
{
thing {
id
name
relatedThing {
id
name
createdAt
}
}
}`
after query.replace(/\s+/g, ' ') I've got...
"{ thing { id name relatedThing { id name createdAt } } }"
I haven't had any complains from the server or any weird behaviour, but I don't understand the server demands and whether there's a possibility this could break some queries. Is it possible that I can break some queries by doing this?

From the spec:
White space is used to improve legibility of source text and act as separation between tokens, and any amount of white space may appear before or after any token. White space between tokens is not significant to the semantic meaning of a GraphQL Document, however white space characters may appear within a String or Comment token... Like white space, line terminators are used to improve the legibility of source text, any amount may appear before or after any other token and have no significance to the semantic meaning of a GraphQL Document. Line terminators are not found within any other token.
In other words, there's nothing wrong with what you are doing. There's just two things to keep in mind:
If your queries include String literals, that particular regex express will also change the value of the String literal as well if it includes more than one space.
GraphQL returns errors with a location that includes both the line number and character number where the error occurred. By transforming your queries like this, that information will reflect the transformed query and not your original one.

Why can class name not start with a number? [duplicate]

What characters/symbols are allowed within the CSS class selectors?
I know that the following characters are invalid, but what characters are valid?
~ ! # $ % ^ & * ( ) + = , . / ' ; : " ? > < [ ] \ { } | ` #

You can check directly at the CSS grammar.
Basically1, a name must begin with an underscore (_), a hyphen (-), or a letter(a–z), followed by any number of hyphens, underscores, letters, or numbers. There is a catch: if the first character is a hyphen, the second character must2 be a letter or underscore, and the name must be at least 2 characters long.
-?[_a-zA-Z]+[_a-zA-Z0-9-]*
In short, the previous rule translates to the following, extracted from the W3C specification:
In CSS, identifiers (including element names, classes, and IDs in
selectors) can contain only the characters [a-z0-9] and ISO 10646
characters U+00A0 and higher, plus the hyphen (-) and the underscore
(_); they cannot start with a digit, or a hyphen followed by a digit.
Identifiers can also contain escaped characters and any ISO 10646
character as a numeric code (see next item). For instance, the
identifier "B&W?" may be written as "B&W?" or "B\26 W\3F".
Identifiers beginning with a hyphen or underscore are typically reserved for browser-specific extensions, as in -moz-opacity.
1 It's all made a bit more complicated by the inclusion of escaped Unicode characters (that no one really uses).
2 Note that, according to the grammar I linked, a rule starting with two hyphens, e.g., --indent1, is invalid. However, I'm pretty sure I've seen this in practice.

To my surprise most answers here are wrong. It turns out that:
Any character except NUL is allowed in CSS class names in CSS. (If CSS contains NUL (escaped or not), the result is undefined. [CSS-characters])
Mathias Bynens' answer links to explanation and demos showing how to use these names. Written down in CSS code, a class name may need escaping, but that doesn’t change the class name. E.g. an unnecessarily over-escaped representation will look different from other representations of that name, but it still refers to the same class name.
Most other (programming) languages don’t have that concept of escaping variable names (“identifiers”), so all representations of a variable have to look the same. This is not the case in CSS.
Note that in HTML there is no way to include space characters (space, tab, line feed, form feed and carriage return) in a class name attribute, because they already separate classes from each other.
So, if you need to turn a random string into a CSS class name: take care of NUL and space, and escape (accordingly for CSS or HTML). Done.

I’ve answered your question in-depth at CSS character escape sequences. The article also explains how to escape any character in CSS (and JavaScript), and I made a handy tool for this as well. From that page:
If you were to give an element an ID value of ~!#$%^&*()_+-=,./';:"?><[]{}|`#, the selector would look like this:
CSS:
<style>
#\~\!\#\$\%\^\&\*\(\)\_\+-\=\,\.\/\'\;\:\"\?\>\<\[\]\\\{\}\|\`\#
{
background: hotpink;
}
</style>
JavaScript:
<script>
// document.getElementById or similar
document.getElementById('~!#$%^&*()_+-=,./\';:"?><[]\\{}|`#');
// document.querySelector or similar
$('#\\~\\!\\#\\$\\%\\^\\&\\*\\(\\)\\_\\+-\\=\\,\\.\\/\\\'\\;\\:\\"\\?\\>\\<\\[\\]\\\\\\{\\}\\|\\`\\#');
</script>

Read the W3C spec. (this is CSS 2.1; find the appropriate version for your assumption of browsers)
relevant paragraph:
In CSS, identifiers (including
element names, classes, and IDs in
selectors) can contain only the
characters [a-z0-9] and ISO 10646
characters U+00A1 and higher, plus the
hyphen (-) and the underscore (_);
they cannot start with a digit, or a
hyphen followed by a digit.
Identifiers can also contain escaped
characters and any ISO 10646 character
as a numeric code (see next item). For
instance, the identifier "B&W?" may be
written as "B&W?" or "B\26 W\3F".
As #mipadi points out in Kenan Banks's answer, there's this caveat, also in the same webpage:
In CSS, identifiers may begin with '-'
(dash) or '_' (underscore). Keywords
and property names beginning with '-'
or '_' are reserved for
vendor-specific extensions. Such
vendor-specific extensions should have
one of the following formats:
'-' + vendor identifier + '-' + meaningful name
'_' + vendor identifier + '-' + meaningful name
Example(s):
For example, if XYZ organization added
a property to describe the color of
the border on the East side of the
display, they might call it
-xyz-border-east-color.
Other known examples:
-moz-box-sizing
-moz-border-radius
-wap-accesskey
An initial dash or underscore is
guaranteed never to be used in a
property or keyword by any current or
future level of CSS. Thus typical CSS
implementations may not recognize such
properties and may ignore them
according to the rules for handling
parsing errors. However, because the
initial dash or underscore is part of
the grammar, CSS 2.1 implementers
should always be able to use a
CSS-conforming parser, whether or not
they support any vendor-specific
extensions.
Authors should avoid vendor-specific
extensions

The complete regular expression is:
-?(?:[_a-z]|[\200-\377]|\\[0-9a-f]{1,6}(\r\n|[ \t\r\n\f])?|\\[^\r\n\f0-9a-f])(?:[_a-z0-9-]|[\200-\377]|\\[0-9a-f]{1,6}(\r\n|[ \t\r\n\f])?|\\[^\r\n\f0-9a-f])*
So all of your listed characters, except “-” and “_” are not allowed if used directly. But you can encode them using a backslash foo\~bar or using the Unicode notation foo\7E bar.

For those looking for a workaround, you can use an attribute selector, for instance, if your class begins with a number. Change:
.000000-8{background:url(../../images/common/000000-0.8.png);} /* DOESN'T WORK!! */
to this:
[class="000000-8"]{background:url(../../images/common/000000-0.8.png);} /* WORKS :) */
Also, if there are multiple classes, you will need to specify them in selector or use the ~= operator:
[class~="000000-8"]{background:url(../../images/common/000000-0.8.png);}
Sources:
https://benfrain.com/when-and-where-you-can-use-numbers-in-id-and-class-names/
Is there a workaround to make CSS classes with names that start with numbers valid?
https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors

My understanding is that the underscore is technically valid. Check out:
https://developer.mozilla.org/en/underscores_in_class_and_id_names
"...errata to the specification published in early 2001 made underscores legal for the first time."
The article linked above says never use them, then gives a list of browsers that don't support them, all of which are, in terms of numbers of users at least, long-redundant.

I would not recommend to use anything except A-z, _- and 0-9, while it's just easier to code with those symbols. Also do not start classes with - while those classes are usually browser-specific flags. To avoid any issues with IDE autocompletion, less complexity when you may need to generate those class names with some other code for whatever reason. Maybe some transpiling software may not work, etc., etc.
Yet CSS is quite loose on this. You can use any symbol, and even emoji works.
<style>
.😭 {
border: 2px solid blue;
width: 100px;
height: 100px;
overflow: hidden;
}
</style>
<div class="😭">
😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅
</div>

We can use all characters in a class name. Even characters like # and .. We just have to escape them with \ (backslash).
.test\.123 {
color: red;
}
.test\#123 {
color: blue;
}
.test\#123 {
color: green;
}
.test\<123 {
color: brown;
}
.test\`123 {
color: purple;
}
.test\~123 {
color: tomato;
}
<div class="test.123">test.123</div>
<div class="test#123">test#123</div>
<div class="test#123">test#123</div>
<div class="test<123">test<123</div>
<div class="test`123">test`123</div>
<div class="test~123">test~123</div>

For HTML5 and CSS 3, classes and IDs can start with numbers.

Going off of Kenan Banks's answer, you can use the following two regex matches to make a string valid:
[^a-z0-9A-Z_-]
This is a reverse match that selects anything that isn't a letter, number, dash or underscore for easy removal.
^-*[0-9]+
This matches 0 or 1 dashes followed by 1 or more numbers at the beginning of a string, also for easy removal.
How I use it in PHP:
// Make alphanumeric with dashes and underscores (removes all other characters)
$class = preg_replace("/[^a-z0-9A-Z_-]/", "", $class);
// Classes only begin with an underscore or letter
$class = preg_replace("/^-*[0-9]+/", "", $class);
// Make sure the string is two or more characters long
return 2 <= strlen($class) ? $class : '';

How to ignore / allow multiple line breaks in a Javascript replace regex

What I am trying to do is clean up an input field on blur. My blur code is completely functional, but I can't get the regex to work correctly. All of the characters I'm trying to allow are working correctly except for any number of line breaks. I've tried with different variations and combinations of including /s, /r, and /n .
I am doing this because I want to prevent as many characters that don't really belong in a descriptive input field as possible. I am using entity to linq for database input, which should protect me from sql injection attacks, but I still want to restrict the characters for added security. I am allowing apostrophes, but that should be the only potential threat from the allowed characters listed in the regex below.
Once I get the regex, I'll also replace on paste using the same code block.
This is my javascript method that I reverted back to.
function CleanSentenceInput(AlphaNumString) {
input = AlphaNumString;
var CleanInput = input.replace(/[^a-z0-9\s.,;:'()-]/gi, '');
CleanInput = myTrim(CleanInput);
return CleanInput;
}
Is there a way to allow any number of line breaks by modifying this replace regex?
Test Input:
aaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbb
cccccccccccccccc
cccccccccccccccc
Test Result:
aaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbcccccccccccccccccccccccccccccccc
Expected Result:
aaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbb
cccccccccccccccc
cccccccccccccccc
Update ** It turns out that the trim function I was using was removing the line-breaks. Here is that function:
Bad trim function:
function myTrim(x) {
return x.replace(/^\s+|\s+$/gm, '');
}
Is there a way to fix this regex so that it still replaces whitespace before and after, but not inside of the content?
Updated **
Good Trim Function:
function myTrim(x) {
return x.replace(/^\s+|\s+$/g, '');
}

As I noted from the beginning the problem is not with the regex since
/[^a-z0-9\s.,;:'()-]/gi
matches characters other than whitespace (beside others in the character class).
In MyTrim you need to remove m because otherwise, $ is treated as a line end and ^ as line start anchors, and in fact you want to only trim the string from its beginning and end:
function myTrim(x) {
return x.replace(/^\s+|\s+$/g, '');
}
It is also possible to use trim() (it is supported by all modern browsers, IE9 already should support it).

How should I enter a Unicode character in CSS now that octal escapes are deprecated?

I am adding CSS styles using JavaScript (specifically with GreaseMonkey's GM_addStyle).
I'd like to put a Unicode character in a CSS property. I've seen a lot of questions like this one and the answer always seems to be along the lines of
#target:before {
content: "\2611";
}
Now, as I said, this style is being specified in GM_addStyle, and the calling function has Strict mode enabled. When my script runs, I get an error on the console with the message octal literals and octal escape sequences are deprecated.
I think the conflict here is between doing the operation in JavaScript (i.e. putting a Unicode character into a JavaScript string) and the operation in CSS (escaping the character when declaring a CSS property). What syntax should I use to have the character escaped without generating an error?

I'm trying to do just this in GreaseMonkey, with Strict mode enabled.
If you're going via JavaScript, then the String '\2611' doesn't have the same meaning as you think it does and you probably want the String '\u2611' or '\\2611'
'\2611'; // "±1"
'\u2611'; // "☑"
'\\2611'; // "\2611"

Mysterious garbage character - IE 8 only

I am building a table, with content pulled from other elements in the page (page scraping).
I am using innerText or textContent to pull the text, then a regular expression to trim it:
string.replace(/^\s+|\s+$/g,"");
This works fine in IE 9 and Chrome, but in IE 8 I am getting a garbage character that I cannot identify. I was able to reproduce the behavior with alerts in jsfiddle:
http://jsfiddle.net/Te4FQ/
What is this extra character, and how can I get rid of it?
Update: thanks for the helpful replies! It seems that the character in question is u200E (left to right mark). So the second part of my question remains, how can I get rid of such characters with regular expressions, and just keep regular text?

Both the "At Risk" and "Complete" <th> tags in your jsFiddle snippet have a U+200E (Left-to-Right Mark, aka LRM) code point at the end of their content. That is not a whitespace character, so it cannot be matched by \s.
One way to get rid of this character is to use the XRegExp library, so that you can replace all matches of \p{C} with the empty string (i.e., delete them). \p{C} matches any code point in Unicode's "Other" category, which includes control, format, private use, surrogate, and unassigned code points. U+200E, specifically, is within the \p{Cf} "Other, Format" subcategory.

Try printing to the page the result of
escape(string.replace(/^\s+|\s+$/g,""));
Your garbage character should show up as an escape code.

We Keep Coding

JavaScript is the programming language of the Web.