How do I remove emoji code using JavaScript? I thought I had taken care of it using the code below, but I still have characters like 🔴.
function removeInvalidChars() {
return this.replace(/[\uE000-\uF8FF]/g, '');
}
For me none of the answers completely removed all emojis so I had to do some work myself and this is what i got :
text.replace(/([\u2700-\u27BF]|[\uE000-\uF8FF]|\uD83C[\uDC00-\uDFFF]|\uD83D[\uDC00-\uDFFF]|[\u2011-\u26FF]|\uD83E[\uDD10-\uDDFF])/g, '');
Also, it should take into account that if one inserting the string later to the database, replacing with empty string could expose security issue. instead replace with the replacement character U+FFFD, see : http://www.unicode.org/reports/tr36/#Deletion_of_Noncharacters
The range you have selected is the Private Use Area, containing non-standard characters. Carriers used to encode emoji as different, inconsistent values inside this range.
More recently, the emoji have been given standardised 'unified' codepoints. Many of these are outside of the Basic Multilingual Plane, in the block U+1F300–U+1F5FF, including your example 🔴 U+1F534 Large Red Circle.
You could detect these characters with [\U0001F300-\U0001F5FF] in a regex engine that supported non-BMP characters, but JavaScript's RegExp is not such a beast. Unfortunately the JS string model is based on UTF-16 code units, so you'd have to work with the UTF-16 surrogates in a regexp:
return this.replace(/([\uE000-\uF8FF]|\uD83C[\uDF00-\uDFFF]|\uD83D[\uDC00-\uDDFF])/g, '')
However, note that there are other characters in the Basic Multilingual Plane that are used as emoji by phones but which long predate emoji. For example U+2665 is the traditional Heart Suit character ♥, but it may be rendered as an emoji graphic on some devices. It's up to you whether you treat this as emoji and try to remove it. See this list for more examples.
I solved it by using a regex with Unicode property escapes. I got it from this article, it's for Java but still very helpful - Remove Emojis from a Java String.
'Smile😀'.replace(/[^\p{L}\p{N}\p{P}\p{Z}^$\n]/gu, '');
It removes all symbols except:
\p{L} - all letters from any language
\p{N} - numbers
\p{P} - punctuation
\p{Z} - whitespace separators
^$\n - add any symbols you want to keep
This one should be more correct and it works, but for me it leaves some trash symbols in the string:
'Smile😀'.replace(/\p{Emoji}/gu, '');
Edit: added symbols from comments
I've found many suggestions around but the regex that have solved my problem is:
/(?:[\u2700-\u27bf]|(?:\ud83c[\udde6-\uddff]){2}|[\ud800-\udbff][\udc00-\udfff]|[\u0023-\u0039]\ufe0f?\u20e3|\u3299|\u3297|\u303d|\u3030|\u24c2|\ud83c[\udd70-\udd71]|\ud83c[\udd7e-\udd7f]|\ud83c\udd8e|\ud83c[\udd91-\udd9a]|\ud83c[\udde6-\uddff]|\ud83c[\ude01-\ude02]|\ud83c\ude1a|\ud83c\ude2f|\ud83c[\ude32-\ude3a]|\ud83c[\ude50-\ude51]|\u203c|\u2049|[\u25aa-\u25ab]|\u25b6|\u25c0|[\u25fb-\u25fe]|\u00a9|\u00ae|\u2122|\u2139|\ud83c\udc04|[\u2600-\u26FF]|\u2b05|\u2b06|\u2b07|\u2b1b|\u2b1c|\u2b50|\u2b55|\u231a|\u231b|\u2328|\u23cf|[\u23e9-\u23f3]|[\u23f8-\u23fa]|\ud83c\udccf|\u2934|\u2935|[\u2190-\u21ff])/g
A short example
function removeEmojis (string) {
var regex = /(?:[\u2700-\u27bf]|(?:\ud83c[\udde6-\uddff]){2}|[\ud800-\udbff][\udc00-\udfff]|[\u0023-\u0039]\ufe0f?\u20e3|\u3299|\u3297|\u303d|\u3030|\u24c2|\ud83c[\udd70-\udd71]|\ud83c[\udd7e-\udd7f]|\ud83c\udd8e|\ud83c[\udd91-\udd9a]|\ud83c[\udde6-\uddff]|\ud83c[\ude01-\ude02]|\ud83c\ude1a|\ud83c\ude2f|\ud83c[\ude32-\ude3a]|\ud83c[\ude50-\ude51]|\u203c|\u2049|[\u25aa-\u25ab]|\u25b6|\u25c0|[\u25fb-\u25fe]|\u00a9|\u00ae|\u2122|\u2139|\ud83c\udc04|[\u2600-\u26FF]|\u2b05|\u2b06|\u2b07|\u2b1b|\u2b1c|\u2b50|\u2b55|\u231a|\u231b|\u2328|\u23cf|[\u23e9-\u23f3]|[\u23f8-\u23fa]|\ud83c\udccf|\u2934|\u2935|[\u2190-\u21ff])/g;
return string.replace(regex, '');
}
Hope it can help you
Just an addition to #hababr answer.
If you need to get rid of complicated emojis, you have to remove also additional things like modifiers and etc:
'👨🏿🎤'.replace(/[\p{Emoji}\p{Emoji_Modifier}\p{Emoji_Component}\p{Emoji_Modifier_Base}\p{Emoji_Presentation}]/gu, '').charCodeAt(0)
update:
*#0-9 - are Emoji characters with a text representation by default, per the Unicode Standard.
so, my current solution is next:
'👨🏿🎤'.replace(/(?![*#0-9]+)[\p{Emoji}\p{Emoji_Modifier}\p{Emoji_Component}\p{Emoji_Modifier_Base}\p{Emoji_Presentation}]/gu, '').charCodeAt(0)
I know this post is a bit old, but I stumbled across this very problem at work and a colleague came up with an interesting idea. Basically instead of stripping emoji character only allow valid characters in. Consulting this ASCII table:
http://www.asciitable.com/
A function such as this could only keep legal characters (the range itself dependent on what you are after)
function (input) {
var result = '';
if (input.length == 0)
return input;
for (var indexOfInput = 0, lengthOfInput = input.length; indexOfInput < lengthOfInput; indexOfInput++) {
var charAtSpecificIndex = input[indexOfInput].charCodeAt(0);
if ((32 <= charAtSpecificIndex) && (charAtSpecificIndex <= 126)) {
result += input[indexOfInput];
}
}
return result;
};
This should preserve all numbers, letters and special characters of the Alphabet for a situation where you wish to preserve the English alphabet + number + special characters. Hope it helps someone :)
#bobince's solution didn't work for me. Either the Emojis stayed there or they were swapped by a different Emoji.
This solution did the trick for me:
var ranges = [
'\ud83c[\udf00-\udfff]', // U+1F300 to U+1F3FF
'\ud83d[\udc00-\ude4f]', // U+1F400 to U+1F64F
'\ud83d[\ude80-\udeff]' // U+1F680 to U+1F6FF
];
$('#mybtn').on('click', function() {
removeInvalidChars();
})
function removeInvalidChars() {
var str = $('#myinput').val();
str = str.replace(new RegExp(ranges.join('|'), 'g'), '');
$("#myinput").val(str);
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<input type="text" id="myinput"/>
<input type="submit" id="mybtn" value="clear"/>
Source
After searching and trying lots of unicode regex, I suggest you try this, it can cover all of emojis:
function removeEmoji(str) {
let strCopy = str;
const emojiKeycapRegex = /[\u0023-\u0039]\ufe0f?\u20e3/g;
const emojiRegex = /\p{Extended_Pictographic}/gu;
const emojiComponentRegex = /\p{Emoji_Component}/gu;
if (emojiKeycapRegex.test(strCopy)) {
strCopy = strCopy.replace(emojiKeycapRegex, '');
}
if (emojiRegex.test(strCopy)) {
strCopy = strCopy.replace(emojiRegex, '');
}
if (emojiComponentRegex.test(strCopy)) {
// eslint-disable-next-line no-restricted-syntax
for (const emoji of (strCopy.match(emojiComponentRegex) || [])) {
if (/[\d|*|#]/.test(emoji)) {
continue;
}
strCopy = strCopy.replace(emoji, '');
}
}
return strCopy;
}
let a = "1️⃣aa🤹♂️b#️⃣🔤✅❎23#!^*bb🤹🏾🤹♀️🚴🏻ccc";
console.log(removeEmoji(a))
Refrence: Unicode Emoij Document
None of the answers here worked for all the unicode characters I tested (specifically characters in the miscellaneous range such as ⛽ or ☯️).
Here is one that worked for me, (heavily) inspired from this SO PHP answer:
function _removeEmojis(str) {
return str.replace(/([#0-9]\u20E3)|[\xA9\xAE\u203C\u2047-\u2049\u2122\u2139\u3030\u303D\u3297\u3299][\uFE00-\uFEFF]?|[\u2190-\u21FF][\uFE00-\uFEFF]?|[\u2300-\u23FF][\uFE00-\uFEFF]?|[\u2460-\u24FF][\uFE00-\uFEFF]?|[\u25A0-\u25FF][\uFE00-\uFEFF]?|[\u2600-\u27BF][\uFE00-\uFEFF]?|[\u2900-\u297F][\uFE00-\uFEFF]?|[\u2B00-\u2BF0][\uFE00-\uFEFF]?|(?:\uD83C[\uDC00-\uDFFF]|\uD83D[\uDC00-\uDEFF])[\uFE00-\uFEFF]?/g, '');
}
(My use case is sorting in a data grid where emojis can come first in a string but users want the text ordered by the actual words.)
sandre89's answer is good but not perfect.
I spent some time on the subject and have a working solution.
var ranges = [
'[\u00A0-\u269f]',
'[\u26A0-\u329f]',
// The following characters could not be minified correctly
// if specifed with the ES6 syntax \u{1F400}
'[🀄-🧀]'
//'[\u{1F004}-\u{1F9C0}]'
];
$('#mybtn').on('click', function() {
removeInvalidChars();
});
function removeInvalidChars() {
var str = $('#myinput').val();
str = str.replace(new RegExp(ranges.join('|'), 'ug'), '');
$("#myinput").val(str);
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<input type="text" id="myinput" />
<input type="submit" id="mybtn" value="clear" />
Here is my CodePen
There are some points to note, though.
Unicode characters from U+1F000 up need a special notation, so you can use sandre89's way, or opt for the \u{1F000} ES6 notation, which may or may not work with your minificator. I succeeded pasting the emojis directly in the UTF-8 encoded script.
Don't forget the u flag in the regex, or your Javascript engine may throw an error.
Beware that things may not be working due to the file encoding, character set, or minificator. In my case nothing worked until I took the script off an .isml file (Demandware) and pasted it into a .js file.
You may gain some insight by referring to Wikipedia Emoji page and How many bytes does one Unicode character take?, and by tinkering with this Online Unicode converter, as I did.
var emoji =/([#0-9]\u20E3)|[\xA9\xAE\u203C\u2047-\u2049\u2122\u2139\u3030\u303D\u3297\u3299][\uFE00-\uFEFF]?|[\u2190-\u21FF][\uFE00-\uFEFF]?|[\u2300-\u23FF][\uFE00-\uFEFF]?|[\u2460-\u24FF][\uFE00-\uFEFF]?|[\u25A0-\u25FF][\uFE00-\uFEFF]?|[\u2600-\u27BF][\uFE00-\uFEFF]?|[\u2900-\u297F][\uFE00-\uFEFF]?|[\u2B00-\u2BF0][\uFE00-\uFEFF]?|(?:\uD83C[\uDC00-\uDFFF]|\uD83D[\uDC00-\uDEFF])[\uFE00-\uFEFF]?|[\u20E3]|[\u26A0-\u3000]|\uD83E[\udd00-\uddff]|[\u00A0-\u269F]/g;
str.replace(emoji, "");
i add this '\uD83E[\udd00-\uddff]'
these emojis were updated when 2018 june
if u want block emojis after other update then use this
str.replace(/[^0-9a-zA-Zㄱ-힣+×÷=%♤♡☆♧)(*&^/~##!-:;,?`_|<>{}¥£€$◇■□●○•°※¤《》¡¿₩\[\]\"\' \\]/g ,"");
u can block all emojis and u can only use eng, num, hangle, and some Characters
thx :)
You can use this function to replace emojis with nothing:
function msgAfterClearEmojis(msg)
{
var new_msg = msg.replace(/([#0-9]\u20E3)|[\xA9\xAE\u203C\u2047-\u2049\u2122\u2139\u3030\u303D\u3297\u3299][\uFE00-\uFEFF]?|[\u2190-\u21FF][\uFE00-\uFEFF]?|[\u2300-\u23FF][\uFE00-\uFEFF]?|[\u2460-\u24FF][\uFE00-\uFEFF]?|[\u25A0-\u25FF][\uFE00-\uFEFF]?|[\u2600-\u27BF][\uFE00-\uFEFF]?|[\u2900-\u297F][\uFE00-\uFEFF]?|[\u2B00-\u2BF0][\uFE00-\uFEFF]?|(?:\uD83C[\uDC00-\uDFFF]|\uD83D[\uDC00-\uDEFF])[\uFE00-\uFEFF]?|[\u20E3]|[\u26A0-\u3000]|\uD83E[\udd00-\uddff]|[\u00A0-\u269F]/g, '').trim();
return new_msg;
}
You can check here with emoji..
😊 , 😌 , 👽
function removeEmoji() {
var y = document.getElementById('textbox_id1');
y.value = y.value.replace(/([\u2700-\u27BF]|[\uE000-\uF8FF]|\uD83C[\uDC00-\uDFFF]|\uD83D[\uDC00-\uDFFF]|[\u2011-\u26FF]|\uD83E[\uDD10-\uDDFF])/g, '');
}
input {
padding: 5px;
}
<input type="text" id="textbox_id1" placeholder="Remove emoji..." oninput="removeEmoji()">
You can take more emojis from here: Emoji Keyboard Online
This is the iteration on #hababr's answer.
His answer removes lots of standard chars like $, +, < and so on.
This version keeps all of them (except for the \ backslash - dunno how to properly escape it).
"hey😁 hau💓 ahoy🏴☠️ !##$%^&*()-_=+±§;:'\|`~/?[]{},.<>".replace(/[^\p{L}\p{N}\p{P}\p{Z}{^$=+±\\'|`\\~<>}]/gu, "")
// "hey hau ahoy !##$%^&*()-_=+±§;:'|`~/?[]{},.<>"
I have this regex and it works for all emojis i found on this page
try this regex
<:[^:\s]+:\d+>|<a:[^:\s]+:\d+>|(\u00a9|\u00ae|[\u2000-\u3300]|\ud83c[\ud000-\udfff]|\ud83d[\ud000-\udfff]|\ud83e[\ud000-\udfff]|\ufe0f)
var emojiRegex = /\uD83C\uDFF4(?:\uDB40\uDC67\uDB40\uDC62(?:\uDB40\uDC65\uDB40\uDC6E\uDB40\uDC67|\uDB40\uDC77\uDB40\uDC6C\uDB40\uDC73|\uDB40\uDC73\uDB40\uDC63\uDB40\uDC74)\uDB40\uDC7F|\u200D\u2620\uFE0F)|\uD83D\uDC69\u200D\uD83D\uDC69\u200D(?:\uD83D\uDC66\u200D\uD83D\uDC66|\uD83D\uDC67\u200D(?:\uD83D[\uDC66\uDC67]))|\uD83D\uDC68(?:\u200D(?:\u2764\uFE0F\u200D(?:\uD83D\uDC8B\u200D)?\uD83D\uDC68|(?:\uD83D[\uDC68\uDC69])\u200D(?:\uD83D\uDC66\u200D\uD83D\uDC66|\uD83D\uDC67\u200D(?:\uD83D[\uDC66\uDC67]))|\uD83D\uDC66\u200D\uD83D\uDC66|\uD83D\uDC67\u200D(?:\uD83D[\uDC66\uDC67])|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E[\uDDB0-\uDDB3])|(?:\uD83C[\uDFFB-\uDFFF])\u200D(?:\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E[\uDDB0-\uDDB3]))|\uD83D\uDC69\u200D(?:\u2764\uFE0F\u200D(?:\uD83D\uDC8B\u200D(?:\uD83D[\uDC68\uDC69])|\uD83D[\uDC68\uDC69])|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E[\uDDB0-\uDDB3])|\uD83D\uDC69\u200D\uD83D\uDC66\u200D\uD83D\uDC66|(?:\uD83D\uDC41\uFE0F\u200D\uD83D\uDDE8|\uD83D\uDC69(?:\uD83C[\uDFFB-\uDFFF])\u200D[\u2695\u2696\u2708]|\uD83D\uDC68(?:(?:\uD83C[\uDFFB-\uDFFF])\u200D[\u2695\u2696\u2708]|\u200D[\u2695\u2696\u2708])|(?:(?:\u26F9|\uD83C[\uDFCB\uDFCC]|\uD83D\uDD75)\uFE0F|\uD83D\uDC6F|\uD83E[\uDD3C\uDDDE\uDDDF])\u200D[\u2640\u2642]|(?:\u26F9|\uD83C[\uDFCB\uDFCC]|\uD83D\uDD75)(?:\uD83C[\uDFFB-\uDFFF])\u200D[\u2640\u2642]|(?:\uD83C[\uDFC3\uDFC4\uDFCA]|\uD83D[\uDC6E\uDC71\uDC73\uDC77\uDC81\uDC82\uDC86\uDC87\uDE45-\uDE47\uDE4B\uDE4D\uDE4E\uDEA3\uDEB4-\uDEB6]|\uD83E[\uDD26\uDD37-\uDD39\uDD3D\uDD3E\uDDB8\uDDB9\uDDD6-\uDDDD])(?:(?:\uD83C[\uDFFB-\uDFFF])\u200D[\u2640\u2642]|\u200D[\u2640\u2642])|\uD83D\uDC69\u200D[\u2695\u2696\u2708])\uFE0F|\uD83D\uDC69\u200D\uD83D\uDC67\u200D(?:\uD83D[\uDC66\uDC67])|\uD83D\uDC69\u200D\uD83D\uDC69\u200D(?:\uD83D[\uDC66\uDC67])|\uD83D\uDC68(?:\u200D(?:(?:\uD83D[\uDC68\uDC69])\u200D(?:\uD83D[\uDC66\uDC67])|\uD83D[\uDC66\uDC67])|\uD83C[\uDFFB-\uDFFF])|\uD83C\uDFF3\uFE0F\u200D\uD83C\uDF08|\uD83D\uDC69\u200D\uD83D\uDC67|\uD83D\uDC69(?:\uD83C[\uDFFB-\uDFFF])\u200D(?:\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E[\uDDB0-\uDDB3])|\uD83D\uDC69\u200D\uD83D\uDC66|\uD83C\uDDF6\uD83C\uDDE6|\uD83C\uDDFD\uD83C\uDDF0|\uD83C\uDDF4\uD83C\uDDF2|\uD83D\uDC69(?:\uD83C[\uDFFB-\uDFFF])|\uD83C\uDDED(?:\uD83C[\uDDF0\uDDF2\uDDF3\uDDF7\uDDF9\uDDFA])|\uD83C\uDDEC(?:\uD83C[\uDDE6\uDDE7\uDDE9-\uDDEE\uDDF1-\uDDF3\uDDF5-\uDDFA\uDDFC\uDDFE])|\uD83C\uDDEA(?:\uD83C[\uDDE6\uDDE8\uDDEA\uDDEC\uDDED\uDDF7-\uDDFA])|\uD83C\uDDE8(?:\uD83C[\uDDE6\uDDE8\uDDE9\uDDEB-\uDDEE\uDDF0-\uDDF5\uDDF7\uDDFA-\uDDFF])|\uD83C\uDDF2(?:\uD83C[\uDDE6\uDDE8-\uDDED\uDDF0-\uDDFF])|\uD83C\uDDF3(?:\uD83C[\uDDE6\uDDE8\uDDEA-\uDDEC\uDDEE\uDDF1\uDDF4\uDDF5\uDDF7\uDDFA\uDDFF])|\uD83C\uDDFC(?:\uD83C[\uDDEB\uDDF8])|\uD83C\uDDFA(?:\uD83C[\uDDE6\uDDEC\uDDF2\uDDF3\uDDF8\uDDFE\uDDFF])|\uD83C\uDDF0(?:\uD83C[\uDDEA\uDDEC-\uDDEE\uDDF2\uDDF3\uDDF5\uDDF7\uDDFC\uDDFE\uDDFF])|\uD83C\uDDEF(?:\uD83C[\uDDEA\uDDF2\uDDF4\uDDF5])|\uD83C\uDDF8(?:\uD83C[\uDDE6-\uDDEA\uDDEC-\uDDF4\uDDF7-\uDDF9\uDDFB\uDDFD-\uDDFF])|\uD83C\uDDEE(?:\uD83C[\uDDE8-\uDDEA\uDDF1-\uDDF4\uDDF6-\uDDF9])|\uD83C\uDDFF(?:\uD83C[\uDDE6\uDDF2\uDDFC])|\uD83C\uDDEB(?:\uD83C[\uDDEE-\uDDF0\uDDF2\uDDF4\uDDF7])|\uD83C\uDDF5(?:\uD83C[\uDDE6\uDDEA-\uDDED\uDDF0-\uDDF3\uDDF7-\uDDF9\uDDFC\uDDFE])|\uD83C\uDDE9(?:\uD83C[\uDDEA\uDDEC\uDDEF\uDDF0\uDDF2\uDDF4\uDDFF])|\uD83C\uDDF9(?:\uD83C[\uDDE6\uDDE8\uDDE9\uDDEB-\uDDED\uDDEF-\uDDF4\uDDF7\uDDF9\uDDFB\uDDFC\uDDFF])|\uD83C\uDDE7(?:\uD83C[\uDDE6\uDDE7\uDDE9-\uDDEF\uDDF1-\uDDF4\uDDF6-\uDDF9\uDDFB\uDDFC\uDDFE\uDDFF])|[#\*0-9]\uFE0F\u20E3|\uD83C\uDDF1(?:\uD83C[\uDDE6-\uDDE8\uDDEE\uDDF0\uDDF7-\uDDFB\uDDFE])|\uD83C\uDDE6(?:\uD83C[\uDDE8-\uDDEC\uDDEE\uDDF1\uDDF2\uDDF4\uDDF6-\uDDFA\uDDFC\uDDFD\uDDFF])|\uD83C\uDDF7(?:\uD83C[\uDDEA\uDDF4\uDDF8\uDDFA\uDDFC])|\uD83C\uDDFB(?:\uD83C[\uDDE6\uDDE8\uDDEA\uDDEC\uDDEE\uDDF3\uDDFA])|\uD83C\uDDFE(?:\uD83C[\uDDEA\uDDF9])|(?:\uD83C[\uDFC3\uDFC4\uDFCA]|\uD83D[\uDC6E\uDC71\uDC73\uDC77\uDC81\uDC82\uDC86\uDC87\uDE45-\uDE47\uDE4B\uDE4D\uDE4E\uDEA3\uDEB4-\uDEB6]|\uD83E[\uDD26\uDD37-\uDD39\uDD3D\uDD3E\uDDB8\uDDB9\uDDD6-\uDDDD])(?:\uD83C[\uDFFB-\uDFFF])|(?:\u26F9|\uD83C[\uDFCB\uDFCC]|\uD83D\uDD75)(?:\uD83C[\uDFFB-\uDFFF])|(?:[\u261D\u270A-\u270D]|\uD83C[\uDF85\uDFC2\uDFC7]|\uD83D[\uDC42\uDC43\uDC46-\uDC50\uDC66\uDC67\uDC70\uDC72\uDC74-\uDC76\uDC78\uDC7C\uDC83\uDC85\uDCAA\uDD74\uDD7A\uDD90\uDD95\uDD96\uDE4C\uDE4F\uDEC0\uDECC]|\uD83E[\uDD18-\uDD1C\uDD1E\uDD1F\uDD30-\uDD36\uDDB5\uDDB6\uDDD1-\uDDD5])(?:\uD83C[\uDFFB-\uDFFF])|(?:[\u231A\u231B\u23E9-\u23EC\u23F0\u23F3\u25FD\u25FE\u2614\u2615\u2648-\u2653\u267F\u2693\u26A1\u26AA\u26AB\u26BD\u26BE\u26C4\u26C5\u26CE\u26D4\u26EA\u26F2\u26F3\u26F5\u26FA\u26FD\u2705\u270A\u270B\u2728\u274C\u274E\u2753-\u2755\u2757\u2795-\u2797\u27B0\u27BF\u2B1B\u2B1C\u2B50\u2B55]|\uD83C[\uDC04\uDCCF\uDD8E\uDD91-\uDD9A\uDDE6-\uDDFF\uDE01\uDE1A\uDE2F\uDE32-\uDE36\uDE38-\uDE3A\uDE50\uDE51\uDF00-\uDF20\uDF2D-\uDF35\uDF37-\uDF7C\uDF7E-\uDF93\uDFA0-\uDFCA\uDFCF-\uDFD3\uDFE0-\uDFF0\uDFF4\uDFF8-\uDFFF]|\uD83D[\uDC00-\uDC3E\uDC40\uDC42-\uDCFC\uDCFF-\uDD3D\uDD4B-\uDD4E\uDD50-\uDD67\uDD7A\uDD95\uDD96\uDDA4\uDDFB-\uDE4F\uDE80-\uDEC5\uDECC\uDED0-\uDED2\uDEEB\uDEEC\uDEF4-\uDEF9]|\uD83E[\uDD10-\uDD3A\uDD3C-\uDD3E\uDD40-\uDD45\uDD47-\uDD70\uDD73-\uDD76\uDD7A\uDD7C-\uDDA2\uDDB0-\uDDB9\uDDC0-\uDDC2\uDDD0-\uDDFF])|(?:[#\*0-9\xA9\xAE\u203C\u2049\u2122\u2139\u2194-\u2199\u21A9\u21AA\u231A\u231B\u2328\u23CF\u23E9-\u23F3\u23F8-\u23FA\u24C2\u25AA\u25AB\u25B6\u25C0\u25FB-\u25FE\u2600-\u2604\u260E\u2611\u2614\u2615\u2618\u261D\u2620\u2622\u2623\u2626\u262A\u262E\u262F\u2638-\u263A\u2640\u2642\u2648-\u2653\u265F\u2660\u2663\u2665\u2666\u2668\u267B\u267E\u267F\u2692-\u2697\u2699\u269B\u269C\u26A0\u26A1\u26AA\u26AB\u26B0\u26B1\u26BD\u26BE\u26C4\u26C5\u26C8\u26CE\u26CF\u26D1\u26D3\u26D4\u26E9\u26EA\u26F0-\u26F5\u26F7-\u26FA\u26FD\u2702\u2705\u2708-\u270D\u270F\u2712\u2714\u2716\u271D\u2721\u2728\u2733\u2734\u2744\u2747\u274C\u274E\u2753-\u2755\u2757\u2763\u2764\u2795-\u2797\u27A1\u27B0\u27BF\u2934\u2935\u2B05-\u2B07\u2B1B\u2B1C\u2B50\u2B55\u3030\u303D\u3297\u3299]|\uD83C[\uDC04\uDCCF\uDD70\uDD71\uDD7E\uDD7F\uDD8E\uDD91-\uDD9A\uDDE6-\uDDFF\uDE01\uDE02\uDE1A\uDE2F\uDE32-\uDE3A\uDE50\uDE51\uDF00-\uDF21\uDF24-\uDF93\uDF96\uDF97\uDF99-\uDF9B\uDF9E-\uDFF0\uDFF3-\uDFF5\uDFF7-\uDFFF]|\uD83D[\uDC00-\uDCFD\uDCFF-\uDD3D\uDD49-\uDD4E\uDD50-\uDD67\uDD6F\uDD70\uDD73-\uDD7A\uDD87\uDD8A-\uDD8D\uDD90\uDD95\uDD96\uDDA4\uDDA5\uDDA8\uDDB1\uDDB2\uDDBC\uDDC2-\uDDC4\uDDD1-\uDDD3\uDDDC-\uDDDE\uDDE1\uDDE3\uDDE8\uDDEF\uDDF3\uDDFA-\uDE4F\uDE80-\uDEC5\uDECB-\uDED2\uDEE0-\uDEE5\uDEE9\uDEEB\uDEEC\uDEF0\uDEF3-\uDEF9]|\uD83E[\uDD10-\uDD3A\uDD3C-\uDD3E\uDD40-\uDD45\uDD47-\uDD70\uDD73-\uDD76\uDD7A\uDD7C-\uDDA2\uDDB0-\uDDB9\uDDC0-\uDDC2\uDDD0-\uDDFF])\uFE0F|(?:[\u261D\u26F9\u270A-\u270D]|\uD83C[\uDF85\uDFC2-\uDFC4\uDFC7\uDFCA-\uDFCC]|\uD83D[\uDC42\uDC43\uDC46-\uDC50\uDC66-\uDC69\uDC6E\uDC70-\uDC78\uDC7C\uDC81-\uDC83\uDC85-\uDC87\uDCAA\uDD74\uDD75\uDD7A\uDD90\uDD95\uDD96\uDE45-\uDE47\uDE4B-\uDE4F\uDEA3\uDEB4-\uDEB6\uDEC0\uDECC]|\uD83E[\uDD18-\uDD1C\uDD1E\uDD1F\uDD26\uDD30-\uDD39\uDD3D\uDD3E\uDDB5\uDDB6\uDDB8\uDDB9\uDDD1-\uDDDD])/g;
console.log(text.replace(emojiRegex,'');
<!DOCTYPE html>
<html>
<head>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script>
<script>
function isEmoji(str) {
var ranges = [
'[\uE000-\uF8FF]',
'\uD83C[\uDC00-\uDFFF]',
'\uD83D[\uDC00-\uDFFF]',
'[\u2011-\u26FF]',
'\uD83E[\uDD10-\uDDFF]'
];
if (str.match(ranges.join('|'))) {
return true;
} else {
return false;
}
}
$(document).ready(function(){
$('input').on('input',function(){
var $th = $(this);
console.log("Value of Input"+$th.val());
emojiInput= isEmoji($th.val());
if (emojiInput==true) {
$th.val("");
}
});
});
</script>
</head>
<body>
Enter your name: <input type="text">
</body>
</html>
There is a modern solution using categories
Modern browsers support Unicode property, which allows you to match emojis based on their belonging in the Emoji Unicode category. For example, you can use Unicode property escapes like \p{Emoji} or \P{Emoji} to match/no match emoji characters. Note that 0123456789#* and other characters are interpreted as emojis using the previous Unicode category. Therefore, a better way to do this is to use the {Extended_Pictographic} Unicode category that denotes all the characters typically understood as emojis instead of the {Emoji} category.
const withEmojis = /\p{Extended_Pictographic}/u
withEmojis.test('😀😀');
//true
withEmojis.test('ab');
//false
withEmojis.test('1');
//false
or with negation
const noEmojis = /\P{Extended_Pictographic}/u
noEmojis.test('😀');
//false
noEmojis.test('1212');
//false
You can use mathiasbynens/emoji-regex package to remove or replace emojis.
You can see the latest build's content to grab the regex by visiting following url:
http://unpkg.com/emoji-regex/index.js
In detail, this function first uses TextEncoder to convert content into a byte array with utf-8 encoding, then loops through this array, if it finds a byte whose first five bits are 11110 (i.e. 0xF0), it means this is an emoji start, then it replaces this byte and the next three bytes with 0x30 (i.e. number 0). Finally, it uses TextDecoder to convert the modified byte array back to a string, and uses replaceAll method to remove extra 0s.
function removeEmoji (content) {
let conByte = new TextEncoder("utf-8").encode(content);
for (let i = 0; i < conByte.length; i++) {
if ((conByte[i] & 0xF8) == 0xF0) {
for (let j = 0; j < 4; j++) {
conByte[i+j]=0x30;
}
i += 3;
}
}
content = new TextDecoder("utf-8").decode(conByte);
return content.replaceAll("0000", "");
}
I am still new to javascript and I am trying to validate my form.
One of my inputs is a text input for an identity number that follows the following pattern: ####XX where # represents a number and X represents a capital letter from A-Z.
Here is my code so far:
var IDnum = document.getElementById('identityNumber').value;
if ( (isNaN(IDnum.charAt(0))) && (isNaN(IDnum.charAt(1)))&& (isNaN(IDnum.charAt(2))) && (isNaN(IDnum.charAt(3))) && (!isNaN(IDnum.charAt(4))) )
{
document.getElementById('identityError').style.display = "inline-block";
}
else
{
document.getElementById('identityError').style.display = "none";
}
I have tried to google it and have seen some info where they use a RegExp however i have yet to learn anything like that.
With my code above, no matter what i type it, it still validates it. Any ideas what i am doing wrong and if there is a more simple and easier way?
EDIT: after looking to regex and similar answers the following
^\d{4}[A-Z]{2}$
did not work either
A regular expression is the way to go here. Use the pattern ^\d{4}[A-Z]$:
document.querySelector('button').addEventListener('click', (e) => {
const { value } = document.querySelector('input');
if (value.match(/^\d{4}[A-Z]$/)) {
console.log('OK');
} else {
console.log('Bad');
}
});
<input>
<button>submit</button>
^\d{4}[A-Z]$ means:
^ - Match the start of the string
\d{4} - Match a digit character (0 to 9) 4 times
[A-Z] - Match a character from A to Z
$ - Match the end of the string
You can use regular expression to identify whether string has 4 digits before a character.
each \d represents a digit, \d\d\d\d means 4 digits (alternatively \d{4}).
followed by . means 4 digits followed by any character.
function isAllowed(str) {
return str.match(/^\d\d\d\d.$/g) !== null
}
console.log(isAllowed("1234X"));
console.log(isAllowed("123a"));
console.log(isAllowed("3892#"));
console.log(isAllowed("X"));
var IDnum = document.getElementById('identityNumber').value;
if (isAllowed(IDnum))
{
document.getElementById('identityError').style.display = "inline-block";
}
else
{
document.getElementById('identityError').style.display = "none";
}
function RegexCheck(str) {
var pettern = new RegExp('^[0-9]{4,}[A-Z]{1,}');
return pettern.test(str);
}
console.log(RegexCheck("####X"));
console.log(RegexCheck("1234A"));
console.log(RegexCheck("2C35B"));
console.log(RegexCheck("A698C"));
console.log(RegexCheck("1698b"));
You can use the pattern attribute to provide a RegExp string:
^\d{4}[A-Z]{2}$ would be a string consisting of 4 digits followed by two capital letters between A and Z.
Explanation
^: Beginning of the string.
\d{4}: Exactly 4 digits in a row (this could also be written as \d\d\d\d)
[A-Z]{2}: Exactly 2 characters from the range of character between A and Z (alternatively [A-Z][A-Z]).
$: The end of the string.
input:invalid {
color: red;
}
input:not(:invalid) {
color: green;
}
<input type="text" pattern="^\d{4}[A-Z]{2}$">
Have a question. Sorry for my English.
I have a string. Customer enter SMS text at text-area.
How can I find the existing character of utf-16 at the string or not?
At php I check this code:
if (iconv("UTF-8","UTF-8//IGNORE",$_entry_text) != $_entry_text) {
// exist utf-16
}
How can I at Javascript check? Try to find an answer the second day ((
Thanks.
A string is a series of characters, each which have a character code. ASCII defines characters from 0 to 127, so if a character in the string has a code greater than that, then it is a Unicode character. This function checks for that. See String#charCodeAt.
function hasUnicode (str) {
for (var i = 0; i < str.length; i++) {
if (str.charCodeAt(i) > 127)
return true;
}
return false;
}
Then use it like, hasUnicode("Test message");
If it's a short string, one method would be to just look across the length of it and check if any of the char-codes sit outside of the single-byte 0-255 range
if (_entry_text.charCodeAt(i) > 255) ...
I think there is no need for a loop;
var ascii = /^[ -~]+$/;
ascii.test("Sefa"); // it is true no non-ascii characters
ascii.test("Sefa£"); // it is false there is non-ascii character
The following regex:
x.toString().replace(/\B(?=(?:\d{3})+(?!\d))/g, "-");
adds dash after each 3rd character so entered 123456789 turns into 123-456-789.
Im trying to use this regex to format phone number. The problem arises on the 10th character. So entered 1234567890 turns into 1-234-567-890.
How would I modify the above regex to turn strings that have 10 digits into 123-456-7890. I use this regex because this happens as user is typing in uses keyup event.
If you know easier or better way of doing this please help me out, dashes has to be added while user is typing in. No other characters allowed.
Notes:
Cant use Jquery Masked input plugin (because if editing the middle character it's focus gets messed up)
How about
> "12345678".match(/\d{3}(?=\d{2,3})|\d+/g).join("-")
"123-456-78"
> "123456789".match(/\d{3}(?=\d{2,3})|\d+/g).join("-")
"123-456-789"
> "1234567890".match(/\d{3}(?=\d{2,3})|\d+/g).join("-")
"123-456-7890"
If you ALREADY have the complete number or string
var x = "329193914";
console.log(x.replace(/(\d{3})(\d{3})(\d{3})/, "$1-$2-$3"));
If you WANT AS someone is typing...
$('#locAcct').keyup(function () {
var foo = $(this).val().split("-").join(""); // remove hyphens
if (foo.length > 0) {
foo = foo.match(new RegExp('.{1,3}', 'g')).join("-");
}
$(this).val(foo);
});
Do you need to use regular expressions for everything or would maybe something like this also help you out?
function convertToValidPhoneNumber(text) {
var result = [];
text = text.replace(/[^\d]/g,"");
while (text.length >= 6) {
result.push(text.substring(0, 3));
text = text.substring(3);
}
if(text.length > 0) result.push(text);
return result.join("-");
}
You could use this function everytime the text in your inputfield changes. It will produce the following results:
"12345678" -> "123-45678"
"123d456789" -> "123-456-789"
"123-4567-89" -> "123-456-789"
I believe the simplest way would be to add dash after every n digits would be like
var a = $('#result');
var x = "<p>asiija kasdjflaksd jflka asdkhflakjshdfk jasd flaksjdhfklasd f</p><p>12345678912345678912345678912312344545545456789</p>"
a.html(x.replace(/(\d{15})/g, "$1-"));
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<div id="result"></div>
Most easiest way is the following using simple javascript onkey and function... it will put dash hyphen after every 3 characters you input / type.
<input type="text" class="form-control" name="sector" id="sector" onkeyup="addDash(this)" required>
add the following script
<script>
function addDash (element) {
let ele = document.getElementById(element.id);
ele = ele.value.split('-').join(''); // Remove dash (-) if mistakenly entered.
let finalVal = ele.match(/.{1,3}/g).join('-');
document.getElementById(element.id).value = finalVal;
}
</script>
i need a javascript function that able to check for digit and - only.
example: 1,2,3,4,5,6,7,8,9,0 will return true
and - will return true as well.
other than that all return false including enter is pressed.
i have a function like this:
function IsNumeric(sText){
var filter = /^[0-9-+]+$/;
if (filter.test(sText)) {
return true;
}else {
return false;
}
}
i call it like this:
if(!IsNumeric(value)) {
alert("Number and - only please");
}
for some reason it does not work, any method to do the verification without using regex?
EDIT: OK, updated as per your comment, an expression to match either a lone minus sign or any combination of digits with no minus sign:
function IsNumeric(sText){
return /^(-|\d+)$/.test(sText);
}
If you want only positive numbers and don't want to allow leading zeros then use this regex:
/^(-|[1-9]\d*)$/
Regarding your question "any method to do the verification without using regex?", yes, there are endless ways to achieve this with the various string and number manipulation functions provided by JS. But a regex is simplest.
Your function returns true if the supplied value contains any combination of digits and the plus or minus symbols, including repeats such as in "---+++123". Note that the + towards the end of your regex means to match the preceding character 1 or more times.
What you probably want is a regex that allows a single plus or minus symbol at the beginning, followed by any combination of digits:
function IsNumeric(sText){
return /^[-+]?\d+$/.test(sText);
}
? means match the preceding character 0 or 1 times. You can simplify [0-9] as \d. Note that you don't need the if statement: just return the result from .test() directly.
That will accept "-123", "123", "+123" but not "--123". If you don't want to allow a plus sign at the beginning change the regex to /^-?\d+$/.
"example: 1,2,3,4,5,6,7,8,9,0 will return true and - will return true as well."
Your example seems to be saying that only a single digit or a single minus sign is considered valid - if so then try this:
function IsNumeric(sText){
return /^[\d-]$/.test(sText);
}
How about
function IsNumeric(s) {
return /^(+|-|)\d*$/.test(s);
}
Hiphen(-) has special meaning so use escape character in character set.
Try this:
var filter = /^[0-9\-]+$/;
Can be simple ... try this:
function IsNumeric(str) {
return str.length == 1 && (parseInt(str) < 10 || str == "-");
}