Me and my team are doing a React/Redux project and now I want to filter out duplicated tags, but I realize someone has put some tricky strings to the tags data like this
And when I log those tags to the console, for example the first and the second tag of the tag list are looking like the same is "HumanIty" but when I compare them with even strict equal operator, I've got the false result.
When I try to select and copy the text content in both string tags, then paste them back to the console, I got a surprise result - The string in the second tag somehow has spaces between characters (red dots in the picture below)
Someone has to face this problem before please give me some explain about this.
Thank you.
To answer your question directly:
Is it possible for to two equal strings be unequal in Javascript?
No.
As mentioned in the comments you have some invisible characters in your strings, making them unequal when you compare them.
To fix the problem, remove the invisible characters with a method of your choice (my recommendation would be to not let user input invisible characters in the first place).
What is the .length property of each string?
If you iterate an index variable over each character position from 0 (inclusive) to length (exclusive), and print the .charCodeAt(index), what do you see?
In doing this, you might see differences between the strings.
I've found out that one of those two look-alike strings contains some special invisible, zero-width character called Byte Order Mark
(https://www.ionos.com/digitalguide/websites/web-development/byte-order-mark/)
and we could strip out those characters by the regex /[^\x20-\x7E]/g as
(https://www.w3resource.com/javascript-exercises/javascript-string-exercise-32.php)
We could detect the existence of the invisible character with some tools which show unicode character
(https://qaz.wtf/u/show.cgi?show=a%E2%80%8Bc&type=string)
Related
I try to match multiple values between quotes
(these values can be anything but spaces)
the best I can achieve is to match everything between the first and the last quote
I already checked many SO answers, yet I cannot make it work
here is the regex
\[\[\[(\w*img\w*)\s(\w*id|url\w*)+="([^"]|.*)"\]\]\]
here is the string I try to match (values are numbers but I could have urls or anything similar)
[[[img id="37" w="100" h="70"]]]
I should get all parameters and their respecting values, but I get only one parameter with the value beeing 37" w="100" h="70
I know I am close, but this one is tricky
regards
I don't think you need all the \w.
And I also would suggest splitting the task in two parts as suggested in a comment.
However, I also see an option in doing it in just one step:
\[\[\[img(?:\s(\w+)="([^"]+)")?(?:\s(\w+)="([^"]+)")?(?:\s(\w+)="([^"]+)")?\]\]\]
This is basically the wrapper [[[]]], a normal character part img and then (?:\s(\w+)="([^"]+)")? repeated as many times as you expect attributes to appear. (\w+) matches the name of the attribute and ([^"]+) its value.
Ok so I am rather stumped by this one.
I get a string value from a javascript library. I call myStringVar = myStringVar.trim() but when I do myStringVar.substring(0,1) it gives me an empty string. When I call var arr = myStringVar.split('') the first element in the array is and empty string, and when I call arr[0].trim().length it returns 1 instead of zero.
Am I missing something?
EDIT
Following the comments and responses I have been able to isolate the problem down to the existence of a non-visual unicode character at the beginning of the string. I will now try to find a way to remove those characters from the string....or better yet extract the portions of the string that are of interest.
Thanks for the help.
The most likely answer for this is that you have some invisible Unicode character in your string (for instance, "", U+2063 INVISIBLE SEPARATOR).
A string containing only such a character would look to a user (or programmer) like an empty string, but would infact have length 1 since it does contain a character.
One simple way to test if this is the case, is to get the Unicode character code of the character in the string with string.charCodeAt(0). You can then look this up value in a Unicode table (such as this one), which should tell you if you have an invisible character in your string.
I am surprised to not to find any post regarding this, I must be missing something very trivial. I have a small JavaScript function to check if a string matches an object's properties. Simple stuff right? It works easily with all strings except those which contain a forward slash.
"04/08/2015".indexOf('4') // returns 2 :good
"04/08/2015".indexOf('4/') // returns -1 :why?
The same issue appears to be with .search() function as well. I encountered this issue while working on date strings.
Please note that I don't want to use regex based solution for performance reasons. Thanks for your help in advance!
Your string has invisible Unicode characters in it. The "left-to-right mark" (hex 200E) appears around the two slash characters as well as at the beginning and the end of the string.
If you type the code in on your browser console instead of cutting and pasting, you'll see that it works as expected.
I'm trying to match variables and numbers in a javascript string (surrounding matches with span tags).
I'm having issues with variables in the form x1, c2 etc. My code originally looked like this
output = output.replace(/\d+/g,"<span class=\"number\">$&</span>");
output = output.replace(/\)/g,"<span class=\"function\">$&</span>");
output = output.replace(/[a-zA-Z]+\d*/g,returnTextValue);
//returnTextValue is a function checking whether the string is a variable or plain text
//and highlighting accordingly
This caused variables in the form [a-zA-Z]+\d+ to not get matched correctly, because they had already been replaced with the number tag.
I've been trying a few things using lookaheads and stuff like [^A-Za-z]?\d+ for the numbers, but have not been able to find a good way of doing this.
I know I could match the tags, but would like a more elegant solution.
Am I missing an obvious logical solution, or does somebody have a regex operator I don't know for this situation?
Is the \d+ in the first rule supposed to match isolated numbers? Add boundaries \b\d+\b, then it won't match the a2 type. – Michael Berkowski Dec 6 at 2:55
I having the following code. I want to extract the last text (hello64) from it.
<span class="qnNum" id="qn">4</span><span>.</span> hello64 ?*
I used the code below but it removes all the integers
questionText = questionText.replace(/<span\b.*?>/ig, "");
questionText=questionText.replace(/<\/span>/ig, "");
questionText = questionText.replace(/\d+/g,"");
questionText = questionText.replace("*","");
questionText = questionText.replace(". ",""); i want to remove the first integer, and need to keep the rest of the integers
It's the third line .replace(/\d+/g,"") which is replacing the integers. If you want to keep the integers, then don't replace \d+, because that matches one or more digits.
You could achieve most of that all on one line, by the way - there's no need to have multiple replaces there:
var questionText = questionText.replace(/((<span\b.*?>)|(<\/span>)|(\d+))/ig, "");
That would do the same as the first three lines of your code. (of course, you'd need to drop the |(\d+) as per the first part of the answer if you didn't want to get rid of the digits.
[EDIT]
Re your comment that you want to replace the first integer but not the subsequent ones:
The regex string to do this would depend very heavily on what the possible input looks like. The problem is that you've given us a bit of random HTML code; we don't know from that whether you're expecting it to always be in this precise format (ie a couple of spans with contents, followed by a bit at the end to keep). I'll assume that this is the case.
In this case, a much simpler regex for the whole thing would be to replace eveything within <span....</span> with blank:
var questionText = questionText.replace(/(<span\b.*?>.*?<\/span>)/ig, "");
This will eliminate the whole of the <span> tags plus their contents, but leave anything outside of them alone.
In the case of your example this would provide the desired effect, but as I say, it's hard to know if this will work for you in all cases without knowing more about your expected input.
In general it's considered difficult to parse arbitrary HTML code with regex. Regex is a contraction of "Regular Expressions", which is a way of saying that they are good at handling strings which have 'regular' syntax. Abitrary HTML is not a 'regular' syntax due to it's unlimited possible levels of nesting. What I'm trying to say here is that if you have anything more complex than the simple HTML snippets you've supplied, then you may be better off using a HTML parser to extract your data.
This will match the complete string and put the part after the last </span> till the next word boundary \b into the capturing group 1. You just need to replace then with the group 1, i.e. $1.
searched_string = string.replace(/^.*<\/span>\s*([A-Za-z0-9]+)\b.*$/, "$1");
The captured word can consist of [A-Za-z0-9]. If you want to have anything else there just add it into that group.