I have a requirement to split the below string and get the values. Have tried with the JS string split method but it is not working for backslash split(). Please help me. Thanks in advance.
Input string = "11,22;0:0/0\0}0#0&"
Output:
, => 11,
; => 22
: => 0
/ => 0
\ => 0
} => 0
# => 0
& => 0
I made a modification with reference to a lot of reading. Hope you find this useful. The '0' with an escape character is also printed in this solution. Please check it.
let string = "26,67;4:79/9\0}0&";
string = encodeURI(string);
string = string.replace("%0", "&");
string = decodeURI(string);
numbers = string.split(/,|;|:|\/|\\|}|&/);
finalList = [];
for (let i = 0; i < numbers.length; i++) {
if (numbers[i] != "")
finalList.push(parseInt(numbers[i]));
}
console.log(finalList);
It is working fine for me. Please try the below code.
let string = "26,67;4:79/9\0}0&";
let arr = string.split(/,|;|:|\/|\\|}|&/); // ["26", "67", "4", "79", "9 ", "0", ""]
Thank you...
I have a large text from which I read data according to the scheme. Key words are placed in the "smallArtName" array. The scheme looks like this:
(key word) xxx (cordX|cordY)
I can't convert the string I received to a number. It seems to me that the reason is white space, visible in the terminal in the picture. I tried to use the replace method which works for sample text, but not for my value.
I'm a beginner and I could probably do it simpler, but the code I wrote works, and this is the most important thing for now.
for (i = 0; i < smallArtName.length; i++) {
var n = art.artPrintScreen.indexOf(smallArtName[i]);
if (n > -1) {
var tempString = art.artPrintScreen.substring(n, n + 100);
betweenChar = tempString.indexOf('|');
for (k = betweenChar - 10; k <= betweenChar + 10; k++) {
if (tempString[k] == '(') {
xStart = k;
}
if (tempString[k] == ')') {
yEnd = k;
}
}
cordX = tempString.slice(xStart + 1, betweenChar);
cordY = tempString.slice(betweenChar + 1, yEnd);
strTest = " t est".replace(/\s/g, '')
var cordY2 = cordY.replace(/\s/g, '')
console.log(typeof (cordY))
console.log(cordY2)
console.log(cordY2[0])
console.log(cordY2[1])
console.log(cordY2[2])
console.log(cordY2[3])
console.log(cordY2[4])
console.log(cordY2[5])
console.log(strTest)
var cordYtest = parseInt(cordY2, 10);
console.log(cordYtest)
}
}
Terminal:
-181
-
1
8
1
test
NaN
string
-154
-
1
5
4
test
NaN
string
104
1
0
4
undefined
test
NaN
Fragment of input text:
Ukryta twierdza (Mapa podziemi I) 153 (−72|−155)
Ukryta twierdza (Amfora Mgły VI) 135 (73|104)
Ukryta twierdza (Mapa podziemi IV) 131 (154|−72)
Analysing your sample input strings, I found some unicode characters \u202c and \u202d that should be stripped before converting to number. Also, the negative values are prefixed by the character −, which is different than minus -, se we need to replace it. That being said, all parsing could be done with a single regex:
var input = "Ukryta twierdza (Mapa podziemi I) 153 (−72|−155)";
input = input.replace(/\u202d|\u202c/g, "");
input = input.replace(/−/g, "-");
var m = input.match(/.*\((.*)\)\s*(.+?)\s*\((.+)\|(.+)\)/);
console.log(m);
console.log(parseInt(m[3]));
console.log(parseInt(m[4]));
Explaining the regex:
.* - Something that will be ignored
\((.*)\) - Something enclosed in parenthesis
\s*(.+?)\s* - Something possibly surrounded by spaces
\((.+)\|(.+)\) - Two parts split by a | and enclosed by parenthesis
I'm trying to access an object within another object using an object key. I'm concatenating the values of a group of <select> elements to build a string that matches the key. Unfortunately this never works. Here's the code below:
var valueKeyString = "";
$(".keySelector").each(function(){
valueKeyString += $(this).val();
});
if (geoAttrs[selectedGeoAttr].rowCol == "row") {
mapData = rowValueGroups[valueKeyString];
} else if (geoAttrs[selectedGeoAttr].rowCol == "col") {
mapData = colValueGroups[valueKeyString];
}
After trying a good number of things I tested the strings character by character using charCodeAt():
var test1 = valueKeyString;
for (var i = 0, len = valueKeyString.length; i < len; i++) {
test1 += valueKeyString.charCodeAt(i)+" ";
}
if (geoAttrs[selectedGeoAttr].rowCol == "row") {
Object.keys(colValueGroups).forEach(function(group) {
var test2 = group;
for (var i = 0, len = group.length; i < len; i++) {
test2 += group.charCodeAt(i)+" ";
}
console.log(test1);
console.log(test2)
})
mapData = colValueGroups[valueKeyString];
}
My concatenated strings all had an extra character with a charCode of 0 at the point of concatenation. Couldn't figure out why it was there or how to get rid of it using a regEx or str.replace(). Ended up with an ugly but functional solution where I just test the object keys to see if they contain the values from the <select> elements:
var valueKeys = [];
$(".keySelector").each(function(){
valueKeys.push($(this).val());
});
if (geoAttrs[selectedGeoAttr].rowCol == "row") {
Object.keys(colValueGroups).forEach(function(group) {
var groupHasAllKeys = true;
valueKeys.forEach(function(key) {
if (group.indexOf(key) == -1 ) {
groupHasAllKeys = false;
}
});
if(groupHasAllKeys) {
mapData = colValueGroups[group];
}
});
} else if (geoAttrs[selectedGeoAttr].rowCol == "col") {
Object.keys(rowValueGroups).forEach(function(group) {
var groupHasAllKeys = true;
valueKeys.forEach(function(key) {
if (group.indexOf(key) == -1 ) {
groupHasAllKeys = false;
}
});
if(groupHasAllKeys) {
mapData = rowValueGroups[group];
}
});
}
There's got to be a better way, right? What the hell is going on here?
EDIT: rowValueGroups might look something like this:
{
"35_to_44_yearsMale": {
"4654387684": {
"value": 215
},
"4654387685": {
"value": 175
},
"4654387686": {
"value": 687
},
"4654387687": {
"value": 172
}
},
"45_to_54_yearsMale": {
"4654387684": {
"value": 516
},
"4654387685": {
"value": 223
},
"4654387686": {
"value": 54
},
"4654387687": {
"value": 164
}
}
}
valueKeyString should be "45_to_54_yearsMale" or the like.
This is survey data. I'm extracting the rowValueGroups data from the output of a nicolaskruchten/pivottable custom renderer. (Specifically from the pivotData.tree object if you're curious). I'm not looking to alter the pivottable core to change how those keys are formatted, so I just figured I'd concatenate a couple of values from a select element to make it work.
Here's part of the console output from the tests above:
35_to_44_yearsMale51 53 95 116 111 95 52 52 95 121 101 97 114 115 0 77 97 108 101
35_to_44_yearsMale51 53 95 116 111 95 52 52 95 121 101 97 114 115 77 97 108 101
First line is the test done on valueKeyString and the second on one of the keys from rowValueGroups. Note that the initial string looks identical, (the Firefox console actually outputs a little character-not-found square in between "years" and "Male" for the valueKeyString one) but the charCodeAt() turns up that weird 0 character at the point of concatenation.
You have null characters in your string. This regex will remove them:
valueKeyString = valueKeyString.replace( /\0/g, '' )
Reference: Removing null characters from a string in JavaScript
If you identify the char code you can try to replace it using exactly its code:
var valueKeyString = "";
$(".keySelector").each(function(){
valueKeyString += $(this).val();
});
valueKeyString = valueKeyString.replace(new RegExp(String.fromCharCode(0),"g"),'');
Or presuming the string already comes \0-terminated directly from the jquery $(this).val() you can try another approach:
var valueKeyString = "";
$(".keySelector").each(function(){
s = $(this).val();
if ( s.charCodeAt(s.length - 1) == 0 )
s = s.substring(0, s.length - 1);
valueKeyString += s;
});
The above snippet check if the last char of each freshly obtained string from $(this).val() is null-terminated (just to be sure) and remove the last char with substring().
Since you are using jQuery you can try trim():
var valueKeyString = "";
$(".keySelector").each(function(){
valueKeyString += $(this).val().trim();
});
I want to create a string in JavaScript that contains all ascii characters. How can I do this?
var s = ' !"#$%&\'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~';
My javascript is a bit rusty, but something like this:
s = '';
for( var i = 32; i <= 126; i++ )
{
s += String.fromCharCode( i );
}
Not sure if the range is correct though.
Edit:
Seems it should be 32 to 127 then. Adjusted.
Edit 2:
Since char 127 isn't a printable character either, we'll have to narrow it down to 32 <= c <= 126, in stead of 32 <= c <= 127.
Just loop the character codes and convert each to a character:
var s = '';
for (var i=32; i<=127;i++) s += String.fromCharCode(i);
Just wanted to put this here for reference. (takes about 13/100 to 26/100 of a ms on my computer to generate).
var allAsciiPrintables = JSON.stringify((Array.from(Array(126 + 32).keys()).slice(32).map((item) => {
return String.fromCharCode(item);
})).join(''));
Decomposed:
var allAsciiPrintables = (function() {
/* ArrayIterator */
var result = Array(126 + 32).keys();
/* [0, 126 + 32] */
result = Array.from(result);
/* [32, 126 + 32] */
result = result.slice(32);
/* transform each item from Number to its ASCII as String. */
result = result.map((item) => {
return String.fromCharCode(item);
});
/* convert from array of each string[1] to a single string */
result = result.join('');
/* create an escaped string so you can replace this code with the string
to avoid having to calculate this on each time the program runs */
result = JSON.stringify(result);
/* return the string */
return result;
})();
The most efficient solution(if you do want to generate the whole set each time the script runs, is probably)(takes around 3/100-35/100 of a millisecond on my computer to generate).
var allAsciiPrintables = (() => {
var result = new Array(126-32);
for (var i = 32; i <= 126; ++i) {
result[i - 32] = (String.fromCharCode(i));
}
return JSON.stringify(result.join(''));
})();
strangely, this is only 3-10 times slower than assigning the string literal directly(with backticks to tell javascript to avoid most backslash parsing).
var x;
var t;
t = performance.now();
x = '!\"#$%&\'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~';
t = performance.now() - t;
console.log(t);
.
This is a version written in python. Gives all ASCII characters in order as a single string.
all_ascii = ''.join(chr(k) for k in range(128)) # 7 bits
all_chars = ''.join(chr(k) for k in range(256)) # 8 bits
printable_ascii = ''.join(chr(k) for k in range(128) if len(repr(chr(k))) == 3)
>>> print(printable_ascii)
' !"#$%&\'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~'
The last string here, printable_ascii contains only those characters that contain no escapes (i.e. have length == 1). The chars like: \x05, \x06 or \t, \n which does not have its own glyph in your system's font, are filtered out.
len(repr(chr(k))) == 3 includes 2 quotes that come from repr call.
Without doing several appends:
var s = Array.apply(null, Array(127-32))
.map(function(x,i) {
return String.fromCharCode(i+32);
}).join("");
document.write(s);
Here is an ES6 one liner:
asciiChars = Array.from({ length: 95 }, (e, i) => String.fromCharCode(i + 32)).join('');
console.log(asciiChars)
let str = '';// empty string declear
for( var i = 32; i <= 126; i++ )
{
str = str + String.fromCharCode( i ); /* this method received one integer and
convert it into a ascii characters and store it str variable one by one by using
string concatenation method. The loop start for 32 and end 126 */
}
Here is a version in coffeescript
require 'fluentnode'
all_Ascii = ->
(String.fromCharCode(c) for c in [0..255])
describe 'all Ascii', ->
it 'all_Ascii', ->
all_Ascii.assert_Is_Function()
all_Ascii().assert_Size_Is 256
all_Ascii()[0x41].assert_Is 'A'
all_Ascii()[66 ].assert_Is 'B'
all_Ascii()[50 ].assert_Is '2'
all_Ascii()[150 ].assert_Is String.fromCharCode(150)
I have a string containing binary data in JavaScript. Now I want to read, for example, an integer from it. So I get the first 4 characters, use charCodeAt, do some shifting, etc. to get an integer.
The problem is that strings in JavaScript are UTF-16 (instead of ASCII) and charCodeAt often returns values higher than 256.
The Mozilla reference states that "The first 128 Unicode code points are a direct match of the ASCII character encoding." (what about ASCII values > 128?).
How can I convert the result of charCodeAt to an ASCII value? Or is there a better way to convert a string of four characters to a 4 byte integer?
I believe that you can can do this with relatively simple bit operations:
function stringToBytes ( str ) {
var ch, st, re = [];
for (var i = 0; i < str.length; i++ ) {
ch = str.charCodeAt(i); // get char
st = []; // set up "stack"
do {
st.push( ch & 0xFF ); // push byte to stack
ch = ch >> 8; // shift value down by 1 byte
}
while ( ch );
// add stack contents to result
// done because chars have "wrong" endianness
re = re.concat( st.reverse() );
}
// return an array of bytes
return re;
}
stringToBytes( "A\u1242B\u4123C" ); // [65, 18, 66, 66, 65, 35, 67]
It should be a simple matter to sum the output up by reading the byte array as if it were memory and adding it up into larger numbers:
function getIntAt ( arr, offs ) {
return (arr[offs+0] << 24) +
(arr[offs+1] << 16) +
(arr[offs+2] << 8) +
arr[offs+3];
}
function getWordAt ( arr, offs ) {
return (arr[offs+0] << 8) +
arr[offs+1];
}
'\\u' + getWordAt( stringToBytes( "A\u1242" ), 1 ).toString(16); // "1242"
Borgar's answer seems correct.
Just wanted to clarify one point. Javascript treats bitwise operations as '32-bit signed int's, where the last (left-most) bit is the sign bit. Ie,
getIntAt([0x7f,0,0,0],0).toString(16) // "7f000000"
getIntAt([0x80,0,0,0],0).toString(16) // "-80000000"
However, for octet-data processing (eg, network stream, etc), usually want the 'unsigned int' representation. This can be accomplished by adding a '>>> 0' (zero-fill right-shift) operator which internally tells Javascript to treat this as unsigned.
function getUIntAt ( arr, offs ) {
return (arr[offs+0] << 24) +
(arr[offs+1] << 16) +
(arr[offs+2] << 8) +
arr[offs+3] >>> 0;
}
getUIntAt([0x80,0,0,0],0).toString(16) // "80000000"
There are two methods for encoding and decoding utf-8 string to a byte array and back.
var utf8 = {}
utf8.toByteArray = function(str) {
var byteArray = [];
for (var i = 0; i < str.length; i++)
if (str.charCodeAt(i) <= 0x7F)
byteArray.push(str.charCodeAt(i));
else {
var h = encodeURIComponent(str.charAt(i)).substr(1).split('%');
for (var j = 0; j < h.length; j++)
byteArray.push(parseInt(h[j], 16));
}
return byteArray;
};
utf8.parse = function(byteArray) {
var str = '';
for (var i = 0; i < byteArray.length; i++)
str += byteArray[i] <= 0x7F?
byteArray[i] === 0x25 ? "%25" : // %
String.fromCharCode(byteArray[i]) :
"%" + byteArray[i].toString(16).toUpperCase();
return decodeURIComponent(str);
};
// sample
var str = "Да!";
var ba = utf8.toByteArray(str);
alert(ba); // 208, 148, 208, 176, 33
alert(ba.length); // 5
alert(utf8.parse(ba)); // Да!
While #Borgar answers the question correctly, his solution is pretty slow. It took me a while to track it down (I used his function somewhere in a larger project), so I thought I would share my insight.
I ended up having something like #Kadm. It's not some little percent faster, it's like 500 times faster (no exaggeration!). I wrote a little benchmark, so you can see it for yourself :)
function stringToBytesFaster ( str ) {
var ch, st, re = [], j=0;
for (var i = 0; i < str.length; i++ ) {
ch = str.charCodeAt(i);
if(ch < 127)
{
re[j++] = ch & 0xFF;
}
else
{
st = []; // clear stack
do {
st.push( ch & 0xFF ); // push byte to stack
ch = ch >> 8; // shift value down by 1 byte
}
while ( ch );
// add stack contents to result
// done because chars have "wrong" endianness
st = st.reverse();
for(var k=0;k<st.length; ++k)
re[j++] = st[k];
}
}
// return an array of bytes
return re;
}
Borga's solution works perfectly. In case you want a more concrete implementation, you may want to have a look at the BinaryReader class from vjeux (which, for the records, is based on the binary-parser class from Jonas Raoni Soares Silva).
How did you get the binary data into the string in the first place? How the binary data gets encoded into a string is an IMPORTANT consideration, and you need an answer to that question before you can proceed.
One way I know of to get binary data into a string, is to use the XHR object, and set it to expect UTF-16.
Once it's in utf-16, you can retrieve 16-bit numbers from the string using "....".charCodeAt(0)
which will be a number between 0 and 65535
Then, if you like, you can convert that number into two numbers between 0 and 255 like this:
var leftByte = mynumber>>>8;
var rightByte = mynumber&255;
borgars solution improvement:
...
do {
st.unshift( ch & 0xFF ); // push byte to stack
ch = ch >> 8; // shift value down by 1 byte
}
while ( ch );
// add stack contents to result
// done because chars have "wrong" endianness
re = re.concat( st );
...
One nice and quick hack is to use a combination of encodeURI and unescape :
t=[];
for(s=unescape(encodeURI("zażółć gęślą jaźń")),i=0;i<s.length;++i)
t.push(s.charCodeAt(i));
t
[122, 97, 197, 188, 195, 179, 197, 130, 196, 135, 32, 103, 196, 153, 197, 155, 108, 196, 133, 32, 106, 97, 197, 186, 197, 132]
Perhaps some explanation is necessary why the heck it works, so let me split it into steps:
encodeURI("zażółć gęślą jaźń")
returns
"za%C5%BC%C3%B3%C5%82%C4%87%20g%C4%99%C5%9Bl%C4%85%20ja%C5%BA%C5%84"
which -- if you look closely -- is the original string in which all characters with values>127 got replaced with (possibly more than one) hexadecimal bytes representations.
For example letter "ż" became "%C5%BC". The fact is encodeURI escapes also some regular ascii characters like spaces, but it does not matter. What matters is that at this point each byte of the original string is either represented verbatim (as is the case with "z", "a", "g", or "j") or as a percent-encoded sequence of bytes (as was the case with "ż" which was originaly two bytes 197 and 188 and got converted to %C5 and %BC).
Now, we apply unescape:
unescape("za%C5%BC%C3%B3%C5%82%C4%87%20g%C4%99%C5%9Bl%C4%85%20ja%C5%BA%C5%84")
which gives
"zażóÅÄ gÄÅlÄ jaźÅ"
If you are not native Polish speaker you might not notice, that this result is in fact way different from the original "zażółć gęślą jaźń". For starters, it has a different number of characters :)
For sure, you can tell, that this strange versions of big letter A do not belong to standard ascii set. In fact this "Å" has value 197. (which is exactly C5 in hexadecimal).
Now, if you are like me, you would ask yourself: wait a minute...if this is really a sequence of bytes with values 122, 97, 197, 188, and JS is really using UTF then why do I see this "ż" characters, and not the original "ż" ?
Well, the thing is (I belive) that this sequence 122, 97, 197, 188 (which we see when applying charCodeAt) is not a sequence of bytes, but a sequence of codes. The character "Å" has a code 197, but its actually two bytes long sequence: C3 85.
So, the trick works because unescape treats numbers occuring in percent-encoded string as codes, not as byte values - or, to be more specific: unescape knows nothing about multibyte characters, so when it decodes bytes one-by-one, handling values lower than 128 just great, but not-so-good when they are above 127 and multibyte -- unescape in such cases simply returns a multibyte character which happens to have a code equal to the requested byte value. This "bug" is actually useful feature.
I'm going to assume for a second that your objective is to read arbitrary bytes from a string.
My first suggestion would be to make your string representation a hexidecmal representation of the binary data.
You can read the values using conversions to numbers from hex:
var BITS_PER_BYTE = 8;
function readBytes(hexString, numBytes) {
return Number( parseInt( hexString.substr(0, numBytes * (BITS_PER_BYTE/4) ),16 ) );
}
function removeBytes(hexString, numBytes) {
return hexString.substr( numBytes * (BITS_PER_BYTE/BITS_PER_CHAR) );
}
The functions can then be used to read whatever you want:
var hex = '4ef2c3382fd';
alert( 'We had: ' + hex );
var intVal = readBytes(hex,2);
alert( 'Two bytes: ' + intVal.toString(2) );
hex = removeBytes(hex,2);
alert( 'Now we have: ' + hex );
You can then interpret the byte string however you want.
Hope this helps!
Cheers!