Javascript text divide by tabs problem (multiline-cells) - javascript

I have a strange issue in my Web Page, specifically with a text area element that obtains the clipboard from the user.
The user perform a CTRL+V; and I created a event to get the data with the event KeyUp.
...this works fine...
But, when I try to divide by each "row" of this textarea; start the problems...
The input can be like this example:
The data reads something like that:
Row1[0][HT]Row1[1][LF]"Row2[0] Comment line 1[LF]Row2[0] Comment line 2"[HT]Row2[1]
Where:
[HT] means {Tab}
[LF] means {New line}
I use:
var myData = document.getElementById("TextAreaElement").value;
var vArray = myData.split(/\n/);
But this array return me 3 lines...
Somebody knows any solution or alternative way?

You get a text containing three lines, split it on the line breaks and get an array with three items. Seems like it works. :) Now, what you need to do, is take each of these items and split them on a tab ( = \t)
[edit]
Nope, now I see what you mean. You won't get there by using this splitting. You'll have to parse the string. A field value can contain an enter, in which case it will be enclosed in double quotes. So you'll have to parse the string and don't split it on a break when you're still within a set of quotes.

Regarding the problem of '.' not matching newlines, the standard method of doing that in JS is [\S\s] which will match anything.
It looks like all you want to do for starters is split the string by tabs, right? Then...
result = string.split(/\t/)
Then you'll have an array with each of the rows' data separate. Note that this only works if your data can't have extra erroneous tabs in it.
Whatever tool is getting the information into the clipboard should really do some escaping before it is copied out and parsed by your JS. If it can't do that, then really anything goes - you can't in that case guarantee that your string won't have double-quotes, tabs, or any other character you might try to use as a delimiter.

Well, I don't find the way to work with some regular expression or a javascript method (I believe that can do it). I worked a different way to split the info.
I used AJAX to send this information to the server and perform the split in VB.
In resume:
I get the max columns (split by tabs).
Get and evaluate each value of the array of tabs.
If start with double quotes, tried to find the end of the double quotes (before that, replaced the mid double quotes with a unique text)
Every time that evaluated an item of the original Array, I deleted each item (Always evaluate the item 0)...
If find a new line (final of the row), only removed the text of the "final column" of the previous row.
I hope help to someone with the same problem. Cheers.
Here is the code:
Public Function TEST(ByVal pText As String) As String
Try
Dim vText As String = pText
Dim vArray As New ArrayList
vArray.AddRange(vText.Split(vbNewLine))
Dim vActualIndex As Integer = 0
Dim vMaxColumns As Integer = 0
For Each vArrayItem In vArray
If vArrayItem.Split(vbTab).Length > vMaxColumns Then
vMaxColumns = vArrayItem.Split(vbTab).Length
End If
Next
Dim vActualArray(vMaxColumns - 1) As String
vArray = New ArrayList
vArray.AddRange(vText.Split(vbTab))
Dim vLen As Integer = vArray.Count
Dim vNewArray As New ArrayList
vActualIndex = 0
Do While vArray.Count <> 0
If vArray(0).Split(vbNewLine).Length = 1 Then
vActualArray(vActualIndex) = vArray(0)
vActualIndex += 1
Else
If vArray(0).Split(vbNewLine)(0).ToString.StartsWith("""") Then
vArray(0) = Mid(vArray(0), 2).Replace("""""", "*_IDIDIDUNIQUEID_*")
If InStr(vArray(0), """" & vbNewLine) <> 0 Then
vActualArray(vActualIndex) = Mid(vArray(0), 1, InStr(vArray(0), """" & vbNewLine) + 1)
vArray(0) = Mid(vArray(0), InStr(vArray(0), """" & vbNewLine) + 3)
vActualArray(vActualIndex) = vActualArray(vActualIndex).ToString.Replace("*_IDIDIDUNIQUEID_*", """""")
vArray(0) = vArray(0).ToString.Replace("*_IDIDIDUNIQUEID_*", """""")
vActualIndex += 1
GoTo Skip_remove
End If
vArray(0) = vArray(0).ToString.Replace("*_IDIDIDUNIQUEID_*", """""")
vActualArray(vActualIndex) = vArray(0)
vActualIndex += 1
Else
vActualArray(vActualIndex) = vArray(0).Split(vbNewLine)(0)
vActualIndex += 1
vArray(0) = vArray(0).ToString.Substring(vArray(0).Split(vbNewLine)(0).ToString.Length + 2)
GoTo Skip_remove
End If
End If
vArray.RemoveAt(0)
' This is a label in VB code
Skip_remove:
If vActualIndex >= vMaxColumns Then
vNewArray.Add(vActualArray)
ReDim vActualArray(vMaxColumns - 1)
vActualIndex = 0
End If
Loop
Catch ex As Exception
Return ""
End Try
End Function

Related

InputStream Encoding with Special Characters

Apologies, I'm not a JS developer and this is the first time I've worked with InputStream.
In the InputStream, I am processing one line of delimited text at a time that will always contain a character that is not UTF-8. My goal is to parse the InputStream to a string, split it by the delimiter, and read a certain value that is UTF-8 at an index.
The line will always be tab delimited, and will always contain the same number of delimiters. I might see something like this (two separate lines):
stuff morestuff 0.00 A ç F00012049333302129FF
stuff2 morestuff2 B è F00012205229521042CB
In my code, the value at the index position always seems to leave my variable undefined, and I'm assuming it's from the UTF-8 encoding in the toString method. My assumption is that the encoding is turning the non UTF-8 character into something that messes up the split function, but I'm not sure what or how. Here's some test code:
var InputStreamCallback = Java.type("org.apache.nifi.processor.io.InputStreamCallback");
var IOUtils = Java.type("org.apache.commons.io.IOUtils");
var StandardCharsets = Java.type("java.nio.charset.StandardCharsets");
var flowFile = session.get();
var index = 5;
session.read(flowFile,
new InputStreamCallback(function(inputStream) {
// Convert the single line of the flowfile into a UTF_8 encoded string
var line = IOUtils.toString(inputStream, StandardCharsets.UTF_8);
// Split the delimited string into an array
var dataArray = line.split('\t');
// Capture the required value at the defined index position
var capturedValue = dataArray[index];
}));
if (typeof capturedValue === 'undefined') {
// log an error
}
else {
// do what it's supposed to do
}
I'm hoping someone could explain what exactly is happening, and help me find a solution that will allow me to look up the correct value at my predetermined index position.

Replace some of a matched string in javascript

I'm working on a grid layout function. I'm about 1/2 way there and have hit a wall.
I am using a string and running a single RegExp function per box I'm placing in the grid to determine where it will fit (based on number of rows/columns it occupies). This works successfully. Demonstration:
function findSpace(columns, rows){
var totalColumns = 4;
var grid = "11002100020000200002";
//1 represents occupied space, 0 empty space, and 2 a new line
var reg = RegExp("(0{" + columns + "})(([0-2]{" + (totalColumns - columns + 1) + "})0{"+columns+"}){" + (rows-1) + "}");
var i = grid.search(reg);
return i.index;
}
Returns the index of the match, letting me know where in my grid this box will fall. See fiddle.
I fall short trying to replace the "0"s with "1"s. Doing grid.replace(reg, "1") of course replaces everything from the beginning of the match to the end with a single "1". I need to replace just the "0"s that will be occupied for row and column, each with a "1", and not any of the characters matched between.
This is an exercise in doing things differently. Yes, I could do this with an array data structure. What fun is that? I'm not looking for a "don't do it this way do it the way everyone else does" answer, I'm trying to determine the most efficient way to solve my scenario.
Thanks!
The String#replace method might be the ticket. If you pass a regular expression and a function as the second parameter, you can dynamically manipulate each match.
Source: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace

javascript - .replace() is working on one string but not another

Pulling data with AJAX, into an array, everuthing there works fine, then I have this...
$.each(data, function (key, value){
var add = value[5]+value[6];
var sub = add.replace(" ","");
var link = 'http://'+sub+'.mydomain.com';
}
//OUTPUT: http://RR1 Box 22USHIGHWAY 67.NextHomeTown.com
This isn't working. It's not replacing any space characters.
Now, here's where it gets fun. This works on every other DB entry that is returned that has a space. Crazy, right?
Is there some type of character encoding that might be causing it not to recognize the space character that is used in this particular entry? The MySQL table has them entered as varchar, but at this point in the process, they're both just text strings right? So it shouldn't matter.
This will only replace the first spacebar it will match. Use this to replace all spacebars:
var sub = add.replace(/\s/g,"");
Since you report the desired behaviour with other tables, it's perhaps not relevant - but don't forget that in javascript, the string replace function only replaces the first instance of the searchString, unless you use a regular expression.
"red, red, red".replace(/ /g, "");
"red,red,red"
"red, red, red".replace(" ", "");
"red,red, red"

JS - Remove all characters before/after a string (and keep that string)?

I've seen several results for removing characters after a specific character - my question is how would I do that with a string?
Basically, this applies to any given string of data, but let's take a URL: stackoverflow.com/question
With given string, and in JS, I'd like to remove everything after ".com", assign ".com" to a variable, and assign the text before ".com" to a separate variable.
So, end result: var x = "stackoverlow" var y = ".com"
What I've done so far:
1) Using a combination of split, substring, etc. I can get it to remove pieces, but not without removing part of the ".com" string. I'm pretty sure I can do what I want to do with substring and split, I think I'm just implementing it incorrectly.
2) I'm using indexOf to find the string ".com" within the full string
Any tips? I haven't posted my actual code because it's become so garbled with all the different things I've tried (I can go ahead and do so if necessary).
Thanks!
You should really look into Regular Expressions.
Here is some code that can get what you are trying to do:
var s = 'stackoverflow.com/question';
var re = /(.+)(\.com)(.+)/;
var result = s.match(re);
if (result && result.length >= 3) {
var x = result[1], //"stackoverlow"
y = result[2]; //".com"
console.log('x: ' + x);
console.log('y: ' + y);
}
Use regular expressions.
"stackoverflow.com".match(/(.+)(\.com)/)
results in
["stackoverflow.com", "stackoverflow", ".com"]
(Why would you want to assign .com to a variable, though?
"stackoverflow.com".split(/\b(?=\.)/) => ["stackoverflow", ".com"]
Or,
"stackoverflow.com/question".split(/\b(?=\.)|(?=\/)/)
=> ["stackoverflow", ".com", "/question"]

Javascript Regexp Duplicate Line Matching not working correctly

I am writing a Javascript code to parse some grammar files, it is quite some code but I will post relevant information here. I am using Javascript Regexp in order to match a duplicate line held within a string. The string contains, for example (assume the string name is lines):
if
else
;
print
{
}
test1
test1
=
+
-
*
/
(
)
num
string
comment
id
test2
test2
What should happen, is a match found on 'test1' and 'test2'. It should then delete the duplicate, leaving 1 instance of test1 and test2. What is happening is no match at all. I am confident in my regex but javascript may be doing something I am not expecting. Here is the code doing the work on the string given above:
var rex = new RegExp("(.*)(\r?\n\1)+","g");
var re = '/(.*)(\r?\n\1)+/g';
rex.lastIndex = 0;
var m = rex.exec(lines);
if (m) {
alert("Found Duplicate");
var linenum = lines.search(re); //Get line number of error
alert("Error: Symbol Defined twice\n");
alert("Error occured on line: " + linenum);
lines = lines.replace(rex,""); //Gets rid of the duplicate
}
It never gets into the if(m) statement. Therefore no match is found. I tested the regex here: http://regexpal.com/ using the regex in my code as well as the example text provided. It matches just fine, so I am at kind of a loss. If anyone can help, it would be great.
Thank you.
Edit:
Forgot to add, I am testing this in firefox, and it only has to work in firefox. Not sure if that matters.
First error: \ in a JS string is also an escape character.
var rex = new RegExp("(.*)(\r?\n\1)+","g");
should be written
var rex = new RegExp("(.*)(\\r?\\n\\1)+","g");
// or, shorter:
var rex = /(.*)(\r?\n\1)+/g;
if you want to make it work. In the case of the RegExp constructor, you’re passing the pattern as a string to the constructor function. This means you need to escape each \ backslash that occurs in the pattern. If you use a regexp literal, you don’t need to escape them, since they’re not in a string, but retain their ‘normal’ properties in the regexp pattern.
Second error, your expression
var re = '/(.*)(\r?\n\1)+/g';
is wrong. What you’re doing here is assigning a string literal to a variable. I’m assuming you meant to assign a regular expression literal, which should be written like this:
var re = /(.*)(\r?\n\1)+/g;
Third error: the last line
lines = lines.replace(rex,""); //Gets rid of the duplicate
removes both instances of all duplicate lines! If you want to keep the first instance of each duplicate, you should use
lines = lines.replace(rex, "$1");
And finally, this method only detects two consecutive identical lines. Is that what you want, or do you need to detect any duplicates, wherever they may be?
var str = 'if\nelse\n;\nprint\n{\n}\ntest1\ntest1\n=\n+\n-\n*\n/\n(\n)\nnum\nstring\ncomment\nid\ntest2\ntest2\ntest2\ntest2\ntest2';
console.log(str);
str = str.replace(/\r\n?/g,'');
// I prefer replacing all the newline characters with \n's here
str = str.replace(/(^|\n)([^\n]*)(\n\2)+/g,function(m0,m1,m2,m3,ind) {
var line = str.substr(0,ind).split(/\n/).length + 1;
var msg = '[Found duplicate]';
msg += '\nFollowing symbol defined more than once';
msg += '\n\tsymbol: ' + m2;
msg += '\n\ton line ' + line;
console.log(msg);
return m1 + m2;
});
console.log(str);
Otherwise you can skip the first line and change the pattern into
/(^|\r\n?|\n)([^\r\n]*)((?:\r\n?|\n)\2)+/g
Note that [^\n]* will also catch multiple empty lines. If you want to make sure it matches (and replaces) non-empty lines then you might want to use [^\n]+.
[EDIT]
For the record, each m represents each arguments object, so m0 is the whole match, m1 is the 1st subgroup ((^|\n)), m2 is the 2nd subgroup (([^\n]*)) and m3 is the last subgroup ((\n\2)). I could have used arguments[n] instead but these are shorter.
As with the return value, due to lack of lookbehind in the regex flavor used by Javascript, this pattern is catching a possible preceding newline (unless it is the first line) so it needs to return the match and that preceding newline if any. That's why it shouldn't be returning m2 only.

Categories