Remove ^M from CSV file

Remove ^M from CSV file - javascript

Trying to read CSV formatted data into javascript using the jquery-csv library, but am getting a CSVDataError: Illegal Data error from the ^M character at the end of each line.
It seems no matter how a CSV is saved, I get this ^M. I can only ever see the ^M if I open the CSV file in vim, even in a text editor or my IDE the data looks fine. I don't get this problem when working in other languages either such as Python or R.
I am working on a Mac environment.
How can I fix this and avoid this problem in the future?

Use dos2unix to convert.
It's false that "no matter how it is saved" the CR (^M is a carriage return) is appended. For instance, echo 'a,b,c' > letters.csv does not append a CR. Check your text editor settings.

Take a look at the splitlines algorithm on the jquery-csv page, it seems to provide a function that will clean these problematic carriage returns for you.

Assuming ^M indicates a mac-style carriage return, support for carriage return was included in a previous release so your code should just work.
Source: I'm the author of jquery-csv

Related

Weird character in a commit.template of git [duplicate]

I keep getting the ^M character in my .vimrc and it breaks my
configuration.

Unix uses 0xA for a newline character. Windows uses a combination of two characters: 0xD 0xA. 0xD is the carriage return character. ^M happens to be the way vim displays 0xD (0x0D = 13, M is the 13th letter in the English alphabet).
You can remove all the ^M characters by running the following:
:%s/^M//g
Where ^M is entered by holding down Ctrl and typing v followed by m, and then releasing Ctrl. This is sometimes abbreviated as ^V^M, but note that you must enter it as described in the previous sentence, rather than typing it out literally.
This expression will replace all occurrences of ^M with the empty string (i.e. nothing). I use this to get rid of ^M in files copied from Windows to Unix (Solaris, Linux, OSX).

:%s/\r//g
worked for me today. But my situation may have been slightly different.

To translate the new line instead of removing it:
:%s/\r/\r/g

It probably means you've got carriage returns (different operating systems use different ways of signaling the end of line).
Use dos2unix to fix the files or set the fileformats in vim:
set ffs=unix,dos

Let's say your text file is - file.txt, then run this command -
dos2unix file.txt
It converts the text file from dos to unix format.

I removed them all with sed:
sed -i -e 's/\r//g' <filename>
Could also replace with a different string or character. If there aren't line breaks already for example you can turn \r into \n:
sed -i -e 's/\r/\n/g' <filename>
Those sed commands work on the GNU/Linux version of sed but may need tweaking on BSDs (including macOS).

I got a text file originally generated on a Windows Machine by way of a Mac user and needed to import it into a Linux MySQL DB using the load data command.
Although VIM displayed the '^M' character, none of the above worked for my particular problem, the data would import but was always corrupted in some way. The solution was pretty easy in the end (after much frustration).
Solution:
Executing dos2unix TWICE on the same file did the trick! Using the file command shows what is happening along the way.
$ file 'file.txt'
file.txt: ASCII text, with CRLF, CR line terminators
$ dos2unix 'file.txt'
dos2unix: converting file file.txt to UNIX format ...
$ file 'file.txt'
file.txt: ASCII text, with CRLF line terminators
$ dos2unix 'file.txt'
dos2unix: converting file file.txt to UNIX format ...
$ file 'file.txt'
file.txt: ASCII text
And the final version of the file imported perfectly into the database.

In Unix it is probably easier to use 'tr' command.
cat file1.txt | tr "\r" "\n" > file2.txt

This is the only thing that worked in my case:
:e ++ff=dos
:wq

You can fix this in vim using
:1,$s/^V^M//g
where ^ is the control character.

If you didn't specify a different fileformat intentionally (say, :e ++ff=unix for a Windows file), it's likely that the target file has mixed EOLs.
For example, if a file has some lines with <CR><NL> endings and others with
<NL> endings, and fileformat is set to unix automatically by Vim when reading it, ^M (<CR>) will appear.
In such cases, fileformats (note: there's an extra s) comes into play. See :help ffs for the details.

If it breaks your configuration, and the ^M characters are required in mappings, you can simply replace the ^M characters by <Enter> or even <C-m> (both typed as simple character sequences, so 7 and 5 characters, respectively).
This is the single recommended, portable way of storing special keycodes in mappings

In FreeBSD, you can clear the ^M manually by typing the following:
:%s/ Ctrl+V, then Ctrl+M, then Ctrl+M again.

I've discovered that I've been polluting files for weeks due to the fact that my Homebrew Mvim instance was set to use filetype=dos. Made the required change in .vimrc....

try :%s/\^M// At least this worked for me.

Parsing unicode in unescaped XML

I'm trying to parse some poorly formatted XML.
I say poorly formatted - because everyone knows that you're not supposed to have un-escaped ampersands in an XML file.
Problem is, I need to collect some unicode formatted phrases from an XML file. I need the format to be as close to the original as possible. You can replicate this issue in your console log...
console.log($("<test>â</test>").text())
// Outputs 'â' instead of desired 'â'
I've tried every combination of escape, unescape(), encodeURI(), decodeURI() I can fathom.
I've tried both settings for jQuery's ajax({processData: bool}) flag. All answers I've found point to these solutions - and it seems like none of them work...
How can I modify the above code to output the original XML content?

Use new Option(yourUnescapedXml).innerHTML. So to answer your question directly,
console.log($(`<test>${new Option('â').innerHTML}</test>`).text())
This creates an HTMLOptionElement, then immediately gets its (escaped) innerHtml.

Convert strange unicode characters into emoji code

I have a dll i suspect not to be supporting UTF-8 for emojis. (its an addon for mIRC)
This dll changes mIRC (text based chat program), into a full HTML/Javascript.
My problem is, when i receive a message containing emojis, they output like this
ðŸ˜€
Four "stange" chars, cause they are not converted fine i suppose.
I though about make a Javascript function matching those, and changing it to correct emoji code back (maybe using a <span> or not, since the following code type is translated correctly into smileys 😈)
so, is there any way in javascript to catch/convert ðŸ˜€ erroneous chars into 😈 for example? (those are not the same emoji)
for a correct example :
:grinning face: U+1F600
output this ðŸ˜€
sending this 😀 finaly output a square... and not the correct smiley so its even not working for all...

Javascript URL string adding %20 (space) after running grunt

TLDR;
FIXED AS FOLLOWS
selectedValue = selectedValue.replace(/\s+/g, '')
Thanks to: Richard Macarthy and Aaron Digulla for the answer, which led me down the poath to the correct answer.
Just tp be clear, it seems Grunt was adding this whitespace for some reason. The fix is very simple...
ORIGINAL QUESTION
I have an JSON request, which get the contents of a JSON file to be used for data visualisations using d3.js.
This all works fine locally, but when I run grunt build the URL string gets an %20 injected into it from nowhere...
Here is how the string looks before I run Grunt:
d3.json("json/wards-info/"+selectedValue+"-wards-data.json", function(error, newDatas) {
newData = newDatas;
newWardsData = newWardsDatas;
drawMap(newData, newWardsData);
});
Which computes to:
http://localhost:8080/app/json/wards-info/liverpool-ward-data.json
After I run Grunt build the computed URL string changes to:
http://localhost:8080/dist/json/wards-info/liverpool%20-ward-data.json
As you can see, it appears to be adding %20 between liverpool-ward
Is this because of grunt, or due to something else?

%20 usually represents a space in HTML URL Encoding, try to make sure there are no spaces in your output.
You can use something like this to help:
string.replace(/ /g,'') to strip the white spaces out. Where string is your URL.
Either that or try this:
.replace(/%20/g,'')

Simply check your selectedValue value. There is a space before - character. Either remove it, or call trim before using it.
Should solve your problem.

It's probably because of something else. %20 is added because of URL escaping rules (which d3.json() probably applies; Grunt shouldn't have an effect here) but what it means is that selectedValue ends with a space character. I've read in your comments that you're 100% sure that there isn't one but if that was true, then there wouldn't be a %20 in the URL. Computers don't add things just for fun, there is always a reason.
So my suggestion is to debug the code as it runs to see what the variable contains, then search your whole code base for -wards-data.json (because maybe there is a second place in the code that you forgot about).
If that doesn't work, then you'll have to tell us more about the Grunt config (are you compressing scripts, obfuscating? Do you have plugins installed?) Also show us the code which Grunt generates out of your input.

how to debug line break issues with javascript

I'm maintaining this website that accepts multi-line inputs from user and sends the data via JSON. line break \n are decoded and encoded properly but somehow the \r chars are not accepted on the server side and I have the feeling I would need to escape them prior to sending them over. Before making the fix, I want to try to reproduce the issue but I can't find a way to do it !!!!
Do you have any recommendations?
EDIT after more investigation, it turns out that the issue is in IE only ( as in the \r chars get added when copying/pasting to the text input). Hijacking the text area did not change anything in FF or chrome and doing a data.description.replace("\r","") did not solve the issue either. Still poking around.

if you just want to reproduce the error, just add some js to populate the textarea:
document.getElementById('textarea-id').value = 'test\r\ntest';

Chrome's Developer Tool's Javascript Console lets you send JSON to your server using friendly jQuery/MooTools/Protoype syntax

We Keep Coding

JavaScript is the programming language of the Web.

Remove ^M from CSV file - javascript

Use dos2unix to convert. It's false that "no matter how it is saved" the CR (^M is a carriage return) is appended. For instance, echo 'a,b,c' > letters.csv does not append a CR. Check your text editor settings.

Take a look at the splitlines algorithm on the jquery-csv page, it seems to provide a function that will clean these problematic carriage returns for you.

Assuming ^M indicates a mac-style carriage return, support for carriage return was included in a previous release so your code should just work. Source: I'm the author of jquery-csv

Related

Weird character in a commit.template of git [duplicate]

Parsing unicode in unescaped XML

Convert strange unicode characters into emoji code

Javascript URL string adding %20 (space) after running grunt

how to debug line break issues with javascript

Categories

Resources