Weird character in a commit.template of git [duplicate] - javascript

I keep getting the ^M character in my .vimrc and it breaks my
configuration.

Unix uses 0xA for a newline character. Windows uses a combination of two characters: 0xD 0xA. 0xD is the carriage return character. ^M happens to be the way vim displays 0xD (0x0D = 13, M is the 13th letter in the English alphabet).
You can remove all the ^M characters by running the following:
:%s/^M//g
Where ^M is entered by holding down Ctrl and typing v followed by m, and then releasing Ctrl. This is sometimes abbreviated as ^V^M, but note that you must enter it as described in the previous sentence, rather than typing it out literally.
This expression will replace all occurrences of ^M with the empty string (i.e. nothing). I use this to get rid of ^M in files copied from Windows to Unix (Solaris, Linux, OSX).

:%s/\r//g
worked for me today. But my situation may have been slightly different.

To translate the new line instead of removing it:
:%s/\r/\r/g

It probably means you've got carriage returns (different operating systems use different ways of signaling the end of line).
Use dos2unix to fix the files or set the fileformats in vim:
set ffs=unix,dos

Let's say your text file is - file.txt, then run this command -
dos2unix file.txt
It converts the text file from dos to unix format.

I removed them all with sed:
sed -i -e 's/\r//g' <filename>
Could also replace with a different string or character. If there aren't line breaks already for example you can turn \r into \n:
sed -i -e 's/\r/\n/g' <filename>
Those sed commands work on the GNU/Linux version of sed but may need tweaking on BSDs (including macOS).

I got a text file originally generated on a Windows Machine by way of a Mac user and needed to import it into a Linux MySQL DB using the load data command.
Although VIM displayed the '^M' character, none of the above worked for my particular problem, the data would import but was always corrupted in some way. The solution was pretty easy in the end (after much frustration).
Solution:
Executing dos2unix TWICE on the same file did the trick! Using the file command shows what is happening along the way.
$ file 'file.txt'
file.txt: ASCII text, with CRLF, CR line terminators
$ dos2unix 'file.txt'
dos2unix: converting file file.txt to UNIX format ...
$ file 'file.txt'
file.txt: ASCII text, with CRLF line terminators
$ dos2unix 'file.txt'
dos2unix: converting file file.txt to UNIX format ...
$ file 'file.txt'
file.txt: ASCII text
And the final version of the file imported perfectly into the database.

In Unix it is probably easier to use 'tr' command.
cat file1.txt | tr "\r" "\n" > file2.txt

This is the only thing that worked in my case:
:e ++ff=dos
:wq

You can fix this in vim using
:1,$s/^V^M//g
where ^ is the control character.

If you didn't specify a different fileformat intentionally (say, :e ++ff=unix for a Windows file), it's likely that the target file has mixed EOLs.
For example, if a file has some lines with <CR><NL> endings and others with
<NL> endings, and fileformat is set to unix automatically by Vim when reading it, ^M (<CR>) will appear.
In such cases, fileformats (note: there's an extra s) comes into play. See :help ffs for the details.

If it breaks your configuration, and the ^M characters are required in mappings, you can simply replace the ^M characters by <Enter> or even <C-m> (both typed as simple character sequences, so 7 and 5 characters, respectively).
This is the single recommended, portable way of storing special keycodes in mappings

In FreeBSD, you can clear the ^M manually by typing the following:
:%s/ Ctrl+V, then Ctrl+M, then Ctrl+M again.

I've discovered that I've been polluting files for weeks due to the fact that my Homebrew Mvim instance was set to use filetype=dos. Made the required change in .vimrc....

try :%s/\^M// At least this worked for me.

Related

How to make meSpeak.js read special characters?

I would like to use mespeak.js script (based on speak.js based on espeak) for text-to-speech - which has a czech voice file -, but for some reason it skips czech special characters like ě, š, č, ř, ž and reads only the rest.
As espeak on Windows reads them correctly, I tried to compile a new voice file (cs.json), but the problem persists.
Thanks!
I don't know what those characters sound like, but your best bet might be to try to approximate the closest english sounding character combination.
For instance, if š sounds like sh in English (not saying it does) then just replace all instances of š with sh.
It may be a better way to use this e-speak modification instead of mespeak:
http://eeejay.github.io/espeak/emscripten/espeak.html (demo)

Remove ^M from CSV file

Trying to read CSV formatted data into javascript using the jquery-csv library, but am getting a CSVDataError: Illegal Data error from the ^M character at the end of each line.
It seems no matter how a CSV is saved, I get this ^M. I can only ever see the ^M if I open the CSV file in vim, even in a text editor or my IDE the data looks fine. I don't get this problem when working in other languages either such as Python or R.
I am working on a Mac environment.
How can I fix this and avoid this problem in the future?
Use dos2unix to convert.
It's false that "no matter how it is saved" the CR (^M is a carriage return) is appended. For instance, echo 'a,b,c' > letters.csv does not append a CR. Check your text editor settings.
Take a look at the splitlines algorithm on the jquery-csv page, it seems to provide a function that will clean these problematic carriage returns for you.
Assuming ^M indicates a mac-style carriage return, support for carriage return was included in a previous release so your code should just work.
Source: I'm the author of jquery-csv

encodeURIComponent encodes differently, depending on environment

I am passing an object via the url using:
encodeURIComponent(JSON.stringify(myObject))
"ä" is encoded as "%C3%A4" on my local server.
Unfortunately it is encoded as "a%CC%88" on the webserver.
Which breaks my app because it is part of the name of a database field which isn't found when wrong encoded. And I can't control that there are no ä's in field names because the app allows users to upload their own data.
How can I make sure that "ä" is always encoded correctly?
SORRY. To make this clear: The encoding happens both times client-side in the browser. But when the web-app is served from the webserver the "ä" is encoded as "%C3%A4" instead of "a%CC%88" (I've tested both in the same chrome browser)
Thanks for all your help. It got me to dig deeper:
I have code that runs on an event. It loops through checkboxes and creates an array of objects containing (also) the field names. The code gets the field names from an attribute named "feld" of the checkbox:
<div class="checkbox">
<label>
<input class="feld_waehlen" type="checkbox" dstyp="Taxonomie" datensammlung="SISF Index 2 (2005)" feld="Artname vollständig">Artname vollständig
</label>
</div>
running this code:
console.log("this.getAttribute('feld') = " + this.getAttribute('feld'));
gives as expected: $(this).attr('feld') = Artname vollständig
If while looping, I run:
console.log('encodeURIComponent("Artname vollständig") = ' + encodeURIComponent("Artname vollständig"));
the answer is correct: encodeURIComponent("Artname vollständig") = Artname%20vollst%C3%A4ndig
But if I run:
console.log("encodeURIComponent(this.getAttribute('feld')) = " + encodeURIComponent(this.getAttribute('feld')));
the answer is: encodeURIComponent(this.getAttribute('feld')) = Artname%20vollsta%CC%88ndig
This happens all in the browser. But the issue only appears, when the web-app is served from the webserver (a couchapp running on cloudant.com).
How can it be that the method "getAttribute" returns a different encoding?
The following code has been tested on Chrome 29 OS X, IE 8 Windows XP.
encodeURIComponent("ä") //%C3%A4"
decodeURIComponent("%C3%A4") //ä
so basically "%C3%A4" should be the expected output.
I think the issue here might be encodeURIComponent require a UTF-8 encoding while your server-side language returns something other than this.
encodeURICompoent - MDN
just a follow up in case somebody runs into this issue later.
It seems to be unique to cloudant.com where my couchapp was hosted.
This is the answer I got from their very helpful support:
OK - I think I've found the culprit. The issue is that, due to internal optimisations (which are not present in CouchDB), the form of unicode strings can get changed. In this case, ä is represented as:
U+0061 LATIN SMALL LETTER A character
U+0308 COMBINING DIAERESIS character (̈)
instead of
U+00E4 LATIN SMALL LETTER A WITH DIAERESIS character (ä)
Both are semantically equivalent, so the fix is to normalize your unicode strings before comparison. Unfortunately, JavaScript has no built-in unicode normalization, but you can use a library such ashttps://github.com/walling/unorm.
It's not an issue for me any more as I changed to a virtual server running on digitalocean.com with vanilla couchdb (and am very happy with it).
But I do think this could hit others developing couchapps in German or other languages needing utf8 and hosting them on cloudant.com
Thanks for your great help.
Alex

CKEDITOR.instance[x].setData not working in IE

Ok, I'm using the CKEditor in a web application. One thing I need to do it set the text in the text area. I've been using the line:
CKEDITOR.instances.setData(html);
...where html is a varible containing HTML.
This works fine in Chrome & Firefox, but not at all in Internet Explorer or Safari.
Can anyone provide an insight as to why, or suggest a work-around?
Many thanks in advance! :-)
Make sure to strip all newlines from the string you pass into setData(). An exception is thrown if you don't, with a message about an unterminated string. The newline characters used by CKEditor are the UNIX-style of \n (in other words, not the DOS version: \r\n).
The newline apparently throws off the parser, making it think that it's the end of the statement.
Also note that if you call getData() to get that value you just set again, CKEditor puts the line breaks and tabs back into it. You'll need to strip them out again if you need to set that value back using setData(). I use a regexp pattern like this to strip out the newlines (and tabs just for completeness):
[\n\t]+
Also make sure that if you use the regular expression to strip them, you need to make sure that the pattern matching will match the \n character (called "single-line" mode in .NET, but I don't know what you're using).

Non-terminating RegExp.exec in Rhino

I have the following JavaScript program saved in a file pre.js:
var pre = readFile("method-help.html");
RegExp.multiline = true;
print(/<pre>((?:.|\s)+)<\/pre>/.exec(pre)[1]);
The contents of method-help.html is simply the page at http://api.stackoverflow.com/1.0/help/method?method=answers/%7bid%7d. What I'm trying to do is get the JSON code in between the pre tags. However, when I run the program in Rhino, nothing is printed out and the program does not terminate. The command I use is:
java -jar js.jar pre.js
My Rhino version is 1_7R2.
The reason it doesn't seem to terminate is probably catastrophic back-tracking due to . and \s overlapping (it would end eventually, but it could be a long time). Here's a correct, fast, version:
var pre = readFile("method-help.html");
print(/<pre>([\s\S]*?)<\/pre>/.exec(pre)[1])
You don't need multiline. That only affects the meaning of ^ and $, which you're not using. However, we do use \s\S to mean all characters (including newline, etc.). We also use *? to mean zero or more characters, non-greedy. The question mark (non-greedy) doesn't matter here but it would if there were multiple pre blocks.

Categories