Change string encoded in win1250 to utf8 - javascript

I'm loading a file that has encoding win1250, but when I load it, it has characters like p��jemce instead of příjemce (note diacritics.)
I'd like to change the encoding FROM win1250 TO UTF8.
I managed to do it in PHP with the following code
$content = iconv('windows-1250', 'UTF-8', $content);
but I am unable to do it in Javascript. I need to do this encoding on client without sending it to server (so I can't use PHP as "encoding proxy")
I've tried to use libraries iconv-lite and text-encoding (on NPM) like this
var reader = new FileReader();
reader.onload = () => {
var data = reader.result;
// iconv-lite
var buf = iconv.encode(data, 'win1250');
var str1 = iconv.decode(new Buffer(buf), 'utf8');
// text-encoding
var uint8array = new TextEncoder('windows-1250').encode(data);
var str2 = new TextDecoder('utf-8').decode(uint8array);
console.log(str1);
console.log(str2);
};
reader.readAsText(file);
But neither has actually correctly changed the encoding. Is there anything I'm missing?

I think you could simply try reader.readAsArrayBuffer
var reader = new FileReader();
reader.onload = () => {
var buf = reader.result;
// iconv-lite
var str1 = iconv.decode(buf, 'win1250');
// text-encoding
var str2 = new TextDecoder('windows-1250').decode(buf);
console.log(str1);
console.log(str2);
};
reader.readAsArrayBuffer(file);
If readAsArrayBuffer should get the binary data directly.
I don't have the entire dev environment so the above code is not fully tested, hope it could at least be inspirational.

Related

How do I get base 64 encoding of a file that use Windows-1251

I have a file uploader input, where I accept a file, convert it to base 64 string, and send the payload to a rest api.
However, when I was encoding base64 for utf-8 files, it was working fine. But if I try to get base64 strings of "window-1251" files, it is not converted to a string properly, and the api throws error instead because the base64 string is not valid content.
So my question is how do I get base64 string of a file that uses window-1251 for encoding?
var reader2 = new FileReader();
reader2.readAsDataURL(file);
reader2.onload = function (e) {
var sContentStream = e.target.result;}
Sorry, but the premise makes no sense.
FileReader.readAsDataURL will always return a valid base64 string from what you gave to it => binary data.
The fact that these bytes represent a text file, with a given encoding is simply ignored by the algorithm.
const rand_data = crypto.getRandomValues(new Uint8Array(50));
const blob = new Blob([rand_data]);
const reader = new FileReader();
reader.onload = e => {
const dataURL = reader.result
const base64 = dataURL.slice(dataURL.indexOf(',')+1);
console.log(base64);
console.log(atob(base64)); // would throw if invalid data
};
reader.readAsDataURL(blob);
So you are looking at the wrong end of the problem: The consumer may have issues with reading windows-1251 encoded text files, but that's not FileReader's fault.
Now, if you are willing to do the conversion from this encoding to UTF-8 in the browser, then that's still doable, but you's need a way to know which encoding the file you've been given is in.
const win_1251 = new Blob([Uint8Array.from([200])]); // И in windows-1251
// to prove it's not UTF-8
readUTF8Text(win_1251); // �
const reencode_reader = new FileReader();
reencode_reader.onload = e => {
const utf_8_arr = new TextDecoder('windows-1251')
.decode(new Uint8Array(reencode_reader.result));
const utf_8 = new Blob([utf_8_arr], {type: 'text/plain'})
makeDataURL(utf_8);
readUTF8Text(utf_8); // И
};
reencode_reader.readAsArrayBuffer(win_1251);
function makeDataURL(blob) {
const reader = new FileReader();
reader.onload = e => {
console.log(reader.result);
};
reader.readAsDataURL(blob);
}
function readUTF8Text(blob) {
const reader = new FileReader();
reader.onload = e => {
console.log(reader.result);
};
reader.readAsText(blob);
}

Blob is appending commas in typescript, when creating csv file

I'm trying to export data from an Angular 6 web application.
I have an array of string, where each string is a csv line, formatted like this:
var csvLines = ['val1,val2\n', 'val3,val4\n'...];
Once I've added all the data to i need to the array, i write it to the console:
This looks fine...
Now i wan't to convert it to a blob and download it as a .CSV file.
The download is fine, but the format of the output is wrong.
When I run the following code:
const blob = new Blob([csvLines], {type: 'text/csv;encoding:utf-8'});
const reader = new FileReader();
reader.onload = () => {
console.log(reader.result);
};
reader.readAsText(blob);
I get this output.
NOTE the commas that are appended on every line but the first - this mess up my csv.
Can anyone tell me why this is happening and perhaps how to disable the comma appending?
I have tried to create the Blob with text/plain as mimetype and without the encoding, but the commas are still appended.
Because you are passing csvLines as [csvLines] to new Blob(..), you are passing an array containing an array. It seems like the subarray is joined using commas.
Just use new Blob(csvLines, { type: 'text/csv;encoding:utf-8' }); and you should be fine.
const csvLines = ['val1,val2\n', 'val3,val4\n'];
const blob = new Blob(csvLines, { type: 'text/csv;encoding:utf-8' });
const reader = new FileReader();
reader.onload = () => {
console.log(reader.result);
};
reader.readAsText(blob);

image to base64 in javascript and c# (frontend and backend) is different

I am trying to convert an image to base64 to upload it on sharepoint site but it is throwing 400:bad request error. when i checked properly then i found out that the base64 i am sending is endcoded by javascript and it is different than what is expected by sharepoint. I have attached 2 images here describing the difference. Can anyone help me to get the proper encoded data using javascript ?
javascript encoded base64
c# encoded base64
var files = $("#myfile").get(0).files;
var reader = new FileReader();
reader.readAsDataURL(files[0]);
reader.onload = function () {
console.log(reader.result);
}
Could try : reader.result.split("base64,")[1]
Removes the "base64," start of the string.
Please try this , i am using this in my project , its working for me
if (file.ContentType.Contains("image"))
{
string theFileName = Path.GetFileName(file.FileName);
byte[] thePictureAsBytes = new byte[file.ContentLength];
using (BinaryReader theReader = new BinaryReader(file.InputStream))
{
thePictureAsBytes = theReader.ReadBytes(file.ContentLength);
}
string thePictureDataAsString = Convert.ToBase64String(thePictureAsBytes);
}
"thePictureDataAsString " variable got Base64 string
.........................................................................
i am getting file like this in my project
public ActionResult SaveMake(string inputMakeName, HttpPostedFileBase file)
{
MakeModel objMakeModel = new MakeModel();
if (file.ContentType.Contains("image"))
{
string theFileName = Path.GetFileName(file.FileName);
byte[] thePictureAsBytes = new byte[file.ContentLength];
using (BinaryReader theReader = new BinaryReader(file.InputStream))
{
thePictureAsBytes = theReader.ReadBytes(file.ContentLength);
}
string thePictureDataAsString = Convert.ToBase64String(thePictureAsBytes);
objMakeModel.ImageBase64 = thePictureDataAsString;
objMakeModel.Make1 = inputMakeName;
}
string response = _apiHelper.ConvertIntoReturnStringPostRequest<MakeModel>(objMakeModel, "api/Transaction/SaveMakes/");
// string response = _apiHelper.SaveMake(objMakeModel, "api/Transaction/SaveMakes/");
return RedirectToAction("AddVehicleMaintenance");
}

Filereader JS API not reading files properly

I have written the following code to read text from any csv or text file. However it sometimes reads successfully and stores in the variable and sometimes doesn't. Is there something missing in my code.
groupCsvData = [];
$('#add-group-upload').change(function() {
var file = this.files[0];
var reader = new FileReader();
reader.onload = function(e) {
var text = reader.result;
groupCsvData = [text];
};
reader.readAsText(file);
)};

Javascript html5 how to convert binary data into string

var reader = new FileReader();
var rawData = new ArrayBuffer();
//console.log(1);
reader.onload = function(e) {
var rawData = e.target.result; //binary data
console.log(rawData);
}
I want to see explicitly the binary raw data as a text string, is that possible?, cause the only thing i see when logging is:
ArrayBuffer {}
You can try
console.log(String.fromCharCode.apply(null, new Uint16Array(rawData)));
This is what'I've needed:
reader.readAsBinaryString(file);
then the data is available raw

Categories