Reading Arrow Feather files in GoLang or Javascript - javascript

I am looking for a way to read the feather files via GoLang or Javascript, or some other languages that does not require users to do some other extra installation.
My goal is to provide a User-interface to read a feather csv file and convert it back to a human-readable csv. However I can't find much resources on how to work it out.
Currently I have a test feather file generated by below.
import pandas as pd
import datetime
import numpy as np
import pyarrow.feather as feather
# Create a dummy dataframe
todays_date = datetime.datetime.now().date()
index = pd.date_range(todays_date-datetime.timedelta(10), periods=10, freq='D')
columns = ['A','B', 'C']
df = pd.DataFrame(index=index, columns=columns)
df = df.fillna(0) # with 0s rather than NaNs
feather.write_feather(df, 'test_feather.csv')
Thanks in advance.

The Javascript package apache-arrow comes with a script that does exactly this. You can find the source for the script here: https://github.com/apache/arrow/blob/master/js/bin/arrow2csv.js
If it is not doing exactly what you want the script should serve as an example of how to use the API to read in a feather file.

Thanks for the hints from #Pace. Turns out I found that I can simply use the arrow.Table.from([arrow]) function to convert .feather file to csv.
For those people encountered same issue, you may find the code below for reference.
const apArrow = require('apache-arrow');
const fs = require('fs');
const outputDir = 'output/feather';
const writeIntoFile = (data) => {
fs.appendFileSync(`${outputDir}/test_feather.csv`, data, function (err) {
if (err) return console.log(err);
});
};
const readDataFromRow = (fields, row) => {
return fields
.map((f) => row.get(f))
.join(',');
};
const arrowReader = (filePath) => {
console.log('filePath', filePath);
const arrow = fs.readFileSync(filePath);
const table = apArrow.Table.from([arrow]);
const columns = table.schema.fields.map((f) => f.name);
let buf = columns.join(',') + '\n';
for (let i = 0; i < table.count(); i++) {
const rowData = readDataFromRow(columns, table.get(i));
buf += `${rowData}\n`;
// export to csv every 10000 rows
if (i % 10000 === 0) {
writeIntoFile(buf);
buf = '';
if (i > 0) {
break;
}
}
}
writeIntoFile(buf);
};

Related

How could I duplicate/copy file in an automatized way with JavaScript?

I have an gif file that is stored in a directory call assets on my computer. I would like to create X amount of duplicates and they should be stored in the same directory and each of them should have a different file name.
Example:
I in the assets directory is the gif file call 0.gif I would like to duplicate this gif file 10 times and The duplicates should be called 1.gif, 2.gif, 3.R and so on.
The simplest option is to use fs and using copyFile function available
const fs = require("fs");
const path = require("path");
let copyMultiple = (src, count) => {
let initCount = 0;
while (initCount < count) {
initCount++;// you can put this at bottom too acc to your needs
const newFileName = `${initCount}_${initCount}${path.extname(src)}`;
console.log(newFileName, "is new file name");
fs.copyFile(src, newFileName, (error) => {
// if errors comes
if (error) {
console.log(error);
}
});
}
};
copyMultiple("1.gif", 3);
Another elegant way of doing this is
const util = require("util");
const fs = require("fs");
const path = require("path");
const copyFilePromise = util.promisify(fs.copyFile);
function copyFiles(srcFile, destDir, destFileNames) {
return Promise.all(
destFileNames.map((file) => {
return copyFilePromise(srcFile, path.join(destDir, file));
})
);
}
const myDestinationFileNames = ["second.gif", "third.gif"];
const sourceFileName = "1.gif";
copyFiles(sourceFileName, "", myDestinationFileNames)
.then(() => {
console.log("Copying is Done");
})
.catch((err) => {
console.log("Got and Error", error);
});
Using this will also give upperhand of knowing when it is done.
You can read docs here
const fs = require("fs")
const filename = "index.js".split(".") //filename like 0.gif to gif
const times = 10 // number of times to duplicate
for(var int = 1; int < times; int++){
const newFilename = `${(parseInt(filename[0]) + init)}.${filename[1]}` //new filename like 0.gif to 1.gif
fs.copyFileSync(filename, newfilename)
}
use the write file and read file from the fs module and a simple for loop
not sure which framework you're on but fs.copyFile() is the standard way for node.js https://nodejs.org/api/fs.html#fscopyfilesrc-dest-mode-callback

txt file to json using Node JS

I have a simple txt file with data in this format with millions of lines:
{"a":9876312,"b":1572568981512}
{"a":9876312,"b":1572568981542}
I want to convert this into a file with "dot" json extension file using reduce function in NodeJs and return statement, probably looking like this:
[{"a":9876312,"b":1572568981512},
{"a":9876312,"b":1572568981542}]
Any help will be really really appreciated. Thanks :)
SO far I tried this:
const fs = require('fs');
const FILE_NAME = 'abc.txt';
const x = mapEvents(getJSONFileData(FILE_NAME));
function getJSONFileData(filename) {
return fs.readFileSync(filename, 'utf-8')
.split('\n')
.map(JSON.parse)
}
function mapEvents(events) {
events.reduce((acc, data) => {
return [{data.a, data.b}]
});
}
console.log(x)
I am getting an 'undefined' value constantly
I have found some issues, in your code.
You haven't returned anything from mapEvents function, that makes your varaible x value undefined.
getJSONFileData needs some fixing.
You can use below code:-
const fs = require('fs');
const FILE_NAME = 'abc.txt';
const x = mapEvents(getJSONFileData(FILE_NAME));
function getJSONFileData(filename) {
return fs
.readFileSync(filename, 'utf-8')
.split('\n')
.filter(Boolean)
.map(JSON.parse);
}
function mapEvents(events) {
return JSON.stringify(events);
}
console.log(x);

Firebase Storage: string does not match format base64: invalid character found. Only when debug is off

I'm trying to upload an image file to firebase storage, save the download URL, and load it after the upload is completed. When I run the app with debug js remotely on it works fine. When I turn off debug mode it stops working with the invalid format exception. The same happens when I run in a real device (both iOS and Android)
The base64 response data from React Native Image Picker seems to be correct
Here's my code
...
import * as ImagePicker from 'react-native-image-picker'; //0.26.10
import firebase from 'firebase'; //4.9.1
...
handleImagePicker = () => {
const { me } = this.props;
const options = {
title: 'Select pic',
storageOptions: {
skipBackup: true,
path: 'images'
},
mediaType: 'photo',
quality: 0.5,
};
ImagePicker.showImagePicker(options, async (response) => {
const storageRef = firebase.storage().ref(`/profile-images/user_${me.id}.jpg`);
const metadata = {
contentType: 'image/jpeg',
};
const task = storageRef.putString(response.data, 'base64', metadata);
return new Promise((resolve, reject) => {
task.on(
'state_changed',
(snapshot) => {
var progress = (snapshot.bytesTransferred / snapshot.totalBytes) * 100;
console.log('Upload is ' + progress + '% done');
},
(error) =>
console.log(error),
() => {
this.onChangeProfileImage();
}
);
});
}
}
onChangeProfileImage = async () => {
const { me } = this.props;
const storageRef = firebase.storage().ref(`/profile-images/user_${me.id}.jpg`);
const profileImageUrl = await new Promise((resolve, reject) => {
storageRef.getDownloadURL()
.then((url) => {
resolve(url);
})
.catch((error) => {
console.log(error);
});
});
// some more logic to store profileImageUrl in the database
}
Any idea how to solve this?
Thanks in advance.
After some research and debug I found the cause of the issue and a solution for it.
Why does it happen?
Firebase uses atob method to decode the base64 string sent by putstring method.
However, since JavaScriptCore doesn't have a default support to atob and btoa, the base64 string can't be converted, so this exception is triggered.
When we run the app in debug javascript remotely mode, all javascript code is run under chrome environment, where atob and btoa are supported. That's why the code works when debug is on and doesn't when its off.
How to solve?
To handle atob and btoa in React Native, we should either write our own encode/decode method, or install a lib to handle it for us.
In my case I preferred to install base-64 lib
But here's an example of a encode/decode script:
const chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=';
const Base64 = {
btoa: (input:string = '') => {
let str = input;
let output = '';
for (let block = 0, charCode, i = 0, map = chars;
str.charAt(i | 0) || (map = '=', i % 1);
output += map.charAt(63 & block >> 8 - i % 1 * 8)) {
charCode = str.charCodeAt(i += 3/4);
if (charCode > 0xFF) {
throw new Error("'btoa' failed: The string to be encoded contains characters outside of the Latin1 range.");
}
block = block << 8 | charCode;
}
return output;
},
atob: (input:string = '') => {
let str = input.replace(/=+$/, '');
let output = '';
if (str.length % 4 == 1) {
throw new Error("'atob' failed: The string to be decoded is not correctly encoded.");
}
for (let bc = 0, bs = 0, buffer, i = 0;
buffer = str.charAt(i++);
~buffer && (bs = bc % 4 ? bs * 64 + buffer : buffer,
bc++ % 4) ? output += String.fromCharCode(255 & bs >> (-2 * bc & 6)) : 0
) {
buffer = chars.indexOf(buffer);
}
return output;
}
};
export default Base64;
Usage:
import Base64 from '[path to your script]';
const stringToEncode = 'xxxx';
Base64.btoa(scriptToEncode);
const stringToDecode = 'xxxx';
Base64.atob(stringToDecode);
After choosing either to use the custom script or the lib, now we must add the following code to the index.js file:
import { decode, encode } from 'base-64';
if (!global.btoa) {
global.btoa = encode;
}
if (!global.atob) {
global.atob = decode;
}
AppRegistry.registerComponent(appName, () => App);
This will declare atob and btoa globally. So whenever in the app those functions are called, React Native will use the global scope to handle it, and then trigger the encode and decode methods from base-64 lib.
So this is the solution for Base64 issue.
However, after this is solved, I found another issue Firebase Storage: Max retry time for operation exceed. Please try again when trying to upload larger images. It seems that firebase has some limitation on support to React Native uploads, as this issue suggests.
I believe that react-native-firebase may not struggle on this since it's already prepared to run natively, instead of using the web environment as firebase does. I didn't test it yet to confirm, but it looks like this will be the best approach to handle it.
Hope this can be helpful for someone else.
The problem is now solved using fetch() API. The promise returned can be converted to blob which you can upload to firebase/storage
Here is an example
let storageRef = storage().ref();
let imageName = data.name + "image";
let imagesRef = storageRef.child(`images/${imageName}`);
const response = await fetch(image);
const blob = await response.blob(); // Here is the trick
imagesRef
.put(blob)
.then((snapshot) => {
console.log("uploaded an image.");
})
.catch((err) => console.log(err));

Javascript,Nodejs: search for a specific word string in files

i'm trying to make an app that searches for all files
contains a specified string under the current directory/subdirectory.
as i understand it means i need to create a read stream, loop it, load the read data to an array, if the word found give __filename, dirname and if ! not found message.
unfortunately, i could not make it work...
any clue?
var path = require('path'),
fs=require('fs');
function fromDir(startPath,filter,ext){
if (!fs.existsSync(startPath)){
console.log("no dir ",startPath);
return;
};
var files=fs.readdirSync(startPath);
let found = files.find((file) => {
let thisFilename = path.join(startPath, file);
let stat = fs.lstatSync(thisFilename);
var readStream = fs.createReadStream(fs);
var readline = require('readline');
if (stat.isDirectory()) {
fromDir(thisFilename, filename,readline, ext);
} else {
if (path.extname(createReadStream) === ext && path.basename(thisFilename, ext) === filename) {
return true;
}
}
});
console.log('-- your word has found on : ',filename,__dirname);
}
if (!found) {
console.log("Sorry, we didn't find your term");
}
}
fromDir('./', process.argv[3], process.argv[2]);
Because not everything was included in the question, I made an assumption:
We are looking for full words (if that's not the case, replace the regex with a simple indexOf()).
Now, I've split the code into two functions - to make it both more readable and easier to recursively find the files.
Synchronous version:
const path = require('path');
const fs = require('fs');
function searchFilesInDirectory(dir, filter, ext) {
if (!fs.existsSync(dir)) {
console.log(`Specified directory: ${dir} does not exist`);
return;
}
const files = getFilesInDirectory(dir, ext);
files.forEach(file => {
const fileContent = fs.readFileSync(file);
// We want full words, so we use full word boundary in regex.
const regex = new RegExp('\\b' + filter + '\\b');
if (regex.test(fileContent)) {
console.log(`Your word was found in file: ${file}`);
}
});
}
// Using recursion, we find every file with the desired extention, even if its deeply nested in subfolders.
function getFilesInDirectory(dir, ext) {
if (!fs.existsSync(dir)) {
console.log(`Specified directory: ${dir} does not exist`);
return;
}
let files = [];
fs.readdirSync(dir).forEach(file => {
const filePath = path.join(dir, file);
const stat = fs.lstatSync(filePath);
// If we hit a directory, apply our function to that dir. If we hit a file, add it to the array of files.
if (stat.isDirectory()) {
const nestedFiles = getFilesInDirectory(filePath, ext);
files = files.concat(nestedFiles);
} else {
if (path.extname(file) === ext) {
files.push(filePath);
}
}
});
return files;
}
Async version - because async is cool:
const path = require('path');
const fs = require('fs');
const util = require('util');
const fsReaddir = util.promisify(fs.readdir);
const fsReadFile = util.promisify(fs.readFile);
const fsLstat = util.promisify(fs.lstat);
async function searchFilesInDirectoryAsync(dir, filter, ext) {
const found = await getFilesInDirectoryAsync(dir, ext);
for (file of found) {
const fileContent = await fsReadFile(file);
// We want full words, so we use full word boundary in regex.
const regex = new RegExp('\\b' + filter + '\\b');
if (regex.test(fileContent)) {
console.log(`Your word was found in file: ${file}`);
}
};
}
// Using recursion, we find every file with the desired extention, even if its deeply nested in subfolders.
async function getFilesInDirectoryAsync(dir, ext) {
let files = [];
const filesFromDirectory = await fsReaddir(dir).catch(err => {
throw new Error(err.message);
});
for (let file of filesFromDirectory) {
const filePath = path.join(dir, file);
const stat = await fsLstat(filePath);
// If we hit a directory, apply our function to that dir. If we hit a file, add it to the array of files.
if (stat.isDirectory()) {
const nestedFiles = await getFilesInDirectoryAsync(filePath, ext);
files = files.concat(nestedFiles);
} else {
if (path.extname(file) === ext) {
files.push(filePath);
}
}
};
return files;
}
If you have not worked with / understand async/await yet, it is a great step to take and learn it as soon as possible. Trust me, you will love not seeing those ugly callbacks again!
UPDATE:
As you pointed in comments, you want it to execute the function after running node process on the file. You also want to pass the function parameters as node's arguments.
To do that, at the end of your file, you need to add:
searchFilesInDirectory(process.argv[2], process.argv[3], process.argv[4]);
This extracts our arguments and passes them to the function.
With that, you can call our process like so (example arguments):
node yourscriptname.js ./ james .txt
Personally, if I were to write this, I would leverage the beauty of asynchronous code, and Node.js's async / await.
As a very side note:
You can easily improve readability of your code, if you add proper formatting to it. Don't get me wrong, it's not terrible - but it can be improved:
Use spaces OR newlines after commas.
Use spaces around equality operators and arithmetic operators.
As long as you are consistent with formatting, everything looks much better.

JavaScript - Convert CSV to XLSX (Preferably Without Use of Library(s))

As the title says, I currently have a CSV file created from SharePoint list data and in order to display this information as a spreadsheet, I want to convert it to an Excel XLSX file. I prefer to do this without relying on a third-party library. At first, I started to use ActiveX objects to try to recreate and/or save the CSV as XLSX, but there's a limitation with that since I can't really use it in other browsers besides IE. I was thinking using Blob to somehow convert it? That's where I'm stuck.
function createCsv(data) {
var result = "";
if (data == null || data.length == 0) {
return;
}
var columnDelimiter = ',';
var lineDelimiter = '\n';
var keys = Object.keys(data[0]);
// spreadsheet header
result += keys.join(columnDelimiter);
result += lineDelimiter;
// spreadsheet data
data.forEach(function (obj) {
var count = 0;
keys.forEach(function (key) {
if (count > 0) {
result += columnDelimiter;
}
result += obj[key];
count++;
});
result += lineDelimiter;
});
return result;
}
function downloadCsv(csv) {
if (csv == null) {
return;
}
var filename = "test.csv";
csv = "data:text/csv;charset=utf-8," + csv;
var data = encodeURI(csv);
console.log(data);
var link = document.getElementById('csv');
link.setAttribute('href', data);
link.setAttribute('download', filename);
console.log(link);
//displayCsv(csv);
}
function displayCsv() {
// using test csv here
var message = "data:text/csv;charset=utf-8, yo, hey, lol";
//var fileType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
var fileType = "application/msexcel";
var csvFile = new Blob([message], {type: fileType});
var csvUrl = URL.createObjectURL(csvFile);
console.log(csvFile);
console.log(csvUrl);
}
CSV works fine with using the spreadsheet (by downloading and opening it in Excel), but I really need a way to display it as a spreadsheet on a webpage and not as text, so that's why I'm looking to convert it over. Since I'm using this within SharePoint then I can use a Excel web part to display the XLSX - it won't open CSV files like this though. Thanks in advance.
It would be quite the undertaking to try to manually try to do this without libraries. While OpenXML files are XML based at their core, they are also bundled/zipped.
I would recommend take a look at SheetJS. https://sheetjs.com/
You can take CSV as input, and write it back out immediately as XSLX.
I'm not sure that this will solve your issues but if a xls file will suffice you can create a xls file simply by adding a separator tag to the first line of the csv and rename it to xls.
Quotes around the values has also been important.
Eg:
"sep=,"
"Service","Reported","Total","%"
"a service","23","70","32.86%"
"yet_a_service","27","70","38.57%"
"more_services","20","70","28.57%"
If you are fine with using a third-party library (which I strongly recommend considering the complexity involved in conversion ), this solution will suit your needs if it needs to be done in nodejs.
If you want to use it in the browser, convertCsvToExcel function needs to be modified to transform the buffer to a blob object, then converting that blob to an XLS file.
// Convert a CSV string to XLSX buffer
// change from xlsx/xls and other formats by going through sheetsjs documentation.
import * as XLSX from 'xlsx';
export const convertCsvToExcelBuffer = (csvString: string) => {
const arrayOfArrayCsv = csvString.split("\n").map((row: string) => {
return row.split(",")
});
const wb = XLSX.utils.book_new();
const newWs = XLSX.utils.aoa_to_sheet(arrayOfArrayCsv);
XLSX.utils.book_append_sheet(wb, newWs);
const rawExcel = XLSX.write(wb, { type: 'base64' })
return rawExcel
}
// Express request handler for sending the excel buffer to response.
export const convertCsvToExcel = async (req: express.Request, res: express.Response) => {
const csvFileTxt = fileBuffer.toString()
const excelBuffer = convertCsvToExcelBuffer(csvFileTxt)
res.setHeader('Content-Type', 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
res.status(200).send(Buffer.from(excelBuffer, 'base64'))
}

Categories