Parse the data read from csv file with Nodejs ExcelJS package

Parse the data read from csv file with Nodejs ExcelJS package - javascript

With NodeJs I need to fill the Excel file with the data fetched from the csv file.
I am using the ExcelJS npm package.
I sucessfully read the data from csv fiel and write it in console.log() but the problem is, it is very strange format.
Code:
var Excel = require("exceljs");
exports.generateExcel = async () => {
let workbookNew = new Excel.Workbook();
let data = await workbookNew.csv.readFile("./utilities/file.csv");
const worksheet = workbookNew.worksheets[0];
worksheet.eachRow(function (row: any, rowNumber: number) {
console.log(JSON.stringify(row.values));
});
};
Data looks like this:
[null,"Users;1;"]
[null,"name1;2;"]
[null,"name2;3;"]
[null,"name3;4;"]
[null,"Classes;5;"]
[null,"class1;6;"]
[null,"class2;7;"]
[null,"class3;8;"]
[null,"Teachers;9;"]
[null,"teacher1;10;"]
[null,"teacher2;11;"]
[null,"Grades;12;"]
[null,"grade1;13;"]
[null,"grade2;14;"]
[null,"grade3;15;"]
So the Excel file which I need to fill with this data is very complex.. In specific cells I need to insert the users, in other sheet I need some images with grades, etc...
The Main question for me is:
How can I parse and store the data which is displayed in my console.log() in separate variables like Users in separate variable, Grades in separate variable and Teachers in separate variable.
Example for users:
users = {
title: "Users",
names: ["name1", "name2", "name3"],
};
There is no need to be exactly the same as example, but the something that can be reused when I will read different csv files with same structure and so I could easily access to the specific data and put it in specific cell in the Excel file.
Thank you very much.

I prepared example, how could you pare your file. As it was proposed in one answer above we use fast-csv. The parsing is quite simple you split by separator and than took line[0] which is first element.
const fs = require('fs');
const csv = require('#fast-csv/parse');
fs.createReadStream('Test_12345.csv')
.pipe(csv.parse())
.on('error', error => console.error(error))
.on('data', function (row) {
var line = String(row)
line = line.split(';')
console.log(`${line[0]}`)
})
.on('end', rowCount => console.log(`Parsed ${rowCount} rows`));
If we put for input like this:
Value1;1101;
Value2;2202;
Value3;3303;
Value4;4404;
your output is in this case like this:
Value1
Value2
Value3
Value4

Related

How to parse Xlsx file and search columns by NLP JSON dictionary

I'm working on a kind of magical xlsx parsing where I need to parse a complex xlsx file (it contains some presentational and useless informations as a header, and a data table).
What I'm trying to do is parsing the file (that's cool), and I'm trying to figure out where the table starts. For that what I want to do, is having a kind on dictionary as JSON.
Since the xlsx's files are dynamics and are all supposed to be some product ordering topics, I want to find the table's header row by "blindly" looking for column names according to the JSON dictionary below.
{
"productName": [
"description",
"product",
"name",
"product name"
],
"unit": [
"unit",
"bundle",
"package",
"packaging",
"pack"
]
}
Here for example, we suppose one column header is going to be "bundle", and another "description". Thanks to my json, I'm suppose to be able to find the key "productName" and "unit". It's a kind of synonym searching actually.
Questions are :
1/ Is there a clean data structure or an efficient for doing doing this search ? The Json could be huge with memory leaks, I want it to be as fast as possible.
2/ I know this is not precise, but it could work. So do you have any suggestion how to to it or how would you do it ?
3/ The operation is going to be costly because I have to parse this xlsx file, and in the mean time I have to parse the JSON to do my dictionary thing, do you have any advice ?
I suppose I should to both parsing in a stream way so I only parse one row of the xlsx at a time, and I make my JSON parsing and research of that row before jumping to the next one ?
Here is the start of the parsing code.
const ExcelJS = require('exceljs');
const { FILE_EXTESTIONS } = require('../constants');
const { allowExtention } = require('../helpers');
const executeXlsxParsing = () => {
const filePath = process.argv.slice(2)[1];
const workbook = new ExcelJS.Workbook();
const isAllowed = allowExtention(FILE_EXTESTIONS.xlsx, filePath)
if (!filePath || !isAllowed) return;
return workbook.xlsx.readFile(filePath).then(() => {
var inboundWorksheet = workbook.getWorksheet(1);
inboundWorksheet.eachRow({ includeEmpty: true }, (row, rowNumber) => {
console.log('Row ' + rowNumber + ' = ' + JSON.stringify(row.values));
});
});
};
module.exports.executeXlsxParsing = executeXlsxParsing();
Thank you all :)

How to get specific key and value from a long json object while iterating it in node.js

I am trying to parse a csv file in node.js , i am able to parse the csv file and can print the content , the contents are coming as a from of a json object.Now my target is to iterate the json object and take out specific key and values from each block and use them in a Query which will do some DB operations.But the problem is while i am trying to iterate the json only first key and values of the first block is printed. Let me post the code what i have done
fs.createReadStream(path)
.pipe(csv.parse({headers:true ,ignoreEmpty : true}))
.on("error",(error) => {
throw error.message;
})
.on("data",function(data){
if(data && data!=={}){
Object.keys(data).forEach(function(k){
if(k==='name' || k==='Office'){
let selectQury = `select name,Office from myTable where name = ${data['name']} and Office
=${data[Office]};
db.query(selectQury,(err,res)=>{
if(err){
console.log('error',null);
This my json which i parse from the csv looks like
{
id:1,
name:"AS",
Office:"NJ"
........
ACTIVE: 1.
},
{
id:2,
name:"AKJS",
Office:"NK"
........
ACTIVE: 2.
}
so now what i want is in the select Query the parameters will be passed like
let selectQury = `select name,Office from myTable where name = "AS" and Office = "NJ";
in the first iteration
let selectQury = `select name,Office from myTable where name = "AKJS" and Office = "NK";
in the second iteration and so on as the csv grows.
I am not able to do it ,please help . Thanks in advance. I am new to node.js & tricky javascript operations.

Filter text file between two dates

So I have a text file test.txt with lines similar to:
08/12/2021
test1
test2
test3
... (some entries)
12/12/2021
test21
test22
test23
... (some entries)
24/12/2021
What should I to write next in order to filter the text file to get the lines between the two newest dates??
const fs = require('fs');
fs.watchFile('test.txt', (eventType, filename) => {
fs.readFile('test.txt', 'utf-8', (err, data) => {
const arr = data.toString().replace(/\r\n/g,'\n').split('\n');
...
The output will be something such as:
test21
test22
test23
... (some entries)
Which are the entries between the two newest dates.
Update:
The text file is actually constantly writing in entries and will input the current date at the end of the day. Which now I am trying to extract the entries between the previous and newest date for further process

I don't know about javascript but it seems you are looking like this one
Is there any way to find data between two dates which are presend in same string and store it to json object

You can do one thing, find the indexOf start date and end date then you can slice the contents. You can do something like this:
try {
const data = fs.readFileSync('test.txt', {encoding:'utf8', flag:'r'}),
start = data.indexOf('start date'),
end = data.lastIndexOf('end data');
const trimText = data.slice(start, end);
} catch(e) {
console.log(e)
}
This method will work well for small file if the file is large we need to read it asynchronously and check for the start and end date while reading it.

const test_data = `08/12/2021
test0
test2
test3
12/12/2021`;
console.log(
test_data.split(/\d+\/\d+\/\d+\n?/g).slice(1, -1)
);

(NodeJS, large JSON file) | stream-json npm package error: Javascript heap out of memory

stream-json noob here. I'm wondering why the below code is running out of memory.
context
I have a large JSON file. The JSON file's structure is something like this:
[
{'id': 1, 'avg_rating': 2},
{'id': 1, 'name': 'Apple'}
]
I want to modify it to be
[
{'id': 1, 'avg_rating': 2, 'name': 'Apple'}
]
In other words, I want to run a reducer function on each element of the values array of the JSON (Object.values(data)) to check if the same id is entered into different keys in the json, and if so "merge" that into one key.
The code I wrote to do this is:
var chunk = [{'id': 1, 'name': 'a'},{'id': 1, 'avg_rating': 2}]
const result = Object.values(chunk.reduce((j, c) => {
if (j[c.id]) {
j[c.id]['avg_rating'] = c.avg_rating
} else {
j[c.id] = { ...c };
}
return j;
}, {}));
console.log(result)
The thing is, you cannot try to run this on a large JSON file without running out of memory. So, I need to use JSON streaming.
the streaming code
Looking at the stream-json documentation, I think I need to use a Parser to take in text as a Readable stream of objects and output a stream of data items as Writeable buffer/text "things".
The code I can write to do that is:
const {chain} = require('stream-chain');
const {parser} = require('stream-json/Parser');
const {streamValues} = require('stream-json/streamers/StreamValues');
const fs = require('fs');
const pipeline = chain([
fs.createReadStream('test.json'),
parser(),
streamValues(),
data => {
var chunk = data.value
const result = Object.values(chunk.reduce((j, c) => {
if (j[c.id]) {
j[c.id]['avg_rating'] = c.avg_rating
} else {
j[c.id] = { ...c };
}
return j;
}, {}));
//console.log(result)
return JSON.stringify(result);
},
fs.createWriteStream(fpath)
])
To create a write stream (since I do want an output json file), I just added to the parse function above fs.createWriteStream(filepath) , but it looks like -- while this works on a small sample -- this doesn't work for a large JSON file: I get the error "heap out of memory".
attempts to fix
I think the main issue of the code is that "chunk" philosophy is wrong. If this works via "streaming" a JSON line by line (?), then "chunk" might be trying to save all the data that the program has run into so far, whereas I really only want it to run a reducer function in batches. I then am kind of back at square one .. how would I merge the key-value pairs of a JSON if the id is the same?
If the data custom code isn't the problem, then I get the feeling I need to use a Stringer , since I want to edit a stream with custom code, and save it back to a file.
However, I can't seem to get how Stringer reads data, as the below code runs an error where data is undefined:
const pipeline = chain([
fs.createReadStream('testjson'),
parser(),
data => {
var chunk = data.value
const result = Object.values(chunk.reduce((j, c) => {
if (j[c.id]) {
j[c.id]['avg_rating'] = c.avg_rating
} else {
j[c.id] = { ...c };
}
return j;
}, {}));
console.log(result)
return JSON.stringify(result);
},
stringer(),
zlib.Gzip(),
fs.createWriteStream('edited.json.gz')
])
I would greatly appreciate any advice on this situation or any help diagnosing the problems in my approach.
Thank you!!

While this is certainly an interesting question - I have the liberty to just restructure how the data's scraped, and as such can bypass having to do this at all.
Thanks all!

Node.js write to CSV file

I have an array of text lines in JavaScript inside variable named "dataArray":
[
'freq[Hz];re:Trc1_S11;im:Trc1_S11;re:Trc2_S21;im:Trc2_S21;',
'2.400000000000000E+009;1.548880785703659E-001;1.067966520786285E-001;1.141964457929134E-003;5.855074618011713E-003;',
'2.400166666666667E+009;1.546109169721603E-001;1.043454632163048E-001;1.287244027480483E-003;5.807569250464439E-003;',
'2.400333333333334E+009;1.546102017164230E-001;1.018797382712364E-001;1.497663557529450E-003;5.986513104289770E-003;',
'2.400500000000000E+009;1.545133888721466E-001;9.928287565708160E-002;1.647840370424092E-003;5.912321619689465E-003;',
'2.400666666666667E+009;1.544111520051956E-001;9.671460092067719E-002;1.589289400726557E-003;5.917594302445650E-003;',
...
]
First line contains headers, and other lines contain data.
I need to write this into .csv file and store that file. How can I do this (I'm using Node.js)?

First i converte the array to a valid csv data, for that i replace all ; with ,
Then i join all entries together with a newline (csv.join("\r\n")) and write it to a file.
const fs = require("fs");
const data = [
'freq[Hz];re:Trc1_S11;im:Trc1_S11;re:Trc2_S21;im:Trc2_S21;',
'2.400000000000000E+009;1.548880785703659E-001;1.067966520786285E-001;1.141964457929134E-003;5.855074618011713E-003;',
'2.400166666666667E+009;1.546109169721603E-001;1.043454632163048E-001;1.287244027480483E-003;5.807569250464439E-003;',
'2.400333333333334E+009;1.546102017164230E-001;1.018797382712364E-001;1.497663557529450E-003;5.986513104289770E-003;',
'2.400500000000000E+009;1.545133888721466E-001;9.928287565708160E-002;1.647840370424092E-003;5.912321619689465E-003;',
'2.400666666666667E+009;1.544111520051956E-001;9.671460092067719E-002;1.589289400726557E-003;5.917594302445650E-003;'
];
const csv = data.map((e) => {
return e.replace(/;/g, ",");
});
fs.writeFile("./data.csv", csv.join("\r\n"), (err) => {
console.log(err || "done");
});

We Keep Coding

JavaScript is the programming language of the Web.

Parse the data read from csv file with Nodejs ExcelJS package - javascript

Related

How to parse Xlsx file and search columns by NLP JSON dictionary

How to get specific key and value from a long json object while iterating it in node.js

Filter text file between two dates

(NodeJS, large JSON file) | stream-json npm package error: Javascript heap out of memory

Node.js write to CSV file

Categories

Resources