npm package csvtojson CSV Parse Error: Error: unclosed_quote - javascript

Node version: v10.19.0
Npm version: 6.13.4
Npm package csvtojson Package Link
csvtojson({
"delimiter": ";",
"fork": true
})
.fromStream(fileReadStream)
.subscribe((dataObj) => {
console.log(dataObj);
}, (err) => {
console.error(err);
}, (success) => {
console.log(success);
});
While trying to handle large CSV file (about 1.3 million records) I face error "CSV Parse Error: Error: unclosed_quote." after certain records(e.g. after 400+ records) being processed successfully. From the CSV file i don't see any problems with data formatting there, however the parser might be raising this error because of "\n" character being found inside the column/field value.
Is there a solution already available with this package? or
is there a workaround to handle this error? or
is there a way to skip such CSV rows having any sort of errors not only this one, to let the
entire CSV to JSON parsing work without the processing getting stuck?
Any help will be much appreciated.

I've played about with this, and it's possible to hook into this using a CSV File Line Hook, csv-file-line-hook, you can check for invalid lines and either repair or simply invalidate them.
The example below will simply skip the invalid lines (missing end quotes)
example.js
const fs = require("fs");
let fileReadStream = fs.createReadStream("test.csv");
let invalidLineCount = 0;
const csvtojson = require("csvtojson");
csvtojson({ "delimiter": ";", "fork": true })
.preFileLine((fileLineString, lineIdx)=> {
let invalidLinePattern = /^['"].*[^"'];/;
if (invalidLinePattern.test(fileLineString)) {
console.log(`Line #${lineIdx + 1} is invalid, skipping:`, fileLineString);
fileLineString = "";
invalidLineCount++;
}
return fileLineString
})
.fromStream(fileReadStream)
.subscribe((dataObj) => {
console.log(dataObj);
},
(err) => {
console.error("Error:", err);
},
(success) => {
console.log("Skipped lines:", invalidLineCount);
console.log("Success");
});
test.csv
Name;Age;Profession
Bob;34;"Sales,Marketing"
Sarah;31;"Software Engineer"
James;45;Driver
"Billy, ;35;Manager
"Timothy;23;"QA

This regex works better
/^(?:[^"\]|\.|"(?:\.|[^"\])")$/g
Here is a more complex working script for big files by reading each line
import csv from 'csvtojson'
import fs from 'fs-extra'
import lineReader from 'line-reader'
import { __dirname } from '../../../utils.js'
const CSV2JSON = async(dumb, editDumb, headers, {
options = {
trim: true,
delimiter: '|',
quote: '"',
escape: '"',
fork: true,
headers: headers
}
} = {}) => {
try {
log(`\n\nStarting CSV2JSON - Current directory: ${__dirname()} - Please wait..`)
await new Promise((resolve, reject) => {
let firstLine, counter = 0
lineReader.eachLine(dumb, async(line, last) => {
counter++
// log(`line before convert: ${line}`)
let json = (
await csv(options).fromString(headers + '\n\r' + line)
.preFileLine((fileLineString, lineIdx) => {
// if it its not the first line
// eslint-disable-next-line max-len
if (counter !== 1 && !fileLineString.match(/^(?:[^"\\]|\\.|"(?:\\.|[^"\\])*")*$/g)) {
// eslint-disable-next-line max-len
console.log(`Line #${lineIdx + 1} is invalid. It has unescaped quotes. We will skip this line.. Invalid Line: ${fileLineString}`)
fileLineString = ''
}
return fileLineString
})
.on('error', e => {
e = `Error while converting CSV to JSON.
Line before convert: ${line}
Error: ${e}`
throw new BaseError(e)
})
)[0]
// log(`line after convert: ${json}`)
if (json) {
json = JSON.stringify(json).replace(/\\"/g, '')
if (json.match(/^(?:[^"\\]|\\.|"(?:\\.|[^"\\])*")*$/g)) {
await fs.appendFile(editDumb, json)
}
}
if (last) {
resolve()
}
})
})
} catch (e) {
throw new BaseError(`Error while converting CSV to JSON - Error: ${e}`)
}
}
export { CSV2JSON }

Related

Where is memory leak in my JavaScript (NodeJS) code?

I have simple script to handle CSV file with size 10GB. The idea is pretty simple.
Open file as stream.
Parse CSV objects from it.
Modify objects.
Make output stream to new file.
I made following code, but it cause memory leak. I have tried a lot of different things, but nothing helps. The memory leak disappear if I remove transformer from pipes. Maybe it causes memory leak.
I run the code under NodeJS.
Can you help me found where I am wrong?
'use strict';
import fs from 'node:fs';
import {parse, transform, stringify} from 'csv';
import lineByLine from 'n-readlines';
// big input file
const inputFile = './input-data.csv';
// read headers first
const linesReader = new lineByLine(inputFile);
const firstLine = linesReader.next();
linesReader.close();
const headers = firstLine.toString()
.split(',')
.map(header => {
return header
.replace(/^"/, '')
.replace(/"$/, '')
.replace(/\s+/g, '_')
.replace('(', '_')
.replace(')', '_')
.replace('.', '_')
.replace(/_+$/, '');
});
// file stream
const fileStream1 = fs.createReadStream(inputFile);
// parser stream
const parserStream1 = parse({delimiter: ',', cast: true, columns: headers, from_line: 1});
// transformer
const transformer = transform(function(record) {
return Object.assign({}, record, {
SomeField: 'BlaBlaBla',
});
});
// stringifier stream
const stringifier = stringify({delimiter: ','});
console.log('Loading data...');
// chain of pipes
fileStream1.on('error', err => { console.log(err); })
.pipe(parserStream1).on('error', err => {console.log(err); })
.pipe(transformer).on('error', err => { console.log(err); })
.pipe(stringifier).on('error', err => { console.log(err); })
.pipe(fs.createWriteStream('./_data/new-data.csv')).on('error', err => { console.log(err); })
.on('finish', () => {
console.log('Loading data finished!');
});

A command to add data into another json file

actually I've a json file test.json with information like this:
{
"id": ["1234"]
}
Now I want to add another id 2345 using discord.js. I am actually saving user ids in json file and now I want to make a command that will push more ids into the test.json file.
For example: With that command i can add another userid "2345" so that the json file will look like this:
{
"id": ["1234", "2345"]
}
Please Help me regarding this!!
There are several options below, all of which will have to be worked further to achieve the exact result you need but this will get you pointed in the right direction.
const fs = require('fs');
const cName = './path_to_json_from_bot_root_folder.json';
function requireUncached(module) {
delete require.cache[require.resolve(module)];
return require(module);
}
let file;
setInterval(() => {
file = requireUncached('../../path_to_json_from_this_file.json');
}, 500);
function prep(file) {
const myJson = JSON.stringify(file);
const JsonCut = myJson.slice(0, myJson.length - 1);
return JsonCut;
}
// Option 1
fs.writeFile(cName, 'information to input will completely overwrite the file', { format: 'json', }), (err) => {
if (err) throw err;
});
// Option 2
fs.writeFile(cName, prep(file) + 'add new information here will have to work through it to make sure that you have all the closing brackets' + '}', { format: 'json', }), (err) => {
if (err) throw err;
});
// Option 3
const oldInfo = file['subsectionName']
const newInfo = message.content // or whatever method you choose like args if you make it a command
fs.writeFile(cName, prep(file) + ${JSON.stringify(file['subsectionName']).trim().replace(`${oldInfo}`, `${newInfo}` + '}', { format: 'json', }), (err) => {
if (err) throw err;
});

Unexpected token '?' using, Discord.js Slash Command handler [duplicate]

I'm not sure what's wrong. I deleted my code and downloaded it then uploaded it again and now I get this error.
Code: https://replit.com/#hi12167pies/webcord#index.js (Click code for code and output for output)
Error:
/home/runner/C8AU9ceLyjc/node_modules/discord.js/src/rest/RESTManager.js:32
const token = this.client.token ?? this.client.accessToken;
^
SyntaxError: Unexpected token '?'
I have no idea whats wrong since it's in the node_modules folder.
If you have problems viewing it here is the code:
const http = require("http")
const discord = require("discord.js")
const client = new discord.Client()
const config = require("./config.json")
const fs = require("fs")
// const readLine = require("readline")
// const rl = readLine.createInterface({
// input: process.stdin,
// output: process.stdout
// })
let msgs = {
"873195510251532348": [],
"873195522633105429": []
}
client.on("ready", () => {
console.log("ready discord")
})
client.on("message", (message) => {
if (message.author.bot) return
if (!config.chats.includes(message.channel.id.toString())) return
msgs[message.channel.id].push({
"username": message.author.tag,
"content": message.content,
"type": "0"
})
})
http.createServer((req,res) => {
const url = req.url.split("?")[0]
let query = {}
req.url.slice(req.url.split("").indexOf("?")).slice(1).split("&").forEach((e) => {
const splited = e.split("=")
query[splited[0]] = splited[1]
})
if (query.q == "messages") {
let msg = []
let i = 0
while (msgs[query.code].length > i) {
const e = msgs[query.code][msgs[query.code].length - (i+1)]
msg.push(e)
i++
}
res.write(JSON.stringify(msg))
res.end()
} else if (query.q == "post") {
let name = query.name.split("%20").join(" ")
let content = query.content.split("%20").join(" ")
client.channels.cache.get(query.code).send(`**${name}**: ${content}`)
msgs[query.code].push({
"username": name,
"content": content,
"type": "1"
})
res.end()
} else if (url == "/robot" && query.istrue == "true") {
res.write("Robot!")
res.end()
} else {
let path
if (!query.code) {
path = "./code.html"
} else {
if (!config.chats.includes(query.code)) {
path = "./invaildcode.html"
} else {
path = "./chat.html"
}
}
fs.readFile(path, (er, da) => {
if (er) res.write("Could not get index.html")
res.write(da)
res.end()
})
}
}).listen(80, (err) => {
if (err) throw err
console.log("listening webserver")
})
client.login(process.env.TOKEN)
I am aware my code is not good right now, I am rewriting it but I still want to know what the error is.
repl.it uses node v12.22.1 but the nullish coalescing operator (??), is relatively new and was added in node v14.
So to use the ?? operator you need to update node in repl.it.
Which you can do by following this repl.it forum post by lukenzy.
Create a file and name it .replit
Inside it, copy and paste the following code:
run = """
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.34.0/install.sh | bash
export NVM_DIR=\"$HOME/.nvm\"
[ -s \"$NVM_DIR/nvm.sh\" ] && \\. \"$NVM_DIR/nvm.sh\"
[ -s \"$NVM_DIR/bash_completion\" ] && \\.\"$NVM_DIR/bash_completion\"
nvm install 14
node index.js
"""
This will install and use the latest Node.js v14 (14.17.4).
If u want to use a different version, change nvm install 14 to any other
number.
Also, change node index.js to the file u want to run.
You are getting this error because you are using an older version of node that didn't support nullable for some packages.
Simply change node version of yours.
You can simply change node versions using 'nvm'. follow this git repo https://github.com/nvm-sh/nvm

Converting XML data to JSON on ReactJs

i am working on healthcare project and i have to import some medicen data from another server . the other server data is type xml , and i want to convert it to json to be on my API .
fetch ("http://api.com/rest/api/something&q=")
.then(response => response.text())
.then((response) => {
parseString(responseText, function (err, result) {
console.log(response)
});
}).catch((err) => {
console.log('fetch', err)
})
},
i get this erorr :
fetch ReferenceError: parseString is not defined
i am using ReactJs so please can someone help me to get the correct way to convert XML to JSON ?
install these two npm Lib
1.npm install react-native-xml2js
2.npm install --save xml-js
step-1-----
import axios from "axios";
const parseString = require('react-native-xml2js').parseString; //step-1 here
axios.get(your api url here,
{
headers:{Authorization:this.state.token}
}).then(response => {
parseString(response, function (err, result) {
console.log("response----"+response.data)
//step--2 here
var convert = require('xml-js');
var xml = response.data
var result1 = convert.xml2json(xml, {compact: true, spaces: 4});
var result2 = convert.xml2json(xml, {compact: false, spaces: 4});
console.log("result1----"+result1);
console.log("result2----"+result2);
//step--2 end here
});
}).catch((err) => {
console.log('fetch', err)
})
use which result suitable for you...
It's worked for me in my React-native project

Replace a string in a file with nodejs

I use the md5 grunt task to generate MD5 filenames. Now I want to rename the sources in the HTML file with the new filename in the callback of the task. I wonder what's the easiest way to do this.
You could use simple regex:
var result = fileAsString.replace(/string to be replaced/g, 'replacement');
So...
var fs = require('fs')
fs.readFile(someFile, 'utf8', function (err,data) {
if (err) {
return console.log(err);
}
var result = data.replace(/string to be replaced/g, 'replacement');
fs.writeFile(someFile, result, 'utf8', function (err) {
if (err) return console.log(err);
});
});
Since replace wasn't working for me, I've created a simple npm package replace-in-file to quickly replace text in one or more files. It's partially based on #asgoth's answer.
Edit (3 October 2016): The package now supports promises and globs, and the usage instructions have been updated to reflect this.
Edit (16 March 2018): The package has amassed over 100k monthly downloads now and has been extended with additional features as well as a CLI tool.
Install:
npm install replace-in-file
Require module
const replace = require('replace-in-file');
Specify replacement options
const options = {
//Single file
files: 'path/to/file',
//Multiple files
files: [
'path/to/file',
'path/to/other/file',
],
//Glob(s)
files: [
'path/to/files/*.html',
'another/**/*.path',
],
//Replacement to make (string or regex)
from: /Find me/g,
to: 'Replacement',
};
Asynchronous replacement with promises:
replace(options)
.then(changedFiles => {
console.log('Modified files:', changedFiles.join(', '));
})
.catch(error => {
console.error('Error occurred:', error);
});
Asynchronous replacement with callback:
replace(options, (error, changedFiles) => {
if (error) {
return console.error('Error occurred:', error);
}
console.log('Modified files:', changedFiles.join(', '));
});
Synchronous replacement:
try {
let changedFiles = replace.sync(options);
console.log('Modified files:', changedFiles.join(', '));
}
catch (error) {
console.error('Error occurred:', error);
}
Perhaps the "replace" module (www.npmjs.org/package/replace) also would work for you. It would not require you to read and then write the file.
Adapted from the documentation:
// install:
npm install replace
// require:
var replace = require("replace");
// use:
replace({
regex: "string to be replaced",
replacement: "replacement string",
paths: ['path/to/your/file'],
recursive: true,
silent: true,
});
You can also use the 'sed' function that's part of ShellJS ...
$ npm install [-g] shelljs
require('shelljs/global');
sed('-i', 'search_pattern', 'replace_pattern', file);
Full documentation ...
ShellJS - sed()
ShellJS
If someone wants to use promise based 'fs' module for the task.
const fs = require('fs').promises;
// Below statements must be wrapped inside the 'async' function:
const data = await fs.readFile(someFile, 'utf8');
const result = data.replace(/string to be replaced/g, 'replacement');
await fs.writeFile(someFile, result,'utf8');
You could process the file while being read by using streams. It's just like using buffers but with a more convenient API.
var fs = require('fs');
function searchReplaceFile(regexpFind, replace, cssFileName) {
var file = fs.createReadStream(cssFileName, 'utf8');
var newCss = '';
file.on('data', function (chunk) {
newCss += chunk.toString().replace(regexpFind, replace);
});
file.on('end', function () {
fs.writeFile(cssFileName, newCss, function(err) {
if (err) {
return console.log(err);
} else {
console.log('Updated!');
}
});
});
searchReplaceFile(/foo/g, 'bar', 'file.txt');
On Linux or Mac, keep is simple and just use sed with the shell. No external libraries required. The following code works on Linux.
const shell = require('child_process').execSync
shell(`sed -i "s!oldString!newString!g" ./yourFile.js`)
The sed syntax is a little different on Mac. I can't test it right now, but I believe you just need to add an empty string after the "-i":
const shell = require('child_process').execSync
shell(`sed -i "" "s!oldString!newString!g" ./yourFile.js`)
The "g" after the final "!" makes sed replace all instances on a line. Remove it, and only the first occurrence per line will be replaced.
Expanding on #Sanbor's answer, the most efficient way to do this is to read the original file as a stream, and then also stream each chunk into a new file, and then lastly replace the original file with the new file.
async function findAndReplaceFile(regexFindPattern, replaceValue, originalFile) {
const updatedFile = `${originalFile}.updated`;
return new Promise((resolve, reject) => {
const readStream = fs.createReadStream(originalFile, { encoding: 'utf8', autoClose: true });
const writeStream = fs.createWriteStream(updatedFile, { encoding: 'utf8', autoClose: true });
// For each chunk, do the find & replace, and write it to the new file stream
readStream.on('data', (chunk) => {
chunk = chunk.toString().replace(regexFindPattern, replaceValue);
writeStream.write(chunk);
});
// Once we've finished reading the original file...
readStream.on('end', () => {
writeStream.end(); // emits 'finish' event, executes below statement
});
// Replace the original file with the updated file
writeStream.on('finish', async () => {
try {
await _renameFile(originalFile, updatedFile);
resolve();
} catch (error) {
reject(`Error: Error renaming ${originalFile} to ${updatedFile} => ${error.message}`);
}
});
readStream.on('error', (error) => reject(`Error: Error reading ${originalFile} => ${error.message}`));
writeStream.on('error', (error) => reject(`Error: Error writing to ${updatedFile} => ${error.message}`));
});
}
async function _renameFile(oldPath, newPath) {
return new Promise((resolve, reject) => {
fs.rename(oldPath, newPath, (error) => {
if (error) {
reject(error);
} else {
resolve();
}
});
});
}
// Testing it...
(async () => {
try {
await findAndReplaceFile(/"some regex"/g, "someReplaceValue", "someFilePath");
} catch(error) {
console.log(error);
}
})()
I ran into issues when replacing a small placeholder with a large string of code.
I was doing:
var replaced = original.replace('PLACEHOLDER', largeStringVar);
I figured out the problem was JavaScript's special replacement patterns, described here. Since the code I was using as the replacing string had some $ in it, it was messing up the output.
My solution was to use the function replacement option, which DOES NOT do any special replacement:
var replaced = original.replace('PLACEHOLDER', function() {
return largeStringVar;
});
ES2017/8 for Node 7.6+ with a temporary write file for atomic replacement.
const Promise = require('bluebird')
const fs = Promise.promisifyAll(require('fs'))
async function replaceRegexInFile(file, search, replace){
let contents = await fs.readFileAsync(file, 'utf8')
let replaced_contents = contents.replace(search, replace)
let tmpfile = `${file}.jstmpreplace`
await fs.writeFileAsync(tmpfile, replaced_contents, 'utf8')
await fs.renameAsync(tmpfile, file)
return true
}
Note, only for smallish files as they will be read into memory.
This may help someone:
This is a little different than just a global replace
from the terminal we run
node replace.js
replace.js:
function processFile(inputFile, repString = "../") {
var fs = require('fs'),
readline = require('readline'),
instream = fs.createReadStream(inputFile),
outstream = new (require('stream'))(),
rl = readline.createInterface(instream, outstream);
formatted = '';
const regex = /<xsl:include href="([^"]*)" \/>$/gm;
rl.on('line', function (line) {
let url = '';
let m;
while ((m = regex.exec(line)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
url = m[1];
}
let re = new RegExp('^.* <xsl:include href="(.*?)" \/>.*$', 'gm');
formatted += line.replace(re, `\t<xsl:include href="${repString}${url}" />`);
formatted += "\n";
});
rl.on('close', function (line) {
fs.writeFile(inputFile, formatted, 'utf8', function (err) {
if (err) return console.log(err);
});
});
}
// path is relative to where your running the command from
processFile('build/some.xslt');
This is what this does.
We have several file that have xml:includes
However in development we need the path to move down a level.
From this
<xsl:include href="common/some.xslt" />
to this
<xsl:include href="../common/some.xslt" />
So we end up running two regx patterns one to get the href and the other to write
there is probably a better way to do this but it work for now.
Thanks
Nomaly, I use tiny-replace-files to replace texts in file or files. This pkg is smaller and lighter...
https://github.com/Rabbitzzc/tiny-replace-files
import { replaceStringInFilesSync } from 'tiny-replace-files'
const options = {
files: 'src/targets/index.js',
from: 'test-plugin',
to: 'self-name',
}
# await
const result = replaceStringInFilesSync(options)
console.info(result)
I would use a duplex stream instead. like documented here nodejs doc duplex streams
A Transform stream is a Duplex stream where the output is computed in
some way from the input.
<p>Please click in the following {{link}} to verify the account</p>
function renderHTML(templatePath: string, object) {
const template = fileSystem.readFileSync(path.join(Application.staticDirectory, templatePath + '.html'), 'utf8');
return template.match(/\{{(.*?)\}}/ig).reduce((acc, binding) => {
const property = binding.substring(2, binding.length - 2);
return `${acc}${template.replace(/\{{(.*?)\}}/, object[property])}`;
}, '');
}
renderHTML(templateName, { link: 'SomeLink' })
for sure you can improve the reading template function to read as stream and compose the bytes by line to make it more efficient

Categories