I have a CSV file that looks like this:
# Meta Data 1: ...
# Meta Data 2: ...
...
Header1, Header2...
actual data
I'm currently using the fast-csv library in a NodeJS script to parse the actual data part into objects with
const csv = require("fast-csv");
const fs = require("fs");
fs.createReadStream(file)
.pipe(csv.parse({
headers:true,
comment:"#", // this ignore lines that begin with #
skipLines:2 })
)
I'm skipping over the comments or else I won't get nice neat objects with header:data pairs, but I still want some of my meta data. Is there a way to get them? If not with fast-csv, is there another library that could accomplish this?
Thanks!
Edit: My current work around is to just regex for the specific meta data I want, but this means I have to read the file twice. I don't expect my files to be super big so this works for now but I don't think it's the best solution.
Related
How can I create a function that loops through a folder in my directory called "data." It contains only image files, and I will keep adding more image files to it. Previously, I was using the following function that returns an array of URLs:
function _image_urls(){return(
[
"https://images.pexels.com/photos/4050284/pexels-photo-4050284.jpeg?auto=compress&cs=tinysrgb&w=1260&h=750&dpr=1",
"https://images.pexels.com/photos/1323550/pexels-photo-1323550.jpeg?auto=compress&cs=tinysrgb&w=600",
"https://images.pexels.com/photos/2002719/pexels-photo-2002719.jpeg?auto=compress&cs=tinysrgb&w=1260&h=750&dpr=1",
"https://images.pexels.com/photos/919606/pexels-photo-919606.jpeg?auto=compress&cs=tinysrgb&w=600",
"https://images.pexels.com/photos/1983038/pexels-photo-1983038.jpeg?auto=compress&cs=tinysrgb&w=1260&h=750&dpr=1",
"https://images.pexels.com/photos/1702624/pexels-photo-1702624.jpeg?auto=compress&cs=tinysrgb&w=1260&h=750&dpr=1",
"https://images.pexels.com/photos/3631430/pexels-photo-3631430.jpeg?auto=compress&cs=tinysrgb&w=1260&h=750&dpr=1",
"https://images.pexels.com/photos/5011647/pexels-photo-5011647.jpeg?auto=compress&cs=tinysrgb&w=1260&h=750&dpr=1",
"https://images.pexels.com/photos/135018/pexels-photo-135018.jpeg?auto=compress&cs=tinysrgb&w=1260&h=750&dpr=1",
"https://images.pexels.com/photos/161154/stained-glass-spiral-circle-pattern-161154.jpeg?auto=compress&cs=tinysrgb&w=1260&h=750&dpr=1"
]
)}
I'm trying to create a function that returns an array of paths for all the images in the data folder. I have been trying the following approach:
function _image_urls(){
const image_folder = 'data';
const image_extension = '.jpg';
let image_urls = [];
for (let i = 0; i < 10; i++) {
let image_url = image_folder + i + image_extension;
image_urls.push(image_url);
}
return image_urls;
}
It seems like this will just return an array like:
[
"data0.jpg",
"data1.jpg",
"data2.jpg",
"data3.jpg",
"data4.jpg",
"data5.jpg",
"data6.jpg",
"data7.jpg",
"data8.jpg",
"data9.jpg"
]
If that's what you're getting, then you need to use i as the index for an array that contains the file names.
The bigger question is how are you getting that list of files in the first place? This is generally not something that JavaScript can do on its own. If the files exist on the server, you'd need some server-side script to actually access the folder and output the array of file names - this can then be put into an array several ways (either writing it directly to the code if you allow your server side code to process the JS file or probably more likely using an XHR to request the file names and then populate the array when you get the response.)
If you write this server side script such that it formats the output as JSON, then it could simply be a matter of using JSON.parse() to convert the output to an array directly without any need to iterate over the response such as the function in the question.
EDIT/UPDATE after comment from OP:
Since you're using PHP on the server side, I would create a server side script that readers the contents of the "Data" folder and outputs a JSON formatted string which can then be parsed by the JS on the front end.
In general, this is done using the scandir function. See https://www.php.net/manual/en/function.scandir.php for details.
and the steps would be as follows:
Use scandir to get an array of files in the Data folder
Remove the first two items in the array (. and ..)
Use the json_encode function to convert the array to a JSON formatted string
Echo that string
Then on the page where you have your JS you have two options:
Include the PHP script described above such that it becomes a JS array using JSON.parse().
Use an XHR to request the PHP script, and when you get a response use JSON.parse() to set it as an array variable.
The first method is outdated, but very simple - though it does require that your JS code is parsed by PHP which may or may not be possible/advisable depending on your server configuration.
The second method is probably what you should do, as long as you're fine with the array being populated after the page loads and that you wait for the XHR to complete before calling any functions that rely on the array.
The main thing to know here is that what you want to do is not possible using only JavaScript because JS cannot read the contents of a folder on the server. Your JS will need to interact with some server side code in order to read the contents of a folder into any array.
So I have a script that organises an un-formatted csv file and presents an output.
One of the pieces of data I receive in this data that we must return is a link to an image stored on Google Drive. The problem with this is Google Drive doesn't like to present you with a direct link to a file.
You can get the ID of a file (e.g. abc123DEFz) and view it online at https://drive.google.com/open?id=abc123DEFz. We need a direct link for another service to be able to process the file, not a redirect or some fancy website.
After poking around I discovered that https://drive.google.com/uc?export=view&id=abc123DEFz would redirect you directly to the file, and was what I somehow had to obtain inside the script.
The url it gave me though didn't really seem to have any relation to the ID and I couldn't just go ahead and swap the ID, for each file I would have to resolve this uc?export link into this link that would send me directly to the file. (Where the redirect sent me: http://doc-0c-2s-docs.googleusercontent.com/docs/securesc/32-char-long-alphanumeric-thing/another-32-char-long-alphanumeric-thing/1234567891234/12345678901234567890/12345678901234567890/abc123DEFz?e=view&authuser=0&nonce=abcdefgh12345&user=12345678901234567890&hash=32-char-long-alphanumeric-hash)
No authentication is required to access the file, it is public.
My script works like this:
const csv = require('csv-parser'),
fs = require('fs'),
request = require('request');
let final = [],
spuSet = [];
fs.createReadStream('data.csv')
.pipe(csv())
.on('data', (row) => {
>> data processing stuff, very boring so you don't care
console.log(`
I'm now going to save this information and tell you about the row I'm processing
so you can see why something went wrong`);
final.push(`[{"yes":"there is something here"},{"anditinvolves":${thatDataIJustGot}]`);
spuSet.push(`[{"morethings":123}]`);
})
.on('end', () => {
console.log('CSV file successfully processed');
console.log(`
COMPLETED! Check the output below and verify:
[${String(final).replace(/\r?\n|\r/g, " ")}]
COMPLETED! Check the output below and verify:
[${String(spuSet).replace(/\r?\n|\r/g, " ")}]`);
>> some more boring stuff where I upload the data somewhere and create a file containing said data
});
I tried using requests but it's a function with a callback so using the data outside of the function would be difficult, and wrapping everything inside the function would remove my ability to push to the array.
The url I get from the redirect would be included in the data I am pushing to the array for me to use later on.
I'm pretty bad at explaining crap, if you have any questions please ask.
Thanks in advance for any help you can give.
Try using the webContentLink parameter of the Get API call:
var webLink = drive.files.get({
fileId: 'fileid',
fields: 'webContentLink'
});
This will return the object:
{
"webContentLink": "https://drive.google.com/a/google.com/uc?id=fileId&export=download"
}
Then you can use split() to remove &export=download from the link, as we don't want to download it.
As fileId, you can get the Ids of your files by using the List API Call, and then you can loop through the list array calling the files.get from the first step.
My apologies if I misunderstood your issue.
In case you need help with the authentication to the Google Services, you can take a look at the Quickstart
I have a JSON that looks like this:
{"marker":[{"#attributes":{"start":"Im Berge",
"finish":"Eichelberger Stra\u00dfe"
...
I am trying to parse the attributes inside the "#attributes", but have not found a way to do it. What I tried so far:
const fs = require('fs');
var jsonObj = JSON.parse(fs.readFileSync('route1.json', 'utf8'));
console.log(jsonObj['#attributes']);
Also tried the same with
console.log(jsonObj.marker['#attributes']);
Neither of which work. I understand that this is supposed to be a json-ld and that I'm supposed to parse an object with an "#" sign with ['#attributes'], but either way I always get an error or undefined. I got the JSON from an API I wanna use and it is in there multiple times, so I have no way around it.
.marker is an array so:
console.log(jsonObj.marker[0]['#attributes']);
But you may want to loop through it:
jsonObj.marker.forEach(marker => console.log(marker['#attributes']));
You can require a JSON file, instead of JSON.parse & fs.readFileSync
var jsonObj = require('./route1.json');
Any help will be appreciated.
I need to extract data from websites and found that node-unfluff does the job (see https://github.com/ageitgey/node-unfluff). There is two ways to call this module.
First, from command line which works!
Second, from node js which doesn't work.
extractor = require('unfluff');
data = extractor('test.html');
console.log(data);
Output : {"title":"","lang":null,"tags":[],"image":null,"videos":[],"text":""}
The data returns an empty json object. It appears like it cannot read the test.html.
It seems like it doesn't recognise test.html. The example says, "my html data", is there a way to get html data ? Thanks.
From the docs of unfluff:
extractor(html, language)
html: The html you want to parse
language (optional): The document's two-letter language code. This
will be auto-detected as best as possible, but there might be cases
where you want to override it.
You are passing a filename, and it expects the actual HTML of the file to be passed in.
If you are doing this in a scripting context, I'd recommend doing
data = extractor(fs.readFileSync('test.html'));
however if you are doing this in the context of a server or some time when blocking will be an issue, you should do:
fs.readFile('test.html', function(err, html){
var data = extractor(html);
console.log(data);
));
I'm creating an android app which takes in some json data, is there a way to set up a directory such as;
http://......./jsons/*.json
Alternatively, a way to add into a json file called a.json, and extend its number of containing array data, pretty much add more data into the .json file this increase its size.
It could be by PHP or Javascript.
Look into Parsing JSON, you can use the JSON.parse() function, in addition, I'm not sure about getting all your JSON files from a directory call, maybe someone else will explain that.
var data ='{"name":"Ray Wlison",
"position":"Staff Author",
"courses":[
"JavaScript & Ajax",
"Buildinf Facebook Apps"]}';
var info = JSON.parse(data);
//var infostoring = JSON.stringify(info);
One way to add to a json file is to parse it, add to it, then save it again. This might not be optimal if you have large amounts of data but in that case you'll probably want a proper database anyway (like mongo).
Using PHP:
$json_data = json_decode(file_get_contents('a.json'));
array_push($json_data, 'some value');
file_put_contents('a.json', json_encode($json_data));