Disclaimer: I have been programming for about 4 months now, so I'm still very new to programming.
I am using Firebase Cloud Firestore for a database and inserting CSV files with large amounts of data in them. Each file can be about 100k records in length. The project I'm working on requires a user to upload these CSV's from a web page.
I created a file uploader and I'm using the PapaParse JS tool to parse the csv, and it does so very nicely, it returns an array instantly, even if it's very long. I've tried to use the largest files I can find and then logging it to the console, it's very fast.
The problem is when I then take that array it gives me and loop through it and insert that into Cloud Firestore. It works, the data is inserted exactly how I want it. But it's very slow. And if I close the browser window it stops inserting. Inserting only 50 records takes about 10-15 seconds. So with files of 100k records, this is not going to work.
I was considering using Cloud Functions, but before I now try and learn how all that works, maybe I'm just not doing this in an efficient way? So I thought to ask here.
Here is the JS
// Get the file uploader in the dom
var uploader = document.getElementById('vc-file-upload');
// Listen for when file is uploaded
uploader.addEventListener('change',function(e){
// Get the file
var file = e.target.files[0];
// Parse the CSV File and insert into Firestore
const csv = Papa.parse(file, {
header: true,
complete: function(results) {
console.log(results);
sim = results.data;
var simLength = sim.length;
for (var i = 0; i < simLength;i++) {
var indSim = sim[i]
iccid = indSim.iccid;
const docRef = firestore.collection('vc_uploads').doc(iccid);
docRef.set({indSim}).then(function() {
console.log('Insert Complete.')
}).catch(function (err) {
console.log('Got an error: '+ err)
})
};
}
});
});
It will almost certainly be faster overall if you just upload the file (perhaps to Cloud Storage) and perform the database operations in Cloud Functions or some other backend, and it will complete even if the user leaves the app.
Related
I have a bulk amount of data in CSV format. I am able to upload that data with python by converting them to the dictionary (dict) with loop. the whole data is getting uploaded.
but now I want to upload bulk data to firebase and images to storage and I want to link between each document and image because i am working on e-commerce react app. so that I can retrieve documents along with images.
which is a good way to do this? should I do this with javascript or python?
I uploaded data manually to firebase by importing from there but again I am unable to upload bulk images to storage and also unable to create references between them. please give me a source where I can find this solution
This is tough, because it's hard to fully understand how exactly your images and CSV's are linked, but generally if you need to link something to items stored in Firebase, you can get a link either manually (go into storage, click and item, and the 'Name' Field on the right hand side is a link), or you can get it when you upload it. So for example, I have my images stored in firebase, and a postgres database with a table storing the locations. In my API (Express), when I post the image to blob storage, I create the URL of the item, and post that as an entry in my table, as well as setting it to be the blobs name. I'll put the code here, but obviously it's a completely different architecture to your problem, so i'll try and highlight the important bits (it's also JS, not Python, sorry!):
const uploadFile = async () => {
var filename = "" + v4.v4() + ".png"; //uses the uuid library to create a unique value
const options = {
destination: filename,
resumable: true,
validation: "crc32c",
metadata: {
metadata: {
firebaseStorageDownloadTokens: v4.v4(),
},
},
};
storage
.bucket(bucketName)
.upload(localFilename, options, function (err, file) {});
pg.connect(connectionString, function (err, client, done) {
client.query(
`INSERT INTO table (image_location) VALUES ('${filename}')`, //inserts the filename we set earlier into the postgres table
function (err, result) {
done();
if (err) return console.error(err);
console.log(result.rows.length);
}
);
});
console.log(`${filename} uploaded to ${bucketName}.`);
};
Once you have a reference between the two like this, you can just get the one in the table first, then use that to pull in the other one using the location you have stored.
I am writing a discord bot using javascript (discord.js).
I use json files to store my data and of course always need the latest data.
I do the following steps:
I start the bot
I run a function that requires the config.json file every time a message is sent
I increase the xp a user gets from the message he sent
I update the users xp in the config.json
I log the data
So now after logging the first time (aka sending the first message) I get the data that was in the json file before I started the bot (makes sense). But after sending the second message, I expect the xp value to be higher than before, because the data should have been updated, the file new loaded and the data logged again.
(Yes I do update the file every time. When I look in the file by myself, the data is always up to date)
So is there any reason the file is not updated after requiring it the second time? Does require not reload the file?
Here is my code:
function loadJson() {
var jsonData = require("./config.json")
//here I navigate through my json file and end up getting to the ... That won't be needed I guess :)
return jsonData
}
//edits the xp of a user
function changeUserXP(receivedMessage) {
let xpPerMessage = getJsonData(receivedMessage)["levelSystemInfo"].xpPerMessage
jsonReader('./config.json', (err, data) => {
if (err) {
console.log('Error reading file:',err)
return
}
//increase the users xp
data.guilds[receivedMessage.guild.id].members[receivedMessage.author.id].xp += Number(xpPerMessage)
data.guilds[receivedMessage.guild.id].members[receivedMessage.author.id].stats.messagesSent += 1
fs.writeFile('./test_config.json', JSON.stringify(data, null, 4), (err) => {
if (err) console.log('Error writing file:', err)
})
})
}
client.on("message", (receivedMessage) => {
changeUserXP(receivedMessage)
console.log(loadJson(receivedMessage))
});
I hope the code helps :)
If my question was not precise enough or if you have further questions, feel free to comment
Thank you for your help <3
This is because require() reads the file only once and caches it. In order to read the same file again, you should first delete its key (the key is the path to the file) from require.cache
I am new to programming, and I heard that some guys on this website are quite angry, but please don't be. I am creating one web app, that has a web page and also makes som ecalculations and works with database (NeDB). I have an index.js
const selects = document.getElementsByClassName("sel");
const arr = ["Yura", "Nairi", "Mher", "Hayko"];
for (let el in selects) {
for (let key in arr) {
selects[el].innerHTML += `<option>${arr[key]}</option>`;
}
}
I have a function which fills the select elements with data from an array.
In other file named: getData.js:
var Datastore = require("nedb");
var users = new Datastore({ filename: "players" });
users.loadDatabase();
const names = [];
users.find({}, function (err, doc) {
for (let key in doc) {
names.push(doc[key].name);
}
});
I have some code that gets data from db and puts it in array. And I need that data to use in the index.js mentioned above, but the problem is that I don't know how to tranfer the data from getData.js to index.js. I have tried module.exports but it is not working, the browser console says that it can't recognize require keyword, I also can't get data directly in index.js because the browse can't recognize the code related to database.
You need to provide a server, which is connected to the Database.
Browser -> Server -> DB
Browser -> Server: Server provides endpoints where the Browser(Client) can fetch data from. https://expressjs.com/en/starter/hello-world.html
Server -> DB: gets the Data out of the Database and can do whatever it want with it. In your case the Data should get provided to the Client.
TODOs
Step 1: set up a server. For example with express.js (google it)
Step 2: learn how to fetch Data from the Browser(Client) AJAX GET are the keywords to google.
Step 3: setup a Database connection from you Server and get your data
Step 4: Do whatever you want with your data.
At first I thought it is a simple method, but them I researched a little bit and realized that I didn't have enough information about how it really works. Now I solved the problem, using promises and templete engine ejs. Thank you all for your time. I appreciate your help)
I am making a discord bot in Node.js mostly for fun to get better at coding and i want the bot to push a string into an array and update the array file permanently.
I have been using separate .js files for my arrays such as this;
module.exports = [
"Map: Battlefield",
"Map: Final Destination",
"Map: Pokemon Stadium II",
];
and then calling them in my main file. Now i tried using .push() and it will add the desired string but only that one time.
What is the best solution to have an array i can update & save the inputs? apparently JSON files are good for this.
Thanks, Carl
congratulations on the idea of writing a bot in order to get some coding practice. I bet you will succeed with it!
I suggest you try to split your problem into small chunks, so it is going to be easier to reason about it.
Step1 - storing
I agree with you in using JSON files as data storage. For an app that is intended to be a "training gym" is more than enough and you have all the time in the world to start looking into databases like Postgres, MySQL or Mongo later on.
A JSON file to store a list of values may look like that:
{
"values": [
"Map: Battlefield",
"Map: Final Destination",
"Map: Pokemon Stadium II"
]
}
when you save this piece of code into list1.json you have your first data file.
Step2 - reading
Reading a JSON file in NodeJS is easy:
const list1 = require('./path-to/list1.json');
console.log(list.values);
This will load the entire content of the file in memory when your app starts. You can also look into more sophisticated ways to read files using the file system API.
Step3 - writing
Looks like you know your ways around in-memory array modifications using APIs like push() or maybe splice().
Once you have fixed the memory representation you need to persist the change into your file. You basically have to write it down in JSON format.
Option n.1: you can use the Node's file system API:
// https://stackoverflow.com/questions/2496710/writing-files-in-node-js
const fs = require('fs');
const filePath = './path-to/list1.json';
const fileContent = JSON.stringify(list1);
fs.writeFile(filePath, fileContent, function(err) {
if(err) {
return console.log(err);
}
console.log("The file was saved!");
});
Option n.2: you can use fs-extra which is an extension over the basic API:
const fs = require('fs-extra');
const filePath = './path-to/list1.json';
fs.writeJson(filePath, list1, function(err) {
if(err) {
return console.log(err);
}
console.log("The file was saved!");
});
In both cases list1 comes from the previous steps, and it is where you did modify the array in memory.
Be careful of asynchronous code:
Both the writing examples use non-blocking asynchronous API calls - the link points to a decent article.
For simplicity sake, you can first start by using the synchronous APIs which is basically:
fs.writeFileSync
fs.writeJsonSync
You can find all the details into the links above.
Have fun with bot coding!
Hello All I have a collection in mongoDB whoose size is 30K.
When I run the find Query (I am using mongoose) from Node server, following problems occur.
1: It takes long time to get result back from datatabase server
2: While creating JSON object from the result data, Node server get crashed
To solve the problem I tried to fetch the data in chunk (Stated in the Doc)
Now i am getting the docuemnt one by one in my stream.on callback,.
here is my code
var index=1;
var stream = MyModel.find().stream();
stream.on('data', function (doc) {
console.log("document number"+ index);
index++;
}).on('error', function (err) {
// handle the error
}).on('close', function () {
// the stream is closed
});
And the out put of my code is
Document number1 document number2 ...... documant number 30000.
Output shows that database is sending the document one by one.
Now my question is, Is there any way to fetch the data in the chunk of 5000 documents.
Or is there any better way to do the same??
Thanks in advance
I tried using batch_size() but it did not solve my problem
Can I use the same streaming for MAP reduce ?