I've been working on a Node project that involves fetching some data from BigQuery. Everything has been fine so far; I have my credential.json file (from BigQuery) and the project works as expected.
However, I want to implement a new feature in the project and this would involve fetching another set of data from BigQuery. I have an entirely different credential.json file for this new dataset. My project seems to recognize only the initial credential.json file I had (I named them differently though).
Here's a snippet of how I linked my first credential.json file:
function createCredentials(){
try{
const encodedCredentials = process.env.GOOGLE_AUTH_KEY;
if (typeof encodedCredentials === 'string' && encodedCredentials.length > 0) {
const google_auth = atob(encodedCredentials);
if (!fs.existsSync('credentials.json')) {
fs.writeFile("credentials.json", google_auth, function (err, google_auth) {
if (err) console.log(err);
console.log("Successfully Written to File.");
});
}
}
}
catch (error){
logger.warn(`Ensure that the environment variable for GOOGLE_AUTH_KEY is set correctly: full errors is given here: ${error.message}`)
process.kill(process.pid, 'SIGTERM')
}
}
Is there a way to fuse my two credential.json files together? If not, how can I separately declare which credential.json file to use?
If not, how can I separately declare which credential.json file to use?
What I would do I would create a function which is the exit point to BigQuery and pass an identifier to your function which credential to generate, This credential will then be used when calling BigQuery.
The below code assume you changed this
function createCredentials(){
try{
const encodedCredentials = process.env.GOOGLE_AUTH_KEY;
To this:
function createCredentials(auth){
try{
const encodedCredentials = auth;
And you can use it like this
import BigQuery from '#google-cloud/bigquery';
import {GoogApi} from "../apiManager" //Private code to get Token from client DB
if (!global._babelPolyfill) {
var a = require("babel-polyfill")
}
describe('Check routing', async () => {
it('Test stack ', async (done, auth) => {
//Fetch client Auth from local Database
//Replace the 2 value below with real values
const tableName = "myTest";
const dataset = "myDataset";
try {
const bigquery = new BigQuery({
projectId: `myProject`,
keyFilename: this.createCredentials(auth)
});
await bigquery.createDataset(dataset)
.then(
args => {
console.log(`Create dataset, result is: ${args}`)
})
.catch(err => {
console.log(`Error in the process: ${err.message}`)
})
} catch (err) {
console.log("err", err)
}
})
})
Related
I run into a problem, which I cant solve.
Im making an app, where on the first page I need to choose one of two machines, there are 2 buttons on page and when one of them is clicked, i make POST to /mechineChoose where I pass id of selected machine. Then I need to change config.js file, where I have all params needed for rest of app.
const config = {
machineName: "Machine",
...
So in my code I need to change machineName, right now I use fs module to read and then write to file, but problem is that I cant change this name more than once. When I restart app, Im able to change the name, but when trying to choose second machine, nothing happens.
router.post("/machineChoose", async (req, res) => {
console.log(req.body.machineChoose);
if (req.body.machineChoose == 1) {
machineX = "Machine1";
} else {
machineX = "Machine2";
}
console.log(machineX);
fs.readFile('./config.js', 'utf-8', function (err,data){
if (err){
console.log(err);
}
var result = data.replace(config.machineName,machineX);
fs.writeFileSync('./config.js', result, 'utf-8', function(err){
if (err) return console.log(err);
});
});
return res.send("")
})
Any idea how to solve it ?
After writing to the file, you need to reload the config-object as it will still hold the previous state in-memory and thus further calls to data.replace(...) will not replace anything, since it will still be called with "Machine".
I would do something like this (although you should consider using a real database):
router.post("/machineChoose", async (req, res) => {
const chosenMachine = req.body.machineChoose == 1 ? "Machine1" : "Machine2";
const config = await readConfig();
config.machineName = chosenMachine;
await writeConfig(config);
res.status(204).end();
});
async function writeConfig(currentConfig) {
try {
await fs.promises.writeFile("./config.json", JSON.stringify(currentConfig));
} catch (e) {
console.log("Could not write config file", e)
throw e;
}
}
async function readConfig() {
try {
const rawConfig = await fs.promises.readFile("./config.json", {encoding: 'utf-8'});
return JSON.parse(rawConfig);
} catch (e) {
console.log("Could not read config file", e)
throw e;
}
}
I have an array of docs ids that I want to delete in using a cloud function, my code looks like the following :
//If the user decieds on deleting his account forever we need to make sure we wont have any thing left inside of db after this !!
// incoming payload array of 3 docs
data = {array : ['a302-5e9c8ae97b3b','92c8d309-090d','a302-5e932c8ae97b3b']}
export const deleteClients = functions.https.onCall(async (data, context) => {
try {
// declare batch
const batch = db.batch();
// set
data.arr.forEach((doc: string) => {
batch.delete(db.collection('Data'), doc);
});
// commit
await batch.commit();
} catch (e) {
console.log(e);
}
return null;
});
I am getting a syntax error on batch.delete how to pass the right arguments in to the batch delete to reference that doc I want to submit for deletion before commit ?
Delete takes a single param, the doc ref of the doc to be deleted.
data.arr.forEach((docId: string) => {
batch.delete(doc(db, "Data", docId));
});
There are several errors in your code:
data.arr.forEach() cannot work wince your data object contains one element with the key array and not the key arr.
You are mixing up the syntax of the JS SDK v9 and the Admin SDK. See the write batch Admin SDK syntax here.
You need to send back some data to the client when all the asynchronous work is complete, to correctly terminate your CF.
You do return null; AFTER the try/catch block: this means that, in most of the cases, your Cloud Function will be terminated before asynchronous work is complete (see the link above)
So the following should do the trick (untested):
const db = admin.firestore();
const data = {array : ['a302-5e9c8ae97b3b','92c8d309-090d','a302-5e932c8ae97b3b']};
export const deleteClients = functions.https.onCall(async (data, context) => {
try {
const batch = db.batch();
const parentCollection = db.collection('Data')
data.array.forEach((docId) => {
batch.delete(parentCollection.doc(docId));
});
// commit
await batch.commit();
return {result: 'success'} // IMPORTANT, see explanations above
} catch (e) {
console.log(e);
// IMPORTANT See https://firebase.google.com/docs/functions/callable#handle_errors
}
});
I need to load and interpret Parquet files from an S3 bucket using node.js. I've already tried parquetjs-lite and other npm libraries I could find, but none of them seems to interpret date-time fields correctly. So I'm trying to AWS's own SDK instead, in the believe that is should be able to deserialize its own Parquet format correctly -- the objects were originally written from SageMaker.
The way to go about it, apparently, is to use the JS version of
https://docs.aws.amazon.com/AmazonS3/latest/API/API_SelectObjectContent.html
but the documentation for that is horrifically out of date (it's referring to the 2006 API, https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#selectObjectContent-property). Likewise, the example they show in their blog post doesn't work either (data.Payload is neither a ReadableStream not iterable).
I've already tried the response in
Javascript - Read parquet data (with snappy compression) from AWS s3 bucket. Neither of them work: the first uses
node-parquet, which doesn't currently compile, and the second uses parquetjs-lite (which doesn't work, see above).
So my question is, how is SelectObjectContent supposed to work nowadays, i.e., using aws-sdk v3?
import { S3Client, ListBucketsCommand, GetObjectCommand,
SelectObjectContentCommand } from "#aws-sdk/client-s3";
const REGION = "us-west-2";
const s3Client = new S3Client({ region: REGION });
const params = {
Bucket: "my-bucket-name",
Key: "mykey",
ExpressionType: 'SQL',
Expression: 'SELECT created_at FROM S3Object',
InputSerialization: {
Parquet: {}
},
OutputSerialization: {
CSV: {}
}
};
const run = async () => {
try {
const data = await s3Client.send(new SelectObjectContentCommand(params));
console.log("Success", data);
const events = data.Payload;
const eventStream = data.Payload;
// Read events as they are available
eventStream.on('data', (event) => { // <--- This fails
if (event.Records) {
// event.Records.Payload is a buffer containing
// a single record, partial records, or multiple records
process.stdout.write(event.Records.Payload.toString());
} else if (event.Stats) {
console.log(`Processed ${event.Stats.Details.BytesProcessed} bytes`);
} else if (event.End) {
console.log('SelectObjectContent completed');
}
});
// Handle errors encountered during the API call
eventStream.on('error', (err) => {
switch (err.name) {
// Check against specific error codes that need custom handling
}
});
eventStream.on('end', () => {
// Finished receiving events from S3
});
} catch (err) {
console.log("Error", err);
}
};
run();
The console.log shows data.Payload as:
Payload: {
[Symbol(Symbol.asyncIterator)]: [AsyncGeneratorFunction: [Symbol.asyncIterator]]
}
what should I do with that?
I was stuck on this exact same issue for quite some time. It looks like the best option now is to append a promise() to it.
So far, I've made progress using the following (sorry, this is incomplete but should at least enable you to read data):
try {
const s3Data = await s3.selectObjectContent(params3).promise();
// using 'any' here temporarily, but will need to address type issues
const events: any = s3Data.Payload;
for await (const event of events) {
try {
if(event?.Records) {
if (event?.Records?.Payload) {
const record = decodeURIComponent(event.Records.Payload.toString().replace(/\+|\t/g, ' '));
records.push(record);
} else {
console.log('skipped event, payload: ', event?.Records?.Payload);
}
}
else if (event.Stats) {
console.log(`Processed ${event.Stats.Details.BytesProcessed} bytes`);
} else if (event.End) {
console.log('SelectObjectContent completed');
}
}
catch (err) {
if (err instanceof TypeError) {
console.log('error in events: ', err);
throw err;
}
}
}
}
catch (err) {
console.log('error fetching data: ', err);
throw err;
}
console.log("final records: ", records);
return records;
}
I have inherited the following code. This is part of CICD pipeline. It tries to get an object called "changes" from a bucket and does something with it. If it is able to grab the object, it sends a success message back to pipeline. If it fails to grab the file for whatever reason, it sends a failure message back to codepipeline.
This "changes" file is made in previous step of the codepipeline. However, sometimes it is valid for this file NOT to exist (i.e. when there IS no change).
Currently, the following code makes no distinction if file simply does not exist OR some reason code failed to get it (access denied etc.)
Desired:
I would like to send a success message back to codepipeline if file is simply not there.
If there is access issue , then the current outcome of "failure' would still be valid.
Any help is greatly appreciated. Unfortunately I am not good enough with Javascript to have any ideas to try.
RELEVANT PARTS OF THE CODE
const AWS = require("aws-sdk");
const s3 = new AWS.S3();
const lambda = new AWS.Lambda();
const codePipeline = new AWS.CodePipeline();
// GET THESE FROM ENV Variables
const {
API_SOURCE_S3_BUCKET: s3Bucket,
ENV: env
} = process.env;
const jobSuccess = (CodePipeline, params) => {
return new Promise((resolve, reject) => {
CodePipeline.putJobSuccessResult(params, (err, data) => {
if (err) { reject(err); }
else { resolve(data); }
});
});
};
const jobFailure = (CodePipeline, params) => {
return new Promise((resolve, reject) => {
CodePipeline.putJobFailureResult(params, (err, data) => {
if (err) { reject(err); }
else { resolve(data); }
});
});
};
// MAIN CALLER FUNCTION. STARTING POINT
exports.handler = async (event, context, callback) => {
try {
// WHAT IS IN changes file in S3
let changesFile = await getObject(s3, s3Bucket, `lambda/${version}/changes`);
let changes = changesFile.trim().split("\n");
console.log("List of Changes");
console.log(changes);
let params = { jobId };
let jobSuccessResponse = await jobSuccess(codePipeline, params);
context.succeed("Job Success");
}
catch (exception) {
let message = "Job Failure (General)";
let failureParams = {
jobId,
failureDetails: {
message: JSON.stringify(message),
type: "JobFailed",
externalExecutionId: context.invokeid
}
};
let jobFailureResponse = await jobFailure(codePipeline, failureParams);
console.log(message, exception);
context.fail(`${message}: ${exception}`);
}
};
S3 should return an error code in the exception:
The ones you care about are below:
AccessDenied - Access Denied
NoSuchKey - The specified key does not exist.
So in your catch block you should be able to validate exception.code to check if it matches one of these 2.
I'm replicating this Google authored tutorial and I have run into a problem and error that I can't figure out how to resolve.
On the Google Cloud Function import json to bigquery, I get an error " TypeError: job.promise is not a function "
Which is located towards the bottom of the function, the code in question is:
.then(([job]) => job.promise())
The error led me to this discussion about the API used, but I don't understand how to resolve the error.
I tried .then(([ job ]) => waitJobFinish(job)) and removing the line resolves the error but doesn't insert anything.
Tertiary question: I also can't find documentation on how to trigger a test of the function so that I can read my console.logs in the google cloud function console, which would help to figure this out . I can test the json POST part of this function, but I can't find what json to trigger a test of a new file write to cloud storage - the test says must include a bucket but I don't know what json to format (the json I use to test the post -> store to cloud storage doesn't work)
Here is the full function which I've pulled into it's own function:
(function () {
'use strict';
// Get a reference to the Cloud Storage component
const storage = require('#google-cloud/storage')();
// Get a reference to the BigQuery component
const bigquery = require('#google-cloud/bigquery')();
function getTable () {
const dataset = bigquery.dataset("iterableToBigquery");
return dataset.get({ autoCreate: true })
.then(([dataset]) => dataset.table("iterableToBigquery").get({ autoCreate: true }));
}
//set trigger for new files to google storage bucket
exports.iterableToBigquery = (event) => {
const file = event.data;
if (file.resourceState === 'not_exists') {
// This was a deletion event, we don't want to process this
return;
}
return Promise.resolve()
.then(() => {
if (!file.bucket) {
throw new Error('Bucket not provided. Make sure you have a "bucket" property in your request');
} else if (!file.name) {
throw new Error('Filename not provided. Make sure you have a "name" property in your request');
}
return getTable();
})
.then(([table]) => {
const fileObj = storage.bucket(file.bucket).file(file.name);
console.log(`Starting job for ${file.name}`);
const metadata = {
autodetect: true,
sourceFormat: 'NEWLINE_DELIMITED_JSON'
};
return table.import(fileObj, metadata);
})
.then(([job]) => job.promise())
//.then(([ job ]) => waitJobFinish(job))
.then(() => console.log(`Job complete for ${file.name}`))
.catch((err) => {
console.log(`Job failed for ${file.name}`);
return Promise.reject(err);
});
};
}());
So I couldn't figure out how to fix google's example, but I was able to get this load from js to work with the following code in google cloud function:
'use strict';
/*jshint esversion: 6 */
// Get a reference to the Cloud Storage component
const storage = require('#google-cloud/storage')();
// Get a reference to the BigQuery component
const bigquery = require('#google-cloud/bigquery')();
exports.iterableToBigquery = (event) => {
const file = event.data;
if (file.resourceState === 'not_exists') {
// This was a deletion event, we don't want to process this
return;
}
const importmetadata = {
autodetect: false,
sourceFormat: 'NEWLINE_DELIMITED_JSON'
};
let job;
// Loads data from a Google Cloud Storage file into the table
bigquery
.dataset("analytics")
.table("iterable")
.import(storage.bucket(file.bucket).file(file.name),importmetadata)
.then(results => {
job = results[0];
console.log(`Job ${job.id} started.`);
// Wait for the job to finish
return job;
})
.then(metadata => {
// Check the job's status for errors
const errors = metadata.status.errors;
if (errors && errors.length > 0) {
throw errors;
}
})
.then(() => {
console.log(`Job ${job.id} completed.`);
})
.catch(err => {
console.error('ERROR:', err);
});
};