Assume this code
// puppeteer-manager.ts
import { Browser, executablePath, Page } from "puppeteer";
import puppeteerExtra from "puppeteer-extra";
import stealthPlugin from "puppeteer-extra-plugin-stealth";
puppeteerExtra.use(stealthPlugin());
interface PuppeteerInstance {
browser: Browser;
page: Page;
}
let instances: { [sessionId: string]: PuppeteerInstance } = {};
async function createPuppeteerInstance(
sessionId: string
): Promise<PuppeteerInstance> {
if (!instances[sessionId]) {
const browser = await puppeteerExtra.launch({
args: ["--no-sandbox"],
headless: false,
ignoreHTTPSErrors: true,
executablePath: executablePath(),
});
const page = await browser.newPage();
instances[sessionId] = { browser, page };
}
return instances[sessionId];
}
async function closePuppeteerInstance(sessionId: string): Promise<void> {
if (instances[sessionId]) {
const { browser } = instances[sessionId];
await browser.close();
delete instances[sessionId];
}
}
export { createPuppeteerInstance, closePuppeteerInstance };
So this code is responsible for creating new instances of puppeteer and storing them in a variable called instances and sessionId as the key.
So I'm building a NextJS app that let every connected user have their own puppeteer instance to achieve what the app is intended to.
So I will assume if I call createPuppeteerInstance in file A and pass it sessionId 1234 as session iD I will assume when I run createPuppeteerInstance in file B with the same sessionId I should get the same instance of puppeteer but for some reason, the variable instances in puppeteer-manager.ts not storing the different instance and createPuppeteerInstance keeps on creating a new instance every time I call it. I know this will just work in Express server so I'm asking/wondering could the problem be with NextJS that causes instances variable to go back to an empty object.
Related
I am using Next.js's Static HTML Export for my site which have 10 million static pages but I am running into ram issues when building the app.
Is it even possible to export it in parts like 100k pages on first build then 100k on second build and so on?
I do not want to use Incremental Static Regeneration or getServerSideProps to cut costs.
This site is using MongoDB only have two pages home page and posts page:
index.js
[postPage].js
In home page I used this code:
export async function getStaticProps() {
const { db } = await connectToDatabase();
const postsFeed = await db
.collection("myCollection")
.aggregate([{ $sample: { size: 100 } }])
.toArray();
return {
props: {
postsFeed: JSON.parse(JSON.stringify(postsFeed)),
},
};
}
In posts page I used this code:
export async function getStaticPaths() {
const { db } = await connectToDatabase();
const posts = await db
.collection("myCollection")
.find({})
.toArray();
const paths = posts.map((data) => {
return {
params: {
postPage: data.slug.toString(),
}
}
})
return {
paths,
fallback: 'blocking'
}
}
export async function getStaticProps(context) {
const postSlug = context.params.postPage;
const { db } = await connectToDatabase();
const posts = await db
.collection("myCollection")
.find({ slug: { $eq: postsSlug } })
.toArray();
const postsFeed = await db
.collection("myCollection")
.aggregate([{ $sample: { size: 100 } }])
.toArray();
return {
props: {
posts: JSON.parse(JSON.stringify(posts)),
postsFeed: JSON.parse(JSON.stringify(postsFeed)),
},
};
}
Doesn't seem to be a built-in option to process batches of static pages https://github.com/vercel/next.js/discussions/14929
I can only think of dividing the work using a bash script where you set an env variable and use it in the code where you're fetching the data to generate the paths, then run the build command as many times as parts you need to split the data, in each iteration move the generated files to another directory that will be your consolidated output.
COUNTER=1
PARTS=100 # change it to control number of parts
while [ $COUNTER -lt $PARTS ]; do
let COUNTER=COUNTER+1
CURRENT=$COUNTER PARTS=$PARTS next build
# move generated files to another directory
done
in your get getStaticPaths
export async function getStaticPaths() {
const currentPercentage = process.env.CURRENT/process.env.PARTS
// logic to fetch the corresponding current percentage of the data
// 1% when there are 100 parts, 0.5% when 200 parts, etc.
}
Be aware that if the data changes very often, you'll see incorrect results, like repeated pages or skipped ones, since each pagination will occur at different moments when running the script. I believe you could create an auxiliary node (or another language) script to better handle that quantity of records, maybe in a streamlined way, and generate JSON files for each chunk of data to use them in getStaticPaths instead of fetching them directly from the DB.
I'm using sveltekit and trying to understand all the new features added after retiring Sapper. One of those new features is hooks.js which runs on the server and not accessible to the frontend. It makes dealing with db safe. So I created a connection to my mongodb to retrieve user's data before I use the db results in my getSession function. It works but I noticed that it access my database TWICE. Here is my hooks.js code:
import * as cookie from 'cookie';
import { connectToDatabase } from '$lib/mongodb.js';
export const handle = async ({event, resolve})=>{
const dbConnection = await connectToDatabase();
const db = dbConnection.db;
const userinfo = await db.collection('users').findOne({ username: "a" });
console.log("db user is :" , userinfo) //username : John
const response = await resolve(event)
response.headers.set(
'set-cookie', cookie.serialize("cookiewithjwt", "sticksafterrefresh")
)
return response
}
export const getSession = (event)=>{
return {
user : {
name : "whatever"
}
}
}
The console.log you see here returns the user data twice. One as soon as I fire up my app at localhost:3000 with npm run dev and then less than a second, it prints another console log with the same information
db user is : John
a second later without clicking on anything a second console.log prints
db user is : John
So my understanding from the sveltekit doc is that hooks.js runs every time SvelteKit receives a request. I removed all prerender and prefetch from my code. I made sure I only have the index.svelte in my app but still it prints twice. My connection code I copied from an online post has the following:
/**
* Global is used here to maintain a cached connection across hot reloads
* in development. This prevents connections growing exponentially
* during API Route usage.
*/
Here is my connection code:
import { MongoClient } from 'mongodb';
const mongoURI ="mongodb+srv://xxx:xxx#cluster0.qjeag.mongodb.net/xxxxdb?retryWrites=true&w=majority";
const mongoDB = "xxxxdb"
export const MONGODB_URI = mongoURI;
export const MONGODB_DB = mongoDB;
if (!MONGODB_URI) {
throw new Error('Please define the mongoURI property inside config/default.json');
}
if (!MONGODB_DB) {
throw new Error('Please define the mongoDB property inside config/default.json');
}
/**
* Global is used here to maintain a cached connection across hot reloads
* in development. This prevents connections growing exponentially
* during API Route usage.
*/
let cached = global.mongo;
if (!cached) {
cached = global.mongo = { conn: null, promise: null };
}
export const connectToDatabase = async() => {
if (cached.conn) {
return cached.conn;
}
if (!cached.promise) {
const opts = {
useNewUrlParser: true,
useUnifiedTopology: true
};
cached.promise = MongoClient.connect(MONGODB_URI).then((client) => {
return {
client,
db: client.db(MONGODB_DB)
};
});
}
cached.conn = await cached.promise;
return cached.conn;
So my question is : is hooks.js runs twice all the time, one time on the server and one time on the front? If not, then why the hooks.js running/printing twice the db results in my case?
Anyone?
I am trying to use the revalidate function. I tried to follow the code that Vercel offers, but I keep getting an error. Here is the function that I am using:
export async function getServerSideProps() {
const client = await clientPromise;
const db = client.db("myFirstDatabase");
let users = await db.collection("users").find({}).toArray();
users = JSON.parse(JSON.stringify(users));
return {
props: {
users,
},
revalidate: 15,
};
}
And here is the mongodb file that returns the client:
import { MongoClient } from 'mongodb'
const uri = process.env.MONGODB_URI
const options = {
useUnifiedTopology: true,
useNewUrlParser: true,
}
let client
let clientPromise
if (!process.env.MONGODB_URI) {
throw new Error('Please add your Mongo URI to .env.local')
}
if (process.env.NODE_ENV === 'development') {
// In development mode, use a global variable so that the value
// is preserved across module reloads caused by HMR (Hot Module Replacement).
if (!global._mongoClientPromise) {
client = new MongoClient(uri, options)
global._mongoClientPromise = client.connect()
}
clientPromise = global._mongoClientPromise
} else {
// In production mode, it's best to not use a global variable.
client = new MongoClient(uri, options)
clientPromise = client.connect()
}
export default clientPromise
I have been able to connect to the database and the code works fine if I remove the revalidate part. The error that I get is :
**
Error: Additional keys were returned from getServerSideProps. Properties intended for your component must be nested under the props key, e.g.:
return { props: { title: 'My Title', content: '...' } }
Keys that need to be moved: revalidate.
Read more: https://nextjs.org/docs/messages/invalid-getstaticprops-value
**
I am not sure what I am doing wrong. I want to get data from the database and update it every 15 seconds. Any help would be greatly appreciated.
revalidate is for getStaticProps, you are using it on getServerSideProps and this does not allow
I recommend you to see this library: https://swr.vercel.app/
Here is the scenario:
I have 3 files (modules):
app.js
(async () => {
await connectoDB();
let newRec = new userModel({
...someprops
});
await newRec.save();
})();
The app.ts is the entry point of the project.
database.ts
interface ConnectionInterface {
[name: string]: mongoose.Connection;
}
export class Connection {
public static connections: ConnectionInterface;
public static async setConnection(name: string, connection: mongoose.Connection) {
Connection.connections = {
...Connection.connections,
[name]: connection,
};
}
}
export async function connectToDB() {
const conn = await mongoose.createConnection('somePath');
await Connection.setConnection('report', conn);
}
model.ts
const userSchema = new mongoose.Schema(
{
..someprops
},
);
const userModel = Connection.connections.report.model('User', userSchema);
export default userModel;
What I am trying to do: I need to have multiple mongoose connections, so I use an static prop called connections in Connection class (in database.ts); every time that I connect to a database I use setConnection to store the connection in mentioned static prop, so I can access it from every module in my project by its name which is report in this case.
Later, In model.ts I use Connection.connections.report to access the connection report to load my model!
Then, When I run app.ts I get the following error which is logical:
const aggregationModel = Connection.connections.report.model('User', userSchema)
^
TypeError: Cannot read property 'report' of undefined
The reason that causes this (I think) is, while loading imported modules in app.ts, .report is not declared because the app.ts isn't run completely (connectoDB() defines the .report key).
The codes that I have mentioned have been simplified for preventing complexity. The original app is an express app!
Now, How should I solve this error?
Thanks in advance.
You can wait for the connection to finish before using it if you change up your class slightly.
const connection = await Connection.getConnection()
const model = connection.example
...
class Connection {
...
public static async getConnection() => {
if (!Connection.connection) {
await Connection.setConnection()
}
return Connection.connection
}
}
I'm trying to put in a class the puppeter call, the browser object, and the page object. But I'm being unable to use it when using the Page Object Model. Page members are undefined.
In every example that I've found I always seen the call to puppeteer and the browser and page object within the main function. I guess that's a problem with context access.
But it's possible to archieve what I'm looking for and still be able to use at least the page object within Page Object files ?
main.mjs
import Browser from './browser.mjs'
import HomePage from './HomePage.mjs'
async function main () {
const browser = new Browser();
const homePage = new HomePage(browser.page);
await homePage.open();
console.log(await homepage.getTitle());
await browser.close();
}
main();
browser.mjs
class Browser {
constructor() {
return this.main(); // I know, that's ugly
}
async main() {
this.browser = await puppeteer.launch({
headless: false,
});
this.page = await browser.newPage();
return this.browser,this.page
}
}
export default Browser
homepage.mjs
class HomePage {
constructor(page) {
this.page = page;
}
async open() {
this.page.goto('http://www.contoso.com');
}
async getTitle() {
return this.page.title();
}
}
export default HomePage
return this.browser,this.page
This returns the browser. If you want to access browser and page, you should return this only. Your code should look like below.
return this
You can access browser, page and everything between from that object.