NextJS GetStaticPath create heavy static data? - javascript

I have a page that display hotel information by id, so I use getStaticPath to create thing like /hotel-info/542711, 542711 is hotel-id
Problem is There are thousands of hotels, And NextJS will pre-build all that thousands page???? (Increment Static Generation)
Is there problem with memory due to store too much pre-built page like that? ...

GetStaticPaths() pre-renders all the paths only at build time.
And depending if you choose or not to pre build all of them, then yes, you could have a thousand pages pre built in your build folder.
So unless perfomances are very very important to you and you have you own a data center to store all of that, I wouldn't recommand doing that.
You don't have to use GetStaticPath() and you if you do, you should use it wisely.
A good use case for this would be to generate only your most popular pages and leave your less popular ones to be generated once they are actually visited for the first time.

Related

Nodejs - SQL queries in separate file?

I would like to know what do you think about the following task. I want to write data from JSON object in a database. I would like to separate the SQL logic with the business logic.
I read t'hi strategy has not good performance, when the file js contain a lot of queries.
Which approach is the best practice in your opinion? Can you provide a little example?
Your performance question is definitely a 'race your horses' scenario (i.e. test it and see). But in general, if you're going to do this I'd simply export an object with all your named queries like so:
module.exports = {
getAllUsers: "SELECT username, email, displayName FROM users;",
/* other queries */
}
Your calling code can then just require that file and get what it needs:
const queries = require('./db/queries');
queries.getAllUsers // <-- this is now that string
Performance should be about as good as it gets, since your require cache will ensure the file is only read once, and a key-based lookup in JS is pretty quick, even with a thousand or two entries.
I think is always a good practice to separate DB code from business code, and from API code if it exists.
Creating these different layers, you get different advantages:
Testing every layer separately (with unit tests), mocking other layers. With this you can detect errors very fast when you make changes in your code.
You can change very easy your DB connector, or even your database, without impacting your business code (e.g. MySQL by MongoDB)
You can change your API or add a new one without changing your business code (e.g. REST API by/and GraphQL)
If you want to see a project with this layers, we published recently a simple project that allow you to create a collaborative newsletter. You can check backend part, which has db folder, domain folder and api folder. Those are the 3 layers I was talking about:
Colaborative newsletter
Hope it helps you

Schema for handling duplicate filenames on user import?

I am developing a presentation style app (e.g. Powerpoint, Keynote) in Electron where the user can import their image files into a project folder and I am wondering how to handle the issue of potential file name conflicts.
I don't want to be reinventing wheels: are there schema, design patterns or frameworks for this sort of thing? For example on OSX, iPhoto uses date, time of import plus seemingly random(?) ids to organize imported images.
I will be implementing this is Javascript but am interested primarily in how to approach the problem so language doesn't matter.
iPhoto Library:
There are two approaches I am aware of:
Unique folder names
By placing each file in a folder with a unique name, you can preserve the original file name and avoid any clashes. This method will make things much easier for you if you want to store associated files (e.g. image thumbnails) with the original.
You can generate unique folder names using a UUID, which is a string that is virtually guaranteed not to clash. There are plenty of Node.js libraries to generate a UUID, such as uuid. Or just use any random string and check if the folder already exists to be sure.
Alternatively, as in the example you gave, the folder names could be generated according to date and time, but you have to be sure that you will only add one item at one point in time, or add random folders under the date as in the iPhoto example. The dated folders in iPhoto are probably not necessary when they are also using random-strings as folder names, but they would make it friendly for a user who is manually browsing through the folders, and there may be performance benefits if iPhoto needs a directory listing for a specific date.
You need to store a reference to both the folder name and the file name in order to load the file, but of course this could be one string e.g. "6c84fb90-12c4-11e1-840d-7b25c5ee775a/image.jpg".
Unique file names
Another technique is to rename files to have a unique name whenever there is a clash. This is the approach used by the macOS Finder when you create new folders or duplicate a file. This approach is usually best if the user may interact directly with the files, as they will not have to navigate through folders with meaningless names.
As a simple example, let's say I am adding photos of penguins, and I've already added a photo called penguin.jpg.
Now I add a second photo which also happens to be called penguin.jpg.
Check if penguin.jpg exists. It does, so...
Generate a new name for the file, penguin-2.jpg
Check if penguin-2.jpg exists. It doesn't, so...
Save the new file as penguin-2.jpg
If I add more files also called penguin.jpg, the program needs to keep incrementing the name until I find one which does not exist (e.g. penguin-3.jpg). This should not cause any performance issues unless adding thousands of files with the same name (which seems unlikely).
I found a Node.js module which can handle this approach for you: uniquefilename

i18n in React with .properties files

I am in the middle of (large)app rewrite into Reactjs-redux and internalization is next problem.
Ive been looking at some currently available libraries (redux-react-i18n , 18n-react) but none of them seems to fit.
Why ? Because my localized strings are stored in separate .properties files and this cannot be changed. But there is a possibility generate whatever format from this in compile time
Example en_US.properties:
key1=This is a constant string
key2=This is a string with {parameter}
....
and similar with de_de.properties file and so on
Also language can be change only on page refresh so this is making it little bit easier
My question is how to approach this problem. My first naive approach is generate some static constant js object available in app globaly but im feeling thats against javascript best practises also no idea how to deal with parametrized strings
As im fairly new to javascript id like to hear any ideas
In case somebody has same problem
I ended up writing script converting .properties files into json files
Then in React code i created HOC component wich gets keys(or namespaces depends on how you organize your json files) as parameter and fetches values from server
These keys are usually for whole page but sometimes also for single component if it makes sense
All it takes is one more HTTP request you can also cache result
Hop it helps

How to make a mulitlingual website?

I am an amateur programmer and I want to make a mulitlingual website. My questions are: (For my purposes let the English website be website nr 1 and the Polish nr 2)
Should it be en.example.com and pl.example.com or maybe example.com/en and example.com/pl?
Should I make the full website in one language and then translate it?
How to translate it? Using XML or what? Should website 1 and website 2 be different html files or is there a way to translate a html file and then show the translation using XML or something?
If You need any code or something tell me. Thank You in advance :)
1) I don't think it makes much difference. The most important thing is to ensure that Google can crawl both languages, so don't rely on JavaScript to switch between languages, have everything done server side so both languages can be crawled and ranked in Google.
2) You can do one translation then the other, you just have to ensure that the layout expands to accommodate more/less text without breaking it. Maybe use lorem ipsum whilst designing.
3) I would put the translations into a database and then call that particular translation depending on whether it is EN or PL in the domain name. Ensure that the webpage and database are UTF-8 encoding otherwise you will find that you get 'funny' characters being displayed.
My Advice is that you start to use any Framework.
For instance if you use CakePHP then you have to write
__('My name is')
and in translate file
msgid "My name is"
msgstr "Nazywam się"
Then you can easy translate to any other language and its pretty easy to implement.
Also if you do not want to use Framework you can check this link to see example how it works:
http://tympanus.net/codrops/2009/12/30/easy-php-site-translation/
While this question probably is not a good SO question due to its broad nature. It might be relevant to many users.
My approach would be templates.
Your suggestion of having two html files is bad for the obvious reason of duplication- say you need to change something in your site. You would always need to change two html files- bad.
Having one html file and then parsing it and translating it sounds like a massive headache.
Some templating framework could help you massively. I have been using Smarty, but that's a personal choice and there are many options here.
The idea is you make a template file for your html and instead of actual content you use labels. Then in your php code you include the correct language depending on cookies, user settings or session data.
Storing labels is another issue here. Storing them in a database is a good option, however, remember you do not wish to make 100's of queries against a database for fetching each label. What you can do is store them in a database and then have it generate a language file- an array of labels->translations for faster access and regenerate these files whenever you add/update labels.
Or you can skip the database altogether and just store them in files, however, as these grow they might not be as easy to maintain.
I think the easiest mistake for an "amateur programmer" to make in this area is to allow the two (or more) language versions of the site to diverge. (I've seen so-called professionals make that mistake too...) You need to design it so everything that's common between the two versions is shared, so that when the time comes to make changes, you only need to make the changes in one place. The way you do this will depend on your choice of tools, and I'm not going to start advising on that, because it depends on many factors, for example the extent to which your content is database-oriented.
Closely related to this are questions of who is doing the translation, how the technical developers and the translators work together, and how to keep track of which content needs to be re-translated when it changes. These are essentially process questions rather than coding questions, so not a good fit for SO.
Don't expect that a site for one language can be translated without any technical impact; you will find you have made all sorts of assumptions about the length of strings, the order of fields, about character coding and fonts, and about cultural things like postcodes, that turn out to be invalid when you try to convert the site to a different language.
You could make 2 different language files and use php constants to define the text you want to translate for example:
lang_pl.php:
define("_TEST", "polish translation");
lang_en.php:
define("_TEST", "English translation");
now you could make a choice for the user between polish or english translation and based on that you can include the language file.
So if you would have a text field you put its value to _TEST (inbetween php brackets).
and it would show the translation of the chosen option.
The place i worked was doing it like this: they didn't have to much writing on their site, so they were keeping it in a database. As your tags have a php , I assume you know how to use databases. They had a mysql table called languages with language id(in your case 1 for en and 2 for pl) and texts in columns. So the column names were main heading, intro_text, about_us... when the user comes it selects the language and a php request get the page with that language. This is easy because your content is static and can be translated before site gets online, if your content is dynamic(users add content) you may need to use a translation machine because you cannot expect your users to write their entry in all languages.

Best way to scrape a set of pages with mixed content

I’m trying to show a list of lunch venues around the office with their today’s menus. But the problem is the websites that offer the lunch menus, don’t always offer the same kind of content.
For instance, some of the websites offer a nice JSON output. Look at this one, it offers the English/Finnish course names separately and everything I need is available. There are couple of others like this.
But others, don’t always have a nice output. Like this one. The content is laid out in plain HTML and English and Finnish food names are not exactly ordered. Also food properties like (L, VL, VS, G, etc) are just normal text like the food name.
What, in your opinion, is the best way to scrape all these available data in different formats and turn them into usable data? I tried to make a scraper with Node.js (& phantomjs, etc) but it only works with one website, and it’s not that accurate in case of the food names.
Thanks in advance.
You may use something like kimonolabs.com, they are much easier to use and they give you APIs to update your side.
Remember that they are best for tabular data contents.
There my be simple algorithmic solutions to the problem, If there is a list of all available food names this can be really helpful, you find the occurrence of a food name inside a document (for today).
If there is not any food list, You may use TF/IDF. TF/IDF allows to calculate the score of a word inside a document among the current document and also other documents. But this solution needs enough data to work.
I think the best solution is some thing like this:
Creating a list of all available websites that should be scrapped.
Writing driver classes for each website data.
Each driver has the duty of creating the general domain entity from its standard document.
If you can use PHP, Simple HTML Dom Parser along with Guzzle would be a great choice. These two will provide a jQuery like path finder and a nice wrapper arround HTTP.
You are touching really difficult problem. Unfortunately there are no easy solutions.
Actually there are two different parts to solve:
data scraping from different sources
data integration
Let's start with first problem - data scraping from different sources. In my projects I usually process data in several steps. I have dedicated scrapers for all specific sites I want, and process them in the following order:
fetch raw page (unstructured data)
extract data from page (unstructured data)
extract, convert and map data into page-specific model (fully structured data)
map data from fully structured model to common/normalized model
Steps 1-2 are scraping oriented and steps 3-4 are strictly data-extraction / data-integration oriented.
While you can easily implement steps 1-2 relatively easy using your own webscrapers or by utilizing existing web services - data integration is the most difficult part in your case. You will probably require some machine-learning techniques (shallow, domain specific Natural Language Processing) along with custom heuristics.
In case of such a messy input like this one I would process lines separately and use some dictionary to get rid Finnish/English words and analyse what has left. But in this case it will never be 100% accurate due to possibility of human-input errors.
I am also worried that you stack is not very well suited to do such tasks. For such processing I am utilizing Java/Groovy along with integration frameworks (Mule ESB / Spring Integration) in order to coordinate data processing.
In summary: it is really difficult and complex problem. I would rather assume less input data coverage than aiming to be 100% accurate (unless it is really worth it).

Categories