Background:
I am building a node.js-based Web app that needs to make use of various fonts. But it only needs to do so in the backend since the results will be delivered as an image. Consequently, the client/browser does not need access to the fonts at all in my case.
Question:
I will try to formulate the question as little subjective as possible:
What are the typical options to provide a node.js backend with a large collection of fonts?
The options I came up with so far are:
Does one install these hundreds or thousands of fonts in the operating system of the (in my case: Ubuntu) server?
Does one somehow serve the fonts from a cloud storage such as S3 or (online) database such as a Mongo DB server?
Does one use a local file system to store the fonts and retrieve them?
...other options
I am currently leaning towards Option 1 because this is the way a layman like me does it on a local machine.
Without starting a discussion here, where could I find resources discussing the (dis-)advantages of the different options?
EDIT:
Thank you for all the responses.
Thanks to these, I noticed that I need to clarify something. I need the fonts to be used in SVG processing libraries such as p5.js, paper.js, raphael.js. So I need to make the fonts available to these libraries that are run on node.js.
The key to your question is
hundreds or thousands of fonts
Until I took that in there is no real difference between your methods. But if that number is correct (kind of mind-boggling though) I would:
not install them in the OS. What happens if you move servers without an image? Or move OS?
Local File system would be a sane way of doing it, though you would need to keep track manually of all the file names and paths for your code.
MongoDB - store file names+paths in the collection..and store the actual fonts in your system.
In advent of moving servers you would have to pick up the directory where all the actual files are stored and the DB where you hold the file name+paths.
If you want you can place it all in a MongoDB but then that file would also be huge, I assume - that is up to you.
Choice #3 is probably what I would do in such a case.
If you have a decent enough server setup (e.g. a VPS or some other VM solution where you control what's installed) then another option you might want to consider is to do this job "out of node". For instance, in one of my projects where I need to build 175+ as-perfect-as-can-be maths statements, I offload that work to XeLaTeX instead:
I run a node script that takes the input text and builds a small but complete .tex file
I then tell node to call "xelatex theFileIJustMade.tex" which yields a pdf
I then tell node to call "pdfcrop" on that pdf, to remove the margins
I then tell node to call "pdf2svg", which is a free and amazingly effective utility
Then as a final step mostly to conserve space and bandwidth, I use "svgo" which is a nodejs based svg optimizer that can run either as normal script code, or as CLI utility.
(some more details on that here, with concrete code here)
Of course, depending on how responsive a system you need, you can do entirely without steps 3 and 5. There is a limit to how fast we can run, but as a server-side task there should never be the expectation of real-time responsiveness.
This is a good example of remembering that your server runs inside a larger operating system that might also offer tools that can do the job. While you're using Node, and the obvious choice is a Node solution, Node is also a general purpose programming language and can call anything else through spawn and exec, much like python, php, java, C#, etc. As such, it's sometimes worth thinking about whether there might be another tool that is even better suited for your needs, especially when you're thinking about doing a highly specialized job like typesetting a string to SVG.
In this case, LaTeX was specifically created to typeset text from the command line, and XeLaTeX was created to do that with full Unicode awareness and clean, easy access to fonts both from file and from the system, with full OpenType feature control, so would certainly qualify as just as worthwhile a candidate as any node-specific solution might be.
As for the tools used: XeLaTeX and pdfcrop come with TeX Live (installed using whatever package manager your OS uses, or through MiKTeX on Windows, but I suspect your server doesn't run on windows) pdf2svg is freely available on github, and svgo is available from npm)
Related
Application:
I wish to publish a web-application that takes input strings, searches for the string in about 5,000 plain-text files and returns the name of the files with matches. Each text file is about 4MB (uncompressed).
Problem:
In PHP, I could use exec(grep -l pattern dir/* ) and get the job done. However, for cost reasons, I would go for a shared web-hosting plan which normally do not allow for executing programs.
Could you please suggest any alternative to grep for web environment?
I have understood following so far:
A binary program file for any grep-alternative (e.g sift) could work. However, the problem of executing on a shared server would remain.
PHP function preg_match is inappropriate considering a large number of files and their size.
I am open to implementations of grep-like function in other languages (e.g perl or javascript). However, I am not sure if the performance would be comparable to grep and whether the problem of execution would still remain.
I have tried looking for different web-hosting providers and understood that a virtual-private server (VPS) might be the solution. However, the price for a VPS plan by all hosting providers I have come across is unaffordable.
Any solutions or guidance for this problem?
Possible solutions depend on what your hosting provider offers and your budget.
Will you have a RDBMS available? You could then use full texts search which many offers. If not you could use SQLite, which has full texts search support.
If you have to stick to low tech solutions, than the PHP solution linked on the right might work for you.
Perl has a File::Find module, which you could use.
I hava a node.js app on heroku thath checks for emails and then if the email match certian cliteria it would reply.
Because heroku restarts the dyno every so often I made a file that saves the ids of the emails (a small array) I've already cheaked (so it doesn't reply twice to the same email), but silly me, heroku restarts that file too, so no changes I made will be save.
Is there a way to save the file changes the app makes to it?
If you know of a better way of to do what I want to?
Heroku enforces this behavior (the local file deletion stuff) because it is a best practice. Writing files locally doesn't scale well, and can lead to some odd edge-case behaviors when you have multiple processes on the same VM all performing file I/O operations.
What you should use instead is either a database, a cache (like Redis), or even just write your file directly to a file storage service like Amazon S3.
I realize it sounds annoying that you have to do these extra things even for a simple use case like what you're doing here -- but Heroku's platform is geared around enforcing best practices to help people build scalable, reliable software.
If you're looking for a way to do stuff like this without the extra hassle, you might want to consider just purchasing a small VPS server from another company where you can have direct control over processes, disk, etc.
I just started my adventure with frontend, most likely with web design. I've been struggling to answer one technical question and I couldn't find yet a reasonable answer.
There's so many libraries you can load, download to make your web developing faster. Therefore there is my question.
Is it better to download these libraries (e.g. Boostrap, jQuery, Angular, fonts from Google and so) and link to them (externally) from the official source or download it, upload to your server and then link to the location file (internal source) on your server?
My imagination tells me that if I would download them and upload em on my server, then link to it would make the whole website load quicker. Is that a good thinking?
Pro hosting and linking to external resources (may it be JS libraries, images or whatever):
Spreading the load: your server doesn't have to serve all content, it can concentrate on its main functionality and serve the application itself.
Spread the HTTP connections: due to more and more asynchronously working applications it is a good thing to use the maximum parallel HTTP connections per site/subdomain to deliver application data and load all necessary additional resources from other servers.
as Rafael mentioned above, CDNs scale very good and seldom go offline.
Cons
Even with fast internet connections there is a high chance that resources will be served faster when they are located on the same Intranet. That's why some companies have their own "Micro-CDNs" inside their local networks to combine the advantages of multiple servers and local availability.
External dependancy: as soon as an Internet connection becomes unavailable or a Proxy server goes down, all external resources become unavailable leaving the application in a broken state.
Sometimes it may be actually faster if you link from an external source. That's because the browser stores recent data it accesses, and many sites use Bootstrap, jQuery and the such. It might not happen frequently with less popular libraries.
Keep in mind, though, since you're downloading from external sources, you're at the mercy of their servers. If for some reason or another it gets offline, your page won't work correctly. CDNs are not supposed to go offline for that very reason, but it's good to be aware of that. Also, when/if you're offline and working on your page, you won't be able to connect during development.
It is always better to download these files locally if you are developing some application for more security so that you do not really have to depend on any third party server which hosts the CDN.
Talking about performance using CDN might be beneficial because the libraries that you require might be cached in your browser so the time to fetch the file is saved. But if the file is available locally loading these files will definately take time and space.
https://halfelf.org/2015/cdn-vs-local/
https://www.sitepoint.com/7-reasons-not-to-use-a-cdn/
I agree with Rafael's answer above, but wanted to note a few benefits of serving up these libraries locally that he or she omitted.
It is still considered best practice (until HTTP2 becomes widespread) to try to minimize the amount of downloads being made by your site by concatenating many files into a single file. SO-- if you are using three Javascript libraries/frameworks (for instance, Angular, jQuery and Moment.js), if you are using a CDN that is three separate script elements pulling down three separate .js files. However, if you host them locally, you can have your build process include a step where it bundles the three libraries together into a single file called "vendor.js" or something along those lines. This has the added bonus of simplifying dependency loading to some degree, as you can concatenate them in a certain order should the need be.
Finally, although it is a little advanced if you are just getting started, if you are considering hosting your library files with your project it would definitely be worth looking into bower (https://bower.io/docs/api/) -- it is a node build tool that allows you to define what packages need to be included in your project and then install them with a single command-- particularly useful for keeping unnecessary library files out of your version control. Good luck!
I'm developing a Cordova app for Android (so it's all HTML/CSS/Javascript code).
This app is going to feature contents that I don't want them to be freely distributed on the internet, mostly audios, videos and some XML files.
Although those contents will be loaded from a server and other content providers, a user could unzip the APK and look into the www folder, analyze the source code (mostly jQuery and jQuery Mobile stuff) and find the direct paths to all those contents. Then, easily download them. Those paths might be inside the javascript code or inside XML files.
Is there any way to prevent this? I know of JS obfuscators, but I believe that they're pretty easy to reverse.
I think you've pretty much answered your own question. Obfuscation is the only way to "protect" the Javascript code, and there really is no way to protect the content. You try encryption, but the Javascript code to un-encrypt it will be exposed, so that solution practically useless.
Perhaps one option is to encrypt content on the server with a key provided by the user, then download it on the app's first run. This has obvious drawbacks as well: Some kind of separate user registration or account is required, entering a password every time the app starts is inconvenient, dealing with lost passwords, et cetera.
There are lots of obfuscation libraries for Javascript, just Google for them.
"Resources are world-readable by design.
Even if you were to not package the ""images or soundFX files"" as resources but were to download them on first run,
users with root access could still get to the files.
Since this is not significantly different than any other popular operating system humanity has developed,
it is unclear why you think this is an Android problem.
Sufficiently interested users can get at your ""images or soundFX files"" on iOS, Windows, OS X, Linux, and so on."
To learn node.js, I am writing a web site that allows users to play the online game Mafia. For those unfamiliar with Mafia, it is a game most commonly played on forums, and pits an uninformed majority (the "Town") against an informed minority (the "Mafia"). However, although this is an accurate brief overview, in fact every game session can exhibit widely varying house rules that can dramatically change the game mechanics.
I want my website to be able to handle all of these variations. At first I planned for my website to implement a comprehensive framework that could run all Mafia variants itself. However, after going over a ton of rule sets for finished games archived on several different forums, I realized that the space of reasonable rules and gameplay mechanics is so huge that I would essentially have to create a new domain-specific programming language to allow all possible variants. Inventing a new language for a an otherwise straightforward personal project is rather silly and not something I'm interested in at the moment, especially given I have a perfectly good language at hand, namely JavaScript.
Therefore, I decided to let variant authors to upload a JavaScript file containing the variant code that my website will call into at the appropriate points. Essentially, JavaScript modules implementing Mafia variant game logic (which my website code will require()) will act as a scripting language to my web site's "game engine". Think Lua for C++ games. Unfortunately, this introduces a severe security problem. Unlike in the browser, node-run JavaScript has access to the file system, the network, etc. etc. So it would be trivial for a malicious user to upload a variant file that deletes the contents of my hard drive, or starts Bitcoin mining, or whatever.
My first thought was to do a replace() over each user's uploaded code for dangerous libraries such as 'fs' and 'http' into invalid strings, and catch the consequent exceptions when I try to load the file. However, this ad-hoc blacklisting technique feels like the kind of approach that one of the many people smarter/more knowledgeable that me will be able to overcome in a heartbeat. What I really need is a way to whitelist the libraries that user-uploaded code has access to. Is there a way to do this using JavaScript in node.js? If not, how would you recommend I secure the computer my node server will be running on as much as possible?
My current strategy is to require myself and a small number of trusted users to review and then vote unanimously in favor of user-uploaded JavaScript variant code before it is brought into the system, but I'm hoping there is a more automatic way of doing it.
You need to use vm module for this. Basically it allows to run scripts in customized contexts, so you can put whatever globals you want, define your own require etc.
You should also remember that in node.js it's possible to harm your app without any libraries — a user can simply add something like while (true) {} which will stop the whole process. So you need to run all untrusted code in separate processes and be ready to kill them, when they start to abuse cpu or memory.