I'm working on a websocket based project. The main server, which includes both a web and websocket server, mainly acts as a forwarding hub to other websocket servers. These secondary websocket servers are not expected to also have web servers running, but are expected to be hosting files that may or may not need to be downloaded (transferred directly via websocket depending on need).
While the files aren't expected to be very large--current test file hovers around 2KB, but we expect a standard of around 10-20KB, and possibly much larger if we allow encoding of images and other data-heavy material--files are expected to be demanded from an array of hosts repeatedly. That is, clients may 'request' a file from multiple, independent websocket servers (not at the same time). However, it would be expected a single client may request the same file from a single websocket server as much as 20 times a day or more.
So, to cut down on bandwidth, I am wondering if it is possible to cache these dynamic files doing purely client-side work.
With HTML5 You can leverage the browsers Application Cache by using a Manifest.
In such a manifest you can specify files which should be cached on the clientside. Invalidation of these caches happens through Javascript so that is also on the clientside.
More on Application Cache and manifest files you'll find here: http://www.html5rocks.com/en/tutorials/appcache/beginner/
Related
I am serving Angular app as a static content in Express Server. When serving static files with Express, Express by default adds ETag to the files. So, each next request will first check if ETag is matched and if it is, it will not send files again. I know that Service Worker works similar and it tries to match the hash. Does anyone know what is the main difference between these two approaches (caching with ETag and caching with Service Workers), and when we should use one over the other? What would be the most efficient when it comes to performance:
Server side caching and serving Angular app static files
Implementing Angular Service Worker for caching
Do both 1 and 2
To give a better perspective, I'll address a third cache option as well, to clarify the differences.
Types of caching
Basically we have 3 possible layers of caching, based on the priority they are checked from the client:
Service Worker cache (client-side)
Browser Cache, also known as HTTP cache (client-side)
Server side cache (CDN)
PS: Some browser like Chrome have an extra memory cache layer in front of the service worker cache.
Characteristics / differences
The service worker is the most reliable from the client-side ones, since it defines its own rules over how to manage the caching, and provide extra capabilities and fine-grained control over exactly what is cached and how caching is done.
The Browser caching is defined based on some HTTP headers from the assets response (Cache-Control and Expires), but the main issue is that there are many conditions in which those are ignored.
For instance, I've heard that for files bigger than 25Mb, normally they are not cached, specially on mobile, where the memory is limited (I believe it's getting even more strict lately, due to the increase in mobile usage).
So between those 2 options, I'd always chose the Service Worker cache for more reliability.
Now, talking to the 3rd option, the CDN checks the HTTP headers to look for ETag for busting the cache.
The idea of the Server-side caching is to only call the origin server in case the asset is not found on the CDN.
Now, between 1st and 3rd, the main difference is that Service Workers works best for Slow / failing network connections and offline, since the cache is done client-side, so if the network is off, then the service worker retrieves the last cached information, allowing for a smooth user experience.
Server-side, on the other hand, only works when we are able to reach the server, but at the same time, the caching happens out of user's device, saving local space, and reducing the application memory consumption.
So as you see, there's no right / wrong answers, just what works best for your use case.
Some Sources
MDN Cache
MDN HTTP caching
Great article from web.dev
Facebook study on caching duration and efficiency
Let's answer your questions:
what is the main difference between these two approaches (caching with ETag and caching with Service Workers)
Both solutions cache files, the main difference is the need to reach the server or stay locally:
For the ETag, the browser hits the server asking for a file with a hash (the etag), depending on the file stored in the server, the server will answer with a "the file was not modified, use your local copy" with a 300 HTTP response or "here is a new version of that file" with a 200 HTTP response and a new file. In both cases the server always decides. and the user will wait for a round trip.
With the Service worker approach you can decide locally what to do. You can write some logic to control what/when to use a local copy (cached) or when go to the server. This is very useful for offline capabilities since the logic is happening in the client, and there is no need to hit the server.
when we should use one over the other?
You can use both together. You can define some logic in the service worker, if there is no connection return the local copies, otherwise go to the server.
What would be the most efficient when it comes to performance:
Server side caching and serving Angular app static files
Implementing Angular Service Worker for caching
Do both 1 and 2
My recommended approach is use both approaches. Although treat your files differently, the 'index.html' file can change, in this case use the service worker (in case there is no internet access) and if there is internet access let the web server answer with the etag. All the other static files (CSS and JS) should be immutable files, this is you can be sure the local copy is valid, in this case add a hash to the files' name (so they are always unique files) and cache them. When you have a new version of your app, you will modify the 'index.html' pointing to new immutable files.
Some background: I am working with legacy code, and am attempting to upload a binary file (~2MB) to an embedded microhttpd web server via an HTTP form (POST request). Lately I've noticed that the upload speed from Windows 10 machines is significantly slower than from non-Windows 10 machines: the upload will send a handful of bytes at a time (about 6-7 chunks of ~1500 bytes each) and will then pause, sometimes for 30-60 seconds, before sending another handful of bytes. The decrease in speed caused by this issue renders the whole upload process unusable.
I performed some debugging on the embedded server and found that it was indeed waiting for more data to come in on the socket created for the POST request, and that for some reason this data was not being sent by the client machine. After some analysis in Wireshark on Win10 vs non-Win10 traffic, the messages I am seeing appear to tally up with the issue described by Microsoft here: https://support.microsoft.com/en-gb/help/823764/slow-performance-occurs-when-you-copy-data-to-a-tcp-server-by-using-a.
Specifically, I am seeing that in the case of Windows 10, the first TCP packet sent to the embedded web server is indeed "a single send call [that] fills the whole underlying socket send buffer", as per the Microsoft article. This does not appear to be the case for non-Windows 10 machines. Hence, I need to be able to set up my sockets so that the web client does not send so much data as to fill up the receive buffer in one packet.
Unfortunately, major modifications to the web server itself (aside from little config tweaks) are out of the question, since the legacy code I'm working with is notoriously coupled and unstable. Therefore, I'm looking for a way to specify socket settings via JavaScript, if this is possible. I'm currently using the JQuery Form plugin, which operates on top of XMLHttpRequests. Since I have complete control over both the JavaScript page and the embedded web backend, I can hard-code the socket buffer sizes appropriately in both cases.
Is there a way to tweak low-level socket settings like this from JavaScript? If not, would there be another workaround for this issue?
There is no way you can do the TCP stack specific tuning you need from inside Javascript running inside a browser on the client side. It simply does not allow this kind of access.
This is kind of a weird question I think to ask, but I have browsing about for the past some time and cannot find a clear definite answer.
I understand that a client connects to its own server and communicates with the web-server through sockets and I kind of see how that works in php (I have never used php but have used sockets before so I understand the concept).
The issue is I'm trying to get a real view of this.
The question is, do websites generally use sockets and contact a web-server to fetch data or the actual html? Or is it a rare choice made in some areas?
If it is generally used, then is the "real" js usually in the server? or is it client-side (for performance sake)?
Context:
Let me explain a bit where I'm coming from, I'm not a web expert, but I am a computer engineering student so most concepts are easy to understand. A "real"-er view of this would be very helpful.
Now, onto why I'm asking this. I'm developing a web-app as part of a project and have done a fair bit of progress on it but everything was done on a local dev server (so basically a client?)
I've started wondering about this because I wanted to use a database for my website and since I want to connect to something, I will need to connect to a web-server first (for security sake).
My question's intent is to guide me on how and most importantly, where, to setup this server.
I don't think showing any code would be of help here, but assume I have my client running on localhost:1234, my database on localhost:3306, I think I should have a web-server on another port so I can establish this communication, but I want to do it in a clean and legitimate way so all of my current solutions can be ported online with little to no changes (except the obvious)
There's a bunch to unpack here.
First of all, servers can be distant or local. Usually they are distant, local server are mostly used for development purposes.
Even if your server is on your local machine, it still isn't the client. The client is the part that is connecting to your server. For web development it is usually the user browser.
Javascript is a language that can be used server-side, with a NodeJS server, but more often client-side, in your user browser.
Your website, or web application, communicate with your server through various means. Most common one is the HTTP protocol, used to make server requests such as data request to populate your page (in case of an API server, REST or otherwise), or simply request the actual page to display in the browser. The HTTP protocol works by resolving URLs, and making requests to your server registered to this url using special methods such as GET, POST, DELETE, etc...
Sockets are used to create a persistent connection with your server that works both ways. It is mostly used for realtime updates, such as a live chat, as it allows you to push updates from the server instead of having the client request everything.
In most cases the database can be found on the same server as the one serving the website or application, as it is a lot easier to handle, and often faster without the extra networks requests to get the data. However it can be placed on another server, with it's own API to get the data (not necessarily web related)
Ports such as 1234 or 3306 are often used for local development, however once your move your project to a host service, this is usually replace by urls. And the host service will provide you with a config to access the associated database. Or if you are building your own server you might still use ports. It is heavily dependent on your server config.
Hope this clear some things up.
In addition to #Morphyish answer, in the simplest case, a web browser (the client) requests an URL from a server. The URL contains the domain name of the server and some parameters. The server responds with HTML code. The browser interprets the code and renders the webpage.
The browser and the server communicates using HTTP protocol. HTTP is stateless and closes the connection after each request.
The server can respond with static HTML, e.g. by serving a static HTML file. Or, by serving dynamic HTML. Serving dynamic HTML requires some kind of server language (e.g. nodejs, PHP, python) that essentially concatenates strings to build the HTML code. Usually, the HTML is created by filling templates with data from the database (e.g. MySQL, Postgres).
There are countless languages, frameworks, libraries that help to achieve this.
In addition to HTML, the server can also serve javascript that is interpreted in the browser and adds dynamics to the webpage. However, there could be 2 types of javascript that should not be mixed. NodeJS runs on the server and formats the server response, client javascript runs on the browser. Remember, client and server are completely isolated and can communicate only through an HTTP connection.
That said, there ways to make persistent connections between client and server with WebSockets, and add all kinds of exotic solutions. The core principle remains the same.
It does not matter if server software (e.g apache, nginx) is running on your local machine or anywhere else. The browser makes a request to an address, the DNS and network stack figures out how to reach the server and makes it work.
My two server machines are deployed behind a load balancer for high availability. When a user upload a file through browser, the file is stored on any of the server machine behind load balancer.
I want that file to send as an attachment through mail to the user using java but I don't know how to get the file path behind the load balancer?
I had faced this problem recently, and did some research about this. The best solution that I found was to keep a shared memory on some third machine, which would be accessible by both of your server machines. The file would be read and written from that machine only. The location of the file can be either kept in central logs, or central database.
Note that the way you are trying to do this requires the two machines to communicate with each other (network communication). By doing this, you would increase the request service time. Also, you'll get rid of the "high availability" requirement because one download request will invoke both the server machines (in worst case).
To use socket.io on the client side, usually we start a node.js server and go like this:
<script src="/socket.io/socket.io.js"></script>
or with specific port:
<script src="http://localhost:3700/socket.io/socket.io.js"></script>
Question is:
is it necessary to use node.js server to serve socket.io.js ?
...or is it possible to
make a local copy of socket.io.js instead of goes to server every single time we need socket.io?
like, we go to view source and copy everything we got from the source of script tag,
paste and save it as socket.io-local.js so that next time we use:
<script src="socket.io-local.js"></script>
will that work ?
Updates
Thanks for everyone's great response,
I'm asking this because in the case I'm involved, I don't actually have access to the server:
I am writing the client-side to connect to other developer's Socket Sever which is written in Java.
Therefore I'll have to think a way to work around the fact that I don't have a server there for me.
from what I've been testing,
this way seems to work but I really don't know what's happening behind the scene.
You obviously can host the socket.io client library anywhere and pull it in to a page. However, it will almost certainly not work with your Java-based server.
To understand why, you need to understand what socket.io is really doing behind the scenes; the client library is only a small part of it.
Socket.io actually defines and implements its own protocol for realtime communication between a browser and a server. It does so in a way that supports multiple transports: if—for example—a user's browser or proxy doesn't support WebSockets, it can fall back to long polling.
What the socket.io client actually does is:
Makes a XHR GET request for /socket.io/1. The server responds with a session ID, configured timeouts, and supported transports.
The client chooses the best transport that the user browser supports. In modern browsers, it will use WebSockets.
If WebSockets are supported, it creates a new WebSocket to initiate a WebSocket connection (HTTP GET with Upgrade: websocket header) to a special URL – /socket.io/1/websocket/<session id>.
If WebSockets aren't supported by the browser or fail to connect (there are lots of intermediaries in the wild like proxies, filters, network security devices, and so forth that don't support WebSocket requests), the library falls back to XHR long polling, and makes a XHR request to /socket.io/1/xhr-polling/<sesion id>. The server does not respond to the request until a new message is available or a timeout is reached, at which point the client repeats the XHR request.
Socket.io's server component handles the other end of that mess. It handles all the URLs under /socket.io/, setting up sessions, parsing WebSocket upgrades, actually sending messages, and a bunch of other bookkeeping.
Without all of the services provided by the socket.io server, the client library is pretty useless. It will just make a XHR request to a URL that doesn't exist on your server.
My guess is that your Java-based server just implements the WebSockets protocol. You can connect directly to it using the browser-provided WebSocket APIs.
It is possible that your server does implement the socket.io protocol – there are a few abandoned Java projects to do that – but that's unlikely. Talk with the developer of your server to find out exactly how he's implemented a "socket server."
A standalone build of socket.io-client is exposed automatically by the socket.io server as /socket.io/socket.io.js. Alternatively you can serve the file socket.io-client.js found at the root of this repository.
https://github.com/LearnBoost/socket.io-client
I have a module called shotgun-client that actually wraps socket.io. I needed to serve a custom client script as well as the socket.io client script, but I didn't want every user of my module to have to include multiple script references on their pages.
I found that, when installed, you can serve the generated client script from socket.io by reading the file /node_modules/socket.io/node_modules/socket.io-client/dist/socket.io.js. So my module adds a listener for its own URL and when it serves my custom client script it also serves the socket.io client script with it. Viola! Only a single script reference for the users of my module :)
While this is technically possible, I don't see why you'd need to do that. If you're concerned about reducing the data that goes over the wire, this change won't actually do much beyond the few characters saved in the shorter src tag. Simply changing the location of the JS file on the server won't actually improve performance - the JS has to be sent.
Proper caching (which Socket.IO has) will return a 304 Not Modified (and not re-send the JS file every time you load a page).