I have been developing a PHP application for quite a while. The (basic) idea is as follows; users can build web-pages using blocks. These blocks can contain images, text etc. Each of these blocks have their own options. These blocks are defined in Domain Driven Design through PHP.
I've build the application to use a php-based Controller that handles the requests from a jQuery/Javascript front. Each time the user edits an option its send to this Controller which unserialises a collection of blocks (php-objects) from Redis and/or the php-session and sets the the attributes of the blocks that are edited or adds/removes one of the blocks. This is to enforce the Domain logic.
Which was fine will developing for myself. I never kept race conditions and such in mind. While moving forward with the product I notice that people lose data. I'll explain what happens;
User edits an option of a block
press save
A request is made to the controller which,
unserialises the collection and
sets the blocks based on their uuid
puts the blocks back in the collection and
serialises the collection again.
There are scenario's where 2 concurrent request might be created which will override the edits of 1 of both requests.
I know I need to rewrite this part of the application. The question is what is the best approach. I could;
Implement some javascript library which will take me a lot of work because it would require me to rewrite that entire part of the application. Also I do not have a lot of experience implementing javascript based solutions. But I do not might stepping into something new. I do want to javascript testing to prevent future problems from occurring and enable cross-browser testing
Apply Redis / Session locking to only enable the controller to process a single request and prevent concurrent requests from overwriting the data set in the previous request. This will lower the chance of concurrent request and data loss, but not fully. People with real slow internet connection might get their connections losing when they might produce a lot of concurrent requests.
I'm curious what other approaches I might be missing, or if one of the two I mentioned above will suffice.
As far as I understand your problem, what you may want to implement is optimistic locking.
A simple way to implement it, is to version your aggregate.
Every time someone edits your object, increment its version.
When you POST your edited blocks, you send back the version on which you are trying to apply your changes.
then, when getting back your object from your persistence storage, you compare the version and ensure you are actually working on an up-to-date object.
If it is, save it, if it is not, reject the modification, notify the user, and reload the object, and take the appropriate action (it depends on your needs).
Related
I'm creating a statistics web page which can see sensitive information.
The webpage has a sort of table which has massive data in it, editable and stored in Server's database. But It needs to be hidden before the user got proper authentications(Like log-in). (Table itself and it's code too). But I found that most of the questions in stack overflow say it is basically impossible. But when I see lots of well-known websites, it seems they are hiding them well. So I guess there are some solutions to the problem.
At first, I build a full-stack of React - Express - Node - MariaDB toolchain.
The react client is responsible for rendering contents of a webpage and editable tables and request for submitting edited content.
The node with express are responsible for retrieving data from DB, updating DB (Provides data to manipulate from client-side -- that's all)
It comes to a problem when I'm considering security on client-side code. I want to hide all content of the page (not just data from the server, but also its logic and features)
To achieving my goals, I consider several things, but I doubt if it is right and working well if I create.
Using Serverside rendering -- Cannot use due to performance reason and lack of resources available
Serverside rendering can hide logic from the user cause it omits the only HTML from the server and all actions are submitted to the server and the server handle the actions and provide its result.
So I can provide only the login page at first, and if login is successful, I can send the rest of HTML and it's logics from the server.
The problem is that my content in the webpage is massive and will be interacted with the user very often, and applying virtualization on my table (by performance reason), it's data and rendering logic should be handled by the web browser.
Combining SSR and Client-Side Rendering
My inspection for this is not sure, I doubt if it is possible.
Use SSR for hiding content of the site from unauthorized users, and if authorized, the web browser renders its full content on demand. (Code and logics should be hidden before authorization, the unauthorized user only can see the login page)
Is it possible to do it?
Get code on demand.
Also my inspection, this is what I am looking for. But I strongly doubt if it is possible.
Workflow is like below
If a user is not logged in:: User only can see the login page and its code
If the user is logged in:: User can see features of the page like management, statistics, etc.
If the user approaches specific features:: Rendering logic and HTTP request interface is downloaded from the server (OR less-performance hindering logic or else...) and it renders what users want to see and do.
It is okay not to find ways from the above idea. Can you provide some outlines for implement such kind of web page? I'm quite new to Web Programming, so I cannot find proper ways. I want to know how can I achieve this with what kinds of solutions, library, structure.
What lib or package should I use for this?
How can I implement this?
OR can you describe to me how modern websites achieve this? (I think the SAP system quite resembles with what I wanna achieve)
Foreword
Security is a complex topic, in which it is not possible to reach 0 threat. I'll try to craft an answer that could fullfil what you are looking for.
Back end: Token, credentials, authentication
So, you are currently using Express for your back end, hence the need to sort of protect the access from this part, many solution exist, I favor the token authentication, but you can do something with username/password (or this) to let the users access the back end.
From what you are describing you would use some sort of API (REST, GraphQL etc.) to connect to the back-end and make your queries (fetch, cross-fetch, apollo-link etc.) and add the token to the call to the back end in the headers usually.
If a user doesn't have the proper token, they have no data. Many sites use that method to block the consumption of data from the users (e.g. Twitter, Instagram). This should cover the security of the data for your back end, and no code is exposed.
Front-end: WebPack and application code splitting
Now the tricky part, so you want the client side not to have access to all the front-end at once but in several parts. This has 2 caveats:
It will be a bit slower than in normal use
Once the client logged in once, he will have access to the application
The only workaround I see in this situation is to use only server side rendering, if you want to limit to the bare minimum the amount of data the client has on your front end. Granted it is slow, but if you want maximum protection that is the only solution.
Yet, if you want to still keep some interactions and have a faster front end, while keeping a bit of security, you could use some code splitting with WebPack. I am not familiar with C so I can't say, but the Multiple page application of WebPack, as I was mentionning in the comment, should give you a good start to build something more secure.
First, you would have for example 2 html files for entering the front end: one with the login and one with the application. The login contains only the Javascript modules that are for entering the application and shouldn't load the other Javascript modules.
All in all, entrypoints are the way you can enter the application, this is a very broad topic that I can't cover in this answer, but I would recommend you to follow WebPack's tutorial and find out how you can work this out.
I recommend the part on code splitting, but all the tutorial is worth having a look.
Second, you will have to tweak the optimisation module. It is usually a module that tries to reduce the size of the application by merging methods that are used by different parts or that are redundant: you don't want this.
In your case, you don't want un-authenticated users to have access. So you would have to probably change things there (as well another broad topic to be covered in a single answer, since you would have to decide what you keep for optimisation and what you remove for security), but here is the link to the optimisation module and a heads up, you will have to modify the SplitChunksPlugin not to do this optimisation.
I hope this helps, there are many solutions are hand and this is not a comprehensive guide but that should give you enough materials to get to what you need.
I've read a few StackOverflow posts related to this subject but I can't find anything specifically helps me in my scenario.
We have multiple monitoring instances within our network, monitoring different environments (Nagios, Icinga, more...). Currently I have a poller script written in PHP which runs every minute via cron, it asks the instance to return all of its problems in JSON, the script then interprets this and pushes it in to a MySQL database.
There is then an 'overview' page which simply reads the database and does some formatting. There's a bit of AJAX involved, every X seconds (currently use 30) it checks for changes (PHP script call) and if there are changes it requests them via AJAX and updates the page.
There's a few other little bits too (click a problem, another AJAX request goes off to fetch problem details to display in a modal etc).
I've always been a PHP/MySQL dev, so the above methodology seemed logical to me and was quick/easy to write, and it works 'ok'. However, the problems are: database constantly being polled by many users, mesh of javascript on the front end doing half the logic and PHP on the back doing the other half.
Would this use case benefit from switching to NodeJS? I've done a bit of Node.JS before but nothing like this. Can I subscribe to MySQL updates? Or trigger them when a 'data fetcher' pushes data in to the database? I've always been a bit confused as I use PHP to create data and javascript to 'draw' the page, is there still a split of NodeJS doing logic and front end javascript creating all the elements, or does NodeJS do all of this now? Sorry for the lack of knowledge in this area...
This is definitely an area where Node could offer improvements.
The short version: with websockets in the front-end and regular sockets or an API on the back-end you can eliminate the polling for new data across the board.
The long version:
Front-end:
You can remove all need for polling scripts by implementing websockets. That way, as soon as new data arrives on the server, you can broadcast it to all connected clients. I would advise Socket.io or the Primus websocket wrapper. Both are very easy to implement and incredibly powerful for what you want to achieve.
All data processing logic should happen on the server. The data is then sent to the client and should be rendered on the existing page, and that is basically the only logic the client should contain. There are some frameworks that do all of this for you (e.g. Sails) but I don't have experience with any of those frameworks, since they require you to write your entire app according to their rules, which I personally don't like (but I know a lot of developers do).
If you want to render the data in the client without a huge framework, I highly recommend the lightweight but incredibly useful Transparency rendering library. Using this, you can format a Javascript object on the server using Node, JSONify it, send it to the client, and then all the client would have to do is de-JSONify it and call Transparency's .render.
Back-end:
This one depends on how much control you have over the behaviour of the instances you need to check. I assume you have some control, since you can get all their data in a nice JSON format. So, there are multiple options.
You can keep polling every so often. This is the easiest solution since it requires no change to the external services. The Javascript setInterval function is very useful here. Depending on how you connect with the instances, you might be able to use a module like Request to do the actual request, so that takes out a bunch more of the heavy lifting.
The benefit of implementing the polling in your Node app as well, is that you will receive the data in your Node app and that way you can immediately broadcast it to the clients, even before inserting it into a database. This will greatly reduce the number of queries on your database.
An alternative to polling would be to set up a simple Express-based API where the applications can post their 'problems', as you call them. This way your application will get notified the moment a problem occurs, and combined with the websockets connection to the client this would result in practically real-time updates.
To be more redundant, you would have a polling timer alongside the API, so that you can check the instances in case there's something wrong that causes them to not send over any more data.
An alternative to the more high-level API would be to just use direct socket communication, which is basically the same approach only using a different set of functions.
Lastly, you could also keep the PHP-based polling script. This would be the most efficient solution since you wouldn't go and replace everything. Then from the Node app that's connected to the clients with websockets, you could set an interval to query the database every so often and broadcast the updates. This will still greatly reduce the number of queries, since no matter how many clients are connected there will only be one query, the response of which then gets sent to all connected clients.
I hope my post has give you some ideas of how you could implement your application using Node. Keep in mind though that I am just one developer, this is how I would approach building your application in Node. There will definitely be others who have different opinions.
So say I was to implement a scrabble game, and I wanted it to be 100 percent client-side, i.e. backbone handles all game logic. Is it possible to protect such a solution so that users weren't able to spoof game moves?
Is this possible?
I think that several things must stay in the server side, even in an (almost) all-client solution
Security - you must have some sort of authorization and authentication outside your client side
Validation - you can never trust user generated content, and a JSON model sent to the server during a backbone sync, is, in some way, user generated content (as anyone can open a console and mess with your models and save)
I know that solutions like Firebase handle #1 very well, but I'm not sure they handle #2
Therefore in this case, Sébastien's answer is a great solution, instead of server validation, you have the peers validate that what they get from other peers is a valid move according to their representation of the game. however, how do you know who is right? the majority wins? I don't see a way to avoid having some sort of server side state, which is the "master" and validating that each move is a "valid" move.
One way of doing it is having your server side be running on Node.js, this way you can avoid rewriting your validation logic in two different places. You don't need to run the entire logic on the server side, just the validation part.
There are also ways to run your entire Backbone app in the server side (e.g. this approach) but I'm not sure this is needed here.
Few other reasons you need server side validation: how do you know what the user is saving? e.g. if you don't have a size limit, what stops them from storing their entire pirated ebook database in your app, if you have no validation on the server side, anyone with a console can push anything theoretically.
This is not possible unless you also build in a way for one client to tell the other client to stop cheating, or in other words, to locally validate every move. This has the reverse problem of allowing cheaters to block every move by their adversary, however.
You could extend this by having a third person with the client "indirectly observe" the game, and provide a third point of view on the moves. If two people out of three deem a move legal, it goes through. This only breaks down if you get a significant amount of cheaters/people modifying the script.
I think this will be one of your only solutions, as, if the app is entirely client-side, you can deem nothing in the code to be safe or unbreakable. You'll need to rely on peer validation more than building checks in the code, I think.
I'm rendering a news feed.
I'm planning to use Backbone.js for my javascript stuff because I'm sick of doing manual DOM binds with JQuery.
So right now I'm looking at 2 options.
When the user loads the page, the "news feed" container is blank. But the page triggers a javascript which renders the items of the news feed onto the screen. This would tie into Backbone's models and collections, etc.
When the user loads the page, the "news feed" is rendered by the server. Even if javascript was turned off, the items would still show because the server rendered it via a templating engine.
I want to use Backbone.js to keep my javascript clean. So, I should pick #1, right?? But #1 is much more complicated than #2.
By the way, the reason I'm asking this question is because I don't want to use the routing feature of Backbone.js. I would load each page individually, and use Backbone for the individual items of the page. In other words, I'm using Backbone.js halfway.
If I were to use the routing feature of Backbone.js, then the obvious answer would be #1, right? But I'm afraid it would take too much time to build the route system, and time should be balanced into my equation as well.
I'm sorry if this question is confusing: I just want to know the best practice of using Backbone.js and saving time as well.
There are advantages and disadvantages to both, so I would say this: choose the option that is best for you, according to your requirements.
I don't know Backbone.js, so I'm going to keep my answer to client- versus server-side rendering.
Client-side Rendering
This approach allows you to render your structure quickly on the server-side, then let the user's JavaScript pick up the actual content.
Pros:
Quicker perceived user experience: if there's enough static content on the initial render, then the user gets their page back (or at least the beginning of it) quicker and they won't be bothered about the dynamic content, because in all likelihood that will render reasonably quickly too.
Better control of caching: By requiring that the browser makes multiple requests, you can set up your server to use different caching headers for each URL, depending on your requirements. In this way, you could allow users to cache the initial page render, but require that a user fetch dynamic (changing) content every time.
Cons:
User must have JavaScript enabled: This is an obvious one and I shouldn't even need to mention it, but you are cutting out a (very small) portion of your user base if you don't provide a graceful alternative to your JS-heavy site.
Complexity: This one is a little subjective, but in some ways it's just simpler to have everything in your server-side language and not require so much back-and-forth. Of course, it can go both ways.
Slow post-processing: This depends on the browser, but the fact is that if a lot of DOM manipulation or other post-processing needs to occur after retrieving the dynamic content, it might be faster to let the server do it if the server is underutilized. Most browsers are good at basic DOM manipulation, but if you have to do JSON parsing, sorting, arithmetic, etc., some of that might be faster on the server.
Server-side Rendering
This approach allows the user to receive everything at once and also caters to browsers that don't have good JavaScript support, but it also means everything takes a bit longer before the browser gets the first <html> tag.
Pros:
Content appears all at once: If your server is fast, it will render everything all at once, and that's that. No messy XmlHttpRequests (does anyone still use those directly?).
Quick post-processing: Just like you wouldn't want your application layer to do sorting of a database queryset because the database is faster, you might also want to reserve a good amount of processing on the server-side. If you design for the client-side approach, it's easy to get carried away and put the processing in the wrong place.
Cons:
Slower perceived user experience: A user won't be able to see a single byte until the server's work is all done. Sure, the server is probably going to zip through it, but it's still a few extra seconds on the user's side and you would do them a favor by rendering what you can right away.
Does not scale as well because server spends more time on requests: It might be that you really want the server to finish a request quickly and move on to the next connection.
Which of these are most important to your requirements? That should inform your decision.
I don't know backbone, but here's a simple thought: if at all possible and secure, do everything on the client instead of the server. That way the server has less work to do and can therefore handle more connections and scale better.
But #1 is much more complicated than #2.
Not really. Once you get your hang of Backbone and jQuery and client-side templating (and maybe throw CoffeeScript into the mix, too), then this is not really difficult. In fact, it greatly simplifies your server code, as all the display-related functions are now removed. You could also even have different clients (mobile version, for example) running against the same server.
Even if javascript was turned off, the items would still show because the server rendered it via a templating engine.
That is the important consideration here. If you want to support users without Javascript, then you need a non-JS version.
If you already have a non-JS version, you can think about if you still need the "enhanced" version, and if you do, if you want to re-use the server-side templating you already have coded and tested and need to maintain anyway, or duplicate the effort client-side, which adds development cost, but as you say may provide a superior experience and lower the load on the server (although I cannot imagine that fetching rendered data versus fetching XML data makes that much of a difference).
If you do not need to support users without Javascript, then by all means, render on the client.
I think Backbone's aim is to organize a Javascript in-page client application. But first of all you should take a position on the next statement:
Even if javascript was turned off, the web-app still works in "post-back mode".
Is that one of your requirements? (This is not a simple requirement.) If no, then I'll advice you: "Do more JS". But if yes then I believe your best friend is jQuery load function.
A Note: I'm a Java programmer and there's a lot of server-side frameworks that bring the ability to write applications that work ajax-ly when js is enabled and switch on post-backs when it isn't. I think Wicket and Echo2 are two of them but it's meant they are server-side libraries...
I came across a site that does something very similar to Google Suggest. When you type in 2 characters in the search box (e.g. "ca" if you are searching for "canon" products), it makes 4 Ajax requests. Each request seems to get done in less than 125ms. I've casually observed Google Suggest taking 500ms or longer.
In either case, both sites are fast. What are the general concepts/strategies that should be followed in order to get super-fast requests/responses? Thanks.
EDIT 1: by the way, I plan to implement an autocomplete feature for an e-commerce site search where it 1.) provides search suggestion based on what is being typed and 2.) a list of potential products matches based on what has been typed so far. I'm trying for something similar to SLI Systems search (see http://www.bedbathstore.com/ for example).
This is a bit of a "how long is a piece of string" question and so I'm making this a community wiki answer — everyone feel free to jump in on it.
I'd say it's a matter of ensuring that:
The server / server farm / cloud you're querying is sized correctly according to the load you're throwing at it and/or can resize itself according to that load
The server /server farm / cloud is attached to a good quick network backbone
The data structures you're querying server-side (database tables or what-have-you) are tuned to respond to those precise requests as quickly as possible
You're not making unnecessary requests (HTTP requests can be expensive to set up; you want to avoid firing off four of them when one will do); you probably also want to throw in a bit of hysteresis management (delaying the request while people are typing, only sending it a couple of seconds after they stop, and resetting that timeout if they start again)
You're sending as little information across the wire as can reasonably be used to do the job
Your servers are configured to re-use connections (HTTP 1.1) rather than re-establishing them (this will be the default in most cases)
You're using the right kind of server; if a server has a large number of keep-alive requests, it needs to be designed to handle that gracefully (NodeJS is designed for this, as an example; Apache isn't, particularly, although it is of course an extremely capable server)
You can cache results for common queries so as to avoid going to the underlying data store unnecessarily
You will need a web server that is able to respond quickly, but that is usually not the problem. You will also need a database server that is fast, and can query very fast which popular search results start with 'ca'. Google doesn't use conventional database for this at all, but use large clusters of servers, a Cassandra-like database, and a most of that data is kept in memory as well for quicker access.
I'm not sure if you will need this, because you can probably get pretty good results using only a single server running PHP and MySQL, but you'll have to make some good choices about the way you store and retrieve the information. You won't get these fast results if you run a query like this:
select
q.search
from
previousqueries q
where
q.search LIKE 'ca%'
group by
q.search
order by
count(*) DESC
limit 1
This will probably work as long as fewer than 20 people have used your search, but will likely fail on you before you reach a 100.000.
This link explains how they made instant previews fast. The whole site highscalability.com is very informative.
Furthermore, you should store everything in memory and should avoid retrieving data from the disc (slow!). Redis for example is lightning fast!
You could start by doing a fast search engine for your products. Check out Lucene for full text searching. It is available for PHP, Java and .NET amongst other.