I'm creating a statistics web page which can see sensitive information.
The webpage has a sort of table which has massive data in it, editable and stored in Server's database. But It needs to be hidden before the user got proper authentications(Like log-in). (Table itself and it's code too). But I found that most of the questions in stack overflow say it is basically impossible. But when I see lots of well-known websites, it seems they are hiding them well. So I guess there are some solutions to the problem.
At first, I build a full-stack of React - Express - Node - MariaDB toolchain.
The react client is responsible for rendering contents of a webpage and editable tables and request for submitting edited content.
The node with express are responsible for retrieving data from DB, updating DB (Provides data to manipulate from client-side -- that's all)
It comes to a problem when I'm considering security on client-side code. I want to hide all content of the page (not just data from the server, but also its logic and features)
To achieving my goals, I consider several things, but I doubt if it is right and working well if I create.
Using Serverside rendering -- Cannot use due to performance reason and lack of resources available
Serverside rendering can hide logic from the user cause it omits the only HTML from the server and all actions are submitted to the server and the server handle the actions and provide its result.
So I can provide only the login page at first, and if login is successful, I can send the rest of HTML and it's logics from the server.
The problem is that my content in the webpage is massive and will be interacted with the user very often, and applying virtualization on my table (by performance reason), it's data and rendering logic should be handled by the web browser.
Combining SSR and Client-Side Rendering
My inspection for this is not sure, I doubt if it is possible.
Use SSR for hiding content of the site from unauthorized users, and if authorized, the web browser renders its full content on demand. (Code and logics should be hidden before authorization, the unauthorized user only can see the login page)
Is it possible to do it?
Get code on demand.
Also my inspection, this is what I am looking for. But I strongly doubt if it is possible.
Workflow is like below
If a user is not logged in:: User only can see the login page and its code
If the user is logged in:: User can see features of the page like management, statistics, etc.
If the user approaches specific features:: Rendering logic and HTTP request interface is downloaded from the server (OR less-performance hindering logic or else...) and it renders what users want to see and do.
It is okay not to find ways from the above idea. Can you provide some outlines for implement such kind of web page? I'm quite new to Web Programming, so I cannot find proper ways. I want to know how can I achieve this with what kinds of solutions, library, structure.
What lib or package should I use for this?
How can I implement this?
OR can you describe to me how modern websites achieve this? (I think the SAP system quite resembles with what I wanna achieve)
Foreword
Security is a complex topic, in which it is not possible to reach 0 threat. I'll try to craft an answer that could fullfil what you are looking for.
Back end: Token, credentials, authentication
So, you are currently using Express for your back end, hence the need to sort of protect the access from this part, many solution exist, I favor the token authentication, but you can do something with username/password (or this) to let the users access the back end.
From what you are describing you would use some sort of API (REST, GraphQL etc.) to connect to the back-end and make your queries (fetch, cross-fetch, apollo-link etc.) and add the token to the call to the back end in the headers usually.
If a user doesn't have the proper token, they have no data. Many sites use that method to block the consumption of data from the users (e.g. Twitter, Instagram). This should cover the security of the data for your back end, and no code is exposed.
Front-end: WebPack and application code splitting
Now the tricky part, so you want the client side not to have access to all the front-end at once but in several parts. This has 2 caveats:
It will be a bit slower than in normal use
Once the client logged in once, he will have access to the application
The only workaround I see in this situation is to use only server side rendering, if you want to limit to the bare minimum the amount of data the client has on your front end. Granted it is slow, but if you want maximum protection that is the only solution.
Yet, if you want to still keep some interactions and have a faster front end, while keeping a bit of security, you could use some code splitting with WebPack. I am not familiar with C so I can't say, but the Multiple page application of WebPack, as I was mentionning in the comment, should give you a good start to build something more secure.
First, you would have for example 2 html files for entering the front end: one with the login and one with the application. The login contains only the Javascript modules that are for entering the application and shouldn't load the other Javascript modules.
All in all, entrypoints are the way you can enter the application, this is a very broad topic that I can't cover in this answer, but I would recommend you to follow WebPack's tutorial and find out how you can work this out.
I recommend the part on code splitting, but all the tutorial is worth having a look.
Second, you will have to tweak the optimisation module. It is usually a module that tries to reduce the size of the application by merging methods that are used by different parts or that are redundant: you don't want this.
In your case, you don't want un-authenticated users to have access. So you would have to probably change things there (as well another broad topic to be covered in a single answer, since you would have to decide what you keep for optimisation and what you remove for security), but here is the link to the optimisation module and a heads up, you will have to modify the SplitChunksPlugin not to do this optimisation.
I hope this helps, there are many solutions are hand and this is not a comprehensive guide but that should give you enough materials to get to what you need.
I am working on a React-based web app that uses Tensorflow.js to run an AI model in realtime on the client in the browser. I've trained this AI model from scratch and I'd like to protect it from being intercepted and used in other projects. Are there any protections available to do this (obfuscation, DRM, etc.)?
From a business perspective, I'd only like the model to work on my web app, nowhere else.
The discussions (1 2 3) I've been able to find on this are more geared toward native apps, not web apps.
Here is an example open source web app that uses Tensorflow.js. These weights are an example of what I would like to protect in my app.
Client-side code obfuscation will never fully prevent it. Use a server instead.
Obfuscation
If your client-side application contains the model, then the user will be able to somehow extract it. You can make it harder for the user, but it will always be possible. Some techniques to make it harder are:
Obfuscating your code: That way the user will not be able to read your code and comments easily. Depending on your build tools, this might already be done for you when you produce a "production ready" build.
Obfuscating the library and its public API: Even if your code is obfuscated, the user might still be able to guess what is going on by seeing the public API calls of the library. Example: It would be rather easy to set a break point at the model.predict function and debug your code from there on. By also obfuscating libraries and their API, this will become harder.
Put "special checks" in your code: You could also check if the page the code is running on is your page (e.g. if the domain matches), etc. You also want to obfuscate this code as well.
Even if your code is perfectly obfuscated and well protected, your client-side code still contains your model somewhere. With these methods it will always be possible to somehow extract your model.
Server-side approach
To make it impossible to get your model, you need a different approach. Only put your "dumb logic" on the client. Exclude the part of code that you want to protect. Instead you offer a API on your server that executes the "protected part" of your code.
This way, instead of running model.predict on the client-side, you would make an AJAX request to your backend (with the parameters) and then return the results. That way the user only sees the input and the output and cannot extract the model itself.
Keep in mind that this means a lot more work, as you not only have to write the code for your client-side application but also for your server-side application, including the API. Depending on how your application looks like (e.g.: does it have a login?), this might be a lot more code.
Another way you can protect your model is to split the model into more than one blocks. Put some blocks at server side and some at client side. This method may also introduce a lot of engineering work, but once you do that you can trade off the computation loading and network latency between the server and client. Users can only get some model blocks which is useless without cooperating with server side blocks.
I've done some web-based projects, and most of the difficulties I've met with (questions, confusions) could be figured out with help. But I still have an important question, even after asking some experienced developers: When functionality can be implemented with both server-side code and client-side scripting (JavaScript), which one should be preferred?
A simple example:
To render a dynamic html page, I can format the page in server-side code (PHP, python) and use Ajax to fetch the formatted page and render it directly (more logic on server-side, less on client-side).
I can also use Ajax to fetch the data (not formatted, JSON) and use client-side scripting to format the page and render it with more processing (the server gets the data from a DB or other source, and returns it to the client with JSON or XML. More logic on client-side and less on server).
So how can I decide which one is better? Which one offers better performance? Why? Which one is more user-friendly?
With browsers' JS engines evolving, JS can be interpreted in less time, so should I prefer client-side scripting?
On the other hand, with hardware evolving, server performance is growing and the cost of sever-side logic will decrease, so should I prefer server-side scripting?
EDIT:
With the answers, I want to give a brief summary.
Pros of client-side logic:
Better user experience (faster).
Less network bandwidth (lower cost).
Increased scalability (reduced server load).
Pros of server-side logic:
Security issues.
Better availability and accessibility (mobile devices and old browsers).
Better SEO.
Easily expandable (can add more servers, but can't make the browser faster).
It seems that we need to balance these two approaches when facing a specific scenario. But how? What's the best practice?
I will use client-side logic except in the following conditions:
Security critical.
Special groups (JavaScript disabled, mobile devices, and others).
In many cases, I'm afraid the best answer is both.
As Ricebowl stated, never trust the client. However, I feel that it's almost always a problem if you do trust the client. If your application is worth writing, it's worth properly securing. If anyone can break it by writing their own client and passing data you don't expect, that's a bad thing. For that reason, you need to validate on the server.
Unfortunately if you validate everything on the server, that often leaves the user with a poor user experience. They may fill out a form only to find that a number of things they entered are incorrect. This may have worked for "Internet 1.0", but people's expectations are higher on today's Internet.
This potentially leaves you writing quite a bit of redundant code, and maintaining it in two or more places (some of the definitions such as maximum lengths also need to be maintained in the data tier). For reasonably large applications, I tend to solve this issue using code generation. Personally I use a UML modeling tool (Sparx System's Enterprise Architect) to model the "input rules" of the system, then make use of partial classes (I'm usually working in .NET) to code generate the validation logic. You can achieve a similar thing by coding your rules in a format such as XML and deriving a number of checks from that XML file (input length, input mask, etc.) on both the client and server tier.
Probably not what you wanted to hear, but if you want to do it right, you need to enforce rules on both tiers.
I tend to prefer server-side logic. My reasons are fairly simple:
I don't trust the client; this may or not be a true problem, but it's habitual
Server-side reduces the volume per transaction (though it does increase the number of transactions)
Server-side means that I can be fairly sure about what logic is taking place (I don't have to worry about the Javascript engine available to the client's browser)
There are probably more -and better- reasons, but these are the ones at the top of my mind right now. If I think of more I'll add them, or up-vote those that come up with them before I do.
Edited, valya comments that using client-side logic (using Ajax/JSON) allows for the (easier) creation of an API. This may well be true, but I can only half-agree (which is why I've not up-voted that answer yet).
My notion of server-side logic is to that which retrieves the data, and organises it; if I've got this right the logic is the 'controller' (C in MVC). And this is then passed to the 'view.' I tend to use the controller to get the data, and then the 'view' deals with presenting it to the user/client. So I don't see that client/server distinctions are necessarily relevant to the argument of creating an API, basically: horses for courses. :)
...also, as a hobbyist, I recognise that I may have a slightly twisted usage of MVC, so I'm willing to stand corrected on that point. But I still keep the presentation separate from the logic. And that separation is the plus point so far as APIs go.
I generally implement as much as reasonable client-side. The only exceptions that would make me go server-side would be to resolve the following:
Trust issues
Anyone is capable of debugging JavaScript and reading password's, etc. No-brainer here.
Performance issues
JavaScript engines are evolving fast so this is becoming less of an issue, but we're still in an IE-dominated world, so things will slow down when you deal with large sets of data.
Language issues
JavaScript is weakly-typed language and it makes a lot of assumptions of your code. This can cause you to employ spooky workarounds in order to get things working the way they should on certain browsers. I avoid this type of thing like the plague.
From your question, it sounds like you're simply trying to load values into a form. Barring any of the issues above, you have 3 options:
Pure client-side
The disadvantage is that your users' loading time would double (one load for the blank form, another load for the data). However, subsequent updates to the form would not require a refresh of the page. Users will like this if there will be a lot of data fetching from the server loading into the same form.
Pure server-side
The advantage is that your page would load with the data. However, subsequent updates to the data would require refreshes to all/significant portions of the page.
Server-client hybrid
You would have the best of both worlds, however you would need to create two data extraction points, causing your code to bloat slightly.
There are trade-offs with each option so you will have to weigh them and decide which one offers you the most benefit.
One consideration I have not heard mentioned was network bandwidth. To give a specific example, an app I was involved with was all done server-side and resulted in 200Mb web page being sent to the client (it was impossible to do less without major major re-design of a bunch of apps); resulting in 2-5 minute page load time.
When we re-implemented this by sending the JSON-encoded data from the server and have local JS generate the page, the main benefit was that the data sent shrunk to 20Mb, resulting in:
HTTP response size: 200Mb+ => 20Mb+ (with corresponding bandwidth savings!)
Time to load the page: 2-5mins => 20 secs (10-15 of which are taken up by DB query that was optimized to hell an further).
IE process size: 200MB+ => 80MB+
Mind you, the last 2 points were mainly due to the fact that server side had to use crappy tables-within-tables tree implementation, whereas going to client side allowed us to redesign the view layer to use much more lightweight page. But my main point was network bandwidth savings.
I'd like to give my two cents on this subject.
I'm generally in favor of the server-side approach, and here is why.
More SEO friendly. Google cannot execute Javascript, therefor all that content will be invisible to search engines
Performance is more controllable. User experience is always variable with SOA due to the fact that you're relying almost entirely on the users browser and machine to render things. Even though your server might be performing well, a user with a slow machine will think your site is the culprit.
Arguably, the server-side approach is more easily maintained and readable.
I've written several systems using both approaches, and in my experience, server-side is the way. However, that's not to say I don't use AJAX. All of the modern systems I've built incorporate both components.
Hope this helps.
I built a RESTful web application where all CRUD functionalities are available in the absence of JavaScript, in other words, all AJAX effects are strictly progressive enhancements.
I believe with enough dedication, most web applications can be designed this way, thus eroding many of the server logic vs client logic "differences", such as security, expandability, raised in your question because in both cases, the request is routed to the same controller, of which the business logic is all the same until the last mile, where JSON/XML, instead of the full page HTML, is returned for those XHR.
Only in few cases where the AJAXified application is so vastly more advanced than its static counterpart, GMail being the best example coming to my mind, then one needs to create two versions and separate them completely (Kudos to Google!).
I know this post is old, but I wanted to comment.
In my experience, the best approach is using a combination of client-side and server-side. Yes, Angular JS and similar frameworks are popular now and they've made it easier to develop web applications that are light weight, have improved performance, and work on most web servers. BUT, the major requirement in enterprise applications is displaying report data which can encompass 500+ records on one page. With pages that return large lists of data, Users often want functionality that will make this huge list easy to filter, search, and perform other interactive features. Because IE 11 and earlier IE browsers are are the "browser of choice"at most companies, you have to be aware that these browsers still have compatibility issues using modern JavaScript, HTML5, and CSS3. Often, the requirement is to make a site or application compatible on all browsers. This requires adding shivs or using prototypes which, with the code included to create a client-side application, adds to page load on the browser.
All of this will reduce performance and can cause the dreaded IE error "A script on this page is causing Internet Explorer to run slowly" forcing the User to choose if they want to continue running the script or not...creating bad User experiences.
Determine the complexity of the application and what the user wants now and could want in the future based on their preferences in their existing applications. If this is a simple site or app with little-to-medium data, use JavaScript Framework. But, if they want to incorporate accessibility; SEO; or need to display large amounts of data, use server-side code to render data and client-side code sparingly. In both cases, use a tool like Fiddler or Chrome Developer tools to check page load and response times and use best practices to optimize code.
Checkout MVC apps developed with ASP.NET Core.
At this stage the client side technology is leading the way, with the advent of many client side libraries like Backbone, Knockout, Spine and then with addition of client side templates like JSrender , mustache etc, client side development has become much easy.
so, If my requirement is to go for interactive app, I will surely go for client side.
In case you have more static html content then yes go for server side.
I did some experiments using both, I must say Server side is comparatively easier to implement then client side.
As far as performance is concerned. Read this you will understand server side performance scores.
http://engineering.twitter.com/2012/05/improving-performance-on-twittercom.html
I think the second variant is better. For example, If you implement something like 'skins' later, you will thank yourself for not formatting html on server :)
It also keeps a difference between view and controller. Ajax data is often produced by controller, so let it just return data, not html.
If you're going to create an API later, you'll need to make a very few changes in your code
Also, 'Naked' data is more cachable than HTML, i think. For example, if you add some style to links, you'll need to reformat all html.. or add one line to your js. And it isn't as big as html (in bytes).
But If many heavy scripts are needed to format data, It isn't to cool ask users' browsers to format it.
As long as you don't need to send a lot of data to the client to allow it to do the work, client side will give you a more scalable system, as you are distrubuting the load to the clients rather than hammering your server to do everything.
On the flip side, if you need to process a lot of data to produce a tiny amount of html to send to the client, or if optimisations can be made to use the server's work to support many clients at once (e.g. process the data once and send the resulting html to all the clients), then it may be more efficient use of resources to do the work on ther server.
If you do it in Ajax :
You'll have to consider accessibility issues (search about web accessibility in google) for disabled people, but also for old browsers, those who doesn't have JavaScript, bots (like google bot), etc.
You'll have to flirt with "progressive enhancement" wich is not simple to do if you never worked a lot with JavaScript. In short, you'll have to make your app work with old browsers and those that doesn't have JavaScript (some mobile for example) or if it's disable.
But if time and money is not an issue, I'd go with progressive enhancement.
But also consider the "Back button". I hate it when I'm browsing a 100% AJAX website that renders your back button useless.
Good luck!
2018 answer, with the existence of Node.js
Since Node.js allows you to deploy Javascript logic on the server, you can now re-use the validation on both server and client side.
Make sure you setup or restructure the data so that you can re-use the validation without changing any code.
I've been trying to learn a little about SproutCore, following the "Todos" tutorial, and I have a couple of questions that haven't been able to find online.
SproutCore is supposed to move all of the business logic to the client. How is that not insecure? A malicious user could easily tamper with the code (since it's all on the client) and change the way the app behaves. How am I wrong here?
SproutCore uses "DataStores", and some of them can be remote. How can I avoid that a malicious user does not interact with the backend on his own? Using some sort of API key wouldn't work since the code is on the client side. Is there some sort of convention here? Any ideas? This really bugs me.
Thanks in advance!
PS: Anyone thinks Cappuccino is a better alternative? I decided to go with SproutCore because the documentation on Cappuccino seemed pretty bad, although SproutCore's doesn't get any better.
Ian
your concerns are valid. The thing is, they apply to all client side code, no matter what framework. So:
Web applications are complicated things. Moving processing to the client is a good thing, because it speeds up the responsiveness of the application. However, it is imperative that the server validate all data inputs, just like in any other web application.
Additionally, all web applications should use the well known authentication/authorization paradigms that are prevalent in system security. Authentication means you must verify that the user is who they say they are, and they can use the system, with Authorization means that the server must verify that the user can do what they are trying e.g. can they create a new data entry, or edit an existing one. It is good design to not present users with UI options that they are not allowed to perform, but you should not rely on that.
All web applications must do those things.
With respect to the 'interacting with the back end' concern: Again, all web applications have this concern. You can open up firebug/webkit, and look at all the the xhr requests that RIAs use in their operations, and mimic them to try to do something on that system. Again, this concern is dealt with by the authentication/authorization checks that you must implement. Anybody can use any webclient to send a request to the server. It is up to the developer to validate that request.
The DataSources in SproutCore are just an abstraction around how SC apps interact with the server. At the end of the day, however, all SC is doing is making XHR requests to the server, like any other RIA.
I've written a simple web page that uses Javascript to control a Quicktime plugin for movie playback. There's also some AJAX stuff using jquery to get info on the movies from an MSSQL database. The web page is served to the user from an Apache 2.0 server, this also hosts MSSQL. The end users will view the page in IE6 (unfortunately).
My problem is that the end users now want to use an RS422 jog/shuttle deck control to drive the movie timeline, in place of another jog/shuttle unit that relied on emulating keypresses which was easy for me to detect.
As I'm not a programmer I'm at a loss what to start looking at for a solution to receive the RS422 data and then send that to the Javascript to control the timeline. Is this something that a custom activeX bit of code could do? I've googled ActiveX with Javascript but it's unclear to me (as a novice) how the two work together, or whether this would be suitable at all.
If anyone could give me an overview of what to start researching that'd be much appreciated.
Many thanks.
Jon
JavaScript runs in a sandbox and has no access to the computer at all (for security reasons; you really don't want to make it any more simple for frauds to get at your credit card data).
ActiveX would work but it's a security risk, too. ActiveX is written in C++, no JavaScript there. You'll find information about that on the M$ Website. Note that ActiveX is usually disabled today because of said security risks. Depending how serious your client take security, the virus scanner might not allow to start an A/X control.
Another option would be to write small program which is installed on the client's computer that reads the serial port and send that to the web server where your JavaScript can query it. Okay, that's more than a bit convoluted but probably the least risky.
Or you write a program which transforms the serial codes into key presses (just create the event and post it to Windows). Again, you need C++ or maybe Python with the win32 package.
Your client must understand that this is something which sounds incredibly simple but you'll have to jump through a lot of hoops to make it work. A web browser is not a local application with full reign of the hardware (and it must never be).