I'm building an app and would like some feedback on my approach to building the data sync process and API that supports it. For context, these are the guiding principles for my app/API:
Free: I do not want to charge people at all to use the app/API.
Open source: the source code for both the app and API are available to the public to use as they wish.
Decentralised: the API service that supports the app can be run by anyone on any server, and made available for use to users of the app.
Anonymous: the user should not have to sign up for the service, or submit any personal identifying information that will be stored alongside their data.
Secure: the user's data should be encrypted before being sent to the server, anyone with access to the server should have no ability to read the user's data.
I will implement an instance of the API on a public server which will be selected in the app by default. That way initial users of the app can sync their data straight away without needing to find or set up an instance of the API service. Over time, if the app is popular then users will hopefully set up other instances of the API service either for themselves or to make available to other users of the app should they wish to use a different instance (or if the primary instance runs out of space, goes down, etc). They may even access the API in their own apps. Essentially, I want them to be able to have the choice to be self sufficient and not have to necessarily rely on other's providing an instance on the service for them, for reasons of privacy, resilience, cost-saving, etc. Note: the data in question is not sensitive (i.e. financial, etc), but it is personal.
The user's sync journey works like this:
User downloads the app, and creates their data in the process of using the app.
When the user is ready to initially sync, they enter a "password" in the password field, which is used to create a complex key with which to encrypt their data. Their password is stored locally in plain text but is never sent to the server.
User clicks the "Sync" button, their data is encrypted (using their password) and sent to the specified (or default) API instance and responds by giving them a unique ID which is saved by the app.
For future syncs, their data is encrypted locally using their saved password before being sent to the API along with their unique ID which updates their synced data on the server.
When retrieving synced data, their unique ID is sent to the API which responds with their encrypted data. Their locally stored password is then used to decrypt the data for use by the app.
I've implemented the app in javascript, and the API in Node.js (restify) with MongoDB as a backend, so in practice a sync requests to the server looks like this:
1. Initial sync
POST /api/data
Post body:
{
"data":"DWCx6wR9ggPqPRrhU4O4oLN5P09onApoAULX4Xt+ckxswtFNH/QQ+Y/RgxdU+8+8/muo4jo/jKnHssSezvjq6aPvYK+EAzAoRmXenAgUwHOjbiAXFqF8gScbbuLRlF0MsTKn/puIyFnvJd..."
}
Response:
{
"id":"507f191e810c19729de860ea",
"lastUpdated":"2016-07-06T12:43:16.866Z"
}
2. Get sync data
GET /api/data/507f191e810c19729de860ea
Response:
{
"data":"DWCx6wR9ggPqPRrhU4O4oLN5P09onApoAULX4Xt+ckxswtFNH/QQ+Y/RgxdU+8+8/muo4jo/jKnHssSezvjq6aPvYK+EAzAoRmXenAgUwHOjbiAXFqF8gScbbuLRlF0MsTKn/puIyFnvJd...",
"lastUpdated":"2016-07-06T12:43:16.866Z"
}
3. Update synced data
POST /api/data/507f191e810c19729de860ea
Post body:
{
"data":"DWCx6wR9ggPqPRrhU4O4oLN5P09onApoAULX4Xt+ckxswtFNH/QQ+Y/RgxdU+8+8/muo4jo/jKnHssSezvjq6aPvYK+EAzAoRmXenAgUwHOjbiAXFqF8gScbbuLRlF0MsTKn/puIyFnvJd..."
}
Response:
{
"lastUpdated":"2016-07-06T13:21:23.837Z"
}
Their data in MongoDB will look like this:
{
"id":"507f191e810c19729de860ea",
"data":"DWCx6wR9ggPqPRrhU4O4oLN5P09onApoAULX4Xt+ckxswtFNH/QQ+Y/RgxdU+8+8/muo4jo/jKnHssSezvjq6aPvYK+EAzAoRmXenAgUwHOjbiAXFqF8gScbbuLRlF0MsTKn/puIyFnvJd...",
"lastUpdated":"2016-07-06T13:21:23.837Z"
}
Encryption is currently implemented using CryptoJS's AES implementation. As the app provides the user's password as a passphrase to the AES "encrypt" function, it generates a 256-bit key which which to encrypt the user's data, before being sent to the API.
That about sums up the sync process, it's fairly simple but obviously it needs to be secure and reliable. My concerns are:
As the MongoDB ObjectID is fairly easy to guess, it is possible that a malicious user could request someone else's data (as per step 2. Get sync data) by guessing their ID. However, if they are successful they will only retrieve encrypted data and will not have the key with which to decrypt it. The same applies for anyone who has access to the database on the server.
Given the above, is the CryptoJS AES implementation secure enough so that in the real possibility that a user's encrypted data is retrieved by a malicious user, they will not realistically be able to decrypt the data?
Since the API is open to anyone and doesn't audit or check the submitted data, anyone could potentially submit any data they wish to be stored in the service, for example:
Post body:
{
"data":"This is my anyold data..."
}
Is there anything practical I can do to guard against this whilst adhering to the guiding principles above?
General abuse of the service such as users spamming initial syncs (step 1 above) over and over to fill up the space on the server; or some user's using disproportionately large amounts of server space. I've implemented some features to guard against this, such as logging IPs for initial syncs for one day (not kept any longer than that) in order to limit a single IP to a set number of initial syncs per day. Also I'm limiting the post body size for syncs. These options are configurable in the API however, so if a user doesn't like these limitations on a public API instance, they can host their own instance and tweak the settings to their liking.
So that's it, I would appreciate anyone who has any thoughts or feedback regarding this approach given my guiding principles. I couldn't find any examples where other apps have attempted a similar approach, so if anyone knows of any and can link to them I'd be grateful.
I can't really comment on whether specific AES algorithms/keys are secure or not, but assuming they are (and the keys are generated properly), it should not be a problem if other users can access the encrypted data.
You can maybe protect against abuse, without requiring other accounts, by using captchas or similar guards against automatic usage. If you require a catcha on new accounts, and set limits to all accounts on data volume and call frequency, you should be ok.
To guard against accidental clear-text data, you might generate a secondary key for each account, and then check on the server with the public secondary key whether the messages can be decrypted. Something like this:
data = secondary_key(user_private_key(cleartext))
This way the data will always be encrypted, and in worst case the server will be able to read it, but others wouldn't.
A few comments to your API :) If you're already using HTTP and POST, you don't really need an id. The POST usually returns a URI that points to the created data. You can then GET that URI, or PUT it to change:
POST /api/data
{"data": "..."}
Response:
Location: /api/data/12345
{"data": "...", "lastmodified": "..." }
To change it:
PUT /api/data/12345
{"data": "..."}
You don't have to do it this way, but it might be easier to implement on the client side, and maybe even help with caching and cache invalidation.
Related
Not sure if the title summarises my question well.
Basically, I am trying to authenticate routes such as checking if user exists etc. I only want to allow
requests coming from my frontend application to be approved, but, since no user is signed in there is no token to send.
Api request -
mywebiste/checkUser/email
This route is unprotected on my backend because no user is logged in.
BUT I want to protect this route, in such a way that it's accessible only from the frontend.
Some ideas I came up with were adding specific headers tag from the frontend and check them on the backend, but that could be easily replicated, is there something more secure like using tokens etc.
I am using React and Node.js
Same origin policy is going to give you some basic protection, but basically if an API endpoint is exposed publicly, it's exposed publicly. If you don't want that route to be publicly accessible you need to add access control.
If you use that route to check if a user is already registered, you could, for example, merge it with the user registration route and send a different error code if the user already exists (which is not a great idea because it leaks which emails are registered on your system).
You can verify that a request was originated by a user (by authenticating him) but you cannot verify that a request comes from a particular client because of these two reasons :
If you include some API key in your client (web page or other), it's easily retrievable by everyone (the best thing you could do is offuscate it which makes things slightly harder but still possible)
If you send an API key over the network it's easily retrievable as well
The only thing you could do is prevent other web pages from calling your backend on behalf of the user, by using CORS (which is actually active by default if you dont specify an Access-Control-Allow-Origin header)
I ended up creating a kind of working solution, so basically, I create a new base64 string on my frontend and attach that to the header while making a request to the backend. The base64 string is different every minute, so even if the header is copied, it differs every minute and is combined with your secret key.
I have made a package so that people can use it if they want - https://github.com/dhiraj1site/ncrypter
You can use it like so
var ncrypter = require('ncrypter');
//use encode on your frontend with number of seconds and secret key
var encodedString = ncrypter.encrypt(2, 'mysecret1')
//use decode on your backend with same seconds and secret
var decodedString = ncrypter.decrypt(encodedString, 2, 'mysecret1');
console.log('permission granted -->', decodedString);
I am building a "TODO" application which uses Service Workers to cache the request's responses and in case a user is offline, the cached data is displayed to the user.
The Server exposes an REST-ful endpoint which has POST, PUT, DELETE and GET endpoints exposed for the resources.
Considering that when the user is offline and submitting a TODO item, I save that to local IndexedDB, but I can't send this POST request for the server since there is no network connection. The same is true for the PUT, DELETE requests where a user updates or deletes an existing TODO item
Questions
What patterns are in use to sync the pending requests with the REST-ful Server when the connection is back online?
What patterns are in use to sync the pending requests with the REST-ful Server when the connection is back online?
Background Sync API will be suitable for this scenario. It enables web applications to synchronize data in the background. With this, it can defer actions until the user has a reliable connection, ensuring that whatever the user wants to send is actually sent. Even if the user navigates away or closes the browser, the action is performed and you could notify the user if desired.
Since you're saving to IndexDB, you could register for a sync event when the user add, delete or update a TODO item
function addTodo(todo) {
return addToIndeDB(todo).then(() => {
// Wait for the scoped service worker registration to get a
// service worker with an active state
return navigator.serviceWorker.ready;
}).then(reg => {
return reg.sync.register('add-todo');
}).then(() => {
console.log('Sync registered!');
}).catch(() => {
console.log('Sync registration failed :(');
});
}
You've registered a sync event of type add-todo which you'll listen for in the service-worker and then when you get this event, you retrieve the data from the IndexDB and do a POST to your Restful API.
self.addEventListener('sync', event => {
if (event.tag == 'add-todo') {
event.waitUntil(
getTodo().then(todos => {
// Post the messages to the server
return fetch('/add', {
method: 'POST',
body: JSON.stringify(todos),
headers: { 'Content-Type': 'application/json' }
}).then(() => {
// Success!
});
})
})
);
}
});
This is just an example of how you could achieve it using Background Sync. Note that you'll have to handle conflict resolution on the server.
You could use PouchDB on the client and Couchbase or CouchDB on the server. With PouchDB on the client, you can save data on the client and set it to automatically sync/replicate the data whenever the user is online. When the database synchronizes and there are conflicting changes, CouchDB will detect this and will flag the affected document with the special attribute "_conflicts":true. It determines which one it'll use as the latest revision, and save the others as the previous revision of that record. It does not attempt to merge the conflicting revision. It is up to you to dictate how the merging should be done in your application. It's not so different from Couchbase too. See the links below for more on Conflict Resolution.
Conflict Management with CouchDB
Understanding CouchDB Conflict
Resolving Couchbase Conflict
Demystifying Conflict Resolution in Couchbase Mobile
I've used pouchDB and couchbase/couchdb/IBM cloudant but I've done that through Hoodie It has user authentication out-of-the box, handles conflict management, and a few more. Think of it like your backend. In your TODO application, Hoodie will be a great fit. I've written something on how to use Hoodie, see links Below:
How to build offline-smart application with Hoodie
Introduction to offline data storage and sync with PouchBD and Couchbase
At the moment I can think of two approaches and it depend on what storage options you are using at your backend.
If you are using an RDBMS to backup all data:
The problem with offline first systems in this approach is the possibility of conflict that you may face when posting new data or updating existing data.
As a first measure to avoid conflicts from happening you will have to generate unique IDs for all objects from your clients and in such a way that they remain unique when posted on the server and saved in a data base. For this you can safely rely on UUIDs for generating unique IDs for objects. UUID guarantees uniqueness across systems in a distributed system and depending on what your language of implementation is you will have methods to generate UUIDs without any hassle.
Design your local database such that you can use UUIDs as primary key in your local database. On the server end you can have both, an integer type auto incremented and indexed, primary key and a VARCHAR type to hold the UUIDs. The primary key on server uniquely identifies objects in that table while UUID uniquely identifies records across tables and databases.
So when posting your object to server at the time of syncing you will have to just check if any object with the UDID is already present and take appropriate action from there. When your are fetching objects from the server send both the primary key of the object from your table and the UDID for the objects. This why when you serialise the response in model objects or save them in local database you can tell the objects which have been synced from the ones which haven't as the objects that needs syncing will not have a primary key in your local database, just the UUID.
There may be a case when your server malfunctions and refuses to save data when you are syncing. In this case you can keep an integer variable in your objects that will keep a count of the number of times you have tried syncing it. If this number exceed by a certain value, say 3, you move on to sync the next object. Now what you do with the unsynced objects is up you the policy you have for such objects, as a solution you could discard them or keep them just locally.
If you are not using RDBMS
As an alternate approach, instead of keeping all objects you could keep transactions that each client perform locally to the server. Each client syncs just the transactions and the while fetching you get the current state by working all the transactions from bottom up. This is very similar to what Git uses. It saves changes in your repository in form of transactions like what has been added (or removed) and by whom. The current state of the repository for each user is worked from the transactions. This approach will not result in conflicts but as you can see its a little tricky to develop.
The Dreamcode as described here: http://nobackend.org/dreamcode.html
That developers don't have to worry about the backend when developing web applications.
Is very interesting. However I have few question on building application logic in the front-end.
The question is, even with authentication being processed in the backend.
What are the ways to make the app logic obfuscated and not to be copied easily?
For the application models it is easy for a server to receive it. However looking with the Store and Public Store idea from Dreamcode, how can we handle fields that are not meant to be sent back to the front-end for security purposes?
For example in this Gist it show how to get object by id:
// find one object
var type = 'note';
var id = 'abc4567';
store.find(type, id)
.done(function (object) {});
The issue here is, for example I have an application that guest user can post a document and edit it later with a password. A guest user saves a document with a encrypted password in it.
When other users "views" the document from the front-end application. The Dreamcode data store will return all the fields for this document object (based on the Dreamcode specification) including the encrypted password, which is not good.
So how can we deal with making a Front-end application with Dreamcode with these potential limitations?
i am getting remote JSON value into to my client app as below.
var $Xhr = Ti.Network.createHTTPClient({
onerror : function($e) {
Ti.API.info($e);
},
timeout : 5000,
});
$Xhr.open("GET", "http://***********.json");
$Xhr.send();
$Xhr.onload = function() {
if ($Xhr.status == 200) {
try {
Ti.API.info(this.responseText);
} catch($e) {
Ti.API.info($e);
} finally {
$Xhr = null;
}
}
};
My json URL is static. i would like to protect this URL from stranger eyes after creating APK file or publishing for iOS.
Also my server side support PHP. I have thouhgt MD5, SHA etc. but i didn't develop any project about this algortim.
Do you have any suggestion or approach?
Thank you in advance.
I would just say that it is not possible for you to "hide" the end point. Your url will always to visible to the user because otherwise user's browser wouldn't know how to actually post it to your server.
If you meant to only hide the json object, even that is not totally possible. If your javascript knows what the values are then any of your client smart enough to understand javascript will be able to decode your encoded json object. Remember, your javascript has decoded object and a user would have full access to it. There is no protection against that. At best, you can hide it from everyday user by encoding to with md5 or sha as you put it.
I you wish to restrict access to app user only, you will need to authenticate your users first.
Once they are authenticated, you should generate a hash by concatenating userid (or any user identifying data) and a key that you know (a string will do it), and hashing it using any hashing method, md5 would be enough for that kind of usage I guess, SHA is good anyway.
The next step would be to send this hash with every AJAX request to your server. consider it as an additional data.
Finally, server-side, before treating the request and fetching the data to be sent, just generate a hash the same way you did in your app, using the userid of the requesting user and the same "secret" key you chose. You can now compare both hashes and see if they're identical. If not, then it's probably that someone tried to forge a request from outside your app.
Note that it could be possible for someone authenticated to get his hash (which depends on his ID) and then use it in one of his applications, so it may be a good idea to track the requests server-side in order to check if there's any suspicious usage of your API. You could aswell change your "secret key" regularily (forcing an update of your app though) or define an array with a different key for each day of the year in both your app and server code, so that each individual hashkey will change everyday, recurring each year.
If I have an XML database on my web server;
<Database>
<Client sid="0123456789abcdefg" name="John Doe" email="johndoe#mail.com" hash="9876543210abcdefg" salt="abcdefg9876543210">
<Setting>A Setting</Setting>
<Setting>Another Setting</Setting>
</Client>
...
</Database>
And I log in with the hash and salt, retrieve the SID, and redirect to the home page via PHP;
header("Location: home.html?sid=" . $sid);
And then use the SID in the location bar via JavaScript to retrieve the user settings from the same database, will I expose my clients' hash?
Is there a better way, or a more standard way, to set and get user settings on the web?
P.S.: Unless you have a really good reason, I really, really, really, don't want to use SQL. I prefer to be able to read my databases, and I like the tangibility and versatility of XML.
Edit: After a little more research, I learned that PHP supports a system for storing SESSION[] variables. This is perfect for me because I am, in fact, using sessions!
The W3C says:
"A PHP session variable is used to store information about, or change settings for a user session. Session variables hold information about one single user, and are available to all pages in one application."
Much better than exposing various data in the address bar. =)
As long as your DB file is inaccessable from HTTP (i.e. locked by a .htaccess or equivalent) and other protocols (i.e. not sitting in a directory accesable by anonymous FTP), the only risk is to (inadvertently) let the hash&salt be collected among a bunch of other user-related data and sent to your clients.
If you have requests equivalent to the SQL * selector, that might be somewhat of a problem. You might want to put the critical data into a different DB file and encapsulate the accesses in an interface dedicated to user registration and login, just to make sure no other piece of code will be able to grab them (even by mistake) from your main DB.