What is actually meant by the direction of data flow?
Consider the composition pattern:
I have a class A, within that class A creates an instance of another class B upon instantiation of class A.
Class A holds public data accessible to both Class A and B.
Instance of class B is instantiated with data from Class A.
Instance of class B calls a method within Class A to manipulated data for Instance of Class A.
What is the data in the data flow considered as? The data held by a Class or hierarchy and permissions of Classes?
For example,
Child class should not be able to call Parent class methods on Instances of Parent class.
What is actually meant by the direction of data flow?
Let us consider a specific type of software, so that we can limit the data flow count to larger ideas.
There are two big divisions of data flow in telecom transport.
A. The Transport HW carries east/west data ... from one data transport equipment to another. Typically, this data flow is too fast for software to deal with directly, and the data flows both directions simultaneously.
B. The telecom transport data control is called north/south data. This flow contain 5 software data flows. The flow is to/from the local hw, and from/to an operation user or host.
There are typically 5 flows in B (telecom transport data control):
status_update -- software periodically reads hw status information (often once per second) and delivers the info captured to local 'fast' storage (where other commands can find it for display or delivery to host)
alarms-update -- very much like status update (i.e. a periodic read), but only alarms. Alarms have duration and timeouts, and are reasonably complex.
pm-update -- very much like status update (i.e. a periodic read), but the software collects summary counts of specific activities ... how many bytes output, how many seconds-in-error, etc. This also has round robin 15 minute time buckets, timeouts and other complications.
configuration control -- user applied commands can change the operational configuration. The sw, in response to user command, changes specific hw config registers. Most T1 / E1 hardware can run in either mode, and the user is required to configure each sub-system at startup.
provisioning control -- the user applies commands that enable (or disable) the availability of a specific hardware type to 'transport' east/west data. (think customer billable service)
Each of these flows, from an architectural approach, may be demand-pull OR supply-push, but probably not both.
Example of demand-pull: status update (direction of flow is north, from hw to local storage)
On most of the systems I worked on, the status-update was triggered off the system clock, and, in a typical demand-pull, the clock event triggered a read of all the status conditions from the hw. This collected info is typically stashed into local 'fast' storage, where other commands can find it for display or delivery to host.
Examples of supply-push: configuration (direction of flow is south, from user to hw)
Any config (or provisioning) command is asynchronous to the system clock (because humans don't like and are not good at syncing). Thus, an action is triggered when the user presses the enter-key, and the supply-pushed data (command parameters) flowed to the hardware.
For supply-push, there are sometimes coordination efforts, i.e. no config (nor prov) changes are allowed while specific other things are going on, but this is typically handled with a simple mutex.
Summary - at the above architectural level, the flows may seem simpler than they actually can be.
For example, sometimes, to read during a demand-pull, the software must 'tickle' some feature of the hw, and that 'tickle' feels like the 'wrong direction' ... but in this case, the 'tickle' is not part of the data flow, just overhead to accomplish the flow by extracting / pulling the data out.
Similarly, to write config data, the sw must sometimes determine if the hw will allow the change to that next configuration, and perhaps that is checked by reading from the hw. These reads feel like the wrong direction, but again, it is not part of the data flow, just some flow overhead.
This duality happens at many levels.
I can't speak as much to desktop software, but I have seen demand-pull and supply-push in many places. Perhaps somewhat less disciplined in my view point, but that is more likely about my lack of experience with large desktop applications.
Related
I'm creating a browser game, I know browsers aren't really safe but I was wondering if there's any workaround my issue.
A player kills a monster and the platform sends an ID to my backend like so:
Axios({
url: this.server + "reward",
headers: { token: "foo123", charToken: "bar123" },
method: "POST",
data: {
id: 10001, // Monster ID
value: 10 // How many monsters were killed
},
});
The problem is, I see no possible way to prevent a user to just send random requests to the server, saying he actually did this level 300 times in a second and getting 300x reward items/exp.
I thought about requesting a token before sending this reward request, but this just makes it harder and doesn't really solve anything.
Your current design
As it currently stands, the design you've implied here won't be manageable to mitigate your concerns of a replay attack on your game. A helpful way to think about this is to frame it in the context of a "conversational" HTTP request/response paradigm. Currently, what you're allowing is something akin to:
Your client tells your server, "Hey, I've earned this reward!"
Your server, having no way to verify this, responds: "Great! I've added that reward to your account."
As you guessed, this paradigm doesn't provide very much protection against even a slightly motivated attacker.
Target design
What your design should enforce is something more along the lines of the following:
Your clients should only be able to ask the server for something. ("I've chosen to attack monster #123, what was the result?")
Your server should respond to that request with the information pertinent to the result of that ask.
"You're not in range of that enemy; you can't attack them."
"That enemy was already dispatched by you."
"That enemy was already dispatched by another player previously."
"You can't attack another enemy while you're in combat with another."
"You missed, and the enemy still has 100hp. The enemy struck back and hit you; you now have 90hp."
"You hit the enemy, and the enemy now has 1hp. The enemy struck back and missed; you still have 100hp."
"You hit the enemy, and the enemy now has 0hp. You've earned 5 coins as a reward."
In this design, your server would be the gatekeeper to all this information, and the data in particular that attackers would seek to modify. Instead of implicitly trusting that the client hasn't been modified, your server would be calculating all this on its own and simply sending the results of those calculations back to the client for display after recording it in a database or other storage medium.
Other items for your consideration
Sequential identifiers
While your server might still coordinate all these actions and calculate the results of said actions itself, it's still possible that any sufficiently motivated attacker could still find ways to exploit your backend, as you've used predictably-incremented values to identify your back-end entities. A small amount of study of your endpoints along with a short script could still successfully yield unintended in-game riches.
When creating entities that you'd prefer players not be able to enumerate (read: it probably is all of them), you can use something akin to a UUID to generate difficult-to-predict unique identifiers. Instead of one enemy being #3130, and the next one being #3131, your enemies are now known internally in your application as 5e03e6b9-1ec2-45f6-927d-794e13d9fa82 and 31e95709-2187-4b02-a521-23b874e10a03. While these aren't, by mathematical definition, reliably cryptographically secure, this makes guessing the identifiers themselves several orders of magnitude more difficult than sequential integers.
I generally allow my database to generate UUIDs automatically for every entity I create so I don't have to think about it in my backend implementation; support for this out of the box will be dependent on the RDBMS you've chosen. As an example in SQL Server, you'd have your id field set to the type UNIQUEIDENTIFIER and have a default value of NEWID().
If you do choose to generate these for whatever reason in Node.js (as you've tagged), something like uuidjs/uuid can generate these for you.
Rate limiting
You haven't mentioned anything about it in your question (whether you've already used it or not), but you really should be enforcing a rate limit for users of your endpoints that's reasonable for your game. Is it really plausible that your user JoeHacker could attack 15 times in the span of a second? No, probably not.
The way this would be accomplished varies based on whatever back-end server you're running. For Express-based servers, several packages including the aptly-named express-rate-limit will get the job done in a relatively simple yet effective manner.
I have a game with a worldwide highscore feature. It uses the firebase database, and writes the user's score if it is the highscore. The rules state that anyone can read or write, so other people can view the highscore.
My problem is that it's easy to manipulate the highscore without actually getting a score. How can I make it so when you achieve a new highscore, it is written to the database, but if you go into the console and change the data, it won't allow you to change it?
if (score > worldScore) {
database.ref().update({highscore: score});
}
You can see that it is very easy to change the data.
In the Firebase console, there is never any restrictions on what you can read or write in the database. Security rules never apply there.
Strictly speaking, unless you involve some serverside component that can one way or another confirm the score was achieved legitimately, this is not possible. Clientside data is always subject to user manipulation; any confirmation checks on that data which you perform on the client would also be subject to user manipulation.
(As for how to actually perform that serverside confirmation: it'll depend on the details of the game, but one way might be to have the client periodically send significant game data to the server; if the server can determine that any of the data has changed in ways that should be impossible according to the game rules -- like a score jumping too far in too short a period of time -- then ignore any future score submissions from that user. Even this isn't perfect: the user can still cheat by manipulating the data that gets sent in that periodic poll, but they'd have to keep their changes at least within the bounds of plausibility.)
Typically you'll want to store not just the score, but also the way the player achieved that score. For example: if it is a board game, write their moves in addition tot he result. If you have both you can:
Verify that the score they wrote is indeed the score that is gotten by applying the moves.
Perform some analysis to detect if the moves seem likely to be computer generated.
Both of these processes are cases of "trusted code", i.e. code that should be running in a trusted environment. For this you can use either an environment you control (a private server, your laptop, etc), Cloud Functions for Firebase, or (in some cases) Firebase's server side security rules. Which ones are feasible depends on your exact use-case, and your available time.
I need to understand what Xdata protocol is. I searched on internet but I can't find anything helps. We are a REU team trying to add a sensor on a interface that only take sensors with Xdata protocol.
Looking for the same information, this is the best document I could find:
ftp://aftp.cmdl.noaa.gov/user/jordan/iMet-1-RSB%20Radiosonde%20XDATA%20Daisy%20Chaining.pdf
Basically, an XDATA packet is ASCII-formated, and looks like this:
xdata=01010123456789abcdef
Where:
"xdata=" - header
"01" - instrument ID
"01" - instrument position in daisy chain (more than one instrument can be connected)
the rest is data
Another document mentions that packet should be terminated with CR/LF: ftp://aftp.cmdl.noaa.gov/user/jordan/XDATA%20Packet%20Example.pdf
Please note that XDATA doesn't seem to specify the data structure. It appears that the data processing is up to the ground station.
Since you didn't specify the radiosonde and instrument you plan to use, I won't write too much here, but you can go to http://www.esrl.noaa.gov/gmd/ozwv/wvap/sw.html where there is a little bit more information about the protocol (but not too much).
I'm the creator of the docs and website linked in the previous answer. Blagus summarized it well. Any instrument you want to send data through an xdata-compatible radiosonde (typically the iMet-1-RSB) to the ground should output 3.3V (3V works too) UART serial packets at 9600 baud (8-N-1, typically no flow control) according to the protocol. We keep track of a list of instrument ID numbers here to avoid conflicts, feel free to contact us to add a new instrument in the future (while prototyping, you can just make one up).
You can also "daisy chain" several xdata instruments together. Any incoming xdata packets have their DC index incremented and then are forwarded down the chain. When xdata packets reach the radiosonde they are stripped of their header info, a CRC is calculated and appended, then they are transmitted as binary FSK radio data to the antenna on the ground. The antenna/preamp is connected to a receiver and then SkySonde Server/Client can decode the data (if using an iMet radiosonde) from the receiver's audio output.
I have a web page that shows a large amount of data from the server. The communication is done via ajax.
Every time the user interacts and changes this data (Say user A renames something) it tells the server to do the action and the server returns the new changed data.
If user B accesses the page at the same time and creates a new data object it will again tell the server via ajax and the server will return with the new object for the user.
On A's page we have the data with a renamed object. And on B's page we have the data with a new object. On the server the data has both a renamed object and a new object.
What are my options for keeping the page in sync with the server when multiple users are using it concurrently?
Such options as locking the entire page or dumping the entire state to the user on every change are rather avoided.
If it helps, in this specific example the webpage calls a static webmethod that runs a stored procedure on the database. The stored procedure will return any data it has changed and no more. The static webmethod then forwards the return of the stored procedure to the client.
Bounty Edit:
How do you design a multi-user web application which uses Ajax to communicate with the server but avoids problems with concurrency?
I.e. concurrent access to functionality and to data on a database without any risk of data or state corruption
Overview:
Intro
Server architecture
Client architecture
Update case
Commit case
Conflict case
Performance & scalability
Hi Raynos,
I will not discuss any particular product here. What others mentioned is a good toolset to have a look at already (maybe add node.js to that list).
From an architectural viewpoint, you seem to have the same problem that can be seen in version control software. One user checks in a change to an object, another user wants to alter the same object in another way => conflict. You have to integrate users changes to objects while at the same time being able to deliver updates timely and efficiently, detecting and resolving conflicts like the one above.
If I was in your shoes I would develop something like this:
1. Server-Side:
Determine a reasonable level at which you would define what I'd call "atomic artifacts" (the page? Objects on the page? Values inside objects?). This will depend on your webservers, database & caching hardware, # of user, # of objects, etc. Not an easy decision to make.
For each atomic artifact have:
an application-wide unique-id
an incrementing version-id
a locking mechanism for write-access (mutex maybe)
a small history or "changelog" inside a ringbuffer (shared memory works well for those). A single key-value pair might be OK too though less extendable. see http://en.wikipedia.org/wiki/Circular_buffer
A server or pseudo-server component that is able to deliver relevant changelogs to a connected user efficiently. Observer-Pattern is your friend for this.
2. Client-Side:
A javascript client that is able to have a long-running HTTP-Connection to said server above, or uses lightweight polling.
A javascript artifact-updater component that refreshes the sites content when the connected javascript client notifies of changes in the watched artifacts-history. (again an observer pattern might be a good choice)
A javascript artifact-committer component that may request to change an atomic artifact, trying to acquire mutex lock. It will detect if the state of the artifact had been changed by another user just seconds before (latancy of javascript client and commit process factors in) by comparing known clientside artifact-version-id and current serverside artifact-version-id.
A javascript conflict-solver allowing for a human which-change-is-the-right decision. You may not want to just tell the user "Someone was faster than you. I deleted your change. Go cry.". Many options from rather technical diffs or more user-friendly solutions seem possible.
So how would it roll ...
Case 1: kind-of-sequence-diagram for updating:
Browser renders page
javascript "sees" artifacts which each having at least one value field, unique- and a version-id
javascript client gets started, requesting to "watch" the found artifacts history starting from their found versions (older changes are not interesting)
Server process notes the request and continuously checks and/or sends the history
History entries may contain simple notifications "artifact x has changed, client pls request data" allowing the client to poll independently or full datasets "artifact x has changed to value foo"
javascript artifact-updater does what it can to fetch new values as soon as they become known to have updated. It executes new ajax requests or gets feeded by the javascript client.
The pages DOM-content is updated, the user is optionally notified. History-watching continues.
Case 2: Now for committing:
artifact-committer knows the desired new value from user input and sends a change-request to the server
serverside mutex is acquired
Server receives "Hey, I know artifact x's state from version 123, let me set it to value foo pls."
If the Serverside version of artifact x is equal (can not be less) than 123 the new value is accepted, a new version id of 124 generated.
The new state-information "updated to version 124" and optionally new value foo are put at the beginning of the artifact x's ringbuffer (changelog/history)
serverside mutex is released
requesting artifact committer is happy to receive a commit-confirmation together with the new id.
meanwhile serverside server component keeps polling/pushing the ringbuffers to connected clients. All clients watching the buffer of artifact x will get the new state information and value within their usual latency (See case 1.)
Case 3: for conflicts:
artifact committer knows desired new value from user input and sends a change-request to the server
in the meanwhile another user updated the same artifact successfully (see case 2.) but due to various latencies this is yet unknown to our other user.
So a serverside mutex is acquired (or waited on until the "faster" user committed his change)
Server receives "Hey, I know artifact x's state from version 123, let me set it to value foo."
On the Serverside the version of artifact x now is 124 already. The requesting client can not know the value he would be overwriting.
Obviously the Server has to reject the change request (not counting in god-intervening overwrite priorities), releases the mutex and is kind enough to send back the new version-id and new value directly to the client.
confronted with a rejected commit request and a value the change-requesting user did not yet know, the javascript artifact committer refers to the conflict resolver which displays and explains the issue to the user.
The user, being presented with some options by the smart conflict-resolver JS, is allowed another attempt to change the value.
Once the user selected a value he deems right, the process starts over from case 2 (or case 3 if someone else was faster, again)
Some words on Performance & Scalability
HTTP Polling vs. HTTP "pushing"
Polling creates requests, one per second, 5 per second, whatever you regard as an acceptable latency. This can be rather cruel to your infrastructure if you do not configure your (Apache?) and (php?) well enough to be "lightweight" starters. It is desirable to optimize the polling request on the serverside so that it runs for far less time than the length of the polling interval. Splitting that runtime in half might well mean lowering your whole system load by up to 50%,
Pushing via HTTP (assuming webworkers are too far off to support them) will require you to have one apache/lighthttpd process available for each user all the time. The resident memory reserved for each of these processes and your systems total memory will be one very certain scaling limit that you will encounter. Reducing the memory footprint of the connection will be necessary, as well as limiting the amount continuous CPU and I/O work done in each of these (you want lots of sleep/idle time)
backend scaling
Forget database and filesystem, you will need some sort of shared memory based backend for the frequent polling (if the client does not poll directly then each running server process will)
if you go for memcache you can scale better, but its still expensive
The mutex for commits has to work globaly even if you want to have multiple frontend servers to loadbalance.
frontend scaling
regardless if you are polling or receiving "pushes", try to get information for all watched artifacts in one step.
"creative" tweaks
If clients are polling and many users tend to watch the same artifacts, you could try to publish the history of those artifacts as a static file, allowing apache to cache it, nevertheless refreshing it on the serverside when artifacts change. This takes PHP/memcache out of the game some for requests. Lighthttpd is verry efficent at serving static files.
use a content delivery network like cotendo.com to push artifact history there. The push-latency will be bigger but scalability's a dream
write a real server (not using HTTP) that users connect to using java or flash(?). You have to deal with serving many users in one server-thread. Cycling through open sockets, doing (or delegating) the work required. Can scale via forking processes or starting more servers. Mutexes have to remain globaly unique though.
Depending on load scenarios group your frontend- and backend-servers by artifact-id ranges. This will allow for better usage of persistent memory (no database has all the data) and makes it possible to scale the mutexing. Your javascript has to maintain connections to multiple servers at the same time though.
Well I hope this can be a start for your own ideas. I am sure there are plenty more possibilities.
I am more than welcoming any criticism or enhancements to this post, wiki is enabled.
Christoph Strasen
I know this is an old question, but I thought I'd just chime in.
OT (operational transforms) seem like a good fit for your requirement for concurrent and consistent multi-user editing. It's a technique used in Google Docs (and was also used in Google Wave):
There's a JS-based library for using Operational Transforms - ShareJS (http://sharejs.org/), written by a member from the Google Wave team.
And if you want, there's a full MVC web-framework - DerbyJS (http://derbyjs.com/) built on ShareJS that does it all for you.
It uses BrowserChannel for communication between the server and clients (and I believe WebSockets support should be in the works - it was in there previously via Socket.IO, but was taken out due to the developer's issues with Socket.io) Beginner docs are a bit sparse at the moment, however.
I would consider adding time-based modified stamp for each dataset. So, if you're updating db tables, you would change the modified timestamp accordingly. Using AJAX, you can compare the client's modified timestamp with the data source's timestamp - if the user is ever behind, update the display. Similar to how this site checks a question periodically to see if anyone else has answered while you're typing an answer.
You need to use push techniques (also known as Comet or reverse Ajax) to propagate changes to the user as soon as they are made to the db. The best technique currently available for this seems to be Ajax long polling, but it isn't supported by every browser, so you need fallbacks. Fortunately there are already solutions that handle this for you. Among them are: orbited.org and the already mentioned socket.io.
In the future there will be an easier way to do this which is called WebSockets, but it isn't sure yet when that standard will be ready for prime time as there are security concerns about the current state of the standard.
There shouldn't be concurrency problems in the database with new objects. But when a user edits an object the server needs to have some logic that checks whether the object has been edited or deleted in the meantime. If the object has been deleted the solution is, again, simple: Just discard the edit.
But the most difficult problem appears, when multiple users are editing the same object at the same time. If User 1 and 2 start editing an object at the same time, they will both make their edits on the same data. Let's say the changes User 1 made are sent to the server first while User 2 is still editing the data. You then have two options: You could try to merge User 1's changes into the data of User 2 or you could tell User 2 that his data is out of date and display him an error message as soon as his data gets send to the server. The latter isn't very user friendly option here, but the former is very hard to implement.
One of the few implementations that really got this right for the first time was EtherPad, which was acquired by Google. I believe they then used some of EtherPad's technologies in Google Docs and Google Wave, but I can't tell that for sure. Google also opensourced EtherPad, so maybe that's worth a look, depending on what you're trying to do.
It's really not easy to do this simultaneously editing stuff, because it's not possible to do atomic operations on the web because of the latency. Maybe this article will help you to learn more about the topic.
Trying to write all this yourself is a big job, and it's very difficult to get it right. One option is to use a framework that's built to keep clients in sync with the database, and with each other, in realtime.
I've found that the Meteor framework does this well (http://docs.meteor.com/#reactivity).
"Meteor embraces the concept of reactive programming. This means that you can write your code in a simple imperative style, and the result will be automatically recalculated whenever data changes that your code depends on."
"This simple pattern (reactive computation + reactive data source) has wide applicability. The programmer is saved from writing unsubscribe/resubscribe calls and making sure they are called at the right time, eliminating whole classes of data propagation code which would otherwise clog up your application with error-prone logic."
I can't believe that nobody has mentioned Meteor. It's a new and immature framework for sure (and only officially supports one DB), but it takes all the grunt work and thinking out of a multi-user app like the poster is describing. In fact, you can't NOT build a mult-user live-updating app. Here's a quick summary:
Everything is in node.js (JavaScript or CoffeeScript), so you can share stuff like validations between the client and server.
It uses websockets, but can fall back for older browsers
It focuses on immediate updates to local object (i.e. the UI feels snappy), with changes sent to the server in the background. Only atomic updates are allowed to make mixing updates simpler. Updates rejected on the server are rolled back.
As a bonus, it handles live code reloads for you, and will preserves user state even when the app changes radically.
Meteor is simple enough that I would suggest you at least take a look at it for ideas to steal.
These Wikipedia pages may help add perspective to learning about concurrency and concurrent computing for designing an ajax web application that either pulls or is pushed state event (EDA) messages in a messaging pattern. Basically, messages are replicated out to channel subscribers which respond to change events and synchronization requests.
https://en.wikipedia.org/wiki/Category:Concurrency_control
https://en.wikipedia.org/wiki/Distributed_concurrency_control
https://en.wikipedia.org/wiki/CAP_theorem
https://en.wikipedia.org/wiki/Operational_transformation
https://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing
There are many forms of concurrent web-based collaborative software.
There are a number of HTTP API client libraries for etherpad-lite, a collaborative real-time editor.
django-realtime-playground implements a realtime chat app in Django with various real-time technologies like Socket.io.
Both AppEngine and AppScale implement the AppEngine Channel API; which is distinct from the Google Realtime API, which is demonstrated by googledrive/realtime-playground.
Server-side push techniques are the way to go here. Comet is (or was?) a buzz word.
The particular direction you take depends heavily on your server stack, and how flexible you/it is. If you can, I would take a look at socket.io, which provides a cross-browser implementation of websockets, which provide a very streamline way to have bidirectional communication with the server, allowing the server to push updates to the clients.
In particular, see this demonstration by the library's author, which demonstrates almost exactly the situation you describe.
In my JavaScript game (made with jQuery) I have player position stored in a database. When character is moving, i just send request to specyfic URL, I.E. mysite.com/map/x1/y3 (where a character's position is x=1, y=3).
That url send coordinates to the database and checks to see if any other players are near ours. If yes, it sends also JSON object with name and coords of that players.
And here is my question - how to secure it? Some one could look into my JavaScript code and prepare url looking like mysite.com/map/x100/y234, and it will 'teleport' him into some other side of map.
Any data/computation processed in JavaScript in the browser will be insecure since all the code runs on the local machine. I would recommend to list all the parameters critical to a fair experience of play, such as the player position, score, resources... and compute the management of these parameters on the server-side. You would only gather user inputs from the browser and send the updated state to the browser for display.
Even if you choose to compute some values on the browser side to avoid latency, you should not take them into account for the global state shared by players, and you should resynchronize the local state with the global state - always in the direction global to local - from time to time.
Like in a typical form handling, you should also check that the values sent by the browser for user inputs fall into reasonable bounds, e.g. relative movement in one second is less than a certain distance.
You could obfuscate your javascript source code. That will at least deter casual cheats, however there's probably no way to make it completely secure using javascript.