Detecting change in raw data

Detecting change in raw data - javascript

I am currently building a web application that acts as a storage tank level dashboard. It parses incoming data from a number of sensors in tanks and stores these values in a database. The application is built using express / node.js. The data is sampled every 5 minutes but is sent to the server every hour (12 samples per transmission).
I am currently trying to expand the application's capabilities to detect changes in the tank level due to filling or emptying. The end goal is to have a daily report that generates a summary of the filling / emptying events with the duration of time and quantity added or removed. This image shows a screenshot of tank capacity during one day - https://imgur.com/a/kZ50N.
My questions are:
What algorithms / functions are available that detects the changes in tank level? How would I implement them into my application?
When should the data handling take place? As the data is parsed and saved into the server? At the end of the day with a function that goes through all the data for that day?
Is it worth considering some sort of data cleaning during the parsing stage? I have noticed times when there are random spikes in the data due to noise.
How should I handle events when they immediately start emptying the tank immediately after completing a delivery? I will need the algorithm to be robust enough that it detects a change in the direction of the slope to be the end of an event. Example of this is in the provided image.
I realise that it may difficult to put together a robust solution. There are times when the tank is being emptied at the same time that it is being filled. This makes it difficult to measure these reductions. The only was to know that this took place is the slope of during the delivery flatlines for approximately 15 minutes and the delivery is a fixed amount less than the usual delivery total.
This has been a fun project to put together. Thanks for any assistance.

You should be able to develop an algorithm that specifies what you mean by a fill or en emptying (a change in tank level). A good place to start is X% in Y seconds. You then calibrate to avoid false positives or false negatives (e.g. showing a fill when there was none vs. missing a fill when it occurs. One potential approach is to average the fuel level over a period of time (say 10 minutes) and compare it with the average for the next 10 minutes. If there is a difference above a threshold (say 5%), you can call this a change.
When you process the data depends on when you need it, so if the users need to be constantly informed of changes, this could be done on querying of the data. Processing the data into changes in level on write to your datastore might be more efficient (you only do it once), however you lose the ability to tweak your algorithm. It could well depend on performance, e.g. if someone wants to pull a years worth of data, is the system able to deal with this?
You will almost certainly need to do something like a low pass filter on the incoming data. You don't want to show a tank fill based on a temporary spike in level. This is easy to do with an array of values. As mentioned above, a moving average, say of the last 10 minutes of levels is another way of smoothing the data. You may never get a 0% false positive rate or a 0% false negative rate, you can only aim for values as low as possible.
In this case it looks like a fill followed by an emptying of the tank. If you consider these to be two separate events then you can simply detect changes on the incoming data. I'd suggest you create a graph marking fills as a symbol on the graph as well as emptying. This way you can eyeball the data to ensure you are detecting changes. I would also say you could add some very useful unit tests for your calculations using perhaps jasmin.js or cucumber.js.

Related

How to apply backpropagation to a single output neuron?

I am making a neural network with reinforcement.
The model looks like this:
63 input neurons (environment state) - 21 neurons in the hidden layer - 4 outputs. The output neurons contain the probability of going up, down, left, right. ([0,0,0,1])
The neural network gives the result of the move, the agent performs an action.
Each new move, after the agent has performed actions, I give him a reward or a penalty.
How to do backpropagation in tensor flow js? I need not an error to propagate back, but a reward or a penalty. And only from a certain output neuron.
enter image description here
Example:
The neural network predicted the move to the right
the agent went to the right and left the playing field.
This is a bad action. The agent is charged a fine of -0.02
In the current model of the neural network, we determine the output neuron that responds to the move to the right.
We backpropagate from this neuron back with a coefficient of -0.02. If it is not a fine, but a reward, then the coefficient will be positive.
How to do step 5 in code?
UDP:I initially thought that the task is simple and does not require additional clarification. Therefore, I formulated the question briefly. I think it's worth giving more information :) The game consists of 10 squares in width and 10 in height, a total of 100. There are 20 chicken legs in a static position on the playing field. The agent's task is to collect as many chicken legs as possible. I started my research with a genetic algorithm. I created a tensor flow model, in which I submitted the state of the game to the input. I didn't teach the model. In tensors, it's just a set of random weights. After each pass of the game, we choose the winners, cross them and mutate a little. The crossing and mutation itself occurs directly in the neural network that is attached to each agent. I do not teach the system, I take weights from the neural network (brain) of the agent and perform mutation and crossing directly, I change the coefficients in tensors. Result = 10 chicken legs. This is not bad, the agents are really trained, but I am not satisfied with the result. I now want to use reinforcement learning. I'm new to this field and I can't find examples anywhere of exactly how to praise or fine a neuron for wrong actions. It is in the form of a code. I understand the concept of the award, but how to implement it...I can't think without the order of actions. For example, the agent walked across the playing field 1 time. He made 4 moves to the left and went outside the playing field. On the 2nd move, he hit the cage with a chicken leg. The experiment is over. Every move, I saved the state of the game (the game environment for input neurons) to an array and saved the rewards to another array [0,1,0,-1] => 1 - reward for a chicken leg, -1 for going beyond.
How do I now teach the system with this data?
(I assumed that it was necessary to reduce the weights of y along the branch from the wrong output neuron to the incoming data by a gradient. Not training the neural network at all, but working purely with weights)

Consider that RL agents do not learn from what is a good or bad action, but they do learn from rewards.
An action is never explicitly tagged as good or bad, just a reward is given back. and usually rewards comes as consequence of several actions.
Consider a simple agent that learns to play tic tac toe. Usually reward is 1 for winning, -1 when looses and 0 for ties. For any other action (i.e. intermediate actions while playing is going on) reward is 0.
Now the agent makes it first move: whatever it is, it will get 0 as reward. Was it a good or a bad move? You don't know until the end of the game. And more formally, you don't even know at the end: the agent may win a game even with a very bad initial movement.
This is also valid for any other action, not only the initial. To guess how good was an action in RL (and in any trial-error framework) is known as the credit assignment problem, and is not explicitly solved.
This is what makes RL very different from supervised learning: in supervised learning, you have pairs of inputs and expected outputs, that tell how to solve you task. In RL you just know what to do (win a tic tac toe game) but you don't know how to do it, so you just pay the agent once he does it ok.
That said, there are several formulations on how to solve an RL problem, from your question I assume you're trying to do Q-learning: the agent learns a function called Q which takes a state and an action and tell how good is the action for the state. When the function is modeled as a neural network, it usually takes as input a state and outputs the Q for each action: exactly as the model you share in the image.
And now we arrive to your question: how do you update the function? Well, first you need to take an action from your current state. Which to take? The best? Well, since your agent is learning, taking always the action he think is the best won't help: he will be wrong most of the time and will only try bad movements. A random one? This may help, but it is also problematic since randomly playing is hard to achieve the task and get paid, so it will not succeed and never learns. What I introduced here is the exploration-exploitation dilema: what to do, expoit agent's knowledge taking the action supposed to be the best or explore trying something different to guess what happens? You need to do both, and an appropriate balance among them is crucial. A simple, yet effective way to achieve this is the epsilon-greedy strategy: with a probability of epsilon take a greedy action (best up to agent's knowledge) otherwise take a random action (explore).
Once you take an action, you receive a reward, and it's time to learn from this reward and update your function. For this, you have your current Q function at time t and the brand-new reward, at time t, and want to update the Q function in time t+1.
There are some small details not covered here to make it work, just wanted to clarify some points. For a keras based implementation, take a look at this.

Website takes very long to load, order, and display statistics

I have a database where a few statistics are stored (PlayerID, Time, Kills, Deaths), 7 in total but these are enough to explain.
What I did was load the table in php, create an array with all the statistics (including statistics that are a product of multiple statistics, kills per death for example), sort them, and cut the top 10.
Then I passed this top10-array as JSON to javascript, where I just created tr- and td-elements and made the respective tables append them.
However, with just 10 different sets of statistics, it took an eternity (like 30 seconds I'd say) to load the page. My guess was that the sorting takes too long for the server, so I tried just passing the initial array to javascript and put the sorting part in /* */, to test if it would go any faster. It did not.
Afterwards I turned the sorting back on, but disabled the javascript part and just displayed it with var_dump(). It still didn't go much faster.
The actual code is kinda long and I think this isn't a question that you need the code for, but I can still post the code if you really need it. What exactly is causing the load time, and what would be the best way to sort and display the statistics?
Edit: I forgot to say that using ORDER BY in the SQL query doesn't work, because I need the to calculate some statistics using others.

Sounds like you spend all you time on retrieving and processing the whole data set, and then you throw away everything but top 10 records.
Without seeing the model and understanding how the top 10 are selected, the only advice would be to rethink you model, decide on an indexed field by which you would be able to fetch just the top 10 records, and do the calculations for that field before you save statistics into DB.
It also depends on which operation is more time-sensitive for you - SELECT or UPDATE (i.e when you fetch or save statistics). But I bet couple of math operations before you save data will not affect much the time spent on saving the data. But will greatly improve time spent on generating some reports, including top 10 report.

Better approach to re-draw colorized circles in HTML5 canvas, store values in array or recalculate color every time?

I got several objects where every one of them got priority value. Priority value can be between 1(lowest) to 200(higgest). Every value is represented by a color, lowest value got green color "rgba("0","255",0,1)"; and highest value got "rgba("255","0",0,1)";
I calculate color value by classic equation where every priority value determine different value(different color). So in the end i got possible chance of 200 different colors in range from green(0) to yellow(100) to red(200) based on priority.
My question is: When I'm redrawing on canvas all objects every 100ms. Is it better to calculate those values everytime to get wanted color or generate only ONCE in initialization function an array of 200 colors where value on array[100] will be color for object with 100 priority.
I expect there won't be like big a difference but still one of those approach must be better.

Calculate once is the better option in almost every case (classically called a lookup table). Memory is cheaper than CPU cycles which means consumer hardware has plenty of RAM, but is always needing cycles.
In this case you are right, 200 colours every 100ms is insignificant even at full frame rate of 16.666...ms (60fps), but clients will have many applications/tabs/services running on the device and anything a programmer does to reduce CPU load will benefit the client.
There is also a added benefit that programmers tend to forget. CPU cycles require much more power than memory. For a single machine adding a few million cycles is nothing, but if every programmer wrote in a manner that reduced overall load the world wide savings in power are considerable. I am off to hug a tree now, hope that helps.

from question i understand that you don't want calculation of color to affect your animation, use Web worker to run your calculation on separate thread Web Workers And one more thing Read all Style at one time and Write style at one time don't do Read/Write together because it may cause Layout Thrashing Layout Thrashing

Cleanup of leap motion controller data

I noticed that the data I'm getting from the leap motion controller is quite noisy. Apart from the obvious (i.e. position of the fingers), I've encoutered events such as
fingers moving between hands,
"phantom" hands appearing,
fingers disappearing and reappearing immediately afterwards.
Does the API (in particular the Javascript API) provide any means of cleaning this data or is there any other way of making this data less noisy? All of these events can be handled in user code of course, but it seems that having to do this yourself every time would be less than ideal.

In short, no- at the moment the developers have to implement the logic for that. Be aware that this might not be true in the future, the API changes fast.
I had problems with this as well, I solved this by using a circular queue with a max limit of (for example) 100 frames. Then I would track the data for just one pointable. I would then filter the data for the conditions I considered to be not normal. For example width, which is very unreliable. I would get the modal value, and accept a +2 -2 range for the modal value. I would ignore everything else. Works rather well :)
In short, as you already mentioned, you need to collect data, and filter out the noise. Tool and width precision will change they told me. Do a search on the forum for isTool and see how others found ways to get 'stabilized' data.
For me the solution was (for what I wanted, which was to track one pointable, and a reliable width):
Hold a queue of max X items
Set a tolerance limit
Compare the data in the queue
Filter out what was considered noise

Web UI to visualise a timeline of events?

I'm looking at storing and presenting event data in a web-based UI.
There are a few complications, however:
There is a very high density of events - hundreds of them every second.
Events are spaced at very small time intervals - e.g. every few microseconds.
There's a need to be able to zoom out to see the macro picture - e.g. over the day, or, over a few minutes, as well as the micro picture - i.e. down to individual events.
I was thinking of doing some kind of timeline, that you could scroll in sync with a list of events, however, I'm not sure of the best way ot achieve this.
Does anybody know of any existing HTML components, or even examples, that show how to present large number of events in a timeline?
And any recommendations/thoughts on a suitable backend for storing this? There is a fairly large number of events, and we'd be storing it going back (e.g. 6 months) - and we'd want the Web UI to be quite responsive panning back and forth, and looking up events.
Cheers,
Victor

On the backend I'd say go with Node.js and MongoDB, you're going to be doing a lot of IO to retrieve your data. For the front-end, you may want to take a look at google charts annotated timeline for displaying the data.

We Keep Coding

JavaScript is the programming language of the Web.