Speed up simplex algorithm

Speed up simplex algorithm - javascript

I am playing around with a great simplex algorithm I have found here: https://github.com/JWally/jsLPSolver/
I have created a jsfiddle where I have set up a model and I solve the problem using the algorithm above. http://jsfiddle.net/Guill84/qds73u0f/
The model is basically a long array of variables and constraints. You can think of it as trying to find the cheapest means of transportation of passengers between different hubs (countries), where each country has a minimum demand for passengers, a maximum supply of passengers, and each connection has a price. I don't care where passengers go, I just want to find the cheapest way to distribute them. To achieve this I use the following minimising objective:
model = {
"optimize": "cost",
"opType": "min",
"constraints": { \\etc...
I am happy with the model and the answer provided by the algorithm ... but the latter takes a very long time to run (>15 seconds...) Is there any possible way I can speed up the calculation?
Kind regards and thank you.
G.

It sounds as though you have a minimum-cost flow problem. There's a reasonable-looking TopCoder tutorial on min-cost flow by Zealint, who covers the cycle-canceling algorithm that would be my first recommendation (assuming that there's no quick optimization that can be done for your LP solver). If that's still too slow, there's a whole literature out there.
Since you're determined to solve this problem with an LP solver, my suggestion would be to write a simpler solver that is fast and greedy but suboptimal and use it as a starting point for the LP by expressing the LP in terms of difference from the starting point.

#Noobster, I'm glad that someone other than me is getting use out of my simplex library. I went through, looked at it, and was getting around the same runtime as you (10 - 20 seconds). There was a piece of the code that was needlessly transposing array to turn the RHS into a 1d array from a 2d array. With your problem, this killed performance eating up 60ms every time it happened (for your problem, 137 times).
I've corrected this in the repo and am seeing runtimes around 2 seconds. There are probably a ton of code clean up optimizations like this that need to happen but the problem set I built this (http://mathfood.com) for are so small that I never knew this was an issue. Thanks!
For what its worth, I took the simplex algo out of a college textbook and turned it into code; the MILP piece came from wikipedia.

Figured it out. The most expensive piece of the code was the pivoting operation; which it turns out was doing a lot of work to update the matrix by adding 0. Doing a little logic up front to prevent this dropped my run-time down on node from ~12 seconds to ~0.5.
for (i = 0; i < length; i++) {
if (i !== row) {
pivot_row = tbl[i][col];
for (j = 0; j < width; j++) {
// No point in doing math if you're just adding
// Zero to the thing
if (pivot_row !== 0 && tbl[row][j] !== 0) {
tbl[i][j] += -pivot_row * tbl[row][j];
}
}
}
}

Related

Why are two calls to string.charCodeAt() faster than having one with another one in a never reached if?

I discovered a weird behavior in nodejs/chrome/v8. It seems this code:
var x = str.charCodeAt(5);
x = str.charCodeAt(5);
is faster than this
var x = str.charCodeAt(5); // x is not greater than 170
if (x > 170) {
x = str.charCodeAt(5);
}
At first I though maybe the comparison is more expensive than the actual second call, but when the content inside the if block is not calling str.charCodeAt(5) the performance is the same as with a single call.
Why is this? My best guess is v8 is optimizing/deoptimizing something, but I have no idea how to exactly figure this out or how to prevent this from happening.
Here is the link to jsperf that demonstrates this behavior pretty well at least on my machine:
https://jsperf.com/charcodeat-single-vs-ifstatment/1
Background: The reason i discovered this because I tried to optimize the token reading inside of babel-parser.
I tested and str.charCodeAt() is double as fast as str.codePointAt() so I though I can replace this code:
var x = str.codePointAt(index);
with
var x = str.charCodeAt(index);
if (x >= 0xaa) {
x = str.codePointAt(index);
}
But the second code does not give me any performance advantage because of above behavior.

V8 developer here. As Bergi points out: don't use microbenchmarks to inform such decisions, because they will mislead you.
Seeing a result of hundreds of millions of operations per second usually means that the optimizing compiler was able to eliminate all your code, and you're measuring empty loops. You'll have to look at generated machine code to see if that's what's happening.
When I copy the four snippets into a small stand-alone file for local investigation, I see vastly different performance results. Which of the two are closer to your real-world use case? No idea. And that kind of makes any further analysis of what's happening here meaningless.
As a general rule of thumb, branches are slower than straight-line code (on all CPUs, and with all programming languages). So (dead code elimination and other microbenchmarking pitfalls aside) I wouldn't be surprised if the "twice" case actually were faster than either of the two "if" cases. That said, calling String.charCodeAt could well be heavyweight enough to offset this effect.

WebAudio FDN Lossless Prototype unstable when using individual feedback gains

I'm trying to build an FDN Reverberator in WebAudio by following this article.
There is a simplified implementation of a Householder FDN which uses a common feedback gain for all delays and seems pretty stable.
However, when I try to implement the more general case that is mixed by a matrix I cannot seem to make it stable.
I have inlined most of the code to narrow down the issue, and put it in a JSFiddle.
EDIT: Warning, high volume in the unstable case.
The difference comes down to this:
var feedback = context.createGain();
feedback.gain.value = gainValue;
for(var i=0; i<n; i++) {
this.delays[i].connect(feedback);
feedback.connect(this.delays[i]);
}
Compared to:
for(var i=0; i<n; i++) {
for(var o=0; o<n; o++) {
var feedback = context.createGain();
feedback.gain.value = gainValue;
this.delays[i].connect(feedback);
feedback.connect(this.delays[o]);
}
}
When I use a common feedback GainNode for all delays, it works fine. When I create individual feedback GainNodes for all delays, using the same gainValue, it becomes unstable.
What am I doing wrong?
EDIT: Clarification from the article.
As mentioned in §3.4, an "ideal" late reverberation impulse response should resemble exponentially decaying noise [314]. It is therefore useful when designing a reverberator to start with an infinite reverberation time (the "lossless case") and work on making the reverberator a good "noise generator". Such a starting point is [often] referred to as a lossless prototype [153,430]. Once smooth noise is heard in the impulse response of the lossless prototype, one can then work on obtaining the desired reverberation time in each frequency band (as will be discussed in §3.7.4 below).

The gain nodes add volume to each other. If you use multiple gain nodes you need to split their value by the number of active gain nodes.
So if you have 10 gain nodes plaing simultaneosly your volume for the gain node would be value / 10 (number of active gain nodes). You need to edit the value of gain nodes before as well to the new value. So best to store all Gain nodes in an array and loop over it.
I didn't try it but I think it should work. It is physically totally nonesense, because if you have ten people crying in the same room the db-Meter is still as loud as if one would cry. Try to think of gain nodes more in a Mathematical way that they add up the signal to each other.
Your Reverb is dope by the way.

In Code it means:
for(var i=0; i<n; i++) {
for(var o=0; o<n; o++) {
var feedback = context.createGain();
feedback.gain.value = gainValue/9;
this.delays[i].connect(feedback);
feedback.connect(this.delays[o]);
}
}
I may reuse this code someday right??? If I set n to 30 I get kind of a cymball sound.

Why is using a loop to iterate from start of array to end faster than iterating both start to end and end to start?

Given an array having .length 100 containing elements having values 0 to 99 at the respective indexes, where the requirement is to find element of of array equal to n : 51.
Why is using a loop to iterate from start of array to end faster than iterating both start to end and end to start?
const arr = Array.from({length: 100}, (_, i) => i);
const n = 51;
const len = arr.length;
console.time("iterate from start");
for (let i = 0; i < len; i++) {
if (arr[i] === n) break;
}
console.timeEnd("iterate from start");
const arr = Array.from({length: 100}, (_, i) => i);
const n = 51;
const len = arr.length;
console.time("iterate from start and end");
for (let i = 0, k = len - 1; i < len && k >= 0; i++, k--) {
if (arr[i] === n || arr[k] === n) break;
}
console.timeEnd("iterate from start and end");
jsperf https://jsperf.com/iterate-from-start-iterate-from-start-and-end/1

The answer is pretty obvious:
More operations take more time.
When judging the speed of code, you look at how many operations it will perform. Just step through and count them. Every instruction will take one or more CPU cycles, and the more there are the longer it will take to run. That different instructions take a different amount of cycles mostly does not matter - while an array lookup might be more costly than integer arithmetic, both of them basically take constant time and if there are too many, it dominates the cost of our algorithm.
In your example, there are few different types of operations that you might want to count individually:
comparisons
increments/decrements
array lookup
conditional jumps
(we could be more granular, such as counting variable fetch and store operations, but those hardly matter - everything is in registers anyway - and their number basically is linear to the others).
Now both of your code iterate about 50 times - they element on which they break the loop is in the middle of the array. Ignoring off-by-a-few errors, those are the counts:
| forwards | forwards and backwards
---------------+------------+------------------------
>=/===/< | 100 | 200
++/-- | 50 | 100
a[b] | 50 | 100
&&/||/if/for | 100 | 200
Given that, it's not unexpected that doing twice the works takes considerably longer.
I'll also answer a few questions from your comments:
Is additional time needed for the second object lookup?
Yes, every individual lookup counts. It's not like they could be performed at once, or optimised into a single lookup (imaginable if they had looked up the same index).
Should there be two separate loops for each start to end and end to start?
Doesn't matter for the number of operations, just for their order.
Or, put differently still, what is the fastest approach to find an element in an array?
There is no "fastest" regarding the order, if you don't know where the element is (and they are evenly distributed) you have to try every index. Any order - even random ones - would work the same. Notice however that your code is strictly worse, as it looks at each index twice when the element is not found - it does not stop in the middle.
But still, there are a few different approaches at micro-optimising such a loop - check these benchmarks.
let is (still?) slower than var, see Why is using `let` inside a `for` loop so slow on Chrome? and Why is let slower than var in a for loop in nodejs?. This tear-up and tear-down (about 50 times) of the loop body scope in fact does dominate your runtime - that's why your inefficient code isn't completely twice as slow.
comparing against 0 is marginally faster than comparing against the length, which puts looping backwards at an advantage. See Why is iterating through an array backwards faster than forwards, JavaScript loop performance - Why is to decrement the iterator toward 0 faster than incrementing and Are loops really faster in reverse?
in general, see What's the fastest way to loop through an array in JavaScript?: it changes from engine update to engine update. Don't do anything weird, write idiomatic code, that's what will get optimised better.

#Bergi is correct. More operations is more time. Why? More CPU clock cycles.
Time is really a reference to how many clock cycles it takes to execute the code.
In order to get to the nitty-gritty of that you need to look at the machine level code (like assembly level code) to find the true evidence. Each CPU (core?) clock cycle can execute one instruction, so how many instructions are you executing?
I haven't counted the clock cycles in a long time since programming Motorola CPUs for embedded applications. If your code is taking longer then it is in fact generating a larger instruction set of machine code, even if the loop is shorter or runs an equal amount of times.
Never forget that your code is actually getting compiled into a set of commands that the CPU is going to execute (memory pointers, instruction-code level pointers, interrupts, etc.). That is how computers work and its easier to understand at the micro controller level like an ARM or Motorola processor but the same is true for the sophisticated machines that we are running on today.
Your code simply does not run the way you write it (sounds crazy right?). It is run as it is compiled to run as machine level instructions (writing a compiler is no fun). Mathematical expression and logic can be compiled in to quite a heap of assembly, machine level code and that is up to how the compiler chooses to interpret it (it is bit shifting, etc, remember binary mathematics anyone?)
Reference:
https://software.intel.com/en-us/articles/introduction-to-x64-assembly
Your question is hard to answer but as #Bergi stated the more operations the longer, but why? The more clock cycles it takes to execute your code. Dual core, quad core, threading, assembly (machine language) it is complex. But no code gets executed as you have written it. C++, C, Pascal, JavaScript, Java, unless you are writing in assembly (even that compiles down to machine code) but it is closer to actual execution code.
A masters in CS and you will get to counting clock cycles and sort times. You will likely make you own language framed on machine instruction sets.
Most people say who cares? Memory is cheap today and CPUs are screaming fast and getting faster.
But there are some critical applications where 10 ms matters, where an immediate interrupt is needed, etc.
Commerce, NASA, a Nuclear power plant, Defense Contractors, some robotics, you get the idea . . .
I vote let it ride and keep moving.
Cheers,
Wookie

Since the element you're looking for is always roughly in the middle of the array, you should expect the version that walks inward from both the start and end of the array to take about twice as long as one that just starts from the beginning.
Each variable update takes time, each comparison takes time, and you're doing twice as many of them. Since you know it will take one or two less iterations of the loop to terminate in this version, you should reason it will cost about twice as much CPU time.
This strategy is still O(n) time complexity since it only looks at each item once, it's just specifically worse when the item is near the center of the list. If it's near the end, this approach will have a better expected runtime. Try looking for item 90 in both, for example.

Selected answer is excellent. I'd like to add another aspect: Try findIndex(), it's 2-3 times faster than using loops:
const arr = Array.from({length: 900}, (_, i) => i);
const n = 51;
const len = arr.length;
console.time("iterate from start");
for (let i = 0; i < len; i++) {
if (arr[i] === n) break;
}
console.timeEnd("iterate from start");
console.time("iterate using findIndex");
var i = arr.findIndex(function(v) {
return v === n;
});
console.timeEnd("iterate using findIndex");

The other answers here cover the main reasons, but I think an interesting addition could be mentioning cache.
In general, sequentially accessing an array will be more efficient, particularly with large arrays. When your CPU reads an array from memory, it also fetches nearby memory locations into cache. This means that when you fetch element n, element n+1 is also probably loaded into cache. Now, cache is relatively big these days, so your 100 int array can probably fit comfortably in cache. However, on an array of much larger size, reading sequentially will be faster than switching between the beginning and the end of the array.

Which Boolean is Faster? < or <=

I'm doing some work involving processing an insane amount of data in browser. As a result I'm trying to optimize everything down to the nuts and bolts. I don't need anyone telling me that I'm wasting my time or that premature optimization is the root of all evil.
I would just like to know if anyone that understands how JS works would know whether or not a lesser than boolean runs faster than a lesser than equals boolean. What I mean by that is, would:
return (i<2? 0:1)
Be parsed and run faster than:
return (i<=1? 0:1)
In this example we're assuming that i is an integer. Thanks.

JavaScript standard desribes the steps that needs to be taken in order to evaluate those expressions. You can take a look at ECMAScript 2015 Language Specification, section 12.9.3.
Be aware that even if there is slightly difference between steps of those two operation, other stuff in your application will have much more influence on performance that these simple operations that you cannot control in JavaScript. For example work of garbage collector, just-in-time compiler, ...
Even if you try measuring time in JavaScript, this will not work as just taking time stamps has much bigger influence on the performance than the actual expression you want to measure. Also the code that you wrote might not be the one which is really evaluated as some preoptimizations might me taken by the engine prior to actual running the code.

I wouldn't call this micro-optimisation, but rather nano-optimisation.
Cases are so similar you'll most likely have a measure precision below the gain you can expect...
(Edit)
If this code is optimised, the generated assembly code will just change from JAto JAE (in (x86) , and they use the same cycle count. 0,0000% change.
If it is not, you might win one step within a selectof the engine...
The annoying thing being that it makes you miss the larger picture : unless i'm wrong, you need a branch here, and if you're that worried about time, the statistical distribution of your input will influence WAY more the execution time. (but still not that much...)
So walk a step back and compare :
if (i<2)
return 0;
else
return 1;
and :
if (i>=2)
return 1;
else
return 0;
You see that for ( 100, 20, 10, 1, 50, 10) (1) will branch way more and for (0, 1, 0, 0, 20, 1), (2) branches more.
That will make much more difference... that might just as well be very difficult to measure !!!
(As a question left to the reader, i wonder how return +(i>1) compiles, and if there's a trick to avoid branching... )
(By the way i'm not against early optimisation, i even posted some advices here, if it might interest you : https://gamealchemist.wordpress.com/2016/04/15/writing-efficient-javascript-a-few-tips/ )

I have created a fiddle using performance.now API and console.time API's
Both the API says how much ms of the time was taken to execute the functions/loops.
I feel the major difference is the result, performance.now gives more accurate value i.e. upto 1/1000th ms.
https://jsfiddle.net/ztacgxf1/
function lessThan(){
var t0 = performance.now();
console.time("lessThan");
for(var i = 0; i < 100000; i++){
if(i < 1000){}
}
console.timeEnd("lessThan");
var t1 = performance.now();
console.log("Perf -- >>" + (t1-t0));
}
function lessThanEq(){
var t0 = performance.now();
console.time("lessThanEq")
for(var i = 0; i < 100000; i++){
if(i <= 999){}
}
console.timeEnd("lessThanEq");
var t1 = performance.now();
console.log("Perf -- >>" + (t1-t0));
}
lessThan()
lessThanEq()
I haven't much difference. May be iterating more may give different result.
Hope this helps you.

Solver for TSP-like Puzzle, perhaps in Javascript

I have created a puzzle which is a derivative of the travelling salesman problem, which I call Trace Perfect.
It is essentially an undirected graph with weighted edges. The goal is to traverse every edge at least once in any direction using minimal weight (unlike classical TSP where the goal is to visit every vertex using minimal weight).
As a final twist, an edge is assigned two weights, one for each direction of traversal.
I create a new puzzle instance everyday and publish it through a JSON interface.
Now I know TSP is NP-hard. But my puzzles typically have only a good handful of edges and vertices. After all they need to be humanly solvable. So a brute force with basic optimization might be good enough.
I would like to develop some (Javascript?) code that retrieves the puzzle from the server, and solves with an algorithm in a reasonable amount of time. Additionally, it may even post the solution to the server to be registered in the leader board.
I have written a basic brute force solver for it in Java using my back-end Java model on the server, but the code is too fat and runs out of heap-space quick, as expected.
Is a Javascript solver possible and feasible?
The JSON API is simple. You can find it at: http://service.traceperfect.com/api/stov?pdate=20110218 where pdate is the date for the puzzle in yyyyMMdd format.
Basically a puzzle has many lines. Each line has two vertices (A and B). Each line has two weights (timeA for when traversing A -> B, and timeB for when traversing B -> A). And this should be all you need to construct a graph data structure. All other properties in the JSON objects are for visual purposes.
If you want to become familiar with the puzzle, you can play it through a flash client at http://www.TracePerfect.com/
If anyone is interested in implementing a solver for themselves, then I will post detail about the API for submitting the solution to the server, which is also very simple.
Thank you for reading this longish post. I look forward to hear your thoughts about this one.

If you are running out of heap space in Java, then you are solving it wrong.
The standard way to solve something like this is to do a breadth-first search, and filter out duplicates. For that you need three data structures. The first is your graph. The next is a queue named todo of "states" for work you have left to do. And the last is a hash that maps the possible "state" you are in to the pair (cost, last state).
In this case a "state" is the pair (current node, set of edges already traversed).
Assuming that you have those data structures, here is pseudocode for a full algorithm that should solve this problem fairly efficiently.
foreach possible starting_point:
new_state = state(starting_point, {no edges visited})
todo.add(new_state)
seen[new_state] = (0, null)
while todo.workleft():
this_state = todo.get()
(cost, edges) = seen[this_state]
foreach directed_edge in graph.directededges(this_state.current_node()):
new_cost = cost + directed_edge.cost()
new_visited = directed_edge.to()
new_edges = edges + directed_edge.edge()
new_state = state(new_visited, new_edges)
if not exists seen[new_state] or new_cost < seen[new_state][0]:
seen[new_state] = (new_cost, this_state)
queue.add(new_state)
best_cost = infinity
full_edges = {all possible edges}
best_state
foreach possible location:
end_state = (location, full_edges)
(cost, last_move) = seen[end_state]
if cost < best_cost:
best_state = end_state
best_cost = cost
# Now trace back the final answer.
path_in_reverse = []
current_state = best_state
while current_state[1] is not empty:
previous_state = seen[current_state][1]
path_in_reverse.push(edge from previous_state[0] to current_state[0])
current_state = previous_state
And now reverse(path_in_reverse) gives you your optimal path.
Note that the hash seen is critical. It is what prevents you from getting into endless loops.
Looking at today's puzzle, this algorithm will have a maximum of a million or so states that you need to figure out. (There are 2**16 possible sets of edges, and 14 possible nodes you could be at.) That is likely to fit into RAM. But most of your nodes only have 2 edges connected. I would strongly advise collapsing those. This will reduce you to 4 nodes and 6 edges, for an upper limit of 256 states. (Not all are possible, and note that multiple edges now connect two nodes.) This should be able to run very quickly with little use of memory.

For most parts of graph you can apply http://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg.
This way you can obtain number of lines that you should repeat in order to solve.
At beginning you should not start at nodes which has short vertices over which you should travel two times.
If I summarize:
start at node whit odd number of edges.
do not travel over lines which sit on even node more than once.
use shortest path to travel from one odd node to another.
Simple recursive brute force solver whit this heuristic might be good way to start.
Or another way.
Try to find shortest vertices that if you remove them from graph remining graph will have only two odd numbered nodes and will be considered solvable as Koningsberg bridge. Solution is solving graph without picking up pencil on this reduced graph and once you hit node whit "removed" edge you just go back and forward.

On your java backend you might be able to use this TSP code (work in progress) which uses Drools Planner (open source, java).

We Keep Coding

JavaScript is the programming language of the Web.