I've developed a business rule engine that users can write rules in boolean syntax.
For example rules are: R1, R2, R3
Sample Expression: (R1 AND R2) OR R3
I want to visualize this expression. For example, visualization framework may display the expression in a tree view and insert colors.
Is there any javascript or any other code framework to achieve this?
(Application is an ASP.NET application)
I can't help but answer this one even though my answer may not help you easily solve your problem. Back in 1998, my very first Javascript project was precisely a boolean expression visualizer.
The code is not available anywhere, so I can't share it. (I doubt even my former employer still has a copy.) And even it it was, it ran on IE4, 5.0 and 5.5; I don't think it was ever updated for IE6, and don't know if it ran there.
But I can still tell you the basic ideas, and even today, I'm still fairly proud of the results, although I know I would shudder to see the actual code.
Of course a boolean expression can easily be represented by a tree structure. Each non-leaf node was either an AND, an OR, or a NOT node in the tree, and the ANDs and ORs could have multiple children (so I represented ("A and B and C and D" as AND(A, B, C, D), not just as combinations of binary ANDs.) To display the data, I simply used nested boxes. ANDs ran horizontally, ORs ran vertically, with the keyword "and" and "or" repeated between the blocks. NOT was just a box in a box with the keyword "not" in the outer one.
My leaf nodes were associated with real data scenarios that the user could use for testing, so instead of just "A" and "B", they looked, for instance, like
age < 30
gender = 'F'
income > 40000
The user could enter sample data for fields age, gender, and income and the output would change to a red-green display to show whether each block of the expression, and of course the entire expression was true or false.
The fields to use were configurable, and the test cases were saved for future elaboration.
This was a very fun project, and it helped in communication between business people who were writing rules, and programmers who implemented them, groups who often had very different ideas of how one might use the word "and" in polite company. :-)
But the main points are that one very useful way to visualize boolean expression is with simple boxes: NOT is a box in a box, with word "not" in the outer one. ORs are boxes containing vertically grouped boxes with "or" in between, and ANDs are boxes containing horizontally grouped boxes with "and" in between. If you can actually assign truthy/falsey values to your primitives, then green for truthy boxes, red for falsey ones makes for a very compelling display.
...
But you'll have to write your own code. Sorry.
Related
I have an array of RGB value integers/whole numbers, ranging from 0 to 255. Eight Different lists.
For example one list has 8,349,042 values in it. OR 2,783,014 RGB sets of three.
The core objective is the user selects a pixel in an image.
That pixel's (R,G,B) value is grabbed and searched for within these lists. It exists in one of these lists, as all the lists together contain all the possible RGB combinations (16,777,216)
I'm trying to figure out what's the best way to store and search through these values.
again: these values don't change, they are hard coded lists(see Bullet Point below), with a known range.
The search query would be at minimum 3 times every event which would be every 10-30 seconds or so if the user was spamming.
OR Best case scenario, if the storage and search technique is fast enough I would like to: run the search on every pixel in an image (of maybe 800 x 600 or smaller resolutions) to have more data to play with. If I run into any memory limitations, I plan to work with it and use it as restrictions for my game design.
I used Javascript to generate these lists, going through and assigning each value based on how close it was to a base color.
[maybe unimportant how I made these lists]
I first assigned black and white RGB lists based on hard numeric limitations, then the rest of the RGB values were looped through and assigned to their closest base colors, red, yellow, cyan, green, blue and magenta. If there was a tie in distance I gave it to currently shortest list to try to keep it somewhat balanced. I may try to optimize this later and generate a new list, but not during runtime, just raw data.
I saved the results in hard text, and they are currently stored as a text that I can dump into large array.
At first I was trying to store the data as a JSON file along with my scripts. But I struggled to read the data and save it to an array. I ran into issues with using fetch and async and not being able to have the array where I needed it. Testing with console.log(arr) and getting undefined. I'm guessing because it wasn't loaded yet.
I can just paste it hard coded into the array but it's ginormous and I know there has to be a better way.
Also, hearing about differences in arrays vs sets vs objects
and different searching techniques within them.
Most of them seemed to be more tactful for multi variable arrays like name and age and location databases.
Since my data is all numeric I was thinking it could be a more bit/byte based approach?
I was reading some things on trees and hashs, bit crunching encryption?
Trees seemed nice for quicker searching as I could try to assign each branch of the tree to each R, G and B of the value, but I would also need to figure out how to convert my Giant single list of numbers into that, it also could be just the search type and that may depend on how I store the data.
I also struggled to understand the difference between front end and back end. I believe everything I've tried would be considered front end as I'm only testing my code in a browser.
I was pointed towards Node for backend but got lost in trying to get the console to run things.
I'm willing to give any of these things a try but I don't want to go down a path and find out it can't do what I want, or not optimally enough, like a server burden, or user burden with waiting too long or unable to do things because of user data security, requiring the user to do something more than just load the game, permissives wise.
SO I'm hoping someone can give me suggestions on what I should pursue so I can knuckle down and have a better focus on what I need to learn to be well versed and best tackle problem.
EDIT: Simplified question: In Javascript, I have an array of 2 million (x,y,z) numbers. What's the best way to search that array for a specific (x,y,z) value?
Would it be better to store the data in a different format than an array for constant searches?
I'm not certain I've understood the overall goal here but have a suggestion to consider if you are trying to assign a predetermined value to some (almost random as it is picked as a pixel colour from an image) rgb number set.
I assume the list/dictionary you have made allocates some value to each rgb number set and that it can be regenerated or reformatted if needed.
There are a maximum of 16,777,216 rgba values (256^3). Javascript arrays can have up to 2^32-2 (almost 4.3 billion) elements. Therefore the suggestion is to reformat your dictionary to be a 3-dimensional array where each dimension is indexed 0-255. The array can be declared and assigned as an array literal in a regular js script (text file) like colourDictionary= [[[val0]..[val255]]..[[]..[]],...[[val0]..[val255]]..[[]..[]]]; and each value accessed arithmetically in constant time using the pixel values as colourDictionary[r][g][b]
To be useful without writing lines to cater for missing values, your gaps (you mention a list of 8,349,042, around half the available number combinations) could be filled with the values of nearest neighbours.
Apologies if I've missed the point.
I am beginning thinking about how to transliterate a RTL string (i.e. arabic, hebrew) to a LTR string (i.e. the romanization of the sounds/letters). It's relatively straightforward if it's LTR -> LTR, but more tricky mentally for RTL -> LTR. For LTR -> LTR, you could have a simple mapping for each letter in A to each letter in B. Maybe multiple A's combined make a B in some cases, or a single a single A makes a chain of Bs.
a b
- -
X 1
YZ 2
ABC 3
D 456
E 78
Then given a string like XYZYZDDEABC you would get 122456456783. Basic enough, though the actual algorithm would be a bit tricky because it might have to lookahead and have a prioritization on the elements. But this is the gist of it.
Now for a RTL -> LTR transformation, I'm confused on two levels. First, how do you iterate through a RTL string? The characters are actually in LTR order, correct? It's just the visual layout in browsers and such which makes it RTL. So from a code perspective, your RTL language is actually read LTR (it's not like we have to do anything in reverse or anything). Just making sure I'm interpreting this correctly. That would mean I can just do like the above LTR -> LTR transformation for all intents and purposes.
If it's not like that, and there's something else to consider, I would like to know generally how to do this. If a language is needed for a demo, then JavaScript would be good.
You're correct. Text is stored in "logical order", which is the order it would be typed (or, in most cases, the order in which it is spoken). So you don't need to take directionality into account during transliteration.
Note that in many writing systems, including both Arabic and Hebrew, numbers are written "big-endian", with the most significant digit on the left. They are also typed in this order, meaning that the text is actually bidirectional. That is also the case when texts of different directionality are mixed together, such as when names written in Latin script are included in an Arabic or Hebrew document. Fortunately, you don't need to worry about that either, unless you're writing a Unicode renderer. (If you are, you'd need to read Annex 9 to the Unicode standard, which goes into all the details of bidirectional rendering.)
I'm building a website that should collect various news feeds and would like the texts to be compared for similarity. What i need is some sort of a news text similarity algorithm.
I know that php has the similar_text function and am not sure how good it is + i need it for javascript.
So if anyone could point me to an example or a plugin or any instruction on how this is possible or at least where to look and start investigating.
There's a javascript implementation of the Levenshtein distance metric, which is often used for text comparisons. If you want to compare whole articles or headlines though you might be better off looking at intersections between the sets of words that make up the text (and frequencies of those words) rather than just string similarity measures.
The question whether two texts are similar is a philosophical one as long as you don't specify exactly what it should mean. Consider the Strings "house" and "mouse". Seen from a semantic level they are not very similar, but they are very similar regarding their "physical appearance", because only one letter is different (and in this case you could go by Levenshtein distance).
To decide about similarity you need an appropriate text representation. You could – for instance – extract and count all n-grams and compare the two resulting frequency-vectors using a similarity measure as e.g. cosine similarity. Or you could stem the words to their root form after having removed all stopwords, sum up their occurrences and use this as input for a similarity measure.
There are plenty approaches and papers about that topic, e.g. this one about short texts. In any case: The higher the abstraction level where you want to decide if two texts are similar the more difficult it will get. I think your question is a non-trivial one (and hence my answer rather abstract) ... ;-)
Suppose I have a large list of objects (thousands or tens of thousands), each of which is tagged with a handful of tags.
There are dozens or hundreds of possible tags and their usage follows a typical power law:
some tags are used extremely often but most are rare.
All but the most frequent couple dozen tags could typically be ignored, in fact.
Now the problem is how to visualize the relationship between these tags.
A tag cloud is a nice visualization of just their frequencies but it ignores which tags occur with which other tags.
Suppose tag :bar only occurs on objects also tagged :foo.
That should be visually apparent.
Similarly for three tags that tend to occur together.
You could make each tag a bubble and let them partially overlap with each other.
Technically that's a Venn diagram but treating it that way might be unwieldy.
For example, Google charts can create Venn diagrams, but only for 3 or fewer sets (tags):
http://code.google.com/apis/chart/docs/gallery/venn_charts.html
The reason they limit it to 3 sets is that any more and it looks horrendous.
See "extentions to higher numbers of sets" on the Wikipedia page: http://en.wikipedia.org/wiki/Venn_diagrams
But that's only if every possible intersection is non-empty.
If no more than 3 tags ever co-occur (maybe after throwing out the rare tags) then a collection of Venn diagrams could work (with the sizes of the bubbles representing tag frequency).
Or perhaps a graph (as in vertices and edges) with visually thicker or thinner edges to represent frequency of co-occurrence.
Do you have any ideas, or pointers to tools or libraries?
Ideally I'd do this with javascript but I'm open to things like R and Mathematica or really anything else.
I'm happy to share some actual data (you'll laugh if I tell you what it represents) if anyone is curious.
Addendum: The application I originally had in mind was TagTime but it occurs to me that this also maps well to the problem of visualizing one's delicious bookmarks.
If i understand your question correctly, an image matrix should work nicely here. The implementation i have in mind would be an n x m matrix in which the tagged items are rows, and each tags type is a separate column. Every cell in the matrix would consist entirely of "1's" and "0's", i.e., a particular item either has a given tag or it doesn't.
In the matrix below (which i rotated 90 degrees so it would fit better in this window--so columns actually represent tagged items, and each row shows the presence or absence of a given tag across all items), i simulated the scenario in which there are 8 tags and 200 tagged items. , a "0" is blue and a "1" is light yellow.
All values in this matrix were randomly selected (each tagged item is eight draws from a box consisting of two tokens, one blue and one yellow (no tag and tag, respectively). So not surprisingly there's no visual evidence of a pattern here, but if there is one in your data, this technique, which is dead simple to implement, can help you find it.
I used R to generate and plot the simulated data, using only base graphics (no external packages or libraries):
# create the matrix
A = matrix(data=r1, nrow=1, ncol=8)
# populate it with random data
for (i in seq(0, 200, 1)){r1 = sample(0:1, 8, replace=TRUE); A = rbind(A, r1)}
# now plot it
image(z=A, ann=F, axes=F, col=topo.colors(12))
I would create something like this if you are targeting the web. Edges connecting the nodes could be thicker or darker in color, or perhaps a stronger force connecting them so they are close in distance. I would also add the tag name inside the circle.
Some libraries that would be very good for this include:
Protovis (Javascript)
Flare (Adobe Flash)
Some other fun javascript libraries worth looking into are:
Processing for Javascript
Raphael
Although this is an old thread, I just came across it today.
You may also want to consider using a Self-Organizing Map.
Here is an example of a self-organizing map for world poverty. It used 39 of what you call your "tags" to arrange what you call your "objects".
http://www.cis.hut.fi/research/som-research/povertymap.gif
Note sure it would work as I did not test that, but here is how I would start:
You can create a matrix as doug suggests in his answer, but instead of having documents as rows and tags as columns, you take a square matrix where tags are rows and columns. Value of the cell T1;T2 will be the number of documents tagged with both T1 and T2 (note that by doing that you'll get a symetric matrix because [T1;T2] will have the same value as [T2;T1]).
Once you have done that, each row (or column) is a vector locating the tag in a space with T dimensions. Tags near each others in this space often occur together. To visualize co-occurrence you can then use a method to reduce your space dimensionality or any clustering method. For example you can use a kohonen self organizing map to project your T-dimensions space to a 2D space, you'll then get a 2D matrix where each cell represents an abstract vector in the tag space (meaning the vector won't necessary exists in your data set). This vector reflect a topological constraint of your source space, and can be seen as a "model" vector reflecting a significant co-occurence of some tags. Moreover, cells near each others on this map will represent vectors close to each other in the source space, thus allowing you to map the tag space on a 2D matrix.
Final visualization of the matrix can be done in many ways but I cannot give you advice on that without first seeing the results of the previous processing.
I cannot for the life of me figure out why Alternative is left recursive. It really throws a wrench into my parser.
Alternative ::
[empty]
Alternative Term
Here is a note in the semantics portion of the spec that is not exactly clear. Maybe the reasoning would be revealed once I understand this?
NOTE Consecutive Terms try to
simultaneously match consecutive
portions of the input String. If the
left Alternative, the right Term, and
the sequel of the regular expression
all have choice points, all choices in
the sequel are tried before moving on
to the next choice in the right Term,
and all choices in the right Term are
tried before moving on to the next
choice in the left Alternative.
What kind of parser can properly handle a left recursive grammar?
Because for certain types of parser left-recursion is much better (e.g. for yacc - see section 6.2 here for an explanation).
If it's causing trouble for your particular parser, then by all means swap it over - it doesn't affect the definition of the language in any way.