I'm looking to calculate the number of terminal columns various printing and non-printing ascii/unicode characters will occupy in a terminal view.
For example, horizontal tab (\t) occupies 8 columns, color codes (i.e. \x1b32m) occupy 0 columns, and fixed-size wide-character strings (i.e. 한) might occupy 2 columns. Of course there are many in the primary ASCII set that only occupy 1 column (ie. a-Z/0-9, punctuation etc.).
I've come across the node.js module, wcwidth, that seems to help calculate wide-character strings, but doesn't do what I'd expect for other characters, like color codes, and tabs.
For example:
var wcwidth = require('wcwidth');
console.log("TAB WIDTH", wcwidth('\t'));
console.log("한 WIDTH", wcwidth('한'));
console.log("Color Code WIDTH", wcwidth('\x1b32m'));
console.log("X WIDTH", wcwidth('X'));
Outputs:
TAB WIDTH 0
한 WIDTH 2
Color Code WIDTH 3
X WIDTH 1
I can't seem to find any information about this anywhere, though I'd imagine it would be a common thing people have had to solve in the ancient past.
If there might be a way using a bash script, or any library, application or tool, I'm totally open to that as well.
Any help much appreciated! :)
Thanks
A tab does not occupy 8 columns. It outputs a single space and then enough spaces to ensure that the next character will be output at the next column whose index is 0 mod 8 (Or 1 mod 8 if you count from 1.) In other words, you cannot tell how wide a tab is unless you know where you are on the line.
A color code (\x1b[32m) might occupy zero space, but it also might not; it depends on the nature of the terminal emulator for the console. Most terminal emulators will recognize the CSI[Pm code but there are other codes which are quite a bit more idiosyncratic. For example,
printf $'\x1b]2;A window\x1b\\'
will set the window title in xterm, and hence will produce no output. But in a Linux console, the text ;A window will be displayed, occupying 9 characters.
In short, it is not so easy a problem, and you can only answer it with a lot of context because there is no absolute answer.
This is indeed an issue for any program that needs to know where the cursor is on screen, from tabular output in ls through editable command lines to full-screen applications. As you've noticed, it's not solved by wcwidth or wcswidth, which are defined only for (strings of) printable characters. (Even that is not well defined for many characters.) Also, control sequences can not only change colours but also cursor positioning and even, where supported, font size effects.
Instead, terminal control libraries such as ncurses [npm search] are sometimes used. These don't seem to tell you string widths either, but because they track text attributes such as colour separately, and generate control sequences themselves to position and style text, they provide some assistance in putting things on screen in given locations.
Unfortunately I don't believe there's much available beyond that, with applications either ignoring the complexities or handling them in ad hoc ways.
To clear up a common misconception: Horizontal Tab (HT, \t) doesn't have a width as such; it's a 'format effector', like Carriage Return or Form Feed, that repositions the cursor according to certain rules.
HT (Horizontal Tabulation): A format effector which controls the
movement of the printing position to the next in a series of
predetermined positions along the printing line. (Applicable also to
display devices and the skip function on punched cards.)
— USA Standard Code for Information Interchange [ASCII], 1968, as reprinted in RFC 20
The most common implementation is to have fixed tab stops every eight columns:
1 2
1.......9.......7.......5.....
1\tXYZ 1 XYZ
12\tXYZ 12 XYZ
1234567\tXYZ 1234567 XYZ
12345678\tXYZ 12345678 XYZ
123456789\tXYZ 123456789 XYZ
though some systems support control sequences or other ways to set the positions of the tab stops at arbitrary distances, like the ruler bar in some word processors.
Related
In js, 2 spaces are the golden standard for indent size. With that said, if one were to try and share some code that had 3 or 4 spaces, that would probably annoy the heck out of whoever else is needing to use it.
The problem is that I have a pretty tough time visually seeing two-space indents, specifically when the start and end of an "indent block" are a screen tall or more (also, I have dyslexia, and this is probably amplifying this difficulty.) Regardless, I just spend nearly an hour trying to fix a "bug" that ended up just being me placing a variable in the wrong scope because I couldn't see the indent. I can't stand two spaces, it's too small! Having a larger indent would prevent problems like this from occurring in the future.
So, is there any way to display more than two indent spaces without having to actually change the indent size? (so that when I push or do a pull request on github, the resulting document only has 2 space indents.)
There is an extension, called indent-switcher, which does exactly what you need. You can manually change the indent from 2 to 4 or from 4 to 2.
I'm not sure if this indent will affect your code when you pull or push it to github. But you could just change it from 2 to 4 while working on it and change it from 4 to 2 when you are finished. You can also bind the commands to a keyboard shortcut. There is also another extension called indent-rainbow. This could also come in handy for you.
I am trying to create a number flip effect using the number-flip npm package as a basis. I wish to make it similar to Robinhood's stock ticker (when scrubbing across a stock chart).
I have been able to customise the base number-flip package so that leading zeros and commas are removed when not necessary. However when the number of digits increases, the rest of the digits simply jump over to the right in position. I would like them to be smoother when moving over.
The current way I do it is, any leading zeros/commas are simply hidden using the position property and then if they are required, they are made visible again. This hiding/unhiding causes all the visible numbers to jump across and make it unsmooth.
I have linked code sandbox below:
https://codesandbox.io/s/nostalgic-taussig-ymfn9?file=/src/index.js
Thank you very much for your help
I've been performing some research, in order to find the best approach to identify break points (trend direction change) in a dataset (with pairs of x/y coordinates), that allow me to identify the trend lines behind my data collections.
However I had no luck, finding anything that brings me some light.
The yellow dots in the following image, represent the breakpoints I need to detect.
Any suggestion about an article, algorithm, or implementation example (typescript prefered) would be very helpful and appreciated.
Usually, people tend to filter the data by looking only maximums (support) or only minimums (resistance). A trend line could be the average of those. The breakpoints are when the data crosses the trend, but this gives a lot of false breakpoints. Because images are better than words, you can look at page 2 of http://www.meacse.org/ijcar/archives/128.pdf.
There are a lot of scripts available look for "ZigZag" in
https://www.tradingview.com/
e.g. https://www.tradingview.com/script/lj8djt1n-ZigZag/ https://www.tradingview.com/script/prH14cfo-Trend-Direction-Helper-ZigZag-and-S-R-and-HH-LL-labels/
Also you can find an interesting blog post here (but code in in python):
https://towardsdatascience.com/programmatic-identification-of-support-resistance-trend-lines-with-python-d797a4a90530
with code available: https://pypi.org/project/trendln/
If you can identify trend lines then can't you just identify a breakpoint as when the slope changes? If you can't identify trend lines, then can you for example, take a 5-day moving average and see when that changes slope?
This might sound strange, or even controversial, but -- there are no "breakpoints". Even looking at your image, the fourth breakpoint might as well be on the local maximum immediately before its current position. So, different people might call "breakpoints" different points on the same graph (and, indeed, they do).
What you have in the numbers are several possible moving averages (calculated on a varying interval, so you might consider MA5 for five-day average, or MA7 for a week average) and their first and maybe second derivatives (if you feel fancy you can experiment with third derivatives). If you plot all these lines, suitably smoothed, over your data, you will notice that the salient points of some of them will roughly intersect near the "breakpoints". Those are the parameters that your brain considers when you "see" the breakpoints; it is why you see there the breakpoints, and not somewhere else.
Another method that the human vision employs to recognize features is trimming outliers: you discard in the above calculations either any value outside a given tolerance, or a fixed percentage of all values starting from those farther from the average. You can also experiment not trimming those values that are outliers for longer-period moving averages but are not for shorter periods (this gives better responsivity but will find more "breakpoints"). Then you run the same calculations on the remaining data.
Finally you can attribute a "breakpoint score" based on weighing the distance from nearby salient points. Then, choose a desired breakpoint distance and call "breakpoint" the highest scoring point in that interval; repeat for all subsequent breakpoints. Again, you may want to experiment with different intervals. This allows for a conveniently paced breakpoint set.
And, finally, you will probably notice that different kinds of signal sources have different "best" breakpoint parameters, so there is no one "One-Size-Fits-All" parameter set.
If you're building an interface to display data, leaving the control of these parameters to the user might be a good idea.
I am working on a process flow. I have used vis library to show the flow. It shows the flow in canvas.
The flow is long and dynamically generated based on user input, so I have provided scroll inside canvas.
So to go to particular step, user has to scroll down. I want to provide search functionality in canvas to make it more user friendly.
Is there any way to provide search functionality in Canvas?
Do you mean character-recognition on existing canvas content (as in your image)?
It would be much easier to modify your vis library to emit {x:,y:,text:} objects for each step.
Character-recognition is possible but not practical.
I did a verrrrry simple proof-of-concept demo where:
The font-size and font-face is known.
The background is transparent except for the text.
Each character is surrounded by transparency.
The process went like this:
fillText a random "unknown" character on a "main" canvas.
On the main canvas, find the top-left set of opaque pixels which are surrounded by transparency. This set of pixels is the unknown character.
Copy only that unknown character on a second canvas.
Set compositing mode to destination-out.
Draw an 'A' onto the second canvas over the unknown character. The compositing mode will erase the opaque pixels of the unknown character.
Count the remaining opaque pixels on the second canvas.
If there are "very few" opaque pixels remaining then 'A' is likely the unknown character.
If there are "many" opaque pixels remaining then 'A' is probably not the unknown character. So repeat steps#3+ using the 'B' character (and then 'C, D, etc').
Refinements that helped recognition:
Remove any anti-aliasing on both the original main canvas and on each 'A, B, ...' character over-drawn.
When drawing the 'A, B, ...' to erase the original unknown character, draw the 'A' multiple times with a 1 pixel offset vertically, horizontally and diagonally. The offsets: [x+0,y+0], [x-1,y], [x+1,y], [x,y-1], [x,y+1], [x-1,y-1], [x-1,y+1], [x+1,y-1], [x+1,y+1].
Results
The darned thing worked fairly well -- he says with genuine self-surprise! :-O
Using 36 pixel Verdana font, the code recognized all of the characters from ! through ~ (most of the non-Unicode characters).
But ... The double-quote character was not recognized because it's the one Verdana character that is broken into 2 parts. Visually the double-quote looks like two single quotes separated by space. Step#2 found the left part of the quote but not the right part because of the transparent space.
This is not an effective OCR system ... it's barely a proof-of-concept!!
The font-size & font-face must be exactly known. Since font faces may differ between browsers (and even versions of browsers), the technique probably won't work well across browsers.
The recognition is only for text written on html5 canvas. If given a paper image, the "noise" in the paper will likely cause the technique to fail.
However, it was the basis for a fairly good pattern-matching algorithm where other clues help with the identification process.
I have a small <p> about 140px wide aligned next to a picture. In total there is space for four lines of text. The first two lines are reserved for the title and there are two lines of other info.
I want the title to be cut if it spans more than two lines else it will push the other info out of line with the bottom of the image.
The only solution I could think of was to create a div the height of two lines with an overflow to hidden. However, if the title is only one line it leaves a big gap.
The solution can be Jquery, plain javascript, CSS or even PHP (if its possible).
TIA
Set the title to have a max-height of two lines
Keep in mind that the property max-height is not supported in IE6. In addition, limiting the size of text boxes can cause accessibility issues, and is generally not recommended.
As this is more of a content issue than a display issue, it's probably best to deal with it on the back end - if it's dynamic text, limit your database field to an appropriate character count, or chop it with some php (or whatever server side situation you're set up in). It's tough to establish a character count with a non-monospaced font, but if you don't limit it on the content side, you run the risk of upsetting your less visually-inclined users who may be using older browsers that don't zoom all fancy like the latest releases of safari and chrome.