How would you implement auto-capitalization in JavaScript/HTML - javascript

I need to implement auto-capitalization inside of a Telerik RadEditor control on an ASPX page as a user types.
This can be an IE specific solution (IE6+).
I currently capture every keystroke (down/up) as the user types to support a separate feature called "macros" that are essentially short keywords that expand into formatted text. i.e. the macro "so" could auto expand upon hitting spacebar to "stackoverflow".
That said, I have access to the keyCode information, as well I am using the TextRange methods to select a word ("so") and expanding it to "stackoverflow". Thus, I have some semblence of context.
However, I need to check this context to know whether I should auto-capitalize. This also needs to work regardless of whether a macro is involved.
Since I'm monitoring keystrokes for the macros, should I just monitor for punctuation (it's more than just periods that signal a capital letter) and auto-cap the next letter typed, or should I use TextRange and analyze context?

Have you tried to apply the text-transform CSS style to your controls?

I'm not sure if this is what you're trying to do, but here is a function (reference) to convert a given string to title case:
function toTitleCase(str) {
return str.replace(/([\w&`'‘’"“.#:\/\{\(\[<>_]+-? *)/g, function(match, p1, index, title){ // ' fix syntax highlighting
if (index > 0 && title.charAt(index - 2) != ":" &&
match.search(/^(a(nd?|s|t)?|b(ut|y)|en|for|i[fn]|o[fnr]|t(he|o)|vs?\.?|via)[ -]/i) > -1)
return match.toLowerCase();
if (title.substring(index - 1, index + 1).search(/['"_{([]/) > -1)
return match.charAt(0) + match.charAt(1).toUpperCase() + match.substr(2);
if (match.substr(1).search(/[A-Z]+|&|[\w]+[._][\w]+/) > -1 ||
title.substring(index - 1, index + 1).search(/[\])}]/) > -1)
return match;
return match.charAt(0).toUpperCase() + match.substr(1);
});
}

Sometimes, not to do it is the right answer to a coding problem.
I really would NOT do this, unless you feel you can write a script to correctly set the case in the following sentence, if you were to first convert it to lowercase and pass it into the script.
Jean-Luc "The King" O'Brien MacHenry van d'Graaf IIV (PhD, OBE), left his Macintosh with in Macdonald's with his friends MacIntosh and MacDonald. Jesus gave His Atari ST at AT&T's "Aids for AIDS" gig in St George's st, with Van Halen in van Henry's van, performing The Tempest.
You have set yourself up for a fall by trying to create a Natural Language Parser. You can never do this as well as the user will. At best, you can do an approximation, and give the user the ability to edit and force a correction when you get it wrong. But often in such cases, the editing is more work than just doing it manually and right in the first place.
That said, if you have the space and power to store and search a large n-gram corpus of suitably capitalized words, you would at least be able to have a wild stab at the most likely desired case.

You pose an interesting question. Acting upon each key press may be more limiting because you will not know what comes immediately after a given keycode (the complexity of undoing a reaction that turns out to be incorrect could mean having to go to a TextRange-based routine anyway). Granted, I haven't wrestled with code on this problem to date, so this is a hypothesis in my head.
At any length, here's a Title Casing function (java implementation inspired by a John Gruber blogging automation) which may spur ideas when it comes to handling the actual casing code:
http://individed.com/code/to-title-case/

Related

Properly modifying text pulled from a game and returning it

I'm really new to Javascript, kinda just learned a little earlier today and been messing around with it, but I'm running into a few issues her and there. I'd appreciate help from some people that know their way around the code.
What's the best way to search a string for multiple words? I'm not completely sure how to explain what I mean, so I'll include my current test code and try to explain. I'm making an attached script to pull text from a text based game online, converting it to lowercase, and defining variables for the use of a money system that changes the input text. Once changes are made, I'm re-inputting the modified text into the game as a return.
let money = 0;
const modifier = (text) => {
let modifiedText = text;
const lowered = text.toLowerCase();
let moneyChange = 0;
// The text passed in is either the user's input or players output to modify.
if(lowered.includes('take their money') || lowered.includes('take ' + 'money')) {
moneyChange = (Math.floor(Math.random() * 500));
if ((moneyChange) > 1) {
console.log(moneyChange);
money += moneyChange;
modifiedText = `You find ${moneyChange} Credits. You now have ${money} Credits`;
} else {
modifiedText = 'You find nothing.';
console.log(modifiedText);
}
}
console.log(modifiedText);
// You must return an object with the text property defined.
return {text: modifiedText};
}
modifier(text);
Currently, as you can see, I have to specifically type "Take their money" or "Take money" as an action before the text pulled is recognized as me taking money from someone or taking some in general. My main issue is that with how the game works, it's somewhat impossible to guess exactly how the input or output is going to come out. The way it works is that the game takes your character's action or speech that you type out, processes it via AI into it's own action or dialogue and generates procedural story to make more sense with the setting so that the player only has to type a vague idea of what's going to happen.
Here's an example:
There's a dead man on the street in front of you.
>loot him
You loot the man, digging through his pockets. You take some money from his wallet, but find nothing else.
The > is my only input and the rest is completely AI generated. My script looks through the AI result and , so I could look for every possible result, from "take his money" to "take her money" and so forth, but that's a little too much to bother with if there's an easier way. If I could have it search the result for specific words that may not be in the normal order and/or with other words in between. Like, it must contain the words "take" and "money" so that if the game says "You find some money, along with a gun. You take both", it recognizes that I'm taking the money. As well as the fact that I still need to write code for every single other time I do anything with money, such as buying things, and if I have to write every possible thing it's going to be a pain.
I know that it would be easier if this code was integrated into the game, but due to AI limitations, that kinda breaks how it works and it goes a little crazy... Any sort of help you can give me will be a help.
If you're looking for a way to search a string which includes multiple sub-phrases, you can use string.includes() in a loop like shown below:
function containsWords(string, words) {
for (let i=0, len=string.length; i<len; i++) {
if (!string.includes(words[i])) {
return false;
}
}
return true;
}
However you also mentioned
search the result for specific words that may not be in the normal order and/or with other words in between
Which immediately brings to mind regex, a text and string matching technology. You can easily find tutorials for regex online, and this live tester is nice too.
I'll quickly build a search string to match "take *** money", where any word can be *** as a quick introduction and example to regex:
/take .+ money/g
Here it matches the specific string take , then .+ matches one or more characters (the middle pronoun eg him/her), then matches money.

String logic in everyday chat using javascript

I'm sort of building an AI for a Telegram Bot, and currently I'm trying to process the text and respond to the user almost like a human does.
For example;
"I want to register"
As a human we understand that the user wants to register.
So I'd process this text using javascript's indexOf to look for want and register
var user_text = message.text;
if (user_text.indexOf('want') >= 0) {
if (user_text.indexOf('register') >= 0) {
console.log('He wants to register?')
}
}
But what if the text contains not somewhere in the string? Of course I'd have like a zillion of conditions for a zillion of cases. It'd be tiring to write this kind of logic.
My question is — Is there any other elegant way to do this? I don't really know the keyword to Google this...
The concept you're looking for is natural language processing and is a very broad field. Full NLP is very intricate and complicated, with all kinds of issues.
I would suggest starting with a much simpler solution, by splitting your input into words. You can do that using the String.prototype.split method with some tweaks. Filter out tokens you don't care about and don't contribute to the command, like "the", "a", "an". Take the remaining tokens, look for negation ("not", "don't") and keywords. You may need to combine adjacent tokens, if you have some two-word commands.
That could look something like:
var user_text = message.text;
var tokens = user_text.split(' '); // split on spaces, very simple "word boundary"
tokens = tokens.map(function (token) {
return token.toLowerCase();
});
var remove = ['the', 'a', 'an'];
tokens = tokens.filter(function (token) {
return remove.indexOf(token) === -1; // if remove array does *not* contain token
});
if (tokens.indexOf('register') !== -1) {
// User wants to register
} else if (tokens.indexOf('enable') !== -1) {
if (tokens.indexOf('not') !== -1) {
// User does not want to enable
} else {
// User does want to enable
}
}
This is not a full solution: you will eventually want to run the string through a real tokenizer and potentially even a full parser, and may want to employ a rule engine to simplify the logic.
If you can restrict the inputs you need to understand (a limited number of sentence forms and nouns/verbs), you can probably just use a simple parser with a few rules to handle most commands. Enforcing a predictable sentence structure with articles removed will make your life much easier.
You could also take the example above and replace the filter with a whitelist (only include words that are known). That would leave you with a small set of known tokens, but introduces the potential to strip useful words and misinterpret the command, so you should confirm with the user before running anything.
If you really want to parse and understand sentences expressed in natural language, you should look into the topic of natural language processing. This is usually done with some kind of neural network trained to "understand" different variations of sentences (aka machine learning), because specifying all of different syntactic and semantic rules of the language appears to be an overwhelming task.
If however the amount of variations of these sentences is limited, then you could specify some rules in the form of commonly used word combinations, probably even regular expressions would do in the simplest case.

Strip GUID from field if present

MY client webservice returning some field with GUID and some not.
I want to write a function which will check if there is GUID in that field i need to strip off that GUID from that, If GUID not present then return field as it is.
Prototype of function:-
function stripGuid(field)
{
//check if field has GUID then strip off, return new strip field
//else return field as it is
}
Sample data:-
4922093F-148F-4220-B321-0FBB1843B5DDrec_guid
4922093F-148F-4220-B321-0FBB1843B5DDdate_add
tablenam
sessguid
How I call function:-
stripGuid(sampleData);
Expected Output:-
rec_guid
date_add
tablenam
sessguid
As with most things regex, the complexity of the answer depends on how many cases you need to cover, for example:
return field.replace(/[0-9a-fA-F\-]{36}/g, "");
will cover most cases and isn't too terrible to read, but it fails in some pretty important cases, so it might be that
return field.replace(/[0-9a-fA-F]{8}\-?[0-9a-fA-F]{4}\-?[0-9a-fA-F]{4}\-?[0-9a-fA-F]{4}\-?[0-9a-fA-F]{12}/g, "");
would work better for you (as it ensures that the dashes are all in the right place, and that there are the right number of them). The better option depends on how standardized you expect the input to be and where your project draws the line between readability and correctness. Without more details it's hard to say what would be best for you.
Edit: Nate Kerkhofs is right, I had left off the global flag on the above regexes, it's fixed now.
return field.replace(/[a-f0-9]{8}-(?:[a-f0-9]{4}-){3}[a-f0-9]{12}/gi, "");
Unlike teryret's solution, this removes all guids, and also ignores case for easier reading.

Is there a way to detect what the input language setting is currently?

I basically want to know what the system's input language is currently on (for users who have multiple language input methods set up). This will determine whether if the text-direction of a <textarea> should be rtl or not.
Please keep in mind that this setting can change after the page is loaded.
Is there a simple way of doing it in JavaScript/jQuery?
There is no way for the browser to tell what the current keyboard layout (input language) is. As #kirilloid mentioned, one possible workaround is to check the keycode on keyup and determine the language from that.
Javascript has no access to the Accept-Language HTTP header which is the way the browser transfers this information to the server.
This means that you'll have to use server-side scripting in some way or another and send the Accept-Language value as a javascript variable.
If you want to check it dynamically you might do an ajax call to a server side script which simply returns the Accept-Language header from the ajax request. That way you will probably catch those who change their language settings after loading the page.
I came across this situation, so I wrote a little jQuery plugin for that:
$.fn.checkDirection = function() {
var dir = checkRTL(this.val()[0]) ? 'RTL' : 'LTR';
this.css("direction", dir);
function checkRTL(s) {
var ltrChars = 'A-Za-z\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u02B8\u0300-\u0590\u0800-\u1FFF' + '\u2C00-\uFB1C\uFDFE-\uFE6F\uFEFD-\uFFFF',
rtlChars = '\u0591-\u07FF\uFB1D-\uFDFD\uFE70-\uFEFC',
rtlDirCheck = new RegExp('^[^' + ltrChars + ']*[' + rtlChars + ']');
return rtlDirCheck.test(s);
}
};
Basically, as soon as the user starts typing, it will check the language's direction, and set the selector's direction accordingly. The usage will be as:
$('textarea').on('input', function(){
$(this).checkDirection();
});
Maybe a solution would be to have a map between keycode(place user pressed on keyboard)
and actual letters/characters/symbols written or their actual UTF-16 code points
different keyboards sometimes write out different characters even if they are positioned on the same place on a keyboard(marked by a keycode).
So, for example, same key(keycode) on the keyboard could be mapped to these symbols/characters/letters.
Ç - in Spain
ú - in Italy
# - in Germany
µ - in Belgian
Here is a sample code that does not use any mapping to determine what language it could be. But its a start, since it checks keycode, actual character/letter/symbol or glyph, and its UTF-16 code point in integer. Also, as far as I can see, none of the methods and properties are marked as deprecated as far as I could see on MDN, as opposed to which, charCode and keyCode, that are used on some examples out there.
var keyCode;
var character;
var utfCodePointInDecimal
function checkCodes(event) {
var that = event
checkKeyCode(that);
checkChar(that);
charUTFPoint();
console.log("keycode/location on keyboard: " + keyCode);
console.log("character: " + character);
console.log("utf code point: " + utfCodePointInDecimal)
}
function checkKeyCode(e) {
keyCode = e.code;
}
function checkChar(e) {
character = e.key;
}
function charUTFPoint() {
utfCodePointInDecimal = Number(character.codePointAt(0));
}
window.onkeydown = checkCodes;
here is a link to a UTF-16 list of characters. (note, code points are in hex)
http://www.fileformat.info/info/charset/UTF-16/list.htm
And here is a site I found with different mappings per language. It is in C lang I belive, but you can probably use it as a template and the write own mapping in JavaScript
https://beta.docs.qmk.fm/using-qmk/simple-keycodes/reference_keymap_extras

Coding convention in Javascript: use of spaces between parentheses

According to JSHint, a Javascript programmer should not add a space after the first parenthesis and before the last one.
I have seen a lot of good Javascript libraries that add spaces, like this:
( foo === bar ) // bad according to JSHint
instead of this way:
(foo === bar) // good according to JSHint
Frankly, I prefer the first way (more spaces) because it makes the code more readable. Is there a strong reason to prefer the second way, which is recommended by JSHint?
This is my personal preference with reasons as to why.
I will discuss the following items in the accepted answer but in reverse order.
note-one not picking on Alnitak, these comments are common to us all...
note-two Code examples are not written as code blocks, because syntax highlighting deters from the actual question of whitespace only.
I've always done it that way.
Not only is this never a good reason to defend a practice in programming, but it also is never a good reason to defend ANY idea opposing change.
JS file download size matters [although minification does of course fix that]
Size will always matter for Any file(s) that are to be sent over-the-wire, which is why we have minification to remove unnecessary whitespace. Since JS files can now be reduced, the debate over whitespace in production code is moot.
moot: of little or no practical value or meaning; purely academic.
moot definition
Now we move on to the core issue of this question. The following ideas are mine only, and I understand that debate may ensue. I do not profess that this practice is correct, merely that it is currently correct for me. I am willing to discuss alternatives to this idea if it is sufficiently shown to be a poor choice.
It's perfectly readable and follows the vast majority of formatting conventions in Javascript's ancestor languages
There are two parts to this statement: "It's perfectly readable,"; "and follows the vast majority of formatting conventions in Javascript's ancestor languages"
The second item can be dismissed as to the same idea of I've always done it that way.
So let's just focus on the first part of the statement It's perfectly readable,"
First, let's make a few statements regarding code.
Programming languages are not for computers to read, but for humans to read.
In the English language, we read left to right, top to bottom.
Following established practices in English grammar will result in more easily read code by a larger percentage of programmers that code in English.
NOTE: I am establishing my case for the English language only, but may apply generally to many Latin-based languages.
Let's reduce the first statement by removing the adverb perfectly as it assumes that there can be no improvement. Let's instead work on what's left: "It's readable". In fact, we could go all JS on it and create a variable: "isReadable" as a boolean.
THE QUESTION
The question provides two alternatives:
( foo === bar )
(foo === bar)
Lacking any context, we could fault on the side of English grammar and go with the second option, which removes the whitespace. However, in both cases "isReadable" would easily be true.
So let's take this a step further and remove all whitespace...
(foo===bar)
Could we still claim isReadable to be true? This is where a boolean value might not apply so generally. Let's move isReadable to an Float where 0 is unreadable and 1 is perfectly readable.
In the previous three examples, we could assume that we would get a collection of values ranging from 0 - 1 for each of the individual examples, from each person we asked: "On a scale of 0 - 1, how would you rate the readability of this text?"
Now let's add some JS context to the examples...
if ( foo === bar ) { } ;
if(foo === bar){};
if(foo===bar){};
Again, here is our question: "On a scale of 0 - 1, how would you rate the readability of this text?"
I will make the assumption here that there is a balance to whitespace: too little whitespace and isReadable approaches 0; too much whitespace and isReadable approaches 0.
example: "Howareyou?" and "How are you ?"
If we continued to ask this question after many JS examples, we may discover an average limit to acceptable whitespace, which may be close to the grammar rules in the English language.
But first, let's move on to another example of parentheses in JS: the function!
function isReadable(one, two, three){};
function examineString(string){};
The two function examples follow the current standard of no whitespace between () except after commas. The next argument below is not concerned with how whitespace is used when declaring a function like the examples above, but instead the most important part of the readability of code: where the code is invoked!
Ask this question regarding each of the examples below...
"On a scale of 0 - 1, how would you rate the readability of this text?"
examineString(isReadable(string));
examineString( isReadable( string ));
The second example makes use of my own rule
whitespace in-between parentheses between words, but not between opening or closing punctuation.
i.e. not like this examineString( isReadable( string ) ) ;
but like this examineString( isReadable( string ));
or this examineString( isReadable({ string: string, thing: thing });
If we were to use English grammar rules, then we would space before the "(" and our code would be...
examineString (isReadable (string));
I am not in favor of this practice as it breaks apart the function invocation away from the function, which it should be part of.
examineString(); // yes; examineString (): // no;
Since we are not exactly mirroring proper English grammar, but English grammar does say that a break is needed, then perhaps adding whitespace in-between parentheses might get us closer to 1 with isReadable?
I'll leave it up to you all, but remember the basic question:
"Does this change make it more readable, or less?"
Here are some more examples in support of my case.
Assume functions and variables have already been declared...
input.$setViewValue(setToUpperLimit(inputValue));
Is this how we write a proper English sentence?
input.$setViewValue( setToUpperLimit( inputValue ));
closer to 1?
config.urls['pay-me-now'].initialize(filterSomeValues).then(magic);
or
config.urls[ 'pay-me-now' ].initialize( fitlerSomeValues ).then( magic );
(spaces just like we do with operators)
Could you imagine no whitespace around operators?
var hello='someting';
if(type===undefined){};
var string="I"+"can\'t"+"read"+"this";
What I do...
I space between (), {}, and []; as in the following examples
function hello( one, two, three ){
return one;
}
hello( one );
hello({ key: value, thing1: thing2 });
var array = [ 1, 2, 3, 4 ];
array.slice( 0, 1 );
chain[ 'things' ].together( andKeepThemReadable, withPunctuation, andWhitespace ).but( notTooMuch );
There are few if any technical reasons to prefer one over the other - the reasons are almost entirely subjective.
In my case I would use the second format, simply because:
It's perfectly readable, and follows the vast majority of formatting conventions in Javascript's ancestor languages
JS file download size matters [although minification does of course fix that]
I've always done it that way.
Quoting Code Conventions for the JavaScript Programming Language:
All binary operators except . (period) and ( (left parenthesis) and [ (left bracket) should be separated from their operands by a space.
and:
There should be no space between the name of a function and the ( (left parenthesis) of its parameter list.
I prefer the second format. However there are also coding style standards out there that insist on the first. Given the fact that javascript is often transmitted as source (e.g. any client-side code), one could see a slightly stronger case with it than with other languages, but only marginally so.
I find the second more readable, you find the first more readable, and since we aren't working on the same code we should each stick as we like. Were you and I to collaborate then it would probably be better that we picked one rather than mixed them (less readable than either), but while there have been holy wars on such matters since long before javascript was around (in other languages with similar syntax such as C), both have their merits.
I use the second (no space) style most of the time, but sometimes I put spaces if there are nested brackets - especially nested square brackets which for some reason I find harder to read than nested curved brackets (parentheses). Or to put that another way, I'll start any given expression without spaces, but if I find it hard to read I insert a few spaces to compare, and leave 'em in if they helped.
Regarding JS Hint, I wouldn't worry- this particular recommendation is more a matter of opinion. You're not likely to introduce bugs because of this one.
I used JSHint to lint this code snippet and it didn't give such an advice:
if( window )
{
var me = 'me';
}
I personally use no spaces between the arguments in parentheses and the parentheses themselves for one reason: I use keyboard navigation and keyboard shortcuts. When I navigate around the code, I expect the cursor to jump to the next variable name, symbol etc, but adding spaces messes things up for me.
It's just personal preference as it all gets converted to the same bytecode/binary at the end of the day!
Standards are important and we should follow them, but not blindly.
To me, this question is about that syntax styling should be all about readability.
this.someMethod(toString(value),max(value1,value2),myStream(fileName));
this.someMethod( toString( value ), max( value1, value2 ), myStream( fileName ) );
The second line is clearly more readable to me.
In the end, it may come down to personal preference, but I would ask those who prefer the 1st line if they really make their choice because "they are used it" or because they truly believe it's more readable.
If it's something you are used to, then a short time investment into a minor discomfort for a long term benefit might be worth the switch.

Categories