Javascript split string by first instance of lowercase character

Javascript split string by first instance of lowercase character - javascript

I would like to take Pascal string inputs and split them up by hyphens.
"HelloWorld" becomes "hello-world"
I'm able to do that no problem, however my regex attempts start to break down when say a person supplies the following:
"FAQ" becomes "f-a-q"
I want it to keep FAQ as "faq", so I think I need to be splitting the string up by all first instances of a lowercase vs uppercase correct?
My regex right now is:
name.split(/(?=[A-Z])/).join('-').toLowerCase()

You could replace the middle starting uppercase letters.
function hyphenate(string) {
return string.replace(/[^A-Z](?=[A-Z])/g, '$&-').toLowerCase();
}
console.log(hyphenate("FAQ")); // faq
console.log(hyphenate("ReadTheFAQ")); // read-the-faq
console.log(hyphenate("HelloWorld")); // hello-world

Related

How to write regexp for finding :smile: in javascript?

I want to write a regular expression, in JavaScript, for finding the string starting and ending with :.
For example "hello :smile: :sleeping:" from this string I need to find the strings which are starting and ending with the : characters. I tried the expression below, but it didn't work:
^:.*\:$

My guess is that you not only want to find the string, but also replace it. For that you should look at using a capture in the regexp combined with a replacement function.
const emojiPattern = /:(\w+):/g
function replaceEmojiTags(text) {
return text.replace(emojiPattern, function (tag, emotion) {
// The emotion will be the captured word between your tags,
// so either "sleep" or "sleeping" in your example
//
// In this function you would take that emotion and return
// whatever you want based on the input parameter and the
// whole tag would be replaced
//
// As an example, let's say you had a bunch of GIF images
// for the different emotions:
return '<img src="/img/emoji/' + emotion + '.gif" />';
});
}
With that code you could then run your function on any input string and replace the tags to get the HTML for the actual images in them. As in your example:
replaceEmojiTags('hello :smile: :sleeping:')
// 'hello <img src="/img/emoji/smile.gif" /> <img src="/img/emoji/sleeping.gif" />'
EDIT: To support hyphens within the emotion, as in "big-smile", the pattern needs to be changed since it is only looking for word characters. For this there is probably also a restriction such that the hyphen must join two words so that it shouldn't accept "-big-smile" or "big-smile-". For that you need to change the pattern to:
const emojiPattern = /:(\w+(-\w+)*):/g
That pattern is looking for any word that is then followed by zero or more instances of a hyphen followed by a word. It would match any of the following: "smile", "big-smile", "big-smile-bigger".

The ^ and $ are anchors (start and end respectively). These cause your regex to explicitly match an entire string which starts with : has anything between it and ends with :.
If you want to match characters within a string you can remove the anchors.
Your * indicates zero or more so you'll be matching :: as well. It'll be better to change this to + which means one or more. In fact if you're just looking for text you may want to use a range [a-z0-9] with a case insensitive modifier.
If we put it all together we'll have regex like this /:([a-z0-9]+):/gmi
match a string beginning with : with any alphanumeric character one or more times ending in : with the modifiers g globally, m multi-line and i case insensitive for things like :FacePalm:.
Using it in JavaScript we can end up with:
var mytext = 'Hello :smile: and jolly :wave:';
var matches = mytext.match(/:([a-z0-9]+):/gmi);
// matches = [':smile:', ':wave:'];
You'll have an array with each match found.

Check if sentence contains a phrase

Sentences:
Hey checkout Hello World <- SHOULD BE INCLUDED
hello world is nice! <- SHOULD BE INCLUDED
Hhello World should not work <- SHOULD NOT BE INCLUDED
This too Hhhello World <- SHOULD NOT BE INCLUDED
var phraseToSearch = "Hello World";
Do note: sentence.ToLower().IndexOf(phraseToSearch.ToLower()) would not work as it would include all the above sentences while the result should only include sentences 1 and 2

You can use regular expression to match a character pattern with a string.
The regular expression is simply looking for Hello World the exact letters you are looking for with \b a word border and using the i case insensitive modifier.
Regex has a method test that will run the regular expression on the given string. It will return a true if the regular expression matched.
const phraseToSearch = /\bhello world\b/i
const str1 = 'Hey checkout Hello World'
const str2 = 'hello world is nice!'
const str3 = 'Hhello World should not work'
const str4 = 'This too Hhhello World'
console.log(
phraseToSearch.test(str1),
phraseToSearch.test(str2),
phraseToSearch.test(str3),
phraseToSearch.test(str4)
)

You probably want to use a regular expression. Here are the things you want to match
Text (with spaces surrounding it)
... Text (with space on one side, and end of text on the other)
Text ... (with space on one side, and start of side on the other)
Text (just the string, on its own)
One way to do it, without a regular expression, is just to put 4 conditions (one for each bullet point above) and join them up with a &&, but that would lead to messy code.
Another option is to split both strings be spaces, and checking if one array was a subarray of another.
However, my solution uses a regular expression - which is a pattern you can test on a string.
Our pattern should
Look for a space/start of string
Check for the string
Look for a space/end of string
\b, according to this, will match spaces, seperators of words, and ends of strings. These things are called word boundries.
Here is the code:
function doesContain(str, query){ // is query in str
return new RegExp("\b" + query + "\b", "i").test(str)
}
The i makes the match case insensitive.

split on words except when phrase contains that word

I am trying to split where clauses, I want to split text on AND|OR|NOT except when NOT is in the 'phrase' NOT IN or NOT LIKE or IS NOT NULL.
1st example:
DEVLDATE IS NOT NULL AND STATUS = D AND PICKUPDATE IS NULL
I expect 3 segments, splitting on the AND's, but not on the NOT in this instance.
2nd ex:
(NOT (STATUS IN ('A','X') )) AND LINEHAUL = 0
I want to split on this NOT & AND, also expecting 3 segments in this instance
I'm trying this look ahead from another almost similar example but it is not splitting at all. I have next to zero regex experience. Not sure what I'm missing or if it's even possible.
Thanks in advance.
var ignoreRegex = /(?!.*\b([NOT IN]|[NOT LIKE]|[NOT BETWEEN]|[IS NOT NULL])\b)(?=.*\b(AND|OR|NOT)\b)/g
var filterArray = filterBy.split(new RegExp(ignoreRegex));

Try with:
\b(AND|OR|NOT(?!\s+NULL|IN|LIKE))\b
DEMO
About your regex:
(?!.*\b([NOT IN]|[NOT LIKE]|[NOT BETWEEN]|[IS NOT NULL])\b)(?=.*\b(AND|OR|NOT)\b
[NOT IN] - this is character class [...] it will match character
which you put in in, so it can match: N,T,etc. not whole
word/sentence,
([NOT IN]|[NOT LIKE]|[NOT BETWEEN]|[IS NOT NULL]) - this whole part actually can match only one character, because it doesn't use any quantifires or intervals, it doesn't work as you expect at all,
so whole regex should match: some text with AND, OR or NOT, but if line within which the part was matched doesn't consist letters and spaces included in character classes..... so it will not match anything probably.

How to make this simple regexp?

I need to make a string starts and ends with alphanumeric range between 5 to 20 characters and it could have a space or none between characters. /^[a-z\s?A-Z0-9]{5,20}$/ but this is not working.
EDIT
test test -should pass
testtest -should pass
test test test -should not pass

You can't do this with traditional regex without writing a ridiculously long expression, so you need to use a look-ahead:
/^(?=(\w| ){15,20}$)\w+ ?\w+$/
This says, make sure there are between 15 and 20 characters in the match, then match /\w+ \w+/
Note I used \w for simplification. It is the same as your character class above except it also accepts underscores. If you don't want to match them you have to do:
/^(?=[a-zA-Z0-9 ]{15,20}$)[a-zA-Z0-9]+ ?[a-zA-Z0-9]+$/

You can't put a ? inside of [...]. [...] is used to specify a set of characters precisely, you can't maybe (?) have a character inside a set of characters. The occurrence of any specific characters is already optional, the ? is meaningless.
If you allow any number of spaces inside your match, just remove the question mark. If you want to allow a single space but no more, then regular expressions alone can't do that for you, you'd need something like
if (myString.match(/^[a-z\sA-Z0-9]{5,20}$/ && myString.match(/\s/g).length <= 1)
You couldn't do this with a single traditional regex without it being dozens of lines long; regexes are meant for matching more simpler patterns than this.
If you only want to use regexes, you could use two instead of one. The first matches the general pattern, the second ensures that only one non-space characters is found.
if (myString.match(/^[a-z\sA-Z0-9]{5,20}$/ && myString.match(/^[^\s]*\s?[^\s]*$/))) {
Example Usage
inputs = ["test test", "testtest", "test test test"];
for (index in inputs) {
var myString = inputs[index];
if (myString.match(/^[a-z\sA-Z0-9]{5,20}$/ && myString.match(/^[^\s]*\s?[^\s]*$/))) {
console.log(myString + " matches.")
} else {
console.log(myString + " does not match.")
}
}
This produces the output specified in your question.

Meh , So here's the ridiculously long traditional regex for the same
(?i)[a-z0-9]+( [a-z0-9]+)?{5,12}
js vesrion (w/o the nested quantifier)
/^([a-z0-9]( [a-z0-9])?){5,12}$/i

Javascript RegExp Matching weirdness

I have a RegExp:
/.?(NCAA|Division|I|Basketball|Champions,|1939-2011).?/gi
and some text "Champion"
somehow, this is coming back as a match, am I crazy?
0: "pio"
1: "i"
index: 4
input: "Champion"
length: 2
the loop is here:
// contruct the pattern, dynamically
var someText = "Champion";
var phrase = ".?(NCAA|Division|I|Basketball|Champions,|1939-2011).?";
var pat = new RegExp(phrase, "gi"); // <- ends up being
var result;
while( result = pat.exec(someText) ) {
// do stuff!
}
There has to be something wrong with my RegExp, right?
EDIT:
The .? thing was just a quick and dirty attempt to say that I'd like to match one of those words AND/OR one of those words with a single char on either side. ex:
\sNCAA\s
NCAA
NCAA\s
\sNCAA
GOAL:
I'm trying to do some simple hit highlighting based on some search words. I've got a function that gets all of the text nodes on a page, and I'd like to go through them all and highlight any matches to any of the terms in my phrase variable.
I think that I just need to rework how I am building my RegExp.

Well, first of all you're specifying case-insensitivity, and secondly, you are matching the letter I as one of your matchable string.
Champion would match pio and i, because they both match /.?I.?/gi
It however doesn't match /.?Champions,.?/gi because of the trailing comma.

Add start (^) and end ($) anchors to the regexp.
/^.?(NCAA|Division|I|Basketball|Champions,|1939-2011).?$/gi
Without the anchors, the regexp's match can start and end anywhere in the string, which is why
/.?(NCAA|Division|I|Basketball|Champions,|1939-2011).?/gi.exec('Champion')
can match pio and i: because it's actually matching around the (case-insensitive) I. If you leave the anchors off, but remove the ...|I|..., the regex won't match 'Champion':
> /.?(NCAA|Division|Basketball|Champions,|1939-2011).?/gi.exec('Champion')
null

Champion matches /.?I.?/i.
Your own output notes that it's matching the substring "pio".
Perhaps you meant to bound the expression to the start and end of the input, with ^ and $ respectively:
/^.?(NCAA|Division|I|Basketball|Champions,|1939-2011).?$/gi
I know you said to ignore the .?, but I can't: it's most likely wrong, and it's most likely going to continue to cause you problems. Explain why they're there and we can tell you how to do it properly. :)

We Keep Coding

JavaScript is the programming language of the Web.

Javascript split string by first instance of lowercase character - javascript

You could replace the middle starting uppercase letters. function hyphenate(string) { return string.replace(/[^A-Z](?=[A-Z])/g, '$&-').toLowerCase(); } console.log(hyphenate("FAQ")); // faq console.log(hyphenate("ReadTheFAQ")); // read-the-faq console.log(hyphenate("HelloWorld")); // hello-world

Related

How to write regexp for finding :smile: in javascript?

Check if sentence contains a phrase

split on words except when phrase contains that word

How to make this simple regexp?

Javascript RegExp Matching weirdness

Categories

Resources