Remove Unfinished String with Ellipsis using regex - javascript

I'm trying to remove the String with Ellipsis from unfinished sentences/words not any sentence with Ellipsis:
1. String ... String
2. String String Str...
3. String string String ...
4. String Strin... String
5. String String ... Stri==...
Output:
1. String ... String
3. String string String ...
My first thought was trying to iterate each sentence, but I think regex would be better(wayy better).
Is that possible with regex?
if so How come? I tried few regex unsuccessfully.
Any help will be appreciate.
ps: I can't post the actual strings (company policies), that's why I posted these dummy examples.
Edit:
I've tried regex like:
/(\.*)\.\.\./mgi (I'm not an expert)
but will fail in some cases...
I will retrieve each sentence in an array of String, not a huge and messy String.
Well basically anything with unfinished word or sentence I need to discart. (anything with a word or a single character infront of a Ellipsis)

I assume an invalid sentence always have a word with ... immediately after.
In the regex below, you could put anything that actually separate your words. For now, I put . and (space character).
var str = `1. String ... String
2. String String Str...
3. String string String ...
4. String Strin... String
5. String String ... Stri==...`;
var cleaned = str.split('\n').filter(function (line) {
return !line.match(/[^\. ]+\.{3}/);
}).join('\n');
console.log(cleaned);
/*
prints
1. String ... String
3. String string String ...
*/

Yes it is, you are basically looking for [any character 1 or more times][...], which would be in regexp:
\w+\.{3}
This is assuming that Ellipsis is always 3 dots, if it's not you can do \.+ instead. Use that to find the sentences you want to remove, then keep the other items.

Related

Javascript split string by first instance of lowercase character

I would like to take Pascal string inputs and split them up by hyphens.
"HelloWorld" becomes "hello-world"
I'm able to do that no problem, however my regex attempts start to break down when say a person supplies the following:
"FAQ" becomes "f-a-q"
I want it to keep FAQ as "faq", so I think I need to be splitting the string up by all first instances of a lowercase vs uppercase correct?
My regex right now is:
name.split(/(?=[A-Z])/).join('-').toLowerCase()
You could replace the middle starting uppercase letters.
function hyphenate(string) {
return string.replace(/[^A-Z](?=[A-Z])/g, '$&-').toLowerCase();
}
console.log(hyphenate("FAQ")); // faq
console.log(hyphenate("ReadTheFAQ")); // read-the-faq
console.log(hyphenate("HelloWorld")); // hello-world

Extract numeric and text parts of a string, in varying formats

I'm trying to put together a RegEx to split a variety of possible user inputs, and while I've managed to succeed with some cases, I've not managed to cover every case that I'd like to.
Possible inputs, and expected outputs
"1 day" > [1,"day"]
"1day" > [1,"day"]
"10,000 days" > [10000,"days"]
Is it possible to split the numeric and text parts from the string without necessarily having a space, and to also remove the commas etc from the string at the same time?
This is what I've got at the moment
[a-zA-Z]+|[0-9]+
Which seems to split the numeric and text portions nicely, but is tripped up by commas. (Actually, as I write this, I'm thinking I could use the last part of the results array as the text part, and concatenate all the other parts as the numeric part?)
var test = [
'1 day',
'1day',
'10,000 days',
];
console.log(test.map(function (a) {
a = a.replace(/(\d),(\d)/g, '$1$2'); // remove the commas
return a.match(/^(\d+)\s*(.+)$/); // split in two parts
}));
This regular expression works, apart from removing the comma from the matched number string:
([0-9,]+]) *(.*)
You cannot "ignore" a character in a returned regular expression match string, so you will just have to remove the comma from the returned regex match afterwards.

replace first occurrence of string after another string

Tried to find it in the network without any success..
Let's say I have the following string:
this is a string test with a lot of string words here another string string there string here string.
I need to replace the first 'string' to 'anotherString' after the first 'here', so the output will be:
this is a string test with a lot of string words here another anotherString string there string here string.
Thank you all for the help!
You don't need to add g modifier while replacing only the first occurance.
str.replace(/\b(here\b.*?)\bstring\b/, "$1anotherString");
DEMO
If you are looking for something which takes in a sentence and replaces the first occurrence of "string" after "here" (using the example in your case),
You should probably look at split() and see how to use it in a greedy way referring to something like this question. Now, use the second half of the split string
Then use replace() to find "string" and change it to "anotherString". By default this function is greedy so only your first occurrence will be replaced.
Concatenate the part before "here" in the original string, "here" and the new string for the second half of the original string and that will give you what you are looking for.
Working fiddle here.
inpStr = "this is a string test with a lot of string words here another string string there string here string."
firstHalf = inpStr.split(/here(.+)?/)[0]
secondHalf = inpStr.split(/here(.+)?/)[1]
secondHalf = secondHalf.replace("string","anotherString")
resStr = firstHalf+"here"+secondHalf
console.log(resStr)
Hope this helps.

What does this JS do?

var passwordArray = pwd.replace(/\s+/g, '').split(/\s*/);
I found the above line of code is a rather poorly documented JavaScript file, and I don't know exactly what it does. I think it splits a string into an array of characters, similar to PHP's str_split. Am I correct, and if so, is there a better way of doing this?
it replaces any spaces from the password and then it splits the password into an array of characters.
It is a bit redundant to convert a string into an array of characters,because you can already access the characters of a string through brackets(.. not in older IE :( ) or through the string method "charAt" :
var a = "abcdefg";
alert(a[3]);//"d"
alert(a.charAt(1));//"b"
It does the same as: pwd.split(/\s*/).
pwd.replace(/\s+/g, '').split(/\s*/) removes all whitespace (tab, space, lfcr etc.) and split the remainder (the string that is returned from the replace operation) into an array of characters. The split(/\s*/) portion is strange and obsolete, because there shouldn't be any whitespace (\s) left in pwd.
Hence pwd.split(/\s*/) should be sufficient. So:
'hello cruel\nworld\t how are you?'.split(/\s*/)
// prints in alert: h,e,l,l,o,c,r,u,e,l,w,o,r,l,d,h,o,w,a,r,e,y,o,u,?
as will
'hello cruel\nworld\t how are you?'.replace(/\s+/g, '').split(/\s*/)
The replace portion is removing all white space from the password. The \\s+ atom matches non-zero length white spcace. The 'g' portion matches all instances of the white space and they are all replaced with an empty string.

Split string by HTML entities?

My string contain a lot of HTML entities, like this
"Hello <everybody> there"
And I want to split it by HTML entities into this :
Hello
everybody
there
Can anybody suggest me a way to do this please? May be using Regex?
It looks like you can just split on &[^;]*; regex. That is, the delimiter are strings that starts with &, ends with ;, and in between there can be anything but ;.
If you can have multiple delimiters in a row, and you don't want the empty strings between them, just use (&[^;]*;)+ (or in general (delim)+ pattern).
If you can have delimiters in the beginning or front of the string, and you don't want them the empty strings caused by them, then just trim them away before you split.
Example
Here's a snippet to demonstrate the above ideas (see also on ideone.com):
var s = ""Hello <everybody> there""
print (s.split(/&[^;]*;/));
// ,Hello,,everybody,,there,
print (s.split(/(?:&[^;]*;)+/));
// ,Hello,everybody,there,
print (
s.replace(/^(?:&[^;]*;)+/, "")
.replace(/(?:&[^;]*;)+$/, "")
.split(/(?:&[^;]*;)+/)
);
// Hello,everybody,there
var a = str.split(/\&[#a-z0-9]+\;/); should do it, although you'll end up with empty slots in the array when you have two entities next to each other.
split(/&.*?;(?=[^&]|$)/)
and cut the last and first result:
["", "Hello", "everybody", "there", ""]
>> ""Hello <everybody> there"".split(/(?:&[^;]+;)+/)
['', 'Hello', 'everybody', 'there', '']
The regex is: /(?:&[^;]+;)+/
Matches entities as & followed by 1+ non-; characters, followed by a ;. Then matches at least one of those (or more) as the split delimiter. The (?:expression) non-capturing syntax is used so that the delimiters captured don't get put into the result array (split() puts capture groups into the result array if they appear in the pattern).

Categories