How do I turn this text:
• Ban Ki-moon calls for immediate ceasefire• Residents targeted in
al-Qusayr, witnesses tell HRWIsrael ignoring expanding violence by
settlers, EU reports9.18am: Footage from activists suggests that
opposition forces continue to resist government troops.This footage...
into this text:
Ban Ki-moon calls for immediate ceasefire. Residents targeted in
al-Qusayr, witnesses tell HRW. Israel ignoring expanding violence by
settlers, EU reports. 9.18am: Footage from activists suggests that
opposition forces continue to resist government troops. This
footage...
This needs to be fixed with javascript (multiple .replace commands are possible)
"• " has to be removed and replaced by a ". ", however the first "• " should just be removed
If there is no space after a dot ".", a space must be added (.This footage)
If there is no space before a time (9.18am), a space must be added
If there is no space before a capital letter (HRWIsrael) that is
followed by non-capital letters, then a dot and space ". " must be added in front
of that non-capital letter.
Breaking down into several replace statements (as listed below) is the way I would go about it (working fiddle).
The fixBullets function will turn all bullets into HTML Entities and the fixBulletEntities fixes those. I did this to normalize bullets as I'm not sure if they are just bullet characters or HTML entities in your source string.
The fixTimes function changes "9.18am:" into " 9:18am. " (otherwise, the fixPeriods function makes it look like " 9. 18am" which I am sure you do not want.
One major caveat regarding the fixCapitalsEndSentence function... This will also convert strings like "WOrDS" into "WO. rDS" which may not be what you want.
At the least, this should get you started...
function fixBullets(text) {
var bullets = /•/g;
return text.replace(bullets, '•');
}
function fixBulletEntities(text) {
var bulletEntities = /•/ig;
text = text.replace(bulletEntities, '. ');
if (text.indexOf('. ') === 0) {
text = text.substring(2);
}
return text;
}
function fixTimes(text) {
var times = /(\d+)[\.:](\d+[ap]m):?/ig;
return text.replace(times, ' $1:$2. ');
}
function fixPeriods(text) {
var periods = /[.](\w+)/g;
return text.replace(periods, '. $1');
}
function fixCapitalsEndSentence(text) {
var capitalsEndSentence = /([A-Z]{2,})([a-z]+)/g;
text = text.replace(capitalsEndSentence, function(match1, match2, match3) {
var len = match2.length - 1;
var newText = match2.substring(0, len) + '. ' + match2.substring(len, len + 1) + match2.substring(len + 1) + match3;
return newText;
});
return text;
}
function fixMultipleSpaces(text) {
var multipleSpaces = /\s+/g;
return text.replace(multipleSpaces, ' ');
}
function fixAll(text) {
text = fixBullets(text);
text = fixBulletEntities(text);
text = fixTimes(text);
text = fixPeriods(text);
text = fixCapitalsEndSentence(text);
text = fixMultipleSpaces(text);
return text;
}
Related
Quick background - I have dyslexia and it can be challenging sometimes looking at certain phrases or numbers to not mix things up, so I've been told to use colour to fix this. Full disclosure, I am not a programmer, but I spoke to one of the developers at work, and they said that they can probably hack something together for me to help out if I can provide them with some base Javascript code to work off.
Is someone able to assist? I have no idea what I'm looking for or what to search for. I found this, but I think it needs to be more complex.
Basically, I want letters to be one colour, symbols to be another colour, numbers a third colour, then my "bad" characters highlighted in something else.
Bad Characters
E / 3 = Red / Orange
L / I = Red / Orange
Other characters
A - Z = Black
1 - 9 = Blue
! - ) = Purple
I hope this makes sense. Feel free to ask me any questions.
Thank you sincerely.
Update: To clarify, there is a box where passwords are generated and I need to transcribe that password into an application that does not accept copy/paste. It is a single phrase/area with no hyperlinks at all.
You do realise what you want is not straight forward!
What if you have a hyperlink inside a paragraph? What if the HTML is malformed?
You said your developer needs something to start with so the below will hopefully give him a base to work from.
Basically we grab all the likely candidates for text nodes (headings and a paragraph, you can extend this to other items).
Then we loop all candidates and check each character. We check if the character compares to anything from our special character list (var all). If they don't we just add them to a string to replace later.
If they do match any one of our special lists we then wrap them in a span with the relevant colour using wrapSpan function.
Once we have checked the whole string for a candidate we then replace the inner HTML for that candidate.
**There is a LOT of work you would have to do with this to make it functional due to the complexities of HTML and the below is very inefficient, but it should give you an idea of where to start.
var candidates = 'h1,h2,h3,h4,h5,h6,p';
var candidateContainers = document.querySelectorAll(candidates);
var red = 'EL';
var ora = '3I'
var blu = '124567890';
var pur = '!';
var all = red + ora + blu + pur;
var char;
candidateContainers.forEach(function(entry) {
var str = entry.innerHTML;
var newStr = "";
for (var i = 0; i < str.length; i++) {
char = str.charAt(i);
if(all.indexOf(char) == -1){
// console.log("do nothing", char);
newStr += char;
}
if(red.indexOf(char) > -1){
newStr += wrapSpan('red', char);
}
if(ora.indexOf(char) > -1){
newStr += wrapSpan('orange', char);
}
if(blu.indexOf(char) > -1){
newStr += wrapSpan('blue', char);
}
if(pur.indexOf(char) > -1){
newStr += wrapSpan('purple', char);
}
}
entry.innerHTML = newStr;
});
function wrapSpan(colour, char){
return '<span style="color: ' + colour + '">' + char + '</span>';
}
//console.log(candidateContainers);
<h1>THIS SENTENCE CONTAINS LETTERS THAT SHOULD BE HIGHLIGHTED SUCH AS E and 3!</h1>
<p>This sentence contains the numbers 1,2,3,4,5,6,7,8,9 and 3 should be a different colour, the exclamantion point should also be a different colour!</p>
<p>This is as simple as it gets as you can see if fails on this sentence due to an existing hyperlink</p>
Unfortunately there's no CSS selector to target specific chars which would have been useful in this case.
This is an interesting problem to understand people with dyslexia needs.
Some simple Javascript can make it like the following code. All you have to do is enclosing each character type with the correct marker.
Working with a configuration array containing your needs based on regular expression is the way to do. In the code below I used color keys for the object, but they are used as CSS identifier (which is not considered appropriate by everybody)
function color_format(ch) {
var config={
'red' : /[EL]/,
'orange': /[3l]/,
'black' : /[a-zA-Z]/,
'blue' : /[0-9]/,
// purple is the default color for non matching chars
};
res=cur='';
for (var i=0;i<ch.length;i++) {
match = "";
Object.keys(config).every(key => {
if (ch[i].match(config[key])) {
match += " " + key;
return false;
}
return true;
});
match=match.trim();
if (match != cur) {
if (cur) {
res += "</span>";
}
if (match) {
res+= "<span class='" + match + "'>";
}
cur=match;
}
res+=ch[i];
}
if (cur) res+='</span>';
return res;
}
document.getElementById("new-password").addEventListener("click",function(e){
// the two following lines are only used to generate a random password for the example
var s = "!(),;:.'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
password=Array.apply(null, Array(10)).map(function() { return s.charAt(Math.floor(Math.random() * s.length)); }).join('');
document.getElementById("password-container").innerHTML=color_format(password);
});
#password-container {
color: purple;
font-family: monospace;
font-size: 3em;
}
#password-container .black {
color: black;
}
#password-container .orange {
color: orange;
}
#password-container .blue {
color: blue;
}
#password-container .red {
color: red;
}
Your new password is : <div id="password-container">
</div>
<button id="new-password">
Generate password
</button>
I have a sentence stored in a variable.That sentence I need to extract into 4 parts depends on sentence which I have put into variables in my code,I can able to extract here and get into console but I am not getting the whole text of inside the bracket,only I am getting first words.Here is the code below.Can anyone please help me.
HTML
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<ul class="messages">
SCRIPT
$(document).ready(function() {
regex = /.+\(|\d. \w+/g;
maintext = "Welcome to project, are you a here(1. new user , 2. test user , 3. minor Accident or 4. Major Accident)";
matches = maintext.match(regex);
text_split0 = matches[0].slice(0, -1);
text_split1 = matches[1];
text_split2 = matches[2];
text_split3 = matches[3];
text_split4 = matches[4];
console.log(text_split0);
console.log(text_split1);
console.log(text_split2);
console.log(text_split3);
console.log(text_split4);
$(".messages").append('<li>'+text_split0+'</li><li>'+text_split1+'</li><li>'+text_split2+'</li><li>'+text_split3+'</li><li>'+text_split4+'</li>');
// $("li:contains('undefined')").remove()
});
function buildMessages(text) {
let messages = text.split(/\d\.\s/);
messages.shift();
messages.forEach((v)=>{
let msg = v.replace(/\,/,'').replace(/\sor\s/,'').trim();
$('.messages').append(`<li>${msg}</li>`);
// console.log(`<li>${msg}</li>`);
});
}
let sentenceToParse = "Welcome to project, are you a here(1. new user , 2. test user , 3. minor Accident or 4. Major Accident)";
buildMessages(sentenceToParse);
Use the split function on the String, keying on the digits (e.g. 1.), you will get the preface and each of the steps into an array.
Use the shift function on the Array removes the unneeded preface.
Use forEach to iterate over the values in the array, clean up the text.
Using replace to first remove commas, then remove or with spaces on either side.
Use trim to remove leading and training whitespace.
At this point, your array will have sanitized copy for use in your <li> elements.
If you're only concerned with working through a regex and not re-factoring, the easiest way may be to use an online regex tool where you provide a few different string samples. Look at https://www.regextester.com/
Ok, Try another approach, cause regex for this isn't the best way. Try this:
$(document).ready(function() {
// First part of sentence.
var mainText = "Welcome to project, are you a here(";
// Users array.
var USERS = ['new user', 'test user', 'minor Accident', 'Major Accident'];
var uSize = USERS.length;
// Construct string & user list dynamically.
for(var i = 0; i < uSize; i++) {
var li = $('<li/>').text(USERS[i]);
if(i === uSize - 1)
mainText += (i+1) + ". " + USERS[i] + ")";
else if(i === uSize - 2)
mainText += (i+1) + ". " + USERS[i] + " or ";
else
mainText += (i+1) + ". " + USERS[i] + " , ";
$(".messages").append(li);
}
console.log(mainText); // You will have you complete sentence.
}
Why that way is better? Simple, you can add or remove users inside the user array. String together with your user list will be updated automatically. I hope that help you.
I'm working on my final project of the Winter 2017 quarter to demonstrate how to use Regular Expressions in both C# and JavaScript code behind pages. I've got the C# version of my demonstration program done, but the JavaScript version is making me pull what little hair I have left on my head out (no small achievement since I got a fresh buzz cut this morning!). The problem involves not getting any output after applying a Regular Expression in a While loop to get each instance of the expression and printing it out.
On my HTML page I have an input textarea, seven radio buttons, an output textarea, and two buttons underneath (one button is to move the output text to the input area to perform multiple iterations of applying expressions, and the other button to clear all textareas for starting from scratch). Each radio button links to a function that applies a regular expression to the text in the input area. Five of my seven functions work; the sixth is the one I can't figure out, and the seventh is essentially the same but with a slightly different RegEx pattern, so if I fix the sixth function, the seventh function will be a snap.
(I tried to insert/upload a JPG of the front end, but the photo upload doesn't seem to be working. Hopefully you get the drift of what I've set up.)
Here are my problem children from my JS code behind:
// RegEx_Demo_JS.js - code behind for RegEx_Demo_JS
var inputString; // Global variable for the input from the input text box.
var pattern; // Global variable for the regular expression.
var result; // Global variable for the result of applying the regular expression to the user input.
// Initializes a new instance of the StringBuilder class
// and appends the given value if supplied
function StringBuilder()
{
var strings = [];
this.append = function (string)
{
string = verify(string);
if (string.length > 0) strings[strings.length] = string;
}
this.appendLine = function (string)
{
string = verify(string);
if (this.isEmpty())
{
if (string.length > 0) strings[strings.length] = string;
else return;
}
else strings[strings.length] = string.length > 0 ? "\r\n" + string : "\r\n";
}
this.clear = function () { strings = []; };
this.isEmpty = function () { return strings.length == 0; };
this.toString = function () { return strings.join(""); };
var verify = function (string)
{
if (!defined(string)) return "";
if (getType(string) != getType(new String())) return String(string);
return string;
}
var defined = function (el)
{
// Changed per Ryan O'Hara's comment:
return el != null && typeof(el) != "undefined";
}
var getType = function (instance)
{
if (!defined(instance.constructor)) throw Error("Unexpected object type");
var type = String(instance.constructor).match(/function\s+(\w+)/);
return defined(type) ? type[1] : "undefined";
}
}
Within the code of the second radio button (which will be the seventh and last function to complete), I tested the ScriptBuilder with data in a local variable, and it ran successfully and produced output into the output textarea. But I get no output from this next function that invokes a While loop:
function RegEx_Match_TheOnly_AllInstances()
{
inputString = document.getElementById("txtUserInput").value;
pattern = /(\s+the\s+)/ig; // Using an Flag (/i) to select either lowercase or uppercase version. Finds first occurrence either as a standalone word or inside a word.
//result = pattern.exec(inputString); // Finds the first index location
var arrResult; // Array for the results of the search.
var sb = getStringBuilder(); // Variable to hold iterations of the result and the text
while ((arrResult = pattern.exec(inputString)) !==null)
{
sb.appendLine = "Match: " + arrResult[0] ;
}
document.getElementById("txtRegExOutput").value = sb.toString();
/* Original code from C# version:
// string pattern = #"\s+(?i)the\s+"; // Same as above, but using Option construct for case insensitive search.
string pattern = #"(^|\s+)(?i)the(\W|\s+)";
MatchCollection matches = Regex.Matches(userTextInput, pattern);
StringBuilder outputString = new StringBuilder();
foreach (Match match in matches)
{
string outputRegExs = "Match: " + "\"" + match.Value + "\"" + " at index [" + match.Index + ","
+ (match.Index + match.Length) + "]" + "\n";
outputString.Append(outputRegExs);
}
txtRegExOutput.Text = outputString.ToString();
*/
} // End RegEx_Match_The_AllInstances
I left the commented code in to show what I had used in the C# code behind version to illustrate what I'm trying to accomplish.
The test input/string I used for this function is:
Don’t go there. If you want to be the Man, you have to beat The Man.
That should return two hits. Ideally, I want it to show the word that it found and the index where it found the word, but at this point I'd be happy to just get some output showing every instance it found, and then build on that with the index and possibly the lastIndex.
So, is my problem in my While loop, the way I'm applying the StringBuilder, or a combination of the two? I know the StringBuilder code works, at least when not being used in a loop and using some test data from the site I found that code. And the code for simply finding the first instance of "the" as a standalone or inside another word does work and returns output, but that doesn't use a loop.
I've looked through Stack Overflow and several other JavaScript websites for inspiration, but nothing I've tried so far has worked. I appreciate any help anyone can provide! (If you need me to post any other code, please advise and I'll be happy to oblige.)
This is about a Chrome Extension.
Suppose a user select any text on a page, then clicks a button to save it. Via window.getSelection() I can get that text without the underlying html markup.
I store that text. For demo purposes, let's say the text is:
"John was much more likely to buy if he knew the price beforehand"
The next time the user visits the page, I want to find that text on the page. The issue is, the html for that text is actually:
<b>John was much more likely to buy if he knew the price <span class="italic">beforehand</span></b>
The second issue is that this system needs to work even if the selection is dirty, i.e. it starts/ends mid DOM node.
What I've build is bit of a fat solution, so I am curious how I can make it more efficient and/or smaller. This is the whole thing:
text.split("").map(function(el, i, arr){
if(specials.includes(el)){
return "\\"+el;
}
return el;
})
.join("(?:\\s*<[^>]+>\\s*)*\\s*");
where text is the saved text and specials is
var specials = [
'/', '.', '*', '+', '?', '|',
'(', ')', '[', ']', '{', '}', '\\'
];
The process is:
Split text into single characters
For each character, check if it's a special char and if so, prepend it with \
Join all letters together with regEx that check if there's any whitespace or html tags inbetween
My question is, can it be done in a better way? I get the "bruteforcing" feeling with this solution and I don't know if it would actually cause lag on larger sites/selection texts.
Plus, it doesn't work for SPAs where text may update a bit after the DOM is ready.
Thank you for any input.
EDIT:
So initially I was using mark.js, which doesn't handle this at all, but not 12 hours after I posted this question the maintainer release v8.0.0 that uses NodeList and handles my use case. The feature is "acrossElements", located here.
create a Range object
set it so that it spans the entire document from start to end
check if the string of interest is in its toString()
clone range twice
apply binary search by moving the start/end points of the subranges into roughly their midpoint. this can be approximated by finding the first descendant with > 1 child nodes and then splitting the child list
goto 3
this should roughly take n log m steps where n is the document text length and m the number of nodes.
Build the entire text representation of the document manually from each node with nodeType of Node.TEXT_NODE, saving the node reference and its text's start/end positions relative to the overall string in an array. Do it just once as DOM is slow, and you might want to search for multiple strings. Otherwise the other answer might be much faster (without actual benchmarks it's a moot point).
Apply HTML whitespace coalescing rules.
Otherwise you'll end up with huge spans of spaces and newline characters.
For example, Range.toString() doesn't strip them, meaning you'd have to convert your string to a RegExp with [\s\n\r]+ instead of spaces and all other special characters like {}()[]|^$*.?+ escaped.
Anyway, it'd be wise to use the converted RegExp on document.body.textContent before proceeding (easy to implement, many examples on the net, thus not included below).
A simplified implementation for plain-string search follows.
function TextMap(baseElement) {
this.baseElement = baseElement || document.body;
var textArray = [], textNodes = [], textLen = 0, collapseSpace = true;
var walker = document.createTreeWalker(this.baseElement, NodeFilter.SHOW_TEXT);
while (walker.nextNode()) {
var node = walker.currentNode;
var nodeText = node.textContent;
var parentName = node.parentNode.localName;
if (parentName==='noscript' || parentName==='script' || parentName==='style') {
continue;
}
if (parentName==='textarea' || parentName==='pre') {
nodeText = nodeText.replace(/^(\r\n|[\r\n])/, '');
collapseSpace = false;
} else {
nodeText = nodeText.replace(/^[\s\r\n]+/, collapseSpace ? '' : ' ')
.replace(/[\s\r\n]+$/, ' ');
collapseSpace = nodeText.endsWith(' ');
}
if (nodeText) {
var len = nodeText.length;
textArray.push(nodeText);
textNodes.push({
node: node,
start: textLen,
end: textLen + len - 1,
});
textLen += len;
}
}
this.text = textArray.join('');
this.nodeMap = textNodes;
}
TextMap.prototype.indexOf = function(str) {
var pos = this.text.indexOf(str);
if (pos < 0) {
return [];
}
var index1 = this.bisectLeft(pos);
var index2 = this.bisectRight(pos + str.length - 1, index1);
return this.nodeMap.slice(index1, index2 + 1)
.map(function(info) { return info.node });
}
TextMap.prototype.bisect =
TextMap.prototype.bisectLeft = function(pos) {
var a = 0, b = this.nodeMap.length - 1;
while (a < b - 1) {
var c = (a + b) / 2 |0;
if (this.nodeMap[c].start > pos) {
b = c;
} else {
a = c;
}
}
return this.nodeMap[b].start > pos ? a : b;
}
TextMap.prototype.bisectRight = function(pos, startIndex) {
var a = startIndex |0, b = this.nodeMap.length - 1;
while (a < b - 1) {
var c = (a + b) / 2 |0;
if (this.nodeMap[c].end > pos) {
b = c;
} else {
a = c;
}
}
return this.nodeMap[a].end >= pos ? a : b;
}
Usage:
var textNodes = new TextMap().indexOf('<span class="italic">');
When executed on this question's page:
[text, text, text, text, text, text]
Those are text nodes, so to access corresponding DOM elements use the standard .parentNode:
var textElements = textNodes.map(function(n) { return n.parentNode });
Array[6]
0: span.tag
1: span.pln
2: span.atn
3: span.pun
4: span.atv
5: span.tag
I would like to modify an existing JavaScript function that formats a user name properly by setting the first name first letter to upper case, as well as the first name of the last name.
There are some last names that are hyphenated and when those happen, they look like Hugo Bearsotti-potz, when in fact it should be Hugo Bearsotti-Potz
I would like to ask for help to modify this function so it allows proper case to hyphenated last names, if possible.
Here is the existing code (pertinent snippet only):
if (input) {
var out = '';
input.split(delimiter || ' ').forEach(function (sector) {
var x = sector.toLowerCase();
out += x.charAt(0).toUpperCase() + x.substr(1) + ' ';
});
if (out.length > 0) {
out = out.substring(0, out.length - 1);
}
return out;
}
Many thanks.
this should satisfy your test conditions set: http://plnkr.co/edit/9welW6?p=preview
html:
<input type="text" ng-model="foo">
<br>
{{foo | nameCaps}}
js:
app.filter('nameCaps',function(){
return function(input) {
if (!input) return;
return input.toString().replace(/\b([a-z])/g, function(ch) {
return ch.toUpperCase();
});
};
});
although I'm wary about making assumptions about people's names http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/
You could also create a function for capitalizing the first character after any given delimiter. Not quite as succinct as the regex solution though.
function capitalizeAfter(input, delimiter) {
var output = '',
pieces = input.split(delimiter);
pieces.forEach(function(section, index) {
// capitalize the first character and add the remaining section back
output += section[0].toUpperCase() + section.substr(1);
// add the delimiter back if it isn't the last section
if (index !== pieces.length - 1) {
output += delimiter;
}
}
return output;
}
Then it would be used like so:
if (input) {
return capitalizeAfter(capitalizeAfter(input.toLowerCase(), ' '), '-');
}