regex jquery to replace number with link - javascript

I have a page of database results where users occasionally type in a reference to another post. (The database is day event tracker for a scheduling office).
The reference to the other post is always the posts ID (format of 001234). We uses these to match events with dockets and other paperwork from truck drivers. It is always a 6 digit number on its own.
<div class="eventsWrapper">
Data from DB is output here using PHP in a foreach loop.
Presents data in similar fashion to facebook for example.
</div>
What I need to do is once the data in the above DIV is loaded, then go through and replace every whole 6 digit number (not part of a number) with the number as a hyperlink.
It is important it only looks for number with a space either side:
EG: Ref 001122 <- like this - not like this -> ignore AB001122
Once I have the hyperlink tag I can make the reference number clickable to take users directly to that post.
I am not that good with regex but think it is something like:
\b(![0-9])?\d{6}\b
I have no idea how to search the DIV and then replace that regex with the hyperlink. Appreciate any help.

(?:^| )\d{6}(?= |$)
You can use this and replace by <space><whateveryouwant>.See demo.
https://regex101.com/r/bW3aR1/7
\b wont works cos A1 is not a word boundary which you want.

Something like this? Make an array of the individual posts and loop through. If there is only ever one ID in a post, you can do without the second loop.
var str = ['Ref 001122 <- like this - not like this -> ignore AB001122', 'Ref 001123 <- like this - not like this -> ignore AB001122', 'Ref 001124 <- like this - not like this -> ignore AB001122'];
var regex = /\b\d{6}\b/g;
for (var j = 0; j < str.length; j++) {
var urls = str[j].match(regex);
for (var i = 0; i < urls.length; i++) {
var url = urls[i];
newString = str[j].replace('' + urls[i] + '', '<a href = ' + url + '>' + urls[i] + '</a>')
}
$('#output').append(newString);
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<div id="output"></div>

Related

How to get all the text separately from the bracket using javascript regular expression

I have a sentence stored in a variable.That sentence I need to extract into 4 parts depends on sentence which I have put into variables in my code,I can able to extract here and get into console but I am not getting the whole text of inside the bracket,only I am getting first words.Here is the code below.Can anyone please help me.
HTML
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<ul class="messages">
SCRIPT
$(document).ready(function() {
regex = /.+\(|\d. \w+/g;
maintext = "Welcome to project, are you a here(1. new user , 2. test user , 3. minor Accident or 4. Major Accident)";
matches = maintext.match(regex);
text_split0 = matches[0].slice(0, -1);
text_split1 = matches[1];
text_split2 = matches[2];
text_split3 = matches[3];
text_split4 = matches[4];
console.log(text_split0);
console.log(text_split1);
console.log(text_split2);
console.log(text_split3);
console.log(text_split4);
$(".messages").append('<li>'+text_split0+'</li><li>'+text_split1+'</li><li>'+text_split2+'</li><li>'+text_split3+'</li><li>'+text_split4+'</li>');
// $("li:contains('undefined')").remove()
});
function buildMessages(text) {
let messages = text.split(/\d\.\s/);
messages.shift();
messages.forEach((v)=>{
let msg = v.replace(/\,/,'').replace(/\sor\s/,'').trim();
$('.messages').append(`<li>${msg}</li>`);
// console.log(`<li>${msg}</li>`);
});
}
let sentenceToParse = "Welcome to project, are you a here(1. new user , 2. test user , 3. minor Accident or 4. Major Accident)";
buildMessages(sentenceToParse);
Use the split function on the String, keying on the digits (e.g. 1.), you will get the preface and each of the steps into an array.
Use the shift function on the Array removes the unneeded preface.
Use forEach to iterate over the values in the array, clean up the text.
Using replace to first remove commas, then remove or with spaces on either side.
Use trim to remove leading and training whitespace.
At this point, your array will have sanitized copy for use in your <li> elements.
If you're only concerned with working through a regex and not re-factoring, the easiest way may be to use an online regex tool where you provide a few different string samples. Look at https://www.regextester.com/
Ok, Try another approach, cause regex for this isn't the best way. Try this:
$(document).ready(function() {
// First part of sentence.
var mainText = "Welcome to project, are you a here(";
// Users array.
var USERS = ['new user', 'test user', 'minor Accident', 'Major Accident'];
var uSize = USERS.length;
// Construct string & user list dynamically.
for(var i = 0; i < uSize; i++) {
var li = $('<li/>').text(USERS[i]);
if(i === uSize - 1)
mainText += (i+1) + ". " + USERS[i] + ")";
else if(i === uSize - 2)
mainText += (i+1) + ". " + USERS[i] + " or ";
else
mainText += (i+1) + ". " + USERS[i] + " , ";
$(".messages").append(li);
}
console.log(mainText); // You will have you complete sentence.
}
Why that way is better? Simple, you can add or remove users inside the user array. String together with your user list will be updated automatically. I hope that help you.

How to extract a specific text from a string. The hard part is the desired text changes periodically

I have an HTML document which contains this text somewhere in it
function deleteFolder() {
var mailbox = "CN=John Urban,OU=Sect-1,DC=TestServer ,DC=acme,DC=com";
var path = "/Inbox/";
//string of interest: "CN=John Urban,OU=Sect-1,DC=TestServer ,DC=acme,DC=com"
I just want to extract this text and store it in a variable in C#. My problem is that string of interest will slightly change each time the page is loaded, something like this:
"CN=John Urban,OU=Sect-1,DC=TestServer ,DC=acme,DC=com"
"CN=Jane Doe,OU=Sect-1,DC=TestServer ,DC=acme,DC=com"
etc....
How do I extract that ever changing string, without regular expression?
Is it always a function deleteFolder() which has its first line as var mailbox = "somestring"? And you are interested in somestring?
Based on the requirements you told us, could just search your string containing the HTML for var mailbox =" and then the next " and take all text between these two occurrences.
var htmlstring= "..."; //
var i1 = htmlstring.IndexOf("var mailbox = \"");
var i2 = i1 >= 0 ? htmlstring.IndexOf("\"", i1+15) : -1;
var result = i2 >= 0 ? htmlstring.Substring(i1+15, i2-(i1+15)): "not found";
VERY, VERY ugly, not maintainable, but without more information, I can't do any better. However Regex would be much nicer!

Parse URL which contain string of two URL

I've node app and Im getting in some header the following URL and I need to parse it and change the content of 3000 to 4000 ,How can I do that since Im getting "two" URLs in the req.headers.location
"http://to-d6faorp:51001/oauth/auth?response_type=code&redirect_uri=http%3AF%2Fmo-d6fa3.ao.tzp.corp%3A3000%2Flogin%2Fcallback&client_id=x2.node"
The issue is that I cannot use just replace since the value can changed (dynmaic value ,now its 3000 later can be any value...)
If the part of the URL you always need to change is going to be a parameter of redirect_uri then you just need to find the index of the second %3A that comes after it.
Javascript indexOf has a second parameter which is the 'start position', so you can first do an indexOf the 'redirect_uri=' string, and then pass that position in to your next call to indexOf to look for the first '%3A' and then pass that result into your next call for the %3A that comes just before your '3000'. Once you have the positions of the tokens you are looking for you should be able to build a new string by using substrings... first substring will be up to the end of your second %3A and the second substring will be from the index of the %2F that comes after it.
Basically, you will be building your string by cutting up the string like so:
"http://to-d6faorp:51001/oauth/auth?response_type=code&redirect_uri=http%3AF%2Fmo-d6fa3.ao.tzp.corp%3A"
"%2Flogin%2Fcallback&client_id=x2.node"
... and appending in whatever port number you are trying to put in.
Hope this helps.
This code should get you what you want:
var strURL = "http://to-d6faorp:51001/oauth/auth?response_type=code&redirect_uri=http%3AF%2Fmo-d6fa3.ao.tzp.corp%3A3000%2Flogin%2Fcallback&client_id=x2.node";
var strNewURL = strURL.substring(0,strURL.indexOf("%3A", strURL.indexOf("%3A", strURL.indexOf("redirect_uri") + 1) + 1) + 3) + "4000" + strURL.substring(strURL.indexOf("%2F",strURL.indexOf("%3A", strURL.indexOf("%3A", strURL.indexOf("redirect_uri") + 1) + 1) + 3));
Split the return string in its parameters:
var parts = req.headers.location.split("&");
then split the parts into fieldname and variable:
var subparts = [];
for (var i = 1; i < parts.length; i++)
subparts[i] = parts[i].split("=");
then check which fieldname equals redirect_uri:
var ret = -1;
for (var i = 0; i < subparts.length; i++)
if (subpart[i][0] == "redirect_uri")
ret = i;
if (ret == -1)
// didnt find redirect_uri, somehow handle this error
now you know which subpart contains the redirect_uri.
Because I dont know which rules your redirect_uri follows I can't tell you how to get the value, thats your task but the problem is isolated to subparts[ret][1]. Thats the string which contains redirect_uri.

deObfuscating in Python using transformed JS function

I needed to convert the following function to python to deobfuscate a text extracted while web scraping:
function obfuscateText(coded, key) {
// Email obfuscator script 2.1 by Tim Williams, University of Arizona
// Random encryption key feature by Andrew Moulden, Site Engineering Ltd
// This code is freeware provided these four comment lines remain intact
// A wizard to generate this code is at http://www.jottings.com/obfuscator/
shift = coded.length
link = ""
for (i = 0; i < coded.length; i++) {
if (key.indexOf(coded.charAt(i)) == -1) {
ltr = coded.charAt(i)
link += (ltr)
}
else {
ltr = (key.indexOf(coded.charAt(i)) - shift + key.length) % key.length
link += (key.charAt(ltr))
}
}
document.write("<a href='mailto:" + link + "'>" + link + "</a>")
}"""
here is my converted python equivalent:
def obfuscateText(coded,key):
shift = len(coded)
link = ""
for i in range(0,len(coded)):
inkey=key.index(coded[i]) if coded[i] in key else None
if ( not inkey):
ltr = coded[i]
link += ltr
else:
ltr = (key.index(coded[i]) - shift + len(key)) % len(key)
link += key[ltr]
return link
print obfuscateText("uw#287u##Guw#287Xw8Iwu!#W7L#", "WXYVZabUcdTefgShiRjklQmnoPpqOrstNuvMwxyLz01K23J456I789H.#G!#$F%&E'*+D-/=C?^B_`{A|}~")
actionattraction$comcastWnet
but I am getting a slightly incorrect output instead of actionattraction#comcast.net I get above. Also many a times the above code gives random characters for the same html page,
The target html page has a obfuscateText function in JS with the coded and key, I extract the function signature in obsfunc and execute it on the fly:
email=eval(obsfunc)
which stores the email in above variable, but the problem is that it works most of the time but fails certain times , I strongly feel that the problem is with the arguments supplied to the python function , they may need escaping or conversion as it contains special characters? I tried passing raw arguments and different castings like repr() but the problem persisted.
Some examples for actionattraction#comcast.net wrongly computed and correctly computed using the same python function(first line is email):
#ation#ttr#ationVaoma#st!nct
obfuscateText("KMd%Y#Kdd8KMd%Y#IMY!MKcdJ#*d", "utvsrwqxpyonzm0l1k2ji3h4g5fe6d7c8b9aZ.Y#X!WV#U$T%S&RQ'P*O+NM-L/K=J?IH^G_F`ED{C|B}A~")
}ction}ttr}ction#comc}st.net
obfuscateText("}ARGML}RRP}ARGMLjAMKA}QRiLCR", "}|{`_^?=/-+*'&%$#!#.9876543210zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA~")
actionattraction#comcast.net
obfuscateText("DEWLRQDWWUDEWLRQoERPEDVWnQHW", "%&$#!#.'9876*54321+0zyxw-vutsr/qponm=lkjih?gfed^cbaZY_XWVUT`SRQPO{NMLKJ|IHGFE}DCBA~")
I've rewritten the deobfuscator:
def deobfuscate_text(coded, key):
offset = (len(key) - len(coded)) % len(key)
shifted_key = key[offset:] + key[:offset]
lookup = dict(zip(key, shifted_key))
return "".join(lookup.get(ch, ch) for ch in coded)
and tested it as
tests = [
("KMd%Y#Kdd8KMd%Y#IMY!MKcdJ#*d", "utvsrwqxpyonzm0l1k2ji3h4g5fe6d7c8b9aZ.Y#X!WV#U$T%S&RQ'P*O+NM-L/K=J?IH^G_F`ED{C|B}A~"),
("}ARGML}RRP}ARGMLjAMKA}QRiLCR", "}|{`_^?=/-+*'&%$#!#.9876543210zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA~"),
("DEWLRQDWWUDEWLRQoERPEDVWnQHW", "%&$#!#.'9876*54321+0zyxw-vutsr/qponm=lkjih?gfed^cbaZY_XWVUT`SRQPO{NMLKJ|IHGFE}DCBA~"),
("ZUhq4uh#e4Om.04O", "ksYSozqUyFOx9uKvQa2P4lEBhMRGC8g6jZXiDwV5eJcAp7rIHL31bnTWmN0dft")
]
for coded,key in tests:
print(deobfuscate_text(coded, key))
which gives
actionattraction#comcast.net
actionattraction#comcast.net
actionattraction#comcast.net
anybody#home.com
Note that all three key strings contain &; replacing it with & fixes the problem. Presumably at some point the javascript was mistakenly html-code-escaped; Python has a module which will unencode html special characters like so:
# Python 2.x:
import HTMLParser
html_parser = HTMLParser.HTMLParser()
unescaped = html_parser.unescape(my_string)
# Python 3.x:
import html.parser
html_parser = html.parser.HTMLParser()
unescaped = html_parser.unescape(my_string)
First of all, index doesn't return None, but throws an exception. In your case, W appears instead of a dot because the index returned is 0, and not inkey (which is also wrong) mistakenly beleive that a character is not present in the key.
Second, presence of & suggests that you indeed may have to find and decode HTML entities.
Finally, I'd recommend to rewrite it like
len0 = len(code)
len1 = len(key)
link = ''
for ch in code:
try:
ch = key[(key.index(ch) - len0 + len1) % len1]
except ValueError: pass
link += ch
return link

Url parsing in javascript and DOM

I am writing a support chat application where I want text to be parsed for urls. I have found answers for similar questions but nothing for the following.
what i have
function ReplaceUrlToAnchors(text) {
var exp = /(\b(https?:\/\/|ftp:\/\/|file:\/\/|www.)
[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig;
return text.replace(exp,"<a href='$1' target='_blank'>$1</a>");
}
that pattern is a modified version of one i found on the internet. It includes www. in the first token, because not all urls start with protocol:// However, when www.google.com is replaced with
<a href='www.google.com' target='_blank'>www.google.com</a>
which pulls up MySite.com/webchat/wwww.google.com and I get a 404
that is my first problem, my second is...
in my script for generating messages to the log, I am forced to do it a hacky way:
var last = 0;
function UpdateChatWindow(msgArray) {
var chat = $get("MessageLog");
for (var i = 0; i < msgArray.length; i++) {
var element = document.createElement("div");
var linkified = ReplaceUrlToAnchors(msgArray[i]);
element.setAttribute("id", last.toString());
element.innerHTML = linkified;
chat.appendChild(element);
last = last + 1;
}
}
To get the "linkified" string to render HTML out correctly I have to use the non-standard .innerHTML attribute of element. I would prefer a way were i could parse the string as tokens - text tokens and anchor tokens - and call either createTextNode or createElement("a") and stitch them together with DOM.
so question 1 is how should I go about www.site.com parsing, or even site.com?
and question 2 is how would could I do this using only DOM?
Another thing you could do is this:
function ReplaceUrlToAnchors(text) {
var exp = /(\b(https?:\/\/|ftp:\/\/|file:\/\/|www.)
[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig;
return text.replace(exp, function(_, url) {
return '<a href="' +
(/^www\./.test(url) ? "http://" + url : url) +
'target="_blank">' +
url +
'</a>';
});
}
That is kind-of like your solution, but it does the check for "www" URLs in that callback passed in to ".replace()".
Note that you won't be picking up "stackoverflow.com" or "newegg.com" or anything like that, which I understand may be unavoidable (and even desirable, given the false positives you'd pick up).
Here is what I came up with, perhaps someone has something better?
function replaceUrlToAnchors(text) {
var naked = /(\b(www.)[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|](.com|.net|.org|.co.uk|.ca|.))/ig;
text = text.replace(naked, "http://$1");
var exp = /(\b(https?:\/\/|ftp:\/\/|file:\/\/)([-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|]))/ig;
return text.replace(exp,"<a href='$1' target='_blank'>$3</a>");
}
the first regex will replace www.google.com with http://www.google.com and is good enough for what I am doing. However, I will hold off marking this as the answer because I would also like to make (www.) optional but when I do (www.)? it replaces every word with http://word/

Categories