jQuery using Regex to find links within text but exclude if the link is in quotes - javascript

I am using jQuery and Regex to search a text string for http or https and convert the string to a URL. I need the code to skip the string if it starts with a quote.
below is my code:
// Get the content
var str = jQuery(this).html();
// Set the regex string
var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig;
var replaced_text = str.replace(exp, function(url) {
clean_url = url.replace(/https?:\/\//gi,'');
return '' + clean_url + '';
})
jQuery(this).html(replaced_text);
Here is an example of my issue:
Text The School of Computer Science and Informatics. She blogs at http://www.wordpress.com and can be found on Twitter #Abcdef.
The current code successfully finds the text that starts with http or https and converts it to a URL but it also converts the twitter URL. I need to ignore the text if it starts with a quote or is within an a tag, etc...
Any help is much appreciated

What about adding [^"'] to the exp variable?
var exp = /(\b[^"'](https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig;
Snippet:
// Get the content
var str = jQuery("#text2replace").html();
// Set the regex string
var exp = /(\b[^"'](https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig;
var replaced_text = str.replace(exp, function(url) {
clean_url = url.replace(/https?:\/\//gi,'');
return '' + clean_url + '';
})
jQuery("#text2replace").html(replaced_text);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div id="text2replace">
The School of Computer Science and Informatics. She blogs at http://www.wordpress.com and can be found on Twitter #Abcdef.
</div>

If you really just want to ignore the quotation marks, this could help:
var replaced_text = $("#selector").html().replace(/([^"])(\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig, '$1$2');

This works for me:
This will recognize urls and convert them to hyperlinks, but will ignore urls, wrapped in " (quotes).
See the code below or this jsfiddle for a working example.
Example HTML:
<ul class="js-replaceUrls">
<li>
www.link-only-www.com
</li>
<li>
http://link-starts-with-HTTP.com
</li>
<li>
https://www.link-starts-with-https-and-www.com
</li>
<a href="https://link-starts-with-https.com">
Link in anchor tag
</a>
</ul>
RegEX:
/(([a-z]+:\/\/)?(([a-z0-9\-]+\.)+([a-z]{2}|aero|arpa|biz|com|coop|edu|gov|info|int|jobs|mil|museum|name|nato|net|org|pro|travel|local|internal))(:[0-9]{1,5})?(\/[a-z0-9_\-\.~]+)*(\/([a-z0-9_\-\.]*)(\?[a-z0-9+_\-\.%=&]*)?)?(#[a-zA-Z0-9!$&'()*+.=-_~:#/?]*)?)(\s+|$)/gmi
jQuery:
// RECOGNIZE URLS AND CONVERT THEM TO HYPERLINKS
// Ignore if hyperlink is found in HTML attr, like "href"
$('.js-replaceUrls').each(function(){
// GET THE CONTENT
var str = $(this).html();
// SET THE REGEX STRING
var regex = /(([a-z]+:\/\/)?(([a-z0-9\-]+\.)+([a-z]{2}|aero|arpa|biz|com|coop|edu|gov|info|int|jobs|mil|museum|name|nato|net|org|pro|travel|local|internal))(:[0-9]{1,5})?(\/[a-z0-9_\-\.~]+)*(\/([a-z0-9_\-\.]*)(\?[a-z0-9+_\-\.%=&]*)?)?(#[a-zA-Z0-9!$&'()*+.=-_~:#/?]*)?)(\s+|$)/gmi;
// REPLACE PLAIN TEXT LINKS BY HYPERLINKS
var replaced_text = str.replace(regex, "<a href='$1' class='js-link'>$1</a>");
// ECHO LINK
$(this).html(replaced_text);
});
// DEFINE URLS WITHOUT "http" OR "https"
var linkHasNoHttp = $(".js-link:not([href*=http],[href*=https])");
// ADD "http://" TO "href"
$(linkHasNoHttp).each(function() {
var linkHref = $(this).attr("href");
$(this).attr("href" , "http://" + linkHref);
});
See this jsfiddle for a working example.

Related

Result of element is changing during getting it by script

I tried to get data with JavaScript:
The Text
var link = document.getElementById('link_Page')
var text=link.innerHTML;
var href=link.href;
I expect to see:
"/product/23" and "The Text "
But result is:
"http://localhost:60790/product/23" and "The Text "
Note: on jsfiddle.js I tested and result of text(not link) was fine. couldn't understand why it's gives me ' '
https://jsfiddle.net/mahma/ocwnufqb/
Note: on jsfiddle.js I tested and result of text(not link) was fine
.href will return the full URL of the linked resource, to get the exact value of the href attribute try using Element.getAttribute():
var link = document.getElementById('link_Page')
var text=link.innerHTML;
var href=link.getAttribute('href');
console.log(text);
console.log(href);
The Text
is the space character in HTML. You have a space character in the end of the a tag's text.
Here is the way you can do what you want.
var link = document.getElementById('link_Page')
var text = link.innerText;
var href = link.getAttribute('href');
console.log(text, href);
The Text

How to split a string and then embed the second string (link) into the first string?

I'm fairly new to JavaScript and I have this RSS Feed I'm working with currently.
When I retrieve an item from the RSS feed, the following is displayed
Google Home Page http://www.google.com
How can I split this string, so that I can embed the second part of it (http://www.google.com) into the first part(Google Home Page)?
First - exclude the link by using following RegEx pattern (searches for string which starts with http://).
/http:\/\/.*[^\W+]/g
The matched value (Array) is being stored into url, now we are able to create the anchor element. (the value of href is the element 0 inside our matches array).
The link content is being generated by replacing the URL with empty space inside the retrievedResult. trim() is optional, I've used it just to remove remaining space.
retrievedResult.replace(url[0], "").trim()
Finally you can append the built anchor element.
var retrievedResult = "Google Home Page http://www.google.com";
var re = /http:\/\/.*[^\W+]/g;
var url = retrievedResult.match(re);
var anchor = '' + retrievedResult.replace(url[0], "").trim() + '';
$('body').append(anchor);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
Okay, so this will be the string:
var string = "Google Home Page http://www.google.com";
Then we split it:
var split = string.split('http'); // ['Google Home Page ', '://www.google.com']
Then we create an a element:
var a = document.createElement('a');
Then we add the link as the href attribute of your anchor element:
a.href = 'http' + split[1];
And then we add the text as textContent of your anchor element:
a.textContent = split[0];
And finally we add the element to the body:
document.body.appendChild(a);
Here an example:
var string = "Google Home Page http://www.google.com";
var split = string.split('http');
var a = document.createElement('a');
a.href = 'http' + split[1];
a.textContent = split[0];
document.body.appendChild(a);
You can use jquery to get to your result
Working Example:
//This is HTML part
<div id="linkcontainer"></div>
<input id="str" value='Google Home Page http://www.google.com'>
<a id="createlink">CreateLink</a>
//This is js part
$('#createlink').click(function(){
createLink();
});
//function that makes link
function createLink(){
var str = $('#str').val();
var http = str.indexOf('http');
var url = str.substring(http);
var text = str.substring(0,http);
$('#linkcontainer').html(''+text+'');
}
Try this code on jsfiddle

How to strip specific tag into div in Javascript?

I have this html code
<div class="myDiv">
My link
<p>This is a paragraph</p>
<script>//This is a script</script>
</div>
And I this javascript:
$('.myDiv').children().each(
function() {
var strToStrip = $('.myDiv').html();
if ( this.tagName != 'A' ) {
// Strip tag element if tagName is not 'A'
// and replace < or > with < or >
strToStrip.replace(/(<([^>]+)>)(?!(a))/ig, "");
}
}
);
How can I strip all tags, except from the a element?
I only need the link and strip tags if it is not a link tag.
I can't find what wrong with this code and what regex can I use to do this.
Any help please?
Try this regex example:
var strToStrip = $('.myDiv').html();
var temp = strToStrip.replace(/<[^a\/][a-z]*>/g, "<");
var result = temp.replace(/<\/[^a][a-z]*>/g, ">");
alert(result);
My goal of this question is to figure out how twitter do his hashtag or usergroup by using # or #. Go here to see the final result
you can use replace method of string using regular expr
var html = $("#main").html();
var result = html.replace(/[\<\>\/]/g,'');
alert(result);
the example shown here

Escape characters in String in a HTML page?

I have a string in the below non-escaped format in a HTML page:
<a href="http://somesite/product?page=blahk&id=EA393216&tabs=7,0&selections=quarter:Q2+2013^&wicket:pageMapName=wicket-2\">SomeThing</a>
What I need is to use jQuery/JavaScript to replace that string with just the link "SomeThing".
I have looked at some examples in StackOverflow, but they don't seem to work. I'm just getting started with jQuery and JavaScript, so would appreciate any help here.
Any ideas?
Try html() and text() in jquery to decode:
var str = '<a href="http://somesite/product?page=blahk&id=EA393216&tabs=7,0&selections=quarter:Q2+2013^&wicket:pageMapName=wicket-2\">SomeThing</a>';
var decoded = $('<div />').html(str).text();
alert($(decoded).text());
See Fiddle demo
var str = '<a href="http://somesite/product?page=blahk&id=EA393216&tabs=7,0&selections=quarter:Q2+2013^&wicket:pageMapName=wicket-2\">SomeThing</a>';
var helper = document.createElement('p');
// evaluate as HTML once, text now "<a href..."
helper.innerHtml = str;
// evaluate as HTML again, helper now has child element a
helper.innerHtml = helper.innerText;
// get text content only ("SomeThing")
alert(helper.innerText);
Here is a possible starting point.
Hope this gets you started!
function parseString(){
var str = '<a href="http://somesite/product?page=blahk&id=EA393216&tabs=7,0&selections=quarter:Q2+2013^&wicket:pageMapName=wicket-2\">SomeThing</a>';
var begin = str.indexOf('\">',0)+2; //--determine where the opening anchor tag ends
var end = str.indexOf('</a>',0); //--determine where the closing anchor tag begins
var parsedString = str.substring(begin,end); //--grab whats in between;
/*//--or all inline
var parsedString = str.substring(str.indexOf('\">',0)+2,str.indexOf('</a>',0));
*/
console.log(parsedString);
}
parseStr();

Extracting the source code of a facebook page with JavaScript

If I write code in the JavaScript console of Chrome, I can retrieve the whole HTML source code by entering:
var a = document.body.InnerHTML; alert(a);
For fb_dtsg on Facebook, I can easily extract it by writing:
var fb_dtsg = document.getElementsByName('fb_dtsg')[0].value;
Now, I am trying to extract the code "h=AfJSxEzzdTSrz-pS" from the Facebook Page. The h value is especially useful for Facebook reporting.
How can I get the h value for reporting? I don't know what the h value is; the h value is totally different when you communicate with different users. Without that h correct value, you can not report. Actually, the h value is AfXXXXXXXXXXX (11 character values after 'Af'), that is what I know.
Do you have any ideas for getting the value or any function to generate on Facebook page.
The Facebook Source snippet is below, you can view source on facebook profile, and search h=Af, you will get the value:
<code class="hidden_elem" id="ukftg4w44">
<!-- <div class="mtm mlm">
...
....
<span class="itemLabel fsm">Unfriend...</span></a></li>
<li class="uiMenuItem" data-label="Report/Block...">
<a class="itemAnchor" role="menuitem" tabindex="-1" href="/ajax/report/social.php?content_type=0&cid=1352686914&rid=1352686914&ref=http%3A%2F%2Fwww.facebook.com%2 F%3Fq&h=AfjSxEzzdTSrz-pS&from_gear=timeline" rel="dialog">
<span class="itemLabel fsm">Report/Block...</span></a></li></ul></div>
...
....
</div> -->
</code>
Please guide me. How can extract the value exactly?
I tried with following code, but the comment block prevent me to extract the code. How can extract the value which is inside comment block?
var a = document.getElementsByClassName('hidden_elem')[3].innerHTML;alert(a);
Here's my first attempt, assuming you aren't afraid of a little jQuery:
// http://stackoverflow.com/a/5158301/74757
function getParameterByName(name, path) {
var match = RegExp('[?&]' + name + '=([^&]*)').exec(path);
return match && decodeURIComponent(match[1].replace(/\+/g, ' '));
}
var html = $('.hidden_elem')[0].innerHTML.replace('<!--', '').replace('-->', '');
var href = $(html).find('.itemAnchor').attr('href');
var fbId = getParameterByName('h', href); // fbId = AfjSxEzzdTSrz-pS
Working Demo
EDIT: A way without jQuery:
// http://stackoverflow.com/a/5158301/74757
function getParameterByName(name, path) {
var match = RegExp('[?&]' + name + '=([^&]*)').exec(path);
return match && decodeURIComponent(match[1].replace(/\+/g, ' '));
}
var hiddenElHtml = document.getElementsByClassName('hidden_elem')[0]
.innerHTML.replace('<!--', '').replace('-->', '');
var divObj = document.createElement('div');
divObj.innerHTML = hiddenElHtml;
var itemAnchor = divObj.getElementsByClassName('itemAnchor')[0];
var href = itemAnchor.getAttribute('href');
var fbId = getParameterByName('h', href);
Working Demo
I'd really like to offer a different solution for "uncommenting" the HTML, but I stink at regex :)

Categories