Accurately count number of links in PHP and Javascript - javascript

I have a form that I am validating with JS on the front-end and PHP on the server side. What I need is a way to reliably count the number of links in an HTML string. The best way that I could think of was to count the closing tags. However simply searching for this tag will not work because the user could circumvent the validation by adding spaces like so: </a >.
I am fairly new to regex and this is the pattern that I have been able to come up with so far:
<[ \n\t]*\/[ \n\t]*a[ \n\t]*>
In Javascript:
function link_count(s){
return s.match(/<[ \n\t]*\/[ \n\t]*a[ \n\t]*>/g, s).length;
}
In PHP:
function count_links($str){
return preg_match_all('<[ \n\t]*/[ \n\t]*a[ \n\t]*>', $str, $matches);
}
Is this the best approach? Will it affect the performance of my form (the html string could be very long)? I am looking for the most efficient and reliable solution.
Thanks in advance.

So, like #sgroves said, </a> are not all links. checking for href might be more interesting.
Also, why not checking the opening tag directly?
I tried searching for <a .... href>
You might use the 's' modifier to ignore newlines...
/<\s*\ba\b.*?href/gs
http://regex101.com/r/bG8lN1/3

Related

Split PHP and HTML tags and marge it together in JavaScript

I am working on extension in JavaScript for Text Editor. Now i working on Code Formator. But i found one thing i cant solve. If you use PHP and HTML together like this:
<?php
Some php
Some php
Some php
?>
<body>
<head></head>
<anotherTag></anotherTag>
</body>
I have that code like String in Javascript. Now I must separate PHP and HTML, use on in Format separate. After it i must Marge it again to return it to Editor. Any ideas, for example with regex or something? :) Thanks.
You can do it in many ways. The general solution is to search in your string for opening/closing tags and then slice that part of string and put it in variable or array that you like to work with. and then simply merge the strings.
This is a list of JavaScript methods with examples that may help you through this:
http://www.w3schools.com/js/js_string_methods.asp
This way you can catch the HTML and PHP code, but it will only catch it correctly if there only is one PHP statement:
~
(.*?) #catches HTML in the start
(?:<[?]php) #looks for opening php tag but doesn't capture it
(.*?) #catches all php code
(?:[?]>) #looks for php ending tag but doesn't capture it
(.*) #catches HTML afterwards
~xs

How do I allow <img> and <a> tags for innerHTML, but no others? (Making a forum)

I am currently programming a forum using only javascript (No JQuery please). I am doing very well, however, there is one issue I would love help with.
Currently I am getting the post from a database, assigning it to variable MainPost, and then attaching it to a div via a text node:
var theDiv = document.getElementById("MainBody");
var content = document.createTextNode(MainPost);
theDiv.appendChild(content);
This is working quite well, however, I would LOVE to be able to do this:
document.getElementById("MainBody").innerHTML += MainPost;
But I know this would allow people to use ANY html tag they want, even something like "script" followed by javascript code. This would be bad for business, obviously, but I do like the idea of allowing posters to use the "img" tag as well as the "a href" tags. Is there a way to somehow disable all tags except these two for the innerHTML?
Thank you all so much for any help you can offer.
Ok, the first thought that came to my mind when I read this question was to find a regular expression to exclude a specific string in a word. Simple search gave a lot of results from SO.
Starting point - To remove all the HTML tags from a string (from this answer):
var regex = /(<([^>]+)>)/ig
, body = "<p>test</p>"
, result = body.replace(regex, "");
console.log(result);
To exclude a string you would do something like this (again from all the source mentioned above):
(?!StringToBeExcluded)
Since you want to exlcude the <a href and <img tags. The suitable regex in your case could be:
(<(?![\/]?a)(?![\/]?img)([^>]+)>)
Explanation :
Think of it as three capturing groups in succession:
(?![\/]?a) : Negative Lookahead to assert that it is impossible to match the regex containing the string "a" prefixed by zero or one backslashes (Should take care of the a href tags)
(?![\/]?img) : Same as 1, just here it looks for the string "img". I don't know why I allowed the </img> tag. Yes, <img> doesn't have a closing tag. You could remove the [\/]? bit from it to fix this.
([^>]+) : Makes sure to not match > zero or one times to take care of tags that have opening and closing tags.
Now all these capture groups lie between < and >. You might want to try a regex demo that I've created incorporating these three capture groups to take care of ignoring all HTML elements except the image and link tags.
Sidenote - I haven't thoroughly given this regex a try. Feel free to play around with it and tweak it according to your needs. In any case, I hope this gets you started in the right direction.

Javascript/jQuery Dynamic text to link/href replece

I need to replace and convert to URL following words on my entire site; Square-Technology UK. I've done some research into replacing text displayed within a site to replace it with url.
Here is the code:
http://jsfiddle.net/Hgtrh/1/
However it doesn't replace on my website for some reason. Here is the HTML am using.
<div class="main_testimonials">
<div class="c_box"></div>
<div class="main_content_img">
<img src="images/news/thumbs/1184901_10151885560986667_1371257993_n_t2.jpg" alt="News" class="news-category"></div>
<div class="main_bubble_box">
Thank you Square-Technology UK for my new system!!
</div>
</div>
this is an example of different javascipt that works, note the ' and "
echo "<script type='text/javascript'>
$(document).ready(function() {
$('body').removeClass('no-js'); $('#my-carousel-3').carousel({ itemsPerPage: 3, itemsPerTransition: 3, easing: 'swing', noOfRows: 1 }); });</script>\n";
Right managed to solve this very easily. My script didn't allow me to use double quotes inside the echo tag in PHP, which is quite obvious. Alternatively using single quotes does not work using the script I posted at the beginning. However the way to do it is just create another file.js, place the code inside it, and then attach it using the following:
echo"<script type='text/javascript' src='js/test_replace.js'></script>\n";
Did you try this?
$(document).ready(function() {
var thePage = $("body");
thePage.html(thePage.html().replace(/Square-Technology UK/ig, 'Square-Technology UK'));
})
Hope this helps..
Try using a more specific identifier to track what you want to replace instead of tracing the entire DOM to search for what you want to replace:
JS:
$(function() {
var $siteLink = $('.site-link'),
linkHtml = 'Square-Technology UK';
$siteLink.html(linkHtml);
});
HTML:
<span class="site-link"></span>
However since your fiddle seems to work we can only guess what is happening, can you provide more info about what jquery you are running, or how the page is layout.
Here is a Fiddle: http://jsfiddle.net/Hgtrh/4/
It is also worth mentioning just like ikaros45 said, that this is normally not something you would want to do with Javascript, this seems more like something that the templates should be able to deal with.
It works on your fiddler example, but not on your site. I suggest confirming that the JQuery library is loading on your site as expected.
So I managed to solve this very easily. My script didn't allow me to use double quotes inside the echo tag in PHP, which is quite obvious. Alternatively using single quotes does not work using the script I posted at the beginning. However the way to do it is just create another file.js, place the code inside it, and then attach it using the following:
echo"<script type='text/javascript' src='js/test_replace.js'></script>\n";

Why replace backslash code doesn't work in JS?

My code in JSP file looks like this :
<s:form namespace="/user" action="list" method="POST" id="filterListForm" theme="simple"
onsubmit="document.getElementById('filterSearchText').value=document.getElementById('filterSearchText').value.replace(/\\/g,'')">
It won't replace the backslash char. I've tried the following, none of them work :
replace('/\\/g','')
replace(/\\\\/g,'')
replace(\/\\\/g,'')
But if I change it to the following, it works :
<s:form namespace="/user" action="list" method="POST" id="filterListForm" theme="simple"
onsubmit="replaceBackslash()">
<script type="text/javascript">
function replaceBackslash() { document.getElementById('filterSearchText').value=document.getElementById('filterSearchText').value.replace(/\\/g,''); }
</script>
Why ? Is there a way to make it work in the first case ?
You want:
var replaced = original.replace(/\\/g, '');
In a regular expression literal, all you need to do is double the backslash to quote it.
As to why it doesn't work when you try passing the code in via a JSP tag, well that would probably be JSP mangling the string for you. It might work to do this:
<s:form ... onsubmit=' ... .replace(/\\\\/g, "") ... ' >
but I don't have a good way to try that at the moment.
edit — actually I'm finding this challenging. It probably depends on what your tag library does. My framework (Stripes) likes to HTML-escape attribute values, so it's hard to pass through something like \ (well, impossible).
(This isn't really a solution, just a recommendation of a general practice that happens to solve this problem, too.)
Bottom line: Go with separated Javascript. If you feel it's too much work to completely separate it out into a different file (even though that would help you cleanly avoid all issues such as this), at least put it all in a script tag at the bottom. It helps separate layout and logic, and it keeps all the Javascript in one known place, making it easier to understand and maintain. You don't even need onclick/onsubmit attributes, you can assign those in Javascript too (usually keyed on html #id attributes). If you use the on[event] attributes anyway, just call one sensibly named function, and put the function's implementation in your main script.

Regex using js to strip js from html

I'm using jQuery to sort a column of emails, though they are base64 encoded in js... so I need a regex command to ignore the <script>.*?<script> tags and only sort what is after them (within the <noscript> tags).
Column HTML
<td>
<script type="text/javascript">
document.write(Base64.decode('PG5vYnI+PGEgaHJlZj0ibWFpbHRvOmJpY2VAdWNzYy5lZHUiIHRpdGxlPSJiaWNlQHVjc2MuZWR1Ij5iaWNlPC9hPjwvbm9icj48YnIgLz4K'));
</script>
<noscript>username</noscript>
</td>
Regex that needs some love
a.replace(/<script.*?<\/script>(.*?)/i,"$1");
Assuming that the structure of the html doesn't change, you can use this:
$(a)​.contents().filter(function(){
return this.nodeType === 3
}).eq(1).text();
It gets all text nodes and then filters to the one at index 1 and get's it's text value.
And if you want to stick with regexp, here's one:
a.replace(/(<script type="text\/javascript">[^>]+>|<noscript>.*<\/noscript>)/ig,"");
I know this isn't exactly what you're asking for (though I'm a little confused what you're asking for, to be honest...), but have you looked at using document.getElementsByTagName('noscript')? This function should return an array, the first element of which will be your noscript element.
Also, I'm not really clear on your overall approach to this problem, but it seems like you're misunderstanding the purpose of a noscript element. noscript elements only execute when the browser does not support Javascript, which means the only time noscript content would be displayed to the user is when the Javascript that you're using to modify the noscript content wouldn't run.
Perhaps you could clarify what exactly you're trying to do?

Categories