Do you know an open source Javascript extraction/regexp engine? [closed]

Do you know an open source Javascript extraction/regexp engine? [closed] - javascript

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
We are in need of a DOM parser, that will be able to run a bunch of patterns and would store the results. For this we are looking for libraries that are open and we can start on,
able to select elements by regexp (for example grab all elements that contain "price" either in class, id, other attributes like meta attributes),
should have a lot of helpers like: remove comments, iframes, etc
and be pretty fast.
can be run from browser extensions.

Ok, I'll say it :
You can use jQuery.
ups :
it is a very good dom parser
it is very good at manipulating the dom (removing/adding/editing elements)
it has a great and intuitive api
it has a big & great community => lots of answers to any jquery related question
it works in browser extensions (tested it myself in chrome and it apparently works in ff extensions too : How to use jQuery in Firefox Extension)
it is lightweight (About 31KB in size - minified and gzipped)
it is cross-browser
it is definitely open source
downs :
it doesn't rely on regex (although this is a very good thing - as dda already mentioned), but regex can be used to filter the elements
dont know if it can access/manipulate comments
Here's an example of some jquery action :
// select all the iframe elements with the class advertisement
// that have the word "porn" in their src attribute
$('iframe.advertisement[src*=porn]')
// filter the ones that contains the word "poney" in their title
// with the help of a regex
.filter(function(){
return /poney/gi.test((this.title || this.document.title).test()));
})
// and remove them
.remove()
// return to the whole match
.end()
// filter them again, this time
// affect only the big ones
.filter(function(){
return $(this).width() > 100 && $(this).height() > 100;
})
// replace them with some html markup
.replaceWith('<img src="harmless_bunnies_and_kitties.jpg" />');

node-htmlparser can parse HTML, provides a DOM with a number of utils (also supports filtering by functions) and can be run in any context (even in WebWorkers).
I forked it a while back, improved it for better speed and got some insane results (read: even faster than native libexpat bindings).
Nevertheless, I would advice you to use the original version, as it supports browsers out-of-the-box (my fork can be run in browsers using browserify, which adds some overhead).

Related

Performance using JS querySelector [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
When using JavaScript in the web browser is there any performance difference between the following:
Existing getElementById
document.getElementById("elem");
Query Selector using #id
document.querySelector("#elem");
Query Selector using [id=elem]
document.querySelector("[id=elem]");
I'm assuming the first one will be fastest (only has to lookup elements with an ID). Also the final one looks like bad practice. I like the second one as using querySelector for everything makes the code easy to read.
Any suggestions?

Firstly,
document.querySelector("#elem");
Has an advantage in the fact that, unlike document.getElementId, it can return classes. However, the usefulness of this is far diminished by the fact that it only returns the first object with that class name, so you might as well just use an id if you're not specifically looking for the first object with that classname. if you use,
document.querySelectorAll
However, I believe (I may be wrong), it returns all items with that classname as an array, where regular querySelector is equivalent to querySelectorAll[0]. One more advantage, is that you can run css3 queries through it, which can be quite useful.
Secondly,
document.getElementById("elem");
Has a very good advantage over queryselector in the sense that it is almost 5 times faster, so if you're sitting there with several thousand lines of code and you want to optimise said code, then getElementById is the way to go.
Lastly,
document.querySelector("[id=elem]");
I, personally, don't see the need to use this in any situation. If you needed a querySelector, why not just use a # ? This is exactly equivalent to your first example of querySelector, however it has a lot of useless charaters.
Edit: Just to be clear, in summary, you're probably better off using document.getElementById.

You can test it yourself. getElementById is a fastest method

is there any performance difference
Probably, since they are different functions. querySelector at least needs to parse the selector before detecting that it's equal to getElementById. And I doubt this optimisation takes place for the attribute selector at all, no one uses it. So I share your assumptions; and tests confirm them (thanks to #Silver_Clash).
Personally I do not like the second one, as it is more ambiguous and awful to use with dynamic id values. Explicitly using getElementById is just more concise.

Toggling visibility with javascript versus with CSS [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Just found out about the CSS tilde selector and it seems like it could be an elegant way to handle toggling input visibility.
The use-case I have in mind is when you only want to display an input to the user when they have checked a checkbox. Currently I do this with javascript and attach listeners to each checkbox, which then search through the DOM to find the appropriate input and toggle a class.
So the question is, why is this bad? And if it isn't, why isn't this prevalent? Why do we do this with .js rather than CSS? It seems to me they are both manipulating the presentation layer in fairly similar ways...
Bonus points for resources.

HTML is the Model, it contains the Data
CSS is the View, it contains the Styles
JS is the Controller, it contains the Interactions
This MVC structure makes for a powerful delegation of responsibility. When you want to change the state of a widget on a page, you can use JavaScript to change the available styling hooks:
jQuery used for brevity
$('#button').on('click', function () {
$('#widget').toggleClass('hidden');
});
CSS:
.hidden {
display: none;
}
Using this structure will easily allow you to change how interactions are performed. If you decide later that the widget should be animated, it's a simple change.
Using JavaScript also has the advantage of being backwards compatible. The percentage of users who have JavaScript disabled is remarkably few. While I highly recommend supporting people with JS disabled, that can often be done simply by showing all the content by default, and hiding things only for users who have JS enabled.

The use-case I have in mind is when you only want to display an input
to the user when they have checked a checkbox.
If you're planning to use the :checked selector for that, your app won't run in old browsers (IE < 9). If you're OK with this restriction, that is, only modern browsers concerned, CSS will do fine as well.
Using JS ensures that your site will run on older browsers too, provided the users have JS enabled.
You can also use a mix of both and it will ensure that your page works in modern browsers even with JS disabled as well as JS-enabled old browsers.
It is often easier to detect whether the browser has JS disabled (e.g. using a stylesheet inside <noscript>) than determining whether a browser supports certain CSS selectors.
Therefore using a JS solution allows you to easily place a disclaimer asking the user to enable JS for the page to work properly.
Though, again, if your site is not aimed at general public that may be using IE8 and below, the CSS solution will do just fine.

I would say if you can get away with using the tilde selector, or css in general, then go for it. CSS imho is a cleaner, usually more concise and superior performance way to accomplish the same thing as toggling the item in js. Plus, the browswer support for the tilde is quite good - see http://www.quirksmode.org/css/contents.html
There are times when you must use javascript to accomplish this, for example the element to hide is not a sibling or a descendant of the element in question.

JavaScript / HTML Command Line Widget [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm making a text based game in JavaScript, and I'm looking for a command line widget, with the following features:
print - Be able to send a synchronous command to put some text (preferably support arbitrary HTML) into the command line.
commands - Be able to specify commands which a user can type in, which will then execute functions (should support parameters to such commands, and preferably also have an auto-complete).
prompt - Be able to request input (and send that input to a callback). This should queue prints, and disable commands.
The user shouldn't be able to edit output text (but should be able to copy and paste).
Preferably using semantic elements.
Browser Support: Latest Firefox & Chrome, preferably also IE9, latest Safari & Opera.
Does anyone know about any such preexisting widget, if not, can anyone give me tips for how to make one?

It's not as tricky as you'd think to make this. I made one recently for a project, using HTML elements and jQuery JS.
I used an input element for the input line - handling special keypresses like enter and tab. For previous commands and responses I used a scrollable area which I appended new elements to as new inputs or responses were available.
I displayed the commands and responses in DIVs, which allowed for copy and pasting (and HTML formatting). I even made the responses disclosable so I'd only show the command initially and you could click on the 'expand' button to show the rest of it.
It all worked really well in the end. Just a few simple HTML elements and some jQuery code which wasn't as complex as I thought it would be when starting it. By far the most complexity and most code was in tweaking the aesthetics!

Look here that is a »virtual box« coded in Javascript with Linux running in it, the terminal might be something you could have a look at it. As far I looked at it, there is done with tables.
Greeting...

Is it O.K to eliminate 'a' tags? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I get tired of the dotted box that appears when you click a tags...so that I started replacing them with p tags...and adding an event listener for click events.
I can't help but notice that other popular sites don't do this ( for example when you click Post Your Question on SO the annoying dotted box appears )
Is there any good reason not to replace a tags by p tags. I don't need any of the special a tag properties...in fact I have to use preventDefault() in my JavaScript to stop them from linking some times.
Is it O.K to pretty much eliminate the A tag?
This is a question regarding major modern browsers.
I'm about to rid myself of them...and am paranoid I'm missing something as I see them still in use pretty much everywhere.

No, this is not okay.
Anchor tags are needed for browsers that don't support your styling, and I'm just talking about old versions of IE. Remember that screen readers, text-only users, and browser plugins all expect to find your anchor tags.
As Dash has pointed out, bots such as search engine crawlers also need your anchor tags to be able to follow them and index your pages.
HTML is for document structure. CSS is for styling. It is important to keep this principal.
A note regarding that dotted box... that is there to show focus on an element. Not everyone uses a mouse you know. Some prefer to tab through the document with a keyboard, and that focus box helps with that. Even if you succeed in removing it with CSS... please don't.

Don't do this, because of html semantics. Screen readers and search engines may not be able to follow your link, for example.

If you don't want buttons and links to have a dotted box around them after they've been clicked, simply add a listener that blurs the element:
Look ma, no dots
There is also a CSS property you can set to not have dots too, but I can't remember it. Do a search or look at the Mozilla default stylesheet.

I disagree with the other two commenters a little. While it is true that it will break screen readers and not be backward compatible in general.....
IF you have a specific use case where you don't care about any of the above, well...it is your project. :)
And you COULD also mostly achieve this and keep the anchor tags by add the following to your CSS:
a
{
text-decoration: none;
}

Is there an actual use case that requires getAttributeNode and/or getAttributeNodeNS? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Can someone think of an actual use case whereby we will ever require the use of getAttributeNode and/or getAttributeNodeNS?
As far as I know, getAttribute and/or getAttributeNS settles all use cases, hence this question.

It lets you obtain the item as a Node, which is an interface shared by other DOM components like elements, processing instructions, comments, etc., so you could treat it similarly to other items. Not sure why, but u could... :)
For example:
<button id="choose_attr">choose attribute node</button>
<button id="choose_text">choose text node</button>
<button id="getchosennodevalue">get chosen node's value</button>
<script>
var chosenNode;
document.getElementById('choose_attr').addEventListener('click', function (e) {
chosenNode = e.target.getAttributeNode('id'); // Attribute node
}, false);
document.getElementById('choose_text').addEventListener('click', function (e) {
chosenNode = e.target.firstChild; // Text node
}, false);
document.getElementById('getchosennodevalue').addEventListener('click', function () {
alert(chosenNode.nodeValue); // We can get the value with a shared property, no matter if it is a text node, comment node, element, attribute node, etc.
}, false);
</script>
Even if you will only use the variable for storing attribute nodes, one might prefer to have it already pre-built into a special object distinct from other types like strings.
Although your question was about getAttributeNode*, as far as the use of attribute nodes in general, I think it might be more handy with the likes of document.createAttribute where you can create and then pass around such a node to set it on an element later. But getting an existing attribute indeed seems of less general utility (though one could imagine a situation where you were passing around attribute nodes, sometimes created anew without an element, and sometimes retrieved from an existing element--using getAttributeNode allows you to avoid building your own object which has getters and setters and handle them with the same interface).

It's true there's rarely a reason to use getAttributeNode(). However, I've found it useful in the past as a workaround to Internet Explorer's getAttribute and setAttribute bugs. For instance, take the following HTML:
<div id="test" onclick="alert()">
and the following code:
var test = document.getElementById("test");
alert(typeof test.getAttribute("onclick"));
Internet Explorer 7 and lower will report function in the alert, unlike newer versions and other browsers which correctly report string. The workaround involves getAttributeNode():
alert(typeof test.getAttributeNode("onclick").nodeValue);
Correctly outputs string in all browsers, albeit at a small performance cost. The same problem applies to boolean and other non-string properties that can be set via attributes. Sure, this wouldn't be necessary if IE didn't have the bug, but it makes me thankful there's an alternative to getAttribute().
I've set up an example for you to test and play around with — http://jsfiddle.net/PVjn5/. There's also a ticket I filed over at jQuery.com about it.

You'd use them in a situation where you needed to encapsulate the attribute name and value (and namespace, if applicable) in a single object. Use cases for doing this when manipulating a HTML user interface via JavaScript are likely to be rare. However, browsers implement the DOM specification and those methods must be provided to meet the spec. The are plenty of use cases for the DOM when it is not being used to manipulate HTML-based user interfaces.

We Keep Coding

JavaScript is the programming language of the Web.

Do you know an open source Javascript extraction/regexp engine? [closed] - javascript

Related

Performance using JS querySelector [closed]

Toggling visibility with javascript versus with CSS [closed]

JavaScript / HTML Command Line Widget [closed]

Is it O.K to eliminate 'a' tags? [closed]

Is there an actual use case that requires getAttributeNode and/or getAttributeNodeNS? [closed]

Categories

Resources