I am trying to make a Google Chrome extension using content script.
My goal is to have a display at the top of the page (which is already working on my own pages) that can interact with the page.
I need things which are very complicated to put together in an extension, due to security policies :
Using require.js on the extension (that works for now, using this Github repo)
Using a templating engine to describe my display : I need to add a lot of content to the page and I don't think writing HTML in javascript would be a good workflow.
For my current version I use jade with my server, but this is not possible with an extension. I think I need to use something like Angular.js or Backbone.js, but I can't make them work on the content script.
I need a lot of communication between my extension and the page : For example I need to detect almost constantly mouse moves
I need communication with my server using socket.io
Every bit of functionality of my extension have been developed and tried in a standalone web page, but now I need to integrate it in a real extension and I am really stuck
So due to these requirements, I am wondering what would be the right approach for building this : putting it all in an iFrame (would the server-side communication work? And how to communicate with the page ?), or a way to make a templating engine work nicely in there, or a solution I didn't think of?
Try this:
Develop the HUD part as a standalone page that the content script will include in an iframe. You should be able to use Angular.js etc. with this, but you will need local copies of as much as possible and you'll need appropriate entries in the manifest.json to get it working in the extension. See/create other questions for the details.
Have your content script inject the code to monitor mouse-moves, etc. into the target page. Have this code digest and summarize the data, so it's not spamming the system. Maybe message the summaries to the HUD page and/or content script five or six times a second.
After that, it should just be a matter of getting the pieces working, one at a time. Break it down to specific problems and ask a question on one specific problem at a time (If you can't find the answers in previous questions).
I'm pretty sure what you appear to want is do-able, but the details are too broad for a single Stack Overflow question.
Related
How can I screen scrape a multi page application? I want to do this using Javascript. Here are the approaches I have considered and the problems I have encountered.
Using the Fetch web API in a Node application to get the web pages
Problem: The web pages won't load properly when being fetched. I guess all javascript on the page does not run when the page is fetched.
Running JavaScript from the console
This is a very simple way to inject JavaScript straight into the document. But one problem is that opening the web page is a browser and pasting into the console is manual work. Another problem is that while this works for single page application it becomes very cumbersome for multi-page applications.
What better approach exists that solves the problems I have encountered?
Depends on what are you doing. If you just want to get some that from some website then injecting JS in the page is the way to go.
But as you said it's manual work from which I deduce you want to scrape the sites and save the data maybe. In this case a service side script is better suited. To fix the problem with the JavaScript not being loaded you can use things like PhantomJs or Horseman.
Take a look at this: https://medium.com/#designman/building-a-performant-web-scraper-in-node-js-5f4449674163
If you want to save website content (html, js, css files, images) to file system you can take a look on website-scraper package for nodejs https://www.npmjs.com/package/website-scraper
It also has plugin for PhantomJS which allows to handle single page applications
Its been a few years since I have worked on Chrome apps. I started messing around with making some simple examples to learn the new process. The problem is that I have found that using an <a> tag to change the html page that is loaded to another html page inside the packaged app will not work. I'm looking for different methods of changing the screen with retrieving user input (button click, link click. and so on). I have looked online and have found very little documentation about making chrome apps most examples show how to make a simple "Hello World" and how to publish it. Not many extensive tutorials beyond that. Just to clarify this app would be a real chrome app not a link to some site. all files would be packaged with the chrome application.
Chrome Apps are meant to be "single page apps", and cannot navigate links by design. <a> links should open in a regular browser window instead.
If you want to do "url routing" within your application to change views, you can just roll your own solution, but should probably use a framework to help you out instead.
Here are some examples:
Polymer
Angular
Ember
Meteor
Backbone
The list is extensive. There are likely many other stack overflow answers that compare each framework.
The DOM (Document Object Model) which represents the contents of a Chrome App window is initiated from an HTML file, but from that point on you can't change it by referencing any other file, which is what you're expecting navigation to do. (<a> elements that navigate to an external browser via a "target=_blank" attribute are perfectly OK.)
However, you are free to change the DOM from your JavaScript at runtime. If you like, set an event handler on the <a> element and change the DOM as you wish. If you want to change the DOM via HTML (not from a file), you might find the JavaScript method insertAdjacentHTML useful. Actually, you can get the HTML from a file, but you have to read that file yourself with the Chrome App file I/O API.
Advice in another answer to use a framework is, in my opinion, overkill. If you think of a Windows app, a Mac app, or any other kind of GUI-based app, you would never assume that you could just change the UI over to something completely different by referencing an HTML file. Think of a Chrome App as being similar to those technologies in that sense, and you'll be on the right track.
i have a sitation where i want to access HTML DOM object from within my application to update certain parts of web page through javascript commands at run time.
It is a local webpage opened in FireFox which would be accessed by my application, so that the final output is always shown at the webpage which is updated by appliation.
It would be great if you could give me some idea about how this can be accomplished.
I have similar requirement like the webmonkey extension of firefox but need to do it outside of browser from my application.
You can try QtWebKit from the Qt framework, it provides an OO set of classes to interact with webpages from basic actions to very complicated and advanced stuff. I believe you may find your answer there, a link is provided below...
Good Luck
see Here
What is the best thing to do when a user doesn't have JavaScript enabled? What is the best way to deliver content to that kind of user? What is the best way to keep a site readable by search engines?
I can think of two ways to achieve this, but do not know what is better (or if a 3rd option is better):
Rely on the meta-refresh tag to redirect users to a non-javascript version of site. Wrap the meta-refresh tag in a noscript tag so it will be ignored by those with javascript.
Rely on an iframe tag located within the body tag to deliver a non-javascript version of site. Wrap the iframe tag in a a noscript tag so it will be ignored by those with javascript.
I would also appreciate high-profile examples of the correct or incorrect way to do this.
--------- ADDITION TO QUESTION -----------
Here is an example of what I have done in the past to address this: http://photocontest.highpoint.edu/
I want to make sure there aren't better ways to do this.
You are talking about graceful degradation: Designing and making the site to work with javascript, then making the site still work with javascript turned off. The easiest thing to do is include the html "noscript" tag somewhere near the top of your page that gives a message saying that the site REQUIRES javascript or things won't work right. SO is a perfect example of this. Most of the buttons at the top of the screen run via javascript. Turn it off and you get a nice red banner and the drop down js effects are gone.
I prefer progressive enhancement development. Get the site working in it's entirety without javascript / flash / css3 / whatever, THEN enhance it bit by bit (still include the noscript tag) to improve the user experience. This ensures you have a fully working, readable website regardless if you're a disabled user with a screen reader or search engine, whilst providing a good user experience for users with newer browsers.
Bottom line: for any dynamically generated content (for example page elements generated via AJAX) there has to be a static page alternative where this content must be available via a standard link. If you are using javascript for tabbed content, then show all the content in a way that is consistent with the rest of the webpage.
An example is http://www.bbc.co.uk/news/ Turn off javascript and you have a full page of written content, pictures, links etc. Turn on javascript and you get scrolling news stories, tabbed content, scrolling pictures and so on.
I'm going to be naughty and post links to wikipedia:
Progressive Enhancement
Graceful Degredation
You have another option, just load the same page but make it work for noscript users (progressive enhancement/gracefull degradation).
A simple example:
You want to load content into a div with ajax, make an <a> tag linking to the full page with the new content (noscript behavior) and bind the <a> tag with jQuery to intercept clicks and load with ajax (script behavior).
$('a.ajax').click(function(){
var anchor = $(this);
$('#content').load(anchor.attr('href') + ' #content');
return false;
});
I'm not entirely sure if Progressive Enhancement is considered to be best practice these days but it's the approach I personally favour. In this case you write your server side code so that it functions like a standard web 1.0 web app (no JavaScript) to provide at least enough functionality for the system to work without JavaScript. You then start layering JavaScript functionality on top of this to make the system more user friendly. If done properly you should end up with a web app that at least provides enough functionality to be useful for non-JavaScript users.
A related process is known as Graceful Degradation, which works in a similar way but starts with the assumption that a user has JavaScript enabled and build in workarounds for cases where they don/t. This has a drawback, however, in that if you overlook something you can leave a non-JavaScript user without a fallback.
Progressive Enhancement example for a search page: Build your search page so that it normally just returns a HTML page of search results, but also add a flag that can be set via GET that when set, it returns XML or JSON instead. On the search page, include a script that does an AJAX request to the search page with the flag appended onto the query string and then replaces the main content of the page with the result of the AJAX call. JavaScript users get the benefit of AJAX but those without JavaScript still get a workable search page.
http://en.wikipedia.org/wiki/Progressive_enhancement
If your application must have javascript to function then there's nothing you can do except show them a polite message in a noscript tag.
Otherwise, you should be thinking the other way around.
Build your site without JS
Give awesome user experience and make it full functional
Add JS and make the UX even more functional. Layer the JS on top.
So if the user doesn't have JS, your site will still revert to step two of your site state.
As for crawling. If your site depends on AJAX and a lot of JS to work, you can make gogole aware of it : http://code.google.com/web/ajaxcrawling/docs/getting-started.html
One quick tip that may help you: just install lynx, a command-line web browser, and you'll immediately see how google and other seo see your site (and blind people too). This is very useful. Of course, in a command line windows, there's no graphics and javascript is disabled.
If you're doing "serious" Ajax (e.g. client side-routing) the following technique could be useful:
Use Urls without GET/"?"-parameters (it makes your life easier later on)
Use http://baseurl.com/#!/path/to/resource for client side-routing
Implement rendering of non-script HTML-version of your site (HTML snapshot is what Google calls it) at http://baseurl.com/path/to/resource
Wrap the whole content of your HTML snapshot in noscript-tags and redirect via top.location.href to the full version of the site
Handle http://baseurl.com/?_escaped_fragment=/path/to/resource - it should redirect via 301-response to http://baseurl.com/path/to/resource
Use a-tags only for GET-links, use forms for POST/PUT/DELETE-links - unstyle the hell out of them if necessary
A nice example code for links I found while researching "How to write proper Ajax-code":
Resource
This is of course a pretty complex solution but it should enable both SEO (including non-search engine crawlers) and accessibility. The problem is that you have to be able to render your page server- AND client side.
One solution could be to use a templating framework like mustache where implementations for different platforms exist.
Use something like {{#pagelet}}/path/to/partial{{/pagelet}} for dynamic parts of your page - example: {{#pagelet}}/image/{{image_id}}/preview{{/pagelet}}
In your client-side rendering, pagelet would be implemented to be dynamically replaced with something loaded via Ajax (for example: render )
In your server-side rendering, pagelet would just be rendered directly (in doubt just curl the pagelet and render it right away - or if you can write the code asynchronously do it just as you would do it client side: write some temporary span into a buffer, start fetching all the pagelets, replace the temporary spans as the pagelets arrive and flush the buffer once all pagelets have been rendered.
That's the best general design I found so far. You can deep link into your app, it's search engine friendly and it should force you to build a page that gracefully degrades.
P.S.: One advantage of the techniques described above is that both the Ajax- and the "Web 1.0"-rendering of a page could profit from memcached-caching of whole pagelets.
I would prefer to code the page without javascript and then if javascript is enabled, we redirect users to a similar page with javascript. (same concept as progressive enhancement)
redirecting with javascript
As part of a job I'm doing on a web site I have to copy a few thousand lines of text from several pages of the old site and paste them into the HTML for the new site. The long and painstaking way of going to the old page and copying the many lines of text and then going to my editor and pasting it there line by line is getting really old. I thought of using injected JavaScript to do this but I'm not quite sure where to start. Thanks in advance for any help.
Here are links to a page of the old site and a page of the new site. As you can see in the tables on each page it would take a ton of time to copy it all manually.
Old site: http://temp.delridgelegalformscom.officelive.com/macorporation1.aspx
New Site: http://ezwebsites.us/delridge/macorporation1.html
In order to do this type of work, you need two things: a way of injecting or executing your script on that page, and a good working knowledge of the Document Object Model for the target site.
I highly recommend using the Firefox plugin FireBug, or some equivalent tool on your browser of choice. FireBug lets you execute commands from a JavaScript console which will help. Hopefully the old site does not have a bunch of <FONT>, <OBJECT> or <IFRAME> tags which will make this even more tedious.
Using a library like Prototype or JQuery will also help selecting parts of the website you need. You can submit results using JQuery like this:
$(function() {
snippet = $('#content-id').html;
$.post('http://myserver/page', {content: snippet});
});
A problem you will very likely run into is the "same origination policy" many browsers enforce for JavaScript. So if your JavaScript was loaded from http://myserver as in this example, you would be OK.
Perhaps another route you can take is to use a scripting language like Ruby, Python, or (if you really have patience) VBA. The script can automate the list of pages to scrape and a target location for the information. It can just as easily package it up as a request to the new server if that's how pages get updated. This way you don't have to worry about injecting the JavaScript and hoping all works without problems.
I think you need Grease Monkey http://www.greasespot.net/