Javascript newb here. Creating a bookmarklet to automate a simple task at work. Mostly a learning exercise. It will scan a transcript on CNN.com, for instance: (http://transcripts.cnn.com/TRANSCRIPTS/1302/28/acd.01.html). It will grab the lead stories at the top of the page, the name and title of the guests on the show, and format them so that they can be copy pasted into another document.
I've come up with a simple version that includes some jQuery that grabs the subheading and then uses a regular expression to find the names of the guests (it will also exclude everything between (begin videoclip) and (end videoclip), but I haven't gotten that far yet. It then alerts them (will eventually print them in a pop-up window, alert is just for troubleshooting purposes).
I'm using http://benalman.com/code/test/jquery-run-code-bookmarklet/ to create the bookmarklet. My problem is that once the bookmarklet is created it is completely unresponsive. Click on it and nothing happens. I've tried minimizing the code first with no result. My guess is that cnn.com's javascript is conflicting with mine but I'm not sure how to get around that. Or do I need to include some code to load and store the text on the current page? Here's the code (I've included comments, but I took these out when I used the bookmarklet generator.) Thanks for any help!
//Grabs the subheading
var leadStories=$(".cnnTransSubHead").text();
//Scans the webpage for guest name and title. Includes a regular expression to find any
//string that starts with a capital letter, includes a comma, and ends in a colon.
var scanForGuests=/[A-Z ].+,[A-Z0-9 ].+:/g;
//Joins the array created by scanForGuests with a semicolon instead of a comma
var guests=scanForGuests.join(‘; ‘);
//Creates an alert in the proper format including stories and guests.
alert(“Lead Stories: “ + leadStories + “. ” + guests + “. SEE TRANSCRIPT FIELD FOR FULL TRANSCRIPT.“)
Go to the page. Open up developer tools (ctrl+shift+j in chrome) and paste your code in the console to see what's wrong.
The $ in var leadStories = $(".cnnTransSubHead").text(); is from jQuery and the link provided does not have jQuery loaded into the page.
On any modern browser you should be able to achieve the same results without jQuery:
var leadStories = document.getElementsByClassName('cnnTransSubHead')
.map(function(el) { return el.innerText } );
next we have:
var scanForGuests=/[A-Z ].+,[A-Z0-9 ].+:/g;
var guests=scanForGuests.join('; ');
scanForGuests IS a regular expression, you never actually matched it to anything - so .join() is going to throw an error. I'm not exactly sure what you're trying to do. Are you trying to scan the full text of the page for that regex? In that case something like this would be your best bet
document.body.innerText.match(scanForGuests);
keep in mind that while innerText removes html markup, it's far from perfect and what pops up in it is very much at the mercy of how the page's html is structured. That said, on my quick test it seems to work.
Finally, for something like this you should use an immediately invoked function or you're sticking all your variables into the global context.
So putting it all together you get something like this:
(function() {
var leadStories = document.getElementsByClassName('cnnTransSubHead')
.map(function(el) { return el.innerText } );
var scanForGuests=/[A-Z ].+,[A-Z0-9 ].+:/g;
var guests = document.body.innerText.match(scanForGuests).join("; ");
alert("Leads: " + leadStories + " Guests: " + guests);
})();
Related
Trying something new, I was attempting to highlight text on this wikia page using javascript within the address bar (i.e. using "javascript:[code]").
When running the following code sample through Chrome's console, it produces the desired effect. When running it from the address bar, it results in only the affected text -- the rest of the page body is removed.
javascript:txt = document.getElementById("Ballas.27_Rebellion_and_Allying_With_Hunhow").parentElement.nextElementSibling;index = txt.innerHTML.indexOf(", but")+2;txt.innerHTML = txt.innerHTML.substring(0,index)+"<span style='background-color:yellow;'>"+txt.innerHTML.substring(index,index+40)+"</span>"+txt.innerHTML.substring(index+40);
Note: if you want to try this you will have to manually type javascript: into the address bar before pasting the code, as Chrome automatically removes it.
I'm curious as to why this would be, and also if there is a way to stop the address bar from removing the rest of the page body. Can anyone offer insight?
Thanks.
The quick solution to the problem you're experiencing is to add false; to the end of your query. This will prevent Chrome from removing the text from your page and should give you the result you expect.
Here's the fixed code:
javascript:txt = document.getElementById("Ballas.27_Rebellion_and_Allying_With_Hunhow").parentElement.nextElementSibling;index = txt.innerHTML.indexOf(", but")+2;txt.innerHTML = txt.innerHTML.substring(0,index)+"<span style='background-color:yellow;'>"+txt.innerHTML.substring(index,index+40)+"</span>"+txt.innerHTML.substring(index+40);false;
To fully answer the question, let me quickly explain what is happening. I'll start by splitting up your JS a bit to make it easier to read.
txt = document.getElementById("Ballas.27_Rebellion_and_Allying_With_Hunhow").parentElement.nextElementSibling;
index = txt.innerHTML.indexOf(", but")+2;
txt.innerHTML = txt.innerHTML.substring(0,index) +
"<span style='background-color:yellow;'>" +
txt.innerHTML.substring(index,index+40) +
"</span>"+txt.innerHTML.substring(index+40);
What you'll note is that the final statement is an assignment operation. In JavaScript the result of an assignment operation is the value of the assignment. In other words, if we say return x = 1 we will both set the value of x to 1 and return the value 1.
This brings us to the reason why Chrome is replacing your page content. The JavaScript you're providing is returning the content of the txt element (the paragraph you're deciding to highlight) and this is then being treated as the content of your new page, the same way that visiting data:text/plain,hello world or javascript:"hello world" in your browser will show the text "hello world"even though you haven't explicitly visited a website.
To fix this, you can return a falsey value in JavaScript - this means any one of the following:
0
false
[]
null
undefined
Hence, adding false; at the end of your JavaScript will have Chrome run the code but not show the resulting text and will prevent it from changing the page content on you unexpectedly.
I have no script abilitiy, but i'd like to edit an existing script which is currently restricting the script from running on any page other then the one that has a certain string in the URL.
Here is the snippet of the script which limits it from running
if(location.href.indexOf("MODULE=MESSAGE")>0||location.href.indexOf("/message")>0)
This only allows the script to run on these pages
mysite/2014/home/11609?MODULE=MESSAGE1
and the pages range from Message1 to Message20
mysite/2014/home/11609?MODULE=MESSAGE20
I would like to also allow the script to be loaded and ran on these pages
mysite/2014/options?L=11609&O=247&SEQNO=1&PRINTER=1
where the SEQNO=1 ranges from 1 to SEQNO=20, just like the MESSAGE1-MESSAGE20 do
Can someone show me how i can edit that small snippet of script to allow the SEQNO string found in the url to work also.
Thanks
If you can't just remove the condition altogether (there's not enough context to know if that's an option), you can just add another or condition (||) like so:
if(location.href.indexOf("MODULE=MESSAGE")>0
||location.href.indexOf("/message")>0
||location.href.indexOf("SEQNO=")>0)
Note that the second clause there isn't actually being used in any of your examples, so could potentially be removed. Also note that this isn't actually checking for a number so it isn't restricted to Message1 to Message20 as you suggest. It would match Message21 or even MessageFoo. That may or may not be a problem for you. You can make the conditions as restrictive or as lose as makes sense.
If you just want to check for the existence of "SEQNO", simply duplicate what is being done for "MODULE_MESSAGE".
if(location.href.indexOf("MODULE=MESSAGE")>0 ||
location.href.indexOf("SEQNO=")>0 ||
location.href.indexOf("/message")>0)
If you want to also ensure that "MESSAGE" ends in 1-20, and "SEQNO=" ends in 1-20, you can use a regex.
// create the end part of the regex, which checks for numbers 1-20
var regexEnd = "([1-9]|1[0-9]|20)[^0-9]*$";
// create the individual regexes
var messageRegex = new RegExp("MODULE=MESSAGE" + regexEnd);
var seqnoRegex = new RegExp("SEQNO=" + regexEnd);
// now comes your if statement, using the regex test() function, which returns true if it matches
if(messageRegex.test(location.href) ||
seqnoRegex.test(location.href) ||
location.href.indexOf("/message")>0)
After I read about Hover Zoom being evil (yikes!), two articles made me instantly switch to another one, called Imagus:
Hoverzoom’s Malware controversy, and Imagus alternative - ghacks.net
Imagus is a Hover Zoom Replacement to Enlarge Images on Mouseover - LifeHacker
Imagus seems to fit the bill by doing pretty much what Hover Zoom also could, but in addition, it seems to support custom filters (to support more sites), in addition to the huge bunch it already comes packed with.
In the options page, on Chrome, the filters section looks deliciously hackable:
However, at the same time, it seems to be written in what I would call Perl Javascript.
I consider myself well-versed in Javascript, DOM and Regex, but it's just painful to try to guess what that is doing, so I looked for documentation. It seems like there was an MyOpera blog, and now the website of the project is, for the time being, hosted on Google Docs.
The page doesn't mention anything about how to develop "filters" (or "sieves", as written in that page?)
So, how can I develop a custom filter? I'm not aware of all the possibilities (it seems to be pretty flexible), but even a simple example like just modifying URLs would be good. (turning /thumb/123.jpg into /large/123.jpg or something).
Or even just an explanation of the fields. They seem to be:
link
url
res
img
to
note <- Probably Comment
The fieds can contain a JavaScript function or a Regex.
link recives the address of any link you hover over.
url uses captured parentheses values from the link field to make an url.
res recives whatever page, in text, that was pointed to by url or link.
If one of them is empty, that step is skipped, e.g. no url and res just loads from link's output.
A simple example is the xkcd filter:
link:
^(xkcd\.(?:org|com)/\d{1,5})/?$
Finds links to xkcd comics. If you're unfamiliar with regex, anything between the parentheses is saved and can be used in Imagus as "$n" to refer to the nth capture. Note that if there's a "?:" after the first parentheses it wont get captured.
url:
$1/info.0.json
This simply appends "/info.0.json" to the address from link.
res:
:
if ($._[0] != '{') $ = null;
else $ = JSON.parse($._), $ = [$.img, [$.year, ('0'+$.month).slice(-2),
('0'+$.day).slice(-2)].join('-') + ' | ' + $.safe_title + ' - ' + $.alt + ' ' +
$.link];
return $;
This javascript function parses the JSON file and returns an array where the first element is the link and the second is the caption text displayed under the hoverzoomed image.
If you return just a link then the caption will be the alt text of the link.
img is used as link is, but for image sources
to is used as res or url is
A simple use case is when you want to redirect from thumbnails to hires.
Like the filter for wikimapia.org.
img:
^(photos\.wikimapia\.org/p/[^_]+_(?!big))[^.]+
This finds any wikimapia image that doesn't have big in the name.
to:
$1big
Adds big to the url.
note is just for notes.
Some filters have links to API docs here.
Now, there's no documentation for this feature yet so I probably missed a lot, but hopfully it'll be enough.
Cheers.
There is a page I can access that contains a bunch of links like this:
<a href="#" onclick="navigate(___VIEW_RAID_2, {raid_inst_id:556816});return false;">
The number after the raid_inst_id: is always going to be different and there will be multiples on the same page all with different numbers. I'm trying to put together a javascript that will scrape the page for these links, put them in an array and then cycle through clicking them.
Ideally, an alert causing a pause between onclicks would be helpful. I've been unsuccessful so far even trying to gather the numbers and just echoing them out let alone manipulating them.
Any hints or help would be greatly appreciated!
Below is a function I tried putting together just to see if I could capture some of the onclick values for further processing but, this produces nothing...
function closeraids(){
x=document.getElementsByTagName('a');
for(i=0;i<x.length;i++)
{
attnode=x.item(i).getAttributeNode('onclick');
alert("OnClick events are: " + attnode);
}
}
Wow - 4 months later and the same problem still exists. I decided to look into this again only to find my own posted question in my Google search! Does anyone have any thoughts on what could be done here? The function I'm trying to provide will be part of a Chrome extension I already provide to users. It uses a combination of a .js file I host on my webserver and injected html content.
Any help would be appreciated!
Had some fun while making this jsfiddle: http://jsfiddle.net/Ralt/ttkGG/
Mostly because I went onto using almost fully functional style... but well. Onto your question.
I'm using getAttribute('onclick') to get the string in there. It shows something like:
"navigate(___VIEW_RAID_2, {raid_inst_id:553516});return false;"
So I just built the necessary regex to match it, and capture the number after raid_inst_id:
var re = /navigate\(___VIEW_RAID_2, {raid_inst_id:(\d+)}\);return false;/;
It's mostly rewriting the string by escaping the parentheses and putting (\d+) where you want to capture the number. (\d+ is matching a number, () is capturing the matched string.)
Using match(), I can simply get the captured string as the last element. So, rewriting the code in old IE way:
var links = document.getElementsByTagName('a'),
re = /navigate\(___VIEW_RAID_2, {raid_inst_id:(\d+)}\);return false;/;
for (var i = 0, l = links.length; i < l; i++) {
var attribute = links[i].getAttribute('onclick'),
nb;
if (nb = attribute.match(re)) {
alert(nb.pop());
}
}
This might be a noob question, but I have tried to find an answere here and on other sites and I have still not find the answere. At least not so that I understand enough to fix the problem.
This is used in a userscript for chrome.
I'm trying to select a date from a string. The string is the innerHTML from a tag that I have managed to select. The html structure, and also the string, is something like this: (the div is the selected tag so everything within is the content of the string)
<div id="the_selected_tag">
link
" 2011-02-18 23:02"
thing
</div>
If you have a solution that helps me select the date without this fuzz, it would also be great.
The javascript:
var pattern = /\"\s[\d\s:-]*\"/i;
var tag = document.querySelector('div.the_selected_tag');
var date_str = tag.innerHTML.match(pattern)[0]
When I use this script as ordinary javascript on a html document to test it, it works perfectly, but when I install it as a userscript in chrome, it doesn't find the pattern.
I can't figure out how to get around this problem.
Dump innerHTML into console. If it looks fine then start building regexp from more generic (/\d+/) to more specific ones and output everything into a console. There is a bunch of different quote characters in different encodings, many different types of dashes.
[\d\s:-]* is not a very good choice because it would match " 1", " ". I would rather write something as specific as possible:
/" \d{4}-\d{2}-\d{2} \d{2}:\d{2}"/
(Also document.querySelector('div.the_selected_tag') would return null on your sample but you probably wanted to write class instead of id)
It's much more likely that tag.innerHTML doesn't contain what you think it contains.