I've been doing web scraping on a website where I need to get the javascript so I could extract data such as name, created date, and some randomly generated codes as will be shown below...
Is there an efficient/any way of getting text/attributes from a script object function in a <script type="text/javascript"> from a html webpage.
I was able to find the script section using BeautifulSoup, with the function embedded in it is as follows:
<script type="text/javascript">
//COMMENT// Some data already here
$(document).ready(function() {
name.init("<website Link>")
lang.init("en", "GB")
data.init("hello", "", "AT3K21SDV", "YIERE34ITEW832WCNG3VMASJKHO345JKRELRK5", "")
});
</script>
Specifically, I need to get the $(document).ready(function() section that will include AT3K21SDV and YIERE34ITEW832WCNG3VMASJKHO345JKRELRK5.
I've been bugging my brains out trying to get it using index like so; `soup[3:40] but doesn't work
beautifulsoup doesn't parse JavaScript, so you need to use other tools. For example re to extract the information:
import re
from ast import literal_eval
txt = '''<script type="text/javascript">
//COMMENT// Some data already here
$(document).ready(function() {
name.init("<website Link>")
lang.init("en", "GB")
data.init("hello", "", "AT3K21SDV", "YIERE34ITEW832WCNG3VMASJKHO345JKRELRK5", "")
});
</script>'''
data = re.search(r'data\.init(\(.*?\))', txt).group(1)
data = literal_eval(data)
print(data[2], data[3])
Prints:
AT3K21SDV YIERE34ITEW832WCNG3VMASJKHO345JKRELRK5
EDIT: If inside data.init(...) are newlines, you must set flags=re.DOTALL in re.search():
import re
from ast import literal_eval
txt = '''<script type="text/javascript">
//COMMENT// Some data already here
$(document).ready(function() {
ab.info.init("sv", "pp", "f", "NONE",
"rw", "3r7u6565667",
"3435345")
});
</script>'''
data = re.search(r'info\.init(\(.*?\))', txt, flags=re.DOTALL).group(1)
data = literal_eval(data)
print(data)
Prints:
('sv', 'pp', 'f', 'NONE', 'rw', '3r7u6565667', '3435345')
Related
I am trying to load a JSON file into a JavaScript script for use in a Mustache script on a webpage. The JSON in a string loads correctly, but the same JSON in a file does not.
This works:
function renderHello() {
var json = JSON.parse('{"cloudy": "This applies to Snowflake.","classic": "This applies to SQL Server."}');
var template = document.getElementById("template").innerHTML;
var rendered = Mustache.render(template, json);
document.getElementById("target").innerHTML = rendered;
}
This does not work:
function renderHello() {
var json = JSON.parse('./abc.json');
var template = document.getElementById("template").innerHTML;
var rendered = Mustache.render(template, json);
document.getElementById("target").innerHTML = rendered;
}
Variations I've tried in abc.json:
{
"cloudy": "This applies to Snowflake.",
"classic": "This applies to SQL Server."
}
{"cloudy": "This applies to Snowflake.","classic": "This applies to SQL Server."}
I've unsuccessfully tried
var json = './abc.json';
var json = JSON.parse('./abc.json');
var json = JSON.stringify('./abc.json');
var json = JSON.parse(JSON.parse('./abc.json'));
var json = JSON.parse(JSON.stringify('./abc.json'));
Frequently an error like this is returned in the Chrome Developer Tools:
VM223:1 Uncaught SyntaxError: Unexpected token . in JSON at position 0
at JSON.parse (<anonymous>)
at renderHello (render.js:2)
at onload (docs.html:91)
I added these lines in the script to reveal the location of the website file that includes the Mustache template, so I knew where to locate abc.json:
var loc = window.location.pathname;
var dir = loc.substring(0, loc.lastIndexOf('/'));
console.log(dir)
And the file structure is now this:
directory
--file-with-mustache-script.html
--abc.json
This may not be relevant, but this is the Mustache script, which of course already works with the JSON as a string.
<body onload="renderHello()">
<div id="target">Loading...</div>
<script id="template" type="x-tmpl-mustache">
Hello {{ cloudy }}, {{ classic }}!
</script>
</body>
The website build tool I'm using does not support Node or Handlebars.
If it is relevant at all, the goal is to populate values from a JSON file in a product documentation website page to indicate which paragraphs apply to which product version.
You need to load the content of the file via "AJAX". You can't get the content like you are trying to do.
Something like this would do the job:
function renderHello() {
fetch('https://jsonplaceholder.typicode.com/users/3').then(r=>r.json())
.then(json=>{
// var template = document.getElementById("template").innerHTML;
// var rendered = Mustache.render(template, json);
document.getElementById("target").innerHTML = JSON.stringify(json,null,2);
})
}
renderHello();
<pre id="target"></pre>
As I don't have access to your abc.json file I used a publicly resource at http://typicode.com.
Hi like the title says I am trying to read an array from my java Servlet. I am trying to read the array in my java script file.
Java servlet code:
String graphData[] = dbHandler.select(attributes); // filling the array with data from database.
request.setAttribute("graphData", graphData);
RequestDispatcher dispatcher = request.getRequestDispatcher("/displayData.jsp");
dispatcher.forward(request, response);
<script type="text/javascript">
var graphData = ['${graphData}'];
var graphData= pageContext.getAttribute("graphData");
var graphData = document.getElementById("graphData");
var GraphData = ['${graphData}'];
log("test: " + graphData);
</script>
I tried all those options but none of them worked.
Can someone please tell me what the correct way is to read an array from a java servlet in a jsp page?
Thanks in advance!
edit:
what I can do is print out the data from the array on the JSP page (in the header) with this code:
<c:forEach items="${graphData}" var="temp">
<c:out value="${temp}"/><br />
</c:forEach>
But I want to use the data from the array in my JS code. which for some reason doesn't work.
You can use like this
<script>
var graphData = [
<c:forEach items="${graphData}" var="graph">
'<c:out value="${graph}" />',
</c:forEach>
];
console.log(graphData);
</script>
Developing in ASP.NET using VB.NET as code behind (I don't do this for a living :-) )
In an attempt to dynamically display all the contents of myDir, I replaced the following code that worked well by displaying a slide show in the specified div (culled for brevity):
<script type="text/javascript">
var mygallery1=new fadeSlideShow
({
wrapperid: "divIDBelow"
...
imagearray: [
["./myDir/image1.jpg", "", ""],
["./myDir/image2.jpg", "", ""]
],
displaymode: ...
...
})
</script>
with:
<script type="text/javascript">
var mygallery1=new fadeSlideShow
(
{
wrapperid: "divIDBelow"
...
imagearray: '<%=fileList.ToString() %>',
displaymode: ...
...
}
)
var imagearr = '<%=fileList.ToString() %>'; //for debugging purposes
alert(imagearr); // for debugging purposes
</script>
where fileList is a server side public StringBuilder variable that is initialized with the contents of "myDir".
The debugging alert outputs the following:
[["./myDir/image1.jpg", "", ""],
["./myDir/image2.jpg", "", ""]]
But the imagearray member in the fadeslideshow function call's variable does not seem to initialize properly as the slide show presents just a white image (as opposed to the results in the hardcoded path version).
Thank you in advance for any help.
imagearray is coming as string not an array, so create new array and fill it up with correct elements and pass it to imagearray.
I'm trying to find a way of dynamically loading a Google visualisation API table, populated from a dynamic query onto a Google spreadsheet into a Blogger blogpost.
Unfortunately, the blog style sheet seems to trash the style of the table, so I thought I'd try to inject the dynamically loaded table into an iframe and isolate it from the host page:
<script type="text/javascript" src="http://www.google.com/jsapi"></script>
<script type="text/javascript">
google.load("jquery", "1.3.2");
google.setOnLoadCallback(f1dj_iframeloader);
function f1dj_iframeloader(){
$(function() {var $frame = $('iframe');
setTimeout( function() {
var doc = $frame[0].contentWindow.document;
var $body = $('body',doc);
$body.html("<script type='text/javascript' src='http://www.google.com/jsapi'></script><script type='text/javascript'>var f1dj_sskey="tQQIIA7x9VuyVKE7UVdrytg";var f1dj_sheet=8;var f1dj_authkey='CITwr80K';google.load('visualization', '1', {'packages':['table']});function f1dj_getData(){var url='http://spreadsheets.google.com/tq?tq=select%20*&key='+f1dj_sskey+'&authkey='+f1dj_authkey+'&gid='+f1dj_sheet;var query = new google.visualization.Query(url); query.send(f1dj_displayTable);} function f1dj_displayTable(response){if (response.isError()) return;var data = response.getDataTable(); visualization = new google.visualization.Table(document.getElementById('f1dj__table'));visualization.draw(data, null);} google.setOnLoadCallback(f1dj_getData)</script><div id='f1dj__table'></div>");}, 1 );
});
}</script>
This seems to work okay in a simple HTML text page EXCEPT that:
1) in the test page, ");}, 1 );});} is also rendered on the page (so something's obviously not right...)
2) the Blogger HTML editor/parses throws a parse error and blocks the saving of the page (maybe same issue as in 1)
Any ideas how to fix this? Is there maybe something obvious I've missed?:-(
Your quotes don't match up - the double quotes for fldj_sskey=... are closing the string being passed to $body.html.
And then you've got "</script>" unencoded in the strings within your script tag, so the HTML parser thinks the script tag ends there.
You have to be careful with inline js and should really html encode it all...
This line is your problem:
$body.html("<script type='text/javascript' src='http://www.google.com/jsapi'></script><script type='text/javascript'>var f1dj_sskey="tQQIIA7x9VuyVKE7UVdrytg";var f1dj_sheet=8;var f1dj_authkey='CITwr80K';google.load('visualization', '1', {'packages':['table']});function f1dj_getData(){var url='http://spreadsheets.google.com/tq?tq=select%20*&key='+f1dj_sskey+'&authkey='+f1dj_authkey+'&gid='+f1dj_sheet;var query = new google.visualization.Query(url); query.send(f1dj_displayTable);} function f1dj_displayTable(response){if (response.isError()) return;var data = response.getDataTable(); visualization = new google.visualization.Table(document.getElementById('f1dj__table'));visualization.draw(data, null);} google.setOnLoadCallback(f1dj_getData)</script><div id='f1dj__table'></div>");}, 1 );
You are calling .html() with a string contained in double quotes (") but your string contains double quotes when you initialize the f1dj_sskey variable. This means that your string is getting closed early. You need to change the quotes in the string either to single quotes or you need to escape them.
Single quotes (change " to '):
$body.html("<script type='text/javascript' src='http://www.google.com/jsapi'></script><script type='text/javascript'>var f1dj_sskey='tQQIIA7x9VuyVKE7UVdrytg';var f1dj_sheet=8;var f1dj_authkey='CITwr80K';google.load('visualization', '1', {'packages':['table']});function f1dj_getData(){var url='http://spreadsheets.google.com/tq?tq=select%20*&key='+f1dj_sskey+'&authkey='+f1dj_authkey+'&gid='+f1dj_sheet;var query = new google.visualization.Query(url); query.send(f1dj_displayTable);} function f1dj_displayTable(response){if (response.isError()) return;var data = response.getDataTable(); visualization = new google.visualization.Table(document.getElementById('f1dj__table'));visualization.draw(data, null);} google.setOnLoadCallback(f1dj_getData)</script><div id='f1dj__table'></div>");}, 1 );
Escaping (change " to \"):
$body.html("<script type='text/javascript' src='http://www.google.com/jsapi'></script><script type='text/javascript'>var f1dj_sskey=\"tQQIIA7x9VuyVKE7UVdrytg\";var f1dj_sheet=8;var f1dj_authkey='CITwr80K';google.load('visualization', '1', {'packages':['table']});function f1dj_getData(){var url='http://spreadsheets.google.com/tq?tq=select%20*&key='+f1dj_sskey+'&authkey='+f1dj_authkey+'&gid='+f1dj_sheet;var query = new google.visualization.Query(url); query.send(f1dj_displayTable);} function f1dj_displayTable(response){if (response.isError()) return;var data = response.getDataTable(); visualization = new google.visualization.Table(document.getElementById('f1dj__table'));visualization.draw(data, null);} google.setOnLoadCallback(f1dj_getData)</script><div id='f1dj__table'></div>");}, 1 );
I am using gettext in my PHP code, but I have a big problem. All my JavaScript files are not affected by the translation, can somebody tell me an easy way to get the translations in the chosen language into JavaScript as well.
The easiest way is having a PHP file write the translations from gettext into JavaScript variables.
js_lang.php:
word_hello = "<?php echo gettext("hello"); ?>"
word_world = "<?php echo gettext("world"); ?>"
word_how_are_you = "<?php echo gettext("how_are_you"); ?>"
and then include it:
<script type="text/javascript" src="js_lang.php"></script>
I would also recommend this method in conjunction with the translation plugins S.Mark mentions (which are very interesting!).
You can define the dictionary in the current page's header, too, without including an external file, but that way, you would have to look up and send the data on every page load - quite unnecessary, as a dictionary tends to change very rarely.
I generally export the translations in a JavaScript structure:
var app = {};
var app.translations = {
en: {
hello: "Hello, World!",
bye: "Goodbye!"
},
nl: {
hello: "Hallo, Wereld!",
bye: "Tot ziens!"
}
};
The current language of the page texts can be defined using: <html xml:lang="en" lang="nl">
This can be read in JavaScript:
var currentLanguage = document.documentElement.lang || "en";
app.lang = app.translations[ currentLanguage ] || app.translations.en;
And then you can write code like this:
alert( app.lang.hello );
Optionally, a i18n() or gettext() function can bring some intelligence, to return the default text if the key does not exist). For example:
function gettext( key )
{
return app.lang[ key ] || app.translations.en[ key ] || "{translation key not found: " + key + "}";
}
Try, jQuery i18n or jQuery localisation
An example for jQuery i18n, and of course you need to generate JSON based dictionary from language file from php
var my_dictionary = {
"some text" : "a translation",
"some more text" : "another translation"
}
$.i18n.setDictionary(my_dictionary);
$('div#example').text($.i18n._('some text'));
JSGettext (archived link) is best implementation of GNU gettext spec.
First download JSGETTEXT package and include in your page
/js/Gettext.js
<?php
$locale = "ja_JP.utf8";
if(isSet($_GET["locale"]))$locale = $_GET["locale"];
?>
<html>
<head>
<link rel="gettext" type="application/x-po" href="/locale/<?php echo $locale ?>/LC_MESSAGES/messages.po" />
<script type="text/javascript" src="/js/Gettext.js"></script>
<script type="text/javascript" src="/js/test.js"></script>
</head>
<body>
Test!
</body>
</html>
javascript code for example
window.onload = function init(){
var gt = new Gettext({ 'domain' : 'messages' });
alert(gt.gettext('Hello world'));
}
For reference find below link. It's working fine without converting .js file to .php.
Click here
You can make your life much easier if you get rid of bad habit to use string literals in your code. That is, instead of
alert("Some message")
use
alert($("#some_message_id").text())
where "#some_message_id" is a hidden div or span generated on the server side.
As a further hint there's a perl script called po2json which will generate json from a .po file.
For JavaScript implementation of GNU gettext API these links can be also useful:
http://tnga.github.io/lib.ijs
http://tnga.github.io/lib.ijs/docs/iJS.Gettext.html
//set the locale in which the messages will be translated
iJS.i18n.setlocale("fr_FR.utf8") ;
//add domain where to find messages data. can also be in .json or .mo
iJS.i18n.bindtextdomain("domain_po", "./path_to_locale", "po") ;
//Always do this after a `setlocale` or a `bindtextdomain` call.
iJS.i18n.try_load_lang() ; //will load and parse messages data from the setting catalog.
//now print your messages
alert( iJS.i18n.gettext("messages to be translated") ) ;
//or use the common way to print your messages
alert( iJS._("another way to get translated messages") ) ;
This library seems the best implementation of getText in javascript:
http://messageformat.github.io/Jed/
https://github.com/messageformat/Jed
example from the documentation:
<script src="jed.js"></script>
<script>
var i18n = new Jed({
// Generally output by a .po file conversion
locale_data : {
"messages" : {
"" : {
"domain" : "messages",
"lang" : "en",
"plural_forms" : "nplurals=2; plural=(n != 1);"
},
"some key" : [ "some value"]
}
},
"domain" : "messages"
});
alert( i18n.gettext( "some key" ) ); // alerts "some value"
</script>