Scrape Content from table generated by js - javascript

I want to extract content from a table within a website that appears to be generated after the website loads using Python. Some excerpts from the html code of the website:
I believe this displays the table:
<table class="table table-condensed table-bordered hoverRowHand" id="contentTable">
<thead>
<tr>
<th width="5%"> </th>
<th width="10%">Date Uploaded</th>
<th width="10%">Survey ID</th>
<th width="5%">Airport ID</th>
<th width="20%">Airport</th>
<th width="10%">Survey Type</th>
</tr>
</thead>
<tbody>
</tbody>
</table>
And I believe running these scripts generates the contents for the table:
<script src="/global/js/nfdc-suite.js"></script>
<script src="/global/bootstrap-3.3.4/js/bootstrap.min.js"></script>
<script src="/global/js/date.js"></script>
<script src="/global/js/spin.min.js"></script>
<script src="/global/js/jquery.twbsPagination.js"></script>
<script src="/nfdcApps/include/main.js"></script>
<script src="/nfdcApps/services/publicData/uddfList-dt.js"></script>
<script src="/global/js/nfdcPublicDataTable.js"></script>
Simply getting the html of the website obviously does not provide any of the information I am looking for, since I believe it is first necessary to run the scripts which generate it.
Is this feasible to do with Python? Or is this something unreasonable to accomplish with Python.
Side note: If I use the "inspect element" tool in my Chrome browser I can see the content I am looking for.

Related

download large records using jquery or some better alternative

On my User Interface, there is an Advanced Search section where user selects 7 different things as shown in the diagram below and based on these parameters, a web service is called. The webservice returns the search results in a JSON format and those records are displayed in a tabular format in the User Interface.
Something like this(using jqxwidget)
So I have a Export to Excel button below the tabular section where all records are displayed, just like shown in the jsfiddle above.
If I have to download the records in excel/csv format as soon as user clicks the download button, is there something better than using jQXwidgets that is used in the JSFiddle?
The reason I want to go away from jqxwidget is that they are asking to supply a URL hosted on my RHEL server and that’s causing issues to me in setting up virtualhost on Apache etc. The reason they are asking is that the records are around 87000 or more and to handle the load I need to have their solution on my server.
Here is more information regarding my UI (diagram below) and some info below it:
So at first, user sees only the Advanced Search section until the Search and Clear buttons. After selecting things from the drop-down boxes, user clicks on the Search button. This calls a search web service in the backend. A new table shows up with Search results (81702)as shown in the diagram above.
Below the table, I have the Export to Excel button. That's where I am having issues since the records are so big and jQWidget I am using is unable to handle more than 600 records. So they are asking me to host their source code file on my server and so on and so forth.
To manage data with table, I suggest to use DataTables. It's easy to manage, it also has a nice template and a lot of useful controls, such as: searching, paging, sorting, filtering... and downloading (Excel, CSV, PDF format).
These are some libraries you may want to add:
CSS:
<link rel="stylesheet" href="https://cdn.datatables.net/1.10.21/css/jquery.dataTables.min.css" />
<link rel="stylesheet" href="https://cdn.datatables.net/buttons/1.6.2/css/buttons.dataTables.min.css" />
Scripts:
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<script src="https://cdn.datatables.net/1.10.21/js/jquery.dataTables.min.js"></script>
<script src="https://cdn.datatables.net/buttons/1.6.2/js/dataTables.buttons.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/jszip/3.1.3/jszip.min.js"></script>
<script src="https://cdn.datatables.net/buttons/1.6.2/js/buttons.html5.min.js"></script>
<script>
$('table').DataTable({
dom: 'Bfrtip',
buttons: [
'excelHtml5',
'csvHtml5'
]
});
</script>
$('table').DataTable({
dom: 'Bfrtip',
buttons: [
'excelHtml5',
'csvHtml5'
]
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<link rel="stylesheet" href="https://cdn.datatables.net/1.10.21/css/jquery.dataTables.min.css" />
<link rel="stylesheet" href="https://cdn.datatables.net/buttons/1.6.2/css/buttons.dataTables.min.css" />
<script src="https://cdn.datatables.net/1.10.21/js/jquery.dataTables.min.js"></script>
<script src="https://cdn.datatables.net/buttons/1.6.2/js/dataTables.buttons.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/jszip/3.1.3/jszip.min.js"></script>
<script src="https://cdn.datatables.net/buttons/1.6.2/js/buttons.html5.min.js"></script>
<table class="table">
<thead>
<tr>
<th scope="col">#</th>
<th scope="col">First</th>
<th scope="col">Last</th>
<th scope="col">Handle</th>
</tr>
</thead>
<tbody>
<tr>
<th scope="row">1</th>
<td>Mark</td>
<td>Otto</td>
<td>#mdo</td>
</tr>
<tr>
<th scope="row">2</th>
<td>Jacob</td>
<td>Thornton</td>
<td>#fat</td>
</tr>
<tr>
<th scope="row">3</th>
<td>Larry</td>
<td>the Bird</td>
<td>#twitter</td>
</tr>
</tbody>
</table>

How to set default sort order on angular smart table?

I have an Angularjs smart-table that has sort definitions set for each column, ie which direction.
What I want to ensure is that every time the page loads, the original sort direction is preserved, ideally to control with a variable. I have noticed with some pagination code I am doing that when you select page 2, the sort order changes sometimes so the second page you get doesn't relate to page 1.
How can I maintain control over what column and direction is being sorted and maintain that state?
<table st-table="displayedCollection" st-pipe="getModelRuns" st-sort="EXECUTIONPOOL" st-sort-default="reverse" st-safe-src="rowCollection" class="table table-condensed" ng-show="displayedCollection.length > 0">
<thead>
<tr>
<th st-sort="NAME" st-skip-natural="true"><i ng-class="getSortIcon('JobName')"></i> <a data-ng-href="#" style="white-space: nowrap">Job Name</a></th>
<th st-sort="CLOUDJOBNAME" st-skip-natural="true"><i ng-class="getSortIcon('CloudJobName')"></i> <a data-ng-href="#" style="white-space: nowrap">Cloud Job Name</a></th>
<th st-sort="MODELVERSION" st-skip-natural="true"><i ng-class="getSortIcon('ModelVersion')"></i> Model Version</th>
<th st-sort="RUNSTATUS" st-skip-natural="true"><i ng-class="getSortIcon('Status')"></i> Status</th>
<th st-sort="REQUESTEDAT" st-skip-natural="true" st-sort-default="reverse"><i ng-class="getSortIcon('Requested')"></i> Requested At</th>
<th st-sort="STARTEDAT" st-skip-natural="true" st-sort-default="reverse"><i ng-class="getSortIcon('Started')"></i> Started At</th>
<th st-sort="FINISHEDAT" st-skip-natural="true" st-sort-default="reverse"><i ng-class="getSortIcon('Finished')"></i> Finished At</th>
<th st-sort="REQUESTEDBY" st-skip-natural="true"><i ng-class="getSortIcon('RequestedBy')"></i> Requested By</th>
<th st-sort="EXECUTIONPOOL" st-skip-natural="true"><i ng-class="getSortIcon('ExecutionPool')"></i> Executed on Pool</th>
<th />
</tr>
</thead>
</table>
I think you asked for default ascending order in table, you can set like
st-sort-default="true"
but you have set reverse so its probably in reverse order for you.

Reload bootstrap table Refresh and change settings

I'm currently trying to change on a bootstrap table it's configurations dynamically, based on some user configurations stored in localStorage and reload it.
The table is filled at page load, but some events can fire inside the page and change it's configuration based on user settings.
I took a look at the official documentation, and some questions in SO. And I tried to do it, as below:
var $table = $('#grd-fatura');
$table.bootstrapTable('refresh', {
pageSize: 2
});
(The localstorage.getItem block and rules were removed from above code for testing purposes.)
And here is my table code (note that some elements are in portuguese):
<table
id="grd-fatura" data-click-to-select="true" data-response-handler="responseHandler"
data-pagination="true" #*data-height="460"*# data-show-footer="true" data-search="true"
data-show-columns="true" data-cache="false" data-show-toggle="true" data-show-export="true">
<thead>
<tr>
<th data-field="state" data-checkbox="true"></th>
<th data-field="ID" data-unique-id="ID" data-align="left" data-visible="false" data-sortable="true">Código</th>
<th data-field="estabelecimento" data-align="left" data-sortable="true">Estabelecimento</th>
<th data-field="historico" data-align="left" data-sortable="true">Histórico</th>
<th data-field="tipo" data-align="left" data-sortable="true">Tipo</th>
<th data-field="pessoa" data-align="left" data-sortable="true">Pessoa</th>
<th data-field="emissao" data-align="center" data-sortable="true">Emissão</th>
<th data-field="referencia" data-align="left" data-sortable="true">Referência</th>
<th data-field="situacao" data-align="left" data-sortable="true">Situação</th>
<th data-field="total" data-align="right" data-sortable="true" data-footer-formatter="Totalizador">Total</th>
</tr>
</thead>
</table>
The code does not work. The table does not refresh, and the code does not generate any errors. I took a look at some examples, and could not find where I am going wrong. So, how do I dynamically change the Boostrap table settings? What I am doing wrong in the above code?
Note: I am using Bootstrap 3.
Try this:
$table.bootstrapTable('refresh', {
query: {pageSize: 2}
});

Access External webpage HTML table without table ID from Java Script and convert to JSON

I have written a script to get HTML table data to JSON Object.
for this task I have used lightswitch05 jquery plugin.
In this code I can access a HTML table data in same web page in Javascript
using
var table = $('#example-table').tableToJSON();
but I need to access HTML table of external webpage.
Table URL is here - http://ccmcwolf.byethost4.com/index.html
how can I change the it to above "#example-table" to that external web page table ?
<!DOCTYPE html>
<html>
<head>
<script
src="http://ajax.googleapis.com/ajax/libs/jquery/1.11.3/jquery.min.js">
</script>
<script
src="http://lightswitch05.github.io/table-to-json/javascripts/jquery.tabletojson.min.js">
</script>
<script>
function myFunction() {
var table = $('#example-table').tableToJSON();
console.log(table);
alert(JSON.stringify(table));
}
$(document).ready(myFunction);
</script>
</head>
<body>
<table id='example-table' class="table table-striped">
<thead>
<tr>
<th>First Name</th>
<th>Last Name</th>
<th data-override="Score">Points</th></tr>
</thead>
<tbody>
<tr>
<td>Jill</td>
<td>Smith</td>
<td data-override="disqualified">50</td></tr>
<tr>
<td>Eve</td>
<td>Jackson</td>
<td>94</td></tr>
<tr>
<td>John</td>
<td>Doe</td>
<td>80</td></tr>
<tr>
<td>Adam</td>
<td>Johnson</td>
<td>67</td></tr>
</tbody>
</table>
</body>
</html>
Thank you very much!
Since, JavaScript has scope in the current page only (i.e. it can access and modify the content of the page where it is embedded or declared/linked using the script tag), I don't think there is any direct way of doing it.
But if that external page is on the same domain where your current JavaScript code is operating, you can always get the html by an Ajax call to the server and then run your code to extract the html table in JSON form.
Please, let me know if I am missing something and I will edit my answer based on your input.
Thank you.

Common mustache template for JS and PHP

I am using mustache for both JS and PHP, I have created one template for JS now I want to use that template in PHP. Is is possible to reuse that template?
For reference see the below code:
JS template:
<table class="table table-bordered table-hover table-striped tablesorter">
<thead>
<tr>
<th class="header">Name</th>
<th class="header">Email ID</th>
<th class="header">Contact Number</th>
<th class="header">Edit</th>
</tr>
</thead>
<tbody>
<div id="eTableList"></div>
<script id="eList" type="text/template">
<tr>
<td>{{name}}</td>
<td>{{email}}</td>
<td>{{contact_number}}</td>
<td>View/Edit</td>
</tr>
</script>
</tbody>
</table>
PHP template:
<table class="table table-bordered table-hover table-striped tablesorter">
<thead>
<tr>
<th class="header">Name</th>
<th class="header">Email ID</th>
<th class="header">Contact Number</th>
<th class="header">Edit</th>
</tr>
</thead>
<tbody id="eTableList">
<tr>
<td>{{name}}</td>
<td>{{email}}</td>
<td>{{contact_number}}</td>
<td>View/Edit</td>
</tr>
</tbody>
</table>
The output of both the templates is same, but I am calling them according to my need using PHP or JS.
So is there a way to use single template instead of above two templates which can be used in both the calls, ie JS and PHP?
Yes you can, since the location of the template is accessible for PHP at server side and Javascript at client side.
You could have the template file at a subfolder of your public access, for example:
public
-- index.php
-- assets
-- -- img (your client side images)
-- -- css (your styles)
-- -- templates
-- -- -- my_template.tpl
-- -- js (your javascript)
Then when you gonna use it in js script, call from relative path ../templates/my_template.tpl, and in PHP, call from absolute path, something like MY_ROOT_APP_CONSTANT . "assets/templates/my_template.tpl"
Hope this help you.
Best regards

Categories