Getting data from Wep Pages using Jsoup Java - javascript

Its my first qustion in this site and hope to stay longer :=)
I have read a lot of article and examine many kind of example about taking specific datas from web site using Jsoup. Alread, I could manage to get some values but I couldn't succed my target which is to read alarm states from some web server so that I can collect them and send to technician.
Unfortunatelly, I don't know the hierarchy. If anyone can tell me how to read the value headlined with red squre. I hope I could explain what ı need clearly.
Thanks in advance
public static void main(String[] args) throws IOException {
File htmlFile = new File("http://162.196.43.36");
Document doc = Jsoup.parse(htmlFile, "UTF-8");
// First <div> element has class ="related-container"
Element div = doc.select("td.imgstatus").first();
System.out.println(div);

public static void mainjdk7(String ... args){
Connection connect = Jsoup.connect("http://www.yahoo.com");
try {
Document dom = connect.get();
dom.getElementsByTag("section").forEach(new Consumer<Element>() {
#Override
public void accept(Element element) {
Elements imgstatus = element.getElementsByClass("imgstatus");
if(null != imgstatus){
//Do something
}
}
});
} catch (IOException e) {
e.printStackTrace();
}
}
public static void mainjdk8(String ... args){
Connection connect = Jsoup.connect("http://www.yahoo.com");
try {
Document dom = connect.get();
dom.getElementsByTag("section").forEach(element -> {
Elements imgstatus = element.getElementsByClass("imgstatus");
if(null != imgstatus){
//Do something
}
});
} catch (IOException e) {
e.printStackTrace();
}
}
hope this works for you.... Happy Coding :)

Related

CefSharp EvaluateScriptAsync - return result in a variable using visual basic language

I hope someone could help, I am trying to replicate this code but in visual basic language. I can't find some working example anywhere =( in the web, I managed to execute javascript code to press a button in the browser but no luck getting innerText from some td.
Thanks in advance.
private void button1_Click(object sender, EventArgs e)
{
object EvaluateJavaScriptResult;
var frame = chromeBrowser.GetMainFrame();
var task = frame.EvaluateScriptAsync("(function() { return document.getElementsByClassName('mw-headline')[0].value; })();", null);
task.ContinueWith(t =>
{
if (!t.IsFaulted)
{
var response = t.Result;
EvaluateJavaScriptResult = response.Success ? (response.Result ?? "null") : response.Message;
MessageBox.Show(EvaluateJavaScriptResult.ToString());
}
}, TaskScheduler.FromCurrentSynchronizationContext());
}

How to block a webpage advertisement when it is inside android studio Webview? [duplicate]

I want to implement a mechanism in a custom webview client (without JavaScript injection) that can block ads. Is a way I can catch ads and replace them with other ads from a trusted source?
Thanks
In your custom WebViewClient, you can override the function shouldInterceptRequest(WebView, WebResourceRequest).
From Android docs:
Notify the host application of a resource request and allow the application to return the data.
So the general idea is to check if the request is coming from an ad URL (plenty of black list filters out there), then return a "fake" resource that isn't the ad.
For a more in depth explanation plus an example, I recommend checking out this blog post.
To implement this, you have two options:
Use Javascript injected code to do this (which you explicitely said, don't want)
In WebView, instead of "http://example.com" load "http://myproxy.com?t=http://example.com" (properly escaped, of course) and setup "myproxy.com" to be a proxy which will fetch the upstream page (given in "t" query parameter, or in any other way) and replace ads with the trusted ones before sending response to the client. This will be pretty complex, though, because ads can be in many forms, they're usually Javascript injected themselves and you'd probably need to rewrite a lot of URL's in the fetched HTML, CSS and JS files etc.
I made a custom WebViewClient like:
public class MyWebViewClient extends WebViewClient {
#Override
public void onPageFinished(WebView view, String url) { }
#Override
public boolean shouldOverrideUrlLoading(WebView view, String url) {
if (url.endsWith(".mp4")) {
Intent intent = new Intent(Intent.ACTION_VIEW);
intent.setDataAndType(Uri.parse(url), "video/*");
intent.setFlags(Intent.FLAG_ACTIVITY_NEW_TASK);
view.getContext().startActivity(intent);
return true;
} else if (url.startsWith("tel:") || url.startsWith("sms:") || url.startsWith("smsto:")
|| url.startsWith("mms:") || url.startsWith("mmsto:")) {
Intent intent = new Intent(Intent.ACTION_VIEW, Uri.parse(url));
intent.setFlags(Intent.FLAG_ACTIVITY_NEW_TASK);
view.getContext().startActivity(intent);
return true;
} else {
return super.shouldOverrideUrlLoading(view, url);
}
}
private Map<String, Boolean> loadedUrls = new HashMap<>();
#SuppressWarnings("deprecation")
#Override
public WebResourceResponse shouldInterceptRequest(WebView view, String url) {
boolean ad;
if (!loadedUrls.containsKey(url)) {
ad = AdBlocker.isAd(url);
loadedUrls.put(url, ad);
} else {
ad = loadedUrls.get(url);
}
return ad ? AdBlocker.createEmptyResource() :
super.shouldInterceptRequest(view, url);
}
}
And created an AdBlocker class like:
public class AdBlocker {
private static final Set<String> AD_HOSTS = new HashSet<>();
public static boolean isAd(String url) {
try {
return isAdHost(getHost(url));
} catch (MalformedURLException e) {
Log.e("Devangi..", e.toString());
return false;
}
}
private static boolean isAdHost(String host) {
if (TextUtils.isEmpty(host)) {
return false;
}
int index = host.indexOf(".");
return index >= 0 && (AD_HOSTS.contains(host) ||
index + 1 < host.length() && isAdHost(host.substring(index + 1)));
}
public static WebResourceResponse createEmptyResource() {
return new WebResourceResponse("text/plain", "utf-8", new ByteArrayInputStream("".getBytes()));
}
public static String getHost(String url) throws MalformedURLException {
return new URL(url).getHost();
}
}
And use this WebViewClient in your oncreate like:
webview.setWebViewClient(new MyWebViewClient());

Android WebView not returning desired HTML

so a quick overview of what I'm doing
I am using Android Webview to Render JavaScript and then reading the HTML from the javascript to parse it.
I am currently having trouble with retrieving the HTML from a website called Sport Chek.
Here is the code for my SportChekSearch class:
public class SportChekSearch extends SearchQuery{
public Elements finalDoc;
private ArrayList<Item> processed;
private final Handler uiHandler = new Handler();
public int status = 0;
//This basically is just so that the class knows which Activity we're working with
private Context c;
protected class JSHtmlInterface {
#android.webkit.JavascriptInterface
public void showHTML(String html) {
final String htmlContent = html;
uiHandler.post(
new Runnable() {
#Override
public void run() {
Document doc = Jsoup.parse(htmlContent);
}
}
);
}
}
/**
* Constructor method
* #param context The context taken from the webview (So that the asynctask can show progress)
*/
public SportChekSearch(Context context, String query) {
final Context c = context;
try {
final WebView browser = new WebView(c);
browser.setVisibility(View.INVISIBLE);
browser.setLayerType(View.LAYER_TYPE_NONE, null);
browser.getSettings().setJavaScriptEnabled(true);
browser.getSettings().setBlockNetworkImage(true);
browser.getSettings().setDomStorageEnabled(true);
browser.getSettings().setCacheMode(WebSettings.LOAD_NO_CACHE);
browser.getSettings().setLoadsImagesAutomatically(false);
browser.getSettings().setGeolocationEnabled(false);
browser.getSettings().setSupportZoom(false);
browser.getSettings().setUserAgentString("Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36");
browser.addJavascriptInterface(new JSHtmlInterface(), "JSBridge");
browser.setWebViewClient(
new WebViewClient() {
#Override
public void onPageStarted(WebView view, String url, Bitmap favicon) {
super.onPageStarted(view, url, favicon);
}
#Override
public void onPageFinished(WebView view, String url) {
browser.loadUrl("javascript:window.JSBridge.showHTML('<html>'+document.getElementsByTagName('html')[0].innerHTML+'</html>');");
}
}
);
browser.loadUrl("https://www.sportchek.ca/search.html#q=" + query.replaceAll(" ", "+") + "&lastVisibleProductNumber=3");
browser.loadUrl(browser.getUrl());
final String link = browser.getUrl();
new fetcher(c).execute(link);
}
catch(Exception e){
e.printStackTrace();
}
//Get the link from the WebView, and save it in a final string so it can be accessed from worker thread
}
/**
* This subclass is a worker thread meaning it does work in the background while the user interface is doing something else
* This is done to prevent "lag".
* To call this class you must write fetcher(Context c).execute(The link you want to connect to)
*
*/
class fetcher extends AsyncTask<String, Void, Elements> {
Context mContext;
ProgressDialog pdialog;
public fetcher(Context context) {
mContext = context;
}
#Override
protected void onPreExecute() {
super.onPreExecute();
pdialog = new ProgressDialog(mContext);
pdialog.setTitle(R.string.finding_results);
pdialog.setCancelable(false);
pdialog.show();
}
//This return elements because the postExecute() method needs an Elements object to parse its results
#Override
protected Elements doInBackground(String... strings) {
//You can pass in multiple strings, so this line just says to use the first string
String link = strings[0];
//For Debug Purposes, Do NOT Remove - **Important**
System.out.println("Connecting to: " + link);
try {
doc = Jsoup.connect(link)
.ignoreContentType(true)
.userAgent("Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36")
.timeout(10000)
.get();
finalDoc = doc.select("body section.product-grid-wrapper");
System.out.println(finalDoc.toString());
} catch (IOException e) {
e.printStackTrace();
}
return finalDoc;
}
#Override
protected void onPostExecute(Elements result) {
//This line clears the list of info in the Search activity
//I should probably be using a getter method but adapter is a static variable so it shouldn't matter
//parse seperates document into elements
//crunch results formats those elements into item objects
//I am saving the result of this to an ArrayList<Item> called "processed"
processed = crunchResults(result);
//For debug purposes, do NOT remove - **Important**
System.out.println(processed.size() + " results have been crunched by Sport Chek.");
//Adds all of the processed results to the list of info in Search activity
ClothingSearch.adapter.addAll(processed);
//For debug purposes, do NOt remove - **Important
System.out.println("Adapter has been notified by Sport Chek.");
//Closes the progress dialog called pdialog assigned to the AsyncTask
pdialog.dismiss();
ClothingSearch.adapter.notifyDataSetChanged();
SearchQueueHandler.makeRequest(mContext, processed, SearchQueueHandler.CLOTHING_SEARCH);
}
}
public ArrayList<Item> crunchResults(Elements e){
ArrayList<Item> results = new ArrayList<Item>();
try {
for (int i = 0; i < e.size(); i++) {
Element ele = e.get(i);
String link = "https://www.sportchek.ca" + ele.select(" a.product-grid__link").attr("href");
System.out.println("https://www.sportchek.ca" + ele.select(" a.product-grid__link").attr("href"));
String title = ele.select(" span.product-title-text").text();
String pricestring = ele.select(" span.product-price__wrap").text();
price = Double.parseDouble(pricestring.substring(pricestring.lastIndexOf("$")));
System.out.println(pricestring);
//*******************************************
String store = "Sport Chek";
//Adds the formatted item to an ArrayList of items
results.add(new Item(title, store, price, link));
//Prints the object's to String to console
//For debug purposes, do NOT remove - **Important
System.out.println(results.get(i).toString());
}
} catch (Exception a){
a.printStackTrace();
}
return results;
}
public int getStatus(){
return status;
}
}
The two relevant methods are doInBackground in my AsyncTask and the crunchResults method.
Here is the result I get from using Ctrl+Shift+I on the actual website (Desired Result):
But when running the above code and using a println here is the result that I get for the tag section class="product-grid-wrapper" :
<section class="product-grid-wrapper">
<ul data-module-type="SearchProductGrid" class="product-grid__list product-grid__list_quickview">
<!-- #product-grid__item-template -->
</ul>
</section>
Can anyone help me figure out why I am not getting my desired result?
All help is appreciated
EDIT: for this specific search that the println data was collected from, the link was https://www.sportchek.ca/search.html#q=men+coat&lastVisibleProductNumber=3
It looks like what you are actually getting is the actual html sent by the server, and that your 'desired result' is what the DOM looks like after the JavaScript runs.
Your 'actual' is what I see if I use "View Source" in Chrome, while your "desired result" is what I see if I use Chrome's DOM inspector.
On further inspection, I see that you are not actually getting the HTML from the browser, you are (indirectly) using JSoup's Connection object to get the HTML directly. Unfortunately, that's not going to run the Javascript.
Instead, you're going to have to get the HTML from the WebView after the JavaScript runs. For a possible way to do that, see How do I get the web page contents from a WebView?
Then, you give the HTML that you get from that to JSoup with
Jsoup.parse(html);

Wicket AbstractDefaultAjaxBehavior do recursive update the page

I have some ajax Behaviour that should pick some data using JS, and turn it back to Java. Sometimes it works but quite ofen it is just add url parameter and do page refresing/
public abstract class LoggedVKIdBehaviour extends AbstractDefaultAjaxBehavior {
private static final Logger logger = LoggerFactory.getLogger(LoggedVKIdBehaviour.class);
#Override
protected void respond(AjaxRequestTarget target) {
String loggedVkId = RequestCycle.get().getRequest().getRequestParameters().getParameterValue("logged_vkid").toString();
logger.info("ajax has comming with logged VK ID " + loggedVkId);
recived(target, loggedVkId);
}
protected abstract void recived(AjaxRequestTarget target, String loggedVkId);
#Override
public void renderHead(final Component component, IHeaderResponse response) {
super.renderHead(component, response);
Map<String, Object> map = new HashMap<>();
map.put("callbackFunction", getCallbackFunction(CallbackParameter.explicit("logged_vkid")));
//
PackageTextTemplate ptt = new PackageTextTemplate(LoggedVKIdBehaviour.class, "vkid_callback.js");
OnDomReadyHeaderItem onDomReadyHeaderItem = OnDomReadyHeaderItem.forScript(ptt.asString(map));
response.render(onDomReadyHeaderItem);
}
}
js template
var calback = ${callbackFunction};
var logged_vk_id = 11;
function authInfo(response) {
if (response.session) {
logged_vk_id = response.session.mid;
calback(response.session.mid);
console.log("recived callback from VK " + logged_vk_id);
}
}
$(document).ready(function () {
VK.Auth.getLoginStatus(authInfo);
});
it is do recursive redirection like http://localhost:8080/mytool/product/1?logged_vkid=332797331&logged_vkid=332797331&logged_vkid=332797331&logged_vkid=332797331&logged_vkid=332773...
As i understand Ajaj technology - iti asynchronus requests, that shouldn't touch main url at all. So what is the reason for page refreshing?
this is generated Callback function
function (logged_vkid) {
var attrs = {"u":"../wicket/bookmarkable/com.tac.kulik.pages.product.ProductPage?12-1.IBehaviorListener.0-&productID=1"};
var params = [{"name":"logged_vkid","value":logged_vkid}];
attrs.ep = params.concat(attrs.ep || []);
Wicket.Ajax.ajax(attrs);
}
I use wicket 7.2
I did a lot investigations for few days. And found that when i remove
setPageManagerProvider(new NoSerializationPageManagerProvider(this));
Application throw me exepton in polite logs
org.apache.wicket.WicketRuntimeException: A problem occurred while
trying to collect debug information about not serializable object look
like it is could come from aused by: java.io.NotSerializableException:
com.tac.kulik.panel.smaccounts.SMAccountsPanel$1
which means that page tryed to be serialized for SOME REASON but $1 it is mean Anonimous class. I had few class created anonimously to ges some ajax links coming from ListView to be managed on parent panel. So After removing this Anonimous class logic, everything start and run well.
So i am happy, but still don't understand which reason page did serialization after ajax, and what the reason was to refresh whole page.

AsyncTask inside execute method of Cordova plugin not working properly

I am developing cordova plugin for the first time and stuck in the following issue.
I have created a class extending CorodvaPlugin and override execute method as given . What I want is after the asynctask has completed it background task, response is returned to the JS and values are displayed on the HTML but whats happening sometimes values are displayed and sometimes not.Any help would be appreciated.
#Override
public boolean execute(String action, JSONArray args,
CallbackContext callbackContext) throws JSONException {
try {
context = this.cordova.getActivity().getApplicationContext();
this.mMyCallbackContext = callbackContext;
new WSCall().execute();
PluginResult pluginResult = new PluginResult(PluginResult.Status.NO_RESULT);
pluginResult.setKeepCallback(true);
mMyCallbackContext .sendPluginResult(pluginResult);
return true;
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
return false;
}
}
and in the Async Task post execute I have done this
#Override
protected void onPostExecute(String result) {
PluginResult result_;
if(groups!=null)
result_ = new PluginResult(PluginResult.Status.OK, groups);
else if(ret_msg!=null)
result_ = new PluginResult(PluginResult.Status.OK, ret_msg);
else
result_ = new PluginResult(PluginResult.Status.OK, "");
result_.setKeepCallback(false);
mMyCallbackContext.sendPluginResult(result_);
pDialog.dismiss();
}
Use this link
and don't return true from execute method ,return Pluginresult only.

Categories