http://www.biletix.com/search/TURKIYE/en#!subcat_interval:12/12/15TO19/12/15
I want to get data from this website. When i use jsoup, it cant execute because of javascript. Despite all my efforts, still couldnot manage.
enter image description here
As you can see, i only want to get name and url. Then i can go to that url and get begin-end time and location.
I dont want to use headless browsers. Do you know any alternatives?
Sometimes javascript and json based web pages are easier to scrape than plain html ones.
If you inspect carefully the network traffic (for example, with browser developer tools) you'll realize that page is making a GET request that returns a json string with all the data you need. You'll be able to parse that json with any json library.
URL is:
http://www.biletix.com/solr/en/select/?start=0&rows=100&fq=end%3A[2015-12-12T00%3A00%3A00Z%20TO%202015-12-19T00%3A00%3A00Z%2B1DAY]&sort=vote%20desc,start%20asc&&wt=json
You can generate this URL in a similar way you are generating the URL you put in your question.
A fragment of the json you'll get is:
....
"id":"SZ683",
"venuecount":"1",
"category":"ART",
"start":"2015-12-12T18:30:00Z",
"subcategory":"tiyatro$ART",
"name":"The Last Couple to Meet Online",
"venuecode":"BT",
.....
There you can see the name and URL is easily generated using id field (SZ683), for example: http://www.biletix.com/etkinlik/SZ683/TURKIYE/en
------- EDIT -------
Get the json data is more difficult than I initially thought. Server requires a cookie in order to return correct data so we need:
To do a first GET, fetch the cookie and do a second GET for obtain the json data. This is easy using Jsoup.
Then we will parse the response using org.json.
This is a working example:
//Only as example please DON'T use in production code without error control and more robust parsing
//note the smaller change in server will break this code!!
public static void main(String[] args) throws IOException {
//We do a initial GET to retrieve the cookie
Document doc = Jsoup.connect("http://www.biletix.com/").get();
Element body = doc.head();
//needs error control
String script = body.select("script").get(0).html();
//Not the more robust way of doing it ...
Pattern p = Pattern.compile("document\\.cookie\\s*=\\s*'(\\w+)=(.*?);");
Matcher m = p.matcher(script);
m.find();
String cookieName = m.group(1);
String cookieValue = m.group(2);
//I'm supposing url is already built
//removing url last part (json.wrf=jsonp1450136314484) result will be parsed more easily
String url = "http://www.biletix.com/solr/tr/select/?start=0&rows=100&q=subcategory:tiyatro$ART&qt=standard&fq=region:%22ISTANBUL%22&fq=end%3A%5B2015-12-15T00%3A00%3A00Z%20TO%202017-12-15T00%3A00%3A00Z%2B1DAY%5D&sort=start%20asc&&wt=json";
Document document = Jsoup.connect(url)
.cookie(cookieName, cookieValue) //introducing the cookie we will get the corect results
.get();
String bodyText = document.body().text();
//We parse the json and extract the data
JSONObject jsonObject = new JSONObject(bodyText);
JSONArray jsonArray = jsonObject.getJSONObject("response").getJSONArray("docs");
for (Object object : jsonArray) {
JSONObject item = (JSONObject) object;
System.out.println("name = " + item.getString("name"));
System.out.println("link = " + "http://www.biletix.com/etkinlik/" + item.getString("id") + "/TURKIYE/en");
//similarly you can fetch more info ...
System.out.println();
}
}
I skipped the URL generation as I suppose you know how to generate it.
I hope all the explanation is clear, english isn't my first language so it is difficult for me to explain myself.
Related
I use an html table where it's content can be changed with mouse drag and drop implemented. Technically, you can move the data from any table cell to another. The table size 50 row * 10 column with each cell given a unique identifier. I want to export it to .xlsx format with C# EPPlus library, and give back the exported file to client.
So I need the pass the whole table data upon a button press and post it to either a web api or an mvc controller, create an excel file (like the original html table data) and send it back to download with browser.
So the idea is to create an array which contains each of table cell's value ( of course there should be empty cells in that array), and post that array to controller.
The problem with that approach lies in the download, if I call the api or mvc controller with regular jquery's ajax.post it did not recognize the response as a file.
C# code after ajax post:
[HttpPost]
public IHttpActionResult PostSavedReportExcel([FromBody]List<SavedReports> savedReports, [FromUri] string dateid)
{
//some excel creation code
HttpResponseMessage response = new HttpResponseMessage(HttpStatusCode.OK)
{
Content = new StreamContent(new MemoryStream(package.GetAsByteArray()))
};
response.Content.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
response.Content.Headers.ContentDisposition = new System.Net.Http.Headers.ContentDispositionHeaderValue("attachment")
{
FileName = dateid + "_report.xlsx"
};
ResponseMessageResult responseMessageResult = ResponseMessage(response);
return responseMessageResult;
}
Usually, for this kind of result I could use window.location = myurltocontroller to download properly , but that is only for GET requests, POST anything is not possible.
I found some answers which could help me in this topic:
JavaScript post request like a form submit
This points out I should go with creating a form, which passes the values, but I do not know how to do so in case of arrays (the table consists 50*10 = 500 values which I have to pass in the form)
I tried some only frontend solutions to the html-excel export problem, which of course does not require to build files on api side, but free jquery add-ins are deprecated, not customizeable, handle only .xls formats, etc.
I found EPPlus nuget package a highly customizeable tool, that is why I want to try this is at first place.
So the question is: how can I post an array of 500 elements, that the controller will recognize, generate the file, and make it automatically download from browser?
If you can provide some code that would be fantastic, but giving me the right direction is also helpful.
Thank you.
You can use fetch() (docs) to send the request from the JS frontend. When the browser (JS) has received the response, it can then offer its binary content as a download. Something like this:
fetch("http://your-api/convert-to-excel", // Send the POST request to the Backend
{
method:"POST",
body: JSON.stringify(
[[1,2],[3,4]] // Here you can put your matrix
)
})
.then(response => response.blob())
.then(blob => {
// Put the response BLOB into a virtual download from JS
if (navigator.appVersion.toString().indexOf('.NET') > 0) {
window.navigator.msSaveBlob(blob, "my-excel-export.xlsx");
} else {
var a = window.document.createElement('a');
a.href = URL.createObjectURL(blob);
a.download = "my-excel-export.xlsx";
a.click();
}});
So the JS part of the browser actually first downloads the file behind the scenes, and only when it's done, it's triggering the "download" from the browsers memory into a file on the HD.
This is a quite common scenario with REST APIs that require bearer token authentication.
Basically I need to access some records or data from a view in lotus notes itself. I can't use use the #DBLookup since our target is not to refresh the form. I know that using AJAX it is possible though I still haven't tried AJAX yet if you have a detailed tutorial please share it here.
My main question is basically any other easier way to access these records in the view? Directly coding in the javascript part of a field. Thanks a lot.
Basically #DbLookup = "?" Javascript (Not AJAX).
I would do like Torsten is suggesting, create a Lotusscript agent that performs the lookup and returns a JSON object with the data. You then make an Ajax-call to that agent from your webpage using Javascript or (even easier) jQuery.
I posted some code on my blog a while back. It is doing something similar, but instead of performing a view lookup, it retrieves the values of a specific document based on a document ID. You can find the code and a more detailed explanation here:
http://blog.texasswede.com/code-snippet-jquery/
This is the jQuery code:
function loadNotesFields(docunid) {
var notesfieldname = "";
$.ajax({
url: "/database.nsf/ajax_GetNotesFieldFields?OpenAgent",
data: {"NotesUNID":docunid},
cache: false
}).done(function(data) {
$('input[notesfield]').each(function() {
notesfieldname = $(this).attr("notesfield");
$(this).val(data[notesfieldname]);
});
});
}
And this is the Lotusscript code:
Dim urldata List as String
Sub Initialize
Dim session As New NotesSession
Dim webform As NotesDocument
Dim db As NotesDatabase
Dim doc As NotesDocument
Dim urlstring As String
Dim urlarr As Variant
Dim urlvaluename As Variant
Dim i As Integer
Dim json As String
Set webform = session.DocumentContext
'*** Remove leading "OpenAgent" from Query_String
urlstring = StrRight(webform.Query_String_Decoded(0),"&")
'*** Create list of arguments passed to agent
urlarr = Split(urlstring,"&")
For i = LBound(urlarr) To UBound(urlarr)
urlvaluename = Split(urlarr(i),"=")
urldata(urlvaluename(0)) = urlvaluename(1)
Next
Set thisdb = session.CurrentDatabase
'*** Create content header for return data
Print "content-type: application/json"
'*** Get Notes document baed on NotesUIND argument
Set doc = db.GetDocumentByUNID(urldata("NotesUNID"))
'*** Build JSON for all fields in document except $fields
json = "{" + Chr$(13)
ForAll item In doc.Items
If Left$(item.Name,1)<>"$" Then
json = json + |"| + item.Name + |":"| + item.Text + |",|+ Chr$(13)
End If
End ForAll
'*** Remove trailing comma and line break
json = Left$(json,Len(json)-2)
json = json + "}"
'*** Return JSON
Print json
End Sub
If that were an XPages- question, then the answer would be easy: Use the JavaScript #DBLookup.
For "classic" web- development it is not that easy. You need to write an agent, that gives you back the result of the #DBLookup in whatever format you want and call that agent using an ajax- call. That would look like this:
LotusScript- Agent, trigger none
Dim ses as New NotesSession
Dim db as NotesDatabase
Dim viw as NotesView
Dim dc as NotesDocumentCollection
Dim doc as NotesDocument
Set db = ses.CurrentDatabase
Set viw = db.GetView( "YourLookupView" )
Set dc = viw.GetAllDocumentsByKey( "YourLookupKey" )
Set doc = dc.GetFirstDocument
While not doc is Nothing
Print doc.GetItemValue( "NameOfItemToReturn" )(0)
Set doc = dc.GetNextDocument( doc )
Wend
This agent will return a "page" with all values, one line each.. in your ajax- return- function you then do, what you want to with this values.
Usually you do not simply print values, but return json- object, or some xml- structure, or already html as ordered list, or whatever, but the principle should be clear.
Then you call the agent (e.g. with an ajax- call) like: hxxp://server/db.nsf/AgentName?OpenAgent
Another possibility would be to use an url like hxxp://server/db.nsf/YourLookupView?ReadViewEntries&restricttocategory=YourCategory or hxxp://server/db.nsf/YourLookupView?ReadViewEntries&restricttocategory=YourCategory&OutputFormat=json and parse the result with "native" javascript...
I've an object created in javacript with a lot of data and I serialize it to JSON to send it to the server. After this, the server must do somework and create a dynamic file, so it can be downloaded.
For the last routine I created an ASHX but can be modified. Already I'm getting a "httpcontext" that I found in another question how to work with it to get the data from the JSON, so my question is not related about this.
The problem (more oriented to JS) is this one:
How can I sent the JSON to the ASHX as a URL/GET/POST to the generic handler to avoid the "ajax reply" and be the user open a new window with the link dinamically generated?
Thanks, sorry for my english (please edit) and kind regards!
Note 1: I can't use third-part code
Note 2: I can't use JSON.NET
Note 3: I can't save the report on the server so the response must be a generated file to download, even more, the download itself is the response of the server.
---UPDATE----
I've been read this question:
Can I post JSON without using AJAX?
The only thing I don't understand from that question is how to make it work, thinking in that I've a "link" to download
I assume you do not want to refresh the whole page so there is a workaround.
1) Ajax-load an iframe which is a separate aspx file for example.
2) In the codebehind of that separate aspx file, generate the file in memory and convert it to an array of bytes.
3) Then use Response to stream the bytes to the user.
Finally I resolved the issue with this (in the right way).
I just take my json object and send it trough POST with a dynamic form generated with javfascript
var dataToPostInExport = JSON.stringify(queryToVerify);
//Convert To POST and send
var VerifyForm = document.createElement("form");
VerifyForm.target = "_blank";
VerifyForm.method = "POST";
VerifyForm.action = "file.ashx";
var dataInput = document.createElement("input");
dataInput.type = "hidden";
dataInput.name = "mydata";
dataInput.value = dataToPostInExport;
VerifyForm.appendChild(dataInput);
document.body.appendChild(VerifyForm);
VerifyForm.submit();
Then in the ashx file:
Dim DataToParse As String
DataToParse = HttpContext.Current.Request.Form("mydata")
Dim JSSerializer As New JavaScriptSerializer
Dim QueryToExport as my very own type!
QueryToExport = JSSerializer.Deserialize(Of My Own Type)(dataToParse)
How to extract messages from the dashboard in Mirth?
Basically using java script, how would I extract the information from dashboard in Mirth.
For example, I am after extracting the encoded data and ACK back from the destination.
One of the thing I tried was to run the following the postprocessor. But it’s only writing raw message not the encoded.
var log1file=D:\TEST\log1.txt;
var ReportBody=(messageObject.getEncodedData());
FileUtil.write(log1file, true, ReportBody);
Any suggestions much appreciated.
Thank you.
try this...
logger.info('start post script');
var status = responseMap.get('Destination Name').getStatus();
if ((status == "ERROR" || status == "FAILURE") )
{
logger.info("Status = "+status);
var errormsg = responseMap.get('Destination Name').getMessage();
logger.info(errormsg);
}
return;
getMessage() describe exception(error) description.
You wouldn't want to extract messages from the Dashboard. The dashboard is only showing the stored history from the database it keeps.
If you want to write the encoded data to a log file as the messages are processed, move that code from your post-processor over to a transformer javascript step in the source or in a destination (the encoded data changes from source to destination if you have transformer steps or if you change from HL7 to XML, etc.)
Is it actually creating the file? You don't have quotes around your file name and the backslashes should be forward slashes.
I am semi-new to ASP.NET MVC. I am building an app that is used internally for my company.
The scenario is this: There are two Html.Listbox's. One has all database information, and the other is initally empty. The user would add items from the database listbox to the empty listbox.
Every time the user adds a command, I call a js function that calls an ActionResult "AddCommand" in my EditController. In the controller, the selected items that are added are saved to another database table.
Here is the code (this gets called every time an item is added):
function Add(listbox) {
...
//skipping initializing code for berevity
var url = "/Edit/AddCommand/" + cmd;
$.post(url);
}
So the problem occurs when the 'cmd' is an item that has a '/', ':', '%', '?', etc (some kind of special character)
So what I'm wondering is, what's the best way to escape these characters? Right now I'm checking the database's listbox item's text, and rebuilding the string, then in the Controller, I'm taking that built string and turning it back into its original state.
So for example, if the item they are adding is 'Cats/Dogs', I am posting 'Cats[SLASH]Dogs' to the controller, and in the controller changing it back to 'Cats/Dogs'.
Obviously this is a horrible hack, so I must be missing something. Any help would be greatly appreciated.
Why not just take this out of the URI? You're doing a POST, so put it in the form.
If your action is:
public ActionResult AddCommand(string cmd) { // ...
...then you can do:
var url = "/Edit/AddCommand";
var data = { cmd: cmd };
$.post(url, data);
... and everything will "just work" with no separate encoding step.
Have you tried using the 'escape' function, before sending the data? This way, all special characters are encoded in safe characters. On the server-side, you can decode the value.
function Add(listbox) { ...
//skipping initializing code for berevity
var url = "/Edit/AddCommand/" + escape(cmd);
$.post(url);
}
use javascript escaping, it does urlencoding.
Javascript encoding
Then in C# you can simple decode it.
It will look as such
function Add(listbox) { ...
//skipping initializing code for berevity
var url = "/Edit/AddCommand/" + escape(cmd);
$.post(url);
}
Have you tried just wrapping your cmd variable in a call to escape()?
You could pass the details as a query string. At the moment I'm guessing you action looks like:
public virtual ActionResult AddCommand( string id )
you could change it to:
public virtual ActionResult AddCommand( string cmd )
and then in you javascript call:
var url = "/Edit/AddCommand?cmd=" + cmd;
That way you don't need to worry about the encoding.
A better way would be if you could pass the databases item id rather than a string. This would probably be better performance for your db as well.