Elegant way to make Array from xml (string)

Elegant way to make Array from xml (string) - javascript

i need to make Array for data grid.
Input is XML(string). With lots of unnecessary data, i need only array "a:Client"
Here is my code which works, but I think it's not too clean to set a way like this.
parse(XML){
var parseString = require('react-native-xml2js').parseString;
var xml = XML;
parseString(xml, (err, result) => {
this.setState({rows: result["s:Envelope"]["s:Body"][0].GetClientsResponse[0].GetClientsResult[0]["a:Client"]});
});
}
Here is XML String
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/"><s:Body><GetClientsResponse xmlns="http://tempuri.org/"><GetClientsResult xmlns:a="http://schemas.datacontract.org/2004/07/tt4t.Dispatching.ApplicationServer.Data.CommonData" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"><a:Client><a:ClientID>0</a:ClientID><a:Name/></a:Client><a:Client><a:ClientID>12</a:ClientID><a:Name>Magistrát města Liberec</a:Name></a:Client><a:Client><a:ClientID>30</a:ClientID><a:Name>Krajský úřad Libereckého kraje</a:Name></a:Client><a:Client><a:ClientID>31</a:ClientID><a:Name>OC Nisa</a:Name></a:Client><a:Client><a:ClientID>32</a:ClientID><a:Name>Globus</a:Name></a:Client><a:Client><a:ClientID>33</a:ClientID><a:Name>Die Länderbahn GmbH DLB</a:Name></a:Client><a:Client><a:ClientID>34</a:ClientID><a:Name>Magistrát města Jablonec nad Nisou</a:Name></a:Client><a:Client><a:ClientID>35</a:ClientID><a:Name>Dopravní podnik měst Liberce a Jablonce n.N.</a:Name></a:Client><a:Client><a:ClientID>36</a:ClientID><a:Name>Liplastec</a:Name></a:Client><a:Client><a:ClientID>37</a:ClientID><a:Name>CBRE Česká republika</a:Name></a:Client><a:Client><a:ClientID>38</a:ClientID><a:Name>Cinestar</a:Name></a:Client><a:Client><a:ClientID>39</a:ClientID><a:Name>České dráhy a.s.</a:Name></a:Client><a:Client><a:ClientID>40</a:ClientID><a:Name>DENSO MANUFACTURING CZECH s.r.o.</a:Name></a:Client><a:Client><a:ClientID>41</a:ClientID><a:Name>Hasiči</a:Name></a:Client><a:Client><a:ClientID>42</a:ClientID><a:Name>MŠ Lísteček ((N.Ruda)</a:Name></a:Client><a:Client><a:ClientID>43</a:ClientID><a:Name>MŠ Tanvaldská (vč. pobočky Poštovní)</a:Name></a:Client><a:Client><a:ClientID>44</a:ClientID><a:Name>Policie ČR</a:Name></a:Client><a:Client><a:ClientID>45</a:ClientID><a:Name>SOŠ Kateřinky</a:Name></a:Client><a:Client><a:ClientID>46</a:ClientID><a:Name>Sportkids</a:Name></a:Client><a:Client><a:ClientID>47</a:ClientID><a:Name>SŽDC, s.o.</a:Name></a:Client><a:Client><a:ClientID>48</a:ClientID><a:Name>Záchranná služba</a:Name></a:Client><a:Client><a:ClientID>49</a:ClientID><a:Name>ZŠ Nad Školou</a:Name></a:Client><a:Client><a:ClientID>50</a:ClientID><a:Name>SKI KLUB Jizerska 50</a:Name></a:Client><a:Client><a:ClientID>51</a:ClientID><a:Name>TJ Dukla Liberec, z.s.</a:Name></a:Client><a:Client><a:ClientID>52</a:ClientID><a:Name>Jiný, viz poznámka</a:Name></a:Client><a:Client><a:ClientID>53</a:ClientID><a:Name>KORID LK, spol. s r.o.</a:Name></a:Client><a:Client><a:ClientID>54</a:ClientID><a:Name>STUDENT AGENCY k.s.</a:Name></a:Client><a:Client><a:ClientID>55</a:ClientID><a:Name>Boveraclub z.s.</a:Name></a:Client><a:Client><a:ClientID>56</a:ClientID><a:Name>Archa 13</a:Name></a:Client><a:Client><a:ClientID>57</a:ClientID><a:Name>Pekárny</a:Name></a:Client><a:Client><a:ClientID>58</a:ClientID><a:Name>MŠ Sídliště - Skloněná</a:Name></a:Client><a:Client><a:ClientID>59</a:ClientID><a:Name>Vratislavice</a:Name></a:Client><a:Client><a:ClientID>60</a:ClientID><a:Name>První festivalová s.r.o.</a:Name></a:Client><a:Client><a:ClientID>61</a:ClientID><a:Name>Městský obvod Liberec - Vratislavice nad Nisou</a:Name></a:Client><a:Client><a:ClientID>62</a:ClientID><a:Name>Preciosa</a:Name></a:Client><a:Client><a:ClientID>63</a:ClientID><a:Name>Central Europe Spartan Race ESR Enterprises Czech</a:Name></a:Client><a:Client><a:ClientID>64</a:ClientID><a:Name>Kümpers Textil, s.r.o.</a:Name></a:Client><a:Client><a:ClientID>65</a:ClientID><a:Name>Základní škola a Mateřská škola, Stráž n.N.</a:Name></a:Client><a:Client><a:ClientID>66</a:ClientID><a:Name>Základní škola a mateřská škola logopedická, LBC</a:Name></a:Client><a:Client><a:ClientID>165</a:ClientID><a:Name>Zájezd stř. 704</a:Name></a:Client><a:Client><a:ClientID>166</a:ClientID><a:Name>DPMLJ X11</a:Name></a:Client><a:Client><a:ClientID>167</a:ClientID><a:Name>DPMLJ 2 a 3</a:Name></a:Client><a:Client><a:ClientID>168</a:ClientID><a:Name>DPMLJ 5 a 11</a:Name></a:Client></GetClientsResult></GetClientsResponse></s:Body></s:Envelope>

Your code works, but I think that it have two things that can be done better....
1) It's not readable, if you'll read this code in a year it'll be difficult to understand what you are looking for, so I would wrap that:
{rows: result["s:Envelope"]["s:Body"][0].GetClientsResponse[0].GetClientsResult[0]["a:Client"]}
in a function like:
getClient(result)
2) It's not reusable, if the XML change or you need to find some other data it'll break, so try to use a function with a parameter that finds exactly what you need recursively if necessary
(You can give a look here, I know that you have array, but you can search recursively also for your arrays)

The only thing I worry about the above code is that it missed error checking. I am sure how confident that you are about your API and the schema will behave as expected. I feel it is good to have error checking when you are accessing a nested property of a nested object.
afterParsing(err, result){
if(err){
// Handle error
return [];
}
try {
const res = result["s:Envelope"]["s:Body"][0].GetClientsResponse[0].GetClientsResult[0]["a:Client"];
return res;
} catch (error) {
return [];
}
}
parse(XML){
var parseString = require('react-native-xml2js').parseString;
var xml = XML;
parseString(xml, (err, result) => {
this.setState({rows: this.afterParsing(err, result)});// if afterParsing and this function are in the class
});
}
If this piece of code is going to be run in the browser you can also try DOMParser.
const stringContainingXMLSource = `<your xml string>`;
var parser = new DOMParser();
var doc = parser.parseFromString(stringContainingXMLSource, "application/xml");
console.log(Array.from(doc.querySelectorAll('Envelope > Body > GetClientsResponse > GetClientsResult > Client')).map(n => n.textContent));

Related

Approach to selecting a document

I am using Couchbase in a node app. Every time I insert a document, I am using a random UUID.
It inserts fine and I could retrieve data based on this id.
But in reality, I actually want to search by a key called url in the document. To be able to get or update or delete a document.
I could possibly add the url as the id I suppose but that is not what I see in any database concepts. Ids are not urls
or any unique names. They are typically random numbers or incremental numbers.
How could I approach this so that I can use a random UUID as id but be able to search by url?
Cos lets say the id was 56475-asdf-7856, I am not going to know this value to search for right.
Whereas if the id was https://www.example.com I know about this url and searching for it would give me what I want.
Is it a good idea making the url the id.
This is in a node app using Couchbase.
databaseRouter.put('/update/:id', (req, res) => {
updateDocument(req)
.then(({ document, error }) => {
if (error) {
res.status(404).send(error);
}
res.json(document);
})
.catch(error => res.status(500).send(error));
});
export const updateDocument = async (req) => {
try {
const result = await collection.get(req.params.id); // Feels like id should be the way to do this, but doesn't make sense cos I won't know the id beforehand.
document.url = req.body.url || document.url;
await collection.replace(req.params.id, document);
return { document };
} catch (error) {
return { error };
}
};

I think it's okay to use URLs as IDs, especially if that's the primary way you're going to lookup documents, and you don't need to change the URL later. Yes, often times IDs are numbers or UUIDs, but there is no reason you have to be restricted to this.
However, another approach you can take is to use a SQL query (SQL++, technically, since this is a JSON database).
Something like:
SELECT d.*
FROM mybucket.myscope.mydocuments d
WHERE d.url = 'http://example.com/foo/baz/bar'
You'll also need an index with that, something like:
CREATE INDEX ix_url ON mybucket.myscope.mydocuments (url)
I'd recommend checking out the docs for writing a SQL++ query (sometimes still known as "N1QL") with Node.js: https://docs.couchbase.com/nodejs-sdk/current/howtos/n1ql-queries-with-sdk.html
Here's the first example in the docs:
async function queryPlaceholders() {
const query = `
SELECT airportname, city FROM \`travel-sample\`.inventory.airport
WHERE city=$1
`;
const options = { parameters: ['San Jose'] }
try {
let result = await cluster.query(query, options)
console.log("Result:", result)
return result
} catch (error) {
console.error('Query failed: ', error)
}
}

How to crawling using Node.js

I can't believe that I'm asking an obvious question, but I still get the wrong in console log.
Console shows crawl like "[]" in the site, but I've checked at least 10 times for typos. Anyways, here's the javascript code.
I want to crawl in the site.
This is the kangnam.js file :
const axios = require('axios');
const cheerio = require('cheerio');
const log = console.log;
const getHTML = async () => {
try {
return await axios.get('https://web.kangnam.ac.kr', {
headers: {
Accept: 'text/html'
}
});
} catch (error) {
console.log(error);
}
};
getHTML()
.then(html => {
let ulList = [];
const $ = cheerio.load(html.data);
const $allNotices = $("ul.tab_listl div.list_txt");
$allNotices.each(function(idx, element) {
ulList[idx] = {
title : $(this).find("list_txt title").text(),
url : $(this).find("list_txt a").attr('href')
};
});
const data = ulList.filter(n => n.title);
return data;
}). then(res => log(res));
I've checked and revised at least 10 times
Yet, Js still throws this result :
root#goorm:/workspace/web_platform_test/myapp/kangnamCrawling(master)# node kangnam.js
[]

Mate, I think the issue is you're parsing it incorrectly.
$allNotices.each(function(idx, element) {
ulList[idx] = {
title : $(this).find("list_txt title").text(),
url : $(this).find("list_txt a").attr('href')
};
});
The data that you're trying to parse for is located within the first index of the $(this) array, which is really just storing a DOM Node. As to why the DOM stores Nodes this way, it's most likely due to efficiency and effectiveness. But all the data that you're looking for is contained within this Node object. However, the find() is superficial and only checks the indexes of an array for the conditions you supplied, which is a string search. The $(this) array only contains a Node, not a string, so when you you call .find() for a string, it will always return undefined.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/find
You need to first access the initial index and do property accessors on the Node. You also don't need to use $(this) since you're already given the same exact data with the element parameter. It's also more efficient to just use element since you've already been given the data you need to work with.
$allNotices.each(function(idx, element) {
ulList[idx] = {
title : element.children[0].attribs.title,
url : element.children[0].attribs.href
};
});
This should now populate your data array correctly. You should always analyze the data structures you're parsing for since that's the only way you can correctly parse them.
Anyways, I hope I solved your problem!

Unable to add async / await and then unable to export variable. Any help appreciated

Background: Been trying for the last 2 day to resolve this myself by looking at various examples from both this website and others and I'm still not getting it. Whenever I try adding callbacks or async/await I'm getting no where. I know this is where my problem is but I can't resolve it myself.
I'm not from a programming background :( Im sure its a quick fix for the average programmer, I am well below that level.
When I console.log(final) within the 'ready' block it works as it should, when I escape that block the output is 'undefined' if console.log(final) -or- Get req/server info, if I use console.log(ready)
const request = require('request');
const ready =
// I know 'request' is deprecated, but given my struggle with async/await (+ callbacks) in general, when I tried switching to axios I found it more confusing.
request({url: 'https://www.website.com', json: true}, function(err, res, returnedData) {
if (err) {
throw err;
}
var filter = returnedData.result.map(entry => entry.instrument_name);
var str = filter.toString();
var addToStr = str.split(",").map(function(a) { return `"trades.` + a + `.raw", `; }).join("");
var neater = addToStr.substr(0, addToStr.length-2);
var final = "[" + neater + "]";
// * * * Below works here but not outside this block* * *
// console.log(final);
});
// console.log(final);
// returns 'final is not defined'
console.log(ready);
// returns server info of GET req endpoint. This is as it is returning before actually returning the data. Not done as async.
module.exports = ready;
Below is an short example of the JSON that is returned by website.com. The actual call has 200+ 'result' objects.
What Im ultimately trying to achieve is
1) return all values of "instrument_name"
2) perform some manipulations (adding 'trades.' to the beginning of each value and '.raw' to the end of each value.
3) place these manipulations into an array.
["trades.BTC-26JUN20-8000-C.raw","trades.BTC-25SEP20-8000-C.raw"]
4) export/send this array to another file.
5) The array will be used as part of another request used in a websocket connection. The array cannot be hardcoded into this new request as the values of the array change daily.
{
"jsonrpc": "2.0",
"result": [
{
"kind": "option",
"is_active": true,
"instrument_name": "26JUN20-8000-C",
"expiration_timestamp": 1593158400000,
"creation_timestamp": 1575305837000,
"contract_size": 1,
},
{
"kind": "option",
"is_active": true,
"instrument_name": "25SEP20-8000-C",
"expiration_timestamp": 1601020800000,
"creation_timestamp": 1569484801000,
"contract_size": 1,
}
],
"usIn": 1591185090022084,
"usOut": 1591185090025382,
"usDiff": 3298,
"testnet": true
}

Looking your code we find two problems related to final and ready variables. The first one is that you're trying to console.log(final) out of its scope.
The second problem is that request doesn't immediately return the result of your API request. The reason is pretty simple, you're doing an asynchronous operation, and the result will only be returned by your callback. Your ready variable is just the reference to your request object.
I'm not sure about what is the context of your code and why you want to module.exports ready variable, but I suppose you want to export the result. If that's the case, I suggest you to return an async function which returns the response data instead of your request variable. This way you can control how to handle your response outside the module.

You can use the integrated fetch api instead of the deprecated request. I changed your code so that your component exports an asynchronous function called fetchData, which you can import somewhere and execute. It will return the result, updated with your logic:
module.exports = {
fetchData: async function fetchData() {
try {
const returnedData = await fetch({
url: "https://www.website.com/",
json: true
});
var ready = returnedData.result.map(entry => entry.instrument_name);
var str = filter.toString();
var addToStr = str
.split(",")
.map(function(a) {
return `"trades.` + a + `.raw", `;
})
.join("");
var neater = addToStr.substr(0, addToStr.length - 2);
return "[" + neater + "]";
} catch (error) {
console.error(error);
}
}
}
I hope this helps, otherwise please share more of your code. Much depends on where you want to display the fetched data. Also, how you take care of the loading and error states.
EDIT:
I can't get responses from this website, because you need an account as well as credentials for the api. Judging your code and your questions:
1) return all values of "instrument_name"
Your map function works:
var filter = returnedData.result.map(entry => entry.instrument_name);
2)perform some manipulations (adding 'trades.' to the beginning of each value and '.raw' to the end of each value.
3) place these manipulations into an array. ["trades.BTC-26JUN20-8000-C.raw","trades.BTC-25SEP20-8000-C.raw"]
This can be done using this function
const manipulatedData = filter.map(val => `trades.${val}.raw`);
You can now use manipulatedData in your next request. Being able to export this variable, depends on the component you use it in. To be honest, it sounds easier to me not to split this logic into two separate components - regarding the websocket -.

How to save the result from collection.findone()

i have a simple question and i have read a lot of same issues here, but these are not exact the same or doesn't work for me :-(
I have a REST function called "addevent". The function gets a json input (req) and iterate through the json array to get some IDs to store them in an extra Array. That works perfect!
After that, the function should search in a mongodb for every single id and store some extra informations from this ID (e.g. the stored URL of this ID). With "console.log(result.link)" it works again perfect. But my problem is that, that i need to store this link in an extra Array (urlArray).
So how can i save the result of collection.findone(). I read something about, that findone() doesn't return a document, but a cursor? what does that mean? How do i have to handle that in my case?
That's the code:
exports.addevent = function(req, res) {
var ids = req.body;
var pArray = new Array();
var urlArray = new Array();
var eventName = ids.name;
for(var i in ids.photos) {
photoArray.push(ids.photos[i]);
var id = ids.photos[i]._id;
var collection = db.get().collection('photos');
collection.findOne({'_id':new mongo.ObjectID(id)},function(err, result) {
console.log(result.link);
}
)
}
Many thanks!
-------------------- Update --------------------
Ok, i think that has something to do with the asynch Callbacks. I found an article, but i don't know how to implement it in my case.
http://tobyho.com/2011/11/02/callbacks-in-loops/
And something about "promises" in javascript.

You can save the result of your search doing something like:
var foundPhoto = collection.find({_id':new mongo.ObjectID(id)}, function(err, photo){
if(!err){
return photo;
} else {
console.log(err)
return null;
}
});
This way you get the return statement of your query in the "photo" variable.

Javascript memory consumption with map() over a large set and callbacks

I don't even know how to properly ask this question but I have concerns about the performance (mostly memory consumption) of the following code. I am anticipating that this code will consume a lot of memory because of map on a large set and a lot of 'hanging' functions that wait for external service. Are my concerns justified here? What would be a better approach?
var list = fs.readFileSync('./mailinglist.txt') // say 1.000.000 records
.split("\n")
.map( processEntry );
var processEntry = function _processEntry(i){
i = i.split('\t');
getEmailBody( function(emailBody, name){
var msg = {
"message" : emailBody,
"name" : i[0]
}
request(msg, function reqCb(err, result){
...
});
}); // getEmailBody
}
var getEmailBody = function _getEmailBody(obj, cb){
// read email template from file;
// v() returns the correct form for person's name with web-based service
v(obj.name, function(v){
cb(obj, v)
});
}

If you're worried about submitting a million http requests in a very short time span (which you probably should be), you'll have to set up a buffer of some kind.
one simple way to do it:
var lines = fs.readFileSync('./mailinglist.txt').split("\n");
var entryIdx = 0;
var done = false;
var processNextEntry = function () {
if (entryIdx < lines.length) {
processEntry(lines[entryIdx++]);
} else {
done = true;
}
};
var processEntry = function _processEntry(i){
i = i.split('\t');
getEmailBody( function(emailBody, name){
var msg = {
"message" : emailBody,
"name" : name
}
request(msg, function reqCb(err, result){
// ...
!done && processNextEntry();
});
}); // getEmailBody
}
// getEmailBody didn't change
// you set the ball rolling by calling processNextEntry n times,
// where n is a sensible number of http requests to have pending at once.
for (var i=0; i<10; i++) processNextEntry();
Edit: according to this blog post node has an internal queue system, it will only allow 5 simultaneous requests. But you can still use this method to avoid filling up that internal queue with a million items if you're worried about memory consumption.

Firstly I would advise against using readFileSync, and instead favour the async equivalent. Blocking on IO operations should be avoided as reading from a disk is very expensive, and whilst that's the sole purpose of your code now, I would consider how that might change in the future - and arbitrarily wasting clock cycles is never a good idea.
For large data files I would read them in in defined chunks and process them. If you can come up with some schema, either sentinels to distinguish data blocks within the file, or padding to boundaries, then process the file piece by piece.
This is just rough, untested off the top of my head, but something like:
var fs = require("fs");
function doMyCoolWork(startByteIndex, endByteIndex){
fs.open("path to your text file", 'r', function(status, fd) {
var chunkSize = endByteIndex - startByteIndex;
var buffer = new Buffer(chunkSize);
fs.read(fd, buffer, 0, chunkSize, 0, function(err, byteCount) {
var data = buffer.toString('utf-8', 0, byteCount);
// process your data here
if(stillWorkToDo){
//recurse
doMyCoolWork(endByteIndex, endByteIndex + 100);
}
});
});
}
Or look into one of the stream library functions for similar functionality.
H2H
ps. Javascript and Node works extremely well with async and eventing.. using sync is an antipattern in my opinion, and likely to cause code to be a headache in future

We Keep Coding

JavaScript is the programming language of the Web.

Elegant way to make Array from xml (string) - javascript

Related

Approach to selecting a document

How to crawling using Node.js

Unable to add async / await and then unable to export variable. Any help appreciated

How to save the result from collection.findone()

Javascript memory consumption with map() over a large set and callbacks

Categories

Resources