I am having trouble consuming the response from my WebFlux server via JavaScript's new Streams API.
I can see via Curl (with the help of --limit-rate) that the server is slowing down as expected, but when I try to consume the body in Google Chrome (64.0.3282.140), it it not slowing down like it should. In fact, Chrome downloads and buffers about 32 megabytes from the server even though only about 187 kB are passed to write().
Is there something wrong with my JavaScript?
async function fetchStream(url, consumer) {
const response = await fetch(url, {
headers: {
"Accept": "application/stream+json"
}
});
const decoder = new TextDecoder("utf-8");
let buffer = "";
await response.body.pipeTo(new WritableStream({
async write(chunk) {
buffer += decoder.decode(chunk);
const blocks = buffer.split("\n");
if (blocks.length === 1) {
return;
}
const indexOfLastBlock = blocks.length - 1;
for (let index = 0; index < indexOfLastBlock; index ++) {
const block = blocks[index];
const item = JSON.parse(block);
await consumer(item);
}
buffer = blocks[indexOfLastBlock];
}
}));
}
According the the specification for Streams,
If no strategy is supplied, the default behavior will be the same as a
CountQueuingStrategy with a high water mark of 1.
So it should slow down the promise returned by consumer(item) resolves very slowly, right?
Looking at the Backpressure support in the Streams API, it seems that Backpressure information is communicated within the Streams chain and not over the network. In this case, we can assume an unbounded queue somewhere and this would explain the behavior you're seeing.
This other github issue suggests that the Backpressure information indeed stops at the TCP level - they just stop reading from the TCP socket which, depending on the current TCP window size/TCP configuration, means the buffers will be filled and then TCP flow control kicks in. As this issue states, they can't set the window size manually and they have to let the TCP stack handle things from there.
HTTP/2 supports flow control at the protocol level, but I don't know if the browser implementations leverage that with the Streams API.
I can't explain the behavior difference you're seeing, but I think you might be reading too much in the Backpressure support here and that this works as expected according to the spec.
Related
i would like to understand DOS (denial-of-service) attacks better and I would like to know what my options are for learning about it with an example.
i have a basic express server.
app.get('/ping', (req, res) => {
res.send({ pong: 'pong', time: new Date().valueOf(), memory: process.memoryUsage()})
})
I will separately create some javascript code that with make multiple requests to the server. but I don't know to devise strategies to try and bring down the server (consider that this is all running on localhost)
I want to see what the upper limit of making requests is possible when locally testing this. i am experiencing what is described here: Sending thousands of fetch requests crashes the browser. Out of memory
... the suggestions on that thread are more along the lines of "browser running out of memory" and that I should "throttle requests".... but I am actively trying to max out the requests the browser can make without crashing. so far my observations are that the server does not have any difficulty. (so maybe I should also make requests from my phone and tablet?)
the code have run on the browser isn't much more than:
const makeRequestAndAlogTime = () => {
const startTime = new Date().valueOf();
fetch('http://localhost:4000/ping')
.then(async (response) => {
const { time, memory } = await response.json();
console.log({
startTime: 0,
processTime: time - startTime,
endTime: new Date().valueOf() - startTime,
serverMemory: memory,
browserMemory: performance['memory']
})
})
}
for(let x = 0; x < 100; x++) {
makeRequestAndAlogTime()
}
but depending on what value I put in for the number of times to go through the for loop, performance is slower and eventually crashes (as expected)... but I want to know if there is a way I could automate determining the upper limit of requests that I can make on my browsers?
I'm currently working on a small web app which should implement a cached first scenario (users download the wep app in a wifi provided base and then should be able use it offline outside)
I'm not using any framework and therefore implement the caching (SW) myself.
As I also integrate some playcanvas content (which has its own loading screen) over an iframe I was wondering what overall strategy in terms of loading would make sense.
In a similar project I simply let the service worker download the assets parallel to the (initial) load of the application.
But it came to my mind that that it would be better to implement a workflow which is closer to an native app behavior - meaning showing a overall loading screen during the service worker download process and building/showing my main application after this process is finished (or did fail -> forced network scenario or did happen before -> offline scenario). Another solution would be to show a non blocking "Assets are still being downloaded" banner.
The main thoughts leading mit to the second workflow where:
The SW-Loading screen / banner could provide better feedback to the user: "All assets downloaded - I'm safe to go offline", while the old scenario could cause issues here - successfully showing the the user the first state - while some critical files are still downloaded in the back.
With the SW-Loading screen the download process is a bit more controllable/understandable for me - as the parallel process of an SW-Download and the Playcanvas Loading for example become sequential.
It would be great if someone could provide me feedback/info:
if I'm on the right track with this second scenario for being better or it just being overhead
how / if it might be possible to implement a cheap loading screen, meaning for example 100 of 230 Files downloaded or else.
better strategies for this scenario in general
As always, thanks for any heads up in advance.
A lot of this comes down to what you want your users to experience. The underlying technology is there to accomplish any of the scenarios you outline.
For instance, if you want to show information about the precaching progress during initial service worker installation, you could do that by adding code along the lines of the following.
In your service worker:
const PRECACHE_NAME = "...";
const URLS_TO_PRECACHE = [
// ...
];
async function notifyClients(urlsCached, totalURLs) {
const clients = await self.clients.matchAll({ includeUncontrolled: true });
for (const client of clients) {
client.postMessage({ urlsCached, totalURLs });
}
}
self.addEventListener("install", (event) => {
event.waitUntil(
(async () => {
const cache = await caches.open(PRECACHE_NAME);
const totalURLs = URLS_TO_PRECACHE.length;
let urlsCached = 0;
for (const urlToPrecache of URLS_TO_PRECACHE) {
await cache.add(urlToPrecache);
urlsCached++;
await notifyClients(urlsCached, totalURLs);
}
})()
);
});
In your client pages:
// Optional: if controller is not set, then there isn't already a
// previous service worker, so this is a "first-time" install.
// If you would prefer, you could add this event listener
// unconditionally, and you'll get update messages even when there's an
// updated service worker.
if (!navigator.serviceWorker.controller) {
navigator.serviceWorker.addEventListener("message", (event) => {
const { urlsCached, totalURLs } = event.data;
// Display a message about how many URLs have been cached.
});
}
Can anyone provide some guidance on pinpointing the bottleneck in a transform?
This is a node.js implementation of Saxon-JS. I'm trying to increase the speed of transforming some XML documents so that I can provide a Synchronous API that responds in under 60sec ideally (230sec is the hard limit of the Application Gateway). I need to be able to handle up to 50MB size XML files as well.
I've run node's built profiler (https://nodejs.org/en/docs/guides/simple-profiling/). But it's tough to make sense of the results given that the source code of the free version of Saxon-JS is not really human-readable.
My Code
const path = require('path');
const SaxonJS = require('saxon-js');
const { loadCodelistsInMem } = require('../standards_cache/codelists');
const { writeFile } = require('../config/fileSystem');
const config = require('../config/config');
const { getStartTime, getElapsedTime } = require('../config/appInsights');
// Used for easy debugging the xslt stylesheet
// Runs iati.xslt transform on the supplied XML
const runTransform = async (sourceFile) => {
try {
const fileName = path.basename(sourceFile);
const codelists = await loadCodelistsInMem();
// this pulls the right array of SaxonJS resources from the resources object
const collectionFinder = (url) => {
if (url.includes('codelist')) {
// get the right filepath (remove file:// and after the ?
const versionPath = url.split('schemata/')[1].split('?')[0];
if (codelists[versionPath]) return codelists[versionPath];
}
return [];
};
const start = getStartTime();
const result = await SaxonJS.transform(
{
sourceFileName: sourceFile,
stylesheetFileName: `${config.TMP_BASE_DIR}/data-quality/rules/iati.sef.json`,
destination: 'serialized',
collectionFinder,
logLevel: 10,
},
'async'
);
console.log(`${getElapsedTime(start)} (s)`);
await writeFile(`performance_tests/output/${fileName}`, result.principalResult);
} catch (e) {
console.log(e);
}
};
runTransform('performance_tests/test_files/test8meg.xml');
Example console output:
❯ node --prof utils/runTransform.js
SEF generated by Saxon-JS 2.0 at 2021-01-27T17:10:38.029Z with -target:JS -relocate:true
79.938 (s)
❯ node --prof-process isolate-0x102d7b000-19859-v8.log > v8_log.txt
Files:
stylesheet
Example XML: is test8meg.xml
Node Profiling log v8_log.txt
Snippet of the V8 log of the largest performance offender:
[Bottom up (heavy) profile]:
Note: percentage shows a share of a particular caller in the total
amount of its parent calls.
Callers occupying less than 1.0% are not shown.
ticks parent name
33729 52.5% T __ZN2v88internal20Builtin_ConsoleClearEiPmPNS0_7IsolateE
6901 20.5% T __ZN2v88internal20Builtin_ConsoleClearEiPmPNS0_7IsolateE
3500 50.7% T __ZN2v88internal20Builtin_ConsoleClearEiPmPNS0_7IsolateE
3197 91.3% LazyCompile: *k /Users/nosvalds/Projects/validator-api/node_modules/saxon-js/SaxonJS2N.js:287:264
3182 99.5% LazyCompile: *<anonymous> /Users/nosvalds/Projects/validator-api/node_modules/saxon-js/SaxonJS2N.js:682:218
2880 90.5% LazyCompile: *d /Users/nosvalds/Projects/validator-api/node_modules/saxon-js/SaxonJS2N.js:734:184
Thanks a lot. There aren't a ton of resources on this anymore to walk myself through. I've also already tried:
Using the stylesheetInternal parameter with pre-parsed JSON (didn't make a large difference)
Splitting the document into separate documents that only contain one activities <iati-activity> child element inside the root <iati-activities> root element, transforming each separately, and putting it back together this ended up taking 2x as long.
Best,
Nik
You asked the same question at https://saxonica.plan.io/boards/5/topics/8105?r=8106, and I have responded there. I know StackOverflow doesn't like link-only answers, but I prefer to support users via our own support channels rather than via StackOverflow where possible.
There's a very common problem I have seen from many people who use different versions of their site for mobile and desktop, many themes have this feature. The issue is Cloudflare caches the same page regardless of the user device causing mixes and inconsistencies between desktop and mobile versions.
The most common solution is to separate the mobile version into another URL, but in my case, I want to use the same URL and make Cloudflare cache work for both desktop and mobile properly.
I found this very nice guide showing how to fix this issue, however, the worker code seems to be outdated, I had to modify some parts to make it work.
I created a new subdomain for my workers and then assigned the route to my site so it starts running.
The worker is caching everything, however, it does not have the desired feature of having different cached versions according to the device.
async function run(event) {
const { request } = event;
const cache = caches.default;
// Read the user agent of the request
const ua = request.headers.get('user-agent');
let uaValue;
if (ua.match(/mobile/i)) {
uaValue = 'mobile';
} else {
uaValue = 'desktop';
}
console.log(uaValue);
// Construct a new response object which distinguishes the cache key by device
// type.
const url = new URL(request.url);
url.searchParams.set('ua', uaValue);
const newRequest = new Request(url, request);
let response = await cache.match(newRequest);
if (!response) {
// Use the original request object when fetching the response from the
// server to avoid passing on the query parameters to our backend.
response = await fetch(request, { cf: { cacheTtl: 14400 } });
// Store the cached response with our extended query parameters.
event.waitUntil(cache.put(newRequest, response.clone()));
}
return response;
}
addEventListener('fetch', (event) => {
event.respondWith(run(event));
});
it is indeed detecting the right user agent, but it should be having two separate cache versions according to the assigned query string...
I think maybe I'm missing some configuration, I don't know why it's not working as expected. As it is right now I still get mixed my mobile and desktop cache versions.
The problem here is that fetch() itself already does normal caching, independent of your use of the Cache API around it. So fetch() might still return a cached response that is for the wrong UA.
If you could make your back-end ignore the query parameter, then you could include the query in the request passed to fetch(), so that it correctly caches the two results differently. (Enterprise customers can use custom cache keys as a way to accomplish this without changing the URL.)
If you do that, then you can also remove the cache.match() and cache.put() calls since fetch() itself will handle caching.
I am using Websync3, Javascript API, and subscribing to approximately 9 different channels on one page. Firefox and Chrome have no problems, but IE9 is throwing an alert error stating The request is too large for IE to process properly.
Unfortunately the internet has little to no information on this. So does anyone have any clues as to how to remedy this?
var client = fm.websync.client;
client.initialize({
key: '********-****-****-****-************'
});
client.connect({
autoDisconnect: true,
onStreamFailure: function(args){
alert("Stream failure");
},
stayConnected: true
});
client.subscribe({
channel: '/channel',
onSuccess: function(args) {
alert("Successfully connected to stream");
},
onFailure: function(args){
alert("Failed to connect to stream");
},
onSubscribersChange: function(args) {
var change = args.change;
for (var i = 0; i < change.clients.length; i++) {
var changeClient = change.clients[i];
// If someone subscribes to the channel
if(change.type == 'subscribe') {
// If something unsubscribes to the channel
}else{
}
}
},
onReceive: function(args){
text = args.data.text;
text = text.split("=");
text = text[1];
if(text != "status" && text != "dummytext"){
//receiveUpdates(id, serial_number, args.data.text);
var update = eval('(' + args.data.text + ')');
}
}
});
This error occurs when WebSync is using the JSON-P protocol for transfers. This is mostly just for IE, cross domain environments. Meaning websync is on a different domain than your webpage is being served from. So IE doesn't want do make regular XHR requests for security reasons.
JSON-P basically encodes the up-stream data (your 9 channel subscriptions) as a URL encoded string that is tacked onto a regular request to the server. The server is supposed to interpret that URL-encoded string and send back the response as a JavaScript block that gets executed by the page.
This works fine, except that IE also has a limit on the overall request URL for an HTTP request of roughly 2kb. So if you pack too much into a single request to WebSync you might exceed this 2kb upstream limit.
The easiest solution is to either split up your WebSync requests into small pieces (ie: subscribe to only a few channels at a time in JavaScript), or to subscribe to one "master channel" and then program a WebSync BeforeSubscribe event that watches for that channel and re-writes the subscription channel list.
I suspect because you have a key in you example source above, you are using WebSync On-Demand? If that's the case, the only way to make a BeforeSubscribe event handler is to create a WebSync proxy.
So for the moment, since everyone else is stumped by this question as well, I put a trap in my PHP to not even load this Javascript script if the browser is Internet Destroyer (uhh, I mean Internet Explorer). Maybe a solution will come in the future though.