I'm trying to write a simple raycaster using an HTML5 canvas and I'm getting abysmal framerates. Firefox's profiler reports 80% of my runtime is spent in context2d.fillRect(), which I'm using per column for floors and ceilings and per pixel on the textured walls. I came across this question, and found that fillRect was faster than 1x1 px pictures by 40% on chrome and 4% on firefox. They mention how it's still calculating alpha stuff, although I'd assume if the alpha is 1 it'd have a pass through for opaque rendering? Are there any other methods for doing a lot of rectangle and pixel blitting with javascript I should try out?
A solution I have found to this is to put the fillRect() call inside a path each time you call it:
canvasContext.beginPath();
canvasContext.rect(1, 1, 10, 10);
canvasContext.fill();
canvasContext.closePath();
It seems that the call to rect() adds to the path of the previous rect() call, if used incorrectly this can lead to a memory leak or a buildup of resource usage.
There are two solutions that you could try in order to reduce the CPU usage of your renderings.
Try using requestAnimationFrame in order for your browser to render your canvas graphics whenever it is ready and especially only when the user has the canvas tab opened in the browser. More info: https://developer.mozilla.org/en-US/docs/Web/API/window.requestAnimationFrame
Second solution, depending on whether part or all of your content is dynamic, is to pre-render portions of your canvas on other "hidden" canvas in memory with JavaScript (for instance you split your main canvas in 4 sub canvas then only have to draw 4 elements on screen).
PS: If you're on a mac using Firefox combined with multiple canvas renderings will boost your CPU usage compared to Chrome
Hope this helps
Related
I am programming a simple webgl application, which draws multiple images (textures) over each other. Depending on the scroll position, the scale and opacity of the images is changed, to create a 3D multi-layer parallax effect. You can see the effect here:
http://gsstest.gluecksschmiede.io/ct2/
I am currently working on improving the performance, because the effect does not perform well on older devices (low fps). I lack the in-depth knowledge in webgl (and webgl debugging) to see what is the reason for the bad performance, so I need help. This question only is concerned with desktop devices.
I've tried / I am currently:
always working with the same program and shader pair
the images are in 2000x1067 and already compressed. I need png because of the transparency. I could compress them a little more, but not much. The resolution has to be that way.
already using requestAnimationFrame and non-blocking scroll listeners
The webgl functions I am using to draw the image can be read in this file:
http://gsstest.gluecksschmiede.io/ct2/js/setup.js
Shader code can be found here (just right click -> show sourcecode):
http://gsstest.gluecksschmiede.io/ct2/
Basically, I've used this tutorial/code and did just a few changes:
https://webglfundamentals.org/webgl/lessons/webgl-2d-drawimage.html
I'm then using this setup code to draw the images depending on current scroll position as seen in this file (see the "update" method):
http://gsstest.gluecksschmiede.io/ct2/js/para.js
In my application, about 15 images of 2000x1067 size are drawn on to each other for every frame. I expected this to perform way better than it actually is. I don't know what is causing the bottleneck.
How you can help me:
Provide hints or ideas what code / image compression / whatever changes could improve rendering performance
Provide help on how to debug the performance. Is there a more clever why then just printing out times with console.log and performance.now?
Provide ideas on how I could gracefully degrade or provide a fallback that performance better on older devices.
This is just a guess but ...
drawing 15 fullscreen images is going to be slow on many systems. It's just too many pixels. It's not the size of the images it's the size they are drawn. Like on my MacBook Air the resolution of the screen is 2560x1600
You're drawing 15 images. Those images are drawn into a canvas. That canvas is then drawn into the browser's window and the browser's window is then drawn on the desktop. So that's at least 17 draws or
2560 * 1600 * 17 = 70meg pixels
To get a smooth framerate we generally want to run at 60 frames a second. 60 frames a second means
60 frames a second * 70 meg pixels = 4.2gig pixels a second.
My GPU is rated for 8gig pixels a second so it looks like we might get 60fps here
Let's compare to a 2015 Macbook Air with a Intel HD Graphics 6000. Its screen resolution is 1440x900 which if we calculate things out comes to 1.3gig pixels at 60 frames a second. It's GPU is rated for 1.2gig pixels a second so we're not going to hit 60fps on a 2015 Macbook Air
Note that like everything, the specified max fillrate for a GPU is one of those theoretical max things, you'll probably never see it hit the top rate because of other overheads. In other words, if you look up the fillrate of a GPU multiply by 85% or something (just a guess) to get the fillrate you're more likely to see in reality.
You can test this easily, just make the browser window smaller. If you make the browser window 1/4 the size of the screen and it runs smooth then your issue was fillrate (assuming you are resizing the canvas's drawing buffer to match its display size). This is because once you do that less pixels are being draw (75% less) but all the other work stays the same (all the javascript, webgl, etc)
Assuming that shows your issue is fillrate then things you can do
Don't draw all 15 layers.
If some layers fade out to 100% transparent then don't draw those layers. If you can design the site so that only 4 to 7 layers are ever visible at once you'll go a long way to staying under your fillrate limit
Don't draw transparent areas
You said 15 layers but it appears some of those layers are mostly transparent. You could break those apart into say 9+ pieces (like a picture frame) and not draw the middle piece. Whether it's 9 pieces or 50 pieces it's probably better than 80% of the pixels being 100% transparent.
Many game engines if you give them an image they'll auto generate a mesh that only uses the parts of the texture that are > 0% opaque. For example I made this frame in photoshop
Then loading it into unity you can see Unity made a mesh that covers only the non 100% transparent parts
This is something you'd do offline either by writing a tool or doing it by hand or using some 3D mesh editor like blender to generate meshes that fit your images so you're not wasting time trying to render pixels that are 100% transparent.
Try discarding transparent pixels
This you'd have to test. In your fragment shader you can put something like
if (color.a <= alphaThreshold) {
discard; // don't draw this pixel
}
Where alphaThreashold is 0.0 or greater. Whether this saves time might depend on the GPU since using discarding is slower than not. The reason is if you don't use discard then the GPU can do certain checks early. In your case though I think it might be a win. Note that option #2 above, using a mesh for each plane that only covers the non-transparent parts is by far better than this option.
Pass more textures to a single shader
This one is overly complicated but you could make a drawMultiImages function that takes multiple textures and multiple texture matrices and draws N textures at once. They'd all have the same destination rectangle but by adjusting the source rectangle for each texture you'd get the same effect.
N would probably be 8 or less since there's a limit on the number of textures you can in one draw call depending on the GPU. 8 is the minimum limit IIRC meaning some GPUs will support more than 8 but if you want things to run everywhere you need to handle the minimum case.
GPUs like most processors can read faster than they can write so reading multiple textures and mixing them in the shader would be faster than doing each texture individually.
Finally it's not clear why you're using WebGL for this example.
Option 4 would be fastest but I wouldn't recommend it. Seems like too much work to me for such a simple effect. Still, I just want to point out that at least at a glance you could just use N <div>s and set their css transform and opacity and get the same effect. You'd still have the same issues, 15 full screen layers is too many and you should hide <div>s who's opacity is 0% (the browser might do that for you but best not to assume). You could also use the 2D canvas API and you should see similar perf. Of course if you're doing some kind of special effect (didn't look at the code) then feel free to use WebGL, just at a glance it wasn't clear.
A couple of things:
Performance profiling shows that your problem isn't webgl but memory leaking.
Image compression is worthless in webgl as webgl doesn't care about png, jpg or webp. Textures are always just arrays of bytes on the gpu, so each of your layers is 2000*1067*4 bytes = 8.536 megabytes.
Never construct data in your animation loop, you do, learn how to use the math library you're using.
I'm currently developing a web client (HTML5 and JavaScript) for my city building game project. The client interacts through SignalR with a web server written in C#/.NET, on which all the game logic resides.
The implementation requires quite a large map which is implemented by a set of canvas elements that represent different layers. The actual drawing of the map consists of drawing 25x25px cells, some of which are animated. That means that there's a lot of small 'drawImage'-invocations taking place on the '2D contexts'.
The current implementation works fine and smooth in Mozilla Firefox, Internet Explorer and Edge. It is however tremendously slow on Google Chrome, most likely due to the fact that the implementation does not play nice with its hardware accelerated rendering.
Acquisition of the tile cell .PNG-images are done through downloading them from the web server and storing them in in-memory 'Image'-objects. From there, I draw them directly to the canvas when necessary. If my current research is done correctly this is where the bottleneck resides; the source 'Image'-objects reside in CPU-memory whereas the target Canvas-element is optimized for GPU-memory access, resulting in a lot of swapping.
I have tried moving the 'Image'-objects to a large off-screen 'buffer' canvas (large enough so that the hardware acceleration is supposed to kick in on Chrome) but this does not produce a noticable difference:
https://github.com/Miragecoder/Urbanization/commit/86ac62a785b233eea28c53b8a7d474ef92ffc283
I have also tried implementing deferred invocations of the 'drawImage'-function through requestAnimationFrame but this too did not produce noticeable differences.
I have the following questions:
Do I understand the problem correctly?
What can I do to improve the performance of the web client?
Some links to questions I have researched but so far with no result:
HTML5 Canvas slow on Chrome, but fast on FireFox
https://gamedev.stackexchange.com/questions/32221/huge-performance-difference-when-using-drawimage-with-img-vs-canvas
https://code.google.com/p/chromium/issues/detail?id=170021
Google Chrome hardware acceleration making game run slow
Your main problem seems to be the number of different Image-objects you use to draw to the canvas. You absolutely should use a Textureatlas, putting as many graphics as possible in a single image.
You should then render your sprites from as few main-images as possible by specifying the relevant rectangle:
void ctx.drawImage(image, sx, sy, sWidth, sHeight, dx, dy, dWidth, dHeight);
instead of the whole image:
void ctx.drawImage(image, dx, dy);
see CanvasRenderingContext2D.drawImage() for details. This way, you avoid too many context switches. As Blindman67 mentioned, you should be careful to switch between texture atlases as few times as possible - for example, you may want to use a single texture atlas for all your rendering to canvas1, another one for all your images for canvas2 etc.
I am currently combining two high resolution images into an html5 canvas element.
The first image is a JPEG and contains all color data, and the second image is an alpha mask PNG file.
I then combine the two to produce a single RGBA canvas.
The app is dealing with 2048x2048 resolution images, that need to also maintain their alpha channel. Therefore by using this method as ooposed to simply using PNG's, I have reduced the average filesize from around 2-3mb to 50-130kb plus a 10kb png alpha mask.
The method I use is as follows:
context.drawImage(alpha_mask_png, 0, 0, w, h);
context.globalCompositeOperation = 'source-in';
context.drawImage(main_image_jpeg, 0, 0, w, h);
Unfortunately this operation takes around 100-120ms. And is only carried out once for each pair of images as they are loaded. While this wouldn't normally be an issue, in this case an animation is being rendered to another main visible canvas (of which these high res images are the source art for) which suffers from a very noticable 100ms judder (mostly perceptible in firefox) whenever new source art is streamed in, loaded, and combined.
What I am looking for is a way to reduce this.
Here is what I have tried so far:
Implemented WebP for Google chrome, removing the need to combine the JPEG and PNG alpha mask altogether. Perfect in Chrome only but need a solution mostly for Firefox (IE 10/11 seems to perform the operation far quicker)
I have tried loading the images in a webworker and decoding them both, followed by combining them. All in pure javascript. This works but is far too slow to be of use.
I have also tried using WebP polyfills. Weppy is very fast and when ran in a webworker does not effect the main loop. However it does not support alpha transparency so is of no use which is a real shame as this method is very close. libwebpjs works okay within a webworker but like my manual decoding of the JPEG/PNG, is far too slow.
EDIT: To further clarify. I have tried transferring the data from my webworkers using transferrable objects and have even tried turning the result into a blob and creating an objectURL which can then be loaded by the main thread. Although there is no lag in the main thread any more, the images simply take far too long to decode.
This leaves me with WebGL. I have literally no understanding of how WebGL works other than I realise that I would need to load both the JPEG and PNG as seperate textures then combine them with a shader. But I really wouldn't know where to begin.
I have spent some time playing with the code from the following link:
Blend two canvases onto one with WebGL
But to no avail. And to be honest I am concerned that loading the images as textures might actually take longer than my original method anyway.
So to summarise, I am really looking for a way of speeding up the operation on high resolution images. (1920x1080 - 2048x2048) be it with the use of WebGL or indeed any other method.
Any help or suggestions would be greatly appreciated.
For a game, I'm trying to calculate light and shadows. For this, I break down my canvas into square areas and calculate, if a light ray would be blocked on the way from the player to each square position. I've managed now to reach a reasonably good performance for those calculations.
The results are then visualized by covering non-visible areas with dark squares (Canvas.fillRect(...)), but this step becomes too expensive when a want a nice resolution, i.e. ~10'000 squares for calculation. I've tried to first render them into an off-screen canvas (=buffer), then draw the buffer on my visible canvas, but I could not experience any remarkable performance improvement.
Is there something I missed, or are there other methods to fasten up drawing?
Update:
The affected code can be found here: https://github.com/otruffer/Ape_On_Tape/blob/master/src/client/js/visibility.js (Code is a bit too big to post here)
The actual drawing takes place in drawCloudAt(...) and flushBuffer() in the lower part of this file.
Doing real-time light calculation in software is always a performance killer. Did you consider using a real 3d engine instead which does the light calculation on the video card? Yes, Javascript can do that now - this new feature is called WebGL.
When you just need a faster way to apply your lightmap, you could manipulate the RGB values of your canvas directly instead of using fillRect. You can use getImageData to get an array of raw 8 bit RGBA values of your canvas. You can then manipulate this array and put it back with putImageData.
I'm investigating the possibility of producing a game using only HTML's canvas as the display media. To take an example task I need to do, I need to construct the game environment from a number of isometric tiles. Of course, working in 2D means they by necessity come in rectangular packages so there's a large overlap between tiles.
I'm old enough that the natural solution to this problem is to call BitBltMasked. Oh wait, no, an HTML canvas doesn't have something as simple and as pleasing as BitBlt. It seems that the only way to dump pixel data in to a canvas is either with drawImage() which has no useful drawing modes that ignore the alpha channel or to use ImageData objects that have the image data in an array.. to which every. access. is. bounds. checked. and. therefore. dog. slow.
OK, that's more of a rant than a question (things the W3C like tend to provoke that from me), but what I really want to know is how to draw fast to a canvas? I'm finding it very difficult to ditch the feeling that doing 100s of drawImages() a second where every draw respects the alpha channel is inherently sinful and likely to make my application perform like arse in many browsers. On the other hand, the only way to implement BitBlt proper relies heavily on a browser using a hotspot-like execution technique to make it run fast.
Is there any way to draw fast across every possible implementation, or do I just have to forget about performance?
This is a really interesting problem, and there's a few interesting things you can do to solve it.
First, you should know that drawImage can accept a Canvas, not just an image. The "sub-Canvas"es don't even need to be in the DOM. This means that you can do some compositing on one canvas, then draw it to another. This opens a whole world of optimization opportunities, especially in the context of isometric tiles.
Let's say you have an area that's 50 tiles long by 50 tiles wide (I'll say meters for the sake of my own sanity). You might divide the area into 10x10m chunks. Each chunk is represented by its own Canvas. To draw the full scene, you'd simply draw each of the chunks' Canvas objects to the main canvas that's shown to the user. If only four chunks (a 20x20m area), you would only perform four drawImage operations.
Of course, each of those individual chunks will need to render its own Canvas. On game ticks where nothing happens in the chunk, you simply don't do anything: the Canvas will remain unchanged and will be drawn as you'd expect. When something does change, you can do one of a few things depending on your game:
If your tiles extend into the third dimension (i.e.: you have a Z-axis), you can draw each "layer" of the chunk into its own Canvas and only update the layers that need to be updated. For example, if each chunk contains ten layers of depth, you'd have ten Canvas objects. If something on layer 6 was updated, you would only need to re-paint layer 6's Canvas (probably one drawImage per square meter, which would be 100), then perform one drawImage operation per layer in the chunk (ten) to re-draw the chunk's Canvas. Decreasing or increasing the chunk size may increase or decrease performance depending on the number of update you make to the environment in your game. Further optimizations can be made to eliminate drawImage calls for obscured tiles and the like.
If you don't have a third dimension, you can simply perform one drawImage per square meter of a chunk. If two chunks are updated, that's only 200 drawImage calls per tick (plus one call per chunk visible on the screen). If your game involves very few updates, decreasing the chunk size will decrease the number of calls even further.
You can perform updates to the chunks in their own game loop. If you're using requestAnimationFrame (as you should be), you only need to paint the chunk Canvas objects to the screen. Independently, you can perform game logic in a setTimeout loop or the like. Then, each chunk could be updated in its own tick between frames without affecting performance. This can also be done in a web worker using getImageData and putImageData to send the rendered chunk back to the main thread whenever it needs to be updated, though making this work seamlessly will take a good deal of effort.
The other option that you have is to use a library like pixi.js to render the scene using WebGL. Even for 2D, it will increase performance by decreasing the amount of work that the CPU needs to do and shifting that over to the GPU. I'd highly recommend checking it out.
I know that GameJS has blit operations, and I certainly assume any other html5 game libraries do as well (gameQuery, LimeJS, etc etc). I don't know if these packages have addressed the specific array-bounds-checking concern that you had, but in practice their samples seem to work plenty fast on all platforms.
You should not make assumptions about what speedups make sense. For example, the GameJS developer reports that he was going to implement dirty rectangle tracking but it turned out that modern browsers do this automatically---link.
For this reason and others, I suggest to get something working before thinking about the speed. Also, make use of drawing libraries, as the authors have presumably spent some time optimizing performance.
I have no personal knowledge about this, but you can look into the appMobi "direct canvas" HTML element which is allegedly a much faster version of normal canvas, link. I'm confused about whether this works in all browsers or just webkit browsers or just appMobi's own special browser.
Again, you should not make assumptions about what speedups make sense without a very deep knowledge of web browser internal processes. That webpage about "direct canvas" mentions a bunch of things that slow down canvas-drawing: "Reflowing text, mapping hot spots, creating indexes for reference links, on and on." Alpha-blending and array-bounds-checking are not mentioned as prominent causes of slowness!
Unfortunately, there's no way around the alpha composition overhead. Clipping may be one solution, but I doubt there would be much, if any, performance gain. Not to mention how complicated such a route would be to implement on irregular shapes.
When you have to draw the entire display, you're going to have to deal with the performance hit. Although afterwards, you have a whole screen's worth of pre-calculated alpha imagery and you can draw this image data at an offset in one drawImage call. Then, you would only have to individually draw the new tiles that are scrolled into view.
But still, the browser is having to redraw each pixel at a different location in the canvas. Which is quite expensive. It would be nice if there was a method for just scrolling pixels, but no luck there either.
One idea that comes to mind is that you could implement multiple canvases, translating each individual canvas instead of redrawing the pixels. This would allow the browser to decide how to redraw those pixels, in a more native way, at least in theory anyway. Then you could render the newly visible tiles on a new, or used/cached, canvas element. Positioning it to match up with the last screen render.
But that's just my two blits... I mean bits... duh, I mean cents :]