We are creating an app that lets a user capture a number of images and it will try to create a 3D model of the target object. In order to help the users capture useful images we give them some guidance while they move their phone from one capture to the next.
We have a nice prototype working by means of navigator.mediaDevices.getUserMedia() that captures video, displays it in a <video> element, and has an overlay that shows how to move the phone. When they are ready they press a button and we grab the current frame of the streamed video.
We were quite happy with this until we realized that very often the captured image would not have enough quality; mainly they tend to be a bit blurred because the user may not hold the device totally still. This causes the math behind creating the 3D model to fail.
I am now tasked with attempting to improve this but I think I don't have many options. Here is what I have been investigating and their drawbacks:
JavaScript's ImageCapture API. This seems to be exactly what we need: a way to actually take a picture instead of grabbing a frame from a video. While the API has still an experimental status, it seems pretty stable and Chrome has it implemented since version 59. The problem is that Safari (our main target) does not have it implemented and it seems they won't ever do. I can't really find information on what their plan is though but as of today, this is not an option.
Use the input element of type file with the attribute capture. While this lets me capture images with the native camera, I cannot give the user any guide as far as I know.
Create a whole mobile app. This requires another year of work and requesting our existing users to install an app, which may not be possible. Also leaves Android devices out which we'd prefer not to.
While typing this I thought of perhaps using the video instead of capturing the images, but not sure this would help in any way.
Instead of a different approach to the way of capturing the image, I could try to only grab the image if I can confirm that the device is as close as still as possible (using a threshold value). Perhaps I could use the gyroscope for this (we are using it to check they have moved the device to a place and angle we consider useful for the process). The drawback of this is that I am not sure it would really mitigate our problem... how still is still enough? is it possible for the person to be that still for a second?
So my question here is, can anyone think of another alternative to those I descrived? or perhaps improve one of the enumerated ones?
BTW does anyone know what are Apple's plans for the ImageCapture API?
Related
I have a javascript canvas game with pixi.js that requires a player to press a certain combination of buttons to complete a level. They basically have to press the button that matches a certain color.
It turns out that players are writing bots in python to do this task and they are getting the maxium score each time. The game is already live and users enjoy playing it so I can't really change anything gameplay wise.
So I thought about a few possible solutions but I have some concerns
Captcha between each level
Check the speed of the input
Check how consistent the input is
The captcha will hurt user experience, and there are tons of video's how to bypass it. 2 and 3 will fail after the creators of the bots understand what is happening. So I am really stuck on what I could do.
I would consider a random grace period before allowing the buttons to be clicked. this may stump some bots, but is circumventable.
Besides that, I would profile the timing of the clicks/interactions. Every time next level is requested, compare to the profile, and if they are consistently the same introduce a randomized button id, button shape (circle, oval, square, etc.), button placement (swap buttons) to avoid easy scripting. Also the font and the actual text could be varied.
I would also change the input element to <input type="image"> since it will give you the exact coordinates (if possible - I'm not familiar with pixi.js) and this will aid in the profiling.
You could also implement some sort of mouse position tracker, but people on touchscreens will not produce data for this. You could supplement with additional check if the user input is touch, but a bot would easily be able to circumvent it.
EDIT
I don't know if some library to detect other JavaScript imports and thereby detecting potential bots would be applicable. Might be one avenue to consider.
Doing something like this: Check whether user has a Chrome extension installed to verify that you are running in a browser and not in a python environment could be another avenue. It would mean that you restrict your users to certain browsers, and as a lot of other code, could be circumvented. Cost/benefit should be kept in mind here.
If everything is being run though the actual browser with some sort of headless interface it is not going to be useful at all.
EDIT 2
A quick googling of python automate browser game brings up a tutorial of how to automate browser games with python. based on a cursory glance, making your buttons move around and changing font would be effective, and even resizing the playing area "randomly" (even if you have a full screen function) may be a viable defense. Again, following the tutorial and trying to automate it using that, and seeing how to block it would be a good exercise.
You could also consider asking some students for help. This could be a good project idea for many computer studies courses that offer project based courses. It could also be a student job type deal - if you want to ensure that you get a result and a "report".
I think your approach is valid. It seems a bit excessive to add Capcha between each level, perhaps add it before the game starts.
It might be a good idea to check interval between individual clicks, and define some threshold when you can safely assume that it was a bot who clicked the button.
Another approach you could take is to make it more complicated to look up the correct buttons. Approaches like randomizing the element IDs, not rendering the label inside the buttons but as separate elements (I assume it is a game with some fixed window size and you don't care about mobile that much).
I am not familiar with Pixi.js, but that could be an approach to consider.
----------------------- Edit -----------------------
What if you run your game in an iframe ?
As we know the host has the liberty of disabling virtual-background on zoom,
i was thinking maybe we can take care of the problem by adding another "camera" that will output what we want - pictures! so now instead of people seeing what your camera shows, they will see the picture you put, this will also solve the headaches students have who want to appear as on the zoom while not being there - they can just make a gif of them naturally moving around, and this hack will work [the teacher wont pay attention at one out of 25 students that his eye blinks are on a loop :) ]
if someone can develop a script for that - it will be a life savior for everyone!
im not sure how the script will work either just interject the video input of the built-in camera or make another camera input available for this feature.. - and make sure it will be cross-platform ;)
There is a program named "ManyCam", which functions as a webcam source in which you can set overlays, pictures and even GIF's and videos. The free version already offers a lot of these options I think.
This is would be my option at least.
I have a web application used for virtual house tours. Currently I am using VRView for these tours and it has worked pretty good, however I’ve ran in to an issue with the gyroscope that I need fixed as soon as possible.
VRView will automatically rotate the camera based on a users device orientation. As a user turns their phone, the virtual house tour will also turn, so the user is able to “look around” the house. This is great for most use cases, however lower end devices have issues when processing this sort of change. I need a way for users to disable the automatic rotation, and simply swipe on their phones to look around.
I’ve tried the permissions api and trying to revoke access to gyroscope, but due to browser compatibility with that api, it doesn’t work. I also can’t find any documentation on this in the VRView library. Any help is much appreciated.
tldr;
You're right, this doesn't seem to be available via their API. It looks like you may have to fork the library and make some adjustments. If you want to go down this path, I'd suggest forking the repo, seeing if you can successfully disable the motion emitter, and then see if you can use the webvr-polyfill to initiate drag controls. It may also be possible to just disable the gyro-based rotation via webvr-polyfill directly.
More in-depth explanation:
The motion information is being published to the VR View iframes (which I believe then feed them to the webvr-polyfill controls) in two locations:
https://github.com/googlearchive/vrview/blob/bd12336b97eccd4adc9e877971c1a7da56df7d69/scripts/js/device-motion-sender.js#L35
https://github.com/googlearchive/vrview/blob/bd12336b97eccd4adc9e877971c1a7da56df7d69/src/api/iframe-message-sender.js#L45
When a browser's UA (user agent) flag indicates it can't use gyro controls, you would need to include a flag in that disables this functionality (or disables the listener in the iframe).
Normally, to enable drag rotation, I think you would then need to write a listener for the start and end of drag events that would translate those events into camera rotation. (Something similar to what this person is suggesting: https://github.com/googlearchive/vrview/issues/131#issuecomment-289522607)
However, it appears that the controls are imported via web-vr-polyfill. The 'window.WebVRConfig' object is coming from web-vr-polyfill, if I'm following this correctly.
See here: https://github.com/googlearchive/vrview/blob/bd12336b97eccd4adc9e877971c1a7da56df7d69/src/embed/main.js#L77
The code above looks like VR View is adjusting the WebVRConfig when it detects a certain flag (in this case the 'YAW_ONLY' attribute). I believe you would have to do something similar.
https://github.com/immersive-web/webvr-polyfill
See here for the YAW_ONLY attribute: https://github.com/immersive-web/webvr-polyfill/blob/e2958490653bfcda4df83881d554bcdb641cf45b/src/webvr-polyfill.js#L68
See here for an example adjusting controls in webvr-polyfill:
https://github.com/immersive-web/webvr-polyfill#using
Is there any way to have two or more (preferably three) html5 < video > tags playing simultaneously and to be in perfect sync.
If I have let's say three tiles of one video and I want them to appear in browser as one big video. They need to be perfectly synchronized. Without even smallest visual/vertical hint that they are tiled.
Unfortunately I cannot use MediaController because it is not supported well enough.
I've tried some workouts, including canvases, but I still get visual differentiation. Has anyone had any similar problem/solution?
Disclaimer: I'm not a video guy, but here are some thoughts anyway.
If they need to be absolutely perfect...you are fighting several problems at once:
A device might not be powerful enough to acquire, synchronize and render 3 streams at once.
Even if #1 is solved, a device is never totally dedicated to your task. For example, it might pause for garbage collection between processing stream#1 and stream#2--resulting in dropped/unsynchronized frames.
So to give yourself the best chance at perfection, you should first merge your 3 videos into 1 vertical video in the studio (or using studio software).
Then you can use the extended clipping properties of canvas context.drawImage to break each single frame into 2-3 separate frames.
Additionally, buffer a few frames you acquire on the stream (this goes without saying!).
Use requestAnimationFrame (RAF) to control the drawing. RAF does a fairly good job of drawing frames when system resources are available and delaying frames when system resources are lacking.
Your result won't be perfect, but they will be synchronized. You will always have to make the decision whether to drop or delay frames when system resources are unavailable, but at least the frames you do present will be synchronized.
As far as I know it's currently impossible to play HTML5 video frame-by-frame, or seek to a frame accurate time-code. The nearest seek seems to be precise to roughly 1-second.
But you can still get pretty close using the some of the media frameworks:
Popcorn.js library made for synchronizing video with content.
mediagroup.js another library used to add support for mediagroup attributes on HTML5 media elements
The only feature that allowed that is named mediaGroup and it was removed from Chrome(apparently for not being popular enough). It's still present in WebKit. Relevant discussion here and here.
I think you can implement you own "mediagroup"-like tag using wasm though without DOM support it may be tricky.
I am thinking as a challenge i should write a javascript based game. I want sound, images and input. A background to simulate a screen (like 640x480 with all my images in it) would be useful to separate the rest of the page from the 'game'. What should i look at?
Some things i would need
Framecontrol. A way to get the current time (or delta).
Image, displaying it and moving it. How do i display full image. Knowing pixel access may be cool.
Input A way to lock it in a box (like flash does) is cool.
Sound play simple sounds on demand (like when i get a hit). Several sounds at once would be great
Bottlenecks. What are things that will kill the CPU?
Restrictions. What cant i do? I hear i cant 'sleep' to wait. I must set a callback
Good or best pratice. What are good things i can do to either keep speed up or to lower glitch or compatibility problems.
I'm going to answer this looking at things from a mootools javascript perspective:
Framecontrol. A way to get the current time (or delta).
periodical()
Image, displaying it and moving it. How do i display full image.
setStyles()
Input A way to lock it in a box (like
flash does) is cool.
Plain old CSS
Sound play simple sounds on demand
(like when i get a hit).
Swiff, remote();
Bottlenecks. What are things that will
kill the CPU?
Internet Explorer.
Restrictions.
3D ... ?
What are good things i can do ... to
lower glitch or compatibility
problems.
Use a framework.
As a starting point, you may want to write it for Opera, as Opera provides a game canvas that will help you out.
For some examples of games in javascript:
http://dev.opera.com/articles/view/3d-games-with-canvas-and-raycasting-part/
http://my.opera.com/WebApplications/blog/show.dml/200788
This one is interesting, it is demos of games using the canvas element.
http://www.canvasdemos.com/tag/games/
The best way to see where the problems are is to start writing the game, and then you will see what may be a problem. By looking at demos you can get an idea what performance issues they encountered. For example, a full 3D Doom game will have problems, but, as the first article above explains, there are some ways to optimize for javascript.
Once you get it working with Opera, then you can look at Firefox 3.5+ and Safari, as well as Chrome, and see if you can make some changes to have it work on those. How many platforms it works on depends on how much work you want to do for it. For a proof-of-concept pick the easiest browser for your task.
The best place to start would be to get very familiar with the <canvas> tag (it allows you to draw anything on screen)
This may help a lot:
http://benfirshman.com/projects/jsnes/
its an online NES emulator that renders everything on screen - the source is also available
Hope that helps =)