Pitch detection on array of floating point numbers - javascript

I'm doing voice recording in javascript, and storing the recording as an array of signed floats. What would I need to determine (and ultimately, adjust) pitch on the array? I've seen various algorithms for C++, but they don't seem to be very helpful in my situation. I even downloaded and tried this one to see if I could convert parts of it to javascript:
http://voicerecorder.codeplex.com/SourceControl/latest
But all that actually did was make the recording louder, regardless of the settings I chose.

I'm not going to try to provide an exhaustive answer here, but rather describe my own findings discovered on my journey of wrestling with similar issues in audio programming.
Pitch Detection
If you're sound is monophonic (as it sounds that is is based on your comment to jeff), I've implemented pitch detection using auto-correlation techniques, mostly because it's relatively simple compared to other pitch detection algorithms.
The idea, if you're unfamiliar, is as follows:
Slide a sample over itself (with a predetermined window size; in 1-sample increments)
At each step, calculate the absolute difference between the original wave and the slid window (hard to explain verbally).
as you slide the window, keep a record of the score calculated in (2)
when the wave correlates with itself, the score will hit a minimum, and the time-location of this minimum specifies the signals periodicity.
In my implementation, this was the only algorithm that worked well (when fed samples of my voice; I didn't try a variety of samples however).
That was a crude explanation of how autocorrelation works, and this article provides a very nice comparison of different pitch-detection algorithms:
https://ccrma.stanford.edu/~pdelac/154/m154paper.htm
Pitch Shifting
Of course, you could get really cheap pitch shifting by just resampling, but that sounds similar to a record being played too fast, which is not acceptable in many circumstances.
As far as pitch shifting goes, I haven't gotten that far yet in my implementation, but last I left off, I was looking at phase vocoders as a possible solution. What's hard is finding a decent explanation of how these algorithms work that provides some intuition on the reason why they work the way they do instead of just providing soley abstract mathmatical equations.

Related

Need help dissecting and recreating the perfect scroll easing based on PastryKit

With web i usually just use the native scroll mechanisms. They are fast, reliable and there is no coding involved.
But, while working more and more in Unity, i have discovered that the available plugins for scroll, even the big ones such as Unity.UI or NGUI are simply awful. I have asked around, and found out that it is like that in most platforms. The physics is plain bad.
I did a bunch of research and i have tried a few solutions, from NGUI scrollView, to web iScroll.js and so on. I have found no solution as perfect as the original Apple's PastryKit. Now, PastryKit is old, deprecated, has no API and as hard to read as hieroglyphs.
But what is important is that while making it, they have managed to exactly recreate the iOS kinetic scroll physics behavior.
I am not trying to implement PastryKit , i am trying to find out how it works. I am trying to understand and replicate.
I am trying to find out the easings/formulas they use and logical conditions they use them in. Apple has a crazy way of writing confusing JS, so even tho i am a full stack developer, i am having a hard time tracing everything down. And i figured few brains is better than one, so lets see, does anyone understand this file? :D
https://github.com/jimeh/PastryKit/blob/master/mobile/dist/PastryKit.js
IN SHORT (so there are no misunderstandings): I am trying to extract a set of physics rules from this file, which i can use as guidelines in order to write my own implementation of scroll on any platform i choose. :)
for example: 'normal' scroll is defined by {>300ms && >10px}, apple uses the following bezier curve when easing the animation of slowdown. cubic-bezier.com/#.25,.46,.1,.94
UPDATE: We solved this a while ago. We discovered how Apple does it's momentum.
https://medium.com/homullus/recreating-native-ios-scroll-and-momentum-2906d0d711ad
After many hours of dissecting the algorithm, we concluded that Apple is in fact using magic numbers. And the magic number is: (drumroll) momentum * 0.95.
Basically, while the touch lasts, apple lets you move the screen 1:1.
On touch end Apple would get momentum by dividing number of pixels that the user had swiped, and time that the user has swiped for. If the number of pixels was less than 10 or time was less than 0.5, momentum would be clamped to zero.
Anyways, once the momentum (speed) was known to us, they would multiply it by 0.95 in every frame, and then move the screen by that much.
So idiotically simple and elegant, that it hurts. :)

Vibration (shaking) elements in Box2D after using the Apply Force

Sorry for my bad English.
I do a little game in JavaScript using Box2D (although I think the language is not important, because the engine is the same, as is well known, and has been exported to JavaScript).
Connecting several bodies via Distance Joint. When creating a joint pointing frequencyHz = 11 and dampingRatio = 0 (as I understand it, the less dampingRatio, the faster the item returns to its original position, that is, the spring will be tougher that I need). For each body to use force:
body.ApplyForce(new b2Vec2(0, 5800), body.GetWorldCenter())
That I needed to have these bodies was greater gravity than others.
After creating several elements, they begin to twitch, to vibrate as if they were something haunted, stop and engage in a certain position.
I tried to "play" with parameters frequencyHz and dampingRatio have joint, and sought to design not twitch, but then the construction seems less harsh and smoother.
There is some sort of solution?
Thanks.
I think that the structure is vibrating due to the fact that the weight of several elements, and the power applicable to them, the object is pulled down by increasing the length of the joint, the joint is aimed szhimatsya back to its original position, due to this vibration is obtained, because the element moves quickly in different directions

How can I detect the direction/distance of movement on iOS with javascript? [duplicate]

I was looking into implementing an Inertial Navigation System for an Android phone, which I realise is hard given the accelerometer accuracy, and constant fluctuation of readings.
To start with, I set the phone on a flat surface and sampled 1000 accelerometer readings in the X and Y directions (parallel to the table, so no gravity acting in these directions). I then averaged these readings and used this value to calibrate the phone (subtracting this value from each subsequent reading).
I then tested the system by again placing it on the table and sampling 5000 accelerometer readings in the X and Y directions. I would expect, given the calibration, that these accelerations should add up to 0 (roughly) in each direction. However, this is not the case, and the total acceleration over 5000 iterations is nowhere near 0 (averaging around 10 on each axis).
I realise without seeing my code this might be difficult to answer but in a more general sense...
Is this simply an example of how inaccurate the accelerometer readings are on a mobile phone (HTC Desire S), or is it more likely that I've made some errors in my coding?
You get position by integrating the linear acceleration twice but the error is horrible. It is useless in practice.
Here is an explanation why (Google Tech Talk) at 23:20. I highly recommend this video.
It is not the accelerometer noise that causes the problem but the gyro white noise, see subsection 6.2.3 Propagation of Errors. (By the way, you will need the gyroscopes too.)
As for indoor positioning, I have found these useful:
RSSI-Based Indoor Localization and Tracking Using Sigma-Point Kalman Smoothers
Pedestrian Tracking with Shoe-Mounted Inertial Sensors
Enhancing the Performance of Pedometers Using a Single Accelerometer
I have no idea how these methods would perform in real-life applications or how to turn them into a nice Android app.
A similar question is this.
UPDATE:
Apparently there is a newer version than the above Oliver J. Woodman, "An introduction to inertial navigation", his PhD thesis:
Pedestrian Localisation for Indoor Environments
I am just thinking out loud, and I haven't played with an android accelerometer API yet, so bear with me.
First of all, traditionally, to get navigation from accelerometers you would need a 6-axis accelerometer. You need accelerations in X, Y, and Z, but also rotations Xr, Yr, and Zr. Without the rotation data, you don't have enough data to establish a vector unless you assume the device never changes it's attitude, which would be pretty limiting. No one reads the TOS anyway.
Oh, and you know that INS drifts with the rotation of the earth, right? So there's that too. One hour later and you're mysteriously climbing on a 15° slope into space. That's assuming you had an INS capable of maintaining location that long, which a phone can't do yet.
A better way to utilize accelerometers -even with a 3-axis accelerometer- for navigation would be to tie into GPS to calibrate the INS whenever possible. Where GPS falls short, INS compliments nicely. GPS can suddenly shoot you off 3 blocks away because you got too close to a tree. INS isn't great, but at least it knows you weren't hit by a meteor.
What you could do is log the phones accelerometer data, and a lot of it. Like weeks worth. Compare it with good (I mean really good) GPS data and use datamining to establish correlation of trends between accelerometer data and known GPS data. (Pro tip: You'll want to check the GPS almanac for days with good geometry and a lot of satellites. Some days you may only have 4 satellites and that's not enough) What you might be able to do is find that when a person is walking with their phone in their pocket, the accelerometer data logs a very specific pattern. Based on the datamining, you establish a profile for that device, with that user, and what sort of velocity that pattern represents when it had GPS data to go along with it. You should be able to detect turns, climbing stairs, sitting down (calibration to 0 velocity time!) and various other tasks. How the phone is being held would need to be treated as separate data inputs entirely. I smell a neural network being used to do the data mining. Something blind to what the inputs mean, in other words. The algorithm would only look for trends in the patterns, and not really paying attention to the actual measurements of the INS. All it would know is historically, when this pattern occurs, the device is traveling and 2.72 m/s X, 0.17m/s Y, 0.01m/s Z, so the device must be doing that now. And it would move the piece forward accordingly. It's important that it's completely blind, because just putting a phone in your pocket might be oriented in one of 4 different orientations, and 8 if you switch pockets. And there's many ways to hold your phone, as well. We're talking a lot of data here.
You'll obviously still have a lot of drift, but I think you'd have better luck this way because the device will know when you stopped walking, and the positional drift will not be a perpetuating. It knows that you're standing still based on historical data. Traditional INS systems don't have this feature. The drift perpetuates to all future measurements and compounds exponentially. Ungodly accuracy, or having a secondary navigation to check with at regular intervals, is absolutely vital with traditional INS.
Each device, and each person would have to have their own profile. It's a lot of data and a lot of calculations. Everyone walks different speeds, with different steps, and puts their phones in different pockets, etc. Surely to implement this in the real world would require number-crunching to be handled server-side.
If you did use GPS for the initial baseline, part of the problem there is GPS tends to have it's own migrations over time, but they are non-perpetuating errors. Sit a receiver in one location and log the data. If there's no WAAS corrections, you can easily get location fixes drifting in random directions 100 feet around you. With WAAS, maybe down to 6 feet. You might actually have better luck with a sub-meter RTK system on a backpack to at least get the ANN's algorithm down.
You will still have angular drift with the INS using my method. This is a problem. But, if you went so far to build an ANN to pour over weeks worth of GPS and INS data among n users, and actually got it working to this point, you obviously don't mind big data so far. Keep going down that path and use more data to help resolve the angular drift: People are creatures of habit. We pretty much do the same things like walk on sidewalks, through doors, up stairs, and don't do crazy things like walk across freeways, through walls, or off balconies.
So let's say you are taking a page from Big Brother and start storing data on where people are going. You can start mapping where people would be expected to walk. It's a pretty sure bet that if the user starts walking up stairs, she's at the same base of stairs that the person before her walked up. After 1000 iterations and some least-squares adjustments, your database pretty much knows where those stairs are with great accuracy. Now you can correct angular drift and location as the person starts walking. When she hits those stairs, or turns down that hall, or travels down a sidewalk, any drift can be corrected. Your database would contain sectors that are weighted by the likelihood that a person would walk there, or that this user has walked there in the past. Spatial databases are optimized for this using divide and conquer to only allocate sectors that are meaningful. It would be sort of like those MIT projects where the laser-equipped robot starts off with a black image, and paints the maze in memory by taking every turn, illuminating where all the walls are.
Areas of high traffic would get higher weights, and areas where no one has ever been get 0 weight. Higher traffic areas are have higher resolution. You would essentially end up with a map of everywhere anyone has been and use it as a prediction model.
I wouldn't be surprised if you could determine what seat a person took in a theater using this method. Given enough users going to the theater, and enough resolution, you would have data mapping each row of the theater, and how wide each row is. The more people visit a location, the higher fidelity with which you could predict that that person is located.
Also, I highly recommend you get a (free) subscription to GPS World magazine if you're interested in the current research into this sort of stuff. Every month I geek out with it.
I'm not sure how great your offset is, because you forgot to include units. ("Around 10 on each axis" doesn't say much. :P) That said, it's still likely due to inaccuracy in the hardware.
The accelerometer is fine for things like determining the phone's orientation relative to gravity, or detecting gestures (shaking or bumping the phone, etc.)
However, trying to do dead reckoning using the accelerometer is going to subject you to a lot of compound error. The accelerometer would need to be insanely accurate otherwise, and this isn't a common use case, so I doubt hardware manufacturers are optimizing for it.
Android accelerometer is digital, it samples acceleration using the same number of "buckets", lets say there are 256 buckets and the accelerometer is capable of sensing from -2g to +2g. This means that your output would be quantized in terms of these "buckets" and would be jumping around some set of values.
To calibrate an android accelerometer, you need to sample a lot more than 1000 points and find the "mode" around which the accelerometer is fluctuating. Then find the number of digital points by how much the output fluctuates and use that for your filtering.
I recommend Kalman filtering once you get the mode and +/- fluctuation.
I realise this is quite old, but the issue at hand is not addressed in ANY of the answers given.
What you are seeing is the linear acceleration of the device including the effect of gravity. If you lay the phone on a flat surface the sensor will report the acceleration due to gravity which is approximately 9.80665 m/s2, hence giving the 10 you are seeing. The sensors are inaccurate, but they are not THAT inaccurate! See here for some useful links and information about the sensor you may be after.
You are making the assumption that the accelerometer readings in the X and Y directions, which in this case is entirely hardware noise, would form a normal distribution around your average. Apparently that is not the case.
One thing you can try is to plot these values on a graph and see whether any pattern emerges. If not then the noise is statistically random and cannot be calibrated against--at least for your particular phone hardware.

Best approach for collision detection with HTML5 and JavaScript?

I'm trying to make a little platform game with pure HTML5 and JavaScript. No frameworks.
So in order to make my character jump on top of enemies and floors/walls etc., it needs some proper collision detection algorithms.
Since I'm not usually into doing this. I really have no clue on how to approach the problem.
Should I do a re-check in every frame (it runs in 30 FPS) for all obstacles in the Canvas and see if it collides with my player, or is there a better and faster way to do so?
I even thought of making dynamic maps. So the width, height, x- and y coordinates of the obstacle are stored in an object. Would that make it faster to check if it's colliding with the player?
1. Should I re-check in every frame (it runs on 30 FPS)?
Who says it runs in 30 FPS? I found no such thing in the HTML5 specification. Closest you'll get to have anything to say about the framerate at all is to programmatically call setInterval or the newish, more preferred, requestAnimationFrame function.
However, back to the story. You should always look for collisions as much as you can. Usually, writing games on other platforms where one have a greater ability to measure CPU load, this could be one of those things you might find favorable to scale back some if the CPU has a hard time to follow suit. In JavaScript though, you're out of luck trying to implement advanced solutions like this one.
I don't think there's a shortcut here. The computer has no way of knowing what collided, how, when- and where, if you don't make that computation yourself. And yes, this is usually, if not at all times even, done just before each new frame is painted.
2. A dynamic map?
If by "map" you mean an array-like object or multidimensional array that maps coordinates to objects, then the short answer has to be no. But please do have an array of all objects on the scene. The width, height and coordinates of the object should be stored in variables in the object. Leaking these things would quickly become a burden; rendering the code complex and introduce bugs (please see separation of concerns and cohesion).
Do note that I just said "array of all objects on the scene" =) There is a subtle but most important point in this quote:
Whenever you walk through objects to determine their position and whether they have collided with someone or not. Also have a look at your viewport boundaries and determine whether the object are still "on the scene" or not. For instance, if you have a space craft simulator of some kind and a star just passed the player's viewport from one side to the other and then of the screen, and there is no way for the star to return and become visible again, then there is no reason for the star to be left behind in the system any more. He should be deleted and removed. He should definitely not be stored in an array and become part of a future collision detection with the player's avatar! Such things could dramatically slow down your game.
Bonus: Collision quick tips
Divide the screen into parts. There is no reason for you to look for a collision between two objects if one of them are on left side of the screen, and the other one is on the right side. You could split up the screen into more logical units than just left and right too.
Always strive to have a cheap computation made first. We kind of already did that in the last tip. But even if you now know that two objects just might be in collision with each other, draw two logical squares around your objects. For instance, say you have two 2D airplanes, then there is no reason for you to first look if some part of their wings collide. Draw a square around each airplane, effectively capturing their largest width and their largest height. If these two squares do not overlap, then just like in the last tip, you know they cannot be in collision with each other. But, if your first-phase cheap computation hinted that they might be in collision, pass those two airplanes to another more expensive computation to really look into the matter a bit more.
I am still working on something i wanted to make lots of divs and make them act on physics. I will share somethings that weren't obvious to me at first.
Detect collisions in data first. I was reading the x and y of boxes on screen then checking against other divs. After a week it occurred to me how stupid this was. I mean first i would assign a new value to div, then read it from div. Accessing divs is expensive. Think dom as a rendering stage.
Use webworkers if possible easy.
Use canvas if possible.
And if possible make elements carry a list of elements they should be checked against for collision.(this would be helpful in only certain cases).
I learned that interactive collisions are way more expensive. Because you have to check for changes in environment while in normal interaction you simulate what is going to happen in future, and therefore your animation would be more fluid and more cpu available.
i made something very very early stage just for fun: http://www.lastnoob.com/

HTML5 Motion detection, grabbing and scrolling

I have followed a couple of tutorials (http://www.adobe.com/devnet/html5/articles/javascript-motion-detection.html, http://www.html5rocks.com/en/tutorials/canvas/notearsgame/) and spliced the two together to create a game (https://github.com/gazzwi86/HTML5-Motion-Detection). While I have a few things to work out with the blending to improve the quality of the detection, I was wondering how I would go about detecting grabbing and swipe gestures, say for navigating a web page.
Could you point me in the direction of some examples or outline the principles so that I may try it myself.
I wouldn't go for it. It would require huge processing on client side to be quite good detection.
You can simply track moving objects(like hand) with some threshold(you can simply blur to get rid of noise). The background of user mostly will stay the same, so you can ignore it too.
Then convert image to black and white and try to have your moving object as one polygon.
What I would go to experiment after - set up a little neural network and train it myself by moving my hand.
Well that's just my 2 cents on how I would try to implement it. It would be really nice to hear from you later how did you do that and what the results are :)

Categories