In original folder of pocketsphinx.js, where should I add the threshold code? I added to recognizer.js but didnt work. Here is the code that i found:
["-kws_threshold", '2']
I added like this (into recognizer.js):
function initialize(data, clbId) {
var config = new Module.Config();
config.push_back(["-kws_threshold", '300']);
And this is README of pocketsphinx..
Thank you
You need to optimize it on desktop with a prerecorded audio file, see details from the tutorial
Threshold must be specified for every keyphrase. For shorter keyphrase you can use smaller thresholds like 1e-1, for longer threshold must be bigger, up to 1e-50. If your keyphrase is very long, larger than 10 syllables, it is recommended to split it and spot for parts separately. For the best accuracy it is better to have keyphrase with 3-4 syllables. Too short phrases are easily confused.
Threshold must be tuned to balance between false alarms and missed detections, the best way to tune threshold is to use a prerecorded audio file. Tuning process is the following:
Take a long recording with few occurrences of your keywords and some
other sounds. You can take a movie sound or something else. The
length of the audio should be approximately 1 hour
Run keyword
spotting on that file with different thresholds for every keyword
Use the following command:
pocketsphinx_continuous -infile <your_file.wav> -keyphrase <"your keyphrase"> -kws_threshold <your_threshold> -time yes
It will print many lines, some of them are keywords with detection times and confidences. You can also disable extra logs with -logfn your_file.log option to avoid clutter.
From keyword spotting results count how many false alarms and missed
detections you've encountered Select the threshold with smallest
amount of false alarms and missed detections
Related
I have been thinking about this for a few days trying to see if there is a generic way to write this function so that you don't ever need to worry about it breaking again. That is, it is as robust as it can be, and it can support using up all of the memory efficiently and effectively (in JavaScript).
So the question is about a basic thing. Often times when you create objects in JavaScript of a certain type, you might give them an ID. In the browser, for example with virtual DOM elements, you might just give them a globally unique ID (GUID) and set it to an incrementing integer.
GUID = 1
let a = createNode() // { id: 1 }
let b = createNode() // { id: 2 }
let c = createNode() // { id: 3 }
function createNode() {
return { id: GUID++ }
}
But what happens when you run out of integers? Number.MAX_SAFE_INTEGER == 2⁵³ - 1. That is obviously a very large number: 9,007,199,254,740,991 quadrillions perhaps. Many billions of billions. But if JS can reach 10 million ops per second lets say in a pick of the hat way, then that is about 900,719,925s to reach that number, or 10416 days, or about 30 years. So in this case if you left your computer running for 30 years, it would eventually run out of incrementing IDs. This would be a hard bug to find!!!
If you parallelized the generation of the IDs, then you could more realistically (more quickly) run out of the incremented integers. Assuming you don't want to use a GUID scheme.
Given the memory limits of computers, you can only create a certain number of objects. In JS you probably can't create more than a few billion.
But my question is, as a theoretical exercise, how can you solve this problem of generating the incremented integers such that if you got up to Number.MAX_SAFE_INTEGER, you would cycle back from the beginning, yet not use the potentially billions (or just millions) that you already have "live and bound". What sort of scheme would you have to use to make it so you could simply cycle through the integers and always know you have a free one available?
function getNextID() {
if (i++ > Number.MAX_SAFE_INTEGER) {
return i = 0
} else {
return i
}
}
Random notes:
The fastest overall was Chrome 11 (under 2 sec per billion iterations, or at most 4 CPU cycles per iteration); the slowest was IE8 (about 55 sec per billion iterations, or over 100 CPU cycles per iteration).
Basically, this question stems from the fact that our typical "practical" solutions will break in the super-edge case of running into Number.MAX_SAFE_INTEGER, which is very hard to test. I would like to know some ways where you could solve for that, without just erroring out in some way.
But what happens when you run out of integers?
You won't. Ever.
But if JS can reach 10 million ops per second [it'll take] about 30 years.
Not much to add. No computer will run for 30 years on the same program. Also in this very contrived example you only generate ids. In a realistic calculation you might spend 1/10000 of the time to generate ids, so the 30 years turn into 300000 years.
how can you solve this problem of generating the incremented integers such that if you got up to Number.MAX_SAFE_INTEGER, you would cycle back from the beginning,
If you "cycle back from the beginning", they won't be "incremental" anymore. One of your requirements cannot be fullfilled.
If you parallelized the generation of the IDs, then you could more realistically (more quickly) run out of the incremented integers.
No. For the ids to be strictly incremental, you have to share a counter between these parallelized agents. And access to shared memory is only possible through synchronization, so that won't be faster at all.
If you still really think that you'll run out of 52bit, use BigInts. Or Symbols, depending on your usecase.
i'm trying to create a "generative score" using beep.js based on some map data i have. i am using new Beep.Voice as placeholder for notes associated to specific types of data (7 voices total). as data is displayed, a voice should be played. i'm doing things pretty "brute force" so far and i'd like it to be cleaner:
// in the data processing function
voice = voices[datavoice]
voice.play()
setTimeout(function(){killVoice(voice)}, 20)
// and the killvoice:
function killVoice(voice) {
voice.pause()
}
i'd like to just "play" the voice, assuming it would have a duration of, say, 20ms (basically just beep on data). i saw the duration property of voices but couldn't make them work.
the code is here (uses grunt/node/coffeescript):
https://github.com/mgiraldo/inspectorviz/blob/master/app/scripts/main.coffee
this is how it looks like so far:
https://vimeo.com/126519613
The reason Beep.Voice.duration is undocumented in the READ ME is because it’s not finished yet! ;) There’s a line in the source code that literally says “Right now these do nothing; just here as a stand-in for the future.” This applies to .duration, .attack, etc. There’s a pull request to implement some of this functionality here but I’ve had to make some significant structural changes since that request was submitted; will need to take a closer look soon once I’ve finished fixing some larger structural issues. (It’s in the pipeline, I promise!)
Your approach in the meantime seems right on the money. I’ve reduced it a bit here and made it 200 milliseconds—rather than 20—so I could here it ring a bit more:
var voice = new Beep.Voice('4D♭')
voice.play()
setTimeout( function(){ voice.pause() }, 200 )
I saw you were using some pretty low notes in your sample code, like '1A♭' for example. If you’re just testing this out on normal laptop speakers—a position I am often myself in—you might find the tone is too low for your speakers; you’ll either hear a tick or dead silence. So don’t worry: it’s not a bug, just a hardware issue :)
Forget everything I said ;)
Inspired by your inquiry—and Sam’s old pull request—I’ve just completed a big ADSR push which includes support for Voice durations. So now with the latest Beep.js getting a quick “chiptune-y” chirp can be done like this:
var voice = new Beep.Voice( '4D♭' )
.setOscillatorType( 'square' )
.setAttackDuration( 0 )
.setDecayDuration( 0 )
.setSustainDuration( 0.002 )
.setReleaseDuration( 0 )
.play()
I’ve even included an ADSR ASCII-art diagram in the new Beep.Voice.js file for easy referencing. I hope this helps!
I am using the cytoscape js library for displaying a hierarchy of images. I followed the example on http://jsbin.com/gist/aedff159b0df05ccfaa5?js,output and found that the breadthfirst layout is what I need.
However, I find the rendered result unsatisfactory due to too much unused space. The arrows are too long. Even the example (http://jsbin.com/gist/aedff159b0df05ccfaa5?js,output) has this issue. For this example, I tried the following
Increase the "height/width" in .selector('node') .css({
Muck around with the distanceX and distanceY (node spacing) variables in layout.breadthfirst.js (line 352).
I am unable to reduce the unused space or reduce the length of the arrows.
Ticket to follow: https://github.com/cytoscape/cytoscape.js/issues/737
If you want a new feature in future, please file a ticket.
For rushing Devs, try this layout option:
spacingFactor: 0
The manual says :
spacingFactor: 1.75, // positive spacing factor,
// larger => more space between nodes (N.B. n/a if causes overlap)
That's the result of the ticket https://github.com/cytoscape/cytoscape.js/issues/737 reported by maxkfranz.
I have created a puzzle which is a derivative of the travelling salesman problem, which I call Trace Perfect.
It is essentially an undirected graph with weighted edges. The goal is to traverse every edge at least once in any direction using minimal weight (unlike classical TSP where the goal is to visit every vertex using minimal weight).
As a final twist, an edge is assigned two weights, one for each direction of traversal.
I create a new puzzle instance everyday and publish it through a JSON interface.
Now I know TSP is NP-hard. But my puzzles typically have only a good handful of edges and vertices. After all they need to be humanly solvable. So a brute force with basic optimization might be good enough.
I would like to develop some (Javascript?) code that retrieves the puzzle from the server, and solves with an algorithm in a reasonable amount of time. Additionally, it may even post the solution to the server to be registered in the leader board.
I have written a basic brute force solver for it in Java using my back-end Java model on the server, but the code is too fat and runs out of heap-space quick, as expected.
Is a Javascript solver possible and feasible?
The JSON API is simple. You can find it at: http://service.traceperfect.com/api/stov?pdate=20110218 where pdate is the date for the puzzle in yyyyMMdd format.
Basically a puzzle has many lines. Each line has two vertices (A and B). Each line has two weights (timeA for when traversing A -> B, and timeB for when traversing B -> A). And this should be all you need to construct a graph data structure. All other properties in the JSON objects are for visual purposes.
If you want to become familiar with the puzzle, you can play it through a flash client at http://www.TracePerfect.com/
If anyone is interested in implementing a solver for themselves, then I will post detail about the API for submitting the solution to the server, which is also very simple.
Thank you for reading this longish post. I look forward to hear your thoughts about this one.
If you are running out of heap space in Java, then you are solving it wrong.
The standard way to solve something like this is to do a breadth-first search, and filter out duplicates. For that you need three data structures. The first is your graph. The next is a queue named todo of "states" for work you have left to do. And the last is a hash that maps the possible "state" you are in to the pair (cost, last state).
In this case a "state" is the pair (current node, set of edges already traversed).
Assuming that you have those data structures, here is pseudocode for a full algorithm that should solve this problem fairly efficiently.
foreach possible starting_point:
new_state = state(starting_point, {no edges visited})
todo.add(new_state)
seen[new_state] = (0, null)
while todo.workleft():
this_state = todo.get()
(cost, edges) = seen[this_state]
foreach directed_edge in graph.directededges(this_state.current_node()):
new_cost = cost + directed_edge.cost()
new_visited = directed_edge.to()
new_edges = edges + directed_edge.edge()
new_state = state(new_visited, new_edges)
if not exists seen[new_state] or new_cost < seen[new_state][0]:
seen[new_state] = (new_cost, this_state)
queue.add(new_state)
best_cost = infinity
full_edges = {all possible edges}
best_state
foreach possible location:
end_state = (location, full_edges)
(cost, last_move) = seen[end_state]
if cost < best_cost:
best_state = end_state
best_cost = cost
# Now trace back the final answer.
path_in_reverse = []
current_state = best_state
while current_state[1] is not empty:
previous_state = seen[current_state][1]
path_in_reverse.push(edge from previous_state[0] to current_state[0])
current_state = previous_state
And now reverse(path_in_reverse) gives you your optimal path.
Note that the hash seen is critical. It is what prevents you from getting into endless loops.
Looking at today's puzzle, this algorithm will have a maximum of a million or so states that you need to figure out. (There are 2**16 possible sets of edges, and 14 possible nodes you could be at.) That is likely to fit into RAM. But most of your nodes only have 2 edges connected. I would strongly advise collapsing those. This will reduce you to 4 nodes and 6 edges, for an upper limit of 256 states. (Not all are possible, and note that multiple edges now connect two nodes.) This should be able to run very quickly with little use of memory.
For most parts of graph you can apply http://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg.
This way you can obtain number of lines that you should repeat in order to solve.
At beginning you should not start at nodes which has short vertices over which you should travel two times.
If I summarize:
start at node whit odd number of edges.
do not travel over lines which sit on even node more than once.
use shortest path to travel from one odd node to another.
Simple recursive brute force solver whit this heuristic might be good way to start.
Or another way.
Try to find shortest vertices that if you remove them from graph remining graph will have only two odd numbered nodes and will be considered solvable as Koningsberg bridge. Solution is solving graph without picking up pencil on this reduced graph and once you hit node whit "removed" edge you just go back and forward.
On your java backend you might be able to use this TSP code (work in progress) which uses Drools Planner (open source, java).
Is there a possibility to render an visualization of an audio file?
Maybe with SoundManager2 / Canvas / HTML5 Audio?
Do you know some technics?
I want to create something like this:
You have a tone of samples and tutorials here : http://www.html5rocks.com/en/tutorials/#webaudio
For the moment it work in the last Chrome and the last last Firefox (Opera ?).
Demos : http://www.chromeexperiments.com/tag/audio/
To do it now, for all visitors of a web site, you can check SoundManagerV2.js who pass through a flash "proxy" to access audio data http://www.schillmania.com/projects/soundmanager2/demo/api/ (They already work on the HTML5 audio engine, to release it as soon as majors browsers implement it)
Up to you for drawing in a canvas 3 differents audio data : WaveForm, Equalizer and Peak.
soundManager.defaultOptions.whileplaying = function() { // AUDIO analyzer !!!
$document.trigger({ // DISPATCH ALL DATA RELATIVE TO AUDIO STREAM // AUDIO ANALYZER
type : 'musicLoader:whileplaying',
sound : {
position : this.position, // In milliseconds
duration : this.duration,
waveformDataLeft : this.waveformData.left, // Array of 256 floating-point (three decimal place) values from -1 to 1
waveformDataRight: this.waveformData.right,
eqDataLeft : this.eqData.left, // Containing two arrays of 256 floating-point (three decimal place) values from 0 to 1
eqDataRight : this.eqData.right, // ... , the result of an FFT on the waveform data. Can be used to draw a spectrum (frequency range)
peakDataLeft : this.peakData.left, // Floating-point values ranging from 0 to 1, indicating "peak" (volume) level
peakDataRight : this.peakData.right
}
});
};
With HTML5 you can get :
var freqByteData = new Uint8Array(analyser.frequencyBinCount);
var timeByteData = new Uint8Array(analyser.frequencyBinCount);
function onaudioprocess() {
analyser.getByteFrequencyData(freqByteData);
analyser.getByteTimeDomainData(timeByteData);
/* draw your canvas */
}
Time to work ! ;)
Run samples through an FFT, and then display the energy within a given range of frequencies as the height of the graph at a given point. You'll normally want the frequency ranges going from around 20 Hz at the left to roughly the sampling rate/2 at the right (or 20 KHz if the sampling rate exceeds 40 KHz).
I'm not so sure about doing this in JavaScript though. Don't get me wrong: JavaScript is perfectly capable of implementing an FFT -- but I'm not at all sure about doing it in real time. OTOH, for user viewing, you can get by with around 5-10 updates per second, which is likely to be a considerably easier target to reach. For example, 20 ms of samples updated every 200 ms might be halfway reasonable to hope for, though I certainly can't guarantee that you'll be able to keep up with that.
http://ajaxian.com/archives/amazing-audio-sampling-in-javascript-with-firefox
Check out the source code to see how they're visualizing the audio
This isn't possible yet except by fetching the audio as binary data and unpacking the MP3 (not JavaScript's forte), or maybe by using Java or Flash to extract the bits of information you need (it seems possible but it also seems like more headache than I personally would want to take on).
But you might be interested in Dave Humphrey's audio experiments, which include some cool visualization stuff. He's doing this by making modifications to the browser source code and recompiling it, so this is obviously not a realistic solution for you. But those experiments could lead to new features being added to the <audio> element in the future.
For this you would need to do a Fourier transform (look for FFT) which will be slow in javascript, and not possible in realtime at present.
If you really want to do this in the browser, I would suggest doing it in java/silverlight, since they deliver the fastest number crunching speed in the browser.