How can I make withLatestFrom() work with itself? - javascript

I have intermediate stream which bound to source but also can fire events from other sources (like user input). At some other place of my program I have derived stream which needs to compare new impulse from intermediate with the last value of source, so it all comes down to such code:
const source = new Rx.Subject;
const derived = new Rx.Subject;
derived.subscribe( () => console.log( "derived" ) );
const intermediate = new Rx.Subject;
//motivation of having "intermediate" is that sometimes it fires on it's own:
source.subscribe( intermediate );
intermediate
.withLatestFrom( source )
.subscribe( derived );
source.next();
<script src="https://cdnjs.cloudflare.com/ajax/libs/rxjs/5.4.3/Rx.min.js"></script>
The problem is that "derived" message is never printed (first event in source is ignored). How can I make some stream that for every message from intermediate stream gets the last value of source even if it's currently the propagation of source?

If I understand correctly, you have a source stream, then an intermediate stream which subscribes to and emits all of source's items and mixes in some others from user input. Then you have two differing requests, which I'll offer suggestions for separately:
Emitting values of source with timings from intermediate: combineLatest will do the job easily:
const source_when_intermediate : Observable<TSource> = intermediate.combineLatest(source, (i, s) => s);
Comparing the latest value from intermediate with source: While this sounds very similar, combineLatest isn't very safe, because if you do something simple like:
const is_different : Observable<bool> = intermediate.combineLatest(source, (i, s) => i !== s);
intermediate might not emit a given value from source before source emits a new one, and you could think that that stale value is unique to intermediate.
Instead, for maximum safety, you'll need to buffer and use a subject:
// untested
const derived : Observable<TSource> = (function() {
const source_buffer = new Queue();
const subj = new Subject();
source.forEach(s => {
source_buffer.enqueue(s);
});
intermediate.forEach(i => {
subj.onNext(source_queue.peek() === i);
if(source_buffer.peek() === i) source_buffer.dequeue();
});
Promise.all([
source.toPromise(),
intermediate.toPromise()
]).then(() => subj.onClose());
return subj.asObservable();
})();

Related

WebSocket scaling on the client side with NodeJS

I have written the script below which creates multiple WebSocket connections with a smart contract to listen to events. it's working fine but I feel this is not an optimized solution and probably this could be done in a better way.
const main = async (PAIR_NAME, PAIR_ADDRESS_UNISWAP, PAIR_ADDRESS_SUSHISWAP) => {
const PairContractHTTPUniswap = new Blockchain.web3http.eth.Contract(
UniswapV2Pair.abi,
PAIR_ADDRESS_UNISWAP
);
const PairContractWSSUniswap = new Blockchain.web3ws.eth.Contract(
UniswapV2Pair.abi,
PAIR_ADDRESS_UNISWAP
);
const PairContractHTTPSushiswap = new Blockchain.web3http.eth.Contract(
UniswapV2Pair.abi,
PAIR_ADDRESS_SUSHISWAP
);
const PairContractWSSSushiswap = new Blockchain.web3ws.eth.Contract(
UniswapV2Pair.abi,
PAIR_ADDRESS_SUSHISWAP
);
var Price_Uniswap = await getReserves(PairContractHTTPUniswap);
var Price_Sushiswap = await getReserves(PairContractHTTPSushiswap);
// subscribe to Sync event of Pair
PairContractWSSUniswap.events.Sync({}).on("data", (data) => {
Price_Uniswap = (Big(data.returnValues.reserve0)).div(Big(data.returnValues.reserve1));
priceDifference(Price_Uniswap, Price_Sushiswap, PAIR_NAME);
});
PairContractWSSSushiswap.events.Sync({}).on("data", (data) => {
Price_Sushiswap = (Big(data.returnValues.reserve0)).div(Big(data.returnValues.reserve1));
priceDifference(Price_Uniswap, Price_Sushiswap, PAIR_NAME);
});
};
for (let i = 0; i < pairsArray.length; i++){
main(pairsArray[i].tokenPair, pairsArray[i].addressUniswap, pairsArray[i].addressSushiswap);
}
In the end, I instantiate the main function multiple times for each pair from a pair array, in a for-loop. I think this way of solving is brute force and there is a better way of doing this.
Any suggestions/opinions would be really appreciated.
Just to clear up the terms: You're opening a websocket connection to the WSS node provider - not to the smart contracts. But yes, your JS snippet subscribes to multiple channels (one for each contract) within this one connection (to the node provider).
You can collect event logs from multiple contracts through just one WSS channel using the web3.eth.subscribe("logs") function (docs), passing it the list of contract addresses as a param. Example:
const options = {
// list of contract addresses that you want to subscribe to their event logs
address: ["0x123", "0x456"]
};
web3.eth.subscribe("logs", options, (err, data) => {
console.log(data);
});
But it has a drawback - it doesn't decode the event log data for you. So your code will need to find the expected data types based on the event signature (returned in data.topics[0]). Once you know which event log is emitted based on the topics[0] event signature (real-life example value in this answer), you can use the decodeLog() function (docs) to get the decoded values.

Memory leak in Tensorflow.js: How to manage memory for a large dataset created using tf.data.generator?

There is a clear memory leak in my code that causes my used memory to go from 5gb to 15.7gb in a span of 40-60 seconds, then crashes my program with an OOM error. I believe this happens when I am creating tensors to form the dataset and not when I am training the model. My data consists of 25,000 images stored locally. As such, I used the built-in tensorflow.js function tf.data.generator(generator) described here to create the dataset. I believe this is the best and most efficient way to create a large dataset as mentioned here.
Example
I used a helper class to create my dataset by passing in the path to the images
class Dataset{
constructor(dirPath){
this.paths = this.#generatePaths(dirPath);
}
// Generate file paths for all images to be read as buffer
#generatePaths = (dirPath) => {
const dir = fs.readdirSync(dirPath, {withFileTypes: true})
.filter(dirent => dirent.isDirectory())
.map(folder => folder.name)
let imagePaths = [];
dir.forEach(folder => {
fs.readdirSync(path.join(dirPath, folder)).filter(file => {
return path.extname(file).toLocaleLowerCase() === '.jpg'
}).forEach(file => {
imagePaths.push(path.resolve(path.join(dirPath, folder, file)))
})
})
return imagePaths;
}
// Convert image buffer to a Tensor object
#generateTensor = (imagePath) => {
const buffer = fs.readFileSync(imagePath);
return tf.node.decodeJpeg(buffer, 3)
.resizeNearestNeighbor([128, 128])
.toFloat()
.div(tf.scalar(255.0))
}
// Label the data with the corresponding class
#labelArray(index){return Array.from({length: 2}, (_, k) => k === index ? 1 : 0)};
// Javascript generator function passed to tf.data.generator()
* #imageGenerator(){
for(let i=0; i<this.paths.length; ++i){
let image;
try {
image = this.#generateTensor(this.paths[i]);
} catch (error) {
continue;
}
console.log(tf.memory());
yield image;
}
}
// Javascript generator function passed to tf.data.generator()
* #labelGenerator(){
for(let i=0; i<this.paths.length; ++i){
const classIndex = (path.basename(path.dirname(this.paths[i])) === 'Cat' ? 0 : 1);
const label = tf.tensor1d(this.#labelArray(classIndex), 'int32')
console.log(tf.memory());
yield label;
}
}
// Load data
loadData = () => {
console.log('\n\nLoading data...')
const xs = tf.data.generator(this.#imageGenerator.bind(this));
const ys = tf.data.generator(this.#labelGenerator.bind(this));
const ds = tf.data.zip({xs, ys}).batch(32).shuffle(32);
return ds;
}
}
And I am creating my dataset like this:
const trainDS = new dataset(trainPath).loadData();
Question
I am aware of built-in tfjs methods to manage memory such as tf.tidy() and tf.dispose(). However, I was unable to implement them in such a way to stop the memory leak, as the tensors are generated by the tf.data.generator function.
How would I go about successfully disposing the tensors from memory after they are yielded by the generators?
Every tensor you create, you need to dispose of - there is no garbage collection as you're used to in JS. That's because tensors are not kept in JS memory (they can be in GPU memory or WASM module, etc.), so JS engine cannot track them. They are more like pointers than normal variables.
For example, in your code:
return tf.node.decodeJpeg(buffer, 3)
.resizeNearestNeighbor([128, 128])
.toFloat()
.div(tf.scalar(255.0))
each chained operation creates interim tensor that never gets disposed
read it this way:
const decoded = tf.node.decodeJpeg(buffer, 3)
const resized = decoded.resizeNearestNeighbor([128, 128])
const casted = resized.toFloat();
const normalized = casted.div(tf.scalar(255.0))
return normalized;
so you have 4 large tensors allocated somewhere
what you're missing is
tf.dispose([decoded, resized, casted]);
and later when youre done with the image, also tf.dispose(image) which disposes normalized
and same regarding everything that is a tensor.
I am aware of built-in tfjs methods to manage memory such as tf.tidy() and tf.dispose(). However, I was unable to implement them in such a way to stop the memory leak, as the tensors are generated by the tf.data.generator function.
you say you're aware, but you're doing the exactly the same thing by creating interim tensors you're not disposing.
you can help yourself by wrapping such functions in a tf.tidy() that creates a local scope so everything that is not returned gets automatically released.
for example:
#generateTensor = tf.tidy(imagePath) => {
const buffer = fs.readFileSync(imagePath);
return tf.node.decodeJpeg(buffer, 3)
.resizeNearestNeighbor([128, 128])
.toFloat()
.div(tf.scalar(255.0))
}
which means interim tensors will get disposed of, but you still need to dispose the return value once youre done with it

RxJS - Create Auto-Complete Observable That First Returns Data From Cache And Then From Server

I found this article that explains how I can use RxJs to create an observable for auto-complete:
https://blog.strongbrew.io/building-a-safe-autocomplete-operator-with-rxjs
const autocomplete = (time, selector) => (source$) =>
source$.pipe(
debounceTime(time),
switchMap((...args: any[]) =>
selector(...args)
.pipe(
takeUntil(
source$
.pipe(
skip(1)
)
)
)
)
)
term$ = new BehaviorSubject<string>('');
results$ = this.term$.pipe(
autocomplete(1000, (term => this.fetch(term)))
)
I want to improve this auto-complete observable by first returning data from local storage and display it to the user and then continue to the server to fetch data. The data that will be returned from the server will not replace the one the result from the local storage but will be added to it.
If I understand it correctly on each time the user types, there observable should emit twice.
How can I build it in the most efficient way?
Kind Regards,
Tal Humy
I think you can take advantage of startWith.
const term$ = new BehaviorSubject('');
const localStorageResults = localStorage.getItem(''); // Map it into the same shape as results$ but the observable unwrapped
const results$ = term$
.pipe(
startWith(localStorageResults),
debounceTime(1000),
switchMap(term =>
getAutocompleteSuggestions(term)
.pipe(
takeUntil(
//skip 1 value
term$.pipe(skip(1))
)
)
)
)
)
You may have to tinker with that, I am not sure if it will play nice with the debounceTime but it's an idea.
So after dealing with this for a few hours, I figured out that the solution was very straightforward:
autocomplete(1000, (term => new Observable(s => {
const storageValue = this.fetchFromStorage(term);
s.next(storageValue);
this.fetchFromServer(term)
.subscribe(r => s.next(r));
})))

Flush a RxJs Subject

I want to create a subject with a bufferTime pipe.
e.g.
subject.pipe(bufferTime(1000, null, this._bufferSize),
filter((v, i) => {
return v.length !== 0;
})
)
After using this subject and finishing the work I'd like for the user to call the onComplete / new method that will flush the remaining contents of the stream.
Since this is time based I could wait for the stream to flush itself, but as I'm using AWS Lambda runtime is money.
Is there a simple way to implement a flush?
I think you are looking for takeUntil operator:
const subject = new Subject();
const complete = new Subject();
const BUFFER_SIZE = 10;
subject
.pipe(
takeUntil(complete),
bufferTime(1000, null, BUFFER_SIZE),
)
.subscribe(buffer => {
console.log(Date.now(), buffer);
});
I use another Subject called complete that is used for completing the Observable and consequently flushing the buffer in bufferTime.
See working example here: https://stackblitz.com/edit/typescript-ihjbxb

Repeating/Resetting an observable

I am using rxjs to create a "channel nummber" selector for a remote control on a smart tv. The idea being that as you are entering the numbers you would see them on the screen and after you have finished entering the numbers, the user would would actually be taken to that channel.
I use two observables to achieve this:
A "progress" stream that listens to all number input and emits the concatenated number string out as the numbers are inputed via the scan operator.
A "completed" stream that, after n milliseconds of no number having being entered, would emit the final numeric string as completed. EG: 1-2-3 -> "123".
Here is the code that I use to try and solve this:
channelNumber:
module.exports = function (numberKeys, source, scheduler) {
return function (completedDelay) {
var toNumericString = function (name) {
return numberKeys.indexOf(name).toString();
},
concat = function (current, numeric) {
return current.length === 3 ? current : current + numeric;
},
live = createPress(source, scheduler)(numberKeys)
.map(toNumericString)
.scan(concat, '')
.distinctUntilChanged(),
completed = live.flatMapLatest(function (value) {
return Rx.Observable.timer(completedDelay, scheduler).map(value);
}),
progress = live.takeUntil(completed).repeat();
return {
progress: progress,
completed: completed
};
};
};
createPress:
module.exports = function (source, scheduler) {
return function (keyName, throttle) {
return source
.filter(H.isKeyDownOf(keyName))
.map(H.toKeyName);
};
};
createSource:
module.exports = function (provider) {
var createStream = function (event) {
var filter = function (e) {
return provider.hasCode(e.keyCode);
},
map = function (e) {
return {
type: event,
name: provider.getName(e.keyCode),
code: e.keyCode
};
};
return Rx.Observable.fromEvent(document, event)
.filter(filter)
.map(map);
};
return Rx.Observable.merge(createStream('keyup'), createStream('keydown'));
};
Interestingly the above code, under test conditions (mocking source and scheduler using Rx.TestScheduler) works as expected. But in production, when the scheduler is not passed at all and source is the result of createPress (above), the progress stream only ever emits until complete, and then never again. It's as if the repeat is completely ignored or redundant. I have no idea why.
Am I missing something here?
You can use Window. In this case, I would suggest WindowWithTime. You can also do more interesting things like use Window(windowBoundaries) and then pass the source with Debounce as boundary.
source
.windowWithTime(1500)
.flatMap(ob => ob.reduce((acc, cur) => acc + cur, ""))
Also, since our windows are closed observables, we can use Reduce to accumulate the values from the window and concat our number.
Now, this variant will close after 1,5 second. Rather, we would want to wait x seconds after the last keypress. Naïve we could do source.window(source.debounce(1000)) but now we subscribe to our source twice, that's something we want to avoid for two reasons. First we do not know is subscribing has any side effects, second we do not know the order subscriptions will receive events. That last thing isn't a problem since we use debounce that already adds a delay after the last keypress, but still something to consider.
The solution is to publish our source. In order to keep the publish inside the sequence, we wrap it into observable.create.
Rx.Observable.create(observer => {
var ob = source.publish();
return new Rx.CompositeDisposable(
ob.window(ob.debounce(1000))
.subscribe(observer),
ob.connect());
}).flatMap(ob => ob.reduce((acc, cur) => acc + cur, ""))
Edit: Or use publish like this:
source.publish(ob => ob.window(ob.debounce(1000)))
.flatMap(ob => ob.reduce((acc, cur) => acc + cur, ""))

Categories