I am currently building a webRTC application that streams audio (the classic server, client, one to many model). Communication and signaling is done through sockets.
The problem I have found is that there is a lot of variability when streaming to smart devices (mainly due to varying processing power), even on a local network.
Hence, I am trying to add functionality that syncs the stream between devices. At a high level I was thinking of potentially, buffering the incoming stream, once all devices are connected the last peer to connect will share something that indicates where that specific peer's buffer starts and all peers will play the buffer from that position.
Does this sound possible? Is there a better way to sync up remote streams? If I was to go along this path, how would I go about buffering a remote MediaStream object (or data from a BlobURL) potentially into some form of array which can be used to identify a common starting location between the streams?
Would I potentially use the Javascript AudioContext api?
I have also looked at NTP protocols and other syncing mechanism but I couldn't find how to apply them to in the context of a webRTC application.
Any help, pointers, or direction would be greatly appreciated.
Please let me elaborate what is my goal
How to take a remote desktop connection from my Angular.js application of a windows application running system. My server is Google App Engine.
What I have thought of so far:
Windows application will take screen shots and send to Google App Engine Channel API.
The Google App Engine channel API will notify the Angular app and send it the screen shots and show it.
The problem with this method is that it's very costly and slow.
Request
Please suggest some tool or api or a way to make a screen sharing application.
This will not be the answer you are looking for but read on either way.
tl;dr;
What you are trying to do is not an App Engine use case and you really shouldn't use App Engine to implement this kind of solution.
long version:
As you found out yourself the channel API will become costly and slow for what you are trying to do. This is because the channel API simply isn't made to stream large amounts of data to the client. It's meant to send regular updates to client, like incoming updates for a real time chat or news ticker. Best case scenario is that you only notify the client of new content and the client requests this new content from the server. So you could send the URL of the new screenshot to the client and the client requests it. When you stream a desktop or a video this is very unnecessary though because what you want with that kind of streaming is as many updates as you can get. You might as well just poll every few milliseconds.
Using screenshots to share a desktop is a particular kind of madness because the data "stream" cannot be compressed properly and will thus be way larger than it has to be. Usually remote desktop systems use compression very similar to video compression algorithms where only the changes / difference of the previous picture / frame will be transmitted and there's a full key frame once in a while. More data means more bandwidth and more latency in your stream. It's really important that you at least try to minimize the dataflow as much as possible.
The goal in most App Engine applications is to allow scaling to thousands of parallel connections. This is accomplished by allowing multiple instances to serve the same content and by enforcing several restriction (like 60 seconds request deadline for frontend / 10 minutes for backend request, maximum bandwidth usuage in a single request, etc.) which chop huge tasks into small requests which can then be served by the multitude of app engine instances. The same restrictions will not allow you to create a long running continuous data stream for something like video or remote desktop streaming. If you poll every few milliseconds as suggested above, app engine would spawn new instances on a regular basis which would cause warm up requests and further delays.
But enough of what won't work, this is an example of what should work:
Streaming server are actually servers which allow direct streaming to clients
Streaming servers publish their service URL to your app engine application
Your AngularJS application requests a stream from the app engine application
App Engine tells the AngularJS application the streaming server information from above
The client requests the stream directly from the server
This approach leaves out app engine as a proxy for your data - so you don't have to worry about the streaming data. It does however require your server to be directly available on the internet.
Alternatively, there are a vast number of applications / services (twitch.tv to name an example) available which allow desktop streaming without you writing a single line of code. Such streams could simply be embedded in your Angular application. Since this is not Software Recommendations i don't want to go any deeper into this matter here.
PROBLEM:
WebRTC gives us peer-to-peer video/audio connections. It is perfect for p2p calls, hangouts. But what about broadcasting (one-to-many, for example, 1-to-10000)?
Lets say we have a broadcaster "B" and two attendees "A1", "A2". Of course it seems to be solvable: we just connect B with A1 and then B with A2. So B sends video/audio stream directly to A1 and another stream to A2. B sends streams twice.
Now lets imagine there are 10000 attendees: A1, A2, ..., A10000. It means B must send 10000 streams. Each stream is ~40KB/s which means B needs 400MB/s outgoing internet speed to maintain this broadcast. Unacceptable.
ORIGINAL QUESTION (OBSOLETE)
Is it possible somehow to solve this, so B sends only one stream on some server and attendees just pull this stream from this server? Yes, this means the outgoing speed on this server must be high, but I can maintain it.
Or maybe this means ruining WebRTC idea?
NOTES
Flash is not working for my needs as per poor UX for end customers.
SOLUTION (NOT REALLY)
26.05.2015 - There is no such a solution for scalable broadcasting for WebRTC at the moment, where you do not use media-servers at all. There are server-side solutions as well as hybrid (p2p + server-side depending on different conditions) on the market.
There are some promising techs though like https://github.com/muaz-khan/WebRTC-Scalable-Broadcast but they need to answer those possible issues: latency, overall network connection stability, scalability formula (they are not infinite-scalable probably).
SUGGESTIONS
Decrease CPU/Bandwidth by tweaking both audio and video codecs;
Get a media server.
As it was pretty much covered here, what you are trying to do here is not possible with plain, old-fashionned WebRTC (strictly peer-to-peer). Because as it was said earlier, WebRTC connections renegotiate encryption keys to encrypt data, for each session. So your broadcaster (B) will indeed need to upload its stream as many times as there are attendees.
However, there is a quite simple solution, which works very well: I have tested it, it is called a WebRTC gateway. Janus is a good example. It is completely open source (github repo here).
This works as follows: your broadcaster contacts the gateway (Janus) which speaks WebRTC. So there is a key negotiation: B transmits securely (encrypted streams) to Janus.
Now, when attendees connect, they connect to Janus, again: WebRTC negotiation, secured keys, etc. From now on, Janus will emit back the streams to each attendees.
This works well because the broadcaster (B) only uploads its stream once, to Janus. Now Janus decodes the data using its own key and have access to the raw data (that it, RTP packets) and can emit back those packets to each attendee (Janus takes care of encryption for you). And since you put Janus on a server, it has a great upload bandwidth, so you will be able to stream to many peer.
So yes, it does involve a server, but that server speaks WebRTC, and you "own" it: you implement the Janus part so you don't have to worry about data corruption or man in the middle. Well unless your server is compromised, of course. But there is so much you can do.
To show you how easy it is to use, in Janus, you have a function called incoming_rtp() (and incoming_rtcp()) that you can call, which gives you a pointer to the rt(c)p packets. You can then send it to each attendee (they are stored in sessions that Janus makes very easy to use). Look here for one implementation of the incoming_rtp() function, a couple of lines below you can see how to transmit the packets to all attendees and here you can see the actual function to relay an rtp packet.
It all works pretty well, the documentation is fairly easy to read and understand. I suggest you start with the "echotest" example, it is the simplest and you can understand the inner workings of Janus. I suggest you edit the echo test file to make your own, because there is a lot of redundant code to write, so you might as well start from a complete file.
Have fun! Hope I helped.
As #MuazKhan noted above:
https://github.com/muaz-khan/WebRTC-Scalable-Broadcast
works in chrome, and no audio-broadcast yet, but it seems to be a 1st Solution.
A Scalable WebRTC peer-to-peer broadcasting demo.
This module simply initializes socket.io and configures it in a way
that single broadcast can be relayed over unlimited users without any
bandwidth/CPU usage issues. Everything happens peer-to-peer!
This should definitely be possible to complete.
Others are also able to achieve this: http://www.streamroot.io/
AFAIK the only current implementation of this that is relevant and mature is Adobe Flash Player, which has supported p2p multicast for peer to peer video broadcasting since version 10.1.
http://tomkrcha.com/?p=1526.
"Scalable" broadcasting is not possible on the Internet, because the IP UDP multicasting is not allowed there. But in theory it's possible on a LAN. The problem with Websockets is that you don't have access to RAW UDP by design and it won't be allowed.
The problem with WebRTC is that it's data channels use a form of SRTP, where each session has own encryption key. So unless somebody "invents" or an API allows a way to share one session key between all clients, the multicast is useless.
There is the solution of peer-assisted delivery, meaning the approach is hybrid. Both server and peers help distribute the resource. That's the approach peer5.com and peercdn.com have taken.
If we're talking specifically about live broadcast it'll look something like this:
Broadcaster sends the live video to a server.
The server saves the video (usually also transcodes it to all the relevant formats).
A metadata about this live stream is being created, compatible with HLS or HDS or MPEG_DASH
Consumers browse to the relevant live stream there the player gets the metadata and knows which chunks of the video to get next.
At the same time the consumer is being connected to other consumers (via WebRTC)
Then the player downloads the relevant chunk either directly from the server or from peers.
Following such a model can save up to ~90% of the server's bandwidth depending on bitrate of the live stream and the collaborative uplink of the viewers.
disclaimer: the author is working at Peer5
My masters is focused on the development of a hybrid cdn/p2p live streaming protocol using WebRTC. I've published my first results at http://bem.tv
Everything is open source and I'm seeking for contributors! :-)
The answer from Angel Genchev seems to be correct, however, there is a theoretical architecture, that allows low-latency broadcasting via WebRTC. Imagine B (broadcaster) streams to A1 (attendee 1). Then A2 (attendee 2) connects. Instead of streaming from B to A2, A1 starts streaming video being received from B to A2. If A1 disconnects then A2 starts receiving from B.
This architecture could work if there are no latencies and connection timeouts. So theoretically it is right, but not practically.
At the moment I am using server side solution.
I'm developing WebRTC broadcasting system using the Kurento Media Server. Kurento Supports several kinds of streaming protocol such as RTSP, WebRTC, HLS. It works as well in term of real-time and scaling.
Hence, Kurento doesn't support RTMP which is used in Youtube or Twitch now. One of the problem with me is the number of user concurrent with this.
Hope it help.
You are describing using WebRTC with a one-to-many requirement. WebRTC is designed for peer-to-peer streaming, however there are configurations that will let you benefit from the low latency of WebRTC while delivering video to many viewers.
The trick is to not tax the streaming client with every viewer and, like you mentioned, have a "relay" media server. You can build this yourself but honestly the best solution is often to use something like Wowza's WebRTC Streaming product.
To stream efficiently from a phone you can use Wowza's GoCoder SDK but in my experience a more advanced SDK like StreamGears works best.
if I am in a room with other 7 users, I am wondering if WebRTC force every user to establish a connection to each one of other participants.
Obviously it would consume something like 7kb/s*7 download and even upload, and many connection cannot handle this if their connection is already busy.
Instead with some kind of media relay the bandwidth usage would be only 7kb/s but you would lose bandwidth adaptation between peers.
Do you know any media relay, or way to solve this problem? is TURN server ( like https://code.google.com/p/rfc5766-turn-server/ ) suitable for this kind of job ( multicast included )?
A TURN server works as a fallback relay server in order to enable connectivity when direct peer-to-peer connectivity is impossible because of firewalls or other network issues. (More information here: press P for speaker notes.) TURN servers are not designed for media distribution.
A Multipoint Control Unit could solve the problem you refer to: there's an example topology for this here. As stated in the notes for that slide:
This is a server that's made specifically to do distribution of media,
and can handle large numbers of participants; it can also do smart
things like selective stream forwarding, mixing of the audio or video,
or recording.
Have a look at https://datatracker.ietf.org/doc/html/draft-ietf-rtcweb-use-cases-and-requirements-06 for details about WebRTC use cases. The authors mention a multi-user conferencing solution that uses a central server. So the best solution of establishing multi-user A/V conferences using WebRTC is to have such a central server that does the audio mixing and A/V "broadcasting" to all peers.
This circumvents the bandwidth problems you mention in your question. Currently a whole bunch of start-ups and established service providers are working on WebRTC-based conferencing solutions, just let your favourite web search engine pick some examples.
A TURN server alone doesn't suffice since TURN is only used to relay data for hosts that can't be reached directly (possibly because of firewalls). TURN servers don't terminate WebRTC connections.
Yes, you would have to establish separate connections to each of your peers. In order to solve this you could use a media server like kurento.
With a media server every peer would connect to the media server, the server would then combine the video streams from your peers into one by placing them side by side and then send you the new stream. This saves peers the trouble of having to download streams from every other peer.
You are right that bandwidth adaption between peers is an issue.
A TURN server does not solve this issue since all it does is provide a stable endpoint, typically for people behind very restrictive NAT setups.
The solution to this issue lies in scalable video codecs. These video codecs are specifically designed to solve the multi-way video conferencing problem. H.264/SVC is one such scalable codec and it is currently used by Google+ Hangouts. VP8 also has temporal and spatial scalability and is used in WebRTC.
The scalable video codecs are designed so that parts of the stream, typically individual UDP packets, can be removed from the stream while preserving the ability to decode the video at a lower quality. At least three types of scalabilities are used:
Temporal, in which the frames-per-second is reduced.
Spatial, where the number of pixels is reduced.
Quality, where the color resolutios is reduced.
If you implement a video conferencing server, you can go into the VP8 stream at a lower level than the WebRTC-level, do the necessary changes to each video stream, and solve the bandwidth adaption issue.
If the question still stands , here is my suggestion :
Based on your SIP server install a RTP proxy software such as if you are using kamailio couple it with rtpengine.
I want to stream video between 2 clients without passing it through the server
Each side sends real time video and also receives the other sides real time video
Is there an open source project that allows that?
Is there an API for that? I'm willing to pay
I want to create it in web app for mobile
Js, html, Ajax, websockets, css...
Thank you so much
VLC has a built in streaming server, as well as the gui it can be use via the comand line so could be scripted to suit your requirements
http://www.videolan.org/doc/streaming-howto/en/
If you stream video directly from one client to another, then you have to understand between two networking models: client-to-server and peer-to-peer.
Server usually is static machine, with networking infrastructure, static ip and many things that allows accessibility by public.
With peer-to-peer you will face many problems, first of them is going through NAT when you creating socket for receiving. One of client might need to create socket to accept connection, and second to accept. They might do both simultaneously and stick to first connected.
Streaming video using web is not possible right now. There is only some beta development happening for Chrome and FireFox that will be publicly available not really soon.
As well you can't establish peer-to-peer connection using WebSockets.
So there is no way doing it using Web technologies.
You might want to have a look into native Mobile development, but there you will face problems with peer-to-peer connections as well.