WebRTC Series <4> A comprehensive understanding of WebRTC for client-server web games

Reproduced: https://blog.brkho.com/2017/03/15/dive-into-client-server-web-games-webrtc/

Multiplayer games are fun. What they lack in single-player immersion, online games make up for in the uniquely rewarding experience of exploring with friends, meeting strangers online, and going head-to-head with capable peers. One need only look at the giants of League of Legends, Hearthstone, and Overwatch to realize the mass demand for multiplayer games. 1 While these franchises are successful, they have significant barriers to entry into their multi-gigabyte game clients. Of course, the installation won't deter hardcore gamers, but for many casual gamers, the extra step isn't getting started.

For this reason, browser games have huge potential for massively multiplayer experiences. While downloading and installing the client might be too much for some, playing the game by simply visiting a web page is low-friction and good for virality. I'm currently building such a game, and in this blog post I want to share my experience of making a raw connect ion between the browser and the game server.

dr: If you're already familiar with these concepts, you can check out complete sample code to get started.

TCP and UDP

The first step in develo ping any multiplayer game is to identify a transport layer protocol, the two most popular by far being TCP and UDP. Differences 2, 3 are already covered extensively by many resources, so I'll only treat this topic briefly. In short, UDP is a simple connectionless protocol that allows a source to send a single packet of data to a destination. Due to the unreliable nature of networks, some packets may be dropped or reach their destination at different times, UDP provides no protection. TCP, on the other hand, is connection-based and guarantees that packets are delivered and received in the order they were sent by the source. Of course, this comes at the expense of speed, since the source needs to acknowledge that a packet was actually received before sending the next one.

While TCP has been used in many successful games (most notably World of Warcraft), most modern online games prefer UDP because not only is packet retransmission unnecessary with dropped packets and their associated delays , but also unacceptable during fast-paced gameplay. UDP is definitely a bit complicated to use, but with some effort4 you can use its flexibility to your advantage and avoid bloating ping.

UDP on the browser

"That sounds good," you say, "but what's the catch? Usually, as long as you take care to prevent transmission failures and network congestion, you'll be fine. Unfortunately, there's a very big problem with browser games— — For security reasons, there is no cross-platform way to send or receive packets 5 over UDP in the browser. Most things like agar.io Such online web games all rely on WebSockets for networking, which exposes a clean interface for TCP connections to servers. However, as I mentioned before, TCP breaks down where sub-second responses are required, so does that mean we're stuck distributing shooters and MOBA s as native clients?

Saved by WebRTC

of course not! No matter how complex the solution, the network will always find a way. Enter WebRTC, a browser API that enables real-time communication for peer-to-peer connections6. While much of WebRTC is tailored for media transport (such as voice chat in Discord's web application or video calling in Web Messenger), it includes a small specification that is often overlooked, called a data channel, which Allows sending arbitrary messages between two peer browsers.

I mentioned earlier that TCP and UDP are the most popular transport layer protocols, but they are far from the only ones. WebRTC data channels use the Stream Control Transmission Protocol (SCTP), which is connection-oriented like TCP but allows for configurability in terms of reliability and packet delivery. In other words, SCTP can be configured to, like TCP, guarantee delivery and ordering of packets, or we can turn off these features to end up with UDP-like functionality.

So this is great; using WebRTC data channels, we can send messages over SCTP configured to behave like UDP, which solves our problem perfectly. However, it turns out that WebRTC is a roaring beast that shakes the earth with the force of a thousand monsoons when you try to set it up. In fact, the last few times WebRTC has been mentioned on Hacker News in a gaming context, many commenters have noted that they either couldn't get it to work or were intimidated by its complexity to even try 7, 8. Furthermore, WebRTC is used for peer-to-peer connections, whereas most competitive online games today require a client-server model to prevent cheating9. Of course, we have no choice but to treat the server as just another neighbor "peer", which provides an extra hoop to jump through to establish a connection.

Peer-to-Peer Challenges

Continuing with the theme of inconvenience, peer-to-peer communication on modern networks presents another challenge of its own. Ideally, each client has its own fixed IP address that other clients can use to make direct connections. However, the IPv4 address space is actually limited to about 3 billion unique addresses, barely enough for the rest of the world to have an internet-connected computer, let alone additional tablets, laptops, and IoT immersive cookers. As a temporary fix during the lull before IPv6, most home and business networks employ a process called Network Address Translation (NAT).

Without going into too much detail, a NAT device such as a home router manages the connections of all the computers on its network. For example, all the internet-connected devices in your apartment are likely behind a single router with a public-facing IP, eg. To save IPv4 address space, your consumer devices all share the public IP of the NAT device, while each device is assigned its own local IP, which is only unique within the local network (for example). Of course, computers on the wider Internet cannot contact or uniquely identify your home computer using its local address; thousands, if not millions, of devices around the world all have the same local IP. 50.50.50.50192.168.0.10

This is where Network Address Translation comes into play. While external devices cannot contact your computer directly, they can come very close by contacting a different public IP of a NAT device behind your computer. The router can then use a lookup table to translate the incoming request to a local address, and forward the request to your home computer.

More specifically, your computer will contact the server by first sending its request to the router, which in turn associates the computer's local IP with a free port on the NAT device. It then sends the request to the intended destination by replacing the sender address with the IP of the NAT device and the port just assigned to your home computer. For example, a NAT device might forward a request to a destination server where it appears to have originated. 50.50.50.50:20000

However, the server doesn't care if the requested address is from NAT; when ready, the server will simply send its response back to whatever address was provided in the From field. This causes the server to send a response back to the NAT device along the port uniquely associated with your home computer. The NAT device will then take the server's response and use a lookup table to route it to the correct computer. Therefore, the IPv4 address space is conservative, and all the indirection required to do so is abstracted away from clients and servers. With NAT, everyone is happy!

Well, except us. In the previous examples, we assumed that the home computer already knew the public IP of the server not behind NAT. WebRTC, on the other hand, is designed for peer-to-peer connections, where both parties may be behind a NAT device and neither address is known. Therefore, WebRTC requires an intermediate discovery step called NAT traversal, which we must implement even in our client-server use case, where the address of the server is actually known in advance.

The most lightweight protocol for this step is called STUN, where a peer ping s a dedicated server called a STUN server to discover its public IP address and port combination (for example). Both peers request their addresses from the STUN server, which sends back the public IP and port on which the request was received. Both peers now effectively know their own "public" IP from the STUN server's response, which they can use to start establishing a WebRTC connection. 50.50.50.50:20000

Unfortunately, as a final complication, enterprise networks often use special types of NAT, such as symmetric NAT, for which STUN is ineffective, for reasons we'll discuss at the end of this blog post. In these rare cases we are forced to use other protocols such as TURN to establish the connection. To manage the alphabet soup of possible NAT traversal protocols, WebRTC rules them all with another protocol called ICE. ICE performs checks on the network, using STUN if available, and falling back to a more complex protocol like TURN if not. We'll continue assuming we're using a traditional home network that supports STUN.

WebRTC peer-to-peer connections and data channels

With all the background information out of the way, I'll now present a high-level overview of the WebRTC data channel creation process, before jumping into the actual code required to set up your own client and server.

WebRTC provides an interface as a starting point for creating any kind of connection, data channel or otherwise. A client can initialize the object and start looking for other peer clients to connect to and start exchanging data. Of course, at this point, clients have no direct way of knowing where other clients are. In WebRTC terminology, we solve this problem through an application-specific process called signaling, where two peers exchange handshakes over known servers and learn about each other's "public" IP s using ICE and STUN. As a real-world example, two friends on Messenger can initiate a peer-to-peer call only after exchanging publicly accessible addresses through Facebook's central servers. RTCPeerConnectionRTCPeerConnection

After the signaling process, both clients know how to contact each other directly and have all the information needed to send arbitrary packets. However, as we mentioned earlier, WebRTC is geared toward media transport, and also requires clients to exchange data about their media capabilities before any kind of connection can be completed. Even though we're not using any part of the media API, WebRTC still requires us to do a full media handshake before opening a data channel. This handshake is called the Session Description Protocol (SDP) and looks like this:

  1. Both Client 1 and Client 2 connect to some predefined server, called a signaling server.
  2. They learn about each other's presence through the signaling server and decide to initiate a connection.
  3. Client 1 creates an Offer, which then includes information about Client 1's media capabilities (for example, if it has a webcam or can play audio). RTCPeerConnection.createOffer
  4. Client 1 sends the offer/service to Client 2 via the signaling server proxy.
  5. Client 2 receives the offer from the signaling server and passes it to to create an "answer" using Client 2's own media capabilities. RTCPeerConnection.createAnswer
  6. Client 2 sends an answer back to Client 1 via the signaling server.
  7. Client 1 receives and validates the answer. It then starts the ICE protocol which, in our example, contacts the STUN server to discover its public IP. When the STUN server responds, it sends this information (called "ICE candidates") to Client 2 via the signaling server.
  8. Client 2 receives Client 1's ICE candidates, looks up its own ICE candidates through the same mechanism, and sends them to Client 1 through the signaling server.
  9. Each client now knows the other client's media capabilities and publicly accessible IP. They exchange direct ping s and establish connections without the help of signaling servers. Both clients can now happily send messages to each other via the API. RTCDataChannel

WebRTC in a client-server model

At the end of the day, we can think of the game client as "Client 1" and the game server as "Client 2", following the complex but well-defined WebRTC protocol to establish the client-server connection. Implementing a WebRTC connection on the client is simple; WebRTC is first and foremost a browser API, so we can call the correct functions provided by most modern browsers.

While WebRTC has pretty decent browser support, using the WebRTC API on a server is a whole different story. For personal style, I originally wrote my game server in JavaScript and Node.js. I started using node-webrtc , which is a JavaScript wrapper for the Chromium WebRTC library. However, I quickly discovered that this depends on very old WebRTC binaries that use an obsolete SDP handshake that is incompatible with modern Chrome 10. I then turn to electron-webrtc , which simply runs a headless Electron client in the background, providing WebRTC functionality via inter-process communication. I'm able to get a basic connection without trouble, but I'm concerned about scalability due to the extra overhead of shuffling data between the main process and a full-fledged Electron app. node-webrtc electron-webrtc

At the end of the day, I realized I wasn't that comfortable with JavaScript performance reasoning and my game server needed a platform with strong multi-threading support. I decided to cut all the excess and go the traditional route and build my game server in C++. For WebRTC functionality, I can link against Chromium's WebRTC library, which is also written in native code.

So now our clients are running JavaScript in the browser and our servers are running in C++, but we still have one piece of the puzzle - the signaling server connecting the two peers. Luckily, we can cut corners here, since the game server is a special peer whose direct address we actually know in advance. We can simply run a lightweight WebSockets library in the background of the game server and easily connect to it via TCP from the client. The client can then send a WebRTC offer over WebSocket and the server can process the data locally instead of forwarding it as in a traditional signaling server.

accomplish

We've covered a lot of information, now let's finally put them together in a minimal example of a client-server WebRTC connection. For consistency, my client is running on OS X and Chrome 56, and my server is running on Ubuntu 16.04 on an EC2 instance (overkill for a development server, but hey, my credits are about to expire). I am compiling my server. Full source code for both client and server is available on my GitHub Found, this should help you follow up. c4.xlargegcc 5.4

The first thing we need to do is set up server dependencies. If you're not very comfortable with C++ build tools, you can clone the full-featured repository I linked above and use that as a starting point. We'll be using WebSocket++, a header-only C++ for pseudo-signaling servers WebSockets  accomplish. WebSocket++ itself depends on Boost.Asio for asynchronous programming, which we can easily install. Since WebSocket++ is a header-only library, we can simply clone the repository and copy the subdirectory into our include path. apt-get install libboost-all-devwebsocketpp

We also need a format for sending structured messages between client and server. In production, I would use a compact and performant serialization solution such as Protocol Buffers , but for the purposes of this demo we'll just use JSON since it provides first-class support in JavaScript. On the server side, I would use rapidjson to parse and serialize the data. Like WebSocket++, this is a header-only library, so all you need to do is clone the repository and copy the subdirectory to your include path. include/rapidjson

Next, we must build and install Chromium's WebRTC library . This is the library used in Chrome for WebRTC functionality, so it's guaranteed to be correct and efficient. I originally built it from scratch, but it was a pain because you needed to clone the repository, build it with Chromium-specific build tools, and put the output in a shared library folder. I recently found a nice collection of scripts , which do the heavy lifting for you, I highly recommend using them to preserve your sanity.

Even with this handy utility, I still had issues when the latest commits on Chromium master failed to build on my machine. I had to make several commits before finding a green build. I chose commits, so if you have trouble building WebRTC, I recommend starting with the same commit hash. If you're using the aisouard script linked above, unfortunately the way of specifying WebRTC submission builds has changed since I first started using it. So I've locked down my server setup process to use a commit script, so please make a revision if you want to follow along. To sum up, you can install WebRTC with just a few commands: 3dda246b69libwebrtc83814ef6f3libwebrtccheckout

apt-get install build-essential libglib2.0-dev libgtk2.0-dev libxtst-dev \
  libxss-dev libpci-dev libdbus-1-dev libgconf2-dev \
  libgnome-keyring-dev libnss3-dev libasound2-dev libpulse-dev \
  libudev-dev cmake
git clone https://github.com/aisouard/libwebrtc.git
cd libwebrtc
<OPTIONAL> git checkout 83814ef6f3
<OPTIONAL> vim CMakeModules/Version.cmake
<OPTIONAL> change the LIBWEBRTC_WEBRTC_REVISION hash to 3dda246b69df7ff489660e0aee0378210104240b
git submodule init
git submodule update
mkdir out
cd out
cmake ..
make
make install

We now have all our server dependencies, so let's start a basic WebSockets connection. Here's the full code to get it going:

[main.cpp]
#include <websocketpp/config/asio_no_tls.hpp>
#include <websocketpp/server.hpp>

#include <iostream>

using websocketpp::lib::placeholders::_1;
using websocketpp::lib::placeholders::_2;
using websocketpp::lib::bind;

typedef websocketpp::server<websocketpp::config::asio> WebSocketServer;
typedef WebSocketServer::message_ptr message_ptr;

// The WebSocket server being used to handshake with the clients.
WebSocketServer server;

// Callback for when the WebSocket server receives a message from the client.
void OnWebSocketMessage(WebSocketServer* s, websocketpp::connection_hdl hdl, message_ptr msg) {
  std::cout << msg->get_payload() << std::endl;
}

int main() {
  // In a real game server, you would run the WebSocket server as a separate thread so your main process can handle the game loop.
  server.set_message_handler(bind(OnWebSocketMessage, &server, ::_1, ::_2));
  server.init_asio();
  server.set_reuse_addr(true);
  server.listen(8080);
  server.start_accept();
  // I don't do it here, but you should gracefully handle closing the connection.
  server.run();
}

This code shouldn't look too complicated; we just create an asio-backed WebSocketServer object, set up a message handler, and call some configuration methods. As mentioned in the comments, this will cause your main server to run a WebSocket listening loop, preventing it from doing anything else. In a real project, the WebSocket server should be run as a separate thread. You can verify that the WebSocket server is actually running by calling it from your personal computer. telnet <server IP> 8080

The corresponding client code to communicate with the WebSocket on the server is equally simple.

[example-client.js]
// URL to the server with the port we are using for WebSockets.
const webSocketUrl = 'ws://<replace with server address>:8080';
// The WebSocket object used to manage a connection.
let webSocketConnection = null;

// Callback for when the WebSocket is successfully opened.
function onWebSocketOpen() {
  console.log('Opened!');
  webSocketConnection.send('Hello, world!');
}

// Callback for when we receive a message from the server via the WebSocket.
function onWebSocketMessage(event) {
  console.log(event.data);
}

// Connects by creating a new WebSocket connection and associating some callbacks.
function connect() {
  webSocketConnection = new WebSocket(webSocketUrl);
  webSocketConnection.onopen = onWebSocketOpen;
  webSocketConnection.onmessage = onWebSocketMessage;
}

While simple, this demonstrates all the functionality we need: create a new WebSocket on the client, assign some callbacks, and send a message. If you call, you should see an "Opened!" print on the browser console and a "Hello, world!" print on the server's stdout. connect

We can now instantiate anand an, which are part of the browser, API.is is then used to create an SDP offer, which is sent to the server over our WebSockets connection.

RTCPeerConnectionRTCDataChannelRTCPeerConnection

[example-client.js]
function onWebSocketOpen() {
  const config = { iceServers: [{ url: 'stun:stun.l.google.com:19302' }] };
  rtcPeerConnection = new RTCPeerConnection(config);
  const dataChannelConfig = { ordered: false, maxRetransmits: 0 };
  dataChannel = rtcPeerConnection.createDataChannel('dc', dataChannelConfig);
  dataChannel.onmessage = onDataChannelMessage;
  dataChannel.onopen = onDataChannelOpen;
  const sdpConstraints = {
    mandatory: {
      OfferToReceiveAudio: false,
      OfferToReceiveVideo: false,
    },
  };
  rtcPeerConnection.onicecandidate = onIceCandidate;
  rtcPeerConnection.createOffer(onOfferCreated, () => {}, sdpConstraints);
}

We created a URL pointing to STUN, server.is a public STUN server maintained by Google for development use, so it is recommended to set Own STUN server for production. Next, we create a data channel associated with and specifying the use of unordered, unreliable SCTP in the configuration object. We bind some callbacks that will return later and try to create the SDP offering. The first parameter is a callback for creation success, the second parameter is a callback for creation failure, and the last parameter is a self-explanatory configuration object. The actual product/service will be passed to the success callback.

RTCPeerConnectionstun:stun.l.google.com:19302RTCPeerConnectioncreateOffer

[example-client.js]
function onOfferCreated(description) {
  rtcPeerConnection.setLocalDescription(description);
  webSocketConnection.send(JSON.stringify({type: 'offer', payload: description}));
}

In the quote callback, we store the client's own media by calling the function, and then send our quote as stringified JSON over WebSocket. On the server side, we can handle this request by parsing the JSON. setLocalDescription

[main.cpp]
#include <rapidjson/document.h>

OnWebSocketMessage(WebSocketServer* s, websocketpp::connection_hdl hdl, message_ptr msg) {
  rapidjson::Document message_object;
  message_object.Parse(msg->get_payload().c_str());
  // Probably should do some error checking on the JSON object.
  std::string type = message_object["type"].GetString();
  if (type == "offer") {
    std::string sdp = message_object["payload"]["sdp"].GetString();
    // Do some some stuff with the offer.
  } else {
    std::cout << "Unrecognized WebSocket message type." << std::endl;
  }
}

At this point we want to create anand anon on the server so we can process the customer's quote and generate an answer. Unfortunately, with the advent of C++, a fair amount of boilerplate code was required to accomplish the same task that required 15 lines of JavaScript. The main difference is that the WebRTC library uses Observer pattern to handle WebRTC events, not the convenience of JS callbacks. In order to run a peer connection, we have to implement all 19 possible events by overriding a series of abstract classes. RTCPeerConnectionRTCDataChannelonmessageonOfferCreatedwebrtc::*Observer

  • webrtc::PeerConnectionObserver is used for peer connection events such as receiving ICE candidates.
  • webrtc::CreateSessionDescriptionObserver is used to create a quote or answer.
  • webrtc::SetSessionDescriptionObserver is used to confirm and store the offer or answer.
  • webrtc::DataChannelObserver is used to receive data channel events such as SCTP messages.

i provided observers.h , which implements no-ops for most of these event methods to simplify your development. Actually, we only care about a few of these events. For events that we do need to act on, we provide callback functions that we will define later. main.cpp

[main.cpp]
#include "observers.h"

void OnDataChannelCreated(webrtc::DataChannelInterface* channel);
void OnIceCandidate(const webrtc::IceCandidateInterface* candidate);
void OnDataChannelMessage(const webrtc::DataBuffer& buffer);
void OnAnswerCreated(webrtc::SessionDescriptionInterface* desc);

PeerConnectionObserver peer_connection_observer(OnDataChannelCreated, OnIceCandidate);
DataChannelObserver data_channel_observer(OnDataChannelMessage);
CreateSessionDescriptionObserver create_session_description_observer(OnAnswerCreated);
SetSessionDescriptionObserver set_session_description_observer;

we need to know now WebRTC's threading model . In a nutshell, WebRTC requires two threads to run it - a signaling thread and a worker thread. The signaling thread handles the bulk of the WebRTC calculations; it creates all the basic components and fires events that we can consume by calling the observer methods defined in . Worker threads, on the other hand, are delegated resource-intensive tasks (such as media streaming) to ensure that signaling threads are not blocked. If we use a, WebRTC will automatically create two threads for us. observers.hPeerConnectionFactory

[main.cpp]
#include <webrtc/api/peerconnectioninterface.h>
#include <webrtc/base/physicalsocketserver.h>
#include <webrtc/base/ssladapter.h>
#include <webrtc/base/thread.h>

#include <thread>

rtc::scoped_refptr<webrtc::PeerConnectionFactoryInterface> peer_connection_factory;
rtc::PhysicalSocketServer socket_server;
std::thread webrtc_thread;

void SignalThreadEntry() {
  // Create the PeerConnectionFactory.
  rtc::InitializeSSL();
  peer_connection_factory = webrtc::CreatePeerConnectionFactory();
  rtc::Thread* signaling_thread = rtc::Thread::Current();
  signaling_thread->set_socketserver(&socket_server);
  signaling_thread->Run();
  signaling_thread->set_socketserver(nullptr);
}

int main() {
  webrtc_thread = std::thread(SignalThreadEntry);
  // ... set up the WebSocket server.
}

CreatePeerConnectionFactory sets the current thread as the signaling thread and creates some worker threads in the background. Since we're using the main thread for the WebSocket listen loop, we need to create a new so that WebRTC and WebSocket can co-exist. webrtc_thread

In the WebRTC thread entry function, we instantiate one, which designates the thread as a signaling thread. After doing some setup like providing sockets to communicate with worker threads, we can finally use the factory to generate an and respond to SDP. PeerConnectionFactoryRTCPeerConnection

[main.cpp]
rtc::scoped_refptr<webrtc::PeerConnectionInterface> peer_connection;
rtc::scoped_refptr<webrtc::DataChannelInterface> data_channel;

void OnWebSocketMessage(...) {
  // ... parse the JSON.
  if (type == "offer") {
    std::string sdp = message_object["payload"]["sdp"].GetString();
    webrtc::PeerConnectionInterface::RTCConfiguration configuration;
    webrtc::PeerConnectionInterface::IceServer ice_server;
    ice_server.uri = "stun:stun.l.google.com:19302";
    configuration.servers.push_back(ice_server);

    // Create the RTCPeerConnection with an observer.
    peer_connection = peer_connection_factory->CreatePeerConnection(configuration, nullptr, nullptr, &peer_connection_observer);
    webrtc::DataChannelInit data_channel_config;
    data_channel_config.ordered = false;
    data_channel_config.maxRetransmits = 0;
    // Create the RTCDataChannel with an observer.
    data_channel = peer_connection->CreateDataChannel("dc", &data_channel_config);
    data_channel->RegisterObserver(&data_channel_observer);

    webrtc::SdpParseError error;
    webrtc::SessionDescriptionInterface* session_description(webrtc::CreateSessionDescription("offer", sdp, &error));
    // Store the client's SDP offer.
    peer_connection->SetRemoteDescription(&set_session_description_observer, session_description);
    // Creates an answer to send back.
    peer_connection->CreateAnswer(&create_session_description_observer, nullptr);
  }
  // ... handle other cases.
}

While this looks complicated, it's essentially the same JavaScript code we write for our clients. First, we create a Google-developed STUN server and use it to create a data channel over unordered, unreliable SCTP. Finally, we use to store the client's offer and create an answer to send back to the client. This will in turn call our callback, to which we can add code to send the answer to the client. RTCPeerConnectionSetRemoteDescriptionCreateAnswerOnSuccessCreateSessionDescriptionObserverOnAnswerCreated

[main.cpp]
#include <rapidjson/stringbuffer.h>
#include <rapidjson/writer.h>

void OnAnswerCreated(webrtc::SessionDescriptionInterface* desc) {
  peer_connection->SetLocalDescription(&set_session_description_observer, desc);
  std::string offer_string;
  desc->ToString(&offer_string);
  rapidjson::Document message_object;
  message_object.SetObject();
  message_object.AddMember("type", "answer", message_object.GetAllocator());
  rapidjson::Value sdp_value;
  sdp_value.SetString(rapidjson::StringRef(offer_string.c_str()));
  rapidjson::Value message_payload;
  message_payload.SetObject();
  message_payload.AddMember("type", "answer", message_object.GetAllocator());
  message_payload.AddMember("sdp", sdp_value, message_object.GetAllocator());
  message_object.AddMember("payload", message_payload, message_object.GetAllocator());
  rapidjson::StringBuffer strbuf;
  rapidjson::Writer<rapidjson::StringBuffer> writer(strbuf);
  message_object.Accept(writer);
  std::string payload = strbuf.GetString();
  ws_server.send(websocket_connection_handler, payload, websocketpp::frame::opcode::value::text);
}

We use the storage server's own answer (passed in as a parameter). Here we're running into abysmal code ergonomics, but hopefully it's obvious that all we're doing is building a simple JSON blob field by field. Once we've constructed the message, we stringify it and send the answer back to the client. SetLocalDescriptionrapidjson's

[example-client.js]
function onWebSocketMessage(event) {
  const messageObject = JSON.parse(event.data);
  if (messageObject.type === 'answer') {
    rtcPeerConnection.setRemoteDescription(new RTCSessionDescription(messageObject.payload));
  } else {
    console.log('Unrecognized WebSocket message type.');
  }
}

We process messages by parsing them to get their type and payload. The client proceeds to store the server's SDP answer by making a call with the message payload. setRemoteDescription

Now that the client and server have exchanged their media capabilities as required by WebRTC, all that remains is to exchange their publicly accessible addresses in the form of ICE candidates. On the client side, manages most of it for us; it executes the ICE protocol using the provided STUN server, and passes all found ICE candidates to the callback. Then, all we have to do is send the ICE candidate to the server in the function we assigned earlier. RTCPeerConnectionrtcPeerConnection.onicecandidateonicecandidate

[example-client.js]
function onIceCandidate(event) {
  if (event && event.candidate) {
    webSocketConnection.send(JSON.stringify({type: 'candidate', payload: event.candidate}));
  }
}

We can process this message on the server. OnWebSocketMessage

[main.cpp]
void OnWebSocketMessage(...) {
  // ... Parse JSON and handle an offer message.
  } else if (type == "candidate") {
    std::string candidate = message_object["payload"]["candidate"].GetString();
    int sdp_mline_index = message_object["payload"]["sdpMLineIndex"].GetInt();
    std::string sdp_mid = message_object["payload"]["sdpMid"].GetString();
    webrtc::SdpParseError error;
    auto candidate_object = webrtc::CreateIceCandidate(sdp_mid, sdp_mline_index, candidate, &error);
    peer_connection->AddIceCandidate(candidate_object);
  } else {
  // ... Handle unrecognized type.
}

The server parses the JSON blob's fields into an appropriate WebRTC ICE candidate object, and saves through that object. AddIceCandidate

The server's own ICE candidates are similarly generated by peer connections, but this time they are passed via . We provide our own callback for this function where we can forward candidates to the client. OnIceCandidatePeerConnectionObserver

[main.cpp]
void OnIceCandidate(const webrtc::IceCandidateInterface* candidate) {
  std::string candidate_str;
  candidate->ToString(&candidate_str);
  rapidjson::Document message_object;
  message_object.SetObject();
  message_object.AddMember("type", "candidate", message_object.GetAllocator());
  rapidjson::Value candidate_value;
  candidate_value.SetString(rapidjson::StringRef(candidate_str.c_str()));
  rapidjson::Value sdp_mid_value;
  sdp_mid_value.SetString(rapidjson::StringRef(candidate->sdp_mid().c_str()));
  rapidjson::Value message_payload;
  message_payload.SetObject();
  message_payload.AddMember("candidate", candidate_value, message_object.GetAllocator());
  message_payload.AddMember("sdpMid", sdp_mid_value, message_object.GetAllocator());
  message_payload.AddMember("sdpMLineIndex", candidate->sdp_mline_index(),
      message_object.GetAllocator());
  message_object.AddMember("payload", message_payload, message_object.GetAllocator());
  rapidjson::StringBuffer strbuf;
  rapidjson::Writer<rapidjson::StringBuffer> writer(strbuf);
  message_object.Accept(writer);
  std::string payload = strbuf.GetString();
  ws_server.send(websocket_connection_handler, payload, websocketpp::frame::opcode::value::text);
}

Also, the code is overly verbose since it's direct and simulated callbacks to the client itself. The server takes the provided ICE candidate, parses its fields into a JSON object, and sends it over WebSocket. rapidjsononIceCandidate

The client receives ICE candidates from the server, where it also invokes. onWebSocketMessage addIceCandidate

[example-client.js]
function onWebSocketMessage(event) {
  // ... Parse string and handle answer.
  } else if (messageObject.type === 'candidate') {
    rtcPeerConnection.addIceCandidate(new RTCIceCandidate(messageObject.payload));
  } else {
  // ... Handle unrecognized type.
}

If you did everything correctly, the calling client should now start and (hopefully) complete the handshake with the server. We can verify this by using the callback we assigned earlier. connectonDataChannelOpendataChannel.onopen

[example-client.js]
function onDataChannelOpen() {
  console.log('Data channel opened!');
}

If the handshake is successful, it should be fired and output a congratulations message to the console! We can then use this newly opened data channel to ping the server. onDataChannelOpen

[example-client.js]
function ping() {
  dataChannel.send('ping');
}

The server also receives an event when the data channel is successfully opened. This is triggered via a callback. However, unlike the client, the server has an extra step to do. When opening a raw data channel, the WebRTC library creates a new data channel containing the updated fields, which is passed as an argument to the callback. This step is abstracted away in client code, but it's not terribly difficult to reallocate a new data channel and rebind the channel on the server. OnDataChannelCreatedPeerConnectionObserverOnDataChannelCreatedDataChannelObserver

[main.cpp]
void OnDataChannelCreated(webrtc::DataChannelInterface* channel) {
  data_channel = channel;
  data_channel->RegisterObserver(&data_channel_observer);
}

Since it is now re-bound to the correct data channel, the server can now start receiving messages through its callbacks. DataChannelObserverOnDataChannelMessage

[main.cpp]
void OnDataChannelMessage(const webrtc::DataBuffer& buffer) {
  std::string data(buffer.data.data<char>(), buffer.data.size());
  std::cout << data << std::endl;
  std::string str = "pong";
  webrtc::DataBuffer resp(rtc::CopyOnWriteBuffer(str.c_str(), str.length()), false /* binary array */);
  data_channel->Send(resp);
}

This will print received ping s (managed by WebRTC) to stdout and respond with a pong. Customers can handle ping pong balls through the ping pong balls we have allocated.

DataBufferonDataChannelMessagedataChannel.onmessage

[example-client.js]
function onDataChannelMessage(event) {
  console.log(event.data);
}

Finally, we're done! If implemented correctly, we reap the fruits of our labor by calling this which will send a "ping" message to the server. The server processes the client's message, printing "ping" to standard output and sending back a "pong" message. After receiving the server's message, the client outputs "pong" to the browser console. ping

benchmark

Huh, that's a lot of concepts and code just to make a simple connection. Initializing a similar connection using WebSocket requires only about 10 lines of client code and 20 lines of server code. Given this difference in upfront cost, is WebRTC and its associated boilerplate worth it? I ran some benchmarks to find out.

In the first test, I sent 20 pings from the client to the server 20 times per second and measured the round trip time. I've done this on "perfect connections" with no packet loss for both WebRTC data channels (SCTP) and simple WebSockets connections (TCP).

As expected, both WebRTC and WebSocket perform acceptable without packet loss, with WebRTC RTT clusters around 40-50ms and WebSocket average around 80-90ms. There is definitely some overhead in the TCP protocol, but for most games, an extra 50ms or so won't make or break the player experience.

In the second test, I sent ping s at the same rate for the same duration, but I also used a traffic shaper to drop 5% of outgoing packets and 5% of incoming packets. Again, I tested on WebRTC and WebSockets.

 

Admittedly, a 5% drop rate is a bit exaggerated, but regardless, the results are staggering. Since we transport WebRTC over unreliable SCTP, the distribution of RTT is completely unaffected. We dropped about 40 packets, but in a game environment where the server sends state 20 times per second, this is not a problem. On the other hand, the WebSocket implementation has a long tail, with some packets not arriving for more than 900 ms. To make matters worse, a significant portion of packets have an RTT of over 250ms, which results in an extremely annoying experience as any gamer can attest to.

in conclusion

Although it took a lot of persistence, we were eventually able to shoehorn WebRTC into a client-server architecture. We implemented an example of a data channel connection that performs much better than WebSockets for networks with perfect connectivity and packet loss. However, the sample code is largely illustrative and contains a lot of sub-optimal patterns. In addition to the global variables littering the code, the server contains significant inefficiencies in immediately processing data channel messages in callbacks. In our example, the cost of doing this is negligible, but in a real game server, the message handler would be a more expensive function that must interact with the state. The message handler function will then prevent the signaling thread from processing any other messages on the wire for the duration of its execution. To avoid this, I recommend pushing all messages to a thread-safe message queue, and in the next game cycle, the main game loop running in a different thread can batch network messages. For this I use in my game server Facebook's lock-free queue . For ideas on how to better organize your games around WebRTC, feel free to check out my server code and client code. OnDataChannelMessage

There are a few other caveats worth mentioning about WebRTC. First, WebRTC isn't even supported in every major browser yet11. While Firefox and Chrome have long been on the list of supported browsers, Safari and Edge are notably absent. I'd be happy to only support modern Firefox and Chrome in my games, but depending on your target audience, it might make more sense to just distribute the native client.

Also, I mentioned earlier that corporate networks behind symmetric NAT devices cannot use STUN. This is because symmetric NAT provides additional security by associating not only the local IP with the port, but also the local IP with the destination. The NAT device will then only accept connections on the associated port from the original destination server. This means that while the STUN server can still discover the client's NAT IP, that address is useless to other peers as only the STUN server can respond along it.

To solve this, we can use a different protocol called TURN, which simply acts as a relay server, forwarding packets between the two peers. However, this approach is suboptimal as it increases the round-trip time between peers due to indirection. An interesting approach that I think has not been explored is to combine a TURN server with a game server, but run a custom TURN implementation that pushes received packets directly to the game loop's message queue. This solves the symmetric NAT problem even more efficiently than the method described in this blog post. I will most likely experiment with it after I further flesh out my game. Stay tuned!

Despite these setbacks, WebRTC data channels are still a powerful tool that can be used to improve the responsiveness of many web games. I'm excited about the future of WebRTC adoption and hope it will usher in the next generation of massively multiplayer gaming browser experiences.

quote

  1. https://www.superdataresearch.com/us-digital-games-market/
  2. http://gafferongames.com/networking-for-game-programmers/udp-vs-tcp/
  3. http://gamedev.stackexchange.com/questions/431/is-the-tcp-protocol-good-enough-for-real-time-multiplayer-games
  4. http://gafferongames.com/networking-for-game-programmers/
  5. http://new.gafferongames.com/post/why_cant_i_send_udp_packets_from_a_browser/
  6. https://www.html5rocks.com/en/tutorials/webrtc/basics/
  7. https://news.ycombinator.com/item?id=13741155
  8. https://news.ycombinator.com/item?id=13264952
  9. http://gamedev.stackexchange.com/questions/67738/limitations-of-p2p-multiplayer-games-vs-client-server
  10. https://github.com/js-platform/node-webrtc/issues/257
  11. http://caniuse.com/#feat=rtcpeerconnection

Tags: Game Development webrtc

Posted by VTS on Fri, 18 Nov 2022 14:52:15 +0530