"Writing a Live Server from Scratch" C++ implements the simplest RTSP streaming server

Streaming Media Development Series Articles

foreword

In the security industry, the onvif protocol and the gb protocol are two standards, gb is the standard of the domestic security industry, onvif is the standard of the foreign security industry, and the video stream in gb28181 is a ps stream, which is pushed to the upper platform by the device and the lower platform , has the characteristics of going to the public cloud, and the onvif video stream uses rtsp, which is usually used for intranet access. When the rtsp stream needs to go to the public cloud, you can use r-rtsp. The r-rtsp interaction process is just the opposite of the rtsp process , the server actively initiates the request.

1. What is rtsp stream?

RTSP is an application layer protocol similar to HTTP. A typical streaming media framework network system can refer to the following figure, in which rtsp is mainly used for control commands, rtcp is mainly used for video quality feedback, and rtp is used for video and audio stream transmission.


Important concepts:
1. RTSP (Real Time Streaming Protocol), RFC2326, real-time streaming protocol, is an application layer protocol in the TCP/IP protocol system, an IETF RFC standard submitted by Columbia University, Netscape and RealNetworks. The protocol defines how one-to-many applications can efficiently transmit multimedia data over IP networks. RTSP is architecturally located on top of RTP and RTCP, and it uses TCP or UDP to complete data transmission.
2. Real-time Transport Protocol or RTP for short, which was published in RFC 1889 by the Multimedia Transmission Working Group of IETF in 1996. The RTP protocol specifies a standard packet format for delivering audio and video over the Internet. It is created on UDP protocol.
3. Real-time Transport Control Protocol or RTP Control Protocol or RTCP for short) is a sister protocol of Real-time Transport Protocol (RTP). RTCP is defined by RFC 3550 (replacing obsolete RFC 1889). RTP uses an even UDP port; while RTCP uses the next port of RTP, which is an odd port. RTCP and RTP work together, RTP implements the transmission of actual data, and RTCP is responsible for sending control packets to each receiver in the session. Its main function is to give feedback on the quality of service that RTP is providing.

RTSP message format:
There are two types of RTSP messages, one is a request message (request), the other is a response message (response), and the formats of the two messages are different.

Request message format:
Method URI RTSP Version CR LF
Header CR LF CR LF
Message body CR LF
The methods include OPTIONS, SETUP, PLAY, TEARDOWN, etc. URI is the address of the receiver (server), for example: rtsp://192.168.22.136:5000/v0, CR LF after each line means carriage return and line feed, which needs to be received The end has corresponding parsing, and the last message header needs to have two CR LF.

Response message format:
RTSP Version Status Code Explanation CR LF
Header CR LF CR LF
Message body CR LF
The RTSP version is generally RTSP/1.0, the status code is a value, 200 means success, and the explanation is the text explanation corresponding to the status code.
The status code consists of three digits, indicating the result of the method execution, defined as follows:
1XX: reserved for future use;
2XX: success, the operation is received, understood, accepted (received, understood, accepted);
3XX: Redirection, further operations must be performed to complete the operation;
4XX: The client has an error, the request has a syntax error or cannot be implemented;
5XX: The server has an error, and the server cannot implement a legal request.

RTSP process
1,OPTIONS
2,DESCRIBE
3,SETUP
4,PLAY
5,TEARDWON

2. Use steps

1. Server code

rtp.cpp

#include <sys/types.h>
#include "rtp.h"

void rtpHeaderInit(struct RtpPacket* rtpPacket, uint8_t csrcLen, uint8_t extension,
    uint8_t padding, uint8_t version, uint8_t payloadType, uint8_t marker,
    uint16_t seq, uint32_t timestamp, uint32_t ssrc)
{
    rtpPacket->rtpHeader.csrcLen = csrcLen;
    rtpPacket->rtpHeader.extension = extension;
    rtpPacket->rtpHeader.padding = padding;
    rtpPacket->rtpHeader.version = version;
    rtpPacket->rtpHeader.payloadType = payloadType;
    rtpPacket->rtpHeader.marker = marker;
    rtpPacket->rtpHeader.seq = seq;
    rtpPacket->rtpHeader.timestamp = timestamp;
    rtpPacket->rtpHeader.ssrc = ssrc;
}
int rtpSendPacketOverTcp(int clientSockfd, struct RtpPacket* rtpPacket, uint32_t dataSize, char channel)
{

    rtpPacket->rtpHeader.seq = htons(rtpPacket->rtpHeader.seq);
    rtpPacket->rtpHeader.timestamp = htonl(rtpPacket->rtpHeader.timestamp);
    rtpPacket->rtpHeader.ssrc = htonl(rtpPacket->rtpHeader.ssrc);

    uint32_t rtpSize = RTP_HEADER_SIZE + dataSize;
    char* tempBuf = (char *)malloc(4 + rtpSize);
    tempBuf[0] = 0x24;//$
    tempBuf[1] = channel;// 0x00;
    tempBuf[2] = (uint8_t)(((rtpSize) & 0xFF00) >> 8);
    tempBuf[3] = (uint8_t)((rtpSize) & 0xFF);
    memcpy(tempBuf + 4, (char*)rtpPacket, rtpSize);

    int ret = send(clientSockfd, tempBuf, 4 + rtpSize, 0);

    rtpPacket->rtpHeader.seq = ntohs(rtpPacket->rtpHeader.seq);
    rtpPacket->rtpHeader.timestamp = ntohl(rtpPacket->rtpHeader.timestamp);
    rtpPacket->rtpHeader.ssrc = ntohl(rtpPacket->rtpHeader.ssrc);

    free(tempBuf);
    tempBuf = NULL;

    return ret;
}
int rtpSendPacketOverUdp(int serverRtpSockfd, const char* ip, int16_t port, struct RtpPacket* rtpPacket, uint32_t dataSize)
{
    struct sockaddr_in addr;
    int ret;

    addr.sin_family = AF_INET;
    addr.sin_port = htons(port);
    addr.sin_addr.s_addr = inet_addr(ip);

    rtpPacket->rtpHeader.seq = htons(rtpPacket->rtpHeader.seq);
    rtpPacket->rtpHeader.timestamp = htonl(rtpPacket->rtpHeader.timestamp);
    rtpPacket->rtpHeader.ssrc = htonl(rtpPacket->rtpHeader.ssrc);

    ret = sendto(serverRtpSockfd, (char *)rtpPacket, dataSize + RTP_HEADER_SIZE, 0,
        (struct sockaddr*)&addr, sizeof(addr));

    rtpPacket->rtpHeader.seq = ntohs(rtpPacket->rtpHeader.seq);
    rtpPacket->rtpHeader.timestamp = ntohl(rtpPacket->rtpHeader.timestamp);
    rtpPacket->rtpHeader.ssrc = ntohl(rtpPacket->rtpHeader.ssrc);

    return ret;
}

rtp.h

#pragma once
#include <stdint.h>
#include <iostream>
#include <arpa/inet.h>
#include <errno.h>
#include <fcntl.h>
#include <netinet/in.h>
#include <stdio.h>
#include <cstring>
#include <stdlib.h>
#include <sys/epoll.h>
#include <sys/socket.h>
#include <unistd.h>
#include <cassert>
#include <string>
#include <iostream>
#include <memory>
#include <functional>
#include <thread>

#define RTP_VESION              2
#define RTP_PAYLOAD_TYPE_H264   96
#define RTP_PAYLOAD_TYPE_AAC    97

#define RTP_HEADER_SIZE         12
#define RTP_MAX_PKT_SIZE        1400

 /*
  *    0                   1                   2                   3
  *    7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0
  *   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  *   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
  *   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  *   |                           timestamp                           |
  *   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  *   |           synchronization source (SSRC) identifier            |
  *   +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
  *   |            contributing source (CSRC) identifiers             |
  *   :                             ....                              :
  *   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  *
  */
struct RtpHeader
{
    /* byte 0 */
    uint8_t csrcLen : 4;
    uint8_t extension : 1;
    uint8_t padding : 1;
    uint8_t version : 2;

    /* byte 1 */
    uint8_t payloadType : 7;
    uint8_t marker : 1;

    /* bytes 2,3 */
    uint16_t seq;

    /* bytes 4-7 */
    uint32_t timestamp;

    /* bytes 8-11 */
    uint32_t ssrc;
};

struct RtpPacket
{
    struct RtpHeader rtpHeader;
    uint8_t payload[0];
};

void rtpHeaderInit(struct RtpPacket* rtpPacket, uint8_t csrcLen, uint8_t extension,
    uint8_t padding, uint8_t version, uint8_t payloadType, uint8_t marker,
    uint16_t seq, uint32_t timestamp, uint32_t ssrc);

int rtpSendPacketOverTcp(int clientSockfd, struct RtpPacket* rtpPacket, uint32_t dataSize, char channel);
int rtpSendPacketOverUdp(int serverRtpSockfd, const char* ip, int16_t port, struct RtpPacket* rtpPacket, uint32_t dataSize);

main.cpp

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <time.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <thread>
#include "rtp.h"

#define AAC_FILE_NAME   "/home/vagrant/Mycode/data/test.aac"
#define H264_FILE_NAME   "/home/vagrant/Mycode/data/test.h264"
#define SERVER_PORT      8554
#define BUF_MAX_SIZE     (1024*1024)

static int createTcpSocket()
{
    int sockfd;
    int on = 1;
    sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (sockfd < 0)
        return -1;
    setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, (const char*)&on, sizeof(on));
    return sockfd;
}

static int bindSocketAddr(int sockfd, const char* ip, int port)
{
    struct sockaddr_in addr;
    addr.sin_family = AF_INET;
    addr.sin_port = htons(port);
    addr.sin_addr.s_addr = inet_addr(ip);
    if (bind(sockfd, (struct sockaddr*)&addr, sizeof(struct sockaddr)) < 0){
        return -1;
    }
    return 0;
}

static int acceptClient(int sockfd, char* ip, int* port)
{
    int clientfd;
    socklen_t len = 0;
    struct sockaddr_in addr;

    memset(&addr, 0, sizeof(addr));
    len = sizeof(addr);

    clientfd = accept(sockfd, (struct sockaddr*)&addr, &len);
    if (clientfd < 0)
        return -1;

    strcpy(ip, inet_ntoa(addr.sin_addr));
    *port = ntohs(addr.sin_port);

    return clientfd;
}

static inline int startCode3(char* buf)
{
    if (buf[0] == 0 && buf[1] == 0 && buf[2] == 1)
        return 1;
    else
        return 0;
}

static inline int startCode4(char* buf)
{
    if (buf[0] == 0 && buf[1] == 0 && buf[2] == 0 && buf[3] == 1)
        return 1;
    else
        return 0;
}

static char* findNextStartCode(char* buf, int len)
{
    int i;
    if (len < 3)
        return NULL;

    for (i = 0; i < len - 3; ++i)
    {
        if (startCode3(buf) || startCode4(buf))
            return buf;
        ++buf;
    }

    if (startCode3(buf))
        return buf;

    return NULL;
}

static int getFrameFromH264File(FILE* fp, char* frame, int size) {
    int rSize, frameSize;
    char* nextStartCode;

    if (fp < 0)
        return -1;
    rSize = fread(frame, 1, size, fp);
    if (!startCode3(frame) && !startCode4(frame))
        return -1;
    nextStartCode = findNextStartCode(frame + 3, rSize - 3);
    if (!nextStartCode){
        //lseek(fd, 0, SEEK_SET);
        //frameSize = rSize;
        return -1;
    }
    else{
        frameSize = (nextStartCode - frame);
        fseek(fp, frameSize - rSize, SEEK_CUR);
    }
    return frameSize;
}

struct AdtsHeader {
    unsigned int syncword;  //12 bit sync word '1111 1111 1111', indicating the start of an ADTS frame
    unsigned int id;        //1 bit MPEG identifier, 0 for MPEG-4, 1 for MPEG-2
    unsigned int layer;     //2 bit s are always '00'
    unsigned int protectionAbsent;  //1 bit 1 means no crc, 0 means crc
    unsigned int profile;           //1 bit indicates which level of AAC to use
    unsigned int samplingFreqIndex; //4 bit indicates the sampling frequency used
    unsigned int privateBit;        //1 bit
    unsigned int channelCfg; //3 bit indicates the number of channels
    unsigned int originalCopy;         //1 bit
    unsigned int home;                  //1 bit

    /*The following parameters are changed, that is, each frame is different*/
    unsigned int copyrightIdentificationBit;   //1 bit
    unsigned int copyrightIdentificationStart; //1 bit
    unsigned int aacFrameLength;               //13 bit The length of an ADTS frame including ADTS header and AAC original stream
    unsigned int adtsBufferFullness;           //11 bit 0x7FF indicates that it is a code stream with variable code rate

    /* number_of_raw_data_blocks_in_frame
     * Indicates that there are number_of_raw_data_blocks_in_frame + 1 AAC raw frame in the ADTS frame
     * So say number_of_raw_data_blocks_in_frame == 0
     * Indicating that there is an AAC data block in the ADTS frame does not mean that there is no. (An AAC raw frame contains 1024 samples and related data over a period of time)
     */
    unsigned int numberOfRawDataBlockInFrame; //2 bit
};

static int parseAdtsHeader(uint8_t* in, struct AdtsHeader* res) {
    static int frame_number = 0;
    memset(res, 0, sizeof(*res));

    if ((in[0] == 0xFF) && ((in[1] & 0xF0) == 0xF0)){
        res->id = ((unsigned int)in[1] & 0x08) >> 3;
        res->layer = ((unsigned int)in[1] & 0x06) >> 1;
        res->protectionAbsent = (unsigned int)in[1] & 0x01;
        res->profile = ((unsigned int)in[2] & 0xc0) >> 6;
        res->samplingFreqIndex = ((unsigned int)in[2] & 0x3c) >> 2;
        res->privateBit = ((unsigned int)in[2] & 0x02) >> 1;
        res->channelCfg = ((((unsigned int)in[2] & 0x01) << 2) | (((unsigned int)in[3] & 0xc0) >> 6));
        res->originalCopy = ((unsigned int)in[3] & 0x20) >> 5;
        res->home = ((unsigned int)in[3] & 0x10) >> 4;
        res->copyrightIdentificationBit = ((unsigned int)in[3] & 0x08) >> 3;
        res->copyrightIdentificationStart = (unsigned int)in[3] & 0x04 >> 2;
        res->aacFrameLength = (((((unsigned int)in[3]) & 0x03) << 11) |
            (((unsigned int)in[4] & 0xFF) << 3) |
            ((unsigned int)in[5] & 0xE0) >> 5);
        res->adtsBufferFullness = (((unsigned int)in[5] & 0x1f) << 6 |
            ((unsigned int)in[6] & 0xfc) >> 2);
        res->numberOfRawDataBlockInFrame = ((unsigned int)in[6] & 0x03);
        return 0;
    }
    else{
        printf("failed to parse adts header\n");
        return -1;
    }
}

static int rtpSendAACFrame(int clientSockfd,
    struct RtpPacket* rtpPacket, uint8_t* frame, uint32_t frameSize) {
    int ret;

    rtpPacket->payload[0] = 0x00;
    rtpPacket->payload[1] = 0x10;
    rtpPacket->payload[2] = (frameSize & 0x1FE0) >> 5; //high 8 bits
    rtpPacket->payload[3] = (frameSize & 0x1F) << 3; //lower 5 bits
    memcpy(rtpPacket->payload + 4, frame, frameSize);
    ret = rtpSendPacketOverTcp(clientSockfd, rtpPacket, frameSize + 4,0x02);
    if (ret < 0){
        printf("failed to send rtp packet\n");
        return -1;
    }
    rtpPacket->rtpHeader.seq++;
    /*
     * If the sampling frequency is 44100
     * Generally, each 1024 samples of AAC is a frame
     * So there are 44100 / 1024 = 43 frames in one second
     * The time increment is 44100 / 43 = 1025
     * The time of one frame is 1 / 43 = 23ms
     */
    rtpPacket->rtpHeader.timestamp += 1025;
    return 0;
}


static int rtpSendH264Frame(int clientSockfd,
    struct RtpPacket* rtpPacket, char* frame, uint32_t frameSize)
{

    uint8_t naluType; // nalu first byte
    int sendByte = 0;
    int ret;
    naluType = frame[0];
    printf("%s frameSize=%d \n", __FUNCTION__, frameSize);
    if (frameSize <= RTP_MAX_PKT_SIZE) // NALU length is less than the maximum packet size: single NALU unit mode
    {
         //*   0 1 2 3 4 5 6 7 8 9
         //*  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         //*  |F|NRI|  Type   | a single NAL unit ... |
         //*  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

        memcpy(rtpPacket->payload, frame, frameSize);
        ret = rtpSendPacketOverTcp(clientSockfd, rtpPacket, frameSize,0x00);
        if(ret < 0)
            return -1;

        rtpPacket->rtpHeader.seq++;
        sendByte += ret;
        if ((naluType & 0x1F) == 7 || (naluType & 0x1F) == 8) // If it is SPS and PPS, there is no need to add time stamps
        {
        }

    }
    else // nalu The length is less than the largest packet: Fragmentation mode
    {
         //*  0                   1                   2
         //*  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
         //* +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         //* | FU indicator  |   FU header   |   FU payload   ...  |
         //* +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

         //*     FU Indicator
         //*    0 1 2 3 4 5 6 7
         //*   +-+-+-+-+-+-+-+-+
         //*   |F|NRI|  Type   |
         //*   +---------------+

         //*      FU Header
         //*    0 1 2 3 4 5 6 7
         //*   +-+-+-+-+-+-+-+-+
         //*   |S|E|R|  Type   |
         //*   +---------------+


        int pktNum = frameSize / RTP_MAX_PKT_SIZE;       // There are several complete packages
        int remainPktSize = frameSize % RTP_MAX_PKT_SIZE; // Size of remaining incomplete packets
        int i, pos = 1;

        // send complete package
        for (i = 0; i < pktNum; i++)
        {
            rtpPacket->payload[0] = (naluType & 0x60) | 28;
            rtpPacket->payload[1] = naluType & 0x1F;

            if (i == 0) //first packet of data
                rtpPacket->payload[1] |= 0x80; // start
            else if (remainPktSize == 0 && i == pktNum - 1) //last packet of data
                rtpPacket->payload[1] |= 0x40; // end

            memcpy(rtpPacket->payload+2, frame+pos, RTP_MAX_PKT_SIZE);
            ret = rtpSendPacketOverTcp(clientSockfd, rtpPacket, RTP_MAX_PKT_SIZE+2,0x00);
            if(ret < 0)
                return -1;

            rtpPacket->rtpHeader.seq++;
            sendByte += ret;
            pos += RTP_MAX_PKT_SIZE;
        }

        // send the rest of the data
        if (remainPktSize > 0){
            rtpPacket->payload[0] = (naluType & 0x60) | 28;
            rtpPacket->payload[1] = naluType & 0x1F;
            rtpPacket->payload[1] |= 0x40; //end

            memcpy(rtpPacket->payload+2, frame+pos, remainPktSize+2);
            ret = rtpSendPacketOverTcp(clientSockfd, rtpPacket, remainPktSize+2, 0x00);
            if(ret < 0)
                return -1;

            rtpPacket->rtpHeader.seq++;
            sendByte += ret;
        }
    }
    return sendByte;
}

static int handleCmd_OPTIONS(char* result, int cseq)
{
    sprintf(result, "RTSP/1.0 200 OK\r\n"
        "CSeq: %d\r\n"
        "Public: OPTIONS, DESCRIBE, SETUP, PLAY\r\n"
        "\r\n",
        cseq);
    return 0;
}

static int handleCmd_DESCRIBE(char* result, int cseq, char* url)
{
    char sdp[500];
    char localIp[100];
    sscanf(url, "rtsp://%[^:]:", localIp);
    sprintf(sdp, "v=0\r\n"
        "o=- 9%ld 1 IN IP4 %s\r\n"
        "t=0 0\r\n"
        "a=control:*\r\n"
        "m=video 0 RTP/AVP/TCP 96\r\n"
        "a=rtpmap:96 H264/90000\r\n"
        "a=control:track0\r\n"
        "m=audio 1 RTP/AVP/TCP 97\r\n"
        "a=rtpmap:97 mpeg4-generic/44100/2\r\n"
        "a=fmtp:97 profile-level-id=1;mode=AAC-hbr;sizelength=13;indexlength=3;indexdeltalength=3;config=1210;\r\n"
        "a=control:track1\r\n",
        time(NULL), localIp);

    sprintf(result, "RTSP/1.0 200 OK\r\nCSeq: %d\r\n"
        "Content-Base: %s\r\n"
        "Content-type: application/sdp\r\n"
        "Content-length: %zu\r\n\r\n"
        "%s",
        cseq,
        url,
        strlen(sdp),
        sdp);

    return 0;
}

static int handleCmd_SETUP(char* result, int cseq)
{
    if (cseq == 3) {
        sprintf(result, "RTSP/1.0 200 OK\r\n"
            "CSeq: %d\r\n"
            "Transport: RTP/AVP/TCP;unicast;interleaved=0-1\r\n"
            "Session: 66334873\r\n"
            "\r\n",
            cseq);
    }
    else if (cseq == 4) {
        sprintf(result, "RTSP/1.0 200 OK\r\n"
            "CSeq: %d\r\n"
            "Transport: RTP/AVP/TCP;unicast;interleaved=2-3\r\n"
            "Session: 66334873\r\n"
            "\r\n",
            cseq);
    }
    return 0;
}

static int handleCmd_PLAY(char* result, int cseq)
{
    sprintf(result, "RTSP/1.0 200 OK\r\n"
        "CSeq: %d\r\n"
        "Range: npt=0.000-\r\n"
        "Session: 66334873; timeout=10\r\n\r\n",
        cseq);
    return 0;
}

static void doClient(int clientSockfd, const char* clientIP, int clientPort) {
    char method[40];
    char url[100];
    char version[40];
    int CSeq;

    char* rBuf = (char*)malloc(BUF_MAX_SIZE);
    char* sBuf = (char*)malloc(BUF_MAX_SIZE);

    while (true) {
        int recvLen;
        recvLen = recv(clientSockfd, rBuf, BUF_MAX_SIZE, 0);
        if (recvLen <= 0) {
            break;
        }
        rBuf[recvLen] = '\0';
        printf("Receive request rBuf = %s \n", rBuf);

        const char* sep = "\n";

        char* line = strtok(rBuf, sep);
        while (line) {
            if (strstr(line, "OPTIONS") ||
                strstr(line, "DESCRIBE") ||
                strstr(line, "SETUP") ||
                strstr(line, "PLAY")) {
                if (sscanf(line, "%s %s %s\r\n", method, url, version) != 3) {
                    // error
                    printf("parse error %d",__LINE__);
                }
            }
            else if (strstr(line, "CSeq")) {
                if (sscanf(line, "CSeq: %d\r\n", &CSeq) != 1) {
                    // error
                    printf("parse error %d",__LINE__);
                }
            }
            else if (!strncmp(line, "Transport:", strlen("Transport:"))) {
                // Transport: RTP/AVP/UDP;unicast;client_port=13358-13359
                // Transport: RTP/AVP;unicast;client_port=13358-13359
                if (sscanf(line, "Transport: RTP/AVP/TCP;unicast;interleaved=0-1\r\n") != 0) {
                    // error
                    printf("parse Transport error \n");
                }
            }
            line = strtok(NULL, sep);
        }
        printf("method: %s seq %d\n",method,CSeq);
        if (!strcmp(method, "OPTIONS")) {
            if (handleCmd_OPTIONS(sBuf, CSeq)){
                printf("failed to handle options\n");
                break;
            }
        }
        else if (!strcmp(method, "DESCRIBE")) {
            if (handleCmd_DESCRIBE(sBuf, CSeq, url)){
                printf("failed to handle describe\n");
                break;
            }
        }
        else if (!strcmp(method, "SETUP")) {
            if (handleCmd_SETUP(sBuf, CSeq)){
                printf("failed to handle setup\n");
                break;
            }
        }
        else if (!strcmp(method, "PLAY")) {
            if (handleCmd_PLAY(sBuf, CSeq)){
                printf("failed to handle play\n");
                break;
            }
        }
        else {
            printf("undefined method = %s \n", method);
            break;
        }
        printf("Response sBuf = %s \n", sBuf);

        send(clientSockfd, sBuf, strlen(sBuf), 0);


        //Start playing, send RTP packets
        if (!strcmp(method, "PLAY")) {
            std::thread t1([&]() {
                int frameSize, startCode;
                char* frame = (char*)malloc(500000);
                struct RtpPacket* rtpPacket = (struct RtpPacket*)malloc(500000);
                FILE* fp = fopen(H264_FILE_NAME, "rb");
                if (!fp) {
                    printf("Failed to read %s\n", H264_FILE_NAME);
                    return;
                }
                rtpHeaderInit(rtpPacket, 0, 0, 0, RTP_VESION, RTP_PAYLOAD_TYPE_H264, 0, 0, 0, 0x88923423);

                printf("start play\n");
                while (true) {
                    frameSize = getFrameFromH264File(fp, frame, 500000);
                    if (frameSize < 0){
                        printf("End of reading %s, frameSize=%d \n", H264_FILE_NAME, frameSize);
                        break;
                    }

                    if (startCode3(frame))
                        startCode = 3;
                    else
                        startCode = 4;

                    frameSize -= startCode;
                    rtpSendH264Frame(clientSockfd, rtpPacket, frame + startCode, frameSize);

                    rtpPacket->rtpHeader.timestamp += 90000 / 25;
                    usleep(20000);//1000/25 * 1000
                }
                free(frame);
                free(rtpPacket);
                
            });
            std::thread t2([&]() {
                struct AdtsHeader adtsHeader;
                struct RtpPacket* rtpPacket;
                uint8_t* frame;
                int ret;

                FILE* fp = fopen(AAC_FILE_NAME, "rb");
                if (!fp) {
                    printf("Failed to read %s\n", AAC_FILE_NAME);
                    return;
                }

                frame = (uint8_t*)malloc(5000);
                rtpPacket = (struct RtpPacket*)malloc(5000);

                rtpHeaderInit(rtpPacket, 0, 0, 0, RTP_VESION, RTP_PAYLOAD_TYPE_AAC, 1, 0, 0, 0x32411);

                while (true){
                    ret = fread(frame, 1, 7, fp);
                    if (ret <= 0){
                        printf("fread err\n");
                        break;
                    }
                    printf("fread ret=%d \n", ret);

                    if (parseAdtsHeader(frame, &adtsHeader) < 0){
                        printf("parseAdtsHeader err\n");
                        break;
                    }
                    ret = fread(frame, 1, adtsHeader.aacFrameLength - 7, fp);
                    if (ret <= 0){
                        printf("fread err\n");
                        break;
                    }
                    rtpSendAACFrame(clientSockfd,rtpPacket, frame, adtsHeader.aacFrameLength - 7);
                    usleep(23223);//1000/43.06 * 1000
                }
                free(frame);
                free(rtpPacket);
            });
            t1.join();
            t2.join();

            break;
        }

        memset(method,0,sizeof(method)/sizeof(char));
        memset(url,0,sizeof(url)/sizeof(char));
        CSeq = 0;
    }

    close(clientSockfd);
    free(rBuf);
    free(sBuf);

}

int main(int argc, char* argv[])
{
    int serverSockfd;
    serverSockfd = createTcpSocket();
    if (serverSockfd < 0){
        printf("failed to create tcp socket\n");
        return -1;
    }

    if (bindSocketAddr(serverSockfd, "0.0.0.0", SERVER_PORT) < 0){
        printf("failed to bind addr\n");
        fprintf(stderr, "bind socket error %s errno: %d\n", strerror(errno), errno);
        return -1;
    }

    if (listen(serverSockfd, 10) < 0){
        printf("failed to listen\n");
        return -1;
    }
    printf("%s rtsp://10.20.39.168:%d/test\n",__FILE__, SERVER_PORT);

    while (true) {
        int clientSockfd = 0;
        char clientIp[40] = {"\0"};
        int clientPort = 0;

        clientSockfd = acceptClient(serverSockfd, clientIp, &clientPort);
        if (clientSockfd < 0){
            printf("failed to accept client\n");
            return -1;
        }
        printf("accept client;client ip:%s,client port:%d\n", clientIp, clientPort);

        doClient(clientSockfd, clientIp, clientPort);
    }
    close(serverSockfd);
    return 0;
}



/*
compile command
g++ main.cpp rtp.cpp -o rtsp.server -std=c++11 -lpthread
 Generate h264 and aac files from mp4
ffmpeg -i test.mp4 -an -vcodec copy -f h264 test.h264
ffmpeg -i test.mp4 -vn -acodec aac test.aac
*/

operation result:
Compile and run:

1,compile
g++ main.cpp rtp.cpp -o rtsp.server -std=c++11 -lpthread
2,run
./rtsp.server
3,play
ffplay.exe  -rtsp_transport tcp rtsp://ip:8554/test

Summarize

Through the study of this article, you should have a certain understanding of rtsp video streaming, and I hope it will be helpful to your subsequent studies.

It is better to teach someone to fish than to teach him to fish. If you like it, please like, collect and follow.

Tags: C++ server network

Posted by jazappi on Tue, 24 Jan 2023 08:48:13 +0530