Implement string class by simulation

One: Four default member functions

1.private Members

 private:
        char* _str;
        size_t _size;
        size_t _capacity;
        const static size_t npos;
  • _ str is a pointer to string storage, _ Size is used to store the end of the current string, _ capacity is used to store the size of a string
  • The value of npos is defined as -1, because the type of npos is size_t, is an unsigned integer, so the actual value of npos is size_ Maximum value of type T 4294967295
  • When NPOs is used as a parameter, it usually indicates the end of the string, because we do not think a string will be longer than npos, and when used as a return value it is usually used to look up, indicating that the value to look for is not found
  • Initialization of npos needs to be outside of class because npos is a static member

2. Constructors

 string(const char* str = "")
        {
            //Prevent nullptr transmission
            if (str == nullptr) str = "";
            _size = strlen(str);
            _capacity = _size;
            _str = new char[_capacity + 1];
            strcpy(_str, str);
        }
  • The constructor simply passes a string to initialize the string, but you cannot write nullptr when writing the default parameter
  • Because strlen cannot pass nullptr when calculating the length of a string, the default parameter is usually written as "", meaning an empty string
  • Also note that _ capacity is the length of a valid character stored, excluding the'\0'at the end, so give _ str allocates more space than _ capacity one more byte

3. Destructors

~string()
        {
            delete[] _str;
            _str = nullptr;
            _size = _capacity = 0;
        }
  • delete does nothing for nullptr, so you don't have to decide _ Is str empty

4. Copy construction

<1>Traditional

  string(const string& str)//Copy construction, must pass reference, otherwise infinite recursion
        :_str(new char[strlen(str._str)+1])
        {
           strcpy(_str, str._str);
        }
  • The copy constructor is very similar to the constructor, but with a string object _ str to construct a new string object _ str, is a copy of the process, call library functions can be
  • Note that parameters must be passed by reference
  • If no reference is passed, infinite recursion will result

<2>Modern Style

void swap(string& str)
        {
            ::swap(_str, str._str);
            ::swap(_size, str._size);
            ::swap(_capacity, str._capacity);
        }
string(const string& str)
            :_str(nullptr)
            , _size(0)
            , _capacity(0)
        {
            string tmp(str._str);
            swap(tmp);
        }
  • Now we can define a copy constructor without strcpy. First we initialize an empty string object, then use the constructor to construct a string with the same number of strings that need to be copied, and then swap them. The copy construct is complete, and it is also a deep copy
  • A copy constructor has been created, and our temporary object tmp automatically calls the destructor to destruct it
  • Here's what we need to do with _ str initializes, otherwise _ str is a random value and error will occur when a copy construct is swapped and a destructor is called for delete because _ str's space is not dynamically allocated
  • Here the swap function can be called from the library without writing it, but the efficiency of the library is not as efficient as our own rewrite because we are just swapping, and the swap in the library creates a new variable and makes three deep copies

5. Assignment operator overload (operator=)

<1>Traditional

 string& operator=(const string& str)
        {
            if (this!=&str)//Prevent s1=s1
            {
                delete[] _str;
                _str = new char[strlen(str._str) + 1];
                strcpy(_str, str._str);
            }
            return *this;
        }
  • Unlike copy construction, operator = needs delete to free up the original space and request as much space as STR to make copies (because to make sure they have as much space as str, they are smaller and more expansive and can only reapply if they are larger, it is better to reapply the space directly)
  • To prevent yourself from assigning values to yourself, add if judgment when this!=& Assignment when str

<2>Modern Style

string& operator=(string str)//When s1=s1_ str's address will change
        {
            swap(str);
            return *this;
        }
  • Like copy construction, operator = has a present way to make code more concise
  • Parameters are key here, and no references are passed, so a copy is constructed when the parameter is passed, and the assignment is done by swap directly.
  • A disadvantage is that you cannot prevent yourself from assigning values to yourself

6. Iterators and operator s[]

  const char& operator[](int pos) const
  //Specially overload const member functions to prevent modification_ Value of str
   {
       assert(pos < _size);
       return _str[pos];
   }
   char& operator[](int pos)
   {
       assert(pos < _size);
       return _str[pos];
   }
  typedef char* iterator;
  typedef const char* const_iterator;
  iterator begin()
  {
      return _str;
  }
  iterator end()
  {
      return _str + _size;
  }
  const_iterator begin() const
  {
      return _str;
  }
  const_iterator end() const
  {
      return _str + _size;
  }
  • Don't forget to add assert when writing operator[] to prevent cross-border access
  • The iterator of the string class is actually the encapsulation of char and const char pointers

Two: reserve and resize

1.reserve function

void reserve(size_t n)
{
   if (n > _capacity)
    {
       char* tmp = new char[n + 1];
       strncpy(tmp, _str, _size + 1);
       //Can't use strcpy, there may be more than \0 not copied here, copy_ size+1, \0 will also be copied
       delete[] _str;
       _str = tmp;
       _capacity = n;
    }
 }
  • N<_ Capacity without expansion
  • You cannot use strcpy to copy because a string class may contain `\0`, and the end flag of the string class is pos==_size
  • So we use strncpy to make a copy, directly _ size+1 character, \0 will also be copied

2.resize function

void resize(size_t n, char val = '\0')//Don't forget the default parameters
        {
            if (n < _size)//Divide into n<_ Size and n>=size
            {
                _size = n;
                _str[_size] = '\0';
            }
            else
            {
                if (n > _capacity)
                {
                    reserve(n);
                }
                while (_size < n)
                {
                    _str[_size++] = val;
                }
                _str[_size] = '\0';
            }
        }
  • resize can be divided into two cases
  • (1) n<_size (2)n>=_size
  • The first is simpler, simply change the character in the nth position to \0 and then modify _ The value of size is sufficient
  • The second is from _ size always ends with the val ue character and ends with0

3: Insert Delete and Find

1.push_back

void push_back(char c)
{
     if (_size == _capacity)
     {
         reserve(_capacity == 0 ? 4 : _capacity * 2);
     }
     _str[_size++] = c;
     _str[_size] = '\0';
 }
  • Before each new character is added, you need to check if there is enough space to expand it
  • If it is an empty string, allocate 4 bytes of space or double the capacity

2.append

 void append(const char* str)
 {
     int len = _size + strlen(str);
     if (len > _capacity)
     {
         reserve(len);
     }
     strcpy(_str + _size, str);
     _size = len;
 }
  • And push_back, you need to check if you need to expand

3.operator+=

string& operator+=(char c)
{
     push_back(c);
     return *this;
}
string& operator+=(const char* str)
 {
     append(str);
     return *this;
 }
  • Two versions of operator+= need to be overloaded, one character at the end and one string at the end, respectively
  • Multiplex push_back and append will do

4.insert

//Insert before pos position
 string& insert(size_t pos, char ch)
  {
      assert(pos <= _size);//Can only be inserted at and before size
      if (_size == _capacity)
      {
          reserve(_capacity == 0 ? 4 : _capacity * 2);//capacity may be 0
      }
      char* end = _str + _size;
      while (end >= _str + pos)
      {
          *(end + 1) = *end;
          end--;
      }
      _str[pos] = ch;
      _size++;
      return *this;
  }
 string& insert(size_t pos, const char* str)//Problem with passing constant string without const
 {
       assert(pos <= _size);
       size_t len = _size+strlen(str);
       if ( len > _capacity)
       {
           reserve(len);
       }
       char* end = _str + _size;
       while (end >= _str + pos)
       {
           *(end + len) = *end;
           end--;
       }
       _size += len;
       strncpy(_str + pos, str, len);
       return *this;
   }
  • Note: (1) The insertion position cannot exceed _ size
  • (2) Don't forget to expand capacity
  • (3) If you do not use the pointer to move a character, you need to move from _ size+1 for assignment
  • If From _ size starts, so _ Str[end+1]=_ If str[end], the loop condition is written while (end>=pos)
  • When pos is 0, end eventually becomes -1, but end is of type size_t, would be a very large positive number, would cause an infinite loop
 string& insert(size_t pos, char ch)
        {
            assert(pos <= _size);//Can only be inserted at and before size
            if (_size == _capacity)
            {
                reserve(_capacity == 0 ? 4 : _capacity * 2);//capacity may be 0
            }
             size_t end = _size+1;//Changing the end type to int is also not possible, implicit type conversion occurs
             while (end > pos)
             {
                 _str[end] = _str[end-1];
                 end--;
             }
            _str[pos] = ch;
            _size++;
            return *this;
        }

5.erase

 string& erase(size_t pos = 0, size_t len = npos)//Do not delete parameters by default
 {
      assert(pos < _size);//_ The size position is\0 and cannot be deleted
      //1. The number of characters to be deleted is greater than the remaining characters
      //2. The number of characters to be deleted is less than the remaining characters
      size_t leftLen = _size - pos;
      if (len >= leftLen)//pos and delete it altogether later
      {
          _str[pos] = '\0';
          _size = pos;
      }
      else
      {
          strcpy(_str + pos, _str + pos + len);
          _size -= len;
      }
      return *this;
  }
  • Note: (1) pos must be less than _ Size,_ The size location is\0 and cannot be deleted
  • (2) The length of deletion and the size of the remaining length after pos need to be calculated. If the length to be deleted is longer than the remaining length, delete all pos directly. Otherwise, delete only len characters, you can complete the character movement through strcpy

6.find

size_t find(char ch, size_t pos = 0)
{
     assert(pos < _size);
     for (int i = pos; i < _size; i++)
     {
         if (_str[i] == ch)
         {
             return i;
         }
     }
     return npos;
 }
        
size_t find(const char* str, size_t pos = 0)
{
    assert(pos < _size);
    const char* ret = strstr(_str + pos, str);
    if (ret)
    {
        return ret - _str;
    }
    return npos;
}
  • The find function also needs to be overloaded, which consists of finding a character and finding a string
  • Searching for a character can be done by traversing the search, which can be achieved by calling the library function strstr
  • Notice that find returns subscripts, strstr returns pointers when looking for strings, and ret-_str calculates subscript and returns

4: operator < and operator >>

ostream& operator<<(ostream& out, const string& s)
{
     for (auto ch : s)
     {
         out << ch;
     }
     return out;
 }
 istream& operator>>(istream& in, string& s)
 {
     s.clear();
     char ch;
     ch = in.get();
     while (ch != ' ' && ch != '\n')
     {
         s += ch;
         ch = in.get();
     }
     return in;
 }
  • Operator < and operator >> are not written as friend functions here because the internal members of string are not modified directly
  • Instead, indirect modifications are made through the string's interface, such as operator > which clears the original string before continuing +=
  • The output is through the range for ()

Five: Operator overload to compare sizes

	bool operator<(string& s1, string& s2)
    {
        return strcmp(s1.c_str(), s2.c_str()) < 0;
    }
    bool operator==(string& s1, string& s2)
    {
        return strcmp(s1.c_str(), s2.c_str()) == 0;
    }
    bool operator<=(string& s1, string& s2)
    {
        return s1 < s2 || s1 == s2;
    }
    bool operator>(string& s1, string& s2)
    {
        return !(s1 <= s2);
    }
    bool operator>=(string& s1, string& s2)
    {
        return !(s1 < s2);
    }
    bool operator!=(string& s1, string& s2)
    {
        return !(s1 == s2);
    }
  • When sorting string classes, we need to compare the size of the string class, so we need to overload operators such as >, <, ==
  • These overloads are not written as member functions in the stl library, so they can also be written outside the class
  • Fast implementation can be achieved by calling strcmp library functions

Tags: C C++

Posted by recklessgeneral on Fri, 26 Aug 2022 21:18:05 +0530