The question of case insensitive strings has been, and continues to be asked a lot, and equally many answers can be found all over the internet. The go-to solution is to create a char_traits policy class with eq, lt, and compare methods implemented using std::toupper before comparing characters, then instantiate std::basic_string with it using istring = std::basic_string<char, char_itraits<char>> This works well until you try to use your new type anywhere near code designed with std::string in mind. Then what?
I guess what I am trying to say is that I never found a complete solution to this, so I decided to create one myself. I wanted the case insensitive string to play nicely and seamlessly with its standard counterpart. I wanted the ability to pass it as a parameter anywhere
std::string is accepted; to convert between
istring and
std::string in both directions; push it to output stream; read it from input stream; compare it using all six operators,
==,
!=,
<,
<=,
>,
>=, with
std::string whether it appeared on the left or right side of the operation; and to declare literals of its type:
"std::string literal"s.
In other words have it be indistinguishable from
std::string except when comparisons are needed, like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
auto f_std = [](const string& s) {}; auto f_is = [](const istring& s) {}; auto std1 = string{"abc"}; auto istr1 = istring{"ABC"}; auto istr2 = "istring literal"_is; f_std(istr1); f_is(std1); auto result = (std1 == istr1); std1 = istr1; istr1 = std1; |
The starting point was the character traits policy class mentioned earlier, but instead of using it with std::basic_string I inherited publicly from std::basic_string template configured with case insensitive traits policy; this pulled all its methods into the derived class scope, I only had to pull base class constructors:
1 2 3 4 5 6 |
template<typename CharT, typename Alloc = std::allocator<CharT>> class basic_istring : public std::basic_string<CharT, char_itraits<CharT>, Alloc> { public: using base = std::basic_string<CharT, char_itraits<CharT>, Alloc>; using base::base; // use base class constructors |
All this derived class needed now was a constructor which would allow it to be created from any other type of std::basic_string as long as the character type was the same; implicit type cast operator to seamlessly convert it to std::string, and 4 comparison operators: == and <=> declared twice with istring as the first or second parameter. The constructor and comparison operators needed to be selectively enabled only for strings with different character traits policy, otherwise they would cause ambiguity and compilation errors. Final step was declaring operator >>, operator <<, and operator""_is.
P.S. Everything I just described also applies to wchar_t aka std::wstring.
The implementation on my GitHub page: istring.hpp, example program: istring.cpp.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
#pragma once #include <istream> #include <ostream> #include <compare> #include <string> #include <locale> #include <utility> #include <algorithm> #include <type_traits> inline namespace detail { template<typename CharT> inline auto char_ieq(CharT c1, CharT c2, const std::locale& loc = std::locale()) { return std::toupper(c1, loc) == std::toupper(c2, loc); }; template<typename CharT> inline auto char_ilt(CharT c1, CharT c2, const std::locale& loc = std::locale()) { return std::toupper(c1, loc) < std::toupper(c2, loc); }; template<typename CharT> inline auto string_icmp(const CharT* s1, std::size_t n1, const CharT* s2, std::size_t n2, const std::locale& loc = std::locale()) { if(std::lexicographical_compare(s1, s1 + n1, s2, s2 + n2, [&](CharT c1, CharT c2) { return char_ilt(c1, c2, loc); })) return -1; if(std::lexicographical_compare(s2, s2 + n2, s1, s1 + n1, [&](CharT c1, CharT c2) { return char_ilt(c1, c2, loc); })) return 1; return 0; } template<typename CharT> struct char_itraits : std::char_traits<CharT> { static auto eq(CharT c1, CharT c2) { return char_ieq(c1, c2); } static auto lt(CharT c1, CharT c2) { return char_ilt(c1, c2); } static auto compare(const CharT* s1, const CharT* s2, std::size_t n) { return string_icmp(s1, n, s2, n); } }; } template<typename CharT, typename Alloc = std::allocator<CharT>> class basic_istring : public std::basic_string<CharT, char_itraits<CharT>, Alloc> { public: using base = std::basic_string<CharT, char_itraits<CharT>, Alloc>; using base::base; template<typename Traits2, typename Alloc2, std::enable_if_t<not std::is_same_v<char_itraits<CharT>, Traits2>, void>* = nullptr> basic_istring(const std::basic_string<CharT, Traits2, Alloc2>& str) : base(str.data(), str.length()) {} operator auto () const { return std::basic_string<CharT>(this->data(), this->length()); } template<typename Traits2, typename Alloc2> std::enable_if_t<not std::is_same_v<char_itraits<CharT>, Traits2>, bool> friend operator == (const basic_istring& lhs, std::basic_string<CharT, Traits2, Alloc2>& rhs) { return string_icmp(lhs.data(), lhs.length(), rhs.data(), rhs.length()) == 0; } template<typename Traits2, typename Alloc2> std::enable_if_t<not std::is_same_v<char_itraits<CharT>, Traits2>, std::strong_ordering> friend operator <=> (const basic_istring& lhs, std::basic_string<CharT, Traits2, Alloc2>& rhs) { return string_icmp(lhs.data(), lhs.length(), rhs.data(), rhs.length()) <=> 0; } template<typename Traits2, typename Alloc2> std::enable_if_t<not std::is_same_v<char_itraits<CharT>, Traits2>, bool> friend operator == (std::basic_string<CharT, Traits2, Alloc2>& lhs, const basic_istring& rhs) { return string_icmp(lhs.data(), lhs.length(), rhs.data(), rhs.length()) == 0; } template<typename Traits2, typename Alloc2> std::enable_if_t<not std::is_same_v<char_itraits<CharT>, Traits2>, std::strong_ordering> friend operator <=> (std::basic_string<CharT, Traits2, Alloc2>& lhs, const basic_istring& rhs) { return string_icmp(lhs.data(), lhs.length(), rhs.data(), rhs.length()) <=> 0; } }; using istring = basic_istring<char>; using iwstring = basic_istring<wchar_t>; inline auto& operator >> (std::istream& is, istring& istr) { std::string temp; is >> temp; istr = std::move(temp); return is; } inline auto& operator >> (std::wistream& wis, iwstring& iwstr) { std::wstring temp; wis >> temp; iwstr = std::move(temp); return wis; } inline auto& operator << (std::ostream& os, const istring& istr) { os << istr.c_str(); return os; } inline auto& operator << (std::wostream& wos, const iwstring& iwstr) { wos << iwstr.c_str(); return wos; } inline auto operator ""_is(const char* istr, std::size_t len) { return istring(istr, len); } inline auto operator ""_iws(const wchar_t* iwstr, std::size_t len) { return iwstring(iwstr, len); } |
If you publicly inherit from a class that does case-sensitive comparisons, your class violates the Liskov Substitution Principle (LSP). You are allowed to do private inheritance from a case sensitive string or contain one and forward member function calls.
In the absence of template parameters your observation would have been correct (assuming the base class compared case-sensitively and the derived not).
If you look closer at the template parameters of, and my class declaration…
template<typename CharT, typename Alloc = std::allocator<CharT>>class basic_istring : public std::basic_string<CharT, char_itraits<CharT>, Alloc>
…you will notice that the base class, std::basic_string is inherited from with possibility to use different character type and allocator (CharT and Alloc template parameters), HOWEVER the second template parameter passed to it is my char_itraits, which implements the case insensitive compare (or the functions needed by the std::basic_string to perform comparisons). THEREFORE the base class of my istring already compares case-insensitively. istring is NOT the same type as std::string. SO NO LSP VIOLATION 😉
I could have stopped at declaring istring to be std::basic_string with my custom char_itraits class, but instead I inherited to expand the functionality; mainly the ability to create istring from std::string, and cast istring to std::string. Those are USER DEFINED TYPE CONVERSIONS, not a violation of LSP.
Perhaps the constructor defined in istring and the type cast operator should be declared explicit; this would make the conversion (NOT SUBSTITUTION) clearer to see and to state such intent. I will consider making that change…