martin wrote: ↑2019-08-13 11:51
Hmm, long time since I looked at this code. There are two string types types in hMailServer- HM::String and HM::AnsiString. HM::String relies on wchar_t internally, while HM::AnsiString relies on char.
So HM::String which you refer to is in fact not ANSI, but UTF-16LE, right?
As long as we talk about x86 CPU's LE is fine to maintain the Least Significant Byte (LSB) order format.
This would be different talk with non x86 CPU's.
VS let you specify globally, by setting the macro _UNICODE or _MBCS how Datatypes like char or wchar_t is handled.
If your source OR included header file OR your Visual Studio Project Preprocessor is set to _MBCS, its a classic c char datatype.
But if its set to _UNICODE declarations
of type char* foo; will automatically treated like w_char* foo;
You can overrule this behavior anytime by using a ANSI or Wide character specific API call in your code, but this makes the code more unreadable
and harder to understand. That's why i think defaulting to _UNICODE in hMailServer's Main VC++ Project file would make sense.
You just have to keep in mind char* is w_char* if you use it in your code.
It also will make the code run faster, because there there ins no typecasting necessary, because the Windows API can process w_char 1:1.
hMailServer does use HM::AnsiString (char) as well. The idea with this one was to use it in places where using an Unicode-encoded strings does not make sense. For example, text sent over SMTP/POP3/IMAP won't use a Unicode-encoding. Another example is query functions for MySQL which takes char* and not wchar_t*. If you have an algorithm which wants to scan through a sequence of char's, passing in a wchar_t* may then lead to issues since each actual character may represents >1 byte.
I've actually forgotten parts of this, but to me it looks like it's currently using UTF-16LE for a majority of cases (String) while still using AnsiString. Do I misunderstand you?
I think this wouldn't stop the show, because it's more an internal compiler specific task behind the curtain how it generates it's Object code before it hits the Linker.
Regarding Connection encoding. I think you can't really rely on encoding promises a client makes in the first place.
It could be a security risk if someone is tampering with unchecked encoding data on a open socket connection.