Step file reader for non-ascii file path

Hi,

I am using OCCT 7.1.0. Based on the release notes of 6.8.0 unicode file path should be supported, however, I cannot get it work. 

STEPControl_Reader aReader;

//teststr std::string which is converted from std::basic_string<char16_t> using UTF-8 encoding.
IFSelect_ReturnStatus status = aReader.ReadFile(teststr.c_str()); //This line always returns error status

Do I need to do any other special things to handle this unicode path in the reader?

Do we have any examples on this?

Thank you in advance!

-Xing

 

Kirill Gavrilov's picture

Most probably the problem is here:

//teststr std::string which is converted from std::basic_string<char16_t> using UTF-8 encoding.

e.g. passed UTF-8 path is invalid due to some issue within conversion code.

You can try MFC ImportExport sample to check if you are able importing the same STEP file there.

xing gao's picture

Thank you Kirill!

I used the MFC importExport sample and it works on Windows!

My old code is 

STEPControl_Reader aReader;

//teststr std::string which is converted from std::basic_string<char16_t> using UTF-8 encoding.
IFSelect_ReturnStatus status = aReader.ReadFile(teststr.c_str());

And my new code is

STEPControl_Reader aReader;

//It is like the MFC import export example
TCollection_AsciiString aFileName((const wchar_t *)fileNameStr.c_str());
IFSelect_ReturnStatus status = aReader.ReadFile(aFileName.ToCString());

But then I realized a problem:

On Linux/Mac it works with my old code, but not the new code.

Considering the MFC examples are for Windows only, I guess my new code is not using the correct way for Linux/Mac.

So my question is:

Can I write this piece of code regardless of the platform? Right now I am using this workaround but I don't think it is elegant enough.

//For Windows      
TCollection_AsciiString    aFileName((const wchar_t *)fileNameStr.c_str());
IFSelect_ReturnStatus status = aReader.ReadFile(aFileName.ToCString());
      
//For Linux/Mac
if (status == IFSelect_RetError || status == IFSelect_RetFail)
{
  //toStringTmp is our method which converts from std::basic_string<char16_t> 
  //to std::string using UTF-8 encoding
  status = aReader.ReadFile(toStringTmp(fileNameStr).c_str());
}

Many thanks!

Xing

Kirill Gavrilov's picture

Your code is confusing - if fileNameStr is std::basic_string<char16_t>, how do you cast it to wchar_t on Linux?

wchar_t is 2-bytes long on Windows (thus, it can be considered as UTF-16 code points) and 4-bytes long on most other systems (thus, it might be considered as UTF-32).
In contrast to Windows, Linux system do not provide any system calls taking wchar_t type (though there are some standard functions for string converting) - everything is expected in UTF-8 nowadays making extra functions redundant (char* is sufficient).

Currently, TCollection_AsciiString does not define any constructors taking char16_t type - there is only constructor taking wchar_t which has different size on different systems.
In contrast, TCollection_ExtendedString defines a constructor taking Standard_ExtString which is char16_t* on modern compilers.
So, if you really like char16_t (why?), you can create TCollection_ExtendedString from it (without static casts to avoid error!) and then create TCollection_AsciiString from it (which will perform UTF-16 -> UTF-8 convertion for you).

xing gao's picture

Dear Kirill,

A lot of thanks for your reply! Somehow I was not getting notification email from your reply.

Based on your suggestion, I am able to get it work using UTF-8 on all our platforms!

Yes, as you said, I do not need char16_t. I misunderstood the wchar_t things on Linux and Maci. 

Thank you so much! 

Best Regards!

Xing