XCAFDoc get part name error

Hello,
My code here:
Handle(TDocStd_Document) m_Document = ReadSTEPToDocument(fileloc);
Handle(XCAFDoc_ShapeTool) aShapeTool = XCAFDoc_DocumentTool::ShapeTool(m_Document->Main());
TDF_LabelSequence aRootLabels;
aShapeTool->GetFreeShapes(aRootLabels);
TDF_Label labelMain;
std::vector<TDF_Label> vecShapelabels;
for (TDF_LabelSequence::Iterator aRootIter(aRootLabels); aRootIter.More(); aRootIter.Next())
{
const TDF_Label& aRootLabel = aRootIter.Value();

if (XCAFDoc_ShapeTool::IsAssembly(aRootLabel))
{
if (labelMain.IsNull())
{
labelMain = aRootLabel;
}
}
GetAssemblyShapes(aRootLabel, vecShapelabels);
#if 1
for (auto it_ls = vecShapelabels.begin(); it_ls != vecShapelabels.end(); ++it_ls)
{
TopoDS_Shape aRootShape;
if (XCAFDoc_ShapeTool::GetShape(*it_ls, aRootShape))
{
aShMap.Add(aRootShape);
}

TCollection_ExtendedString asName;
Handle_TDataStd_Name attrName;
if (it_ls->FindAttribute(TDataStd_Name::GetID(), attrName)) {
asName = attrName->Get();
}
string str=TCollection_AsciiString(asName).ToCString();
}
#endif
This program can running but sometimes the "asName" and "str" will get wrong name like "ǰФͥ" ,how do i fix it ?

Dmitrii Pasukhin's picture

Hello,

String "str" can work with pure ASCII symbols. You need to specify locale and convert them to needed local, if it possible (using OCCT API or std::codecvt).

Or you can use special container according TCollection_ExtendedString:

ASCII // std::string

UTF 8  // std::string before C++20, std::u8string in C++20

UTF 16 // std::wstring or std::u16string

UTF32 //  std::u32string

Best regards, Dmitrii.

x xo's picture

Thanks,
I’m trying to change encoding of "str",but it still error name and became chinese error name or another error name. Do you have anothor suggestion?
Thanks again.

gkv311 n's picture

This program can running but sometimes the "asName" and "str" will get wrong name like "ǰФͥ" ,how do i fix it ?

How exactly do you use this std::string later and check if it's name is wrong or correct? What is the origin file, may you share a sample? How names are displayed in CAD Assistant?

Dmitrii Pasukhin's picture

You import Step file. So, can you check the locale of this file?
If step file written in UTF we read it just converting to the Unicode. But if it written by special locale or use special coding for non ASCII symbols you need to setting import process before reading.
The param for changing reading from some locales - "read.step.codepage". 

If it is possible please share any entity with Chinese symbol. "#1234 = PRODUCT ("...",...)" All supported locales you can see in  "occt/src/Resource/Resource_FormatType.hxx". Default value - Resource_FormatType_UTF8. Also you can disable conversion, using Resource_FormatType_ANSI and convert string after in your environmental.

Best regards, Dmitrii.

x xo's picture

thanks,
This resource encoding is ISO-8859,Does OCCT not support this encoding?

Attachments: 
gkv311 n's picture

This resource encoding is ISO-8859,Does OCCT not support this encoding?

You may see the list of built-in code page converters in Resource_FileType enumeration:

   // ISO8859 8-bit code pages
   Resource_FormatType_iso8859_1,    //!< ISO 8859-1 (Western European) encoding
   Resource_FormatType_iso8859_2,    //!< ISO 8859-2 (Central European) encoding
   Resource_FormatType_iso8859_3,    //!< ISO 8859-3 (Turkish) encoding
   Resource_FormatType_iso8859_4,    //!< ISO 8859-4 (Northern European) encoding
   Resource_FormatType_iso8859_5,    //!< ISO 8859-5 (Cyrillic) encoding
   Resource_FormatType_iso8859_6,    //!< ISO 8859-6 (Arabic) encoding
   Resource_FormatType_iso8859_7,    //!< ISO 8859-7 (Greek) encoding
   Resource_FormatType_iso8859_8,    //!< ISO 8859-8 (Hebrew) encoding
   Resource_FormatType_iso8859_9,    //!< ISO 8859-9 (Turkish) encoding

STEP standard doesn't support these code pages - symbols should be encoded in UTF-8 or via special format. As such, you'll need passing non-standard code page to OCCT explicitly.

x xo's picture

You means change source .STEP file first. And then loading that ?

Dmitrii Pasukhin's picture

Hello,

you can change parameter for converting names from STEP file.

STEPControl_Controller::Init();
Interface_Static::SetCVal("read.step.codepage", "iso8859-1"); // to convert from iso8859-1
// Interface_Static::SetCVal("read.step.codepage", "NoConversion"); // to do not make conversion internally ( save as is )
STEPCAFControl_Reader aReader;
Standard_Boolean aStatus = aReader.Perform(...);
...

Best regards, Dmitrii.

x xo's picture

hi,
It's still wrong name.I can get right name when i use "SystemLocale" on Windows,but get nothing on Linux.how to set GB2312 encoding with SetCVal? How to convert TCollection_ExtendedString to std:string without TCollection_AsciiString?
Thanks very much.

gkv311 n's picture

By the way, your file uses GB2312 code page.

cadass cadass

x xo's picture

thanks,
so,How to set gb2312 encoding from SetCval("read.step.codepage","xxx").Them all return false when I try to use "GBK" or "GB2312" ,can I only use "SystemLocale" to get right name?

Dmitrii Pasukhin's picture

Hello,

please try the next code:

Interface_Static::SetCVal("read.step.codepage", "GB"); // to convert from GB
// Interface_Static::SetIVal("read.step.codepage", 3); // the same with "GB" but just IVal

You can see the numbers for all locales below(copied from "occt\occt\src\STEPControl\STEPControl_Controller.cxx"):

    Interface_Static::Init("step", "read.step.codepage", 'e', "");
    Interface_Static::Init("step", "read.step.codepage", '&', "enum 0");
    Interface_Static::Init("step", "read.step.codepage", '&', "eval SJIS");         // Resource_FormatType_SJIS 0
    Interface_Static::Init("step", "read.step.codepage", '&', "eval EUC");          // Resource_FormatType_EUC 1
    Interface_Static::Init("step", "read.step.codepage", '&', "eval NoConversion"); // Resource_FormatType_NoConversion 2
    Interface_Static::Init("step", "read.step.codepage", '&', "eval GB");           // Resource_FormatType_GB 3
    Interface_Static::Init("step", "read.step.codepage", '&', "eval UTF8");         // Resource_FormatType_UTF8 4
    Interface_Static::Init("step", "read.step.codepage", '&', "eval SystemLocale"); // Resource_FormatType_SystemLocale 5 
    Interface_Static::Init("step", "read.step.codepage", '&', "eval CP1250");       // Resource_FormatType_CP1250 6
    Interface_Static::Init("step", "read.step.codepage", '&', "eval CP1251");       // Resource_FormatType_CP1251 7
    Interface_Static::Init("step", "read.step.codepage", '&', "eval CP1252");       // Resource_FormatType_CP1252 8
    Interface_Static::Init("step", "read.step.codepage", '&', "eval CP1253");       // Resource_FormatType_CP1253 9
    Interface_Static::Init("step", "read.step.codepage", '&', "eval CP1254");       // Resource_FormatType_CP1254 10
    Interface_Static::Init("step", "read.step.codepage", '&', "eval CP1255");       // Resource_FormatType_CP1255 11
    Interface_Static::Init("step", "read.step.codepage", '&', "eval CP1256");       // Resource_FormatType_CP1256 12
    Interface_Static::Init("step", "read.step.codepage", '&', "eval CP1257");       // Resource_FormatType_CP1257 13
    Interface_Static::Init("step", "read.step.codepage", '&', "eval CP1258");       // Resource_FormatType_CP1258 14 
    Interface_Static::Init("step", "read.step.codepage", '&', "eval iso8859-1");    // Resource_FormatType_iso8859_1 15
    Interface_Static::Init("step", "read.step.codepage", '&', "eval iso8859-2");    // Resource_FormatType_iso8859_2 16 
    Interface_Static::Init("step", "read.step.codepage", '&', "eval iso8859-3");    // Resource_FormatType_iso8859_3 17
    Interface_Static::Init("step", "read.step.codepage", '&', "eval iso8859-4");    // Resource_FormatType_iso8859_4 18
    Interface_Static::Init("step", "read.step.codepage", '&', "eval iso8859-5");    // Resource_FormatType_iso8859_5 19
    Interface_Static::Init("step", "read.step.codepage", '&', "eval iso8859-6");    // Resource_FormatType_iso8859_6 20
    Interface_Static::Init("step", "read.step.codepage", '&', "eval iso8859-7");    // Resource_FormatType_iso8859_7 21
    Interface_Static::Init("step", "read.step.codepage", '&', "eval iso8859-8");    // Resource_FormatType_iso8859_8 22
    Interface_Static::Init("step", "read.step.codepage", '&', "eval iso8859-9");    // Resource_FormatType_iso8859_9 23
    Interface_Static::Init("step", "read.step.codepage", '&', "eval CP850");        // Resource_FormatType_CP850 24

Best regards, Dmitrii.

x xo's picture

Thanks very much!
I think i fix this now,and another question is convert TCollection_ExtendedString to std:string without TCollection_AsciiString.

Thanks again.

Dmitrii Pasukhin's picture

Hello,

You try to convert from 16bit to 8 bit. So, you need to make some conversion. TCollection_AsciiString helps with it action.

But of course, you can do it directly using the code from constructor:

//---------------------------------------------------------------------------
//  Create an asciistring from an ExtendedString 
//---------------------------------------------------------------------------
TCollection_AsciiString::TCollection_AsciiString(const TCollection_ExtendedString& astring,
                                                 const Standard_Character replaceNonAscii) 
: mystring (0)
{
  if (replaceNonAscii)
  {
    mylength = astring.Length(); 
    mystring = Allocate(mylength+1);
    for(int i = 0; i < mylength; i++) {
      Standard_ExtCharacter c = astring.Value(i+1);
      mystring[i] = ( IsAnAscii(c) ? ToCharacter(c) : replaceNonAscii );
    }
    mystring[mylength] = '\0';
  }
  else {
    // create UTF-8 string
    mylength = astring.LengthOfCString();
    mystring = Allocate(mylength+1);
    astring.ToUTF8CString(mystring);
  }
}

Best regards, Dmitrii.