Not all IsEqual’s are really equal, or 6.5.3-induced regression

Hi folks,

When porting on OCC6.5.3 I have come across an issue that looks like a potential threat to multiple users’ codes. So this post is to spread the word and to get feedback, as I cannot fully comprehend the behavior. Specialists in symbol resolutions on Linux are badly needed (Windows works just fine as expected)
Symptoms – the following code in my library:

NCollection_Map aMap;
for (; ;) {
TopoDS_Shape aShape = …;
if (!aMap.Contains (aShape)) {
...
aMap.Add (aShape);
}
...
works inconsistently on Linux and Windows.

Contains() underneath calls Hasher::IsEqual(const TopoDS_Shape&) where Hasher is a new optional template parameter introduced in 6.5.3. It defaults to a proxy that calls a global function IsEqual(). The latter is to preserve source code compatibility with pre-6.5.3 versions where Hasher did not exist.

I provide my IsEqual() via including a header with the following definition:
inline Standard_Boolean IsEqual(const TopoDS_Shape& S1, const TopoDS_Shape& S2)
{ return S1.IsSame(S2); }

Note sure if this is important but the header gets included *after* NCollection_Map (due to 3rd party headers chain).

The code compiles and links fine. The configuration is -O0 -g.
However during run-time Contains() invokes another global IsEqual(), which is defined in ...TKV3d’s AIS_ConnectedShape.cxx as follows:
static Standard_Boolean IsEqual( const TopoDS_Shape& theLeft, const TopoDS_Shape& theRight )
{ return theLeft.IsEqual(theRight); }
and thus obviously changes the expected behavior.

I had no clue what’s happening underneath until invoked gdb to debug an ugly Linux-specific regression.

On Linux (unlike Windows), names are global and generally due to name collision, a symbol foo() defined in library A may be called from library B, even if B does not depend on A but does depend on C (where the same name foo() was defined). My library does not depend on libTKV3d.so but the latter gets loaded into process’ space.

However in my case above:
1. Initially the library used *inline* function, so why would the collision exist at all ?
2. TKV3d’s IsEqual() is declared static and thus should not be exported, so why is it ?
3. Why did this not happen in the past when there were different inlined IsEqual() ?

A couple of other questions:
4. What should be the reliable fix on the OCC side – I presume unnamed namespace in AIS_ConnectedShape.cxx that would encompass above IsEqual() ?
5. What are other possible side-effects of such Linux-specific behavior to be aware of ?

Thanks for the tips in advance.
Roman

JuryS's picture

Don't use inline functions anywere))) I think that only MS VC may work correctly with inline functions (any optimization ? can't see effect). Rewrite needed functions in cxx file.

Roman Lygin's picture

When you define a function or a class method in a header you effectively use inlining. There is nothing wrong about it.

Inline or outline - it does not explain why a non-exported symbol (static from TKV3d) is used from another library. See nm on libTKV3d.so:

nm -gC ./OCC/6.5.3/fix-lin64-gcc4-debug/lib/libTKV3d.so.0 | grep IsEqual 0000000000243066 W IsEqual(Handle_Standard_Transient const&, Handle_Standard_Transient const&)
000000000021cf96 W IsEqual(Handle_SelectMgr_SelectableObject const&, Handle_SelectMgr_SelectableObject const&)
U TColStd_MapIntegerHasher::IsEqual(int const&, int const&)
00000000001cdcc8 W NCollection_DefaultHasher::IsEqual(TopoDS_Shape const&, TopoDS_Shape const&)
000000000024319e W NCollection_DefaultHasher::IsEqual(Handle_Standard_Transient const&, Handle_Standard_Transient const&)
000000000021d09c W NCollection_DefaultHasher::IsEqual(Handle_SelectMgr_SelectableObject const&, Handle_SelectMgr_SelectableObject const&)
U TColStd_MapTransientHasher::IsEqual(Handle_Standard_Transient const&, Handle_Standard_Transient const&)
0000000000182fde W TopoDS_Shape::IsEqual(TopoDS_Shape const&) const
U Quantity_Color::IsEqual(Quantity_Color const&) const
U TopLoc_Location::IsEqual(TopLoc_Location const&) const
U TCollection_AsciiString::IsEqual(char const*) const
00000000002a93a6 T Graphic3d_MaterialAspect::IsEqual(Graphic3d_MaterialAspect const&) const
U TCollection_PrivCompareOfInteger::IsEqual(int const&, int const&) const
00000000001aac0c W gp_Dir::IsEqual(gp_Dir const&, double) const
00000000001a9a8c W gp_Pnt::IsEqual(gp_Pnt const&, double) const

Roman Lygin's picture

Just for a record.

Two fixes have worked out in AIS_ConnectedShape.cxx.

1. Defining IsEqual in unnamed namespace *before* inclusion of NCollection_DataMap (otherwise compilation fails to find it):

#include
namespace {
static Standard_Boolean IsEqual( const TopoDS_Shape& theLeft,
const TopoDS_Shape& theRight )
{
return theLeft.IsEqual(theRight);
}
}
...
#include

2. Defining an explicit hasher parameter as follows:

namespace {
class AIS_ConnectedShape_ShapeHasher : public NCollection_DefaultHasher {
public:
static Standard_Boolean IsEqual( const TopoDS_Shape& theLeft,
const TopoDS_Shape& theRight )
{
return theLeft.IsEqual(theRight);
}
};
}

...
typedef NCollection_DataMap
Shapes2EntitiesMap;
Of course unnamed namespace is redundant but kept for explicitness.

I tend prefer the second one for its explicitness and unambiguity, and being more compliant with Boost/STL approach.

Trying to simply define previous static IsEqual() with inline did not help - it was still invoked from my library.

Anyway, most questions above still remain, so any educative hints will still be much appreciated.

Roman Lygin's picture

Additional details (after more experiments and conversations with Linux knowledgeable folks around)

1. Formally this looks like an ODR (One Definition Rule, http://en.wikipedia.org/wiki/One_Definition_Rule) violation, and hence behavior is undefined. However, exporting static and calling it from another compilation unit is still unclear and looks like a compiler/linker bug (gcc 4.1.2)

By the way, invoking nm -C (i.e. without -G) shows it in libTKV3d.so.0:
00000000001cae64 t IsEqual(TopoDS_Shape const&, TopoDS_Shape const&)

2. Details of using LD_DEBUG=all are attached. You can see that a symbol libTKV3d.so really wins.

3. Recommendation given - always use namespace (e.g. unnamed namespace) to prevent this and isolate your symbols.

4. From 6.5.3 when using NCollection containers you better define a hasher and provide it as template argument instead of defining a global IsEqual() or Hash() functions.

5. The fix for AIS_ConnectedShape.cxx will be submitted.

Hope this will be helpful for those reading this or encountering a similar issue.
Roman

Attachments: 
Roman Lygin's picture