
Wed, 08/01/2012 - 02:02
Hi folks,
When porting on OCC6.5.3 I have come across an issue that looks like a potential threat to multiple users’ codes. So this post is to spread the word and to get feedback, as I cannot fully comprehend the behavior. Specialists in symbol resolutions on Linux are badly needed (Windows works just fine as expected)
Symptoms – the following code in my library:
NCollection_Map
for (; ;) {
TopoDS_Shape aShape = …;
if (!aMap.Contains (aShape)) {
...
aMap.Add (aShape);
}
...
works inconsistently on Linux and Windows.
Contains() underneath calls Hasher::IsEqual(const TopoDS_Shape&) where Hasher is a new optional template parameter introduced in 6.5.3. It defaults to a proxy that calls a global function IsEqual(). The latter is to preserve source code compatibility with pre-6.5.3 versions where Hasher did not exist.
I provide my IsEqual() via including a header with the following definition:
inline Standard_Boolean IsEqual(const TopoDS_Shape& S1, const TopoDS_Shape& S2)
{ return S1.IsSame(S2); }
Note sure if this is important but the header gets included *after* NCollection_Map (due to 3rd party headers chain).
The code compiles and links fine. The configuration is -O0 -g.
However during run-time Contains() invokes another global IsEqual(), which is defined in ...TKV3d’s AIS_ConnectedShape.cxx as follows:
static Standard_Boolean IsEqual( const TopoDS_Shape& theLeft, const TopoDS_Shape& theRight )
{ return theLeft.IsEqual(theRight); }
and thus obviously changes the expected behavior.
I had no clue what’s happening underneath until invoked gdb to debug an ugly Linux-specific regression.
On Linux (unlike Windows), names are global and generally due to name collision, a symbol foo() defined in library A may be called from library B, even if B does not depend on A but does depend on C (where the same name foo() was defined). My library does not depend on libTKV3d.so but the latter gets loaded into process’ space.
However in my case above:
1. Initially the library used *inline* function, so why would the collision exist at all ?
2. TKV3d’s IsEqual() is declared static and thus should not be exported, so why is it ?
3. Why did this not happen in the past when there were different inlined IsEqual() ?
A couple of other questions:
4. What should be the reliable fix on the OCC side – I presume unnamed namespace in AIS_ConnectedShape.cxx that would encompass above IsEqual() ?
5. What are other possible side-effects of such Linux-specific behavior to be aware of ?
Thanks for the tips in advance.
Roman
Wed, 08/01/2012 - 05:44
Don't use inline functions anywere))) I think that only MS VC may work correctly with inline functions (any optimization ? can't see effect). Rewrite needed functions in cxx file.
Wed, 08/01/2012 - 11:02
When you define a function or a class method in a header you effectively use inlining. There is nothing wrong about it.
Inline or outline - it does not explain why a non-exported symbol (static from TKV3d) is used from another library. See nm on libTKV3d.so:
nm -gC ./OCC/6.5.3/fix-lin64-gcc4-debug/lib/libTKV3d.so.0 | grep IsEqual 0000000000243066 W IsEqual(Handle_Standard_Transient const&, Handle_Standard_Transient const&)
000000000021cf96 W IsEqual(Handle_SelectMgr_SelectableObject const&, Handle_SelectMgr_SelectableObject const&)
U TColStd_MapIntegerHasher::IsEqual(int const&, int const&)
00000000001cdcc8 W NCollection_DefaultHasher::IsEqual(TopoDS_Shape const&, TopoDS_Shape const&)
000000000024319e W NCollection_DefaultHasher::IsEqual(Handle_Standard_Transient const&, Handle_Standard_Transient const&)
000000000021d09c W NCollection_DefaultHasher::IsEqual(Handle_SelectMgr_SelectableObject const&, Handle_SelectMgr_SelectableObject const&)
U TColStd_MapTransientHasher::IsEqual(Handle_Standard_Transient const&, Handle_Standard_Transient const&)
0000000000182fde W TopoDS_Shape::IsEqual(TopoDS_Shape const&) const
U Quantity_Color::IsEqual(Quantity_Color const&) const
U TopLoc_Location::IsEqual(TopLoc_Location const&) const
U TCollection_AsciiString::IsEqual(char const*) const
00000000002a93a6 T Graphic3d_MaterialAspect::IsEqual(Graphic3d_MaterialAspect const&) const
U TCollection_PrivCompareOfInteger::IsEqual(int const&, int const&) const
00000000001aac0c W gp_Dir::IsEqual(gp_Dir const&, double) const
00000000001a9a8c W gp_Pnt::IsEqual(gp_Pnt const&, double) const
Wed, 08/01/2012 - 12:12
Just for a record.
Two fixes have worked out in AIS_ConnectedShape.cxx.
1. Defining IsEqual in unnamed namespace *before* inclusion of NCollection_DataMap (otherwise compilation fails to find it):
#include
namespace {
static Standard_Boolean IsEqual( const TopoDS_Shape& theLeft,
const TopoDS_Shape& theRight )
{
return theLeft.IsEqual(theRight);
}
}
...
#include
2. Defining an explicit hasher parameter as follows:
namespace {
class AIS_ConnectedShape_ShapeHasher : public NCollection_DefaultHasher {
public:
static Standard_Boolean IsEqual( const TopoDS_Shape& theLeft,
const TopoDS_Shape& theRight )
{
return theLeft.IsEqual(theRight);
}
};
}
...
typedef NCollection_DataMap
Shapes2EntitiesMap;
Of course unnamed namespace is redundant but kept for explicitness.
I tend prefer the second one for its explicitness and unambiguity, and being more compliant with Boost/STL approach.
Trying to simply define previous static IsEqual() with inline did not help - it was still invoked from my library.
Anyway, most questions above still remain, so any educative hints will still be much appreciated.
Thu, 08/02/2012 - 00:13
Additional details (after more experiments and conversations with Linux knowledgeable folks around)
1. Formally this looks like an ODR (One Definition Rule, http://en.wikipedia.org/wiki/One_Definition_Rule) violation, and hence behavior is undefined. However, exporting static and calling it from another compilation unit is still unclear and looks like a compiler/linker bug (gcc 4.1.2)
By the way, invoking nm -C (i.e. without -G) shows it in libTKV3d.so.0:
00000000001cae64 t IsEqual(TopoDS_Shape const&, TopoDS_Shape const&)
2. Details of using LD_DEBUG=all are attached. You can see that a symbol libTKV3d.so really wins.
3. Recommendation given - always use namespace (e.g. unnamed namespace) to prevent this and isolate your symbols.
4. From 6.5.3 when using NCollection containers you better define a hasher and provide it as template argument instead of defining a global IsEqual() or Hash() functions.
5. The fix for AIS_ConnectedShape.cxx will be submitted.
Hope this will be helpful for those reading this or encountering a similar issue.
Roman
Thu, 08/02/2012 - 02:19
Fix pushed as CR23365 - see http://tracker.dev.opencascade.org/view.php?id=23365