Parallel computations with BRepExtrema_DistShapeShape

Hugues Delorme
Fri, 07/10/2009 - 18:47

Forums:

Hello all,

Having read the very interesting article serie of Roman Lygin related to parallel applications with OpenCascade, I decided to make some tests with an OCC class which is often used : BRepExtrema_DistShapeShape.

For my case BRepExtrema_DistShapeShape gives me the projection of a point (vertex) on a topologic shape.
My test is as follows :
1. Load a testing geom curve C (for example from a file)
2. Load a testing topologic shape S (for example from a file)
3. Discretize C with GCPnts_UniformAbscissa, getting Points
4. For each point Pi in Points, project Pi on S with BRepExtrema_DistShapeShape

I want to make the fourth step parallel, using TBB (Threading Building Blocks 2.1) for that purpose (tbb::parallel_for()).
Here is the code I wrote :

#include
#include
#include
#include
#include
#include
#include
#include
#include

// The Body functor that tbb::parallel_for() will execute in parallel.
struct apply_occ_dist_shape_shape
{
apply_occ_dist_shape_shape(const TopoDS_Shape& shape,
const std::vector& curvePoints) :
_shape(shape),
_curvePoints(curvePoints)
{
}

void operator()(const tbb::blocked_range& r) const
{
BRepExtrema_DistShapeShape distSS;
distSS.LoadS1(this->_shape);
std::cout for (int i = r.begin(); i != r.end(); ++i)
{
distSS.LoadS2(BRepBuilderAPI_MakeVertex(this->_curvePoints[i]));
distSS.Perform();
}
}

private:
const TopoDS_Shape& _shape;
const std::vector& _curvePoints;
}; // struct apply_occ_dist_shape_shape

int main()
{
tbb::task_scheduler_init tbbInit;

// 1. Load the testing geom curve (C).
const Handle_Geom_Curve curve1 = // A (long) curve ...
// Load the testing topologic shape (S).
const TopoDS_Shape shape1 = // A (complex) shape ...
// Discretize the curve (C).
const double discrStep = 10.;
GeomAdaptor_Curve curveAdaptor(curve1);
GCPnts_UniformAbscissa discretizer(curveAdaptor, discrStep);
std::cout std::vector points(discretizer.NbPoints());
for (int i = 1; i points[i - 1] = curve1->Value(discretizer.Parameter(i));
// Project.
Standard::SetReentrant(Standard_True);
const size_t grainSize = 100;
tbb::parallel_for(tbb::blocked_range(0, points.size(), grainSize),
::apply_occ_dist_shape_shape(shape1, points));
return 0;
}

The third parameter of the constructor of tbb::blocked_range() is the grain size.
This value tells TBB to split a range into two sub-ranges if its size is greater than grainSize.

So let me give the results :
- There is 334 points in my curve discretization (so points.size() == 334) -> it forces TBB to split the range [0,334) and activates parallelization.
- The computation starts with two sub-ranges :
range : 0, 83
range : 167, 250
And fails because of an exception (caught be TBB) :
terminate called after throwing an instance of 'tbb::captured_exception'
what(): Unidentified exception

Setting MMGT_OPT to 0 does not help.
Setting grainSize to a value greater than points.size() disables parallelization and then the computation works.

I have not gone deeper into OpenCascade sources, but as Roman pointed out in his article serie, the problem is surely related to static variables declared in some OpenCascade C++ implementation files. But where ?

Roman Lygin
Fri, 07/10/2009 - 21:11

Hi Hugues,

(Just came accross your post)
Great story ! Welcome to a parallel world ;-).
What you might want to do is to try Intel Parallel Inspector (http://www.intel.com/go/parall, free eval for 30 days) and to try its Thread Checker. Start with level 2 - this should give you data races occurrences (with shallow stack level - 2 or so). Using level 3 should give you full stacks at the expense of overhead.

By the way, make sure you use debug binaries with /Zi and /Debug options compiled and linked.

On TBB grain size. You should likely use size of 1 and tbb::simple_partitioner (not auto_partitioner). This is especially important if your workload per thread can be of very different size. I have encountered this when using Shape Healing on solids imported from ACIS (part of CAD Exchanger development). I'm going to share this experience in some posts soon.

Keep us posted with your results. Good luck !
Roman

---
opencascade.blogspot.com - the Open CASCADE blog
www.cadexchanger.com - CAD Exchanger, your 3D data translator

Hugues Delorme
Sat, 07/11/2009 - 14:16

Hi Roman,

Intel Parallel Inspector seems very useful. I could not find a trial version for Linux, do you know where I can find one ? I am working on kubuntu 8.04.

Using QtCreator 1.2 I could see that one of the execution crashes with this stack call :

Geom_BSplineSurface::D1()
|
v
Geom_BSplineSurface::ValidateCache()
|
v
BSplSLib::BuildCache()
|
v
PrepareEval()

There is some "ugly" code in Geom_BSplineSurface::D1() :

void Geom_BSplineSurface::D1 (const Standard_Real U,
const Standard_Real V,
gp_Pnt& P,
gp_Vec& D1U,
gp_Vec& D1V) const
{
Standard_Real new_u = U,
new_v = V ;
PeriodicNormalization(new_u,
new_v) ;
if (!IsCacheValid(new_u,
new_v))
{
Geom_BSplineSurface * my_surface = (Geom_BSplineSurface *) this ;
my_surface->ValidateCache(new_u,
new_v) ;
}

The const "this" instance is casted to a mutable "this" instance. A possible source of race condition ? Geom_BSplineSurface::ValidateCache() internally uses BSplCLib which is full of static variables, look at the beginning of source file BSplCLib.cxx :

// static Arrays for Bezier computation
static TColStd_Array1OfReal bidknots(1,2);
static TColStd_Array1OfInteger bidmults(1,2);

// static arrays for temporary computations
static math_Matrix *BsplineMatrixPtr = NULL ;
static Standard_Integer RowSize = 0 ;
static Standard_Integer ColSize = 0 ;
static Standard_Integer LocalArraySize = 0;
static Standard_Real *LocalRealArray ;

Roman Lygin
Mon, 07/13/2009 - 22:52

Hi Hugues,

Parallel Inspector (as well as the whole Studio) is Windows-only product. We are currently proceeding to developing new version which will have a Linux flavor.

Casting from const* to * is absolutely safe and has nothing to do with data races. The root-cause of course is in BSplCLib. This was discussed several times on the blog (including last time here - http://opencascade.blogspot.com/2009/06/developing-parallel-applications.... Open CASCADE 6.3.1 features thread-safety improvements for BSplCLib and BSplSLib (but not for PLib unfortunately). Until that you will need to patch BSplCLib et al. I'm going to upload my some accumulated patches which would let you proceed. Stay tuned.

Good luck.
Roman

---
opencascade.blogspot.com - the Open CASCADE blog
www.cadexchanger.com - CAD Exchanger, your 3D data translator

Roman Lygin
Tue, 07/14/2009 - 00:57

Fixes uploaded on Source Forge.

There are two files at https://sourceforge.net/projects/opencascade/files/ - OCC630-fixes100-199.zip and -fixes200-299.zip, and readme with instructions and description of fixes.

Source Forge seems to need some time to mirror the files before letting download them.

These are source code patches only but to apply them you may need to generate some additional files (e.g. *.hxx) and/or modify project files. I have never had time to complete these fixes for such a seamless installation, so if some volunteer could do that it would be a great help for the community.

Good luck anyway. Hope this will be useful.
Roman
---
opencascade.blogspot.com - the Open CASCADE blog
www.cadexchanger.com - CAD Exchanger, your 3D data translator