Performance improvements

Jul 22, 2015 at 2:33 PM
Edited Jul 22, 2015 at 7:49 PM
Hi,

I profiled code under memory profiler (after adopting it to my needs) and want to share some of my findings:
  • RobustPredicates.cs:InCircleAdapt allocates a lot of double arrays every time is called. I guess this is port from C where array is allocated on stack, but in .NET array is reference type and allocated on heap. I would suggest to change the code (see below one of solutions)
  • Triangle, BadTriangle, Segments produce high memory traffic (a lot of allocations on heap)
  • Mesh object contains a lot of collections inside, so it's construction causes memory traffic
I fixed these issues by introducing object pool.

Here some additional findings:
  • Vertex has hash field which has the same meaning as id. So, type contains one additional integer field which leads to additional 8 bytes per object (or more depending on memory model/alignment)
  • Vertex and Point types are primitive by their nature, but in app they are reference types which means that additional memory will be allocated (sync block and type pointer). Which is critical if you're using collections of such types.
Anyway, thanks for the library.
Coordinator
Jul 23, 2015 at 2:09 PM
RobustPredicates: yes, actually I think this class should be more like a service (IRobustPredicates interface), so each mesh object has it's own instance (and the user could switch to a custom implementation).

I'm not an expert at memory profiling (to say the least). Would you consider sharing your code?
Jul 23, 2015 at 3:00 PM
Edited Jul 23, 2015 at 3:00 PM
I don't think that this class needs to be interface. It should be internal: I guess no one is going to provide different implementation of it. Personally, I prefer to keep public API interface as small as possible and hide details of implementation (internal classes). Personally, I don't like specific interfaces with one implementation which most likely not to be implemented different way.

If you want to improve your knowledge in memory profiling topic, you can start with this nice book: http://www.writinghighperf.net (one of chapters is available here: http://www.codeproject.com/Articles/812678/Performance-Considerations-of-Class-Design-and-Gen)

You may try to use jetBreans dotMemory profiler - it has user friendly GUI and full trial version (for 5 days).

I'm not memory profiling expert, but I tried to use library in my project (https://www.youtube.com/watch?v=cQ_FmhjC12s) and it has unsatisfactory performance for my needs. Is not because of data structures (e.g. which hashtable implementation to choose), but mostly because of unneeded memory allocations.

I cannot share whole code cause I've completely removed/replaced things which are not needed for the project (and it is not public yet). However, I created gist with example of my changes related to RobustPredicates class: https://gist.github.com/ibuiluk/bd3dad7ef24b2cca274b

General idea is to allocate all arrays from object pool and return them to pool once they are not used in that context anymore. Of course, object pool should be thread safe. You may try to use lock-free data structure to prevent locking, but implementation depends on the certain application. In my implementation is static class which I don't like, but it's ok for me so far.
Oct 20, 2015 at 5:16 AM
Hi! Awesome library! Great Work!

But again, the problem with the performance and memory traffic, when using parallel triangulation in multiple threads.
Due to the large amount of object creation, GC begins to block the execution of all threads, about 300-500ms.
It would be cool, in the next update to see the object pool, for frequently created objects.

Thank you!
Coordinator
Dec 5, 2015 at 12:56 PM
Edited Dec 5, 2015 at 12:57 PM
Regarding the predicates:
Can someone confirm that the latest changes improve the situation?

Regarding making the predicates a service:
I agree that keeping the public API simple is a good thing. I hope the way I implemented it (making it optional to create your own instance and injecting it in the constructor) is a good compromise. BTW: there are at least two other implementations I found: ceometric.com and RobustGeometry.NET.

Regarding Vertex has hash field which has the same meaning as id: that's not true. Vertex ids might be changed by the user, but the hash is fixed (index to the dictionary).

Regarding Vertex and Point types are primitive by their nature: that's true for the Point, but not so much for Vertex, which is part of the topology.

Now the important stuff, regarding the collections in the Mesh class:
I guess you are referring to the dictionaries allocating additional KeyValuePair's? I have replaced the triangles dictionary with an array based structure that retains the allocated triangles between runs. Here are some performance results (single threaded, creating a mesh with approx. 80000 triangles, repeating 10 times, average runtime):

300ms (old version using dictionary)
260ms (using new datastructure, not re-using the pool)
170ms (using new datastructure and re-using the pool)

I will upload the code next week and I'd be glad if someone would test it and do some memory benchmarks.
Coordinator
Dec 11, 2015 at 12:49 AM
Here's an example using the new triangle pool: Parallel mesh processing

Some remarks:
  • at the moment only triangles are implemented, but the same could be easily done for vertices and segments
  • it's not threadsafe, so each thread has to use it's own pool (see the example code)