| 01:19.42 | *** join/#brlcad infobot (ibot@rikers.org) | |
| 01:19.42 | *** topic/#brlcad is BRL-CAD and open source CAx discussion ! Also @ http://brlcad.zulipchat.com ! Logs @ http://infobot.rikers.org/%23brlcad/ | |
| 01:33.16 | *** join/#brlcad odqeywdqlmgagnir (~armin@dslb-088-066-132-181.088.066.pools.vodafone-ip.de) | |
| 04:09.24 | DenisP | Okay, I think I have got a basic idea how librt works. |
| 04:09.51 | DenisP | Appleseed uses boost::thread for multithreading. It divides an image into tiles and render each tile in separate thread. ( = pixel parallelization and not ray parallelization) |
| 04:10.08 | DenisP | Back to brl-cad. I believe each thread must have a separate resource structure to use rt_shootray() and all we can get from Appleseed is current thread's id (via boost::this_thread::get_id()). |
| 04:10.23 | DenisP | So, here's how I run rt_shootray() in multiple threads right now: |
| 04:10.40 | DenisP | I have got a BrlcadProceduralObject. It contains rt_i structure (which is initialized in the constructor); an array of resources for different threads; std::set for threads' Ids; a mutex (boost::mutex). In intersection function before calling rt_shootray(), I lock the mutex, add curent thread's id to the set, get the index for curent thread's resource structure, assign it to the application structure, unlock the mutex and call rt_s |
| 04:11.36 | DenisP | But it's probably not the best way to do this. Of course the whole mutex routine is not really computationally expensive compared to rt_shootray for large scenes, but still it's gonna be called pixels*samples times (plus additional pixels from tiles' overlaps) plus threads are gonna block each other a little. And resource assignment probably could be done more efficiently without std::set. |
| 04:11.44 | DenisP | But at least it's something to start with. |
| 04:13.53 | *** join/#brlcad kintel (~kintel@unaffiliated/kintel) | |
| 05:17.33 | Stragus | Locking a mutex for every ray is some crazy overhead |
| 05:18.19 | Stragus | You should be able to buffer results in preallocated arrays without any mutex |
| 05:21.55 | *** join/#brlcad Radicarian (~Radicaria@cpe-72-231-246-183.buffalo.res.rr.com) | |
| 05:25.28 | DenisP | >is some crazy overhead |
| 05:25.33 | DenisP | Yeah, I thought so too at first. |
| 05:26.20 | DenisP | And what results? I use mutex for choosing a proper resource. There's nothing except thread's id as an input. And resources are allocated in the constructor. |
| 05:27.28 | Stragus | I missed some parts, what do you need/want to do with the ray tracing results? |
| 05:27.50 | Stragus | (what comes out of rt_shootray() ) |
| 05:28.39 | Stragus | Considering each CPU core can trace many million rays per second, there must be no per-ray memory allocation or per-ray mutex locking |
| 05:28.57 | DenisP | I have got no problems with results. |
| 05:29.22 | DenisP | The problem is a correct resource allocation for different threads |
| 05:32.50 | Stragus | I guess I'm lacking too much contextual information to comment properly |
| 05:32.54 | DenisP | In librt when rt_shootray is used in parallel, the resources are assigned at the start. And with appleseed without rewriting appleseed's core I believe the only way to assign allocated resources is with thread's id. |
| 08:03.41 | *** join/#brlcad kintel (~kintel@unaffiliated/kintel) | |
| 08:04.31 | *** join/#brlcad kintel (~kintel@unaffiliated/kintel) | |
| 08:05.16 | *** join/#brlcad kintel (~kintel@unaffiliated/kintel) | |
| 08:06.09 | *** join/#brlcad kintel (~kintel@unaffiliated/kintel) | |
| 08:06.54 | *** join/#brlcad kintel (~kintel@unaffiliated/kintel) | |
| 13:49.12 | *** join/#brlcad teepee (~teepee@unaffiliated/teepee) | |
| 14:50.36 | *** join/#brlcad SaranNarayan (1b611e37@gateway/web/freenode/ip.27.97.30.55) | |
| 16:29.16 | *** join/#brlcad DenisP (d40d70a2@gateway/web/freenode/ip.212.13.112.162) | |
| 16:46.53 | *** join/#brlcad kintel (~kintel@unaffiliated/kintel) | |
| 17:58.36 | *** join/#brlcad witness (uid10044@gateway/web/irccloud.com/x-vplselmngmbimrpq) | |
| 19:51.50 | brlcad | DenisP: so you have a solution, mutext locking when getting resources, which is fine for getting started but obviously not ideal |
| 19:53.19 | brlcad | another trick that eliminates the overhead is to simply create a large array of resources (e.g., struct resource r[MAX_PSW]) and then just access get_id() % MAX_PSW |
| 19:54.11 | brlcad | of course, that's not guaranteed but in practice (for development) that works >90% of the time, so also good enough for testing/development |
| 19:54.51 | brlcad | the real fix needed requires a mod to appleseed, they need to set the thread ID during creation and make that accessible to the threads in some manner |
| 19:58.21 | brlcad | for gsoc planning, that's probably last on the priority list given the other two alternatives get things going |
| 20:11.42 | DenisP | yeah, I agree |
| 21:12.29 | *** join/#brlcad Radicarian (~Radicaria@cpe-72-231-246-183.buffalo.res.rr.com) | |
| 23:05.55 | *** join/#brlcad witness (uid10044@gateway/web/irccloud.com/x-qoaueyjaqdhccyka) | |