01:19.42 |
*** join/#brlcad infobot
(ibot@rikers.org) |
01:19.42 |
*** topic/#brlcad is BRL-CAD
and open source CAx discussion ! Also @ http://brlcad.zulipchat.com !
Logs @ http://infobot.rikers.org/%23brlcad/ |
01:33.16 |
*** join/#brlcad
odqeywdqlmgagnir
(~armin@dslb-088-066-132-181.088.066.pools.vodafone-ip.de) |
04:09.24 |
DenisP |
Okay, I think I have got a basic idea how
librt works. |
04:09.51 |
DenisP |
Appleseed uses boost::thread for
multithreading. It divides an image into tiles and render each tile
in separate thread. ( = pixel parallelization and not ray
parallelization) |
04:10.08 |
DenisP |
Back to brl-cad. I believe each thread must
have a separate resource structure to use rt_shootray() and all we
can get from Appleseed is current thread's id (via
boost::this_thread::get_id()). |
04:10.23 |
DenisP |
So, here's how I run rt_shootray() in multiple
threads right now: |
04:10.40 |
DenisP |
I have got a BrlcadProceduralObject. It
contains rt_i structure (which is initialized in the constructor);
an array of resources for different threads; std::set for threads'
Ids; a mutex (boost::mutex). In intersection function before
calling rt_shootray(), I lock the mutex, add curent thread's id to
the set, get the index for curent thread's resource structure,
assign it to the application structure, unlock the mutex and call
rt_s |
04:11.36 |
DenisP |
But it's probably not the best way to do this.
Of course the whole mutex routine is not really computationally
expensive compared to rt_shootray for large scenes, but still it's
gonna be called pixels*samples times (plus additional pixels from
tiles' overlaps) plus threads are gonna block each other a little.
And resource assignment probably could be done more efficiently
without std::set. |
04:11.44 |
DenisP |
But at least it's something to start
with. |
04:13.53 |
*** join/#brlcad kintel
(~kintel@unaffiliated/kintel) |
05:17.33 |
Stragus |
Locking a mutex for every ray is some crazy
overhead |
05:18.19 |
Stragus |
You should be able to buffer results in
preallocated arrays without any mutex |
05:21.55 |
*** join/#brlcad Radicarian
(~Radicaria@cpe-72-231-246-183.buffalo.res.rr.com) |
05:25.28 |
DenisP |
>is some crazy overhead |
05:25.33 |
DenisP |
Yeah, I thought so too at first. |
05:26.20 |
DenisP |
And what results? I use mutex for choosing a
proper resource. There's nothing except thread's id as an input.
And resources are allocated in the constructor. |
05:27.28 |
Stragus |
I missed some parts, what do you need/want to
do with the ray tracing results? |
05:27.50 |
Stragus |
(what comes out of rt_shootray() ) |
05:28.39 |
Stragus |
Considering each CPU core can trace many
million rays per second, there must be no per-ray memory allocation
or per-ray mutex locking |
05:28.57 |
DenisP |
I have got no problems with results. |
05:29.22 |
DenisP |
The problem is a correct resource allocation
for different threads |
05:32.50 |
Stragus |
I guess I'm lacking too much contextual
information to comment properly |
05:32.54 |
DenisP |
In librt when rt_shootray is used in parallel,
the resources are assigned at the start. And with appleseed without
rewriting appleseed's core I believe the only way to assign
allocated resources is with thread's id. |
08:03.41 |
*** join/#brlcad kintel
(~kintel@unaffiliated/kintel) |
08:04.31 |
*** join/#brlcad kintel
(~kintel@unaffiliated/kintel) |
08:05.16 |
*** join/#brlcad kintel
(~kintel@unaffiliated/kintel) |
08:06.09 |
*** join/#brlcad kintel
(~kintel@unaffiliated/kintel) |
08:06.54 |
*** join/#brlcad kintel
(~kintel@unaffiliated/kintel) |
13:49.12 |
*** join/#brlcad teepee
(~teepee@unaffiliated/teepee) |
14:50.36 |
*** join/#brlcad SaranNarayan
(1b611e37@gateway/web/freenode/ip.27.97.30.55) |
16:29.16 |
*** join/#brlcad DenisP
(d40d70a2@gateway/web/freenode/ip.212.13.112.162) |
16:46.53 |
*** join/#brlcad kintel
(~kintel@unaffiliated/kintel) |
17:58.36 |
*** join/#brlcad witness
(uid10044@gateway/web/irccloud.com/x-vplselmngmbimrpq) |
19:51.50 |
brlcad |
DenisP: so you have a solution, mutext locking
when getting resources, which is fine for getting started but
obviously not ideal |
19:53.19 |
brlcad |
another trick that eliminates the overhead is
to simply create a large array of resources (e.g., struct resource
r[MAX_PSW]) and then just access get_id() % MAX_PSW |
19:54.11 |
brlcad |
of course, that's not guaranteed but in
practice (for development) that works >90% of the time, so also
good enough for testing/development |
19:54.51 |
brlcad |
the real fix needed requires a mod to
appleseed, they need to set the thread ID during creation and make
that accessible to the threads in some manner |
19:58.21 |
brlcad |
for gsoc planning, that's probably last on the
priority list given the other two alternatives get things
going |
20:11.42 |
DenisP |
yeah, I agree |
21:12.29 |
*** join/#brlcad Radicarian
(~Radicaria@cpe-72-231-246-183.buffalo.res.rr.com) |
23:05.55 |
*** join/#brlcad witness
(uid10044@gateway/web/irccloud.com/x-qoaueyjaqdhccyka) |