i'm wondering if the struct bu_list l
member of struct seg
is still used, or if it is deprecated. and if it is still used, what's the difference between the list of segments in a partition and the list in struct seg
? :-?
@Cezar I'm not really an expert on bu_list, but my understanding is that the "l" member of a bu_list is always pretty much essential, and if I'm remembering correctly structures whose first element is a bu_list "l" can function as bu_lists for some purposes
thanks for replying :D what i don't get is why struct seg
functions as a bu_list as well. comments say that if a ray intersects geometry in multiple places (for example shooting through a torus), the bu_list l
member points to the next intersection. but a partition is described as containing "information about intervals of the ray which pass through geometry". so aren't they doing the same thing?
well, i don't think it's that important, and so far i haven't seen struct seg
used as a linked list in code, but i was afraid i might have a wrong understanding of what segments and partitions are
thanks for replying :D what i don't get is why
struct seg
functions as a bu_list as well. comments say that if a ray intersects geometry in multiple places (for example shooting through a torus), thebu_list l
member points to the next intersection. but a partition is described as containing "information about intervals of the ray which pass through geometry". so aren't they doing the same thing?
@Cezar Understanding how bu_lists work takes a bit of in-depth understanding of C, how memory is managed, how structures are accessed (typewise). Putting a bu_list struct into ANY other structure (e.g., struct seg) turns it into a doubly linked list because of old C trick called pointer aliasing.
you can read up on that, but the gist is if you have struct A {struct B; int C;}, I can pretend A is B and vice versa because the first byte of A is also the first byte of B, so it all happens to work if I simply cast
i understand that and i understand how it works
later versions of C came along and said you should only be able to alias if A and B are the same size, which resulted in the addition of the -fno-strict-aliasing flag, but the trick remains because of how the memory is arranged
okay, so then did I misunderstand the question?
my question was, if struct seg
is a list of segments, and a partition is a list of segments as well, what's the difference between them?
ah, so that difference is struct seg "IS A" list of segments and struct partition "HAS A" list of segments and "IS A" list of partitions
a partition has a lot more information about an intersection event
a segment is a simple slice of a primitive
and is struct seg used anywhere as a list of segments?
because in a partition, the list of segments is stored as struct bu_ptbl pt_seglist;
, for example
and is struct seg used anywhere as a list of segments?
oof, now that's a broad question ... and the answer is "probably" :)
finding them might be tricky because it could be something operating on bu_list or on struct seg's
yeah, i was about to say that "anywhere" is a bit much :D
segments are not typically accessed that way, so the answer very well could be no
but what prompted the question was a comment in rt/seg.h
. it says that struct seg
"Contains forward link to additional intersection segments if the intersection spans multiple segments (e.g., shooting a ray through a torus)."
typically, callers iterate over partitions that have been woven and ordered, or they iterate over the bu_ptbl set
but my understanding was that if i shoot a ray through a torus, i should have a partition containing the two segments, not a struct seg with a link to the the other segment
right, they are definitely stored as a list -- lots of primitives can result in a list of segs
triangle meshes in particular
you can test that understanding easily enough, but basically I think you'll end up with the rt_tor_shot() routine giving back a seg list with two segments that are returned to the caller as a partition with in_seg on the first seg and out_seg on the latter seg and the ptbl containing them both
oh, ok, that makes sense
so you could iterate from in_seg to the next (which is equal to out_seg), or access out_seg, or access via the hash
for something simple like a torus, it's overkill, but there are obviously much more complex primitives that could return an infinite number of segments during shot()
so the primitives work with lists of segments, and a partition is more of a user-facing structure?
almost, yes
evaluating a ray against primitives gives segments (basically sets of
in-out hit points)
great, then this pretty much answers the original question
depending on the geometry definition (you could call this "user-facing" I suppose) determines how those segments are processed
if I have a region R1 that is "A union B", then it'll return a partition for R1 that is the segments of A merged with the segments of B, which might end up being just 1 segment -- consider A is [0,1] and B is [0.5,2] then it could return a partition with segment C [0,2] because they overlapped and were the same region (they're the same material)
i was looking at the code in bool_eval
, and a lot of the time there is spent checking if a solid in a region is in the partition to evaluate, because it scans the list of segments linearly. i think that can be sped up with a bit vector of solids present in the partition, since solids already have a bits assigned to them and i see bit vectors are already present in a few places
that's where you 'll see, mentioned throughout the code, "bool_weave" and "bool_final" making themselves relevant
oh, also, i started reading parts of the librt code last week, and i wrote here some of the things i learned https://cezarelnazli.github.io/librt.txt
yeah, there's definitely a few places like that where you could eliminate an O(n)
regarding the bitvectors, i reread the messages in the performance topic and i saw that one of the areas was "optimization/elimination of libbu pointer tables (bu_pbtl_* calls) and bit vectors"
regarding the terminology, the hardest part to understand is usually what regions are and why we need them
i was wondering why bit vectors would have to be eliminated :-? and i thought that using bitvectors for the solids in a partition might not be ok if bitvectors should be eliminated
it's visualized slightly in http://brlcad.org/w/images/5/52/MGED_Quick_Reference_Card.pdf but it basically comes down to having a means for knowing the difference between a when something represents some space/volume/area vs something physically occupying space/volume/area
and a solid represents some space and a region is physically occupying that space?
bitvectors is merely a book-keeping optimization, right? so consider why they are needed at all, and whether we could speed things up if we didn't have them
specifically to performance, bitv's currently pop up on a profile where they should not, taking too much time where they should take zero
a solid or combination can represent some space
and a solid represents some space and a region is physically occupying that space?
a solid or combination can represent some space ... marking it as a "region" is you saying that it physically occupies space
so having a solid A and B overlapping represent some volume .. and if you union them, you have a new volume (say C) or you have an error (e.g., if you had a region A' and region B' overlapping)
it's a little more evident if you apply material properties, say A is a 100mm sphere and B is a 1000mm sphere, both at the origin; if you union A and B, you obviously get a 1000mm sphere and no problems. If, however, you have an A' region that is "water" and a B' that is steel .. then you have an error because the 100mm inner space cannot be both water and steel simultaneously.
so i suppose this means that you can apply union/intersection/diff to primitives, but not to regions?
no, you can
but they just have to not overlap?
if A' and B' don't overlap, you could certainly "union" aka group them together and there's no problem
right
technically even if they were the same material, it probably shouldn't be an error, but currently is treated as one (if you have two water volumes overlapping, what is a "double" water??)
subtractions are far more common
one could conceptually argue that "subtracting a steel volume from a water volume" doesn't make sense, but it's such a common user operation that it's allowed and presumed they meant to subtract the shape, not the object itself
@Cezar per your notes, rays do appear in code -- see the a_ray entry in struct application in rtexample.c for example
you can also have regions with just one solid
oh, right. i actually modified rtexample to shoot along X instead of Z, but i didn't notice that a_ray was a field. i misread that as a_ray_r_dir
coming back to the bitvector discussion, i suppose they are needed in order to check if something (like a solid or a region) is present in a set. so was your point about eliminating the need to check in the first place?
right, eliminate the need or use a different method or speed up the existing method
one problem I distinctly noted a few months back was bu_bitv_shift() came up in a profile ... and that should never happen -- implies CHAR_BIT wasn't defined or something else was wrong in build logic (see include/bu/bitv.h)
means a couple things went wrong like CHAR_BIT not defined and the cmake 'inline' test failing
ok, another question would be, is rt_boolweave really in need of speeding up? i profiled raytracing havoc.g, and it took 25 minutes, and rt_boolweave was only 27 seconds. is there a case that was not highlighted by this database where rt_boolweave takes a lot of time?
also in that same profile, i didn't see bu_list showing up. heaviest stack traces were BU_PTBL_FOR and bu_ptbl_cat_uniq
you're not going to see bu_list in a profile -- the performance impact there is that they fully prevent automatic vectorization by the compiler
boolweave can take anywhere from 0 to 50% of the ray trace time, depending on the model and view. on havoc, zoom into the space where the rotor blades all come togheter -- profile that
i'm sure you could contrive a model that exacerbates boolean evaluation even more than 50% too, but typical performance is in the 10-30% range for a traditional detailed csg model
that said ... 25 minutes.... what were you doing? that's a LOT of time for havoc
i was running with -H127 -J0 -B :-?
aha, heh
-H511* actually
oh, i'm also curious what the effect of shooting multiple rays is. is there some randomness with each, which is then aggregated for better results?
with -B there is no effect, you just increased the work by the -H count
without -B, there is randomness within each cell with results averaged together
thanks for answering all these questions, i think i have a clearer picture now
it's important work, so keep the questions flowing and ping me/others again if you don't get an answer or don't get a satisfactory answer
it's an optimized system, but there's ALWAYS room for improvement and gains to be made, especially as you start to see the bigger picture
I imagine this was already hashed out, but would it be worthwhile/practical to start with replacing bu_ptbl with something else and see what that does, rather than deep diving into the boolean eval workflow? (serious question, not a hint to try something else...)
yeah, I agree. my understanding is that @Cezar already has several good leads from some profiles and is just exploring the landscape, but tackling one of them like ptbls is as good a place as any to start with
the bu_ptbl work is where i'll most likely start. these past two weeks i've been trying to get more familiar with the code
@Cezar No problem, that's a good thing - I just didn't want you to feel obligated to untangle all the yarn before doing any knitting ;-). We do need to deal with bu_list and its implications, but as you're learning that one will take a fair bit of digesting.
Are the shorthand # defines in search.h really necessary? It looks like a holdover from OpenBSD and they have caused me at least one hard to debug error. I'm tempted to remove them.
@Peter Pronai I'm sure they're there from the BSD code - I've no objection to making them explicit and removing the defines
@Peter Pronai There are a number of things about search that aren't really optimal - for example, right now it builds an array of all paths in a database and applies the filters to that. I ended up doing that because it was simpler than debugging problems with properly terminating tree walking (ended up with missing items that should have been found in the search but weren't, if I recall correctly) but for large databases that initial list build and processing all the extra paths when a condition might have terminated searching whole subtrees is quite expensive.
Generally speaking with search, the importance ordering is 1) accuracy 2) features 3) performance
Are the shorthand # defines in search.h really necessary? It looks like a holdover from OpenBSD and they have caused me at least one hard to debug error. I'm tempted to remove them.
shorthand? which defines are you referring too?
I think he's talking about src/librt/search.h starting around line 132
I have a patch that removes them, could you take a look? https://sourceforge.net/p/brlcad/patches/485/
I just wrote a simple Lua script that expanded the macros in search.c, everything built fine afterwards so I it should be fine
I updated the patch that removes the shorthand definitions
it _should_ work with SVN now
I updated the patch that removes the shorthand definitions
Well done Peter! Now just one more patch ....
if I asked "parallel tessellation?", how hard would everyone laugh? :big_smile:
<snort> depends on what you mean by parallel
Processing multiple objects in parallel - eventually yes. Parallel tessellation of a single NURBS brep object - not anytime soon, and for some stages of the watertight processing that would be quite difficult.
@Erik you mean parallel boolean evaluation (e.g., 'facetize' in parallel)? There's nothing really preventing it. It's actually imminently doable at the NMG layers but nobody has even tried as far as I know.
I'm thinking nmg(bot), single region facetize for stl->gcode. I vaguely recall it was a bit of a hot topic a decade ago and my computers seem to have more cores instead of faster ones lately :)
Yep, still an issue.
Nobody has even tried afaik.
Hello. have a question about central_ray in bundle. I will shoot bunch of rays which r_pt and r_dir taken from a .txt and actually there is no central ray among them. Do I have to add a central_ray to application inside bundle struct, can I add as NULL vect_t ?
Thnks for all your answers
Hello
have a question about central_ray in bundle. I will shoot bunch of rays which r_pt and r_dir taken from a .txt and actually there is no central ray among them. Do I have to add a central_ray to application inside bundle struct, can I add as NULL vect_t ?
Thnks for all your answers
@scorp08 Where do you see a central_ray reference? I'm not familiar with that or seeing anything named that in the bundle interface
Also, are you using the rt_shootrays() interface or the rt_shootray_bundle() interface?
@Sean rt_shootrays(). Filled the xrays structure and list. But I could not decide what to give to struct application.a_ray inside the struct application_bundle. I do not need it actually. xrays is enough for me
Hello @Sean I want to change global coordinate system in the model space by code such that after changing, GUI and model space for example, can show new "x "axis as rotated by 60 degrees of the default x axis . In which file can I define my own global coordinate system ??
@scorp08 Matrix transformations are applied to combinations, and you can have arbitrary nestings of combinations, so the typical way to do that would be a top-level transformation combination that applies the desired rotation.
By large, we've avoided implementing arbitrary coordinate systems as a generic non-combination construct as a matter of consistency and compatibility (it's dealt with better on import/export), but there's actually not been a user feature request for such a thing -- you're welcome to submit something if you can formalize what all your requirements would entail.
By large, we've avoided implementing arbitrary coordinate systems as a generic non-combination construct as a matter of consistency and compatibility (it's dealt with better on import/export), but there's actually not been a user feature request for such a thing -- you're welcome to submit something if you can formalize what all your requirements would entail.
@Sean as first step I would like to test by changing axis and rotate with trial and error to see . Is brlcad coord. system based on opennurbs-rhino (same) ?
Yes, I believe so -- both are a typical right-hand Z-up coordinate system that is typical among CAD software. It's those crazy content modeling systems that use Y-up that are typically rotated on import/export. :)
@Sean thanks. I checked the fun bu_parallel and could not understand one thing. max_psw is defined as 1024 and table [max_psw] is initialized. while calling with npu=0, it returns "bu_avail_cpus" (eg: npu=4). Does it meant that only table[4] is working parallel or 4*table[1024] is working parallel. like 4 cores each has 1024 proccess ?
I did some tests to see time difference for rtshootrays. generate 2 exe, one is defined SHOOTRAYS_IN_PARALLEL and one is not. Checked time using chrono and used 18k shoots. Run in 2 machines (4 cores and 48 cores) Some times parallel run - exe return shorter time than the other but sometimes vice versa. But results are quite similar. Why could not get a speed increase (even not-parallel exe return shorter times)
@scorp08 max_psw is simply the maximum amount of concurrent parallelism that libbu will support, no matter what your hardware will support. You generally should not need to think about that limit as it's rare to find a supercomputer with more parallelism per node.
the ncpu value that you specify to bu_parallel is how many concurrent threads of execution you are requesting. in general, this will be equal to the number of cores, so the default '0' for whatever your hardware has available is usually adequate.
as for testing performance, I would first make sure you have enough work -- 19k shots will be nearly instantaneous on most non-nurbs geometry models. literally sub-second. so that means your program time is going to be dominated by OS initialization, process startup time, model loading, and other factors.
For most models, especially in parallel, you want to profile with somewhere between 100k to 10M rays. That still might only be 1s of work because ray tracing is that fast. If you're not sure, run rt and look at the summary at the end to get an idea of the upper-bound (rt does rt_shootray() plus calls into liboptical, so the timings are an upper bound for rt_shootray()).
If you don't have that many rays, then you'll want to run a profiler like 'perf' on linux to see where your performance bottleneck is at.
For most models, especially in parallel, you want to profile with somewhere between 100k to 10M rays. That still might only be 1s of work because ray tracing is that fast. If you're not sure, run rt and look at the summary at the end to get an idea of the upper-bound (rt does rt_shootray() plus calls into liboptical, so the timings are an upper bound for rt_shootray()).
@Sean defined SHOOTRAYS_IN_PARALLEL macro in bundle.c and run but noticed that "while (rays->ap[i].a_magic != RT_AP_MAGIC)" is not working since I initialize ap with rt_ap_init in my codes and magic num is rt_ap_magic . Instead of "while" , I did a for loop iterating over nrays which is 18k. But this time did not hit anything with again longer time . if ap not have magic num as RT_AP_MAGIC, returning with errors naturally.
Could not solve , feeling that missing a small detail ?
Hey @Sean , should db_apply_state_from_one_member be a public function? At a glance, that looks like part of the librt tree internal implementation...
@starseeker There's a bunch of functions in db_tree.c that shouldn't be public -- looks like nobody has really given it consideration in there. db_apply_state_from_one_member() def looks like an implementation detail to db_follow_path().
Last updated: Jan 09 2025 at 00:46 UTC