Stream: brlcad

Topic: bot raytracing crash


view this post on Zulip starseeker (Aug 27 2024 at 01:57):

Digging into the lint raytracing crash in main... The following code is crashing (bot.c:1480):

          segp->seg_in = hits[i]; /* struct copy */
          trip = (triangle_s *)hits[i].hit_private;
          BOT_UNORIENTED_NORM(ap, &segp->seg_in, trip->face_norm, IN_SEG);

If I look at the memory contents:

(gdb) l

1475
1476        RT_GET_SEG(segp, ap->a_resource);
1477        segp->seg_stp = stp;
1478        segp->seg_in = hits[i]; /* struct copy */
1479        trip = (triangle_s *)hits[i].hit_private;
1480        BOT_UNORIENTED_NORM(ap, &segp->seg_in, trip->face_norm, IN_SEG);
1481        segp->seg_out = hits[i+1];  /* struct copy */
1482        trip = (triangle_s *)hits[i+1].hit_private;
1483        BOT_UNORIENTED_NORM(ap, &segp->seg_out, trip->face_norm, OUT_SEG);
1484        BU_LIST_INSERT(&(seghead->l), &(segp->l));
(gdb) print segp->seg_in

$35 = {hit_magic = 0, hit_dist = 0, hit_point = {0, 0, 0}, hit_normal = {0, 0, 0}, hit_vpriv = {0, 0, 0},
  hit_private = 0x0, hit_surfno = 0, hit_rayp = 0x0}
(gdb) print hits[i]

$36 = {hit_magic = 543713652, hit_dist = 1.9994999999998984, hit_point = {0, 0, 0}, hit_normal = {0, 0, 0},
  hit_vpriv = {-0.99999999999999978, 0.3333333333333337, 0.33333333333331677}, hit_private = 0x55555721dd18,
  hit_surfno = 1705, hit_rayp = 0x7ffde2bff960}
(gdb) print i

$37 = 8
(gdb) print hits[8]

$38 = {hit_magic = 543713652, hit_dist = 1.9994999999998984, hit_point = {0, 0, 0}, hit_normal = {0, 0, 0},
  hit_vpriv = {-0.99999999999999978, 0.3333333333333337, 0.33333333333331677}, hit_private = 0x55555721dd18,
  hit_surfno = 1705, hit_rayp = 0x7ffde2bff960}
(gdb) print trip

$39 = (triangle_s *) 0x0
(gdb) print (triangle_s *)hits[i].hit_private

$40 = (triangle_s *) 0x55555721dd18
(gdb)

Not quite sure what to make of that - it's as if the struct copy failed. If I point trip to the segp->seg_in and segp->seg_out copies I still get the same crash, but with a different message printout:

$ ./bin/mged havoc.g lint r.nos5.bot
Checking for cyclic paths...
Checking for references to non-extant objects...
Checking for invalid objects...
            adding fictitious entry at 1.996570 (r.nos5.bot)
            ray = (14561.3 -813.522 308.446) -> (1 0 0)
        adding fictitious exit at 0.000000 (r.nos5.bot)
            ray = (14561.3 -813.522 308.446) -> (1 0 0)
        adding fictitious entry at 2.950000 (r.nos5.bot)
            ray = (13917.3 -741.636 1243.1) -> (-0.0124236 0.405285 -0.914106)
        adding fictitious exit at -3.659558 (r.nos5.bot)
            ray = (14428.2 819.536 329.255) -> (0.0500219 0.998748 0)
        adding fictitious exit at 0.000000 (r.nos5.bot)
            ray = (14428.2 819.536 329.255) -> (0.0500219 0.998748 0)
        adding fictitious entry at 1262.212986 (r.nos5.bot)
            ray = (13427.9 605.209 1300.15) -> (-0.000665746 1.45872e-06 1)
        adding fictitious entry at 4.712672 (r.nos5.bot)
            ray = (16242.7 -280.706 767.48) -> (-1 0 0)
        adding fictitious entry at 4.712672 (r.nos5.bot)
            ray = (16242.7 -334.541 826.966) -> (1 0 0)
        adding fictitious entry at 2.107612 (r.nos5.bot)
            ray = (16516.7 -200.724 1110.97) -> (-0.196252 0.172898 -0.96519)
        adding fictitious exit at 1.999500 (r.nos5.bot)
            ray = (16516.7 -200.724 1110.97) -> (-0.196252 0.172898 -0.96519)
        adding fictitious exit at 1.999500 (r.nos5.bot)
            ray = (16516.7 -200.724 1110.97) -> (-0.196252 0.172898 -0.96519)
        adding fictitious exit at 1.999500 (r.nos5.bot)
            ray = (16516.7 -200.724 1110.97) -> (-0.196252 0.172898 -0.96519)

If I run in valgrind it completes:

==352619== Memcheck, a memory error detector
==352619== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==352619== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==352619== Command: ./bin/mged havoc.g lint r.nos5.bot
==352619==
==352619== Warning: set address range perms: large range [0x59c93000, 0x159c93000) (defined)
==352619== Warning: set address range perms: large range [0x159c93000, 0x259c93000) (defined)
Checking for cyclic paths...
Checking for references to non-extant objects...
Checking for invalid objects...
Found invalid objects:
    r.nos5.bot [bot:close_face,bot:thin_volume]
==352619== Warning: set address range perms: large range [0x159c93000, 0x259c93000) (noaccess)
==352619== Warning: set address range perms: large range [0x59c93000, 0x159c93000) (noaccess)
==352619==
==352619== HEAP SUMMARY:
==352619==     in use at exit: 4,927,346 bytes in 374 blocks
==352619==   total heap usage: 90,470 allocs, 90,096 frees, 37,850,731 bytes allocated

view this post on Zulip starseeker (Aug 27 2024 at 01:58):

I'm using concurrentqueue.h for parallelism in lint - could that be incompatible somehow with the memory being used for hlbvh?

view this post on Zulip starseeker (Aug 27 2024 at 02:00):

Maybe I should switch to old-school bu_parallel... I was testing out whether we could use that queue to avoid the asymptotic tail of a few long running rays holding up an otherwise parallel ray interrogation, but now I'm thinking that might have been a mistake...

view this post on Zulip starseeker (Aug 27 2024 at 02:48):

I do wonder about those "fictitious" messages and what's going on there, as well...

view this post on Zulip starseeker (Aug 27 2024 at 14:55):

Yeah, looks like the concurrentqueue.h may have been to blame - setting up using bu_parallel, it runs clean.


Last updated: Oct 09 2024 at 00:44 UTC