Stream: brlcad

Topic: bugs


view this post on Zulip Cezar (Jul 03 2018 at 21:43):

there is a bug here. freesolid is iterated, the head element is removed, and then added again, resulting in an infinite loop. i don't know how to fix it, since i don't know what it's actually trying to accomplish

view this post on Zulip Sean (Jul 04 2018 at 04:51):

there is a bug here. freesolid is iterated, the head element is removed, and then added again, resulting in an infinite loop. i don't know how to fix it, since i don't know what it's actually trying to accomplish

I'm not seeing it -- can you explain? are you referring to RT_G_DEBUG mode on line 738? that can be deleted if it's gotten out-of-date.

view this post on Zulip Cezar (Jul 04 2018 at 05:26):

FREE_SOLID (https://sourceforge.net/p/brlcad/code/HEAD/tree/brlcad/trunk/include/rt/solid.h#l81) also does a BU_LIST_APPEND of sp (which has just been dequeued) to freesolid (from which it has just dequeued sp)

view this post on Zulip Cezar (Jul 04 2018 at 05:28):

it's true that RT_G_DEBUG has to be on. i ran an mged test with -x 801 -X 801 and stumbled upon the issue

view this post on Zulip Sean (Jul 04 2018 at 05:52):

I think that's code that got out of sync with other code. There almost certainly was other RT_G_DEBUG code that has since been deleted that make it work right. The "fix" is to probably just delete this offending code. We should not be doing manual data-specific pooling (freelists) like that any more. They're crazy slow, incoherent, pointer chasing.

view this post on Zulip Cezar (Jul 04 2018 at 07:17):

i removed the code in r71109. i ran the tests before committing, everything was fine

view this post on Zulip Cezar (Jul 09 2018 at 12:01):

regress-rtwizard-rtwiz_m35_D fails on mac, but not on linux. here's the output of pix-cmp -l and here are the different pixels (using imagemagick). any suggestions on how i could debug this? i get the same results on both release and debug

view this post on Zulip Cezar (Jul 09 2018 at 12:03):

@Sean should i pursue this further, or should i focus on the SoA work for now?

view this post on Zulip Sean (Jul 10 2018 at 03:26):

@Sean should i pursue this further, or should i focus on the SoA work for now?

@Cezar if you would like to debug this, please do - it's some relatively recent change. you might have a good chance finding it simply by bisecting commits until you find the culprit.

I'm also seeing the failure on mac.

view this post on Zulip Sean (Jul 10 2018 at 03:30):

all of the differences are subtle shading changes at extreme angles (e.g., 0/0/0 vs 2/2/2 rgb values on the tire edges), implying something has changed mathematically somewhere. i have a mini backlog of commits I've not yet peer-reviewed, so I'll try to keep an eye out for likely cause but typically will be something like subtle bounding box change via redefinition of INFINITY, optimized math rounding, error accumulation, etc

view this post on Zulip Cezar (Jul 10 2018 at 08:20):

i tried with 7.26.4 from sourceforge, the test still fails. 7.26.0 doesn't build and i didn't try to fix it. it's either not that new, or maybe clang's libc++ could be at fault? i'll try building with -stdlib=libstdc++

view this post on Zulip Cezar (Jul 10 2018 at 08:22):

i'll try building with clang on linux next, hopefully that uses libc++ and not the gnu one. i tried gcc 8 on mac, but lld can't link stuff (something about unrecognized options), and gnu ld doesn't produce mach-o's

view this post on Zulip Sean (Jul 10 2018 at 15:56):

i tried with 7.26.4 from sourceforge, the test still fails. 7.26.0 doesn't build and i didn't try to fix it. it's either not that new, or maybe clang's libc++ could be at fault? i'll try building with -stdlib=libstdc++

@cezar It's typically better when hunting for bugs to go by commit revisions, not releases, so you can linearly walk back in time until you find when it last worked/broke. That's the bisecting I referred to as it's a typical workflow for both svn and git (though slightly different for each). That said, it's sounding like this might be a little too far down the rabbit hole for you right now, so I'd say just move on and I'll take a look.

view this post on Zulip Cezar (Jul 10 2018 at 15:58):

i know how to do that with revisions, it’s what i did with the flto bug a few days ago, but in this case, i went to r69000 and it was still failing, so i was curious if the release was ok or not :D

view this post on Zulip Cezar (Jul 10 2018 at 16:00):

would you like me to submit a bug report, maybe, so it’s not forgotten about?

view this post on Zulip Sean (Jul 10 2018 at 16:01):

no, don't bother

view this post on Zulip Sean (Jul 10 2018 at 16:01):

it won't be forgotten -- it's failing make test, so we can't release until it's fixed

view this post on Zulip Sean (Jul 10 2018 at 16:02):

obviously slipped through on 7.26.4, but now we know about it -- you could add a line to the TODO file's top section as those are release blockers

view this post on Zulip Peter Pronai (Aug 07 2018 at 19:11):

ran into this

mged> search / | /

ERROR: bad pointer 0x5629bc520f18: s/b db_full_path(x64626670), was librt directory(x5551212), file /home/rain/Sync/gsoc/build/svn-master/src/librt/db_fullpath.c, line 264

ERROR: bad pointer 0x5629bc520f18: s/b db_full_path(x64626670), was librt directory(x5551212), file /home/rain/Sync/gsoc/build/svn-master/src/librt/db_fullpath.c, line 264

Saving stack trace to mged-5688-bomb.log

view this post on Zulip starseeker (Aug 08 2018 at 02:15):

confirmed, I can reproduce it.

view this post on Zulip starseeker (Aug 08 2018 at 03:00):

@Peter Pronai can you explain to me again why you're thinking the hole syntax needs changing?

view this post on Zulip starseeker (Aug 08 2018 at 03:03):

OK, r71438 should take care of that crash.

view this post on Zulip Peter Pronai (Aug 08 2018 at 22:27):

it doesn't necessarily, not with the latest patch

view this post on Zulip Peter Pronai (Aug 08 2018 at 22:32):

1533767464.png

ran into this by clicking around randomly in the manpage viewer in MGED

view this post on Zulip Peter Pronai (Aug 08 2018 at 22:33):

it keeps changing and stealing focus and i can't check out the Details>> thing or do anything but kill MGED

view this post on Zulip Peter Pronai (Aug 08 2018 at 22:33):

it basically DOSes one's desktop

view this post on Zulip Peter Pronai (Aug 08 2018 at 22:33):

no idea what the cause is but i'm more concerned about error messages being able to "DOS"

view this post on Zulip Peter Pronai (Aug 08 2018 at 22:34):

there should be One (1) error window where all errors are collected and the user can scroll back and stuff

view this post on Zulip Peter Pronai (Aug 08 2018 at 22:35):

something like Unity's log window is a good example (the game engine)

view this post on Zulip Peter Pronai (Aug 08 2018 at 23:21):

since that's a bit vague, what happens more precisely is:
the node ID keeps incrementing in the error message, so telling it to skip the message doesn't because (i assume) each message counts as different due to the different ID

view this post on Zulip starseeker (Aug 09 2018 at 01:45):

hmm. does that error appear right away or after clicking on a few man pages?

view this post on Zulip starseeker (Aug 09 2018 at 01:47):

I can't seem to get that error, although I can get an error when I try to follow a hyperlink...

view this post on Zulip Peter Pronai (Aug 09 2018 at 02:15):

never right away, i had to click on a number of pages on the left and idk which one triggered it

view this post on Zulip Peter Pronai (Aug 09 2018 at 02:15):

sometimes it's only ~20 clicks, sometimes a lot more

view this post on Zulip starseeker (Aug 09 2018 at 02:16):

that may be some quirk in the tkhtml widget - it's still kinda got wires sticking out around the edges in some respects.

view this post on Zulip Sean (Aug 09 2018 at 06:46):

it basically DOSes one's desktop

what's going on is you're generating mouse events on that window every time you move the mouse (and there's obviously some problem in the event handler that causes the dialog to pop up). there are probably LOTS of events queued up, and every time you click a button or move the mouse to dismiss the panel, even more events are generated (dozens, maybe hundreds). if you took the window out of full screen and moved it out of the way from the dialog so the mouse doesn't go over it, you could probably clear out all the dialogs (which would clear out all the pending events)..

view this post on Zulip Sean (Aug 09 2018 at 06:46):

there should be One (1) error window where all errors are collected and the user can scroll back and stuff

completely agree

view this post on Zulip Daniel Rossberg (Oct 13 2022 at 17:56):

Open toyjeep.g in mged + e all.
Related to https://github.com/BRL-CAD/brlcad/commit/30e45f823ceca8c37fc7591c2bcf37f628471724 ?

view this post on Zulip starseeker (Oct 14 2022 at 01:50):

I just tried with latest main and it seems to draw the wireframe - what are the symptoms you're seeing?

view this post on Zulip Daniel Rossberg (Oct 14 2022 at 08:40):

Same for me, after https://github.com/BRL-CAD/brlcad/commit/9cd246294876ef3b757914c4e1dbf272b6751fd5 "Apparently we can't depend on cts_s.ts_dbip always being valid - try rti_dbip instead."

view this post on Zulip Daniel Rossberg (Oct 14 2022 at 08:51):

BTW, I'm experiencing a build error with the new stepcode:

I'm seeing a kind of chicken or the egg dilemma in stepcode with expscan.h. The file is necessary to get generated. The work-around is to copy the file from the generated directory in the source tree to the ExpScanner_expscan directory in the binaries tree. It will be overwritten by the generated one then.

view this post on Zulip starseeker (Oct 14 2022 at 11:31):

@Daniel Rossberg what failure are you seeing trying to draw? Is it crashing? Failing to create the wireframe correctly?

view this post on Zulip starseeker (Oct 14 2022 at 11:32):

I'll have to try and set up a Debian Stable VM to see if I can reproduce - I've not encountered that stepcode failure mode on any of the testing so far, and I've not seen it from the github runners, so it's a little puzzling... that'll be a few days before I can attempt

view this post on Zulip Daniel Rossberg (Oct 14 2022 at 11:37):

starseeker said:

Daniel Rossberg what failure are you seeing trying to draw? Is it crashing? Failing to create the wireframe correctly?

After your commit (I mentioned above), I don't get the error anymore. It should have fixed it. I meant with "same for me" that it draws the wireframe now. Before, I got a bad pointer message.

view this post on Zulip starseeker (Oct 14 2022 at 11:37):

Ah - I thought you meant "same" as in "no change to error" :-)

view this post on Zulip starseeker (Oct 14 2022 at 11:40):

@Daniel Rossberg I don't know if you can easily tell, but what in the initial build process is needing expscan.h? I'm wondering how that can be needed on just Debian stable...

view this post on Zulip starseeker (Oct 14 2022 at 11:41):

It might be that there's something wonky with the perplex/lemon setup, which should be the only time generated gets involved...

view this post on Zulip Daniel Rossberg (Oct 14 2022 at 12:11):

expscan.h is included in expparse.y. This is where the error comes from, during the generation phase, I think.

[ 26%] Creating directories for 'STEPCODE_BLD'
[ 26%] No download step for 'STEPCODE_BLD'
[ 26%] No update step for 'STEPCODE_BLD'
[ 26%] No patch step for 'STEPCODE_BLD'
[ 26%] Performing configure step for 'STEPCODE_BLD'
-- STEPCODE_BLD configure command succeeded.  See also /home/rossberg/Devel/BRL-CAD/build/brlcad/src/other/ext/STEPCODE_BLD-prefix/src/STEPCODE_BLD-stamp/STEPCODE_BLD-configure-*.log
[ 26%] Performing build step for 'STEPCODE_BLD'
CMake Error at /home/rossberg/Devel/BRL-CAD/build/brlcad/src/other/ext/STEPCODE_BLD-prefix/src/STEPCODE_BLD-stamp/STEPCODE_BLD-build-.cmake:37 (message):
  Command failed: 2

   'make'

  See also

    /home/rossberg/Devel/BRL-CAD/build/brlcad/src/other/ext/STEPCODE_BLD-prefix/src/STEPCODE_BLD-stamp/STEPCODE_BLD-build-*.log


-- stdout output is:
Scanning dependencies of target ExpParser_input_cpy
[  0%] Generating expparse.y.sentinel
[  1%] Generating expparse.y
[  1%] Built target ExpParser_input_cpy
[  1%] [LEMON][ExpParser] Building parser with /home/rossberg/Devel/BRL-CAD/build/brlcad/bin/lemon
Scanning dependencies of target objlib_expparse_c
[  2%] Building C object src/express/CMakeFiles/objlib_expparse_c.dir/expparse.c.o

-- stderr output is:
In file included from expparse.y:5:
/home/rossberg/Devel/BRL-CAD/brlcad/src/other/ext/stepcode/src/express/parse_data.h:3:10: fatal error: expscan.h: Datei oder Verzeichnis nicht gefunden
    3 | #include "expscan.h"
      |          ^~~~~~~~~~~
compilation terminated.
make[5]: *** [src/express/CMakeFiles/objlib_expparse_c.dir/build.make:86: src/express/CMakeFiles/objlib_expparse_c.dir/expparse.c.o] Fehler 1
make[4]: *** [CMakeFiles/Makefile2:1172: src/express/CMakeFiles/objlib_expparse_c.dir/all] Fehler 2
make[3]: *** [Makefile:149: all] Fehler 2

CMake Error at /home/rossberg/Devel/BRL-CAD/build/brlcad/src/other/ext/STEPCODE_BLD-prefix/src/STEPCODE_BLD-stamp/STEPCODE_BLD-build-.cmake:47 (message):
  Stopping after outputting logs.


make[2]: *** [src/other/ext/CMakeFiles/STEPCODE_BLD.dir/build.make:130: src/other/ext/STEPCODE_BLD-prefix/src/STEPCODE_BLD-stamp/STEPCODE_BLD-build] Fehler 1
make[1]: *** [CMakeFiles/Makefile2:15471: src/other/ext/CMakeFiles/STEPCODE_BLD.dir/all] Fehler 2
make: *** [Makefile:182: all] Fehler 2

view this post on Zulip Daniel Rossberg (Oct 14 2022 at 12:16):

Putting PERPLEX_TARGET before LEMON_TARGET doesn't help.

view this post on Zulip starseeker (Oct 15 2022 at 01:08):

I've seen an occasional, not reliably reproducible error that seems to be either a missing dependency specification or one of those situations were I/O operations aren't fully completing correctly. If you run "make" again after that error appears by going down into the build directory of the stepcode subbuild and manually running make, does it complete?

view this post on Zulip starseeker (Oct 15 2022 at 20:11):

@Daniel Rossberg Let me know if 658f8329fc2d works/helps.

view this post on Zulip Daniel Rossberg (Oct 16 2022 at 15:32):

I got a clean build now :smiley:

view this post on Zulip starseeker (Oct 17 2022 at 01:22):

We'll want to make sure it works by running it for a while - if that's the issue I've seen it doesn't always manifest. If it consistently fixes it I'll push it upstream.

view this post on Zulip Daniel Rossberg (Oct 17 2022 at 13:20):

I tested it on 3 machines before. On all, I got the same error. I applied your bugfix on two of them already. The builds on both are successful now.

view this post on Zulip Daniel Rossberg (Dec 13 2022 at 19:18):

However, solving one problem mean to run into the next one. This is probably related to https://github.com/BRL-CAD/brlcad/commit/d6980c17fc16c6c694e561f6f8697d2e57d69f0e#diff-0a778b8459b95589232e39c837dc5217b2fb7d691baf5f7bc7601cda19c14dc7.

I open a GED handle with GED_INIT() and close it with ged_close but the wdbp handled to the init should be still intact after the GED handle was closed. Insted, I'm seeing an error like this

ERROR: bad pointer 0x55b1fa4ac120: s/b struct db_i(x57204381), was Unknown_Magic(xfa4a8250), file /home/rossberg/Devel/BRL-CAD/brlcad/src/librt/db_open.c, line 352

view this post on Zulip starseeker (Dec 14 2022 at 01:07):

@Daniel Rossberg Looking at the previous (7.32.6) release (https://github.com/BRL-CAD/brlcad/blob/rel-7-32-6/src/libged/ged.c#L83) it looks like ged_close is calling wdbp_close. Also from the 7.32.6 release, GED_INIT is assigning the supplied wdbp to gedp->ged_wdbp (https://github.com/BRL-CAD/brlcad/blob/rel-7-32-6/include/ged/defines.h#L105)

Naively, my expectation would be that a wdbp passed to GED_INIT and then subjected to ged_close wouldn't be expected to be valid? Perhaps I'm missing something...

view this post on Zulip Daniel Rossberg (Dec 16 2022 at 21:56):

Maybe. I never really tested it. What I did is

resource* resp_db = static_cast<resource*>(bu_calloc(1, sizeof(resource), "resp_db"));
rt_init_resource(resp_db, 0, NULL);
struct db_i* dbip = db_open(fileName, "rw");
db_dirbuild(dbip);
rt_wdb* wdbp_db = wdb_dbopen(dbip, RT_WDB_TYPE_DB_DISK);
rt_i* rtip_db = rt_new_rti(wdbp_db->dbip);
rt_init_resource(resp_db, 0, rtip_db);

I.e., resp_db, rtip_db, and wdbp_db are my variables used for the writable database.

ged* gedp_ged;
BU_GET(gedp_ged, ged);
GED_INIT(gedp_ged, wdbp_db);
ged_close(gedp_ged);
wdb_close(wdbp_db);
rt_free_rti(rtip_db);
rt_clean_resource_complete(0, resp_db);
bu_free(resp_db, "resp_db");

view this post on Zulip Daniel Rossberg (Dec 16 2022 at 22:08):

I get the error in rt_free_rti (db_close_client). Maybe, I should make a copy first and hand this over to GED_INIT?

view this post on Zulip Daniel Rossberg (Dec 16 2022 at 22:58):

BTW, why does GED_INIT call ged_init twice?


Last updated: Oct 09 2024 at 00:44 UTC