there is a bug here. freesolid is iterated, the head element is removed, and then added again, resulting in an infinite loop. i don't know how to fix it, since i don't know what it's actually trying to accomplish
there is a bug here. freesolid is iterated, the head element is removed, and then added again, resulting in an infinite loop. i don't know how to fix it, since i don't know what it's actually trying to accomplish
I'm not seeing it -- can you explain? are you referring to RT_G_DEBUG mode on line 738? that can be deleted if it's gotten out-of-date.
FREE_SOLID
(https://sourceforge.net/p/brlcad/code/HEAD/tree/brlcad/trunk/include/rt/solid.h#l81) also does a BU_LIST_APPEND
of sp
(which has just been dequeued) to freesolid
(from which it has just dequeued sp
)
it's true that RT_G_DEBUG
has to be on. i ran an mged test with -x 801 -X 801
and stumbled upon the issue
I think that's code that got out of sync with other code. There almost certainly was other RT_G_DEBUG code that has since been deleted that make it work right. The "fix" is to probably just delete this offending code. We should not be doing manual data-specific pooling (freelists) like that any more. They're crazy slow, incoherent, pointer chasing.
i removed the code in r71109. i ran the tests before committing, everything was fine
regress-rtwizard-rtwiz_m35_D
fails on mac, but not on linux. here's the output of pix-cmp -l and here are the different pixels (using imagemagick). any suggestions on how i could debug this? i get the same results on both release and debug
@Sean should i pursue this further, or should i focus on the SoA work for now?
@Sean should i pursue this further, or should i focus on the SoA work for now?
@Cezar if you would like to debug this, please do - it's some relatively recent change. you might have a good chance finding it simply by bisecting commits until you find the culprit.
I'm also seeing the failure on mac.
all of the differences are subtle shading changes at extreme angles (e.g., 0/0/0 vs 2/2/2 rgb values on the tire edges), implying something has changed mathematically somewhere. i have a mini backlog of commits I've not yet peer-reviewed, so I'll try to keep an eye out for likely cause but typically will be something like subtle bounding box change via redefinition of INFINITY, optimized math rounding, error accumulation, etc
i tried with 7.26.4 from sourceforge, the test still fails. 7.26.0 doesn't build and i didn't try to fix it. it's either not that new, or maybe clang's libc++ could be at fault? i'll try building with -stdlib=libstdc++
i'll try building with clang on linux next, hopefully that uses libc++ and not the gnu one. i tried gcc 8 on mac, but lld can't link stuff (something about unrecognized options), and gnu ld doesn't produce mach-o's
i tried with 7.26.4 from sourceforge, the test still fails. 7.26.0 doesn't build and i didn't try to fix it. it's either not that new, or maybe clang's libc++ could be at fault? i'll try building with -stdlib=libstdc++
@cezar It's typically better when hunting for bugs to go by commit revisions, not releases, so you can linearly walk back in time until you find when it last worked/broke. That's the bisecting I referred to as it's a typical workflow for both svn and git (though slightly different for each). That said, it's sounding like this might be a little too far down the rabbit hole for you right now, so I'd say just move on and I'll take a look.
i know how to do that with revisions, it’s what i did with the flto bug a few days ago, but in this case, i went to r69000 and it was still failing, so i was curious if the release was ok or not :D
would you like me to submit a bug report, maybe, so it’s not forgotten about?
no, don't bother
it won't be forgotten -- it's failing make test, so we can't release until it's fixed
obviously slipped through on 7.26.4, but now we know about it -- you could add a line to the TODO file's top section as those are release blockers
ran into this
mged> search / | / ERROR: bad pointer 0x5629bc520f18: s/b db_full_path(x64626670), was librt directory(x5551212), file /home/rain/Sync/gsoc/build/svn-master/src/librt/db_fullpath.c, line 264 ERROR: bad pointer 0x5629bc520f18: s/b db_full_path(x64626670), was librt directory(x5551212), file /home/rain/Sync/gsoc/build/svn-master/src/librt/db_fullpath.c, line 264 Saving stack trace to mged-5688-bomb.log
confirmed, I can reproduce it.
@Peter Pronai can you explain to me again why you're thinking the hole syntax needs changing?
OK, r71438 should take care of that crash.
it doesn't necessarily, not with the latest patch
ran into this by clicking around randomly in the manpage viewer in MGED
it keeps changing and stealing focus and i can't check out the Details>> thing or do anything but kill MGED
it basically DOSes one's desktop
no idea what the cause is but i'm more concerned about error messages being able to "DOS"
there should be One (1) error window where all errors are collected and the user can scroll back and stuff
something like Unity's log window is a good example (the game engine)
since that's a bit vague, what happens more precisely is:
the node ID keeps incrementing in the error message, so telling it to skip the message doesn't because (i assume) each message counts as different due to the different ID
hmm. does that error appear right away or after clicking on a few man pages?
I can't seem to get that error, although I can get an error when I try to follow a hyperlink...
never right away, i had to click on a number of pages on the left and idk which one triggered it
sometimes it's only ~20 clicks, sometimes a lot more
that may be some quirk in the tkhtml widget - it's still kinda got wires sticking out around the edges in some respects.
it basically DOSes one's desktop
what's going on is you're generating mouse events on that window every time you move the mouse (and there's obviously some problem in the event handler that causes the dialog to pop up). there are probably LOTS of events queued up, and every time you click a button or move the mouse to dismiss the panel, even more events are generated (dozens, maybe hundreds). if you took the window out of full screen and moved it out of the way from the dialog so the mouse doesn't go over it, you could probably clear out all the dialogs (which would clear out all the pending events)..
there should be One (1) error window where all errors are collected and the user can scroll back and stuff
completely agree
Open toyjeep.g in mged + e all
.
Related to https://github.com/BRL-CAD/brlcad/commit/30e45f823ceca8c37fc7591c2bcf37f628471724 ?
I just tried with latest main and it seems to draw the wireframe - what are the symptoms you're seeing?
Same for me, after https://github.com/BRL-CAD/brlcad/commit/9cd246294876ef3b757914c4e1dbf272b6751fd5 "Apparently we can't depend on cts_s.ts_dbip always being valid - try rti_dbip instead."
BTW, I'm experiencing a build error with the new stepcode:
I'm seeing a kind of chicken or the egg dilemma in stepcode with expscan.h. The file is necessary to get generated. The work-around is to copy the file from the generated directory in the source tree to the ExpScanner_expscan directory in the binaries tree. It will be overwritten by the generated one then.
@Daniel Rossberg what failure are you seeing trying to draw? Is it crashing? Failing to create the wireframe correctly?
I'll have to try and set up a Debian Stable VM to see if I can reproduce - I've not encountered that stepcode failure mode on any of the testing so far, and I've not seen it from the github runners, so it's a little puzzling... that'll be a few days before I can attempt
starseeker said:
Daniel Rossberg what failure are you seeing trying to draw? Is it crashing? Failing to create the wireframe correctly?
After your commit (I mentioned above), I don't get the error anymore. It should have fixed it. I meant with "same for me" that it draws the wireframe now. Before, I got a bad pointer message.
Ah - I thought you meant "same" as in "no change to error" :-)
@Daniel Rossberg I don't know if you can easily tell, but what in the initial build process is needing expscan.h? I'm wondering how that can be needed on just Debian stable...
It might be that there's something wonky with the perplex/lemon setup, which should be the only time generated gets involved...
expscan.h is included in expparse.y. This is where the error comes from, during the generation phase, I think.
[ 26%] Creating directories for 'STEPCODE_BLD'
[ 26%] No download step for 'STEPCODE_BLD'
[ 26%] No update step for 'STEPCODE_BLD'
[ 26%] No patch step for 'STEPCODE_BLD'
[ 26%] Performing configure step for 'STEPCODE_BLD'
-- STEPCODE_BLD configure command succeeded. See also /home/rossberg/Devel/BRL-CAD/build/brlcad/src/other/ext/STEPCODE_BLD-prefix/src/STEPCODE_BLD-stamp/STEPCODE_BLD-configure-*.log
[ 26%] Performing build step for 'STEPCODE_BLD'
CMake Error at /home/rossberg/Devel/BRL-CAD/build/brlcad/src/other/ext/STEPCODE_BLD-prefix/src/STEPCODE_BLD-stamp/STEPCODE_BLD-build-.cmake:37 (message):
Command failed: 2
'make'
See also
/home/rossberg/Devel/BRL-CAD/build/brlcad/src/other/ext/STEPCODE_BLD-prefix/src/STEPCODE_BLD-stamp/STEPCODE_BLD-build-*.log
-- stdout output is:
Scanning dependencies of target ExpParser_input_cpy
[ 0%] Generating expparse.y.sentinel
[ 1%] Generating expparse.y
[ 1%] Built target ExpParser_input_cpy
[ 1%] [LEMON][ExpParser] Building parser with /home/rossberg/Devel/BRL-CAD/build/brlcad/bin/lemon
Scanning dependencies of target objlib_expparse_c
[ 2%] Building C object src/express/CMakeFiles/objlib_expparse_c.dir/expparse.c.o
-- stderr output is:
In file included from expparse.y:5:
/home/rossberg/Devel/BRL-CAD/brlcad/src/other/ext/stepcode/src/express/parse_data.h:3:10: fatal error: expscan.h: Datei oder Verzeichnis nicht gefunden
3 | #include "expscan.h"
| ^~~~~~~~~~~
compilation terminated.
make[5]: *** [src/express/CMakeFiles/objlib_expparse_c.dir/build.make:86: src/express/CMakeFiles/objlib_expparse_c.dir/expparse.c.o] Fehler 1
make[4]: *** [CMakeFiles/Makefile2:1172: src/express/CMakeFiles/objlib_expparse_c.dir/all] Fehler 2
make[3]: *** [Makefile:149: all] Fehler 2
CMake Error at /home/rossberg/Devel/BRL-CAD/build/brlcad/src/other/ext/STEPCODE_BLD-prefix/src/STEPCODE_BLD-stamp/STEPCODE_BLD-build-.cmake:47 (message):
Stopping after outputting logs.
make[2]: *** [src/other/ext/CMakeFiles/STEPCODE_BLD.dir/build.make:130: src/other/ext/STEPCODE_BLD-prefix/src/STEPCODE_BLD-stamp/STEPCODE_BLD-build] Fehler 1
make[1]: *** [CMakeFiles/Makefile2:15471: src/other/ext/CMakeFiles/STEPCODE_BLD.dir/all] Fehler 2
make: *** [Makefile:182: all] Fehler 2
Putting PERPLEX_TARGET before LEMON_TARGET doesn't help.
I've seen an occasional, not reliably reproducible error that seems to be either a missing dependency specification or one of those situations were I/O operations aren't fully completing correctly. If you run "make" again after that error appears by going down into the build directory of the stepcode subbuild and manually running make, does it complete?
@Daniel Rossberg Let me know if 658f8329fc2d works/helps.
I got a clean build now :smiley:
We'll want to make sure it works by running it for a while - if that's the issue I've seen it doesn't always manifest. If it consistently fixes it I'll push it upstream.
I tested it on 3 machines before. On all, I got the same error. I applied your bugfix on two of them already. The builds on both are successful now.
However, solving one problem mean to run into the next one. This is probably related to https://github.com/BRL-CAD/brlcad/commit/d6980c17fc16c6c694e561f6f8697d2e57d69f0e#diff-0a778b8459b95589232e39c837dc5217b2fb7d691baf5f7bc7601cda19c14dc7.
I open a GED handle with GED_INIT() and close it with ged_close but the wdbp handled to the init should be still intact after the GED handle was closed. Insted, I'm seeing an error like this
ERROR: bad pointer 0x55b1fa4ac120: s/b struct db_i(x57204381), was Unknown_Magic(xfa4a8250), file /home/rossberg/Devel/BRL-CAD/brlcad/src/librt/db_open.c, line 352
@Daniel Rossberg Looking at the previous (7.32.6) release (https://github.com/BRL-CAD/brlcad/blob/rel-7-32-6/src/libged/ged.c#L83) it looks like ged_close is calling wdbp_close. Also from the 7.32.6 release, GED_INIT is assigning the supplied wdbp to gedp->ged_wdbp (https://github.com/BRL-CAD/brlcad/blob/rel-7-32-6/include/ged/defines.h#L105)
Naively, my expectation would be that a wdbp passed to GED_INIT and then subjected to ged_close wouldn't be expected to be valid? Perhaps I'm missing something...
Maybe. I never really tested it. What I did is
resource* resp_db = static_cast<resource*>(bu_calloc(1, sizeof(resource), "resp_db"));
rt_init_resource(resp_db, 0, NULL);
struct db_i* dbip = db_open(fileName, "rw");
db_dirbuild(dbip);
rt_wdb* wdbp_db = wdb_dbopen(dbip, RT_WDB_TYPE_DB_DISK);
rt_i* rtip_db = rt_new_rti(wdbp_db->dbip);
rt_init_resource(resp_db, 0, rtip_db);
I.e., resp_db, rtip_db, and wdbp_db are my variables used for the writable database.
ged* gedp_ged;
BU_GET(gedp_ged, ged);
GED_INIT(gedp_ged, wdbp_db);
ged_close(gedp_ged);
wdb_close(wdbp_db);
rt_free_rti(rtip_db);
rt_clean_resource_complete(0, resp_db);
bu_free(resp_db, "resp_db");
I get the error in rt_free_rti (db_close_client). Maybe, I should make a copy first and hand this over to GED_INIT?
BTW, why does GED_INIT call ged_init twice?
Last updated: Oct 09 2024 at 00:44 UTC