Stream: brlcad

Topic: recent bugs


view this post on Zulip Sean (Jul 21 2020 at 04:27):

@starseeker did you happen to look at "Primitive select (mouse behavior) causes drawn solids to disappear instead of being highlighted. When hitting escape (return to normal mouse behavior) all solids reappear." ?

view this post on Zulip starseeker (Jul 21 2020 at 11:33):

That rings a bell with a change Nick made a while back - I'll have to check the history, something about transparency support in MGED. Don't know if it was that specifically, but there was some sort of drawing problem...

view this post on Zulip Sean (Jul 21 2020 at 20:42):

yeah, I thought it was nick's change, but I couldn't confirm it and there was no NEWS entry. probably a set of commits deep down in my inbox I haven't gotten to reviewing yet.

view this post on Zulip Sean (Aug 05 2020 at 02:46):

@starseeker: sorry I should have tested the process I/O change more carefully.

view this post on Zulip Sean (Aug 05 2020 at 02:49):

that is a little unsettling that it broke rt in archer on windows... do you have any more info on why? presumably fileno() is not returning 0/1/2 for stdin/err/out, which should imply something else is seriously wrong.

view this post on Zulip Sean (Aug 05 2020 at 03:03):

it is mildly concerning that the api is assuming 0/1/2 are in/err/out. not just being pedantic. it's going to be terribly difficult to debug out of context, particularly any code that happens to do perfectly normal pipe operations on 0/1/2. likely result in i/o just not working right mysteriously, things failing inconsistently.

view this post on Zulip Sean (Aug 05 2020 at 03:04):

I wouldn't be surprised if the rt-archer breakage wasn't a reverse assumption elsewhere in the code...

view this post on Zulip Sean (Aug 05 2020 at 03:10):

on a mildly related note to your commit comment yesterday, how about just jumping to what we talked about last year -- using capnproto for commands to talk back? the protocol could be just simple err/out log messages for now, set up and handled internal to the bu_process api. then you'd only need on descriptor (e.g., stdout) which could transmit both out+err log messages from the process.

view this post on Zulip starseeker (Aug 05 2020 at 12:43):

When I spot checked, fileno(stdin) returned -2 on Windows.

view this post on Zulip starseeker (Aug 05 2020 at 12:44):

I switched to using an enum to specify which channel we're intending to use, so hopefully that will alleviate the issue?

view this post on Zulip starseeker (Aug 05 2020 at 12:46):

The rt-archer breakage (when I debugged) was the code testing fileno on Windows never getting an expected value for any of the three inputs (stdin/stdout/stderr) and simply giving up - that put us in an anomalous position because to archer it looked like the subprocess wasn't returning any info at all on any valid channel.

view this post on Zulip starseeker (Aug 05 2020 at 12:47):

I'd like to try the capnproto approach, but I'm not sure how easy/hard that will be to get working - it's not an area of programming I'm terribly familiar with, so there'd likely be a significant spin-up cost.

view this post on Zulip starseeker (Aug 05 2020 at 12:50):

I'm trying to excise bu_list from as much as possible of the libged drawing layer, both to make it easier to understand what the various pieces are doing and as a step towards being able to more easily use the bu_magic mechanism to validate things getting passed around as void *. That leads of course to the vlist and solid containers... another learning experience, but one I can no longer avoid if I'm going to really be able to following what libtclcad/archer are doing about drawing.

view this post on Zulip starseeker (Aug 05 2020 at 12:51):

The gsh bits hooking up callbacks were simply trying to set up so I could get a simpler-to-debug (i.e. non-Tcl) method of executing subprocess commands for testing.

view this post on Zulip starseeker (Aug 05 2020 at 13:05):

@Sean I'm sure you've got quite a bit more expertise than I do, so I'd appreciate any insights, but what it's looking like to me so far:

  1. capnproto will allow us to serialize/de-serialize information being sent over the stdin/stdout/stderr channels, allowing for richer communication, but it doesn't itself solve the Inter-Process Communication problem.

  2. Since we've done well over the years with stdin/stdout/stderr piped IPC (which is what allows the MGED/rt connections to work cross platform) that seems like a good way to keep going, with capnproto being used to structure the I/O so we can easily/safely move much more than we currently do over those channels (right now the most complex communication I know of is rtcheck, which uses stdout and stderr for text/vlist drawing information and simply assumes what is coming back over each channel.)

  3. If we're going to use stdin/stdout/stderr, we don't actually want the parent process to do what I've currently got gsh doing, which is to periodically check in a thread whether any new input has arrived for processing. Rather, we would want to do an event based setup where action is triggered when the subprocess sends something down the pipe. I went with the simplest thing that got what I needed working for gsh, but that's not what we want/need to do long term.

  4. Setting up events based on activity in the I/O channels appears to be very very platform specific. I've been hunting around trying to find a small encapsulation of the necessary logic, and so far have come up pretty dry.

view this post on Zulip starseeker (Aug 05 2020 at 13:18):

ASIO (https://think-async.com/Asio/asio-1.16.1/doc/asio/overview.html) has support for file descriptors and HANDLEs, but doesn't (so far as I can tell) wrap both mechanisms under one API we could use. (That by the way also appears to be what happens in Tcl, which is why we have the file-descriptor/Tcl_Channel ifdef for the tclcad I/O callbacks. I tried once to consolidate that into just Tcl_Channel, but it didn't work on Linux...)

Chromium's IPC solves a similar problem (https://www.chromium.org/developers/design-documents/inter-process-communication) but it's not stand-alone and it's not clear to me if it would adapt easily to what we need.

view this post on Zulip starseeker (Aug 05 2020 at 13:20):

https://source.chromium.org/chromium/chromium/src/+/master:ipc/

view this post on Zulip starseeker (Aug 05 2020 at 13:24):

In some ways I'm actually tempted to see what it would take to extract the Tcl bits for defining these particular events and I/O management into libbu - it's proven to work, and so far I've not come across any simple, stand-alone drop-in alternative...

view this post on Zulip starseeker (Aug 05 2020 at 13:47):

Hmm... looking again I see the capnproto code does seem to have IPC logic, but I can't tell if they can work without the socket APIs...

view this post on Zulip starseeker (Aug 05 2020 at 13:57):

Ah, there it is... looks like we use pipe if we go with "kj::newOneWayPipe()"

view this post on Zulip starseeker (Aug 05 2020 at 13:58):

/me 's brain hurts when bending it in purely C++ directions... oof.

view this post on Zulip starseeker (Aug 05 2020 at 14:00):

OK, so the question is - can gsh be made to work using capnproto for IPC, events and content?

view this post on Zulip starseeker (Aug 05 2020 at 14:02):

Or does this need to be wired in at the libbu subprocess management level for things like reading and writing?

view this post on Zulip starseeker (Aug 05 2020 at 14:10):

We need to be able to allow the Tcl_Even t loop to manage the callback invocation for MGED/Archer to allow current behavior...

view this post on Zulip Sean (Aug 05 2020 at 14:27):

starseeker said:

When I spot checked, fileno(stdin) returned -2 on Windows.

That's interesting. It makes sense on Window because there isn't a standard input descriptor set up by default for GUI apps on Windows unless it's a console application.

view this post on Zulip Sean (Aug 05 2020 at 14:29):

That probably means something else opened up an input pipe -- and that code wherever it is didn't register it as stdin. Probably is the same bug for out/err too.

view this post on Zulip Sean (Aug 05 2020 at 14:39):

you're right that capnproto doesn't solve IPC by itself because it turns it into an RPC solution. I actually wouldn't recommend going down that route until you're ready to abandon IPC because of the obvious performance implications. from a technique perspective, though, RPC is quite a bit simpler than dealing with cross-platform IPC.

view this post on Zulip Sean (Aug 05 2020 at 14:41):

alternative to capn might be worth trying instead is zeromq -- it supports in-process (inter-thread communication) and inter-process (IPC, ports) communication in addition to capn-style benefits for the data being exchanged.

view this post on Zulip Sean (Aug 05 2020 at 14:51):

Used asio a while back and wouldn't recommend it -- it's really meant for async client/server communication (e.g., pkg alternative) aside from it pulling in the boost ecosystem.

view this post on Zulip Sean (Aug 05 2020 at 14:55):

Boost.interprocess (https://www.boost.org/doc/libs/1_63_0/doc/html/interprocess.html) would be the one that does what you're needing for ports, but I'd still try 10 other things before pulling in boost myself... :)

view this post on Zulip Sean (Aug 05 2020 at 14:59):

there are a bunch of ways to do IPC and lots have wrapped it, so I'm sure you can find one that works. some of them may just rely on a particular IPC method and that'll require changing code a little bit. for example, this one (https://github.com/jarikomppa/ipc/) uses shared memory. so instead of using fwrite, calls get changed to sprintf since shared memory works like a malloc'd buffer with both sides of the pipe able to read/write that memory.

view this post on Zulip Sean (Aug 05 2020 at 15:15):

starseeker said:

In some ways I'm actually tempted to see what it would take to extract the Tcl bits for defining these particular events and I/O management into libbu - it's proven to work, and so far I've not come across any simple, stand-alone drop-in alternative...

I'd be cool with that! It really isn't much code that we're talking about. Only issue would be that it's essentially the same problem of consolidating to Tcl_Channel. From what I saw in the code, there's no reason it shouldn't work on linux, so there's almost certainly some other mistaken assumption going on somewhere in the code and until that assumption is found and eliminated, none of these solutions are going to work.

view this post on Zulip Sean (Aug 05 2020 at 15:17):

It's very much related to the concern I have with using 0/1/2 integers and assuming they are a particular port. I don't know if it's related to this specific problem, but this is the kind of problem that causes. Really hard to debug without unwinding the port from creation to destruction on both sides of the port.

view this post on Zulip Sean (Aug 05 2020 at 15:19):

starseeker said:

OK, so the question is - can gsh be made to work using capnproto for IPC, events and content?

No, it'd be the other way around -- you would adapt gsh to capnproto rpc approach instead of events and ipc. It'd look different, but it can work (just not as performant as IPC).

view this post on Zulip Sean (Aug 05 2020 at 15:21):

starseeker said:

Or does this need to be wired in at the libbu subprocess management level for things like reading and writing?

and when I say "adapt gsh" that doesn't preclude this belonging in libbu. libbu would ideally provide a call like subprocess_write and subprocess_read or something similar to abstract from the method underneath. The benefit of file descriptors is trying to avoid needing to do that so you can just use read/write or sprintf/sscanf.

view this post on Zulip starseeker (Aug 05 2020 at 15:24):

FWIW, there is a stand-alone ASIO that doesn't need boost...

view this post on Zulip starseeker (Aug 05 2020 at 15:26):

The Tcl refactor is on some ways the most incremental change, assuming it doesn't turn gnarly - even if we eventually opt for another solution, that has the advantage of knowing exactly what it should do if the migration is successful (since it's already working in place.)

view this post on Zulip starseeker (Aug 05 2020 at 15:30):

This is probably an embarrassing question, but what are the implications of abandoning IPC for RPC? I thought RPC was just one form of IPC?

view this post on Zulip Sean (Aug 05 2020 at 15:34):

So I know that's a lot and probably talking through too many issues to make sense of it all. In summary, I would recommend 1) trying again to consolidate to Tcl_Channel again as whatever is making that not work likely will affect other solutions until the assumption is inadvertently ripped out, 2) try one of the many wrapped options like shared memory or named pipes.. starting with zeromq or a simpler header-only one, and finally 3) switching to RPC for libged with Capnproto after you've abandoned hope on IPC. ;)

view this post on Zulip starseeker (Aug 05 2020 at 15:36):

/me nods - the Tcl_Channel thing bothered me last time, and if I take it far enough apart to digest it for extraction I should be able to run it to ground one way or the other.

view this post on Zulip starseeker (Aug 05 2020 at 15:37):

Not to mention squashing another WIN32 ifdef... those are getting hard to remove these days...

view this post on Zulip Sean (Aug 05 2020 at 15:38):

starseeker said:

This is probably an embarrassing question, but what are the implications of abandoning IPC for RPC? I thought RPC was just one form of IPC?

Only embarrassing questions are the ones not asked. IPC uses a specific operating system method for allowing two processes to exchange data. typical examples are files (and file descriptors), named pipes, shared memory, message passing, and sockets. each method has significant implications on how you set up communication and how data is exchanged which is to say it's not generally possible to create a generic IPC interface that uses different methods.

view this post on Zulip Sean (Aug 05 2020 at 15:39):

you typically find one method that is implemented on different platforms similarly wrapped by a library

view this post on Zulip Sean (Aug 05 2020 at 15:40):

RPC for example is typically associated with the message passing form of IPC and message passing typically relies on the socket method of IPC data exchange

view this post on Zulip Sean (Aug 05 2020 at 15:41):

MPI is the elephant example of RPC/IPC

view this post on Zulip starseeker (Aug 05 2020 at 15:42):

So capnproto's RPC API won't guarantee a specific method of communication (say, pipes vs. sockets) even if it sometimes uses pipes under the hood?

view this post on Zulip Sean (Aug 05 2020 at 15:43):

RPC is an IPC method, but it's more strongly associated with sockets and that's what I was referring to when I mentioned "abandoning IPC" .. which really was"abandon file/pipe method of IPC"

view this post on Zulip Sean (Aug 05 2020 at 15:44):

starseeker said:

So capnproto's RPC API won't guarantee a specific method of communication (say, pipes vs. sockets) even if it sometimes uses pipes under the hood?

I don't know for sure, but when I was reading their docs, I didn't see any support for file/pipe-based methods, only socket-based methods

view this post on Zulip starseeker (Aug 05 2020 at 15:44):

https://github.com/capnproto/capnproto/blob/master/c%2B%2B/src/kj/async-io-unix.c%2B%2B#L1700

view this post on Zulip Sean (Aug 05 2020 at 15:44):

whereas zeromq explicitly calls them all out

view this post on Zulip Sean (Aug 05 2020 at 15:45):

yeah, that code would lead me to believe capnproto may also support pipes

view this post on Zulip Sean (Aug 05 2020 at 15:45):

I just didn't see any examples (didn't look very hard either though)

view this post on Zulip starseeker (Aug 05 2020 at 15:47):

(maybe?) https://github.com/capnproto/capnproto/blob/master/c%2B%2B/src/kj/async-io-test.c%2B%2B#L311

view this post on Zulip Sean (Aug 05 2020 at 15:47):

again, though, nearly every method is going to require adopting a data exchange method, whether that's reading/writing on pipes/files (this is your closest fit currently) or reading/writing on sockets (this is typical in client+server apps) or reading/writing buffers of memory

view this post on Zulip starseeker (Aug 05 2020 at 15:47):

The capnproto docs leave a bit to be desired, IMHO... at least for newbies

view this post on Zulip starseeker (Aug 05 2020 at 15:48):

/me nods

view this post on Zulip starseeker (Aug 05 2020 at 15:49):

Hmm... ZeroMQ is LGPLv3 and looks like they're working towards an MPL2 relicense. OK, that's workable...

view this post on Zulip Sean (Aug 05 2020 at 15:49):

So yeah, looks like capn can -- "As of version 0.4, the only supported way to communicate between threads is over pipes or socketpairs."

view this post on Zulip starseeker (Aug 05 2020 at 15:51):

/me would ideally prefer to avoid getting user bug reports that parts of the application can't talk to each other...

view this post on Zulip Sean (Aug 05 2020 at 15:51):

capn notes in https://capnproto.org/encoding.html that he adopted streaming as the data exchange method (implying file/pipe or socket method, not message passing or shared memory)

view this post on Zulip Sean (Aug 05 2020 at 15:55):

looks like https://capnproto.org/cxx.html has more specific detail, under Messages and I/O

view this post on Zulip Sean (Aug 05 2020 at 15:56):

still unclear if he has helpers that wrap the communication pipe setup

view this post on Zulip Sean (Aug 05 2020 at 15:57):

if he doesn't, that might be a case for something like that header only lib that used a shared memory method -- looks like you can just point capn to it

view this post on Zulip starseeker (Aug 05 2020 at 21:22):

@Sean one other note about capnproto - if we do adopt it, it bumps our minimum required C++ to C++14. Personally I'm OK with that, but I wanted to raise it in case it's of concern to you.

view this post on Zulip Sean (Aug 05 2020 at 21:26):

I’m okay with it for this, if it solves the need of communication with her commands. I would probably hesitate elsewhere but capnproto has compelling capability.

view this post on Zulip Sean (Feb 10 2021 at 05:45):

User reported a bug launching MGED on Windows 7 64-bit. They found an article mentioning a KB update, but that didn't fix it apparently. Any ideas?
147093015_10160483351802542_1761671887773690017_n.jpg

view this post on Zulip starseeker (Feb 10 2021 at 12:43):

Windows 7 is too old for that function: https://docs.microsoft.com/en-us/windows/win32/api/sysinfoapi/nf-sysinfoapi-getsystemtimepreciseasfiletime

view this post on Zulip starseeker (Feb 10 2021 at 12:44):

src/libbu/datetime.c

view this post on Zulip starseeker (Feb 10 2021 at 12:46):

Maybe we could use https://stackoverflow.com/a/27856440 to implement this?

view this post on Zulip starseeker (Feb 10 2021 at 12:48):

I didn't realize there were still any users on Windows 7. If https://en.wikipedia.org/wiki/Windows_7 has it right even extended support ended January 2020.

view this post on Zulip Sean (Feb 24 2021 at 17:01):

@starseeker related to earlier discussion, this appears to be a consistent hard crasher: mged> search ./ebm.r /pnts.r

ERROR: bad pointer 0x7ffe677688f8: s/b db_full_path(x64626670), was librt directory(x5551212), file /Users/morrison/brlcad.trunk/src/librt/db_fullpath.c, line 264

ERROR: bad pointer 0x7ffe677688f8: s/b db_full_path(x64626670), was librt directory(x5551212), file /Users/morrison/brlcad.trunk/src/librt/db_fullpath.c, line 264

view this post on Zulip Sean (Feb 24 2021 at 17:02):

mixing relative and full path terms apparently makes it unhappy

view this post on Zulip starseeker (Feb 24 2021 at 18:12):

r78317 should fix it.

view this post on Zulip Sean (Feb 25 2021 at 17:45):

Cool. Now if only could figure out why that resulted in zombie processes ... haven't seen them in ages!

view this post on Zulip Sean (Apr 19 2021 at 18:53):

@starseeker any ideas: https://sourceforge.net/p/brlcad/bugs/394/

view this post on Zulip starseeker (Apr 19 2021 at 18:55):

Whoa. Those are wacky looking.

view this post on Zulip starseeker (Apr 19 2021 at 18:57):

Almost looks like they're trying to move a binary from one system to another incompatible system, but if I'm reading that correctly it's the result of building and running on the same machine?

view this post on Zulip Erik (Apr 19 2021 at 19:12):

it's a power8, not an x86, if that means anything

view this post on Zulip starseeker (Apr 19 2021 at 19:13):

Those errors look like it's not finding system libs correctly

view this post on Zulip Sean (Apr 20 2021 at 16:10):

starseeker said:

Almost looks like they're trying to move a binary from one system to another incompatible system, but if I'm reading that correctly it's the result of building and running on the same machine?

Yes, they appear to have compiled it themselves with BRLCAD_BUNDLED_LIBS=ON.. any response we can give them? The archer error looks like a tcl/tk 8.6 error...

view this post on Zulip starseeker (Apr 20 2021 at 16:26):

/me shakes head - for the OpenGL bit all I could suggest is they try different drivers (maybe the modern Mesa gallium software rasterizing setup) and for the Archer bit I guess my first thought would be to see if bwish can run.

view this post on Zulip Sean (Apr 20 2021 at 17:59):

From the backtrace, it looks like they're already using Mesa. Being on Power9, their options are probably limited to Mesa or straight X. MGED did work with OpenGL disabled.

The archer failure is a little more concerning... as "hv3::formmanager" is from us. Main reason I can think for an "invalid command name" on it would be because the tclIndex or pkgIndex.tcl didn't get created/loaded.. which might imply something is wrong in our tcl/tk build system.

view this post on Zulip starseeker (Apr 20 2021 at 18:18):

That's why I was wondering what bwish does - it will pull in most of the packages but not hv3 (which is the web viewer, iirc) so that might help scope what's wrong.

view this post on Zulip Sean (Jul 21 2021 at 05:48):

@starseeker Getting:

CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
XOPENGL_glu_LIBRARY (ADVANCED)
    linked by target "dm-ogl" in directory /home/sean/brlcad.main/src/libdm/glx

Is that new? This is a default build on a remote Linux system (that may or may not have glu, but I presume it doesn't and the logic isn't taking that into account correctly to disable X).

view this post on Zulip starseeker (Jul 21 2021 at 13:13):

It may be a side effect of my refactor a while back to contain the X OpenGL logic. Does 320af5adad96f fix it?

view this post on Zulip Sean (Jul 21 2021 at 14:31):

checking, I just yanked the glu line and it worked for me, but suspected a better fix.

view this post on Zulip starkaiser (Jul 21 2021 at 15:38):

Hi everyone! So I have been learning BRL-CAD for the past couple of weeks, but I have encountered a bug in archer that I was unable to solve. I use Linux Mint 20.2. I have downloaded the source from main and built it. All the tests passed with no error, but when using snap grid in archer I get an error. I tried to look at the code, but I don't know tcl and can't solve it. I tried also to built BRL-CAD on FreeBSD 13.0, but I get the same error in archer. Snap grid works in MGED though. Has anyone else got this error? archer_error.png

view this post on Zulip bch (Jul 21 2021 at 18:47):

@starkaiser is this simple for you to reproduce?

view this post on Zulip bch (Jul 21 2021 at 18:55):

I’d do a couple things:
1) find out what $v and $c are, for curiousity sake; before the first if in the vscale proc, put something like:
set fh [ open ~/starkaiser_debug.log a]; puts “$v // $c”; chan close $fh

view this post on Zulip bch (Jul 21 2021 at 18:58):

This will (not surprisingly) open a log file in your homedir w the contents of v and c. The last line (which will have caused the crash) are the most interesting

view this post on Zulip starkaiser (Jul 21 2021 at 19:46):

Peek-2021-07-21-22-19.gif The error appears for both edges and faces for all arbs. I have added those lines and the values in the log file are:
0 0 0 // 1.000000
-nan -nan -nan // 1.000000

view this post on Zulip bch (Jul 21 2021 at 20:34):

The “-nan” gives us a specific clue. I’ve got a general patch candidate I’ll work on later that may be useful. Hopefully this isn’t a show-stopper for you…

view this post on Zulip starkaiser (Jul 21 2021 at 20:39):

Thank you! I have tried to solve it myself, but I have only managed to hop around different files, trying to understand the code. Mged works fine, so I'm using it to learn the basics

view this post on Zulip bch (Jul 21 2021 at 20:42):

I personally really like mged. Hopefully you come to enjoy it too. Definitely a “learning wall” associated with it, but it pays off

view this post on Zulip bch (Jul 21 2021 at 20:50):

@starkaiser - which version of brl-cad are you running?

view this post on Zulip starkaiser (Jul 21 2021 at 21:05):

7.32.3 I think. I compiled the latest versions from the main repository on github because older stable versions were giving me other errors, more serious. I compiled the version that I am currently using two days ago

view this post on Zulip starseeker (Jul 21 2021 at 21:10):

@starkaiser Just so you're aware - you're well into the "not-well-tested" aspects of the software interacting with Archer for geometry creation. MGED will usually be the more stable of the interfaces, since Archer doesn't currently get as much use/attention.

view this post on Zulip starseeker (Jul 21 2021 at 21:13):

Cool that you're digging into it - Archer's also got some of our most advanced GUI features (File->Open can trigger some of the converters to open other file types, for example.)

view this post on Zulip bch (Jul 21 2021 at 21:19):

@starkaiser - ok, so you’re working w the tip of main, then. I can do same

view this post on Zulip bch (Jul 21 2021 at 21:21):

@starseeker - I’m looking at some general fixes that may have knock-on effects. We’ll see how it turns out 🧐

view this post on Zulip Sean (Jul 22 2021 at 07:52):

starseeker said:

It may be a side effect of my refactor a while back to contain the X OpenGL logic. Does 320af5adad96f fix it?

That did the trick for libdm, but now there's another error from libpng being built wrong: undefined reference to `brl_png_init_filter_functions_vsx'

Looking at the png source code, it looks like we're missing all the platform-specific subdirs where that function comes from.

view this post on Zulip Sean (Jul 22 2021 at 07:53):

Can you fix that?

view this post on Zulip starseeker (Jul 22 2021 at 14:43):

What platform triggers the failure? Surprisingly, I've not encountered that error before...

view this post on Zulip starseeker (Jul 22 2021 at 15:14):

It looks like the POWERPC platform is defining PNG_FILTER_OPTIMIZATIONS. Unless we need that defined, my inclination would be to simply not define it.

view this post on Zulip starseeker (Jul 22 2021 at 15:16):

If I'm interpreting this correctly, PNG_FILTER_OPTIMIZATIONS is for platform specific optimization logic and there is a generic fallback we can use.

view this post on Zulip starseeker (Jul 22 2021 at 15:35):

@Sean does d4ecd86137fa fix it?

view this post on Zulip Sean (Jul 23 2021 at 05:16):

how does one clear out the src/other/ext builds?? make clean isn't working...

view this post on Zulip Sean (Jul 23 2021 at 05:17):

Can we get make clean fixed?.. if make builds it, make clean should still clear it.

view this post on Zulip Sean (Jul 23 2021 at 05:21):

curiously distclean appears to have deleted files I would have thought it had no business deleting , and still left bin/osdemo and src/other/libosmesa

view this post on Zulip Sean (Jul 23 2021 at 05:44):

starseeker said:

It looks like the POWERPC platform is defining PNG_FILTER_OPTIMIZATIONS. Unless we need that defined, my inclination would be to simply not define it.

But why is it customized? I would think it's far less complexity and risk to drop in as vanilla as strictly possible. It seems to be just a few files, so I can't see an argument for space/complexity/savings. No idea what the runtime implications are.

Plus, it's an impedence to upgrades and there's a real risk of cost... which it now incurred. I mean, between all the builds, rebuilding, inspecting, having to shift what I was doing elsewhere, trying again, I've now spent at least 4 hours unproductively because of it. :(

I have to hope there was some need or benefit beyond tidying up files.? A benefit that saves us time and effort?

view this post on Zulip Sean (Jul 23 2021 at 05:45):

starseeker said:

Sean does d4ecd86137fa fix it?

That does seem to have fixed it. Thank you.

view this post on Zulip starseeker (Jul 23 2021 at 13:16):

Sean said:

Can we get make clean fixed?.. if make builds it, make clean should still clear it.

My understanding of how ExternalProject_Add works suggests that this will be difficult. ExternalProject builds are decoupled from the primary CMake logic internally, and each individual project's logic isn't even guaranteed to define a clean target at all. CMake doesn't provide a lot of good options for customizing the "make clean" target as far as I know...

view this post on Zulip starseeker (Jul 23 2021 at 13:19):

Sean said:

curiously distclean appears to have deleted files I would have thought it had no business deleting , and still left bin/osdemo and src/other/libosmesa

That is curious - latest main doesn't do that for me using a build folder. Are you doing a build from the src dir? It doesn't reproduce for me there either...

view this post on Zulip starseeker (Jul 23 2021 at 13:28):

Sean said:

I have to hope there was some need or benefit beyond tidying up files.? A benefit that saves us time and effort?

Looks like I did that back in 2016 (r68360) when was trying to scrub everything we don't need out of src/other to reduce our overall tarball size.

view this post on Zulip starseeker (Jul 23 2021 at 13:37):

For src/other/ext, the obvious thing to try would be to define some custom logic to be executed by the ExternalProject_Add build steps that generates a list of all files added by the build step that the parent build could then remove, but I don't know of a way to customize the clean target in the parent CMake build to that degree.

view this post on Zulip starseeker (Jul 23 2021 at 13:44):

I think these folks have a similar issue: https://github.com/klee/klee/issues/718

view this post on Zulip starseeker (Jul 23 2021 at 13:44):

We could probably produce a clean-ext target that would invoke the needed steps...

view this post on Zulip Sean (Jul 24 2021 at 06:31):

starseeker said:

My understanding of how ExternalProject_Add works suggests that this will be difficult. ExternalProject builds are decoupled from the primary CMake logic internally, and each individual project's logic isn't even guaranteed to define a clean target at all. CMake doesn't provide a lot of good options for customizing the "make clean" target as far as I know...

From my reading, e.g., https://cmake.org/pipermail/cmake/2012-February/049208.html it looks like cmake is supposed to run clean on ExternalProjects. That person is trying to stop the behavior I'm expecting.

It didn't look like any of them cleaned, even if they had a cmake build..

view this post on Zulip Sean (Jul 24 2021 at 06:35):

starseeker said:

That is curious - latest main doesn't do that for me using a build folder. Are you doing a build from the src dir? It doesn't reproduce for me there either...

Nope, I'm using a build folder. I don't know what it'd do in a src tree. It was a straight up fresh cloning, cmake in build, and make calls, then make distclean when clean didn't clear out libpng. I was testing your libpng change, but couldn't get it to recompile again even with the file edited, so tried to make clean which failed, then distclean -- which left turds.

view this post on Zulip Sean (Jul 24 2021 at 06:51):

starseeker said:

Looks like I did that back in 2016 (r68360) when was trying to scrub everything we don't need out of src/other to reduce our overall tarball size.

Oof. :(

Yeah, I don't think we should keep that then, long term, especially as upgrades happen. That was a really expensive impact..

Also, that edit requires a human in the loop at all future upgrade points (i.e., more time) and docs/knowledge of the edits complicating upgrades. That in turn puts us in a position where upgrades are resisted (e.g., gdal, opennurbs, stepcode, ...). Not a healthy pattern.

view this post on Zulip Sean (Jul 24 2021 at 06:54):

Of course, opennurbs has other reasons, so that one's not entirely equivalent, but it is a bit involved to upgrade in part because of cullings (in addition to our code edits).

view this post on Zulip starseeker (Jul 24 2021 at 13:32):

I saw those emails, but I'm wondering if the behavior they describe is out of date - I don't think I've ever seen the ExternalProject_Add builds follow a make clean...
exttest.tar.gz
I made a small test (attached) and the behavior I'm seeing here on a make clean is that p1 is removed, but bin/p2 is intact.

view this post on Zulip starseeker (Jul 24 2021 at 13:34):

Sean said:

 tried to make clean which failed, then distclean -- which left turds.

Which files were left after the distclean? One possibility is that if there are files left from older build states, distclean based on updated CMakeLists.txt files won't know it needs to remove them...

view this post on Zulip starseeker (Jul 24 2021 at 14:02):

Sean said:

Also, that edit requires a human in the loop at all future upgrade points (i.e., more time) and docs/knowledge of the edits complicating upgrades. That in turn puts us in a position where upgrades are resisted (e.g., gdal, opennurbs, stepcode, ...). Not a healthy pattern.

My hope is that once it is properly matured, the new src/other/ext approach to building will make vanilla upstreams more practical. Since the new logic (so far at least) is capable of replicating the CMake RPath magic without needing all up build system replacements, the incentive to clean up the third party directories goes down.

When I'm having to write and/or maintain the build systems myself, those messy directories are a problem - that was the other reason I was stripping them down, to make it easier to understand what I had to write build logic for. When I had to do major build system work on third party deps, all future upgrades needed a human in the loop anyway, so the simplification was an overall win. I'm trying to let the dust settle on the src/other/ext system before I introduce the additional complication of swapping in things like the upstream GDAL build, and I also wanted as much in the way of automated cross platform testing in place as possible before trying that step. (I haven't had bandwidth to do it anyway, but even if I had I would have been hesitant to pile the native build systems on top of everything else.)

Even if we get to completely vanilla src/other/ext, there's still going to be some disincentive to disrupt things by upgrading those deps. I'm thinking it might be helpful if we go ahead and break ext into its own git repo and add it as a submodule. If git will support this, we could set it up as follows:

  1. Point the src/other/ext submodule in the main repo to an equivalent of "STABLE" in the ext repo by default. (I.e., an out of the box recursive checkout of brlcad will pull a known working ext configuration.)

  2. In the ext repo, we can then upgrade third party deps without impinging on brlcad itself. We can then test both the ext build and its integration with the parent brlcad build by checking out different versions of ext within the brlcad checkout.

  3. If the above works (and I'm not 100% sure if it can, I need to do some experimenting with submodules) we might even be able to set up CI on the ext repo to do continual, ongoing merge and integration testing with upstream repos like GDAL that are also on github. We might define branches like ACTIVE (used by brlcad), STABLE (holds the latest released version of each of the deps, should match ACTIVE in most situations unless we need to patch a stable release for a CVE or some such), TESTING (used to keep an eye on the latest development versions of all the deps, expected to break regularly), and STAGING (similar to testing but where we can adjust problematic upstream versions back a bit if we need to to keep everything else going.)

view this post on Zulip Sean (Jul 28 2021 at 13:06):

starseeker said:

Sean said:

 tried to make clean which failed, then distclean -- which left turds.

Which files were left after the distclean? One possibility is that if there are files left from older build states, distclean based on updated CMakeLists.txt files won't know it needs to remove them...

Like I said, that was the entirety of its existence, so no prior build states, no git pulls besides the edit you made to fix the png issue. It was a clean checkout, cmake + make + cmake + make (tried diff compiler), and eventually make clean, then make distclean, which then left bin/osdemo and src/other/libosmesa with a few files in there.

view this post on Zulip Sean (Jul 28 2021 at 13:19):

The plan you describe sounds good except I would suggest we keep ext branching simpler. Having to hunt for which branch has the deps that works would be a bit .. frustrating to say the least.

I'd think we just have main track ext main and STABLE tract ext STABLE and leave it at that for starters. I.e., what worked for the last release, and whatever is currently needed for main. This would make main be your ACTIVE+TESTING+STAGING branches and it'd be on us to make branches while testing risky efforts, but without any reuqired formality beyone main and STABLE. I like the idea of possibly having main track ext STABLE for some added stability, but I could go either way. I'd hope any instability is very short lived.

view this post on Zulip scorp08 (Jul 29 2021 at 04:52):

vanilla ? :)

view this post on Zulip starkaiser (Jul 30 2021 at 16:35):

starkaiser said:

Peek-2021-07-21-22-19.gif The error appears for both edges and faces for all arbs. I have added those lines and the values in the log file are:
0 0 0 // 1.000000
-nan -nan -nan // 1.000000

I just downloaded and compiled the 7.32.4 release and now the snap to grid mode in Archer works great!

view this post on Zulip starseeker (Jul 30 2021 at 17:41):

The 7.32.4 release is based on older code, and doesn't incorporate most of the changes in main

view this post on Zulip starseeker (Jul 30 2021 at 17:42):

The most likely problem for issues in main is refactoring work I was doing to shift logic down the library stacks (primarily out of libtclcad, but also some out of libged into lower layers.)

view this post on Zulip starseeker (Jul 30 2021 at 17:42):

I may have missed a step when moving the snapping functions.

view this post on Zulip starseeker (Jul 30 2021 at 17:44):

I haven't gotten to the editing modes yet - they'll be one of the very last things to shift to the Qt GUI, because of the amount of work involved - and so the snapping behaviors haven't yet been tested post refactor (or rather, your Archer test has served as an inadvertent test).

view this post on Zulip starseeker (Aug 25 2021 at 14:32):

@Sean do we have a standard way to get SSIZE_MAX on Windows? limits.h doesn't seem to have it...

view this post on Zulip starseeker (Aug 25 2021 at 17:15):

Hmm. The only other use in our code is an ifndef test in common.h...

view this post on Zulip Sean (Sep 01 2021 at 02:28):

belated sorry about that, but I saw your fix and seems good enough. there's not a standard way other than including limits.h which we already do.

view this post on Zulip Sean (Sep 01 2021 at 02:29):

I could probably key off some other limit as it just needs to be some imposed limit to satisfy the cert/stig issue, tainted input sanitization

view this post on Zulip Sean (Nov 30 2021 at 08:14):

Both ogl and X framebuffer appear to be non-functional (on Mac) ... not sure since when as I've been in a different section of the code, but ogl fails and X crashes.

view this post on Zulip starseeker (Nov 30 2021 at 19:31):

Looks like it's bcc41b5798 that's causing the fb issues, at least for X.

view this post on Zulip starseeker (Dec 01 2021 at 02:16):

@Sean Not sure about ogl, but I think I addressed the X issue. I don't see an ogl failure on Linux...

view this post on Zulip Sean (Jan 08 2022 at 06:37):

oh, didn't report back on this until now, but the ogl issue never went away... still a hard failure on mac, no archer, no /dev/ogl

view this post on Zulip Sean (Jan 08 2022 at 06:40):

fbhelp reports:

=============== Current Selection ================
bu_shmget failed, errno=22
bu_shmget: Invalid argument
ogl_getmem:  Unable to attach to shared memory, using private
fb_ogl_open: double buffering not available. Using single buffer.
Assertion failed: (glx_dpy), function __glXSendError, file ../src/glx/glx_error.c, line 44.

Should look like this:

=============== Current Selection ================
ogl_getmem: shmget failed, errno=22
ogl_getmem:  Unable to attach to shared memory.
Description: Silicon Graphics OpenGL
Device: /dev/ogl
Max width height: 16384 16384
Default width height: 512 512
Usage: /dev/ogl[option letters]
   p   Private memory - else shared
   l   Lingering window
   t   Transient window
   d   Suppress dithering - else dither if not 24-bit buffer
   c   Perform software colormap - else use hardware colormap if possible
   s   Single buffer -  else double buffer if possible
   b   Fast pan and zoom using backbuffer copy -  else normal
   D   Don't update screen until fb_flush() is called.  (Double buffer sim)
   z   Zap (free) shared memory.  Can also be done with fbfree command

Current internal state:
    mi_doublebuffer=1
    mi_cmap_flag=0
    ogl_nwindows=1
X11 Visual:
    TrueColor: Fixed RGB maps, pixel RGB subfield indices
    RGB Masks: 0xff0000 0xff00 0xff
    Colormap Size: 256
    Bits per RGB: 8
    screen: 0
    depth (total bits per pixel): 24

view this post on Zulip Sean (Jan 08 2022 at 06:43):

also, definitely seeing some regression in the conversion code. was doing an obj-g conversion, was giving me some new errors -- checked against a rando prior release (7.30 I think) and prior succeeded where current main does not (completes, but results in bad/flipped faces).

view this post on Zulip Sean (Jan 08 2022 at 06:44):

Here's a visual example, left is prev, right is curr: image.png

view this post on Zulip Sean (Jan 08 2022 at 06:46):

Here's that geometry if you want to see if you can track it down.. PoliceLifterSpeed.obj

view this post on Zulip Sean (Jan 08 2022 at 06:46):

Shouldn't need this but just in case: PoliceLifterSpeed.mtl

view this post on Zulip starseeker (Jan 11 2022 at 14:44):

Can you double check what version succeeded? I've tried a number of 7.30 obj-g conversions, and so far they all produce the bad geometry here.

view this post on Zulip starseeker (Jan 11 2022 at 14:49):

7.28.2 shows bad as well

view this post on Zulip Sean (Jan 14 2022 at 07:22):

Oof, I thought I grabbed 7.30, but it's looking like if I just opened the .g file, then it would have fired up a 7.24 release.

view this post on Zulip Sean (Jan 14 2022 at 07:24):

looks like it might have been 7.26 actually

view this post on Zulip Sean (Jan 14 2022 at 07:25):

So I guess the good news is it's not a release blocker, thanks for checking it

view this post on Zulip Sean (Jan 27 2022 at 05:28):

@starseeker not sure if it’s recent but cmake summary has a blank entry for Iwidgets. I took a look but couldn’t follow the logic, appeared to be handled differently from the other _BUILD vars. Would you take a look? TCL is ON, Tk is Disabled, Itcl/Itk is ON (Itcl only), and Iwidgets is blank.

This is a server / no-X build on Ubuntu.

view this post on Zulip starseeker (Jan 27 2022 at 13:10):

@Sean got it, thanks - was missing an else case in the iwidgets.cmake flow

view this post on Zulip starseeker (Jan 27 2022 at 13:11):

(my real motivator for the Qt work - get rid of all the Tcl/Tk build logic ;-) )

view this post on Zulip Sean (Feb 14 2022 at 16:48):

@starseeker Here's one of the errors that a couple of them got:
warning: error while sourcing archer_launch.tcl: couldn't read file "tclscripts/archer/itk_redefines.tcl": no such file or directory

view this post on Zulip Sean (Feb 15 2022 at 03:39):

@starseeker I've run into that particular itk_defines.tcl error before as well, if that helps. I'm not sure the conditions but it doesn't seem to interfere with the build as much as it was on Windows

view this post on Zulip Sean (Feb 15 2022 at 03:40):

I did get a trace on the all-apps-crashing again bug -- it appears to be something inside libdm during application shutdown. Valgrind is pointing at some unknown symbols in that library:

--11359-- Discarding syms at 0x107420000-0x107474000 in /Users/morrison/brlcad.main/.build/lib/libdm.20.0.1.dylib (have_dinfo 1)
--11359-- Discarding syms at 0x1074cc000-0x1074d4000 in /Users/morrison/brlcad.main/.build/lib/libpkg.20.0.1.dylib (have_dinfo 1)
--11359-- Discarding syms at 0x1092e4000-0x1092f0000 in /Users/morrison/brlcad.main/.build/libexec/dm/libdm-ps.dylib (have_dinfo 1)
==11359== Jump to the invalid address stated on the next line
==11359==    at 0x107446636: ???
==11359==    by 0x10744657E: ???
==11359==    by 0x10743CA42: ???
==11359==    by 0x7FFF2071ED24: ??? (in /dev/ttys000)
==11359==    by 0x7FFF2071F00F: ??? (in /dev/ttys000)
==11359==    by 0x7FFF2080AF43: ??? (in /dev/ttys000)
==11359==  Address 0x107446636 is not stack'd, malloc'd or (recently) free'd

view this post on Zulip Sean (Feb 15 2022 at 03:41):

Actually, if that output is strictly ordered, that may be the issue. It's unloaded libdm, but then goes to unload dm/libdm-ps.dylib and a call is made into libdm...

view this post on Zulip starseeker (Feb 15 2022 at 03:42):

Um. How do we control that ordering?

view this post on Zulip Sean (Feb 15 2022 at 03:44):

I could be wrong on that interpretation, but it def appears to be something libdm plugin-related

view this post on Zulip starseeker (Feb 15 2022 at 03:45):

For the itk_defines.tcl error, the question I have is whether share/tclscripts/archer/itk_redefines.tcl is present - if not, it may be a missing dependency on the cp target for that file, if it is then it's something about the paths in the Tcl environment.

view this post on Zulip starseeker (Feb 15 2022 at 03:45):

Can we check if there are any undefined symbols inthe so files?

view this post on Zulip starseeker (Feb 15 2022 at 03:46):

Let me see if I can make the target for copying that file an explicit dependency of archer...

view this post on Zulip Sean (Feb 15 2022 at 03:46):

I'll see if I can trigger the itk_redefines.tcl error, and check -- it was pretty consistent for me for a while, but I'd been ignoring it

view this post on Zulip Sean (Feb 15 2022 at 03:47):

which so files? libdm-plot or libdm or ?

view this post on Zulip starseeker (Feb 15 2022 at 03:47):

the libdm-ps.dylib file, if that's the one factoring into the crash above

view this post on Zulip Sean (Feb 15 2022 at 03:47):

I'm not seeing where/why in the code the plugins have any code that would be getting called

view this post on Zulip Sean (Feb 15 2022 at 03:48):

there's no apparent atexit handler. oh, maybe dlopen registers one.. that might be.

view this post on Zulip starseeker (Feb 15 2022 at 03:48):

If it's the unloading code, a quick check would be to comment out the unloading bits in libdm_clear

view this post on Zulip starseeker (Feb 15 2022 at 03:49):

in dm_init.cpp

view this post on Zulip Sean (Feb 15 2022 at 03:49):

(base) morrison@agua .build % nm libexec/dm/libdm-ps.dylib
                 U _Tcl_AppendStringsToObj
                 U _Tcl_DuplicateObj
                 U _Tcl_GetObjResult
                 U _Tcl_SetObjResult
                 U ___stack_chk_fail
                 U ___stack_chk_guard
00000000000100d0 d __dyld_private
                 U _bu_calloc
                 U _bu_free
                 U _bu_log
                 U _bu_vls_addr
                 U _bu_vls_free
                 U _bu_vls_init
                 U _bu_vls_printf
                 U _bu_vls_sprintf
                 U _bu_vls_strcpy
0000000000010550 b _disp_mat
000000000000b770 T _dm_plugin_info
0000000000010138 d _dm_ps
0000000000010150 d _dm_ps_impl
                 U _draw_Line3D
                 U _fclose
                 U _fflush
                 U _fopen
                 U _fprintf
                 U _fputs
00000000000106a8 s _head_ps_vars
                 U _memcpy
                 U _memset
00000000000104d0 b _mod_mat
                 U _null_String2DBBox
                 U _null_SwapBuffers
                 U _null_beginDList
                 U _null_configureWin
                 U _null_doevent
                 U _null_drawDList
                 U _null_drawPoint3D
                 U _null_drawPoints3D
                 U _null_endDList
                 U _null_freeDLists
                 U _null_genDLists
                 U _null_getDisplayImage
                 U _null_loadPMatrix
                 U _null_makeCurrent
                 U _null_openFb
                 U _null_reshape
                 U _null_setDepthMask
                 U _null_setLight
                 U _null_setTransparency
                 U _null_setZBuffer
000000000000c010 s _pinfo
0000000000009120 t _ps_close
000000000000b6a0 t _ps_debug
000000000000b290 t _ps_draw
00000000000092f0 t _ps_drawBegin
0000000000009360 t _ps_drawEnd
0000000000009af0 t _ps_drawLine2D
0000000000009c00 t _ps_drawLine3D
0000000000009c60 t _ps_drawLines3D
0000000000009cf0 t _ps_drawPoint2D
0000000000009940 t _ps_drawString2D
0000000000009d60 t _ps_drawVList
0000000000010690 b _ps_drawVList.fin
0000000000010650 b _ps_drawVList.last
0000000000010670 b _ps_drawVList.start
0000000000009430 t _ps_hud_begin
00000000000094a0 t _ps_hud_end
0000000000009510 t _ps_loadMatrix
000000000000b700 t _ps_logfile
0000000000008310 t _ps_open
00000000000104c0 b _ps_open.count
000000000000b3f0 t _ps_setBGColor
000000000000b340 t _ps_setFGColor
000000000000b480 t _ps_setLineAttr
000000000000b540 t _ps_setWinBounds
00000000000100e0 d _ps_usage
00000000000092a0 t _ps_viable
00000000000105d0 b _psmat
                 U _setbuf
                 U _sscanf
                 U _vclip
                 U dyld_stub_binder

view this post on Zulip starseeker (Feb 15 2022 at 03:50):

U is undefined?

view this post on Zulip Sean (Feb 15 2022 at 03:51):

oh, there it is! I was looking for whatever was triggering the unloading... plain as day in dm_init.cpp

view this post on Zulip Sean (Feb 15 2022 at 03:51):

yep

view this post on Zulip Sean (Feb 15 2022 at 03:52):

oh, I wonder if this is that age-old STL issue...

view this post on Zulip starseeker (Feb 15 2022 at 03:58):

0a1de42aee may help with the Archer bit

view this post on Zulip Sean (Feb 15 2022 at 04:00):

testing a fix for dlclosure issue, and I'm coincidentally getting the itk_defines issue so I'll update here when I can and test

view this post on Zulip Sean (Feb 15 2022 at 04:01):

may be related to this huge blather: WARNING - bu_dir's bin value is set to ., but binary being run is located in /Users/morrison/brlcad.main/.build. This probably means you are running btclsh from a non-install directory with BRL-CAD already present in . - be aware that .tcl files from . will be loaded INSTEAD OF local files. Tcl script changes made to source files for testing purposes will not be loaded, even though btclsh will most likely 'work'. To test local changes, either clear ., specify a different install prefix (i.e. a directory without BRL-CAD installed) while building, or manually set the BRLCAD_ROOT environment variable.

view this post on Zulip starseeker (Feb 15 2022 at 04:02):

Urm. I've seen that too, but didn't seem to trigger the issue for me. However, that shouldn't be happening, so I'll see if I can take a quick look...

view this post on Zulip Sean (Feb 15 2022 at 04:02):

my change appears to have fixed the dynamic unloading crashes

view this post on Zulip Sean (Feb 15 2022 at 04:02):

I'll push that up here in a sec

view this post on Zulip Sean (Feb 15 2022 at 04:15):

okay pushed.. I think what was going on is because the iterator was registering ABC and then closing ABC .. and badness was happening. now unloads in reverse order, so ABC->CBA, and that appears to have resolved whatever dependency tracking badness was going on. I suspect it's either plugins that refer to other plugins (thus needing to be in order) or the dynamic linker doing recursive reference counting and thinking it was done with libdm as the dlclose() plugins were unloaded and references got updated.

view this post on Zulip Sean (Feb 15 2022 at 04:19):

on an unrelated note, we're probably going to need to set up code signing before our next major release. that's apparently the solution to all the firewall triggers that go off every build. Looks like these guys have a module: https://github.com/Monetra/libmonetra/blob/master/CMakeModules/CodeSign.cmake

view this post on Zulip Sean (Feb 15 2022 at 04:20):

cpack appears to have some built-in stuff too for what it produces, though that doesn't address the build tree like that module seems to

view this post on Zulip starseeker (Feb 15 2022 at 04:30):

Where do we get a certificate?

view this post on Zulip starseeker (Feb 15 2022 at 04:38):

@Sean c7cd672c8 appears to break Linux:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7fe6ea5 in ?? () from /lib64/ld-linux-x86-64.so.2
(gdb) bt
#0  0x00007ffff7fe6ea5 in ?? () from /lib64/ld-linux-x86-64.so.2
#1  0x00007ffff74e98b8 in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffd300, operate=<optimized out>,
    args=<optimized out>) at dl-error-skeleton.c:208
#2  0x00007ffff74e9983 in __GI__dl_catch_error (objname=0x55555558b830, errstring=0x55555558b838,
    mallocedp=0x55555558b828, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:227
#3  0x00007ffff39dab59 in _dlerror_run (operate=operate@entry=0x7ffff39da420 <dlclose_doit>, args=0x6) at dlerror.c:170
#4  0x00007ffff39da468 in __dlclose (handle=<optimized out>) at dlclose.c:46
#5  0x00007ffff4bd8a9a in bu_dlclose (handle=0x6) at /home/cyapp/brlcad/src/libbu/dylib.c:66
#6  0x00007ffff48ad034 in libdm_clear () at /home/cyapp/brlcad/src/libdm/dm_init.cpp:201
#7  0x00007ffff48ad654 in libdm_initializer::~libdm_initializer (this=0x7ffff48e8590 <LIBDM>,
    __in_chrg=<optimized out>) at /home/cyapp/brlcad/src/libdm/dm_init.cpp:217
#8  0x00007ffff73d015e in __cxa_finalize (d=0x7ffff48e6c20) at cxa_finalize.c:83
#9  0x00007ffff48970f7 in __do_global_dtors_aux () from /home/cyapp/brlcad-build/lib/libdm.so.20
#10 0x00007fffffffdd90 in ?? ()

view this post on Zulip starseeker (Feb 15 2022 at 04:39):

Don't tell me Mac wants it one way and Linux the other...

view this post on Zulip Sean (Feb 15 2022 at 05:01):

it indeed is working much better for me, but that 0x6 handle in your stack there is suspicious. I think I may have been careless with the iterator.

view this post on Zulip Sean (Feb 15 2022 at 05:02):

yeah, end() shouldn't be valid and that's where I made it start. surprisingly works...

view this post on Zulip Sean (Feb 15 2022 at 05:05):

fixing.

view this post on Zulip Sean (Feb 15 2022 at 05:17):

you're good enough to catch mistakes like that! heh. it was an invalid loop! bogus handle was a dead give-away..

view this post on Zulip Sean (Feb 15 2022 at 05:26):

I feel first response should never be to inject platform identifiers... revert if needed, or at least give me a chance to fix it... see if latest is any better.

view this post on Zulip starseeker (Feb 15 2022 at 13:43):

@Sean confirmed, that's got it

view this post on Zulip Sean (Feb 15 2022 at 14:15):

@starseeker bad news is that the crash isn't gone... declared victory too soon. there's still something very distinctly wrong in the loading/unloading..

view this post on Zulip starseeker (Feb 15 2022 at 14:17):

That's weird. As I recall that code is pretty straightfoward - load on initialize, unload on exit. Not sure where to go hunting for trouble...

view this post on Zulip starseeker (Feb 15 2022 at 14:18):

unless the bu_dlopen/bu_dlclose wrappers are missing something maybe?

view this post on Zulip starseeker (Feb 15 2022 at 14:23):

@Sean I don't know if it helps any, but src/libbu/tests/dylib is intended to be a small, self-contained testing of that mechanism...

view this post on Zulip starseeker (Feb 15 2022 at 14:25):

@Sean do you know when this started? (i.e. has it been doing it ever since the dm/ged plugin work, or did some more recent change kick it off?)

view this post on Zulip starseeker (Feb 15 2022 at 14:29):

To me the strangest thing is that neither the local mac here nor the CI runners seem to be exhibiting it. And the CI build for the mac indicates it's running the ASCII to .g conversions, which (at least on the Linux box here) did trigger crashing when the unloading wasn't working.

view this post on Zulip Sean (Feb 15 2022 at 15:06):

It's the same behavior I've been seeing for months, I think since the dm/ged plugin work. It doesn't appear to be 100% deterministic as it seems to depend what symbols are in use, implying it's involving the dynamic linker and when a particular symbol or set of symbols are encountered.

view this post on Zulip Sean (Feb 15 2022 at 15:09):

It doesn't appear to affect more complicated apps that call lots of symbols (e.g., mged or gcv, etc) as much (or at least as visibly). Seems to be most noticeable on a handful of smaller simpler apps that essentially do nothing (but still load and unload nearly everything), and every now and then on something more complicated.

view this post on Zulip Sean (Feb 15 2022 at 15:13):

I think there's possibly something fundamental in play here (like the ordering) and Mac happens to be provoking. When I watch the binary's DYLD loading/unloading, there is some strangeness going on. The libged plugins are loading, and then it loads dm and it's dependency libraries. It appears to be choking up when it goes to unload dm and friends.

view this post on Zulip starseeker (Feb 15 2022 at 15:21):

Anything I can do to help?

view this post on Zulip Sean (Feb 15 2022 at 15:48):

Honestly, I'm not sure.. I just got a clue that it may be related to dm-X and dm-ogl, and that latter is still fully busted on Mac for me -- so that might be something you could check on -- if mged, archer, and such work for you on mac from a build dir. If it works, we can maybe trace backwards to figure out where things diverge.

view this post on Zulip starseeker (Feb 15 2022 at 15:50):

OK, I'll check on that. I know qged worked with Qt6 on mac, but I didn't try archer and I'm not sure if qged was doing its swrast fallback or not...

view this post on Zulip Sean (Feb 15 2022 at 15:50):

From what I think I'm seeing is that apps that already link libdm and/or X11 have no problem. It's when an app doesn't use either, but then libged loads libged-dm.dylib, that loads libdm and all it's deps. It's when libdm's deps get unloaded that it segfaults.

view this post on Zulip Sean (Feb 15 2022 at 15:50):

which makes sense since an app using libdm is not going to unload it

view this post on Zulip Sean (Feb 15 2022 at 15:51):

I don't have a qt build. I've been living in mged -c land for a while once ogl stopped working.

view this post on Zulip Sean (Feb 15 2022 at 15:52):

I haven't tried an opengl-disabled build to see if non-classic mode will fire up X correctly

view this post on Zulip starseeker (Feb 15 2022 at 15:52):

OH! Yeah, I had to refactor some code for a libgcv plugin because of that - apparently a dynamically loaded lib can't go and load another dynamically loaded lib.

view this post on Zulip starseeker (Feb 15 2022 at 15:54):

Hmm. I'd hate to give up the libged dm command...

view this post on Zulip starseeker (Feb 15 2022 at 16:05):

Archer does work from the build dir

view this post on Zulip Sean (Feb 15 2022 at 16:10):

Okay, I think I just ruled out X11/ogl -- if I remove libdm-X and libdm-ogl, it still segfaults

view this post on Zulip Sean (Feb 15 2022 at 16:11):

dyld: unloaded: <970A62D7-21A7-3363-92AC-41D3E3ED2AF5> /Users/morrison/brlcad.main/.build/libexec/ged/libged-autoview.dylib
!!! REMOVING 0x7fd514c08d00 unknown
dyld: unloaded: <D8509635-B237-3585-B70C-823C95F4B5CB> /Users/morrison/brlcad.main/.build/libexec/ged/libged-attr.dylib
!!! REMOVING 0x7fd514c08ba0 unknown
dyld: unloaded: <F4435FE5-243C-3286-B0D3-CEDC50774EEE> /Users/morrison/brlcad.main/.build/libexec/ged/libged-arot.dylib
!!! REMOVING 0x7fd514e05d90 unknown
dyld: unloaded: <B9E74045-8CE7-3438-B699-54645171BFC3> /Users/morrison/brlcad.main/.build/libexec/dm/libdm-swrast.dylib
dyld: unloaded: <21900CBB-094E-349C-A1B2-BAD779BDCF15> /Users/morrison/brlcad.main/.build/lib/libosmesa.dylib
!!! REMOVING 0x7fd514e059b0 unknown
dyld: unloaded: <C363B743-FE6B-3D4A-8513-953A5F6FAF28> /Users/morrison/brlcad.main/.build/libexec/dm/libdm-ps.dylib
!!! REMOVING 0x7fd514e054c0 unknown
dyld: unloaded: <E8A365D0-5923-386F-A9BD-7DA434D46324> /Users/morrison/brlcad.main/.build/libexec/dm/libdm-plot.dylib
!!! REMOVING 0x7fd514d069b0 unknown
dyld: unloaded: <DE144727-5E38-36CE-BFDD-A11CB151703E> /Users/morrison/brlcad.main/.build/lib/libdm.20.dylib
dyld: unloaded: <219AC144-E743-3037-8F1C-9B313D82BB1A> /Users/morrison/brlcad.main/.build/lib/libpkg.20.dylib
dyld: unloaded: <0AC2C158-06D9-3273-962E-FD0F51813D60> /Users/morrison/brlcad.main/.build/libexec/dm/libdm-txt.dylib
zsh: segmentation fault  DYLD_PRINT_LIBRARIES=1 bin/cad_user 2>&1

view this post on Zulip starseeker (Feb 15 2022 at 16:12):

So the "REMOVING <address> unknown" warnings are the problem?

view this post on Zulip Sean (Feb 15 2022 at 16:13):

so what's going on there is it's unloading everything, is unloading the last libdm-*.dylib (libdm-plot.dylib in this example) and it unloads libdm itself since reference-counting-wise, nothing else is using it.

view this post on Zulip starseeker (Feb 15 2022 at 16:13):

Ah, so it's getting to libdm before libdm-txt, which is a no-no?

view this post on Zulip Sean (Feb 15 2022 at 16:14):

no, that's my manual debug printing, I'm printing out all the library pointers on load and unload. You put them in a std::set, we the name is unknown, but it's basically the lines that follow -- and in the full log, the pointer address can be matched to the load statement where the name was known

view this post on Zulip starseeker (Feb 15 2022 at 16:14):

Oh, OK.

view this post on Zulip Sean (Feb 15 2022 at 16:15):

I think I see what's wrong, but I'm not sure what to do about it.

view this post on Zulip starseeker (Feb 15 2022 at 16:27):

Do I need to redesign or back off the plugin approach?

view this post on Zulip Sean (Feb 15 2022 at 16:33):

1) libged static initializer runs and plugins get dlopened, each one getting resolved by the dynamic linker which loads its dependent libraries, among those being..
2) libged-dm loads, which dynamic loads libdm, which static initializer runs and plugins get dlopened, each one getting... yada yada, and then
3) app runs, does it's thing, returns from main
4) libged destructor runs, starts unloading ged plugins, libged-dm unloads for example but dependencies are not yet unloaded
5) libdm destructor runs, starts unloading dm plugins and dependencies (perhaps asynchronously), and when it gets to the last plugin ...
6) dynamic linker unloads libdm itself, and this appears to happen while libdm's destructor is still running
7) seg faults, presumably on next iteration of the loop or on return from the destructor

view this post on Zulip Sean (Feb 15 2022 at 16:36):

please don't go shotgunning the plugins just yet! -- I have a swath of unpushed commits rebased on main, hundreds of changes to eliminate the per-command API

view this post on Zulip starseeker (Feb 15 2022 at 16:36):

Don't worry, I'm not going to do anything drastic. Just trying to get a sense of what we're facing

view this post on Zulip Sean (Feb 15 2022 at 16:37):

it'll conflict for sure if you go ripping on it too much
I think what is needed is to either ensure destruction is deferred, or order is somehow guaranteed by symbols

view this post on Zulip starseeker (Feb 15 2022 at 16:38):

If I absolutely have to I can ditch the dm command as a libged command and make it available some other way, but it's still a potential issue if anyone else happens to set up a similar conundrum for the unloaders...

view this post on Zulip Sean (Feb 15 2022 at 16:38):

I mean there is one possibility of simply not auto-loading everything. Only load as called.

view this post on Zulip starseeker (Feb 15 2022 at 16:39):

Would that help in the unloading calls though? Or do you mean immediately unloading after execution as well?

view this post on Zulip Sean (Feb 15 2022 at 16:39):

yeah, I think the dm plugin just provokes the issue, and isn't the issue itself. seems reasonable/likely that future plugin will require some lib. only issue might be like you said -- a dylib with a static initializer that loaded another dylib with a static initializer, and trying to avoid that

view this post on Zulip Sean (Feb 15 2022 at 16:40):

oh gosh, no, not unloading after execution. only loading what is used, and unloading everything that was loaded on shutdown. that would handle this specific case (because very little uses the dm plugin)

view this post on Zulip Sean (Feb 15 2022 at 16:41):

and apps that DO use the dm plugin appear to be gui and link dm, so it's never unloaded

view this post on Zulip starseeker (Feb 15 2022 at 16:41):

Ah. So the gsh tool should provoke the issue then, if the dm command is used.

view this post on Zulip starseeker (Feb 15 2022 at 16:42):

Oh, nope - I added libdm to that lib list

view this post on Zulip Sean (Feb 15 2022 at 16:42):

the test would be to run something that dynamically loads libdm, run the dm command, and see if it behaves on exit

view this post on Zulip Sean (Feb 15 2022 at 16:43):

yeah, it'd require removing libdm from gsh's lib list, run dm command, and see if exit behaves

view this post on Zulip starseeker (Feb 15 2022 at 16:45):

I'll try that here... one sec. Looks like I've got some actual dm library calls in there, so I'll have to turn off a couple things.

view this post on Zulip Sean (Feb 15 2022 at 16:46):

interesting. so if I remove all dm plugins, it still loads libdm dynamic, and eventually unloads it some time after libged-dm is unloaded seemingly without issue. valgrind is clean.

view this post on Zulip Sean (Feb 15 2022 at 16:49):

which is to say it's not simply returning from libdm's destructor that's causing the seg fault. it's that it is in the plugin unloading loop and it unloads a plugin that the corruption happens

view this post on Zulip Sean (Feb 15 2022 at 16:49):

happens even with just the txt plugin and no others..

view this post on Zulip Sean (Feb 15 2022 at 16:50):

Is there anything different about the libdm plugins compared to the libged plugins?

view this post on Zulip starseeker (Feb 15 2022 at 16:52):

Not intentionally...

view this post on Zulip starseeker (Feb 15 2022 at 16:52):

OK, confirm - if I take out the libdm explicit library calls from gsh, it crashes on exit after running "dm types"

view this post on Zulip starseeker (Feb 15 2022 at 16:53):

I'll go ahead and commit that turned off so we have a simple test case - will be easy to turn back on later.

view this post on Zulip Sean (Feb 15 2022 at 16:56):

so this all centers around the c++ trick of using static initialization with a class we're using to ensure constructor/destructor code is called when a library is loaded/unloaded, and that's what is not playing -- it's unloading the library before the destructor is done

view this post on Zulip Sean (Feb 15 2022 at 17:03):

Options I think are....
1) make libdm not plugin-based, as that would avoid a dynamic lib loading other dynamic-loading/unloading libs,
2) make libged only load plugins on-demand and hope any plugins like dm that load other dynamic-loading libs will already be loaded,
3) defer unloading to libbu unloading -- basically make bu_dlclose schedule something for closure and wait,
4) find a different mechanism (avoid using constructor/destructor since that's at the heart of why this fails)

view this post on Zulip starseeker (Feb 15 2022 at 17:05):

1) is possible - it was done primarily to keep Tcl out of the core libs, but I can also just put those backends requiring it behind an ENABLE_TCL check like that one shader in liboptical.

view this post on Zulip starseeker (Feb 15 2022 at 17:06):

My bigger concern is what happens if we start supporting 3rd party GED commands and someone else adds a command that does their own libdm-esque magic behind the scenes.

view this post on Zulip starseeker (Feb 15 2022 at 17:06):

3) appeals, but I don't know how practical it is

view this post on Zulip starseeker (Feb 15 2022 at 17:08):

1) is probably the shortest path back to working reliably, and realistically it's pretty unlikely we're going to get a lot of custom libdm backend implementations anytime soon to take advantage of the modularity.

view this post on Zulip starseeker (Feb 15 2022 at 17:10):

I also wonder what will happen if we expose libgcv through any of the GED commands - mightn't there be a similar issue?

view this post on Zulip starseeker (Feb 15 2022 at 17:22):

Would 4) involve (say) making ged_init and ged_free be responsible for plugin loading and unloading?

view this post on Zulip Sean (Feb 15 2022 at 17:25):

yeah, something like that - making the loading and unloading a little more explicit. I suspect just having the loop that does destruction be explicitly called would avoid the segfault because the dynamic loader would know that it can't unload the parent dm/ged/gcv library

view this post on Zulip Sean (Feb 15 2022 at 17:31):

I think I can try #3 pretty quickly, and see if it does the trick. I suspect it will. The downside is memory use until libbu is unloaded. Probably could have API forcibly unload on demand if that becomes an issue, but unlikely an issue in our case until we're talking about thousands of plugins.

view this post on Zulip starseeker (Feb 15 2022 at 17:33):

Sounds good. If that doesn't work let me know if you want me to do either 1) or 4)

view this post on Zulip Sean (Feb 15 2022 at 17:37):

may still be a benefit to doing #2 (faster load times) -- there is some occasional huge pause on certain (usually infrequent) runs that I assume is something the dynamic loader is doing. seem the pause especially on Windows, 30-60+sec before mged displays.

view this post on Zulip Sean (Feb 15 2022 at 18:46):

Deferred appears to work nicely:

...
!!! REMOVING 0x7fdd55c0a030 unknown
!!! REMOVING 0x7fdd55c09dc0 unknown
!!! REMOVING 0x7fdd55c09b50 unknown
!!! REMOVING 0x7fdd55c098e0 unknown
!!! REMOVING 0x7fdd55c09670 unknown
!!! REMOVING 0x7fdd55c09330 unknown
!!! REMOVING 0x7fdd55c09130 unknown
!!! REMOVING 0x7fdd55c08ef0 unknown
!!! REMOVING 0x7fdd55c08d90 unknown
!!! LIBDM DESTRUCTOR
!!! REMOVING 0x7fdd579054c0 unknown
!!! REMOVING 0x7fdd57905140 unknown
!!! REMOVING 0x7fdd57904a50 unknown
!!! REMOVING 0x7fdd5780ecc0 unknown
!!! REMOVING 0x7fdd55e06400 unknown
!!! REMOVING 0x7fdd55d07010 unknown
dyld: unloaded: <484BDA57-EC5C-3533-8271-1213BE720173> /Users/morrison/brlcad.main/.build/libexec/dm/libdm-ogl.dylib
dyld: unloaded: <7CD794FB-07E7-3E51-B7CE-CB9585477278> /usr/local/opt/libxrender/lib/libXrender.1.dylib
dyld: unloaded: <466439D8-1576-33B8-AE38-F4AD4CBCDC3F> /opt/X11/lib/libGLU.1.dylib
dyld: unloaded: <0AC2C158-06D9-3273-962E-FD0F51813D60> /Users/morrison/brlcad.main/.build/libexec/dm/libdm-txt.dylib
dyld: unloaded: <E8A365D0-5923-386F-A9BD-7DA434D46324> /Users/morrison/brlcad.main/.build/libexec/dm/libdm-plot.dylib
dyld: unloaded: <C363B743-FE6B-3D4A-8513-953A5F6FAF28> /Users/morrison/brlcad.main/.build/libexec/dm/libdm-ps.dylib
dyld: unloaded: <B9E74045-8CE7-3438-B699-54645171BFC3> /Users/morrison/brlcad.main/.build/libexec/dm/libdm-swrast.dylib
dyld: unloaded: <21900CBB-094E-349C-A1B2-BAD779BDCF15> /Users/morrison/brlcad.main/.build/lib/libosmesa.dylib
dyld: unloaded: <6461ED77-30C4-3D90-8FFE-224EF5B8365F> /Users/morrison/brlcad.main/.build/libexec/ged/libged-dsp.dylib
dyld: unloaded: <1536E3E1-0B02-3F94-92A2-00D48E37B256> /Users/morrison/brlcad.main/.build/libexec/ged/libged-edmater.dylib
dyld: unloaded: <F0DAA927-CCFA-3F6D-B79B-BC27BDB6A3A8> /Users/morrison/brlcad.main/.build/libexec/ged/libged-env.dylib
dyld: unloaded: <74AF84F3-50D3-398F-9470-8C0F4DC17813> /Users/morrison/brlcad.main/.build/libexec/ged/libged-erase.dylib
dyld: unloaded: <52AACC7B-7DD1-3EA6-BF05-7D1073E5ADC1> /Users/morrison/brlcad.main/.build/libexec/ged/libged-exists.dylib
dyld: unloaded: <35739FC4-A62C-3F93-8E41-B355D7E4D5A2> /Users/morrison/brlcad.main/.build/libexec/ged/libged-expand.dylib
dyld: unloaded: <C079326A-9961-3C29-9CB0-18D9CCA48C32> /Users/morrison/brlcad.main/.build/libexec/ged/libged-eye_pos.dylib
dyld: unloaded: <F40B173E-8839-3244-A200-C1BEAC11EB7E> /Users/morrison/brlcad.main/.build/libexec/ged/libged-facetize.dylib
dyld: unloaded: <47715816-3B66-3BDF-85E8-915D193BDDD4> /Users/morrison/brlcad.main/.build/libexec/ged/libged-fb2pix.dylib
dyld: unloaded: <CA5B33DE-CF2C-3D33-95D8-CDCD86B4C109> /Users/morrison/brlcad.main/.build/libexec/ged/libged-fbclear.dylib
dyld: unloaded: <479418C5-FF5B-3D14-BEEB-D095AD4D4C55> /Users/morrison/brlcad.main/.build/libexec/ged/libged-find.dylib
dyld: unloaded: <7F8475E5-81F6-3032-9465-72E7D321179A> /Users/morrison/brlcad.main/.build/libexec/ged/libged-form.dylib
dyld: unloaded: <0DABEEDD-2EBE-327A-8B17-6C9FFEDA693B> /Users/morrison/brlcad.main/.build/libexec/ged/libged-fracture.dylib
dyld: unloaded: <5E8181FA-7584-37BF-96BE-7E9819B89D52> /Users/morrison/brlcad.main/.build/libexec/ged/libged-gdiff.dylib
...

view this post on Zulip starseeker (Feb 15 2022 at 19:15):

FYA, I'm trying to get set up with Visual Studio 2022 now - I think the Github CI system made the upgrade.

view this post on Zulip starseeker (Feb 15 2022 at 19:16):

Rather worrisome in that the openNURBS build appears to be failing with an internal compiler error...

view this post on Zulip Sean (Feb 15 2022 at 19:20):

basically it cruises through the destructor and schedules all the dylibs for closing. then when libbu is unloaded or an explicit dlunload() is called, it actually closes them all.

view this post on Zulip Sean (Feb 15 2022 at 19:21):

alrighty.. all tests back to passing. still no ogl, but progress!

view this post on Zulip Sean (Feb 15 2022 at 19:22):

I'm still sorting through compiler errors with the tamu students.. almost all ran into issues. any idea why CHECK_CXX_FLAG(fsanitize=fuzzer) would be passing on Windows??? It did, and then proceeded to fail during compile because of the flag.

view this post on Zulip starseeker (Feb 15 2022 at 19:23):

Not yet - it's something about Visual Studio 2022

view this post on Zulip Sean (Feb 15 2022 at 19:23):

another tried in WSL, which I've done myself, but their build ended up unable to find Tcl's configure for some reason

view this post on Zulip starseeker (Feb 15 2022 at 19:23):

I'm seeing it myself here, but I don't know yet why that test would pass

view this post on Zulip starseeker (Feb 15 2022 at 19:31):

CHECK_START: Performing Test FSANITIZE_FUZZER_CXX_FLAG_FOUND
CHECK_PASS: Success
Performing C++ SOURCE FILE Test FSANITIZE_FUZZER_CXX_FLAG_FOUND succeeded with the following output:
Change Dir: C:/brlcad-build/CMakeFiles/CMakeTmp

Run Build Command(s):C:/Program Files/Microsoft Visual Studio/2022/Community/MSBuild/Current/Bin/amd64/MSBuild.exe cmTC_30228.vcxproj /p:Configuration=Debug /p:Platform=x64 /p:VisualStudioVersion=17.0 /v:m && Microsoft (R) Build Engine version 17.1.0+ae57d105c for .NETFramework^M
Copyright (C) Microsoft Corporation. All rights reserved.^M
^M
  Microsoft (R) C/C++ Optimizing Compiler Version 19.31.31104 for x64^M
  Copyright (C) Microsoft Corporation.  All rights reserved.^M
  cl /c /Zi /W3 /WX- /diagnostics:column /Od /Ob0 /D _MBCS /D WIN32 /D _WINDOWS /D _POSIX_C_SOURCE=200809L /D _XOPEN_SOURCE=700 /D FSANITIZE_FUZZER_CXX_FLAG_FOUND /D "CMAKE_INTDIR=\"Debug\"" /Gm- /EHsc /RTC1 /MDd /GS /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /GR /Fo"cmTC_30228.dir\Debug\\" /Fd"cmTC_30228.dir\Debug\vc143.pdb" /external:W3 /Gd /TP /errorReport:queue  -fsanitize=fuzzer "C:\brlcad-build\CMakeFiles\CMakeTmp\src.cxx"^M
  src.cxx^M
  cmTC_30228.vcxproj -> C:\brlcad-build\CMakeFiles\CMakeTmp\Debug\cmTC_30228.exe^M


Source file was:
int main() { return 0; }

view this post on Zulip Sean (Feb 15 2022 at 19:31):

ahh, https://www.google.com/search?client=safari&rls=en&q=visual+studio+2022+fsanitize%3Dfuzzer&ie=UTF-8&oe=UTF-8

view this post on Zulip Sean (Feb 15 2022 at 19:31):

they actually added it

view this post on Zulip Sean (Feb 15 2022 at 19:31):

so then the question is why does it fail later...

view this post on Zulip Sean (Feb 15 2022 at 19:35):

looks like the top-level unprotected one is stray. we have a fuzz regression test that does a direct test and links proper

view this post on Zulip Sean (Feb 15 2022 at 19:37):

I removed it, doesn't appear to be used

view this post on Zulip starseeker (Feb 15 2022 at 21:13):

https://discourse.mcneel.com/t/building-opennurbs-public-with-visualstudio2022-results-in-compiler-error/137817

view this post on Zulip starseeker (Feb 15 2022 at 21:22):

https://github.com/microsoft/vcpkg/issues/19561

view this post on Zulip starseeker (Feb 15 2022 at 21:41):

It narrows down fairly quickly to trying to call methods on the const_cast<ON_SerialNumberMap*>(this) pointer. Not sure yet how to work around it.

view this post on Zulip starseeker (Feb 15 2022 at 21:57):

Grr. I don't have access to the Microsoft compiler bug page referenced in the vcpkg discussion.

view this post on Zulip starseeker (Feb 15 2022 at 21:57):

Looks like all we can tell students until a workaround is found or Microsoft pushes a fix is to use VS2019

view this post on Zulip starseeker (Feb 15 2022 at 21:58):

removing the consts and the casts doesn't seem to help

view this post on Zulip starseeker (Feb 15 2022 at 22:13):

OK... from the "cheap but functional" school... It looks like that particular class method isn't actually used anywhere, so we can just turn it off completely.

view this post on Zulip starseeker (Feb 15 2022 at 22:17):

Blast, typoed the summary line. It's unused

view this post on Zulip Sean (Feb 15 2022 at 22:20):

I wonder if their build system works, implying it being something we're passing in that's untested. I don't see reference to that error in their tracker. If it uniquely affects us, it's probably our combination of flags..

view this post on Zulip Sean (Feb 15 2022 at 22:21):

looks like cmake guys encountered an issue with the /FS flag recently that they addressed, maybe related

view this post on Zulip starseeker (Feb 15 2022 at 22:26):

confirmed on the mac here that gsh now shuts down clean. That was some nice work @Sean

view this post on Zulip starseeker (Feb 15 2022 at 23:14):

@Sean as far as OpenGL is concerned - I think I may have asked you this already, but does glxgears or one of the other X11 OpenGL demos run successfully on your Mac?

view this post on Zulip starseeker (Feb 16 2022 at 00:19):

Just got a successful build with Visual Studio 2022 (needed a clean build dir)

view this post on Zulip Sean (Feb 17 2022 at 21:06):

@starseeker yep, no problems with glxgears or previous mgeds for that matter

view this post on Zulip Sean (Feb 17 2022 at 21:07):

any idea what this is about? MicrosoftTeams-image.png

view this post on Zulip Sean (Feb 17 2022 at 21:08):

Yeah, I was going to say, I have a clean 2022 from two students now .. but one has those errors in the external project builds (maybe all of them)

view this post on Zulip starseeker (Feb 17 2022 at 21:55):

maddening... why doesn't it fail with the mac here???

view this post on Zulip starseeker (Feb 17 2022 at 21:55):

That's one of those rather unhelpful Visual Studio errors you get when a custom target fails.

view this post on Zulip starseeker (Feb 17 2022 at 21:55):

First question - what version of CMake are they using?

view this post on Zulip starseeker (Feb 17 2022 at 21:59):

For libdm+opengl, what does running ./src/libdm/tests/dm_test from the build directory show?

view this post on Zulip Sean (Feb 18 2022 at 17:39):

@starseeker I figured that one out. The full log had better detail. Turns out MSVC automatically updated/updates itself, so the compiler that CMake had originally detected no longer existed.

view this post on Zulip Sean (Feb 18 2022 at 17:41):

I think that may explain a couple build failures commonly encountered by people who have recently installed MSVC. I'm not sure if we can detect that situation as that error is absolutely inscrutible, or maybe put some advice into the Compiling page instructions to ensure MSVC is completely updated before proceeding with CMake (but then MSVC could update at any time).

view this post on Zulip Sean (Feb 18 2022 at 17:41):

I suppose it's not as common on Mac/Linux/BSD simply because the compiler isn't sitting in a versioned directory like msvc's compiler is.

view this post on Zulip Sean (Feb 18 2022 at 17:44):

Also figured out one of the other common build errors some of them ran into. If you do a Git for Windows clone of the code, the build will fail in WSL (Ubuntu) because some of the build logic appears to require unix line endings (e.g., libpng seems to be running awk).

view this post on Zulip Sean (Feb 18 2022 at 17:46):

Not sure that can be detected either, but can put a note in Compiling that one must fully start in WSL if you're going that route.

view this post on Zulip Sean (Feb 18 2022 at 17:46):

starseeker said:

First question - what version of CMake are they using?

Always the latest.

view this post on Zulip Sean (Feb 18 2022 at 17:47):

starseeker said:

For libdm+opengl, what does running ./src/libdm/tests/dm_test from the build directory show?

(base) morrison@agua .build % src/libdm/tests/dm_test
load msgs: dlsym(0x7f80706048d0, fb_plugin_info): symbol not found
Unable to load symbols from './libexec/dm/libdm-plot.dylib' (skipping)
Could not find 'fb_plugin_info' symbol in plugin
dlsym(0x7f807040bf60, fb_plugin_info): symbol not found
Unable to load symbols from './libexec/dm/libdm-ps.dylib' (skipping)
Could not find 'fb_plugin_info' symbol in plugin

Available types:
    ogl
    X
    plot
    ps
    swrast
    txt
    nu
nu valid: 1
plot valid: 1
X valid: 1
ogl valid: 1
osgl valid: 0
wgl valid: 0
dmp name: nu
open called
dmp name: txt
close called
recommended type: ogl

view this post on Zulip Sean (Feb 28 2022 at 21:26):

anything else to check @starseeker ?

view this post on Zulip starseeker (Feb 28 2022 at 21:29):

Maybe check whether the older (working) versions are linking to any libraries that are different from the newer version?

I don't know how much trouble it would be, but it would be interesting to know if qged works on that platform or not (the qged setup shouldn't require X11 opengl, so I'm curious as to whether the problem also manifests if we take X out of the equation...)

view this post on Zulip starseeker (Mar 01 2022 at 20:40):

@Sean I think the bzflag reboot must have introduced a new default compiler - libbu's sort.c is suddenly making it unhappy...

view this post on Zulip Sean (Mar 01 2022 at 20:41):

@starseeker yes, see announcement -- major OS upgrade happened

view this post on Zulip starseeker (Mar 01 2022 at 20:41):

Ah, OK. I can see where the error is coming from, but I'm not sure what the "correct" thing to do instead is...

view this post on Zulip Sean (Mar 01 2022 at 20:43):

I'll can take a look at it, I hadn't gotten to compiling there yet. Been chasing fires, reviewing PR commits, and answering questions all day.

view this post on Zulip Sean (Mar 01 2022 at 20:43):

speaking of which... I'll create another thread for an e-mail that came in

view this post on Zulip Erik (Mar 02 2022 at 02:12):

@starseeker sorry, that was me bashing around. Clang 13 now, gcc 10.3 is also installed, so -DCMAKE_C_COMPILER=gcc

view this post on Zulip starseeker (Mar 02 2022 at 02:26):

@Erik no worries - we just need to fix the issue. I'm not confident I know what the "right" answer should be yet...

view this post on Zulip Erik (Mar 02 2022 at 02:28):

the subtracting a null ptr thing is buried in a macro from what I saw, could take a bit of doing to tease out. Using gcc pulls a sysinfo bridge header that mucks up libbu linking, that test should be moved from 'have the header' to 'can link the symbol' I think

view this post on Zulip Sean (Mar 08 2022 at 16:20):

I just did a build on the latest Fedora, compiled clean, mgen runs, but then abruptly closes after drawing anything and running rt…. Rt window displays the rendering. Terminal output says mged was Killed.

view this post on Zulip Sean (Mar 08 2022 at 16:22):

ran in gdb and there’s nothing to break on as there indeed appears to be something in the system that send mged the kill signal after forking off the rt process.

view this post on Zulip Sean (Mar 08 2022 at 16:22):

I’ve only seen that before when a process attempts to allocate too much memory, but haven’t yet seen evidence that’s what’s going on here

view this post on Zulip Sean (Mar 08 2022 at 16:31):

Oof, okay I found the evidence. It is getting killed by the Out of memory monitor. Not seeing why as it only appears to be using 1mb…

view this post on Zulip Sean (Mar 08 2022 at 16:57):

Ah, so turns out mged is using 5.4GB just with mged open… and that laptop is my low-resource test box, only has 4GB + 4GB swap. It is running out of memory. Seems a bit nuts that mged is using that much with essentially nothing open.

view this post on Zulip Sean (Mar 08 2022 at 17:23):

Looks like it’s something in DM. Every attach X is adding 1.5GB usage. Kicking off the tcltk gui adds over 4GB (presumably from the dm+fb).

view this post on Zulip Sean (Mar 08 2022 at 17:37):

@starseeker can you see what mged does for you if you run mged -c share/db/moss.g. , attach nu, attach X , close the window, attach X again, then draw all.g. ?

view this post on Zulip starseeker (Mar 08 2022 at 20:32):

@Sean Wipes out with the following error:

X Error of failed request:  BadDrawable (invalid Pixmap or Window parameter)
  Major opcode of failed request:  62 (X_CopyArea)
  Resource id in failed request:  0x460000a
  Serial number of failed request:  3473
  Current serial number in output stream:  3474

view this post on Zulip starseeker (Mar 08 2022 at 20:45):

I'm also seeing a memory bump here. Not sure why yet.

view this post on Zulip starseeker (Mar 08 2022 at 20:46):

(As an aside, here's something a little weird from valgrind when I run attach X):

==1409798== Invalid read of size 1
==1409798==    at 0x483FEF0: strcmp (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1409798==    by 0x90A999E: _XimUnRegisterIMInstantiateCallback (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==1409798==    by 0x9090892: XUnregisterIMInstantiateCallback (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==1409798==    by 0x90A9866: _XimRegisterIMInstantiateCallback (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==1409798==    by 0x909080C: XRegisterIMInstantiateCallback (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==1409798==    by 0x5A392F2: TkpOpenDisplay (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59A0701: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59A0567: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59A0F4E: TkCreateMainWindow (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59ABA1D: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59AB40C: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59A349C: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==  Address 0xf205dc1 is 1 bytes inside a block of size 9 free'd
==1409798==    at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1409798==    by 0x909FB3F: XSetLocaleModifiers (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==1409798==    by 0x5A39ACA: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x5A39A4F: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x90A9866: _XimRegisterIMInstantiateCallback (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==1409798==    by 0x909080C: XRegisterIMInstantiateCallback (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==1409798==    by 0x5A392F2: TkpOpenDisplay (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59A0701: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59A0567: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59A0F4E: TkCreateMainWindow (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59ABA1D: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59AB40C: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==  Block was alloc'd at
==1409798==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1409798==    by 0x909F756: _XlcDefaultMapModifiers (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==1409798==    by 0x909FB2A: XSetLocaleModifiers (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==1409798==    by 0x5A39ACA: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x5A392D9: TkpOpenDisplay (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59A0701: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59A0567: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59A0F4E: TkCreateMainWindow (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59ABA1D: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59AB40C: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x59A349C: ??? (in /usr/lib/x86_64-linux-gnu/libtk8.6.so)
==1409798==    by 0x129B98: gui_setup (attach.c:333)

view this post on Zulip starseeker (Mar 08 2022 at 20:57):

Memory is being allocated at if_X24.c:2065, from a size calculation at if_X24.c:1997

view this post on Zulip starseeker (Mar 08 2022 at 20:58):

both ifp->i->if_max_height and ifp->i->if_max_width are set to 20480

view this post on Zulip starseeker (Mar 08 2022 at 21:00):

I think that's coming from src/libdm/include/private.h:129

view this post on Zulip starseeker (Mar 08 2022 at 21:04):

However, that doesn't explain what's going on when the window is closed...

view this post on Zulip starseeker (Mar 08 2022 at 21:06):

Interestingly, the same general problem happens with ogl:

ATTACHING ogl (X Windows with OpenGL graphics)
mged> X Error of failed request:  GLXBadDrawable
  Major opcode of failed request:  151 (GLX)
  Minor opcode of failed request:  5 (X_GLXMakeCurrent)
  Serial number of failed request:  1252
  Current serial number in output stream:  1252

view this post on Zulip starseeker (Mar 08 2022 at 21:26):

Also seems to be specific to the first attach - if I attach multiple windows and close the second, I can attach a new one successfully.

view this post on Zulip starseeker (Mar 08 2022 at 21:27):

Clearly memory is not being freed when the window is closed...

view this post on Zulip starseeker (Mar 08 2022 at 21:30):

None of dm_close, fb_close nor fb_close_existing gets triggered when the window closes.

view this post on Zulip starseeker (Mar 08 2022 at 21:38):

Oof. The first thing that comes to mind is to have the dm_open command bind a some Tcl command that will call dm_close to the Tk <Destroy> event....

view this post on Zulip Erik (Mar 09 2022 at 14:32):

(deleted)

view this post on Zulip Sean (Mar 09 2022 at 14:50):

starseeker said:

Clearly memory is not being freed when the window is closed...

yeah, I noticed that too. I think that's may also be new, but more concerning is the crash. The 20480x20480 change was made back in 7.16.0 and just testing a 7.24 version, it doesn't appear to explode memory use and seems to release when windows are closed. I reduced the number down to 8096x8096 anyways, but some other change is likely involved, and I think the crash is definitely new.

view this post on Zulip Sean (Mar 09 2022 at 14:56):

starseeker said:

Sean Wipes out with the following error:

X Error of failed request:  BadDrawable (invalid Pixmap or Window parameter)
  Major opcode of failed request:  62 (X_CopyArea)
  Resource id in failed request:  0x460000a
  Serial number of failed request:  3473
  Current serial number in output stream:  3474

I think this is at the heart of the issue, but I'm not yet groking that stack trace.. will try to catch it on Mac to see if it gives a different path or at least more complete symbols -- looks like your build isn't enable-all'd.

view this post on Zulip starseeker (Mar 09 2022 at 17:45):

@Sean I may have messed up the dm bookkeeping in MGED at some point - my recollection of MGED's management of those can be summed up as "messy", so it's actually quite likely I messed up somewhere. I'll see if I can tease a 7.24 build into working and try to figure out how the fb memory got freed...

view this post on Zulip starseeker (Mar 09 2022 at 17:52):

One likely culprit of the increased memory usage may be my attempt to set up things so each dm has a built-in embedded fb by default. Don't know if that's behind the window crash but it's likely why the dm's are suddenly taking up memory they didn't previously

view this post on Zulip starseeker (Mar 09 2022 at 18:29):

OK, got 7.24.4 building - here's what I'm seeing so far (I'm getting X11 windows and the first ogl window, but the second wipes out):

BRL-CAD Release 7.24.4  Geometry Editor (MGED)
    Wed, 09 Mar 2022 13:25:05 -0500, Compilation 0
    cyapp@ubuntu2019

attach (nu|txt|X|ogl)[nu]?
mged> attach X
ATTACHING X (X Window System (X11))
mged> attach X
ATTACHING X (X Window System (X11))
mged> attach ogl
ATTACHING ogl (X Windows with OpenGL graphics)
mged> attach ogl
ATTACHING ogl (X Windows with OpenGL graphics)
mged> X Error of failed request:  GLXBadDrawable
  Major opcode of failed request:  151 (GLX)
  Minor opcode of failed request:  5 (X_GLXMakeCurrent)
  Serial number of failed request:  1181
  Current serial number in output stream:  1181

view this post on Zulip starseeker (Mar 09 2022 at 18:43):

Huh - now I'm seeing the exact same thing with latest main, fwiw. I can't get 7.22.0 to build easily - will probably need to set up a VM if I need to go that far back.

view this post on Zulip starseeker (Mar 09 2022 at 18:48):

(Oh - should make clear I'm closing each window above before proceeding to the next attach)

view this post on Zulip starseeker (Mar 09 2022 at 18:53):

Aaaaand now I can't get the X attach to reproduce the failure, even with the fb memory set large... what on earth...

view this post on Zulip starseeker (Mar 09 2022 at 22:03):

OK, per recent discussion, drawing in the second "attach X" does indeed crash in latest main.

view this post on Zulip starseeker (Mar 09 2022 at 22:16):

Also crashes in 69b1b1bed2 (Tcl/Tk 8.5, just before the 8.6 switch)

view this post on Zulip starseeker (Mar 09 2022 at 22:18):

Same thing with rel-7-24-4

view this post on Zulip starseeker (Mar 09 2022 at 22:24):

7.24.0 is too old to readily build on this machine...

view this post on Zulip Sean (Mar 12 2022 at 21:01):

Okay, I swear I'd tested 7.24 and it worked, but it's bombing for me too. I documented it.

view this post on Zulip starseeker (Mar 14 2022 at 18:54):

I believe you :smile: Interesting problem, in a hair-pulling sort of way...

view this post on Zulip Sean (Mar 21 2022 at 16:05):

Just FYI, I'm working on the build Action testing issues from the recent materials merge.

view this post on Zulip Himanshu (Mar 22 2022 at 04:59):

Sean said:

Just FYI, I'm working on the build Action testing issues from the recent materials merge.

Latest commit failing in Windows.

view this post on Zulip Himanshu (Mar 22 2022 at 12:37):

/me thinks sometimes why msvc shows weird message that file not found but file is still there. Now builds fine.

view this post on Zulip Sean (Mar 22 2022 at 22:26):

@Himanshu Sekhar Nayak hm, don't know what to say about that other than it helps to turn up the compilation verbosity (under Options -> Project and Solutions -> Build and Run). I typically set output to Normal and log to Detailed. That way, I can get to what exactly happened if needed.

view this post on Zulip Sean (Apr 05 2022 at 19:07):

One of the unexpected side effects of the plugin changes is frequently running into runtime crashes now whenever something changes outside the plugin dll/so/dylib that is incompatible with whatever's going on inside the plugin (as it does not appear to automatically recompile). At least that seems to be what's going on. For example, just pulled latest view changes, compiled, and then all tools exhibit hard corruption, assert failures, bu_bombing, etc. Cleaning and recompiling is apparently more often than not necessary now. Rather unexpected and unintuitive that it's not updating/recompiling the plugins. Maybe some dependencies aren't listed correctly?

view this post on Zulip Sean (Apr 05 2022 at 19:09):

Also working with a student on a hard database I/O corruption situation that seems to be new. Any database creation on his system is resulting in corrupted .g files. Others with the same setup, same msvc, etc. are not experiencing the corruption. It appears to have just started in the past two weeks.

view this post on Zulip starseeker (Apr 05 2022 at 19:19):

That is unexpected - I would have figured the logic would rebuild anything that would result in such a pronounced failure.

view this post on Zulip starseeker (Apr 05 2022 at 19:19):

@Sean I'll switch to working in a branch for this, so I don't keep disrupting everyone else.

view this post on Zulip starseeker (Apr 05 2022 at 19:22):

It's probable I wouldn't see that breakage mode myself, as my normal MO is to clear and rebuild.

view this post on Zulip Sean (Apr 14 2022 at 17:52):

@starseeker I’m away from a computer to test, but getting multiple reports that mged is busted and recent updates, draw not working. Can you or someone else check?

view this post on Zulip starseeker (Apr 14 2022 at 18:11):

@Sean I just pushed a reversion that should put it back.

view this post on Zulip Erik (May 11 2022 at 21:43):

so, uh, 'sup with make test failing with asc and weight? :D is that just me?

view this post on Zulip Sean (May 11 2022 at 21:44):

No that’s my doing. The test is detecting a change due to new material object management and I need to resolve it.

view this post on Zulip Sean (May 11 2022 at 21:44):

Probably will yank the attribute sync code but needs a bit of testing

view this post on Zulip Erik (May 11 2022 at 22:03):

and that durn kryptonite slips in... :) I was mucking with converting jenkins to a pipeline (can be dropped into the repo as /Jenkinsfile and revision controlled)

view this post on Zulip Sean (May 12 2022 at 17:19):

That's cool. I've been wanting to do that myself too. IaC FTW.

view this post on Zulip starseeker (May 12 2022 at 18:13):

@Daniel Rossberg any chance we could wire up your cubes examples as unit/regression tests to make sure the gqa behavior stays correct in the future?

view this post on Zulip Sean (May 12 2022 at 18:16):

I was just looking at that PR too. Very interesting! Does it still interleave as resolution doubles? That is one of the current features, no ray is shot twice -- it (is supposed to) refines the gaps in-between recursively without ever reshooting the same ray.

view this post on Zulip Daniel Rossberg (May 13 2022 at 11:46):

Sean said:

I was just looking at that PR too. Very interesting! Does it still interleave as resolution doubles? That is one of the current features, no ray is shot twice -- it (is supposed to) refines the gaps in-between recursively without ever reshooting the same ray.

You may have a point here. I'll review it.

BTW, that's why I made a PR and didn't committed it directly: To give it a better review and discuss it first.

view this post on Zulip Daniel Rossberg (May 13 2022 at 11:49):

starseeker said:

Daniel Rossberg any chance we could wire up your cubes examples as unit/regression tests to make sure the gqa behavior stays correct in the future?

I'll look for this ans see, how much effort this would be. Unfortunately, the result of gqa is aprint-out, which had to interpreted first.

view this post on Zulip Sean (May 13 2022 at 16:18):

@Daniel Rossberg even if it reshoots, correct is obviously more important than performance. I was just more wondering if that behavior changed (and the potential effect as the grid size continues to double, if half the rays are repeat work each level)

view this post on Zulip Daniel Rossberg (May 13 2022 at 18:00):

It hadn't reshot, but also not reused the old ray-traces. Changed back to the old grid generation.

The main fault was that in lines 1003-1005 the grid sizes were recomputed with the wrong number of steps (state->steps instead of state->steps-1).

The next improvement was to use gridSpacing there too. With every refinement the "old" moments have to be reduced, and its a problem if they were computed with on value and readjusted based on a different one.

state.steps+1 in lines 2619-2661 ensures that the rays reach mdl_max.

view this post on Zulip Sean (Jun 02 2022 at 16:45):

@Daniel Rossberg thank you for that detail! really helps to understand what's going on there. that's awesome that you caught that off-by-one bug... would take me quite a while to fully re-understand what is going on in there, so glad you figured out what was wrong. :)

view this post on Zulip starseeker (Jul 11 2022 at 03:03):

@Sean Looks like the recent MSVC warning changes broke gcc linux building

view this post on Zulip Sean (Jul 11 2022 at 06:48):

Thanks @starseeker and sorry, should be fixed now! I hadn't cycled back to mac or linux yet as I was really trying to immerse in a windows dev workflow as much as possible last week so I could address categoric issues from that side I'm seeing in our stig listings. Took a heck of a lot longer than expected to get things off the ground (still not done, but putting a thumbtack in it for now).

view this post on Zulip Sean (Jul 11 2022 at 13:37):

thanks for clearing the last two. was waiting for the scan to see what else was left and you'd fixed it before I got to see the next (as it's building for me locally clean)

view this post on Zulip Sean (Jul 11 2022 at 13:39):

let me know if bio.h causes a problem; might get away with these vanilla environments, but I suspect that'll need to be handled differently to be fully portable

view this post on Zulip starseeker (Aug 02 2022 at 13:00):

@Sean gcc errors with latest changes:

/brlcad/src/conv/off/off-g.c: In function ‘off2nmg’:
/brlcad/src/conv/off/off-g.c:208:39: error: ‘%s’ directive output may be truncated writing up to 63 bytes into a region of size 62 [-Werror=format-truncation=]
  208 |     snprintf(sname, sizeof(sname), "s.%s", title);
      |                                       ^~   ~~~~~
/brlcad/src/conv/off/off-g.c:208:5: note: ‘snprintf’ output between 3 and 66 bytes into a destination of size 64
  208 |     snprintf(sname, sizeof(sname), "s.%s", title);
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/brlcad/src/conv/off/off-g.c:209:39: error: ‘%s’ directive output may be truncated writing up to 63 bytes into a region of size 62 [-Werror=format-truncation=]
  209 |     snprintf(rname, sizeof(sname), "r.%s", title);
      |                                       ^~   ~~~~~
/brlcad/src/conv/off/off-g.c:209:5: note: ‘snprintf’ output between 3 and 66 bytes into a destination of size 64
  209 |     snprintf(rname, sizeof(sname), "r.%s", title);
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors

view this post on Zulip Sean (Aug 02 2022 at 13:26):

Thank you! Home stretch here with musl ... hopefully one of the last issues.

view this post on Zulip Sean (Aug 02 2022 at 13:33):

That's a fun one .. fixing one issue let it detect another underlying. Commit fix pushed.

view this post on Zulip Sean (Aug 02 2022 at 13:34):

I now appear to have a full build, so I'm going to let the gitlab folks know we're good to go. Hopefully we can stay stable until the end of this week for their demo.

view this post on Zulip starseeker (Aug 02 2022 at 22:08):

@Sean Hint taken - I'll shift to a branch (sorry, didn't see this until now)

view this post on Zulip starseeker (Aug 02 2022 at 22:56):

@Sean do you want me to merge to RELEASE so we can start 7.34.0 shakedown?

view this post on Zulip starseeker (Aug 03 2022 at 23:56):

@Sean - FYI, CheckCompilerFlag is 3.19 and newer: https://cmake.org/cmake/help/latest/module/CheckCompilerFlag.html

view this post on Zulip Sean (Aug 04 2022 at 01:32):

Yeah I just discovered that earlier today.. I fixed it but didn’t push yet.

view this post on Zulip Sean (Aug 04 2022 at 14:33):

Files known to Git are not accounted for in build logic:
doc/docbook/resources/brlcad/CMakeLists.txt
doc/docbook/resources/brlcad/brlcad-article-fo-stylesheet.xsl.in
doc/docbook/resources/brlcad/brlcad-article-xhtml-stylesheet.xsl.in
doc/docbook/resources/brlcad/brlcad-book-fo-stylesheet.xsl.in
doc/docbook/resources/brlcad/brlcad-book-xhtml-stylesheet.xsl.in
doc/docbook/resources/brlcad/brlcad-common.xsl.in
doc/docbook/resources/brlcad/brlcad-fo-stylesheet.xsl.in
doc/docbook/resources/brlcad/brlcad-fonts.xsl.in
doc/docbook/resources/brlcad/brlcad-gendata.xsl
doc/docbook/resources/brlcad/brlcad-lesson-fo-stylesheet.xsl.in
doc/docbook/resources/brlcad/brlcad-lesson-xhtml-stylesheet.xsl.in
doc/docbook/resources/brlcad/brlcad-man-fo-stylesheet.xsl.in
doc/docbook/resources/brlcad/brlcad-man-stylesheet.xsl.in
doc/docbook/resources/brlcad/brlcad-man-xhtml-stylesheet.xsl.in
doc/docbook/resources/brlcad/brlcad-presentation-fo-stylesheet.xsl.in
doc/docbook/resources/brlcad/brlcad-presentation-xhtml-stylesheet.xsl.in
doc/docbook/resources/brlcad/brlcad-specification-fo-stylesheet.xsl.in
doc/docbook/resources/brlcad/brlcad-specification-xhtml-stylesheet.xsl.in
doc/docbook/resources/brlcad/brlcad-xhtml-header-navigation.xsl
doc/docbook/resources/brlcad/brlcad-xhtml-stylesheet.xsl.in
doc/docbook/resources/brlcad/center-table-print.xsl
doc/docbook/resources/brlcad/images/brlcad-logo-669966.svg
doc/docbook/resources/brlcad/images/brlcad-logo-6699cc.svg
doc/docbook/resources/brlcad/images/brlcad-logo-blue.svg
doc/docbook/resources/brlcad/images/brlcad-logo-cc6666.svg
doc/docbook/resources/brlcad/images/brlcad-logo-cc9966.svg
doc/docbook/resources/brlcad/images/brlcad-logo-green.svg
doc/docbook/resources/brlcad/images/brlcad-logo-limegreen.svg
doc/docbook/resources/brlcad/images/brlcad-logo-red.svg
doc/docbook/resources/brlcad/images/logo-vm-gears.png
doc/docbook/resources/brlcad/images/logo-vm-gears.svg
doc/docbook/resources/brlcad/presentation.xsl.in
doc/docbook/resources/brlcad/tutorial-cover-template.xsl.in
doc/docbook/resources/brlcad/tutorial-template.xsl.in
doc/docbook/resources/brlcad/wordpress.xsl.in

Files mentioned in build logic are not checked into the repository:
doc/docbook/resourcesCMakeLists.txt
doc/docbook/resourcesbrlcad-article-fo-stylesheet.xsl.in
doc/docbook/resourcesbrlcad-article-xhtml-stylesheet.xsl.in
doc/docbook/resourcesbrlcad-book-fo-stylesheet.xsl.in
doc/docbook/resourcesbrlcad-book-xhtml-stylesheet.xsl.in
doc/docbook/resourcesbrlcad-common.xsl.in
doc/docbook/resourcesbrlcad-fo-stylesheet.xsl.in
doc/docbook/resourcesbrlcad-fonts.xsl.in
doc/docbook/resourcesbrlcad-gendata.xsl
doc/docbook/resourcesbrlcad-lesson-fo-stylesheet.xsl.in
doc/docbook/resourcesbrlcad-lesson-xhtml-stylesheet.xsl.in
doc/docbook/resourcesbrlcad-man-fo-stylesheet.xsl.in
doc/docbook/resourcesbrlcad-man-stylesheet.xsl.in
doc/docbook/resourcesbrlcad-man-xhtml-stylesheet.xsl.in
doc/docbook/resourcesbrlcad-presentation-fo-stylesheet.xsl.in
doc/docbook/resourcesbrlcad-presentation-xhtml-stylesheet.xsl.in
doc/docbook/resourcesbrlcad-specification-fo-stylesheet.xsl.in
doc/docbook/resourcesbrlcad-specification-xhtml-stylesheet.xsl.in
doc/docbook/resourcesbrlcad-xhtml-header-navigation.xsl
doc/docbook/resourcesbrlcad-xhtml-stylesheet.xsl.in
doc/docbook/resourcescenter-table-print.xsl
doc/docbook/resourcesimages/brlcad-logo-669966.svg
doc/docbook/resourcesimages/brlcad-logo-6699cc.svg
doc/docbook/resourcesimages/brlcad-logo-blue.svg
doc/docbook/resourcesimages/brlcad-logo-cc6666.svg
doc/docbook/resourcesimages/brlcad-logo-cc9966.svg
doc/docbook/resourcesimages/brlcad-logo-green.svg
doc/docbook/resourcesimages/brlcad-logo-limegreen.svg
doc/docbook/resourcesimages/brlcad-logo-red.svg
doc/docbook/resourcesimages/logo-vm-gears.png
doc/docbook/resourcesimages/logo-vm-gears.svg
doc/docbook/resourcespresentation.xsl.in
doc/docbook/resourcestutorial-cover-template.xsl.in
doc/docbook/resourcestutorial-template.xsl.in
doc/docbook/resourceswordpress.xsl.in

CMake Error at CMakeTmp/distcheck_repo_verify.cmake:228 (message):
ERROR: Distcheck cannot proceed until build files and repo are in sync (set
-DFORCE_DISTCHECK=ON to override)

view this post on Zulip Sean (Aug 04 2022 at 19:31):

Sean said:

Yeah I just discovered that earlier today.. I fixed it but didn’t push yet.

pushed the fix last night, should be good to go. Possibly related, I'm seeing two "Attempt to add a custom rule to output" cmake error rmessages on libnetpbm.a.rule and libgdal.a.rule
Any ideas?

view this post on Zulip starseeker (Aug 05 2022 at 13:23):

Not offhand - the logic doing that management is src/other/ext/CMake/ExternalProject_Target.cmake:442 - it in turn uses the fcfgcpy function which defines custom rules

view this post on Zulip starseeker (Aug 05 2022 at 13:26):

You could try some message statements in those functions to see if you can bracket where that error is being generated

view this post on Zulip Sean (Sep 15 2022 at 03:13):

@starseeker appears to be a recent regression on draw -m2 ...

view this post on Zulip Sean (Sep 15 2022 at 03:13):

unknown-1.png

view this post on Zulip Sean (Sep 15 2022 at 03:13):

unknown.png

view this post on Zulip Sean (Sep 16 2022 at 00:59):

Your recent fix appears to have fixed it, nice! Thank you.

view this post on Zulip Sean (Apr 24 2023 at 15:23):

@Christopher looks like a few dirs are missing from the latest commit? (fbx, dxf, pbrt in regress/gcv)

view this post on Zulip Christopher (Apr 24 2023 at 15:28):

Forgot some cleanup. Fix pushed

view this post on Zulip GregoryLi (May 18 2023 at 08:55):

It seems we have some problems with brep command. ged_brep_corewill receive four arguments before. Now we only get two.
image.png

view this post on Zulip GregoryLi (May 18 2023 at 08:56):

image.png

view this post on Zulip Daniel Rossberg (May 18 2023 at 14:25):

GregoryLi said:

It seems we have some problems with brep command. ged_brep_corewill receive four arguments before. Now we only get two.

Just tests with a clean build of current brlcad on Linux. I got arb8.s.brep is made.. Can you repeat your test with a clean build from scratch?

view this post on Zulip GregoryLi (May 19 2023 at 03:47):

Sorry, a clean build works well. :grinning:

view this post on Zulip Sean (May 19 2023 at 20:30):

libged commands are loaded dynamically (as dynamic libs) and for some reason they don't always rebuild when a file has been edited despite having dependencies set in cmake (or perhaps one is missing).

view this post on Zulip Sean (May 19 2023 at 20:32):

so if anyone edits a header, especially a structure, they all need to be rebuilt and that doesn't always happen automatically. would be great if someone could make that not be a problem, but currently I make sure to delete the libged and libdm libs at a minimum so they're rebuilt.

view this post on Zulip GregoryLi (Oct 03 2023 at 08:57):

Hi, I just pulled the newest codes and found I can't open .g database.
image.png

view this post on Zulip GregoryLi (Oct 03 2023 at 08:58):

I'm working on Ubuntu 20.04 at commit 753ca33.

view this post on Zulip GregoryLi (Oct 03 2023 at 09:18):

It's quite strange... For me, the problem existed many commit ago (before Sep 13 the problem exists). And it worked well on Aug 27. Does anyone else have this problem? Do I need to use the git bisect command to determine the location?

view this post on Zulip starseeker (Oct 03 2023 at 15:31):

That might need a bisect - it's probably related to the work I did with the open/opendb GED command work.

view this post on Zulip starseeker (Oct 03 2023 at 15:32):

A naive guess is that I didn't change something from open to opendb, but it could be something else.

view this post on Zulip GregoryLi (Oct 04 2023 at 01:25):

I just located the error using bisect. a7bba28a948a1939e53ab224fdc4e4a381cddb23 is the first bad commit.

view this post on Zulip starseeker (Oct 04 2023 at 02:20):

@GregoryLi OK, that' confirms somewhere in the Archer startup stack we're calling "open" where we should be calling "opendb"

view this post on Zulip starseeker (Oct 08 2023 at 01:50):

@GregoryLi Can you check if things are working again in the latest?

view this post on Zulip GregoryLi (Oct 09 2023 at 00:29):

starseeker said:

GregoryLi Can you check if things are working again in the latest?

Yeah, it works well now. :smile:

view this post on Zulip starseeker (Nov 04 2023 at 16:38):

Bah - bu_vls_vprintf tests 22 and 32 fail on Alpine Linux.

view this post on Zulip starseeker (Nov 04 2023 at 22:26):

I don't believe this - facetizing tor with a tolerance of r=0.0001 is causing an nmg_mdl_to_bot failure just on the mac, which seems to be why the lod drawing test is failing.

view this post on Zulip Sean (Oct 08 2024 at 16:28):

Here's a crash reading/writing BREP:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x00000001018c4884 libOpenNURBS.dylib`ON_Object::IsKindOf(ON_ClassId const*) const + 20
    frame #1: 0x00000001017aeebc libOpenNURBS.dylib`ON_Geometry::Cast(ON_Object*) + 32
    frame #2: 0x00000001007be114 librt.20.dylib`brep_dbi2on(rt_db_internal const*, ONX_Model&) + 176
    frame #3: 0x00000001007be560 librt.20.dylib`rt_brep_export5 + 168
    frame #4: 0x0000000100809088 librt.20.dylib`rt_generic_xform + 340
    frame #5: 0x000000010003bdec mged`vls_solid + 168
    frame #6: 0x000000010004c618 mged`refresh + 1040
    frame #7: 0x0000000100049988 mged`main + 7080
    frame #8: 0x000000019234ff28 dyld`start + 2236

view this post on Zulip Sean (Oct 08 2024 at 16:46):

commands invoked by mged:

M
M $args
M 1 0 0
adc $args
adc draw
ae
aip f
attach
center
draw $esol_control($id,name)
has_embedded_fb
ill
ill -e -i $ri $spath
ill -e -i 1 $path
ill -e -i 1 [lindex $spath_and_pos 0
ill -e -n -i $ri $spath
ill -i 1 [lindex $paths 0
ill -i 1 \$mged_gui($id,mgs_path)
in $mged_gui($id,solid_name) dsp f \
keep
keep db_glob
ls -c
ls -r
make $mged_gui($id,solid_name) $type} msg
make_name $mged_default(solid_name_fmt)} name
make_name comb@\
matpick
matpick $item
matpick -n $path_pos
matpick -n \$item
matpick [lindex $spath_and_pos 1
nirt $args
opendb
pl
postscript
press
press oill
press reject
press reset
press sill
qray basename
qray echo
qray effects
qray evencolor
qray fmt f
qray fmt g
qray fmt h
qray fmt m
qray fmt o
qray fmt p
qray fmt r
qray oddcolor
qray overlapcolor
qray script
qray voidcolor
quit
rset grid anchor
rt
saveview
sed $mged_gui($id,solid_name)}
sed -i 1 $item
sed -i 1 $spath
size
size $size
status state
tie
tie $id
tie $id $mged_gui($id,active_dm)
tree
tree $args} result
units $mged_display(units)
view
view center
view size
view_ring
view_ring next
view_ring prev
view_ring toggle
who
who phony
x -1
x -2

view this post on Zulip Sean (Oct 08 2024 at 16:53):

generated from:

grep -r -e '[^a-z]_mged_' * | sed 's/._mged_//g' | sed 's/].//g'  | sed 's/;.//g' | sed 's/".//g' | sort | uniq

view this post on Zulip starseeker (Oct 08 2024 at 17:10):

1738 const ON_ClassId* p = ClassId();
(gdb) print *this
$5 = {_vptr.ON_Object = 0x5d00000032, static m_s_ON_Object_ptr = 0x0, static m_ON_Object_class_rtti = {
static m_p0 = 0x7ffff3764e00 <ON_3dmObjectAttributes::m_ON_3dmObjectAttributes_class_rtti>,
static m_p1 = 0x7ffff37801e0 <ON_RdkUserData::m_ON_RdkUserData_class_rtti>, static m_mark0 = 0,
m_pNext = 0x7ffff376ed20 <ON_HistoryRecord::m_ON_HistoryRecord_class_rtti>, m_pBaseClassId = 0x0, m_sClassName = "ON_Object", '\000' <repeats 70 times>,
m_sBaseClassName = "0", '\000' <repeats 78 times>, m_create = 0x0, m_uuid = {Data1 = 1622531005, Data2 = 58976, Data3 = 4563,
Data4 = "\277\344\000\020\203\001", <incomplete sequence \360>}, m_mark = -2147483648, m_class_id_version = 0, m_f1 = 0x0, m_f2 = 0x0, m_f3 = 0x0, m_f4 = 0x0,
m_f5 = 0x0, m_f6 = 0x0, m_f7 = 0x0, m_f8 = 0x0}, m_userdata_list = 0x200000003a}

view this post on Zulip starseeker (Oct 08 2024 at 17:11):

opennurbs_object.h:

#define ON_VIRTUAL_OBJECT_IMPLEMENT( cls, basecls, uuid ) \
void* cls::m_s_##cls##_ptr = nullptr; \
const ON_ClassId cls::m_##cls##_class_rtti(#cls,#basecls,0,uuid);\
cls * cls::Cast( ON_Object* p) {return(p&&p->IsKindOf(&cls::m_##cls##_class_rtti))?static_cast< cls *>(p):nullptr;} \
const cls * cls::Cast( const ON_Object* p) {return(p&&p->IsKindOf(&cls::m_##cls##_class_rtti))?static_cast<const cls *>(p):nullptr;} \
const ON_ClassId* cls::ClassId() const {return &cls::m_##cls##_class_rtti;} \
bool cls::CopyFrom(const ON_Object*) {return false;} \
cls * cls::Duplicate() const {return static_cast< cls *>(this->Internal_DeepCopy());} \
ON_Object* cls::Internal_DeepCopy() const {return nullptr;}

view this post on Zulip starseeker (Oct 08 2024 at 17:16):

(gdb) print *bi->brep
$6 = {<ON_Geometry> = {<ON_Object> = {_vptr.ON_Object = 0x5d00000032, static m_s_ON_Object_ptr = 0x0, static m_ON_Object_class_rtti = {

view this post on Zulip starseeker (Oct 08 2024 at 18:14):

(gdb) print *this
$3 = {_vptr.ON_Object = 0x7ffff372ed10 <vtable for ON_Brep+16>, static m_s_ON_Object_ptr = 0x0,

view this post on Zulip starseeker (Oct 08 2024 at 18:14):

(gdb) print *this
$4 = {_vptr.ON_Object = 0xc00000004, static m_s_ON_Object_ptr = 0x0, static m_ON_Object_class_rtti = {

view this post on Zulip starseeker (Oct 08 2024 at 18:21):

#0 brep_dbi2on (intern=0x7fffffffd1c0, model=...) at /home/user/brlcad/src/librt/primitives/brep/brep.cpp:2321
#1 0x00007ffff75b4c82 in rt_brep_get (logstr=0x5555556a70a0, intern=0x7fffffffd1c0, attr=0x0)

(gdb) print *bi->brep
$1 = {<ON_Geometry> = {<ON_Object> = {_vptr.ON_Object = 0x7ffff372ed10 <vtable for ON_Brep+16>,
static m_s_ON_Object_ptr = 0x0, static m_ON_Object_class_rtti = {

view this post on Zulip starseeker (Oct 08 2024 at 18:23):

#0 brep_dbi2on (intern=0x55555565e320 <es_int>, model=...)
at /home/user/brlcad/src/librt/primitives/brep/brep.cpp:2331
#1 0x00007ffff75b544f in rt_brep_export5 (ep=0x7fffffffd1a0, ip=0x55555565e320 <es_int>, UNUSED_local2mm=1,

$3 = {<ON_Geometry> = {<ON_Object> = {_vptr.ON_Object = 0x2c00000030, static m_s_ON_Object_ptr = 0x0,
static m_ON_Object_class_rtti = {

view this post on Zulip starseeker (Oct 08 2024 at 18:36):

==690562== Invalid read of size 8
==690562== at 0x9569A20: ON_Object::IsKindOf(ON_ClassId const*) const (opennurbs_object.cpp:1738)
==690562== by 0x93E6722: ON_Geometry::Cast(ON_Object*) (opennurbs_geometry.cpp:24)
==690562== by 0x4CD2A99: brep_dbi2on(rt_db_internal const*, ONX_Model&) (brep.cpp:2345)
==690562== by 0x4CD344E: rt_brep_export5 (brep.cpp:2422)
==690562== by 0x4DB07D5: rt_generic_xform (generic.c:85)
==690562== by 0x4F7B733: rt_matrix_transform (transform.c:39)
==690562== by 0x16D989: transform_editing_solid (edsol.c:2712)
==690562== by 0x19247E: vls_solid (edsol.c:7349)
==690562== by 0x1D9835: create_text_overlay (titles.c:89)
==690562== by 0x1B9195: refresh (mged.c:2316)
==690562== by 0x1B72B0: main (mged.c:1695)
==690562== Address 0x13c29970 is 784 bytes inside an unallocated block of size 2,432 in arena "client"
==690562==
==690562== Invalid read of size 8
==690562== at 0x9569A23: ON_Object::IsKindOf(ON_ClassId const*) const (opennurbs_object.cpp:1738)
==690562== by 0x93E6722: ON_Geometry::Cast(ON_Object*) (opennurbs_geometry.cpp:24)
==690562== by 0x4CD2A99: brep_dbi2on(rt_db_internal const*, ONX_Model&) (brep.cpp:2345)
==690562== by 0x4CD344E: rt_brep_export5 (brep.cpp:2422)
==690562== by 0x4DB07D5: rt_generic_xform (generic.c:85)
==690562== by 0x4F7B733: rt_matrix_transform (transform.c:39)
==690562== by 0x16D989: transform_editing_solid (edsol.c:2712)
==690562== by 0x19247E: vls_solid (edsol.c:7349)
==690562== by 0x1D9835: create_text_overlay (titles.c:89)
==690562== by 0x1B9195: refresh (mged.c:2316)
==690562== by 0x1B72B0: main (mged.c:1695)
==690562== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==690562==
==690562==
==690562== Process terminating with default action of signal 11 (SIGSEGV)
==690562== Access not within mapped region at address 0x0
==690562== at 0x9569A23: ON_Object::IsKindOf(ON_ClassId const*) const (opennurbs_object.cpp:1738)
==690562== by 0x93E6722: ON_Geometry::Cast(ON_Object*) (opennurbs_geometry.cpp:24)
==690562== by 0x4CD2A99: brep_dbi2on(rt_db_internal const*, ONX_Model&) (brep.cpp:2345)
==690562== by 0x4CD344E: rt_brep_export5 (brep.cpp:2422)
==690562== by 0x4DB07D5: rt_generic_xform (generic.c:85)
==690562== by 0x4F7B733: rt_matrix_transform (transform.c:39)
==690562== by 0x16D989: transform_editing_solid (edsol.c:2712)
==690562== by 0x19247E: vls_solid (edsol.c:7349)
==690562== by 0x1D9835: create_text_overlay (titles.c:89)
==690562== by 0x1B9195: refresh (mged.c:2316)
==690562== by 0x1B72B0: main (mged.c:1695)

view this post on Zulip starseeker (Oct 08 2024 at 20:33):

@Sean I might have fixed it - let me know if the latest commit works for you. (I didn't put a NEWS item in yet, want more confirmation than just "works on my box" for this sucker...)

view this post on Zulip Sean (Oct 20 2024 at 05:54):

New build error on a default build (on Mac):

morrison@Miniagua TCL_BLD-build % make

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -c -I"." -I/Volumes/X10/brlcad/.build/bext_build/tcl/TCL_BLD-prefix/src/TCL_BLD/unix -I/Volumes/X10/brlcad/.build/bext_build/tcl/TCL_BLD-prefix/src/TCL_BLD/generic -I/Volumes/X10/brlcad/.build/bext_build/tcl/TCL_BLD-prefix/src/TCL_BLD/libtommath -O2 -pipe  -I/Volumes/X10/brlcad/.build/bext_output/install/include  -Wall -Wpointer-arith -fno-common -DBUILD_tcl -DPACKAGE_NAME=\"tcl\" -DPACKAGE_TARNAME=\"tcl\" -DPACKAGE_VERSION=\"8.6\" -DPACKAGE_STRING=\"tcl\ 8.6\" -DPACKAGE_BUGREPORT=\"\" -DNO_DIRENT_H=1 -DNO_VALUES_H=1 -DNO_STDLIB_H=1 -DNO_STRING_H=1 -DNO_SYS_WAIT_H=1 -DNO_DLFCN_H=1 -DUSE_THREAD_ALLOC=1 -D_REENTRANT=1 -D_THREAD_SAFE=1 -DHAVE_PTHREAD_ATTR_SETSTACKSIZE=1 -DHAVE_PTHREAD_ATFORK=1 -DTCL_THREADS=1 -DTCL_CFGVAL_ENCODING=\"iso8859-1\" -DHAVE_ZLIB=1 -DMODULE_SCOPE=extern\ __attribute__\(\(__visibility__\(\"hidden\"\)\)\) -DHAVE_HIDDEN=1 -DMAC_OSX_TCL=1 -DHAVE_CAST_TO_UNION=1 -DHAVE_VFORK=1 -DHAVE_POSIX_SPAWNP=1 -DHAVE_POSIX_SPAWN_FILE_ACTIONS_ADDDUP2=1 -DHAVE_POSIX_SPAWNATTR_SETFLAGS=1 -DTCL_SHLIB_EXT=\".dylib\" -DNDEBUG=1 -DTCL_CFG_OPTIMIZED=1 -DTCL_TOMMATH=1 -DMP_PREC=4 -DTCL_WIDE_INT_IS_LONG=1 -DWORDS_BIGENDIAN=1 -DHAVE_GETCWD=1 -DHAVE_MKSTEMP=1 -DHAVE_OPENDIR=1 -DHAVE_STRTOL=1 -DHAVE_WAITPID=1 -DHAVE_GETNAMEINFO=1 -DHAVE_GETADDRINFO=1 -DHAVE_FREEADDRINFO=1 -DHAVE_GAI_STRERROR=1 -DNEED_FAKE_RFC2553=1 -DHAVE_MTSAFE_GETHOSTBYNAME=1 -DHAVE_MTSAFE_GETHOSTBYADDR=1 -DNO_FD_SET=1 -DHAVE_GMTIME_R=1 -DHAVE_LOCALTIME_R=1 -DHAVE_MKTIME=1 -Dmode_t=int -Dpid_t=int -Dsize_t=unsigned -Duid_t=int -Dgid_t=int -Dsocklen_t=int -DNO_UNION_WAIT=1 -DGETTOD_NOT_DECLARED=1 -DHAVE_SIGNED_CHAR=1 -DHAVE_PUTENV_THAT_COPIES=1 -DHAVE_CHFLAGS=1 -DHAVE_MKSTEMPS=1 -DNO_ISNAN=1 -DHAVE_GETATTRLIST=1 -DHAVE_COPYFILE=1 -DTCL_DEFAULT_ENCODING=\"utf-8\" -DTCL_LOAD_FROM_MEMORY=1 -DTCL_WIDE_CLICKS=1 -DTCL_UNLOAD_DLLS=1     -DSTATIC_BUILD -fno-lto /Volumes/X10/brlcad/.build/bext_build/tcl/TCL_BLD-prefix/src/TCL_BLD/generic/tclStubLib.c

In file included from /Volumes/X10/brlcad/.build/bext_build/tcl/TCL_BLD-prefix/src/TCL_BLD/generic/tclStubLib.c:14:

In file included from /Volumes/X10/brlcad/.build/bext_build/tcl/TCL_BLD-prefix/src/TCL_BLD/generic/tclInt.h:36:

In file included from /Volumes/X10/brlcad/.build/bext_build/tcl/TCL_BLD-prefix/src/TCL_BLD/generic/tclPort.h:23:
**/Volumes/X10/brlcad/.build/bext_build/tcl/TCL_BLD-prefix/src/TCL_BLD/unix/tclUnixPort.h:32:10:** **fatal error:** **'errno.h' file not found**

#include <errno.h>

         **^~~~~~~~~**

1 error generated.

make: *** [tclStubLib.o] Error 1

view this post on Zulip starseeker (Oct 21 2024 at 13:15):

Does OSX not have errno.h?

view this post on Zulip Sean (Oct 21 2024 at 16:47):

@starseeker It most certainly does and always has. Nothing on the system has changed. Debug build worked just fine. Just the default build is dying on that error during Tcl's bext build.

view this post on Zulip Sean (Oct 21 2024 at 16:49):

Only thing I can see is all the -DNO_*_H=1 flags also look wrong, like something is wrong during/after tcl's configure phase.

view this post on Zulip Sean (Oct 21 2024 at 16:49):

e.g., saying there is no string.h or stdlib.h also

view this post on Zulip Sean (Oct 21 2024 at 16:50):

release and debug builds both seem to have worked, but I've not deleted them to check from scratch as I'm working on something else and the default build just surprised me that it's failing basic setup.


Last updated: Jan 07 2025 at 00:46 UTC