Stream: brlcad

Topic: Recent Compilation Errors


view this post on Zulip Sean (Apr 09 2020 at 04:41):

@starseeker Looks like build is Sad Mac. It finds system Tcl/Tk but then fails to find TCL_STUB_LIBRARY:

Compile Tcl ........................: OFF
Compile Tk .........................: OFF
Compile Itcl/Itk ...................: OFF
Compile Iwidgets ...................: OFF
Compile Tkhtml .....................: ON
Compile Tktable ....................: OFF
Compile libpng .....................: OFF
Compile libregex ...................: OFF
Compile zlib .......................: OFF
Compile Utah Raster Toolkit ........: ON
Compile openNURBS ..................: ON
Compile STEPcode....................: ON

OpenGL support (optional) ..........: ON
X11 support (optional) .............: ON
Qt support (optional) ..............: OFF
Run-time debuggability (optional) ..: ON

Build 32/64-bit release ............: 64BIT (Auto)
Build optimized release ............: OFF
Build static libraries .............: ON
Build dynamic libraries ............: ON
Install example geometry models ....: ON
Generate extra docs ................: ON (html/man)

Elapsed configuration time: 2 minutes 22 seconds
-- Configuring done
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
TCL_STUB_LIBRARY
    linked by target "Tkhtml" in directory /Users/morrison/brlcad.trunk/src/other/tkhtml

-- Generating done
CMake Generate step failed.  Build files cannot be regenerated correctly.

view this post on Zulip starseeker (Apr 09 2020 at 12:53):

Erm. Does Apple's install of Tcl/Tk have the stub library? (Alternately, we might try just TCL_LIBRARY on Mac...)

view this post on Zulip starseeker (Apr 09 2020 at 12:54):

What does ENABLE_ALL do?

view this post on Zulip Sean (Apr 18 2020 at 03:08):

@starseeker build is still sad mac with enable all too

view this post on Zulip Sean (Apr 18 2020 at 03:08):

[ 56%] Building C object src/other/tcl/CMakeFiles/tcl.dir/unix/tclLoadDyld.c.o
[ 56%] Linking C shared library ../../../lib/libtcl.dylib
duplicate symbol '_TclpDlopen' in:
    CMakeFiles/tcl.dir/unix/tclLoadDl.c.o
    CMakeFiles/tcl.dir/unix/tclLoadDyld.c.o
duplicate symbol '_TclGuessPackageName' in:
    CMakeFiles/tcl.dir/unix/tclLoadDl.c.o
    CMakeFiles/tcl.dir/unix/tclLoadDyld.c.o
ld: 2 duplicate symbols for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [lib/libtcl.8.6.10.dylib] Error 1
make[1]: *** [src/other/tcl/CMakeFiles/tcl.dir/all] Error 2

view this post on Zulip Sean (Apr 18 2020 at 04:49):

Looks like Mac is even more sad I skip that error. Termio compile errors next, missing libbu symbols.

view this post on Zulip starseeker (Apr 18 2020 at 13:01):

Ah, that's me - I switched a couple logging statements and string opts to bu versions

view this post on Zulip starseeker (Apr 18 2020 at 13:01):

forgot to update libtermio CMakeLists.txt

view this post on Zulip starseeker (Apr 18 2020 at 13:17):

I think I fixed where OSX build was pulling in the file with colliding symbols.

view this post on Zulip Daniel Rossberg (May 04 2020 at 16:06):

The (Linux)build is still broken?

Build Time: Mon May  4 18:01:26 2020

[  0%] Built target timestamp
[  0%] Built target lcheck
[  0%] Built target toplevel_DOCFILES_cp
[  0%] Built target lemon
[  1%] Built target re2c_bootstrap
[  1%] Built target re2c
[  1%] Built target perplex_template_cp
[  2%] Built target perplex
[  2%] Built target astyle
[  2%] Built target env2c
[  2%] Built target debug2c
[  2%] Built target 04238f4c3429a37ca335f9352ba2dc81_cp
[  2%] Built target openNURBS_headers_cp
[  2%] Built target lz4
[  2%] Built target lz4-static
[  2%] Built target netpbm-static
[  3%] Built target netpbm
[  4%] Built target utahrle-static
[  4%] Built target utahrle
[  4%] Built target itclstub
[  4%] Linking C shared library ../../../lib/libitcl.so
/usr/bin/ld: -ltclstub kann nicht gefunden werden
collect2: error: ld returned 1 exit status
make[2]: *** [src/other/itcl3/CMakeFiles/itcl.dir/build.make:250: lib/libitcl.so.3.4] Fehler 1
make[1]: *** [CMakeFiles/Makefile2:3600: src/other/itcl3/CMakeFiles/itcl.dir/all] Fehler 2
make: *** [Makefile:163: all] Fehler 2

view this post on Zulip Sean (May 04 2020 at 21:32):

This could be the same issue I ran across. I think the underlying issue was a system libtcl that did not install libtclstub? Or perhaps the cmake detection of it failed? I don't recall the detail but does enable all work for you¿

view this post on Zulip Daniel Rossberg (May 05 2020 at 14:22):

Right, libtcl8.6 is there, but no libtclstub. And, there seems to be no Debian package providing it.

view this post on Zulip Daniel Rossberg (May 05 2020 at 14:54):

It looks like the old find-TCL hasn't found the system TCL library, but the new one does.

view this post on Zulip starseeker (May 05 2020 at 19:44):

@Sean Are you able to build trunk on OSX now, or is it still busted?

view this post on Zulip Sean (May 05 2020 at 19:54):

Trunk built yesterday, but I haven't tested it since. I'll kick off a build

view this post on Zulip Sean (May 06 2020 at 09:27):

@starseeker new opennurbs-zlib-related compilation error encountered after a fresh cmake:

/usr/local/Cellar/llvm/10.0.0_3/bin/clang++  -w -m64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -mmacosx-version-min=10.14 -dynamiclib -Wl,-headerpad_max_install_names  -m64 -current_version 2012.10.245 -o ../../../lib/libopenNURBS.2012.10.245.dylib -install_name @rpath/libopenNURBS.2012.10.245.dylib CMakeFiles/openNURBS-obj.dir/opennurbs_basic.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_brep_changesrf.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_brep_kinky.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_x.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_3dm_attributes.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_3dm_properties.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_3dm_settings.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_annotation.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_annotation2.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_arc.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_arccurve.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_archive.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_array.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_base32.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_base64.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_beam.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_bezier.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_beziervolume.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_bitmap.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_bounding_box.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_box.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_brep.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_brep_extrude.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_brep_io.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_brep_isvalid.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_brep_region.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_brep_tools.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_brep_v2valid.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_circle.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_color.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_compress.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_cone.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_crc.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_curve.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_curveonsurface.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_curveproxy.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_cylinder.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_defines.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_detail.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_dimstyle.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_dll.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_ellipse.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_embedded_file.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_error.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_error_message.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_evaluate_nurbs.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_extensions.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_font.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_fsp.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_geometry.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_group.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_hatch.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_instance.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_intersect.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_knot.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_layer.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_light.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_line.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_linecurve.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_linetype.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_lookup.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_material.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_math.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_massprop.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_matrix.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_memory.c.o CMakeFiles/openNURBS-obj.dir/opennurbs_memory_util.c.o CMakeFiles/openNURBS-obj.dir/opennurbs_mesh.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_mesh_ngon.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_mesh_tools.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_morph.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_nurbscurve.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_nurbssurface.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_nurbsvolume.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_object.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_object_history.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_objref.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_offsetsurface.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_optimize.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_plane.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_planesurface.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_pluginlist.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_point.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_pointcloud.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_pointgeometry.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_pointgrid.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_polycurve.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_polyedgecurve.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_polyline.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_polylinecurve.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_rand.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_revsurface.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_rtree.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_sort.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_sphere.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_string.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_sum.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_sumsurface.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_surface.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_surfaceproxy.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_textlog.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_torus.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_unicode.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_userdata.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_uuid.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_viewport.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_workspace.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_wstring.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_xform.cpp.o CMakeFiles/openNURBS-obj.dir/opennurbs_zlib.cpp.o  /usr/lib/libz.dylib
Undefined symbols for architecture x86_64:
  "_brl_deflate", referenced from:
      ON_CompressStream::In(unsigned long long, void const*) in opennurbs_compress.cpp.o
      ON_CompressStream::End() in opennurbs_compress.cpp.o
      ON_BinaryArchive::WriteDeflate(unsigned long, void const*) in opennurbs_zlib.cpp.o
      ON_CompressedBuffer::DeflateHelper(ON_CompressedBufferHelper*, unsigned long, void const*) in opennurbs_zlib.cpp.o
  "_brl_deflateEnd", referenced from:
      ON_CompressStream::End() in opennurbs_compress.cpp.o
      ON_BinaryArchive::CompressionEnd() in opennurbs_zlib.cpp.o
      ON_CompressedBuffer::CompressionEnd(ON_CompressedBufferHelper*) const in opennurbs_zlib.cpp.o
  "_brl_deflateInit_", referenced from:
      ON_CompressStream::Begin() in opennurbs_compress.cpp.o
      ON_BinaryArchive::CompressionInit() in opennurbs_zlib.cpp.o
      ON_CompressedBuffer::CompressionInit(ON_CompressedBufferHelper*) const in opennurbs_zlib.cpp.o
  "_brl_inflate", referenced from:
      ON_UncompressStream::In(unsigned long long, void const*) in opennurbs_compress.cpp.o
      ON_UncompressStream::End() in opennurbs_compress.cpp.o
      ON_BinaryArchive::ReadInflate(unsigned long, void*) in opennurbs_zlib.cpp.o
      ON_CompressedBuffer::InflateHelper(ON_CompressedBufferHelper*, unsigned long, void*) const in opennurbs_zlib.cpp.o
  "_brl_inflateEnd", referenced from:
      ON_UncompressStream::End() in opennurbs_compress.cpp.o
      ON_BinaryArchive::CompressionEnd() in opennurbs_zlib.cpp.o
      ON_CompressedBuffer::CompressionEnd(ON_CompressedBufferHelper*) const in opennurbs_zlib.cpp.o
  "_brl_inflateInit_", referenced from:
      ON_UncompressStream::Begin() in opennurbs_compress.cpp.o
      ON_BinaryArchive::CompressionInit() in opennurbs_zlib.cpp.o
      ON_CompressedBuffer::CompressionInit(ON_CompressedBufferHelper*) const in opennurbs_zlib.cpp.o
ld: symbol(s) not found for architecture x86_64

view this post on Zulip Sean (May 06 2020 at 09:28):

this was on mac

view this post on Zulip Sean (May 06 2020 at 09:31):

looks like it's still linking system libz there at the end

view this post on Zulip starseeker (May 06 2020 at 12:47):

That's... weird. Wonder why it's doing that only on one platform.

The license scanning is in place now - might be time to subsume the openNURBS build under libbrep.

view this post on Zulip Sean (May 06 2020 at 18:13):

Because we're only testing 2-3 platforms?

view this post on Zulip Sean (May 06 2020 at 18:15):

Characterizing it diminutively as "only on one platform" is not exactly a healthy perspective for the build... That was absolutely broken elsewhere as well.

It depends how ld is configured on the system, the type of library being compiled, and options in use (two of those are in the user's purview). Libraries can be resolved at compilation, at linkage, or at runtime (or combinations thereof). Mac happens to default to resolved dynamic libraries, but Linux, UNIX, and BSD linking can be configured this way as well (and some distros do).

view this post on Zulip Sean (May 06 2020 at 18:16):

All that was probably needed was listing our renamed libz as a dependendency to cmake and it would have probably done the right thing.

view this post on Zulip Sean (May 06 2020 at 18:17):

We've hit that issue before repeatedly over the years. Usually from Mac, but also from some Linux platforms. Cmake figures it out if things are fully declared correctly.

view this post on Zulip Sean (May 06 2020 at 18:18):

@starseeker build now fails with a different set of issues:

In file included from /Users/morrison/brlcad.trunk/src/libbrep/openNURBS/opennurbs_array.h:1788:
/Users/morrison/brlcad.trunk/src/libbrep/openNURBS/opennurbs_array_defs.h:120:36: error: unknown
      warning group '-Wclass-memaccess', ignored [-Werror,-Wunknown-warning-option]
#  pragma clang diagnostic ignored "-Wclass-memaccess"
                                   ^
[ 12%] Built target SPSR
/Users/morrison/brlcad.trunk/src/libbrep/openNURBS/opennurbs_array_defs.h:434:36: error: unknown
      warning group '-Wclass-memaccess', ignored [-Werror,-Wunknown-warning-option]
#  pragma clang diagnostic ignored "-Wclass-memaccess"
                                   ^
/Users/morrison/brlcad.trunk/src/libbrep/openNURBS/opennurbs_array_defs.h:463:36: error: unknown
      warning group '-Wclass-memaccess', ignored [-Werror,-Wunknown-warning-option]
#  pragma clang diagnostic ignored "-Wclass-memaccess"
                                   ^
/Users/morrison/brlcad.trunk/src/libbrep/openNURBS/opennurbs_array_defs.h:557:36: error: unknown
      warning group '-Wclass-memaccess', ignored [-Werror,-Wunknown-warning-option]
#  pragma clang diagnostic ignored "-Wclass-memaccess"
                                   ^
/Users/morrison/brlcad.trunk/src/libbrep/openNURBS/opennurbs_array_defs.h:583:36: error: unknown
      warning group '-Wclass-memaccess', ignored [-Werror,-Wunknown-warning-option]
#  pragma clang diagnostic ignored "-Wclass-memaccess"
                                   ^
/Users/morrison/brlcad.trunk/src/libbrep/openNURBS/opennurbs_array_defs.h:875:36: error: unknown
      warning group '-Wclass-memaccess', ignored [-Werror,-Wunknown-warning-option]
#  pragma clang diagnostic ignored "-Wclass-memaccess"
                                   ^
/Users/morrison/brlcad.trunk/src/libbrep/openNURBS/opennurbs_array_defs.h:901:36: error: unknown
      warning group '-Wclass-memaccess', ignored [-Werror,-Wunknown-warning-option]
#  pragma clang diagnostic ignored "-Wclass-memaccess"
                                   ^
/Users/morrison/brlcad.trunk/src/libbrep/openNURBS/opennurbs_array_defs.h:966:36: error: unknown
      warning group '-Wclass-memaccess', ignored [-Werror,-Wunknown-warning-option]
#  pragma clang diagnostic ignored "-Wclass-memaccess"
                                   ^
[ 12%] Building C object src/libnmg/CMakeFiles/libnmg-obj.dir/copy.c.o
/Users/morrison/brlcad.trunk/src/libbrep/openNURBS/opennurbs_array_defs.h:1585:36: error: unknown
      warning group '-Wclass-memaccess', ignored [-Werror,-Wunknown-warning-option]
#  pragma clang diagnostic ignored "-Wclass-memaccess"
                                   ^
/Users/morrison/brlcad.trunk/src/libbrep/openNURBS/opennurbs_array_defs.h:1627:36: error: unknown
      warning group '-Wclass-memaccess', ignored [-Werror,-Wunknown-warning-option]
#  pragma clang diagnostic ignored "-Wclass-memaccess"
                                   ^
/Users/morrison/brlcad.trunk/src/libbrep/openNURBS/opennurbs_array_defs.h:1663:36: error: unknown
      warning group '-Wclass-memaccess', ignored [-Werror,-Wunknown-warning-option]
#  pragma clang diagnostic ignored "-Wclass-memaccess"
                                   ^
/Users/morrison/brlcad.trunk/src/libbrep/openNURBS/opennurbs_array_defs.h:1890:36: error: unknown
      warning group '-Wclass-memaccess', ignored [-Werror,-Wunknown-warning-option]
#  pragma clang diagnostic ignored "-Wclass-memaccess"
                                   ^
[ 12%] Building C object src/libbu/CMakeFiles/libbu-obj.dir/whereis.c.o
/Users/morrison/brlcad.trunk/src/libbrep/openNURBS/opennurbs_array_defs.h:1970:36: error: unknown
      warning group '-Wclass-memaccess', ignored [-Werror,-Wunknown-warning-option]
#  pragma clang diagnostic ignored "-Wclass-memaccess"
                                   ^

view this post on Zulip Sean (May 06 2020 at 18:30):

trying a different compiler...

view this post on Zulip Sean (May 06 2020 at 18:32):

also horked. looks like more pragmas. did you add those or were they already there? some seem to be in include/defines.h, others in opennurbs files.

view this post on Zulip Sean (May 06 2020 at 18:47):

@starseeker I see you riddled it with the pragmas. Please, please...
Please don't make huge changes like this. I get why as we've talked about moving opennurbs, but this is so unstable an approach with pragma hacks on top of a major dependency restructure.. all to fix what should have been a single line declaration that was missing.

view this post on Zulip starseeker (May 06 2020 at 18:53):

Reverted. FWIW, wasn't just for that - figured it would avoid the Windows zlib issue as well.

view this post on Zulip starseeker (May 06 2020 at 18:54):

Still need to figure out why openNURBS is getting system zlib on the mac - by the time it gets to src/other/openNURBS FindZLIB should be satisfied

view this post on Zulip Sean (May 06 2020 at 18:55):

I'm sure it wasn't, but it's just the unexpected stress of everything coming to a halt. Was dead in the water again and that went in a more unstable direction with maintenance issues embedded.

view this post on Zulip starseeker (May 06 2020 at 18:57):

fair enough. I'll see if I can coax a mac build out of github - that's failing when Ubuntu, BSD and Windows don't so I'm having a hard time predicting failures reliably.

view this post on Zulip Sean (May 06 2020 at 18:59):

Sorry if my frustration is coming through. I hear Lee from 15 years ago in my messages. Just needing things to be a bit more stable on trunk to make progress debugging.

view this post on Zulip starseeker (May 06 2020 at 19:00):

No problem - I'm getting a bit greedy trying to push through things that I've wanted to do for a long time, and now isn't the time for it.

view this post on Zulip Sean (May 06 2020 at 19:00):

BSD was also broken with the pragmas, fwiw.

view this post on Zulip starseeker (May 06 2020 at 19:01):

right, that one's on me - the class-memaccess warning must be unique to GCC right now.

view this post on Zulip starseeker (May 06 2020 at 19:01):

(probably a real issue though - that may bite us someday with the openNURBS code...)

view this post on Zulip Sean (May 06 2020 at 19:01):

there were 3-4 pragmas it was complaining about, different files, that was just the first

view this post on Zulip starseeker (May 06 2020 at 19:02):

Yeah, clang and GCC define different sets. Right now the src/other builds deliberately don't enable any of the warnings so we don't see any of that.

view this post on Zulip Sean (May 06 2020 at 19:02):

and I get why, but pragmas should be last resort imho, when there is not a build system solution because that is intrinsically unstable

view this post on Zulip Sean (May 06 2020 at 19:03):

a -w flag forced on src/libbrep/openNURBS would have been better I think

view this post on Zulip Sean (May 06 2020 at 19:03):

or maybe -Wno-error

view this post on Zulip starseeker (May 06 2020 at 19:03):

Probably what would have had to happen in that case is including src/libbrep/openNURBS with -isystem

view this post on Zulip starseeker (May 06 2020 at 19:04):

Anyway, moot

view this post on Zulip Sean (May 06 2020 at 19:04):

pragmas went into our headers too, which means the instability infection spread

view this post on Zulip starseeker (May 06 2020 at 19:06):

Well, the revert should be clean - two commits only, not mixed with anything

view this post on Zulip starseeker (May 06 2020 at 19:06):

I'm less sure what to do about the system zlib creeping in - can you send me your CMakeCache.txt file?

view this post on Zulip Sean (May 06 2020 at 19:11):

sure

view this post on Zulip Sean (May 06 2020 at 19:17):

any idea what this is about? I don't get it all the time and it doesn't halt cmake despite the message:

CMake Error: The source directory "/Users/morrison/brlcad.trunk/.build/remove" does not exist.
Specify --help for usage, or press the help button on the CMake GUI.
CMake Error: The source directory "/Users/morrison/brlcad.trunk/.build/remove" does not exist.
Specify --help for usage, or press the help button on the CMake GUI.
CMake Error: The source directory "/Users/morrison/brlcad.trunk/.build/remove" does not exist.
Specify --help for usage, or press the help button on the CMake GUI.
...

view this post on Zulip starseeker (May 06 2020 at 19:17):

Not sure - I've been seeing that too, but haven't run it down yet.

view this post on Zulip Sean (May 06 2020 at 19:18):

Happens repeatedly intermixed throughout the cmake output, so it's undoubtedly in some macro/function

view this post on Zulip Sean (May 06 2020 at 19:19):

that message isn't in our code (at least I didnt' find it)
not sure what to make of 'remove' either as that's not a file in our repo. we do test for the remove function, but that seems entirely unrelated to the messages and this message appearing just in the past few days.

view this post on Zulip starseeker (May 06 2020 at 19:21):

I think I noticed it when I started using a newer CMake, but I'll have to confirm that. It didn't seem to break anything, and I'm using a way newer CMake than most distros, so I wasn't sure if it needed to be run down for general use yet... Since you're seeing it too, that decides it

view this post on Zulip Sean (May 06 2020 at 19:24):

I'm on a newer cmake too 3.16

view this post on Zulip Sean (May 06 2020 at 19:25):

also have 3.12 to test with when I'm trying to make sure things work back a ways

view this post on Zulip starseeker (May 06 2020 at 19:42):

well hot diggity, looks like github workflows will let me kick off a Mac build

view this post on Zulip starseeker (May 06 2020 at 19:47):

And now (of course) it doesn't want to give me the remove error. Did it happen on a clean CMake run, or is it when reconfiguring with an existing build dir?

view this post on Zulip starseeker (May 06 2020 at 20:20):

growl... the non-enable-all github osx build didn't help any - it built

view this post on Zulip starseeker (May 06 2020 at 20:20):

enable_all might be more promising, but also quite slow...

view this post on Zulip starseeker (May 06 2020 at 20:45):

arrrrgh. that built as well.

view this post on Zulip starseeker (May 06 2020 at 20:46):

@Sean I've got Xcode_11.4.1.app with AppleClang 11.0.3.11030032 on the runner - is that radically different from yours?

view this post on Zulip Sean (May 06 2020 at 20:47):

may be a different tool chain or different compiler version or different cmake. i'll see if I can repeat the error with everything reverted.

view this post on Zulip Sean (May 06 2020 at 20:55):

appreciate you checking. maybe something in my newer cmake's find zlib or something with newer clang (I have multiple compilers installed, xcode's isn't sufficient for some things, e.g., fuzzing)

view this post on Zulip starseeker (May 06 2020 at 21:00):

The log and cache files may help, once you get them. If we absolutely have to we can revert back to pre 8.6.

view this post on Zulip starseeker (May 06 2020 at 22:28):

Yeah, struck out - got a successful make check run on github's osx runner. Only build failure I've got right now is on OpenBSD, and that's something totally different.

view this post on Zulip starseeker (May 07 2020 at 01:16):

OK, pretty sure I found the source of the spurious .build/remove messages (cmake --trace to the rescue.)

view this post on Zulip starseeker (May 07 2020 at 02:08):

@Sean There's a slight chance that the "build/remove" messages error could have had an impact on what was being found on the Mac - if configurations were switched between system and local, that was the piece of logic that was supposed to be clearing out stale files. The files didn't end up removed. My initial guess would have been that that failure was more likely to result in a local copy being found instead of a system copy rather than the other way around, but it certainly could have caused something unexpected to happen.

view this post on Zulip Sean (May 07 2020 at 02:12):

okay, I've got several builds going at the moment so I'll let you know how things are looking

view this post on Zulip starseeker (May 07 2020 at 02:52):

I think that's about as good as it's going to get for OpenBSD right now, at least until they add a way to get the executable pathname.

view this post on Zulip Sean (May 07 2020 at 02:53):

how's that? none of the implemented methods work?

view this post on Zulip starseeker (May 07 2020 at 02:54):

Apparently not. Poked around online a bit and it seems like that's just a feature that OpenBSD either deliberately doesn't support or hasn't bothered to support.

view this post on Zulip Sean (May 07 2020 at 02:54):

but one of the methods is a gcc method

view this post on Zulip Sean (May 07 2020 at 02:54):

or are they llvm only?

view this post on Zulip starseeker (May 07 2020 at 02:55):

Primary compilation is clang/llvm now - don't know if gcc is still in there or not, but my understanding is that it's on its way out if it is

view this post on Zulip Sean (May 07 2020 at 02:55):

HAVE_PROGRAM_INVOCATION_NAME is false?

view this post on Zulip starseeker (May 07 2020 at 02:55):

Looking at the whoami.c code, I think they need some runtime support from the kernel to be able to know their location

view this post on Zulip starseeker (May 07 2020 at 02:57):

er, whereami.c rather

view this post on Zulip starseeker (May 07 2020 at 02:58):

https://stackoverflow.com/questions/31494901/how-to-get-the-executable-path-on-openbsd

view this post on Zulip starseeker (May 07 2020 at 03:02):

This seems to be the email referenced in the second comment: https://marc.info/?l=openbsd-misc&m=144987773230417&w=2

view this post on Zulip Sean (May 07 2020 at 03:05):

I think streams are getting crossed there. Theo is talking about the kernel providing some intrinsic storage/lookup capability. Similarly one of the responses is about whether it's immediately available, but even that is half wrong.

view this post on Zulip Sean (May 07 2020 at 03:06):

you don't need either of those.
in fact the second SO response is closer to being on track

view this post on Zulip Sean (May 07 2020 at 03:07):

guarantee there's a way to figure it out. worst case, we can repeat what the shell does to invoke the binary (which I originally had in there but found it to be unnecessary complexity, even as a fallback, so I stripped it out).

view this post on Zulip Sean (May 07 2020 at 03:08):

still, I'm surprised llvm doesn't supply program_invocation_name -- can you check the cmakeerror?

view this post on Zulip starseeker (May 07 2020 at 03:09):

ld: error: undefined symbol: program_invocation_name

view this post on Zulip starseeker (May 07 2020 at 03:10):

from CheckFunctionExists test

view this post on Zulip Sean (May 07 2020 at 03:11):

yeah, darn. that's the easiest. clang totally could have stashed it during _start

view this post on Zulip starseeker (May 07 2020 at 03:13):

Does getprogname do anything useful? https://man.openbsd.org/getprogname.3

view this post on Zulip Sean (May 07 2020 at 03:14):

well yeah, but it should already be testing for that and using it

view this post on Zulip Sean (May 07 2020 at 03:14):

grep GETPROGNAME in the cache

view this post on Zulip starseeker (May 07 2020 at 03:14):

Doesn't look like it's realpathed anyway...

view this post on Zulip Sean (May 07 2020 at 03:14):

bu_getprogname() uses it

view this post on Zulip Sean (May 07 2020 at 03:15):

neither is program_invocation_name, they are just faster alternatives to stashing argv0

view this post on Zulip starseeker (May 07 2020 at 03:15):

<not> - what we need is the all-up full path

view this post on Zulip starseeker (May 07 2020 at 03:15):

<nod> rather

view this post on Zulip Sean (May 07 2020 at 03:15):

it's undesirable to have to call setprogname(argv[0]) in every application, so half the code in question is simply trying to figure out the name of the running app.

view this post on Zulip Sean (May 07 2020 at 03:16):

the other half figures out the path to that app

view this post on Zulip Sean (May 07 2020 at 03:16):

if you have the name, you can always find the path brute force in all but theoretical edge cases that we don't care about.

view this post on Zulip starseeker (May 07 2020 at 03:16):

we've got getprogname detected - it's the whereami.c code that doesn't know how to resolve the full path on OpenBSD

view this post on Zulip Sean (May 07 2020 at 03:17):

whereami's that external code you added for bu_dir right?

view this post on Zulip starseeker (May 07 2020 at 03:17):

right

view this post on Zulip Sean (May 07 2020 at 03:17):

you recall what exactly in bu_dir is using it? I just see the include.

view this post on Zulip starseeker (May 07 2020 at 03:18):

wai_getExecutablePath

view this post on Zulip Sean (May 07 2020 at 03:18):

ah, I found it

view this post on Zulip Sean (May 07 2020 at 03:18):

did whereis.c compile?

view this post on Zulip starseeker (May 07 2020 at 03:19):

Originally, it bailed.

view this post on Zulip starseeker (May 07 2020 at 03:19):

I added a "always return -1" fallback

view this post on Zulip Sean (May 07 2020 at 03:19):

not whereami.c .. the other, whereis.c

view this post on Zulip starseeker (May 07 2020 at 03:19):

ah - let me check

view this post on Zulip starseeker (May 07 2020 at 03:20):

yes, compiled - makes a .o file

view this post on Zulip Sean (May 07 2020 at 03:20):

so then we have the means

view this post on Zulip Sean (May 07 2020 at 03:20):

we just have to merge or replace whereami.c with whereis.c

view this post on Zulip Sean (May 07 2020 at 03:21):

they do the same thing, there were just platform robustness differences

view this post on Zulip Sean (May 07 2020 at 03:24):

I don't recall the exact motivation for why you pulled in whereami.c but I think you had some concern with whereis.c for some reason. You remember what the issue was?

view this post on Zulip starseeker (May 07 2020 at 03:26):

it wasn't robust to arbitrary maneuvering on the file system, IIRC

view this post on Zulip starseeker (May 07 2020 at 03:28):

Illustrated simply:

mged> bu_brlcad_root bin
/home/user/brlcad/build/bin
mged> cd ../../
mged> bu_brlcad_root bin
mged> bu_dir bin
/home/user/brlcad/build/bin

view this post on Zulip starseeker (May 07 2020 at 03:28):

(that's on Linux)

view this post on Zulip Sean (May 07 2020 at 03:29):

that's not bu_whereis() though ... what's the bu_whereis() call -- I'm not sure how whereis would even be put to work there because bin is a directory, not a command

view this post on Zulip Sean (May 07 2020 at 03:30):

maybe bu_brlcad_root had/has misuse?

view this post on Zulip starseeker (May 07 2020 at 03:31):

No, bu_brlcad_root doesn't use whereis. The above is just illustrating why I put whereami.c in.

view this post on Zulip Sean (May 07 2020 at 03:33):

heh, yeah I seemed to remember it being some bug and instead of debugging, you pulled in someone else's code. :)

view this post on Zulip Sean (May 07 2020 at 03:34):

and now we have code debt from both :D

view this post on Zulip starseeker (May 07 2020 at 03:34):

and a working feature :P

view this post on Zulip Sean (May 07 2020 at 03:35):

not on openbsd apparently

view this post on Zulip starseeker (May 07 2020 at 03:35):

I don't think it ever worked there with any of the methods except probably your shell mimicking code.

view this post on Zulip Sean (May 07 2020 at 03:35):

bu_dir didn't, you're right -- that was all stubbed empty

view this post on Zulip Sean (May 07 2020 at 03:36):

bu_brlcad_root is still just a straight up bug -- it shouldn't be invariant to cwd at all

view this post on Zulip Sean (May 07 2020 at 03:36):

s/invariant/variant/

view this post on Zulip Sean (May 07 2020 at 03:37):

the method that whereami code is using is the method to use on windows, and I think it's using the same as other code on linux

view this post on Zulip starseeker (May 07 2020 at 03:40):

In fairness - the above mged log I showed used an mged invoked from the build/lib dir as follows:

../bin/mged -c -a nu ../share/db/moss.g

view this post on Zulip Sean (May 07 2020 at 03:40):

I think the only immediate solution is extending bu_dir with different methods

view this post on Zulip Sean (May 07 2020 at 03:41):

or, hm

view this post on Zulip starseeker (May 07 2020 at 03:41):

So I should more precisely have said it isn't just cwd but cwd+invocation argv0

view this post on Zulip Sean (May 07 2020 at 03:42):

getExecutablePath() intrinsically relies on a global program name, we could just feed it the name we know

view this post on Zulip starseeker (May 07 2020 at 03:43):

you mean the relative path from argv[0]?

view this post on Zulip Sean (May 07 2020 at 03:43):

bu_getprogname()

view this post on Zulip Sean (May 07 2020 at 03:43):

which worst case will just be argv[0] passed from the app

view this post on Zulip Sean (May 07 2020 at 03:44):

it can call bu_whereis to get the resolved path as another method either in getExecutablePath() or in bu_dir() if we want to keep our code separate

view this post on Zulip starseeker (May 07 2020 at 03:45):

what will happen once the cwd changes though? Do we replace the recorded argv[0] with it's resolution immediately upon program launch before the app has a chance to change it's working directory?

view this post on Zulip Sean (May 07 2020 at 03:46):

I think you're getting issues mixed up -- that's some bu_brlcad_root issue

view this post on Zulip Sean (May 07 2020 at 03:47):

getprogname doesn't stop working after cwd

view this post on Zulip starseeker (May 07 2020 at 03:47):

Maybe - to me the "user level" issue is I want bu_dir to always resolve correctly.

view this post on Zulip Sean (May 07 2020 at 03:47):

bu_which logic has nothing that introspects cwd

view this post on Zulip Sean (May 07 2020 at 03:48):

heh, um, yes... and me too...

view this post on Zulip starseeker (May 07 2020 at 03:49):

Are you saying that the bu_brlcad_root issue is independent of the program name?

view this post on Zulip Sean (May 07 2020 at 03:50):

i'm saying nothing about the bu_brlcad_root issue at the moment -- there is some issue there to be looked into at some point, but I didn't refer to it

view this post on Zulip starseeker (May 07 2020 at 03:50):

OK

view this post on Zulip Sean (May 07 2020 at 03:52):

the issue at hand I thought was getting bu_dir to return the path that getExecutablePath() is/should be providing because there's some compilation issue on obsd

view this post on Zulip Sean (May 07 2020 at 03:52):

we have everything needed to get the current executable path on obsd

view this post on Zulip starseeker (May 07 2020 at 03:54):

OK... that's the whereis.c code?

view this post on Zulip Sean (May 07 2020 at 03:54):

that can be got using bu_getprogname and bu_which, or we could tweak the logic in whereami.c -- there is a bsd section in there, so I'm a little surprised i it doesn't work

view this post on Zulip Sean (May 07 2020 at 03:54):

er, yes s/which/whereis/

view this post on Zulip starseeker (May 07 2020 at 03:54):

I tried - OpenBSD doesn't define KERN_PROC_PATHNAME

view this post on Zulip Sean (May 07 2020 at 03:55):

sure, but what about KERN_PROC_ARGS and KERN_PROC_ARGV ?

view this post on Zulip Sean (May 07 2020 at 03:58):

assuming it supplies access to argv globally, then we either call bu_whereis/bu_which or replicate that logic in whereami.c for obsd

view this post on Zulip starseeker (May 07 2020 at 03:59):

I believe it has those...

view this post on Zulip Sean (May 07 2020 at 03:59):

that's what I was originally saying -- the choice is between modifying whereami.c to extend it with openbsd support or using existing support at the higher call in bu_dir to call getprogname+whereis

view this post on Zulip Sean (May 07 2020 at 04:01):

Thomas' response at https://stackoverflow.com/questions/31494901/how-to-get-the-executable-path-on-openbsd looks like a near drop-in for whereami.c if you want to try it on openbsd; that might make a good upstream patch anyways

view this post on Zulip Sean (May 07 2020 at 04:02):

whereami.c supports a nice variety of platforms already, so second choice would be to extend it with a LIBBU fallback method using getprogname+whereis

view this post on Zulip Sean (May 07 2020 at 04:02):

I think either of those are superior to doing anything in bu_dir

view this post on Zulip starseeker (May 07 2020 at 04:03):

Agreed - I guess I'm going to have to try it before it will "click"

view this post on Zulip starseeker (May 07 2020 at 04:04):

I suppose this is something relatively safe for me to chew on...

view this post on Zulip Sean (May 07 2020 at 04:04):

for what it's worth, that stack overflow response is basically doing a argv0 lookup followed by a PATH search, which is essentially the exact same code as calling bu_getprogname+bu_whereis

view this post on Zulip Sean (May 07 2020 at 04:06):

bu_whereis searches the system path a lot harder, so I guess his method is actually identical to bu_which()

view this post on Zulip Sean (May 07 2020 at 04:06):

searching env(PATH)

view this post on Zulip starseeker (May 07 2020 at 04:08):

Am I wrong, or will bu_whereis fail in the case with a relative path to an executable that isn't in any normal binary launching path?

view this post on Zulip starseeker (May 07 2020 at 04:08):

I.e. not in PATH, not in any standard system bin dir

view this post on Zulip Sean (May 07 2020 at 04:11):

it should fail, that's what 'whereis' does. 'which' will return the relative path. calling code must call realpath on relative paths.

view this post on Zulip starseeker (May 07 2020 at 04:11):

But realpath will only succeed if the current path is still the same as it was when the original invocation occurred?

view this post on Zulip Sean (May 07 2020 at 04:12):

bingo

view this post on Zulip starseeker (May 07 2020 at 04:12):

and isn't that the problem?

view this post on Zulip Sean (May 07 2020 at 04:12):

(and yes, that's a problem)

view this post on Zulip starseeker (May 07 2020 at 04:12):

my understanding was that that is the issue whereami.c was trying to resolve

view this post on Zulip Sean (May 07 2020 at 04:12):

hah, my sausage fingers are slow tonight .. fell down the stairs

view this post on Zulip starseeker (May 07 2020 at 04:12):

yikes!

view this post on Zulip Sean (May 07 2020 at 04:13):

gonna be sore tomorrow

view this post on Zulip Sean (May 07 2020 at 04:14):

it is, it's the same reason bu_argv0_full_path() exists too

view this post on Zulip starseeker (May 07 2020 at 04:16):

So... won't what we're proposing to implement for OpenBSD end up failing in those cases without something like KERN_PROC_PATHNAME to fall back on?

view this post on Zulip Sean (May 07 2020 at 04:16):

same reason it's deprecated, but almost impossible to remove. the only full-proof way for it to work is to either do a realpath() lookup right away (which should be happening when bu_setprogname() is called) or otherwise stashing the cwd on init

view this post on Zulip Sean (May 07 2020 at 04:17):

if you just pull that persons code as-is, yes -- it will be suseptiple to path changes, but we did resolve the initial path issue in our code

view this post on Zulip Sean (May 07 2020 at 04:17):

bu_getiwd()

view this post on Zulip starseeker (May 07 2020 at 04:17):

bu_setprogname is currently just copying argv0 - no realpath resolution at all that I can see...

view this post on Zulip Sean (May 07 2020 at 04:19):

that's likely the bit missing that makes all the realpath() callers in libbu fail after a chdir

view this post on Zulip starseeker (May 07 2020 at 04:20):

will try that tomorrow - might be able to ditch the whereami.c code if that does the trick...

view this post on Zulip Sean (May 07 2020 at 04:20):

like I said, I wouldn't ditch it.

view this post on Zulip Sean (May 07 2020 at 04:21):

it's got good platform case variety and they're good, just not piecewise generalized

view this post on Zulip Sean (May 07 2020 at 04:21):

that's why I didn't complain about it being added, it's a good path forward

view this post on Zulip starseeker (May 07 2020 at 04:22):

ah - so, bu_realpath the progname at set time, then use it as a fallback implementation if we don't have a platform answer?

view this post on Zulip starseeker (May 07 2020 at 04:22):

s/as/to support a/

view this post on Zulip Sean (May 07 2020 at 04:22):

to be ditched would probably be path_normalize and all the bu_brlcad* routines and the bu_argv0_full_path logic (and callers)

view this post on Zulip starseeker (May 07 2020 at 04:23):

Actually, I think I use path_normalize for some .g path operations, not just filesystem stuff

view this post on Zulip starseeker (May 07 2020 at 04:23):

would need to check, but I think search calls it

view this post on Zulip starseeker (May 07 2020 at 04:24):

yeah, libged/search.c

view this post on Zulip Sean (May 07 2020 at 04:25):

it's just redundant with something else in libbu iirc

view this post on Zulip starseeker (May 07 2020 at 04:25):

O.o

view this post on Zulip starseeker (May 07 2020 at 04:25):

I sure missed it if it is...

view this post on Zulip Sean (May 07 2020 at 04:26):

I could be wrong

view this post on Zulip Sean (May 07 2020 at 04:28):

it's just one func with no system calls too, so not a big deal even if that logic is somewhere, and like you said it might not be

view this post on Zulip starseeker (May 07 2020 at 04:28):

If you're thinking of realpath_bsd.c, that's the actual filesystem resolver from OpenBSD - using that to avoid that crazy crasher case we hit trying to use the Linux realpath

view this post on Zulip starseeker (May 07 2020 at 04:30):

OK, that's enough finger torture - get some rest!

view this post on Zulip Sean (May 07 2020 at 06:38):

yeah you're right, file_path_normalize() is cool. I think I was remembering bu_file_realpath() and GetFullPathName()'s behavior, but it's not the same as realpath().

view this post on Zulip starseeker (May 07 2020 at 13:15):

@Sean To be sure I understand - we shouldn't be using bio.h in our public header files, correct?

view this post on Zulip starseeker (May 07 2020 at 16:13):

@Sean you probably told me, but I'm afraid I don't recall - what's the reason we don't call bu_setprogname from libbu_init along with bu_getiwd?

view this post on Zulip Sean (May 07 2020 at 18:34):

starseeker said:

Sean To be sure I understand - we shouldn't be using bio.h in our public header files, correct?

No, it's fine for use in our headers. I mean, usage should be minimized, but there's not a problem. Why you ask?

view this post on Zulip starseeker (May 07 2020 at 18:38):

The repository.sh script has a test that says "make sure nobody includes private headers like bio.h in a public header"

view this post on Zulip Sean (May 07 2020 at 18:38):

starseeker said:

Sean you probably told me, but I'm afraid I don't recall - what's the reason we don't call bu_setprogname from libbu_init along with bu_getiwd?

Because we don't have the argument to pass that it needs. bu_init() is called before main(), before argv0 exists.

view this post on Zulip starseeker (May 07 2020 at 18:38):

Ah.

view this post on Zulip starseeker (May 07 2020 at 18:39):

I implemented a C++ version that looked for bio.h in include/ and there were a fair number. I scrubbed them back out, but I was wondering if the comment was wrong or the test was broken...

view this post on Zulip Sean (May 07 2020 at 18:45):

"why not both"?

view this post on Zulip Sean (May 07 2020 at 18:46):

I'm looking

view this post on Zulip Sean (May 07 2020 at 18:53):

yeah, that test was broken. I removed it. the single quotes stop expansion of ${i} so grep will never report a match

but also, the restriction that bio.h and friends not be included was removed a few years ago, so it's no longer needed.

view this post on Zulip starseeker (May 07 2020 at 18:53):

Well crud

view this post on Zulip Sean (May 07 2020 at 18:53):

good catch

view this post on Zulip starseeker (May 07 2020 at 18:54):

I'll put it back for the windows.h includes then - those are the messy ones.

view this post on Zulip Sean (May 07 2020 at 18:54):

note the other bio check is still good, make sure no redundant bio.h and stdio.h inclusions together

view this post on Zulip Sean (May 07 2020 at 18:55):

I think we've even used the wrapper headers in src/other .. though maybe not any longer

view this post on Zulip Sean (May 07 2020 at 18:56):

yeah looks like no longer

view this post on Zulip Sean (May 07 2020 at 18:57):

you mean put the bio.h public header inclusion test back? no entiendo

view this post on Zulip starseeker (May 07 2020 at 18:58):

brep/defines.h bu/tcl.h and fb/fb_wgl.h were using bio.h because they need window.h - I switched it to just the guarded windows.h include, but it's a lot more verbose.

view this post on Zulip starseeker (May 07 2020 at 18:58):

So I'll put bio.h back in those three headers

view this post on Zulip Sean (May 07 2020 at 18:59):

Ah, I hadn't noticed you'd removed them yet

view this post on Zulip starseeker (May 07 2020 at 18:59):

I compile tested the snot out of it this time, so hopefully it shouldn't hurt anything...

view this post on Zulip Sean (May 07 2020 at 19:00):

you're on point, duplication and verbosity trump as being problematic

view this post on Zulip Sean (May 07 2020 at 19:02):

the only reason they had that rule at the beginning is because they weren't originally safe for public header inclusion. and once we had something they provided needed in a couple places, the refactoring and testing cleanup happened to make sure they were safe.

view this post on Zulip starseeker (May 07 2020 at 19:03):

Ah, yeah I remember that now.

view this post on Zulip starseeker (May 07 2020 at 19:04):

OK. I think I've got a working whereami.c fallback for OpenBSD now - once that's in I'll go test VS2019 and do a final across the board check. If that's all good it will mean trunk is building on everything I have access to.

view this post on Zulip Sean (May 07 2020 at 19:05):

the main issue is their usage of cmake-tested symbols to define/set/unset logic (for example HAVE_SYS_TIME_H is defined, so we know we can #include <sys/time.h>). we suddenly have a header that is dependent on compilation logic that may not be true on an installed system where the headers get used.

view this post on Zulip Sean (May 07 2020 at 19:06):

the solution was to simply minimize cmake-test-symbol usage as much possible (which is a good practice for public headers regardless) and to at last document their usage in each header.

view this post on Zulip Sean (May 07 2020 at 19:06):

starseeker said:

OK. I think I've got a working whereami.c fallback for OpenBSD now - once that's in I'll go test VS2019 and do a final across the board check. If that's all good it will mean trunk is building on everything I have access to.

cool, did you end up using bu_getiwd() or doing something else?

view this post on Zulip starseeker (May 07 2020 at 19:06):

Yep, bu_getiwd

view this post on Zulip Sean (May 07 2020 at 19:08):

I have a distcheck-full running on RELEASE going, so should have a status check there once that finishes

view this post on Zulip starseeker (May 07 2020 at 19:08):

did you end up getting any CMakeCache.txt files or logs with failures?

view this post on Zulip starseeker (May 07 2020 at 19:10):

I need to test RELEASE with a large .g file.

view this post on Zulip Sean (May 07 2020 at 19:10):

that's what's running. I'm going methodically through settings to help make sure a failure doesn't come from something from a previous test, or different compiler, etc

view this post on Zulip starseeker (May 07 2020 at 19:11):

OK, sounds good. Once you've got that take a shot at trunk, if you can - I'm trying to get trunk into a solid state, and then I'll switch off into branches for a while to keep it stable.

view this post on Zulip starseeker (May 07 2020 at 19:23):

Yep, RELEASE on Windows was able to open a big .g test - phew

view this post on Zulip starseeker (May 07 2020 at 19:29):

alright, let's see what VS2019 has to say...

view this post on Zulip Sean (May 07 2020 at 19:50):

shouldn't 75729 be reverted? looks like all of the bio.h's pushed down into headers were because the bu headers didn' declare something they needed (FILE), which means the public headers are incomplete, possibly broken now

view this post on Zulip Sean (May 07 2020 at 19:51):

I don't think we have a test in there any more that tests isolated header compilation, but that'd probably be good to restore

view this post on Zulip Sean (May 07 2020 at 19:57):

hm, looks like we need that -- doing a quick test, looks like we have a dozen or so headers on trunk that have type declaration errors, missing headers, and most look like relatively recent changes. would be neat to get cmake doing something like this:

find include -name \*.h -exec g++ -Iinclude -I.build/include -Isrc/other/openNURBS -fsyntax-only -Wall -Wextra -Wno-deprecated {} \;

view this post on Zulip starseeker (May 07 2020 at 19:58):

Can revert it, but I build tested on multiple platforms... you're thinking it's still broken somewhere?

view this post on Zulip starseeker (May 07 2020 at 19:58):

I added stdio.h in lieu of bio.h when it just needed FILE, for example

view this post on Zulip starseeker (May 07 2020 at 20:00):

Hmm. That might be possible, actually...

view this post on Zulip starseeker (May 07 2020 at 20:01):

I'll try it in the branch

view this post on Zulip starseeker (May 07 2020 at 20:12):

Erm. How do I configure test for a compiler flag that by design produces no .o file?

view this post on Zulip starseeker (May 07 2020 at 20:53):

@Sean A rough attempt at such a header testing target is in trunk:

cmake .. -DBRLCAD_HDR_CHECK=ON && make check-headers -k

view this post on Zulip starseeker (May 07 2020 at 21:00):

I don't see any at first glance that look like they're due to r75729...

view this post on Zulip starseeker (May 07 2020 at 21:01):

Do you want me to start fixing them? The two I'm not sure how to handle are vector_x86.h and vector_fpu.h

view this post on Zulip Sean (May 07 2020 at 21:05):

starseeker said:

Can revert it, but I build tested on multiple platforms... you're thinking it's still broken somewhere?

Not a big deal yet, but there is an issue there.

The headers are likely broken by the nature of using a symbol that hasn't been declared yet, but compilation won't stop because you added the header in all the places it was used (thus masking the fact the header didn't declare it like it should have).

Later someone goes to use the header and gets the error, and they either perpetuate by adding stdio/bio to their source or the fix the header leaving a lot of unnecessary redundant includes in all the callers.

view this post on Zulip starseeker (May 07 2020 at 21:07):

Wouldn't the g++ check you ran see the undeclared symbols? Maybe I'm misunderstanding what' that's telling us...

view this post on Zulip Sean (May 07 2020 at 21:07):

starseeker said:

I added stdio.h in lieu of bio.h when it just needed FILE, for example

if you actually did catch them all, then they're not broken, but then it means all the bio.h in the usage locations are redundant/unnecessary right?

view this post on Zulip starseeker (May 07 2020 at 21:08):

Some yes, some no - open/close calls need unistd.h

view this post on Zulip Sean (May 07 2020 at 21:08):

which is to say those headers aren't right

view this post on Zulip starseeker (May 07 2020 at 21:09):

sorry, I was talking about the source files

view this post on Zulip starseeker (May 07 2020 at 21:09):

you're talking about bio.h usage in the headers?

view this post on Zulip Sean (May 07 2020 at 21:09):

if the symbols were only in the source files then it's certainly fine

view this post on Zulip starseeker (May 07 2020 at 21:10):

there are only 3 bio.h usages in the headers, and they should all be for windows.h (things like HANDLE, iirc, or getting ahead of opennurbs.h in one case.)

view this post on Zulip Sean (May 07 2020 at 21:10):

I mean symbols in the public headers that were satisfied by bio.h (regardless of the comment about it being for whatever reason)

view this post on Zulip starseeker (May 07 2020 at 21:11):

right, I tried to get all of those when I made the switch.

view this post on Zulip Sean (May 07 2020 at 21:11):

but did you get them by making sure they included what they declared, or did you get them by putting a header in a source file

view this post on Zulip starseeker (May 07 2020 at 21:12):

including what they declared

view this post on Zulip Sean (May 07 2020 at 21:12):

then there's no problem

view this post on Zulip Sean (May 07 2020 at 21:12):

a little surprising, but not a problem if true ;)

view this post on Zulip starseeker (May 07 2020 at 21:13):

if that means what I think it means - i.e. bu_function(FILE *stream) needs stdio in the header because the function signature exposes the FILE type

view this post on Zulip Sean (May 07 2020 at 21:14):

yep, that's exactly it

view this post on Zulip Sean (May 07 2020 at 21:14):

you tested windows? iirc, there's a handful of things that come from stdio.h on linux that come from windows.h on windows.

view this post on Zulip starseeker (May 07 2020 at 21:14):

I've built on windows - haven't tried your g++ style test there (can it be done with msvc?)

view this post on Zulip Sean (May 07 2020 at 21:15):

I don't think we have any in public headers, but wouldn't bet on it

view this post on Zulip starseeker (May 07 2020 at 21:15):

I think I listed the platforms I tested in the commit message... let me see... ah:

Passes make check on MSVC 2017, Linux GCC 9.2, Linux clang 9, OSX (github runner), and OpenBSD (clang 8)

view this post on Zulip Sean (May 07 2020 at 21:16):

starseeker said:

Do you want me to start fixing them? The two I'm not sure how to handle are vector_x86.h and vector_fpu.h

If you want, but I wouldn't mix it with that branch work. Adding missing headers should be low risk (because in theory all the callers will simply have redundant includes once they get fixed).

view this post on Zulip Sean (May 07 2020 at 21:17):

would be fun to test the new gcc that came out today

view this post on Zulip Sean (May 07 2020 at 21:17):

it's been a year I think?

view this post on Zulip starseeker (May 07 2020 at 21:18):

The branch is pretty well closed out at this point - adding the header check in basic form was simple and non-invasive, and if you decide you want it in that form we can juice it up later in trunk.

view this post on Zulip starseeker (May 07 2020 at 21:20):

Last thing I should need to do in trunk (unless your mac builds identify more problems) is clean up the bot command refactor I started.

view this post on Zulip Sean (May 07 2020 at 21:20):

don't understand -- you added the header check to trunk

view this post on Zulip starseeker (May 07 2020 at 21:20):

just now, yeah - I did the initial experimentation in the branch

view this post on Zulip starseeker (May 07 2020 at 21:21):

Once it proved simple to add, I went ahead and merged it to trunk

view this post on Zulip Sean (May 07 2020 at 21:21):

ah, it's no risk so shrug

view this post on Zulip Sean (May 07 2020 at 21:21):

the header check is cool, thanks!

view this post on Zulip starseeker (May 07 2020 at 21:22):

It's a little more work to tie in compiler flag detection and hook it into make regress (which presumably is where you eventually want it to go) but for now now it's a quick manual check if the hard-coded build flags work for the current platform.

view this post on Zulip Sean (May 07 2020 at 21:22):

is there a way to get rid of all the line added to every header dir though? it's ancillary to that file's purpose and begs questions to newcomers

view this post on Zulip Sean (May 07 2020 at 21:23):

make regress only after all headers are fixed. they were all last clean and verified maybe 5 years ago

view this post on Zulip starseeker (May 07 2020 at 21:24):

yes, but there are trade-offs - using the lists this way we can be sure we aren't testing stray files stashed in the directories (CMAKEFILES ignored files, for example.) I could do one toplevel build that reaches down into the lower directories, but then any renaming done a lower levels will have non-obvious breakage consequences.

view this post on Zulip Sean (May 07 2020 at 21:25):

this is one of the rare instances where recursive globbing would have been kosher :)

view this post on Zulip starseeker (May 07 2020 at 21:25):

how do we exclude files though?

view this post on Zulip starseeker (May 07 2020 at 21:25):

besides CMakeLists.txt I mean...

view this post on Zulip starseeker (May 07 2020 at 21:26):

For example, bu has column.h and tbl.h stashed in it - a recursive glob will grab those, which we don't want

view this post on Zulip starseeker (May 07 2020 at 21:27):

I suppose I can try it and see if any of those cause breakage - maybe it won't matter

view this post on Zulip Sean (May 07 2020 at 21:27):

why not?
I would expect to glob on *.h .. and if there's a header in our include folder, good to know if it's busted.

view this post on Zulip Sean (May 07 2020 at 21:27):

if they work, then no problem, great to know if we ever do add a broken one

view this post on Zulip Sean (May 07 2020 at 21:28):

if they don't work, they they specifically could be excluded ... the downside is completely localized to the test itself

view this post on Zulip starseeker (May 07 2020 at 21:28):

A couple are "headers in progress" I left there years ago. I suppose if they're a problem they've just now qualified for removal as a maintenance burden ;-)

view this post on Zulip Sean (May 07 2020 at 21:29):

it's also super easy to fix headers in general if they are broken. it's just they propagate a slew of unnecessary includes and #ifdef logic to callers when they're broken

view this post on Zulip Sean (May 07 2020 at 21:29):

bsd used to be notorious for that

view this post on Zulip starseeker (May 07 2020 at 21:40):

OK, r75745 localizes it

view this post on Zulip starseeker (May 07 2020 at 22:06):

r75746 gets most of them - we're down to the weirdly recursive trio of bn/dvec.h bn/vector_fpu.h and bn/vector_x86.h and RtServerImpl.h wanting JNIEXPORT (which our headers don't define anywhere.)

view this post on Zulip starseeker (May 07 2020 at 22:06):

my vote would be to yank RtServerImpl.h, unless someone is still using it...

view this post on Zulip starseeker (May 07 2020 at 22:09):

I've tangled with the bn/vec* files once or twice in the past and lost - IIRC they have something to do with the NURBS code

view this post on Zulip Sean (May 07 2020 at 22:36):

@starseeker back to clean build state and confirmed. distcheck-full failures are consistent and repeatable. iges error is real, i'm back debugging it. Looks like it's crashing during a boolean evaluation export.

view this post on Zulip starseeker (May 08 2020 at 00:13):

@Sean excellent

view this post on Zulip starseeker (May 08 2020 at 00:14):

There we go - make check-headers is now clean!

view this post on Zulip Erik (May 08 2020 at 00:20):

once project I did many many mmmmany years ago (before I met yall), I had a 'no recursive include' policy and occasionally tried commenting out includes, the compile time was amazeballs

view this post on Zulip starseeker (May 08 2020 at 00:33):

Heh - I'd hate to even contemplate what that would take for BRL-CAD

view this post on Zulip starseeker (May 08 2020 at 00:34):

Heh - seven commits away from 75757

view this post on Zulip Sean (May 08 2020 at 04:02):

@starseeker haven't yet isolated if it's bsd-specific, probably not, but it is only exhibiting on freebsd (thus far) and only in optimized builds. it's not limited to g-iges, just happens to be the tool more reliably triggering the race. error appears to be not new, so it doesn't have to be a release blocker.

view this post on Zulip starseeker (May 08 2020 at 14:15):

freebsd, not OSX?

view this post on Zulip starseeker (May 08 2020 at 14:15):

@Daniel Rossberg Is RELEASE branch still good for you?

view this post on Zulip starseeker (May 08 2020 at 14:16):

I don't think I've changed anything since your last test, but it's now last call ;-)

view this post on Zulip Daniel Rossberg (May 08 2020 at 14:19):

No, nothing has changed. It's still revision 75755.

view this post on Zulip Daniel Rossberg (May 08 2020 at 14:19):

I.e., it's still good.

view this post on Zulip starseeker (May 08 2020 at 14:20):

OK. Changelog updates are going in, release date gets updated (again), and then we're tagging!

view this post on Zulip starseeker (May 08 2020 at 14:28):

Whew

view this post on Zulip starseeker (May 08 2020 at 14:28):

Lot of fun for a "patch" release...

view this post on Zulip starseeker (May 09 2020 at 03:42):

Thank you, OpenBSD. Getting a core file from MGED (but no other failure report in make regress, so don't know what's producing it) and the old gdb can't read it. Compiled the new gdb from ports, ran egdb... and gdb itself crashed trying to read the core file. Now I have a gdb core dump.

view this post on Zulip starseeker (May 09 2020 at 03:43):

That's a big help.

view this post on Zulip Sean (May 09 2020 at 03:54):

starseeker said:

freebsd, not OSX?

ugh, I literally lost track of which platform I was debugging on (I have like 20 terminals open atm) and somehow got into an "i'm on freebsd" because the debugging looked exactly like the issue we've been having on freebsd.

I was wrong, very wrong. the corruption is on mac, release builds, not limited to g-iges.

view this post on Zulip Sean (May 09 2020 at 03:55):

talk about a massive brain fart.

view this post on Zulip Sean (May 09 2020 at 03:56):

anyways, I have a reliable stack trace and have a lead on the cause. going to keep at it this weekend to see if I can suss it out.

view this post on Zulip Erik (May 09 2020 at 13:24):

I think you might need to change your brain drawers, boy :D

view this post on Zulip Sean (May 09 2020 at 16:29):

I do. Got too much going on.

view this post on Zulip Erik (May 10 2020 at 12:00):

well, poop, I grabbed an rpi3b+ to pop up a new backup server and they only have usb2, I think writing to the disk may be slower than my network O.o :/

view this post on Zulip Sean (May 13 2020 at 18:19):

@starseeker Just did an update and now seeing this error again:

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -std=c11  -D_POSIX_C_SOURCE=200809L -D_XOPEN_SOURCE=700 -pipe -fno-strict-aliasing -fn\
o-common -fexceptions -m64 -ggdb -Qunused-arguments -fstack-protector-all -fno-omit-frame-pointer -pedantic -pedantic-errors -Wall -Wextra -Wundef -Wfloat-equal -Wshadow -Wbad-\
function-cast -Wc++-compat -Winline -Wno-long-long -Wno-variadic-macros -Wdocumentation -Wno-c11-extensions -Werror -isysroot /Applications/Xcode.app/Contents/Developer/Platfor\
ms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -mmacosx-version-min=10.14 -dynamiclib -Wl,-headerpad_max_install_names  -m64 -ggdb -compatibility_version 20.0.0 -current_ver\
sion 20.0.1 -o ../../lib/libicv.20.0.1.dylib -install_name @rpath/libicv.20.dylib CMakeFiles/libicv-obj.dir/fileformat.c.o CMakeFiles/libicv-obj.dir/rot.c.o CMakeFiles/libicv-o\
bj.dir/color_space.c.o CMakeFiles/libicv-obj.dir/crop.c.o CMakeFiles/libicv-obj.dir/filter.c.o CMakeFiles/libicv-obj.dir/encoding.c.o CMakeFiles/libicv-obj.dir/operations.c.o C\
MakeFiles/libicv-obj.dir/stat.c.o CMakeFiles/libicv-obj.dir/size.c.o CMakeFiles/libicv-obj.dir/pix.c.o CMakeFiles/libicv-obj.dir/png.c.o CMakeFiles/libicv-obj.dir/ppm.c.o CMake\
Files/libicv-obj.dir/bw.c.o CMakeFiles/libicv-obj.dir/dpix.c.o  -Wl,-rpath,/Users/morrison/brlcad.trunk/.build/lib ../../lib/libbn.20.0.1.dylib /usr/local/lib/libpng.dylib ../.\
./lib/libnetpbm.dylib ../../lib/libbu.20.0.1.dylib -framework Foundation -ldl -framework System /usr/lib/libc.dylib -lm
Undefined symbols for architecture x86_64:
  "_brl_png_create_info_struct", referenced from:
      _png_write in png.c.o
      _png_read in png.c.o
  "_brl_png_create_read_struct", referenced from:
      _png_read in png.c.o
  "_brl_png_create_write_struct", referenced from:
      _png_write in png.c.o
  "_brl_png_destroy_read_struct", referenced from:
      _png_write in png.c.o
  "_brl_png_destroy_write_struct", referenced from:
      _png_write in png.c.o
  "_brl_png_get_bKGD", referenced from:
      _png_read in png.c.o
  "_brl_png_get_bit_depth", referenced from:
      _png_read in png.c.o
  "_brl_png_get_color_type", referenced from:
      _png_read in png.c.o
  "_brl_png_get_gAMA", referenced from:
      _png_read in png.c.o
  "_brl_png_get_image_height", referenced from:
      _png_read in png.c.o
  "_brl_png_get_image_width", referenced from:
      _png_read in png.c.o
  "_brl_png_init_io", referenced from:
      _png_write in png.c.o
      _png_read in png.c.o
  "_brl_png_read_image", referenced from:
      _png_read in png.c.o
  "_brl_png_read_info", referenced from:
      _png_read in png.c.o
  "_brl_png_read_update_info", referenced from:
      _png_read in png.c.o
  "_brl_png_set_IHDR", referenced from:
      _png_write in png.c.o
  "_brl_png_set_background", referenced from:
      _png_read in png.c.o
  "_brl_png_set_expand", referenced from:
      _png_read in png.c.o
  "_brl_png_set_gAMA", referenced from:
      _png_read in png.c.o
  "_brl_png_set_gray_to_rgb", referenced from:
      _png_read in png.c.o
  "_brl_png_set_longjmp_fn", referenced from:
      _png_write in png.c.o
  "_brl_png_set_sig_bytes", referenced from:
      _png_read in png.c.o
  "_brl_png_set_strip_16", referenced from:
      _png_read in png.c.o
  "_brl_png_sig_cmp", referenced from:
      _png_read in png.c.o
  "_brl_png_write_end", referenced from:
      _png_write in png.c.o
  "_brl_png_write_info", referenced from:
      _png_write in png.c.o
  "_brl_png_write_row", referenced from:
      _png_write in png.c.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [lib/libicv.20.0.1.dylib] Error 1

view this post on Zulip Sean (May 13 2020 at 18:20):

Any ideas? It seems like maybe something stateful as it didn't error on a fresh build, but did after a simple re-run that had nothing to do with libicv.

view this post on Zulip starseeker (May 13 2020 at 20:29):

So it got /usr/local/lib/libpng.dylib somehow.... weird.

view this post on Zulip starseeker (May 13 2020 at 20:30):

@Sean CMakeCache.txt file available?

view this post on Zulip starseeker (May 13 2020 at 20:51):

Somehow PNG_LIBRARIES must be getting overridden - that's the variable libicv is referencing. When you say "rerun" is that a CMake re-run or a re-build?

view this post on Zulip starseeker (May 13 2020 at 20:59):

If it's a CMake re-run you'll probably see something in the CMake output about "finding" PNG - that would be when things are getting messed up. The question would be why it thinks it needs to look for PNG in the first place. I'll see if I can try a double-configure on github...

view this post on Zulip Sean (May 13 2020 at 21:09):

Sorry, I already wiped some files out and have it working again

view this post on Zulip Sean (May 13 2020 at 21:12):

it was an svn up followed by make, which re-ran cmake, which then failed. did make clean, wiped out cmake cache, wiped out cmakefiles dir, and still failed. deleted build/cmake* and build/include, re-ran cmake, and seems to be back working. is there perhaps something getting written into the brlcad_config that gets screwed up on a second pass?

view this post on Zulip Sean (May 13 2020 at 21:14):

this is a simple default "cmake .." so it's the auto-detect default and there is a proper libpng for it to find

view this post on Zulip Sean (May 13 2020 at 21:14):

this is the fresh start cmakecache results that succeeds:

agua:.build morrison$ grep -r -i png CMakeCache.txt
CMakeCache.txt://define BRLCAD_PNG
CMakeCache.txt:BRLCAD_PNG:STRING=SYSTEM (AUTO)
CMakeCache.txt://PNG_INCLUDE_DIR
CMakeCache.txt:PNG_INCLUDE_DIR:STRING=PNG-NOTFOUND
CMakeCache.txt://PNG include directory
CMakeCache.txt:PNG_INCLUDE_DIRS:STRING=/usr/local/include
CMakeCache.txt://PNG_LIBRARIES
CMakeCache.txt:PNG_LIBRARIES:STRING=/usr/local/lib/libpng.dylib
CMakeCache.txt://PNG_LIBRARY
CMakeCache.txt:PNG_LIBRARY:STRING=PNG-NOTFOUND
CMakeCache.txt:PNG_LIBRARY_DEBUG:FILEPATH=PNG_LIBRARY_DEBUG-NOTFOUND
CMakeCache.txt:PNG_LIBRARY_RELEASE:FILEPATH=/usr/local/lib/libpng.dylib
CMakeCache.txt://Set PNG_MAN_DIR to the global MAN_DIR
CMakeCache.txt:PNG_MAN_DIR:STRING=share/man
CMakeCache.txt://Option to disable Console IO in PNG
CMakeCache.txt:PNG_NO_CONSOLE_IO:BOOL=OFF
CMakeCache.txt://Option to disable STDIO in PNG
CMakeCache.txt:PNG_NO_STDIO:BOOL=OFF
CMakeCache.txt:PNG_PNG_INCLUDE_DIR:PATH=/usr/local/include
CMakeCache.txt://BRL-CAD prefix for libpng
CMakeCache.txt:PNG_PREFIX:STRING=brl_
CMakeCache.txt://Disable building png test executables
CMakeCache.txt:PNG_TESTS:STRING=0
CMakeCache.txt:libdm_LIB_DEPENDS:STATIC=general;librt;general;libfb;general;/usr/X11/lib/libSM.dylib;general;/usr/X11/lib/libICE.dylib;general;/usr/X11/lib/libX11.dylib;general;/usr/X11/lib/libXext.dylib;general;/usr/X11/lib/libXi.dylib;general;/usr/lib/libtcl.dylib;general;/usr/lib/libtk.dylib;general;/usr/local/lib/libpng.dylib;
CMakeCache.txt:libfb_LIB_DEPENDS:STATIC=general;libbu;general;libpkg;general;/usr/local/lib/libpng.dylib;general;/usr/lib/libtcl.dylib;general;/usr/X11/lib/libSM.dylib;general;/usr/X11/lib/libICE.dylib;general;/usr/X11/lib/libX11.dylib;general;/usr/X11/lib/libXext.dylib;general;/usr/X11/lib/libXi.dylib;general;/usr/X11/lib/libGLU.dylib;general;/usr/X11/lib/libGL.dylib;general;/usr/X11/lib/libSM.dylib;general;/usr/X11/lib/libICE.dylib;general;/usr/X11/lib/libX11.dylib;general;/usr/X11/lib/libXext.dylib;general;/usr/X11/lib/libXi.dylib;general;/usr/lib/libtk.dylib;
CMakeCache.txt:libged_LIB_DEPENDS:STATIC=general;libwdb;general;liboptical;general;librt;general;libbrep;general;libnmg;general;libfb;general;libbg;general;libbn;general;libbu;general;libicv;general;libanalyze;general;/usr/local/lib/libpng.dylib;general;/usr/lib/libc.dylib;general;m;
CMakeCache.txt:libicv_LIB_DEPENDS:STATIC=general;libbu;general;libbn;general;/usr/local/lib/libpng.dylib;general;netpbm;
CMakeCache.txt://STRINGS property for variable: BRLCAD_PNG
CMakeCache.txt:BRLCAD_PNG-STRINGS:INTERNAL=AUTO;BUNDLED;SYSTEM
CMakeCache.txt://Details about finding PNG
CMakeCache.txt:FIND_PACKAGE_MESSAGE_DETAILS_PNG:INTERNAL=[/usr/local/lib/libpng.dylib][/usr/local/include][v1.6.37()]
CMakeCache.txt://ADVANCED property for variable: PNG_INCLUDE_DIR
CMakeCache.txt:PNG_INCLUDE_DIR-ADVANCED:INTERNAL=1
CMakeCache.txt://ADVANCED property for variable: PNG_INCLUDE_DIRS
CMakeCache.txt:PNG_INCLUDE_DIRS-ADVANCED:INTERNAL=1
CMakeCache.txt://ADVANCED property for variable: PNG_LIBRARIES
CMakeCache.txt:PNG_LIBRARIES-ADVANCED:INTERNAL=1
CMakeCache.txt://ADVANCED property for variable: PNG_LIBRARY
CMakeCache.txt:PNG_LIBRARY-ADVANCED:INTERNAL=1
CMakeCache.txt://ADVANCED property for variable: PNG_LIBRARY_DEBUG
CMakeCache.txt:PNG_LIBRARY_DEBUG-ADVANCED:INTERNAL=1
CMakeCache.txt://ADVANCED property for variable: PNG_LIBRARY_RELEASE
CMakeCache.txt:PNG_LIBRARY_RELEASE-ADVANCED:INTERNAL=1
CMakeCache.txt://ADVANCED property for variable: PNG_MAN_DIR
CMakeCache.txt:PNG_MAN_DIR-ADVANCED:INTERNAL=1
CMakeCache.txt://ADVANCED property for variable: PNG_NO_CONSOLE_IO
CMakeCache.txt:PNG_NO_CONSOLE_IO-ADVANCED:INTERNAL=1
CMakeCache.txt://ADVANCED property for variable: PNG_NO_STDIO
CMakeCache.txt:PNG_NO_STDIO-ADVANCED:INTERNAL=1
CMakeCache.txt://ADVANCED property for variable: PNG_PNG_INCLUDE_DIR
CMakeCache.txt:PNG_PNG_INCLUDE_DIR-ADVANCED:INTERNAL=1
CMakeCache.txt://ADVANCED property for variable: PNG_PREFIX
CMakeCache.txt:PNG_PREFIX-ADVANCED:INTERNAL=1
CMakeCache.txt://ADVANCED property for variable: PNG_TESTS
CMakeCache.txt:PNG_TESTS-ADVANCED:INTERNAL=1

view this post on Zulip starseeker (May 13 2020 at 21:15):

OK, so the error was the prefix definition being set when it should not have been set?

view this post on Zulip Sean (May 13 2020 at 21:15):

presumably

view this post on Zulip Sean (May 13 2020 at 21:15):

so second pass, somehow the header has _brl_ getting defined, so libicv can't resolve when it tries to link

view this post on Zulip starseeker (May 13 2020 at 21:17):

I'm a little surprised that succeeded - PNG_PREFIX is set.

view this post on Zulip Sean (May 13 2020 at 21:17):

testing a simple touch CMakelists.txt src/libicv/* + make, cmake rerunning now to see if it triggers

view this post on Zulip Sean (May 13 2020 at 21:19):

that worked, so it's something a bit more complex than just re-running cmake

view this post on Zulip starseeker (May 13 2020 at 21:21):

A stale pnglibconf.h is one possibility

view this post on Zulip Sean (May 13 2020 at 21:21):

$ nm src/libicv/CMakeFiles/libicv-obj.dir/png.c.o
does not have _brl_

view this post on Zulip starseeker (May 13 2020 at 21:21):

If it's finding a local pnglibconf.h but a system libpng, that'll most likely blow up

view this post on Zulip starseeker (May 13 2020 at 21:22):

Ah, that could very well be actually. pnglibconf.h, IIRC, is generated as a build output. Just switching the third party setting probably isn't clearing it out. Let me see where it ends up in the output...

view this post on Zulip starseeker (May 13 2020 at 21:27):

Looking at the cache file, an initial configure leaves PNG_INCLUDE_DIR set to NOTFOUND. The ENABLE_ALL is actually setting that. Let me see if libicv is looking at the wrong PNG include var...

view this post on Zulip starseeker (May 13 2020 at 21:27):

no...

view this post on Zulip starseeker (May 13 2020 at 21:32):

@Sean with that result, something somehow is getting an ENABLE_ALL header with a system libpng, since clearly the PNG_PREFIX CMake variable by itself didn't trigger the issue (and I see why it doesn't, looking at libpng)

view this post on Zulip starseeker (May 13 2020 at 21:33):

Maybe I should go back to getting that third party build system rework done, even if it does mean a rewrite of the top level flow... I've lost count of the number of times that ThirdParty.cmake state management has bitten me subtly over the years...

view this post on Zulip starseeker (May 13 2020 at 21:34):

(no, I'm not going to, don't worry...)

view this post on Zulip Sean (May 13 2020 at 22:37):

Don't think it needs to a nuclear option. It's almost certainly a single var mispelled, misplaced, ifdef/ifndef switcheroo or something else really simple.

view this post on Zulip Sean (May 18 2020 at 18:43):

Updated, and now seeing after make test:

    881 - regress-flawfinder (Failed)
    882 - regress-mged (Failed)
    901 - regress-gcv-dem (Failed)
    905 - regress-pkg (Timeout)

view this post on Zulip starseeker (May 18 2020 at 18:45):

@Sean make test is the one that won't work reliably - it runs everything, even the non-working tests. You probably wanted make check?

view this post on Zulip Sean (May 18 2020 at 18:46):

I know that, but I thought the first three were working

view this post on Zulip Sean (May 18 2020 at 18:46):

I know the pkg one is nwe

view this post on Zulip starseeker (May 18 2020 at 18:48):

Flawfinder I don't think has ever worked fully (or at least, not in a long time.) If that mged test is the old one I was working on it hasn't worked in a long time. The gcv-dem is too sensitive, IIRC - it's md5summing the output dsp

view this post on Zulip Sean (May 18 2020 at 18:53):

okay, hrm. thanks! will ignore.

view this post on Zulip Sadeep Darshana (Jun 26 2020 at 14:10):

Is Archer working in the latest revision?

view this post on Zulip Sadeep Darshana (Jun 26 2020 at 14:11):

also MGED

view this post on Zulip starseeker (Jun 26 2020 at 15:15):

It should be...

view this post on Zulip starseeker (Jun 26 2020 at 15:15):

What are the errors?

view this post on Zulip starseeker (Jun 26 2020 at 15:20):

Which Visual Studio are you using? The testing I've done was with 2017, so 2019 might have some issues...

view this post on Zulip Sadeep Darshana (Jun 26 2020 at 15:46):

Archer
C:\summer\brlcad-code\build\Debug\bin\archer.exe
ERROR: Requisite display manager is not available.
BRL-CAD may need to be recompiled with support for:
Run 'fbhelp' for a list of available display managers.

Unexpected error encountered while running Archer.
Aborting.

Process finished with exit code 1
MGED
This is what MGED looks like. image.png
Visual Studio 16 2019, no special build options used

view this post on Zulip starseeker (Jun 26 2020 at 15:49):

OK. Something about the new display manager setup must be unhappy. We've shifted to plugins, and that failure mode indicates that it can't locate the plugins.

view this post on Zulip starseeker (Jun 26 2020 at 15:50):

Let me see what 2017 does, then I'll see if I can try 2019

view this post on Zulip Sadeep Darshana (Jun 26 2020 at 15:50):

I'll check that

view this post on Zulip starseeker (Jun 26 2020 at 15:52):

Do you have a "libexec" directory in your build output (in either Release or Debug, depending on which configuration you used to build...)

view this post on Zulip Sadeep Darshana (Jun 26 2020 at 15:52):

I didn't build ALL_BUILD, but archer

view this post on Zulip starseeker (Jun 26 2020 at 15:52):

Ah. OK, you'll need to build dm-wgl as well

view this post on Zulip starseeker (Jun 26 2020 at 15:53):

That's a point, archer and mged should explicitly depend on those targets now...

view this post on Zulip Sadeep Darshana (Jun 26 2020 at 15:55):

It works @starseeker . Thanks

view this post on Zulip Sadeep Darshana (Jun 26 2020 at 15:55):

I thought building archer builds all dependancies

view this post on Zulip Sadeep Darshana (Jun 26 2020 at 15:56):

Is it not what used to happen? (or did I build ALL_BUILD last time, because it used to work)

view this post on Zulip starseeker (Jun 26 2020 at 15:56):

It should. I didn't think through all the implications of moving to plugin based architectures - there's no longer a link time dependency between the backends and applications, so the build system won't automatically make the connection

view this post on Zulip starseeker (Jun 26 2020 at 15:57):

The introduction of plugins into trunk is very new, so you might not have seen it previously, but ALL_BUILD would also have avoided it (that's why I didn't see it in earlier testing)

view this post on Zulip starseeker (Jun 26 2020 at 15:57):

Easy to fix - just need to specify the required backends as build dependencies.

view this post on Zulip starseeker (Jun 26 2020 at 15:57):

Testing now, hang on...

view this post on Zulip Sadeep Darshana (Jun 26 2020 at 15:58):

cool

view this post on Zulip starseeker (Jun 26 2020 at 16:04):

r76220 should handle it for the most obvious cases - there are more that will need similar logic to build in isolation, but those will catch the most common cases.

view this post on Zulip Sadeep Darshana (Jun 30 2020 at 06:40):

-- Could NOT find TCL (missing: TCL_STUB_LIBRARY TCL_INCLUDE_PATH)
-- Could NOT find TCLTK (missing: TCL_STUB_LIBRARY TCL_INCLUDE_PATH TK_STUB_LIBRARY TK_INCLUDE_PATH)
-- Could NOT find TK (missing: TK_STUB_LIBRARY TK_INCLUDE_PATH)
-- Could NOT find TCLTK (missing: TK_STUB_LIBRARY TK_INCLUDE_PATH)
-- Could NOT find TK (missing: TK_STUB_LIBRARY TK_INCLUDE_PATH)

Any idea why this happens eve after I installed required libraries under https://sourceforge.net/p/brlcad/code/HEAD/tree/brlcad/trunk/doc/README.Linux
(apt-get install xserver-xorg-dev libx11-dev libxi-dev libxext-dev libfontconfig-dev libglu1-mesa-dev)
See everything is installed.

root@sadeep-VirtualBox:/media/sf_summer/brlcad-code/build_linux# apt-get install xserver-xorg-dev libx11-dev libxi-dev libxext-dev libfontconfig-dev libglu1-mesa-dev
Reading package lists... Done
Building dependency tree
Reading state information... Done
Note, selecting 'libfontconfig1-dev' instead of 'libfontconfig-dev'
libfontconfig1-dev is already the newest version (2.13.1-2ubuntu3).
libglu1-mesa-dev is already the newest version (9.0.1-1build1).
libx11-dev is already the newest version (2:1.6.9-2ubuntu1).
libxext-dev is already the newest version (2:1.3.4-0ubuntu1).
libxi-dev is already the newest version (2:1.7.10-0ubuntu1).
xserver-xorg-dev is already the newest version (2:1.20.8-2ubuntu2.1).
0 upgraded, 0 newly installed, 0 to remove and 218 not upgraded.

view this post on Zulip Sadeep Darshana (Jun 30 2020 at 06:47):

(Ubuntu)

view this post on Zulip Sadeep Darshana (Jun 30 2020 at 08:13):

Fixed this. Had to delete the cmake directory and reload after installing missing libraries. Simply reloading was not sufficient.

view this post on Zulip Sean (Jul 01 2020 at 07:15):

Yeah, it will usually default to cache'd results, so you have to nuke it all.

view this post on Zulip Daniel Rossberg (Aug 06 2020 at 08:05):

What is the background of revision 75933 "ON_DLL_EXPORTS/ON_DLL_IMPORTS is no longer specific enough for an MSVC only override."? It breaks my MS Visual Studio brlcad.dll build with static libs.

view this post on Zulip starseeker (Aug 06 2020 at 12:42):

The import/export logic was generalized to be enabled both on MSVC and on GCC/clang - the latter has had such an ability for a little while, but we didn't use it. When enabling it (mostly so we could detect changes that would break the import/export bit on platforms other than Windows) we had to adjust the import/export logic to be more general. OpenNURBS was one of the libraries that had to be adjusted to accommodate it.

view this post on Zulip starseeker (Aug 06 2020 at 12:47):

If I recall correctly, that change triggered MSVC only logic in opennurbs, which is what prompted the change - I think your adjustment should be fine.

view this post on Zulip Daniel Rossberg (Aug 06 2020 at 12:49):

Okay, good.

view this post on Zulip Sean (Aug 14 2020 at 19:22):

@starseeker got a new failure -- btclsh is crashing on launch on Mac inside dm_init()

view this post on Zulip Sean (Aug 14 2020 at 19:22):

frame #4: 0x000000010077cd2e libdm.20.dylib`libdm_init() at dm_init.cpp:109:18

view this post on Zulip Sean (Aug 14 2020 at 19:22):

dm_get_name() returned NULL, code didn't handle it

view this post on Zulip Sean (Aug 14 2020 at 19:25):

er, not just btclsh. looks like mged is crashing too, anything doing dm init

view this post on Zulip starseeker (Aug 14 2020 at 19:27):

trunk or RELEASE?

view this post on Zulip Sean (Aug 14 2020 at 19:30):

trunk

view this post on Zulip Sean (Aug 14 2020 at 19:31):

I haven't tested RELEASE in a long while.

view this post on Zulip starseeker (Aug 14 2020 at 19:32):

OK. Let me see what happens here... I've been hitting RELEASE so something may have slipped on trunk...

view this post on Zulip starseeker (Aug 14 2020 at 19:33):

That's a strange error... it indicates that the plugin loaded but didn't define a display manager with a name?

view this post on Zulip starseeker (Aug 14 2020 at 19:36):

Linux doesn't reproduce it...

view this post on Zulip Sean (Aug 14 2020 at 19:40):

Looking at the code, dmp must be NULL so there are at least two issues in the code. First issue is introducing a bunch of getter functions in dm-generic.c that can return NULL in r76200 -- every place those are called must check for NULL or the functions should get changed so NULL is not returnable. Second issue is obviously whatever caused dmp to be NULL.

view this post on Zulip starseeker (Aug 14 2020 at 19:43):

@Sean I just committed a check and err msg to dm_init.cpp - can you rebuild and run src/libdm/tests/dm_test to see what happens?

view this post on Zulip Sean (Aug 14 2020 at 19:43):

looks like that list is dm_interp, dm_get_fb, dm_get_dm_name, dm_get_dm_lname, dm_get_bg, dm_get_fg, dm_get_pathname, dm_get_name, dm_get_dname, dm_get_graphics_system, dm_get_tkname, dm_get_vp, dm_get_vparse, and dm_get_mvars

view this post on Zulip Sean (Aug 14 2020 at 19:43):

sure

view this post on Zulip Sean (Aug 14 2020 at 19:46):

hm, now cmake is failing

view this post on Zulip starseeker (Aug 14 2020 at 19:46):

/me closes eyes in pain

view this post on Zulip Sean (Aug 14 2020 at 19:47):

It's a build from scratch, but I wouldn't expect that necessarily for the minor changes just made.. don't yet see why it's actually halting.

view this post on Zulip Sean (Aug 14 2020 at 19:48):

agua:.build morrison$ make
**********************************************************
*** Configuring BRL-CAD Release 7.31.0, Build 20200814 ***
**********************************************************
X11 detected and enabled
Defining HAVE_X11_XLIB_H
Defining HAVE_X11_EXTENSIONS_XINPUT_H
-- Could NOT find Appleseed (missing: Appleseed_INCLUDE_DIR Appleseed_LIBRARY)
^[[D^[[D

-------------------- BRL-CAD Release 7.31.0, Build 20200814 --------------------

        Prefix: /usr/brlcad/dev-7.31.0
      Binaries: /usr/brlcad/dev-7.31.0/bin
     Libraries: /usr/brlcad/dev-7.31.0/lib
  Manual pages: /usr/brlcad/dev-7.31.0/share/man
Data resources: /usr/brlcad/dev-7.31.0/share

CC       = /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc
CXX      = /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++
CFLAGS   = -std=c11 -D_POSIX_C_SOURCE=200809L -D_XOPEN_SOURCE=700 -pipe
           -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions
           -m64 -ggdb -Qunused-arguments -fstack-protector-all
           -fno-omit-frame-pointer -pedantic -pedantic-errors -Wall -Wextra
           -Wundef -Wfloat-equal -Wshadow -Wbad-function-cast -Wc++-compat
           -Winline -Wno-long-long -Wno-variadic-macros -Wdocumentation
           -Wno-c11-extensions -Werror
CXXFLAGS = -std=c++11 -D_POSIX_C_SOURCE=200809L -D_XOPEN_SOURCE=700 -pipe
           -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions
           -ftemplate-depth-128 -m64 -ggdb -Qunused-arguments
           -fstack-protector-all -fno-omit-frame-pointer -pedantic -Wall
           -Wextra -Wundef -Wfloat-equal -Wshadow -Wbad-function-cast -Winline
           -Wno-long-long -Wno-variadic-macros -Wdocumentation
           -Wno-c11-extensions -Werror
LDFLAGS  = -m64 -ggdb

Compile Tcl ........................: ON
Compile Tk .........................: ON
Compile Itcl/Itk ...................: ON
Compile Iwidgets ...................: ON
Compile Tkhtml .....................: ON
Compile Tktable ....................: ON
Compile libpng .....................: ON
Compile libregex ...................: ON
Compile zlib .......................: ON
Compile Utah Raster Toolkit ........: ON
Compile openNURBS ..................: ON
Compile STEPcode....................: ON

OpenGL support (optional) ..........: ON
X11 support (optional) .............: ON
Qt support (optional) ..............: OFF
Run-time debuggability (optional) ..: ON

Build 32/64-bit release ............: 64BIT (Auto)
Build optimized release ............: OFF
Build static libraries .............: ON
Build dynamic libraries ............: ON
Install example geometry models ....: ON
Generate extra docs ................: ON (html/man)


"/Users/morrison/brlcad.trunk/INSTALL" is out of date.  An updated version has been generated at "/Users/morrison/brlcad.trunk/.build/INSTALL.new"
To clear this warning, replace "/Users/morrison/brlcad.trunk/INSTALL" with "/Users/morrison/brlcad.trunk/.build/INSTALL.new"


"/Users/morrison/brlcad.trunk/configure" is out of date.  An updated version has been generated at "/Users/morrison/brlcad.trunk/.build/configure.new"
To clear this warning, replace "/Users/morrison/brlcad.trunk/configure" with "/Users/morrison/brlcad.trunk/.build/configure.new"

CMake Error at misc/CMake/BRLCAD_Util.cmake:90 (_message):
  Configure haulted.
Call Stack (most recent call first):
  CMakeLists.txt:3728 (message)


-- Configuring incomplete, errors occurred!
See also "/Users/morrison/brlcad.trunk/.build/CMakeFiles/CMakeOutput.log".
See also "/Users/morrison/brlcad.trunk/.build/CMakeFiles/CMakeError.log".

view this post on Zulip starseeker (Aug 14 2020 at 19:48):

Oh. For whatever reason, it's generated INSTALL and configure files that don't match the ones in the source tree

view this post on Zulip Sean (Aug 14 2020 at 19:48):

I've gotten those messages about INSTALL and configure before, so I can't imagine it's those are causing the halt?

view this post on Zulip Sean (Aug 14 2020 at 19:49):

Plus it says they are warnings...

view this post on Zulip starseeker (Aug 14 2020 at 19:49):

They're supposed to be fatal errors. The only reason they're warnings is so we can check for both files before halting the configure

view this post on Zulip starseeker (Aug 14 2020 at 19:50):

what's the diff between INSTALL.new and INSTALL?

view this post on Zulip Sean (Aug 14 2020 at 19:50):

Hm. Is there not a way for it to at least report them as errors instead of warnings, and still check both?

view this post on Zulip starseeker (Aug 14 2020 at 19:51):

We could alter the message that's printed

view this post on Zulip starseeker (Aug 14 2020 at 19:51):

CMakeLists.txt line 3702

view this post on Zulip Sean (Aug 14 2020 at 19:52):

configure.new is missing all of the dependency toggles (e.g., --enable-scl, --enable-step, --enable-gdiam, etc)

view this post on Zulip starseeker (Aug 14 2020 at 19:52):

O.o

view this post on Zulip starseeker (Aug 14 2020 at 19:52):

Which version of CMake are you using?

view this post on Zulip Sean (Aug 14 2020 at 19:52):

that's the same problem with INSTALL.new .. they're all gone

view this post on Zulip Sean (Aug 14 2020 at 19:52):

cmake version 3.16.20200125-g33e7bd6

view this post on Zulip starseeker (Aug 14 2020 at 19:53):

I wouldn't think that would be an issue...

view this post on Zulip Sean (Aug 14 2020 at 19:53):

this was fine a week ago, so I'm not sure what changed

view this post on Zulip starseeker (Aug 14 2020 at 19:53):

This is from a clean build dir?

view this post on Zulip Sean (Aug 14 2020 at 19:53):

of course not, I have a lot of in-progress work in there

view this post on Zulip Sean (Aug 14 2020 at 19:54):

I can blow away the cmake bits though

view this post on Zulip starseeker (Aug 14 2020 at 19:54):

Maybe try removing CMakeCache.txt

view this post on Zulip Sean (Aug 14 2020 at 19:54):

rm -rfing cmake* CMake*

view this post on Zulip Sean (Aug 14 2020 at 19:55):

how is the INSTALL/configure file contents built up? implies there's some bug there at least.

view this post on Zulip Sean (Aug 14 2020 at 19:56):

should be obvious/categoric

view this post on Zulip Sean (Aug 14 2020 at 20:00):

updated the error message

view this post on Zulip starseeker (Aug 14 2020 at 20:02):

The BRLCAD_OPTION function in misc/CMake/BRLCAD_Options.cmake does that, calling some helper functions to generate the text. BRLCAD_OPTION is usually invoked by ThirdParty.cmake

view this post on Zulip Sean (Aug 14 2020 at 20:03):

looks like the OPTIONS file it wrote has them

view this post on Zulip Sean (Aug 14 2020 at 20:03):

at least now, wish I'd looked before running clean

view this post on Zulip Sean (Aug 14 2020 at 20:07):

starseeker said:

The BRLCAD_OPTION function in misc/CMake/BRLCAD_Options.cmake does that, calling some helper functions to generate the text. BRLCAD_OPTION is usually invoked by ThirdParty.cmake

Thanks, will see if I can catch it next time.

view this post on Zulip Sean (Aug 14 2020 at 20:08):

gah, now the X11 error again. I really need to figure out what's wrong there.

view this post on Zulip starseeker (Aug 14 2020 at 20:09):

I'm not too surprised if you're running for long periods of time in the same build dir without clearing it - that's not a mode I usually work in (it tends to produce weird intermediate build system states that are hard to debug) so I won't usually know ahead of time if those issues are in there...

view this post on Zulip starseeker (Aug 14 2020 at 20:11):

Clearing the CMake files should do it though.

view this post on Zulip Sean (Aug 14 2020 at 20:11):

I know, you say that every time. To me it's a mode we've always supported and that's even worked in our cmake era robustly-enough until relatively recently.

view this post on Zulip Sean (Aug 14 2020 at 20:12):

maybe an indication that issues are piling up, maybe just random chance, but it is absolutely more unproductive time if I have to wait for cmake to fully re-run every single time. these failures aren't that often, so they really should be isolated issues.

view this post on Zulip Sean (Aug 14 2020 at 20:13):

the fact that it works at least 9/10 times is a testament that it can work and usually does

view this post on Zulip Sean (Aug 14 2020 at 20:13):

or 19/20, whatever it is

view this post on Zulip starseeker (Aug 14 2020 at 20:14):

Heh. OK, fair enough - I don't re-run every time either. Hopefully clearing the CMake files will right the ship

view this post on Zulip Sean (Aug 14 2020 at 20:15):

case in point, I have to wait another 10 minutes (5min into compile, 5 to re-run cmake) to re-run because default build has been non-functional since the tcl upgrade. have to enable all and run cmake again, then rebuild.

view this post on Zulip Sean (Aug 14 2020 at 20:16):

ugh, and it doesn't help when I typo the darn var name like I just did and have to wait for cmake yet again...

view this post on Zulip starseeker (Aug 14 2020 at 20:17):

/me nods. That's a good point - I forget how slow it is to run CMake on the mac. I'm spoiled on Linux - it's fast enough there to be a much less severe pain point.

view this post on Zulip starseeker (Aug 14 2020 at 20:18):

The github macOS runner does build successfully - maybe because it doesn't have Tk enabled?

view this post on Zulip Sean (Aug 14 2020 at 20:18):

even when I'm on the 128 core linux box and compile takes 2min, it's annoying that cmake takes 2 min too :)

view this post on Zulip Sean (Aug 14 2020 at 20:20):

starseeker said:

The github macOS runner does build successfully - maybe because it doesn't have Tk enabled?

probably, I don't remember the specifics. I spent time debugging months back, but didn't get it resolved. it was broken on the branch and working on trunk and it'd been merged, so I didn't look too much into it. eventually appeared on trunk too.

view this post on Zulip starseeker (Aug 14 2020 at 20:21):

I wish Mac had some kind of remote open source development environments we could set up with for testing - the main environment I can't test is the one you use as your primary environment.

view this post on Zulip starseeker (Aug 14 2020 at 20:21):

Short of buying a Mac (which I suppose is what Apple wants) it's a conundrum

view this post on Zulip starseeker (Aug 14 2020 at 20:22):

Even the github CI doesn't seem to help much

view this post on Zulip Sean (Aug 14 2020 at 20:22):

I think cmake is doing the right thing -- it found a viable system Tcl/Tk. the problem (I think) is that it's finding Tk's X11 stub headers before the actual system X11 headers.

view this post on Zulip Sean (Aug 14 2020 at 20:22):

I've seen both compilation and linkage errors, so likely two issues.

view this post on Zulip starseeker (Aug 14 2020 at 20:23):

Oh. That reminds me of the fink/macports issues - I never did know what the "right" answer to that was. I ended up forcing an ordering, but I think you really didn't like what I did... it was pretty quirky/custom and specific to that setup.

view this post on Zulip Sean (Aug 14 2020 at 20:24):

Example build output (parallel unfortunately so it's a bit messy to read):

view this post on Zulip Sean (Aug 14 2020 at 20:24):

[ 61%] Building CXX object src/gtools/CMakeFiles/glint.dir/glint.cpp.o
Scanning dependencies of target gex
Scanning dependencies of target libdm
[ 61%] Linking CXX shared library ../../lib/libdm.dylib
[ 61%] Linking C executable gtransfer
[ 61%] Building CXX object src/gtools/CMakeFiles/gex.dir/gex.cpp.o
[ 61%] Linking C executable ../../../bin/vdeck
[ 61%] Built target gtransfer
Scanning dependencies of target libwdb
[ 61%] Built target libdm
[ 61%] Linking CXX shared library ../../lib/libwdb.dylib
Scanning dependencies of target liboptical
[ 61%] Linking C shared library ../../lib/liboptical.dylib
[ 61%] Built target vdeck
[ 61%] Linking CXX static library ../../lib/libdm.a
[ 61%] Built target libwdb
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../lib/libdm.a(knob.c.o) has no symbols
Scanning dependencies of target dm-X
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: ../../lib/libdm.a(knob.c.o) has no symbols
[ 61%] Built target liboptical
[ 61%] Built target libdm-static
Scanning dependencies of target dm-ogl
Scanning dependencies of target dm-plot
Scanning dependencies of target dm-ps
[ 61%] Building C object src/libdm/X/CMakeFiles/dm-X.dir/dm-X.c.o
[ 61%] Building C object src/libdm/plot/CMakeFiles/dm-plot.dir/dm-plot.c.o
[ 61%] Building C object src/libdm/postscript/CMakeFiles/dm-ps.dir/dm-ps.c.o
[ 61%] Linking CXX executable gen-attributes-file
[ 61%] Building C object src/libdm/glx/CMakeFiles/dm-ogl.dir/dm-ogl.c.o
[ 61%] Built target libanalyze-obj
[ 61%] Built target gen-attributes-file
[ 61%] Building C object src/libdm/X/CMakeFiles/dm-X.dir/color.c.o
Scanning dependencies of target dm-txt
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:118:10: error: implicit
      declaration of function 'XAllocColor' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
    st = XAllocColor(dpy, cmap, color);
         ^
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:182:5: error: implicit
      declaration of function 'XGetWindowAttributes' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
    XGetWindowAttributes(pubvars->dpy,
    ^
/Users/morrison/brlcad.trunk/src/libdm/X/color.c:88:5: error: implicit
      declaration of function 'XQueryColors' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
    XQueryColors(dpy, src, colors, ncolors);
    ^
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:212:7: error: implicit
      declaration of function 'XLoadQueryFont' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
             XLoadQueryFont(pubvars->dpy, FONT9)) == NULL) {
             ^
/Users/morrison/brlcad.trunk/src/libdm/X/color.c:91:2: error: implicit
      declaration of function 'XStoreColors' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
        XStoreColors(dpy, dest, colors, ncolors);
        ^
/Users/morrison/brlcad.trunk/src/libdm/X/color.c:91:2: note: did you mean
      'XQueryColors'?
/Users/morrison/brlcad.trunk/src/libdm/X/color.c:88:5: note: 'XQueryColors'
      declared here
    XQueryColors(dpy, src, colors, ncolors);
    ^
/Users/morrison/brlcad.trunk/src/libdm/X/color.c:94:6: error: implicit
      declaration of function 'XAllocColor' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
            XAllocColor(dpy, dest, &colors[i]);
            ^
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:211:27: error: incompatible
      integer to pointer conversion assigning to 'XFontStruct *' from 'int'
      [-Werror,-Wint-conversion]
        if ((pubvars->fontstruct =
                                 ^
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:214:31: error: incompatible
      integer to pointer conversion assigning to 'XFontStruct *' from 'int'
      [-Werror,-Wint-conversion]
            if ((pubvars->fontstruct =
                                     ^
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:222:2: error: implicit
      declaration of function 'XChangeGC' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
        XChangeGC(pubvars->dpy,
        ^
/Users/morrison/brlcad.trunk/src/libdm/X/color.c:139:7: error: implicit
      declaration of function 'XStoreColor' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
                    XStoreColor(dpy, cmap, &color);
                    ^
/Users/morrison/brlcad.trunk/src/libdm/X/color.c:141:7: error: implicit
      declaration of function 'XAllocColor' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
                    XAllocColor(dpy, cmap, &color);
                    ^
5 errors generated.
make[2]: *** [src/libdm/X/CMakeFiles/dm-X.dir/color.c.o] Error 1
make[2]: *** Waiting for unfinished jobs....
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:231:27: error: implicit
      declaration of function 'XLoadQueryFont' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
            if ((newfontstruct = XLoadQueryFont(pubvars->dpy,
                                 ^
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:231:25: error: incompatible
      integer to pointer conversion assigning to 'XFontStruct *' from 'int'
      [-Werror,-Wint-conversion]
            if ((newfontstruct = XLoadQueryFont(pubvars->dpy,
                               ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:233:3: error: implicit
      declaration of function 'XFreeFont' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
                XFreeFont(pubvars->dpy,
                ^
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:233:3: note: did you mean
      'Tk_FreeFont'?
/Users/morrison/brlcad.trunk/.build/include/tkDecls.h:270:14: note:
      'Tk_FreeFont' declared here
EXTERN void             Tk_FreeFont(Tk_Font f);
                        ^
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:237:3: error: implicit
      declaration of function 'XChangeGC' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
                XChangeGC(pubvars->dpy,
                ^
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:243:27: error: implicit
      declaration of function 'XLoadQueryFont' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
            if ((newfontstruct = XLoadQueryFont(pubvars->dpy,
                                 ^
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:243:25: error: incompatible
      integer to pointer conversion assigning to 'XFontStruct *' from 'int'
      [-Werror,-Wint-conversion]
            if ((newfontstruct = XLoadQueryFont(pubvars->dpy,
                               ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:245:3: error: implicit
      declaration of function 'XFreeFont' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
                XFreeFont(pubvars->dpy,
                ^
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:249:3: error: implicit
      declaration of function 'XChangeGC' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
                XChangeGC(pubvars->dpy,
                ^
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:255:27: error: implicit
      declaration of function 'XLoadQueryFont' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
            if ((newfontstruct = XLoadQueryFont(pubvars->dpy,
                                 ^
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:255:25: error: incompatible
      integer to pointer conversion assigning to 'XFontStruct *' from 'int'
      [-Werror,-Wint-conversion]
            if ((newfontstruct = XLoadQueryFont(pubvars->dpy,
                               ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:257:3: error: implicit
      declaration of function 'XFreeFont' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
                XFreeFont(pubvars->dpy,
                ^
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:261:3: error: implicit
      declaration of function 'XChangeGC' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
                XChangeGC(pubvars->dpy,
                ^
/Users/morrison/brlcad.trunk/src/libdm/X/dm-X.c:267:27: error: implicit
      declaration of function 'XLoadQueryFont' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
            if ((newfontstruct = XLoadQueryFont(pubvars->dpy,
                                 ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
[ 61%] Building C object src/libdm/glx/CMakeFiles/dm-ogl.dir/if_ogl.c.o
20 errors generated.
make[2]: *** [src/libdm/X/CMakeFiles/dm-X.dir/dm-X.c.o] Error 1
make[1]: *** [src/libdm/X/CMakeFiles/dm-X.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 61%] Linking C shared library ../../../libexec/dm/libdm-ps.dylib
[ 61%] Linking C shared library ../../../libexec/dm/libdm-plot.dylib
[ 61%] Building C object src/libdm/txt/CMakeFiles/dm-txt.dir/if_debug.c.o
[ 61%] Building C object src/libdm/txt/CMakeFiles/dm-txt.dir/dm-txt.c.o
/Users/morrison/brlcad.trunk/src/libdm/glx/dm-ogl.c:236:5: error: implicit
      declaration of function 'XGetWindowAttributes' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
    XGetWindowAttributes(pubvars->dpy,
    ^
[ 61%] Built target libgcv_fastgen4-obj
/Users/morrison/brlcad.trunk/src/libdm/glx/dm-ogl.c:250:7: error: implicit
      declaration of function 'XLoadQueryFont' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
             XLoadQueryFont(pubvars->dpy,
             ^
/Users/morrison/brlcad.trunk/src/libdm/glx/dm-ogl.c:249:27: error: incompatible
      integer to pointer conversion assigning to 'XFontStruct *' from 'int'
      [-Werror,-Wint-conversion]
        if ((pubvars->fontstruct =
                                 ^

view this post on Zulip starseeker (Aug 14 2020 at 20:26):

Would putting the X11 header includes before the Tk includes deal with that?

view this post on Zulip Sean (Aug 14 2020 at 20:26):

I actually don't know the status of fink/macports... homebrew kind of took over that market

view this post on Zulip Sean (Aug 14 2020 at 20:26):

maybe, like I said, I didn't dig

view this post on Zulip starseeker (Aug 14 2020 at 20:26):

/me looks...

view this post on Zulip Sean (Aug 14 2020 at 20:27):

looks like at least for dm-ogl.c that they're already first

view this post on Zulip Sean (Aug 14 2020 at 20:28):

probably need to do a header-trace to see where the declarations are coming from (or not coming from)

view this post on Zulip Sean (Aug 14 2020 at 20:29):

hah, that's great! (just saw the benchmark results)

view this post on Zulip Sean (Aug 14 2020 at 20:30):

"Run 'C:/RELEASE-build/Release/bin/benchmark clean' to remove generated pix files." ... Seeing a windows path there might be a first

view this post on Zulip starseeker (Aug 14 2020 at 20:34):

@Sean one thing to make sure of with the plugins is that any stale .so files are cleared out of the libexec directory - if they aren't removed they'll stand a decent change of causing problems.

view this post on Zulip starseeker (Aug 14 2020 at 20:35):

src/libdm/tests/dm_test is my "diagnostic" goto to figure out what's going on, at least with the libdm plugins.

view this post on Zulip Sean (Aug 14 2020 at 21:16):

starseeker said:

Sean one thing to make sure of with the plugins is that any stale .so files are cleared out of the libexec directory - if they aren't removed they'll stand a decent change of causing problems.

Why is that? None of the other binary products have an issue with that I'm aware of.

view this post on Zulip Sean (Aug 14 2020 at 21:18):

that's not something i'm likely going to remember a month from now, so if something isn't stable, that should get fixed.

view this post on Zulip starseeker (Aug 14 2020 at 21:49):

The library initializes at runtime, looking for whatever .so files are in the directory. It has no way of knowing what is and isn't a stale file.

view this post on Zulip Sean (Aug 14 2020 at 21:51):

what does stale mean in that context? I would still expect the plugins to recompile if they were changed

view this post on Zulip Sean (Aug 14 2020 at 21:54):

even with dynamic loading, it's possible to detect actual incompatibility and we may need to if this is a potential issue

view this post on Zulip Sean (Aug 14 2020 at 21:54):

I'm not sure what staleness potential you're referring to though..

view this post on Zulip starseeker (Aug 14 2020 at 21:57):

Stale would be a plugin file that dates to an older build (say, a dm backend that was removed from the build and then the plugin API changed.) Attempting to load the old .so file in that case wouldn't work.

view this post on Zulip starseeker (Aug 14 2020 at 21:58):

Also might be an issue if the .so file linked to older versions of BRL-CAD libraries that no longer are compatible with the current environment.

view this post on Zulip Sean (Aug 14 2020 at 21:59):

well, so that's a different issue no? that's got nothing to do with my build dir and needing to remember to wipe out .so files

view this post on Zulip Sean (Aug 14 2020 at 22:00):

unless you just recently changed the plugin interface and I only partially recompile, that's not so much a concern. the headers will have changed and the plugins will recompile (at least I would expect them to)

view this post on Zulip starseeker (Aug 14 2020 at 22:02):

Well, if you do a lot of builds over time in a build dir without doing make clean, it's possible to accumulate such files (similar to how stale symlinks were sometimes left over). The scenario I was thinking of:

  1. Build plugins
  2. Remove (say) the plot backend from the CMake build.
  3. Change plugin API.
  4. Rebuild.

The old libdm plot plugin would still be in libexec in that scenario.

view this post on Zulip Sean (Aug 14 2020 at 22:04):

I see, so how about preventing that then. sounds like this is a new category of concern.

view this post on Zulip starseeker (Aug 14 2020 at 22:04):

I'm open to suggestions...

view this post on Zulip Sean (Aug 14 2020 at 22:05):

could prevent it just by putting a plugin api version number into the header file. if the loaded version is less, skips that plugin. just have to remember to bump the api version whenever the calling interface changes.

view this post on Zulip Sean (Aug 14 2020 at 22:06):

could make the api number simply be the composite of the version number -- that way they're always locked to their compiled version only.

view this post on Zulip starseeker (Aug 14 2020 at 22:09):

So it would be a "bu_dlsym" to look for something like "dm_api_version" from the plugin?

view this post on Zulip Sean (Aug 14 2020 at 22:09):

something like:

#define DM_API ((BRLCAD_MAJOR*10000) + (BRLCAD_MINOR*100) + BRLCAD_PATCH)

so 7.32.12 becomes api version 703212

view this post on Zulip Sean (Aug 14 2020 at 22:11):

you'll have to make it plugin-specific, like an: int dm_ogl_version=DM_API;
then bu_dlsym that out after loading the plugin, before looking at any symbols

view this post on Zulip Sean (Aug 14 2020 at 22:12):

if var < DM_API, skip it

view this post on Zulip Sean (Aug 14 2020 at 22:12):

heck, could start if var != DM_API, skip it

view this post on Zulip starseeker (Aug 14 2020 at 22:12):

Why plugin specific?

view this post on Zulip starseeker (Aug 14 2020 at 22:13):

that ties the variable name to the file name of the plugin...

view this post on Zulip Sean (Aug 14 2020 at 22:14):

oh my, you're using the same symbol names across all plugins??

view this post on Zulip Sean (Aug 14 2020 at 22:15):

ah, I see, yes and no

view this post on Zulip Sean (Aug 14 2020 at 22:15):

they symbols are all unique, but in a dm_plugin pinfo struct

view this post on Zulip starseeker (Aug 14 2020 at 22:16):

Yes. That's a single point of entry - went that way for simplicity.

view this post on Zulip Sean (Aug 14 2020 at 22:18):

hm, well that is fine for now I guess but it's a little awkward for dynamic libs to do that. I don't think that's entirely portable in the general case for dlsym, though those cases aren't necessarily a major concern any more.

view this post on Zulip Sean (Aug 14 2020 at 22:19):

it's the same reason why Tcl has the library name the loading function specifically LIB_Init()

view this post on Zulip Sean (Aug 14 2020 at 22:20):

it also precludes portability to any new systems that don't (yet) support dynamic libraries and precludes static linking if for some case we later need/want that.

view this post on Zulip Sean (Aug 14 2020 at 22:21):

OS X was out for years before they added support for dynamic libs for example. now Mac does, of course, but the potential for other/new OS environments is definitely still a modern concern.

view this post on Zulip Sean (Aug 14 2020 at 22:22):

we can cross that bring later. it's trivial to change pinfo to ogl_pinfo or whatever, and to change the version var.

view this post on Zulip Sean (Aug 14 2020 at 22:23):

hm, it will screw with tags (msvc, emacs, vim jump-to-symbol/definition support) so maybe still something to consider

view this post on Zulip Sean (Aug 14 2020 at 22:24):

would simplify the code by eliminating the preprocessor conditional, one benefit

view this post on Zulip starseeker (Aug 14 2020 at 22:24):

Are those environments capable of following those linkages? I figured as a runtime-only connection they were left out anyway...

view this post on Zulip Sean (Aug 14 2020 at 22:25):

What do you mean?

view this post on Zulip starseeker (Aug 14 2020 at 22:26):

I just am not clear what conditions would allow emacs/vim to jump to the name - as a plugin, wouldn't it just be a pointer anyway?

view this post on Zulip Sean (Aug 14 2020 at 22:26):

the editors? they can certainly follow a symbol if I'm looking at a header on struct dm_plugin and ask it to show me the definition

view this post on Zulip Sean (Aug 14 2020 at 22:27):

it's going to jump to one of them

view this post on Zulip Sean (Aug 14 2020 at 22:27):

some do dynamic symbol searching, some do string matching

view this post on Zulip Sean (Aug 14 2020 at 22:28):

(most do the latter I think, especially the 'tags' system)

view this post on Zulip starseeker (Aug 14 2020 at 22:28):

OK, so just directly (say) return dm_ogl instead of pinfo

view this post on Zulip starseeker (Aug 14 2020 at 22:29):

Except I thought we needed that static wrapper

view this post on Zulip Sean (Aug 14 2020 at 22:31):

it's whatever symbol you load -- I'm guessing you load "dm_plugin_info" currently, so that would change to "dm_ogl_plugin" or whatever. that pinfo struct is already static so it's not going to collide with anything, even if compiled static.

view this post on Zulip starseeker (Aug 14 2020 at 22:32):

OK. So I just need something then to make sure the names match the filenames (so the init function knows what to ask for for any given file)

view this post on Zulip Sean (Aug 14 2020 at 22:32):

could put the version field right into the dm_impl as the first field

view this post on Zulip Sean (Aug 14 2020 at 22:32):

then you dont' need to actually introduce another lookup

view this post on Zulip starseeker (Aug 14 2020 at 22:33):

Sort of like the bu_magic trick?

view this post on Zulip Sean (Aug 14 2020 at 22:33):

sort of

view this post on Zulip starseeker (Aug 14 2020 at 22:34):

I take it we should get this in pre-release?

view this post on Zulip Sean (Aug 14 2020 at 22:35):

I would .. this is going to be 7.32 right?

view this post on Zulip Sean (Aug 14 2020 at 22:36):

seems pretty major

view this post on Zulip starseeker (Aug 14 2020 at 22:36):

Yes. (Or what's in RELEASE will be, at any rate - I branched off of trunk a while back.)

view this post on Zulip Sean (Aug 14 2020 at 22:37):

also warrants a NEWS mention I think, dynamic behavior is a user-visible architecture change

view this post on Zulip starseeker (Aug 14 2020 at 22:37):

OK - I'll hit that first, then see if I can make rtweight do something saner.

view this post on Zulip starseeker (Aug 14 2020 at 22:37):

Really? I had thought if I got it right the user wouldn't notice anything...

view this post on Zulip Sean (Aug 14 2020 at 22:46):

you're right, user shouldn't -- but it's a different runtime and exposure profile, and definitely results in different runtime behavior when something is not what we expected

view this post on Zulip Sean (Aug 14 2020 at 22:47):

it's like when we made shaders optionally load dynamically. user didn't necessarily see that except it resulted in dynamic load failure messages when a typo'd shader name wasn't found. not something that was anticipated, but the arch change was user visible and the messages obviously could be traced to it's introduction.

view this post on Zulip Sean (Aug 14 2020 at 22:51):

documented

view this post on Zulip starseeker (Aug 14 2020 at 22:54):

so same deal for the ged commands

view this post on Zulip starseeker (Aug 15 2020 at 00:36):

@Sean was https://sourceforge.net/p/brlcad/code/76767/ what you were thinking (more or less)?

view this post on Zulip starseeker (Aug 15 2020 at 00:38):

(Want to settle on a pattern before I do libged, that'll be more painful...)

view this post on Zulip Sean (Aug 15 2020 at 14:13):

starseeker said:

Sean was https://sourceforge.net/p/brlcad/code/76767/ what you were thinking (more or less)?

Yes, that's spot on! You could even make it an explicit plugin->api_version check if you want to support the version moving around the struct, but what you have checking the first byte is a good contract to require too.

view this post on Zulip starseeker (Aug 15 2020 at 15:13):

2020-08-15T14:51:35.0885582Z /home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_debug/brlcad-7.31.0/include/bu/magic.h:244:21: error: the address of ‘jaunt_tbl’ will always evaluate as ‘true’ [-Werror=address]
2020-08-15T14:51:35.0886308Z      if (UNLIKELY(( (!(_ptr)) /* non-NULL pointer */ \
2020-08-15T14:51:35.0886756Z                      ^
2020-08-15T14:51:35.0887896Z /home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_debug/brlcad-7.31.0/include/common.h:364:50: note: in definition of macro ‘UNLIKELY’
2020-08-15T14:51:35.0888280Z  #  define UNLIKELY(expression) __builtin_expect((expression), 0)
2020-08-15T14:51:35.0888663Z                                                   ^~~~~~~~~~
2020-08-15T14:51:35.0889257Z /home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_debug/brlcad-7.31.0/include/bu/ptbl.h:65:24: note: in expansion of macro ‘BU_CKMAG’
2020-08-15T14:51:35.0889945Z  #define BU_CK_PTBL(_p) BU_CKMAG(_p, BU_PTBL_MAGIC, "bu_ptbl")
2020-08-15T14:51:35.0890236Z                         ^~~~~~~~
2020-08-15T14:51:35.0890818Z /home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_debug/brlcad-7.31.0/src/libnmg/mod.c:3061:2: note: in expansion of macro ‘BU_CK_PTBL’
2020-08-15T14:51:35.0891507Z   BU_CK_PTBL(&jaunt_tbl);

view this post on Zulip starseeker (Aug 15 2020 at 15:21):

Oh, I see - r76785

view this post on Zulip starseeker (Aug 15 2020 at 15:23):

If we want the extra test, maybe devise a configure time check to see whether the platform in question lets us do it without the error?

view this post on Zulip starseeker (Aug 16 2020 at 00:42):

@Sean What issues do you know of that you would consider release blockers?

view this post on Zulip Sean (Aug 16 2020 at 18:30):

ah, right, constant expression. hrm.

view this post on Zulip Sean (Aug 16 2020 at 18:32):

I have vague recollection of that issue being the reason it was removed now too..

view this post on Zulip Sean (Aug 16 2020 at 18:53):

can you recheck the runner with the latest? trying a different approach.

view this post on Zulip Sean (Aug 16 2020 at 18:55):

starseeker said:

Sean What issues do you know of that you would consider release blockers?

rtweight no longer reporting volume unless there's a material defined is pretty significant change. the spew wouldn't be a blocker, but I think lacking a report is. also didn't seem to be outputting the report when run from mged -c, but I'd need to verify with a clean build.

view this post on Zulip Sean (Aug 16 2020 at 18:57):

only issue that come to mind is peer review isn't yet complete on all outstanding commits, and that's supposed to be a blocker for IA compliance, particularly it it's a build intended to be deployed (which I think it is, yes?) I think I can get through the remaining commits by tuesday if you can hold off tagging.

view this post on Zulip Sean (Aug 16 2020 at 18:58):

also, still seeing this:

view this post on Zulip Sean (Aug 16 2020 at 18:58):

[ 80%] Linking C shared library ../../../libexec/dm/libdm-ps.dylib
Undefined symbols for architecture x86_64:
  "_Tcl_AppendStringsToObj", referenced from:
      _ps_open in dm-ps.c.o
      _ps_loadMatrix in dm-ps.c.o
  "_Tcl_DuplicateObj", referenced from:
      _ps_open in dm-ps.c.o
      _ps_loadMatrix in dm-ps.c.o
  "_Tcl_GetObjResult", referenced from:
      _ps_open in dm-ps.c.o
      _ps_loadMatrix in dm-ps.c.o
  "_Tcl_SetObjResult", referenced from:
      _ps_open in dm-ps.c.o
      _ps_loadMatrix in dm-ps.c.o

view this post on Zulip starseeker (Aug 16 2020 at 18:58):

Huh, that means it's still not linking Tcl - I thought I fixed that.

view this post on Zulip Sean (Aug 16 2020 at 18:59):

I saw the commit and thought so too. it's re-run cmake cleanly since.

view this post on Zulip starseeker (Aug 16 2020 at 18:59):

Yeah, I can hold off til tuesday - still banging my head on the Ninja Windows build anyway (The msbuild run on the runner is less than 100% reliable, so far...)

view this post on Zulip starseeker (Aug 16 2020 at 19:00):

That's really strange then. Is TCL_LIBRARY set correctly in CMakeCache.txt?

view this post on Zulip Sean (Aug 16 2020 at 19:00):

agua:.build morrison$ grep TCL_LIBRARY CMakeCache.txt
//ITCL_LIBRARY
ITCL_LIBRARY:STRING=itcl
//TCL_LIBRARY
TCL_LIBRARY:STRING=tcl
//TCL_LIBRARY
//ADVANCED property for variable: ITCL_LIBRARY
ITCL_LIBRARY-ADVANCED:INTERNAL=1
//ADVANCED property for variable: TCL_LIBRARY
TCL_LIBRARY-ADVANCED:INTERNAL=1

view this post on Zulip starseeker (Aug 16 2020 at 19:01):

OK, that should be correct.

view this post on Zulip starseeker (Aug 16 2020 at 19:01):

What does otool say about what the dm-ps libexec lib is seeing?

view this post on Zulip Sean (Aug 16 2020 at 19:01):

here's verbose:

[100%] Linking C shared library ../../../libexec/dm/libdm-ps.dylib
cd /Users/morrison/brlcad.trunk/.build/src/libdm/postscript && /Users/morrison/Applications/bin/cmake -E cmake_link_script CMakeFiles/dm-ps.dir/link.txt --verbose=1
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -std=c11  -D_POSIX_C_SOURCE=200809L -D_XOPEN_SOURCE=700 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -m64 -ggdb -Qunused-arguments -fstack-protector-all -fno-omit-frame-pointer -pedantic -pedantic-errors -Wall -Wextra -Wundef -Wfloat-equal -Wshadow -Wbad-function-cast -Wc++-compat -Winline -Wno-long-long -Wno-variadic-macros -Wdocumentation -Wno-c11-extensions -Werror -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -mmacosx-version-min=10.14 -dynamiclib -Wl,-headerpad_max_install_names  -m64 -ggdb -o ../../../libexec/dm/libdm-ps.dylib -install_name @rpath/libdm-ps.dylib CMakeFiles/dm-ps.dir/dm-ps.c.o  -Wl,-rpath,/Users/morrison/brlcad.trunk/.build/lib ../../../lib/libdm.20.0.1.dylib ../../../lib/librt.20.0.1.dylib ../../../lib/libgdiam.dylib ../../../lib/libvds.dylib ../../../lib/libbrep.20.0.1.dylib ../../../lib/libnmg.dylib ../../../lib/libbg.20.0.1.dylib ../../../lib/libSPSR.dylib ../../../lib/libopenNURBS.2012.10.245.dylib ../../../lib/libpoly2tri.dylib ../../../lib/libicv.20.0.1.dylib ../../../lib/libbn.20.0.1.dylib ../../../lib/libnetpbm.dylib ../../../lib/libpkg.20.0.1.dylib ../../../lib/libbu.20.0.1.dylib -framework Foundation -ldl -framework System ../../../lib/libregex_brl.1.0.4.dylib -lm /usr/X11/lib/libGLU.dylib /usr/X11/lib/libGL.dylib /usr/X11/lib/libSM.dylib /usr/X11/lib/libICE.dylib /usr/X11/lib/libX11.dylib /usr/X11/lib/libXext.dylib /usr/X11/lib/libXi.dylib ../../../lib/libpng_brl.16.37.0.dylib ../../../lib/libz_brl.1.2.11.dylib
Undefined symbols for architecture x86_64:
  "_Tcl_AppendStringsToObj", referenced from:
      _ps_open in dm-ps.c.o
      _ps_loadMatrix in dm-ps.c.o
  "_Tcl_DuplicateObj", referenced from:
      _ps_open in dm-ps.c.o
      _ps_loadMatrix in dm-ps.c.o
  "_Tcl_GetObjResult", referenced from:
      _ps_open in dm-ps.c.o
      _ps_loadMatrix in dm-ps.c.o
  "_Tcl_SetObjResult", referenced from:
      _ps_open in dm-ps.c.o
      _ps_loadMatrix in dm-ps.c.o

view this post on Zulip starseeker (Aug 16 2020 at 19:03):

Yeah, I don't see tcl in that list.. what the heck?

view this post on Zulip Sean (Aug 16 2020 at 19:03):

there's no tcl listed

view this post on Zulip starseeker (Aug 16 2020 at 19:04):

Does line 20 in src/libdm/postscript/CMakeLists.txt have TCL_LIBRARY on it?

view this post on Zulip Sean (Aug 16 2020 at 19:06):

hm, it didn't but does now -- different tree got updated. this is probably my bad. rebuilding.

view this post on Zulip Sean (Aug 16 2020 at 19:13):

yep, my issue. sorry! enable-all is all good.

view this post on Zulip starseeker (Aug 16 2020 at 19:16):

No problem. Github runner kicked off - I'm trying with Ninja again, so if that fails I'll have to go again with msbuild, but either way should have an answer within about 2-3 hours.

view this post on Zulip starseeker (Aug 16 2020 at 19:17):

Actually, the Ubuntu GCC should work either way, come to think of it

view this post on Zulip starseeker (Aug 16 2020 at 21:26):

2020-08-16T19:39:35.8704061Z FAILED: src/librt/CMakeFiles/librt-obj.dir/reduce_db.cpp.o
2020-08-16T19:39:35.8706430Z /usr/bin/c++  -DBRLCADBUILD -DDB5_DLL_EXPORTS -DGDIAM_DLL_IMPORTS -DHAVE_CONFIG_H -DRT_DLL_EXPORTS -DTIE_DLL_EXPORTS -DVDS_DLL_IMPORTS -I/home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_release/brlcad-7.31.0/include -Iinclude -I/home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_release/brlcad-7.31.0/src/librt -Iinclude/brlcad -isystem src/other/openNURBS -isystem src/other/libz -isystem src/other/libregex -isystem src/other/libvds -isystem src/other/libgdiam -isystem /home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_release/brlcad-7.31.0/src/other/poly2tri -isystem /home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_release/brlcad-7.31.0/src/other/openNURBS -isystem /home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_release/brlcad-7.31.0/src/other/libz -isystem /home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_release/brlcad-7.31.0/src/other/libregex -isystem /home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_release/brlcad-7.31.0/src/other/libvds -isystem /home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_release/brlcad-7.31.0/src/other/libgdiam -std=c++11  -D_POSIX_C_SOURCE=200809L -D_XOPEN_SOURCE=700 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -g -ggdb3 -O3 -fipa-pta -fstrength-reduce -fexpensive-optimizations -finline-functions -flto -fno-omit-frame-pointer -pedantic -Wall -Wextra -Wundef -Wfloat-equal -Wshadow -Wno-inline -Wno-long-long -Wno-variadic-macros -Werror -fPIC   -std=c++11 -MD -MT src/librt/CMakeFiles/librt-obj.dir/reduce_db.cpp.o -MF src/librt/CMakeFiles/librt-obj.dir/reduce_db.cpp.o.d -o src/librt/CMakeFiles/librt-obj.dir/reduce_db.cpp.o -c /home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_release/brlcad-7.31.0/src/librt/reduce_db.cpp
2020-08-16T19:39:35.8707398Z In file included from /home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_release/brlcad-7.31.0/src/librt/reduce_db.cpp:27:0:
2020-08-16T19:39:35.8708379Z /home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_release/brlcad-7.31.0/src/librt/reduce_db.cpp: In function ‘reduce_db::remove_dead_references(db_i&)’:
2020-08-16T19:39:35.8708883Z /home/runner/work/cadcitest/cadcitest/build/distcheck-enableall_release/brlcad-7.31.0/include/bu/magic.h:245:6: error: nonnull argument ‘db’ compared to NULL [-Werror=nonnull-compare]
2020-08-16T19:39:35.8709297Z      if (UNLIKELY(( ((uintptr_t)(_ptr) == (uintptr_t)NULL) /* non-NULL pointer */ \
2020-08-16T19:39:35.8709452Z                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2020-08-16T19:39:35.8709757Z       || ((uintptr_t)(_ptr) == 0) /* non-zero pointer */ \
2020-08-16T19:39:35.8709894Z       ^~~~~~~~~~~~~~~~~~~~~~~~~~~
2020-08-16T19:39:35.8710114Z cc1plus: all warnings being treated as errors

view this post on Zulip starseeker (Aug 18 2020 at 00:09):

rtweight breakage looks like it was introduced by commit 72190

view this post on Zulip starseeker (Aug 18 2020 at 03:19):

OK, now I'm really confused. In the 7.26.4 in non-interactive mode src/mged/cmd.c line 1043 is executed, the program quits by eventually calling Tcl_Exit(), and then the queued up stdout output is dumped to the terminal. In trunk this doesn't happen, even though the same sequence still seems to be followed, and I am now getting output in the interactive classic mode.

view this post on Zulip Sean (Aug 18 2020 at 06:06):

that's a terrible commit message. :P

view this post on Zulip starseeker (Aug 18 2020 at 12:10):

My that was painful. @Sean I think I've gotten the output restored in all three modes - GUI, interactive classic, and command execution. Only tested on Linux so far.

view this post on Zulip Sean (Aug 18 2020 at 17:46):

IMG_0399.jpeg @starseeker any thoughts on the source of this? recent build on windows by bill.

view this post on Zulip Sean (Aug 18 2020 at 18:05):

cmake 3.18.1, so pretty recent

view this post on Zulip Sean (Aug 18 2020 at 18:06):

build succeeded so it is apparently innocuous (perhaps regress is busted if it could try)

view this post on Zulip starseeker (Aug 18 2020 at 18:30):

That matches a problem I had trying to run the "clean" target with msbuild

view this post on Zulip starseeker (Aug 18 2020 at 18:31):

Strange issue - when I grepped through the directory I could only find one match for a regress target, so I'm not quite sure where it's getting the duplicate.

view this post on Zulip starseeker (Aug 18 2020 at 18:32):

https://github.com/dotnet/msbuild/issues/3019 might be related if CMake's generation of names is doing something a little quirky, but not sure.

view this post on Zulip starseeker (Aug 18 2020 at 18:38):

Let me try again with 3.18.1 - I don't recall getting a GUI pop-up, it showed only on the command prompt. 'course I was using a newer VS, that may make a difference too.

view this post on Zulip starseeker (Aug 18 2020 at 19:01):

Yeah, 2017 loaded without that message, with 3.18.1. Not sure if I have the space to install 2015 on this laptop...

view this post on Zulip starseeker (Aug 18 2020 at 19:01):

Which version of CAD is he building?

view this post on Zulip Sean (Aug 18 2020 at 19:02):

lastest checkout

view this post on Zulip starseeker (Aug 18 2020 at 19:02):

OK, yeah - that's what I tried too. Weird.

view this post on Zulip Sean (Aug 22 2020 at 05:37):

@starseeker did you test if_disk and if_remote?

view this post on Zulip Sean (Aug 22 2020 at 05:42):

and what's the deal with bu_exit()?? that's not ignorable/replaceable. there are over 1500 calls to bu_exit

view this post on Zulip Sean (Aug 22 2020 at 05:43):

if it's crashing, we have to investigate and fix it

view this post on Zulip Sean (Aug 22 2020 at 05:56):

if bu_exit() is crashing dauto, that's pretty much guaranteed to be corruption

view this post on Zulip starseeker (Aug 22 2020 at 13:24):

I'm still working through all the if_* cases - should be done by the end of the day.

view this post on Zulip starseeker (Aug 22 2020 at 13:26):

I didn't have much luck debugging the dauto bit - it cropped up when I was trying the usage.sh script, appeared only some of the time, and the only thing GDB could tell me when I popped a sleep into bu_bomb so I could attach was that the BU_SETJUMP test was true when bu_exit was called.

view this post on Zulip starseeker (Aug 22 2020 at 13:27):

I couldn't figure out where corruption might be introduced - dauto basically did the isatty checks and then quit immediately.

view this post on Zulip starseeker (Aug 22 2020 at 13:44):

bu_setprogname was the only other call.

view this post on Zulip starseeker (Aug 22 2020 at 13:46):

The only other thing I can think of is that maybe we need to initialize the bu_jmpbuf array somehow...

view this post on Zulip starseeker (Aug 22 2020 at 14:31):

AH! My bad - I misinterpreted where the failure was occurring. I thought bu_bomb would be called only when something when wrong, but that's not the case. I put my test in the wrong part of the code.

view this post on Zulip starseeker (Aug 22 2020 at 14:44):

dauto needs to validate the atoi conversion before mallocing

view this post on Zulip starseeker (Aug 22 2020 at 15:26):

Grr. Now I can't get fbstretch to repeat it's bomb log file generation, with or without the dauto failure in the usage.sh run. Will just have to keep an eye out to see if it repeats.

view this post on Zulip Sean (Aug 22 2020 at 20:46):

starseeker said:

I couldn't figure out where corruption might be introduced - dauto basically did the isatty checks and then quit immediately.

sounds like you figured it out, but another thing to keep in mind is all static initialization also occurs before main(). this includes globals and statics getting initialized (which in the case of C++ can be arbitrarily complex) as well as any library initialization like we're doing in libbu.

view this post on Zulip Sean (Aug 22 2020 at 20:48):

starseeker said:

Grr. Now I can't get fbstretch to repeat it's bomb log file generation, with or without the dauto failure in the usage.sh run. Will just have to keep an eye out to see if it repeats.

yeah, we actually had to remove the __attribute__ "never returns" marking on it because technically it can (and often does) return control -- bu_exit() is guaranteed to not return (it calls _exit() hard), but bu_bomb is not.

view this post on Zulip starseeker (Aug 22 2020 at 22:24):

@Sean how exactly do I test the disk and stack interfaces? I'm not seeing where they get used.

view this post on Zulip Sean (Aug 22 2020 at 22:32):

You can/could specify all types via the -F option, though some of the builtins use conventions (e.g., if_network is used when you used a hostname). I believe disk is used when you specify a filepath, mem uses a convention iirc. They have been documented in several places over the decades (especially the old green manuals).

view this post on Zulip Sean (Aug 22 2020 at 22:33):

I think I recall some examples in the code. Don't know but libfb man page might have relevant info (where FB_FILE is mentioned).

view this post on Zulip Sean (Aug 22 2020 at 22:33):

the env var is another way they can be set

view this post on Zulip starseeker (Aug 22 2020 at 23:37):

The mem gets used by rtwizard, and -F<filepath> seems to work for disk.

stack I can trigger and it wants some sort of args (but don't know what to feed it, old or new config)

view this post on Zulip starseeker (Aug 22 2020 at 23:43):

Ah, there we go (buried in one of the old emails)

export FB_FILE="/dev/stack /dev/debug;/home/user/moss.pix;"
./bin/rt share/db/moss.g all.g

view this post on Zulip starseeker (Aug 22 2020 at 23:50):

OK, looks like we're in business.

btw: not reproducible, but this happened once if it's of interest:
bu_semaphore_free(): pthread_mutex_destroy() failed on [18] of [20]

view this post on Zulip Sean (Aug 22 2020 at 23:54):

I don't recall stack being explicit, could be wrong though.

view this post on Zulip Sean (Aug 22 2020 at 23:55):

thought it let you do things like -F"/dev/mem -" to read from stdin and write to a memory buffer

view this post on Zulip starseeker (Aug 22 2020 at 23:55):

html/ReleaseNotes/email3.0.html was where I pulled that from, FWIW

view this post on Zulip Sean (Aug 22 2020 at 23:55):

ah, great good find then

view this post on Zulip starseeker (Aug 22 2020 at 23:56):

First time I've actually used one of those old emails practically... nifty!

view this post on Zulip Sean (Aug 22 2020 at 23:56):

how's it actually going through /dev/disk without it being in that switch list?

view this post on Zulip starseeker (Aug 22 2020 at 23:57):

fb_open - falls through to the disk interface if no other matches

view this post on Zulip starseeker (Aug 22 2020 at 23:58):

dm_plugins.cpp:405

view this post on Zulip Sean (Aug 22 2020 at 23:58):

cool, and remote? triggered on something else?

view this post on Zulip starseeker (Aug 22 2020 at 23:59):

if (fb_totally_numeric(file) || strchr(file, ':') != NULL) {

view this post on Zulip Sean (Aug 22 2020 at 23:59):

hm, interesting

view this post on Zulip starseeker (Aug 22 2020 at 23:59):

Not sure what happens with a full C:\ windows path, that could get ugly...

view this post on Zulip starseeker (Aug 23 2020 at 00:00):

That's been there though, I don't think I changed any of that...

view this post on Zulip Sean (Aug 23 2020 at 00:00):

yeah, probably wrong for that, but probably always been wrong for that

view this post on Zulip Sean (Aug 23 2020 at 00:00):

I would have thought -Fbrlcad.org would use the default port, but it might have always required the colon

view this post on Zulip starseeker (Aug 23 2020 at 00:01):

/me double checks the rel-7-30-10 code...

view this post on Zulip starseeker (Aug 23 2020 at 00:02):

yep, same conditional

view this post on Zulip Sean (Aug 23 2020 at 00:02):

/me checks 7.24, though it's been broken it was probably broken long before that

view this post on Zulip starseeker (Aug 23 2020 at 00:04):

At least as far back as 1990 per git blame

view this post on Zulip starseeker (Aug 23 2020 at 00:04):

svn:revision:4527

view this post on Zulip starseeker (Aug 23 2020 at 00:05):

/me must concede that was fun... git blame FTW

view this post on Zulip Sean (Aug 23 2020 at 00:05):

cool

view this post on Zulip Sean (Aug 23 2020 at 00:06):

yeah, 7.24 treats it as a file

view this post on Zulip Sean (Aug 23 2020 at 00:06):

don't see a documented way to override in the man page, good to go

view this post on Zulip starseeker (Aug 23 2020 at 00:07):

svn:revision:3680
I think that's where it's original form got added.

view this post on Zulip Sean (Aug 23 2020 at 00:07):

er, so did you test that remote actually works? :) run fbserv on brlcad.org something

view this post on Zulip Sean (Aug 23 2020 at 00:08):

something like -Fbrlcad.org:file.pix

view this post on Zulip Sean (Aug 23 2020 at 00:08):

that should be a remote /dev/disk

view this post on Zulip starseeker (Aug 23 2020 at 00:08):

What is that supposed to do? Upload a file to brlcad.org?

view this post on Zulip Sean (Aug 23 2020 at 00:09):

wherever the framebuffer server is, yeah

view this post on Zulip Sean (Aug 23 2020 at 00:09):

manpage example is using it to launch remote transient windows

view this post on Zulip starseeker (Aug 23 2020 at 00:09):

Do I have permissions to make that work?

view this post on Zulip Sean (Aug 23 2020 at 00:10):

like from brlcad.org, you could -Fyour_local_ip:/dev/Xl assuming networking routes are open

view this post on Zulip Sean (Aug 23 2020 at 00:10):

try localhost -- that should still go through the interface

view this post on Zulip Sean (Aug 23 2020 at 00:10):

-/

view this post on Zulip Sean (Aug 23 2020 at 00:10):

things like -Flocalhost:/dev/X and -Flocalhost:file.pix

view this post on Zulip Sean (Aug 23 2020 at 00:11):

can see examples in brlman libfb

view this post on Zulip starseeker (Aug 23 2020 at 00:12):

How do I launch an fbserver to listen for that?

view this post on Zulip Sean (Aug 23 2020 at 00:12):

man fbserv ;)

view this post on Zulip Sean (Aug 23 2020 at 00:13):

fbserv 0 /dev/X

view this post on Zulip starseeker (Aug 23 2020 at 00:13):

So localhost:/dev/X is an alternative to just specifying the port number?

view this post on Zulip Sean (Aug 23 2020 at 00:13):

no

view this post on Zulip Sean (Aug 23 2020 at 00:15):

at least I don't think so -- it's saying launch a transient /dev/X on the remote host

view this post on Zulip starseeker (Aug 23 2020 at 00:15):

./bin/fbserv 0 /dev/X
./bin/rt -Flocalhost:/dev/X share/db/moss.g all.g
pkg_open(localhost, remotefb): unknown service
pkg_open: client connect: errno=111
rem_open: can't connect to remotefb server on host "localhost".
fb_open: can't open device "localhost:/dev/X", ret=-4.
rt: can't open frame buffer

view this post on Zulip starseeker (Aug 23 2020 at 00:16):

Oh, duh

view this post on Zulip starseeker (Aug 23 2020 at 00:16):

./bin/rt -Flocalhost:0:/dev/X share/db/moss.g all.g

view this post on Zulip starseeker (Aug 23 2020 at 00:17):

OK, so yes - works in trunk

view this post on Zulip Sean (Aug 23 2020 at 00:17):

testing 24 behavior

view this post on Zulip Sean (Aug 23 2020 at 00:19):

hm, it's not doing anything with anything after the :0 if you put a port

view this post on Zulip starseeker (Aug 23 2020 at 00:21):

That whole aspect of the framebuffer I/O gets very little exercise, to the best of my knowledge - the closest I know of that happens semi-regularly is when rtwizard does its wackier stuff with multiple programs targeting one fb.

view this post on Zulip Sean (Aug 23 2020 at 00:21):

looks like it's a space if the man page is right

view this post on Zulip Sean (Aug 23 2020 at 00:24):

yeah, it's not working even following the man page example in 7.24 so not a release issue at least. maybe broken long ago.

view this post on Zulip starseeker (Aug 23 2020 at 00:24):

The only other instance I can think of where any of the over-the-network features were at play was your distributed rendering

view this post on Zulip starseeker (Aug 23 2020 at 00:26):

That's a thought - wonder if remrt works

view this post on Zulip Sean (Aug 23 2020 at 00:26):

give it a go

view this post on Zulip starseeker (Aug 23 2020 at 00:27):

I'm not much good with the multiple machines and network communication stuff Sean - I can try, but the only thing that comes to mind is running a remrt here and trying to enlist bz

view this post on Zulip Sean (Aug 23 2020 at 00:28):

that'll work -- I've used it before that way

view this post on Zulip starseeker (Aug 23 2020 at 00:28):

fb_open(name?name:framebuffer, xx, yy)) is it's invocation pattern - is that what we need to test here?

view this post on Zulip starseeker (Aug 23 2020 at 00:29):

Do you usually launch it from within MGED?

view this post on Zulip Sean (Aug 23 2020 at 00:29):

it's really just replace an rt call with remrt, then run rtsrv pointing it to that host

view this post on Zulip Sean (Aug 23 2020 at 00:29):

no, that'd get ugly

view this post on Zulip Sean (Aug 23 2020 at 00:29):

usually start with a saveview script

view this post on Zulip starseeker (Aug 23 2020 at 00:29):

Man page calls out "rrt remrt -M -s###"

view this post on Zulip Sean (Aug 23 2020 at 00:29):

then edit the script to replace rt with remrt

view this post on Zulip Sean (Aug 23 2020 at 00:30):

rrt should work, but that's introducing a whole other thing too just fyi

view this post on Zulip Sean (Aug 23 2020 at 00:30):

er, and rrt -M is going to wait for stdin

view this post on Zulip starseeker (Aug 23 2020 at 00:30):

/me just noting that the man page could use a step-by-step example

view this post on Zulip Sean (Aug 23 2020 at 00:30):

which is a lil tricky from mged

view this post on Zulip Sean (Aug 23 2020 at 00:31):

i mean, with all the proc changes, should also make sure rrt still works ...

view this post on Zulip Sean (Aug 23 2020 at 00:31):

but would test that separate from remrt

view this post on Zulip Sean (Aug 23 2020 at 00:31):

just try like rrt rtedge

view this post on Zulip starseeker (Aug 23 2020 at 00:31):

/me nods

view this post on Zulip Sean (Aug 23 2020 at 00:32):

but yeah, remrt is usually for big long-running jobs -- no way I'd do that inside mged

view this post on Zulip starseeker (Aug 23 2020 at 00:32):

First let me see if I can get anything to work... considering how long it too me to get hello world going with libpkg this could be good for some comedy

view this post on Zulip Sean (Aug 23 2020 at 00:36):

so to assist, here's what you'll need (from memory). 0) start an fbserv -S2048 0 /dev/mem, 1) run mgd saveview, 2) edit script to call remrt, 3) edit script switches to remove -o outputfile and file logging lines, 4) edit script to -F0 or whatever your fbserv is running on, 5) run script make sure it's waiting for clients, 6) run rtsrv from two hosts

view this post on Zulip Sean (Aug 23 2020 at 00:37):

you'll of course want to make sure the render job is expensive enough to not immediately terminate. one of the nurbs samples with ambient occlusion should easily do, so you could add -c "set ambSamples=1024 ambSlow=1" to the script

view this post on Zulip Sean (Aug 23 2020 at 00:38):

unrelated 76891 needs NEWS

view this post on Zulip Sean (Aug 23 2020 at 00:41):

also, your 76890 comment sounds like you might not be familiar with that convention, but '-' means stdin or stdout depending on the context and tool, it's like a magic name so a user can specify stdin/stdout via args

view this post on Zulip Sean (Aug 23 2020 at 00:42):

we're not consistent but a few of our tools support that, that many also don't yet

view this post on Zulip starseeker (Aug 23 2020 at 00:42):

/me nods. Not familiar with it, so good to know.

view this post on Zulip Sean (Aug 23 2020 at 00:44):

can see examples of it in action: grep \"-\" src/util/*.c

view this post on Zulip Sean (Aug 23 2020 at 00:46):

usually it'll be something like "cat file2.pix | pixdiff file1.pix -"

view this post on Zulip starseeker (Aug 23 2020 at 00:48):

So... How will I know when the rtsrv command has connected with remrt?

view this post on Zulip Sean (Aug 23 2020 at 00:48):

and of course -- typically means "stop processing args, pass the rest along to the next thing that processes args"

view this post on Zulip starseeker (Aug 23 2020 at 00:49):

Blast - can't contact

view this post on Zulip Sean (Aug 23 2020 at 00:49):

remrt logs it getting jobs

view this post on Zulip starseeker (Aug 23 2020 at 00:49):

rtsrv: unable to contact <ip>, port <num>

view this post on Zulip Sean (Aug 23 2020 at 00:49):

it's instantaneous when everything is correct

view this post on Zulip Sean (Aug 23 2020 at 00:51):

use rtsrv -d for debugging

view this post on Zulip starseeker (Aug 23 2020 at 00:53):

OK the port it was listening on wasn't the port that remrt claimed it was listening on

view this post on Zulip starseeker (Aug 23 2020 at 00:53):

Uh... "setpgid: Operation not permitted" and rtsrv died.

view this post on Zulip Sean (Aug 23 2020 at 00:54):

defaults to port 4446

view this post on Zulip starseeker (Aug 23 2020 at 00:54):

08/22 20:45:45 Automatic REMRT on ubuntu2019
08/22 20:45:45 Listening at port 24081, reading script on stdin

what port is it referring to there?

view this post on Zulip Sean (Aug 23 2020 at 00:54):

the "Listening at port .. #" is that whole automatic transient port thing we were talking about months ago

view this post on Zulip starseeker (Aug 23 2020 at 00:55):

Oh, that business. ugh.

view this post on Zulip Sean (Aug 23 2020 at 00:55):

that's not the connection port, that's just the internal listening reassignment the kernel gave it

view this post on Zulip Sean (Aug 23 2020 at 00:55):

there's not really much value in it printing it .. obviously misleading

view this post on Zulip Sean (Aug 23 2020 at 00:56):

all it really means is that it's listening, and a rtsrv -d localhost 4446 should attach

view this post on Zulip starseeker (Aug 23 2020 at 00:59):

It does, but rtsrv is dying when it tries to fork with the setpgid error

view this post on Zulip Sean (Aug 23 2020 at 01:02):

uh oh, looks like fbclear -c is broken

view this post on Zulip starseeker (Aug 23 2020 at 01:06):

And launching rtsrv from bz isn't getting through to remrt

view this post on Zulip starseeker (Aug 23 2020 at 01:09):

@Sean I don't think my machine is accessible from the outside back in...

view this post on Zulip Sean (Aug 23 2020 at 01:09):

try via localhost

view this post on Zulip Sean (Aug 23 2020 at 01:10):

and yes, you'd definitely need to configure your router if you're not using a DMZ

view this post on Zulip starseeker (Aug 23 2020 at 01:13):

remrt sees a connection attempt, but that's when I get setpgid: Operation not permitted

view this post on Zulip starseeker (Aug 23 2020 at 01:13):

and rtsrv dies

view this post on Zulip Sean (Aug 23 2020 at 01:14):

add -d

view this post on Zulip starseeker (Aug 23 2020 at 01:14):

PROTOCOL_VERSION='BRL-CAD REMRT Protocol v2.0'
using 12 of 12 cpus
ph_loglvl 1
ph_dirbuild: NIST_MBE_PMI_2.g

ph_dirbuild: rt_dirbuild(NIST_MBE_PMI_2.g) failure

view this post on Zulip Sean (Aug 23 2020 at 01:15):

make sure the .g is in cwd

view this post on Zulip Sean (Aug 23 2020 at 01:15):

there's ways to tell remrt to transfer it, but I don't know how to set that up

view this post on Zulip starseeker (Aug 23 2020 at 01:15):

Ah, there we go. I thought it would send the .g over the wire

view this post on Zulip Sean (Aug 23 2020 at 01:15):

default assumes all rtsrv's have access

view this post on Zulip starseeker (Aug 23 2020 at 01:16):

-d or not, it should probably print the file not found error - that setpgid thing was extremely confusing

view this post on Zulip Sean (Aug 23 2020 at 01:17):

looking at the code, setpgid is probably just a perror warning

view this post on Zulip Sean (Aug 23 2020 at 01:17):

meaning it's not authorized to run in a proper daemon server mode

view this post on Zulip Sean (Aug 23 2020 at 01:17):

which is part of what it's trying to set up there

view this post on Zulip Sean (Aug 23 2020 at 01:18):

debug mode skips all that including skipping the fork+exec

view this post on Zulip starseeker (Aug 23 2020 at 01:20):

If you don't mind, I'm going to add these steps as an example to the remrt man page...

view this post on Zulip Sean (Aug 23 2020 at 01:22):

heh, why would I mind?? It is a whole section in the advanced rendering guide I started last year

view this post on Zulip Sean (Aug 23 2020 at 01:22):

I really need to format that into docbook

view this post on Zulip starseeker (Aug 23 2020 at 01:23):

Wanted to make sure you weren't already altering the man page.

view this post on Zulip Sean (Aug 23 2020 at 01:27):

no, as awesome as it is for production rendering, it's been strictly minimally maintained thus far -- every big render I've done I've come across dozens of bugs and inflexibilities and things that need to be improved.

view this post on Zulip Sean (Aug 23 2020 at 01:27):

I've only done the minimum needed to get it working, try to justify making a few improvements until the next time.

view this post on Zulip Sean (Aug 23 2020 at 01:27):

things like setpgid() and the error reporting just scratch the surface.

view this post on Zulip starseeker (Aug 23 2020 at 01:28):

I don't want to get distracted from the release, but I'll at least try to write down what it took to get it running so I don't have to repeat the flopping a second time...

view this post on Zulip Sean (Aug 23 2020 at 01:28):

I'm not sure what's wrong, but remrt isn't working here for me

view this post on Zulip Sean (Aug 23 2020 at 01:28):

and fbclear -c does seem to be fully busted

view this post on Zulip Sean (Aug 23 2020 at 01:28):

dont' know if that's new

view this post on Zulip starseeker (Aug 23 2020 at 01:29):

I'll check in a few minutes, once I'm done with the man page... one sec...

view this post on Zulip starseeker (Aug 23 2020 at 02:14):

@Sean is the fbclear breakage new?

view this post on Zulip starseeker (Aug 23 2020 at 02:20):

If it's hanging indefinitely, I'm seeing that in 7.30.10. if_remote.c:702 is doing a pkg_send, then waiting on 719 for a response that's apparently not coming.

view this post on Zulip Sean (Aug 23 2020 at 03:53):

what's up with r76889? sounds like you disabled a failing regression test...

view this post on Zulip starseeker (Aug 23 2020 at 12:55):

I partially re-enabled a test that I had fully disabled earlier. The input file being fed to asc2dsp for the other part is reporting an invalid character - I can't tell if it ever worked or even should have worked...

view this post on Zulip starseeker (Aug 23 2020 at 13:05):

That's a drawback with how some of the regression tests are set up (or at least, it has been - I've not done a systematic audit) - intermediate commands can fail, but as long as the final command executed by the script returns 0 the intermediated failures don't always fail the test.

If I remember correctly with asc2dsp, the input file names passed to asc2dsp were incorrect (probably after something got moved or renamed). However, asc2dsp still creates an empty output file in that case. So the test created two empty output files, which cmp found were the same, ergo the test passed.

view this post on Zulip starseeker (Aug 23 2020 at 13:07):

I'm not sure if asc2dsp just doesn't really support the "old" format, or this is one of those cases where a .dsp file got tagged as ASCII rather than binary (or vice versa) and the file contents at SVN checkout don't match what asc2dsp is expecting.

view this post on Zulip Sean (Aug 24 2020 at 13:46):

Barring bugs or new deviations introduced, all of the tests were structured to accumulate any errors and return that as a result. Relying on the last command's status is not a good idea for a whole host of reasons.

view this post on Zulip Sean (Aug 24 2020 at 13:50):

To that specific commit, sounds good if the test was already disabled and it's a partial re-enabling. We shouldn't disable a test if it was working was the only concern.

view this post on Zulip Sean (Aug 24 2020 at 13:52):

It's not doing any regression testing from the looks of the code either, so since it's presenting a maintenance cost, may make sense to remove it.

view this post on Zulip Sean (Aug 24 2020 at 13:54):

@starseeker on the matter of NULL checking, given the state of all the calling sites needing a null check and most still not really looking like they'd gracefully handle a NULL condition at all, it's worth considering making the API never return NULL (i.e., making it impossible).

view this post on Zulip starseeker (Aug 24 2020 at 13:56):

You mean returning an "(NULL)" string?

view this post on Zulip Sean (Aug 24 2020 at 13:57):

General way to do that would be to make the error condition a special value instead of a special pointer. Any value could work but ideally on that might make sense when printed.

view this post on Zulip starseeker (Aug 24 2020 at 13:57):

I'm working my way through the list of functions you itemized earlier, trying to make sure they do something sane.

view this post on Zulip Sean (Aug 24 2020 at 13:57):

so yeah, "(NULL)" could work though that's what stdio uses on Linux, so wouldn't be able to disambiguate.

view this post on Zulip Sean (Aug 24 2020 at 13:58):

I noticed, appreciated as there were a slew of potential null derefs introduced -- that's why I mention it.

view this post on Zulip starseeker (Aug 24 2020 at 13:58):

@Sean I'm leery of going too far down this rabbit hole, since the case of a null DMP isn't going to come up in normal usage and the entire structure of this whole thing (IMHO) needs some serious TLC

view this post on Zulip Sean (Aug 24 2020 at 13:59):

yes

view this post on Zulip Sean (Aug 24 2020 at 14:00):

that's why I mention it, I was leery too -- adding all the null checks doesn't feel right because it's decreasing readability, increasing complexity

view this post on Zulip starseeker (Aug 24 2020 at 14:00):

bu_list everywhere, globals galore... feels like trying to paint a rust bucket, in some ways

view this post on Zulip Sean (Aug 24 2020 at 14:00):

but that's the API. so either the API should change to not return NULL, or the callers are technically required to test it, however rare a condition

view this post on Zulip starseeker (Aug 24 2020 at 14:01):

I'm seeing a fair number of null checks already, actually, as I'm going through - found a few missing

view this post on Zulip starseeker (Aug 24 2020 at 14:01):

whether the calling code does something "sane" after doing the check is another matter, but that was presumably baked in from the get go...

view this post on Zulip starseeker (Aug 24 2020 at 14:03):

I think the reason it looks funky now is mostly that I straight up replaced all variable accesses with function calls, which resulted in a lot of double calls in if tests and bu_vls string outputs - that's most of what I'm trying to clean up now.

view this post on Zulip starseeker (Aug 24 2020 at 14:03):

I'm game to make it return not null, but I'm not sure what to do about the functions returning bu_vls strings in that respect.

view this post on Zulip starseeker (Aug 24 2020 at 14:04):

Do we malloc a vls internally in the null case, and then the caller has to check for a null vls and free it if that's what they get? that feels wrong...

view this post on Zulip Sean (Aug 24 2020 at 14:05):

starseeker said:

bu_list everywhere, globals galore... feels like trying to paint a rust bucket, in some ways

That can be said of almost any code, but I don't think it's a constructive perspective. Dismissing any issue in code and more importantly allowing new ones to be introduced because someone else dismissed something is not something I support (and HACKING has explicitly called out since inception). That's very much a canonical fallacy in my experience.

view this post on Zulip Sean (Aug 24 2020 at 14:07):

starseeker said:

I think the reason it looks funky now is mostly that I straight up replaced all variable accesses with function calls, which resulted in a lot of double calls in if tests and bu_vls string outputs - that's most of what I'm trying to clean up now.

I realize that, but therein is now a new issue that did not exist before as functions express contracts, variables do not. So you kind of birthed this beast :smile:

view this post on Zulip starseeker (Aug 24 2020 at 14:09):

Public struct members don't constitute a contract?

view this post on Zulip Sean (Aug 24 2020 at 14:09):

starseeker said:

I'm game to make it return not null, but I'm not sure what to do about the functions returning bu_vls strings in that respect.

That's tricky as there's dynamic memory associated. How about eliminating them?

view this post on Zulip starseeker (Aug 24 2020 at 14:10):

Not trivial to eliminate. Calling code assumes ability to stash the tcl path in them, IIRC

view this post on Zulip starseeker (Aug 24 2020 at 14:11):

Tk window path rather

view this post on Zulip Sean (Aug 24 2020 at 14:12):

starseeker said:

Public struct members don't constitute a contract?

They're just containers. There is no expectations for C or C++. They often are assumptions on those containers, but they're all generally subject to error. That's why for C and C++ at least the best you can generally do is dictate a convention of who owns the memory.

view this post on Zulip Sean (Aug 24 2020 at 14:12):

C++ formalizes that a fair bit better, but in C it's typically whomever allocated is responsible for releasing.

view this post on Zulip Sean (Aug 24 2020 at 14:13):

so whoever created the struct would be responsible for ensuring it's embedded values and pointers are "sane". That's also still not a contract on those values by callers though -- it was just the creator/initializer's reponsibility to minimize the risk, ideally provide a safe API, which is what codifies expectations.

view this post on Zulip starseeker (Aug 24 2020 at 14:15):

So, meta issue for a second... up until now we've not considered the libdm/libfb APIs truly public, since they were not intended to survive in their current form. That's still true - this refactor is not what I would propose if the intent was to establish a proper API. It feels like you're wanting to treat the function wrappers as something more "public" than the old libdm/fb structs - am I misunderstanding the intent?

view this post on Zulip Sean (Aug 24 2020 at 14:15):

starseeker said:

Not trivial to eliminate. Calling code assumes ability to stash the tcl path in them, IIRC

How about making the caller pass a vls to it

view this post on Zulip starseeker (Aug 24 2020 at 14:16):

Not sure how hard that would be - don't know if calling functions are set up to properly manage memory. We might have to make application global vls strings for that to work in MGED.

view this post on Zulip starseeker (Aug 24 2020 at 14:17):

Would probably work, but even in MGED introducing new globals just feels wrong...

view this post on Zulip Sean (Aug 24 2020 at 14:17):

To the meta point, no -- public API has a much higher burden still. This is an issue with any functions, not one just applied to public functions.

view this post on Zulip Sean (Aug 24 2020 at 14:18):

The issue is functions that generate NULL. It's one thing to be handed NULL and then pass that along, it's another to be the returning source of NULL.

view this post on Zulip starseeker (Aug 24 2020 at 14:19):

For the code having issues from a NULL from one of these functions though, wouldn't it have had the same problems trying to access an invalid struct member in the old code?

view this post on Zulip Sean (Aug 24 2020 at 14:19):

The only place that should be given a pass is performance-critical code, so you will/should find deviations in librt and some parts of libbn/libbu used in those performance paths.

view this post on Zulip Sean (Aug 24 2020 at 14:25):

starseeker said:

For the code having issues from a NULL from one of these functions though, wouldn't it have had the same problems trying to access an invalid struct member in the old code?

It really depends on the specifics, devil in the details. There are entire theories of corporate development that prohibit refactoring in NULL as it's exceptionally common for code to coincidentally work. So there could be swaths of logic that happen to gracefully handle a variety of invalid conditions by accident, but handled none the less. I don't think that's our issue here or concern -- it's the simple principle of having a function that can return NULL, you're supposed to check it otherwise there is no point whatsoever in returning NULL and the consequence will be eventual obscure crashing.

view this post on Zulip Sean (Aug 24 2020 at 14:26):

We've ran into this issue many times over the years too, so it's not just a rare hypothetical.

view this post on Zulip starseeker (Aug 24 2020 at 14:26):

So is the shortest path forward for to finish the check through the code for the null checks, or should I try to look at excising the bu_vls slots from the dm implementation struct?

view this post on Zulip starseeker (Aug 24 2020 at 14:27):

Need to tie this off - I'm game to do the work, but need to have a finish line where this can be called done

view this post on Zulip Sean (Aug 24 2020 at 14:28):

Shortest and safest path depends if we're talking long or short term, but the difference is probably only an hour or two effort either way.

view this post on Zulip starseeker (Aug 24 2020 at 14:29):

All right - what in your estimate is the best way forward?

view this post on Zulip Sean (Aug 24 2020 at 14:30):

Make it so caller doesn't have to check null, make caller pass in container if it would otherwise result in an allocation

view this post on Zulip Sean (Aug 24 2020 at 14:31):

if null must be returned for some deeper hard-to-untangle reason, then the call sites should get checked

view this post on Zulip starseeker (Aug 24 2020 at 14:32):

I'll see what I can do.

btw - is this the last big blocker, or have you see other things that'll need to be handled?

view this post on Zulip starseeker (Aug 24 2020 at 14:32):

(so far - I know you're still commit reviewing...)

view this post on Zulip Sean (Aug 24 2020 at 14:32):

I've been getting through about 100 commits per day

view this post on Zulip Sean (Aug 24 2020 at 14:32):

so more than half way there now

view this post on Zulip Sean (Aug 24 2020 at 14:33):

course that includes reviewing the new stuff as it's coming in, so it's an uphill struggle :)

view this post on Zulip Sean (Aug 24 2020 at 14:33):

there's a lot of little undocumented things that I'm concerned about but not enough to hold things up

view this post on Zulip starseeker (Aug 24 2020 at 14:33):

I could hold off and do one big commit to try and clean up libdm :-P

view this post on Zulip Sean (Aug 24 2020 at 14:34):

lots of things that beg testing :(

view this post on Zulip Sean (Aug 24 2020 at 14:34):

that'd just screw up the commit review rate :P

view this post on Zulip Sean (Aug 24 2020 at 14:34):

you heard about paragon's give to linux kernel? :)

view this post on Zulip Sean (Aug 24 2020 at 14:35):

just heard about that yesterday

view this post on Zulip starseeker (Aug 24 2020 at 14:35):

No, haven't been plugged in to outside world much lately - what'd they give?

view this post on Zulip Sean (Aug 24 2020 at 14:35):

they submitted their ntfs driver

view this post on Zulip Sean (Aug 24 2020 at 14:36):

it's a 27k line pull request

view this post on Zulip starseeker (Aug 24 2020 at 14:36):

<snork> Given how the Linux folks reacted to ZFS, I can't wait to see the comments on that one...

view this post on Zulip starseeker (Aug 24 2020 at 14:37):

Looks like we've got 4 vls strings in the dm container - two Tcl/Tk drawing window strings, the display name, and a log.

view this post on Zulip Sean (Aug 24 2020 at 14:37):

paragon is known for having one of the best ntfs drivers and linux needs it, but they're also notorious for having crappy issues in their drivers (including the ntfs one)

view this post on Zulip Sean (Aug 24 2020 at 14:38):

starseeker said:

Looks like we've got 4 vls strings in the dm container - two Tcl/Tk drawing window strings, the display name, and a log.

A real cheap 'punt' that avoids lots of restructuring could be to convert them into char[]

view this post on Zulip Sean (Aug 24 2020 at 14:39):

then you'd only need to worry about the handful of sites that write

view this post on Zulip starseeker (Aug 24 2020 at 14:39):

/me nods - got to check where they're being accessed.

view this post on Zulip Sean (Aug 24 2020 at 14:40):

like if you made them four char[255] arrays, just need to make sure nobody writes past that

view this post on Zulip Sean (Aug 24 2020 at 14:41):

dm_log looks like it should have been a flag from the comment

view this post on Zulip Sean (Aug 24 2020 at 14:41):

unless it's the actual log, then the comment is just wrong

view this post on Zulip starseeker (Aug 24 2020 at 14:41):

Probably want the Tk paths a bit longer - some of the Itk names can get long.

I'll have to check, but I think it's supposed to be a log filename.

view this post on Zulip Sean (Aug 24 2020 at 14:42):

the other three look like they could get away with #define MAXNAME 255 ; char[MAXNAME] or similar

view this post on Zulip Sean (Aug 24 2020 at 14:42):

ah, if it's a filepath, then we really gobble up memory but that's fine

view this post on Zulip Sean (Aug 24 2020 at 14:42):

for filesystem paths use MAXPATHLEN

view this post on Zulip starseeker (Aug 24 2020 at 14:43):

Does Tk have a maximum path length for their window names? If so that's what we'd want there...

view this post on Zulip starseeker (Aug 24 2020 at 14:43):

/me flips a type to see what blows up in the build...

view this post on Zulip Sean (Aug 24 2020 at 14:43):

I doubt it.

view this post on Zulip Sean (Aug 24 2020 at 14:44):

in practice for dm, though, the path names are things like .id_0.some_window.childsite.widgetfoo_window.dialog

view this post on Zulip Sean (Aug 24 2020 at 14:44):

that would be a pretty extreme example too. most are far more concise

view this post on Zulip starseeker (Aug 24 2020 at 14:45):

OK - as long as Archer doesn't do anything crazy - I remember Bob command line manipulating Itk windows once, and the names were kinda nutty

view this post on Zulip Sean (Aug 24 2020 at 14:46):

I've never seen any get close to 256

view this post on Zulip starseeker (Aug 24 2020 at 14:47):

/me nods we can give 255 a shot. I imagine we'll know pretty quick if there's an issue...

view this post on Zulip Sean (Aug 24 2020 at 14:48):

I was thinking indexing when I wrote that 0-255 .. size should be 256 so it stays aligned in memory

view this post on Zulip Sean (Aug 24 2020 at 14:49):

that's also where the calling sites that write into the buffer can then just bu_bomb justifiably and continuing code can assume it always works

view this post on Zulip Sean (Aug 24 2020 at 14:50):

oh wait

view this post on Zulip Sean (Aug 24 2020 at 14:51):

so there is another solution here that may be even less work

view this post on Zulip Sean (Aug 24 2020 at 14:51):

and can keep the vls

view this post on Zulip Sean (Aug 24 2020 at 14:53):

the issue is strictly that they return NULL, so like dm_get_pathname() you could just make it halt on that condition. then callers never need to check. It's not a great pattern to do aggressively, but it certainly seems reasonable for the four vls functions as they're in the dmp.

view this post on Zulip Sean (Aug 24 2020 at 14:53):

i.e., they're not pointers in the dmp

view this post on Zulip Sean (Aug 24 2020 at 14:54):

dm_get_vp() is a good counter-example. it returns null and is potentially null itself in the dmp.

view this post on Zulip starseeker (Aug 24 2020 at 14:56):

Make it hault? You mean bu_exit?

view this post on Zulip Sean (Aug 24 2020 at 14:57):

no, bu_exit is for applications

view this post on Zulip Sean (Aug 24 2020 at 14:57):

bu_bomb

view this post on Zulip Sean (Aug 24 2020 at 14:57):

or a magic number check

view this post on Zulip Sean (Aug 24 2020 at 14:57):

(which bu_bombs)

view this post on Zulip Sean (Aug 24 2020 at 14:58):

since that's conceptually what you're doing with the tests anyways, just checking the dmp

view this post on Zulip Sean (Aug 24 2020 at 14:58):

might as well shove a magic in there and BU_CKMAG it

view this post on Zulip Sean (Aug 24 2020 at 14:58):

then no null return possible

view this post on Zulip starseeker (Aug 24 2020 at 15:27):

@Sean a quick experiment with a BU_CKMAG in the dm causes a couple of MGED crashes...

view this post on Zulip starseeker (Aug 24 2020 at 15:27):

Working through them...

view this post on Zulip starseeker (Aug 24 2020 at 15:31):

There we go - was r76921 what you were thinking?

view this post on Zulip starseeker (Aug 24 2020 at 15:41):

(along with r76922)

view this post on Zulip starseeker (Aug 24 2020 at 15:42):

If so I'll have to make another pass in the calling code to handle the (DM-NULL) returns instead of NULL (there may still be cases were we don't want to feed that to a Tcl script for execution...)

view this post on Zulip Sean (Aug 24 2020 at 16:17):

starseeker said:

Sean a quick experiment with a BU_CKMAG in the dm causes a couple of MGED crashes...

Nominally means NULL is/was being returned at least somewhere/sometime! That's why all the fuss... even if the condition is an unexpected/unintended behavior or even if it happened to work today (might not tomorrow after some other change or recompile).

view this post on Zulip Sean (Aug 24 2020 at 16:18):

I noticed several of the funcs are unused -- what's up with that? :)

view this post on Zulip starseeker (Aug 24 2020 at 17:04):

At least one of them I just added as a get/set pairing, when only set was needed

view this post on Zulip starseeker (Aug 24 2020 at 17:06):

Actually, crash ended up being silly - just me missing one of the magic initialization cases in the allocation logic.

view this post on Zulip Sean (Aug 24 2020 at 19:32):

starseeker said:

At least one of them I just added as a get/set pairing, when only set was needed

But if nothing reads them, why bother setting them... can they be removed?

view this post on Zulip starseeker (Aug 24 2020 at 20:01):

It's read internally by the ogl backend, IIRC - a parameter passed from the app to the backend. I'm sure it can be refactored somehow - there's a TODO note about it - but I haven't had time to fool with it. That level of the drawing logic, it's easy to break things in subtle ways that are hard to catch.

view this post on Zulip starseeker (Aug 25 2020 at 15:31):

OK, so upon further investigation (at least in latest trunk) the X dm seems to work embedded in Tk on Linux - it's only the dmtype set (runtime switching of the display manager types) that's failing.

Setting it in the .mgedrc works fine. That's not too surprising, really - dmtype was always a rather evil hack...

view this post on Zulip Sean (Aug 25 2020 at 16:44):

does attach work?

view this post on Zulip Sean (Aug 25 2020 at 16:44):

attach X .. attach ogl

view this post on Zulip starseeker (Aug 25 2020 at 18:33):

Yes, attach works.

view this post on Zulip Sean (Aug 25 2020 at 21:04):

both simultaneously?

view this post on Zulip starseeker (Aug 25 2020 at 21:08):

Yes.

view this post on Zulip starseeker (Aug 25 2020 at 21:10):

mged -c -a X share/db/moss brings up the X based version, then I can "attach ogl" and also get an OpenGL version. "e all.g" draws to both.

view this post on Zulip Sean (Aug 25 2020 at 21:11):

That's great. It should accept an arbitrary number of attach calls. That's wicked fun way to set up mged on CAVE walls and projection displays.

view this post on Zulip Sean (Aug 26 2020 at 03:59):

just fyi, in regex [12] and [1-2] are equivalent (saw the real error, missing a right bracket)

view this post on Zulip starseeker (Aug 31 2020 at 12:09):

@Sean any more issues pop up in commit review?

view this post on Zulip Sean (Aug 31 2020 at 19:54):

quite a few flagged as needing testing, but none yet that warrant release blocking. some still to document; more likely to miss one or two of those :/ .. I think I'm over 90% through, so should be done soon.

view this post on Zulip starseeker (Aug 31 2020 at 23:56):

/me stares at the MGED and dm-ogl code, and reluctantly concludes that a major shift in the way dm/fb rendering events are managed is too much of a time sink to tangle with for now...

view this post on Zulip starseeker (Aug 31 2020 at 23:58):

Will require a deep dive into Tk window setup, management, and events... blegh.

view this post on Zulip Sean (Sep 01 2020 at 06:14):

@starseeker one thing you could check for me -- can you see if obj export is working? see if an object from bot_dump and g-obj will open in anything else.

view this post on Zulip Sean (Sep 01 2020 at 06:17):

was reported and recorded but don't think it got checked at all as a release blocker, don't think we have anything that validates obj, only verification

view this post on Zulip Sean (Sep 01 2020 at 07:35):

/me is down to 165 remaining, woot!

view this post on Zulip starseeker (Sep 01 2020 at 11:46):

bot_dump'ed a facetized sph to an obj file, and ran ./bin/g-obj -o g_obj.obj sph_bot.g sph.s on the same sph - both .obj files opened successfully in meshlab.

So working at a basic level, at least on Linux.

view this post on Zulip starseeker (Sep 01 2020 at 14:44):

Is it me, or is sourceforge slow today?

view this post on Zulip Sean (Sep 04 2020 at 07:05):

huh, okay. thanks for checking! that's really bizarre because I do recall that distinctly not working. Does Blender open them? I'll have to recheck xcode and cura, or maybe it's somehow mac-specific.

view this post on Zulip starseeker (Sep 04 2020 at 12:16):

I'll try on Windows - rtwizard is barking only on Windows, so maybe it'll expose this as well...

view this post on Zulip starseeker (Sep 04 2020 at 16:09):

Seems OK on Windows as well. Only thing noticed so far was an empty file getting created if I tried the -b option with -t obj for bot_dump. It provided an informative error message, but ideally shouldn't have produced the empty output file.

view this post on Zulip Sean (Sep 11 2020 at 18:22):

@starseeker do you have any more information you can give about the NUL file getting created -- what does "test -f /dev/null" report?

view this post on Zulip starseeker (Sep 11 2020 at 18:23):

I'll see if I can tell - it only fails when run inside of CTest - a straight command-line run succeeds

view this post on Zulip Sean (Sep 11 2020 at 18:28):

all the more bizarre.. 'test' is used all throughout the benchmark and other scripts run by ctest.

view this post on Zulip Sean (Sep 11 2020 at 18:29):

very concerning if something this fundamental isn't reliable

view this post on Zulip starseeker (Sep 11 2020 at 18:34):

I'm not sure what to make of this. If I put the following in:

if test -f /dev/null ; then
    echo $?
    echo "have /dev/null"
else
    echo $?
    echo "no /dev/null"
fi

I get:

875: 1
875: no /dev/null

view this post on Zulip starseeker (Sep 11 2020 at 18:35):

I get that both in and out of CTest, but if I make the "NUL" file that gets created read-only CTest will abort after trying to write to it and failing, but running outside of CTest benchmark still proceeds.

view this post on Zulip Sean (Sep 11 2020 at 18:35):

put in something more explicit like "echo ls -la /dev/null" before the test and

view this post on Zulip Sean (Sep 11 2020 at 18:35):

those were backticked

view this post on Zulip Sean (Sep 11 2020 at 18:36):

could just put ls -la /dev/null to make sure it's not running in some kind of environment, also "which test" and see if /bin/test -f /dev/null and i [ -f /dev/null ] behaves differently...

view this post on Zulip starseeker (Sep 11 2020 at 18:37):

crw-rw-rw- 1 root root 1, 3 Sep 2 16:50 /dev/null

view this post on Zulip starseeker (Sep 11 2020 at 18:40):

/bin/test and /usr/bin/test don't seem to make a difference...

view this post on Zulip Sean (Sep 11 2020 at 18:41):

what about: if [ -f /dev/null ] ; then ...

view this post on Zulip starseeker (Sep 11 2020 at 18:41):

No change.

view this post on Zulip Sean (Sep 11 2020 at 18:42):

and this is linux you said??

view this post on Zulip starseeker (Sep 11 2020 at 18:42):

Yes - Linux + CTest

view this post on Zulip Sean (Sep 11 2020 at 18:42):

how are you invoking ctest?

view this post on Zulip starseeker (Sep 11 2020 at 18:42):

ctest -R benchmark --verbose

view this post on Zulip starseeker (Sep 11 2020 at 18:44):

dvn=$(test -f /dev/null)
echo $dvn
echo $?

returns:

0

view this post on Zulip Sean (Sep 11 2020 at 18:44):

and silly question, but sure it's regenerating the benchmark script, picking up the changs?

view this post on Zulip Sean (Sep 11 2020 at 18:45):

this literally makes no sense...

view this post on Zulip starseeker (Sep 11 2020 at 18:45):

I'm directly editing the copy in bin/benchmark, and not re-running CMake (not silly question, I made that mistake when I first started trying to fix)

view this post on Zulip Sean (Sep 11 2020 at 18:46):

with ctest --verbose, does it show the actual shell command it's invoking?

view this post on Zulip starseeker (Sep 11 2020 at 18:46):

/usr/bin/sh "/home/cyapp/RELEASE/build2/bin/benchmark" "run" "TIMEFRAME=1"

view this post on Zulip starseeker (Sep 11 2020 at 18:47):

But if I run that straight up, it does start running even with the NUL file read only (all the printouts seem to be the same though...)

view this post on Zulip Sean (Sep 11 2020 at 18:48):

is there a trace flag or something that can give more info?

view this post on Zulip starseeker (Sep 11 2020 at 18:51):

-x to the shell script, but that's not helping much...

view this post on Zulip starseeker (Sep 11 2020 at 18:52):

+ ls -la /dev/null
+ echo crw-rw-rw- 1 root root 1, 3 Sep 2 16:50 /dev/null
crw-rw-rw- 1 root root 1, 3 Sep 2 16:50 /dev/null
+ test -f /dev/null
+ dvn=
+ echo

+ echo 0
0
+ [ -f /dev/null ]
+ echo 1
1
+ echo no /dev/null
no /dev/null

view this post on Zulip Sean (Sep 11 2020 at 18:57):

Interesting, I didn't notice before, but I'm seeing the behavior here in ctest too

view this post on Zulip Sean (Sep 11 2020 at 18:57):

this is beyond messed up...

view this post on Zulip Sean (Sep 11 2020 at 19:03):

oh crap, right, huh.

view this post on Zulip Sean (Sep 11 2020 at 19:03):

my mistake all along.

view this post on Zulip starseeker (Sep 11 2020 at 19:04):

While we're at it - should I go ahead and pull in the gcv changes?

view this post on Zulip Sean (Sep 11 2020 at 19:04):

fixed

view this post on Zulip Sean (Sep 11 2020 at 19:06):

no, I'd hold up on them. lots of potential for brokenness.

view this post on Zulip starseeker (Sep 11 2020 at 19:06):

/me nods - agreed. So RELEASE isn't quite equal to trunk, I misspoke earlier

view this post on Zulip Sean (Sep 11 2020 at 19:07):

I didn't look in detail, but at a glance appeared to just wildcard it

view this post on Zulip starseeker (Sep 11 2020 at 19:08):

The gcv changes? Well, the intent was to explicitly set the context, instead of assuming MODEL.

view this post on Zulip starseeker (Sep 11 2020 at 19:08):

Essentially remove the implicit assumption, and check extensions and the like against more types than just the model types.

view this post on Zulip Sean (Sep 11 2020 at 19:08):

right, but a plugin could be multiple / any contexts

view this post on Zulip starseeker (Sep 11 2020 at 19:09):

Right. In which case, it sets the generalized match, the gcv program will pass its inputs to the plugin regardless, and it's up to the plugin to figure it out.

view this post on Zulip Sean (Sep 11 2020 at 19:10):

which then begs what are the contexts for, just complicates api and callers if most of our tools will want to allow multiple types

view this post on Zulip starseeker (Sep 11 2020 at 19:11):

I'm not sure they will - for example, wouldn't icv want to accept just image types?

view this post on Zulip Sean (Sep 11 2020 at 19:11):

already ran into a problem with sumanga's plugin where he had to force PNG mime type because some other plugin was happily parsing it as pix data

view this post on Zulip Sean (Sep 11 2020 at 19:12):

no, probably not .. same reasons it's a problem in gcv. someone might want to import image data from a spreadsheet or some random application/vendor proprietary type

view this post on Zulip starseeker (Sep 11 2020 at 19:12):

That's going to place a massive input validation burden on each individual plugin though - that feels wrong.

view this post on Zulip Sean (Sep 11 2020 at 19:13):

that would be wrong, I agree, and it's worse than that

view this post on Zulip Sean (Sep 11 2020 at 19:14):

pix reader plugin isn't going to have any basis for denying input really

view this post on Zulip Sean (Sep 11 2020 at 19:14):

for exampl

view this post on Zulip starseeker (Sep 11 2020 at 19:15):

Right. That's I think the original motivation for the contexts - there's no guarantee that file extensions are unique mappings to types, so we may need the context to disambiguate if (for example) an image extension and a geometry extension are the same.

view this post on Zulip Sean (Sep 11 2020 at 19:15):

it's up to icv/gcv to somehow prioritize unless overridden based on file suffix and/or signature or some other measure .. a callback can't/shouldn't know about other formats

view this post on Zulip Sean (Sep 11 2020 at 19:16):

that's a case of clear ambiguity, that's fine

view this post on Zulip Sean (Sep 11 2020 at 19:16):

that has to be called out, it's fine if that case has to be explicit

view this post on Zulip starseeker (Sep 11 2020 at 19:17):

How do we allow the user to specify the resolution?

view this post on Zulip Sean (Sep 11 2020 at 19:18):

before even getting to options, there's a question of whether the file is even getting routed to the right plugin. maybe a scoring system?

view this post on Zulip starseeker (Sep 11 2020 at 19:18):

I don't quite follow about the callback - the callbacks can't really be format specific. GDAL is currently the canonical example, but assimp will be similar.

view this post on Zulip Sean (Sep 11 2020 at 19:19):

what do you mean the callbacks can't be format specific?

view this post on Zulip Sean (Sep 11 2020 at 19:19):

they are plugin specific

view this post on Zulip Sean (Sep 11 2020 at 19:19):

a plugin certainly could be specific to a format

view this post on Zulip Sean (Sep 11 2020 at 19:20):

and their callback would be specific to that format...

view this post on Zulip starseeker (Sep 11 2020 at 19:20):

Right, but plugins in general aren't necessarily format specific. Hence the "wildcard" specifier for types when hooking in a plugin

view this post on Zulip starseeker (Sep 11 2020 at 19:20):

(if that plugin does support multiple types - otherwise you specify the one it does support)

view this post on Zulip starseeker (Sep 11 2020 at 19:21):

Unless you're looking for some way for a plugin to register itself for multiple formats - that'd be a completely different architecture than what we've got now...

view this post on Zulip Sean (Sep 11 2020 at 19:21):

sure, but that's a different statement than "can't" .. they can and are .. and they might not be, and sometimes aren't

view this post on Zulip Sean (Sep 11 2020 at 19:22):

if they're all wildcard, then there wouldn't really be much point in typing at all and the api complexity it entail

view this post on Zulip starseeker (Sep 11 2020 at 19:23):

Sure, but the majority aren't general and if we can limit the types of data those limited ones get fed it bounds the data validation problem somewhat.

view this post on Zulip Sean (Sep 11 2020 at 19:24):

sure but it doesn't solve the problem :)

view this post on Zulip Sean (Sep 11 2020 at 19:26):

i mean png-to-vol is a good example. if we marked the mime type as auto, then gdal proceeded to grab the input and crash-n-burned on it. if marked as png, then the new plugin got to try.

view this post on Zulip Sean (Sep 11 2020 at 19:26):

new plugin was made to use icv, so in theory it handles any icv type now, but can't set a type that gets it routed to it without it being specific

view this post on Zulip starseeker (Sep 11 2020 at 19:27):

My off-the-cuff thought there was to set a generic type and an image context - that way the plugin would get a crack at anything gcv identifies as image data.

view this post on Zulip starseeker (Sep 11 2020 at 19:28):

The png only version would get a PNG image type instead of the generic type and the image context, which means it would only be invoked for PNG image data.

view this post on Zulip Sean (Sep 11 2020 at 19:29):

I mean in fairness, this is probably a case where there's a genuine conflict -- both gdal and an icv reader can read images

view this post on Zulip Sean (Sep 11 2020 at 19:29):

so needs someway to force which and/or fallback to another when the one chose fails and there were alternatives

view this post on Zulip starseeker (Sep 11 2020 at 19:30):

/me nods - I don't think we've worked out that part of the command option set yet - this may be the first case that's popped up that actually results in it tripping on the issue.

view this post on Zulip Sean (Sep 11 2020 at 19:30):

in this case, we didn't actually want a png-only version -- want it to be any icv format

view this post on Zulip starseeker (Sep 11 2020 at 19:32):

Right, but that's an interesting point. If we did have a PNG specific version and the generic icv version both present, it would be a reasonable inference that the format specific one is in there because it could do a better job on that specific format than the generic alternative. (Otherwise, why have both in the first place?)

That breaks down when we get to more complex things like geometry where it's too complex to guarantee one of anything is better than another, of course, but the inference will likely still be drawn.

view this post on Zulip starseeker (Sep 11 2020 at 19:35):

I don't really have any good answers - the existing gcv app was implemented with the mindset of replacing all of the -g and g- tools (which didn't impose any of the multiple plugin issues of the more general solution.) So it's probably got quite a few limitations of that sort...

view this post on Zulip Sean (Sep 11 2020 at 19:35):

yeah, I don't think we want to build assumptions like that into the system. we either need to find a way to let plugins encode/grade/characterize themselves or let it get resolved explicitly by the caller

view this post on Zulip Sean (Sep 11 2020 at 19:36):

just because one plugin reads multiple formats (e.g., assimp) and another one format (e.g., stl) doesn't mean either is better at it and likely isn't the case

view this post on Zulip starseeker (Sep 11 2020 at 19:37):

I know, but a naive user looking for solutions will see they have an stl file, see "stl" in the listed plugins, and almost certainly jump to that one. If we're lucky, they may think to look for other options if that one fails, but even money they won't.

view this post on Zulip Sean (Sep 11 2020 at 19:37):

if it were the same author, then I could see the case for inferring "png" plugin trumps "*"plugin when presented with a png file, but that's not an assumption we can make

view this post on Zulip Sean (Sep 11 2020 at 19:38):

starseeker said:

I know, but a naive user looking for solutions will see they have an stl file, see "stl" in the listed plugins, and almost certainly jump to that one. If we're lucky, they may think to look for other options if that one fails, but even money they won't.

sure and that behavior doesn't present a problem... let them explore

view this post on Zulip Sean (Sep 11 2020 at 19:38):

they can infer and assume whatever they like

view this post on Zulip Sean (Sep 11 2020 at 19:38):

it's whether the code is inferring and assuming anything that we have control over

view this post on Zulip Sean (Sep 11 2020 at 19:40):

I'm suggesting the code should not, for example, "prefer" sending a .png file to a particular plugin because it's name has 'png' in it or it's type was set to 'image/png" whereas the icv and gdal and other plugins declared multiple image support

view this post on Zulip starseeker (Sep 11 2020 at 19:41):

But it will need to prefer something, unless we want to scatter-gun multiple conversion attempts in parallel automatically and try to detect the "best" result somehow...

view this post on Zulip Sean (Sep 11 2020 at 19:41):

i.e., specific type shouldn't intrinsically trump a wildcard type; wildcard should just be like a globbed expansion of types (and maybe we need to make it be something like that)

view this post on Zulip starseeker (Sep 11 2020 at 19:42):

We can establish a plugin ranking for plugins we are providing, but if users start hooking in their own 3rd party plugins how will that interact?

view this post on Zulip Sean (Sep 11 2020 at 19:42):

it could need to pick something, but not necessarily "prefer"

view this post on Zulip starseeker (Sep 11 2020 at 19:42):

Oh, blast it:

Files present after distclean in /home/user/RELEASE-build/distcheck-autodetect_release/build:
bench/NUL

view this post on Zulip Sean (Sep 11 2020 at 19:43):

yeah, I don't see how ranking would work (and not b totally gameable)

view this post on Zulip Sean (Sep 11 2020 at 19:43):

starseeker said:

Oh, blast it:

Files present after distclean in /home/user/RELEASE-build/distcheck-autodetect_release/build:
bench/NUL

that's certainly from earlier testing.

view this post on Zulip starseeker (Sep 11 2020 at 19:43):

/me double checks to make sure he's got the test -e update from trunk

view this post on Zulip Sean (Sep 11 2020 at 19:44):

I don't get a NUL any more

view this post on Zulip starseeker (Sep 11 2020 at 19:44):

Doesn't look like it - let me make absolutely sure and start clean again...

view this post on Zulip Sean (Sep 11 2020 at 19:45):

kind of funny that it seemed to work writing out a NUL file.. it almost certainly wasn't fully behaving and just appeared to work

view this post on Zulip Sean (Sep 11 2020 at 19:45):

make sure you don't still have a customized installed script

view this post on Zulip Sean (Sep 11 2020 at 19:46):

with debug printings and such

view this post on Zulip starseeker (Sep 11 2020 at 19:46):

/me nods - clean SVN release branch, updated, with the trunk fix pulled in.

view this post on Zulip Sean (Sep 11 2020 at 19:47):

it all makes sense now. I got my flags mixed up forgetting that /dev/null is a file in the eyes of the filesystem, but not in the eyes of the test command where it only distinguishes "regular" files

view this post on Zulip Sean (Sep 11 2020 at 19:49):

note, test -f /dev/null did not return 0 for you. looking back, your snippet printed the return code of running the 'echo' command. ;-)

view this post on Zulip starseeker (Sep 11 2020 at 19:49):

OK, works in a clean toplevel test... kicking off distcheck-full again

view this post on Zulip Sean (Sep 11 2020 at 19:49):

it returns 1

view this post on Zulip starseeker (Sep 11 2020 at 19:50):

Ah. Figures. I still can't believe I'm enabling you to keep shell around by making it work on Windows... ;-)

view this post on Zulip starseeker (Sep 11 2020 at 19:51):

It's even worse than regex expressions

view this post on Zulip starseeker (Sep 11 2020 at 19:52):

OpenBSD behaved itself, by the way.

view this post on Zulip Sean (Sep 11 2020 at 19:53):

I'd still love to create a proper geometry shell, gash or whatever for navigating a geometry filesystem using libged commands and shell intrinsics

view this post on Zulip Sean (Sep 11 2020 at 19:53):

then we could have some seriously (more) powerful scripting constructs for creating and manipulating geometry

view this post on Zulip Sean (Sep 11 2020 at 19:56):

we're so close

view this post on Zulip starseeker (Sep 11 2020 at 19:57):

gsh is a start - still just a raw argc/argv interface right now, and I'm not happy with the subprocess callback, but it's heading in that direction.

view this post on Zulip Sean (Sep 11 2020 at 19:59):

gsh is cool, but that's completely on the other end of the spectrum.

view this post on Zulip starseeker (Sep 11 2020 at 19:59):

Oh, just so I know where to look - are you planning to update the NEWS file in trunk, or RELEASE?

view this post on Zulip Sean (Sep 11 2020 at 20:00):

i'm talking about taking something like zsh or bash and replacing all the libc I/O calls with libg I/O calls

view this post on Zulip starseeker (Sep 11 2020 at 20:01):

Ah - you mean a complete port of the shell environment, not just some shell interaction constructs on top of the argc/argv calls

view this post on Zulip Sean (Sep 11 2020 at 20:02):

having a full "geometry shell" environment including notion of current working dir, most of the built-in commands like cd, pwd, ls, stat, etc, plus command-line editing, command history, key bindings, terminal interfacing, etc

view this post on Zulip Sean (Sep 11 2020 at 20:02):

right, full deal

view this post on Zulip Sean (Sep 11 2020 at 20:02):

we're literally less than a gsoc-project away from that being realizable

view this post on Zulip Sean (Sep 11 2020 at 20:03):

that's why I started putting in the libbu/librt dir.h API as that's the foundation of the shell

view this post on Zulip Sean (Sep 11 2020 at 20:05):

I considered for a while how we could actually use shells unmodified, but that would require some pretty heavy FUSE integration work. when I last looked, it felt like creating a full FUSE filesystem driver was going to be quite a bit more work than porting and customizing a shell.

view this post on Zulip Sean (Sep 11 2020 at 20:05):

plus there are some legacy shell constructs we probably don't want to keep

view this post on Zulip starseeker (Sep 11 2020 at 20:09):

/me nods. How would terminal interaction work with Windows though? Only native sh port I'm familiar with other than the Git thing is the Windows zsh port from about 10 years ago... zsh-nt or some such.

view this post on Zulip Sean (Sep 11 2020 at 20:11):

terminal is a separate issue from the shell. can display a shell in any text environment, even dumb ones like a Qt text widget (just a lot won't display and behave right)

view this post on Zulip Sean (Sep 11 2020 at 20:12):

terminal would likely be something like what git-bash is using (i.e., a customized msys)

view this post on Zulip starseeker (Sep 11 2020 at 20:15):

Phew! There we go - one of the distcheck tests passed clean. Must have had a stale file somewhere in the old build.

view this post on Zulip starseeker (Sep 11 2020 at 22:02):

@Sean distcheck full succeeded

view this post on Zulip starseeker (Sep 12 2020 at 04:43):

(Initial success was on Ubuntu - also checking CentOS and GhostBSD)

view this post on Zulip starseeker (Sep 12 2020 at 17:59):

distcheck-full passed on GhostBSD except for what looks like another manifestation of the BSD threading issue.

view this post on Zulip starseeker (Sep 12 2020 at 22:55):

CentOS 8 distcheck-full passed.

view this post on Zulip Sean (Sep 22 2020 at 21:35):

@starseeker you are off by one throughout r 77191 .. strl*() all take the size of the allocated buffer. they ensure the last byte is nul so you don't want to drop the +1 that was added for nul.

view this post on Zulip starseeker (Sep 23 2020 at 02:18):

@Sean - did I follow correctly? (r77193)

view this post on Zulip Sean (Sep 23 2020 at 02:18):

nope, wrong way

view this post on Zulip Sean (Sep 23 2020 at 02:18):

now off by 2

view this post on Zulip Sean (Sep 23 2020 at 02:18):

it's the size of the allocation

view this post on Zulip Sean (Sep 23 2020 at 02:18):

that's what strl wants

view this post on Zulip Sean (Sep 23 2020 at 02:19):

so find the buffer if it's static or the malloc/calloc size if it's dynamic

view this post on Zulip Sean (Sep 23 2020 at 02:19):

that's the size to put... they made it simple

view this post on Zulip starseeker (Sep 23 2020 at 02:42):

so it's:

char str[COUNT];
func(..., str, COUNT);

You're saying inside func, it would be bu_strlcat(str, in_str, COUNT+1) to correctly copy in_str into str?

view this post on Zulip Sean (Sep 23 2020 at 02:42):

heh, no...

view this post on Zulip Sean (Sep 23 2020 at 02:43):

str[COUNT] ... strlcat(str, str2, COUNT)

view this post on Zulip starseeker (Sep 23 2020 at 02:43):

But... isn't that what I had in 77191?

view this post on Zulip Sean (Sep 23 2020 at 02:44):

I saw str = malloc(cnt+1, ...) ... strlcat(str, .., cnt)

view this post on Zulip starseeker (Sep 23 2020 at 02:45):

They may have both...

view this post on Zulip starseeker (Sep 23 2020 at 02:46):

So looking at get_style_tag_for_cell...

view this post on Zulip starseeker (Sep 23 2020 at 02:47):

fort.c:2397 it looks like it's getting cell_style_tag, which is TEXT_STYLE_TAG_MAX_SIZE

view this post on Zulip starseeker (Sep 23 2020 at 02:51):

There was F_MALLOC((sz + 1) that was in the logic that got all up replaced with a call to bu_strdup in 77191....

view this post on Zulip Sean (Sep 23 2020 at 02:54):

yeah, that's probably the sz+1 I saw replaced with a strlcat sz, but then the definition of sz changed too

view this post on Zulip starseeker (Sep 23 2020 at 02:57):

As far as I can tell, it's these 4 functions that are calling the strl logic:

get_style_tag_for_cell
get_reset_style_tag_for_cell
get_style_tag_for_content
get_reset_style_tag_for_content

view this post on Zulip starseeker (Sep 23 2020 at 02:59):

They're all, as far as I can tell, writing into static buffers defined with constants passed to sz. The only more complex case is line 4469.

view this post on Zulip starseeker (Sep 23 2020 at 02:59):

And I think that's right?

view this post on Zulip starseeker (Sep 23 2020 at 03:01):

(I reverted 77193, btw)

view this post on Zulip Sean (Sep 23 2020 at 03:02):

4469 looks right

view this post on Zulip Sean (Sep 23 2020 at 03:02):

yeah, I should have read the file instead of the patch, greater context

view this post on Zulip starseeker (Sep 23 2020 at 03:06):

Wonder if they'd take a strlcat patch upstream? As far as I know strlcpy is still non-standard...

view this post on Zulip starseeker (Sep 23 2020 at 03:06):

Although we might be able to use strncpy there, actually...

view this post on Zulip Sean (Sep 23 2020 at 03:23):

can always use strn .. it's just much more error prone, especially when cat'ing into existing buffers like the 4469 case where it's appending

view this post on Zulip starseeker (Sep 23 2020 at 03:25):

Did glibc ever add strlcpy? I know they were adamantly opposed to it for a long time...

view this post on Zulip starseeker (Sep 23 2020 at 03:29):

Oh well, no biggie - easy to adjust for regress once the pattern is clear.

view this post on Zulip starseeker (Sep 23 2020 at 03:29):

Now just need to make sure it works on Windows...

view this post on Zulip starseeker (Sep 23 2020 at 03:33):

Ah, there we go.

view this post on Zulip Sean (Sep 23 2020 at 03:35):

https://sourceware.org/glibc/wiki/strlcpy

view this post on Zulip starseeker (Sep 23 2020 at 03:36):

sigh. figures

view this post on Zulip Sean (Sep 23 2020 at 03:36):

looks like there was a second patch 6 years ago that's getting better reception but still not integrated / released

view this post on Zulip starseeker (Sep 23 2020 at 03:36):

bu_strlcpy it is

view this post on Zulip starseeker (Sep 23 2020 at 03:36):

I see MSVCRT isn't on board either

view this post on Zulip Sean (Oct 21 2020 at 16:47):

@starseeker seeing a weird ged plugin issue. the bigdb gtools test is failing on mac. It works when called directly in the build dir (e.g., src/gtools/tests/bigdb 1), but fails when run via ctest with:

    Start 874: slow-bigdb_1gb

874: Test command: /Users/morrison/brlcad.trunk/.build/src/gtools/tests/bigdb "1"
874: Test timeout computed to be: 1500
874: bigdb_tops="unknown command: tops"
874: bigdb_idlen=43
874: bigdb_szlen=1073741825
874: bigdb_lenmatch=0
874: bigdb_strmatch=0
1/1 Test #874: slow-bigdb_1gb ...................***Failed    8.97 sec

view this post on Zulip Sean (Oct 21 2020 at 16:48):

I can only imagine this has something to do with the plugin system, but don't have time to debug it at the moment.

view this post on Zulip Sean (Oct 21 2020 at 16:51):

all the rtwiz tests have also been out for a while with tclcad init failures, assumed you know about those but maybe not? they report "can't find package cadwidgets::RtImage"
just fyi, haven't tried a rebuild today if you made a recent change as I'm in the middle of debugging session.

view this post on Zulip starseeker (Oct 21 2020 at 22:13):

I think r77523 will take care of the bigdb error. Wasn't aware of rtwiz failures - they're not part of the standard tests because they can't be safely run in parallel. I'll see if I can run it to ground.

view this post on Zulip starseeker (Oct 21 2020 at 23:22):

How are you launching the rtwizard tests? ninja regress-rtwizard just succeeded on Linux...

view this post on Zulip starseeker (Oct 21 2020 at 23:32):

Ah, fails on BSD

view this post on Zulip starseeker (Oct 21 2020 at 23:32):

OK...

view this post on Zulip starseeker (Oct 21 2020 at 23:45):

Oh - it's a bundled vs system tcl issue, I'll bet...

view this post on Zulip starseeker (Oct 21 2020 at 23:52):

@Sean I think r77525 should have it.

view this post on Zulip starseeker (Oct 21 2020 at 23:52):

just got a successful regress-rtwizard on bz

view this post on Zulip Sean (Nov 24 2020 at 06:55):

@starseeker I don't know how long it's been an issue, but I'm seeing a pretty big behavior change on trunk with rt blocking until the framebuffer window is closed. This is on a default build on Mac, affecting both classic and non-classic runtime.

Anyone else seeing similar? (just run "rt" in windows on some geometry)

view this post on Zulip starseeker (Nov 24 2020 at 14:58):

You mean launching rt from within MGED? Not seeing that on Linux...

view this post on Zulip Sean (Nov 24 2020 at 15:00):

Yes, it's blocking for some reason.

view this post on Zulip starseeker (Nov 24 2020 at 15:00):

Might be the r77436-77454 libtclcad changes to fix an issue with Windows.

view this post on Zulip starseeker (Dec 17 2020 at 23:53):

I finally got a test Mac graphical setup. Can confirm the MGED prompt ignoring input while rt is processing, although it seems to be a bit more subtle than just the framebuffer being up - if I wait long enough, it will resume taking input with the fb lingering.

It's almost as if rt and MGED are sharing channels. Definitely something to do with the rt callback handler rework

view this post on Zulip starseeker (Dec 17 2020 at 23:54):

Very surprised this is unique to the Mac... wonder what's different?

view this post on Zulip starseeker (Dec 18 2020 at 00:07):

the bad news is right at this second I have absolutely no idea how to fix it...

view this post on Zulip starseeker (Dec 18 2020 at 00:26):

r77435 isn't any better, so it's an older issue than those changes...

view this post on Zulip starseeker (Dec 18 2020 at 00:41):

r77073 has the same problem...

view this post on Zulip starseeker (Dec 18 2020 at 01:18):

r76199 works

view this post on Zulip starseeker (Dec 18 2020 at 01:29):

r76226 works

view this post on Zulip starseeker (Dec 18 2020 at 01:38):

r76621 works

view this post on Zulip starseeker (Dec 18 2020 at 01:46):

r76654 works

view this post on Zulip starseeker (Dec 18 2020 at 01:57):

r76670 works

view this post on Zulip starseeker (Dec 18 2020 at 02:05):

r76903 does not work

view this post on Zulip starseeker (Dec 18 2020 at 02:11):

r76800 works

view this post on Zulip starseeker (Dec 18 2020 at 03:02):

r76850 does not work

view this post on Zulip starseeker (Dec 18 2020 at 03:08):

r76825 does not work

view this post on Zulip starseeker (Dec 18 2020 at 03:25):

r76824 is the one that broke the GUI MGED interactivity

view this post on Zulip starseeker (Dec 18 2020 at 04:41):

@Sean Can you see if r77998 fixes your issue on the Mac?

view this post on Zulip Sean (Dec 18 2020 at 04:52):

Sure, will do

view this post on Zulip Sean (Dec 18 2020 at 06:09):

Confirmed, appears to fix the issue.

view this post on Zulip Sean (Dec 18 2020 at 06:09):

nice work

view this post on Zulip Sean (Dec 18 2020 at 06:18):

so what was the problem?

view this post on Zulip starseeker (Dec 18 2020 at 13:22):

I set up a handler on STDOUT, trying to fix a crash on Windows. It avoided the crash on Windows, but it was the wrong way to do it. It looks in the end like a copy-paste error setting up the original logic meant I was deleting a structure before stderr was actually clear (checked stdout twice instead of stdout+stderr), which (surprise) messed up the application. A stdout handler avoided the crash all right, but messed with a channel the callbacks shouldn't have been hooked up to.

view this post on Zulip starseeker (Dec 18 2020 at 13:22):

It might explain a bit of quirky I/O behavior that's been reported on Windows too, although it's hard to know that for sure.

view this post on Zulip starseeker (Dec 18 2020 at 13:23):

In retrospect I'm surprised it only showed in the Mac GUI configuration. Tcl/Tk has a rather irritating habit of "almost" working and masking problems, sometimes...

view this post on Zulip starseeker (Dec 18 2020 at 13:25):

I'll have to double check everything is working now across all platforms, but if that's got it the last significant blocker I know of now is the gqa plot file problem. That was a user report, so I'll need to run it down before releasing.

view this post on Zulip starseeker (Dec 18 2020 at 13:27):

@Sean I shifted a fair number of things down in the "do before release" queue - let me know if any of the ones I moved are "need to haves".

view this post on Zulip Sean (Dec 18 2020 at 18:14):

Okay will do.

view this post on Zulip Sean (Dec 18 2020 at 18:15):

The only concerning thing I've noticed is that I encountered mged command line corruption again recently. I need to fully rebuild clean to confirm as it's tricky to reproduce, but it's the same behavior we had a year ago when there was a bug in static initialization.

view this post on Zulip Sean (Dec 18 2020 at 18:17):

Becomes apparent when jumping around the mged command line in console mode, cutting to end of line, pasting, navigating with error keys, up arrow. Resulted in unpredictable but obvious command line corruption.

view this post on Zulip starseeker (Dec 18 2020 at 18:17):

Hmm. Not ringing a bell - it was caused by static initialization error?

view this post on Zulip Sean (Dec 18 2020 at 18:21):

Yes, fortunately/unfortunately the bug was found before release and never made public, but that means there's no NEWS breadcrumb on what the exact cause was not that it matters here. I seem to recall the prior being bu init related and I don't think that code has changed. It's probably just some other corruption.

view this post on Zulip Sean (Dec 18 2020 at 18:22):

But like I said, I need to confirm with a clean build before throwing up red flags. Just a yellow caution for now. :)

view this post on Zulip Jeffrey Liu (Feb 16 2021 at 02:58):

Was having some errors compiling libbu because there was an unresolved external symbol BU_SEM_DATETIME. Seems like in bu_init.cpp, there is extern "C" int BU_SEM_DATETIME , but BU_SEM_DATETIME is defined in datetime.cpp which is a C++ file. Removing the "C" part fixed the error for me.

view this post on Zulip Jeffrey Liu (Feb 16 2021 at 03:05):

^ would changing something like this cause issues outside of Windows?

view this post on Zulip starseeker (Feb 16 2021 at 03:34):

Possibly - what if we do r78245 instead? Does that work?

view this post on Zulip Jeffrey Liu (Feb 16 2021 at 03:38):

I'm still compiling r78244 right now, but I'll let you know.


Last updated: Jan 09 2025 at 00:46 UTC