Stream: brlcad

Topic: conversion


view this post on Zulip Sean (Jul 15 2024 at 23:06):

Interesting, some preliminary results coming in from testing main vs 7.36.0 on Mac.

Overall, main is definitely succeeding more. There are some that timeout after 5min and some that throw mtx errors and "SHOULD NOT HAPPEN" errors for both old and new at a seemingly similar rate.

Looks like conversion/facetize is also taking about 2x longer with the new approach. Most simple objects that succeed seem to do so in about 3s whereas they're 1.5s on average in 7.36.0. Running the conversion across lots of files, that of course adds up but then is partially offset by slightly more old that hit the 5min timeout limit (I think, still have to verity).

view this post on Zulip starseeker (Jul 16 2024 at 11:01):

For the fully successful run, I had MAXTIME=5000 as my upper limit - 300s was definitely much too short, so the timeouts are no surprise.

view this post on Zulip starseeker (Jul 16 2024 at 11:02):

I noticed the simple cases being slower - I think, at least in my case, what appeared to be happening was the overhead of starting up the subprocess to process a single primitive was adding the extra time (which would explain the 1.5s vs 3s difference - two process startup/teardowns vs 1 for 7.36.0 - the actual primitive facetize in such cases should be virtually instantaneous.)

view this post on Zulip starseeker (Jul 16 2024 at 11:04):

For the SHOULD NOT HAPPEN errors, one possible approach would be to get the newer mmesh version of that logic working: https://github.com/BRL-CAD/mmesh

view this post on Zulip starseeker (Jul 16 2024 at 11:05):

When I took a quick look it's not quite a 1-1 drop in replacement for the gct calls, so I could use a little help getting it figured out and set up.

view this post on Zulip Sean (Jul 28 2024 at 23:57):

Still don't yet have the delta, but here's full run results I got on Linux w/ main:

Summary
=======
Converted: 96.9%  ( 9923 of 10245 objects, 40 files )
   Passed: 9923   ( 9974 NMG 10231 BoT 10116 Brep )
   Failed: 303   ( 243 NMG 12 BoT 126 Brep )
  Timeout:  19   ( 28 NMG 2 BoT 3 Brep )
 NMG rate: 97.4%  ( 9974 of 10245 )
 BoT rate: 99.9%  ( 10231 of 10245 )
Brep rate: 98.7%  ( 10116 of 10245 )
Prim rate: 99.7%  ( 7029 of 7050 )
 Reg rate: 95.1%  ( 2316 of 2436 )
  Elapsed: 61697.0 seconds
  Average: 6.0 seconds per object

view this post on Zulip starseeker (Jul 29 2024 at 12:30):

That's a surprisingly high failure rate for BoT - the input test set is our sample models?

view this post on Zulip Sean (Jul 29 2024 at 16:36):

Yep, it's a straight up run on db/, no additions.

view this post on Zulip Sean (Jul 29 2024 at 16:37):

0.1% is a high failure rate?

view this post on Zulip starseeker (Jul 29 2024 at 16:37):

On my local machine, at least with a timeout of 5000, I had a completely clean run

view this post on Zulip Sean (Jul 29 2024 at 16:37):

I was more surprised that BoT succeeds more than NMG -- in theory could go bot-to-nmg on those failure cases to get both in sync.

view this post on Zulip starseeker (Jul 29 2024 at 16:38):

NMG has to use the old NMG boolean - that'll fail more often

view this post on Zulip starseeker (Jul 29 2024 at 16:38):

Manifold is BoT only

view this post on Zulip Sean (Jul 29 2024 at 16:39):

NMG has to result in an NMG...

view this post on Zulip Sean (Jul 29 2024 at 16:39):

so if we had a bot-to-nmg(), which is pretty straightforward, they could be in sync, more success all around.

view this post on Zulip starseeker (Jul 29 2024 at 16:39):

I mean, if you're willing to ditch the fancy polygons for NMG triangles I guess - but doesn't that defeat the point of NMG in the first place?

view this post on Zulip Sean (Jul 29 2024 at 16:40):

nmg is would then still give nice quad-mesh results for some things, but still give some mesh where it'd otherwise fail.

view this post on Zulip Sean (Jul 29 2024 at 16:40):

No no, that's why I said when nmg fails

view this post on Zulip starseeker (Jul 29 2024 at 16:40):

Oh, gotcha

view this post on Zulip Sean (Jul 29 2024 at 16:40):

a fail is useless for everyone

view this post on Zulip starseeker (Jul 29 2024 at 16:41):

True - I suppose the only time you'd want to see it would be a dev trying to use the Manifold techniques with NMG data types

view this post on Zulip starseeker (Jul 29 2024 at 16:44):

I'm really curious what primitives out-and-out failed. Timeout I can see - the fallback methods can be expensive - but failure...

view this post on Zulip Sean (Jul 29 2024 at 16:46):

starseeker said:

On my local machine, at least with a timeout of 5000, I had a completely clean run

That could be the difference. I used the default 5min limit. 50min is nuts imho... :) That said, good point about whether it succeeds at all vs cannot.

I'll re-run with a higher limit, but definitely a UX argument to be made given how simple all the sample models are. As an outsider, I would expect them all to be sub-minute or something is "wrong". Even 5min (again, per object) seems pretty generous.

view this post on Zulip Sean (Jul 29 2024 at 16:47):

Also, I didn't dive into the log yet to see if they failed because of timeout. So will have to check that.

view this post on Zulip Sean (Jul 29 2024 at 16:48):

I will have to get it running in parallel before doing that -- it took a super long time to get through everything as it is.

view this post on Zulip starseeker (Jul 29 2024 at 16:48):

plate mode to vol and brep point sampling are the two worse offenders

view this post on Zulip starseeker (Jul 29 2024 at 16:49):

Generic twin booleans are nothing to sneeze at, even if you pre-convert the plate modes

view this post on Zulip Sean (Jul 29 2024 at 16:49):

17 hours to run everything...

view this post on Zulip Sean (Jul 29 2024 at 16:49):

(with a 5min timeout!)

view this post on Zulip starseeker (Jul 29 2024 at 16:50):

my numbers are in the facetize thread, IIRC

view this post on Zulip Sean (Jul 29 2024 at 16:50):

I'm kind of assuming that real models are going to take 10-100x for full conversions.

view this post on Zulip starseeker (Jul 29 2024 at 16:52):

<nod> Not claiming we're "done" in any way - all the fallback method uses except DSP are an indication of a problem in our primitive conversions that should be fixed.

view this post on Zulip starseeker (Jul 29 2024 at 16:53):

Given where we have been historically, I was willing to take any sort of "working" I could get to start with. Frankly I didn't expect them all to succeed even with the long timeout, I was rather surprised when they did

view this post on Zulip Sean (Jul 29 2024 at 16:53):

That's not a dig or push, just talking out loud my thoughts on implications

view this post on Zulip starseeker (Jul 29 2024 at 16:54):

If your run did have all timeouts, we need to fix the summary - what you posted above made it look like some timeouts and some failures

view this post on Zulip starseeker (Jul 29 2024 at 16:55):

I kinda buy the fallout methods being quirky depending on the environment, since the point sampling is pseudorandom

view this post on Zulip Sean (Jul 29 2024 at 16:55):

Immediate goal is still to get a conversion trajectory over time, but hit a load of build issues that I'm working through. Would be nice to track the finish line progress on %conversion, time, and conversion methods, dashboard it up onto a graph. Then throw more models into the mix.

view this post on Zulip Sean (Jul 29 2024 at 16:57):

starseeker said:

If your run did have all timeouts, we need to fix the summary - what you posted above made it look like some timeouts and some failures

Can change it, but the pragmatic issue is picking a line in the sand that is "too much" no matter the reason. If a single object conversion took 10 days, I would kind of say it doesn't matter -- that's a fail for all intents.

view this post on Zulip Sean (Jul 29 2024 at 16:59):

Original line was 5min as a general rule that a full model would potentially have two orders more, which would be approximately a full day for a full model to convert. Something that doesn't complete overnight is a difficult proposition.

view this post on Zulip Sean (Jul 29 2024 at 17:00):

Bumping to an hour per object increases that to an order, so over a week to convert... potentially useful to know where we are algorithmically, but definitely long enough to give anyone pause.

view this post on Zulip Sean (Jul 29 2024 at 17:00):

(on a real model)

view this post on Zulip Sean (Jul 29 2024 at 17:01):

Still, point taken that it can+should include the second number (#timeouts) in the summary just so it doesn't conflate the two. That's definitely important for our own purposes.

view this post on Zulip Sean (Jul 29 2024 at 21:51):

Sean said:

Still, point taken that it can+should include the second number (#timeouts) in the summary just so it doesn't conflate the two. That's definitely important for our own purposes.

and I'm blind... it already calls out timeouts. There were only 2 bot timeouts. The rest were actual failures.

view this post on Zulip Sean (Jul 29 2024 at 21:54):

So it does affect the success percentages, but not terribly so for bot. It should probably list the timeout and other details (like the version identifier) in the summary for sure.

view this post on Zulip starseeker (Jul 29 2024 at 23:13):

The failures are what surprise me - I didn't see those here

view this post on Zulip Sean (Jul 30 2024 at 03:33):

Not terribly surprising if there's some random perturbations involved. That'd make it non-deterministic.

view this post on Zulip Sean (Jul 30 2024 at 04:20):

Even if there's somehow not random involved (which would be a little surprising), floating point fuzz can definitely still do it. Floating point issues only present across platforms or compilation setting changes. To be expected unless that was pretty exhaustively tested for specifically.


Last updated: Oct 09 2024 at 00:44 UTC