IRC log for #brlcad on 20110422

00:01.27 brlcad gprof excels at function-level analysis
00:01.42 starseeker growl... for whatever reason, that's not working on the mac either
00:01.50 starseeker either in isolation or as part of the build
00:01.58 starseeker to linux...
00:02.35 brlcad TODO: fix profiling option :)
00:03.05 brlcad (presuming you're using gprof right)
00:03.28 starseeker yep, but I'm assuming it's my mac's fault until proven otherwise
00:04.26 brlcad unlikely, it's not like shark
00:04.55 brlcad it's not cpu metrics, it injects code around all your functions
00:05.14 brlcad so if gcc works, it should work
00:05.26 starseeker gprof works on inux
00:05.29 starseeker linux even
00:07.03 brlcad ok, whatever works
00:09.25 starseeker must... fix... mac...
00:10.47 starseeker yeah, ok - my uvpoints.cpp file is wasting most of its time with strings
00:10.53 starseeker no wonder
00:19.57 brlcad (fwiw, "using std namespace" is evil)
00:20.10 brlcad er, using namespace std;
00:20.22 starseeker I know - it's just a test program, not intended for production in its current form
00:20.31 brlcad ah, interesting
00:20.52 starseeker scratchpad for experimenting with uv coordinates, memory pools, etc. without the complexity of the SurfaceTree
00:20.55 brlcad perhaps a less on references, pointers, and std:: data types :)
00:21.06 brlcad damn, can't type tonight
00:21.15 brlcad lesson
00:21.18 starseeker winces - yeah, I'm quite sure I'm using it all wrong
00:21.29 brlcad you're passing std::string in and out of functions
00:21.41 brlcad that makes a copy of the string every function call
00:21.47 brlcad and every return
00:22.00 brlcad malloc-style
00:22.29 brlcad you want to either pass around std::string*'s or std::string&'s depending on the use
00:22.53 brlcad depends who creates the string and how it's used
00:23.37 brlcad safest is to pass a pointer, but best is usually a ref .. but then you have to be more careful on how it's accessed and stored
00:24.57 starseeker nods - any good tutorials?
00:25.08 brlcad e.g. UVKey::UVKey() is fine taking a std::string newkey (so it'll make a copy on construction), but UVKey::getKey() should return a std::string&
00:31.52 CIA-105 BRL-CAD: 03starseeker * r44486 10/brlcad/trunk/src/librt/uvpoints.cpp: Try returning a reference
00:31.55 starseeker brlcad: like that?
00:32.34 brlcad yeah, that's better
00:33.17 brlcad one thing to remember, that change now changes the contract for UVKey -- it's no longer just a string, it's UVKey's string
00:33.37 starseeker nods - that was the idea originally, actually
00:33.43 brlcad so if you delete UVKey, any references to a std::string from getKey() would be dangling references (invalid)
00:33.47 starseeker yow, string comparison is brutal
00:35.26 brlcad yeah, that was my next question .. what's up with string keys??
00:35.33 brlcad numeric ftw
00:35.41 starseeker they're UV coordinates
00:36.40 brlcad so?
00:36.50 starseeker I don't recall the original motivator - I think it was to make it easier to mash together two pairs into one unique descriptive string
00:36.52 brlcad yeah, AppendKeys() .. nfg
00:37.25 brlcad ints_to_key looks like some sort of string-based hash
00:37.51 brlcad just store them into a u,v array, O(1) lookups
00:38.01 starseeker ah, right - 0100 and 0001 don't convert to different integers
00:38.34 starseeker and those strings are the keys for a minimal perfect hash
00:39.01 brlcad perfectly inefficient maybe :)
00:39.07 starseeker probably I should stash them as numbers and only build the strings when I need to
00:39.57 starseeker the original effort there was to find unique identifiers for all the UV points that might get evaluated into 3-space during a subdivision of depth N
00:40.15 brlcad either way, even as a perfect hash, shouldn't have a sprintf() call in there for generating a hash (heck, shouldn't need to manually hash yourself anyways)
00:40.17 starseeker because there are a lot of shared points, I wanted a good way to avoid duplicate evaluations
00:41.13 brlcad the std:: container hashing is pretty frickin good
00:41.40 starseeker the simplest way to avoid duplicates was to have a way for any subdivision, at any depth up until the maximum depth, be able to look up a given UV coordinate to see if it had been evaluated
00:42.32 brlcad you probably want a std::map
00:43.09 brlcad heck, maybe even a simple std::map<u, std::map<v, value> >
00:43.23 starseeker how fast are those lookups?
00:43.33 brlcad guaranteed to be at least logN iirc
00:43.57 starseeker in theory, if I understand correctly, a minimal perfect hash is O(1)
00:44.50 starseeker and as a tree deepens, we can build up a LOT of points and do a LOT of lookups
00:46.06 starseeker so a minimal perfect hash of a unique string descriptor of the UV coordinates, which is possible because we know the candidate sites that might be evaluated in advance, allows us to generate such a hash up front
00:46.38 brlcad if it's O(1) lookup and O(insane) to generate the hash, you've kinda missed the boat
00:46.56 brlcad that's still just the minimum guarantee
00:47.02 starseeker except the hash only has to be generated at compile time
00:47.13 brlcad you mean prep?
00:47.18 starseeker no, compile
00:47.22 starseeker source code compile
00:47.38 brlcad er, there's no such thing going on in your code there now
00:47.43 starseeker right
00:47.56 starseeker because it was early days when I had to shift to other stuff
00:48.14 starseeker but it DOES generate the keys.txt file, which contains the full list of unique keys
00:48.31 brlcad if there is a limited set of hash sites, why not just direct index into a set then?
00:49.25 starseeker all I know when I'm subdividing is the UV coordinates of the point I'm at - I have to map that to some index value
00:49.37 brlcad i'd still try the std::map first -- dead simple implementation, easy to maintain, and really good performance behavior guarantees
00:50.42 brlcad so the idea sounds fine, but it shouldn't be based on or involve strings
00:51.47 starseeker uh... the string is key for hashing - apparently hashing numbers usually doesn't work out well
00:52.11 starseeker my problem is I'm storing the info as a string, which is wrong
00:52.12 brlcad http://www.sgi.com/tech/stl/hash_set.html
00:52.42 brlcad again, configurable key type, even configurable hash function
00:53.26 brlcad a hash set of hash maps perhaps for u,v to value mapping
00:53.29 brlcad http://www.sgi.com/tech/stl/hash_map.html
00:53.43 brlcad or hash_map of hash_maps
00:59.07 brlcad the tradeoff with a std::map<> is that std::hash_map<> will be faster (O(1) on average O(N) worst case even with perfect hashing, have to account for collisions) but it's not in any particular order if you need to iterate or search
00:59.48 brlcad so find *this* UV is fast.. finding your closest neighbor will be relatively expensive
01:00.25 starseeker nods - so far there hasn't been a need for nearest UV neighbor (famous last words...)
01:00.56 starseeker brlcad: let me see if I can take strings out of what I'm doing now, at least up until keys.txt is generated
01:01.04 starseeker that works in the right direction regardless
01:01.52 starseeker once I'm doing that right, we can tangle with the various hashing/mapping options
01:08.04 starseeker the minimal perfect hash appeals to me because with this situation we don't NEED to have any collisions, but I suppose I could be full of... err... converted dog food too
01:14.28 *** join/#brlcad KimK (~Kim__@wlnt-02-246.dsl.netins.net)
02:31.04 *** join/#brlcad IriX64 (~kvirc@bas2-sudbury98-1128565712.dsl.bell.ca)
03:22.05 *** join/#brlcad juanman (~quassel@unaffiliated/juanman)
04:29.39 starseeker brlcad wins - will investigate stl map tomorrow, then look into hash_map if that's not enough
04:29.55 starseeker braces for another C++ learning curve
05:06.15 brlcad it's important enough to warrant a simple performance comparison
06:14.42 brlcad starseeker: here's a code snippet I just whipped up for you to play with
06:15.35 brlcad shows how to use map, hash map, and shows the comparative performance of each
06:41.24 brlcad http://brlcad.org/tmp/hasher.cxx
06:42.42 brlcad just for kicks, I stubbed in some code sort of showing worst case and best case perfect hashing .. though the devil is truly in the details for all of them since a simple datatype change or iteration change can completely skew the comparison
06:44.12 brlcad only played two use-case scenarios, setting and reading .. not iterating over all, nearest neighbor, deletions, etc
06:45.30 brlcad also using the non-bounds-checked [] operator instead of insert(), find(), and other functions
06:46.55 brlcad I'd suggest looking at a real worst case trace, see how many insertions/lookups/deletions and what the tree looks like -- shallowest, average, deepest
06:47.10 brlcad then mapping that into a test program simulating the same behavior
06:50.51 brlcad should hopefully be a little more clear how/why I think map or unordered_map will probably be more than "good enough" given how much simpler they are to run with .. and if not, looking into exactly why
07:33.21 *** join/#brlcad Stattrav (~Stattrav@117.202.22.183)
07:33.21 *** join/#brlcad Stattrav (~Stattrav@unaffiliated/stattrav)
08:10.31 *** join/#brlcad crazy_imp (~mj@a89-182-215-216.net-htp.de)
08:28.56 *** join/#brlcad merzo (~merzo@193.254.217.44)
08:30.38 *** join/#brlcad b0ef (~b0ef@226.27.202.84.customer.cdi.no)
09:38.38 *** join/#brlcad KimK (~Kim__@wlnt-02-246.dsl.netins.net)
11:35.09 ``Erik *readreadread* obsessing over a perfect hash may be self defeating... also; O() is just one of the bits to consider, there are 5 categories of asymptotic notation. qsort is a poor performer on O(), but awesome in real life (omega is tight, k is low, etc)
11:40.48 ``Erik hm, no system hash funcs in your demo, brlcad? md5? sha1? :)
12:45.34 starseeker brlcad: thanks for the demo - that will be a big help
12:48.28 starseeker hmm... u+v won't work, isn't unique 0 + 1 = 1 + 0... probably need something like u * (some power of 10) + v
12:48.52 starseeker that's a detail though
12:49.20 starseeker rummages around for some energy and gets moving
13:13.00 ``Erik collisions may be acceptable... ergo; quadratic (http://en.wikipedia.org/wiki/Quadratic_probing) or chained (each hash bin is a linked list)
13:17.15 *** join/#brlcad KimK (~Kim__@wlnt-02-246.dsl.netins.net)
13:33.11 brlcad starseeker: it'll "work" .. it's just a really really bad hash :)
13:33.15 brlcad lots of collisions
13:33.31 brlcad more showing the full range of impact it can have
13:34.06 brlcad with the right hash, you approach the best case instant update time of an array but worst case can be an order worse than a simple map
13:34.54 brlcad all the more motivation to just stick with the algo that makes most sense
13:36.35 brlcad the data is a simple mapping of uv keys to a 3dpoint, so the immediate thought that comes to mind is std::map<std::pair<int, int>, vect_t>
13:37.56 brlcad (do not use ON_3dpoint .. it's got lots of junk, you can set the vect_t direct on read: vect_t v = VINIT_ZERO; ON_3dpoint p = v; ON_3dpoint *pp = new ON_3dpoint(v);
13:38.25 brlcad they added operator support for getting the value from a double*
13:46.10 brlcad good exercise for the reader to attempt adding std::map<std::pair<int, int>, vect_t> and using proper std::map::iterator instead of using the [] operator .. you'll get a lot of insight
13:57.11 brlcad then switch it to [] and compare :)
14:31.14 *** join/#brlcad dli (~dli@67.55.46.44)
14:46.25 starseeker brlcad: the map is good if we need things like nearest-neighbor?
14:48.17 starseeker wonders if triangulation might just want that information, even if we don't need it for raytracing
15:25.06 ``Erik r-tree is the shizzle for nearest neighbor
15:25.12 *** join/#brlcad Stattrav (~Stattrav@117.202.18.42)
15:25.12 *** join/#brlcad Stattrav (~Stattrav@unaffiliated/stattrav)
15:25.41 ``Erik but data repacking should never take more than nlgn
15:25.59 ``Erik poor cats, still reeling from the bathing this morning
15:43.01 starseeker ``Erik: this should appeal to your bit-level side - can I take a floating point value and map it to an int simply?
15:43.38 starseeker std::map is slow with floats, but for storage the important thing is to have an easy, unique identifer
15:44.15 starseeker if the UV floating point coordinates can be mapped to integers...
15:45.30 starseeker I hadn't properly considered what wanting to split first on the knot points implies - it pretty much trashes my subdivision gridding coordinate scheme
15:46.34 starseeker might be able to get away with just being aware of which leaves contain knots, but more general solution of being able to pick any given UV point, subdivide using it, and using the std::map or something like it all the while sounds nice
15:53.03 *** part/#brlcad willdye (~willdye@198.183.6.23)
16:27.29 *** join/#brlcad juanman (~quassel@unaffiliated/juanman)
17:35.13 *** join/#brlcad Stattrav (~Stattrav@117.202.18.42)
17:35.13 *** join/#brlcad Stattrav (~Stattrav@unaffiliated/stattrav)
17:47.37 starseeker hmm... http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm
18:45.43 kanzure did commercial CAD packages ever take more than a few seconds to do a boolean operation on two solids?
19:08.49 starseeker kanzure: I doubt it...
19:12.44 kanzure hrm. weird. i'm reading some papers from the 90s where these programmers are cheering because they can merge a few meshes "IN ALMOST EIGHT SECONDS!!"
19:13.03 kanzure on some sort of sgi onyx machine (200 mhz.. etc.)
19:17.49 bhinesley I'm off to the beach for the weekend. Have a good weekend, everyone.
19:25.24 *** join/#brlcad juanman (~quassel@unaffiliated/juanman)
19:36.48 kanzure wow "CATIA uses polyhedral approximations for intersection and boundary computations"
19:38.36 *** join/#brlcad juanman (~quassel@unaffiliated/juanman)
19:44.36 kanzure "We then use the trapezoidation of the trimming polygon (using Seidel's algorithm) to perform logarithmic time point location queries.
19:44.39 kanzure "However, the accuracy of this result depends on the magnitude of errors introduced by approximating the high degree trimming curve through the boundary of the Gaudl invariants."
19:44.43 kanzure ok i give up on this paper
19:45.42 starseeker kanzure: which paper?
19:45.53 kanzure one of krishnan's
19:46.14 kanzure i'll use his other one where he uses matrix math for constructing the intersection curve
20:39.27 starseeker brlcad: looks like the pair iteration is slightly slower, I'm assuming due to the make_pair overhead: http://brlcad.org/~starseeker/map_test.cxx
20:43.37 starseeker I can't get the iterator to work thus far
20:56.05 starseeker ah, there we go - I THINK that works... anyway, if I do full assignment instead of using drand48 the count returned seems to match up
21:00.53 starseeker hmm, interesting - iterator is faster if I replace drand48 with i in the insertion loops
21:01.03 *** join/#brlcad juanman (~quassel@unaffiliated/juanman)
22:26.36 ``Erik hm, you could always do floor() or ceiling()... could also grab a bit mask ( ((uint320t)f) >> 16) )
22:26.52 ``Erik s/0/_/
22:27.51 starseeker ``Erik: actually, even the worst case floating point performance may not be a showstopper - plus, it looks like to get performance benefit the data type in question needs to be small anyway
22:29.22 starseeker suspects, based on conversations concerning STL libraries, that they probably are doing whatever the sanest thing is for floats under the hood
22:30.25 ``Erik the guys who did stl are pretty good... don't try to outsmart them, the key is to not use their stuff wrong...
22:31.06 starseeker nods
22:31.24 ``Erik at the moment, we assume a full fpu... I'm sure BRL-CAD on my arm7 would suck arse
22:31.40 starseeker heh
22:32.02 ``Erik http://brlcad.org/~erik/20110422/ they are not happy :)
22:32.31 starseeker hehe
23:04.18 brlcad starseeker: excellent that you got the pair working along with iterators
23:04.28 brlcad that was more the point than actual performance
23:05.07 brlcad make_pair() is probably injecting a tiny overhead, you'd probably just call the pair<> constructor directly where you can set values by const reference instead of by value
23:06.18 brlcad the other thing you should try is printing the pair values through the iterator
23:06.43 brlcad instead of just incrementing counter, try to use the iterator
23:22.45 ``Erik if I groked correctly, iterator classes are insanely efficient once invoked... a machine optimizable deref, iirc
23:23.31 ``Erik same level of machine nerdiness as a ^= b ^= a ^= b

Generated by irclog2html.pl Modified by Tim Riker to work with infobot.