irclog2html for #brlcad on 20060928

01:00.10 *** join/#brlcad DTRemenak (n=DTRemena@adsl-68-126-0-210.dsl.irvnca.pacbell.net)
03:01.12 *** join/#brlcad digitalfredy (n=digitalf@200.71.62.161)
03:09.40 Maloeran That 1.7 million triangles frigate really kills the raytracing performance, with all its diagonal ropes through the scene. Very stressfull test for a raytracer... I would be interested in knowing how my 200mb of RAM use on this compares with ADRT
03:10.44 Maloeran Or 400mb if I push the quality ( and performance ) high
03:19.04 CIA-9 BRL-CAD: 03brlcad * 10brlcad/sh/ (footer.sh header.sh): add support for C++ and Objective-C/C++ to the mix
04:04.52 *** join/#brlcad digitalfredy (n=digitalf@200.71.62.161)
04:05.19 *** join/#brlcad dan_falck (n=danfalck@pool-71-111-76-8.ptldor.dsl-w.verizon.net)
04:19.38 *** join/#brlcad IriX64 (n=Who@bas3-sudbury98-1168052970.dsl.bell.ca)
04:54.19 *** join/#brlcad DTRemenak (n=DTRemena@adsl-68-126-0-210.dsl.irvnca.pacbell.net)
05:46.36 *** join/#brlcad clock_ (i=clock@84-72-60-185.dclient.hispeed.ch)
07:20.50 *** join/#brlcad clock_ (n=clock@zux221-122-143.adsl.green.ch)
11:20.58 *** join/#brlcad rossberg (n=rossberg@bz.bzflag.bz)
11:54.20 CIA-9 BRL-CAD: 03d_rossberg * 10brlcad/BUGS: fixed rendering toyjeep.g on Windows bug (on 7/6/2006) by using a less rigorouse function to invert a 4x4 matrix in rt_bend_pipe_prep
12:35.38 *** join/#brlcad Twingy (n=justin@74.92.144.217)
13:04.20 Maloeran Does anyone have a recommendation for the best reference for doxygen comments in the BRL-CAD code?
13:05.14 Maloeran I noticed Lee working on libuu's doxygen documentation, though I'm not sure where that libuu is. Not much comes out on find
13:05.56 Maloeran Ah, or perhaps it was libbu
14:01.31 Maloeran Eh, Doxygen is confused about GCC's __attribute__()
14:22.54 ``Erik O.o
14:24.44 Maloeran Feeling any better, Erik?
14:34.32 ``Erik not much, heh
14:35.00 Maloeran :/ Did you go through a x-ray scan just to make sure?
14:38.24 ``Erik yeah, several xrays and a catscan
14:38.50 ``Erik btw, I think I may have an idea on why your code doesn't run so hot on g4/g5 ... gcc 4.0.0
14:39.20 Maloeran Oh hum, that's a possibility. The assembly looked very poor, as little as I know that arch
14:39.49 Maloeran The demo now loads the 1.7 million triangles frigate with caching, if you want
14:40.19 ``Erik yeah, been building for a few minutes
14:40.23 ``Erik it segfaults on my amd64
14:40.35 ``Erik #0 0x0000000801758c88 in stepComputeValue (step=0x522030) at ../../../RF/prepmodel.c:701
14:40.35 ``Erik 701 step->linkcost[RF_EDGE_MAXZ] = WALK_LINKCOUNT_COST( step->linkcount[ RF_EDGE_MAXZ ] );
14:40.44 Maloeran Hum. Okay
14:41.34 Maloeran I seriously need to speed up that prep eventually, it does a decent job but isn't fast at it
14:42.29 Maloeran Could you p step->linkcount[ RF_EDGE_MAXZ ] on that segfault? It's rather curious
14:44.33 Maloeran Even with low preparation quality, the 'prep' can eat up to 500mb ; if it takes minutes, I think you are swapping...
14:50.23 brlcad src/lib*, there's a list of what each of the various libs do in HACKING and src/README
14:51.11 Maloeran I was more looking for the best reference for the desired doxygen comment style, rather than a specific library
14:54.48 ``Erik it's consuming one whole cpu, 508.19m real, 719.54m virtual, and has been going for 22 minutes
14:54.55 Maloeran Thanks Erik, bug reproductible if I fill all malloc'ed memory with garbage
14:55.09 ``Erik mal: yet another linux vs restoftheworld type issue
14:55.28 Maloeran Woah, it takes less than a minute on a good Athlon
14:56.10 ``Erik I got 1.3m r/s with the m1 a couple days ago
14:56.27 ``Erik I'm wondering if maybe it's caught in an infinite loop due to different rounding behaviors or something
14:56.33 Maloeran I got 2.5-3.0m on my desktop, but the frigate is much more demanding
14:56.55 Maloeran That shouldn't happen, then again, I might have missed something in this new prep written from scratch
14:56.58 ``Erik oh, and *HUGE* stalls on some ops, heh
14:57.09 ``Erik but I think it's a compiler problem more than anything else :/
14:57.20 ``Erik and stupid darwinports won't compile gcc42
14:58.01 Maloeran Yes, the dotproduct4 assembly code was loading all the values just before working on them, instead of scheduling a bit
14:58.30 ``Erik hm, 'real' memory dropped a bit and is creaping back up
14:58.33 ``Erik it must still be doing SOMETHING
14:58.36 ``Erik uh
14:58.38 Maloeran Ahah
14:58.42 ``Erik you don't do something like realloc in that prep, do you?
14:59.04 Maloeran Very rarely, but it will happen
14:59.08 ``Erik hrm
14:59.30 ``Erik it's horrendously expensive on the bsd family since phkmalloc and dmalloc work differently
14:59.32 Maloeran I realloc the table of pages for pointer directories, for sectors/steps/nodes
14:59.37 Maloeran I see.
15:00.20 ``Erik phkmalloc tries to keep things more secure from mmu smashes, so it tries to force memory to be contiguous on the wire, which means a realloc is an ugly naive alloc/copy/dealloc instead of dmalloc's page mangling
15:00.33 Maloeran Gah!
15:00.44 ``Erik MOST unix has a very very slow realloc
15:01.42 ``Erik but mallocing more than you need is 'free', it won't actually hit wire until it's written to, so malloc 2g, use what you want, don't worry about it *shrug* :)
15:02.28 Maloeran Then it's swapping around happily, hence why it takes 22 minutes instead of 40 seconds
15:02.56 ``Erik swap is totally unused right now
15:03.15 Maloeran What is system doing?
15:03.21 ``Erik I d'no *shrug*
15:03.37 ``Erik you're making system calls (wrapped via libc calls, I'm sure) that are expensive
15:04.04 Maloeran There are no system calls but malloc/free/realloc in there
15:04.19 ``Erik malloc and free should be fast
15:04.21 ``Erik there it is
15:04.24 ``Erik realloc is dog slow
15:04.44 Maloeran It's really realloc? The one in mmDir* in mm.c ?
15:07.33 ``Erik hrm, in the raytrace porttion, 9.6% of the time is spent on one op... "cror" (but it's stalled pretty heavy)
15:08.16 Maloeran In the dot product again? :)
15:08.54 ``Erik graphTraceDualOut line 635, the "if(dstdist<=0.0)", which looks like it has to do two sequential tests and then or the results before choosing to branch
15:09.37 ``Erik so to the machine, it looks like "if( dstdist<0.0 || dstdist==0.0 )", requiring both to get out of the pipeline, then feed back in for the or? *shrug*
15:09.52 Maloeran That's quite possible, weird chip you got
15:09.53 ``Erik vs if(!(dstdir>0.0)) which can be streamed
15:09.58 ``Erik it's risc *shrug*
15:10.12 Maloeran dstdist < 0.0 if you prefer, won't make a difference
15:10.32 ``Erik I'm kinda guessing based on what the little comment in shark says, heh
15:10.48 Maloeran Yes I remember
15:10.50 ``Erik 14% of compute time is on that dstdis = _mathPlanePoint(tri->plane, dst) on 634
15:11.22 ``Erik '
15:11.24 ``Erik gheh
15:12.29 Maloeran So I suppose it finished prep'in in the end. Care to profile that part?..
15:12.48 Maloeran I can't see what would take so long, as lazy as some of the code is
15:15.37 Maloeran If you do so, make sure to delete the cache or it will just load it
15:24.40 ``Erik sure, uh, I'll gzip the cache instead, heh...
15:24.55 ``Erik rtch ?
15:24.58 Maloeran Right
15:25.13 ``Erik 100 meg file, huh
15:25.55 Maloeran I was aiming for a bit packed version earlier, I'll switch back to that later
15:26.19 Maloeran ( So if you need 13 bits to identify a sector, it will use that instead of 32 bits )
15:26.46 ``Erik interesting, it starts very user based, and linearly ramps to very system based
15:27.21 Maloeran Anything more precise on what's going on in system?
15:29.49 ``Erik "shandler" sounds familiar?
15:31.03 Maloeran Hum, no?
15:32.25 ``Erik only 15.6 spend outside of mach_kernel
15:32.48 ``Erik the biggest single symbol being vm_map_enter
15:33.02 ``Erik which kinda smells like lots of small alloc's
15:33.33 ``Erik O.O holy forshizzle
15:33.56 ``Erik chunk->prev = (void *)&(mmList); is greviously expensive, if I'm reading this right
15:34.18 Maloeran But... how?
15:34.27 ``Erik stw r0,12(r3)
15:35.24 ``Erik okie, readin that wrong...
15:35.48 ``Erik of the 3% of program time, that op was the big consumer there... still less than 3% total
15:36.07 Maloeran :) I prefer that
15:53.44 ``Erik *shrug* comments and docs would allow other people to understand your stuff more readily and maybe make comments on possible concerns or bottlenecks that you'd otherwise spend a lot of time tracking
15:53.55 ``Erik especially since your environment is pretty homogenous
15:54.31 Maloeran I wanted to try Justin's fbsd box but it only has 256mb of ram
15:55.22 ``Erik mine only has 384, heh
15:55.39 ``Erik my home one, that is
15:56.17 Maloeran I just tried profiling in gprof, and it doesn't profile anything in shared libraries :p, so I profiled my main.c
15:56.45 ``Erik you need to build profiling forms of the shared libraries
15:57.06 ``Erik uhmmm, on fbsd, you'd see like libc.so and libc_p.so where _p.so is for the profiling lib
15:57.25 ``Erik I'm too out of leenewx to remember there, heh
15:57.26 Maloeran Shared libraries were built with -pg as well, anything else?
15:58.47 Maloeran Any sensitive results out of Sharp?
16:01.50 Maloeran "Support for gprof profiling of shared libraries is available on 32-bit systems only." What the...
16:02.20 Maloeran Sorry, nevermind that, specific to HP-Unix
16:02.22 ``Erik shark? I don't think I ran it right, so I'm rerunning it :/
16:06.12 ``Erik stepSampleSort is a bit pricey
16:06.56 Maloeran Like 5% or 40%?
16:07.05 ``Erik 22.6
16:07.31 Maloeran Okay. That's one of the thing I have marked to fix, I'm more wondering about the time spent on "system"
16:08.27 ``Erik sampleAddTri() is a tiny bit expensive, ...
16:09.30 Maloeran Yes... and I'm not even using these lists yet, planning ahead for improvements of the prep
16:10.28 Maloeran Can you throw all the profiling text at me?
16:12.01 ``Erik uhmmmmm, I'm running another set with different time variables
16:35.39 Maloeran So 50% is spent outside the executable itself, that's... cute ;)
16:36.39 ``Erik I d'no if that's because it's a single thread on a dual proc machine, or if it's just not seeing the frame stack correctly when it samples, or if sdl throws threads, or what
16:39.42 Maloeran The model is built before SDL is initialized, and you mentionned the system share starts growing later on
16:40.06 ``Erik hm, part of sdl is initialized before main() iirc
16:40.20 ``Erik it immediately pops up an sdl icon in the doc
16:40.23 ``Erik before the window appears
16:40.24 ``Erik dock
16:41.17 Maloeran Right I see
16:47.26 Maloeran I think I would know how to build shared libraries for gprof'iling, except that everything goes though this libtool thing
16:48.33 ``Erik yeah, I'm not terribly keen on libtool, but dynamic libraries are different on every os :/
16:49.09 ``Erik btw, I msg'd the url there because I can't msg here and I don't know how public you want that info... I'll delete it if you want
16:50.20 Maloeran Ah, nothing sensitive in there
16:53.49 ``Erik ok, thandler is the 'trap handler' and shandler is the 'syscall handler', in the mach kernel (micro, so it's handled via messages and 'servers', not function calls)
16:54.33 Maloeran Trap handler sounds like handling of page faults when running out of ram
16:54.53 Maloeran Syscall handler... Growing the heap size? 25% of the processing time? Gez.
17:06.47 ``Erik hrm, dude, I have 2g of ram and I'm only using like 200m
17:06.53 ``Erik and I never touched swap
17:07.13 ``Erik now the trap might be cache line related or something else *shrug* and itt might be system wide, not just applied to your application
17:09.17 ``Erik I just ran a program to allocate a gig in 1m chunks and write crap to every page... almost no system time consumed in that (16s user, 3s sys)
17:09.35 ``Erik no slowdown in it, so no swap hit
17:10.23 ``Erik about 1.5g I start seeing swap hits
17:11.28 Maloeran Right. I could be mistaken, but the trap handler handles page faults and I don't see what else could be causing faults..
17:13.59 ``Erik page fault is just one kind of trap
17:16.26 ``Erik ok, in the midst of the ugly, the syscall handler is 54% and the trap handler is 21.5%,
17:16.40 ``Erik the trap that consumes most time looks to be "ml_set_interrupts_enabled"
17:17.07 ``Erik only 1% of the time is vm_fault
17:17.28 Maloeran I can't think of any other syscall being made but malloc() and friends
17:17.43 ``Erik "isync" is the big trap abuse
17:17.57 ``Erik context switches force traps and shit, too
17:19.21 ``Erik ok, isync stops new ops from entering the pipeline and waits until the pipeline is empty, "This instruction is context synchronizing"
17:19.39 ``Erik for OS memory management tasks, like changes in the mmu
17:23.22 ``Erik "large_and_huge_malloc" might be related, in mmAlloc under sampleAddTri
17:24.46 Maloeran 20-40k is "large and huge" ?
17:25.15 ``Erik bigger than a page *shrug* I d'no, heh, I'm looking through this stuff more or less lost...
17:25.18 ``Erik <-- doesn't know ppc asm :)
17:25.26 Maloeran #define SAMPLE_TRIANGLES_PER_LIST (4096) could be set to 200k or something *shrug*, to have fewer calls
17:51.38 Maloeran Erik, could one of OSX's "security feature" be to zero malloc() chunks or something? I'm running out of hypotheses
17:52.38 ``Erik might be *shrug* I d'no
17:55.12 Maloeran "The default malloc on OS X causes a large performance degradation relative to the default mallocs on Linux and Solaris."
17:55.16 Maloeran Gah.
17:56.42 Maloeran 50% slower, nothing of the scale we saw here
18:07.06 ``Erik interesting, a significant portion of time looks like it's attribtued to handling l2 cache misses
18:09.45 ``Erik ahhhhhhhhh
18:10.05 ``Erik mmAlloc() cooks up time in a kernel function called "Zero Fill"
18:10.15 Maloeran AHH!!
18:10.26 ``Erik which'd explain cache thrashing
18:10.35 Maloeran _That_ is the reason, I'm allocating a whole bunch and freeing, sometimes without even using the chunks
18:10.59 ``Erik learn somethin' new every day
18:11.22 Maloeran Can you fix that?
18:11.30 Maloeran Can you make malloc() behave in a sane manner?
18:12.32 ``Erik googling for that now... and 'sane' is a phrase that can be argued against... :D quit abusing malloc? *duck*
18:12.37 ``Erik http://lists.apple.com/archives/Darwin-development/2003/Apr/msg00217.html mentions some
18:12.46 Maloeran Maybe there are multiple memory managers on OSX, as there are multiple threading libraries on fbsd ( and the default one is horrible too )
18:13.17 Maloeran Why would an OS ever memset() malloc'ed chunks? I can do that myself I need it, that's absurd
18:13.28 Maloeran if* I need it
18:13.49 Maloeran The segfault mentionned earlier was fixed too
18:13.56 ``Erik http://lists.apple.com/archives/Darwin-development/2003/Apr/msg00210.html answers that, heh
18:14.01 ``Erik security mechanism
18:14.11 Maloeran Absurd.
18:16.01 ``Erik http://developer.apple.com/tools/performance/optimizingwithsystemtrace.html and search for "zero-fill"
18:17.25 Maloeran So I have to write my own full-featured memory manager because the OSX manager is too incompetent to care about performance
18:17.48 ``Erik well, the converse argument is that the linux memory manager is too incompetent to care about security
18:17.52 Maloeran That also explains why even the m1a2 was taking so long to prep on your laptops, it's supposed to be a few seconds
18:18.26 Maloeran If a process puts sensitive stuff in RAM, it's the duty of _that_ process to mlock() the memory and clear it accordingly
18:18.44 Maloeran Don't slow down the whole OS for a few chunks of ram that might possibly contain something sensitive
18:19.09 ``Erik heh
18:19.22 ``Erik in the land of incompetent coders... :)
18:19.32 Maloeran mlock() and related functions exist for a good reason
18:19.45 ``Erik yes, as do calloc(), etc...
18:20.33 Maloeran Grah, this is so absurd
18:20.58 ``Erik freebsd does the same thing, apparently
18:21.04 ``Erik http://kerneltrap.org/node/72
18:22.55 Maloeran Seriously, this makes no sense at all. There are POSIX functions to take care of storing sensitive information in RAM
18:23.16 ``Erik ... and if people USED them, then os's wouldn't have to step up and cover
18:24.06 Maloeran This is a _very_ bad fix. Fix the software, don't hack a slow and patchy solution in the OS
18:24.46 ``Erik heh, and it seems to be a hot issue in linux kernel development right now
18:25.23 ``Erik (and if the software is designed to break the os? malicious code exists :/ )
18:26.08 Maloeran Okay. Do you have a full-featured and complete memory manager in BRL-CAD already?
18:26.22 ``Erik http://lists.apple.com/archives/darwin-development/2003/Apr/msg00227.html has more
18:26.29 ``Erik yeah, in libbu
18:26.31 ``Erik um
18:26.52 ``Erik but the behavior of "lots of allocs and deallocs" is gonna be slow if it's passed to the os...
18:27.02 Maloeran Seriously, the OS could bzero() pages as the heap grows, but OSX seems to clear even reused pages ; malloc'ing without expanding the heap
18:27.30 Maloeran Normally, malloc() only reaches the OS if the heap has to be extended. Otherwise, it stays entirely in user space
18:27.34 Maloeran On a sane and decent OS anyway
18:28.02 ``Erik erm, ... vm and wm are different, dude
18:29.13 ``Erik (heh, and this is exactly where compacting gc's shine)
18:29.45 Maloeran Checking libbu, I only saw red-black tree stuff there last time
18:30.34 ``Erik I'm pretty sure the libbu memory management is just portable passthrough stuff, though
18:31.36 ``Erik stupid headache *grr*
18:32.00 Maloeran I really don't feel like writing a memory manager to handle broken malloc() implementations, but if I must..
18:32.17 ``Erik <-- thinks it's less broken than linux's :(
18:33.03 Maloeran Surely you agree that if software deals with sensitive information, there are robust and _efficient_ mechanisms to deal with this, instead of having every malloc() call being zero'ed?
18:33.28 ``Erik given the quality of 95% of coders writing 'real' applications, no. I don't.
18:33.34 Maloeran malloc()'ed memory is not supposed to be cleared, it's supposed to be fast
18:34.07 ``Erik hm, I've never thought of malloc as a fast operation *shrug* if you want fast, allocate a big honkin' heap and do it yourself in that...
18:34.33 Maloeran Clearing the new pages as the heap grows would have made a certain sense, but for every malloc call, this is highly absurd
18:34.48 ``Erik ...
18:35.00 ``Erik you cannot make that statement because of how mmu's work.
18:35.21 ``Erik you can free 4k, and then "immediately" alloc 4k, and you are not guaranteed that you got the same 4k back
18:35.33 ``Erik you coudl've gotten one of my pages, or a completely different page altogether
18:35.50 Maloeran Of course not, but it's likely to be within the heap for the process address space
18:36.11 ``Erik ... for the process address space, yes... but not the wired address space
18:36.33 ``Erik physical memory doesn't line up to process memory, that's what the mmu does...
18:36.39 Maloeran The heap never shrinks, the OS doesn't know that the page is now unused
18:36.59 ``Erik erm, which heap? heh
18:37.27 Maloeran The heap of the process ; the memory manager is likely to reuse that page and you'll get what you had previously stored there, without ever making a syscall
18:37.28 ``Erik free() is to mark a heap as unused so it can be culled...
18:37.43 ``Erik and it disassociates it from the wired page
18:37.45 Maloeran So the heap can shrink on OSX? It never does on Linux
18:38.56 Maloeran That seems to be a logical explanation as to why every malloc() call is zero'ed
18:40.27 ``Erik the process heap should be able to shrink on every os :/
18:40.46 ``Erik now the memory address of new allocations is up in the air, but *shrug*
18:42.04 Maloeran You can't shrink the heap on Linux. If it grows high and shrink, unused high pages will eventually be put on swap to make room for other processes, and just forgotten
18:42.15 Maloeran That design has its flaws too ( the swapping )
18:42.17 *** join/#brlcad cadguy (n=butler@bz.bzflag.bz)
18:42.26 ``Erik heh, and eventually oom
18:42.56 cadguy Yo! How is everyone?
18:43.00 ``Erik (might be why I've seen ugly oom's on linux, it's malloc is broken... O:-) )
18:43.18 Maloeran Good afternoon Lee
18:43.35 ``Erik email is sent, lee... subj "Sql"
18:43.36 Maloeran BSD's malloc() seems less broken than OSX still, it clears new pages but not the content of every malloc() call
18:43.44 cadguy Howdy Maloeran
18:44.06 Maloeran Just having a long debate with Erik about why the raytracer's prep is so terribly slow on OSX
18:44.09 ``Erik osX only zerofills when the freshly allocated page is touched, as far as I can tell
18:45.26 Maloeran Now reading libbu's memory manager, I suppose that's the solution to work around inefficient malloc implementations
18:45.27 cadguy Hmm. How many pages are we allocating? Lots?
18:45.40 Maloeran Lots of pages, which are often just unused and freed
18:45.57 Maloeran malloc() is quite fast on Linux as pages are never cleared
18:45.58 cadguy Yes, that's a notorious performance killer.
18:46.09 cadguy That's a security issue.
18:46.45 Maloeran When dealing with sensitive information, processes can mlock() the memory, there are POSIX functions to take care of that
18:47.25 Maloeran But as Erik argued, a dirty and inefficient fix at the OS level seems to be required due to the amount of bad software out there... *shakes head*
18:47.42 cadguy The usual technique is to keep a buffer pool if you want to alloc/free a lot to keep the code easy. Then allocate through your own buffer pool.
18:48.31 ``Erik *nod* allocate a slew of pages, keep 'free' and 'used' linked lists, when one is freed or allocated, just change which list it lives in
18:48.39 cadguy Yea. Lots of lame code mucking around with priviledges. Remember mlock() didn't appear until 4.4BSD.
18:48.41 Maloeran Right. I'm checking libbu, but I won't hide that I'm used to deal with an efficient malloc implementation
18:49.13 ``Erik if you allocate with nothing in the free list, free more... if you're worried about memory consumption, free() some out of the free list when it reaches a threshhold
18:49.28 ``Erik s/efficient/insecure/ :)
18:49.51 Maloeran Yes yes, I got that to deal with many small chunks. I haven't got a full memory manager to deal with chunks of all sizes and shapes
18:49.52 cadguy No reason to hide. Just be aware that there are space/time/security tradeoffs that different OS's make.
18:50.03 ``Erik my bike goes 20kph and stays together, yours goes 30 and kicks the wheels off every 50km
18:50.05 ``Erik :D
18:50.52 Maloeran :) Eh well, time to write a memory manager then!
18:51.24 ``Erik <-- thought that's what mm was supposed to be o.O :)
18:51.56 Maloeran It's not a full-blown memory manager, it has efficient handling of packed tiny chunks, balanced trees, etc.
18:52.47 Maloeran since Linux's malloc() always performed decently for management of medium to large sized chunks
18:53.26 cadguy In general, any time you can avoid a system call, it is worth doing.
18:54.46 Maloeran On Linux, free() never shrinks the heap, so malloc() will always remain in user-space unless the heap has to grow. I realize it's quite different on OSX
18:55.36 cadguy And different on solaris and other Unix's
19:07.54 Maloeran That model really is a challenge for any acceleration structure, the planned second 'prep' pass should improve things a bit... but mostly, ray bundles will
19:08.05 Maloeran That and threads
19:09.48 ``Erik oohhhhh, rfTraceRays() calls malloc, too
19:11.58 Maloeran Only if there are no already allocated 'job' struct in the list, nothing to worry about there
19:15.13 ``Erik that dstdir=mathPlanePoint() line (634) is a major contributor to L2 cache misses (27.5%)
19:15.39 ``Erik second being line 582 "if(src[linkflags&RF_NODE_AXIS_MASK]<NODE(root)->plane)" at 6.6%
19:16.56 Maloeran The prototype had prefetch instructions for caching triangles before the actual tests, that should help
19:17.19 ``Erik memory bandwidth looks like, um, around 200-300 MB/s read and 20-30MB/s write
19:17.32 Maloeran You know, I really like your profiler :)
19:17.50 ``Erik heh, me too, this thing is gnarly
19:18.06 cadguy You really should try to pick it up.
19:18.39 cadguy Want me to talk with Mark?
19:19.30 Maloeran Thanks, just give me 33 hours to receive my first real pay check from Survice assuming the 30 days delay after the end of the month is respected
19:20.19 ``Erik you got your travel expenses and per diem all sorted out, correct?
19:21.07 CIA-9 BRL-CAD: 03lbutler * 10brlcad/sh/gforge.sh: script for querying a gforge site
19:21.34 Maloeran I had no per diem expenses in August, but sure
19:23.24 *** join/#brlcad IriX64 (n=IriX64@bas3-sudbury98-1168052970.dsl.bell.ca)
19:23.38 ``Erik dude, if you ever do work related travel, the employer should set everything up and take care of all the (reasonable) expenses...
19:24.34 ``Erik it's chump change to them, a no brainer investment...
19:27.16 Maloeran Ah don't worry, I'll be quite fine. The 30 days delay for a monthly pay is just a bit annoying, after 2-3 months of unpaid vacation anyway ;)
19:27.41 ``Erik rtiBatchNsCallback() is your flat shadow-less shader?
19:27.53 Maloeran Somewhat, yes
20:57.14 CIA-9 BRL-CAD: 03lbutler * 10brlcad/sh/gforge.sh: make script adaptable to host
21:13.00 Maloeran Erik, before I write a bunch of code, do you have Hoard handy to see if the memory manager does a better job?
21:13.25 Maloeran It might clear pages the BSD way even on OSX
22:43.50 ``Erik hoard? nope
23:16.54 Maloeran Oh well. Everything but sectors and steps are now allocated by sliced blocks, these chunks of variable size will have their own personal little memory manager

Generated by irclog2html.pl by Jeff Waugh - find it at freshmeat.net! Modified by Tim Riker to work with blootbot logs, split per channel, etc.