irclog2html for #brlcad on 20070212

01:46.05 *** join/#brlcad IriX64 (n=IriX64@bas3-sudbury98-1168057882.dsl.bell.ca)
02:42.55 *** join/#brlcad Twingy (n=justin@74.92.144.217)
03:22.11 Maloeran Erik or brlcad, are you there? I could use your knowledge of non-Linux platforms
03:23.14 Maloeran Basically, I would like to solve the problem of running on NUMA platforms. Having a copy of the datasets into each memory bank, and having specific threads on specific cores working on each copy is something that can be done on Linux
03:24.26 Maloeran Since that might be tricky to support all platforms this way, I was thinking about a more general way... Each die, each chunk of cores accessing the same memory bank, could have its own process running ; the processes working together by distributed processing
03:25.22 Maloeran Are other platforms clever enough to allocate memory into the memory bank specific to the processor a process is running on? Are they bright enough to put all the threads of the process on the cores of the same die?
03:25.57 Maloeran That really would be a simple solution. The processes can synchronize each other by shared memory to avoid most of the networking overhead
03:29.28 ``Erik not always
03:29.44 Maloeran Can it be manually forced?
03:29.51 ``Erik fbsd doesn't seem to do bank association on amd64 from dorking around... no way to force it
03:30.00 Maloeran I think that's far less trouble than having NUMA-aware built in ; just having a process per memory bank
03:30.05 Maloeran Ouch!
03:30.32 ``Erik and if you have one hard running thread hard running on a dual proc mac, it'll aggressively rotate it between procs to keep temps even
03:30.43 Maloeran This is terrible
03:30.53 ``Erik *shrug* it's the way things go
03:31.22 ``Erik (the bsd thing needs to be fixed... if I had free time, I'd get elbow deep into the allocator and scheduler and make it happen... but time is a rare commodity)
03:31.58 Maloeran Do you have any thought about numa-aware code within a single process, storing multiple copies of the dataset, or just having multiple synchronized processes?
03:32.32 ``Erik that all depends on if there's enough ram *shrug*
03:32.32 Maloeran The second way seems easier to get to work on different OSes, if the OSes themselves are numa-aware
03:32.40 Maloeran Right, of coures
03:32.44 Maloeran course, even
03:35.22 Maloeran I'm surprised that, even manually, one can't force threads on cores and allocation in banks... That's probably part of the explanation on why clusters don't run BSD
03:37.29 brlcad a lot of similar concepts to numa
03:38.01 brlcad additional reading with details on threading: http://www-941.ibm.com/collaboration/wiki/display/WikiPtype/POWER5+Architecture
03:38.34 Maloeran Thanks brlcad, seems similar to the Opteron docs I read at first glance
03:39.31 Maloeran I'm mostly wondering about how to solve the software aspect of the problem
03:41.01 brlcad eh, devil the in details .. exceptionally high-end server market, no commodity aspects
03:41.15 brlcad the documents go into software implications
03:41.23 Maloeran Right, great
03:41.48 brlcad in particular the latter that details execution, threading, and memory management
03:42.25 brlcad could probably get an account on an sp4 to play with
03:42.47 Maloeran Wouldn't that be Power5-OSX specific? It's awfully specific to the OS, there's no standard for NUMA management
03:43.26 brlcad os x doesn't run on power5
03:43.45 Maloeran MacOS9 then :), I really didn't follow that line of software
03:44.06 brlcad the power series are what are used by the high-end supercomputers
03:44.21 brlcad they have no relation to apple/mac
03:45.05 Maloeran Oh. Great
03:46.24 brlcad the G4 and G5 have architecture aspects similar to the power series, and some have suggested that the G5 is effectively the Power3 or Power4 with some of the high-end supercomputing facilities removed (data management, simultaneous core execution, larger L1/L2/L3 memories, etc, etc)
03:47.02 Maloeran Thanks, that clears things up
03:47.28 ``Erik 'cept the g[45] series have altivec, ibm/ppc doesn't
03:47.44 ``Erik 'cluster' is an awfully broad term o.O
03:47.57 Maloeran Exactly :)
03:48.16 ``Erik that's like saying you want to learn how to write assembly for computers...
03:49.11 Maloeran The comparison is valid ; learning assembly for the main architectures, or learning scalable software for the main cluster architectures
03:51.33 brlcad valid, but potentially vary misleading -- comparing athlon/G5/P4/whatever to the Power architecture is sort of like comparing the GForce 2 to the Quaddro FX .. there are correlations, but one is the exceptional high-end with various features that can be leveraged for extra order(s) performance
03:54.53 ``Erik heh, my analogy was to point out how vague the notion of mals statement was, as there are many radically different archs... as there are cluster technologies *shrug*
03:55.32 ``Erik heh, yeah, the power line displaced the mips line, its immediate ancestor.. :D
03:56.04 brlcad yeah, and have been king ever since.. for what? a decade now?
03:56.26 brlcad since at least 1998 iirc
03:56.48 Maloeran Erik, and I'm interested in learning scalable programming for the main ones
03:56.48 ``Erik the unf/$ leans more towards opterons, though *shrug*
03:56.52 brlcad opteron has certainly been on the rise with the revival of cray
03:57.10 ``Erik some amusing quotes from seymour
03:58.08 Maloeran NUMA-aware threading code isn't too much trouble on Linux, but as for some other OSes..
03:58.08 ``Erik 'numa' is a pretty broad category
03:58.13 Maloeran Assigning threads to memory banks is a fairly simple concept
03:58.47 ``Erik the simplest of forms and provided the OS exposes it, sure *shrug*
03:59.01 brlcad at the top 500 level, it rarely has to do with $$.. it's reliability and performance first followed by probably support and installation impact
03:59.35 brlcad the technology is usually second to just computing things as fast as possible
08:33.22 *** join/#brlcad IriX64 (n=IriX64@bas3-sudbury98-1168057882.dsl.bell.ca)
09:17.33 *** join/#brlcad dtidrow (n=dtidrow@c-69-255-182-248.hsd1.va.comcast.net)
09:59.21 *** join/#brlcad clock_ (n=clock@zux221-122-143.adsl.green.ch)
10:03.35 *** join/#brlcad cad32 (n=503708da@bz.bzflag.bz)
13:52.38 *** join/#brlcad b0ef (n=b0ef@084202025057.customer.alfanett.no)
14:55.18 *** join/#brlcad docelic (n=docelic@212.15.183.78)
15:27.54 *** join/#brlcad docelic (n=docelic@212.15.174.172)
15:55.47 *** join/#brlcad brlcad (n=sean@bz.bzflag.bz) [NETSPLIT VICTIM]
15:59.33 *** join/#brlcad b0ef (n=b0ef@084202025057.customer.alfanett.no) [NETSPLIT VICTIM]
16:00.07 *** join/#brlcad docelic (n=docelic@212.15.174.172) [NETSPLIT VICTIM]
16:00.33 *** mode/#brlcad [+o brlcad] by ChanServ
16:58.10 *** join/#brlcad docelic (n=docelic@212.15.185.121)
17:55.50 *** join/#brlcad debarshi (n=rishi@202.141.130.198)
22:37.02 *** join/#brlcad FthrNtr (n=IriX64@bas3-sudbury98-1168056909.dsl.bell.ca)

Generated by irclog2html.pl by Jeff Waugh - find it at freshmeat.net! Modified by Tim Riker to work with blootbot logs, split per channel, etc.