irclog2html for #brlcad on 20070212

`01:46.05`	`*** join/#brlcad IriX64 (n=IriX64@bas3-sudbury98-1168057882.dsl.bell.ca)`
`02:42.55`	`*** join/#brlcad Twingy (n=justin@74.92.144.217)`
`03:22.11`	`Maloeran`	`Erik or brlcad, are you there? I could use your knowledge of non-Linux platforms`
`03:23.14`	`Maloeran`	`Basically, I would like to solve the problem of running on NUMA platforms. Having a copy of the datasets into each memory bank, and having specific threads on specific cores working on each copy is something that can be done on Linux`
`03:24.26`	`Maloeran`	`Since that might be tricky to support all platforms this way, I was thinking about a more general way... Each die, each chunk of cores accessing the same memory bank, could have its own process running ; the processes working together by distributed processing`
`03:25.22`	`Maloeran`	`Are other platforms clever enough to allocate memory into the memory bank specific to the processor a process is running on? Are they bright enough to put all the threads of the process on the cores of the same die?`
`03:25.57`	`Maloeran`	`That really would be a simple solution. The processes can synchronize each other by shared memory to avoid most of the networking overhead`
`03:29.28`	``Erik	`not always`
`03:29.44`	`Maloeran`	`Can it be manually forced?`
`03:29.51`	``Erik	`fbsd doesn't seem to do bank association on amd64 from dorking around... no way to force it`
`03:30.00`	`Maloeran`	`I think that's far less trouble than having NUMA-aware built in ; just having a process per memory bank`
`03:30.05`	`Maloeran`	`Ouch!`
`03:30.32`	``Erik	`and if you have one hard running thread hard running on a dual proc mac, it'll aggressively rotate it between procs to keep temps even`
`03:30.43`	`Maloeran`	`This is terrible`
`03:30.53`	``Erik	`shrug it's the way things go`
`03:31.22`	``Erik	`(the bsd thing needs to be fixed... if I had free time, I'd get elbow deep into the allocator and scheduler and make it happen... but time is a rare commodity)`
`03:31.58`	`Maloeran`	`Do you have any thought about numa-aware code within a single process, storing multiple copies of the dataset, or just having multiple synchronized processes?`
`03:32.32`	``Erik	`that all depends on if there's enough ram shrug`
`03:32.32`	`Maloeran`	`The second way seems easier to get to work on different OSes, if the OSes themselves are numa-aware`
`03:32.40`	`Maloeran`	`Right, of coures`
`03:32.44`	`Maloeran`	`course, even`
`03:35.22`	`Maloeran`	`I'm surprised that, even manually, one can't force threads on cores and allocation in banks... That's probably part of the explanation on why clusters don't run BSD`
`03:37.29`	`brlcad`	`a lot of similar concepts to numa`
`03:38.01`	`brlcad`	`additional reading with details on threading: http://www-941.ibm.com/collaboration/wiki/display/WikiPtype/POWER5+Architecture`
`03:38.34`	`Maloeran`	`Thanks brlcad, seems similar to the Opteron docs I read at first glance`
`03:39.31`	`Maloeran`	`I'm mostly wondering about how to solve the software aspect of the problem`
`03:41.01`	`brlcad`	`eh, devil the in details .. exceptionally high-end server market, no commodity aspects`
`03:41.15`	`brlcad`	`the documents go into software implications`
`03:41.23`	`Maloeran`	`Right, great`
`03:41.48`	`brlcad`	`in particular the latter that details execution, threading, and memory management`
`03:42.25`	`brlcad`	`could probably get an account on an sp4 to play with`
`03:42.47`	`Maloeran`	`Wouldn't that be Power5-OSX specific? It's awfully specific to the OS, there's no standard for NUMA management`
`03:43.26`	`brlcad`	`os x doesn't run on power5`
`03:43.45`	`Maloeran`	`MacOS9 then :), I really didn't follow that line of software`
`03:44.06`	`brlcad`	`the power series are what are used by the high-end supercomputers`
`03:44.21`	`brlcad`	`they have no relation to apple/mac`
`03:45.05`	`Maloeran`	`Oh. Great`
`03:46.24`	`brlcad`	`the G4 and G5 have architecture aspects similar to the power series, and some have suggested that the G5 is effectively the Power3 or Power4 with some of the high-end supercomputing facilities removed (data management, simultaneous core execution, larger L1/L2/L3 memories, etc, etc)`
`03:47.02`	`Maloeran`	`Thanks, that clears things up`
`03:47.28`	``Erik	`'cept the g[45] series have altivec, ibm/ppc doesn't`
`03:47.44`	``Erik	`'cluster' is an awfully broad term o.O`
`03:47.57`	`Maloeran`	`Exactly :)`
`03:48.16`	``Erik	`that's like saying you want to learn how to write assembly for computers...`
`03:49.11`	`Maloeran`	`The comparison is valid ; learning assembly for the main architectures, or learning scalable software for the main cluster architectures`
`03:51.33`	`brlcad`	`valid, but potentially vary misleading -- comparing athlon/G5/P4/whatever to the Power architecture is sort of like comparing the GForce 2 to the Quaddro FX .. there are correlations, but one is the exceptional high-end with various features that can be leveraged for extra order(s) performance`
`03:54.53`	``Erik	`heh, my analogy was to point out how vague the notion of mals statement was, as there are many radically different archs... as there are cluster technologies shrug`
`03:55.32`	``Erik	`heh, yeah, the power line displaced the mips line, its immediate ancestor.. :D`
`03:56.04`	`brlcad`	`yeah, and have been king ever since.. for what? a decade now?`
`03:56.26`	`brlcad`	`since at least 1998 iirc`
`03:56.48`	`Maloeran`	`Erik, and I'm interested in learning scalable programming for the main ones`
`03:56.48`	``Erik	`the unf/$ leans more towards opterons, though shrug`
`03:56.52`	`brlcad`	`opteron has certainly been on the rise with the revival of cray`
`03:57.10`	``Erik	`some amusing quotes from seymour`
`03:58.08`	`Maloeran`	`NUMA-aware threading code isn't too much trouble on Linux, but as for some other OSes..`
`03:58.08`	``Erik	`'numa' is a pretty broad category`
`03:58.13`	`Maloeran`	`Assigning threads to memory banks is a fairly simple concept`
`03:58.47`	``Erik	`the simplest of forms and provided the OS exposes it, sure shrug`
`03:59.01`	`brlcad`	`at the top 500 level, it rarely has to do with $$.. it's reliability and performance first followed by probably support and installation impact`
`03:59.35`	`brlcad`	`the technology is usually second to just computing things as fast as possible`
`08:33.22`	`*** join/#brlcad IriX64 (n=IriX64@bas3-sudbury98-1168057882.dsl.bell.ca)`
`09:17.33`	`*** join/#brlcad dtidrow (n=dtidrow@c-69-255-182-248.hsd1.va.comcast.net)`
`09:59.21`	`*** join/#brlcad clock_ (n=clock@zux221-122-143.adsl.green.ch)`
`10:03.35`	`*** join/#brlcad cad32 (n=503708da@bz.bzflag.bz)`
`13:52.38`	`*** join/#brlcad b0ef (n=b0ef@084202025057.customer.alfanett.no)`
`14:55.18`	`*** join/#brlcad docelic (n=docelic@212.15.183.78)`
`15:27.54`	`*** join/#brlcad docelic (n=docelic@212.15.174.172)`
`15:55.47`	`*** join/#brlcad brlcad (n=sean@bz.bzflag.bz) [NETSPLIT VICTIM]`
`15:59.33`	`*** join/#brlcad b0ef (n=b0ef@084202025057.customer.alfanett.no) [NETSPLIT VICTIM]`
`16:00.07`	`*** join/#brlcad docelic (n=docelic@212.15.174.172) [NETSPLIT VICTIM]`
`16:00.33`	`*** mode/#brlcad [+o brlcad] by ChanServ`
`16:58.10`	`*** join/#brlcad docelic (n=docelic@212.15.185.121)`
`17:55.50`	`*** join/#brlcad debarshi (n=rishi@202.141.130.198)`
`22:37.02`	`*** join/#brlcad FthrNtr (n=IriX64@bas3-sudbury98-1168056909.dsl.bell.ca)`

Generated by irclog2html.pl by Jeff Waugh - find it at freshmeat.net! Modified by Tim Riker to work with blootbot logs, split per channel, etc.