01:01.00 |
Maloeran |
I know I have whined about this before, but
the C parser of Doxygen sure is broken on many points |
01:02.07 |
Twingy |
but Lee sure seems to love it |
01:02.18 |
Maloeran |
Arrays of function pointers are functions,
struct variables with gnu99 __attribute__ are "class methods" (
hello, this is C ) |
01:03.14 |
Maloeran |
It's nice on some aspects, but the parser
makes some gross mistakes |
01:07.43 |
Maloeran |
Whenever the data type of something is
complex, it thinks it's a function or a method. Oh, and I had
Doxygen tell me that for(;;) and if() where undocumented
"functions" before |
01:11.06 |
``Erik |
nice |
01:11.21 |
``Erik |
if only for() and if() WERE functions in
C |
01:11.22 |
``Erik |
*sigh* |
01:12.40 |
Maloeran |
Lisp is nice, but I'm not against having some
fundamental language contructs that are not functions |
01:35.56 |
``Erik |
http://www.calarts.edu/~jwhite/gbj/SeeHim.html |
01:36.10 |
``Erik |
lithp and thcheme have language constructs
that aren't functions |
07:08.23 |
*** join/#brlcad clock_
(i=clock@84-72-60-30.dclient.hispeed.ch) |
08:15.17 |
*** join/#brlcad clock_
(n=clock@zux221-122-143.adsl.green.ch) |
09:21.10 |
*** join/#brlcad clock_
(n=clock@zux221-122-143.adsl.green.ch) |
10:49.29 |
*** join/#brlcad docelic
(i=docelic@ri01-092.dialin.iskon.hr) |
14:51.31 |
*** join/#brlcad docelic
(i=docelic@ri02-084.dialin.iskon.hr) |
16:51.15 |
brlcad |
``Erik: you have any 6.2 discs
burnt? |
17:18.03 |
``Erik |
uhmmmmmmmm, no, 6.2 isn't released
yet |
17:18.21 |
``Erik |
it's rc1 |
17:45.45 |
*** join/#brlcad ntroutman
(n=nathanie@prox.snu.edu) |
18:02.42 |
*** join/#brlcad ntroutman
(n=nathanie@prox.snu.edu) |
18:36.22 |
``Erik |
mal: fixes rendering, yup, but generation is
still effed up |
18:36.32 |
``Erik |
linux 8core amd64... |
18:36.33 |
``Erik |
linkListAddPair (listhead0=0x2a95d3a668,
listhead1=0x2a95bfe290, step0=0x2a95d3a618, step1=0x2a95bfe238,
memblock=0x513bf0) at ../../../RF/prepmodel.c:718 |
18:36.34 |
``Erik |
718 if( ( linklist1->used ==
LINKS_PER_LIST ) ) { |
18:36.34 |
``Erik |
(gdb) print *linklist1 |
18:36.34 |
``Erik |
Cannot access memory at address 0x0 |
18:37.02 |
``Erik |
(and fbsd just sits, no crash) |
18:40.45 |
Maloeran |
Okay, every time or sometimes? |
18:42.52 |
Maloeran |
And fbsd freezes constantly or it's less
consistant? I'm just trying to get an idea of what's going
on |
18:43.25 |
*** join/#brlcad docelic
(i=docelic@ri01-139.dialin.iskon.hr) |
18:44.06 |
``Erik |
linux: |
18:44.07 |
``Erik |
0x0000002a9585ba41 in jobModelPrepStep
(engine=0x50aa20, job=Variable "job" is not available. |
18:44.07 |
``Erik |
) at ../../../RF/prepmodel.c:1720 |
18:44.07 |
``Erik |
1720 if( steplink->edge[axismin]
< plane ) { |
18:44.47 |
Maloeran |
Right okay, not too consistent, though I guess
it does crash pretty much all the time |
18:45.19 |
Maloeran |
And always in the same pass, okay |
18:46.18 |
``Erik |
if there's a valid cache, it's all grand...
gettin' ~40fps on both the 4 core fbsd and the 8 core linux
(hardcoded at 4 threads, I guess) |
18:46.58 |
``Erik |
hah, linux: Can't attach LWP -1023750857: No
such process |
18:47.10 |
``Erik |
fbsd seems stuck |
18:47.11 |
Maloeran |
Yes, fix rfdemo.c to increase that count,
hard-coded #define |
18:47.21 |
Maloeran |
No such process, eh? :) |
18:48.47 |
Maloeran |
I guess I now have some information to figure
out this bug |
18:50.22 |
Maloeran |
The more backtrace you can throw at me, the
easier I think it would be to see what's happening in
there |
18:51.17 |
``Erik |
0x0000002a9585ad7f in jobModelPrepEvaluate
(engine=0xc4f5d489c4f61b90, job=Variable "job" is not
available. |
18:51.17 |
``Erik |
) at ../../../RF/prepmodel.c:974 |
18:51.17 |
``Erik |
974 } while( --a ); |
18:51.19 |
``Erik |
wonky |
18:51.32 |
``Erik |
bt on that is a blown stack |
18:51.53 |
Maloeran |
Groovy |
18:52.55 |
``Erik |
meh, paste.lisp.org seems to be down |
18:53.06 |
Maloeran |
rafb.net/paste of course |
18:53.17 |
``Erik |
<-- coudln't remember the url |
18:56.10 |
``Erik |
http://rafb.net/paste/results/kONUFQ20.html |
18:56.24 |
``Erik |
those seem to be the 4 cases I get on linux
2.6 amd64 |
18:56.51 |
Maloeran |
Great, thank you |
18:56.56 |
``Erik |
np |
18:57.10 |
``Erik |
I have a 2 hour meeting starting in 30
minutes... |
18:57.51 |
Maloeran |
:) Have fun! |
19:00.16 |
``Erik |
heh |
19:00.34 |
``Erik |
just letting you know in case you have any
modifications you want to see run on multi-cache machines |
19:01.56 |
``Erik |
hrm, I'm going to try to bump the thread count
to something a bit more straining... hope that linux doesn't drop
the ball on it... I know it's ok on fbsd and solaris |
19:02.25 |
``Erik |
heh, gdb might be the problem here
:) |
19:04.07 |
``Erik |
goddamn, it's slugging down ugly :( |
19:11.06 |
``Erik |
http://rafb.net/paste/results/wJKR5372.html
<-- new error |
19:11.10 |
``Erik |
128 threads |
19:11.28 |
``Erik |
that the 'job' variable is unavailable on all
the errors is... suspicious |
19:19.29 |
Maloeran |
New error, always the same pass, and
consistent with a corruption of the lists of reverse
links |
19:20.06 |
Maloeran |
128 threads, I haven't tried that yet :). I
think I can figure out what's going on, the problem has been
narrowed a bit |
20:01.31 |
*** join/#brlcad dtidrow
(n=dtidrow@c-69-255-182-248.hsd1.va.comcast.net) |
20:16.54 |
*** join/#brlcad clock_
(i=clock@84-72-63-88.dclient.hispeed.ch) |
21:09.42 |
*** join/#brlcad dtidrow_work
(n=dtidrow@host169.objectsciences.com) |
21:14.10 |
*** join/#brlcad ntroutman
(n=nathanie@prox.snu.edu) |
22:05.43 |
*** join/#brlcad Twingy
(n=justin@74.92.144.217) |
22:31.01 |
brlcad |
looks like Maloeran needs some
valgrindage |
22:31.37 |
Maloeran |
I can't reproduce the bug, it's of no use
;) |
22:31.55 |
brlcad |
valgrind will still report the memory
problems |
22:32.36 |
brlcad |
assuming it's not a platformness but actually
just problem being masked by linux behavior |
22:33.10 |
Maloeran |
It's a problem caused by multiple cores not
shared the same cache, probably a faulty mutex somewhere |
22:33.19 |
brlcad |
valgrind is very good in what it does, way
better than the memory bounds checkers |
22:34.36 |
Maloeran |
Indeed, I'm just skeptical on its ability to
catch a bug that requires multiple caches to occur |
22:34.52 |
Maloeran |
Anyway. If I haven't solved this by tomorrow,
I'll call a friend to ask to boot his 4 cores desktop on
Linux |
22:35.53 |
brlcad |
you've narrowed it down that far for sure, or
guessing? :) |
22:36.30 |
Maloeran |
I can never reproduce the bug when it is
running on a single core, or dual cores but with shared
cache |
22:36.59 |
brlcad |
you mean you can't reproduce the
crash |
22:38.19 |
brlcad |
point being that you could run for years on a
single-thread single core machine and never see bad behavior, yet
there still be a memory problem (that is simply masked by OS or C
lib behavior) |
22:38.42 |
brlcad |
running gentoo? |
22:38.52 |
Maloeran |
Of course so, but that memory problem might
not be a "problem" in that case just because threads all share the
same cache |
22:38.55 |
Maloeran |
Yes |
22:40.58 |
dtidrow_work |
gonna get nasty cold around here tonight
:-( |
22:41.00 |
Maloeran |
An unsafe memory instruction might just happen
to have a direct memory operand, so you'll never see a problem if
threads are executed one at a time on the same cache |
22:41.49 |
Maloeran |
And no debugger could ever figure that out,
unless you trigger the bug with threads running on distinct
caches |
22:42.22 |
brlcad |
you're already assuming that's the bug too
(and maybe it is, but it's generally not good to assume when
hunting stack corruption) |
22:42.45 |
Maloeran |
The stack corruption was caused by something
else, probably a consequence of the first bug |
22:42.51 |
brlcad |
easy enough to recompile once, run and get a
valgrind report |
22:43.11 |
brlcad |
comes up clean and then at least basic
operation is sound |
22:43.43 |
brlcad |
if you get lucky, though, it might indicate a
problem elsewhere that hadn't even been considered, or thought to
be fine |
22:44.28 |
``Erik |
heh |
22:44.33 |
Maloeran |
Syscall param write() points to unitialized
bytes in XOpenDisplay() Eheh okay, that was unexpected |
22:57.16 |
Maloeran |
( It's not done prep'ing, I think I'll get
results before tomorrow ) |
23:10.16 |
*** join/#brlcad ntroutman
(n=nathanie@prox.snu.edu) |
23:11.30 |
Maloeran |
No error found by Valgrind,
unfortunately |
23:19.59 |
``Erik |
hum |
23:20.16 |
``Erik |
what about grabbing bochs, setting it up as a
true smp and grabbing a leenewx smp disk image |
23:20.19 |
``Erik |
and see if it can crash in that? |
23:20.32 |
``Erik |
(won' be fast, but it might be a way to
reproduce the error) |
23:49.04 |
*** join/#brlcad ntroutman_
(n=nathanie@prox.snu.edu) |