IRC log for #brlcad on 20170713

`00:19.42`	`*** join/#brlcad infobot (~infobot@rikers.org)`
`00:19.42`	`*** topic/#brlcad is GSoC students: if you have a question, ask and wait for an answer ... responses may take minutes or hours. Ask and WAIT. ;)`
`08:55.54`	`*** join/#brlcad infobot (~infobot@rikers.org)`
`08:55.54`	`*** topic/#brlcad is GSoC students: if you have a question, ask and wait for an answer ... responses may take minutes or hours. Ask and WAIT. ;)`
`09:22.05`	`*** join/#brlcad mdtwenty[m] (mdtwentyma@gateway/shell/matrix.org/x-iwpdlhgermucyhhk)`
`09:23.32`	`*** join/#brlcad Caterpillar2 (~caterpill@unaffiliated/caterpillar)`
`09:59.54`	`*** join/#brlcad merzo (~merzo@252-22-132-95.pool.ukrtel.net)`
`10:35.58`	`*** join/#brlcad teepee (~teepee@unaffiliated/teepee)`
`11:15.02`	`*** join/#brlcad teepee (~teepee@unaffiliated/teepee)`
`11:31.33`	`*** join/#brlcad teepee (~teepee@unaffiliated/teepee)`
`11:37.21`	`Notify`	`03BRL-CAD:Amritpal singh * 10098 /wiki/User:Amritpal_singh/GSoC17/logs: /* Coding Period */`
`11:39.42`	`Notify`	`03BRL-CAD:Amritpal singh * 10099 /wiki/User:Amritpal_singh/GSoC17/logs: /* Coding Period */`
`12:01.06`	`*** join/#brlcad teepee (~teepee@unaffiliated/teepee)`
`12:44.16`	`*** join/#brlcad gabbar1947 (uid205515@gateway/web/irccloud.com/x-jhwmzfdcblkcoioz)`
`12:56.45`	`*** join/#brlcad kintel (~kintel@unaffiliated/kintel)`
`13:37.12`	`*** join/#brlcad deep-book-gk_ (~1wm_su@94.242.252.58)`
`13:37.42`	`*** part/#brlcad deep-book-gk_ (~1wm_su@94.242.252.58)`
`13:42.33`	`*** join/#brlcad yorik (~yorik@2804:431:f720:9892:290:f5ff:fedc:3bb2)`
`13:46.45`	`*** join/#brlcad teepee (~teepee@unaffiliated/teepee)`
`15:17.34`	`*** join/#brlcad vasc (~vasc@bl4-6-201.dsl.telepac.pt)`
`15:19.21`	`*** join/#brlcad vasc (~vasc@bl4-6-201.dsl.telepac.pt)`
`15:19.24`	`vasc`	`hello mdtwenty[m]`
`15:20.28`	`mdtwenty[m]`	`hello`
`15:20.38`	`mdtwenty[m]`	`sorry yesterday, was having lunch`
`15:21.02`	`vasc`	`no problem. i had to leave early in the afternoon as well.`
`15:21.14`	`vasc`	`how's the code going?`
`15:23.24`	`mdtwenty[m]`	`hm so after translating the bits of the boolean tree i could get the operators scene to work for some views (i found later that some views have a strange behaviour)`
`15:23.53`	`vasc`	`so what's the difference between translating the bits and not translating the bits, in terms of visual output?`
`15:23.59`	`Notify`	`03BRL-CAD Wiki:95.18.89.88 * 10100 /wiki/User:Mariomeissner/logs:`
`15:25.27`	`mdtwenty[m]`	`if i dont translate the bits, the render always differ, because the boolean trees changes and so do the partitions evaluated`
`15:26.03`	`mdtwenty[m]`	`but when i translate the tree the output is fixed`
`15:26.26`	`vasc`	`ok`
`15:26.48`	`vasc`	`well`
`15:27.22`	`vasc`	`i think it's like this. we clean up the code a bit and try to put it into the SVN branch.`
`15:27.37`	`vasc`	`there's still some bugs, but i think it's close to optimal.`
`15:28.02`	`vasc`	`at least as far as an alpha release can go`
`15:28.51`	`vasc`	`from now on, make your code against the opencl branch: https://svn.code.sf.net/p/brlcad/code/brlcad/branches/opencl/src/librt/`
`15:28.52`	`gcibot`	`[ p/brlcad/code - Revision 69941: /brlcad/branches/opencl/src/librt ]`
`15:29.09`	`mdtwenty[m]`	`yes, i think so!`
`15:29.17`	`vasc`	`make a patch, download that, and make a patch against that version`
`15:29.27`	`vasc`	`then i'll review it and we'll apply it`
`15:29.55`	`vasc`	`this isn't good enough to go in the trunk yet, but i think we need to keep it stored someplace.`
`15:30.50`	`*** join/#brlcad merzo (~merzo@136-3-133-95.pool.ukrtel.net)`
`15:31.12`	`mdtwenty[m]`	`will do that!`
`15:31.31`	`mdtwenty[m]`	`this is the one example of a bug that is happening`
`15:32.03`	`mdtwenty[m]`	`uploaded an image: operators.png (138KB) <https://matrix.org/_matrix/media/v1/download/matrix.org/QRtTVgTwEFAWVKOtedoSguQa>`
`15:32.03`	`mdtwenty[m]`	`is some views this "holes" happend`
`15:32.53`	`vasc`	`that's really weird.`
`15:33.27`	`mdtwenty[m]`	`hm it is not that weird with the wireframe`
`15:33.31`	`mdtwenty[m]`	`sec`
`15:34.05`	`vasc`	`it's like it's evaluating the tree wrong?`
`15:34.28`	`vasc`	`the segments seem ok`
`15:35.24`	`mdtwenty[m]`	`uploaded an image: operators_wire.png (192KB) <https://matrix.org/_matrix/media/v1/download/matrix.org/QEPwQVVQWceIapNElAnxOMfn>`
`15:35.37`	`mdtwenty[m]`	`it seems like the geometry in yellow is interfering`
`15:35.57`	`vasc`	`yeah that helps. like thought, it's like its evaluating the boolean csg wrong but the segments seem to be ok.`
`15:36.09`	`mdtwenty[m]`	`this can be an error with the regiontable, or the lack of support for overlapping partitions`
`15:36.22`	`mdtwenty[m]`	`i think, not sure yet`
`15:37.13`	`vasc`	`well once we get this into svn, we need to fix that issue with the region's primitive id translation`
`15:37.41`	`vasc`	`and we need to get some kind of translator so that we can compare the output of the intermediate steps in the opencl and ansi c code.`
`15:37.59`	`vasc`	`or just fix the bugs.`
`15:38.38`	`vasc`	`lack of support for overlapping partitions?`
`15:38.58`	`vasc`	`in which part of the code is that supposed to be in?`
`15:39.36`	`mdtwenty[m]`	`is a part of the rt_boolfinal function`
`15:39.39`	`vasc`	`oh`
`15:39.53`	`vasc`	`but the boolweave is feature complete right?`
`15:40.51`	`mdtwenty[m]`	`i didn't implemented it yet because was trying to understand the FASTGEN regions`
`15:41.18`	`vasc`	`ah. THAT`
`15:41.23`	`mdtwenty[m]`	`yeah i think booleweave is complete now`
`15:41.32`	`vasc`	`ignore those. in fact just strip the FASTGEN code from the opencl port.`
`15:42.15`	`vasc`	`FASTGEN is like legacy support for an older solid modelling system that the US military used from what I understand.`
`15:42.35`	`vasc`	`fact is, i didn't even port the FASTGEN primitives.`
`15:42.48`	`vasc`	`so it's kinda pointless to implement the FASTGEN csg code.`
`15:43.41`	`vasc`	`if for whatever reason it's necessary to implement FASTGEN someday, then we'll think about it.`
`15:43.55`	`mdtwenty[m]`	`hm i see. i was not sure if it was important for the ocl code so thanks for claryfying`
`15:44.24`	`vasc`	`BRL-CAD has a FASTGEN import module that imports FASTGEN scenes.`
`15:44.44`	`vasc`	`the database format for FASTGEN is kinda weird. it has like special primitives and the way the rendering works is also different.`
`15:46.15`	`Notify`	`03BRL-CAD Wiki:Mariomeissner * 10101 /wiki/User:Mariomeissner/logs:`
`15:46.32`	`mdtwenty[m]`	`ok`
`15:46.55`	`mdtwenty[m]`	`the other day you said something about storing the results of the boolean evaluation in the struct partition`
`15:47.48`	`mdtwenty[m]`	`which i currently do`
`15:48.18`	`vasc`	`that should probably be kept in a separate data structure`
`15:48.36`	`vasc`	`or we should just make the evaluation work faster so that we don't need to cache that in the first place.`
`15:50.20`	`*** join/#brlcad skat00sh (uid103741@gateway/web/irccloud.com/x-owitdxmtukjpgbew)`
`15:51.35`	`vasc`	`btw the opencl branch in svn already has the boolean tree code.`
`15:51.47`	`vasc`	`so you might get merge conflicts because of that.`
`15:52.02`	`vasc`	`you'll have to manually apply the patch.`
`15:53.11`	`mdtwenty[m]`	`ok thanks for the heads up`
`15:53.19`	`mdtwenty[m]`	`will apply it mannually`
`16:01.33`	`mdtwenty[m]`	`i will just remove some debug code from the code and will clean it a bit before submiting the patch to the opencl branch`
`16:02.23`	`mdtwenty[m]`	`or should i finish rt_boolfinal first? (overlapping partitions)`
`16:03.37`	`vasc`	`well`
`16:04.09`	`vasc`	`just show me what you have`
`16:04.29`	`vasc`	`there should be some debug and log code in there`
`16:05.52`	`vasc`	`some of those should be kept i think`
`16:15.01`	`mdtwenty[m]`	`yeah probably is good idea to have some debug code in there`
`16:15.12`	`mdtwenty[m]`	`posted a file: rt_bool_final.patch (62KB) <https://matrix.org/_matrix/media/v1/download/matrix.org/uomSumphCREJqrcPmarkhjYX>`
`16:15.19`	`mdtwenty[m]`	`this is what i got right now`
`16:15.50`	`mdtwenty[m]`	`i m already using a dynamic bitvector for the regiontable`
`16:16.30`	`vasc`	`yeah but that's against trunk/ not branches/opencl right?`
`16:16.51`	`mdtwenty[m]`	`ah yes it is not against the opencl branch yet`
`16:17.44`	`mdtwenty[m]`	`i just checked out the opencl branch from the svn, so if you give me some time i can apply the patch manually and sent it to you`
`16:18.16`	`vasc`	`there also seems to be some noise`
`16:18.28`	`vasc`	`like in the rendering function rt.cl`
`16:18.44`	`vasc`	`because you indented some code differently some things are reported as changed even though the code is the same`
`16:20.43`	`mdtwenty[m]`	`oh i see.. will fix that`
`16:30.08`	`vasc`	`this code is missing in opencl boolweave:`
`16:30.16`	`vasc`	`if (segp->seg_stp->st_aradius < INFINITY &&`
`16:30.16`	`vasc`	`<PROTECTED>`
`16:30.16`	`vasc`	`<PROTECTED>`
`16:30.18`	`vasc`	`...`
`16:32.01`	`vasc`	`also this is kinda strange:`
`16:32.02`	`vasc`	`<PROTECTED>`
`16:32.32`	`vasc`	`<PROTECTED>`
`16:33.33`	`vasc`	`is that the circular 'pointer' in the head and tail of the list again?`
`16:33.55`	`vasc`	`shouldn't it be, like, 'j = head_pp' or whatever?`
`16:37.33`	`mdtwenty[m]`	`hum yes, j = head_pp is equivalent, but shorter`
`16:38.00`	`mdtwenty[m]`	`i already have it that way in rt_boolfinal`
`16:39.15`	`vasc`	`also name those functions boolweave and boolfinal`
`16:39.22`	`vasc`	`don't use different names than the ANSI C names`
`16:39.35`	`vasc`	`it makes it harder to understand which is which`
`16:40.03`	`vasc`	`and yeah eval_partitions/rt_boolfinal needs to be cleaned up`
`16:40.13`	`vasc`	`and some things need to be refactored.`
`16:43.45`	`vasc`	`yeah we'll need an overlap handler...`
`16:44.04`	`vasc`	`ah well.`
`16:44.13`	`vasc`	`fix the things i said and make a patch against branches/opencl`
`16:44.33`	`vasc`	`i'll then apply the boolweave code, but the boolfinal code still needs some work`
`16:45.35`	`mdtwenty[m]`	`sure will do that`
`17:08.19`	`Stragus`	`Hrm... perhaps these INFINITY should be replaced with FLT_MAX or DBL_MAX`
`17:08.39`	`Stragus`	`Some chips become up to 900 times slower when perfoming a floating point operation where an infinity or NaN is involved`
`17:08.51`	`Stragus`	`(including comparisons)`
`17:10.01`	`vasc`	`well, we want bug for bug compatibility with the ANSI C code though.`
`17:10.27`	`Stragus`	`Right. That comment was mostly for the CPU side of things`
`17:10.31`	`vasc`	`i mean if we wanted speed we wouldn't be using doubles on a GPU in the first place.`
`17:10.38`	`Stragus`	`Eh, indeed`
`17:11.10`	`vasc`	`still its a reasonable argument. considering we have some defines for doubles as floats an an option.`
`17:11.13`	`Stragus`	`But even modern CPU Intel chips are 200-300 times slower with infinities. AMD doesn't care about inf/NaN`
`17:11.47`	`vasc`	`so you say #undef INFINITY and #define INFINITY DBL_MAX?`
`17:12.30`	`Stragus`	`Basically yes, though I would personally prefer some custom foo_MAX macro rather than replacing INFINITY`
`17:13.39`	`vasc`	`in opencl that would be MAXFLOAT it seems`
`17:14.09`	`vasc`	`ah no`
`17:14.11`	`vasc`	`that's SP`
`17:14.15`	`Stragus`	`nods`
`17:15.04`	`vasc`	`doesn't the compiler do those kinds of optimizations you use ffast-math or something?`
`17:15.09`	`vasc`	`if you use`
`17:15.18`	`Stragus`	`No, that would change the behavior of the code`
`17:16.28`	`vasc`	`i think ffast-math disables those checks though`
`17:16.28`	`Stragus`	`I'm not currently aware of the performance of Inf/NaN on GPUs, but it's a good idea to avoid these in any case`
`17:16.45`	`vasc`	`https://gcc.gnu.org/wiki/FloatingPointMath`
`17:16.46`	`gcibot`	`[ FloatingPointMath - GCC Wiki ]`
`17:17.20`	`vasc`	`"In addition GCC offers the -ffast-math flag which is a shortcut for several options, presenting the least conforming but fastest math mode. It enables -fno-trapping-math, -funsafe-math-optimizations, -ffinite-math-only, -fno-errno-math, -fno-signaling-nans, -fno-rounding-math, -fcx-limited-range and -fno-signed-zeros."`
`17:17.20`	`vasc`	`-ffinite-math-only`
`17:17.56`	`Stragus`	`Hrm -ffinite-math-only, indeed`
`17:18.19`	`Stragus`	`Though I'm aware of a performance gain when I removed infinity checks on code that was using ffast-math several years ago`
`17:19.16`	`Stragus`	`And assuming you do want to check for overflows, a check against DBL_MAX is still a good idea, eh`
`17:19.19`	`vasc`	`well it wouldn't be the first time a compiler wouldn't behave like it's supposed to.`
`17:21.53`	`mdtwenty[m]`	`hm, should i use DBL_MAX then?`
`17:22.26`	`vasc`	`keep it as is for now`
`17:22.26`	`vasc`	`we don't want even more weird behavior right now.`
`17:22.26`	`vasc`	`leave the optimizations for later.`
`17:22.38`	`vasc`	`just make a note for it.`
`17:22.54`	`mdtwenty[m]`	`ok :)`
`17:42.30`	`vasc`	`given the amount of things which need to be optimized...`
`17:42.45`	`vasc`	`we'll go for algorithmic improvements first.`
`17:46.54`	`vasc`	`there's lots of O(N^2) things, spurious memory usage and access and things like that which need to be fixed first`
`17:48.00`	`vasc`	`besides i'm not sure the compiler doesn't do that in the first place`
`17:48.21`	`vasc`	`without looking at the assembly code output i wouldn't make changes like that.`
`17:51.12`	`Stragus`	`Ah right... but I think perhaps that should all have been done before porting to GPUs?`
`17:52.47`	`Stragus`	`Optimization and debugging on GPUs is more troublesome, it's easier to settle the algorithm and code on CPUs first`
`17:52.47`	`Stragus`	`And I'm not entirely convinced about GPU performance considering the need for double precision, compared to CPU AVX2`
`17:53.18`	`vasc`	`well. this is OpenCL. it runs on the CPU as well. in fact mdtwenty[m] has been running and testing it there.`
`17:54.03`	`vasc`	`and i did prototype the boolean evaluator in ANSI C before mdtwenty[m] ported it over.`
`17:54.30`	`vasc`	`the boolean weaving code is also a relatively straightfoward port.`
`17:54.38`	`Stragus`	`All right then`
`17:54.52`	`vasc`	`the boolfinal might not be, because i suspect the current way of doing it isn't optimal. but mdtwenty[m]'s still working on that.`
`17:55.33`	`vasc`	`also it's not that GPUs are slow at double's. it's that NVIDIA cripples the budget GPUs.`
`17:56.43`	`vasc`	`have you looked at the DP FLOPS of the V100?`
`17:57.08`	`Stragus`	`Sure sure, it's all right in the $3k GPUs`
`17:57.09`	`vasc`	`7014 GFLOPS on the PCIe V100`
`17:57.16`	`vasc`	`DP GFLOPS`
`17:58.10`	`Stragus`	`On consumer GPUs, I have had better performance using dual-float math instead of doubles (for similar accuracy)`
`17:58.16`	`vasc`	`how many GFLOPS do those Skylake server processors or the AMD Epyc have?`
`17:58.52`	`vasc`	`it says in this article`
`17:58.56`	`vasc`	`http://www.eetimes.com/document.asp?doc_id=1331988&page_number=2`
`17:58.57`	`gcibot`	`[ Intel Skylake Counters AMD Epyc \| EE Times ]`
`17:59.11`	`vasc`	`32 FLOPS/cycle`
`18:00.14`	`vasc`	`DP FLOPS`
`18:00.27`	`vasc`	`28 cores`
`18:00.30`	`vasc`	`3.6 GHz`
`18:01.22`	`Stragus`	`So about half of a $3k GPU`
`18:01.28`	`vasc`	`3225.6 DP GFLOPS/peak?`
`18:01.33`	`Stragus`	`Right`
`18:02.30`	`vasc`	`https://en.wikichip.org/wiki/intel/xeon_platinum/8180`
`18:02.31`	`gcibot`	`[ Xeon Platinum 8180 - Intel - WikiChip ]`
`18:02.37`	`vasc`	`Release Price$10009.00`
`18:02.49`	`vasc`	`GPU wins that one.`
`18:03.20`	`vasc`	`let's see how much the entry level costs.`
`18:03.21`	`Stragus`	`Screw you Intel :), I'm waiting for dual-socket Epyc motherboards to upgrade my desktop`
`18:04.06`	`vasc`	`https://en.wikichip.org/wiki/intel/xeon_bronze`
`18:04.07`	`gcibot`	`[ Xeon Bronze - Intel - WikiChip ]`
`18:04.09`	`vasc`	`those are cheaper.`
`18:04.47`	`vasc`	`also half the clockspeed.`
`18:04.47`	`Stragus`	`And I know profesional grade GPUs are better at double precision. But on a typical desktop machine with a gaming GPU, it's not so clear`
`18:05.06`	`vasc`	`yeah, it's a good question, what's better on a typical desktop.`
`18:05.29`	`vasc`	`which is one reason why we went for opencl and not cuda, despite all the extra work in it.`
`18:05.35`	`vasc`	`because of the crap libraries.`
`18:06.30`	`Stragus`	`The best double-precision-like performance I had on gaming GPUs was a healthy mix of regular floats and dual-floats`
`18:06.59`	`Stragus`	`Just in case you could use that, here's my code for double-float arithmetics: http://www.rayforce.net/ddm.h`
`18:09.21`	`vasc`	`what's the license?`
`18:09.54`	`vasc`	`0h it uses sse`
`18:09.54`	`Stragus`	`"Do whatever you want with it", I should put a header`
`18:09.54`	`Stragus`	`No no, that was just some optional optimization attempt`
`18:10.25`	`Stragus`	`The double-double math is also useful when you need higher accuracy than double but with decent performance`
`18:10.48`	`vasc`	`ok i'll keep this under my hat`
`18:10.51`	`vasc`	`:-)`
`18:10.56`	`vasc`	`now really bbl`
`18:11.02`	`Stragus`	`:) Okay`
`18:24.54`	`*** part/#brlcad mdtwenty[m] (mdtwentyma@gateway/shell/matrix.org/x-iwpdlhgermucyhhk)`
`20:13.44`	`*** join/#brlcad merzo (~merzo@136-3-133-95.pool.ukrtel.net)`
`21:28.54`	`*** join/#brlcad infobot (~infobot@rikers.org)`
`21:28.55`	`*** topic/#brlcad is GSoC students: if you have a question, ask and wait for an answer ... responses may take minutes or hours. Ask and WAIT. ;)`
`21:34.44`	`*** join/#brlcad kintel (~kintel@unaffiliated/kintel)`
`21:45.21`	`Notify`	`03BRL-CAD:starseeker * 69942 (brlcad/trunk/misc/CMake/BRLCAD_Targets.cmake brlcad/trunk/src/libbu/CMakeLists.txt): Tweak astyle validation logic`

Generated by irclog2html.pl Modified by Tim Riker to work with infobot.