Not scanning at all, actually - that's probably where I was getting the cmakefiles.cmake reference. I use that as my initial list, so there's no scanning at all.
I mean scanning file contents, i.e., reading from disk
hierarchy traversal is nearly instantaneous these days on all platforms
Oh. I'm resetting the offset in the stream once per pass type - getting finer than that is going to get pretty tricky.
traversal is measured in hundreds of thousands of inodes per second, thousands of dirs per second
so I open the ifstream, do a bio.h check, reset seeking to 0, do the bnetwork.h check, etc.
hm
that definitely is going to involve disk caching
In principle I might be able to reuse each getline to do all the processing, but I'm leery of trying - that would involve multiple simultaneous scanning state managements, and just to save 5 or 6 seconds I doubt it's worth it.
to eliminate the effect, you'd need to parse the ifstream into a stringstream or other buffer type and do all your work on that
Hmm. More ram usage, but none of our files are big enough to be an issue anyway on any sane system...
not even close to being an issue
the only downside would be if you don't actually have enough work, i.e., if the cost of a malloc+memcpy is more than your regex (which seems unlikely)
the memcpy can be avoided, but you'd need to use bu_mapped_file or mmap or similar and just scan the buffer instead of using getline. it's an order of magnitude faster for read-only work.
relies on the virtual memory manager to do the best thing, and it's really hard to do better oneself
in a quick test stringstream didn't do much
for what it's worth, it's very likely possible to get the repo check down to a sub-second time compiled. not necessary of course, and any improvement will be gravy.
you're already disk-cached -- you'd have to invalidate the cache to see the effect
ah
I'd still expect it to be a smidgen gain, but it's mostly saving seek time. if it's cached, though, it's all in memory already and there is no actual seeking going on
I'm probably not using it correctly - I literally dropped in the stringstream instead of the ifstream, and I doubt that's right
If you want to watch it blow up on OSX, you can try it with cmake -DREPOCHECK_TEST=ON to have "make regress-repository" use the new version.
mapped file really will be the best for this.
for myself I'd rather figure out how to get it to check all of our files defining main to make sure they call bu_setprogname
it's as simple as
struct bu_mapped_file *mf = bu_open_mapped_file(myfile, "whatever");
and just start searching (char*)mf->buf for content.
Don't know if I've added a new program in the last 5 years without forgetting to do that and then having to debug why the relative lookup acted weird...
bu_getline will work on that? (been a while since I used a mapped file)
need lines for the line number reporting...
you don't need bu_getline, you just use it almost like it was getline's result
just, it's all lines
instantaneously. it's a beautiful thing.
Well, if you think it's worth it I can take a look tomorrow. Need to try it on some more machines, but for myself if it's reliably <10sec (or even <30 sec) across the board that's probably good enough for now...
Yeah, I agree with you on the bu_setprogname -- that's bit us repeatedly. ideally we shouldn't need to call it at all, but we still don't have all platforms yet.
I don't think we can get around it for OpenBSD - I couldn't find any other solution that worked reliably
It's actually pretty cool that the iwd/setprogname pairing does work - I think that might be an unusual capability
starseeker said:
Well, if you think it's worth it I can take a look tomorrow. Need to try it on some more machines, but for myself if it's reliably <10sec (or even <30 sec) across the board that's probably good enough for now...
if one of your goals/reasons for transcoding it was to make it faster, then it's definitely worth it. I think it would be worthwhile simply because the code definitely has a 10x cyclomatic complexity increase, so we should at least take advantage it'll give decreasing iterative testing times.
if it can be made fast enough, then we might even be able to break up the tests themselves so it's not a compound test
Well, kinda - I wanted to get to a place where we don't see those crazy outliers that sometimes break the CI results with a "timed out" fail
(and have it run on Windows...)
breaking it up with certainly do that too :)
and it could run in parallel
I could probably do that now, actually (running the individual tests bit) - didn't bother since it's no longer a drop-in replacement for repository.sh when I do that, but shouldn't be too hard.
the only reason all those tests were munged together was because of shell scripting I/O performance limitations. I would have needed to use shared memory or ports/socket communication to break it up effectively, which is a pain in shell scripts.
@Sean Most of the complexity is to support threading - if you want to go simpler and single threaded it can probably be about the same or maybe a little simpler than the shell script
threading o.O
Launches the test suite against multiple files in parallel
what complexity are you referring to?
I see no threading in repocheck.cpp
look around 748 (you may need to svn update if you haven't in the last few hours)
alright, I gotta call it a night
ah, I see what you mean now
yeah, I think it should be single threaded..
It did make a difference, at least in earlier testing - maybe less so now.
this is already turning 1-line shell script commands into 100 line C++ logic, which has it's obvious benefits, but I don't think it's healthy to add to that complexity unnecessarily. it won't be as maintainable.
repository.sh is 382 lines, repocheck.cpp is (currently) 885 - more like 3x overall...
repository.sh has an unrealistic amount of error checking for typical scripting
but fair enough since even the error checking and logging is transcodd
the script could be distilled down to something like 10 lines, though, one grep for each test
<snort> if we don't factor in readability of the lines, sure you can win that way ;-)
it's not a competition
Well, maybe in the sense of competing solutions - the C++ code has to be good enough to warrant replacement of an existing, working solution
Without, as you say, insane maintenance burdens being taken on as a result.
readability of both is dominated by reader experience. assuming an experienced programmer, the difference here is going to predominantly be lines of code and potential for error, long-term maintainability
cyclomatic complexity of c++ is nowhere near that of any higher level language including shell script. it's way higher.
When I'm doing the coding you can add a 2x multiplier for error potential to shell scripting just due to spaces and special characters in path names.
it's a win in the other benefits, portability and performance foremost
Anyway, I'll crank it back to the single threaded version tomorrow and see if mapped files do anything helpful for performance.
any single developer's experience or preference, mine and yours included, shouldn't be the dominant calculus in my opinion. that leads to things like sticking with tcl/tk.
heh. In fairness, Tcl/Tk was the rational choice in 1990. They just decided they liked it in 1990 and stayed put too long.
alright, uncle. need sleep
certainly, and shell scripts can still be written in significant fraction of the time as c++, they have their place
cya
fyi, jenkins is barking a failure: make[3]: don't know how to make embedded_licenses.txt. Stop
just saw c75789 .. that's technically wrong too. the correct form of a URI reference for a relative reference is simply the relative path (i.e., no file: prefix). see https://en.wikipedia.org/wiki/Uniform_Resource_Identifier
Was considering the "root" of the reference system to be the root of the BRL-CAD src tree (not the filesystem) so in that sense they're absolute. However, if you prefer something like the reuse.software format we can do that - I just want to pick one answer, set it up, and not worry about it anymore.
stale build tree caught me - missed a variable. r75796 should do it.
Well, that gets to < 2s on Linux, but still ~15 on FreeBSD.
@Sean is there any equivalent to perf on FreeBSD?
starseeker said:
Was considering the "root" of the reference system to be the root of the BRL-CAD src tree (not the filesystem) so in that sense they're absolute. However, if you prefer something like the reuse.software format we can do that - I just want to pick one answer, set it up, and not worry about it anymore.
I know what you were considering; it was clear. Just letting you know that it's technically wrong per the URI spec if that was the reason for using them. There's little reason to prefer file:/path/to/file over file:path/to/file and vice versa. The actual correct form is simply path/to/file or ./path/to/file as a URI reference, which is combined with a root path to create a valid URL, like file:/usr/home/brlcad/path/to/file or file:pwd
/$filepath or file:${PWD}/${filepath} etc.
Note of the two relative forms, lots of software does support the latter (file:rel/path/to/file) despite being invalid (e.g., I believe some terminals will handle it)
starseeker said:
Sean is there any equivalent to perf on FreeBSD?
yes, it's called the pmc interface on freebsd. I think the main comparable tool is hwpmc but I have not tried it
https://people.freebsd.org/~jkoshy/projects/perf-measurement/
looks like latest docs are https://wiki.freebsd.org/PmcTools
Looks like pcmstat is the user facing side, but I can't get it to kick off...
pmcstat: ERROR: Initialization of the pmc(3) library failed: No such file or directory
Not sure if that's my fault or if there's something missing on the system...
@starseeker just looking at 75799, you're not likely going to see a performance improvement with that approach. you're making three copies of every file. without copy on right, that could even be slower than what you had before.
@Sean I didn't see a way with mapped file to go line-by-line - it's just a big char buffer. That won't work for this...
the benefit of mapped files is when you use the mmap->buf directly. you do that to completely avoid user memory allocation, and it avoids unnecessary disk seeking.
sure it can
Short of manually storing offsets along the buffer...
especially with regexes, it's geared for that use case.
not following? why would you store offsets?
(we're in the wrong topic - one sec)
For reporting I need line numbers, which means I'm processing per-line, not per file string
there fixed
file stream rather
it seems to be literally the bytes in a buffer, which isn't enough structure by itself
just because you need line numbers doesn't mean you have to process per-line
?
not following
C/C++ literally has no concept of lines, it's a construct by convention. a line only exists because we say it exists as denoted by special character values.
there is std api that implements that convention, but it's not rote and not the only way
the performance gain is had from just keeping track where things are in the ->buf
you can certainly find a match with a regex regardless of newlines or with newlines being taken into account
for example
to find a line number, you just count newlines backwards from a match point (of course, probably would make sense to wrap that 2-liner into a whatsthislinenumber(off_t position) function)
but that happens entirely in memory, without a memory copy, so it's instantaneous compared to the alternatives
there's probably a c++ helper, but the C was would be to iteratively call strchr() from a match point
or literally for (i=pos; i>0; i--) if (buf[i] == newline) linenumber++;
just with strchr, you might get lucky and get a vectorized implementation
groan... that's a lot of really finicky overhaul for a few seconds, assuming I can get it to work - that would involve ripping out virtually all the std::string usage.
shrug would be a REALLY good real usecase to learn about performance, but certainly up to you
this is applicable everywhere in the code, and particularly relevant given how much you like to use std::string and the stl containers.
also it doesn't get any simpler than this.
but you're also right that it's certainly not necessary for this unit test, just rare to have learning opportunities isolated like this.
for what it's worth:
std::string fbuff((char *)ifile->buf); // A) this potentially reads entire file from disk, mallocs memory, and copies the entire memory span
std::istringstream fs(fbuff); // B) this potentially mallocs more memory and copies memory, albeit typically in blocksizes but could result in a full allocation duplicate of fbuff
...
while (std::getline(fs, sline) ... // C) this necessary causes an allocation, albeit typically only one blocksize worth
<nod> - there's also some risk that the regex_match calls may malloc under the hood if I feed them const char * inputs - not sure about that
gcc and llvm have optimizations that try to defer the allocations in A and B, but they're not guaranteed or portably reliable
could be wrong, but I don't think so
regfree() releases things allocated by recompile
The C library probably doesn't - was thinking about the C++11 implementations.
I'd be surprised if they do too... that's old behavior, there's no need for it to allocate unless you do something like request the match in a new object
which you definitely don't need for this
I mean the implementation certainly could under the hood if they wanted, but that's true of the C version and nearly any API that doesn't explicitly say they don't make syscalls.
@Sean Is r75802 headed in the direction you were suggesting?
yeah, that's definitely in the right direction! awesome. fast study!
you could make pos_to_line_num a bit faster as you don't need to calculate offset. can just check NULL result.
the fastest version of this probably involves a start-to-end positional parameters so you can accumulate instead of parsing all the way back to the start repeatedly. still should be really fast, but O(n) vs O(n^2) in the worst case
you also might not gain much from the bio.h precheck, might even be slowing it down, but that's easy enough to test and observe.
@Sean here's the weird part - compared to the prior commit, this version is slower on FreeBSD
the only additional work is the repeat backscanning, so pos_to_line_num() could be a bigger issue than I gave it. not that surprising -- it's going to be more sensitive to cpu. I can try a profile if you want me to dig deeper, or install them if they're not there if you want to give it a go.
I'll fiddle with it a bit first, if you think that's the likely source. I don't trust FreeBSD's regex performance, but my code is the likelier issue.
Would it be worth making a libbu function in the bu/mapped_files.h API that wraps the "correct" position to line number translation, once that's done? Would be a waste to have to re-invent it down the road if we're going to use mapped file more.
I'm not quite following what you're proposing about not calculating the offset. Don't I have to know that to know when to stop accumulating line counts?
perf on linux puts most of the time in the bu_open_mapped_file command, but it's also much faster than FreeBSD so that's probably not a helpful indication.
don't think it belongs with mapped files, but could exist as some sort of helper function elsewhere, bu_str_ maybe. I'd wait until there's a second use case. More importantly, though -- the faster version of this is probably going to require a different function signature, different args.
hm, thinking about it some more, there's potential for a really big performance boost by only loading the top of the file. that works for the header checks but I guess wouldn't be so great for some of the other checks.
I'm only checking the first 500 lines for some of the checks...
Even with the C++ isms I'm sub-second on my ubuntu box, so I'm going to go ahead and wire it in - if nothing else we should be able to git rid of that 600 second timeout special case...
cool.
Hah - taught it to look for bu_setprogname in files with main()
Not totally debugged yet - it's not catching them all - but progress
Ah, right - int may be on a different line. That's more like it - 303 that don't call it
<rolls up sleeves>
/me peels eyeballs off the monitor and heads for bed
@starseeker it looks like you changed some of the repository regular expressions, was there a problem?
Not really - adjusted slightly what I was checking at which times, mostly
Had to un-quote some stuff as well
just noticing some of them will match differently now
which ones?
not sure if it was intentional or fixing a mismatch
looks like you fixed one of the ones changed in 75812
I think in one or two cases I simplified while testing, I may have forgotten to put back one or two of the more specific regex strings
I know that was one of those cases, there may be others I missed...
matching space chars is rather different from matching [[:space:]] ... :)
that's not a simplification
Heh - I mean simplifying the regex expression while I was debugging setting them up in C++
When translating the longer ones a single char misstep could wipe out badly - if I couldn't see what the problem was the first step was to take it back down to something I could readily parse by eye to make sure I hadn't made some more basic mistake.
I introduced each problem deliberately into my source tree to detect them, so there is some positive evidence they'll catch things
(runs on Windows in <10s, btw)
in doing so, I think some matching behavior has changed. could be wrong, but the regular expressions for the function matches and platform checks look different. some less aggressive, some more aggressive.
great that it's running fast
The one behavior I know has changed is that platform symbols are counted per-symbol, not per line - so some of the lines with WIN32 and CYGWIN on one line, for example, are now registering 2 counts instead of 1
you're also catching mixed case now, which it wasn't doing before (not saying that's good or bad, just different)
you mean the platform symbols? I thought that was still matching just all upper and all lower...
can I make a request on the function tests -- can you make it so the things to match are only on one line?
maybe I misread
/me checks
Oh, you mean the the lists of specific functions and platforms? sure
list of platforms is fine, they're one per line
functions are on two lines
worried that one will get updated and not the other over time
@Sean where are you looking? not following...
api_func_strs
oh, wait I see what you're doing now
Right - the static array. Are you referring to the api_exemptions?
no, I thought strings were being combined, but I get it now
Last updated: Jan 09 2025 at 00:46 UTC