Stream: brlcad

Topic: GitHub


view this post on Zulip Sean (Sep 10 2019 at 18:39):

As some of you already know, we're planning on moving our main repository and operations from SourceForge to GitHub real soon now. It's taken approximately two years (yes years, but worked predominantly on weekends and evenings) to get the entirety of BRL-CAD's repository converted from Subversion to Git. This work, by Cliff Yapp, has included fairly extensive complicated mappings to preserve as much data as possible, to fix old corruption, to track changes across major disruptions, to verify and validate that everything is preserved.

view this post on Zulip Sean (Sep 10 2019 at 18:42):

As this is a big change to our development operations, this is an intentional "open comments" period for folks to talk, to adjust, ask questions, give feedback, get prepared, explore tutorials, etc. The intention is to flip the switch in a few weeks.

view this post on Zulip Sean (Sep 10 2019 at 18:44):

One question that's already been a point of discussion (and some of you have already shared your views privately, thank you) is how to handle the commit e-mail associated with past commits. If people want them associated with their current GitHub profile/e-mail, then we'll need to set those before migration is complete. As it currently stands, everyone's commits are associated with a fictitious "USER@sf" e-mail.

view this post on Zulip Sean (Sep 10 2019 at 18:47):

If you'd like your commits associated with a specific name and/or address, please contact me in private or make the change yourself in misc/repoconv/account-map

view this post on Zulip scorp08 (Sep 11 2019 at 10:24):

@Sean so I guess , It is possible to fork from brlcad git

view this post on Zulip Sean (Sep 11 2019 at 18:40):

@scorp08 Yes, of course it will be possible. Technically it's not hard to fork the Svn repo now, but it will become even easier.

view this post on Zulip Sean (Sep 11 2019 at 18:40):

We'll still be maintaining a central repository structure to encourage collaboration and accelerated development, but it's all good. If people feel more empowered to work on the code in a fork than they do in a clone, I'll still be happy to see their development. Hopefully it won't get too messy and we can actually improve coordination and make it even easier for new developers to get involved with improving the code base.

view this post on Zulip Erik (Feb 29 2020 at 13:30):

Git?

view this post on Zulip Sean (Feb 29 2020 at 14:51):

I need a couple more days to contact the last remaining committers to get their e-mails, create aliases for the handful that aren't reachable, then assume another 2 weeks for @starseeker to run the reconstruction, followed by maybe 1 more week of validation and testing while uploading to GitHub, and if all goes well, we should be up and running by the end of the month!

view this post on Zulip starseeker (Mar 14 2020 at 14:09):

<squeaky wheel noise>

view this post on Zulip Erik (Mar 17 2020 at 12:14):

@Sean status? anything anyone can do to help? do I need to swing by the farm supply store for a salt lick and a cattle prod to do the "carrot and stick" thing? :D

view this post on Zulip starseeker (Mar 20 2020 at 02:51):

ping...

view this post on Zulip Erik (Mar 22 2020 at 22:37):

less hearts, more answers, boy. What's the holdup? my git-fu is pretty strong these days and my drives tend to be more than 8 gigs (the drive my home server used when we did cvs->svn) these days, so I'm not complaining about repo size :D

view this post on Zulip starseeker (Mar 23 2020 at 00:04):

@Erik If you want a preview, you can take a look at https://github.com/starseeker/git_conv_test - it's about 5 months out of date now and the non-email committer names mess with github's stat calculators, but it should be a pretty fair representation of what the conversion will end up looking like otherwise. If you want to ,check it out to see how it behaves for you (and see if you spot anything wrong). If you want the git notes that have the SVN numbers, you'll need to explicitly grab the notes as well with: git fetch origin refs/notes/commits:refs/notes/commits

view this post on Zulip starseeker (Mar 23 2020 at 00:12):

The hideous conversion process is laid out in misc/repoconv/CONVERT.sh - it's about as ugly as it gets: C++ mixed with shell scripts mixed with sed and stream of consciousness quick and dirty hackery , but it seems to (slowly) get the job done. I'm still not very skilled with using git day-to-day, but I now know quite a bit more than I wanted to about fast import and export and friends.

view this post on Zulip Sean (Mar 23 2020 at 01:57):

@Erik he's been patiently waiting on me. I'm the holdup. I've been confirming with every past committer since I have contacts for nearly everyone and they've responded with a plethora of e-mails to use. I just had a few remaining to contact which got delayed with a tasker at the office and GCI and GSoC prep and server issue and ... delays. Now with all this ample time on hand (hah), at least time at keyboard, I've been getting through mad backloggage so we should be able to wrap this up with the final pass this week I think.

view this post on Zulip Sean (Mar 23 2020 at 01:58):

It's not a space issue, it's about having a complete history that doesn't loose anything, which @starseeker has gone to exceptional lengths to preserve. The rest is limitations of github that require real contact info if we want to have real stat preservation.

view this post on Zulip Erik (Mar 23 2020 at 11:07):

git history is rewritable. mistakes at this stage can be fixed. (rewriting git history is dangerous and expert friendly, but we're not ... committed.

view this post on Zulip Erik (Mar 23 2020 at 11:08):

good luck wrapping up the last few, if'n ya'll need git or shell help, lemme know, I've been using git almost exclusively since... hm, was it '13 that I left arl to check out the modern world? :D

view this post on Zulip starseeker (Mar 23 2020 at 12:14):

We may not be committed, but once we go live with the new github repo and people start forking rewriting the full history would be highly disruptive. On the order of what we nearly had to do with the Great SVN Duplicate Commit ID crisis a number of years back.

view this post on Zulip starseeker (Mar 23 2020 at 12:22):

The chaining of SHA1 hashes is neat for repository integrity, but it means there's no such thing as a local history change. I spent a lot of time thrashing trying to figure out if I could splice the newer SVN conversion onto the older CVS git conversion, and it took me longer than it should have to realize that it's actually structurally impossible to do that with anything other than a full commit replay of the post-CVS commits on top of the CVS conversion (hello, rabbit hole).

So since I REALLY don't want to have to wade through all of that any more times than I need to (there are some finicky manual steps that have to get updated each time the committer emails change, not to mention the delightful experience of mucking around in the swamp mud of my conversion logic) I'm willing to wait for @Sean to get it right the first time :-)

view this post on Zulip starseeker (Mar 23 2020 at 12:25):

Whadya mean "modern"? We use CMake and everything these days! We're even embracing this newfangled C++11 thing! Now get off my lawn! :-P

view this post on Zulip Erik (Mar 23 2020 at 12:55):

hehe, ch'know, there's a c++17 now :D

view this post on Zulip Erik (Mar 23 2020 at 12:56):

(still c++, though... swift and go are way nicer... I hear good things about rust, too)

view this post on Zulip starseeker (Mar 23 2020 at 16:36):

@Daniel Rossberg The plan right now is to also convert all the smaller project histories (including rt^3) to their own individual git repos. Does git+github work for you for rt^3? (We have to change the name to rt_3 - the ^ character causes some problems for git..)

view this post on Zulip Daniel Rossberg (Mar 23 2020 at 16:55):

Well, sure, I didn't created this name. However, why don't we call it rt3?

view this post on Zulip starseeker (Mar 23 2020 at 17:55):

That's fine too, assuming it works for the converisons - rt_3 was just what I had put in the original svn-fast-export mapping files when I found out ^ wouldn't work.

view this post on Zulip Sean (Mar 24 2020 at 06:32):

The "rt-cubed" name doesn't need to be preserved. I would suggest renaming the repo to "moose" since that's the name we decided on.

view this post on Zulip Sean (Mar 24 2020 at 06:32):

or MOOSE ?

view this post on Zulip Sean (Mar 24 2020 at 06:39):

would be good to disambiguate from https://en.wikipedia.org/wiki/MOOSE_(software) in some manner, maybe moose++ or just be fine with moose or ...

view this post on Zulip Sean (Mar 24 2020 at 06:41):

@Erik git history is typically rewritable, but we're (thus far) using a feature of git (notes) that precludes rewriting history without rebuilding hashes. that's because the way git notes are currently implemented, they attach to specific hashes and are not updated on history edits. they get orphaned. it's lame, but it's the best solution so far for attaching svn's metadata to specific commits. open to other solutions.

view this post on Zulip Daniel Rossberg (Mar 24 2020 at 07:45):

Sean said:

The "rt-cubed" name doesn't need to be preserved. I would suggest renaming the repo to "moose" since that's the name we decided on.

I recommend to stay with "rt-cubed" name for the conversion, because this branch is more of a sandbox for experimental extensions than C++ interface specific.

However, I agree with you to aim for an own moose repository for the C++ interface and its belongings in the future.

view this post on Zulip Daniel Rossberg (Mar 24 2020 at 07:51):

Sean said:

would be good to disambiguate from https://en.wikipedia.org/wiki/MOOSE_(software) in some manner, maybe moose++ or just be fine with moose or ...

I would officially name it BRL-CAD MOOSE for "BRL-CAD Modular Object Oriented Software Extension". I.e., the MOOSE acronym makes only sense with the BRL-CAD prefix.

view this post on Zulip Sean (Mar 24 2020 at 07:57):

software extension? I thought we had a better backronym. ;)

view this post on Zulip Sean (Mar 24 2020 at 07:58):

Modular Object-Oriented Solidity Engine

view this post on Zulip Daniel Rossberg (Mar 24 2020 at 08:02):

Whatever :grinning_face_with_smiling_eyes:
Important is the name MOOSE with its wonderful logo.

view this post on Zulip Sean (Mar 24 2020 at 08:03):

so true

view this post on Zulip starseeker (Mar 24 2020 at 11:59):

@Erik about the git-notes usage - I did that so the git commit messages could exactly match their SVN counterparts, which allows for a fairly straightforward analysis to map SVN ids to older commits.

For the CVS portion of the conversion (i.e. the commits put in Git straight from the cvs repo) the ordering and specifics of the generated commits varies a bit from the cvs->svn results (which is one of the reasons I went to all this trouble - cvs-git produced better results with the very early commits). That means the commit messages (when unique) are the best available way to find SVN id mappings to older git commits, hence I needed to keep them the same in both conversions. (Even that isn't enough to reliably peg all cvs->git commits with SVN ids, but an upside of using notes is that if someone someday wants to do a better job of SVN id mapping than I managed they can do so without disturbing the main Git history.)

view this post on Zulip starseeker (Mar 31 2020 at 11:48):

ping...

view this post on Zulip starseeker (Apr 07 2020 at 18:56):

ping...

view this post on Zulip Erik (Apr 08 2020 at 13:52):

does he need rebooted?

view this post on Zulip starseeker (Apr 08 2020 at 19:18):

Heh - just high load compared to available bandwidth

view this post on Zulip starseeker (Apr 13 2020 at 12:43):

ping...

view this post on Zulip Erik (Apr 17 2020 at 23:39):

traceroute?

view this post on Zulip starseeker (Apr 19 2020 at 18:44):

ping...

view this post on Zulip Erik (Apr 22 2020 at 17:00):

alacaPING

view this post on Zulip starseeker (Apr 22 2020 at 19:34):

@Sean Are we still waiting on responses?

view this post on Zulip Sean (Apr 22 2020 at 19:37):

No, I've not had time to work on it, chasing other issues.

view this post on Zulip Sean (Apr 22 2020 at 19:37):

I need to set up aliases and update a couple things and it'll be good to go.

view this post on Zulip starseeker (Apr 22 2020 at 19:39):

Ah, K.

view this post on Zulip starseeker (Apr 29 2020 at 16:03):

ping...

view this post on Zulip starseeker (Apr 29 2020 at 16:05):

@Sean would it help if you sent me the updated info and I integrated it into updated author maps? As long as those are finalized we can start the conversion without actually requiring the aliases be present on bz...

view this post on Zulip starseeker (May 02 2020 at 17:21):

ping...

view this post on Zulip Erik (May 03 2020 at 14:44):

I'm starting to wonder if starseeker's pinger is broken

view this post on Zulip Sean (May 03 2020 at 14:44):

Heh, not broken, almost there
Lots going on

view this post on Zulip Erik (May 03 2020 at 14:46):

tautological. High interest topic, so nosey noses want to know :)

view this post on Zulip Sean (May 03 2020 at 14:48):

Like whether everyone got their invite -- I resent it again for a third time to all, you get yours?

view this post on Zulip Erik (May 03 2020 at 14:48):

(burndown list? blockers we can help with? rough eta? have you sourced adequate caffeine?)

view this post on Zulip Sean (May 03 2020 at 14:49):

I just got a new coffee grinder, it's been amazing

view this post on Zulip Erik (May 03 2020 at 14:49):

yup, enrolling now

view this post on Zulip Erik (May 03 2020 at 14:50):

my grinder broke :( I have to drive 5 minutes to the office for real coffee

view this post on Zulip Erik (May 03 2020 at 14:51):

IMG_4974.JPG

view this post on Zulip Erik (May 03 2020 at 15:00):

hey, neato, I guess I'm a "mentor" now

view this post on Zulip starseeker (May 03 2020 at 18:30):

Ah - so that's mentor invites not github invites?

view this post on Zulip Sean (May 03 2020 at 18:45):

That's quite a contraption @Erik .. you got that 5+ years ago I think, right?

view this post on Zulip starseeker (May 03 2020 at 22:17):

<snort> from the looks of that contraption you're lucky breakage didn't involve an explosion

view this post on Zulip starseeker (May 03 2020 at 22:18):

There must be quite a science to proper coffee grinding

view this post on Zulip Sean (May 03 2020 at 22:46):

I have a similar expresso machine I've used for 15+ years. Sounds like it could explode any minute, but that's just how they work. They build up steam pressure to heat and force liquid through the grounds. Which also means you want really fine grounds, not the same grinding used in drip coffee machines.

view this post on Zulip Erik (May 05 2020 at 23:20):

the work one, has two cafe grade grinders to the left of it, it's a beast (and was some of my first on the job training at this place). I'm stuck with keurig's, a moka and an old target espresso maker that gathers dust :/

view this post on Zulip Erik (May 05 2020 at 23:20):

github? ping?

view this post on Zulip Sean (May 05 2020 at 23:21):

spending all time fixing builds and debugging, want to help -- can figure out why mysqld is using so much memory, see if it can be cut in half

view this post on Zulip Sean (May 05 2020 at 23:21):

i'm looking at whether/how jenkin's usage can be reduced

view this post on Zulip Erik (May 05 2020 at 23:22):

sure, people are using databases. remove the db's and terminate access, problem solved

view this post on Zulip Erik (May 05 2020 at 23:22):

I mean, uh, O:-) for jenkins, you might be able to tune max vm size, but java historically has a habit of not releasing much memory, it likes to hold onto it for it's own allocator

view this post on Zulip Sean (May 05 2020 at 23:22):

heh, well I'm almost certain the largest offender is the wiki .. but that's a hypothesis and still doesn't mean there's not some configuration options that might reduce usage too

view this post on Zulip Sean (May 05 2020 at 23:23):

yeah, I know java is a notorious pig, but almost certainly can get it to use less than 6GB

view this post on Zulip Erik (May 05 2020 at 23:23):

mysql probably has vm tuning options, too

view this post on Zulip Sean (May 05 2020 at 23:23):

I suspect it's loading a lot of stuff from our side that it doesn't need to

view this post on Zulip Sean (May 05 2020 at 23:24):

mysql is the one that's almost certainly loaded with attack attempts

view this post on Zulip Sean (May 05 2020 at 23:24):

website gets hit constantly, and could just be gradual accumulation of crap

view this post on Zulip Erik (May 05 2020 at 23:26):

probably... could try just restarting those services and see what happens, certainly something we could tune to fix, but might be a quick bandaid^Wadhesive bandage

view this post on Zulip Sean (May 05 2020 at 23:27):

mysql gets restarted frequently, it's sitting around 2GB

view this post on Zulip Erik (May 05 2020 at 23:30):

I tuned down a couple of it's buffers, looks like it's at 1/2 vm and 1/3 res, we'll see how far it drifts up.

view this post on Zulip Erik (May 05 2020 at 23:31):

and/or what asplodes :D pooters are fun!

view this post on Zulip Sean (May 06 2020 at 00:06):

it's set to use zero swap, so it's at 50% capacity if everything goes resident

view this post on Zulip starseeker (May 06 2020 at 01:55):

@Sean If I'm still breaking the OSX build, I can shift entirely to working in branches until we finish the github migration...

view this post on Zulip starseeker (May 06 2020 at 01:58):

Alternately, I can put a snapshot of trunk up on my own github and see if I can figure out how to hook up the OSX CI system

view this post on Zulip starseeker (May 18 2020 at 12:13):

ping...

view this post on Zulip starseeker (May 28 2020 at 16:45):

ping...

view this post on Zulip Erik (May 31 2020 at 13:18):

Eager minds want to know how close this sausage is to being made. :)

view this post on Zulip starseeker (Jun 13 2020 at 03:15):

ping...

view this post on Zulip Sumagna Das (Jun 13 2020 at 03:16):

what is this channel for?

view this post on Zulip starseeker (Jun 13 2020 at 03:16):

discussion of an eventual transition of the BRL-CAD source repository to using the Git version control system

view this post on Zulip Sumagna Das (Jun 13 2020 at 03:17):

you guys were talking about CI system, right?

view this post on Zulip starseeker (Jun 13 2020 at 03:17):

Continuous Integration is a separate topic

view this post on Zulip Sumagna Das (Jun 13 2020 at 03:17):

ok

view this post on Zulip starseeker (Jun 22 2020 at 21:22):

ping...

view this post on Zulip Erik (Jun 27 2020 at 00:44):

what resource is missing to put a bow on this? lack of Seans? we can do the star trek thing and split him into the saucer Sean section and the other Chris section, right? "Make it so, number :poop: "

view this post on Zulip Sean (Jul 02 2020 at 08:35):

@starseeker Probably missed your window, but ... it'll be there for whenever you get back.
It's done!

view this post on Zulip Sean (Jul 02 2020 at 08:38):

Aliases have been added and lots of confirmations and updates for others. Apparently took some 20+ hours to finish it up. Lots of proper e-mails in there, though, so worth it. Lots of simple awareness too.

view this post on Zulip Sean (Jul 02 2020 at 08:39):

a bunch of folks link through an alias to a noreply@ address in cases where I didn't have and couldn't find any contact information or if they were unreachable.

view this post on Zulip starseeker (Jul 02 2020 at 12:48):

Awesome - thanks! Will have to wait to kick off the main run, but I should be able to start on some of the preliminaries (in particular, updating the bridging commits between cvs and svn, which require manual adjustment.)

view this post on Zulip starseeker (Jul 02 2020 at 12:51):

There's still a + in front of jgrosh - is that significant?

view this post on Zulip Sean (Jul 02 2020 at 22:10):

starseeker said:

There's still a + in front of jgrosh - is that significant?

Oh, good catch. Yes, significant, as it means I hadn't reconciled his yet. Yay for book-keeping that served its purpose! No response, so he's replaced with an alias.

view this post on Zulip Sean (Jul 02 2020 at 22:12):

At this point, any remaining uncertainty or issues I'm just replacing with brlcad.org aliases. Frankly, they could have all been replaced with brlcad.org aliases and captured inside GitHub, but this way I don't have to be in the loop (as they are DNS MX records).

view this post on Zulip starseeker (Jul 03 2020 at 01:36):

@Sean The only other thing I noticed is the cvs_authormap has a "jebbly" entry for Jeffrey Liu, which wasn't in the svn map (probably my fault.) Should that just be jebbly@brlcad.org ?

view this post on Zulip starseeker (Jul 03 2020 at 01:52):

Actually, per r75095 that's a recent committer?

view this post on Zulip starseeker (Jul 03 2020 at 01:52):

Presumably the same Jeffery Liu in the chat now... I got misled by seeing the name only in the CVS authormap.

view this post on Zulip Sean (Jul 03 2020 at 02:29):

yeah, I did not reconcile against the other file, so another good one to catch

view this post on Zulip starseeker (Jul 03 2020 at 02:46):

I pulled a list of committers from the svn log and compared it to the ones in the map - I think we're good now. I'll run a basic conversion of the CVS history and upload it to github to see what happens with the new email addresses

view this post on Zulip Sean (Jul 03 2020 at 02:56):

If there's any problem, we should either just switch EVERYTHING to brlcad.org aliases for the historic commits (to preserve the username as-is, even duplicates), or we should check out what GitLab does with the same info.

view this post on Zulip starseeker (Jul 03 2020 at 03:01):

/me nods

view this post on Zulip starseeker (Jul 03 2020 at 03:04):

I got the CVS part to convert, which will hopefully be enough to tell the tale

view this post on Zulip starseeker (Jul 03 2020 at 03:20):

https://github.com/starseeker/brlcad_cvs_git/

view this post on Zulip starseeker (Jul 03 2020 at 03:20):

I imagine the stats will have to crunch for a little while

view this post on Zulip starseeker (Jul 03 2020 at 03:23):

Ah. Well, contributors so far only appear to be those tied to a github account? https://github.com/starseeker/brlcad_cvs_git/graphs/contributors

view this post on Zulip starseeker (Jul 03 2020 at 03:29):

Might be worth a support question...

view this post on Zulip starseeker (Jul 03 2020 at 03:32):

Not sure if individuals profiles will pick up retroactively on commits made before they joined github...

view this post on Zulip starseeker (Jul 03 2020 at 03:34):

Also, this being in my own personal grouping (as opposed to an org) might have some impact...

view this post on Zulip Sean (Jul 03 2020 at 03:35):

They do, the commits go all the way back (e.g., look at John's)

view this post on Zulip Sean (Jul 03 2020 at 03:37):

We could create github accounts for all of the brlcad.org aliased accounts, that way they'd at least show up.

view this post on Zulip Sean (Jul 03 2020 at 03:37):

at least for one in particular...

view this post on Zulip Sean (Jul 03 2020 at 03:44):

so... let's see. there are 102 entries of which 88 are unique. minus the 19 accounts it found. minus 26 aliased.
that leaves 43 unaccounted and unaccountable.

view this post on Zulip Sean (Jul 03 2020 at 03:52):

284281400@qq.com
abhijit.nandy@gmail.com
agkphysics@gmail.com
andrecastelo@gmail.com
anuragmurty@gmail.com
ben.e.saunders@gmail.com
bhinesley@gmail.com
bilmer1@comcast.net
brlcad@mail.lordofbikes.de
carl.nuzman@nokia-bell-labs.com
carlm0404@gmail.com
cdueck93@gmail.com
cezar.elnazli2@gmail.com
cprecup@cisco.com
dgodbey@yahoo.com
dloman77@gmail.com
doug@survice.com
ebautu@gmail.com
g.sayol@gmail.com
indianlarry@verizon.net
jdoliner@gmail.com
kunigami@gmail.com
manuel.montezelo@gmail.com
marcodomingues20@gmail.com
maths22@gmail.com
michael.j.gillich@gmail.com
mireastefangabriel@gmail.com
mohitdaga.lnmiit@gmail.com
nreed1@umbc.edu
popescu.andrei1991@gmail.com
robert.reschly@gmail.com
sam@hocevar.net
sharan.nyn@gmail.com
shubhamrathore1947@gmail.com
thedawnthomas@gmail.com
tim@jvsw.com
tom.browder@gmail.com
u2isaac@gmail.com
vladbogolin@gmail.com
zaqcloud@hotmail.com

view this post on Zulip Sean (Jul 03 2020 at 03:54):

indianlarry has an account, so could check with him to see what's up, what he set it to

view this post on Zulip Sean (Jul 03 2020 at 03:54):

several of those are surprising

view this post on Zulip Sean (Jul 03 2020 at 04:07):

question for you @starseeker , looking at the docs it looks like both username and email get recorded. are the old usernames being preserved or collapsed? just wondering.

view this post on Zulip starseeker (Jul 03 2020 at 12:23):

@Sean Right now they're collapsed - you would need to pull the map file to associate a sourceforge name with the github id, and for individuals with multiple svn ids I didn't preserve which commit was made with which id. Could probably do so using the notes mechanism, now that I think about it, if that's of interest.

view this post on Zulip starseeker (Jul 03 2020 at 12:25):

Would github allow the creation of accounts by someone other than the individual in question? My thought was to inquire if there was any way to have the contributors page report non-github contributors in some fashion...

view this post on Zulip starseeker (Jul 03 2020 at 12:32):

btw, did you switch Erik's email as a test of the alias mechanism? He had given us another email earlier, if I'm remembering correct.y

view this post on Zulip starseeker (Jul 03 2020 at 12:36):

When I look at (say) your individual page, it's only reporting your contribution activity back to when you joined github in 2011 - it doesn't look like it's picking up on the older commits and associating them with your account

view this post on Zulip starseeker (Jul 03 2020 at 12:37):

might be another question for the github folks, if anyone has good contacts there...

view this post on Zulip Erik (Jul 03 2020 at 13:12):

eh, I think I asked to be erik@brlcad.org

view this post on Zulip starseeker (Jul 04 2020 at 12:42):

@Erik Oh, OK - good. I couldn't remember.

view this post on Zulip starseeker (Jul 04 2020 at 12:43):

@Sean What do you think? Want to go ahead with the conversion with the email addresses as-is? Or try to contact github to find out more?

view this post on Zulip starseeker (Jul 04 2020 at 13:12):

I'm thinking it's probably not worth it to tweak too much more, unless you want to track down the root cause of the "surprising" accounts that ought to show up even by github's current contributor criteria but aren't... We can provide something like https://brlcad.org/~starseeker/git_stats/general.html on our own project site to document contributions with more control.

Not ideal certainly - it would be nice if we could get the github site to more accurately reflect the full history - but so far I'm not having much luck trying to research whether that is doable...

view this post on Zulip starseeker (Jul 04 2020 at 13:20):

(one thought - did indianlarry commit to the repository prior to our conversion to SVN? If he doesn't have any CVS commits he wouldn't show up in this test...)

view this post on Zulip Erik (Jul 04 2020 at 13:46):

He had a run in the long ling ago, perhaps under rcs, roght?

view this post on Zulip Erik (Jul 04 2020 at 13:46):

Long long ago

view this post on Zulip starseeker (Jul 04 2020 at 14:38):

2009 was indianlarry's earliest commit, according to the previous git conversion.

view this post on Zulip starseeker (Jul 04 2020 at 14:39):

Yep, post-dates CVS - last commit there was end of 2007. OK, that explains it.

view this post on Zulip starseeker (Jul 04 2020 at 14:40):

Better test will be once I've got the SVN history spliced on, but that's the hard part. Looks like it's time to update the bridge commits...

view this post on Zulip starseeker (Jul 04 2020 at 14:43):

(or more precisely, I'm out of excuses to avoid updating the bridging commits... ick.)

view this post on Zulip Sean (Jul 08 2020 at 18:24):

@starseeker I think it's worth doing both - asking github if there's a way to show/list contributors that don't have a github account, just to make sure we're not doing something wrong, and proceeding ahead.

The surprising accounts are probably worth looking into just to double-check that they're not a typo or dead e-mail. There was only a couple, one that I just fixed yesterday. some people we had gmail accounts for have switched to different gmail accounts.

I think if we can account for everyone and the vast majority - say 95% - show up under the contributor list, we're good. We may get to that percentage just by ensuring one or two accounts.

view this post on Zulip starseeker (Jul 08 2020 at 19:51):

https://github.com/starseeker/brlcad_convtest has what I'd gotten as of this morning - up to 33 github contributor links

view this post on Zulip starseeker (Jul 08 2020 at 19:52):

indianlarry is in there

view this post on Zulip Sean (Jul 08 2020 at 19:54):

I created an account for mike, so that should get another ten thousand commits. I'm looking through the list and going to see if any heavy contributors are missing.

view this post on Zulip starseeker (Jul 08 2020 at 19:56):

I'll do another upload in the next day or two, if my laptop doesn't die

view this post on Zulip Sean (Jul 08 2020 at 19:57):

parker has a curious commit count.. that page is showing 4431 but I'm seeing 5105 in svn, that because it's only played through 2012?

view this post on Zulip Sean (Jul 08 2020 at 19:58):

tbrowder is similar -- shows 57, but svn has 1637

view this post on Zulip starseeker (Jul 08 2020 at 19:58):

Yeah, only into the 40000s on commits

view this post on Zulip Sean (Jul 08 2020 at 19:59):

okay, maybe around the right time before browder did a lot

view this post on Zulip starseeker (Jul 08 2020 at 19:59):

It'll be next week sometime before I get all the way through

view this post on Zulip Sean (Jul 08 2020 at 20:00):

looks like we're only missing one >1k commit author

view this post on Zulip Sean (Jul 08 2020 at 20:01):

why would someone have more commits in git than they had in svn?

view this post on Zulip starseeker (Jul 08 2020 at 20:02):

If we're missing any from the cvs era that's a problem - newer SVN committers (post 2011) won't show yet.

view this post on Zulip starseeker (Jul 08 2020 at 20:02):

cvs -> git conversion may have broken out the commits differently

view this post on Zulip starseeker (Jul 08 2020 at 20:02):

(than cvs2svn)

view this post on Zulip Sean (Jul 08 2020 at 20:03):

Can you check gdurf? He had 682 in svn, 710 in git.

view this post on Zulip Sean (Jul 08 2020 at 20:04):

don't know if you have an easy way to compare, I just did an svn log > log and counted them up

view this post on Zulip starseeker (Jul 08 2020 at 20:07):

git log --author="Glenn Durfee" --pretty=oneline - gives a quick overview of the git commits

SVN is harder...

view this post on Zulip starseeker (Jul 08 2020 at 20:08):

I can already see early commits in the git history that have the same git message - that's a probable source

view this post on Zulip Sean (Jul 08 2020 at 20:09):

Are they mistakes or denoting something like a file move?

view this post on Zulip starseeker (Jul 08 2020 at 20:33):

My understanding is that cvs-fast-export had to deduce which cvs operations in different files denote "commits" for git, when those timestamps don't exactly line up. The tolerance on how much the time span is allowed to vary before a new commit is declared is one of the settings that can be altered on the tool.

view this post on Zulip starseeker (Jul 08 2020 at 20:36):

I haven't tried to adjust that too much - it only impacts the CVS portion of the history. The CONVERT.sh script has the setup used for the initial cvs conversion, and that's quite fast if you want to do some experimentation.

view this post on Zulip Sean (Jul 08 2020 at 20:38):

I don't want to experiment, but I would like to confirm that is exactly what's happening here, as opposed to some other unexpected behavior or a bug or bad data or ...

view this post on Zulip Sean (Jul 08 2020 at 20:40):

if you look at a couple of the duplicates, do they differ in files, timestamps separated by a few seconds, or something else? might be concerning if they're different changes to the same files.

view this post on Zulip starseeker (Jul 08 2020 at 20:45):

I see a commit 1996-03-25 16:42:45 that doesn't have a corresponding svn commit id

view this post on Zulip starseeker (Jul 08 2020 at 20:47):

That means the analysis scripts couldn't find a commit message with a close timestamp

view this post on Zulip Sean (Jul 08 2020 at 20:48):

what's the actual content of the commit? does the other commit with the matching log message match an svn commit it? does it's change match the svn change? (should it)

view this post on Zulip starseeker (Jul 08 2020 at 20:51):

That one doesn't have a matching message. I'm seeing more that don't (at least 10 so far) which is a bit surprising. There is one at 1994-12-16 15:33:48 that does have a subsequent commit with an SVN id assigned - 1994-12-16 15:38:40

view this post on Zulip starseeker (Jul 08 2020 at 21:16):

r10215 looks like it got split up into a couple commits in git, and the time ordering is slightly different.

view this post on Zulip Sean (Jul 08 2020 at 21:24):

okay, cool ... does that fully explain it? how much time are we talking about? couple seconds?

view this post on Zulip starseeker (Jul 08 2020 at 22:22):

looks like a few minutes

view this post on Zulip starseeker (Jul 08 2020 at 22:48):

If there's a way to get git and svn to generate identically formatted diffs, we could identify when we have actually differing commits - that would be the best/only way to get true assignment of SVN commits to exactly corresponding commits. My estimate was that the maximal utility was to go ahead and assign the numbers based on the commit ids, since it would localize the commit to the general portion of the git history containing the corresponding changes.

view this post on Zulip starseeker (Jul 08 2020 at 22:48):

The next best thing would be to generate a list of all commits that don't have svn ids assigned and inspect what's happening around them.

view this post on Zulip starseeker (Jul 08 2020 at 23:32):

git log --notes --invert-grep --grep=".svn." --pretty=format:"%h %an %ad %s"

view this post on Zulip starseeker (Jul 08 2020 at 23:34):

gitk --notes --invert-grep --grep=".svn." allows for inspection in gitk

view this post on Zulip starseeker (Jul 09 2020 at 00:03):

Bah - github's web history doesn't have --follow enabled, from the looks of it - ell.c history stops at the restructure.

view this post on Zulip starseeker (Jul 09 2020 at 00:13):

Sigh... https://github.com/isaacs/github/issues/900

view this post on Zulip starseeker (Jul 09 2020 at 16:24):

@Sean Today's upload sees Mike as a contributor: https://github.com/starseeker/brlcad_convtest2/graphs/contributors

Looks like github doesn't retroactively add contributors when they create accounts - yesterday's upload still doesn't show him.

view this post on Zulip Sean (Jul 09 2020 at 22:00):

So I'll need to make sure others are created before final upload. Good to know. Was going to crunch the numbers, but I think carl is the only >1k committer missing. He's not likely to create a github account, so he can be switched to a brlcad.org alias.

view this post on Zulip starseeker (Jul 13 2020 at 00:45):

@Sean looks like it will be a few more days at least to finish up the test run (in the mid 60000s range now) - I'll upload it as before once it's done so we can inspect the github integration.

Seeing as we now appear to be getting very close, what's the procedure for flipping the switch from sf to github? Lock the SVN repo as read-only, upload the github repo, and update the web page links are the obvious first steps - will we keep using the existing email lists for the time being?

view this post on Zulip starseeker (Jul 13 2020 at 00:55):

Possibly of interest: https://github.com/cmungall/gosf2github

view this post on Zulip Sean (Jul 13 2020 at 05:48):

yeah, I'd definitely like to import the feature request, support, and bug report trackers to issues, so that's more than of interest. I've come across a couple similar efforts to import sf data. the one you link looks pretty good. suggests we utilize a non-dev account, which is probably a good idea. the patches tracker is a separate beast and will need to be dealt with differently.

view this post on Zulip Sean (Jul 13 2020 at 05:50):

would help to have a checklist on the wiki so we don't miss an action. willing to write down what you know so we can look into ordering and making sure we got everything? I can add my notes as well.

view this post on Zulip starseeker (Jul 13 2020 at 12:52):

Made a quick start here: https://brlcad.org/wiki/Github_Migration

I've got notes scattered around (most in misc/repoconv/NOTES) with more details.

view this post on Zulip Sean (Jul 13 2020 at 19:17):

Cool, I'll add some of mine to it. Awesome! Thanks!

view this post on Zulip starseeker (Jul 14 2020 at 18:33):

@Erik We're still a few weeks out (at a minimum I need to re-run the conversion with the final account-map in place) but we can see the finish line now

view this post on Zulip starseeker (Jul 14 2020 at 18:37):

@Sean Is it worth putting out an email to the brlcad-devel list with a "last chance" call for any account info updates? Or is that not needed?

view this post on Zulip Sean (Jul 14 2020 at 20:38):

Good idea, I'll write up and send an announcement.

view this post on Zulip starseeker (Jul 16 2020 at 18:15):

Test conversion complete: https://github.com/starseeker/brlcad_conv3

Unless there are more email changes needed, we should now be ready to begin the final conversion run.

view this post on Zulip Sean (Jul 16 2020 at 18:17):

/me looks

view this post on Zulip Sean (Jul 16 2020 at 18:19):

Nice! Looks like it's recognizing 55 contributors now. Not too shabby. Did you start that before I changed Carl's address?

view this post on Zulip starseeker (Jul 16 2020 at 18:21):

Unfortunately, yes - doesn't have that change nor Ben's. That's why I'll have to run one more time.

view this post on Zulip starseeker (Jul 16 2020 at 18:21):

Also why I was suggesting sending out the "last chance" email - once I kick off this time, we're locked in.

view this post on Zulip Sean (Jul 16 2020 at 18:21):

okay, I have a couple more to make and we should be good to go

view this post on Zulip Sean (Jul 16 2020 at 18:22):

yep

view this post on Zulip Sean (Jul 16 2020 at 18:23):

I can send the email now then, unless you wanted to send it? appreciated seeing the draft.

view this post on Zulip starseeker (Jul 16 2020 at 18:23):

If that looks good I can send it - you're the better wordsmith, so I wanted you to have a crack at it

view this post on Zulip starseeker (Jul 16 2020 at 18:25):

Assuming a repeat performance, it looks like a bit shy of two weeks for the run - so around the beginning of August we should plan to lock the SVN repository and open up the github repo.

view this post on Zulip Sean (Jul 16 2020 at 18:25):

I'd reduce it down a bit and put the main point first, but it looks good enough as is too.

view this post on Zulip starseeker (Jul 16 2020 at 18:26):

OK, go ahead and send it - makes more sense really for you to do so since you've been POC for the emails all along

view this post on Zulip Sean (Jul 16 2020 at 18:27):

I like the automatic stale branch designation

view this post on Zulip starseeker (Jul 16 2020 at 18:27):

I'm planning to start the run on the 19th, if you want a fixed deadline for the email

view this post on Zulip starseeker (Jul 16 2020 at 18:30):

We may want to adjust our tag names going forward, so the tar.gz file github generates from the tag will be more meaningful name wise - right now we get "rel-7-30-8.tar.gz"

view this post on Zulip Sean (Jul 16 2020 at 18:31):

sure, I can put "before Monday" in the mail

view this post on Zulip Sean (Jul 16 2020 at 18:40):

starseeker said:

We may want to adjust our tag names going forward, so the tar.gz file github generates from the tag will be more meaningful name wise - right now we get "rel-7-30-8.tar.gz"

I think we're good. If anything, we could adopt Semantic Versioning (which is simply v1.2.3), but we don't fully comply so that would be a bit misleading.

The tag downloads aren't necessarily meant to be release tarballs (though they obviously should be the same), or at least their convenience priority is geared to for devs on the command line, not downloaders. Some notable examples:
https://github.com/tensorflow/tensorflow/tags
https://github.com/torvalds/linux/tags
https://github.com/redis/redis/tags

view this post on Zulip starseeker (Jul 16 2020 at 18:41):

OK, cool. As long as nobody complains that our Github tar.gz download links aren't compliant with HACKING ;-)

view this post on Zulip Sean (Jul 16 2020 at 18:41):

What we have is simple and self-consistent, historically accurate.

view this post on Zulip starseeker (Jul 16 2020 at 18:42):

Ah, so Releases are the fancier version. OK, that's a corner of Github I've not delved into yet

view this post on Zulip Sean (Jul 16 2020 at 18:43):

Yeah, like I said, those won't necessarily be our source tarballs. They'll be wherever we host our binary platform releases, since we still need those too.

view this post on Zulip Sean (Jul 16 2020 at 18:45):

Binary downloads aren't something Github supports directly -- some use the GitHubs LFS, some self-host, others still continue to use SourceForge for downloads since that's the one thing they actually are still good at.

view this post on Zulip starseeker (Jul 16 2020 at 18:45):

https://docs.github.com/en/github/administering-a-repository/managing-releases-in-a-repository seems to suggest you can upload binaries?

view this post on Zulip starseeker (Jul 16 2020 at 18:46):

(#7 in that list)

view this post on Zulip Sean (Jul 16 2020 at 18:46):

Well what do you know, they added it.

view this post on Zulip Sean (Jul 16 2020 at 18:48):

heh, looks like they added it 7 years ago. shows how closely I've been paying attention to it.

view this post on Zulip starseeker (Jul 16 2020 at 18:50):

<grin> I've mostly been trying to figure out the CI bits - I don't have any projects that do releases, so I hadn't noticed either

view this post on Zulip starseeker (Jul 16 2020 at 18:53):

/me doesn't speak fluent YAML yet...

view this post on Zulip Sean (Jul 16 2020 at 19:06):

okay, so looks like we're at 89% commit coverage across those 55 accounts
63715 commits out of 71289

Carl's will get us to 93%. I think a few more will probably get us into the 95-98% ballpark.

view this post on Zulip starseeker (Jul 16 2020 at 19:06):

That's a lot better than I expected, to be honest

view this post on Zulip starseeker (Jul 16 2020 at 19:07):

I just wanted to be sure we didn't miss anyone currently active on github who wanted to be correctly linked into the history

view this post on Zulip Sean (Jul 16 2020 at 19:07):

curiously, tom browder's commits are actually the first i've noticed lower than his Svn count... should there be any reason for that?

view this post on Zulip starseeker (Jul 16 2020 at 19:08):

Is is svn count counting any repos other than the main BRL-CAD history?

view this post on Zulip Sean (Jul 16 2020 at 19:08):

nope, it's just an svn log dump off trunk

view this post on Zulip starseeker (Jul 16 2020 at 19:08):

If it's an admin dump that'd be everything (rt^3, geomcore, etc.)

view this post on Zulip starseeker (Jul 16 2020 at 19:09):

Ah, you mean from a checkout

view this post on Zulip starseeker (Jul 16 2020 at 19:09):

one sec...

view this post on Zulip Sean (Jul 16 2020 at 19:09):

can check on your end: svn log > log && grep -E '^r[[:digit:]]+[[:space:]]\|' log | awk '{print $3}' | sort | uniq -c | sort -n
tom's showing 1637 commits. git count is 1630.

view this post on Zulip Sean (Jul 16 2020 at 19:12):

I'm not sure what to make of this ... https://github.com/starseeker/brlcad_conv3/pulse

view this post on Zulip Sean (Jul 16 2020 at 19:13):

oh, I get it. That's activity in the last N days.

view this post on Zulip starseeker (Jul 16 2020 at 19:13):

git log --pretty=oneline --author="Thomas Browder" --branches="*" |wc -l gives me a count of 1634

view this post on Zulip starseeker (Jul 16 2020 at 19:13):

Working on svn

view this post on Zulip Sean (Jul 16 2020 at 19:15):

https://github.com/starseeker/brlcad_conv3/graphs/contributors lists tom at 1630 .. so then there's two discrepancies

view this post on Zulip starseeker (Jul 16 2020 at 19:15):

I don't know that the contributors is looking across all branches

view this post on Zulip Sean (Jul 16 2020 at 19:16):

probably not, but seems odd that tom would commit .. 4 times to a branch

view this post on Zulip Sean (Jul 16 2020 at 19:16):

easy enough to verify

view this post on Zulip starseeker (Jul 16 2020 at 19:17):

I had something in the notes for doing a deeper dive into the git history... one sec...

view this post on Zulip starseeker (Jul 16 2020 at 19:18):

When I do your svn log count on my local brlcad_repo copy I get:

1988 tbrowder2

view this post on Zulip starseeker (Jul 16 2020 at 19:18):

That's across everything

view this post on Zulip Sean (Jul 16 2020 at 19:18):

o.O

view this post on Zulip Sean (Jul 16 2020 at 19:19):

send me your log

view this post on Zulip starseeker (Jul 16 2020 at 19:21):

sent

view this post on Zulip Sean (Jul 16 2020 at 19:22):

heh, you know you can just drag n drop them into here? :)

view this post on Zulip Sean (Jul 16 2020 at 19:22):

got it

view this post on Zulip starseeker (Jul 16 2020 at 19:23):

Ah, right - sorry, my reflexes still think this is a fancy version of irssi

view this post on Zulip starseeker (Jul 16 2020 at 19:31):

If I do the following script I end up with 73882 commit messages, where github (and git log itself) give only 71289 log.sh

view this post on Zulip Sean (Jul 16 2020 at 19:33):

so he did make a lot of commits into the ova repository, so that's one difference albeit to be expected

view this post on Zulip Sean (Jul 16 2020 at 19:37):

and he did make a branch for working on binary attributes, so that's explaining the 297 additional commits. if you ran off trunk, you would have seen 1637.

view this post on Zulip Sean (Jul 16 2020 at 19:38):

Which is also interesting in itself ... github is not counting branch commits?

view this post on Zulip Sean (Jul 16 2020 at 19:38):

yeah, says it right on the page -- Contributions to master

view this post on Zulip starseeker (Jul 16 2020 at 19:39):

/me is thinking it's probably still worthwhile to have our own gitstats page...

view this post on Zulip starseeker (Jul 16 2020 at 19:39):

which reminds me

view this post on Zulip Sean (Jul 16 2020 at 19:41):

So only the original question remains -- why his commit count is 7 commits short.

view this post on Zulip starseeker (Jul 16 2020 at 19:41):

(btw, the attached script reports 1930 commits for Tom in git.) log_browder.sh

view this post on Zulip Sean (Jul 16 2020 at 19:43):

so similar question since svn is saying he had 1988

view this post on Zulip starseeker (Jul 16 2020 at 19:44):

I have some C++ code in misc/repoconv (I think it's in the svn_map_commit_revs.cxx file) which could probably be repurposed to actually diff the logs, with some work - svn makes those types of comparisons very annoying...

view this post on Zulip Sean (Jul 16 2020 at 19:44):

wouldn't be unusual except that everyone else is slightly higher in git with the fake/duplicate commits

view this post on Zulip starseeker (Jul 16 2020 at 19:46):

Did he ever adjust svn ignore properties or mime types?

view this post on Zulip starseeker (Jul 16 2020 at 19:46):

Those commits didn't translate, so we may have lost a few there if he did do that and didn't do any move+change commits

view this post on Zulip Sean (Jul 16 2020 at 19:48):

Should be able to figure this out easily by process of elimination.
I just got the repo cloned -- how do we access the svn rev?

view this post on Zulip starseeker (Jul 16 2020 at 19:49):

It's in the notes. I have a convenience script in misc/repoconv/NOTES

view this post on Zulip Sean (Jul 16 2020 at 19:49):

if we can get a list of svn-to-sha, they can be eliminated from the svn list or sha list and vice versa .. should be just a handful remaining in both

view this post on Zulip starseeker (Jul 16 2020 at 19:50):

.gitconfig helpers section

view this post on Zulip starseeker (Jul 16 2020 at 19:50):

svnrev

view this post on Zulip Sean (Jul 16 2020 at 19:51):

i'm okay unaliased, rather know what's going on first

view this post on Zulip Sean (Jul 16 2020 at 19:51):

got it: git log --all --pretty=format:"%H %N" --grep svn:revision:29886|awk '{system("git checkout "$1)}'

view this post on Zulip starseeker (Jul 16 2020 at 19:51):

It's probably pretty slow for scripting - I wasn't trying to performance optimize, thinking it was just for checking out one svn rev...

view this post on Zulip starseeker (Jul 16 2020 at 19:53):

Make sure you cloned the notes too, by the way, or that won't work: git fetch origin refs/notes/commits:refs/notes/commits

view this post on Zulip Sean (Jul 16 2020 at 19:54):

ah, was just about to say - I have no notes

view this post on Zulip Sean (Jul 16 2020 at 19:54):

is there a way to fetch all?

view this post on Zulip starseeker (Jul 16 2020 at 19:55):

git clone --mirror https://github.com/starseeker/brlcad_conv3.git

view this post on Zulip starseeker (Jul 16 2020 at 19:55):

I think that'll do it

view this post on Zulip Sean (Jul 16 2020 at 19:55):

k

view this post on Zulip starseeker (Jul 16 2020 at 19:56):

it drives me nuts that git won't pull the notes by default... probably another one of those decisions like not tracking file moves.

view this post on Zulip Sean (Jul 16 2020 at 19:56):

okay, so now it's just a process of elimination

view this post on Zulip starseeker (Jul 16 2020 at 19:59):

Was Tom an SVN era only committer or did he have CVS commits? Things get a lot more wonky when we cross the CVS threshold...

view this post on Zulip starseeker (Jul 16 2020 at 20:02):

I wish bob would put at least his name on his github account - his account name by itself looks rather bleak

view this post on Zulip Sean (Jul 16 2020 at 20:07):

this is going to take a lil while, but getting closer .. lots of curious little discrepancies to chase down

view this post on Zulip Sean (Jul 16 2020 at 20:08):

just looking at svn revisions, some clearly didn't map, so it'll be easy to find those -- I suspect they're something categoric like adding directories or moving files

view this post on Zulip Sean (Jul 16 2020 at 20:08):

or attributes like you mentioned

view this post on Zulip starseeker (Jul 16 2020 at 20:10):

@Sean Are you checking just Tom's commits, or doing a whole-history analysis? If the latter you'll probably see on the order of a couple thousand commits that won't line up, at a guess.

view this post on Zulip starseeker (Jul 16 2020 at 20:14):

Any git commit without a note doesn't have a matching SVN commit (or at least, an identified one) although the "preliminary move commit" commits arguably do map to specific revisions (I just didn't bother assigning the rev number, since the subsequent change commit is the one that should actually restore the tree to the state that matches the SVN commit.)

view this post on Zulip Sean (Jul 16 2020 at 20:15):

heh, why would I expand scope on a specific discrepancy? that'd be terrible way to go about v&v :)

view this post on Zulip Sean (Jul 16 2020 at 20:16):

just checking tom's to understand this delta. it's 7 commits, should be easy to isolate and understand.

view this post on Zulip starseeker (Jul 16 2020 at 20:16):

Wasn't sure what you were up to - "lots of curious little discrepancies" sounded omnious

view this post on Zulip starseeker (Jul 16 2020 at 20:18):

At various points when debugging the conversion, I generated lists of sets of unmapped commits. Can't say I'd look forward to it, but if you need me to I can prepare a complete list of SVN brlcad commits and the corresponding git log and produce the sets of commit deltas.

view this post on Zulip Sean (Jul 16 2020 at 20:18):

well, in trying to pin it down, a couple more numbers aren't adding up like if I do a git log on browder and pull all that have an svn ID, I get 1626 commits on trunk, 1919 on all .. which is slightly off the 1630 on the public site and 1930 you reported via some script

view this post on Zulip starseeker (Jul 16 2020 at 20:20):

If I remember correctly commits that are only locatable by tags won't show up in the default git log listings, which was the reason for that crazy script to introspect everything.

view this post on Zulip Sean (Jul 16 2020 at 20:22):

No need, it's easy to pull the mapping with: git log --pretty=format:"%H %N" | grep revision | sed 's/svn:revision://g'

view this post on Zulip Sean (Jul 16 2020 at 20:22):

can write an awk to show the gaps or just inverse grep as needed

view this post on Zulip Sean (Jul 16 2020 at 20:27):

Just FYI, now that we're really close, I'm planning on doing actual v&v on the repo to sanity check everything. I don't expect to find anything cause I know you poured heart and soul into in the previous revisions, but still better to find any problems now rather than later.

view this post on Zulip Sean (Jul 16 2020 at 20:29):

Basically just looking for any actual data loss, like commits missing that shouldn't be missing or something that's off by one or some other bug.
Nothing exhaustive, nothing to hold anything up either. Just basic comparative testing to see if we understand and expect all the differences.

view this post on Zulip Sean (Jul 16 2020 at 20:30):

tom's actually a rather convenient delta to investigate.

view this post on Zulip Sean (Jul 16 2020 at 20:43):

@starseeker so is the account name actually preserved anywhere? It's okay if it's not, but I'm not seeing it and thought it was getting collapsed and preserved somewhere.

view this post on Zulip starseeker (Jul 16 2020 at 20:44):

It's not, except in the account-map file. Only other approach I can think of is to add another note line to the commits with the cvs/svn commit name, and that'd be a bit of a job to do right.

view this post on Zulip starseeker (Jul 16 2020 at 20:45):

I'm afraid to touch the main logic at this point if I don't absolutely have to, which would mean appending another not line with a post-conversion analysis. Doable, but not trivial.

view this post on Zulip starseeker (Jul 16 2020 at 20:49):

If it's helpful, here's the list my logic generates of SVN commits that have no identifiable corresponding git commit (at least, without analyzing the contents of the diffs themselves, which I have not attempted): svn_list.txt

view this post on Zulip Sean (Jul 16 2020 at 20:50):

hmm, interesting

view this post on Zulip Sean (Jul 16 2020 at 21:01):

starseeker said:

If it's helpful, here's the list my logic generates of SVN commits that have no identifiable corresponding git commit (at least, without analyzing the contents of the diffs themselves, which I have not attempted): svn_list.txt

Have you gone through the list already exhaustively? Anything unexpected?

view this post on Zulip Sean (Jul 16 2020 at 21:03):

obviously lots of categoric ones to not worry about that I'd distill to, like all the generated ones and tag commits are non-issues.

view this post on Zulip Sean (Jul 16 2020 at 21:07):

starseeker said:

It's not, except in the account-map file. Only other approach I can think of is to add another note line to the commits with the cvs/svn commit name, and that'd be a bit of a job to do right.

I have mixed feelings on this. On one hand, it would be nice to preserve the actual user name recorded on that specific commit, but the historic merit is questionable (beyond provenance, which is already lost) and can't think of an actual use case unless mappings are wrong (which reminds me, should check the first and last author in the mapping file specifically).

view this post on Zulip Sean (Jul 16 2020 at 21:08):

It's fine without. If you want to add it, that'd be fine too.

view this post on Zulip starseeker (Jul 16 2020 at 21:17):

Not exhaustively, and that particular list is CVS era only - I'm working on a more comprehensive one.

view this post on Zulip starseeker (Jul 16 2020 at 21:19):

I'm inclined to skip it for now - since git notes can be added without impacting the main sha1 repo history, we can always go back and generate the mappings later if we discover it's worthwhile. ( I plan to put the original CVS and SVN repos up in a single archived git repository on the project, to preserve them for potential use when something comes along to dethrone git and some poor sucker gets to do this again.)

view this post on Zulip starseeker (Jul 16 2020 at 21:30):

These might be a bit more interesting - I disable the limiter and ran the check for all svn commits, as well as printing out unmapped (or at least, not uniquely mapped by commit message) git commits.
svn_list.txt
git_list.txt

view this post on Zulip starseeker (Jul 16 2020 at 21:32):

The git version is less sophisticated - duplicate commit messages on different commits will show up - but it's a start.

view this post on Zulip starseeker (Jul 16 2020 at 21:37):

Update - better version of the git_list.txt file that also removes unique timestamp + message matches. < 2k as opposed to almost 6, and visually most of them look like cvs-fast-export breaking down commits differently:
git_list.txt

view this post on Zulip starseeker (Jul 16 2020 at 21:39):

The branch delete commits are needed to preserve when a branch was removed in SVN, since we can't actually delete the branches in git without unreachable commits being garbage collected.

view this post on Zulip starseeker (Jul 16 2020 at 21:55):

First number in both lists is the timestamp, so they're sorted chronologically. SVN has commit ids, and git has sha1 hashes. Then for both commit message is shown, which usually gives a hint as to why there's no mapping in the other system.

view this post on Zulip Sean (Jul 16 2020 at 22:05):

posted announcement to mailing lists, facebook, and twitter

view this post on Zulip starseeker (Jul 17 2020 at 00:08):

git-stats looks like it's pretty much working - probably need to tweak the output some for our specific needs, but right general idea:
https://brlcad.org/~starseeker/git_stats/authors/best_authors.html

view this post on Zulip starseeker (Jul 17 2020 at 13:27):

/me pushes his luck by feeding the full 70k+ commit history through the git->fossil converter... curious to see if fossil can handle this.

view this post on Zulip Sean (Jul 17 2020 at 19:06):

of course it can, no reason it shouldn't. he's pretty consistent in making things robust to scale.

hey question on the svn revisions and git notes...

view this post on Zulip Sean (Jul 17 2020 at 19:18):

I know this is coming late and maybe we hashed it out earlier(??), but given the tooling issues, what about just stashing the cvs/svn rev info as the last line in the commit log?

view this post on Zulip Sean (Jul 17 2020 at 19:40):

@Daniel Rossberg per your e-mail, you're also welcome to use your brlcad.org alias (rossberg).. which can be pointed to anything, and can be claimed in your github account as an additional address.

view this post on Zulip starseeker (Jul 17 2020 at 20:12):

Took most of a day to run, but it did work - cool! I present BRL-CAD, in fossil:
brlcad-fossil.jpg

view this post on Zulip starseeker (Jul 17 2020 at 20:15):

Theoretically possible to stash it there, but once we do we lose the trivial 1-1 commit message correspondence with the earlier repositories. The latter is what let me generate the svn_list.txt and git_list.txt files above - I know we could work around adding the extra info, but the git notes appealed to me semantically (metadata on the commit, rather than part of the core message/data/parent relationship)

view this post on Zulip starseeker (Jul 17 2020 at 20:16):

Also, I can't incorporate it into the CVS portion of the history without trying to hack the cvs-git tool in some weird way - I'm taking their git output and assigning the notes with our ID numbers post-conversion, rather than during.

view this post on Zulip Sean (Jul 17 2020 at 20:19):

Once they're in git, the log messages can be edited, so CVS could still be annotated too.

view this post on Zulip starseeker (Jul 17 2020 at 20:20):

Editing the log messages is (I think) like editing the commit names - it will propagate invalidating the SHA1 hashes all the way up the chain.

view this post on Zulip Sean (Jul 17 2020 at 20:20):

I get the appeal, but the downsides are starting to dominate the more I work with it.

view this post on Zulip Sean (Jul 17 2020 at 20:21):

invalidating the sha hashes was a notes issue though, wasn't it

view this post on Zulip Sean (Jul 17 2020 at 20:21):

if we're not using notes, then that's no longer an issue

view this post on Zulip starseeker (Jul 17 2020 at 20:21):

Actually, when you asked the git list they gave us a theoretical way around that.

view this post on Zulip starseeker (Jul 17 2020 at 20:22):

I never tested it, but it's not a huge issue - in principle I could do a complete regeneration of all the notes information given timestamps and commit messages that match the CVS/SVN messages.

view this post on Zulip starseeker (Jul 17 2020 at 20:23):

What downsides are you encountering?

view this post on Zulip Sean (Jul 17 2020 at 20:23):

right but that's the whole point -- it's really a half-baked feature that isn't working well. the log message is part of the commit and the only reliable place to stash it.

view this post on Zulip starseeker (Jul 17 2020 at 20:24):

I could do it for the SVN portion of the history, although there is a risk I'll break something - CVS is much harder.

view this post on Zulip starseeker (Jul 17 2020 at 20:28):

How much do you envision using that information? I was figuring the "svnrev" alias for the gitconfig file would cover the most common use case - check out an svn revision - and those ids would grow steadily less relevant with time... Is the part you're not liking that you don't get the notes in a default git clone?

view this post on Zulip starseeker (Jul 17 2020 at 20:36):

Actually, doing it even with the SVN history would be a substantial effort as I look at it - over 300 commits would have be manually updated, plus the correct surgery on the C++ commit header generation code.

view this post on Zulip Sean (Jul 17 2020 at 20:36):

well let's see.. there's:
1) people have to be told that notes exist and use a command they've probably never used before to pull them
2) additional options that must be learned to work with them (e.g., --pretty=format: %N)
3) 72354 commits to add them that show up in log, have to be ignored or scripted around
4) the restriction that if we change any historic commit, we'll need to do surgery to reattach the note
5) the general feeling that notes are half-baked and they're not prioritized to change anytime soon
6) needing to have additional customizations/macros that have to be remembered, maintained, explained
7) the fat that presents to users simply as the last line of the log, so it didn't really buy us more than logical separation
8) logical separation isn't compelling by itself as it could just as easily be stripped from logs (with less machinery than adding it)
9) the svn revs are not visible to an observer without it being explained...

view this post on Zulip Sean (Jul 17 2020 at 20:37):

I suppose #1 and #9 are related, but separate points on needing to know they exist, and needing to actively take steps to do something about it

view this post on Zulip starseeker (Jul 17 2020 at 20:37):

@Sean Another possibility is to write a utility to take the completed conversion and construct a new repository from that, incorporating the notes as commits.

view this post on Zulip starseeker (Jul 17 2020 at 20:38):

(essentially, "replay" the history again, but this time from git->git rather than through all the custom insanity.)

view this post on Zulip starseeker (Jul 17 2020 at 20:39):

That's probably the most practical option by a long shot, actually, now that I think about it.

view this post on Zulip starseeker (Jul 17 2020 at 20:40):

er, incorporating the notes on the end of the commit messages rather.

view this post on Zulip Sean (Jul 17 2020 at 20:40):

starseeker said:

How much do you envision using that information? I was figuring the "svnrev" alias for the gitconfig file would cover the most common use case - check out an svn revision - and those ids would grow steadily less relevant with time... Is the part you're not liking that you don't get the notes in a default git clone?

I actually envision using it on the regular for at least a while until references in trackers and notes and other places become less frequent.. but again, I don't need machinery to do that. I just need it somewhere. A file in the repo would work if the revs didn't change. Since they do, the log becomes the next best place I think. One can grep a log and grab a sha.

view this post on Zulip starseeker (Jul 17 2020 at 20:41):

So you're wanting something robust even to a full history rewrite, if it comes to that?

view this post on Zulip Sean (Jul 17 2020 at 20:41):

Is there value in the branch note? aren't they on their respective branches?

view this post on Zulip starseeker (Jul 17 2020 at 20:42):

Not really, git doesn't have the same notion of branch specific histories that svn does

view this post on Zulip starseeker (Jul 17 2020 at 20:42):

If you want the ability to find the commits made to a branch, and only those commits, you need the branch notes

view this post on Zulip starseeker (Jul 17 2020 at 20:43):

I think I've got a link somewhere that explains how that works - it's a low level consequence of Git's world view

view this post on Zulip Sean (Jul 17 2020 at 20:43):

I do recall the conversation a while back
I guess I've just not needed to know that specifically

view this post on Zulip Sean (Jul 17 2020 at 20:45):

and can't it be derived? I mean I can pull a git tree view and see all the commits on that branch

view this post on Zulip Sean (Jul 17 2020 at 20:46):

it's of course squirrelly when commits are cherry picked over, but from svn's perspective, they would have presented as being made on the branch too,
unless one peeks at the mergeinfo

view this post on Zulip starseeker (Jul 17 2020 at 20:46):

You'll see the commits, but git doesn't retain the origin branch for the commit. Once the commit is referenced by multiple branches, they're equal - there's nothing that remember what the "first" branch was. It will work up to a point, but once you start merging multiple directions between branches you lose the origin information

view this post on Zulip Sean (Jul 17 2020 at 20:47):

but... it'd be the first chronologically

view this post on Zulip starseeker (Jul 17 2020 at 20:48):

https://stackoverflow.com/questions/4629358/show-only-history-of-one-branch-in-a-git-log discusses some of the issues

view this post on Zulip Sean (Jul 17 2020 at 20:48):

and even then, I'm not sure what knowing the branch is going to help with. knowing the committer, sure. knowing when or a commit message saying why, sure.

view this post on Zulip starseeker (Jul 17 2020 at 20:49):

I sometimes want it to know if a particular change took place while I was working in a topic branch, or whether the change took place in trunk.

view this post on Zulip Sean (Jul 17 2020 at 20:49):

so that joker already does something I really despise.. --squash

view this post on Zulip starseeker (Jul 17 2020 at 20:50):

@Sean If I remember correctly, you can see the issue by trying to look at the history of the bullet branch - use git's own tools, and then the method I have in the NOTES file using the branch notes.

view this post on Zulip Sean (Jul 17 2020 at 20:51):

starseeker said:

I sometimes want it to know if a particular change took place while I was working in a topic branch, or whether the change took place in trunk.

but that's my point, if you annotate the line and find the hash, and look at the first instance on a git tree, won't you know that?

view this post on Zulip starseeker (Jul 17 2020 at 20:51):

If I'm trying to review what was done in the branch, but I've merged in trunk/master, it gets hard because suddenly a whole bunch of "master" commits are now part of that branches history, interwoven with the commits made on the branch

view this post on Zulip starseeker (Jul 17 2020 at 20:52):

I'd have to try that for an individual commit, but if both branches that reference the individual commit are older than the commit itself I don't think you can distinguish which one created it.

view this post on Zulip starseeker (Jul 17 2020 at 20:55):

Another concrete case - if I want to look at the original development of the CMake build system in the cmake branch, in SVN I can log just in that branch and not see any trunk commits that happened while that branch was live. In Git, once I merged the cmake branch back into master, suddenly all the master commits that took place while the cmake branch was live are effectively part of the history of both branches.

view this post on Zulip Sean (Jul 17 2020 at 20:55):

I'm still not seeing how that's a problem that needs to be solved. So commits are interwoven... that means cherry picking might be hard. It probably means I should merge more frequently or will make me merge less frequently or, better yet, not be working on a branch for a long time.

view this post on Zulip starseeker (Jul 17 2020 at 20:56):

It makes it hard for me to follow the commit history of a particular feature's development, without interference from commits in other branches. If I'm the only one that has the problem it doesn't matter particularly, but that was my motivation since it is something that can be done now in SVN (and I have done on occasion).

view this post on Zulip Sean (Jul 17 2020 at 20:59):

I may use it more than I realize, but I'm still struggling to come up with a case where knowing the branch is going to change my behavior or awareness on something. I'm usually wondering "who wrote this chunk of code, why was it written". I suppose knowing a branch might help indicate that but to date the info's either not existed or come from log messages because branch use has historically been big isolated things.

view this post on Zulip starseeker (Jul 17 2020 at 21:00):

Right - that's the point though, in Git we lose that isolation. Hang on, let me see if I can give you a concrete example with bullet...

view this post on Zulip Sean (Jul 17 2020 at 21:00):

like I might consider the binary attributes or opencl branches, they both have lots of changes, so it might be nice to know what changes aren't on trunk

view this post on Zulip starseeker (Jul 17 2020 at 21:01):

@Sean do you want me to start trying to figure out how to replay the history and consolidate the notes into the commit message?

view this post on Zulip Sean (Jul 17 2020 at 21:01):

but then maybe I should check out those histories, because I expect the tree view to clearly show what was done on the branch

view this post on Zulip starseeker (Jul 17 2020 at 21:02):

in my experience it does not

view this post on Zulip starseeker (Jul 17 2020 at 21:02):

I may be missing something - see if you can use (say) gitk to visualize the history of the bullet branch

view this post on Zulip starseeker (Jul 17 2020 at 21:03):

(by the way, for general history browsing I generally use gitk --branches"*" to avoid seeing the notes commits)

view this post on Zulip Sean (Jul 17 2020 at 21:04):

okay, so then convinced me it's worth keeping for now -- the branch info -- if only because we have a dozen branches with work worth isolating and if it helps isolate them, fair enough

view this post on Zulip starseeker (Jul 17 2020 at 21:06):

OK, so in the NOTES file I have two aliases defined - logb and logsvnb. The former tries to use git's "standard" information to follow the branch history, and the logsvnb alias uses the notes.

view this post on Zulip starseeker (Jul 17 2020 at 21:07):

If you checkout the bullet branch, then do:

git logb

you'll get one result, and

git logsvnb

will produce another.

view this post on Zulip starseeker (Jul 17 2020 at 21:08):

(you can also do what the aliases are doing in scripts, that was just an easy way for me to achieve the result)

view this post on Zulip Sean (Jul 17 2020 at 21:09):

after I set something up, right, to get those aliases?

view this post on Zulip Sean (Jul 17 2020 at 21:10):

Screen-Shot-2020-07-17-at-5.09.07-PM.png <-- another downside...

view this post on Zulip starseeker (Jul 17 2020 at 21:13):

Try --branches="*" instead of --all - does that help?

view this post on Zulip starseeker (Jul 17 2020 at 21:14):

Yes, the NOTES definitions get added to your ~/.gitconfig file

view this post on Zulip Sean (Jul 17 2020 at 21:15):

ah, need a [alias] header

view this post on Zulip starseeker (Jul 17 2020 at 21:16):

Oh, sorry - I figured for the docs to put in a fully populated .gitconfig file as an example, but I haven't assembled it yet (if we decide not to keep the notes in this form it's moot anyway).

view this post on Zulip Sean (Jul 17 2020 at 21:16):

I know this is all one-time setup, but it really does feel clunky -- I think if we can make it work as the last two lines of the log message, we should and most if not all of this custom can go away

view this post on Zulip starseeker (Jul 17 2020 at 21:16):

Well, logsvnb won't but --all will behave better, that's true

view this post on Zulip Sean (Jul 17 2020 at 21:17):

well there simply won't be 72k commits that sometimes appear and have to be explained/ignored/parsed over/etc

view this post on Zulip Sean (Jul 17 2020 at 21:17):

so that'd be a plus

view this post on Zulip Sean (Jul 17 2020 at 21:18):

curious -- are they on a branch or something?

view this post on Zulip starseeker (Jul 17 2020 at 21:19):

It's some soft of separate mechanism .git/refs/notes/commits I think

view this post on Zulip Sean (Jul 17 2020 at 21:20):

very bizarre presentation

view this post on Zulip Sean (Jul 17 2020 at 21:20):

when I inspected one, it was presented as a change to file /dev/null

view this post on Zulip Sean (Jul 17 2020 at 21:21):

user "CVS_SVN_GIT Mapper <cvs_svn_git>" which I presume you set

view this post on Zulip starseeker (Jul 17 2020 at 21:21):

yes

view this post on Zulip Sean (Jul 17 2020 at 21:21):

and are predominantly at the end of the git log --all listing, but then are partially interwoven ...odd ordering

view this post on Zulip Sean (Jul 17 2020 at 21:26):

I wish git embraced a feature like svn attributes. I think mercurial supports arbitrary key/value attributes on their objects. sigh

view this post on Zulip starseeker (Jul 17 2020 at 21:35):

https://github.com/newren/git-filter-repo/ might have some possibilities

view this post on Zulip Sean (Jul 17 2020 at 21:38):

you sure it's not easier to update the tooling? seems like it should be easier to not write notes and simply append to the log messages as they are committed.

view this post on Zulip Sean (Jul 17 2020 at 21:41):

I think I could also probably write a script that adds them to the existing log if that'd help

view this post on Zulip starseeker (Jul 17 2020 at 21:42):

I've got over 300 manually adjusted commits which would have to be updated by hand (and being off by one character length in any of them will halt the commit) - plus it's now been close to a year since I've mucked in the code that generates the commit headers. And that's still just the SVN portion of the history - I'd need something like git-filter-repo anyway to get the CVS version.

view this post on Zulip Sean (Jul 17 2020 at 21:43):

ah, you're not rebuilding the cvs portion repeatedly?

view this post on Zulip Sean (Jul 17 2020 at 21:43):

just picking up at 17k or wherever

view this post on Zulip starseeker (Jul 17 2020 at 21:44):

Correct - cvs-git generates that, I then post-process it to match SVN commits to CVS->GIT commits

view this post on Zulip starseeker (Jul 17 2020 at 21:44):

~29k, IIRC

view this post on Zulip Sean (Jul 17 2020 at 21:44):

ah, right, the reorg was cvs

view this post on Zulip Sean (Jul 17 2020 at 21:45):

and that was nearly 23k

view this post on Zulip starseeker (Jul 17 2020 at 21:46):

/me nods - I could have put the svn numbers in the commit messages when I was originally writing that code - in fact I considered it - but it wouldn't have been a universal solution and it complicated the commit message mappings, which had to happen for CVS anyway.

view this post on Zulip Sean (Jul 17 2020 at 21:46):

/me nods

view this post on Zulip Sean (Jul 17 2020 at 21:46):

it wasn't really apparent the burden or full implications until working with it more

view this post on Zulip starseeker (Jul 17 2020 at 21:47):

If you want to help, you could take a look at https://github.com/newren/git-filter-repo/ and see if that provides enough power to rewrite the history by pulling the note (if any) from each commit and appending it to the commit message.

view this post on Zulip starseeker (Jul 17 2020 at 21:50):

The notes associate the information with the commit, so the problem becomes to (for each commit) retrieve the information and assemble the new commit message. Then, it needs to be applied and the history above it rewritten to accommodate the new sha1.

view this post on Zulip starseeker (Jul 17 2020 at 21:51):

Even with a well tuned process that'll be quite slow, especially for the older commits...

view this post on Zulip starseeker (Jul 17 2020 at 21:53):

@Sean if you're OK with a mapping file, what about a mapping file for timestamp plus commit message to SVN id? That should be robust if we can supply a way to look up a given commit using those inputs, even if we skip the notes

view this post on Zulip starseeker (Jul 17 2020 at 21:53):

(by the way, since a default git clone from github doesn't pull the notes, they're not going to be an issue for people unless they go looking for them...)

view this post on Zulip Sean (Jul 17 2020 at 21:54):

I would just shell script it myself, something like:
oldmessage="git log ..."
git --ammend -m "$oldmessage\nsvn:revision:$revision"

view this post on Zulip starseeker (Jul 17 2020 at 21:55):

If you want to give that a go, the repo on github now should be a suitable test

view this post on Zulip Sean (Jul 17 2020 at 21:56):

I can, but it might delay things for monday -- still working through yesterday's validation check and need to create a few more accounts for the final upload

view this post on Zulip starseeker (Jul 17 2020 at 21:57):

If we need to figure out another solution that involves the conversion process, Monday is shot anyway...

view this post on Zulip Sean (Jul 17 2020 at 21:58):

a couple other things I wanted to test too, like what happens if we garbage collect -- are there any orphans now?

view this post on Zulip starseeker (Jul 17 2020 at 21:58):

git fsck --lost-found can check for that, IIRC

view this post on Zulip Sean (Jul 17 2020 at 21:58):

also, what happens after deleting all the note commits. . and then garbage collecting. is there more to clean up.

view this post on Zulip Sean (Jul 17 2020 at 21:59):

right, I know -- that's just one of a couple dozen validation things to check on my list

view this post on Zulip Sean (Jul 17 2020 at 21:59):

common ops someone might do to their checkout

view this post on Zulip starseeker (Jul 17 2020 at 22:00):

/me shakes his head - I think we'd better not plan on Monday. You may find more issues, so let's just wait until you're either confident or have identified specifically where we need to end up to be ready.

view this post on Zulip Sean (Jul 17 2020 at 22:03):

We did originally plan for there being about 2 weeks of validation. I was going to try an cram as much as possible in 4 days :smile:

view this post on Zulip Sean (Jul 17 2020 at 22:04):

okay, time to stretch legs.. oof. giving myself nerve issues with so much sitting for months now.

view this post on Zulip starseeker (Jul 17 2020 at 22:05):

/me nods. Let me know.

FWIW, this might generate a SHA1 independent map:
git log --all --pretty=format:"%ct%nGITMSG%n%B%nGITMSGEND%n%N%n"

view this post on Zulip Sean (Jul 17 2020 at 22:06):

oh nice, that eliminates the indentation too.... I was just going to sed that out, but this is better.

view this post on Zulip starseeker (Jul 17 2020 at 22:06):

We could just commit that, and then the notes wouldn't matter much... most clones wouldn't have them.

view this post on Zulip starseeker (Jul 17 2020 at 22:07):

That's one thing about git I unreservedly approve of over SVN - it is way way better about programmatic extraction of information.

view this post on Zulip starseeker (Jul 17 2020 at 22:10):

If you re-clone from github, without pulling the notes, your git log (and gitk) won't show the notes commits even with the --all option.

view this post on Zulip starseeker (Jul 17 2020 at 22:28):

Here's a demonstration of a command pair that can use a timestamp and message to checkout a specific commit:
sha1=$(git log -F --after=1047583133 --before=1047583133 --grep="* empty log message *" --pretty=format:"%H") && git checkout $sha1

view this post on Zulip Sean (Jul 18 2020 at 00:04):

starseeker said:

We could just commit that, and then the notes wouldn't matter much... most clones wouldn't have them.

Just commit what?

view this post on Zulip Sean (Jul 18 2020 at 00:08):

good to know about the notes, so the extra commits wouldn't be ongoing nuisance unless someone pulls them. but if they're on the commit log, would there be a reason for keeping both? or is that not what you meant?

view this post on Zulip starseeker (Jul 18 2020 at 00:15):

If we generate a script that is capable of checking out the matching git commit without requiring the sha1, based on the timestamp and some or all of the commit message, then the git notes won't be needed anymore.

view this post on Zulip starseeker (Jul 18 2020 at 00:16):

I suppose we could strip them, but I'd rather leave them (at least in the primary github repo, even if we don't tell people to grab them by default) in case whatever script we come up with proves to have some sort of problem - then they'd be available as a fallback.

view this post on Zulip Sean (Jul 18 2020 at 00:20):

Won't they get disassociated when the commits get ammended? guess we can find out..

view this post on Zulip starseeker (Jul 18 2020 at 00:22):

Amended? Why would we do that, if the script can check out the SVN id?

view this post on Zulip starseeker (Jul 18 2020 at 00:23):

If we eventually have to change the repo for some reason we'd have to either try the solution the git folks gave us or re-generate the notes, I suppose

view this post on Zulip Sean (Jul 18 2020 at 00:23):

maybe talking about different things?

to append to svn rev the log message, that's an amend

view this post on Zulip starseeker (Jul 18 2020 at 00:24):

Right - I'm trying to avoid having to do that

view this post on Zulip Sean (Jul 18 2020 at 00:24):

i'm not following then what the suggestion was

view this post on Zulip starseeker (Jul 18 2020 at 00:25):

To generate a shell script that can accept a SVN revision number as an input, and do the appropriate checkout based on timestamp and commit message matching to check out the corresponding git commit.

view this post on Zulip starseeker (Jul 18 2020 at 00:25):

That won't tie the script to a particular sha1 hash, and so should be robust.

view this post on Zulip starseeker (Jul 18 2020 at 00:26):

Then we won't need to worry particularly about notes, updating log messages, etc.

view this post on Zulip Sean (Jul 18 2020 at 00:26):

how's it mapping svn rev to commit message? talking to origin or something?

view this post on Zulip starseeker (Jul 18 2020 at 00:27):

It would hard code the timestamp and message associations into a case statement, which would use the SVN rev as the lookup key

view this post on Zulip Sean (Jul 18 2020 at 00:28):

part of the issue was also one of simplicity and obviousness, not having to know some special knowledge to discover the svn rev or have it explained or documented

view this post on Zulip starseeker (Jul 18 2020 at 00:29):

Changing the commit messages is the most disruptive of all the options - are you sure it's worth it?

view this post on Zulip starseeker (Jul 18 2020 at 00:30):

You, Nick and I are probably the most likely to need SVN revs, and we're the most able to handle something less obvious...

view this post on Zulip Sean (Jul 18 2020 at 00:30):

agreed, but looking up commits wasn't the only issue

view this post on Zulip Sean (Jul 18 2020 at 00:31):

kanzure's first response weighs on me

view this post on Zulip Sean (Jul 18 2020 at 00:31):

his comment, questioning where the revs are and hoping we saved them somewhere.

view this post on Zulip Sean (Jul 18 2020 at 00:31):

it's not just that the data is or isn't available.

view this post on Zulip Sean (Jul 18 2020 at 00:31):

it's that he had to ask

view this post on Zulip Sean (Jul 18 2020 at 00:32):

and even then, that's still only 3 of the 7 issues that came to mind..

view this post on Zulip Sean (Jul 18 2020 at 00:35):

with all the churn and back and forth, an e-mail change seems inevitable ... like if github suddenly becomes persona non grata and we move to gitlab. we might want/need to rewrite all those stupid github privacy aliases .. talk about f'ing vendor lock in.

view this post on Zulip Sean (Jul 18 2020 at 00:35):

of course, that one is less interesting as future us may have other options

view this post on Zulip Sean (Jul 18 2020 at 00:35):

s/interesting/compelling/

view this post on Zulip starseeker (Jul 18 2020 at 00:35):

Heh - that's an argument to go to all brlcad.org emails, in some ways...

view this post on Zulip Sean (Jul 18 2020 at 00:36):

I thought about that

view this post on Zulip Sean (Jul 18 2020 at 00:36):

but then I think we'd end up with even less coverage displayed

view this post on Zulip starseeker (Jul 18 2020 at 00:36):

Ah - because people have to add the brlcad.org email to their profile?

view this post on Zulip Sean (Jul 18 2020 at 00:37):

unlikely to get the old devs like gary to associate an alias he's never used to his github account that he probably never uses.

view this post on Zulip Sean (Jul 18 2020 at 00:37):

right

view this post on Zulip Sean (Jul 18 2020 at 00:37):

I mean, 55 out of about 75 is not too shabby

view this post on Zulip starseeker (Jul 18 2020 at 00:39):

OK, I'll see if I can figure out the amending thing, since there are multiple potential applications/use cases. Just be aware I'm running out of steam, to a degree.

view this post on Zulip Sean (Jul 18 2020 at 00:40):

I know, I hated bringing it up. Don't mean to cause more work.

view this post on Zulip Sean (Jul 18 2020 at 00:43):

The usability implications have been somewhat jarring/unexpected, and simpler may be better. We're not losing anything.
And we probably could revert back to notes or attributes or some other feature ends up getting developed. I have to imagine something eventually will..

view this post on Zulip starseeker (Jul 18 2020 at 00:43):

/me winces. Once the repo goes live, a change of that sort will be disruptive for all forks even if we figure out how to do it.

view this post on Zulip Sean (Jul 18 2020 at 00:44):

yeah, I'm thinking more like if/when we change hosts again

view this post on Zulip Sean (Jul 18 2020 at 00:44):

it took us so long to get of sourceforge that github is bound to be obsolete soon.

view this post on Zulip Sean (Jul 18 2020 at 00:44):

off*

view this post on Zulip Sean (Jul 18 2020 at 00:44):

</sarcasm>

view this post on Zulip starseeker (Jul 18 2020 at 00:45):

Um. Even then, in principle we could migrate the git repo without breaking forks, if I understand correctly - it would just be a change in origins. The breakage would be if we needed to change emails on old commits (as opposed to associating them with the new accounts, say...)

view this post on Zulip Sean (Jul 18 2020 at 00:46):

well, yeah -- I think that'd be implicit because of all the github-specific aliases. that only works on github.

view this post on Zulip starseeker (Jul 18 2020 at 00:46):

Heh - how many times did sourceforge get sold before they started having trouble? That might be a decent yardstick...

view this post on Zulip Sean (Jul 18 2020 at 00:46):

if github were shuttered, there'd be no way to authenticate/claim those addresses

view this post on Zulip starseeker (Jul 18 2020 at 00:47):

Hm. That's true enough.

view this post on Zulip Sean (Jul 18 2020 at 00:47):

I think people just assume they'd rewrite their author names. I would if I were using one.

view this post on Zulip Sean (Jul 18 2020 at 00:48):

I find the idea of using a content provider's e-mail alias a bit wonky personally. Unless it's something "too big to fail" like gmail.com ... tech is notoriously unreliable, even fickle ... looking at you yahoo.com

view this post on Zulip starseeker (Jul 18 2020 at 00:50):

/me likes the idea of spam going to /dev/null with the noreply email...

view this post on Zulip Erik (Jul 18 2020 at 00:58):

so just put your email address someplace permanent, like geocities.com

view this post on Zulip starseeker (Jul 18 2020 at 01:00):

To be honest, I'm not all that worried (on a personal level) about my commits showing up anywhere - they haven't for a decade, and I'll live if they don't... as long as the project's stats behave reasonably, whether it ties to my account is secondary.

view this post on Zulip starseeker (Jul 18 2020 at 01:00):

@Erik you mentioned your git fu being strong - we have a case where where help would be appreciated, if you have any ideas

view this post on Zulip Erik (Jul 18 2020 at 01:01):

a git repo is a git repo, they can be rewritten, there is no "exporting", it just is

view this post on Zulip starseeker (Jul 18 2020 at 01:01):

We need to rewrite a git history to take those commits that have notes, and append them to the end of the commit message instead.

view this post on Zulip Erik (Jul 18 2020 at 01:02):

shrug so slap a hook in and clone it to fire it or something

view this post on Zulip starseeker (Jul 18 2020 at 01:02):

words like append, rebase, filter-branch, and such hover around this question, but there are additional challenges - such as preserving the original timestamps while doing all this.

view this post on Zulip starseeker (Jul 18 2020 at 01:04):

we are striving for a degree of fidelity in history preservation that I conclude is somewhat unusual among git users...

view this post on Zulip Erik (Jul 18 2020 at 01:04):

there are several types of dates kept in git.. author date, commit date, merge date... um, read all the formatting options in the pretty printing section of man git-log

view this post on Zulip Erik (Jul 18 2020 at 01:05):

it's probably the quickest most comprehensive way to grok what git stores

view this post on Zulip starseeker (Jul 18 2020 at 01:07):

Is there some sort of standard "advanced" script for a situation like this, needing extensive (and non-unique) commit msg updates?

view this post on Zulip starseeker (Jul 18 2020 at 01:08):

If push comes to shove I can manipulate the data at whatever level is required, but it would be nice if there's a pre-packaged answer...

view this post on Zulip starseeker (Jul 18 2020 at 01:08):

/me realizes he should probably eat dinner...

view this post on Zulip Erik (Jul 18 2020 at 01:10):

there's a hooks directory that can be used for crap like this, the script just has to do one commit, then ask git to clone using it, or filter if you want to try to do it in place, or whatever. Or just write a script to iterate the commits and -amend them. Or ...

view this post on Zulip Erik (Jul 18 2020 at 01:11):

and it's a dvcs, so, y'know, if you break it, just grab another copy

view this post on Zulip starseeker (Jul 18 2020 at 01:26):

It's not the breaking it, it's the 200 iterations of breaking it before I manage not to break it...

view this post on Zulip Daniel Rossberg (Jul 19 2020 at 15:11):

Sean said:

Daniel Rossberg per your e-mail, you're also welcome to use your brlcad.org alias (rossberg).. which can be pointed to anything, and can be claimed in your github account as an additional address.

For github I want to stay with my github address.

Another issue is that my brlcad.org address is dead. It points to a sourceforge address, which don't accept mails from outside. ~/.forward seems to not work.

view this post on Zulip Sean (Jul 20 2020 at 23:02):

@Daniel Rossberg your alias no longer points to any sourceforge addresses -- they were all updated recently for everyone for that very reason.

view this post on Zulip Sean (Jul 20 2020 at 23:03):

just fyi.

view this post on Zulip starseeker (Jul 21 2020 at 02:38):

@Sean As long as I'm doing this anyway, would you prefer a different format for the SVN revision and branch info than what I was using? It wouldn't be too much more work to change the formatting once I get the initial logic working, if you would prefer something different.

view this post on Zulip Sean (Jul 21 2020 at 02:55):

Daniel Rossberg said:
For github I want to stay with my github address.

Another issue is that my brlcad.org address is dead. It points to a sourceforge address, which don't accept mails from outside. ~/.forward seems to not work.

And of course, not a problem either to keep it on your github address either, can be whatever you want. Just was letting you know it was an option. The aliases are DNS MX records, so they are aliased before they even hit a mail server.

view this post on Zulip Sean (Jul 21 2020 at 04:17):

starseeker said:

Sean As long as I'm doing this anyway, would you prefer a different format for the SVN revision and branch info than what I was using? It wouldn't be too much more work to change the formatting once I get the initial logic working, if you would prefer something different.

I think what you used is perfectly reasonable.

view this post on Zulip starseeker (Jul 23 2020 at 04:05):

https://github.com/starseeker/brlcad_nonotes

view this post on Zulip Sean (Jul 23 2020 at 04:07):

how's that 5 contributors less?

view this post on Zulip starseeker (Jul 23 2020 at 04:38):

I'll have to check... just got it working a couple hours ago, pretty fried.

view this post on Zulip starseeker (Jul 23 2020 at 04:39):

I'm actually seeing more?

view this post on Zulip starseeker (Jul 23 2020 at 04:40):

60 on the new vs 55 on the previous?

view this post on Zulip starseeker (Jul 23 2020 at 04:40):

might be taking time to populate

view this post on Zulip Sean (Jul 23 2020 at 04:44):

interesting, now it says sixty here too. guess it wasn't done processing

view this post on Zulip Sean (Jul 23 2020 at 04:45):

I got 7 more names sorted out, so we should be up to 62 now

view this post on Zulip starseeker (Jul 23 2020 at 05:17):

Ugh. Alright, can't wrap this up tonight (quite.)

view this post on Zulip starseeker (Jul 23 2020 at 05:17):

So close...

view this post on Zulip Sean (Jul 23 2020 at 05:18):

still got a bit more validation too, but yeah, the last couple names I got were huge wins

view this post on Zulip Sean (Jul 23 2020 at 06:16):

this is looking good. we're up to 74% of authors - 64 of 86 - and will be up to at least 95% commits after the next run. that should do it!

view this post on Zulip Sean (Jul 23 2020 at 06:16):

it should be at least 96.4%

view this post on Zulip Sean (Jul 23 2020 at 06:21):

there are flags on just three accounts with anomalies that I'll need to investigate. one with too few, one with way too many, and one not linking to their github

view this post on Zulip starseeker (Jul 23 2020 at 14:14):

Here's an upload with all the bells and whistles - converted all the emails as of account-maps earlier this morning, notes consolidated into commit messages, and just for grins I also wrapped single line commit messages to 72 chars:

https://github.com/starseeker/brlcad_conv4

view this post on Zulip starseeker (Jul 23 2020 at 14:18):

Needs a validation check to make sure I didn't accidentally mess something up in the processing, still...

view this post on Zulip starseeker (Jul 23 2020 at 15:14):

Ah crud, typo'ed a couple of the mappings. screeech, rerun...

view this post on Zulip starseeker (Jul 23 2020 at 16:01):

There we go (still populating site info...)
https://github.com/starseeker/brlcad_conv5

view this post on Zulip starseeker (Jul 23 2020 at 22:51):

@Sean is this what you were looking for with the svn commit names?

https://github.com/starseeker/brlcad_conv6/commit/6dc9436d0fc5f17176a0a5fc5d00b54b1194f75c

view this post on Zulip starseeker (Jul 23 2020 at 23:00):

I think I'm pretty much out of stuff I know still has to be done (aside from pulling newer commits of course) - let me know if you spot anything else.

view this post on Zulip starseeker (Jul 24 2020 at 00:35):

Ugh, 80 col experiment didn't work well. conv6 removed, replaced:
https://github.com/starseeker/brlcad_conv7

view this post on Zulip starseeker (Jul 24 2020 at 00:36):

(github must be wondering what on earth I'm doing...)

view this post on Zulip starseeker (Jul 24 2020 at 00:40):

/me proceeds to unplug brain for recharging...

view this post on Zulip Sean (Jul 24 2020 at 01:22):

yeah, I've got very strongly mixed feelings about inserting newlines where they didn't exist. I feel that's just bad git presentation defaults. Apparently they can be overcome (e.g., default interactive pager is LESS=-S even though it can auto-wrap to screen correctly).

view this post on Zulip Sean (Jul 24 2020 at 01:24):

starseeker said:

I think I'm pretty much out of stuff I know still has to be done (aside from pulling newer commits of course) - let me know if you spot anything else.

I have commits for three accounts to investigate, which I hope to finish up with tomorrow. I'm done with accounts -- we nearly got everyone that made at least 100 commits (woot!). We're definitely getting super close.

view this post on Zulip Sean (Jul 24 2020 at 01:24):

The log additions for branch and revision look like they were flawless. Trying to find one with no log message to see what it did...

view this post on Zulip starseeker (Jul 24 2020 at 10:09):

@Sean That was my experience with wrapping - gitk I know can deal with it, but doesn't by default??? (I can only conclude that it's a deliberate design decision, given the feature does exist and works...)

I've got notes somewhere on which option to set (at least for gitk), which we'll probably still want to advise people to do regardless because there are some cases I don't detect as wrappable.

My motivation for wrapping was two fold - 1) if we wrap lines, we'll get better behavior for new users with default tool settings and 2) interfaces/websites/tools that assume "standard" git commit message settings may behave better.

It's quite literally an option in the post-processing tool, so trivial to disable if you decide we shouldn't wrap them.

view this post on Zulip starseeker (Jul 24 2020 at 10:12):

@Sean If you just want any commit without a note migration, here's one:
https://github.com/starseeker/brlcad_conv7/commit/a8161859aa2d1d3935a257be9e725daff89e8157

view this post on Zulip starseeker (Jul 24 2020 at 10:16):

git log --invert-grep --grep="svn:revision" will list the ones without an svn tag

view this post on Zulip starseeker (Jul 24 2020 at 10:20):

Here's a no-svn-id commit with a longer message:
https://github.com/starseeker/brlcad_conv7/commit/0758d43db1ef6e2bb518d4e1db355bf6dc864527

view this post on Zulip starseeker (Jul 24 2020 at 10:21):

CVS era commit without svn id (i.e. not an artifact of SVN conversion)
https://github.com/starseeker/brlcad_conv7/commit/dd3a2e848c19e8610c82f81a40e6c9d7fdbc8c81

view this post on Zulip starseeker (Jul 24 2020 at 10:23):

I don't think we had any commits with an actual empty string in the git history - the preliminary conversions produced some like this:
https://github.com/starseeker/brlcad_conv7/commit/a4ad5e277ff55f47cc70bb36dd12097b31d03c02

view this post on Zulip starseeker (Jul 24 2020 at 11:24):

Ah, whoops - sorry Sean, just messed up that repo with an experiment. Hang, on creating a new one.

view this post on Zulip starseeker (Jul 24 2020 at 11:33):

https://github.com/starseeker/brlcad_r76458

view this post on Zulip Sean (Jul 24 2020 at 21:08):

Nice, I like the annotations.

view this post on Zulip Sean (Jul 24 2020 at 21:09):

how about "account" instead of "author" though

view this post on Zulip Sean (Jul 24 2020 at 21:09):

author has implications that may or may not be true

view this post on Zulip Sean (Jul 24 2020 at 21:10):

"account" or "username"

view this post on Zulip starseeker (Jul 24 2020 at 21:39):

What about "committer" ? Other than that I'd vote for account.

view this post on Zulip starseeker (Jul 24 2020 at 21:41):

@Sean can I do anything to help analyze the remaining concerns?

view this post on Zulip starseeker (Jul 24 2020 at 21:42):

(fwiw, "committer" is the equivalent term from git)

view this post on Zulip Sean (Jul 25 2020 at 01:09):

I would stick with svn's nomenclature

view this post on Zulip Sean (Jul 25 2020 at 01:09):

it's an account username, so either works

view this post on Zulip starseeker (Jul 25 2020 at 01:45):

Well, I'd prefer to match git, but it's not worth bikeshedding - account it is.

view this post on Zulip starseeker (Jul 25 2020 at 01:49):

Looks like the distcheck-repo_verify adaptation for Git is working.

view this post on Zulip starseeker (Jul 25 2020 at 01:51):

/me should confirm the svn-fast-export method is working for the other repos, actually - been a while since I tested that.

view this post on Zulip starseeker (Jul 25 2020 at 14:27):

Rather crude, but this should encapsulate what's needed for doing a verification between git and svn (at least, as far back as the end of the CVS history)
https://sourceforge.net/p/brlcad/code/HEAD/tree/brlcad/trunk/misc/repoconv/verify.cpp

view this post on Zulip starseeker (Jul 25 2020 at 14:28):

Will need to run it against this version, with fixed svn branch names:
https://github.com/starseeker/brlcad_conv8

view this post on Zulip starseeker (Jul 25 2020 at 14:29):

I've not run it myself, beyond a few commits to see if it looks like it's working - it will be very slow, and there may be more optimal ways to go about checking - this is very much a brute force approach.

view this post on Zulip starseeker (Jul 25 2020 at 18:38):

We can compare the CVS portion of the history as well if you want to, but I'm not sure what we'd do about any discrepancies - I'm just using the output from cvs-fast-export, so any changes would be quite difficult.

view this post on Zulip starseeker (Jul 25 2020 at 18:40):

And in that case the true "ground truth" would actually be the equivalent CVS checkout, if we can map the svn revisions back to CVS in some fashion.

view this post on Zulip starseeker (Jul 25 2020 at 23:02):

I must be out of my mind, but I taught https://sourceforge.net/p/brlcad/code/HEAD/tree/brlcad/trunk/misc/repoconv/verify/verify.cpp to check both the SVN and the CVS repositories, once things get that far back. (recommended to replace the sphflake.pix,v file as documented in the beginning of CONVERT.sh)

view this post on Zulip starseeker (Jul 26 2020 at 01:43):

@Sean when you said there were authors you need to check, was that in the conversion or the Github integration?

(trying to think of anything else useful I can do...)

view this post on Zulip Sean (Jul 27 2020 at 13:59):

In the conversion

view this post on Zulip Sean (Jul 27 2020 at 14:03):

I hope to have them all inspected today, will ask if there are questions. It's more a matter a tracing down all their master commit shas and seeing what the delta is against their trunk commits, to make sure all the differences can be explained.

view this post on Zulip Sean (Jul 27 2020 at 14:05):

starseeker said:

Sean That was my experience with wrapping - gitk I know can deal with it, but doesn't by default??? (I can only conclude that it's a deliberate design decision, given the feature does exist and works...

I'll just note that is your assumption, and not one I would make. Yet you're using it to justify a subsequent decision that all have to live with.

I've got notes somewhere on which option to set (at least for gitk), which we'll probably still want to advise people to do regardless because there are some cases I don't detect as wrappable.

We will also because I have little intention of manually injecting newlines once we're on github for command-line commits. I don't do it for other git repos and don't plan to on ours either except when the commit warrants a longer description and I'm in an editor.

My motivation for wrapping was two fold - 1) if we wrap lines, we'll get better behavior for new users with default tool settings and 2) interfaces/websites/tools that assume "standard" git commit message settings may behave better.

That said, these are sound reasons, save for the caveat I just stated -- that it just means the historic commits might be pretty but not the more recent ones.

It's quite literally an option in the post-processing tool, so trivial to disable if you decide we shouldn't wrap them.


view this post on Zulip Sean (Jul 27 2020 at 14:11):

starseeker said:

Sean That was my experience with wrapping - gitk I know can deal with it, but doesn't by default??? (I can only conclude that it's a deliberate design decision, given the feature does exist and works...

I'll just note that is your assumption, and not one I would make. Yet you're using it to justify a subsequent decision that all have to live with.

From my perspective, this is a feature that git and github have wrong. Line wrapping is a presentation issue that is trivially handled by apps. Other distributed vcs didn't make the same decisions, and if we'd picked another we wouldn't even be having this consideration. Which is to say that it's possibly something we'll regret in the future when we migrate to git's successor. Unfortunately, it's trivial to add newlines but it's not trivial to remove them.

I've got notes somewhere on which option to set (at least for gitk), which we'll probably still want to advise people to do regardless because there are some cases I don't detect as wrappable.

We will also because I have little intention of manually injecting newlines once we're on github for command-line commits. I don't do it for other git repos and don't plan to on ours either except when the commit warrants a longer description and I'm in an editor.

My motivation for wrapping was two fold - 1) if we wrap lines, we'll get better behavior for new users with default tool settings and 2) interfaces/websites/tools that assume "standard" git commit message settings may behave better.

That said, these are sound reasons, save for the caveat I just stated -- that it just means the historic commits might be pretty but not the more recent ones.

It's quite literally an option in the post-processing tool, so trivial to disable if you decide we shouldn't wrap them.

I don't feel that strongly to oppose it. I do like it neat and tidy though it begs a couple questions (like what column did you wrap on? what about things like URLs? what about punctuation? ..).

It's a little concerning that it's not preserving what actually was written. It's slightly complicating the review process because they don't match and I have to do additional scripting (but I'll deal, just slows things down). Those are not strong enough to argue against though. I think you said you limited it to commits that had only 1-line comments? That's probably a good balance.

view this post on Zulip Sean (Jul 27 2020 at 14:27):

Sean said:

From my perspective, this is a feature that git and github have wrong.

Github actually appears to handle the long lines just fine (ellipses on presentation). This really is just a git tooling convention / defaults issue. I think I even read how one can make git log behave for a different format line.

But again, not enough to fight against it, just sharing my perspective. I probably wouldn't, but if you want to inject them on the single line commits, I won't fuss too much. :)

view this post on Zulip Sean (Jul 27 2020 at 16:26):

Hm... one possibility comes to mind. Putting svn:log:wrapped could be used to denote the ones wrapped, which would then make them invertible and an encoding of the original data.

view this post on Zulip starseeker (Jul 27 2020 at 16:43):

I'll just note that is your assumption, and not one I would make. Yet you're using it to justify a subsequent decision that all have to live with.

Didn't intend it to be a justification - more an assessment of likelihood of it being changed.

From my perspective, this is a feature that git and github have wrong. Line wrapping is a presentation issue that is trivially handled by apps. Other distributed vcs didn't make the same decisions, and if we'd picked another we wouldn't even be having this consideration. Which is to say that it's possibly something we'll regret in the future when we migrate to git's successor. Unfortunately, it's trivial to add newlines but it's not trivial to remove them.

Point. I wasn't strongly advocating for it - I just put it in the test conversion as a demonstration of what I could achieve if it was of interest.

We will also because I have little intention of manually injecting newlines once we're on github for command-line commits. I don't do it for other git repos and don't plan to on ours either except when the commit warrants a longer description and I'm in an editor.

Fair enough.

That said, these are sound reasons, save for the caveat I just stated -- that it just means the historic commits might be pretty but not the more recent ones.

Which actually argues against doing it - don't want newer stuff to look "worse" in some sense.

I don't feel that strongly to oppose it. I do like it neat and tidy though it begs a couple questions (like what column did you wrap on? what about things like URLs? what about punctuation? ..).

Column 72 - used the "TextFlow" algorithm, which I gather is similar to what editors do for work wrapping.

It's a little concerning that it's not preserving what actually was written. It's slightly complicating the review process because they don't match and I have to do additional scripting (but I'll deal, just slows things down). Those are not strong enough to argue against though. I think you said you limited it to commits that had only 1-line comments? That's probably a good balance.

I wish you'd said something, I could have generated another version without the wrapping. (Still can, for that matter...)

The posted version is not the final version anyway... Over the weekend I think I figured out how to actually audit and fix the CVS era commits so the git checkout for each commit will match what cvs would produce (still testing, and will take a while to run, but initial results are promising.)

view this post on Zulip Sean (Jul 27 2020 at 17:07):

Column 74 is I think the minimum only because Git defaults to presenting 4 char indents on log output.

view this post on Zulip Sean (Jul 27 2020 at 17:08):

starseeker said:

I wish you'd said something, I could have generated another version without the wrapping. (Still can, for that matter...)

The posted version is not the final version anyway... Over the weekend I think I figured out how to actually audit and fix the CVS era commits so the git checkout for each commit will match what cvs would produce (still testing, and will take a while to run, but initial results are promising.)

Only slightly. But each upload has meant I need to regenerate my list of comparison hashes... ;)

view this post on Zulip Sean (Jul 27 2020 at 17:09):

On the latter point, what do you think about having the log tags actually denote cvs:revision:### (in addition to the svn revision) for the cvs portion?

view this post on Zulip Sean (Jul 27 2020 at 17:10):

if there's a way to record the actual account name used (not just the mapped account name) for both cvs and svn, that would be a nice-to-have preservation. if not, no biggie.

view this post on Zulip Sean (Jul 27 2020 at 17:13):

and wouldn't do it to only get cvs or only svn.. that could be confusing

view this post on Zulip Sean (Jul 27 2020 at 17:15):

by the way, I updated https://brlcad.org/wiki/Github_Migration with all the migration steps as I'd envisioned them. I may have forgotten a step or two, but I think most of it is there. I did try to make sure it incorporated all the points you mentioned in your (more elaborate) discussion.

view this post on Zulip Sean (Jul 27 2020 at 17:16):

Of course, some of the verification steps may cause more verification steps, but it's got the gist of what's needed.

view this post on Zulip starseeker (Jul 27 2020 at 20:05):

Growl... well, I can change CVS era commits but auditing them is proving trickier than I'd hoped in some ways... specifically, what do I check out from CVS when Git says a particular commit is on a dozen branches?

view this post on Zulip starseeker (Jul 27 2020 at 20:17):

About cvs:revision - correct me if I'm wrong, but did the CVS tool actually have revision numbers? I thought all we had was the numbers SVN assigned various commits when the cvs2svn migration occurred.

view this post on Zulip starseeker (Jul 27 2020 at 20:18):

I'm checking out by date and -r tag (when trunk/master isn't available) - is there another option?

view this post on Zulip starseeker (Jul 27 2020 at 20:19):

Are the CVS commit names different from the SVN authors? I'd been assuming a 1-1 mapping there, but perhaps I'm wrong?

view this post on Zulip starseeker (Jul 27 2020 at 20:40):

I suppose one possibility might be to add the cvs checkout lines corresponding to each commit...

cvs:checkout:cvs co -ko -D "<date>" [-r tag] -P brlcad

view this post on Zulip starseeker (Jul 27 2020 at 20:42):

probably overkill

view this post on Zulip Sean (Jul 27 2020 at 21:00):

starseeker said:

About cvs:revision - correct me if I'm wrong, but did the CVS tool actually have revision numbers? I thought all we had was the numbers SVN assigned various commits when the cvs2svn migration occurred.

revisions in cvs are per file -- akin to git. there is no global number like svn.

view this post on Zulip Sean (Jul 27 2020 at 21:02):

starseeker said:

Are the CVS commit names different from the SVN authors? I'd been assuming a 1-1 mapping there, but perhaps I'm wrong?

They're different.

view this post on Zulip Sean (Jul 27 2020 at 21:02):

At least, there's a swath of names that only exist in cvs, a swath that exist in cvs and svn, and a swath that are only in svn

view this post on Zulip Sean (Jul 27 2020 at 21:04):

it's whatever account we committed from

view this post on Zulip Sean (Jul 27 2020 at 21:06):

so for example, Markowski had commits as 'mmark' under rcs and 'mm' under cvs (or vice versa). I had commits as 'morrison' under cvs, never as that via svn though.

view this post on Zulip Sean (Jul 27 2020 at 21:08):

not a terrible loss, but it'd be really cool if we could preserve that original commit account name per commit. there's some semantic repo history that would be preserved just by knowing the name.

view this post on Zulip starseeker (Jul 27 2020 at 21:56):

So if we can ID which account names are unique to CVS, we could flag them. A quick check shows svn:account:mmark and svn:account::mm both present in the conversion, so the names made it. A cvs prefix could probably be added based on which commits originally came from the CVS conversion - I'd have to think about that, but it's probably possible.

view this post on Zulip starseeker (Jul 27 2020 at 21:56):

Actually, that might be best - just prefix with cvs:account or svn:account based on which VCS the commits came from.

view this post on Zulip starseeker (Jul 27 2020 at 21:57):

revisions will always be svn:revision (those commits that have it) since the numbers came from SVN.

view this post on Zulip starseeker (Jul 27 2020 at 21:58):

branches are trickier, but based on my experiences so far I'd rather just leave the svn branches alone - my brain hurts trying to sort out the various mappings, and I doubt it's terribly critical as long as git blame can walk back through the history successfully.

view this post on Zulip Sean (Jul 28 2020 at 01:46):

Actually since you mentioned it about mmark and mm, it looks like you already have it doing the right thing -- it's using the account username as originally committed for both svn and cvs. That's great!

view this post on Zulip Sean (Jul 28 2020 at 01:48):

starseeker said:

Actually, that might be best - just prefix with cvs:account or svn:account based on which VCS the commits came from.

That would be a pretty slick detail. We'd actually be able to distinguish three "generations" of commits, hah.

view this post on Zulip starseeker (Jul 28 2020 at 11:35):

I think I've found a way to associate the author ids (and cvs-fast-export's branch analysis) with the comments in the final conversion. I'll need to actually test applying the data in repowork, but I've got a script now that looks like it is successfully extracting the information. (misc/repoconv/cvs_info.sh)

view this post on Zulip starseeker (Jul 28 2020 at 20:50):

OK, managed to apply the CVS account/branch information, FWIW:
https://github.com/starseeker/brlcad_conv9

view this post on Zulip starseeker (Jul 29 2020 at 13:04):

@Sean Is there anything more I can do? I'm not sure it makes sense to have me do the check steps, since I'd basically be re-using the same logic I put together to do the conversion in the first place, but if there's anything that will move the process forward I'd like to help...

view this post on Zulip Sean (Jul 29 2020 at 14:26):

best you can probably do is probably just having a bit of patience, however frustrating.. :) you're right -- you can't / shouldn't verify since you may unintentionally dismiss or overlook something whereas someone else won't know to. sumanga and I don't know your conversion logic at all, so this is nice indep validation. :)

view this post on Zulip starseeker (Jul 29 2020 at 14:33):

OK, will do.

view this post on Zulip starseeker (Jul 29 2020 at 14:47):

I had to make one adjustment post brlcad_conv9 to get the spacing right for the CVS-only comments - should I upload that version of the repo?

view this post on Zulip starseeker (Jul 29 2020 at 14:49):

(I know you mentioned the changing sha1 values in the various versions was a pain, so I wanted to check...)

view this post on Zulip Sumagna Das (Jul 29 2020 at 19:17):

starseeker said:

OK, managed to apply the CVS account/branch information, FWIW:
https://github.com/starseeker/brlcad_conv9

should i pull this one right now and run the script? :worried:

view this post on Zulip Sumagna Das (Jul 29 2020 at 19:19):

or should i operate on the brlcad_conv8 repo?

view this post on Zulip starseeker (Jul 29 2020 at 19:57):

brlcad_conv8 is fine.

view this post on Zulip starseeker (Jul 29 2020 at 19:57):

The newer ones are just minor variations on the commit message formatting

view this post on Zulip Sumagna Das (Jul 29 2020 at 19:57):

thanks for the info :smile:

view this post on Zulip starseeker (Jul 29 2020 at 19:59):

I'll let you know if one appears that would motivate a restart in the check, but unless someone finds an actual error I doubt it will be necessary at this point...

view this post on Zulip starseeker (Jul 31 2020 at 19:49):

@Sumagna Das how are the checks going?

view this post on Zulip Sumagna Das (Jul 31 2020 at 19:50):

somehow the local copy of the github repo got wiped and all i know was that the last revision being checked was 75007

view this post on Zulip Sumagna Das (Jul 31 2020 at 19:51):

now if i clone it, it will start from the beginning

view this post on Zulip starseeker (Jul 31 2020 at 19:51):

Can you tell your script to start at a lower revision?

view this post on Zulip Sumagna Das (Jul 31 2020 at 19:52):

it starts first with the github repo commits, check them and then checkout the svn revision

view this post on Zulip Sumagna Das (Jul 31 2020 at 19:52):

i have to find a way to go backwards then

view this post on Zulip starseeker (Jul 31 2020 at 19:56):

@Sumagna Das The github checkout has the svn revisions in the comments - could you just filter out any commits that have a number higher than 75007?

view this post on Zulip Sumagna Das (Jul 31 2020 at 19:56):

i was thinking about that

view this post on Zulip starseeker (Jul 31 2020 at 19:57):

(btw, if you're going to check it out again go with https://github.com/starseeker/brlcad_conv10 )

view this post on Zulip Sumagna Das (Jul 31 2020 at 19:57):

i will make it skip them but not save them in the skipped_commits.txt

view this post on Zulip Sumagna Das (Jul 31 2020 at 19:57):

starseeker said:

(btw, if you're going to check it out again go with https://github.com/starseeker/brlcad_conv10 )

will be a good idea

view this post on Zulip starseeker (Jul 31 2020 at 19:58):

/me cleans up older conversion tests...

view this post on Zulip starseeker (Jul 31 2020 at 20:01):

There we go - now my github account looks less manically busy.

view this post on Zulip Sumagna Das (Jul 31 2020 at 20:01):

:grinning_face_with_smiling_eyes:

view this post on Zulip Sumagna Das (Jul 31 2020 at 20:29):

found it through grep checked it out and restarted from the same point

view this post on Zulip Sumagna Das (Jul 31 2020 at 20:30):

my script starts checking from the commit which is checked out at the moment on the git repo

view this post on Zulip starseeker (Aug 09 2020 at 16:45):

@Sean ping?

view this post on Zulip Sean (Aug 09 2020 at 17:24):

Got through two checks the past week, looking good so far. Few more to go. Hoping we will be able to go live soon, maybe next weekend if these check go good.

view this post on Zulip Sean (Aug 09 2020 at 17:24):

How’s your scan going @Sadeep Darshana ?

view this post on Zulip starseeker (Aug 09 2020 at 17:25):

I believe he finished it, results in the "issues in migrated repo" thread

view this post on Zulip starseeker (Aug 09 2020 at 17:26):

When I checked, all the SVN era differences I saw where when his script tried to compare brep-debug commits with trunk. CVS era is messier, as expected.

view this post on Zulip Sumagna Das (Aug 13 2020 at 07:01):

how much of the migration is left?

view this post on Zulip starseeker (Aug 15 2020 at 15:10):

@Sumagna Das not sure - @Sean , is https://brlcad.org/wiki/Github_Migration still current?

view this post on Zulip starseeker (Aug 15 2020 at 15:11):

@Sumagna Das we're shaking down for a release, which will slow things up a bit

view this post on Zulip Sumagna Das (Aug 15 2020 at 16:42):

if any help is needed, i will try to help if i can

view this post on Zulip Sean (Aug 16 2020 at 18:14):

starseeker said:

Sumagna Das not sure - Sean , is https://brlcad.org/wiki/Github_Migration still current?

I completed a couple more tasks, will update.

view this post on Zulip starseeker (Sep 02 2020 at 22:54):

ping? (I know commit reviews are competing with this...)

view this post on Zulip Sean (Sep 03 2020 at 05:38):

not really a competition, it's been a full-stop shift to eye-bleeding commit reading for hours on end...

view this post on Zulip starseeker (Sep 03 2020 at 11:45):

/me winces. Well, hopefully the commit storm will be letting up after this for a while.

view this post on Zulip starseeker (Sep 03 2020 at 16:30):

Anything helpful I can do? (Testing, etc?)

view this post on Zulip starseeker (Sep 15 2020 at 17:26):

@Sean Just FYI, realized my updates were missing the svn commit ids for newer commits, in case you were using my github test conversion. New version, current as of last night with all commits, up at https://github.com/starseeker/brlcad_conv11

view this post on Zulip Sumagna Das (Sep 19 2020 at 15:44):

how much of the migration is done or left?

view this post on Zulip starseeker (Sep 19 2020 at 18:54):

https://brlcad.org/wiki/Github_Migration is the place to watch

view this post on Zulip Sumagna Das (Sep 19 2020 at 18:55):

nothing changed i think

view this post on Zulip starseeker (Sep 19 2020 at 18:56):

From a technical standpoint the main SVN->Git conversion is essentially complete (barring discovery of some significant, heretofore unnoticed problem).

The migration of the secondary data hasn't been as thoroughly explored - that'll probably be tricky, and hasn't (yet) been tested.

view this post on Zulip starseeker (Sep 19 2020 at 19:02):

@Sumagna Das If you want to do a little experimenting you might see if you can figure out how https://github.com/cmungall/gosf2github works...

view this post on Zulip starseeker (Sep 19 2020 at 19:05):

Ah, right, now I remember. Unfortunately I don't have admin privileges on BRL-CAD necessary to do the export...

view this post on Zulip Sumagna Das (Sep 19 2020 at 19:21):

or atleast someone who has admin privileges who can give the exported stuff needed

view this post on Zulip starseeker (Sep 19 2020 at 19:22):

I don't think any of my old projects have anything to export in this department, certainly not on a scale like BRL-CAD's...

view this post on Zulip starseeker (Sep 19 2020 at 19:24):

@Sumagna Das One question I don't know the answer to yet is what the best way to handle unmerged patches is. On github they're pull requests, but on sourceforge they're patch files... I don't know off hand how we're going to handle patch file submissions to github. Have you seen anything about how people address that problem?

view this post on Zulip starseeker (Sep 19 2020 at 19:26):

Maybe the gosf2github script migrates them somehow, since it looks like sourceforge categorizes bugs, patches and feature requests as tickets...

view this post on Zulip Sumagna Das (Sep 19 2020 at 19:28):

that is the main question....if it can migrate them correctly

view this post on Zulip starseeker (Sep 19 2020 at 20:13):

/me is not quite sure what gosf2github is talking about with setting up oauth... never done that before

view this post on Zulip starseeker (Sep 19 2020 at 20:37):

OK, I think the "Personal access tokens" will work, but the perl script is a bit cranky...

view this post on Zulip starseeker (Sep 19 2020 at 20:40):

Blegh. This begs for a detailed, step-by-step guilde for folks unfamiliar with any of this...

view this post on Zulip starseeker (Sep 19 2020 at 20:46):

OK, It looks like to get the "collaborators" list needed by gosf2github the repo needs to be an organization-owned repository: https://docs.github.com/en/rest/reference/repos#collaborators

view this post on Zulip starseeker (Sep 19 2020 at 20:47):

Yeesh. I guess someone needs to experiment with this stuff on a test import of BRL-CAD in the org project...

view this post on Zulip starseeker (Sep 19 2020 at 20:48):

Oh well... at least it's not necessary to stand up the primary VCS repos.

view this post on Zulip starseeker (Sep 19 2020 at 20:52):

/me make a note to check more recently updated fork at https://github.com/n-soda/gosf2github

view this post on Zulip Sean (Sep 19 2020 at 20:56):

remotely relevant, https://github.com/github/renaming .. so we could / probably should adopt main instead of master

view this post on Zulip starseeker (Sep 19 2020 at 20:59):

/me tries renaming...

view this post on Zulip starseeker (Sep 19 2020 at 21:09):

OK, we can rename master->main - proved out on brlcad_conv11

view this post on Zulip starseeker (Sep 19 2020 at 21:20):

Essentially painless if we do it before pushing to github - I'll just note it in the CONVERT.sh script.

view this post on Zulip starseeker (Sep 19 2020 at 21:29):

Good note to be aware of when we eventually try migrating issues (last 2 comments in particular): https://github.com/beanshell/beanshell/issues/44

view this post on Zulip starseeker (Sep 19 2020 at 21:31):

OK, looks like our contributions stats are still there after the default branch rename too.

Phew. Had a few bad moments wondering if we were going to have to re-run the whole thing again to get commits reassigned...

view this post on Zulip starseeker (Sep 25 2020 at 23:43):

ping?

view this post on Zulip Sumagna Das (Oct 13 2020 at 06:54):

@starseeker @Sean any update on the migration?

view this post on Zulip Erik (Oct 15 2020 at 22:53):

whatcha renaming master to, "tyrant"? :D

view this post on Zulip starseeker (Oct 16 2020 at 02:48):

"main" appears to be the new convention. I like it - it's shorter and still starts with the same letters.

view this post on Zulip starseeker (Oct 20 2020 at 01:30):

@Sean ping?

view this post on Zulip Sean (Oct 20 2020 at 01:32):

I worked on it some this past weekend. Will update the checklist with things done tomorrow to see where we're at.

view this post on Zulip starseeker (Oct 20 2020 at 02:50):

We've got a problem - my incremental conversion process just broke.

view this post on Zulip starseeker (Oct 20 2020 at 02:51):

Not Good.

view this post on Zulip starseeker (Oct 20 2020 at 02:57):

/me apprehensively re-runs to see if he can diagnose the failure...

view this post on Zulip starseeker (Oct 20 2020 at 03:12):

OK, I think I know what happened... let's see if I can adjust and recover.

view this post on Zulip starseeker (Oct 20 2020 at 03:14):

Alright, run kicked off - I've got to crash

view this post on Zulip starseeker (Oct 20 2020 at 12:42):

phew. Looks like that adjustment got past it.

view this post on Zulip starseeker (Oct 27 2020 at 01:40):

@Sean ping?

view this post on Zulip starseeker (Oct 30 2020 at 15:19):

@Sean ping?

view this post on Zulip starseeker (Nov 09 2020 at 23:30):

@Sean ping?

view this post on Zulip Erik (Nov 11 2020 at 22:49):

is he doing that thing were he defers working on it a week every time someone bugs him about it? :D

view this post on Zulip Sean (Nov 11 2020 at 22:50):

No :P

view this post on Zulip starseeker (Dec 02 2020 at 18:34):

https://github.com/starseeker/brlcad_conv11 is updated through r77867

view this post on Zulip Sumagna Das (Dec 03 2020 at 13:49):

(deleted)

view this post on Zulip starseeker (Dec 03 2020 at 22:22):

http://bsdimp.blogspot.com/2020/09/freebsd-subversion-to-git-migration.html

view this post on Zulip starseeker (Dec 03 2020 at 22:26):

leave it to OpenBSD to try and improve on the git frontend: http://gameoftrees.org/

view this post on Zulip Sean (Dec 04 2020 at 02:01):

excellent, I'm going to try and push on it this friday now that a particular render task is finishing up.

view this post on Zulip Sean (Dec 04 2020 at 02:02):

(excellent == updated through r...)

view this post on Zulip starseeker (Dec 04 2020 at 13:07):

Now through 77924

view this post on Zulip starseeker (Dec 05 2020 at 18:48):

@Sean Just wanted to check if you were/are able to push on the Git conversion - I can make more of an effort to keep the github repo in sync with SVN if it is helpful, but otherwise it's a little simpler to only do it every few hundred commits...

view this post on Zulip starseeker (Dec 06 2020 at 18:21):

Current through r77936

view this post on Zulip starseeker (Dec 10 2020 at 19:05):

ping?

view this post on Zulip starseeker (Dec 17 2020 at 15:07):

https://github.com/starseeker/brlcad_conv11 is updated through r77978

view this post on Zulip starseeker (Dec 17 2020 at 15:08):

@Sean any chance we'll be able to move before 2021?

view this post on Zulip starseeker (Dec 17 2020 at 16:08):

Hey, cool - you can select a range on the commit graph for Github. image.png

view this post on Zulip Sean (Dec 17 2020 at 19:02):

Your'e on fire!

view this post on Zulip Sean (Dec 17 2020 at 19:03):

Yeah, I think so. Was thinking the same thing myself. working on it!

view this post on Zulip starseeker (Dec 17 2020 at 20:32):

Heh, well, like you said, it's a fire sale :-P

view this post on Zulip Sean (Dec 17 2020 at 20:32):

according to those graphs, the fire's been raging for a couple years

view this post on Zulip starseeker (Dec 17 2020 at 20:32):

Gotta say, I like the dark github theme - previously their website was the brightest thing on my desktop

view this post on Zulip starseeker (Dec 20 2020 at 20:08):

/me blinks - rsyncing the SVN repo from sf.net didn't complete. That's a new one...

view this post on Zulip starseeker (Dec 20 2020 at 21:00):

There we go. Github brlcad_conv11 updated through r78038

view this post on Zulip starseeker (Dec 20 2020 at 21:41):

@Sean barring something unforeseen, that's probably my last update of both SVN and Github for the year.

view this post on Zulip starseeker (Dec 20 2020 at 21:49):

I've not done the full cross platform distcheck-full hammering for release testing since the gqa multithreaded test will currently fail, but otherwise things are generally looking like they're in fairly good shape...

view this post on Zulip Aniket Khandagale (Dec 21 2020 at 12:22):

Hey can i get the link for github repo where i can look for beginner level,easy to fix problems and try to fix them

view this post on Zulip Thusal Ranawaka (Dec 21 2020 at 13:29):

Hey @Aniket Khandagale welcome to BRL-CAD Community, I think you can have a look at BRL-CAD Wiki www.brlcad.org/wiki and start with compiling BRL-CAD to your PC. You can find build instructions from here, https://brlcad.org/wiki/Building_from_SVN

view this post on Zulip Thusal Ranawaka (Dec 21 2020 at 13:31):

If you want assistance, ask from Community and also you can ask from Sean and starseeker.

view this post on Zulip Thusal Ranawaka (Dec 21 2020 at 13:32):

Aniket Khandagale said:

Hey can i get the link for github repo where i can look for beginner level,easy to fix problems and try to fix them

In this case, I am not sure about the Github Repo, please help @Sean

view this post on Zulip Sumagna Das (Dec 21 2020 at 17:43):

Aniket Khandagale said:

Hey can i get the link for github repo where i can look for beginner level,easy to fix problems and try to fix them

@Aniket Khandagale BRL-CAD is available on sourceforge(SVN). it is being migrated from sourceforge (svn) to github (git) so there is no (official) github repo. (there is one where the migration is happening but it is not up to date and behind the main repo by a couple of commits (or revisions, as per SVN terminology).

view this post on Zulip Aniket Khandagale (Dec 21 2020 at 18:01):

Thanks @Sumagna Das should i wait till the time its been migrated to github?

view this post on Zulip Sumagna Das (Dec 21 2020 at 18:03):

@Aniket Khandagale you dont need to wait for the migration.

view this post on Zulip Aniket Khandagale (Dec 21 2020 at 18:04):

@Sumagna Das can i get the link for sourceforge

view this post on Zulip scorp08 (Dec 26 2020 at 05:38):

@starseeker Is it possible to fork from blrcad github, push staff or wait to finish migration??

view this post on Zulip starseeker (Dec 26 2020 at 14:55):

It's technically possible to fork, but the repository of record is still SVN at the moment. I'd recommend waiting for us to complete the migration.

view this post on Zulip Sean (Dec 26 2020 at 14:57):

Yes, best to wait or things could get messy when it comes time to switch it. I've been going through the repo so it hopefully won't be a long additional wait for folks.

view this post on Zulip starseeker (Jan 03 2021 at 20:39):

ping?

view this post on Zulip starseeker (Jan 16 2021 at 21:48):

ping?

view this post on Zulip starseeker (Jan 26 2021 at 19:02):

I'll post an announcement to the email list as well, but my plan is to lock the SVN repository sometime on Friday, Jan. 29th to finalize the repository contents for the Git conversion.

view this post on Zulip Sumagna Das (Jan 27 2021 at 02:35):

starseeker said:

I'll post an announcement to the email list as well, but my plan is to lock the SVN repository sometime on Friday, Jan. 29th to finalize the repository contents for the Git conversion.

does this mean that the migration is done or are there some more things left?

view this post on Zulip starseeker (Jan 27 2021 at 13:28):

@Sean is doing final review - I'm going to start uploading the secondary repositories while he finishes looking at the main repository.

view this post on Zulip Sumagna Das (Jan 27 2021 at 14:39):

Okay
tell me when the whole operation is done.

view this post on Zulip starseeker (Jan 27 2021 at 15:03):

@Daniel Rossberg I've uploaded the svn-all-fast-export conversions of all the projects except BRL-CAD itself to https://github.com/BRL-CAD - can you take a look at rt-cubed and make sure it looks OK to you before anyone starts committing to it?

view this post on Zulip Daniel Rossberg (Jan 27 2021 at 16:31):

@starseeker I looks good to me.

Git doesn't know empty directories, that's why they got lost from src/other/ogre. But, as far as I know, Ogre isn't used anywhere.

view this post on Zulip starseeker (Jan 30 2021 at 23:58):

@Sean status?

view this post on Zulip Sean (Feb 01 2021 at 18:18):

I spent most of the weekend validating and reviewing. It's looking really fantastic to me. I have questions, but no show-stoppers. I actually got through the laundry checklist I'd written up to identify all the deltas as document discrepancies. Filed support request for --follow and doing one more pass through the log of missing commits now. Planning to upload repo myself today so I know the process, unless there's some reason not to.

view this post on Zulip starseeker (Feb 01 2021 at 22:57):

@Sean brlcad_conv11 is current with the latest commits.

view this post on Zulip starseeker (Feb 01 2021 at 22:58):

Any other questions I can help answer?

view this post on Zulip Erik (Feb 02 2021 at 12:55):

if an empty directory is needed by git, typically a .do_not_delete file is touched

view this post on Zulip starseeker (Feb 02 2021 at 13:10):

I'm not aware of any situation where we actually need an empty directory in the raw source repo - if nothing else, it's simple to have the build system or the code create such directories on the fly...

view this post on Zulip Erik (Feb 02 2021 at 13:11):

yup, just quipping what I done seened :)

view this post on Zulip starseeker (Feb 02 2021 at 13:12):

@Erik It looked like your git isst repo had everything from SVN's isst as well - is that correct?

view this post on Zulip Erik (Feb 02 2021 at 13:14):

I'm sure it was just an import, I hope I tried to make the introduction commit as basic as possible and did the "tidy" as a next commit... it's been a few, yo :)

view this post on Zulip Erik (Feb 02 2021 at 13:15):

if not, I hope someone is archiving the svn repo and the latest snapshot, y'know, "just in case". I'm sure the DoD can still afford the bits :)

view this post on Zulip starseeker (Feb 02 2021 at 13:15):

https://github.com/BRL-CAD/vcs-history

view this post on Zulip Erik (Feb 02 2021 at 13:16):

neato :D (keep an in-house copy or 20)

view this post on Zulip starseeker (Feb 02 2021 at 13:17):

I may have to "top off" the SVN portion depending on whether I need to make more SVN commits before the final switch, but if some poor soul has to repeat the VCS conversions for whatever reason they should have the necessary inputs to work with.

view this post on Zulip starseeker (Feb 02 2021 at 13:18):

(straight jacket not included)

view this post on Zulip Sean (Feb 02 2021 at 18:08):

@starseeker still no show-stoppers but found a couple oddities. there are about 100 "* empty log message *" cvs commits that exist in git and svn but are missing the corresponding svn:revision:#### line. would that be because of timestamps or something else?

view this post on Zulip Sean (Feb 02 2021 at 18:09):

some appear to have it while others do not.

view this post on Zulip Sean (Feb 02 2021 at 18:13):

there are also 139 empty log message cvs commits in addition to the 100 that don't seem to be in git, but I'm writing them off as different cvs2git vs cvs2svn translation until I see evidence otherwise.

view this post on Zulip starseeker (Feb 02 2021 at 18:49):

@Sean I was probably somewhat hesitant about mapping SVN numbers to those commits - with such ambiguous messages, all I had to go on for those was the timestamps, and the git and svn conversions of CVS didn't always end up exactly mapping those.

svn number assignment logic is in misc/repoconv/svn_map_commit_revs.cxx FWIW

view this post on Zulip starseeker (Feb 02 2021 at 18:51):

Looking at the logic, I don't know that I did a whole lot with the empty log message commits.

view this post on Zulip starseeker (Feb 02 2021 at 18:52):

I think by that point I figured we were well into diminishing returns.

view this post on Zulip starseeker (Feb 02 2021 at 18:54):

I see that I categorized some commits as "non-unique, has exact timestamp match" but I think without manual inspection of the diffs I wouldn't have had the confidence to assign them SVN ids

view this post on Zulip starseeker (Feb 02 2021 at 18:54):

Even the "Initial revision" commit assignments, which I did make, are a bit dubious

view this post on Zulip starseeker (Feb 02 2021 at 18:57):

(non-unique in that context would be "non-unique commit message string")

view this post on Zulip starseeker (Feb 02 2021 at 19:02):

AH! I think I found the processing log

view this post on Zulip starseeker (Feb 02 2021 at 19:03):

svn_unmapped.txt

view this post on Zulip starseeker (Feb 02 2021 at 19:12):

Looks like when I ran that it was against a git repo that didn't have newer 76300+ commits. The following is cleaned up and sorted:

view this post on Zulip starseeker (Feb 02 2021 at 19:12):

svn_unmapped_sorted.txt

view this post on Zulip Sean (Feb 02 2021 at 19:48):

So 735 there is an example -- it's the first in my list. It's not got a timestamp match, so you didn't know which svn :revision that was (which is curious in itself)

view this post on Zulip Sean (Feb 02 2021 at 19:49):

at least maybe? that's the curious part because you did know it was 735 ...

view this post on Zulip Sean (Feb 02 2021 at 19:51):

under what conditions would cvs commits get or not get the svn:revision:### note?

view this post on Zulip starseeker (Feb 02 2021 at 19:56):

I know about SVN commit 735, but in the git repository I don't have a commit with an exact timestamp match with the same commit message

view this post on Zulip starseeker (Feb 02 2021 at 19:57):

What you're seeing in that log is a processing of a detailed log from SVN, combined with a log of available git commits.

view this post on Zulip starseeker (Feb 02 2021 at 19:58):

For that printout, all SVN commits were checked against what was/is available in git

view this post on Zulip starseeker (Feb 02 2021 at 20:04):

CVS commits would get the svn:revision note under the following conditions:

1) there exists an exact, unique commit message that is shared by an SVN commit and a Git commit
2) There exists an SVN commit with a non-unique commit message match that also shares an exact timestamp with a Git commit having the same commit message
3) The special case commit message "Initial revision" when there exists a Git commit with an exact timestamp match, and the timestamp match is outside the known "bad" range of early commits with unreliable timestamps.

view this post on Zulip Sean (Feb 02 2021 at 20:05):

Yeah, that's odd. The timestamp you have for 735 is different in cvs2git from what svn had...

view this post on Zulip Sean (Feb 02 2021 at 20:05):

Looks like r735 is 771b3183f9e315f6e1451a1e3462e6f84724a9cd

view this post on Zulip Sean (Feb 02 2021 at 20:06):

svn lists that date as 1986-08-12 23:18:25 -0400

view this post on Zulip Sean (Feb 02 2021 at 20:07):

git is a solid ten minutes off at Wed Aug 13 03:08:40 1986 +0000

view this post on Zulip Sean (Feb 02 2021 at 20:07):

(that's not to suggest git's is wrong -- svn could of course be wrong)

view this post on Zulip starseeker (Feb 02 2021 at 20:10):

I will readily admit I didn't delve into the details of how cvs2git and cvs2svn differed in their processing, so I can't say which one is right or better. For myself I wasn't worried about it - in some circumstance where precision for a given commit's timing mattered, I'd want to query CVS directly...

A word of caution - now that we're no longer using git notes for svn revision information, any updates to add more mappings are going to be difficult (not impossible, but it will be another custom processing implementation in repowork.)

view this post on Zulip Sean (Feb 02 2021 at 20:12):

No worries, not seeing any reason to reprocess anything -- just accounting to make sure nothing is missing. And simply trying to understand.

view this post on Zulip Sean (Feb 02 2021 at 20:14):

I was able to rule out all 81 "Initial revision" commits for example, as they're clearly all categorically present, just not labeled (likely your #3 above).

view this post on Zulip Sean (Feb 02 2021 at 20:16):

I think #1 may also be accounting for a lot of the 400 remaining. Many have a repeat commit message but that was made (sometimes seconds) later. If there's some discrepancy between the clock being used, that would also potentially account for more. I should know more definitively here in a bit.

view this post on Zulip starseeker (Feb 02 2021 at 20:17):

@Sean If you're doing the grunt work to go identify mappings manually, you may as well make a note of the mappings. If you're going to that degree of trouble, I might as well do the extra work to capture it in the commit messages...

view this post on Zulip starseeker (Feb 02 2021 at 20:21):

Just something like:

sha1;#
sha1;#
...

should do it.

view this post on Zulip starseeker (Feb 02 2021 at 20:25):

One of the problems though is I don't expect some of them to have exact 1-1 mappings at all, since cvs2git may have grouped things differently.

view this post on Zulip starseeker (Feb 02 2021 at 20:33):

Pardon, my terminology was loose - the tool we're using is cvs-fast-export, not cvs2git.

view this post on Zulip Sean (Feb 02 2021 at 22:04):

Hm, yeah I have that info. I basically wrote two 1-liners to pull a diff of the missing svn revs and and of all git revs, then a 1-liner to make sure they're all accounted for. I could make it print which commit is actually which missing rev.

view this post on Zulip starseeker (Feb 02 2021 at 22:33):

You're actually comparing the commit diffs themselves? nifty

view this post on Zulip Sean (Feb 02 2021 at 22:37):

technically I'm comparing the md5 sum of just the changed/added/removed lines, but yeah. it was also needed to figure out which missing commits were because they were just propset changes.

view this post on Zulip Sean (Feb 02 2021 at 22:37):

they show up as empty diffs, so easy to cull them from review

view this post on Zulip starseeker (Feb 02 2021 at 22:39):

Ah, I hadn't thought of extracting just the diff lines - good call

view this post on Zulip starseeker (Feb 03 2021 at 02:22):

@Sean upon reflection, I'm second guessing myself - if I update the older commit messages, it changes all the sha1s again and arguably we would need to do more verification to make sure the new step didn't mess with anything. Maybe it's not worth it for the stray svn:revision tags?

view this post on Zulip starseeker (Feb 03 2021 at 03:17):

With a diff based approach, you might in principle be able to spot if any of the Initial Revision commits ended up mapped wrong despite exact timestamp matches...

view this post on Zulip starseeker (Feb 03 2021 at 03:18):

If I ended up assigning demonstrably incorrect numbers, that's probably worth fixing...

view this post on Zulip Sean (Feb 03 2021 at 03:21):

Sure, can revisit the decision -- my priority has been on finding / validating they're there somewhere. If they're all there and just not tagged, I agree that's less of a concern. I mean it'd be cool to have them all tagged, but that can happen at a later date even and we make everyone re-clone.

view this post on Zulip Sean (Feb 03 2021 at 03:22):

Hey, can you take a look at a couple commits and tell me what I'm seeing...
4850989e3a2f9624127ae043c6094076a60bc472 and 97d02527843ffb84f8bb3da0e64ef5f7db6df28c

view this post on Zulip starseeker (Feb 03 2021 at 13:40):

I'm not entirely sure what those are - some artifact of the cvs-to-git conversion process, obviously, but I'm not entirely clear on what they're trying to represent.

view this post on Zulip starseeker (Feb 03 2021 at 14:42):

I haven't considered trying to "clean up" any of the cvs era artifacts of the conversion, since I don't know which of them might be added to preserve content that would otherwise be garbage collected out.

view this post on Zulip Sean (Feb 03 2021 at 16:08):

at a quick glance, they look like the entire repository was deleted. they're the two largest commits in the git repo. they're fortunately in branches, but would be good to understand what's going on there because it smells like something went wrong

view this post on Zulip Sean (Feb 03 2021 at 16:09):

I recall the 7.0 branch and don't remember any sort of merge event like that happening ...

view this post on Zulip starseeker (Feb 03 2021 at 17:49):

The "cvsconvert" tool did generate some sort of audit...

view this post on Zulip starseeker (Feb 03 2021 at 17:49):

cvs_all_fast_export_audit.txt

view this post on Zulip starseeker (Feb 03 2021 at 17:51):

@Sean How do you want to proceed?

view this post on Zulip starseeker (Feb 03 2021 at 20:37):

@Sean FWIW, I think I've gotten the necessary piece in place to do the sha1;rev# updating successfully. I'd still want to run your diff check on the final results and probably inspect the updated commits to be sure, but a quick test with your 735 example succeeded.

view this post on Zulip starseeker (Feb 03 2021 at 20:38):

bbl

view this post on Zulip Sean (Feb 03 2021 at 20:40):

starseeker said:

Sean How do you want to proceed?

It would be nice to understand why either of those branches appear to wipe out everything (if that's indeed what happened), even if were not going to do anything about it. I think that'd entail checking out one of those branches and looking at the commits before/after to see if there's an explanation. Not a show-stopper since they're on branches, but concerning from a data anomaly perspective.

view this post on Zulip Sean (Feb 03 2021 at 20:43):

Good to know about the sha/rev updating. At a quick glance, lookup succeeded on about 1/2 to 2/3rds of the commits missing. I'm looking at the ones that didn't match to see if they're actually missing or if there's something in the diffing method pooching things. There are a few dozen that map 1:many that we can either ignore or map manually by their date, but I wasn't going to worry about them.

view this post on Zulip starseeker (Feb 04 2021 at 01:13):

@Sean The immediate question is whether they did do that or cvs-fast-export is misinterpreting some aspect of the CVS data.

view this post on Zulip starseeker (Feb 04 2021 at 01:16):

FWIW, I think a685e85ff730450f669a0d853c69ef545c30b46f may be related to the 97d02527843ffb84f8bb3da0e64ef5f7db6df28c commit

view this post on Zulip starseeker (Feb 04 2021 at 01:17):

"remove the cvs tag relic" may be why the prior incomplete tag commit removed everything?

view this post on Zulip starseeker (Feb 04 2021 at 01:20):

Ah, wait a minute - I wasn't looking closely enough. "merge-to-head" incomplete tag (4850989e3a2f9624127ae043c6094076a60bc472) is an SVN era commit, and also seems to have an associated commit (dd2bb79965568f5aab4f7458606d875d22b74b40)

view this post on Zulip starseeker (Feb 04 2021 at 01:21):

Yeah, those are both SVN era commits - my apologies.

view this post on Zulip starseeker (Feb 04 2021 at 01:21):

I was fooled by the "cvs" in the commit messages and didn't look closely enough

view this post on Zulip starseeker (Feb 04 2021 at 01:28):

OK, so checking more carefully, here's the breakdown:

97d02527843ffb84f8bb3da0e64ef5f7db6df28c - Synthetic commit for incomplete tag release-7-0 - CVS era commit
a685e85ff730450f669a0d853c69ef545c30b46f - child of 97d02, SVN era commit. Message:
clearly not actually release 7.0 .. remove the cvs tag relic that was made on a few files just before the project was converted to open source. (svn branch delete)

4850989e3a2f9624127ae043c6094076a60bc472 - Synthetic commit for incomplete tag merge-to-head-20051223 - CVS era commit
dd2bb79965568f5aab4f7458606d875d22b74b40 - child of 485098, SVN era commit. Message:
move cvs branch tagging artifact removal (svn branch delete)

view this post on Zulip starseeker (Feb 04 2021 at 01:33):

So, my guess is that the CVS conversions (evidently both of them, cvs2svn and cvs-fast-export) found something in the data prompted tagging. Based on the 7.0 message, it looks like a stray tag was on a few files, the converter interpreted that as a tag in Git that preserved only those files and removed everything else (hence the massive diff.)

Back in 2011, you did some cleanup on the SVN branches and spotted those as spurious. So, we've got the cvs-fast-export generated tags and associated branches, and then the 2011 SVN cleanup of the cvs2svn versions of the same thing.

view this post on Zulip starseeker (Feb 04 2021 at 01:38):

@Sean Your call how you want to handle the 1-many - I'm pretty sure I can handle that in the svn:revision assignment, as long as each sha1 maps to only one SVN rev.

view this post on Zulip starseeker (Feb 04 2021 at 03:12):

(I think if we delete the two branches in question from Git we can probably garbage collect them out, by the way - do we want to preserve that, or would it be better to remove?)

view this post on Zulip Sean (Feb 04 2021 at 03:20):

starseeker said:

So, my guess is that the CVS conversions (evidently both of them, cvs2svn and cvs-fast-export) found something in the data prompted tagging. Based on the 7.0 message, it looks like a stray tag was on a few files, the converter interpreted that as a tag in Git that preserved only those files and removed everything else (hence the massive diff.)

Back in 2011, you did some cleanup on the SVN branches and spotted those as spurious. So, we've got the cvs-fast-export generated tags and associated branches, and then the 2011 SVN cleanup of the cvs2svn versions of the same thing.

This is the explanation I was hoping for!

view this post on Zulip Sean (Feb 04 2021 at 03:21):

Yeah, okay I can see that happening and how it might have gotten intepreted -- a branch was tagged, the branch was removed, but a few stray files from that branch ended up remaining tagged/referenced, so it generated delete commits to preserve their lineage.

view this post on Zulip Sean (Feb 04 2021 at 03:23):

There were actually a dozen or two commits very similar to those, which is also why it was concerning (they were just the biggest two), but I think that fully explains them.

view this post on Zulip Sean (Feb 04 2021 at 03:24):

starseeker said:

(I think if we delete the two branches in question from Git we can probably garbage collect them out, by the way - do we want to preserve that, or would it be better to remove?)

I think we can just ignore them for now. They're not the only ones, they just stood out during validation as potential processing corruption.

view this post on Zulip Sean (Feb 04 2021 at 03:24):

Knowing that they're not, they are out of sight, out of mind. Excellent!

view this post on Zulip Erik (Feb 04 2021 at 14:30):

might wanna docco that with stashed history

view this post on Zulip starseeker (Feb 04 2021 at 14:33):

@Erik Did folks manually edit CVS files to tag releases or some such? I know from what Sean said the history was edited at least once to deal with some Tcl/Tk issues (which can be seen comparing CVS checkouts vs git checkouts, actually)...

view this post on Zulip Erik (Feb 04 2021 at 14:35):

I have no recollection of manually tweaking CVS files for a release O.o I was slid off to muves3 around that time I think, I think 7 happened without me

view this post on Zulip Erik (Feb 04 2021 at 14:36):

I mostly just did fbsd support and autoconf before reassignment (plus a few side projects, uh, some parser for matrex federations, uh, something else for Geoff, too... )

view this post on Zulip starseeker (Feb 04 2021 at 14:37):

Fair enough. The more I see of all this the more grateful I am that I got to come on board just as SVN was introduced.

view this post on Zulip starseeker (Feb 04 2021 at 14:39):

CVS is... weird. At one point I even considered https://github.com/rcls/crap as an alternative to cvs-fast-export, since it seems to reproduce in Git what CVS checks out, but after discussions with Sean (and I think I noticed this myself at one point) I learned even CVS itself won't accurately check out some parts of our history (accurately in the sense of reproducing the tree that the users would have seen at the time) due to the edits made to work around the libtcl/libtk problems.

view this post on Zulip Erik (Feb 04 2021 at 14:41):

CVS is a "remote RCS server", grokking rcs is kinda important for grokking cvs

view this post on Zulip starseeker (Feb 04 2021 at 14:42):

<snort> I guess as a young whippersnapper I joined the software community too late to properly appreciate them. "RCS" to me mostly means annoying tags at the beginning of files that complicate diffing :-P

view this post on Zulip starseeker (Feb 04 2021 at 14:44):

Which is not to say I could have designed anything better than RCS back in the day, of course - I get the sense that VCS is one of those problems where only experience with the day-to-day requirements of the problems at scale can really result in good designs.

view this post on Zulip Sean (Feb 04 2021 at 15:29):

I don't know of CVS files being edited for releases. They were mostly edited to "fix" things CVS couldn't do, like renaming a directory or eliminating a bad commit.

view this post on Zulip Sean (Feb 04 2021 at 15:29):

@starseeker another something to investigate... do you know why this doesn't work?
git diff 2686445fedcfeadcbc8a2960fd8690f2d0ccbf47~1 2686445fedcfeadcbc8a2960fd8690f2d0ccbf47

view this post on Zulip Sean (Feb 04 2021 at 15:30):

show works, but can't diff it... somehow it doesn't have an ancestor or has multiple or ... ?

view this post on Zulip starseeker (Feb 04 2021 at 15:32):

Author: Douglas Kingston <dpk@randomnotes.org>
Date:   Fri Dec 16 00:10:31 1983 +0000

    Original 4.2 Distribution Source

    svn:revision:2
    cvs:account:dpk
    cvs:branch:trunk

That's the earliest commit in the history - it doesn't have an ancestor

view this post on Zulip Sean (Feb 04 2021 at 15:33):

OH!

view this post on Zulip Sean (Feb 04 2021 at 15:33):

g'dammit.. okay, I can special case it. haha.

view this post on Zulip starseeker (Feb 04 2021 at 15:33):

I've noticed git tools aren't always graceful when they encounter that case.

view this post on Zulip Sean (Feb 04 2021 at 15:33):

I was able to identify most of the missing revisions, but there are about 160 that didn't match, and when I investigated it was because git's show syntax doesn't match diff syntax for merge commits. manually looking at one of the ones that didn't match, it was indeed a merge commit that didn't match because of the format. so I regen'd the diffs but it barfed on that one.

view this post on Zulip starseeker (Feb 04 2021 at 15:34):

/me grins - you found the last turtle!

view this post on Zulip Sean (Feb 04 2021 at 15:34):

there must be some other diff syntax that will get that commit?

view this post on Zulip Sean (Feb 04 2021 at 15:35):

tried ^ ...

view this post on Zulip starseeker (Feb 04 2021 at 15:35):

git show will print it... don't know about diff.

view this post on Zulip Sean (Feb 04 2021 at 15:35):

show is the wrong format

view this post on Zulip starseeker (Feb 04 2021 at 15:35):

Conceptually, what are we diffing against?

view this post on Zulip Sean (Feb 04 2021 at 15:35):

I want the patch for that commit

view this post on Zulip Sean (Feb 04 2021 at 15:35):

the commit

view this post on Zulip Sean (Feb 04 2021 at 15:36):

so patch format, it'll be lines added.

view this post on Zulip starseeker (Feb 04 2021 at 15:36):

git diff 4b825dc642cb6eb9a060e54bf8d69288fbee4904 2686445fedcfeadcbc8a2960fd8690f2d0ccbf47

view this post on Zulip Sean (Feb 04 2021 at 15:36):

^! seems to work, but I'm not sure what that means...

view this post on Zulip starseeker (Feb 04 2021 at 15:36):

https://stackoverflow.com/a/40884093

view this post on Zulip starseeker (Feb 04 2021 at 15:36):

Does that work?

view this post on Zulip Sean (Feb 04 2021 at 15:37):

eh... that's f'ing retarded.

view this post on Zulip Sean (Feb 04 2021 at 15:37):

I'm sure it will. There's got to be some shorthand for that shit.

view this post on Zulip starseeker (Feb 04 2021 at 15:38):

Maybe ^! is a shorthand for that? Dunno, haven't encountered that syntax before - @Erik ?

view this post on Zulip Sean (Feb 04 2021 at 15:39):

I found that on some other SO but can't find it in the docs to know what it means.

view this post on Zulip starseeker (Feb 04 2021 at 15:39):

Yeah, that's a tough one to google...

view this post on Zulip Erik (Feb 04 2021 at 15:40):

I dunno !
^ means "previous"
~<n> means "nth previous"

view this post on Zulip Sean (Feb 04 2021 at 15:40):

here we go: The r1^! notation includes commit r1 but excludes all of its parents. By itself, this notation denotes the single commit r1.

view this post on Zulip Sean (Feb 04 2021 at 15:41):

sounds like that might be correct

view this post on Zulip Erik (Feb 04 2021 at 15:41):

ah, neat
'git show' is what I use for a single commit

view this post on Zulip Sean (Feb 04 2021 at 15:41):

me too, but it's syntax is wrong for merge commits (at least for patch and diffing purposes)

view this post on Zulip Sean (Feb 04 2021 at 15:41):

it does ++ and -- lines

view this post on Zulip Sean (Feb 04 2021 at 15:42):

there's undoubtedly other options to change the format, but diff is the command that does it in the right format by default, so it was a hunt to find the right syntax for "just this commit"

view this post on Zulip Sean (Feb 04 2021 at 15:42):

apparently it's rev^! ...

view this post on Zulip Sean (Feb 04 2021 at 15:44):

/me reruns a diff dump.. should have answers within the hour

view this post on Zulip Erik (Feb 04 2021 at 15:44):

I think I would have ended up doing "git diff rev^..rev"

view this post on Zulip Erik (Feb 04 2021 at 15:44):

or something-ish

view this post on Zulip Sean (Feb 04 2021 at 15:44):

yep, did that and that's what barfs when it encounters the last rev

view this post on Zulip Sean (Feb 04 2021 at 15:45):

er, first commit

view this post on Zulip Sean (Feb 04 2021 at 15:45):

git diff rev~ rev

view this post on Zulip starseeker (Feb 04 2021 at 15:48):

@Sean If you think it's worthwhile, I'd stick your verification script or at least notes about the key gotchas associated with creating it in misc/repoconv once this is all over - it can't be any worse than my conversion logic, and it might be useful someday if we ever have to dive back into this swamp...

view this post on Zulip Sean (Feb 04 2021 at 15:48):

sure

view this post on Zulip Sean (Feb 04 2021 at 15:49):

I've been stashing notes just in case I need to reference one of the 1-liners later, and notes on missing revs as they've been explained

view this post on Zulip starseeker (Feb 04 2021 at 15:49):

/me is embarrassed that he didn't think of cutting down the diff into +/- lines - should have considered that when the commit messages didn't resolve things unambiguously

view this post on Zulip Sean (Feb 04 2021 at 15:50):

well still remains to be seen -- they may need to be sorted too, but wasn't going to do that until there's evidence it's needed

view this post on Zulip Sean (Feb 04 2021 at 15:51):

e.g., if there are multiple file changes and svn shows A, B, C but then git displays C, B, A or similar ... shouldn't but might be possible. so far I'm thinking not just because so many are matching.

view this post on Zulip Sean (Feb 04 2021 at 15:52):

but the bigger set was empty merges next so should see how many of the 160 this eliminates

view this post on Zulip starseeker (Feb 04 2021 at 16:03):

/me will be curious to see if any of the commit message + timestamp based mappings prove to be incorrect.

view this post on Zulip Sean (Feb 04 2021 at 16:16):

I can re-run on everything next but immediate priority was just identifying potentially missing commits

view this post on Zulip starseeker (Feb 04 2021 at 16:16):

/me nods

view this post on Zulip starseeker (Feb 04 2021 at 16:17):

Whatever you think best - just want to do whatever I can to put bow on this sucker.

view this post on Zulip Sean (Feb 04 2021 at 16:17):

that will require pulling all the svn diffs, which takes a while. took longer to pull 720 svn diffs from sf than it took to pull 70000 git diffs locally ... not much longer but still was a while

view this post on Zulip Sean (Feb 04 2021 at 16:17):

I'm fine just making sure we're not missing data. if a commit is mis-tagged, that could be fixed later.

view this post on Zulip starseeker (Feb 04 2021 at 16:18):

Might be faster to rsync the SVN repo and pull it locally - that's how I've worked with it

view this post on Zulip Sean (Feb 04 2021 at 16:18):

oh it definitely would

view this post on Zulip Sean (Feb 04 2021 at 16:18):

I just didn't bother

view this post on Zulip Sean (Feb 04 2021 at 16:18):

that'd take like two lines.

view this post on Zulip Sean (Feb 04 2021 at 16:18):

i'm all about the 1-liners

view this post on Zulip starseeker (Feb 04 2021 at 16:19):

K. If you've got the data to hand though, now that I've got what should be a means to correct them implemented, I'd kinda like to to ahead and fix them. Remember, if we have to ask everyone to re-clone, it's also going to wipe out any pull requests, etc. on github folks may have open.

view this post on Zulip starseeker (Feb 04 2021 at 16:20):

Remember what happened when I messed up the web git repo

view this post on Zulip Sean (Feb 04 2021 at 16:21):

ah, okay, good point. I'll poke that next then.

view this post on Zulip starseeker (Feb 04 2021 at 16:37):

One trick will be the known cases where cvs-fast-export split things more finely than cvs2svn with those desc tags from CVS - any commit with that in play won't match in diff - for those cases (most of them, anyway) the commit message would actually be more reliable.

view this post on Zulip starseeker (Feb 04 2021 at 16:40):

So I guess the priority ordering would be:

1) unique commit message mapping
2) diff match

view this post on Zulip starseeker (Feb 04 2021 at 16:44):

Starts getting a bit more iffy if we have non-unique commit message, matching timestamp, but non-matching diff (with no other exact matching diff) - if the diff is a subset of the SVN diff a case could be made for assigning the number, but that'd probably take some laborious manual inspection...

view this post on Zulip starseeker (Feb 04 2021 at 16:44):

Hopefully we'll have few/no cases that fall into those categories.

view this post on Zulip starseeker (Feb 04 2021 at 16:56):

Also, fair warning - I'd expect some differences (due to line endings especially) in the CVS era commits.

view this post on Zulip starseeker (Feb 04 2021 at 16:57):

If you want to focus on just the commits that are currently unmapped, and ignore trying to validate all of them based on diffs, I'd personally be fine with that given the difficulties of the latter.

view this post on Zulip Sean (Feb 05 2021 at 07:33):

I haven't run into any of those yet, but split commits could be handled pretty easily I think. If they're already tagged, I'd just ignore them and rely on all the unsplit matching as sufficient validation.

view this post on Zulip Sean (Feb 05 2021 at 07:36):

Update -- my reprocessing using diff format indeed improved things significantly. Found more than half the remaining missing commits. Down to just 64 commits unidentified.

view this post on Zulip Sean (Feb 05 2021 at 07:38):

Digging in, turns out at least a portion of them are due to changed lines with internal whitespace differences. It's some sort of expanded tabs issue, possibly where cvs-fast-export preserved tabs correctly whereas cvs2svn did not preserve them. That's unconfirmed, but matches the commit I was checking. Rerunning it now with internal space stripped and should know in the morning what's left.

view this post on Zulip starseeker (Feb 05 2021 at 17:46):

Huh - I expected some line ending oddities, but I'm surprised there's actual internal whitespace diffs.

view this post on Zulip starseeker (Feb 06 2021 at 04:13):

@Sean anything useful I can do?

view this post on Zulip Erik (Feb 06 2021 at 23:54):

/me shuts his mouth O:-)

view this post on Zulip Erik (Feb 06 2021 at 23:55):

is there a public preview of the current incarnation of the git repo?

view this post on Zulip starseeker (Feb 07 2021 at 02:13):

https://github.com/starseeker/brlcad_conv11

view this post on Zulip starseeker (Feb 07 2021 at 03:00):

@Erik Unless @Sean spots something, the only planned remaining changes (other than updates until SVN closes) are the application of some addition SVN commit -> Git commit mappings @Sean has identified during his validation.

view this post on Zulip Sean (Feb 08 2021 at 15:19):

I'm down to reviewing the last few remaining missing commits -- it's down to about 50 missing, so I should hopefully figure out what happened without too much trouble (e.g., if they're categoric processing artifacts or actually missing data). It's a manual process for the few remaining until I find a categoric pattern.

So far, I'm genuinely having trouble finding one of them, but not done hunting for it (I found a fragment but then couldn't find its commit, so have to re-find the fragment to see if that was combined/merged with something else or just a coincidental edit to the same line in an unrelated commit.)

view this post on Zulip starseeker (Feb 08 2021 at 17:21):

@Sean you're much deeper in than I at this point, but is there anything I might be able to help with?

view this post on Zulip starseeker (Feb 08 2021 at 23:53):

Confound it - @Sean any SHA1s after r77842 are most likely going to change

view this post on Zulip starseeker (Feb 08 2021 at 23:54):

I'm going to see if I can arrange a partial re-run, but there's a glitch in one of my processing filters

view this post on Zulip starseeker (Feb 09 2021 at 13:25):

Phew. Tightrope walked, looks like: https://github.com/starseeker/brlcad_conv12

view this post on Zulip Sean (Feb 09 2021 at 17:28):

@starseeker can you take a look at 4d401a8617869d3594b5948de12a374a5bd292fe and ea6d4c16bae6ecf30d4439d92c8dd72f56b3e942 and r19440

view this post on Zulip Sean (Feb 09 2021 at 17:32):

there are no remappings or missings after 77231 so it's fine

view this post on Zulip Sean (Feb 09 2021 at 17:40):

pulling the new repo

view this post on Zulip starseeker (Feb 09 2021 at 18:31):

Sean said:

starseeker can you take a look at 4d401a8617869d3594b5948de12a374a5bd292fe and ea6d4c16bae6ecf30d4439d92c8dd72f56b3e942 and r19440

If I'm interpreting this correctly, 4d401a8617869d3594b5948de12a374a5bd292fe matches the r19440 change on trunk, and ea6d4c16bae6ecf30d4439d92c8dd72f56b3e942 is the same change applied to the rel-5-3 branch. However, the r19440 label was applied to the branch commit rather than the trunk commit.

view this post on Zulip starseeker (Feb 09 2021 at 18:33):

Which, since SVN reports a diff in trunk for r19440, means the timestamp must have matched for the branch application, but it should instead have been applied to the trunk version

view this post on Zulip starseeker (Feb 09 2021 at 18:35):

So the "correct" fix there would be to apply the SVN revision to the trunk commit and strip it from the branch commit. I can do the former, but I'll have to tweak things to support the latter.

view this post on Zulip starseeker (Feb 09 2021 at 18:36):

If you want, we can establish a line with the convention SHA1; to denote commits that I should clear an SVN revision assignment from.

view this post on Zulip Sean (Feb 09 2021 at 18:36):

woo hoo, resolved another 13... damn commit messages :)

view this post on Zulip Sean (Feb 09 2021 at 18:37):

I don't think it's a huge deal, I'm not sure how many of those there are. possibly quite unlikely if it was just because those commits were within a few seconds of each other?

view this post on Zulip Sean (Feb 09 2021 at 18:38):

I'm getting a count now -- there's some number of trunk commits are tagged on branches

view this post on Zulip starseeker (Feb 09 2021 at 18:39):

If cvs2svn consolidated the timestamps on those two commits as "identical" and picked the newer timestamp, then it will happen every time cvs-fast-export resolved those cases into individual commits and the branch commit was the newer of the two.

view this post on Zulip starseeker (Feb 09 2021 at 18:43):

Might as well fix them if it's easy to pull the data set - it won't be appreciably more work than adding the missing mappings in the first place.

view this post on Zulip Sean (Feb 09 2021 at 18:45):

I think if I've counted correctly, that there are at least 121 commit revisions that were on trunk, but are tagged in git on a branch.

view this post on Zulip starseeker (Feb 09 2021 at 18:45):

Blegh.

view this post on Zulip starseeker (Feb 09 2021 at 18:46):

OK, I'll set up to fix 'em

view this post on Zulip starseeker (Feb 09 2021 at 18:46):

Fortunately, you've identified a test case already ;-)

view this post on Zulip Sean (Feb 09 2021 at 18:47):

that one was an anomaly... wasn't even looking, I was collapsing the multiple-match revs manually for the 30 or so that match multiple diffs and that matched two... and noticed it seemed flipped

view this post on Zulip starseeker (Feb 09 2021 at 18:48):

Ah. Well, either way, good catch.

view this post on Zulip Sean (Feb 09 2021 at 18:48):

I'm going to double-check that 121 too. That seems high to me.

view this post on Zulip Sean (Feb 09 2021 at 19:28):

Is the branch/trunk label reliable? I've been assuming it was generated based off commit location with no guessing involved, but realize I should double-check that assumption.

view this post on Zulip starseeker (Feb 09 2021 at 19:52):

For SVN it should be reliable. CVS identifications were up to cvs-fast-export/cvs2svn and I'm not as certain there

view this post on Zulip starseeker (Feb 09 2021 at 20:05):

The cvs:branch labels were based off of a fairly low-level analysis of the git conversion data - misc/repoconv/cvs_info.sh IIRC

view this post on Zulip starseeker (Feb 09 2021 at 20:07):

The root was the git rev-list --first-parent reporting, which depends on cvs-fast-export correctly assigning the first parent based on CVS branch data.

view this post on Zulip starseeker (Feb 09 2021 at 20:09):

(and on me having correctly interpreted the information, of course)

view this post on Zulip starseeker (Feb 11 2021 at 03:44):

@Sean are we still at about 40 unresolved?

view this post on Zulip Sean (Feb 11 2021 at 07:32):

There was a categoric anomaly so I cleaned up and changed some things to check, and am re-running the comparison to make sure.

view this post on Zulip Sean (Feb 11 2021 at 07:38):

The 40 count was wrong (it was higher). On the plus side, scripting is cleaned up (had to rewrite everything) to the point that it can check all revs easily now. Got svn cloned too so it can do that quickly. Got it matching files and log messages cleanly now too. It's running through re-processing the missing batch now and should have an update in the morning.

view this post on Zulip Sean (Feb 11 2021 at 09:19):

Can you see if you can find c1644? There's a number of initial rev commits like that that I can't find. I'd hope it simply got merged with something else, but trying to verify that on one of them like 1644.

view this post on Zulip starseeker (Feb 11 2021 at 13:31):

Hmm. Well, if I cheat a bit and use https://stackoverflow.com/a/13598028 to find when the files added in c1644 were added in Git, I get:

git log --diff-filter=A -- util/pl-X.c
commit 86a7fcc40057934832f61255b606c0bd6f7fc12b
Author: Phillip Dykstra <phil@pdykstra.com>
Date:   Thu Apr 28 17:40:50 1988 +0000

    Unix-plot to X Window System display (X11)

    cvs:branch:trunk
    cvs:account:phil

and

git log --diff-filter=A -- util/pl-X10.c
commit a6feb76ce1551b09222463514f15e65db0343b55
Author: Phillip Dykstra <phil@pdykstra.com>
Date:   Thu Apr 28 17:43:26 1988 +0000

    Unix-plot to X Window System Display (X10R4)

    cvs:branch:trunk
    cvs:account:phil

I didn't do a detailed diff analysis, but it looks the difference is splitting up the commit to get the distinct commit messages?

view this post on Zulip Sean (Feb 12 2021 at 08:13):

Cool, that was super helpful. I'm not sure about the general case but I'm guessing it's split them up because they were far apart enough in time (couple min), so cvs2git decided to handle them differently. Checking down through, that rules out a bunch but I have to figure out how to automate the check across all 135 missing. I have checks for matching diffs vs logs vs changed files but obviously doesn't catch split/merge changes unless all that changed was the log message (did verify a slew with that lil trick).

view this post on Zulip Sean (Feb 12 2021 at 08:21):

Initial revisions seem to be a large portion of the bulk missing. Took some work to figure out they're not just on branches.

Three commits you could check on for me are r51428, r54352, and r64428. They're fairly modern commits, so they stick out like a sore thumb for not matching. Haven't dove in to figure out what's up with them.

Set up the check across all svn commits and that's chugging along now. When that finishes up, should have a list of commits that are mistagged on branch vs trunk.

view this post on Zulip starseeker (Feb 12 2021 at 13:12):

@Sean Starting with r51428... The checkouts of the files are identical, so i pulled the diffs:

git format-patch -1 be5072cb90113d7c0d75839cc4f183d8cde1646b
svn diff -c51428 > r51428.patch

The patch formatting is different, so I brought them up in meld and applied all the SVN style headers to the git patch. Doing that, I was left with:
diff.png

view this post on Zulip starseeker (Feb 12 2021 at 13:12):

It looks like git and svn made very slightly different decisions on where to start and end their patch blocks.

view this post on Zulip starseeker (Feb 12 2021 at 13:23):

r54352 is similar, but less subtle - identical files in checkouts, but different ordering on the subtraction line instructions in the diff: diff_r54352.png

view this post on Zulip starseeker (Feb 12 2021 at 13:35):

r64428 is the most spectacularly different of the diffs, but checking the Git and SVN checkouts of r64427 and 64428 all files appear to agree, so the two different diffs appear to end up doing the same job.

view this post on Zulip starseeker (Feb 13 2021 at 18:59):

@Sean was that what you were looking for, or is there something else about those commits that is concerning?

view this post on Zulip Sean (Feb 13 2021 at 19:28):

No that was great, helpful. I hypothesized that'd happen but hadn't actually seen it (or at least hadn't noticed). Those stuck out because they were new. I've been going through the list ruling out others like those.

view this post on Zulip starseeker (Feb 13 2021 at 19:46):

Any more you'd like me to check?

view this post on Zulip starseeker (Feb 15 2021 at 13:28):

@Sean How did the re-run go?

view this post on Zulip Sean (Feb 17 2021 at 08:58):

Went well! Took a while to process, but went really well. I double checking a couple lists, but here's the list of trunk commits that are misattributed to branches in git. It's not as many as originally seemed fortunately, but it's a few:
mistagged_trunk_commits.log

view this post on Zulip starseeker (Feb 17 2021 at 12:24):

Looks good - thanks!

view this post on Zulip starseeker (Feb 17 2021 at 12:26):

r66607 is surprising - I wouldn't have expected any issues like that in the SVN era

view this post on Zulip starseeker (Feb 17 2021 at 14:07):

OK. It looks like r66607 was a multi-branch commit, making changes to both the branch and trunk in the same commit. I didn't realize we had any of those in the modern era - all the instances I had spotted were much earlier.

view this post on Zulip starseeker (Feb 17 2021 at 14:09):

What the conversion ended up doing was to apply the changes from r66607 to trunk in commit r66672.

view this post on Zulip starseeker (Feb 17 2021 at 14:11):

Which will also mean that the r66672 diff won't match that from SVN, since the SVN change was just the HAVE_ANALYZER_NORETURN test.

view this post on Zulip starseeker (Feb 17 2021 at 14:13):

That'll be tricky to fix. Hmm.

view this post on Zulip starseeker (Feb 17 2021 at 14:17):

Going through the rest of the list, I haven't identified obvious reassignment candidates yet for the following:

19033
19757
19759
19761
19763

view this post on Zulip starseeker (Feb 17 2021 at 21:30):

Ooof. https://github.com/starseeker/brlcad_conv13/commits/main?before=e9c0d7ea2dd93965dd2037357f9992480cc1bc12+35&branch=main

I think I've got the trunk portion of r66607 spliced in correctly.

view this post on Zulip starseeker (Feb 17 2021 at 21:38):

Identified 19033

view this post on Zulip starseeker (Feb 17 2021 at 21:41):

Ah, I see. The other four are cvs2svn artifacts - so rather than reassigning, they simply don't have direct analog commits and all and we just remove the assignments.

view this post on Zulip starseeker (Feb 17 2021 at 21:41):

@Sean OK, next! ;-)

view this post on Zulip Sean (Feb 18 2021 at 06:56):

Cool, glad you could deduce them. I wasn't 100% sure if you have it tagging revs separate from branches. I didn't check whether the :branch: tag was correct or not, only that that rev definitely didn't happen on a branch.

view this post on Zulip Sean (Feb 18 2021 at 07:02):

Next set is the inverse -- looking a lot better (half done it just found one :trunk mis-assignment) but taking longer to process for some reason. Should be done here soon.

view this post on Zulip Sean (Feb 18 2021 at 07:03):

Will share the list of found assignments missing in the morn.

view this post on Zulip Sean (Feb 18 2021 at 07:15):

starseeker said:

Ooof. https://github.com/starseeker/brlcad_conv13/commits/main?before=e9c0d7ea2dd93965dd2037357f9992480cc1bc12+35&branch=main

I think I've got the trunk portion of r66607 spliced in correctly.

What am I looking at in that github date view?

view this post on Zulip starseeker (Feb 18 2021 at 13:18):

Sean said:

starseeker said:

Ooof. https://github.com/starseeker/brlcad_conv13/commits/main?before=e9c0d7ea2dd93965dd2037357f9992480cc1bc12+35&branch=main

I think I've got the trunk portion of r66607 spliced in correctly.

What am I looking at in that github date view?

The insertion of this commit into the history: https://github.com/starseeker/brlcad_conv13/commit/e977c035ec8a79967cb3d2a0874af08d86a89764

view this post on Zulip starseeker (Feb 18 2021 at 13:23):

I used the date view to illustrate it's not just an isolated commit in the repo, but part of the main history

view this post on Zulip starseeker (Feb 18 2021 at 13:28):

@Sean Where are we with the list of previously unidentified SVN id matches found by your diffing method? I'd be glad to help if you have a set of commits for manual review.

Also, just conceptually, what is your preference for cases like the one identified earlier where a single cvs2svn commit got split up into multiple git commits? Did you want to assign the SVN id to each "portion" commit in Git, if they can be identified?

view this post on Zulip Sean (Feb 18 2021 at 20:00):

That would be totally awesome to tag both commits, and similarly, tag merged commits with multiple revision tags. I know some of them but haven't been fully tracking. I do think there are probably 100-200 in that category.

view this post on Zulip starseeker (Feb 18 2021 at 21:04):

Tagging multiple svn revs onto a single Git commit would require some rework of the assignment code - let me know if that's something you definitely want to do.

view this post on Zulip Sean (Feb 19 2021 at 06:47):

If you want to, go for it, but I don't think it's strictly necessary. So long as the commit is tagged somewhere on one of the rev parts, that should be sufficient for tracing.

view this post on Zulip Sean (Feb 19 2021 at 06:49):

I finished checking the inverse and the only anomaly was 30804. It's tagged as "svn:branch:trunk-UNNAMED-BRANCH" but was branch "unlabeled-2.5.1" in svn.

view this post on Zulip Sean (Feb 19 2021 at 07:22):

Am seeing some other anomalies on these tagged revisions, what's going on with r30687 ? The tags don't appear to match svn at all.

view this post on Zulip Sean (Feb 19 2021 at 07:28):

Another curious one is 46324 -- it's tagged as being on four branches but it was a tag, never committed to branches. Saw some others like that.

view this post on Zulip Sean (Feb 19 2021 at 07:33):

Even if tags are treated as branches, it's tagged on:

    svn:branch:ansi-20040316-freeze
    svn:branch:bobWinPort-20051223-freeze
    svn:branch:ctj-4-5-post
    svn:branch:ctj-4-5-pre
    svn:branch:hartley-6-0-post
    svn:branch:offsite-5-3-pre
    svn:branch:opensource-pre
    svn:branch:windows-20040315-freeze

but it was on these tags in svn:

ansi-20040316-freeze
ansi-20040405-merged
autoconf-freeze
bobWinPort-20051223-freeze
ctj-4-5-post
ctj-4-5-pre
hartley-6-0-post
hartley-6-0-pre
offsite-5-3-pre
opensource-post
opensource-pre
windows-20040315-freeze

view this post on Zulip Sean (Feb 19 2021 at 07:40):

Sean said:

I finished checking the inverse and the only anomaly was 30804. It's tagged as "svn:branch:trunk-UNNAMED-BRANCH" but was branch "unlabeled-2.5.1" in svn.

Looks like 30688 is also tagged as trunk-UNNAMED-BRANCH but also cjohnson-mac-hack, but I don't see that in svn. Svn only lists it affecting:

unlabeled-1.1.1
unlabeled-1.1.2
unlabeled-1.2.1
unlabeled-11.1.1
unlabeled-2.12.1
unlabeled-2.6.1
unlabeled-9.1.1
unlabeled-9.10.1
unlabeled-9.12.1
unlabeled-9.2.1
unlabeled-9.3.1
unlabeled-9.7.1
unlabeled-9.9.1

view this post on Zulip starseeker (Feb 19 2021 at 14:39):

@Sean I've added the ability to correct the r30804 and r30688 branch assignments.

view this post on Zulip starseeker (Feb 19 2021 at 15:08):

Looking at r46324, here's what I'm seeing:

the svn:revision:46324 label is on four commits:

commit 44e3d7341c5680250d65091b2aff6ed051720a11 (HEAD, origin/itcl3-2, itcl3-2)
Author: Christopher Sean Morrison <brlcad@gmail.com>
Date:   Tue Aug 23 12:19:43 2011 +0000

    revmoed additional 3rd party dependencies that don't really belong amongst our other tags (svn branch delete)

    svn:revision:46324
    svn:branch:itcl3-2
    svn:account:brlcad

commit a988903bbe27985e0dd94228e07079e91e98be4d (origin/libpng_1_0_2, libpng_1_0_2)
Author: Christopher Sean Morrison <brlcad@gmail.com>
Date:   Tue Aug 23 12:19:43 2011 +0000

    revmoed additional 3rd party dependencies that don't really belong amongst our other tags (svn branch delete)

    svn:revision:46324
    svn:branch:libpng_1_0_2
    svn:account:brlcad

commit c54b9b07158d4a904aabddae264290854ecb250c (origin/tcl8-3, tcl8-3)
Author: Christopher Sean Morrison <brlcad@gmail.com>
Date:   Tue Aug 23 12:19:43 2011 +0000

    revmoed additional 3rd party dependencies that don't really belong amongst our other tags (svn branch delete)

    svn:revision:46324
    svn:branch:tcl8-3
    svn:account:brlcad

commit 03af105da8dd3cf85a29cc7f056513cc8e79d751 (origin/tk8-3, tk8-3)
Author: Christopher Sean Morrison <brlcad@gmail.com>
Date:   Tue Aug 23 12:19:43 2011 +0000

    revmoed additional 3rd party dependencies that don't really belong amongst our other tags (svn branch delete)

    svn:revision:46324
    svn:branch:tk8-3
    svn:account:brlcad

When I look at what r46324 did in SVN, it eliminated branches/tags/itcl3-2, branches/tags/tcl8-3, branches/tags/tk8-3, and branches/tags/libpng_1_0_2 - this seems to corresponds to what is recorded in those Git commits (which can't actually delete the branches without any commits being uniquely referenced by them getting garbage collected.)

view this post on Zulip starseeker (Feb 19 2021 at 15:11):

Similarly, for r30687:

$ git log --all --grep 30687
commit 004ec0ae439f0ca3c814d22a46957012cd8fb239
Author: Christopher Sean Morrison <brlcad@gmail.com>
Date:   Wed Apr 16 14:40:20 2008 +0000

    remove branches that have no meaning and are for 3rd-party dependencies (svn branch delete)

    svn:revision:30687
    svn:branch:Original
    svn:account:brlcad

commit 720f9b9b75588e35d3cce0f9f5b802abea2259ab
Author: Christopher Sean Morrison <brlcad@gmail.com>
Date:   Wed Apr 16 14:40:20 2008 +0000

    remove branches that have no meaning and are for 3rd-party dependencies (svn branch delete)

    svn:revision:30687
    svn:branch:itcl3-2
    svn:account:brlcad

commit f206b315ca475d3a3e55e98ec42d772c6b05baee
Author: Christopher Sean Morrison <brlcad@gmail.com>
Date:   Wed Apr 16 14:40:20 2008 +0000

    remove branches that have no meaning and are for 3rd-party dependencies (svn branch delete)

    svn:revision:30687
    svn:branch:libpng_1_0_2
    svn:account:brlcad

commit cbff64617866cc3fc2b25db15cd610e651561958
Author: Christopher Sean Morrison <brlcad@gmail.com>
Date:   Wed Apr 16 14:40:20 2008 +0000

    remove branches that have no meaning and are for 3rd-party dependencies (svn branch delete)

    svn:revision:30687
    svn:branch:tcl8-3
    svn:account:brlcad

commit 87bc784daf7f15cc8d9c9fa980a934a98a17de95
Author: Christopher Sean Morrison <brlcad@gmail.com>
Date:   Wed Apr 16 14:40:20 2008 +0000

    remove branches that have no meaning and are for 3rd-party dependencies (svn branch delete)

    svn:revision:30687
    svn:branch:tk8-3
    svn:account:brlcad

commit d748c2ea214b699008563e18f5a7105de39faba9
Author: Christopher Sean Morrison <brlcad@gmail.com>
Date:   Wed Apr 16 14:40:20 2008 +0000

    remove branches that have no meaning and are for 3rd-party dependencies (svn branch delete)

    svn:revision:30687
    svn:branch:zlib_1_0_4
    svn:account:brlcad

view this post on Zulip starseeker (Feb 19 2021 at 15:14):

I think you're right that SVN tags are getting treated as branches - that's the only way to handle SVN tags with edits - and I doubt I attempted to distinguish when assigning the svn:branch labels.

view this post on Zulip starseeker (Feb 19 2021 at 15:15):

@Sean I'm not following how you're getting an association between (say) svn:branch:ansi-20040316-freeze and r46324 ?

view this post on Zulip starseeker (Feb 19 2021 at 15:17):

I guess I could try to take a list of commits made to tags instead of branches and update the svn:branch: labels to besvn:tag: labels instead?

view this post on Zulip starseeker (Feb 19 2021 at 15:37):

Ah. I think this might actually work:

svn log file:///home/user/brlcad_repo/brlcad/tags|grep \^r|awk '{print $1}' > tags.log

That gives us (more or less) the set of tag commits. If we then look for any of them that match commit messages, we get a set of commits. tag_commits.txt

view this post on Zulip starseeker (Feb 19 2021 at 15:39):

So those commit labels could then be switched from svn:branch:* to svn:tag:*

view this post on Zulip Sean (Feb 19 2021 at 16:45):

I can generate a list of all branches/tags associated with each commit easily enough -- that's what I was doing to validate specific sets, just not systematically on all commits.

view this post on Zulip starseeker (Feb 19 2021 at 16:47):

Did my breakdown of r46324 make sense? I may be missing something

view this post on Zulip Sean (Feb 19 2021 at 16:48):

you mean the question about the ansi one?

view this post on Zulip Sean (Feb 19 2021 at 16:48):

or in general?

view this post on Zulip starseeker (Feb 19 2021 at 16:49):

I wasn't (am not) seeing how you associated that commit with that branch, either in SVN or git?

view this post on Zulip Sean (Feb 19 2021 at 16:50):

let me check where I got it from because I agree, I'm only seeing it on four git commits now... maybe misprocessed on a subsequent validation

view this post on Zulip Sean (Feb 19 2021 at 16:52):

ah, yeah, looks like i wrote the wrong rev here in the chat.. 46324 is good...
that list was for 46322 ... which looks like it matches so I just got those two crossed when I was checking them manually

view this post on Zulip Sean (Feb 19 2021 at 16:54):

cool, that's great -- could be more thorough but that's good enough for non-branch commits -- means all non-branch commits that are tagged look like they're mostly tagged correctly besides the two trunk-UNNAMED-BRANCH commits.

view this post on Zulip starseeker (Feb 19 2021 at 16:56):

Those are partially my fault - a regex match was too loose and turned master-UNNAMED-BRANCH into trunk-UNNAMED-BRANCH. Either way though they had the wrong branch somehow, so I added corrections

view this post on Zulip Sean (Feb 19 2021 at 16:56):

uploading a mapping of all revs to non-trunk branches

view this post on Zulip Sean (Feb 19 2021 at 17:03):

can you check on something unusual... commits 21570 through 21634 in svn
I got nothing but a log message.

view this post on Zulip Sean (Feb 19 2021 at 17:04):

perhaps cvs2svn garbage of some sort? did cvs2git fix/import any of those better?

view this post on Zulip starseeker (Feb 19 2021 at 17:05):

Are there equivalent git commits?

view this post on Zulip starseeker (Feb 19 2021 at 17:06):

Hmm. No matching commit message anywhere for 21570

view this post on Zulip starseeker (Feb 19 2021 at 17:09):

Here's the portion of brlcad/h/Attic/tclIntPlatDecls.h,v from CVS that seems to have generated that commit:

1.1
log
@file tclIntPlatDecls.h was initially added on branch windows-6-0-branch.
@
text
@d1 585
@


1.1.2.1
log

view this post on Zulip starseeker (Feb 19 2021 at 17:09):

At a guess, cvs2svn put in an empty commit and cvs-fast-export ignored it as an empty commit...

view this post on Zulip Sean (Feb 19 2021 at 17:13):

what about one of the revs in the middle?

view this post on Zulip Sean (Feb 19 2021 at 17:14):

that's a huge range of commits, all with detailed log messages indicating activity

view this post on Zulip Sean (Feb 19 2021 at 17:15):

I mean, I guess it's garbage or old cvs issue of some sort, so not a problem, but odd

view this post on Zulip Sean (Feb 19 2021 at 17:16):

also, how'd you manage to catch/fix r62027 ? looks like it was added alongside trunk and you somehow fixed it (or at least tagged it better) as being a branch

view this post on Zulip starseeker (Feb 19 2021 at 17:16):

Same deal with 21600 from brlcad/libpkg/Attic/libpkg.dsp,v

1.1
log
@file libpkg.dsp was initially added on branch windows-6-0-branch.
@
text
@d1 115
@


1.1.2.1

view this post on Zulip starseeker (Feb 19 2021 at 17:18):

I was the one who messed that up, so I knew it was coming and did some manual work in the initial conversion to special case that.

view this post on Zulip Sean (Feb 19 2021 at 17:18):

neat

view this post on Zulip Sean (Feb 19 2021 at 17:34):

Okay! Finally... here's the list of commits that appear to have applied to multiple branches at the same time: commits_to_multiple_branches.txt

view this post on Zulip Sean (Feb 19 2021 at 17:36):

Might want to double-check me there, but that's only looking at the svn side. You may already be handling some of them differently like the branches AUTOCONF vs autoconf-branch ?

view this post on Zulip starseeker (Feb 19 2021 at 17:36):

Maybe. I recognize 19033 - it's one of the ones you flagged as being missing on trunk. I had removed its commit id from the branch, but if that's right it actually needs to be on both

view this post on Zulip starseeker (Feb 19 2021 at 17:38):

Blegh. Well, I uploaded the latest state at brlcad_conv14 to demonstrate the switch to svn:tag: labeling for those commits made to tags, but don't use that for SHA1 lists of any sort - stick to brlcad_conv12. Clearly the post-processing isn't done yet...

view this post on Zulip Sean (Feb 19 2021 at 17:38):

That transcript is derived by pulling a diff of all commits and extracting all the filepaths that changed.

view this post on Zulip starseeker (Feb 19 2021 at 17:39):

I'll take a run through - probably it's just going to mean an adjustment/expansion of the branch and/or trunk commits I need to manually specify revisions for

view this post on Zulip starseeker (Feb 19 2021 at 17:40):

The other list I know we still need is the svn commit IDs you were able to identify that I had never mapped, like 735 - did that prove practical or were there roadblocks?

view this post on Zulip Sean (Feb 19 2021 at 17:44):

yes, that's in one of the windows ... Screen-Shot-2021-02-19-at-12.41.50-PM.png :)

view this post on Zulip starseeker (Feb 19 2021 at 17:44):

/me grins - it's like my desk, but with yellow terminals instead of dark gray!

view this post on Zulip starseeker (Feb 19 2021 at 17:45):

I'm not sure what to make of 18999 - I'm not seeing two commits associated with that in Git

view this post on Zulip Sean (Feb 19 2021 at 17:45):

I was dark, but eyes needed a different hue some months back

view this post on Zulip starseeker (Feb 19 2021 at 17:46):

/me nods - it's surprising how much difference that makes over long stretches of time.

view this post on Zulip starseeker (Feb 19 2021 at 17:47):

/me goes through the list to see if he can quickly spot any candidates for svn revision labels...

view this post on Zulip starseeker (Feb 19 2021 at 18:10):

@Sean How authoritative was the cvs2svn branch identification for commits? A lot of these in git are tagged as rel-5-2 rather than rel-5-1-branch - given the process I used to try and determine which branch was the "origin" branch in CVS relied on the git conversion itself, it's possible I've not correctly identified the original branches...

view this post on Zulip starseeker (Feb 19 2021 at 18:30):

A sizable chunk of these are proving to be the mirror image of the other case - instead of the branch getting the svn id and trunk not getting it, it's trunk that got the id and the branch didn't.

view this post on Zulip Sean (Feb 19 2021 at 19:32):

How are the rev updates committed in the repo still valid? Doesn't assigning a different tag on earlier commits affect the future commit shas?

view this post on Zulip Sean (Feb 19 2021 at 19:33):

Can you give me an example of the rel-52 vs? Could be a bug, but the processing was pretty straightforward to have it report what actually changed.

view this post on Zulip starseeker (Feb 19 2021 at 19:33):

Yes. Every time I have to do that, I have to upload a new repository. That's why the brlcad_conv13 and brlcad_conv14 repos were up briefly

view this post on Zulip starseeker (Feb 19 2021 at 19:34):

Sure, one sec

view this post on Zulip starseeker (Feb 19 2021 at 19:36):

OK, to pick one example (this is typical): r19440 in your list has:

branches/rel-5-1-branch trunk

The corresponding commits in the Git conversion report:
cvs:branch:rel-5-3 cvs:branch:trunk

view this post on Zulip starseeker (Feb 19 2021 at 19:36):

4d401a8617869d3594b5948de12a374a5bd292fe and ea6d4c16bae6ecf30d4439d92c8dd72f56b3e942

view this post on Zulip Sean (Feb 19 2021 at 19:58):

morrison@agua brlcad_conv11 % svn diff -c 19440  svn+ssh://brlcad@svn.code.sf.net/p/brlcad/code|grep "^Index: brlcad"
Index: brlcad/branches/rel-5-1-branch/tclscripts/mged/grid.tcl
Index: brlcad/trunk/tclscripts/mged/grid.tcl

No mention of rel-5-3 ...

view this post on Zulip Sean (Feb 19 2021 at 20:03):

Looks like the git commit log and diff are correct, just incorrectly asssociated with rel-5-3

view this post on Zulip Sean (Feb 19 2021 at 20:04):

How is the branch/tag figured out? I sort of assumed it was coming from the processing. If they're suspect, that might explain some of the missing trunk tags.

view this post on Zulip starseeker (Feb 19 2021 at 20:29):

The SVN era branch assignments should come directly from repository information. The CVS era branch assignments were done using the script in misc/repoconv/cvs_info.sh

view this post on Zulip starseeker (Feb 19 2021 at 20:30):

Fundamentally, it uses git rev-list --first-parent to follow commit chains back up the branches.

view this post on Zulip starseeker (Feb 19 2021 at 20:32):

I was trying to identify the commit chains independent of the SVN history.

view this post on Zulip starseeker (Feb 19 2021 at 20:40):

So my question was whether cvs2svn was more likely to correctly assign a correct commit branch of origin. If that's the case, then I'll have to reassign the CVS era branches somehow.

view this post on Zulip starseeker (Feb 19 2021 at 20:48):

I'm not conversant enough with CVS to know how to try and directly coax the information out of the original repo, so my reasoning was that since the cvs-fast-export conversion was the one we were using from the CVS era the branch assignments were the ones to use for that part of the history.

view this post on Zulip starseeker (Feb 19 2021 at 21:04):

FWIW, SVN commit r19990 "Release 5.3" was right in amongst the latter of the multibranch commits SVN reported as being on rel-5-1-branch. It seems a bit suspect that all the multibranch commits would be originating on rel-5-1-branch when they were about to release 5.3...

view this post on Zulip Sean (Feb 19 2021 at 22:16):

I can't say for sure, but I do recall that branches in cvs are recorded explicitly so there's no guessing. Any tool converting has perfect branch knowledge so I would expect cvs2svn (and cvs-to-git) to correctly reflect what was in cvs in svn.

view this post on Zulip Sean (Feb 19 2021 at 22:18):

Perhaps --first-parent isn't appropriate? What if something is a branch of a branch or similar? Git could be tracking through to a grandparent branch.

view this post on Zulip Sean (Feb 19 2021 at 22:19):

the branch names are in the ,v files, if you want to see if/when rel-5-1-branch vs 5-3 branch are associated with a 0particular commit. They're in a "symbolic names:" block near the top.

view this post on Zulip Sean (Feb 19 2021 at 22:22):

Some explanation here: https://www.astro.princeton.edu/~rhl/cvs-branches.html#branchnumbers

view this post on Zulip starseeker (Feb 19 2021 at 23:23):

I'm beginning to think git just literally doesn't track this properly at all, at ANY level. If I'm interpreting these number correctly per the Princeton site, it looks like SVN has it correct.

view this post on Zulip starseeker (Feb 19 2021 at 23:24):

That's really, really annoying.

view this post on Zulip starseeker (Feb 19 2021 at 23:26):

@Sean I don't suppose in that pile of scripts you've got one that will generate the set of branches for all SVN commits?

view this post on Zulip starseeker (Feb 20 2021 at 00:21):

Nevermind, got it.

view this post on Zulip starseeker (Feb 20 2021 at 02:31):

OK, there we go. Can now scrub out the existing cvs:branch labels and replace them with SVN data.

view this post on Zulip Sean (Feb 20 2021 at 06:13):

Yeah, that finished processing. Careful if you used the previous script, had a bug.

view this post on Zulip Sean (Feb 20 2021 at 06:14):

Here's all revisions, all branches and tags:
commits_to_multiple_branches2.txt

view this post on Zulip Sean (Feb 20 2021 at 06:24):

Er, rather that's all multiple branchpoint commits. This is all commits in the repo: all_branches2.log

view this post on Zulip Sean (Feb 20 2021 at 06:25):

note the multiple branches list did update, if that changes anything on the processing

view this post on Zulip starseeker (Feb 22 2021 at 15:02):

OK, I think I've got the branch assignments working using SVN data now. Here's the diff that shows the changes to the commit messages in brlcad_conv12 diff.txt

view this post on Zulip starseeker (Feb 22 2021 at 16:34):

The sha1s won't match, but I can upload that version of the repository if it is useful.

view this post on Zulip starseeker (Feb 22 2021 at 18:43):

Updated diff file: diff.txt

view this post on Zulip starseeker (Feb 23 2021 at 19:14):

@Sean Any luck with generating the mappings? As an alternative if you want you can post the brlcad_conv11 repo you were using (I haven't kept that iteration so I'd need a copy of what you're using) and your existing SHA1 sets - I think I've hammered out an update script now.

view this post on Zulip Sean (Feb 23 2021 at 19:48):

Yeah, it's processing now.

view this post on Zulip Sean (Feb 23 2021 at 19:48):

Should be done soon. Taking a while to recompute all the hashes. Looks like the first few hundred ended up unmodified (same sha) but once a commit message changed, everything after had to be re-associated with the new shas and that process takes a couple hours (and it's a couple hours in, so almost done).

view this post on Zulip Sean (Feb 23 2021 at 19:49):

One curiosity that you can maybe help explain / educate me on ... do you know why a commit like 944 would be in git log --all but not in git log --follow . ?

view this post on Zulip Sean (Feb 23 2021 at 19:50):

maybe a bad example -- I didn't check if it was a commit to a different repo or something, just the first I noticed

view this post on Zulip starseeker (Feb 23 2021 at 20:26):

Hmm. If I save the log output of git log --follow . to a file and then search for c037a5e3a6eb97d2f9455225bbafeffec5b79be4 (which I think is the commit corresponding to 944 in brlcad_conv12) it is there.

view this post on Zulip starseeker (Feb 23 2021 at 20:27):

I do know in general that git log --all will incorporate the history from all branches, not just the currently checked out branch.

view this post on Zulip starseeker (Feb 23 2021 at 20:28):

https://stackoverflow.com/a/7203551 is sometimes useful in the context of tracking back specific files.

view this post on Zulip starseeker (Feb 23 2021 at 20:28):

This also seems to work: git log --follow --full-history -- src/fb/fb-orle.c

view this post on Zulip starseeker (Feb 23 2021 at 20:33):

Ah, whoops, sorry - that's not the right commit/hash. One sec...

view this post on Zulip starseeker (Feb 23 2021 at 20:36):

Checking SVN, that's a property change - so the bug is the SVN revision getting assigned at all.

view this post on Zulip starseeker (Feb 23 2021 at 20:37):

I.e. it shouldn't be in either git log --all or git log --follow .

view this post on Zulip starseeker (Feb 23 2021 at 20:53):

OK. Here are my thoughts so far: 944 looks like a timestamp match with 52036a8b4569b8ffe90e2e8fb0b43f5ed36ba040. It's got one of the generic log messages, so my revision assignment code went ahead and assigned it that revision.

Based on the diff report from SVN, that's an incorrect assignment and needs to be changed/cleared. Hopefully the diff based checking will catch that.

view this post on Zulip starseeker (Feb 23 2021 at 20:55):

That probably explains why it doesn't show in git log --follow . - that search is based on all the files in the currently checked out branch, working backwards. Since the incorrectly identified "944" has no files associated with it, there's no way for git to associate it with the history walking backwards from the tree as a starting point.

view this post on Zulip starseeker (Feb 23 2021 at 20:57):

Or, another possibility - even if it can associate it following the commit chains, an empty commit won't match the "." specifier.

view this post on Zulip starseeker (Feb 23 2021 at 21:03):

OK - I see "Added fb_close", which is the parent of 52036a8b4569b8ffe90e2e8fb0b43f5ed36ba040, does make it into the git log --follow . output. That suggests it's following the chain through that commit, but not matching "." and skipping reporting it.

view this post on Zulip starseeker (Feb 23 2021 at 21:04):

(Sorry, that's probably a little more stream of consciousnesses than you were looking for...)

view this post on Zulip starseeker (Feb 23 2021 at 21:07):

In some ways it's tempting to try to scrub empty commits like that with generic commit messages out, but at this juncture I'd be worried about inadvertently breaking something else...

view this post on Zulip starseeker (Feb 23 2021 at 21:34):

Hmm. Actually, repowork already has the info to detect empty commits, in principle, and even categorize them...

view this post on Zulip starseeker (Feb 23 2021 at 21:35):

Some I know we need (branch creation/deletion), some are marginal (commits removing empty directories, which are no-ops in git) and some of them are useless (empty generic message, empty contents).

view this post on Zulip starseeker (Feb 23 2021 at 21:45):

Here's the CVS era empties: empty.log

view this post on Zulip starseeker (Feb 23 2021 at 21:49):

@Sean What do you think - should I scrub out the empty commits with "* empty log message*" and maybe some of the other obvious ones?

view this post on Zulip starseeker (Feb 23 2021 at 21:51):

"BRL CAD Distribution Release 1.10" has a couple non-empties in addition to the 4 empties, for example...

view this post on Zulip starseeker (Feb 23 2021 at 23:05):

Yeah, it's a variation on the splicing problem. Have the ability to remove specified commits now.

view this post on Zulip scorp08 (Feb 24 2021 at 20:18):

starseeker said:

Hmm. If I save the log output of git log --follow . to a file and then search for c037a5e3a6eb97d2f9455225bbafeffec5b79be4 (which I think is the commit corresponding to 944 in brlcad_conv12) it is there.

by the way what is the difference between conv11 and 12 ?? :)

view this post on Zulip starseeker (Feb 25 2021 at 01:01):

I don't recall at this point - they were iterative refinements to the process of correcting the output from the main svnfexport conversion (merging git notes into comments, correcting emails, etc.)

view this post on Zulip starseeker (Feb 25 2021 at 01:02):

conv12 is the "target" for a third series of refinements at this point, mainly because I need stable SHA1s to target for processing. (In principle I've prepared a script to translate between old and new repositories if necessary, but I'd rather not have to use it... this is already complicated enough.)

view this post on Zulip Sean (Feb 25 2021 at 17:31):

I see you're getting ahead of my own validation pace... Sorry it's taking so long, I'm just chasing down issues in the multiassignment, a couple bugs in the scripting, wanted the list I give to be more certain than a blanket wash as I'm seeing lots of little discrepancies and ways to mis-associate.

view this post on Zulip Sean (Feb 25 2021 at 17:34):

@starseeker can you validate this list against the assignments you made: svn.to.git.complete_matches.log

view this post on Zulip starseeker (Feb 25 2021 at 19:47):

I don't see any collisions - you've got about a dozen that I haven't got yet, but I suspect that's probably because I forgot to use the version of the SVN repository that had the RCS tags scrubbed down.

view this post on Zulip starseeker (Feb 25 2021 at 19:48):

I'll have to look more closely at the ones that popped up on mine as matching you don't have... may be an issue I haven't found yet.

view this post on Zulip starseeker (Feb 25 2021 at 19:56):

420b6c86aebaab8d233b9124aac2dfcaab390158;2253
3626fd67e335d89391ce624b8a3246bd99adffec;2470
231fd989a63e842f6ed485d8ac49caec4eee3660;2471
bbd7e8166d10d1f8c1c3355f87814fd9c4e652df;2489
1a79a71444aa3900b25c61c321c270d3f83d7065;2657
68245f26449e72b3fa8362bfdaa8ec4b458566bc;2841
0857eeb72eb573cad76f86b770e765349a85a671;2875
b026f5c0fe0aa8e2d9ca34051a20fb9afb92162a;2884
3732cf651af0b526eb3ec6bdf5893892f22afef4;2886
84c054fee5394109f52dfaf15add46d671ede196;2890
de144fde847a9fe45cac391c97bd7abaeacc3b0b;2900
d0f9348a8847c22a1f5cb4846f9c7414c7c1081b;3578

Just for reference, those are the ones in your set I've not spotted yet.

view this post on Zulip starseeker (Feb 25 2021 at 20:06):

I can pick up 68245f26449e72b3fa8362bfdaa8ec4b458566bc;2841 if I sort the diff contents ahead of doing the md5sum.

view this post on Zulip Sean (Feb 25 2021 at 20:11):

I have others that partially match, I just haven't validated them so didn't share them yet.

view this post on Zulip starseeker (Feb 25 2021 at 20:14):

Ah, it looks like the rest categorized in my processing as having non-unique content matching.

view this post on Zulip starseeker (Feb 25 2021 at 20:15):

/me inspects...

view this post on Zulip starseeker (Feb 25 2021 at 20:18):

OK. Diff content wasn't unique for r2253 - matches with r22190 - so it takes the path and/or date to resolve. 420b6c86aebaa is correct

view this post on Zulip starseeker (Feb 25 2021 at 20:31):

@Sean Other matches all check out - your list looks good.

view this post on Zulip starseeker (Feb 26 2021 at 01:07):

@Sean Sorry, just read back up through chat history - not trying to replace your work (defeats the point of independent V&V) - goal was/is to get representative inputs to make sure my repo updating logic can handle something similar to what the final pass will look like. Just stashed the various bits and pieces (and notes) in case they prove to be useful.

view this post on Zulip Sean (Feb 26 2021 at 18:07):

@starseeker e6417be98f27d570d863744f566f5aaf738abbe6 .. I'm seeing listed as branch commit, but it was a trunk commit 19763

view this post on Zulip starseeker (Feb 26 2021 at 18:09):

/me nods. I'll add it to branch_corrections.txt

view this post on Zulip Sean (Feb 26 2021 at 18:09):

Here's a second update with about 70 more matches svn.to.git.complete_matches2.log

view this post on Zulip Sean (Feb 26 2021 at 18:10):

had a bug that had to get sorted out in verifying the other 70

view this post on Zulip starseeker (Feb 26 2021 at 18:10):

One sec while I merge/verify.

view this post on Zulip Sean (Feb 26 2021 at 18:18):

Here are more that appear to be mistagged as branch/tag commits:

19033 LOG+FILE MATCH ON c365a032935f99d5cbcc5e0b7316253e918183f5
19211 LOG+FILE MATCH ON 2ec20a87d6e216cc3af62da933a2917e96459ce2
19282 LOG+FILE MATCH ON 4d5fe4e8afa57a275c04f0a11cbf20c1378ce600
19283 LOG+FILE MATCH ON 4af5f01acc93a65ba8e158c1e407e6fa30f0a867
19288 LOG+FILE MATCH ON 9f4472b6c4a9d77005a25bac0e6ea9d0b45c6829
19289 LOG+FILE MATCH ON 3312597ec11da607ad8cdecb8e86ecd6cd43a21c
19440 LOG+FILE MATCH ON ea6d4c16bae6ecf30d4439d92c8dd72f56b3e942
19449 LOG+FILE MATCH ON af33297408e4ec0b38fa37d211104ae8e3f4b850
19558 LOG+FILE MATCH ON 3a6fdd142e59c7fee7dfb06fdaecc3b30f28d633
19587 LOG+FILE MATCH ON a53d24a82016e59e54ad3fa0750238b077313a33
19720 LOG+FILE MATCH ON f1c200f10e9d5c0f896508b2967f644abafad234
19723 LOG+FILE MATCH ON 45a67834524348e32e2c1d34071b59dbb1360d9e
19763 FILE+DIFF MATCH ON e6417be98f27d570d863744f566f5aaf738abbe6
19772 LOG+FILE MATCH ON 6f4104bd83cf4a930bda9cbaa1b811d3e0d236b3
19783 LOG+FILE MATCH ON 6af6602bcdb5227c51a6b467226d5fc70d321855
19797 LOG+FILE MATCH ON 4b51763bd75123f81f069bba1b873c4538776530
19798 LOG+FILE MATCH ON d82708b47d89c008a20ce23ba23ce4aca80cf232
19839 LOG+FILE MATCH ON 8dcb60d4529dc5e0cf99729338e05869cf270c06

view this post on Zulip starseeker (Feb 26 2021 at 18:20):

Confirmed - no collisions.

view this post on Zulip Sean (Feb 26 2021 at 18:20):

This one is an outlier I'm not sure about, 11077485329842c81213eab68006fe5d58b5925f ...

view this post on Zulip Sean (Feb 26 2021 at 18:21):

it says it was 21565 but that was a trunk cvs2svn conversion commit. Commit message on 11077.. is that of 21564

view this post on Zulip Sean (Feb 26 2021 at 18:22):

21564 is not found tagged in git

view this post on Zulip starseeker (Feb 26 2021 at 18:22):

/me nods. Probably means it should be 21564

view this post on Zulip Sean (Feb 26 2021 at 18:25):

I need to investigate why 21564 isn't in my list of missing commits... should have caught that but didn't

view this post on Zulip starseeker (Feb 26 2021 at 18:26):

@Sean am I correct that all the commits you listed are on trunk?

view this post on Zulip starseeker (Feb 26 2021 at 18:30):

I think I've got 19033 set up as follows: aec4367dafd37a7b0657c4b27414caa21ac4c1be is the trunk portion of that commit, and c365a032935f99d5cbcc5e0b7316253e918183f5 is the rel-5-1-branch portion

view this post on Zulip Sean (Feb 26 2021 at 18:42):

starseeker said:

Sean am I correct that all the commits you listed are on trunk?

I'll have to confirm that myself, as I've been toggling between processing all commits and only those on trunk.

view this post on Zulip Sean (Feb 26 2021 at 18:47):

aha! yes, that explains it. that's why 21564 wasn't in my list. thought I was going crazy. that was a branch commit.

view this post on Zulip Sean (Feb 26 2021 at 18:49):

so in svn, 21564 was committed to branch, then 21565 commited to trunk to compensate?? I'm not sure what cvs2svn did there.
regardless, in git .. 21564's diff turned into 11077485.. and perhaps properly tagged as branch, despite being tagged as trunk commit 21565. do I have that right?

view this post on Zulip Sean (Feb 26 2021 at 18:50):

21565 appears to be empty in svn

view this post on Zulip starseeker (Feb 26 2021 at 18:53):

In git, if I'm interpreting gitk's display properly, 11077485329842c81213eab68006fe5d58b5925f is a branch commit. If 21564 was the branch commit in SVN, that's probably what it should be in Git. Not 100% sure why it got the 21565 assignment instead.

view this post on Zulip starseeker (Feb 26 2021 at 18:54):

Best guess is something funky happened because the timestamps of those two commits are identical in SVN, as far as I can tell.

view this post on Zulip Sean (Feb 26 2021 at 18:56):

Okay, yeah, that's what I thought I was seeing as well. Don't see how it got 21565 either. Is there a way to check, see if that happened anywhere else? Not too worried but if it's scannable, we can do a quick check.

view this post on Zulip starseeker (Feb 26 2021 at 18:58):

Only thing I can think of would be to look for identical timestamp commits in SVN and double check the Git assignments, but not sure how script-able that is (especially since we're accumulating a fair set of revision number assignments/updates.)

view this post on Zulip starseeker (Feb 26 2021 at 18:59):

f5a1b0037fec2927cba073d118db24cdbd681975
a098425430db227021617976961e6b51ce5569cb
e6417be98f27d570d863744f566f5aaf738abbe6

Those might be worth checking - I think they also had incorrect revision numbers

view this post on Zulip starseeker (Feb 26 2021 at 19:03):

It might get to the point where I should run the updates we've accumulated and establish a new baseline for additional comparisons, so we can focus without re-discovering what we've already fixed, but I know that would require regenerating the sha1/md5 mappings again. Let me know if you think things reach the point where that would be worthwhile.

view this post on Zulip Sean (Feb 26 2021 at 19:42):

Yeah, I'm ignoring timestamps because it'd be a fair bit of work to parse the date string into something that could be fuzzy compared in script land

view this post on Zulip Sean (Feb 26 2021 at 23:01):

You may already have, but here's a couple outliers that are partial matches, appear to be probably split commits?:

2125 LOG+DIFF MATCH ON 0c1f4a88c5c960bd7de51ef8a05e7f53f00fb1a2 (NOT TAGGED)
3102 LOG+DIFF MATCH ON 402419dac49d3abe9bd6036f76696b43a70a66f5 (NOT TAGGED)

view this post on Zulip Sean (Feb 27 2021 at 20:21):

awesome! got it doing the comparisons in parallel now... that should speed things up a bit!

view this post on Zulip starseeker (Feb 28 2021 at 00:38):

I've put up a demo repo at https://github.com/starseeker/brlcad_conv15 showing all the accumulated changes thus far.

view this post on Zulip starseeker (Feb 28 2021 at 01:42):

Simple way to compare the brlcad_conv12 and the brlcad_conv15 logs to see changes seems to be:

git log --all |grep -v ^commit |grep -v ^Merge > all_nc.log

view this post on Zulip starseeker (Feb 28 2021 at 02:12):

That filters out the sha1s so the message and other changes can be seen easily in a diff.

view this post on Zulip starseeker (Mar 01 2021 at 14:45):

@Sean It's looking like SVN and git use subtly different diffing algorithms, so the diff file changes don't always map up.

view this post on Zulip starseeker (Mar 01 2021 at 15:15):

I think I've pretty well reached my limits: https://github.com/starseeker/brlcad_conv16

view this post on Zulip starseeker (Mar 01 2021 at 15:27):

@Sean what comparisons do you have it doing?

view this post on Zulip Sean (Mar 01 2021 at 16:26):

I compared every rev.

view this post on Zulip starseeker (Mar 01 2021 at 16:27):

/me winces. Yeah, that's a slow process.

view this post on Zulip Sean (Mar 01 2021 at 16:27):

Cool thing is that now takes about 3-4 hours total to test every rev.

view this post on Zulip Sean (Mar 01 2021 at 16:27):

Crunches in parallel.

view this post on Zulip Sean (Mar 01 2021 at 16:27):

Finished over the weekend pretty quickly actually, but I was too exhausted to verify+upload it.. sorry.

view this post on Zulip Sean (Mar 01 2021 at 16:27):

Did a workout on Sat that wiped me out.

view this post on Zulip Sean (Mar 01 2021 at 16:28):

I have a laundry list now.. will post it in the categoric sets here in a few min.

view this post on Zulip starseeker (Mar 01 2021 at 16:28):

Np, happens. I ended up manually hunting up a bunch of Git commits in SVN - hopefully that'll be helpful.

view this post on Zulip Sean (Mar 01 2021 at 16:29):

Yeah, you may have already found/fixed a lot or all of them.

view this post on Zulip Sean (Mar 01 2021 at 16:30):

I've not done anything with 15 or 16. I can kick that off a final pass on 17 assuming there are a few updates, but still working on 12 to keep shas in sync.

view this post on Zulip starseeker (Mar 01 2021 at 16:30):

/me nods - sounds good.

view this post on Zulip starseeker (Mar 01 2021 at 16:31):

Hopefully there won't be too much more to do...

view this post on Zulip starseeker (Mar 01 2021 at 16:32):

FWIW, I'm not convinced all the CVS era commits will be diff free, even if the revisions line up.

view this post on Zulip Sean (Mar 01 2021 at 16:53):

Yeah, I think we already found a few differences where commits were split differently. They seem to be very few overall.

view this post on Zulip starseeker (Mar 01 2021 at 17:44):

I think cvs2svn and cvs-fast-export might have picked different contents for their "synthetic commit to represent incomplete tag" commits... I suppose a case can be made either way for assigning the corresponding SVN revs if that's what happened. I went ahead and did so, but I could go either way.

view this post on Zulip starseeker (Mar 01 2021 at 18:05):

r4778 actually is a nice compact illustration of different diff picks - at least with the svn and git versions I have, git produces:

diff --git a/librt/db_io.c b/librt/db_io.c
index 3645cea1dc..7faa9be6ba 100644
--- a/librt/db_io.c
+++ b/librt/db_io.c
@@ -32,8 +32,8 @@ static char RCSid[] = "@(#)$Header$ (BRL)";

 #include "machine.h"
 #include "vmath.h"
-#include "raytrace.h"
 #include "db.h"
+#include "raytrace.h"

 #include "./debug.h"

and SVN produces:

Index: brlcad/trunk/librt/db_io.c
===================================================================
--- brlcad/trunk/librt/db_io.c  (revision 4777)
+++ brlcad/trunk/librt/db_io.c  (revision 4778)
@@ -32,8 +32,8 @@

 #include "machine.h"
 #include "vmath.h"
+#include "db.h"
 #include "raytrace.h"
-#include "db.h"

 #include "./debug.h"

Git moves raytrace.h down, and SVN moves db.h up, both to the same effect.

view this post on Zulip starseeker (Mar 01 2021 at 18:06):

Shouldn't impact a full-up revision check of course, but does illustrate the limits of diff comparisons nicely.

view this post on Zulip Sean (Mar 01 2021 at 18:09):

I should have one of the lists cleaned up here soon now. Trying to make sure I don't feed you bad data... so much scripting...

view this post on Zulip Sean (Mar 01 2021 at 18:09):

The good news is I'd say the vast majority match and map well.

view this post on Zulip starseeker (Mar 01 2021 at 18:11):

/me can imagine - once this is done I'm going to have to scrub my home dir to clean out a truly amazing pile of intermediate scripting files, checkouts, test dirs, etc.

view this post on Zulip Sean (Mar 01 2021 at 18:11):

Yeah, I noticed some of the different diffs like that. Pretty interesting. I found a couple more complex cases where an entire function appeared to be added/removed when in reality all that happened was the end parenthesis on one function was moved and the signature on the next function had an edit. Somehow git's diff engine decided it would represent that as some mangled movement.

view this post on Zulip starseeker (Mar 01 2021 at 21:28):

@Sean I'm seeing a big swath of differences between r702 and r3735 - given the timing I'd guess that's tied up with that timestamp business in the SVN conversion?

view this post on Zulip starseeker (Mar 01 2021 at 21:30):

Datestamp wise it lines up, as near as I can tell.

view this post on Zulip Sean (Mar 01 2021 at 22:08):

yeah, I noticed them a while back. found many/most of them (or ruled them out as splits/inconsequential).

view this post on Zulip starseeker (Mar 02 2021 at 03:00):

@Sean if we hit a situation where a commit message matches to one revision but the change matches a different revision, which mapping do you prefer to use?

view this post on Zulip Sean (Mar 02 2021 at 06:17):

o.O I'd wonder how that happened...

view this post on Zulip Sean (Mar 02 2021 at 06:18):

regardless, I think it's more important the rev match the diff since we're notionally using these numbers to trace back changes in a file

view this post on Zulip Sean (Mar 02 2021 at 06:21):

unrelated, here's a neat little find in the commits. there appear to be exactly 7 commits that were perfectly duplicated on branches and trunk:

10 19514 LOG+FILE+DIFF PERFECT MATCH ON c9cc663089d441f8a7d40f63757b0080dec5af10 f5419dcbab0e9edc78c90af24b5318b04686a7b2 (TAGGED MISMATCH f5419dcbab0e9edc78c90af24b5318b04686a7b2)
10 19595 LOG+FILE+DIFF PERFECT MATCH ON 0c2cb0cf51b8f543cd740e758ea3ebe2be964336 afbcb106f05606065ae3ce11b602fa566efb0031 (TAGGED MISMATCH afbcb106f05606065ae3ce11b602fa566efb0031)
10 19605 LOG+FILE+DIFF PERFECT MATCH ON 9bacc2b9ac94977113d3d68617ac4c896a37da60 c614ed067a631ba7d56fee51d1fc289359efb64b (TAGGED MISMATCH 9bacc2b9ac94977113d3d68617ac4c896a37da60)
10 19697 LOG+FILE+DIFF PERFECT MATCH ON e49447b2d924385b7272c6ba8d78e490590f1778 f363b6cbec7bdd415f20e77a9d3734ecfa6cbf98 (TAGGED MISMATCH f363b6cbec7bdd415f20e77a9d3734ecfa6cbf98)
10 19892 LOG+FILE+DIFF PERFECT MATCH ON 200ca9ba685b57dbc4bd0dcd9600649a7bec8117 f5787013aff6a38adc807bcc5a8db617510818a3 (TAGGED MISMATCH 200ca9ba685b57dbc4bd0dcd9600649a7bec8117)
10 19992 LOG+FILE+DIFF PERFECT MATCH ON 1b8fd04c74f8b99551e35ec87d4980bb27735a62 ae67110218bc3d71c5f3301707b5d86a60564cf7 (TAGGED MISMATCH 1b8fd04c74f8b99551e35ec87d4980bb27735a62)
10 64506 LOG+FILE+DIFF PERFECT MATCH ON eb5c98bf8799083d4d946f1f63f9e1edd8e61631 2ca450a34b29f37d58b4ed8288c3f41a4b155a78 (TAGGED MISMATCH eb5c98bf8799083d4d946f1f63f9e1edd8e61631)

Ignore the mismatch, I manually verified and they're all correct in git. It was just interesting because there appear to be so few of those. I kind of expected more, but they were apparently pretty rare to be exactly the same message, the same files, same diff.

view this post on Zulip starseeker (Mar 02 2021 at 20:54):

How's it going?

view this post on Zulip starseeker (Mar 03 2021 at 04:05):

So, here's a question - 7496c761e580e1935607fc336ff85bf06c524caf was initially unassigned. It got assigned r10209 based on commit message and history position, but diffing it with the SVN checkout indicates some of the changes for r10209 in SVN got grouped into the git commit labeled r10210 instead.

So we can assign r10209 to 7496c761e5 and be "approximately" correct - presumably the best match available in the git history to that SVN commit, but with a checkout that won't match - or skip assigning r10209 to any commit (losing some mapping info, but skipping a mapping that can't produce a matching output.)

What's the preferred answer in such situations?

view this post on Zulip Sean (Mar 03 2021 at 04:38):

So I've been down a rabbit hole trying to sort out how git handles encoding, but it's looking like it's not just that -- I think there's a couple categoric issues potentially. check these out:

b17a2836c85b43422c15faf7b111088bc4e445e3
a9daa166161d57ee6ed486cc9488880ffc5da843
ed4c28dcc1f17520d6596192e2ccae808d44ba4f
bc320ea12852890495809d142600a97eb241bd6f
d1e7455ffff304d2b8f25aba0cf144c6dc0fb4b4
9594f3ce737b98e902379066be02337eabc8db53
18ea6afa636886ee2ba5fb7d7807a920db3ee35e
8a97709dae7e86479bc04ab8d52dcaa65c2b4beb
9ae7c9024838f140c1cb20d0ddaf0606e2e486ef
c13ba71962660bcd2bb471671a08d61c94827e30
9ef20d544982d92f0b1d9183477c42543c4d45c4
4d6d7aad28eed5f23e31aa3f3fc37576de05b6dc
03eab0819b8a74d2a046273443ff14122f2d7e98
92cc90f7397cf45802a70f70260cfa2f57b1fc3b
106637f9c2913d3cc43d8a02a0f955c9709f67d4
6c20c610b10b3c098ad8c8bd53fc111791bca7e6
9f1e2c92eb250b39ac64b981c6246236f0cdb2c5
4c103440e2947d6990386e2767b9778266dd1517
b7a0eb56822e52c1a18ca30f312abea93ead6867
c90bfc8e507ea27863d82ea9ff514d2c79253b98
cb8ebedb7da7eb7981d0038fc826b61f4315e699
b0f3314a23e067051d520b11da483d068b73ebe6

view this post on Zulip Sean (Mar 03 2021 at 04:39):

there's clearly some utf-8 going on there that wasn't preserved, but then there's also some utf-8 getting added where it previously did not exist. I didn't scan all commits for the condition -- these are the ones that came up as matching DIFF+FILES but not matching the log message.

view this post on Zulip Sean (Mar 03 2021 at 04:40):

looks like about half of them have message up log messages where there was an apostrophe or a double quote. I checked svn and they were indeed just simple single/double quotes, so I'm thinking something in the scripting

view this post on Zulip Sean (Mar 03 2021 at 04:43):

As to your question (sorry, had to offload before I lost the context) ... I have that commit matching 10209 and 10210 as well because of the log message match. They match these git commits:

5222348e9f8c57c3a7623700413d0f37a1d74122
7496c761e580e1935607fc336ff85bf06c524caf
46472340020700642675b9613c7ddce85c391bea
becf17cb8e73ddbef7a0e840090712714ef4cff0
5846eaff72182de5f744baf4ef8c757b1e44b615

view this post on Zulip Sean (Mar 03 2021 at 04:44):

So you could tag them all or just the first, shrug, all valid enough choices I think

view this post on Zulip Sean (Mar 03 2021 at 04:44):

presumably some are 10209 and some are 10210

view this post on Zulip Sean (Mar 03 2021 at 04:46):

I have a list of others like that, 1156 only match log message

view this post on Zulip Sean (Mar 03 2021 at 04:46):

when I gave them a prelim scan, it looked like most are commits split up differently than they were in svn

view this post on Zulip starseeker (Mar 03 2021 at 05:31):

So looking at the first one on that list (b17a2836c85b43422c15faf7b111088bc4e445e3) I'm seeing the following:

CVS

add Roßberg to list of contributors

SVN

add Roßberg to list of contributors

Git:

add Roßberg to list of contributors

You're saying your scripts indicate the SVN and Git messages don't match? All three lines appear to have the same utf8 character, at least here...

view this post on Zulip starseeker (Mar 03 2021 at 05:32):

I'll take a look at the rest of the list tomorrow...

view this post on Zulip Sean (Mar 03 2021 at 05:35):

yeah, I don't get the utf chars here when I query git. I could have done something that caused them, but if I just run git show, I get encoded mess

view this post on Zulip Sean (Mar 03 2021 at 05:36):

another set of oddities to check on, search git for svn 30687

view this post on Zulip Sean (Mar 03 2021 at 05:36):

appears to be tagged across a variety of branches (which maybe happened, I hadn't checked that yet)

view this post on Zulip Sean (Mar 03 2021 at 05:40):

okay, so looks like that's part of the story:

svn diff -c30687 file:///Users/morrison/brlcad.github/svn.sfmirror/code | grep ^Index | cut -f3 -d/ | sort | uniq
VendorARL
libpng
scriptics
zlib

yet the git side of things is:

for i in `echo "004ec0ae439f0ca3c814d22a46957012cd8fb239
720f9b9b75588e35d3cce0f9f5b802abea2259ab
f206b315ca475d3a3e55e98ec42d772c6b05baee
cbff64617866cc3fc2b25db15cd610e651561958
87bc784daf7f15cc8d9c9fa980a934a98a17de95
d748c2ea214b699008563e18f5a7105de39faba9
004ec0ae439f0ca3c814d22a46957012cd8fb239"` ; do git show $i | grep svn:branch ; done | sort | uniq
    svn:branch:Original
    svn:branch:itcl3-2
    svn:branch:libpng_1_0_2
    svn:branch:tcl8-3
    svn:branch:tk8-3
    svn:branch:zlib_1_0_4

view this post on Zulip starseeker (Mar 03 2021 at 13:01):

@Sean what version of Git are you using?

view this post on Zulip starseeker (Mar 03 2021 at 13:05):

Is https://stackoverflow.com/a/19436421 related?

view this post on Zulip starseeker (Mar 03 2021 at 13:05):

Also, what do you see in gitk as opposed to on the console?

view this post on Zulip starseeker (Mar 03 2021 at 14:24):

OK, so it looks like the git commits are spurious - I may have messed up a correction or some such. Of the 4 branches from r30687, only VendorARL is present and it looks like that's because I custom-added it.

view this post on Zulip starseeker (Mar 03 2021 at 15:06):

r15365 created the libpng branch in SVN, if I'm not mistaken. In Git, that revision got assigned to f8fa716f5077cdde438f676c1b24244a09eb3fcd

view this post on Zulip starseeker (Mar 03 2021 at 15:17):

r15338 created the zlib branch in SVN. In Git, that looks like e85b06be0fa6632e097d8c728506ab5251a2b635

view this post on Zulip starseeker (Mar 03 2021 at 15:24):

The scriptics branch has 4 commits - r19756, r19758, r19760 and r19762. Those don't have assignments right now, but it looks like the corresponding commits have r19757, r19759, r19761 and r19763. Looking at them, I'd say the four earlier commits are probably the better content choices for assignment (not to mention having the mapping commit messages.)

view this post on Zulip starseeker (Mar 03 2021 at 15:40):

@Sean OK, I think I've got the corrective files in place for r30687. Basically, since cvs-fast-export put the commits on other branches, we don't have png, zlib or scriptics branch deletes. I added the proper VendorARL delete, and removed the spurious itcl3-2, etc. deletes incorrectly associated with r30687 in Git.

view this post on Zulip starseeker (Mar 03 2021 at 15:41):

Also updated the scriptics commit revision assignments.

view this post on Zulip Sean (Mar 03 2021 at 15:42):

I don't think the encoding was a git version issue, I think it's just encoding. I think I have it sorted out.

view this post on Zulip Sean (Mar 03 2021 at 15:42):

Looks like the git command I used to dump the log and the svn command used to dump the log ended up dumping differently is all.

view this post on Zulip Sean (Mar 03 2021 at 15:43):

So that's pretty much the entirety of commits that had UTF-8 characters in them. The suspicious quote-related ones look like they're actually smart single quotes, probably copy-pasted from some output.

view this post on Zulip starseeker (Mar 03 2021 at 15:44):

Ah, cute.

view this post on Zulip Sean (Mar 03 2021 at 15:44):

starseeker said:

OK, so it looks like the git commits are spurious - I may have messed up a correction or some such. Of the 4 branches from r30687, only VendorARL is present and it looks like that's because I custom-added it.

I can pull the rest... r30687 was just one example. There are others like that.

view this post on Zulip starseeker (Mar 04 2021 at 01:01):

Well, that was mostly a blind alley I should have known better than to chase, but it did result in characterizing some of the commit diffs... looks like cvs-fast-export and cvs2svn sometimes picked different commit ordering for commits with the same timestamps.

@Sean unless you feel really strongly about that I'd rather not try to switch them around - it'll take some effort on the repowork code to support doing so.

view this post on Zulip starseeker (Mar 04 2021 at 01:14):

We need some kind of "good enough" criteria... my sense is that chasing down all the CVS vs SVN vs Git differences has the potential to be nearly endless...

view this post on Zulip Sean (Mar 04 2021 at 07:17):

Yeah, I'm not worried about commit ordering. The oddity was the multitude of seemingly unrelated branches. Working on pulling that list still, had some diffs that had to get recomputed and worked on tallying where we're at.

view this post on Zulip Sean (Mar 04 2021 at 07:23):

My criteria has been to identify or explain all the non-empty trunk commits. That all are tagged or otherwise accounted for correctly (i.e., with something matching or it's a split commit). We're definitely closing that gap.

view this post on Zulip Sean (Mar 04 2021 at 07:55):

I can pull the rest... r30687 was just one example. There are others like that.

Here's the others that were like that:

view this post on Zulip Sean (Mar 04 2021 at 07:55):

30687 NOT FOUND (empty files) (TAGGED MISMATCH 004ec0ae439f0ca3c814d22a46957012cd8fb239 720f9b9b75588e35d3cce0f9f5b802abea2259ab f206b315ca475d3a3e55e98ec42d772c6b05baee cbff64617866cc3fc2b25db15cd610e651561958 87bc784daf7f15cc8d9c9fa980a934a98a17de95 d748c2ea214b699008563e18f5a7105de39faba9 )
30688 NOT FOUND (empty files) (TAGGED MISMATCH 3882bb89a329277499b8b6c2246be115544740a0 68aeb784b3ee698c854878c190eb4b229b88e1fe )
30690 NOT FOUND (empty files) (TAGGED MISMATCH bab9cb74c7cf403e3c6ffb862367e7e921d5de5e 1f1d7a7f607d5b4c673d1d73cc7bcb126b0da82b ebaea28c7f234f5af88bbe8f60e8cae1026d7f08 9a4972e8d397e2fe1457987252531bbb08aae2b5 2000a7fd53ba7f017eadb55168ed737a0e6d2906 47ca01661701d59ee6aa948cd914b42e9ae9e36e )
36471 NOT FOUND (empty files) (TAGGED MISMATCH 5d5a16ac1af3bef7ea3acd9df913a882ecb2c450 cf54441bbb9da781638c782f0330e2399b114ba2 f3402be29c09993717319df0a8045087c3c1efcc 29fb00141b4040de08c9319404bfe44946ef43f2 2c43fbad65f4bc373dfa80a6254077b5913623d0 e19308e9b43771204ad04daa015bb646ffda7077 )
36472 LOG+FILE MATCH ON 96a3e5fb75628744e4835d9ce2f7cbf8dbca8ec4 (TAGGED MISMATCH 96a3e5fb75628744e4835d9ce2f7cbf8dbca8ec4 c0737a9252506872ce5ce6cd14207f7c375741da )
46324 NOT FOUND (empty files) (TAGGED MISMATCH 44e3d7341c5680250d65091b2aff6ed051720a11 a988903bbe27985e0dd94228e07079e91e98be4d c54b9b07158d4a904aabddae264290854ecb250c 03af105da8dd3cf85a29cc7f056513cc8e79d751 )
46322 NOT FOUND (empty files) (TAGGED MISMATCH e56ca9ed3e746b0f0531a5a90a50706dc4486786 cbd805930e92e0174548d245eee8a50f79f4be6a 8db928ed630bba609e98e97045dc91377539353e f64cf35a3a10e027863a68a07f6d4dda041d0fb4 3e54caeb944540d809a8c123289f9fb3624b7509 f199b69dd1f620bfa299a9e8fd520c37cc9b3c26 33b42ffbd7c4aa5e42e5854d020a8d66dd69ccfc a6225b252463bcb48ce3376200227c1e783c77d5 )
46328 NOT FOUND (empty files) (TAGGED MISMATCH ccb829355adc0829b9a5a7a3f0b5ac72dc13ea45 a442ff82f39e00b14ef139fb8f62b18c0ec32046 e4fba5a7cdbc525184d64170eec22e7eeedbd1f2 4396a0b1cd513d4ce9945589c4aadb17eda9a6d0 5452bab5c382ff6f2d0af42c2d4b367a0fdc13aa 9884b41b3aa6f790c80dbb4a55cf5cea4844fc8b 46d4a300710516c5547fa1b6f64ba29ec64ab3b4 )
46335 NOT FOUND (empty files) (TAGGED MISMATCH dd2bb79965568f5aab4f7458606d875d22b74b40 f5e6fc5ebfaaedceb7538a1f2ba1a3fc1589c399 )
62127 LOG+FILE MATCH ON 797d0138514136e2e95b0dfa1cc7d2e774fef2ab (TAGGED MISMATCH bae6fd511505e5e4f12f16b1cd73b5381f4f47f6 797d0138514136e2e95b0dfa1cc7d2e774fef2ab )
62975 NOT FOUND (empty files) (TAGGED MISMATCH dbaf54ff6b25ad2f576f82f26086101bc5015dec e047bc1116cc3199bbdbf58101ef281c153c2b74 )
69921 NOT FOUND (TAGGED MISMATCH cca216f058fe5791dbbd082ad7293911b6aae9f6 f828d1c0b1f6e68879a1bdecb2c58d1dc9a9207b )

view this post on Zulip Sean (Mar 04 2021 at 07:57):

can ignore the NOT FOUND / empty files -- that's just me not tracking branches. What's interesting is those are revs tagged to multiple git commits. Some of course may be intentional, but that's all of them (on conv12).

view this post on Zulip starseeker (Mar 04 2021 at 14:20):

OK. Going through the list...

view this post on Zulip starseeker (Mar 04 2021 at 14:22):

r30688 is on two commits because it eliminated both the branch with the Mac Hack commit and other "unlabeled" branches which mapped (collapsed) to "master-UNNAMED-BRANCH" in the cvs-fast-export conversion. So master-UNNAMED-BRANCH has two branch delete commits assigned to it. Might as well delete 68aeb784b3ee698c, since it doesn't add anything.

view this post on Zulip starseeker (Mar 04 2021 at 14:25):

r30690 is deliberate - multiple branches removed in a single commit.

view this post on Zulip starseeker (Mar 04 2021 at 14:27):

r36471 is deliberate - multiple branches removed single SVN commit

view this post on Zulip starseeker (Mar 04 2021 at 14:30):

r36472 is the result of a branch naming consolidation - c0737a92525 can be removed.

view this post on Zulip starseeker (Mar 04 2021 at 14:31):

r46324 is deliberate - multiple deletions single SVN commit

view this post on Zulip starseeker (Mar 04 2021 at 14:31):

ditto r46322 - multiple tag deletions, single SVN commit

view this post on Zulip starseeker (Mar 04 2021 at 14:32):

Same with r46328 - multiple tag deletions, single SVN commit

view this post on Zulip starseeker (Mar 04 2021 at 14:32):

Same with r46335 - deliberate

view this post on Zulip starseeker (Mar 04 2021 at 14:36):

r62127 looks like a branch delete that registered an empty commit on trunk for some reason - 797d0138514136e can be removed.

view this post on Zulip starseeker (Mar 04 2021 at 14:36):

r62975 is deliberate - multiple branch delete

view this post on Zulip starseeker (Mar 04 2021 at 14:40):

r69921 - looks like a branch rebase got recorded somehow as a branch delete plus re-creation - f828d1c0b1f6e68879a1bdecb2c58d1dc9a9207b can be removed.

view this post on Zulip starseeker (Mar 04 2021 at 14:47):

@Sean Note that I went and manually tagged a lot of Git commits as mapping to multiple SVN revisions in the post-conv12 update logic...

view this post on Zulip Sean (Mar 04 2021 at 16:02):

I did notice that... it "should" just mean a lot more multiple matches no? If so, I think we can just do a post-process check later to make sure there wasn't a typo or other blatant mistake in the manual tagging, but shouldn't affect the upload V&V.

view this post on Zulip starseeker (Mar 04 2021 at 16:05):

Yes, more multiple matches from the CVS era commits.

view this post on Zulip starseeker (Mar 04 2021 at 16:07):

I wasn't sure how "deep" you wanted to go checking those manual tags - the majority are based on context (unmapped commit that is immediately before a mapped commit, with a file missing from the "mapped" commit compared to the SVN file list) but I'm not set up to actually try and validate all the diffs as being part of the SVN commits.

view this post on Zulip starseeker (Mar 04 2021 at 16:08):

It may not be possible in all cases anyway, if one git commit ended up getting deltas from two SVN commits - in that case the best that can be done is an "approximate" assignment.

view this post on Zulip Sean (Mar 04 2021 at 16:11):

ankle deep, just blatant sanity check to make sure they are deliberate or mistakes since they were outliers.

view this post on Zulip Sean (Mar 05 2021 at 05:45):

Okay, @starseeker here's a batch for you to check out, myriad issues. These are all the commits that do not map uniquely. Most are probably correct as-is and simply aren't unique because they were a common log message applied to the same files or similar or were branch commits (keep in mind that I'm ignoring branch-only diff data so they show up as "not found"), BUT the rest are all multiple candidate diffs. Could be entirely benign or correct, but could use your eyes on at least some of them.
svn.to.git5.multiple_matches.sorted

view this post on Zulip Sean (Mar 05 2021 at 06:02):

Here's one you may have already captured with changes you made a couple days ago, but here are all the commits that match svn revs in LOG+FILES+DIFF, but aren't tagged revs in git (or at least weren't as of brlcad_conv12). That's not to say that they should be all mapped -- it's entirely possible for a commit to have gotten split and just happens to map to another with the same files and log message. I'm not sure how to rule that out, but maybe you can verify them easily. There's 167 in this category:
svn.to.git5.matching_not_tagged

view this post on Zulip Sean (Mar 05 2021 at 06:12):

Feel like these two might be swapped:

+9016 LOG+FILE+DIFF MATCH ON 24c9f6ebb84eba2bb53211c3012b7dfb68672a2b (TAGGED MISMATCH 68b56645d7a689a3af445bb5dfef16c78a4a4270)
+9015 LOG+FILE+DIFF MATCH ON 68b56645d7a689a3af445bb5dfef16c78a4a4270 (TAGGED MISMATCH 24c9f6ebb84eba2bb53211c3012b7dfb68672a2b)

view this post on Zulip Sean (Mar 05 2021 at 06:12):

or they're splits because of cvs screwery and they're right because of other adjacent commits?

view this post on Zulip Sean (Mar 05 2021 at 06:30):

another set similar to matching_not_tagged is this batch that aren't/weren't in git but match a log+file pairing, possible candidates. Note some are non-unique.
svn.to.git5.matching_lf_not_tagged

view this post on Zulip Sean (Mar 05 2021 at 06:47):

here's a much smaller but similar set of untagged commits where a matching log file was found, possible candidates for manual tagging. affects 37 commits:
svn.to.git5.matching_l_not_tagged

view this post on Zulip Sean (Mar 05 2021 at 07:30):

In theory, I think those 5 data sets reconciled fully should nearly result in full coverage... the only ones missing should be ambiguous cases. I'll can run a final trunk pass on any changes you make (conv16?) and we can see if there are any left! This might be it.

view this post on Zulip Sean (Mar 05 2021 at 07:30):

Kind of exciting!

view this post on Zulip Sean (Mar 05 2021 at 07:32):

Here's where they tallies stand:

 78233 total unique commits
-10544 PERFECT MATCH
- 9356 NOT FOUND (branch changes)
-  939 EMPTY (prop changes)
-50807 LOG+FILE+DIFF (matching)
       167 LOG+FILE+DIFF MATCH but not tagged (all UNIQUE)
       141 MISMATCH or duplicated candidates
- 5001 LOG+FILE (matching)
       38 LOG+FILE MATCH but not tagged (13 UNIQUE)
       776 MISMATCH or duplicated candidates
-  180 LOG+DIFF (matching)
-   29 FILE+DIFF (matching)
-   30 DIFF (matching)
-    0 FILE (matching)
- 1117 LOG (matching)
       90 MATCH but not tagged (all UNIQUE)
       350 MISMATCH or duplicated candidates
------
   229 unaccounted for in mismatches not tagged
  - 90 LOG not tagged
  - 13 LOG+FILE not tagged
  -167 LOG+FILE+DIFF not tagged
------
   -41 dupes not excluded properly (oops)

view this post on Zulip starseeker (Mar 05 2021 at 14:27):

Sean said:

Feel like these two might be swapped:

+9016 LOG+FILE+DIFF MATCH ON 24c9f6ebb84eba2bb53211c3012b7dfb68672a2b (TAGGED MISMATCH 68b56645d7a689a3af445bb5dfef16c78a4a4270)
+9015 LOG+FILE+DIFF MATCH ON 68b56645d7a689a3af445bb5dfef16c78a4a4270 (TAGGED MISMATCH 24c9f6ebb84eba2bb53211c3012b7dfb68672a2b)

Looks like they are swapped based on the diffs, yes.

view this post on Zulip starseeker (Mar 05 2021 at 14:34):

Going through the multiple_matches file, it's looking like the SVN era commits are mostly "checkpoint" or similarly ambiguous commit messages on similar file sets (which is what I would expect for the SVN era - given how the commits were generated for that portion of the history I'm not sure how we'd get an SVN revision number mis-assignment, since the commits were generated on a per-SVN commit basis to begin with...)

view this post on Zulip starseeker (Mar 05 2021 at 14:44):

r62027 and r62708 are a bit more interesting - they are branch creation and deletion commits that represent me adding and deleting a branch in the wrong place. They are candidates for removal, unless you want to keep them to preserve the history of what happened at those particular SVN commits.

view this post on Zulip starseeker (Mar 05 2021 at 15:01):

@Sean I've scanned the logs for the SVN era multiple_matches commits, and r62027 is the only one that jumped out - the rest appear to be either checkpoints, branch syncs, throwaway test commits, or applying identical changes to different branches.

view this post on Zulip starseeker (Mar 05 2021 at 15:02):

or a few that are different changes to the same file with the same commit message.

view this post on Zulip starseeker (Mar 05 2021 at 15:05):

One I'm not following - how come da4ace8194f81d0f92565b428dfa309143b37914 and ae970a06e7d02f63e7c77ff927af5ca90721a111 are getting flagged as 75110 match?

view this post on Zulip starseeker (Mar 05 2021 at 15:07):

I see they're "rename" commit message commits, but I'm not seeing any file matching...

view this post on Zulip starseeker (Mar 05 2021 at 15:30):

r799 is an example where the commit groupings ended up different - match.c isn't in SVN 799, but the vdeck.c changes in that commit do appear to align with the r799 changes.

view this post on Zulip starseeker (Mar 05 2021 at 16:10):

I'm trying to go through the CVS era a bit more carefully, but so far none of the multiple_matches seem to indicate mis-mapped files. A couple untagged commits that matched entries on my list, and one minor correction to a git rev assignment.

view this post on Zulip starseeker (Mar 05 2021 at 16:28):

The matching_not_tagged I confirmed as being part of svn_rev_updates.txt, and the lf_not_tagged had a few that appear to be valid matches as well (some aren't). I think I've accounted for the l_not_tagged commits as well.

view this post on Zulip starseeker (Mar 05 2021 at 16:45):

@Sean It will take me a bit more time to manually confirm that none of the "TAGGED MISMATCH" cvs era git commits are actually incorrectly identified, but I'm hopeful they're good. I've uploaded the current state at https://github.com/starseeker/brlcad_conv17 - if neither of us finds anything else, I'll do a final update from SVN and we'll be ready to roll.

(The most likely source of any remaining issues is if you spot something in my manually assigned commits - they're more extensive than the ones from the svn.to.git lists, since I was making a stab at mapping all the commits I could back to SVN.)

view this post on Zulip Sean (Mar 05 2021 at 21:07):

starseeker said:

r62027 and r62708 are a bit more interesting - they are branch creation and deletion commits that represent me adding and deleting a branch in the wrong place. They are candidates for removal, unless you want to keep them to preserve the history of what happened at those particular SVN commits.

I can go either way. I would probably preserve, but not a big deal either way.

view this post on Zulip Sean (Mar 05 2021 at 21:22):

starseeker said:

One I'm not following - how come da4ace8194f81d0f92565b428dfa309143b37914 and ae970a06e7d02f63e7c77ff927af5ca90721a111 are getting flagged as 75110 match?

False positive, can ignore them. They're an artifact of how branches were handled from svn. They have empty file lists, so it erroneously thinks it has a better match than it really does. I didn't get around to detecting and handling that case differently. If you come across a rev that is branch activity, you can just skip it.

view this post on Zulip Sean (Mar 05 2021 at 22:20):

So.... based on an assumption that svn_rev_updates.txt is correct enough, we're done to just 48 to resolve...HOME STRETCH!
...

view this post on Zulip Sean (Mar 05 2021 at 22:21):

and now down to 8!

view this post on Zulip Sean (Mar 05 2021 at 22:43):

and now 0.

view this post on Zulip starseeker (Mar 05 2021 at 22:44):

O.o

view this post on Zulip Sean (Mar 05 2021 at 22:44):

@starseeker Is it possible to tag a merged commit with two revs?

view this post on Zulip starseeker (Mar 05 2021 at 22:45):

Maybe... if we replace the commit with a new commit having a custom message. A bit tricky, but doable if it's only one or two

view this post on Zulip Sean (Mar 05 2021 at 22:45):

there are a handful of svn_rev_updates.txt that didn't apply because the commit was already tagged as something else

view this post on Zulip Sean (Mar 05 2021 at 22:45):

when I looked, it's because it's both

view this post on Zulip starseeker (Mar 05 2021 at 22:46):

Hmm. How many?

view this post on Zulip Sean (Mar 05 2021 at 22:48):

I don't know, can run a script to find out -- but basically it's all the entries in svn_rev_updates that are on a commit that has something else

view this post on Zulip Sean (Mar 05 2021 at 22:48):

example f9fd3ad956d23e854df73294083cb37ef3c2f341

view this post on Zulip Sean (Mar 05 2021 at 22:48):

that's 9011 and 9012 iirc

view this post on Zulip starseeker (Mar 05 2021 at 22:49):

Oh, OK - we're talking CVS era commits. Yeah, I'm not surprised.

view this post on Zulip Sean (Mar 05 2021 at 22:49):

yeah, oldest I found was r14320 which is in 76e74e9e9ce955bf6602171e67cbcd9539bfbec9

view this post on Zulip starseeker (Mar 05 2021 at 22:50):

My original thought was that as long as we had one rev number assigned in the right general range, that would provide timeline and history context. Is it worth trying to tease out the multiple commit mappings?

view this post on Zulip starseeker (Mar 05 2021 at 22:50):

Seeing as they won't be 1-1 regardless in that case...

view this post on Zulip Sean (Mar 05 2021 at 22:50):

nah, I think it's fine -- just didn't know if it was possible/easy

view this post on Zulip Sean (Mar 05 2021 at 22:51):

let me do a quick check to see if we're talking about a few dozen or hundreds

view this post on Zulip starseeker (Mar 05 2021 at 22:51):

Not trivially - to really do that "Right" I would have had to generate independent per-file diffs for all the git and SVN revisions, then find all the corresponding changes and do all the multi-mappings.

view this post on Zulip starseeker (Mar 05 2021 at 22:52):

If it's just a couple we can fake it by doing hand-assembled replacement commits, but anything more intensive would be really tough.

view this post on Zulip starseeker (Mar 05 2021 at 22:53):

Need to leave something for the next generation of historians to figure out ;-)

view this post on Zulip Sean (Mar 05 2021 at 22:53):

I'm not looking to find them beyond what's already in svn_rev_mappings.txt ... there's some unknown number of them in there already

view this post on Zulip Sean (Mar 05 2021 at 22:54):

In hunting down the last few trunk commits missing, it turned out they were in the mappings file identified

view this post on Zulip starseeker (Mar 05 2021 at 22:54):

Ah.

view this post on Zulip Sean (Mar 05 2021 at 22:54):

they just weren't tagged because that commit got tagged again as something else

view this post on Zulip Sean (Mar 05 2021 at 22:54):

only writing out the latter

view this post on Zulip starseeker (Mar 05 2021 at 22:55):

Whoops. I thought I had checked for those - must have re-introduced a couple. Hang on - awk + sort + uniq to the rescue...

view this post on Zulip starseeker (Mar 05 2021 at 22:57):

/me blinks - all the sha1 keys in svn_rev_updates.txt appear to only be in there once...

view this post on Zulip starseeker (Mar 05 2021 at 23:00):

Must have happened earlier in the process.

view this post on Zulip starseeker (Mar 05 2021 at 23:00):

If we need to do it for the missing ones, as long as it's not too many, I can do what I did for the "Mac Hack" commit to fix its data and make replacement commits to apply.

view this post on Zulip Sean (Mar 05 2021 at 23:01):

it's not that there listed more than once

view this post on Zulip Sean (Mar 05 2021 at 23:03):

it's that whatever processing association that normally happens happened (or it's because i'm on 12 and if I were testing 17 then I wouldn't find 9011 instead of 9012 and vice versa

view this post on Zulip starseeker (Mar 05 2021 at 23:03):

/me nods - I get it, and we actually want both so we don't have missing svn rev mappings.

view this post on Zulip Sean (Mar 05 2021 at 23:04):

or we ignore it because it's just 3-4 of them

view this post on Zulip starseeker (Mar 05 2021 at 23:04):

i.e. something for grep to match for both 9011 and 9012, even if it goes to the same commit.

view this post on Zulip Sean (Mar 05 2021 at 23:04):

or edit the log message at the last minute

view this post on Zulip starseeker (Mar 05 2021 at 23:04):

If it's just 3-4 I can deal with it pretty quickly (knock on wood)

view this post on Zulip Sean (Mar 05 2021 at 23:04):

I'm running a check now

view this post on Zulip starseeker (Mar 05 2021 at 23:05):

If it's a dozen or more I'd be more inclined to punt

view this post on Zulip Sean (Mar 05 2021 at 23:05):

looks like 859 and 858 is the first it found

view this post on Zulip starseeker (Mar 05 2021 at 23:05):

/me goes fishing...

view this post on Zulip Sean (Mar 05 2021 at 23:05):

it's up to 1500 so I suspect it'll take it 10min to get through them all

view this post on Zulip starseeker (Mar 05 2021 at 23:06):

You said the highest one was in the 14k range?

view this post on Zulip Sean (Mar 05 2021 at 23:06):

well just that I ran into looking for missing trunks, but that won't have all the assignments you did

view this post on Zulip Sean (Mar 05 2021 at 23:06):

I'd have to check 17 for that

view this post on Zulip starseeker (Mar 05 2021 at 23:06):

Ah, right.

view this post on Zulip starseeker (Mar 05 2021 at 23:07):

/me tries test with 858...

view this post on Zulip Sean (Mar 05 2021 at 23:07):

I think this scan over svn_rev_mappings will be a good enough check... if they're really rare, then we can just punt

view this post on Zulip Sean (Mar 05 2021 at 23:07):

or can amend the log and let the shas shift prior to upload

view this post on Zulip Sean (Mar 05 2021 at 23:09):

so far it's only found 4 and it's up to 10k

view this post on Zulip Sean (Mar 05 2021 at 23:12):

Here's the list:

MISMATCH on 859 and 858 in c7da20384024574fddc07c59dcdfcc2879560e31
MISMATCH on 9011 and 9012 in f9fd3ad956d23e854df73294083cb37ef3c2f341
MISMATCH on 11406 and 11407 in eb0179c08aefd8ea90697c42eba31244e4904eed
MISMATCH on 12424 and 12425 in fc9e5a26cba18a926c644a4e2bb4b321855f2a88
MISMATCH on 14320 and 14321 in 76e74e9e9ce955bf6602171e67cbcd9539bfbec9
MISMATCH on 18892 and 18993 in 67c46ada661fdab789632885c34bf77a277962db
MISMATCH on 21564 and 21565 in 11077485329842c81213eab68006fe5d58b5925f
MISMATCH on 22525 and 22521 in edf3df35c8c44492fa25cb3999788338b1f2570b
MISMATCH on 19756 and 19757 in a7c85f280677d70b8eef9aadf79302736ed26ffc
MISMATCH on 19758 and 19759 in f5a1b0037fec2927cba073d118db24cdbd681975
MISMATCH on 19760 and 19761 in a098425430db227021617976961e6b51ce5569cb
MISMATCH on 19762 and 19763 in e6417be98f27d570d863744f566f5aaf738abbe6

view this post on Zulip Sean (Mar 05 2021 at 23:13):

9015 and 9016 look suspicious...

view this post on Zulip starseeker (Mar 05 2021 at 23:13):

Weren't those the ones you though were flipped earlier?

view this post on Zulip starseeker (Mar 05 2021 at 23:13):

I think I looked and concurred...

view this post on Zulip Sean (Mar 05 2021 at 23:14):

yes, they were. okay, so probably showing up just because I'm comparing then to conv12

view this post on Zulip starseeker (Mar 05 2021 at 23:15):

OK, that's not too bad - my 858 test seems to be going smoothly, so I can probably get the others.

view this post on Zulip starseeker (Mar 05 2021 at 23:15):

Any other TODOs before switch flipping?

view this post on Zulip Sean (Mar 05 2021 at 23:16):

were you still sorting through the other potential branch taggings or done with them?

view this post on Zulip starseeker (Mar 05 2021 at 23:16):

You mean the multiple_matches file?

view this post on Zulip Sean (Mar 05 2021 at 23:17):

well that one, but more importantly the three not_tagged files to see which if any don't have a tagging

view this post on Zulip Sean (Mar 05 2021 at 23:18):

since those were content-based lookups, some were unique untagged matches

view this post on Zulip starseeker (Mar 05 2021 at 23:18):

I think I checked the non-tagged and all of those commits were listed in svn_rev_updates.txt

view this post on Zulip starseeker (Mar 05 2021 at 23:20):

multiple_matches has a pretty high false-positive rate - I'm basically checking the TAGGED MISMATCH commits against the trunk diff visually to make sure they look like they're correctly lined up. LOG+FILE is apparently not a terribly unique key in the revision set (mostly my fault, too, from what I've seen so far... I should go back in time and tell myself to use more unique commit messages.)

view this post on Zulip Sean (Mar 05 2021 at 23:20):

yep

view this post on Zulip Sean (Mar 05 2021 at 23:21):

the thing about multiple matches is those are revs for tags that were not tagged in conv12

view this post on Zulip starseeker (Mar 05 2021 at 23:22):

you mean sha1 keys that didn't have a matching SVN rev?

view this post on Zulip Sean (Mar 05 2021 at 23:22):

probably would make sense to only check the multiple_match revs to see which if any are NOT listed in svn_rev_mappings , since they're potential new info

view this post on Zulip Sean (Mar 05 2021 at 23:24):

svn revs that aren't referenced in the git log

view this post on Zulip Sean (Mar 05 2021 at 23:25):

it should be exclusively branch commits since I half ignored them

view this post on Zulip Sean (Mar 05 2021 at 23:25):

no worries, it was just a thought.. I think it's good to go

view this post on Zulip starseeker (Mar 05 2021 at 23:26):

Did you want me to do the multi-svn labeling? That'll probably take about an hour

view this post on Zulip Sean (Mar 05 2021 at 23:26):

would it be faster to just edit the commit?

view this post on Zulip starseeker (Mar 05 2021 at 23:27):

Maybe, but I'd have to figure out how to do that...

view this post on Zulip Sean (Mar 05 2021 at 23:27):

I mean just edit the log message
git --amend or whatever it is

view this post on Zulip starseeker (Mar 05 2021 at 23:28):

Um. Let me try that once...

view this post on Zulip starseeker (Mar 05 2021 at 23:29):

I think it's actually "git rebase" for older commit messages.

view this post on Zulip starseeker (Mar 05 2021 at 23:49):

@Sean 22521 and 22525 should both still be there after svn_rev_update - 22521 was moved to afd806bf472d0ac4b2685be406966a5a6eb28e5c

view this post on Zulip starseeker (Mar 06 2021 at 00:28):

Ah, I'm supposed to be correcting 18992, not 18892 - that's why

view this post on Zulip starseeker (Mar 06 2021 at 00:29):

@Sean OK, re-running - I'll upload to brlcad_conv18 once it's done.

view this post on Zulip starseeker (Mar 06 2021 at 00:54):

@Sean ec2350e47ab0a7a6a2e4f798aaf3a348775077ef is tagged as both 2103 and 2185

view this post on Zulip starseeker (Mar 06 2021 at 00:56):

Ah, I see - 2103 is right

view this post on Zulip starseeker (Mar 06 2021 at 01:20):

https://github.com/starseeker/brlcad_conv18

view this post on Zulip starseeker (Mar 06 2021 at 01:20):

@Sean I think that's got everything.

view this post on Zulip Sean (Mar 06 2021 at 01:36):

Cool, checking it now.

view this post on Zulip starseeker (Mar 06 2021 at 13:55):

@Sean How we looking?

view this post on Zulip Sean (Mar 06 2021 at 18:13):

It's still chugging through it all; should know how it looks here in a bit. Per the checklist, we're done with the repo itself if there are no problems on this final pass! so exciting!

view this post on Zulip Sean (Mar 06 2021 at 18:13):

Then on to the dang trackers and such...

view this post on Zulip Sean (Mar 06 2021 at 18:15):

This is taking a little longer because I had to re-extract the diffs that ran last night. I forgot to set diff.renameLimit on the new conv18 which caused slews of false differences. The re-extraction is running.

view this post on Zulip Sean (Mar 06 2021 at 18:17):

I should probably figure out how to set that in my personal config, instead of having to set it every cloning.

view this post on Zulip starseeker (Mar 06 2021 at 19:36):

I pulled all the latest SVN commits in - brlcad_conv18 should now be up-to-the-minute (i.e. r78389)

view this post on Zulip starseeker (Mar 06 2021 at 19:37):

Unless you see an issue or someone commits before validation completes, brlcad_conv18 should be ready to upload.

view this post on Zulip Sean (Mar 06 2021 at 20:30):

I'm only checking through 78233 just so numbers can be compared with 12, but sounds good!

view this post on Zulip starseeker (Mar 06 2021 at 23:23):

You had indicated you wanted to do the final upload to the BRL-CAD github site - after setting the origin, this is what I use to push to upload everything:

git push --all -u origin && git push --follow-tags

view this post on Zulip starseeker (Mar 06 2021 at 23:25):

I'm not sure if a basic clone from github will get all the branches, so I'd recommend pulling a mirror clone:

git clone --mirror https://github.com/starseeker/brlcad_conv18.git
cd brlcad_conv18.git
git remote set-url origin git@github.com:BRL-CAD/brlcad.git

view this post on Zulip Erik (Mar 07 2021 at 01:00):

I find myself using --git-dir a lot in scripts (cuz pwd is so passe)

view this post on Zulip starseeker (Mar 07 2021 at 14:40):

@Erik that's helpful, thanks!

view this post on Zulip starseeker (Mar 07 2021 at 14:41):

where is that documented?

view this post on Zulip starseeker (Mar 07 2021 at 14:58):

@Sean did the updated run succeed? (by the way, I think that limit can be set with: git config diff.renameLimit 999999 )

view this post on Zulip starseeker (Mar 07 2021 at 14:59):

git config merge.renamelimit 999999 may also be relevant

view this post on Zulip Erik (Mar 07 2021 at 15:11):

@starseeker: man page? :D a lot of cmds have similar (ninja -C, make -C, cmake -B <dir> -S <dir> ..)

view this post on Zulip starseeker (Mar 07 2021 at 16:04):

@Erik Oh, I see where I went wrong - it's a top level option supplied before the subcommands, so it's not in their --help statements.

view this post on Zulip Erik (Mar 07 2021 at 16:12):

yeah, it's a strange beast, git args/cmds are applied in order with side effects.

view this post on Zulip Sean (Mar 08 2021 at 14:30):

starseeker said:

Sean did the updated run succeed? (by the way, I think that limit can be set with: git config diff.renameLimit 999999 )

That's what I set, and it's needed to get the right diffs/logs for our history. The problem is that config is per cloning, so have to remember to do it every time.

view this post on Zulip Sean (Mar 08 2021 at 14:31):

Working on the upload, I think we're golden...

view this post on Zulip Sean (Mar 08 2021 at 14:31):

Few changes in the numbers I've been looking at, but nothing turning the train around.

view this post on Zulip Sean (Mar 08 2021 at 14:33):

Could use a test, but the svn repo "should" be read only now for everyone.

view this post on Zulip starseeker (Mar 08 2021 at 14:34):

I'll poke it quick

view this post on Zulip starseeker (Mar 08 2021 at 14:36):

Bingo:

svn: E000013: Commit failed (details follow):
svn: E000013: Can't open file '/svn/p/brlcad/code/db/txn-current-lock': Permission denied

view this post on Zulip Sean (Mar 08 2021 at 14:40):

Excellent.

view this post on Zulip Sumagna Das (Mar 08 2021 at 15:43):

The migration is almost complete then I guess

view this post on Zulip Sumagna Das (Mar 08 2021 at 15:43):

The migration is almost complete then I guess

view this post on Zulip Sean (Mar 08 2021 at 15:44):

Yes it is, @Sumagna Das ... coming super soon. Just tallying some final stats.

view this post on Zulip Sumagna Das (Mar 08 2021 at 15:45):

Awesome.🤩🤩

view this post on Zulip Sean (Mar 08 2021 at 15:46):

One of the world's oldest continuously developed source code repository's migration should be complete later today... ;)

view this post on Zulip Sumagna Das (Mar 08 2021 at 15:48):

Noice

view this post on Zulip Sumagna Das (Mar 08 2021 at 19:17):

how many more hours will it take from now to see the repository on the github?

view this post on Zulip Sumagna Das (Mar 08 2021 at 19:18):

will it be up tomorrow morning (for me) as it is past 12 midnight?

view this post on Zulip Sean (Mar 08 2021 at 21:26):

@Sumagna Das I'm hopeful, but I'm still trying to figure out something that changed.

view this post on Zulip starseeker (Mar 08 2021 at 21:31):

Can I assist?

view this post on Zulip starseeker (Mar 08 2021 at 21:39):

@Sean I'll be back on a bit later - please post anything I can help with. I'll be glad to re-run the final fixup pass again if necessary...

view this post on Zulip Sean (Mar 08 2021 at 21:41):

I need to make sure it's not something I did differently. I should know here in a bit. I need to pull conv12 again to confirm.

view this post on Zulip Sean (Mar 08 2021 at 21:42):

Still might not be enough to stop the gravy train, but it was a surprise. I'm hoping I just fat-fingered something.

view this post on Zulip starseeker (Mar 08 2021 at 21:46):

svn commit counts changed?

view this post on Zulip Sean (Mar 08 2021 at 21:50):

numbers are off. I don't want to speculate too much until I rule out a couple things.

view this post on Zulip starseeker (Mar 08 2021 at 21:51):

K. The good news is that as long as I don't need to do significant rework in the repowork C++, the post-brlcad_conv12 portion of the conversion runs pretty quickly.

view this post on Zulip starseeker (Mar 08 2021 at 21:52):

/me will be chewing off his fingernails elsewhere for a few hours...

view this post on Zulip starseeker (Mar 09 2021 at 17:50):

@Sean How's it going?

view this post on Zulip Sean (Mar 10 2021 at 05:58):

It is done. https://github.com/BRL-CAD/brlcad

view this post on Zulip Sean (Mar 10 2021 at 06:21):

@starseeker Please check and let me know if you see any mistakes.

view this post on Zulip Sean (Mar 10 2021 at 06:22):

I was ultimately able to reconcile most of the big differences, many looked like branch commit improvements (e.g., looks like you categorically eliminated the "initially added on branch" commits, that was 118 of them).

view this post on Zulip Sumagna Das (Mar 10 2021 at 08:13):

Can I clone the repo right now or are you guys still checking if there's any issues?

view this post on Zulip starseeker (Mar 10 2021 at 11:05):

Awesome - thank you @Sean for grinding through it!

view this post on Zulip starseeker (Mar 10 2021 at 11:06):

@Sumagna Das Give me a couple hours to check - this is almost certainly it, but I've got a couple things I need to do before I can focus properly on it.

view this post on Zulip Sumagna Das (Mar 10 2021 at 11:07):

ohk

view this post on Zulip Erik (Mar 10 2021 at 11:56):

(starseeker can focus properly now? :astonished: ) :D

view this post on Zulip starseeker (Mar 10 2021 at 12:14):

<snort> Only on one thing at a time. There are days when that's a significant handicap...

view this post on Zulip starseeker (Mar 10 2021 at 12:35):

OK, pull request and direct commit both worked, branches are present, tags are present, logs match, Contributors is populated. Looks good!

view this post on Zulip starseeker (Mar 10 2021 at 12:37):

Looks like I should have made that a rebase for the pull request... generated a merge commit too. Oh well.

view this post on Zulip starseeker (Mar 10 2021 at 13:18):

@Sumagna Das Looks like it's good to go.

view this post on Zulip Sean (Mar 10 2021 at 14:43):

I'm not as concerned about have a "clean history", let it be what it be. ;)

view this post on Zulip starseeker (Mar 10 2021 at 14:44):

The thought that worried me is that checkout out (say) prior release tags will produce checkouts that won't build.

view this post on Zulip Sean (Mar 10 2021 at 14:44):

@starseeker I hadn't looked at permissions yet, can do that today. There's a lot to do.

view this post on Zulip starseeker (Mar 10 2021 at 14:45):

Unfortunately, the fix requires a full (multi-week) re-run of the full process...

view this post on Zulip Sean (Mar 10 2021 at 14:45):

what do you mean??

view this post on Zulip starseeker (Mar 10 2021 at 14:46):

If we want (say) 7.30.0 to check out with tkhtml in a build-able state, I need to re-generate the history leaving the tkhtml RCS tags in place. That's an adjustment to the filters, which means a full re-run.

view this post on Zulip Sean (Mar 10 2021 at 15:04):

Hm, I'm still not following. Why wouldn't a checkout of the rel-7-30-0 tag be any different than what it was? Is it not right? Or not right for src/other because of how history was spliced?

view this post on Zulip starseeker (Mar 10 2021 at 15:06):

It's not right because I made a point of stripping out the RCS tags to make the git history following cleaner. So, for example,

static const char rcsid[] = "$Id: cssparser.c,v 1.8 2008/01/19 06:08:13 danielk1977 Exp $";

became

static const char rcsid[] = "$Id$";

I exempted a few specific directories early on that were problematic (mostly the step related stuff) but tkhtml was one of the ones that got stripped.

view this post on Zulip starseeker (Mar 10 2021 at 15:07):

It never crossed my mind that those headers might be a compilation necessity, and apparently for all the scrutiny I put on the conversion (diffing, log messages, etc.) I never actually tried a full compile of the generated checkout.

view this post on Zulip starseeker (Mar 10 2021 at 15:10):

It looks like tkhtml does some cute trick where it generates a list of source files that go into a generated file, and the script that generates that source file is matching on those rcsid lines.

view this post on Zulip starseeker (Mar 10 2021 at 15:28):

Actually, I should probably confirm that they were originally populated in the raw SVN data - since SVN will do RCS keyword expansion, it's theoretically possible that they were stored unevaluated internally. If that's the case, even exempting src/other/tkhtml won't fix it because Git doesn't populate RCS tags.

view this post on Zulip starseeker (Mar 10 2021 at 15:31):

I'm not sure what to do in that case, actually, short of using something like https://github.com/turon/git-rcs-keywords

view this post on Zulip starseeker (Mar 10 2021 at 16:03):

@Sean It looks like git-rcs-keywords can populate the RCS tags.

view this post on Zulip Erik (Mar 10 2021 at 16:23):

a force push to fix something like that would be pretty traumatic once this is "the way"

view this post on Zulip starseeker (Mar 10 2021 at 16:27):

I know. I think the answer is going to be git-rcs-keywords

view this post on Zulip starseeker (Mar 10 2021 at 16:27):

I'm testing now, and I think it can work.

view this post on Zulip starseeker (Mar 10 2021 at 16:44):

OK. Got it.

view this post on Zulip starseeker (Mar 10 2021 at 16:46):

I'm working with the https://github.com/kimmormh/git-rcs-keywords fork - there may be a better way to go at it, but that's functional in testing.

view this post on Zulip starseeker (Mar 10 2021 at 16:50):

Here's how it works:

Install the two scripts (rcs-keywords.clean and rcs-keywords.smudge) to /usr/local/share/git_filters

Add the following section to ~/.gitconfig

[filter "rcs-keywords"]
        clean  = /usr/local/share/git_filters/rcs-keywords.clean
        smudge = /usr/local/share/git_filters/rcs-keywords.smudge %f

Check out the git repository:

git clone https://github.com/BRL-CAD/brlcad.git

Copy the attached file to brlcad/.git/info/attributes: tkhtml_attributes

Note that attributes is the file name, not a directory - the file should be renamed.

view this post on Zulip starseeker (Mar 10 2021 at 16:54):

What this will do is match the particular tkhtml files in question, and populate the tags. Placing it in .git/info (rather than .gitattributes) means it will be active for all checkout activities (a .gitattributes file wouldn't exist in older checkouts, defeating the purpose.)

I've adjusted main's copy of Tkhtml to not use an RCS tag for what it is doing, so it will not be altered by this filter. The older checkouts still using the $Id: tag, which are the ones that need to be populated, will match and be populated (thus being viable for compilation.)

view this post on Zulip starseeker (Mar 10 2021 at 16:56):

The attributes file specifically calls out only the tkhtml files in question, to minimize processing time overall.

view this post on Zulip starseeker (Mar 10 2021 at 16:59):

@Sean If that looks workable to you, I can write it up for inclusion in the src tree

view this post on Zulip Erik (Mar 10 2021 at 17:33):

and then a multi-hour regeneration and multi-day re-review?

view this post on Zulip starseeker (Mar 10 2021 at 17:55):

@Erik pardon? the rcs-keywords works with the existing repository

view this post on Zulip Erik (Mar 10 2021 at 17:56):

oh, awesome, i thought it'd be a history rewrite to add the appropriate bits

view this post on Zulip starseeker (Mar 10 2021 at 17:56):

Going that route means we don't need to worry about re-inserting any RCS expansions into the history - they're just populated on checkout

view this post on Zulip starseeker (Mar 10 2021 at 17:57):

If we want it to work without any RCS expansion, yes - that's a multi-day rewrite, not multi-hour. If, however, we use the filters to do the expansion just where we need to, all a user has to do is set up the .gitconfig and attributes file.

view this post on Zulip starseeker (Mar 10 2021 at 18:00):

That has the advantage, once set up, of giving us expanded RCS keywords anywhere we need them. I'm 90% sure the expanded tkhtml tags were originally in the commit history, but if I'm wrong about that even a full regeneration of the history wouldn't be enough - I'd actually have to inject the expanded tags into the commit history.

view this post on Zulip Erik (Mar 10 2021 at 18:01):

cool, please leave breadcrumbs for the next poor fool who tries to compile something old :grinning_face_with_smiling_eyes: (i had a 43bsd compiling an old old version in a simh vax11, crazy people will do crazy unexpected things...)

view this post on Zulip starseeker (Mar 10 2021 at 18:02):

The drawback of this is it's not a "working out of the box" solution - because Git has no way to expand RCS tags by default, it requires some work by the user to prepare the solution.

The best we can do is pre-bake everything and tell folks exactly how to set it up.

view this post on Zulip starseeker (Mar 10 2021 at 18:04):

@Erik And that's the primary drawback - someone coming into things cold and not knowing they need to set up RCS expansion for older checkouts.

view this post on Zulip starseeker (Mar 10 2021 at 18:20):

Ooo! There might be an even better way...

view this post on Zulip starseeker (Mar 10 2021 at 18:21):

/me tests

view this post on Zulip starseeker (Mar 10 2021 at 18:21):

Sweet!!!!

view this post on Zulip starseeker (Mar 10 2021 at 18:22):

We don't need to mess with rcs-keywords at all.

view this post on Zulip starseeker (Mar 10 2021 at 18:23):

All we need to do is put this file at .git/info/attributes : attributes

view this post on Zulip Sean (Mar 10 2021 at 18:23):

Do tell, but I have a couple thoughts on this. First off, I'm not terribly concerned about tkhtml working but would be concerned if there's not a simple workaround that can be discovered when the failure is encountered.

view this post on Zulip Sean (Mar 10 2021 at 18:24):

what's that attributes file do? looks like it'll match on any files with those names??

view this post on Zulip starseeker (Mar 10 2021 at 18:25):

Yes, it's a filename match. I haven't figured out yet how to do a full-path match successfully.

view this post on Zulip starseeker (Mar 10 2021 at 18:25):

It uses the ident property as documented here: https://git-scm.com/docs/gitattributes

view this post on Zulip Sean (Mar 10 2021 at 18:26):

looks like src/other/tkhtml/** is what you want

view this post on Zulip starseeker (Mar 10 2021 at 18:26):

As it happens $Id$ is the RCS keyword at issue, and although the Git expansion is totally different from RCS/CVS/SVN it satisfies the compilation requirement.

view this post on Zulip starseeker (Mar 10 2021 at 18:26):

/me tries

view this post on Zulip Sean (Mar 10 2021 at 18:27):

I'm still more concerned about what the error looks like... is there a non-git workaround possible?

view this post on Zulip Sean (Mar 10 2021 at 18:27):

like "install system tkthtml" if that happened to work (which I bet it doesn't)

view this post on Zulip starseeker (Mar 10 2021 at 18:28):

Um. It might, actually, if anyone packages it...

view this post on Zulip Sean (Mar 10 2021 at 18:28):

or comment something in cmake

view this post on Zulip Sean (Mar 10 2021 at 18:28):

we test and detect tkhtml??

view this post on Zulip Sean (Mar 10 2021 at 18:28):

I thought that was one we didn't check

view this post on Zulip starseeker (Mar 10 2021 at 18:29):

At one point I did try, but I may have removed it - it's impossible on headless build nodes and problematic otherwise (almost no systems install Tkhtml)

view this post on Zulip starseeker (Mar 10 2021 at 18:29):

I'm set up as a 3rd party Tcl package build in CMake, so maybe.

view this post on Zulip starseeker (Mar 10 2021 at 18:29):

/me tries

view this post on Zulip Sean (Mar 10 2021 at 18:30):

if there is a simple way to turn it off (even if it disabled and man viewer), that'd be acceptable workaround

view this post on Zulip starseeker (Mar 10 2021 at 18:31):

Looks like the src/other/tkhtml/** line works.

view this post on Zulip starseeker (Mar 10 2021 at 18:32):

I'm fairly sure I didn't set up to disable just the Tkhtml dependent components.

view this post on Zulip starseeker (Mar 10 2021 at 18:33):

I could do it now of course, but that wouldn't help older builds.

view this post on Zulip starseeker (Mar 10 2021 at 18:33):

Let me see if I can get a system tkhtml install.

view this post on Zulip Sean (Mar 10 2021 at 18:33):

what happens if you -DENABLE_TKHTML=OFF ?

view this post on Zulip Sean (Mar 10 2021 at 18:34):

of if you just comment out the THIRD_PARTY_TCL_PACKAGE line in src/other/CMakeLists.txt

view this post on Zulip Sean (Mar 10 2021 at 18:35):

those would be generic workarounds we could live with

view this post on Zulip starseeker (Mar 10 2021 at 18:35):

Configure fails. (unnoticed dependency in tktable build on tkhtml build logic being loaded first.)

view this post on Zulip Sean (Mar 10 2021 at 18:36):

damn

view this post on Zulip starseeker (Mar 10 2021 at 18:36):

Actually, wait... let me double check my SVN state.

view this post on Zulip starseeker (Mar 10 2021 at 18:37):

Yeah, OK - it actually did find the system tkhtml, but then tktable wasn't happy. However...

view this post on Zulip starseeker (Mar 10 2021 at 18:38):

If we install BOTH tkhtml and tktable, that worked.

view this post on Zulip starseeker (Mar 10 2021 at 18:38):

/me builds

view this post on Zulip starseeker (Mar 10 2021 at 18:41):

Urmf. It builds successfully, but at least on Ubuntu those packages doesn't seem to work - Archer can't load and while MGED will load, the man viewer won't come up.

view this post on Zulip starseeker (Mar 10 2021 at 18:47):

rtwizard will load

view this post on Zulip Sean (Mar 10 2021 at 18:48):

/me facepalms sadly

view this post on Zulip Sean (Mar 10 2021 at 18:48):

so what's the minimum to turn it all off?

view this post on Zulip Sean (Mar 10 2021 at 18:48):

comment package line for both tkhtml and tktable?

view this post on Zulip Sean (Mar 10 2021 at 18:49):

it's so absurd that they did that with the $Id tag... ugh.

view this post on Zulip starseeker (Mar 10 2021 at 18:49):

That would produce the same result. Archer requires them and so wouldn't work, and MGED would work but without the man viewer...

view this post on Zulip starseeker (Mar 10 2021 at 18:50):

Installing the system tkhtml and tktable produced as much of a working config as we could expect without tkhtml/tktable built.

view this post on Zulip Sean (Mar 10 2021 at 18:50):

because of require lines in archer?

view this post on Zulip starseeker (Mar 10 2021 at 18:50):

I believe so - Archer's right panel uses tktable, and both the help viewer and man viewer use tkhtml. The way the Itcl class system works, if I'm remembering correctly, it all gets defined and loaded up front.

view this post on Zulip Erik (Mar 10 2021 at 18:51):

will do recursive globbing into subdirs, you can prefix / as the system root, too ( so /db//*.g )

view this post on Zulip Sean (Mar 10 2021 at 18:51):

okay, so archer's got to have it

view this post on Zulip Sean (Mar 10 2021 at 18:51):

what about surgery?

view this post on Zulip Erik (Mar 10 2021 at 18:51):

aieee, not bold, double asterisk will do recursive globbing :D

view this post on Zulip Sean (Mar 10 2021 at 18:52):

that's a finite list -- can it be reduced to a one-line edit in tkhtml's build logic with the list of file names that it would have extracted (or a glob)?

view this post on Zulip starseeker (Mar 10 2021 at 18:53):

The simplest possible source code fix is probably a one line sed line of some sort that replaces all the $Id$ lines with $Id: tkhtml$

view this post on Zulip starseeker (Mar 10 2021 at 18:53):

That'll be way simpler than trying to do anything with the Archer codebase.

view this post on Zulip starseeker (Mar 10 2021 at 18:54):

The issue is already fixed in the Git main - the only problem is the older checkouts that we can't change.

view this post on Zulip Sean (Mar 10 2021 at 18:57):

right, which I think is a problem unless we can figure out a trivial workaround :(

view this post on Zulip Sean (Mar 10 2021 at 18:58):

otherwise, there'd be almost no point to all the old tags since all the ones since tkhtml was introduced won't/can't work at all

view this post on Zulip starseeker (Mar 10 2021 at 18:58):

echo "/src/other/tkhtml/** ident" > .git/info/attributes doesn't count as trivial?

view this post on Zulip starseeker (Mar 10 2021 at 19:00):

find . -path \*tkhtml\* -type f -exec sed -i 's/\$Id\$/\$Id: tkhtml\$/g' {} \; should also do the trick.

view this post on Zulip Sean (Mar 10 2021 at 19:00):

it's a solution, but then it assumes git and is completely non-intuitive... that doesn't feel right

view this post on Zulip Sean (Mar 10 2021 at 19:01):

I'm not seeing where Id is being used...

view this post on Zulip Sean (Mar 10 2021 at 19:01):

in either tkthml or tktable

view this post on Zulip starseeker (Mar 10 2021 at 19:01):

In main it's not, any longer. It was in mkdefaultstyle.tcl

view this post on Zulip starseeker (Mar 10 2021 at 19:02):

src/other/tkhtml/src/mkdefaultstyle.tcl:21

view this post on Zulip Sean (Mar 10 2021 at 19:04):

ah, ${DOLLAR} is why...

view this post on Zulip Sean (Mar 10 2021 at 19:05):

so looks like that writes out four files...

view this post on Zulip Sean (Mar 10 2021 at 19:11):

Hm, I'm not following this logic ... I think I need to get a branch and see how it was

view this post on Zulip starseeker (Mar 10 2021 at 19:12):

You mean what the tags originally looked like?

view this post on Zulip Sean (Mar 10 2021 at 19:12):

no, I'm looking at the logic in mkdefaultstyle.tcl

view this post on Zulip Sean (Mar 10 2021 at 19:13):

it's writing out #define lines (to whomever is calling mkdefaultstyle.tcl), reading from just four files (html.css, tkhtml.css, quirks.css,

view this post on Zulip Sean (Mar 10 2021 at 19:13):

and *.c (okay, so not four files, but four categories)

view this post on Zulip starseeker (Mar 10 2021 at 19:13):

https://sourceforge.net/p/brlcad/code/HEAD/tree/brlcad/trunk/src/other/tkhtml/src/mkdefaultstyle.tcl

view this post on Zulip Sean (Mar 10 2021 at 19:14):

yes, that's what I'm looking at...

view this post on Zulip starseeker (Mar 10 2021 at 19:14):

That file is very old... I don't know that I ever changed it.

view this post on Zulip Sean (Mar 10 2021 at 19:15):

I'm saying I don't see yet what that file is actually doing.

view this post on Zulip Sean (Mar 10 2021 at 19:15):

because the logic says it's writing out #define lines, but none of them exist (yet)

view this post on Zulip starseeker (Mar 10 2021 at 19:16):

They go into build/src/other/tkhtml/htmldefaultstyle.c

view this post on Zulip Sean (Mar 10 2021 at 19:16):

during compilation

view this post on Zulip starseeker (Mar 10 2021 at 19:16):

Target in CMakeLists.txt line 34

view this post on Zulip starseeker (Mar 10 2021 at 19:17):

Yes - custom build step, prior to building Tkhtml lib

view this post on Zulip Sean (Mar 10 2021 at 19:18):

hm, so maybe ... can you generate it on trunk and see what happens on branch if you just drop htmldefaultstyle.c into the build dir?

view this post on Zulip starseeker (Mar 10 2021 at 19:19):

So, check out an older branch in git, take the trunk copy of that file, and drop it in the build dir?

view this post on Zulip starseeker (Mar 10 2021 at 19:21):

That seemed to work.

view this post on Zulip starseeker (Mar 10 2021 at 19:24):

@Sean You're thinking to stash the older htmldefaultstyle.c somewhere and just have folks drop it into the build directory?

view this post on Zulip Sean (Mar 10 2021 at 19:26):

something like that, it's agnostic

view this post on Zulip Sean (Mar 10 2021 at 19:27):

does it work if it's dropped into the source tree? could commit changes to the tags...

view this post on Zulip Sean (Mar 10 2021 at 19:27):

course, could just expand the Ids in the tags too..

view this post on Zulip Sean (Mar 10 2021 at 19:27):

one time patching

view this post on Zulip starseeker (Mar 10 2021 at 19:27):

Yeah, that'd be the way to go. That file is entirely build dir generated.

view this post on Zulip Sean (Mar 10 2021 at 19:28):

what about just checking out all tags and committing the Id expansion?

view this post on Zulip Sean (Mar 10 2021 at 19:28):

does that screw with anything?

view this post on Zulip starseeker (Mar 10 2021 at 19:29):

I think we'd need to make new branches for those tags that don't already have one - a tag in git is just a pointer to a commit, so we'd be making a new branch from that change for the new tag.

view this post on Zulip starseeker (Mar 10 2021 at 19:30):

If we did that I'd want to use the same fix I made to main, so we can still keep the attributes option without breaking anything. The attributes solution allows arbitrary older commits to work, tag fixes will only address the tags.

view this post on Zulip starseeker (Mar 10 2021 at 19:30):

Actually though, with an ident based solution there, it probably won't matter anyway.

view this post on Zulip Sean (Mar 10 2021 at 19:31):

right, I was thinking tags==branches

view this post on Zulip Sean (Mar 10 2021 at 19:31):

going to take a while to adjust my mental model

view this post on Zulip starseeker (Mar 10 2021 at 19:31):

/me got it pounded into him for a year writing converter logic ;-)

view this post on Zulip Sean (Mar 10 2021 at 19:31):

they're just dirs in svn, so you can keep committing to the tag/branch

view this post on Zulip starseeker (Mar 10 2021 at 19:32):

Right. The ones were we did that I believe have branches already (or they should) since we can't commit to tags. However, if any tags weren't edited that'll be new branches to introduce.

view this post on Zulip starseeker (Mar 10 2021 at 19:35):

/me is game to do that if you think it's the best solution. I'm also willing (gulp) to redo the process with the correct tkhtml contents, if you decide that's best - it was my mistake, and I'll do what it takes to fix it.

view this post on Zulip Sean (Mar 10 2021 at 19:35):

let me do a checkout to see how confusing it actually looks when it fails

view this post on Zulip starseeker (Mar 10 2021 at 19:36):

I've been using git checkout rel-7-32-2 fwiw

view this post on Zulip Sean (Mar 10 2021 at 19:36):

it's not just on you... checking out a couple tags was on my verification list and I got lazy and skipped that one ... :(

view this post on Zulip Sean (Mar 10 2021 at 22:58):

looks like after turning off STRICT, adding include(CheckSymbolExists), all that was required to at least compile was commenting out the line it failed on -- Tcl_SetResult(interp, HTML_SOURCE_FILES, TCL_STATIC); in htmltcl.c:2787

view this post on Zulip Sean (Mar 10 2021 at 22:58):

and then both archer and mged work

view this post on Zulip Sean (Mar 10 2021 at 22:59):

I think that's an acceptable hardship since there's always going to be tweaking of old builds required.

view this post on Zulip starseeker (Mar 10 2021 at 23:23):

@Sean Actually, I may have just coaxed repowork into redoing just the tkhtml sources: https://github.com/starseeker/brlcad_tkhtml_fix

view this post on Zulip starseeker (Mar 10 2021 at 23:23):

Testing now, stand by...

view this post on Zulip starseeker (Mar 10 2021 at 23:27):

Ah hah! rel-7-32-2 builds clean.

view this post on Zulip starseeker (Mar 10 2021 at 23:30):

@Daniel Rossberg , @Sumagna Das - don't do anything yet with your forks. I may have achieved a deeper fix for the tkhtml issue, and if so it will require swapping out the current git repository.

view this post on Zulip starseeker (Mar 10 2021 at 23:31):

@Sean This is not a brlcad_conv12 derivative - it is the current repository up on BRL-CAD/brlcad with a targeted set of blob sha1 replacements centered on just the relevant Tkhtml files.

view this post on Zulip starseeker (Mar 10 2021 at 23:32):

As such it has the commits we had already made to that repository. Indeed, the commit where I had restored the rcsid strings is now in this new repository almost a no-op (there was one .txt file that got its revision restored in that commit but not in my remapping work.)

view this post on Zulip starseeker (Mar 10 2021 at 23:37):

Once I contemplated re-running the whole conversion again to avoid filtering the tkhtml sources, I realized it was far easier to take the existing conversion (which already has a multitude of corrections applied that would be tricky to re-apply again) and target just the necessary sources. So I scripted a git log --follow of all the relevant .c files in git, and for each log entry for each file extracted the svn revision. Then I checked out both the git and the SVN tkhtml sources for the corresponding revisions, calculated the git hashes for the git and svn versions of the file, and made a git fast-import blob out of each revison of the svn version of the file. That gave me a way to map the git internal state to what it should have been had the SVN files been properly included, and also gave me the raw blob inputs to feed to fast import so the blobs would be available to reference.

view this post on Zulip starseeker (Mar 10 2021 at 23:39):

It looks as if it worked. To switch this repository in I would recommend deleting BRL-CAD/brlcad and re-creating it, since all the revisions after somewhere in the 32k range will have different SHA1 values.

view this post on Zulip starseeker (Mar 11 2021 at 00:10):

Here's the set of commits where those files changed:

32868
32899
40243
40405
47234
49599
49602
49607
49608
49609
49611
49613
57890

view this post on Zulip starseeker (Mar 11 2021 at 01:44):

Sigh. Another oddity. Now that I can try distcheck, run.sh isn't set to executable and benchmark isn't cleaning up properly.

view this post on Zulip starseeker (Mar 11 2021 at 02:49):

https://github.com/starseeker/brlcad_bench_fix superceeds the tkhtml_fix repository - it has those fixes and 100755 setting for the SVN executable files that git didn't have listed.

For the latter it's a brute force approach - I generated a list of all the file paths rel-7-32-2 had set executable that git did not, and set those paths' modes to 100755 throughout the git history. I didn't attempt an analysis of when SVN did or didn't set that property on those files.

make distcheck now passes on a git checkout rel-7-32-2

view this post on Zulip starseeker (Mar 11 2021 at 04:10):

/me fires distcheck-full on rel-7-32-2, notes that brain has reached "E", and heads off to the charging station...

view this post on Zulip Daniel Rossberg (Mar 11 2021 at 07:26):

Thanks for the warning. I removed my fork for now.

view this post on Zulip Sumagna Das (Mar 11 2021 at 08:19):

Should I also remove the fork?

view this post on Zulip starseeker (Mar 11 2021 at 12:53):

@Sumagna Das I would suggest doing so - @Sean will need to review what I've done and see what he wants to do.

view this post on Zulip starseeker (Mar 11 2021 at 12:55):

@Sean distcheck-full succeeded on rel-7-32-2 from the brlcad_bench_fix repo

view this post on Zulip starseeker (Mar 11 2021 at 14:53):

One additional quirk I just hit, but something that's not specific to our setup as far as I can tell - git checked out the .3dm file in text mode by default on Windows. When I added .3dm binary to the .gitattributes file that seems to address it, but since older checkouts won't have a .gitattributes file on Windows they'll probably get the wrong checkout by default.

view this post on Zulip starseeker (Mar 11 2021 at 15:03):

Workarounds would either be the .git/info/attributes approach discussed earlier for $Id$, or setting .3dm in a global attributes file: https://stackoverflow.com/a/28027656

view this post on Zulip starseeker (Mar 11 2021 at 15:10):

With that caveat, brlcad_bench_fix rel-7-32-2 built successfully on Windows with MSVC

view this post on Zulip starseeker (Mar 11 2021 at 15:35):

rel-7-32-0 distcheck-full passed on Ubuntu

view this post on Zulip starseeker (Mar 11 2021 at 16:11):

rel-7-32-2 enable-all build succeeded on OSX

view this post on Zulip starseeker (Mar 11 2021 at 16:24):

rel-7-30-8 is too old to distcheck vanilla on this box without modding the system (system proj_api.h interferes )

view this post on Zulip starseeker (Mar 11 2021 at 16:25):

@Sean If you have other tests you'd like me to do I'm game...

view this post on Zulip Erik (Mar 11 2021 at 16:28):

what about 7.0? :D I think that was the first release I contributed to (fbsd support and autoconf)

view this post on Zulip Erik (Mar 11 2021 at 16:29):

"here's a cd with fbsd, here's a cd with our source code, here's a computer. We'll try to get you on the network in the next couple of weeks." haha, the good old days :D

view this post on Zulip starseeker (Mar 11 2021 at 16:30):

/me chuckles. I'd almost certainly need a VM to try building something that old.

view this post on Zulip Erik (Mar 11 2021 at 16:35):

speaking of! A lot of my life lately is building singularity and occasionally docker images. I know we had a raw disk image a while back for loading into vmware or bochs or whatever, do we/should we(you) provide container images? :D

view this post on Zulip starseeker (Mar 11 2021 at 16:36):

@Erik Checking the diffs, it looks like whitespace changes (line endings) and expanded vs. unexpanded RCS tags. Plus a couple files like .cvsignore and .gitignore

view this post on Zulip Erik (Mar 11 2021 at 16:38):

he, hehe, he, yeh... 'find . -type f | xargs sed -i.bak 's/[ ^t]*//'` ... I think I did an indent back then, too...

view this post on Zulip starseeker (Mar 11 2021 at 16:39):

I mean differences between the SVN and git checkouts. Although if git's history following breaks in there I'll know who blame ;-)

view this post on Zulip Erik (Mar 11 2021 at 17:09):

c'mon, I was new, had to spraypaint my name all over the place and get established, ch'know :D

view this post on Zulip starseeker (Mar 12 2021 at 01:10):

/me tries a crazy idea to handle the .3dm checkout issue...

view this post on Zulip starseeker (Mar 12 2021 at 02:59):

@Sean I've figured out how to insert a .gitattributes file at strategic points in the git history so the .3dm file gets flagged by git as a binary checkout. Ironically, this means the rel-7-32-2 distcheck will break on the default repo_verify step, since by that point: 1. the CMake logic had been taught how to use git for bookkeeping and 2. the .gitattributes file is present and unaccounted for in the CMake logic. However, I think it's still better to insert it - the error message tells the user what flag to supply to to avoid the problem (or they can just delete the .gitattributes file, since it has done its job by that point.) A corrupted 3dm file, on the other hand, has no easy fix.

https://github.com/starseeker/brlcad_added_gitattributes

view this post on Zulip Sean (Mar 12 2021 at 05:27):

@starseeker I found a good way to quickly extract all the files ever marked executable. It's a lot more than a handful. Quick scan of just a few dozen revisions found 2635 files. As expected, some are bogus but most look good. I'll do a manual pass over the list in the morning to weed out the ones that clearly shouldn't have exec set. The rest should be harmless.

view this post on Zulip Sean (Mar 12 2021 at 06:01):

Think it's better that the build work and the files be valid/usable on checkout. Distcheck failing isn't critical, so it's a reasonable trade. I would be cautious making more changes like that though. Surgery on the history to inject and edit files is risky in a manner that might not be realized for months and need to be re-uploaded to fix if there's some obscure but important bug.

view this post on Zulip Sean (Mar 12 2021 at 08:12):

Couldn't wait till morning. I went over the list manually, eliminated all the ones that looked like the exec bit was wrong/unnecessary, and here they are: executables.txt

view this post on Zulip Sean (Mar 12 2021 at 08:19):

I only looked at every 250'th commit for expediency, but did look through entire history up through r77000. I also only looked at trunk, so anything only existing on a branch Intentionally delisted all the itcl/itk files, makefile logic, and other outright errors (many of which are still wrong on trunk albeit harmlessly). Identified/Kept 1770 files but could use another pass from fresh eyes.

view this post on Zulip starseeker (Mar 12 2021 at 12:38):

Sean said:

Think it's better that the build work and the files be valid/usable on checkout. Distcheck failing isn't critical, so it's a reasonable trade. I would be cautious making more changes like that though. Surgery on the history to inject and edit files is risky in a manner that might not be realized for months and need to be re-uploaded to fix if there's some obscure but important bug.

Agreed. It might be better in that sense not to change it, even, since the .git/info/attributes answer would also address the issue and does not require history editing.

If we do opt for adding .gitattributes, there is one final question - the repo I posted last night puts a minimal .gitattributes in at two places - once when terra.dsp is introduced, and the second time when the .3dm file is introduced. The .gitattribute contents are focused tightly on those two file extensions. However, if we're going to more closely mimic the SVN checkout behavior, it would actually make more sense to inject a more comprehensive .gitattributes at the beginning of the history that covers more file types. My initial impulse was to go minimal to avoid surprises, but since SVN did have those mime types set there is an argument that it is more surprising for git not to have them. Thoughts?

view this post on Zulip starseeker (Mar 12 2021 at 14:32):

@Sean on the executable files - I set up some checks as well, using a brute force approach. (SSD speeds are nice) I checked all commits for trunk/ and branches/ is finishing up now.

Here is the full, unedited trunk list: trunk.txt

view this post on Zulip starseeker (Mar 12 2021 at 14:33):

(That's all trunk commits from the beginning.)

view this post on Zulip Sean (Mar 12 2021 at 14:42):

@starseeker That's basically the list I started with. I edited it down to the executables.txt list as there are many subtley and blatantly wrong entries in there.

view this post on Zulip Sean (Mar 12 2021 at 14:43):

We shouldn't set all those. There are entire folders that were checked in with executable bit set, including source files, header files, build files, images, ...

view this post on Zulip starseeker (Mar 12 2021 at 14:43):

Agreed.

view this post on Zulip Sean (Mar 12 2021 at 14:43):

I spent an hour whittling it down to what looked like should be the set to set.

view this post on Zulip Sean (Mar 12 2021 at 14:44):

What's the alternative to .gitattributes?

view this post on Zulip starseeker (Mar 12 2021 at 14:44):

I 'll give it a quick check to see if a full rev check caught any that the 250-per jumping skipped over, but I don't expect to find much.

view this post on Zulip Sean (Mar 12 2021 at 14:44):

I'm not terribly a fan of littering folders with vcs files.

view this post on Zulip starseeker (Mar 12 2021 at 14:45):

After doing a main checkout, the user can add "*.3dm binary" to the .git/info/attributes file.

view this post on Zulip starseeker (Mar 12 2021 at 14:45):

That has to be a manual step, but because it's not a file in the repo history and it has highest precedence, once it's there any checkouts of other branches or tags will use it.

view this post on Zulip Sean (Mar 12 2021 at 14:46):

starseeker said:

I 'll give it a quick check to see if a full rev check caught any that the 250-per jumping skipped over, but I don't expect to find much.

You're comparing apples to oranges there a bit because missing will be any between the 250 jumps that lived ephemerally but 99% will be intentional removals. I can give you the full list I started with.

view this post on Zulip starseeker (Mar 12 2021 at 14:47):

Either way - I'd be good to go with your list if you're comfortable with it.

view this post on Zulip Sean (Mar 12 2021 at 14:48):

Here's the list I started with -- can compare with this to see what got skipped: executables_250.txt

view this post on Zulip Sean (Mar 12 2021 at 14:50):

Here's what pulls the list: for i in seq 0 250 77000 ; do echo $i ; svn -R -r $i propget svn:executable file:///Users/morrison/brlcad.github/svn.sfmirror/code/brlcad/trunk ; done | tee executables.log
Then just sort | uniq -c | sort -nr | awk | sed | sort ;)

view this post on Zulip starseeker (Mar 12 2021 at 14:52):

Here's what got added by all revs: 250_vs_trunk_all.diff

view this post on Zulip starseeker (Mar 12 2021 at 14:56):

Checking all tags, there was only one path that got added compared to trunk - "misc/archlinux/brlcad.sh"

view this post on Zulip Sean (Mar 12 2021 at 15:01):

that's diff'd against executables_250.txt ??

view this post on Zulip starseeker (Mar 12 2021 at 15:01):

250_vs_trunk_all.diff? Yes.

view this post on Zulip Sean (Mar 12 2021 at 15:02):

because that looks more like the list of what I deleted

view this post on Zulip starseeker (Mar 12 2021 at 15:02):

/me redoes it to be sure...

view this post on Zulip Sean (Mar 12 2021 at 15:03):

oh, I posted the wrong file dammits

view this post on Zulip Sean (Mar 12 2021 at 15:04):

gimmie a sec

view this post on Zulip Sean (Mar 12 2021 at 15:06):

This is the full list!
executables_redux250.txt

view this post on Zulip Sean (Mar 12 2021 at 15:07):

FYI, I deleted the brlcad repo :(

view this post on Zulip starseeker (Mar 12 2021 at 15:07):

250_vs_trunk_all_take2.diff

view this post on Zulip starseeker (Mar 12 2021 at 15:08):

Mostly culls - couple configure scripts and the like that may pass...

view this post on Zulip starseeker (Mar 12 2021 at 15:11):

I'm three quarters of the way through the remaining branch checks - so far it looks like a little over 1200 files set exec that are unique to branches.

view this post on Zulip Sean (Mar 12 2021 at 15:40):

I'll comb through that diff list. There are a few in there that should be preserved.

view this post on Zulip starseeker (Mar 12 2021 at 15:49):

Here's the set that are unique to branch commits: branches_uniq.txt

view this post on Zulip starseeker (Mar 12 2021 at 15:50):

Here's the set with some of the obvious culls removed: branches_uniq_reduced.txt

Line used: cat branches_uniq.txt |grep -v \\.h|grep -v \\.msg |grep -v \\.itk |grep -v \\.cpp |grep -v tzdata > branches_uniq_reduced.txt

view this post on Zulip Sean (Mar 12 2021 at 16:55):

still some inappropriates in there
I should be done going through the diff list here in a jiffy after I grab a bite

view this post on Zulip Sean (Mar 12 2021 at 18:02):

@starseeker thoughts on the creo3plugin snafu? inclined to ignore it from an exec bit perspective

view this post on Zulip starseeker (Mar 12 2021 at 18:03):

Agreed. Not significant.

view this post on Zulip starseeker (Mar 12 2021 at 18:11):

@Sean As far as the .gitattributes thing, which option do you want to go with? We've got:

a) Insert minimal .gitattributes files at strategic points (the brlcad_added_gitattributes repo)

b) Insert more fully populated .gitattributes file for overall repo (closer match to SVN mime types in many cases, but problematic if we get unanticipated matches - personally I'm inclined not to do this)

c) No .gitattributes insertions, require user to set either per-checkout attributes or some form of global git attribute.

I'd be OK with a) or c) - if we go with c) however, we'll need to prominently document what to do to get "proper" older checkout behavior on Windows. The .dsp file isn't particularly noticeable if it gets munged up by the checkout, but the 3dm file is...

view this post on Zulip Sean (Mar 12 2021 at 18:14):

Here's the merged reduced set:
executables2.txt

view this post on Zulip Sean (Mar 12 2021 at 18:15):

how does git normally handle the exec bit?

view this post on Zulip starseeker (Mar 12 2021 at 18:16):

I think it stashes it as part of the commit internally.

view this post on Zulip Sean (Mar 12 2021 at 18:16):

ah, I see https://stackoverflow.com/questions/21691202/how-to-create-file-execute-mode-permissions-in-git-on-windows

view this post on Zulip Sean (Mar 12 2021 at 18:16):

yeah, it stashes the mode on commit

view this post on Zulip Sean (Mar 12 2021 at 18:17):

so ... I think it looks like that's a global property that's just set, not something tracked per commit?

view this post on Zulip Sean (Mar 12 2021 at 18:17):

am I reading that right?

view this post on Zulip Sean (Mar 12 2021 at 18:17):

the fact that one can do git update-index --chmod=+x foo.sh

view this post on Zulip Sean (Mar 12 2021 at 18:17):

the "index" being some sort of permissions ledger

view this post on Zulip starseeker (Mar 12 2021 at 18:18):

I don't... think so? I think the index update is going to alter the tree entries git uses to track the checkout states?

view this post on Zulip starseeker (Mar 12 2021 at 18:18):

I know in the fast-export file that's how it's represented... hang on, let me generate something quick.

view this post on Zulip Sean (Mar 12 2021 at 18:19):

I guess to answer your question, I'm looking for an option #4 where it's just set in the repo transparently instead of explicitly as a bandaid

view this post on Zulip Sean (Mar 12 2021 at 18:19):

since we're going through the rigor to fix it, might as well .. fix it

view this post on Zulip starseeker (Mar 12 2021 at 18:20):

Oh, you're talking about the binary vs. text mode checkout?

view this post on Zulip starseeker (Mar 12 2021 at 18:20):

That's different from the exec mode.

view this post on Zulip Sean (Mar 12 2021 at 18:20):

what about checking out each rev and doing a git update-index on each of the files in our ledger?

view this post on Zulip Sean (Mar 12 2021 at 18:20):

starseeker said:

Oh, you're talking about the binary vs. text mode checkout?

No, I'm not

view this post on Zulip Sean (Mar 12 2021 at 18:21):

I've not even considered the binary property...

view this post on Zulip starseeker (Mar 12 2021 at 18:21):

Here's the fast-export stream from the repository without the blobs (i.e. small enough to be viewed): https://brlcad.org/~starseeker/no_blobs.fi.gz

view this post on Zulip starseeker (Mar 12 2021 at 18:22):

I find that helpful for understanding how things are stored by git.

view this post on Zulip Sean (Mar 12 2021 at 18:24):

So every time it's checked in, it's mode is potentially changed

view this post on Zulip starseeker (Mar 12 2021 at 18:24):

Yes.

view this post on Zulip starseeker (Mar 12 2021 at 18:25):

That's my understanding - I can do an experiment quick to confirm that.

view this post on Zulip Sean (Mar 12 2021 at 18:25):

So a script that walks every commit and scans for all the executables2.txt files?

view this post on Zulip Sean (Mar 12 2021 at 18:26):

Can you do an update-index on a committed sha?

view this post on Zulip starseeker (Mar 12 2021 at 18:26):

I was planning to do what I did for the previous case - take executables2.txt, reformat it for repowork, and operate on the fast import stream.

view this post on Zulip Sean (Mar 12 2021 at 18:30):

I'm not familiar with what "operate on the fast import stream" means... :)

view this post on Zulip Sean (Mar 12 2021 at 18:30):

how's it doing surgery on the repo commits?

view this post on Zulip Sean (Mar 12 2021 at 18:30):

or is it not?

view this post on Zulip starseeker (Mar 12 2021 at 18:31):

Heh. Sorry. repowork take the output of "git fast-export", reads it into C++ data structures, manipulates it, and dumps out a new fast-import stream that is in turn fed to "git fast-import"

view this post on Zulip Sean (Mar 12 2021 at 18:32):

so re-running the conversion

view this post on Zulip Sean (Mar 12 2021 at 18:33):

and fixing it then

view this post on Zulip Sean (Mar 12 2021 at 18:33):

or dumping the conversion, fixing, and re-importing

view this post on Zulip Sean (Mar 12 2021 at 18:34):

I think you mean you are dumping 12, doing all those corrections+fixes+etc, and then ending up with a new repo (call it 19 or 20 or whatever)?

view this post on Zulip starseeker (Mar 12 2021 at 18:34):

The latter. So the full sequence is:

cd old_repo && git fast-export --all --show-original-ids > ~/old.fi
./repowork --mode-map exec_update.txt ~/old.fi new.fi
mkdir new_repo && cd new_repo && git init
cat ../new.fi | git fast-import

view this post on Zulip starseeker (Mar 12 2021 at 18:34):

No, dumping (in this case) brlcad_added_gitattributes.

view this post on Zulip starseeker (Mar 12 2021 at 18:35):

That way I don't have to redo all the 12->18 corrections - they're already there.

view this post on Zulip Sean (Mar 12 2021 at 18:35):

okay, I think I get it

view this post on Zulip starseeker (Mar 12 2021 at 18:36):

That's why I was asking about the .gitattributes solution - I can also dump brlcad_tkhtml_fix and not incorporate the .gitattributes changes.

view this post on Zulip Sean (Mar 12 2021 at 18:36):

sounds good. Okay, so then ... is there anything to be done about the binary files? we could audit them similarly

view this post on Zulip Sean (Mar 12 2021 at 18:36):

I'd like to avoid .gitattributes if we can.

view this post on Zulip starseeker (Mar 12 2021 at 18:36):

Hrm. I hadn't considered that beyond the known breakages...

view this post on Zulip starseeker (Mar 12 2021 at 18:37):

Well, we can't avoid something like that, unless you know about a Git feature I don't.

view this post on Zulip Sean (Mar 12 2021 at 18:37):

I can pull the list of known correct and incorrect binaries the same way. even doing every rev would probably take an hour or so

view this post on Zulip starseeker (Mar 12 2021 at 18:38):

Whether the dsp or 3dm files get checked out as text or binary I don't think is governed by anything stored internally in the repo.

view this post on Zulip Sean (Mar 12 2021 at 18:38):

What do you mean?

view this post on Zulip starseeker (Mar 12 2021 at 18:39):

That's why I posted the no_blob.fi file. If you check pretty much any commit, you'll see that only the mode and the blob sha1 are associated with the path. There doesn't seem to be an equivalent to the svn:mime-type

view this post on Zulip starseeker (Mar 12 2021 at 18:40):

https://git-scm.com/docs/gitattributes

view this post on Zulip Sean (Mar 12 2021 at 18:41):

right, my understanding is git doesn't actually store mixed encodings like svn

view this post on Zulip Sean (Mar 12 2021 at 18:41):

that everything is essentially just stored (binary) and whether it displays them or treats them as binary depends on it detecting non-ascii bytes

view this post on Zulip starseeker (Mar 12 2021 at 18:42):

Right, which means if you want it to treat a file (say) as binary anyway (or keep Windows line endings on Linux, for that matter) you need some form of gitattributes override. The dsp and 3dm files are getting detected as text, as far as I can tell.

view this post on Zulip starseeker (Mar 12 2021 at 18:44):

We could look for other files in the repository that should be binary but will match a text detection, although I'm not 100% how to set that up, but even once we know that there's no per-path property we can set in git (that I know of) that doesn't involve the .gitattributes file

view this post on Zulip Sean (Mar 12 2021 at 18:45):

gotcha

view this post on Zulip Sean (Mar 12 2021 at 18:45):

in that case, I'm inclined to see-no-evil

view this post on Zulip Sean (Mar 12 2021 at 18:46):

too much potential to screw up something. e.g., .dsp files .. those were msvc6 project files iirc, so they usually are/were text files

view this post on Zulip starseeker (Mar 12 2021 at 18:47):

Right - that's where terra.dsp got so messed up historically - when people auto-set all the mime types for .dsp files.

view this post on Zulip Sean (Mar 12 2021 at 18:47):

i mean, we can fix our little terra.dsp and 3dm, but probably not worth seeking out more

view this post on Zulip Sean (Mar 12 2021 at 18:48):

potential for error would probably be the few .g's that have been committed, but those are almost certainly correctly detected as binary

view this post on Zulip Sean (Mar 12 2021 at 18:49):

i'll do a spot check just to see if it looks like there were any important binaries in the history

view this post on Zulip Sean (Mar 12 2021 at 18:49):

similar 250 jumping to see what was binary

view this post on Zulip Sean (Mar 12 2021 at 18:49):

that just takes a couple min

view this post on Zulip Sean (Mar 12 2021 at 18:53):

oh that'll be handy -- this will also tell which files we changed the mime-type on, which might be an indicator that it was important

view this post on Zulip Sean (Mar 12 2021 at 18:54):

half done

view this post on Zulip starseeker (Mar 12 2021 at 18:54):

Sounds good. This repo has the exec updates from your new file: https://github.com/starseeker/brlcad_exec2

view this post on Zulip Sean (Mar 12 2021 at 19:00):

4868 unique binary file paths

view this post on Zulip Sean (Mar 12 2021 at 19:02):

man... there's a lot of mime-type mistakes in there

view this post on Zulip starseeker (Mar 12 2021 at 19:10):

/me fires distcheck-full on rel-7-32-2 from brlcad_exec2

view this post on Zulip starseeker (Mar 12 2021 at 19:11):

(or more specifically, cmake .. -DFORCE_DISTCHECK=ON && make distcheck-full)

view this post on Zulip starseeker (Mar 12 2021 at 19:41):

@Sean If you do find more important binary paths that test as text files, what did you want to do about them - make similar insertions of .gitattributes to protect them?

view this post on Zulip Sean (Mar 12 2021 at 19:48):

still going through the list

view this post on Zulip Sean (Mar 12 2021 at 19:51):

@starseeker do you have an existing .gitconfig or other file specifying file extensions being binary or not somewhere?

view this post on Zulip starseeker (Mar 12 2021 at 19:52):

My current heuristics are here: https://github.com/starseeker/brlcad_exec2/blob/main/.gitattributes

view this post on Zulip Sean (Mar 12 2021 at 19:53):

but you just added that, no?

view this post on Zulip Sean (Mar 12 2021 at 19:54):

that's not what I'm getting at

view this post on Zulip starseeker (Mar 12 2021 at 19:54):

Oh, you mean do I have something else on my system setting attributes?

view this post on Zulip starseeker (Mar 12 2021 at 19:54):

Not to my knowledge... let me see if there's a system file...

view this post on Zulip starseeker (Mar 12 2021 at 19:55):

It doesn't look like it, no.

view this post on Zulip Sean (Mar 12 2021 at 19:57):

The fact that it got terra.dsp wrong is a little surprising as it clearly has non-printable characters. The only reason I can think of where it would have committed that as text is somewhere something saying '.dsp files are text'. I'm not finding that so it's a little concerning where that came from.

view this post on Zulip Sean (Mar 12 2021 at 19:57):

by any automatic measure, terra.dsp would have come up binary

view this post on Zulip starseeker (Mar 12 2021 at 19:58):

Oh. That may be my error.

view this post on Zulip starseeker (Mar 12 2021 at 19:58):

terra.dsp may have gotten the svn mime-type set by pattern match.

view this post on Zulip starseeker (Mar 12 2021 at 19:58):

It may be that left to itself it would be fine in git...

view this post on Zulip Sean (Mar 12 2021 at 19:59):

that's file's been set both ways in svn, a source of issues over the years

view this post on Zulip starseeker (Mar 12 2021 at 19:59):

That would simplify matters, actually - we'd only have to add .gitattributes for the 3dm file.

view this post on Zulip starseeker (Mar 12 2021 at 20:00):

As long as git doesn't have any built-in file extension awareness for *.dsp... I doubt it...

view this post on Zulip Sean (Mar 12 2021 at 20:00):

I can't imagine it does either.

view this post on Zulip Sean (Mar 12 2021 at 20:01):

is NIST_MBE_PMI_7-10.3dm the only 3dm file or was something else triggering it?

view this post on Zulip Sean (Mar 12 2021 at 20:02):

that's another that should get detected as binary... I mean unless the detection method is onerously too simple.

view this post on Zulip starseeker (Mar 12 2021 at 20:02):

I believe it's the only uncompressed 3dm file

view this post on Zulip starseeker (Mar 12 2021 at 20:02):

src/libbrep/tests/ayam_hyperbolid.3dm I guess would be another one.

view this post on Zulip starseeker (Mar 12 2021 at 20:03):

but I don't think I've got that hooked into any build tests right now...

view this post on Zulip Sean (Mar 12 2021 at 20:03):

is that in the history or something?

view this post on Zulip Sean (Mar 12 2021 at 20:03):

oh strange

view this post on Zulip starseeker (Mar 12 2021 at 20:03):

/me blinks - it should be in trunk...

view this post on Zulip Sean (Mar 12 2021 at 20:03):

so that file is in svn trunk, but it's not in conv18

view this post on Zulip Sean (Mar 12 2021 at 20:04):

oh right, my bad -- that was my 7.30 test

view this post on Zulip Sean (Mar 12 2021 at 20:05):

there it is

view this post on Zulip starseeker (Mar 12 2021 at 20:05):

/me blinks - terra.dsp is coming out different in SVN and git checkouts according to diff, even though I fixed the SVN mime type in 70882

view this post on Zulip starseeker (Mar 12 2021 at 20:05):

/me growls and heads for a CVS checkout... what's the right file here??

view this post on Zulip starseeker (Mar 12 2021 at 20:09):

OK, I guess that makes sense, kind of. Both r18847 and latest trunk SVN checkout of terra.dsp diff with the CVS checkout, but the git checkout matches the CVS checkout.

view this post on Zulip starseeker (Mar 12 2021 at 20:09):

/me doesn't know why NONE of the SVN checkouts match CVS, but I guess it doesn't really matter at this point...

view this post on Zulip Sean (Mar 12 2021 at 20:14):

interesting. so the difference is really subtle.

view this post on Zulip Sean (Mar 12 2021 at 20:16):

trunk's terra.dsp doesn't appear to have any 0x13 bytes (carriage returns)

view this post on Zulip Sean (Mar 12 2021 at 20:16):

it does have 0x11 bytes (newlines)

view this post on Zulip Sean (Mar 12 2021 at 20:17):

conv18's terra.dsp has both 0x13 and 0x11 bytes which at a glance is probably correct

view this post on Zulip Sean (Mar 12 2021 at 20:18):

it's also worth mentioning that both are perfectly valid dsp data files for the same dimensional specification. the difference is going to be a 1/32768 difference in elevation at those points.

view this post on Zulip starseeker (Mar 12 2021 at 20:19):

I just tested brlcad_tkhtml_fix on Windows, which for older checkouts doesn't have .gitattributes. terra.dsp checkout matches the CVS version according to diff, so you're correct - we don't need .dsp flagged as binary explicitly. We just need to make sure we don't flag it as Windows line ending in git.

view this post on Zulip starseeker (Mar 12 2021 at 20:21):

It's probably worth leaving the entry in the new .gitattributes to avoid that, but we don't need to insert it in the old history for that purpose. I'll adjust my logic to only add the .3dm version.

view this post on Zulip Sean (Mar 12 2021 at 20:21):

can we get rid of the top-level .gitattributes altogether? rather we stick to defaults if we can manage.

view this post on Zulip starseeker (Mar 12 2021 at 20:21):

We still need it for .3dm

view this post on Zulip Sean (Mar 12 2021 at 20:23):

can't that be a single-line .gitattributes in that lone 3dm's folder?

view this post on Zulip starseeker (Mar 12 2021 at 20:23):

Let me see if git supports that...

view this post on Zulip Sean (Mar 12 2021 at 20:23):

there's so much override specified in that file, I can see that coming to bite down the road or at least being a debugging discovery journey

view this post on Zulip Sean (Mar 12 2021 at 20:24):

but that also still begs the question how that 3dm is getting treated as text... it's full of binary

view this post on Zulip Sean (Mar 12 2021 at 20:24):

'file' sees it as binary

view this post on Zulip starseeker (Mar 12 2021 at 20:25):

Yeah, I don't know why Windows treated it differently.

view this post on Zulip starseeker (Mar 12 2021 at 20:26):

OK, if I'm reading this right, ".gitattributes file in the same directory as the path in question" is in the precedence list, so you should be correct we can target locally.

view this post on Zulip Sean (Mar 12 2021 at 20:26):

I would target as specific and minimally as possible for this initial thrust

view this post on Zulip starseeker (Mar 12 2021 at 20:27):

@Sean I'm game to ditch the top level .gitattributes in main - I added it mostly trying to match the subversion default rules you had set up...

view this post on Zulip starseeker (Mar 12 2021 at 20:27):

Give me a few minutes to rework the repowork in puts...

view this post on Zulip Sean (Mar 12 2021 at 20:28):

okay, so apparently their method is essentially ..."check for any occurrence of a zero/nul byte in the first 8000 bytes"

view this post on Zulip Sean (Mar 12 2021 at 20:28):

/me checks that file

view this post on Zulip Sean (Mar 12 2021 at 20:29):

so terra.dsp qualifies. it's got a zero around byte 280

view this post on Zulip Sean (Mar 12 2021 at 20:31):

okay, so NIST_MBE_PMI_7-10.3dm has a zero byte at the 34th byte in the file...

view this post on Zulip Sean (Mar 12 2021 at 20:32):

34th, 35th, 36th, 39th, 40th are all zero...

view this post on Zulip Sean (Mar 12 2021 at 20:33):

did you maybe use some git checkout tool that had a built-in config such that it was an individual issue?

view this post on Zulip starseeker (Mar 12 2021 at 20:33):

/me shrugs. Maybe - I'll try again.

view this post on Zulip Sean (Mar 12 2021 at 20:38):

according to git on mac, it thinks they're binary...
this tells which it thinks are binary: git diff --numstat 4b825dc642cb6eb9a060e54bf8d69288fbee4904 HEAD -- | grep '^-'

view this post on Zulip starseeker (Mar 12 2021 at 20:41):

I just tried again cloning with Git on Windows - the SVN checkout and the git checkout differ.

view this post on Zulip Sean (Mar 12 2021 at 20:41):

those would seem to imply we don't need to do anything for them and terra.dsp is getting historically and contemporarily fixed by the migration

view this post on Zulip Sean (Mar 12 2021 at 20:41):

maybe the svn checkout is wrong

view this post on Zulip starseeker (Mar 12 2021 at 20:42):

terra.dsp agreed. The 3dm files are the issue - ayam_hyperbolid.3dm also differs between git and SVN checkouts.

view this post on Zulip starseeker (Mar 12 2021 at 20:42):

3dm-g works on the SVN checkouts, but not the git versions.

view this post on Zulip starseeker (Mar 12 2021 at 20:43):

/me double checks that...

view this post on Zulip starseeker (Mar 12 2021 at 20:44):

Confirmed. Git checkouts of both 3dm files on Windows fail to convert with 3dm-g

view this post on Zulip Sean (Mar 12 2021 at 20:45):

what's the checkout tool?

view this post on Zulip Sean (Mar 12 2021 at 20:45):

this: https://git-scm.com/download/win ?

view this post on Zulip starseeker (Mar 12 2021 at 20:45):

Just the standard Git windows install, from the bash command line

view this post on Zulip starseeker (Mar 12 2021 at 20:46):

I believe so, yes.

view this post on Zulip Sean (Mar 12 2021 at 20:46):

what does this report on windows: git diff --numstat 4b825dc642cb6eb9a060e54bf8d69288fbee4904 HEAD -- | grep '^-' | grep 3dm

view this post on Zulip Sean (Mar 12 2021 at 20:47):

I get:

morrison@agua brlcad_conv18 % git diff --numstat 4b825dc642cb6eb9a060e54bf8d69288fbee4904 HEAD -- | grep '^-' | grep 3dm
-   -   db/nist/NIST_MBE_PMI_7-10.3dm
-   -   regress/nurbs/brep-3dm.tar.bz2
-   -   src/libbrep/tests/ayam_hyperbolid.3dm

view this post on Zulip Sean (Mar 12 2021 at 20:49):

or git diff --stat ... it reports Bin too

view this post on Zulip starseeker (Mar 12 2021 at 20:49):

Yes, that matches

view this post on Zulip Sean (Mar 12 2021 at 20:50):

Are the bytes the same?

% git diff --stat 4b825dc642cb6eb9a060e54bf8d69288fbee4904 HEAD -- | grep 3dm
 db/nist/NIST_MBE_PMI_7-10.3dm                      |    Bin 0 -> 4232626 bytes
 regress/nurbs/brep-3dm.tar.bz2                     |    Bin 0 -> 103242 bytes
 src/conv/3dm/3dm-g.c                               |    137 +
 src/conv/3dm/CMakeLists.txt                        |     16 +
 src/libbrep/tests/ayam_hyperbolid.3dm              |    Bin 0 -> 4189 bytes
 src/other/openNURBS/opennurbs_3dm.h                |    528 +
 src/other/openNURBS/opennurbs_3dm_attributes.cpp   |   1528 +
 src/other/openNURBS/opennurbs_3dm_attributes.h     |    573 +
 src/other/openNURBS/opennurbs_3dm_properties.cpp   |    598 +
 src/other/openNURBS/opennurbs_3dm_properties.h     |    142 +
 src/other/openNURBS/opennurbs_3dm_settings.cpp     |   4036 +
 src/other/openNURBS/opennurbs_3dm_settings.h       |    891 +

view this post on Zulip starseeker (Mar 12 2021 at 20:51):

https://brlcad.org/~starseeker/ayam_hyperbolid.3dm.gz

view this post on Zulip starseeker (Mar 12 2021 at 20:51):

there's one of the windows checkouts...

view this post on Zulip starseeker (Mar 12 2021 at 20:52):

https://brlcad.org/~starseeker/NIST_MBE_PMI_7-10.3dm.gz

view this post on Zulip Sean (Mar 12 2021 at 20:53):

yeah, it's bigger 4199 bytes

view this post on Zulip Sean (Mar 12 2021 at 20:53):

and confirmed in hex mode, it's converted 0a's to 0d0a's

view this post on Zulip Sean (Mar 12 2021 at 20:54):

WHY? does stat say they're Bin for you ??

view this post on Zulip starseeker (Mar 12 2021 at 20:55):

Interestingly, I get the same report when I ask git:

$ git diff --stat 4b825dc642cb6eb9a060e54bf8d69288fbee4904 HEAD -- | grep 3dm
 db/nist/NIST_MBE_PMI_7-10.3dm                      |    Bin 0 -> 4232626 bytes
 regress/nurbs/brep-3dm.tar.bz2                     |    Bin 0 -> 103242 bytes
 src/conv/3dm/3dm-g.c                               |    137 +
 src/conv/3dm/CMakeLists.txt                        |     16 +
 src/libbrep/tests/ayam_hyperbolid.3dm              |    Bin 0 -> 4189 bytes
 src/other/openNURBS/opennurbs_3dm.h                |    528 +
 src/other/openNURBS/opennurbs_3dm_attributes.cpp   |   1528 +
 src/other/openNURBS/opennurbs_3dm_attributes.h     |    573 +
 src/other/openNURBS/opennurbs_3dm_properties.cpp   |    598 +
 src/other/openNURBS/opennurbs_3dm_properties.h     |    142 +
 src/other/openNURBS/opennurbs_3dm_settings.cpp     |   4036 +
 src/other/openNURBS/opennurbs_3dm_settings.h       |    891 +

view this post on Zulip Sean (Mar 12 2021 at 20:55):

yeah, see that makes no f'ing sense...

view this post on Zulip Sean (Mar 12 2021 at 20:55):

it thinks they're binary ... yet it's been translated by something

view this post on Zulip Sean (Mar 12 2021 at 20:56):

"something"

view this post on Zulip starseeker (Mar 12 2021 at 20:56):

Here's what stat says:

$ stat db/nist/NIST_MBE_PMI_7-10.3dm
  File: db/nist/NIST_MBE_PMI_7-10.3dm
  Size: 4232626         Blocks: 4136       IO Block: 65536  regular file
Device: e02c1581h/3760985473d   Inode: 3096224743825018  Links: 1
Access: (0644/-rw-r--r--)  Uid: (197612/   cliff)   Gid: (197612/ UNKNOWN)
Access: 2021-03-12 15:55:21.920148200 -0500
Modify: 2021-03-12 15:53:06.716503800 -0500
Change: 2021-03-12 15:53:06.716503800 -0500
 Birth: 2021-03-12 15:53:06.716503800 -0500

view this post on Zulip Sean (Mar 12 2021 at 20:57):

woah, that's odd too

view this post on Zulip Sean (Mar 12 2021 at 20:59):

4232626 bytes ... yet the file you sent me is 4243206 bytes

view this post on Zulip Sean (Mar 12 2021 at 20:59):

how's that possible?

view this post on Zulip Sean (Mar 12 2021 at 21:00):

stat is saying it has no carriage returns, yet the file you sent has carriage returns

view this post on Zulip Sean (Mar 12 2021 at 21:00):

that's stat on windows?

view this post on Zulip starseeker (Mar 12 2021 at 21:00):

Yes. All those were run from the git bash command prompt

view this post on Zulip Sean (Mar 12 2021 at 21:01):

what's ls say?

view this post on Zulip starseeker (Mar 12 2021 at 21:01):

$ ls -l db/nist/NIST_MBE_PMI_7-10.3dm
-rw-r--r-- 1 cliff 197612 4232626 Mar 12 15:53 db/nist/NIST_MBE_PMI_7-10.3dm

view this post on Zulip Sean (Mar 12 2021 at 21:01):

!

view this post on Zulip Sean (Mar 12 2021 at 21:02):

and you're saying 3dm-g in that same shell dies on it?

view this post on Zulip starseeker (Mar 12 2021 at 21:05):

Well, I first got this:

 /c/brlcad-build/Debug/bin/3dm-g.exe -o test.g brlcad_tkhtml_fix/src/libbrep/tests/ayam_hyperbolid.3dm
invalid input file ('ONX_Model::Read() failed.

Note:  if this file was saved from Rhino3D, make sure it was saved using
Rhino's v5 format or lower - newer versions of the 3dm format are not
currently supported by BRL-CAD.')

failed to load input file

but I just tried it again and it seemed to work???

view this post on Zulip starseeker (Mar 12 2021 at 21:06):

And now the diff matches????????!!!!!

view this post on Zulip Sean (Mar 12 2021 at 21:06):

that's what's confusing because all the numbers are pointing at it being correct (now)

view this post on Zulip starseeker (Mar 12 2021 at 21:06):

What in the world?

view this post on Zulip Sean (Mar 12 2021 at 21:07):

maybe make sure you don't ahve a git diff or checkout again or something

view this post on Zulip starseeker (Mar 12 2021 at 21:07):

It's almost as if it wrote an intermediate version of the file and then went back and changed it

view this post on Zulip Sean (Mar 12 2021 at 21:07):

not a .gitattributes or .gitconfig or something...

view this post on Zulip starseeker (Mar 12 2021 at 21:07):

/me starts over...

view this post on Zulip Sean (Mar 12 2021 at 21:08):

check that size after every step
4232626 is correct... 423 good 424 bad

view this post on Zulip starseeker (Mar 12 2021 at 21:12):

OK, fresh checkout, back to bad file:

MINGW64 /c
$ ls -l brlcad_tkhtml_fix/db/nist/NIST_MBE_PMI_7-10.3dm
-rw-r--r-- 1 cliff 197612 4243206 Mar 12 16:08 brlcad_tkhtml_fix/db/nist/NIST_MBE_PMI_7-10.3dm

MINGW64 /c
$ diff brlcad_tkhtml_fix/db/nist/NIST_MBE_PMI_7-10.3dm brlcad/db/nist/NIST_MBE_PMI_7-10.3dm
Binary files brlcad_tkhtml_fix/db/nist/NIST_MBE_PMI_7-10.3dm and brlcad/db/nist/NIST_MBE_PMI_7-10.3dm differ

MINGW64 /c
$ ls -l brlcad_tkhtml_fix/db/nist/NIST_MBE_PMI_7-10.3dm
-rw-r--r-- 1 cliff 197612 4243206 Mar 12 16:08 brlcad_tkhtml_fix/db/nist/NIST_MBE_PMI_7-10.3dm

MINGW64 /c
$ /c/brlcad-build/Debug/bin/3dm-g.exe -o /c/brlcad_tkhtml_fix/test.g brlcad_tkhtml_fix/db/nist/NIST_MBE_PMI_7-10.3dm
invalid input file ('ONX_Model::Read() failed.

Note:  if this file was saved from Rhino3D, make sure it was saved using
Rhino's v5 format or lower - newer versions of the 3dm format are not
currently supported by BRL-CAD.')

failed to load input file

MINGW64 /c
$ ls -l brlcad_tkhtml_fix/db/nist/NIST_MBE_PMI_7-10.3dm
 -rw-r--r-- 1 cliff 197612 4243206 Mar 12 16:08 brlcad_tkhtml_fix/db/nist/NIST_MBE_PMI_7-10.3dm

 MINGW64 /c
$ date
Fri Mar 12 16:11:59 EST 2021

view this post on Zulip starseeker (Mar 12 2021 at 21:18):

So far it hasn't fixed itself again...

view this post on Zulip starseeker (Mar 12 2021 at 21:19):

stat at least agrees with the wrong version this time:

$ stat db/nist/NIST_MBE_PMI_7-10.3dm
  File: db/nist/NIST_MBE_PMI_7-10.3dm
  Size: 4243206         Blocks: 4144       IO Block: 65536  regular file
Device: e02c1581h/3760985473d   Inode: 48132221017637258  Links: 1
Access: (0644/-rw-r--r--)  Uid: (197612/   cliff)   Gid: (197612/ UNKNOWN)
Access: 2021-03-12 16:17:43.513220300 -0500
Modify: 2021-03-12 16:16:43.532223700 -0500
Change: 2021-03-12 16:16:43.532223700 -0500
 Birth: 2021-03-12 16:08:53.242552400 -0500

view this post on Zulip starseeker (Mar 12 2021 at 21:20):

MINGW64 /c/brlcad_tkhtml_fix (main)
$ git diff --stat 4b825dc642cb6eb9a060e54bf8d69288fbee4904 HEAD -- | grep 3dm
 db/nist/NIST_MBE_PMI_7-10.3dm                      |    Bin 0 -> 4232626 bytes
 regress/nurbs/brep-3dm.tar.bz2                     |    Bin 0 -> 103242 bytes
 src/conv/3dm/3dm-g.c                               |    137 +
 src/conv/3dm/CMakeLists.txt                        |     16 +
 src/libbrep/tests/ayam_hyperbolid.3dm              |    Bin 0 -> 4189 bytes
 src/other/openNURBS/opennurbs_3dm.h                |    528 +
 src/other/openNURBS/opennurbs_3dm_attributes.cpp   |   1528 +
 src/other/openNURBS/opennurbs_3dm_attributes.h     |    573 +
 src/other/openNURBS/opennurbs_3dm_properties.cpp   |    598 +
 src/other/openNURBS/opennurbs_3dm_properties.h     |    142 +
 src/other/openNURBS/opennurbs_3dm_settings.cpp     |   4036 +
 src/other/openNURBS/opennurbs_3dm_settings.h       |    891 +

MINGW64 /c/brlcad_tkhtml_fix (main)
$ stat db/nist/NIST_MBE_PMI_7-10.3dm
  File: db/nist/NIST_MBE_PMI_7-10.3dm
  Size: 4243206         Blocks: 4144       IO Block: 65536  regular file
Device: e02c1581h/3760985473d   Inode: 48132221017637258  Links: 1
Access: (0644/-rw-r--r--)  Uid: (197612/   cliff)   Gid: (197612/ UNKNOWN)
Access: 2021-03-12 16:17:43.852005600 -0500
Modify: 2021-03-12 16:16:43.532223700 -0500
Change: 2021-03-12 16:16:43.532223700 -0500
 Birth: 2021-03-12 16:08:53.242552400 -0500

view this post on Zulip starseeker (Mar 12 2021 at 21:23):

@Sean I dont' suppose you have access to a similar environment?

view this post on Zulip Sean (Mar 12 2021 at 21:23):

what about a git up or git stat .. wondering if that fixed it

view this post on Zulip starseeker (Mar 12 2021 at 21:24):

I've tried deleting it and re-checking it out - to no avail

view this post on Zulip Sean (Mar 12 2021 at 21:26):

it has to have been a git tool that fixed it.. you tried git diff --stat or git diff --numstat on it?

view this post on Zulip Sean (Mar 12 2021 at 21:26):

I'm thinking the GUI checkout tool has a bug

view this post on Zulip starseeker (Mar 12 2021 at 21:27):

I didn't use the GUI though - just the command line

view this post on Zulip Sean (Mar 12 2021 at 21:27):

well torpedo'd that thought good

view this post on Zulip Sean (Mar 12 2021 at 21:27):

and your .gitconfig is empty? and no .gitattributes?

view this post on Zulip Sean (Mar 12 2021 at 21:28):

cause now it's a double mystery. why it's cloning wrong and how it got fixed.

view this post on Zulip starseeker (Mar 12 2021 at 21:28):

Only thing I can think of is that .gitattributes file I added in main may actually be the problem.

view this post on Zulip starseeker (Mar 12 2021 at 21:30):

Yep. I'll be that's it.

view this post on Zulip starseeker (Mar 12 2021 at 21:31):

It checked out wrong in main because some rule I stuck in there must have matched the 3dm file, then when the branch checked out it kept the file that had been "modified" by the .gitattributes file.

view this post on Zulip starseeker (Mar 12 2021 at 21:31):

OK, I'm convinced - no gitattributes file.

view this post on Zulip Sean (Mar 12 2021 at 21:32):

:thumbs_up: That's what I suspected would eventually happen... just not so soon.

view this post on Zulip starseeker (Mar 12 2021 at 21:32):

When I switched to the branch that didn't have the file, blew away the modded 3dm file, and restored from the branch rather than main that's where the good file came from.

view this post on Zulip Sean (Mar 12 2021 at 21:33):

ahh! yep, that'd explain it. whew.

view this post on Zulip starseeker (Mar 12 2021 at 21:33):

OK. I'll back up to conv18 (or the one with your readme update if I've got it) and re-apply the tkhtml fix and the exec settings. We should then be Good To Go - minimalism wins again.

view this post on Zulip Sean (Mar 12 2021 at 21:33):

Ima need a hard drink tonight after that

view this post on Zulip Sean (Mar 12 2021 at 21:34):

README update wasn't important. that was just testing commit

view this post on Zulip Sean (Mar 12 2021 at 21:34):

I plan to completely overhaul the readme soon once all the other docs and tickets are in place.

view this post on Zulip starseeker (Mar 12 2021 at 21:35):

/me has been rather bad for your stress levels this week. OK, give me a few minutes to do a final pass and I'll upload the final candidate.

view this post on Zulip starseeker (Mar 12 2021 at 21:53):

@Sean remind me after we're done here to check the fast4 regression test - one of those files is deliberately Windows line endings and one is deliberately Linux - we may have to switch the in repo copies of those files to be .bz2 or something so they don't get autoupdated as text files on checkout.

view this post on Zulip starseeker (Mar 12 2021 at 21:55):

@Sean While we're thinking about it, did you also want to eliminate .gitignore? It's in there now because it gave us some non-empty SVN commits for id mapping, but maybe we want to eliminate it now.

view this post on Zulip starseeker (Mar 12 2021 at 21:57):

https://github.com/starseeker/brlcad_conv19

view this post on Zulip starseeker (Mar 12 2021 at 22:19):

Windows build and distcheck-full running on rel-7-32-2 tag from that repo now. Will confirm if successful in a few hours

view this post on Zulip Sean (Mar 12 2021 at 22:46):

I'll start testing 19 now too unless there's a reason to wait.

view this post on Zulip starseeker (Mar 12 2021 at 22:46):

Go for it

view this post on Zulip starseeker (Mar 13 2021 at 01:23):

Windows build passed, distcheck full Ubuntu passed for rel-7-32-2

view this post on Zulip starseeker (Mar 13 2021 at 17:20):

@Sean Any other checks you'd like me to run?

view this post on Zulip Sean (Mar 15 2021 at 16:55):

maybe just make sure the few .g's that are in the repo open and seem valid?

view this post on Zulip Sean (Mar 15 2021 at 16:56):

I'm running out of checks on my end, I think I'll upload the update this afternoon unless you found anything

view this post on Zulip starseeker (Mar 15 2021 at 17:43):

find ../ -name \*.g -exec ./bin/mged {} ls \; came through clean, as far as I can tell.

view this post on Zulip starseeker (Mar 15 2021 at 18:31):

Ditto on Windows (git bash shell)

view this post on Zulip Prabhat Singh (Mar 15 2021 at 19:31):

Hello everyone, will it be possible if someone can point me out to the getting started docs or quick start docs for opencax ?

view this post on Zulip Sean (Mar 15 2021 at 19:58):

@starseeker something that exercises external-to-internal form, like doing a get * or draw *. ls doesn't crack them iirc.

view this post on Zulip starseeker (Mar 15 2021 at 22:28):

@Sean draw console output matches that from SVN build.

view this post on Zulip starseeker (Mar 15 2021 at 22:33):

g2asc output for all of them matches as well, except for a few chars in the openNURBS serializations of the breps.

view this post on Zulip starseeker (Mar 16 2021 at 01:32):

OK, yeah - the openNURBS serializations differ even building from the same sources, when different build dirs are used.

view this post on Zulip starseeker (Mar 16 2021 at 01:33):

@Sean no known blockers on my side...

view this post on Zulip starseeker (Mar 16 2021 at 22:10):

Anything I can do to help?

view this post on Zulip Sean (Mar 17 2021 at 06:49):

I just finished up, didn't find anything else. Will upload in the morning.

view this post on Zulip starseeker (Mar 17 2021 at 23:42):

@Sean If it's the vanilla brlcad_conv19 repo, shall I go ahead and upload it?

view this post on Zulip Sean (Mar 17 2021 at 23:49):

No, not yet

view this post on Zulip Sean (Mar 19 2021 at 03:01):

So I think a "soft opening" is probably in order. It's uploaded and live, but perhaps we could give it a few days to "simmer" .. not announce it publicly just yet.

view this post on Zulip starseeker (Mar 19 2021 at 03:35):

Sounds good. So to be sure - I'm clear to commit?

view this post on Zulip Sean (Mar 19 2021 at 03:45):

@starseeker haha, that sounds terrifying when you say it like that

view this post on Zulip Sean (Mar 19 2021 at 03:52):

but yeah, I don't see a reason why not. if anything, we should exercise it to make sure it's correct.

view this post on Zulip starseeker (Mar 19 2021 at 12:07):

Woo hoo! https://github.com/BRL-CAD/brlcad/actions

view this post on Zulip starseeker (Mar 19 2021 at 12:24):

/me has been working towards that for quite a while now

view this post on Zulip starseeker (Mar 19 2021 at 12:38):

@Sean Do we want to bother setting up an email mailing list for folks to get commit emails without having a Github account? If so this may be useful... https://docs.github.com/en/github/administering-a-repository/about-email-notifications-for-pushes-to-your-repository

view this post on Zulip starseeker (Mar 19 2021 at 13:26):

@Sean Right now I've got the "check" target building on the runners, but that's not going to succeed reliably due to the threading issues - looks like regress-gqa is failing some of the time on the OSX runner. Should I disable the check portion of the test until we have an expectation it can reliably run?

view this post on Zulip starseeker (Mar 19 2021 at 15:50):

I took a run at updating HACKING, but without doing an all-up release I'm sure I've missed something.

view this post on Zulip starseeker (Mar 19 2021 at 15:54):

One thing that is clear - if we want to keep providing the GNU style ChangeLog files, we'll have to put some effort into it.

view this post on Zulip starseeker (Mar 19 2021 at 15:58):

My thought, since now each git clone has the whole history locally, would be to either dispense with the ChangeLog all together or simply use the git log output. The only real utility to the ChangeLog would be for folks looking at tarballs without any access to either a local or github version of the history - I would expect that to be a rare case, and even in that scenario I would expect git log (or maybe git log --stat) output to be as useful as the current ChangeLog.

view this post on Zulip starseeker (Mar 19 2021 at 19:51):

Started populating the releases - that's going to be a job if we want to get all the binaries, source tarballs and notes moved. I got the majority of the release notes set up - all but a couple back to 7.0, except for a couple without obvious corresponding tags. However, I've only gotten a few of the uploads.

view this post on Zulip starseeker (Mar 20 2021 at 01:09):

@Sean Seems to be working pretty well so far.

view this post on Zulip starseeker (Mar 20 2021 at 03:12):

/me needs to add a tag for rel-7-10-4

view this post on Zulip starseeker (Mar 20 2021 at 03:15):

rel-7-6-2 as well

view this post on Zulip starseeker (Mar 20 2021 at 03:16):

and rel-7-4-2

view this post on Zulip Erik (Mar 20 2021 at 13:00):

ehhh, if github is central to development, people who want to see that far into development should come watch...

view this post on Zulip starseeker (Mar 20 2021 at 15:07):

Phew! OK, missing tags added, source and binary tarballs uploaded. Needs someone to double check to make sure I didn't miss any. OVA image (just barely) uploaded to Release on OVA repository.

Only binaries I don't have up yet are the old ProE plugins - not sure where to put them.

view this post on Zulip starseeker (Mar 20 2021 at 15:16):

Options would be either to set up a separate project for the creo plugins, or add tags for the plugins (something like proe-plugin-0-2-0 maybe?) and upload the plugins to those tags. If we do want to add older tags for the plugins, we'll need to be careful about setting tag dates once we identify the corresponding commits. (Just got bit by that - it's fixable https://stackoverflow.com/a/21741848/2037687 but we may as well get it right up front...)

view this post on Zulip starseeker (Mar 20 2021 at 15:26):

Do we want to use https://pages.github.com/ for the site?

view this post on Zulip starseeker (Mar 21 2021 at 13:25):

@Erik Do you know anything about the Github "packages" feature? Is that anything that might be useful for BRL-CAD?

view this post on Zulip starseeker (Mar 21 2021 at 13:26):

https://docs.github.com/en/packages/learn-github-packages/core-concepts-for-github-packages looks like the starting point but I'm not clear yet on what a "BRL-CAD package" would be or mean... Is that were we'd stick (say) a docker image?

view this post on Zulip Erik (Mar 22 2021 at 12:47):

no clue, never heard of it... my daily is on bitbucket :/

view this post on Zulip Sumagna Das (Mar 23 2021 at 19:56):

https://github.com/nektos/act

view this post on Zulip Sumagna Das (Mar 23 2021 at 19:57):

this repo looks like a good one for testing out the workflow files for github

view this post on Zulip starseeker (Mar 23 2021 at 20:16):

@Sumagna Das Interesting! Have you tried that with the BRL-CAD actions?

view this post on Zulip Sumagna Das (Mar 23 2021 at 20:17):

i just can keep my laptop open for more than 2-3 hr mainly for classes

view this post on Zulip Sumagna Das (Mar 23 2021 at 20:18):

but i can try it right now if you want and keep the laptop open for the night....

view this post on Zulip starseeker (Mar 23 2021 at 20:19):

@Sumagna Das Up to you - I'd actually be surprised if it can do anything much with our action files, since they call for Windows and OSX vms as well as Linux...

view this post on Zulip Sumagna Das (Mar 23 2021 at 20:20):

as per the readme, it seems like it cant work with windows and macos

view this post on Zulip Sumagna Das (Mar 23 2021 at 20:20):

only works with linux

view this post on Zulip Sumagna Das (Mar 23 2021 at 20:21):

so might not be a good one for testing out workflow files except for linux ones

view this post on Zulip starseeker (Mar 23 2021 at 20:22):

So the question would be whether it knows to skip the non-Linux entries automatically or would we need to edit the files down before running it.

view this post on Zulip Sumagna Das (Mar 23 2021 at 20:24):

might skip them

view this post on Zulip Sumagna Das (Mar 23 2021 at 20:24):

let me try a dry run

view this post on Zulip Sean (Mar 23 2021 at 20:24):

Sumagna Das said:

https://github.com/nektos/act

How is this offtopic?? :D

view this post on Zulip Sumagna Das (Mar 23 2021 at 20:25):

i didnt know that it can actually help with BRL-CAD's github actions so thought it was off topic :smile:

view this post on Zulip starseeker (Mar 23 2021 at 20:27):

@Sumagna Das fits with the Github topic, if you want to shift it over there.

view this post on Zulip Sumagna Das (Mar 23 2021 at 20:27):

done :smile:

view this post on Zulip starseeker (Mar 23 2021 at 20:27):

I've lost count of the number of things I've done on this conversion that I've considered where I didn't know whether or not it would help...

view this post on Zulip Sean (Mar 24 2021 at 03:09):

starseeker said:

Sean Do we want to bother setting up an email mailing list for folks to get commit emails without having a Github account? If so this may be useful... https://docs.github.com/en/github/administering-a-repository/about-email-notifications-for-pushes-to-your-repository

Yes, though I'm not fond of Github's default that merely links to the diff. It should really be in the e-mail (up to some kb limit) since the entire point of commit notification is quick review of the code change.

view this post on Zulip Sean (Mar 24 2021 at 03:10):

Looks like the way to handle it will be to set up a clone on .bz that pulls periodically with a receive hook

view this post on Zulip Sean (Mar 24 2021 at 03:11):

starseeker said:

Sean Right now I've got the "check" target building on the runners, but that's not going to succeed reliably due to the threading issues - looks like regress-gqa is failing some of the time on the OSX runner. Should I disable the check portion of the test until we have an expectation it can reliably run?

Yes, advisory in the meantime.

view this post on Zulip Sean (Mar 24 2021 at 03:52):

For the ChangeLog, we can start without it. I think it will be good to include one in future source tarballs, though I don't think it matters so much what tool generates it. Including more than the git 1-liner would be essential, but a git log of all changes since last release would probably be adequate.

At a glance, looks like there are a couple that wrap git log, and looks like emacs can do it, or we can just sort out the magic needed to automatically extract commits since the previous release (a little tricky, but not terribly hard).

view this post on Zulip Sean (Mar 24 2021 at 03:53):

starseeker said:

OVA image (just barely) uploaded to Release on OVA repository.

How close was it to the file limit?

view this post on Zulip starseeker (Mar 24 2021 at 12:45):

The github file size limit is, IIRC, 2 gigs. Compressed, it was on the order of 1.8

view this post on Zulip Sean (Mar 25 2021 at 04:29):

I thought we determined that was a soft limit, not a hard one...

view this post on Zulip starseeker (Mar 25 2021 at 10:59):

Maybe... I don't recall for sure

view this post on Zulip starseeker (Mar 30 2021 at 02:01):

/me bemusedly wonders if @Sean is planning to announce the migration on April 1st...

view this post on Zulip Sean (Mar 30 2021 at 15:42):

Okay, I've sent out 16 invitations to add people to our list of members (i.e., people that have commit access to any repo). It's only a fraction of what we had on SourceForge, but it should be a good start.

view this post on Zulip Sean (Mar 30 2021 at 15:54):

@starseeker you also apparently lacked the admin bit on the brlcad repo and weren't a member of the dev team, which looks like is why you couldn't add anyone.

view this post on Zulip starseeker (Mar 30 2021 at 15:54):

/me nods - that'll do it.

view this post on Zulip Sean (Mar 30 2021 at 15:55):

That's fixed and I've added the new repos to the existing teams

view this post on Zulip Sean (Mar 30 2021 at 15:55):

right now, being in a team pretty much gives full administrative control, so we may want to change that later, but that's essentially how it was on sourceforge

view this post on Zulip starseeker (Mar 30 2021 at 15:56):

Does github offer finer granularity?

view this post on Zulip Sean (Mar 30 2021 at 15:56):

oh heck yes, it's quite granular and with two separate layers

view this post on Zulip Sean (Mar 30 2021 at 15:57):

permissions are set on repos themselves or they're set on teams (which then have permissions attached to them) or they're set on members (which have permissions attached to them)

view this post on Zulip Sean (Mar 30 2021 at 15:57):

three layers I guess

view this post on Zulip starseeker (Mar 30 2021 at 15:58):

Wow - nifty!

view this post on Zulip Sean (Mar 30 2021 at 15:58):

so for example you were a member, which lets you create repos, but you weren't on the dev team, so you couldn't add people to brlcad

view this post on Zulip Sean (Mar 30 2021 at 15:59):

It looks like it's set up this way so you can have teams with admin access, teams without, all accessing some or not having access to other repos. It's not a strict hierarchy of permissions, it's more of a matrix.

view this post on Zulip starseeker (Mar 30 2021 at 16:00):

A bit complex to manage, but also potentially quite useful for preventing accidents and the like.

view this post on Zulip Sean (Mar 30 2021 at 16:53):

right now I just have two teams set up, devs and webdevs, with devs having all repos but only admin on the compiled-code repos, and webdevs having admin over the web-related repos including the website and web projects

view this post on Zulip Sean (Mar 31 2021 at 19:17):

@starseeker :( ... git log --follow src/conv/iges/g-iges.c

view this post on Zulip Sean (Mar 31 2021 at 19:40):

Looks like "nearly" everything in there has no traceability after the 25XXX converter movements. Looking at iges.h for example, it stops at 25521. Git log appears to have the other changes, for example if I git log --follow src/conv/iges/iges.h, it looks corrupted to me.

view this post on Zulip Sean (Mar 31 2021 at 19:41):

the last commit is shown as 0fe9bf30dc0f7980df6486014bb29567bec09a84 (r4502) which was a change to sig/i-a.c ... similarly 1cdf453b9d355b1a7fb10bea445ab18b262a0252 (r5920) was sig/u-a.c

view this post on Zulip Sean (Mar 31 2021 at 19:42):

the two commits before that seem to have nothing to do with sig and are other random commits

view this post on Zulip Sean (Mar 31 2021 at 19:46):

looks like it's not until 3408f5ba1220271623a90b3740eb43abe06a857a a dozen or commits prior that it starts to get back on track

view this post on Zulip Sean (Mar 31 2021 at 19:50):

If I trace back commits in subversion, the last five on iges.h are r13453, r10561, r9487, r8144, r7715. Commits r13453 is 994dcc97ee6d9f60e670aa9a2ed110273920294c for example and r7715 is split across three commits: 317460fce22e6ba835a08bef126e2b75a123ee78
b9f6d30bd15f4c66ed5e7506877b6ae35c80ea06
eb458e30c765b2758097abc1cb5909422e050e90
so the commits are somewhere in the full history, I'm just not sure where. :(

view this post on Zulip Sean (Mar 31 2021 at 19:54):

/me hopes this is limited to conv/ or conv/iges and not all directory renames besides r22798... because there were a dozen or so others

view this post on Zulip Sean (Mar 31 2021 at 20:11):

looks like it got vfont correct, looks like it got src/external/Creo wrong ..

view this post on Zulip starseeker (Mar 31 2021 at 20:31):

For whatever reason, the --follow algorithm isn't finding the src/iges/iges.h file starting from src/conv/iges/iges.h. Looking at the gitk history, following the parent commits does get to the rename commit, so my initial guess is that it's not data corruption per say but a limitation of the implementation of --follow (which apparently has some issues...)

view this post on Zulip Sean (Mar 31 2021 at 20:34):

That doesn't add up though -- it lists some older commits on some files, commits that have absolutely nothing to do with that directory entirely.

view this post on Zulip Sean (Mar 31 2021 at 20:34):

like the sig/ files

view this post on Zulip starseeker (Mar 31 2021 at 20:34):

If I'm reading this right, git's interpretation (or cvs-fast-export's, at any rate) was that r25518 removed the iges files rather than moving them, 25519 and 25520 were then committed, and 25521 added the iges files back in.

view this post on Zulip starseeker (Mar 31 2021 at 20:35):

that may what breaks the --follow chain

view this post on Zulip starseeker (Mar 31 2021 at 20:36):

I don't get any --follow output pass 25521

view this post on Zulip starseeker (Mar 31 2021 at 20:37):

Your --follow is giving you bogus commits prior to 25521 with follow on iges.h?

view this post on Zulip starseeker (Mar 31 2021 at 20:40):

git log --full-history -- "**/iges.h" may be useful here

view this post on Zulip Sean (Mar 31 2021 at 20:40):

git log --follow src/conv/iges/iges.h

view this post on Zulip Sean (Mar 31 2021 at 20:40):

The last dozen or two commits have nothing to do with iges.h

view this post on Zulip starseeker (Mar 31 2021 at 20:43):

Are they empty commits?

view this post on Zulip Sean (Mar 31 2021 at 20:44):

starseeker said:

git log --full-history -- "**/iges.h" may be useful here

Doesn't that just mean that the history is attached somewhere? That much is already confirmed, the commits exist in the history, just seemingly not where they should be. Like, where is r13453 ? What file can I do a log on to find it? (inclined to see if it's attached to some other random file like the u-a.c commit.

view this post on Zulip Sean (Mar 31 2021 at 20:45):

starseeker said:

Are they empty commits?

Definitely not, they're genuine changes to other files not even related to src/conv in any way.

view this post on Zulip Sean (Mar 31 2021 at 20:46):

git show 0fe9bf30dc0f7980df6486014bb29567bec09a84 ... it says that was the first commit to iges.h in that location (sans follow)

view this post on Zulip starseeker (Mar 31 2021 at 20:48):

The parent commit of 25521 is 3408f5ba1220 (25520) which is an empty commit as far as iges.h is concerned (and iges.h doesn't exist in the tree at that point.) That may break the --follow chain, but I'm not clear yet on why --follow is reporting anything else before src/conv/iges/ iges.h in that case

view this post on Zulip starseeker (Mar 31 2021 at 20:48):

Wonder if this is related somehow? https://blog.plover.com/prog/git-log-follow.html

view this post on Zulip starseeker (Mar 31 2021 at 21:01):

Commits back through 22798 in the follow history do have changes that pertain to iges.h, from the looks of things.

view this post on Zulip starseeker (Mar 31 2021 at 21:02):

It goes off the rails from 22798 to 22606.

view this post on Zulip Sean (Mar 31 2021 at 21:02):

some do in a general sense, like license header updates, others not so much

view this post on Zulip Sean (Mar 31 2021 at 21:04):

I haven't been able to find the iges/iges.h history which had several dozen commits prior to the move around 25520

view this post on Zulip starseeker (Mar 31 2021 at 21:22):

I'm not seeing several dozen? Here's what I can find for iges.h: iges_h_svn.txt

view this post on Zulip starseeker (Mar 31 2021 at 21:29):

@Sean I agree git log --follow is going off the rails in a bizarre way, but if I diff the svn commits and those found by git log -- "**/iges.h" the delta is pretty small:

--- svnrevs.txt 2021-03-31 17:19:14.593937412 -0400
+++ gitrevs.txt 2021-03-31 17:19:50.609358451 -0400
@@ -18,12 +18,13 @@
 27341
 26074
 25521
+25518
 23807
 23633
 23577
+22839
 22798
 13453
-10561
 9487
 8144
 7715

view this post on Zulip starseeker (Mar 31 2021 at 21:36):

gitk "**/iges.h" is also useful...

view this post on Zulip starseeker (Mar 31 2021 at 21:37):

bbl

view this post on Zulip starseeker (Mar 31 2021 at 22:11):

Discussion of how --follow works: https://stackoverflow.com/a/43960010/2037687

view this post on Zulip Sean (Apr 01 2021 at 04:25):

starseeker said:

I'm not seeing several dozen? Here's what I can find for iges.h: iges_h_svn.txt

Sorry, I meant iges.c for that one -- I was trying to find it's full history the same way and can't get it to report the 30 commits prior to it getting moved around even with git log --full-history -- **/iges.c

view this post on Zulip Sean (Apr 01 2021 at 04:26):

Comparing against: svn log svn+ssh://brlcad@svn.code.sf.net/p/brlcad/code/brlcad/trunk/iges/iges.c@22500 | grep '^r'

view this post on Zulip Sean (Apr 01 2021 at 04:31):

How can I manually traverse the actual history manually on the git side? In svn, one would see a log stops at r12345, then one pulls log on a path mentioned in the comment at a few revs prior (e.g., r12340), and repeat as needed. if it wasn't mentioned in a comment, one can still pull the tree at r12340, find the file, then continue the log on it.

view this post on Zulip Sean (Apr 01 2021 at 04:32):

in general, that system works even if the file was renamed.

view this post on Zulip Sean (Apr 01 2021 at 04:35):

I mean, I can think of a really brute force way, checking out the sha prior (-1), but what's the right way?

view this post on Zulip Sean (Apr 01 2021 at 05:00):

Relying on "git log -- **/file" feels inadequate in the general case because it 1) only works if the file wasn't renamed, 2) can erroneously catch other same-named files (good luck tracking a subdir README that moved..), and 3) doesn't seem to help figure out where the commit exists..only that it exists.

view this post on Zulip Sean (Apr 01 2021 at 05:01):

Any idea what happened with ProEngineer? It seems to similarly have lost track. I didn't check the others.

view this post on Zulip starseeker (Apr 01 2021 at 11:28):

Sean said:

Comparing against: svn log svn+ssh://brlcad@svn.code.sf.net/p/brlcad/code/brlcad/trunk/iges/iges.c@22500 | grep '^r'

So if I do the following: git log -- "**/iges.c"|grep svn:revision|awk -F':' '{print $3}' the last few returns are:

22367
21028
20508
19942
19550
19335
19139
19131
18043
17500
16912
13453
12989
11582
9951
9831
9693
9487
9283
9227
9221
9133
9080
8573
8144
8129
7790
7716
7715

view this post on Zulip starseeker (Apr 01 2021 at 11:30):

With SVN svn log https://svn.code.sf.net/p/brlcad/code/brlcad/trunk/iges/iges.c@22500 | grep '^r'|awk '{print $1}'|sed 's/r//' I get:

22367
21028
20508
19942
19550
19335
19139
19131
18043
17500
16912
13453
12989
11582
10561
9951
9831
9693
9487
9283
9227
9221
9133
9080
8573
8144
8129
7790
7716
7715

view this post on Zulip starseeker (Apr 01 2021 at 11:31):

r10561 is the only one missing from Git, and that's expected as it was an SVN property change.

view this post on Zulip starseeker (Apr 01 2021 at 11:42):

Sean said:

How can I manually traverse the actual history manually on the git side? In svn, one would see a log stops at r12345, then one pulls log on a path mentioned in the comment at a few revs prior (e.g., r12340), and repeat as needed. if it wasn't mentioned in a comment, one can still pull the tree at r12340, find the file, then continue the log on it.

In that situation what I would usually do is bring up gitk (or maybe gitk --all) and go to the last known relevant commit, then browse my way back up the history.

Following file history in Git is a known weak point (
https://stackoverflow.com/questions/5743739/how-to-really-show-logs-of-renamed-files-with-git)

IMHO not tracking file moves was a mistake, since it fundamentally limits what you can successfully pull out of the history in cases like this.

view this post on Zulip starseeker (Apr 01 2021 at 11:51):

git log --follow and variations on git log -- "**/fiename" are the best answers I'm currently aware of, but I'll keep my eyes peeled for better ones.

view this post on Zulip starseeker (Apr 01 2021 at 12:06):

Sean said:

Any idea what happened with ProEngineer? It seems to similarly have lost track. I didn't check the others.

If I'm interpreting 69329 correctly, the CREO directory was added while the ProEngineer directory was still present.

view this post on Zulip starseeker (Apr 01 2021 at 12:07):

That may be why it's not following Creo back into ProEngineer - it wasn't a folder rename.

view this post on Zulip starseeker (Apr 01 2021 at 12:09):

gitk's blame feature might be slightly better in some cases at following changes back, since some of the comments I've seen seem to suggest it's using a more powerful search mechanism than the --follow option...

view this post on Zulip Sean (Apr 01 2021 at 21:30):

Woah, okay, hah... huge difference between:

git log -- **/iges.c

and

git log -- "**/iges.c"

... I'd missed quoting the glob, so it was only matching src/conv/iges/iges.c history.

view this post on Zulip Sean (Apr 01 2021 at 21:31):

With or without --full-history/-all/etc that was what was causing me grief.

view this post on Zulip Sean (Apr 01 2021 at 21:35):

starseeker said:

Sean said:
In that situation what I would usually do is bring up gitk (or maybe gitk --all) and go to the last known relevant commit, then browse my way back up the history.

Er, that's rather error prone I'd think, trying to follow a text line potentially next to a half dozen other | lines, scrolling up for pages, maybe 10k commits back. Still that's also only good in GUI mode -- I'm looking for lower-level that will work even when I'm remove in a console. I mean is "git log -1 sha" where the gitk line connects up to? Or is it sha^! or something else?

view this post on Zulip starseeker (Apr 01 2021 at 23:12):

Maybe I'm not quite following what you're after... Do you mean something like the following?:

For commit 22798:

$ git log -1 a1e49c
commit a1e49c5edbb4df8eb10f7ae014ae6efeb12fc966
Author: Christopher Sean Morrison <brlcad@gmail.com>
Date:   Thu May 20 15:22:02 2004 +0000

    Vast reorganization begins.  Sources moved from top-level directories into src/.

    svn:revision:22798
    cvs:account:morrison
    cvs:branch:trunk

If I want info about the immediately preceding commit:

$ git log -1 a1e49c~1
commit be1f3137808b681347a7665a05049911c55166a1
Author: Christopher Sean Morrison <brlcad@gmail.com>
Date:   Thu May 20 14:54:22 2004 +0000

    Sources that are external to BRL-CAD are moved from the top level to src/other/.

    svn:revision:22797
    cvs:account:morrison
    cvs:branch:trunk

I can then get (for example) a top level view of the tree at that previous revision:

$ git ls-tree a1e49c~1
100644 blob cf056985dbd9086d3db465d486471e1e4ec5427f    .gitignore
100644 blob 20214282c2426fcf91b0cd7635598aedb1ae06a7    AUTHORS
100644 blob c750a69b34d9c6cd4a914966f104176d00edf5f4    BUGS
100644 blob 1df557054723c07a38dc07014966a80bf024fdbc    COPYING
...

If I want to see into a subdirectory:

git ls-tree a1e49c~1 iges/
100644 blob 2ce043e0fec731921623a30b66a61350a6ca8f28    iges/Makefile.am
100644 blob 77a5bc69e89aae54230f3594a29345b4a6210c43    iges/add_face.c
100644 blob de7c126c87da7202a0fff25b39915c1605b6624e    iges/add_inner_shell.c
...

view this post on Zulip starseeker (Apr 01 2021 at 23:14):

Or look recursively for a specific path:

$ git ls-tree -r a1e49c~1 |grep /iges\\.c
100644 blob 3cc309a9a5cc94b19ac1ffcda9f4a1204f889bbc    iges/iges.c

view this post on Zulip starseeker (Apr 01 2021 at 23:17):

To follow back up the parent-child chain starting from that commit, I can just pull a local log:

$ git log --oneline -10 a1e49c
a1e49c5edb Vast reorganization begins.  Sources moved from top-level directories into src/.
be1f313780 Sources that are external to BRL-CAD are moved from the top level to src/other/.
4440f1c095 Sources that are external to BRL-CAD are moved from the top level to src/other/.
fa32f6950a The old regression test scripts are being replaced by something else.  Likely it'll be Corredor with some unit test framework.  The old scripts are so far out of sync and so inadequate that it's simply not worth it any more.
074785b939 moved from html/ to doc/html/
4e5eaaaa87 s/.doc/.tr/
b51a0ee5e9 renamed .doc files to .tr since they are [tng]roff files
40e36bc94e old nmake visual studio file no longer exists
679e068d94 cake is no more and theres no incentive to maintain it any more so .. buh bye.
29ba93efce rename the text files from .doc to a .txt extension.  reserve .doc extension for groff files

view this post on Zulip starseeker (Apr 01 2021 at 23:33):

Other cute tricks... this finds all the file paths that had the file name TODO:

$ git log --all --name-only --pretty=format:"" "**/TODO" |sort|uniq

doc/docbook/resources/other/standard/xsl/TODO
doc/docbook/resources/standard/xsl/TODO
doc/docbook/system/man3/en/TODO
doc/docbook/system/man3/TODO
libitcl3.2/TODO
libitcl/TODO
libpng/TODO
misc/d-bindings/TODO
misc/tools/astyle/TODO
misc/tools/svn2cl/TODO
src/archer/TODO
src/libdm/TODO
src/libged/TODO
src/libicv/TODO
src/libpc/TODO
src/other/blt/src/TODO
src/other/ext/stepcode/TODO
src/other/ext/tcl/compat/zlib/contrib/iostream3/TODO
src/other/ext/tcl/pkgs/itcl4.2.0/TODO
src/other/ext/tcl/pkgs/tdbcpostgres1.1.1/TODO
src/other/flex/TODO
src/other/freetype/docs/TODO
src/other/incrTcl/itcl/TODO
src/other/incrTcl/itk/TODO
src/other/incrTcl/TODO
src/other/libitcl/TODO
src/other/libnetpbm/TODO
src/other/libpng/TODO
src/other/libz/contrib/iostream3/TODO
src/other/openscenegraph/TODO
src/other/stepcode/TODO
src/other/step/TODO
src/other/tcl/compat/zlib/contrib/iostream3/TODO
src/other/tcl/pkgs/itcl4.0.4/TODO
src/other/tcl/pkgs/itcl4.2.0/TODO
src/other/tcl/pkgs/tdbcpostgres1.1.1/TODO
src/other/uuid/TODO
src/qbrlcad/TODO
src/qged/TODO
src/superbuild/stepcode/TODO
src/superbuild/tcl/compat/zlib/contrib/iostream3/TODO
src/superbuild/tcl/pkgs/itcl4.2.0/TODO
src/superbuild/tcl/pkgs/tdbcpostgres1.1.1/TODO
src/tclscripts/checker/TODO

view this post on Zulip starseeker (Apr 01 2021 at 23:52):

If you know something about the contents, you can use git grep - for example, if I think the historical version of "iges.c" that I'm looking for has the string "Code to support the g-iges converter" in it but I don't know if the file name changed, I can do the following to grep for it back 5 commits:

$ git grep "Code to support the g-iges converter" $(git log -5 --pretty=format:"%H" 3408f5ba122027)
3408f5ba1220271623a90b3740eb43abe06a857a:src/conv/iges/iges.c: *  Code to support the g-iges converter
90f783ca790a5a2f7d176c1b9c0a5eba4c880927:src/iges/iges.c: *  Code to support the g-iges converter
f89fb406daf8348bf215ed96f115bdcf9bbd072c:src/iges/iges.c: *  Code to support the g-iges converter
b6414214c3cdd7e883be1d5f3cd19f9102deb9ec:src/iges/iges.c: *  Code to support the g-iges converter

Notice only 4 commits reported matching that content. If we look at the straight log for 5 commits from that point:

$ git log --oneline -5 3408f5ba122
3408f5ba12 (HEAD) moved all the geometry converter directories from src/. to src/conv/.
48a6bed946 a single iges file didn't make it for some bizzare reason, manually move from src/iges to src/conv/iges
90f783ca79 iges converter moved
f89fb406da moved all the geometry converter directories from src/. to src/conv/.
b6414214c3 formatting, spelling, reference the tasker too

Commit 48a6bed946's tree does not have a file (by any name) matching that string.

view this post on Zulip starseeker (Apr 02 2021 at 00:24):

iges.h has similar results, being missing from 3 commits (this is from a checkout of 3408f5ba12:

$ git grep "I G E S . H" $(git log -5 --pretty=format:"%H")
3408f5ba1220271623a90b3740eb43abe06a857a:src/conv/iges/iges.h:/*                          I G E S . H
b6414214c3cdd7e883be1d5f3cd19f9102deb9ec:src/iges/iges.h:/*                          I G E S . H
$ git log --oneline -5
3408f5ba12 (HEAD) moved all the geometry converter directories from src/. to src/conv/.
48a6bed946 a single iges file didn't make it for some bizzare reason, manually move from src/iges to src/conv/iges
90f783ca79 iges converter moved
f89fb406da moved all the geometry converter directories from src/. to src/conv/.
b6414214c3 formatting, spelling, reference the tasker too

view this post on Zulip starseeker (Apr 02 2021 at 00:34):

I'm still not sure why git log --follow pulls in 23649 for iges.h when doing the src/conv/iges/iges.h path search - it's clearly wrong. However, if I check out the first commit that does have the iges.h contents again (b6414214c3) git log --follow looks like it can go the rest of the way successfully.

view this post on Zulip starseeker (Apr 02 2021 at 00:36):

@Sean My thinking is it's more likely we found a bug in git log --follow than in the repo data...

view this post on Zulip starseeker (Apr 04 2021 at 16:44):

@Sean did we want to set the BRL-CAD github org's icon to the BRL-CAD logo? Right now it's just one of the generic Github images...

view this post on Zulip Sean (Apr 05 2021 at 19:26):

That's helpful. Looks like ls-tree in combo with a couple other commands can help me walk it back.

view this post on Zulip Sean (Apr 05 2021 at 19:28):

starseeker said:

Sean did we want to set the BRL-CAD github org's icon to the BRL-CAD logo? Right now it's just one of the generic Github images...

Yep, good idea. Hadn't gotten to cosmetic yet.

Really needing commits with diffs... but apparently that's going to require some customization. Will have to live with links to the changes for now.

view this post on Zulip Himanshu (Jun 14 2021 at 14:46):

I just saw some other organizations in GitHub and they have verified tag. We can have it too right?

view this post on Zulip Sean (Jun 14 2021 at 15:56):

Later, sure. Not a priority right now.

view this post on Zulip Armin (LordOfBikes) (Jun 15 2021 at 21:40):

The verified tag is the committers responsibility.
Commits, made online by GitHub web interface, are verified with a GitHub key automatically.
Committer with push access have to set up a GPG key to their GitHub account and sign local commits using this key.
Then commits are verified by the developers key. A click on the verified tag shows the key owner.
See https://docs.github.com/en/github/authenticating-to-github/managing-commit-signature-verification

view this post on Zulip starseeker (Jun 24 2021 at 15:41):

I added the new repository location to OpenHub and removed the older ones: https://www.openhub.net/p/brlcad/enlistments

So far the new repo hasn't "taken" for analysis - it tried to pull it down last night, but towards the end of the processing this morning something must have gone wrong. I added some paths to the ignore files (src/other, etc.) - that may help it complete successfully. Fingers crossed...

view this post on Zulip starseeker (Jun 24 2021 at 17:19):

There we go - 15 hour delay updating. BRL-CAD's OpenHub page is back!

view this post on Zulip Sean (Jun 25 2021 at 04:58):

That's really great to have our stats back online fully. Awesome!

view this post on Zulip Erik (Jul 16 2021 at 14:41):

is it, like, really really official, or still moving bits into place? :D

view this post on Zulip starseeker (Jul 16 2021 at 22:44):

It's official from my perspective - the SVN repo is frozen and all dev activity is now on github

view this post on Zulip starseeker (Jul 16 2021 at 22:45):

There's still a lot of polishing to do on the site - get our logo up, see if we can migrate the sf metadata (patches, bug reports, etc.) somehow, etc. But Github is now the active development center.

view this post on Zulip Erik (Jul 17 2021 at 00:19):

rock on, now we're all moving to ... :D

view this post on Zulip Erik (Jul 17 2021 at 00:21):

grats on accomplishing such a mega-effort

view this post on Zulip starseeker (Jul 17 2021 at 00:23):

thanks :-). It's satisfying to have it complete, although I'm still finding myself in the "confound it, why doesn't git record file moves" camp

view this post on Zulip starseeker (Jul 17 2021 at 00:24):

The CI testing has been Really Useful though - it's already caught me a number of times.

view this post on Zulip starseeker (Jul 17 2021 at 00:25):

I tried turning on CodeQL to see what happens - early signs suggest we may be too big a bite for that setup to handle.

view this post on Zulip Erik (Jul 17 2021 at 00:37):

blehhhh, ci/cd stacks, that's my life lately. On software that takes 40 minutes on a 64 core (128 hyperthread) machine to compile and, uh, a test sys that is heavy enough that it'd cost ~$500 on aws to run once and has a minimum 10 hour turnaround... I hear ya on the pain of bein' too big :D

view this post on Zulip starseeker (Jul 17 2021 at 00:46):

I had already evolved a script to target the clang static analyzer at our core libs selectively, but I figured that would be a local machine only affair. However, I found some examples recently which suggested it might actually be possible to install the necessary pieces on the runner to set that up as a github action. I'm letting CodeQL run a bit to see what happens, but I wouldn't be surprised to see it time out without finishing.

view this post on Zulip starseeker (Jul 17 2021 at 00:48):

The static analyzer script looks like it may be able to complete in on the order of an hour, which isn't too bad.

view this post on Zulip starseeker (Jul 17 2021 at 00:48):

We're deliberately building serially in order to minimize stress on the I/O subsystem - I pushed it harder in some early tests and had a few cases where file writes didn't complete properly.

view this post on Zulip Erik (Jul 17 2021 at 00:52):

ya'll should get a lil nvme raid with one of them melly-nox connectx5's :D beastly i/o pair

view this post on Zulip Erik (Jul 17 2021 at 00:53):

(if the file writes didn't complete properly, either there're kernel bugs or your writer doesn't check return values and bitbuckets data when the buffers are full)

view this post on Zulip starseeker (Jul 17 2021 at 00:56):

I'm not sure what sort of backend system the Actions setup is using for its runners, so I can't say for sure.

view this post on Zulip starseeker (Jul 17 2021 at 00:57):

So far at least none of the issues we've hit is anything like that sourceforge failure that led to the duplicate SVN commit id crisis (knock on wood)

view this post on Zulip starseeker (Jul 17 2021 at 00:58):

Usually when that sort of thing happens I suspect another parallel compilation bug, but in this case it was a single .c file that failed to build - not much opportunity there for parallel issues...

view this post on Zulip starseeker (Jul 17 2021 at 00:59):

https://github.com/actions/runner/issues/718

view this post on Zulip Sumagna Das (Apr 02 2022 at 07:47):

@Sean @starseeker i am thinking about trying to migrate the bugs to start getting back to work......and while migrating look at the bugs i can try to fix as starters to getting to know the code

view this post on Zulip Sumagna Das (Apr 02 2022 at 07:47):

where can i start?

view this post on Zulip starseeker (Apr 02 2022 at 22:16):

@Sumagna Das You'll want to check with @Sean on that one - I know he has some thoughts about migrating SF data

view this post on Zulip Sumagna Das (Apr 03 2022 at 05:30):

@starseeker meanwhile....can I transfer the bugs and to-dos from the 2 files?

view this post on Zulip Sean (Apr 03 2022 at 19:16):

@Sumagna Das getting started with bug migration sounds great!

view this post on Zulip Sean (Apr 03 2022 at 19:17):

were you thinking the BUGS file? I wouldn't migrate those to github issues without first confirming that they are still issues. The BUGS file is intended to be for devs to leave notes on issues that may or may not be user visible, may or may not be fixed, may or may not be opinions on design, etc. It's great for finding things to work on, but I wouldn't necessarily think we want to elevate all of them to a github "issue".

view this post on Zulip Sumagna Das (Apr 03 2022 at 19:17):

Sean said:

Sumagna Das getting started with bug migration sounds great!

well right now my target is the already present BUGS and TODO files....after that i will try the online issues

view this post on Zulip Sean (Apr 03 2022 at 19:17):

A better starting point would be to look at the bugs reported at http://sourceforge.net/p/brlcad/bugs/ ... those could all be migrated automatically or manually

view this post on Zulip Sumagna Das (Apr 03 2022 at 19:19):

Sean said:

were you thinking the BUGS file? I wouldn't migrate those to github issues without first confirming that they are still issues. The BUGS file is intended to be for devs to leave notes on issues that may or may not be user visible, may or may not be fixed, may or may not be opinions on design, etc. It's great for finding things to work on, but I wouldn't necessarily think we want to elevate all of them to a github "issue".

should i try the todo file ?

view this post on Zulip Sumagna Das (Apr 03 2022 at 19:20):

Sean said:

A better starting point would be to look at the bugs reported at http://sourceforge.net/p/brlcad/bugs/ ... those could all be migrated automatically or manually

well i was going to try the sf2github script but it needs the bugs.json file to start which i dont have

view this post on Zulip Sean (Apr 03 2022 at 19:20):

there are 126 bugs listed on sf.net, 67 feature requests on sf.net, 51 support requests, 4 geometry, and 214 patches. there's about 166 entries in the BUGS file and 492 ideas in the TODO file. :)

view this post on Zulip Sumagna Das (Apr 03 2022 at 19:22):

Sean said:

there are 126 bugs listed on sf.net, 67 feature requests on sf.net, 51 support requests, 4 geometry, and 214 patches. there's about 166 entries in the BUGS file and 492 ideas in the TODO file. :smile:

that was fast

view this post on Zulip Sean (Apr 03 2022 at 19:22):

I mean it all depends on what interests you. working on any of those will be helpful!

view this post on Zulip Sumagna Das (Apr 03 2022 at 19:23):

anyways i saw that the sf2github script is not updated but i can fix it to work as per our need i think

view this post on Zulip Sean (Apr 03 2022 at 19:23):

personally, I'd probably start with the smallest (geometry) and next smallest (support requests), etc just because I like to shorten lists.

view this post on Zulip Sumagna Das (Apr 03 2022 at 19:23):

Sean said:

personally, I'd probably start with the smallest (geometry) and next smallest (support requests), etc just because I like to shorten lists.

thats not a bad idea actually

view this post on Zulip Sumagna Das (Apr 03 2022 at 19:24):

right now i was trying to parse the TODO file....should i continue with it or start doing the sf requests?

view this post on Zulip Sumagna Das (Apr 03 2022 at 19:26):

Sean said:

personally, I'd probably start with the smallest (geometry) and next smallest (support requests), etc just because I like to shorten lists.

to start with this i think i need the bugs.json file or something like that

view this post on Zulip Sean (Apr 03 2022 at 19:30):

well, I meant actually address the item, not really migrate it -- or migrate it manually (copy-paste and link to the sf item)

view this post on Zulip Sean (Apr 03 2022 at 19:30):

I can look into generating the .json file -- there's a script I have to run as admin, I believe

view this post on Zulip Sean (Apr 03 2022 at 19:31):

alternatively, could just look through the list of bugs in BUGS like you'd said and find one you think you understand -- then add it to issues, then work on it ;)

view this post on Zulip starseeker (Apr 03 2022 at 20:04):

Just as an observation - the BUGS and TODO files, by virtue of being part of the repo, are already preserved on Github. The data in the Sourceforge systems isn't migrated at all, so from a data preservation standpoint it's the data we don't have migrated at all, in any form.

view this post on Zulip starseeker (Apr 03 2022 at 20:06):

For the SF data, my thinking (again for what it's worth) is that it's probably worth migrating them by hand, and doing some checking to see if the original issue is still valid for the current codebase. The end result would be a better set of issues than just a mechanical migration.

view this post on Zulip Sean (Apr 03 2022 at 20:36):

Yeah definitely would be most valuable to have some manually migrate and validate sf tracker items.

view this post on Zulip Sean (Apr 03 2022 at 20:39):

That’s where I’d probably start with the geometry because there’s just four of them and they could easily turn into four pull requests for new sample geom. iirc they just needed docs and some minor cleanup like making sure top level object name made sense, minimal overlaps, make sure title is set, etc

view this post on Zulip Sumagna Das (Apr 03 2022 at 21:22):

so i tried pulling all of the tickets throught the SF api.....one thing i have to know is that there are a few tickets with attachments, right?

view this post on Zulip Sumagna Das (Apr 03 2022 at 21:25):

starseeker said:

For the SF data, my thinking (again for what it's worth) is that it's probably worth migrating them by hand, and doing some checking to see if the original issue is still valid for the current codebase. The end result would be a better set of issues than just a mechanical migration.

if manual checking is needed then i can try putting all of the tickets i got throught API into a text file and then manually checking the needed ones?

view this post on Zulip Sumagna Das (Apr 03 2022 at 21:53):

(deleted)

view this post on Zulip Sean (Apr 03 2022 at 22:50):

There are a lot of tickets with attachments (especially the patches and geometry trackers), but not so much for the feature and support request trackers.

view this post on Zulip Sean (Apr 03 2022 at 22:51):

Sumagna Das said:

starseeker said:
if manual checking is needed then i can try putting all of the tickets i got throught API into a text file and then manually checking the needed ones?

Yes, that would definitely work and be helpful! Any trackers that are still relevant could be manually submitted as a gh issue or pr (in the case of the patches and geometry).

view this post on Zulip Sumagna Das (Apr 04 2022 at 06:52):

Sean said:

There are a lot of tickets with attachments (especially the patches and geometry trackers), but not so much for the feature and support request trackers.

i am giving only the urls of the attachments because nothing else can be gotten from the API

view this post on Zulip Sumagna Das (Apr 04 2022 at 06:55):

Sean said:

Sumagna Das said:

starseeker said:
if manual checking is needed then i can try putting all of the tickets i got throught API into a text file and then manually checking the needed ones?

Yes, that would definitely work and be helpful! Any trackers that are still relevant could be manually submitted as a gh issue or pr (in the case of the patches and geometry).

i will make a text file for an intermediate place for the tickets then......after the manual checking, the text file can again be parsed and then put onto github if that works

view this post on Zulip Sumagna Das (Apr 04 2022 at 08:06):

bugs
feature-requests
support-requests
geometry

these file contain tickets with their information i got from the sourceforge API.....if this works, then i can make a parser which will parse the checked tickets and get it into github as issues

view this post on Zulip Sean (Apr 04 2022 at 20:05):

@Sumagna Das that sounds good, but I don't want to cause you work if there's a tool I can run as admin to migrate everything -- what about this: https://github.com/cmungall/gosf2github ?

view this post on Zulip Sumagna Das (Apr 05 2022 at 05:54):

Sean said:

Sumagna Das that sounds good, but I don't want to cause you work if there's a tool I can run as admin to migrate everything -- what about this: https://github.com/cmungall/gosf2github ?

wait.....there was and updated tool....the last time i checked there were no updated tools for this

view this post on Zulip Sumagna Das (Apr 05 2022 at 05:54):

gotta look out for stuff

view this post on Zulip Sean (Apr 05 2022 at 15:10):

@Sumagna Das there's no mention whether that tool does anything with file uploads, but I was going to test it out on the geometry tracker since it's so small.. If it goes bad, probably won't be hard to clean up after it.

view this post on Zulip Sumagna Das (Apr 06 2022 at 05:21):

Sean said:

Sumagna Das there's no mention whether that tool does anything with file uploads, but I was going to test it out on the geometry tracker since it's so small.. If it goes bad, probably won't be hard to clean up after it.

geometry tracker doesnt have any attachments and its small so not a problem i guess

view this post on Zulip Sean (Apr 06 2022 at 06:37):

@Sumagna Das the geometry tracker does have attachments... they're in the comments

view this post on Zulip Sean (Apr 06 2022 at 06:37):

that's its whole point, they're people submitting geometry models (.g files)

view this post on Zulip Sumagna Das (Apr 06 2022 at 14:44):

Sean said:

that's its whole point, they're people submitting geometry models (.g files)

i can do something about that i think

view this post on Zulip Sumagna Das (Apr 06 2022 at 14:45):

the SF API supports providing the discussion (posts) as well as it uploads/attachments via requests i think

view this post on Zulip Sumagna Das (Apr 06 2022 at 14:46):

(deleted)

view this post on Zulip Sean (May 02 2022 at 13:27):

Profanity aside, this is actually a really useful reference for common git issues: https://ohshitgit.com

view this post on Zulip Sean (Dec 02 2023 at 21:40):

This looks fun… https://github.com/AmrDeveloper/GQL

view this post on Zulip starseeker (Dec 02 2023 at 22:20):

Huh, interesting. Certainly feels like it should be useful for some sort of repo report generation

view this post on Zulip Alexis Naveros (Dec 02 2023 at 23:58):

Hey Sean, it has been years, how's everything? I have received the "ok" from Mark to work fewer hours to do that point cloud thing. I'm planning the algorithm on paper before I get started, there are some details I'm undecided how to handle

view this post on Zulip Alexis Naveros (Dec 02 2023 at 23:58):

And that post of mine was off-topic. I'm not used to this Zulip topic-based chat

view this post on Zulip Alexis Naveros (Dec 03 2023 at 02:47):

I would have a couple questions... Cliff said you already had Screened Poisson reconstruction, the wording suggested that it was satisfactory but very slow. Could it be just a matter of beating the hell out of that code with threads, SSE/AVX/AVX-512, atomics, NUMA awareness? I briefly looked at the code but was a bit lost backtracking beyond SPSR.cpp

view this post on Zulip Alexis Naveros (Dec 03 2023 at 02:49):

And do you have some kind of deadline or desired date for the mesh reconstruction algorithm? Just to have an idea how I'll weight the couple different things that need to be done

view this post on Zulip Sean (Mar 13 2024 at 05:35):

hey @Alexis Naveros very delayed reply!... everything has been going really great, and glad to hear they're going well for you too. short answer is "I dunno" on the screened poisson, at least to say for sure. I'm fairly certain it's typical unstable non-performant academic code, so yeah, probably tons of room for optimizations and improvement.

On that point, I listed to a talk just last week by someone that was comparing screened poisson with other methods, outlining the general deficiencies of the algorithm. I believe they were approaching it from a completely different perspective, incorporating ML into the pipeline to make more dynamic decisions, with good results.

view this post on Zulip Sean (Mar 13 2024 at 05:38):

if it wasn't obvious, we don't have deadlines here. or better still, there's many many many desired deadlines to choose from and they often go wooshing by, but we make progress steadily still.

I consequently just finished implementing a montecarlo approach to external surface area estimation that samples the hell out of the exterior surfaces and would love a robust point-cloud to solid mesh routine. My current tactic is going to be to sample it very densely, make thin cylinders at each surface hit point, mesh and union them all together, and (if sampled densely enough) I should be able to eliminate all the interior faces/points. It's stupid, but it just might work well.

view this post on Zulip Sean (Mar 13 2024 at 05:39):

If you came up with a better way, I'd gladly use it!


Last updated: Oct 09 2024 at 00:44 UTC