Stream: brlcad

Topic: libpkg


view this post on Zulip starseeker (May 18 2020 at 18:22):

@Sean I'm having a hard time figuring out how to get pkg_permserver to just pick a random usable ephemeral port - it seems to either want a specific number or a string input for getservbyname.

view this post on Zulip starseeker (May 18 2020 at 18:22):

Hopefully I'm missing something obvious...

view this post on Zulip Sean (May 18 2020 at 18:41):

IP doesn't work that way. servers must open a specific port. best you can do is to try a specific port number and then incrementally try ports above it to a specific brief range like 7000-7005. you'll want to make sure whichever port you pick is not in /etc/services as a registered port, which would tell you 7000-7005 is reserved (and thus shouldn't be one you use)

view this post on Zulip starseeker (May 18 2020 at 18:44):

I must be misunderstanding what ephemeral ports are for... https://www.cymru.com/jtk/misc/ephemeralports.html

view this post on Zulip Sean (May 18 2020 at 18:52):

A little knowledge is leading you astray... they're not something you specify. For example, ssh into .bz and run netstat -n | grep 22 and you'll see that the server is on port 22 and your ssh client is on some other port like 59382. That's an ephemeral port.

view this post on Zulip starseeker (May 18 2020 at 18:55):

What I'm trying to do is have that libpkg test pick a random, not utilized port so I can run multiple separate client/server pairs simultaneously. The only other notion I've been able to come up with is having the server keep trying increments until it finds one that works, which just seems... crude

view this post on Zulip starseeker (May 18 2020 at 18:56):

I don't know if that's why it failed on your side, but I know on mine it caused a distcheck failure when one of the tests tried to run when another build already had a server up and running

view this post on Zulip starseeker (May 18 2020 at 18:58):

Ultimately it doesn't matter too much - I mostly threw that in to make it easier to test the IPC on Windows to set the stage for trying to figure out why we've got Windows and non-Windows specific fbserv communications code going on.

view this post on Zulip starseeker (May 18 2020 at 19:00):

The basic libpkg back and forth looks like it is succeeding with any Tcl_Channel help, so I'll have to dig deeper.

view this post on Zulip starseeker (May 18 2020 at 19:08):

FWIW, I was looking at the Capt'n Proto calculator example to see what it does about ports: https://github.com/capnproto/capnproto/tree/master/c%2B%2B/samples
If you don't specify a port, it (or rather the libraries it calls) get one from the OS by setting the sockaddr_in field sin_port to 0. The calculator client then needs to be pointed at that same port of course, but once it does things seem to work...

view this post on Zulip Sean (May 18 2020 at 19:10):

starseeker said:

What I'm trying to do is have that libpkg test pick a random, not utilized port so I can run multiple separate client/server pairs simultaneously. The only other notion I've been able to come up with is having the server keep trying increments until it finds one that works, which just seems... crude

Can't pick random, but incrementally checking is common. Note that bad/buggy applications can also lock up a port if it doesn't clean up after itself properly or if it crashes while a port is in use. The OS will eventually harvest the port, but it can leave a port unusable for a while.

view this post on Zulip Sean (May 18 2020 at 19:13):

starseeker said:

Ultimately it doesn't matter too much - I mostly threw that in to make it easier to test the IPC on Windows to set the stage for trying to figure out why we've got Windows and non-Windows specific fbserv communications code going on.

When Bob first ported to Windows, pkg was not fully compiled and working on Windows, so he had to help the socket initialization. A short while later, pkg porting was finished (I think it just needed a single winsock init call), but he never went back to clean things up.

view this post on Zulip starseeker (May 18 2020 at 19:15):

Ah. I take it if I'm going to muck around in there I should probably be in a branch?

view this post on Zulip Sean (May 18 2020 at 19:26):

starseeker said:

If you don't specify a port, it (or rather the libraries it calls) get one from the OS by setting the sockaddr_in field sin_port to 0. The calculator client then needs to be pointed at that same port of course, but once it does things seem to work...

That should work with libpkg too, definitely not anything specific to them. sin_port is ultimately passed to bind(), which is how one requests a port from the operating system. Client and server ultimately have to know what port to listen on and connect to. More concerning, however, is why you can't just specify a high port not in use with a fallback when it is in use? For that matter, you could use the same port as rt so existing firewalls won't trigger (which they will otherwise.)

view this post on Zulip starseeker (May 18 2020 at 19:27):

As long as we have more valid fallbacks than potential parallel executions of the regress-pkg test that should work.

view this post on Zulip Sean (May 18 2020 at 19:27):

starseeker said:

Ah. I take it if I'm going to muck around in there I should probably be in a branch?

Not unless you start to modify libpkg, I'd think. any app changes are above the platform code or specific to the platform you're testing on (e.g., the windows-specific port code).

view this post on Zulip starseeker (May 18 2020 at 19:28):

There a real risk I'll break the embedded fb display in MGED while working though - I already did that once the last time I tried using the Tcl_Channel code everywhere...

view this post on Zulip Sean (May 18 2020 at 19:28):

starseeker said:

As long as we have more valid fallbacks than potential parallel executions of the regress-pkg test that should work.

that's also addressable in code. think about how an http server supports hundreds of connections per second on port 80.

view this post on Zulip Sean (May 18 2020 at 19:29):

try NOT using the tcl channel code everywhere ;)

view this post on Zulip Sean (May 18 2020 at 19:29):

there should be zero reason to use it any more

view this post on Zulip starseeker (May 18 2020 at 19:29):

heh - that was my plan/hope - it's the last remaining use of Tcl in the dm/fb merge, not to mention a headache to read

view this post on Zulip starseeker (May 18 2020 at 19:30):

I've never tried - can multiple http servers run on the same machine at the same time and share port 80?

view this post on Zulip Sean (May 18 2020 at 19:32):

nope, that's not how it works

view this post on Zulip Sean (May 18 2020 at 19:32):

there's always just one process on a port

view this post on Zulip starseeker (May 18 2020 at 19:32):

<nod> that's what I thought. The analogy here, then, is that each regress-pkg server is an http server program and will need its own port

view this post on Zulip Sean (May 18 2020 at 19:33):

why are there multiples though? you thinking like distcheck-full running multiple copies or something else?

view this post on Zulip starseeker (May 18 2020 at 19:33):

yep

view this post on Zulip Sean (May 18 2020 at 19:34):

which is it? :)

view this post on Zulip starseeker (May 18 2020 at 19:34):

distcheck-full - exact same scenario that keeps me from enabling the rtwizard tests

view this post on Zulip starseeker (May 18 2020 at 19:35):

or running the various individual rtwizard tests in parallel, for that matter

view this post on Zulip Sean (May 18 2020 at 19:35):

sure, though rtwizard also has file ops in the way

view this post on Zulip Sean (May 18 2020 at 19:35):

so there's a couple issues

view this post on Zulip starseeker (May 18 2020 at 19:37):

I'd have to double check - I think as currently implemented under the hood the various intermediate stages are always talking to a framebuffer via a port, but I could be wrong - it's been a while.

view this post on Zulip Sean (May 18 2020 at 19:39):

the biggest is that each port we utilize is (actually, not just theoretically) a not-insignificant security issue from the operating system's standpoint and from a configuration perspective. We have a little liberty during distcheck as the burden will mostly be on ourselves, but best practice is still that all ports must be documented and managed.

view this post on Zulip Sean (May 18 2020 at 19:39):

consider brl-cad's certificate of networthiness, for example, where we had to spell out every application that opens a port, which port numbers it uses (including any fallback ports), and what protocols are on those ports.

view this post on Zulip Sean (May 18 2020 at 19:40):

it's also common now that the latest default security profiles for windows and linux have firewalling enabled that prevents starting up unregistered servers (i.e., every server must be formally registered during install)

view this post on Zulip starseeker (May 18 2020 at 19:40):

<nod> I still (eventually) plan to get rtwizard rendered down to library calls to fb/rt/icv. That won't work for a libpkg test though - the whole point is communication over ports.

view this post on Zulip Sean (May 18 2020 at 19:41):

this is quite the hot topic these days

view this post on Zulip Sean (May 18 2020 at 19:41):

what libpkg really could benefit from is using a local named port instead of socket communication

view this post on Zulip Sean (May 18 2020 at 19:42):

they're WAY faster and compatible with how we typically use them and think about them (i.e., as a general IPC mechanism)

view this post on Zulip starseeker (May 18 2020 at 19:42):

hmm. How tricky would that be to pull off?

view this post on Zulip Sean (May 18 2020 at 19:43):

I'm not sure, honestly. probably would take a few days but you could also probably address distcheck-full by making sure the test doesn't linger and fully cleans up after itself.

view this post on Zulip Sean (May 18 2020 at 19:44):

i.e., if the server is short lived, then you can poll on opening the socket for 1-2 seconds, and then run the server

view this post on Zulip Sean (May 18 2020 at 19:44):

select() will tell you if it's in use reliably and with a timer

view this post on Zulip Sean (May 18 2020 at 19:48):

take a bit to find a better example, but here's a snippet of python that does what I'm thinking in the probe() method: https://github.com/Apstra/aeon-venos/blob/master/pylib/aeon/base/device.py

view this post on Zulip starseeker (May 18 2020 at 20:10):

@Sean I'm seeing some references to "named pipes" but not much for "named ports" - can you point me to what you're referring to there?

view this post on Zulip Sean (May 18 2020 at 20:19):

I meant a pipe

view this post on Zulip Sean (May 18 2020 at 20:19):

named or unnamed, really

view this post on Zulip starseeker (May 18 2020 at 20:22):

Hmm. I'd have to check the libpkg API - are we generic enough in the public API that we could actually swap mechanisms behind the scenes? I guess we'd have to translate specified "port numbers" into pipes somehow?

view this post on Zulip Sean (May 18 2020 at 20:22):

it's for local communication, doesn't typically go through the network stack
for example, "ls | grep foo" creates a pipe between the ls and grep processes where ls writes data to a pipe and grep reads from the pipe.
a named pipe lets you create a persistent communication context. apps usually use that method to perform some communication protocol within it, especially if dealing with multiple writers or readers. an unnamed pipe is shorter-lived, life of the apps.

view this post on Zulip Sean (May 18 2020 at 20:25):

starseeker said:

Hmm. I'd have to check the libpkg API - are we generic enough in the public API that we could actually swap mechanisms behind the scenes? I guess we'd have to translate specified "port numbers" into pipes somehow?

I would expect a different call or two to be introduced like pkg_popen() or pkg_perm_pipeserver or something similar ... haven't thought through it in detail honestly. it couldn't reliably be completely transparent because you only know it's local after you have both processes wanting to talk.

view this post on Zulip starseeker (Jul 14 2023 at 13:38):

@Sean Have we ever considered making the pkg_conn struct a PIMPL container? At a naive first glance I'm wondering if it could be made opaque in order to allow for multiple communications mechanisms (a.l.a. libuv) to be hidden under the hood...

view this post on Zulip starseeker (Jul 14 2023 at 16:33):

This is an initial stab at making pkg_conn opaque - interested in whether it's a viable approach or not: https://github.com/BRL-CAD/brlcad/commit/b11f84b1ed8e2d77512606c65991bc303d86d2ce

view this post on Zulip starseeker (Jul 14 2023 at 16:41):

Short and sweet summary is I just moved all the data slots to a private impl struct and added accessor functions for the bits being accessed beyond libpkg itself. No intentional logic changes.

view this post on Zulip Sean (Jul 15 2023 at 06:09):

@starseeker really like the pimpl encapsulation as that's good practice in general, but I don't think that will really work well for strapping another library underneath.

view this post on Zulip Sean (Jul 15 2023 at 06:09):

libpkg is barely a wrapper around tcp.

view this post on Zulip Sean (Jul 15 2023 at 06:22):

that said, welcome to try.

it's such a low-level library that it's almost certainly a mismatch (e.g., pkg expects you to manually select() and read() etc), which means you'll likely end up adding awkward options and/or additional API that simply won't make sense for some connection types, or won't make sense when using a different backend. it's probably gonna be fugly and/or buggy.

but MAYBE NOT... if you stick to libpkg's existing calls and options and just make it work.

FWIW, I would not be opposed to replacing libpkg with libuv. It's proven to be a very capable I/O library and it has tcp support (and then some) so it probably wouldn't be too terribly difficult to supplant pkg with it. just would have to be VERY careful and test test test. I do believe there's pkg calls in our beloved analytic codes too, so we'd probably need a huge (multi-year) deprecation window with runtime notification and in-person hand-holding.

view this post on Zulip starseeker (Jul 17 2023 at 13:13):

Hmm. I hadn't thought too hard about whether we could directly use libuv, actually... The main reason I was wanting to see about sticking it under libpkg (aside from not changing our existing codes any more than necessary) was to determine if we could hide the pipe vs. tcp communication selection from the calling codes. I suppose that's probably unlikely, unfortunately - I've noticed most of the codes I've looked at that support both pipes and tcp don't seem to wrap them in that fashion, and I imagine there's a reason they don't try to abstract it...

view this post on Zulip Sean (Jul 18 2023 at 02:47):

Yeah they seem like they’re the same thing but they’re really not., at least on the initiation side of things. I have seen plenty of libels that wrap the same once the connection is made (for the most part). Capnproto does that iirc. You det up as tcp or Port, but then it’s the same until shutdown.


Last updated: Oct 09 2024 at 00:44 UTC