Upgrading from gpg1 to gpg2: lots of trouble, need help

Thu Dec 21 06:19:00 CET 2017

Daniel Kahn Gillmor wrote:

> Hi raf--
> 
> Hi On Wed 2017-12-20 14:11:26 +1100, gnupg at raf.org wrote:
> > Daniel Kahn Gillmor wrote:
> >> On Mon 2017-12-18 20:01:02 +1100, gnupg at raf.org wrote:
> >> > For most of my decryption use cases I can't use a
> >> > pinentry program. Instead, I have to start gpg-agent in
> >> > advance (despite what its manpage says) with
> >> > --allow-preset-passphrase so that I can then use
> >> > gpg-preset-passphrase so that when gpg is run later, it
> >> > can decrypt unaided.
> >> 
> >> can you explain more about this use case?  it sounds to me like you
> >> might prefer to just keep your secret keys without a passphrase in the
> >> first place.
> >
> > I'm assuming that you are referring to the use case in Question 1.
> >
> > Definitely not. That would make it possible for the decryption to
> > take place at any time. I need it to only be able to take place
> > for short periods of time when I am expecting it.
> 
> OK, so your preferred outcome is some way to enable a key for a limited
> period of time.  is that right?

Yes.

> > I think the real problem with this use case is that the incoming
> > ssh connections from the other hosts are starting their own
> > gpg-agent (I'm guessing using the S.gpg-agent.ssh socket) rather
> > than just connecting to the existing gpg-agent that I have put
> > the passphrase into (I'm guessing that gpg-agent uses the
> > S.gpg-agent socket).
> 
> there should be only one S.gpg-agent.ssh socket, and therefore only one
> agent.  If you were using systemd and dbus user sessions, those system
> management tools would make sure that these things exist.  This is the
> entire point of session management.  It's complex to do by hand, and
> choosing to abandon the tools that offer it to you seems gratuitously
> masochistic.  But ok…

There is only one S.gpg-agent.ssh socket (I think). I'm
pretty sure that I was mistaken when I guessed that
S.gpg-agent.ssh had something to do with the incoming ssh
connection using gpg which started up its own gpg-agent
process. I now think that S.gpg-agent.ssh has to do with
ssh-agent support and nothing to do with this.

With gnupg-2.1.11 on ubuntu16, there is only a single socket:

  ~/.gnupg/S.gpg-agent

With gnupg-2.1.18 on debian9, there are four sockets:

  ~/.gnupg/S.gpg-agent
  ~/.gnupg/S.gpg-agent.browser
  ~/.gnupg/S.gpg-agent.extra
  ~/.gnupg/S.gpg-agent.ssh

This may have something to do with why what I am trying to
do works with gnupg-2.1.11 but not with gnupg-2.1.18.

The incoming ssh connection did start its own gpg-agent
process (even though there already was one running) but I no
longer think that it had anything to do with
S.gpg-agent.ssh. In fact, since the "user session" in which
the first gpg-agent process was started could no longer
access the passphrase, it seems as though the new gpg-agent
process took over the sockets so that all attempts to
communicate with gpg-agent via these sockets connected to
the new gpg-agent process that knew nothing and the original
gpg-agent process which knew the passphrase was
uncontactable. But again, I'm only guessing.

I saw a comment of yours on a mailing list archive about one
of the purposes of gpg-agent being to prevent access to its
contents from any process just because they had permissions
to use the sockets without alerting the user. It sounds like
that could be what is preventing my use case from working.
But again, I'm only guessing.

  https://lists.gnupg.org/pipermail/gnupg-devel/2015-May/029804.html

Since you say that, if systemd was handling this, that it would
make sure that these sockets exist, perhaps my attempt to mask
them had no effect. Because as soon as I start the first
gpg-agent, all four sockets are created. I assume that it is
gpg-agent itself that creates them rather than systemd. They
disappear again when gpg-agent terminates. But that's the same
behaviour as on macos without systemd. The sockets are created
when gpg-agent starts and they are deleted when it stops. Which
seems sensible. Hardly masochistic. But perhaps my masochism
threshold is too high. :-)

> > What I want is to have gpg and encrypted data and a
> > key-with-a-strong-passphrase on a small number of servers and
> > then, when needed and only when needed, I want to be able to
> > enable unassisted decryption by the uid that owns the
> > data/keys/gpg-agent. Other hosts that need access to the
> > decrypted data need to be able to ssh to the host that has
> > gpg/keys/data to get that data without my interaction.
> >
> > I need to be able to ssh to the server with gpg/keys/data to set
> > things up. Then I need to be able to log out without gpg-agent
> > disappearing. Then the other servers need to be able to ssh to
> > that server and use the gpg-agent that I prepared earlier so as
> > to decrypt the data. Then I need to be able to ssh back in and
> > turn off gpg-agent.
> 
> I'm still not sure i understand your threat model -- apparently your
> theorized attacker is capable of compromising the account on the
> targeted host, but *only* between the times before you enable (and after
> you disable) gpg-agent.  Is that right?

Well, for physical theft of the servers, yes.

> Why do you need these multi-detached operations?  by "multi-detached" i
> mean that your sequence of operations appears to be:
> 
>  * attach
>  * enable gpg-agent
>  * detach
>  * other things use…
>  * attach
>  * disable gpg-agent
>  * detach
> 
> wouldn't you rather monitor these potentially-vulnerable accounts (by
> staying attached or keeping a session open while they're in use)?

I usually do but I want the ability to be able to detach from
the screen session. But it's only for a few minutes. Being
able to detach is not important. Having the incoming ssh
connections communicate with the existing gpg-agent process
is what's important.

In my testing of this, I didn't actually detach from the screen
session so that is not what is causing this problem.

> > The big picture is that there are some publically accessible
> > servers that need access to sensitive data (e.g. database
> > passwords and symmetric encryption keys and similar) that I
> > don't want stored on those servers at all. Instead there are
> > service processes that fetch the data from a set of several
> > other servers that are not publically accessible. This fetching
> > of data only needs to happen when the publically accessible
> > servers reboot or when the data fetching services are
> > restarted/reconfigured.
> 
> so what is the outcome if the gpg-agent is disabled when these
> reboots/restarts happen?  how do you coordinate that access?

If gpg-agent is disabled when the reboots happen, the client servers
fail to obtain the data until I enable gpg-agent. The clients
keep trying until it works.

> > I want to be able to enter the passphrase once (on each of the
> > gpg/data/key hosts) before I reboot the publically accessible
> > hosts, and I want that to be sufficient to enable multiple
> > incoming ssh connections from the rebooting hosts to get what
> > they need, and when the hosts have successfully rebooted I want
> > to be able to turn off gpg-agent.
> >
> > If you prefer, the confirmation of the use of private keys is me
> > entering the passphrase into gpg-agent before the other hosts
> > make their ssh connections.
> 
> this approach seems congruent with my single-attach proposal:
> 
>  * you log into "key management" host (this enables the systemd
>    gpg-agent user service)
>    
>  * on "key management" host, enable key access using
>    gpg-preset-passphrase or something similar
> 
>  * you trigger restart of public-facing service
> 
>  * public-facing service connects to "key management" host, gets the
>    data it needs
> 
>  * you verify that the restart of the public-facing service is successful
> 
>  * you log out of "key management" host.  dbus-user-session closes the
>    gpg-agent automatically with your logout, thereby closing the agent
>    and disabling access to those keys.
> 
> can you explain why that doesn't meet your goals?

Sorry, I thought I already did. The 4th point above does not
work. When the public-facing host connects via ssh to the
key management host, and runs gpg, instead of it successully
connecting to the existing gpg-agent process that I started
minutes earlier, it starts a new gpg-agent process which
doesn't know the passphrase and so the decryption fails.

Here are the gpg-agent processes after I start the first gpg-agent
process and preset the passphrase:

  /usr/bin/gpg-agent --homedir /etc/thing/.gnupg --allow-preset-passphrase \
    --default-cache-ttl 3600 --max-cache-ttl 3600 --daemon -- /bin/bash --login

Here are the gpg-agent processes after an inoming ssh connection that
attempts to use gpg:

  /usr/bin/gpg-agent --homedir /etc/thing/.gnupg --allow-preset-passphrase \
    --default-cache-ttl 3600 --max-cache-ttl 3600 --daemon -- /bin/bash --login
  gpg-agent --homedir /etc/thing/.gnupg --use-standard-socket --daemon

That second gpg-agent process should not exist. The gpg
process that caused it to be started should have connected
to the existing gpg-agent process. The sockets for it
existed but perhaps there was some reason why it didn't use
them.

There must be some reason why gpg thinks it needs to start
gpg-agent. Perhaps it's because it's a different "user
session". They are from two different ssh connections after
all.

> > Even if I consider those servers to be "local", it's still not what I
> > want because that assumes that it is the server with the keys that
> > connects to the other servers with data that needs to be decrypted
> > with those keys. In this case, it is those other servers that will be
> > making the connections to the server with the keys (and the data). I
> > don't want their rebooting to be delayed by my having to log in to
> > each of them with a passphrase or a forwarded gpg-agent connection. I
> > want them to make the connection by themselves as soon as they are
> > ready to, obtain the data they need, and continue booting up.
> 
> Here, i think you're making an efficiency argument -- you want to
> prepare the "key management" host in advance, so that during the boot
> process of the public-facing service, it gets what it needs without
> you needing to manipulate it directly.

That's correct.

> > I'm not sure I understand your reasons for asking all these
> > questions. Is it that you don't think that want I want to do is
> > still possible with gnupg2.1+ and are you trying to convince me
> > to fundamentally change what I'm doing?
> 
> I'm trying to extract high-level, security-conscious, sensible goals
> from your descriptions, so that i can help you figure out how to meet
> them.  It's possible that your existing choices don't actually meet your
> goals as well as you thought they did, and newer tools can help get you
> closer to meeting your goals.
> 
> This may mean some amount of change, but it's change in the direction of
> what you actually want, so hopefully it's worth the pain.

I'm sure that's probably true and I do appreciate your efforts.

> > Can incoming ssh connections use the existing gpg-agent that I
> > have already started and preset with a passphrase or not? Does
> > anyone know?
> 
> yes, i've tested it.  it works.

That's hopeful but I wonder why it doesn't work for me.

> > Is continuing to use gpg1 indefinitely an option? Will it
> > contine to work with recent versions of gpg-agent?
> 
> gpg1 only "works" with versions of gpg-agent as a passphrase cache, but
> modern versions of GnuPG use gpg-agent as an actual cryptographic agent,
> which does not release the secret key at all.

And I noticed that gpg1 can't use preset passphrases anymore anyway.
And gnupg-1.4.22 in macports says that it doesn't use the agent at all
anymore so that's not an option (probably for the best).

> This is actually what i think you want, as it minimizes exposure of the
> secret key itself.  gpg1 has access to the full secret key, while gpg2
> deliberately does not.
> 
> gpg-preset-passphrase only unlocks access to secret key material in the
> agent -- that is, it does *not* touch the passphrase cache.  This means
> that it is incompatible with gpg1, as noted in the manual page.
> 
> > Debian says that gpg1 is deprecated but I've read that gpg1 is
> > now mostly only useful for embedded systems (or servers).
> 
> where did you read this?

I can't remember.

> imho, gpg1 is now mostly only useful for
> people with bizarre legacy constraints (like using an ancient, known-bad
> PGP-2 key to maintain a system that is so crufty it cannot update the
> list of administrator keys).
> 
> > Since IoT and servers will never go away, does that mean that gpg1
> > will never go away? I'd be happy to keep using gpg1 if I knew that it
> > wouldn't go away and if I knew that it would keep working with recent
> > versions of gpg-agent.
> 
> i advise against this approach.  please use the modern version.  it is
> well-maintained and should meet your needs.
> 
>                 --dkg

Don't worry. I will. But it hasn't met many of my needs so far. :-)

Another reason that I disabled/masked systemd's handling of
the sockets is for consistency between the ubuntu16 host
with gnupg-2.1.11 and debian9 with gnupg-2.1.8. Only
the debian9 host has the systemd handling of sockets
(it started with gnupg-2.1.17).

Ah, systemd puts the sockets in a completely different
place: /run/user/*/gnupg/ instead of ~/.gnupg/. So much for
a standard socket location :-). That might be relevant. But
it shouldn't be if systemd is not handling the sockets.

Perhaps I didn't disable systemd's handling of the sockets
properly and it's still partially managing things. But it claims
to be masked so I don't think that's the problem.

No, something's not right. I've globally unmasked and enabled
the sockets but...

As my user, I can do:

  > systemctl --global is-enabled gpg-agent.service gpg-agent.socket gpg-agent-ssh.socket gpg-agent-extra.socket gpg-agent-browser.socket
  static
  enabled
  enabled
  enabled
  enabled

And:

  > systemctl --user is-enabled gpg-agent.service gpg-agent.socket gpg-agent-ssh.socket gpg-agent-extra.socket gpg-agent-browser.socket
  static
  enabled
  enabled
  enabled
  enabled

I had to specifically enable them with --user otherwise it said
disabled with --user even though it said enabled with --global.
I might have done --user disable in teh past as well. It's all
a bit of a blur.

But when I su to the user in question, I get:

  > systemctl --user is-enabled gpg-agent.service gpg-agent.socket gpg-agent-ssh.socket gpg-agent-extra.socket gpg-agent-browser.socket
  Failed to connect to bus: No such file or directory

But it still reports as enabled with --global.
Maybe that's enough. I don't know.

And, as that user, gpg can --list-secret-keys but when I try
to decrypt something, it doesn't ask for a passphrase and it
fails to decrypt but it does start gpg-agent and sockets are
created in ~/.gnupg even though systemd is now supposed to be
handling the sockets. This is without me starting up the 
screen/sudo/gpg-agent/bash processes first.

  > gpg --list-secret-keys
  /etc/thing/.gnupg/pubring.gpg
  -----------------------------
  sec   rsa2048 2016-01-13 [SC]
        25EB4337C3CA32DE46774E1B17B64F00CD3C41D1
  uid           [ultimate] user <user at domain.com>
  ssb   rsa2048 2016-01-13 [E]

Hmm, it mentions the old keyring above, not the
migrated one in ~/.gnupg/private-keys-v1.d.
Maybe that's why --list-secret-keys worked but
the rest below doesn't.

  > echo OK | gpg -e --default-recipient-self | gpg -d
  gpg: encrypted with 2048-bit RSA key, ID 6E76F4FAAE42FC15, created 2016-01-13
        "user <user at domain.com>"
  gpg: public key decryption failed: Inappropriate ioctl for device
  gpg: decryption failed: No secret key

  > ls -alsp .gnupg/S*
  0 srwx------ 1 thing thing 0 Dec 21 15:45 .gnupg/S.gpg-agent
  0 srwx------ 1 thing thing 0 Dec 21 14:47 .gnupg/S.gpg-agent.browser
  0 srwx------ 1 thing thing 0 Dec 21 14:47 .gnupg/S.gpg-agent.extra
  0 srwx------ 1 thing thing 0 Dec 21 14:47 .gnupg/S.gpg-agent.ssh

I am completely failing to understand what's going on here. :-)
Is systemd handling the sockets or not? There's no /run/user
directory for this user so probably not. Maybe I don't
understand --user and --global or systemd in general.

Sorry for taking up so much of your time.
I appreciate your effort to help.

cheers,
raf