[Fim4l] CoCo / attributes for authorization (was: Re: Fwd: RA21 Adopts GEANT Data Protection Code of Conduct)

Tue Mar 19 16:52:41 CET 2019

* Peter Gietz <peter.gietz at daasi.de> [2019-03-19 15:46]:
> I have a few comments here based on also reading between the lines:
> 
> > Earlier this year (2019), the RA21 Security & Privacy group endorsed
> > the GEANT Data Protection Code of Conduct as guidance that RA21 should
> > follow: data minimization, purpose limitation, data retention, and
> > more.
> Basically "RA21 endorses Coco" only says "RA21 endorses GDPR" which in
> Europe has not a lot more meaning than "we want to follow the law". But
> outside of Europe, especially in the US this is of course significant.

I remember the old (v1) GEANT CoCo saying that contracts always
overrule whatever the CoCo requires because they're more specific.
(Last time I looked I didn't find language to that regardin in v2 but
that doesn't mean that contracts between two parties won't prevail
nonetheless.)

Since instiuttionally licensed resources will almost always be covered
by contracts I don't see the significance of that announcement (as
well). But of course getting the word out about the GEANT CoCo and
maybe getting it validated in commercial contexts (licensed resources
cost a lot of money which satisfies my ad.hoc criterion of "commercial
context" here) can only be benefitial.

> This is more or less the opposite of:
> 
> > unless the Service Provider (such as a publisher or
> > other content vendor) has a specific agreement with an Identity
> > Provider (IdP - usually an individual’s institution) to receive
> > additional data the IdP should only send anonymous and pseudonymous
> > identifiers to the Service Provider.

Well, ignoring "anonymous" for now (as there's no use for that within
federated identity management ever so could well be removed from any
documents) there's a realization that pretty much everything today
(under GRPD at least) will be considered personal data,
incl. pseudonyms, and so the CoCo could still provide legal support
for transferring and processing this.

But see the argument about contracts > code of conducts above.

> > Specifically, the service
> > provider should only ask for eduPersonEntitlement and, optionally, a
> > pseudonymous pairwise user identifier (e.g., eduPersonTargetedID)
> 
> eduPersonTargetedID is a very good choice since it does not allow
> for user tracking beyond one SP, since every SP gets a different ID
> for the same user.

The idea of "targeted"/"pairwise"/SP-specific identifiers is good, but
not the eduPersonTargetedID attribute itself, in 2019.
eduPersonTargetedID is dead, or should long have been.
https://wiki.univie.ac.at/display/federation/eduPersonTargetedID

1. It should never have been defined for use with SAML2 (according to
some of the people involved), instead it should have been replaced
with proper SAML2 persistent NameIDs once SAML2 defined those in 2005.

2. saml2int ("old") -- the best standard for interop we have -- has
been reommending againt eduPersonTargetedID (and for proper persistent
NameIDs, i.e., in the Subject elelement of the SAML2 Assertion) for
many, many years now.

3. There's a case-folding issue both with "proper" persistent NameID
and therefore equally with eduPersonTargetedID attributes that could
cause information disclosure (one person seeing someone elses data at
an SP, e.g. the history of the content they searched for!).
We/You can't continue pretending this does not exist and on the other
hand claim we are sooo concerned about privacy, even more so than the
federations themselfs (cf. "talk to InCommon about privacy").

4. saml2int ("new") requires use of the replacement identifiers and is
veryy clearly worded in this regard.

Short version: We all need to adopt and prompte pairwise-id and stop
even mentioning the old eduPersonTargetedID attribute, except in some
legacy/migration from old-to-current documents.

> But there are other Attributes in use in addition or in stead of the
> second attribute mentioned, eduPersonEntitlement, namely
> eduPersonScopedAffiliation. So why does RA21 recommend entitlement?

Here's the simple reasoning:
https://wiki.univie.ac.at/display/federation/Library+Services
The common-lib-terms value for the eduPersonEntitlement attribute is
static and the same for every SP and from every IDP. (No other
entitlement is relevant for this discussion/sector.)

That alone creates massive scalability benefits for everyone, because
IDPs do not have to tell the SP just what eduPersonScopedAffiliation
attribute values they consider to be covered by the specific contract.
(Unless the SP makes assumptions about how the IDP implemented
eduPersonAffiliations every IDP will need to tell every SP just what
affiliation values are to be considered authrorized.)

But the common-lib-terms entitlement (value) also has further
scalability advantages: In SAML 2.0 Metadata (cf. eduID.at) you can
list not only the RequestedAttribute' names (as is commonly done) but
also the *values* a given SP requires. With sufficiently powerful
software (e.g. the Shibboleth IDP) that means the attributes released
by the IDP can be automatically filtered to the values provided in
metadata, without manual intervention.

A single and static configuration rule in the IDP can tell the IDP to
release the (non-PII) common-lib-terms value to any SP that says it
needs it. *Nothing* *else* is required from the IDP if the SP supports
his attribute (and can work without identifiers).

On the other hand affiliations are mostly (or exclusively) used in
their "scoped" variant, and that prevents an SP from ever listing the
expected attribute *values* in metadata, as the SP would have to list
the acceptable affiliations mutiplied by the scopes of all their
customers' scopes, i.e.:
staff at univie.ac.at
faculty at univie.ac.at
student at univie.ac.at
employee at univie.ac.at
staff at jku.at
faculty at jku.at
student at jku.at
faculty at mci.edu
student at mci.edu
etc.

> Here is my hypothesis:
> entitlement means that the IdP side knows about the rights at the service.
[...]
> This means that the complex algorithm, evaluating contracts to specify
> entitlements has to be implemented on the IdP side.

That's correct only in the fully abstract and ignoring the /one/
attribute *value* for the eduPersonEntitlement attribute that has been
agreed for this use-case specifically (access to institutionally
licensed resources). So I also only speak to that one entitlement
*value*, not to using entitlements in general.

In fact looking at the eduID.at documentation (link below) you can see
that an IDP can simply and automatically derive the common-lib-terms
entitlement from an existing affiliation.

I.e., if an IDP can produce eduPerson(Scoped)Affiliation attributes
-- which arguably are the main/only remaining value proposition of
academic IDPs and so should be supported by any academic IDP on the
planet -- then the IDP only needs to add one trivial rule that creates
the common-lib-terms entitlement from a set of (locally decided)
affiliations and be done with this. For *all* SPs that can support
common-lib-terms entitlments.

Example from our documentation: "If the subject has the 'member'
affiliation then give it the common-lib-terms entitlemen', too".
https://wiki.univie.ac.at/display/federation/IDP+3+Attribute+resolution#IDP3Attributeresolution-eduPersonEntitlement
(That's the simplest possible case since 'member' is defined in
eduPerson to include 'staff', 'faculty', 'employee' and 'student'. So
all of these will also have 'member' and so all of these groups will
get the common-lib-terms entitlement added.)

It's trivial to create on the IDP side and makes comparison and
agreement much easier for everyone involved: IDPs (no guessing or
per-SP-configuration overhead what affiliation values the SP wants),
for federation operators (all e-resource services are handled the same
way) and IDPs (one attribute with one static value means no more
per-IDP configuration).

> The IdP can quite easily release the affiliation (IdPs generally
> know what relation exists between the user and the institution, and
> to which subdomain a user belongs). If this attribute is sent to
> publishers, the computing and comparing with the contracts is on the
> SP side.

Not yet achievable just from your decription, though: The IDP would
also have to configure (via a self-service interface at the SP or via
their request/ticket system or sales contact) *which* of their values
should be allowed and which should not.
Repeated for every SP they use.
Repeated at every IDP that uses such SPs.
And it's all completely avoidable!

> If publishers require entitlement it means IMO that they trust the
> institutions to tell the truth and that they want less work on their
> own side.

Obviously that's always the case (trusting that IDPs don't lie), same
with affiliations or anything else.
If an IDP lies that IDP risks getting thrown out of the federation and
possibly worse (contract violations).
In other words: My IDP might as well lie to just one SP using
affiliations. How is anyone supposed to detect that?)

> Thus my recommendations to libraries would be to rather agree to
> contracts based on affiliation than on entitlement.

For the reasons stated above (and on the wiki page I referenced) I
think the opposite should be the case. ;)

Currently IDPs simply have to support both models because some SPs
only support one or the other, increasing the effort slightly (both
are trivially easy to support, but knowing when to use what is an
overhead, of course).
Within eduID.at I curate the metadata such that any SP that's known to
support the common-lib-terms entitlement for authorization (either by
default or on request) only has that attribute listed.
I.e., those that play by the rules have it Just Work.
Make those do extra steps that insist on doing it in less efficient ways.

Best regards,
-peter