
* Bernd Oberknapp bo@ub.uni-freiburg.de [2019-05-11 12:38]:
Regarding RA21, this is to some extend based on the fact that some publishers already have tried to enforce in contract negotiations, with reference to RA21, that libraries switch to SAML as the only authentication method and in some cases that they not only provide a persistent/targeted/pairwise ID but also personal data like names and email addresses.
So on the one hand libraries agree to such contract terms -- releasing other people's personal data to such publishers for no reason and without a legal basis (for the sake of the argument we'll have to assume that the SP does not in fact need any of that data, otherwise there'd be no problem to begin with) -- on the other hand libraries here are campaigning and acting as if they were the last and only defenders of privacy.
The argument (made earlier on this list, IIRC) that SAML shouldn't be used because it's possible to misconfigure it is also interesting. Web and e-mail servers can also be (and sometimes are) misconfigured, sometimes resulting in leaking personal data. Still that doesn't stop anyone from using the technology. Why should this be different for SAML?
Wanting some magic bullet that works consistently everywhere everytime, is secure (per the current state of the art), is as privacy preserving as possible but sufficiently flexible to cater to all relevant use-cases, requires no client set-up whatsoever, does not require subjects to change their content discovery strategies or tools and CANNOT POSSIBLY BE MISCONFIGURED to "leak" personal data... is an interesting set of requirements. I'd sure like to see any alternative that satisfies those criteria.
That's why many libraries, at least in Germany, wouldn't support any recommendation that promotes SAML as the only authentication method or doesn't include anonymous access via SAML.
A blanket recommendation to send more data than is (sometimes) necessary would violate fundamental principles of data protection (minimalism) and would possibly risk violating European data protection law. (Though you might ask at what point you're trying to be more catholic than the pope.)
So a recommendation would probably have to take into account the differences between two types of service: Those that cannot work at all without recognising returning subjects (need a stable identifier for the subject) and those that do not (need as little data as possible) while in both cases still fulfilling requirements to perform access control as needed. SAML Metadata is of course suitable to express this in as much detail as needed on a per-service basis. But if an Entity Category is needed (not that we know it is, yet) that would mean we'd need two different categories, even for the same general use-case of anonymous/pseudonymous access to licensed e-resources: one with the ability to recognise returning subjects, one without.
Whether that added granularity is worth the added complexity (yet something that could be misconfigured!) -- for the exact same use-case -- is an open question. Seems we're in for another contradiction: Libraries want stuff that cannot be misconfigured, but they also need more than "one-size-fits-all" to ensure the least amount of data is sent for each of the common cases.
Unless we know that the "cannot work at all without recognising returning subjects" case is not a current (and will not become a common) requirement? Then a single category (or configuration recommendation) would suffice, one that would not recommend to release a stable idenifier for the subject.
Of course then we're still left with the problem of optional personalisation (resulting in Yet Another Username and Password for the subject, at each and every SP where that's required for the desired featurs to work).
-peter