
* Jos Westerbeke jos.westerbeke@eur.nl [2019-05-15 09:51]:
I think we made a lot of progress and two (principle) choices appear already on the surface: 5.a and 5.b in the Guidelines document.
Nick pointed out to me that in the US many IDPs release personal data (so-called "directory data") to all SPs in order to get many of them working with one simple configuration.
None of the methods I suggested (how existing SAML 2.0 Metadata for an SP should be used to express also the lack of data needed: just dont request any data!) would work in this case.
So I'll look some more into the effects on existing popular SAML implementations to find out just how bad a "nagative" category (one that mandates to NOT send any personal data to such SPs, not even stable pseudonymous identifiers) would be and whether it could be made to work. I'll report back here once I know more about this.
FYI, the plan would be to codify at least 5.a and 5.b from the guidelines into categories, but phrase them in a positive way where possible): 5.a would be something like the "privacy star" that does not need (and MUST NOT process) any attributes except for access control and possible statistical reporting. Ideally it would be desirable to get this "trust mark" for an SP. (One may dream, right?) (That would also contain prescriptions for how the SAML metadata would have to look: E.g. no NameIDFormats or only "transient", subject-id Entity Attribute of "none", no RequestedAttribute elements other than a "whitelisted" number of attrs for authz/statistics. This should enable consistent creation/curation of metadata for such SPs across federations, cultures and countries while reducing any signals to zero that would cause an ordinary IDP -- one that doesn't release data to anyone. The inconsistency in application by federations would still remain in whether /this/ or /another/ or no category will be assigned. But if its this it would have rules attached that minimize the chance of unwanted data being transferred/processed.)
5.b would then maybe be something like "subject tracking and personalisation possible" (so spelling out the danger right away with the possible feature) and would *only* have a persistent pseudonyous identifier as part of the MUST set (pairwise-id, possibly with allowed fallbacks to persistent NameID or eduPersonTargetedID) in addition to the data in 5.a I'm still abit undecided wrt optional data but I think all that should then come from the subject herself: We help enable that by making a stable but pseudonymous identifier available, the rest should be up the SP and the subject. So 5.b probably only differs in the identifier from 5.a.
I'm not certain what, if anything, we'd do about 5.c (SPs needing more data). Probably nothing other than recommending such SPs to opt into the GEANT Data Protection Code of Conduct (v2, really) since at that point the SP will probably have to provide added incentive for IDPs to *release* that data, not to avoid the release.
-peter