Skip to content

Conversation

@kshepherd
Copy link
Member

@kshepherd kshepherd commented Feb 13, 2025

References

Description

Request a copy is an existing feature in DSpace, whereby users can request access to files they are not authorized to download, and the item author or a delegated helpdesk can approve or deny these requests.

Up until 8.x, all approved requests resulted in an email sent to the requester as attachments.

This is not a suitable solution for large files and doesn’t allow as much control over future authorization to those resources.

In this improvement, a secret key is generated to provide limited access to the requested bitstream(s) with a URL, rather than sending files as attachments. This link can be configured to allow access to just selected bitstreams, or all bitstreams of an item, and it can be configured with an “access period”, after which the secret URL will no longer work.

Example "access by secure link" item page:
screenshot_2025-02-13_15:31:09_selection

Notes for Reviewers

To test, you will also need the DSpace pull request applied: DSpace/DSpace#10407

This is halfway between an improvement and a new feature. For those doing practical testing, I recommend reading the attached PDF.
Request-a-copy-secure-links.pdf

There is a flyway sql migration - two columns are added, so make sure you are testing on an instance that doesn't need to quickly roll back to an earlier version with the existing requestitem table.

There is a new captcha added - A new GDPR complicant, proof-of-work captcha is added to the "request a copy" form. I would like to propose we make this the default for Register User as well, and am happy to open a new PR accordingly. It is under the MIT license: https://altcha.org/captcha/

Instructions for Reviewers

  1. Read the supplied PDF: Request-a-copy-secure-links.pdf
  2. Check out or merge this PR into your local environment, edit configuration to suit the way you wish to test, and build + deploy
    a. If you are a non-developer tester, you may be able to follow the docker testing instructions instead
  3. Make some items with a mix of restricted and public files, of various sizes (depending on the threshold you set, if any)
  4. As anonymous and/or authenticated (but not authorized) users, make requests for the bitstreams or 'all files of items' in question. Take note of how the request form operates, ensuring that the captcha works well (it does not require human interaction).
  5. As the owner or helpdesk delegate, check for the item request (if you are using a test environment and have mail server disabled in your config, you can see the email and copy the link from your log files). Follow the link to the approve/deny form
  6. Try approvals and denials for requests as an owner or helpdesk delegate, with different access periods.
  7. Check emails (or logs if disabled) for both kinds of emails, depending on your configuration. You may have some smaller files sent as attachments, larger ones sent as a secure link instead, you may have everything configured to send as a link, and so on.
  8. Test out access to the bitstreams / item page using the secure link provided. Note the behaviour of downloading a bitstream explicitly allowed by this link, compared to a bitstream the user already had access to, compared to a bitstream that is still unauthorized (and which should lead to a new request copy form)
  9. Try accessing files with the link outside of the access period
  10. Try images and videos with the media viewer enabled

List of changes in this PR:

  1. New components added to support secure access (via secret URL) to item page with file section and bitstream download links
  2. New components for media viewers (video and image) to support access via secret URLs
  3. New Altcha GDPR compliant captcha for the request item form
  4. Changes to the approve/deny form to allow access period to be determined
  5. Routing changes to support item and itemrequest resolution in new components

Attribution

Developed by The Library Code with the support of Technische Universität Berlin

Checklist

This checklist provides a reminder of what we are going to look for when reviewing your PR. You do not need to complete this checklist prior creating your PR (draft PRs are always welcome).
However, reviewers may request that you complete any actions in this list if you have not done so. If you are unsure about an item in the checklist, don't hesitate to ask. We're here to help!

  • My PR is created against the main branch of code (unless it is a backport or is fixing an issue specific to an older branch).
  • My PR is small in size (e.g. less than 1,000 lines of code, not including comments & specs/tests), or I have provided reasons as to why that's not possible.
  • My PR passes ESLint validation using npm run lint
  • My PR doesn't introduce circular dependencies (verified via npm run check-circ-deps)
  • My PR includes TypeDoc comments for all new (or modified) public methods and classes. It also includes TypeDoc for large or complex private methods.
  • My PR passes all specs/tests and includes new/updated specs or tests based on the Code Testing Guide.
  • My PR aligns with Accessibility guidelines if it makes changes to the user interface.
  • My PR uses i18n (internationalization) keys instead of hardcoded English text, to allow for translations.
  • My PR includes details on how to test it. I've provided clear instructions to reviewers on how to successfully test this fix or feature.
  • If my PR includes new libraries/dependencies (in package.json), I've made sure their licenses align with the DSpace BSD License based on the Licensing of Contributions documentation.
  • If my PR includes new features or configurations, I've provided basic technical documentation in the PR itself.
  • If my PR fixes an issue ticket, I've linked them together.

@kshepherd
Copy link
Member Author

Just a note to say that it's taken me a little while longer to fit in the google/altcha "generic" recaptcha for this, register user, feedback etc, than I first anticipated -- if it takes too long I will pull back on that "bonus extra" and we can consider this email form as just (optionally) protected by Altcha

@MW3000
Copy link
Contributor

MW3000 commented Feb 20, 2025

I tested the main functionality and it works great and as advertised ✅. I will test the different configuration possibilities in the next days.

@tdonohue tdonohue requested review from tdonohue and ybnd February 20, 2025 15:40
@MW3000
Copy link
Contributor

MW3000 commented Feb 26, 2025

I did thoroughly test the different configuration options:

Test basic configuration

  • Request all files (of item) in restricted access, grant access => Works ✅
  • Request one of many files, grant access => Works ✅
  • Deny access request => Works ✅

Test the captcha

=> Works ✅
(don't forget to set altcha.hmac.key when testing)

Test using a size limit

Request single file below limit
=> Works ✅
Request single file above limit
=> Works ✅
If 'all files' was indicated in the request, this threshold will be activated if any of the files meet the configured size minimum.
=> Works ✅

Test if access is closed again, after end of access period

=> Works insofar as access is no longer possible ✅
💡 But, after the end of the access period, I just get presented with a login page. It might be a better UX to show the custom download page and add a hint, that the access period expired?

Test restricted images and videos with media viewer enabled

=> Image Viewer works on download page ✅
💡 Should be ported to work on regular item page too! (unrelated to this issue! but we could perhaps use the solution from this pull request?)

=> Video player works on download page ✅
💡 Should be ported to work on regular item page too! (unrelated to this issue! but we could perhaps use the solution from this pull request?)

Copy link
Member

@tdonohue tdonohue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kshepherd : Thanks for this feature! I still haven't had a chance to test the code, but I've done an initial code review. Overall, the code looks OK, but I have to admit I'm worried about the code duplication that I see with this PR.

I'm not fond of all the new item-page/access-by-token/ components, because they all look to be a copy & paste of an existing component, updated slightly to support accessTokens. I'm worried this code duplication is going to be difficult to maintain over time, and we'll eventually have bugs or improvements to the original components that don't get copied into these access-by-token versions.

I'd much rather we find a way to modify or extend the existing components to support accessTokens. Conceptually, it seems like it's just a giant "if" clause: "If accessToken exists in the URL param, then use it to download the file in our component(s). Else, do things as normal." But, the approach of this PR currently appears to be: "If accessToken exists in the URL param, then use a separate set of components entirely. Else, use the normal, default components."

All that said, I don't have a solution to present other than to look into modifying the existing components to check for the "accessToken" and use it when found. If you already did try that approach and hit a "wall" or major issue, then I'd like to better understand what that issue was, so that we can see if we can find a workaround.

Beyond that major concern, everything else inline below should be minor code cleanup. Thanks again, as this looks like a great idea, but I just want to make sure the code is easier to maintain.

<!-- </a>-->
<a [attr.routerLink]="previewLinkOptions.routerLink" class="dont-break-out d-block" [target]="'_blank'"
[attr.queryParams]="previewLinkOptions.queryParams"
[attr.rel]=""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using _blank we should set rel="noopener noreferrer"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@kshepherd
Copy link
Member Author

Thanks for your comments @tdonohue - I will work to address your specific points.

Regarding the duplication of components, you are correct; it's a tradeoff where we took the option of "maintain very similar / near-duplicates of some common components, and avoid adding all the additional if tests / access token handling to those core common components".
Some repositories that never enable request-a-copy might wonder why they now have to deal with / maintain / execute all these extra lines in components like item page, file section, download link and so on...
I'm happy to accept that my original approach is a little too conservative, and to go with the consensus. I'd prefer that we still kept a separate 'item page' so it can be modified and presented in a different way to a "normal" item page, but we can probably consolidate the code in:

  • full file section
  • download link
  • media viewers

Does that sound reasonable?

@tdonohue
Copy link
Member

tdonohue commented Mar 5, 2025

@kshepherd : I see. I'm OK with the Item pages being different, provided we find a way to document (in code comments of the component) why we have a separate copy of the Item page component.

I agree though that I'd rather see the other individual components (that you've listed) consolidated to avoid code duplication. That way if we ever update the media viewers (or similar), we aren't having to do so in multiple places.

I do also want to point out that Request a Copy seems to be enabled by default as of DSpace 7. So, I expect most/many repositories do use it now (unless they've specifically turned it off).

@ybnd
Copy link
Member

ybnd commented Mar 6, 2025

@kshepherd @tdonohue

From our point of view the duplicated Bitstream sections/links and Item pages are also a red flag, especially when taking into account theming and custom Entities.

Regarding the duplication of components, you are correct; it's a tradeoff where we took the option of "maintain very similar / near-duplicates of some common components, and avoid adding all the additional if tests / access token handling to those core common components".
Some repositories that never enable request-a-copy might wonder why they now have to deal with / maintain / execute all these extra lines in components like item page, file section, download link and so on...

While I agree that limiting the duplication to Item components only is an improvement, I'd still argue that those components are some of the most commonly customized in DSpace as a whole.
IMO this trade-off leans heavily to reducing duplicate code, as the file section / download link code is a lot less likely to be modified in the first place.

I'd like to propose two alternatives to keep Item pages deduplicated:

  1. Move the accessToken logic to a resolver and (optionally) use route data within the Item page components
    • Using a resolver would make it very easy to use the ItemRequest object in existing components without having to change their inputs at all
      • e.g. the warning could be moved to ds-item-alerts along with conceptually similar things
    • If the access token logic is limited to the components where it matters (warning, download links, ...), themed Item pages would just work with this feature out of the box
  2. Rethink the /access-by-token page as completely separate from the Item page, and don't try to keep it "in sync"
    • It could contain the file section and warning, along with links to the simple or full Item pages.
    • A clearly distinct title/layout could serve as another nudge for users not to share the access link if they happen to gloss over the warning
    • If someone wants to mirror the simple Item page here, they can always use theming

@kshepherd
Copy link
Member Author

kshepherd commented Mar 6, 2025

thanks @ybnd , I see the points and I'll think about the suggestions.. a resolver might be a good approach. Do you mind if I ping you for specific feedback on that if I try something out?

regarding suggestion 2 - that is kind of what i am going for in the current code, but i think i see your concern -- that any breaking changes to ItemComponent would also affect these pages?

@kshepherd
Copy link
Member Author

kshepherd commented Mar 6, 2025

@tdonohue @ybnd just another note to say i do like a resolver as a way of separating and simplifying the way of providing the ItemRequest object to whatever component is ultimately loaded, but in addition:

  • all entity types are supported.. the custom "access by token item page" extends item component -- the item-page-routes run for all entity types and untyped item anyway... still though, right now the metadata view is handled in that custom page so I understand it'd look different

  • if i could use the ThemedItemPageComponent with an if test to either include the full or 'view mode' file section, it'd be nice, but unfortunately it hands everything off to the generic listable object component loader.
    If i do any extra template work here, i'd have to test for access token and then include either a whole custom item view, or the listable object viewer -- or, i direct to ThemedFullItemPageComponent instead (which does have nicer separation of sub-components in an item viewer) and just do some templating to make the metadata look a bit more like the simple item view...

So I'm a bit stuck of the smoothest way forward on this... yes, I want to farm out all the parts of the item 'view' except for the file section... perhaps it's enough to hide the file section in an item view mode where an access token is present, and include the full file section in ThemedItemPageComponent afterwards, where it is present, but that means modifying this super-generic "ListableObjectComponentLoader" class to handle accessTokens when we only care about it in this super-specific "the object is an item and we are calling it in Standalone View Mode" case...

(UPDATE - see next comment, I may have a way to move forward by wrapping the item/dso in an extended model and passing it down to base components?)

@github-actions
Copy link

Hi @kshepherd,
Conflicts have been detected against the base branch.
Please resolve these conflicts as soon as you can. Thanks!

@kshepherd
Copy link
Member Author

@tdonohue @ybnd I have pushed a refactored version (I still have the original branch here to compare if/when we need it) -- it has a little console output still included so please ignore that, but I submit it to get your approval on the new approach:

  1. A resolver will resolve any accessToken query param to an ItemRequest object, in the parent item page route path
  2. A new ItemWithSupplementaryData (name TBD, just there as an example) extends Item with an optional ItemRequest var
  3. The MediaViewerItem component was changed to also contain an optional accessToken string, for building URLs
  4. The ItemPageComponent, if itemRequest is present from the resolver, will replace the itemRD.payload with a new one using the new object model above - this means Item will still work as normal, but if it happens to be instanceof ItemWithSupplementaryData, we can use the itemrequest data
  5. The FileSection component, depending on whether item request data exists, will use either the normal or secure download link component to render the download link (note: i am happy to consolidate this into the main download link too, but this does have quite a bit of extra testing just to determine paths, icons, etc so for now i kept it separate)
  6. The Image and Video media viewers will look for an accessToken in its MediaViewerItem data and use it to construct the URLs, if present

Is this closer to an acceptable solution?

@kshepherd kshepherd force-pushed the request-a-copy-secure-links_main branch from 85512ca to 5bf7344 Compare March 11, 2025 11:28
@kshepherd kshepherd force-pushed the request-a-copy-secure-links_main branch 4 times, most recently from a41b218 to 1fd8f7e Compare March 11, 2025 14:05
@github-actions
Copy link

Hi @kshepherd,
Conflicts have been detected against the base branch.
Please resolve these conflicts as soon as you can. Thanks!

@tdonohue
Copy link
Member

NOTE: This PR has merge conflicts caused by the merger of #3997. You will need to rebase this PR on the latest main and update it to use "control flow" syntax by replacing any ngIf, ngFor, or ngSwitch usages in HTML templates with @if, @for or @switch. See the control flow docs or the above PR for more details.

@kshepherd kshepherd force-pushed the request-a-copy-secure-links_main branch from 1fd8f7e to c1b8ec4 Compare March 13, 2025 10:39
@kshepherd kshepherd force-pushed the request-a-copy-secure-links_main branch from fce5493 to 58cee5f Compare March 26, 2025 16:08
Copy link
Member

@ybnd ybnd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kshepherd @tdonohue sorry, I found a few more things to bring up after testing with another theme and "faking" an expired token

return itemRequestDataService.getSanitizedRequestByAccessToken(accessToken).pipe(
getFirstCompletedRemoteData(),
// Handle authorization errors, not found errors and forbidden errors as normal
redirectOn4xx(router, authService),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this means that you'll be shown a login page if you try to follow the access link after it expires.

It would be more informative if we could still show the Item page, but with a notice to explain what's going on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notifications now show 2 kinds of errors, depending on if the token is expired, or just was not granted or was revoked. the redirect operator here is still in place, but won't trigger in those scenarios because the backend returns 200 for those cases in the findByAccessToken method (of course, it still throws a hard authorize error in the actual bitstream download check)

* to help components determine how the Item or its data/bitstream should be delivered
* and presented to the users, but not part of the actual database model.
*/
export class ItemWithSupplementaryData extends Item {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is still here; as far as I can tell it's only used in one other place as part of an instanceof check that isn't needed anymore?

}
</div>
}
</ng-container>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be moved to a separate component like the other alerts below?

If we change the REST/resolver logic to still expose expired RequestItems this would also be the place to show an alternative message, e.g. something along the lines of "you're viewing this page with an expired access token, so you no longer have access to certain files"

The date is shown as an ISO timestamp, which is a bit hard to read. Could you transform it with e.g. | date:"YYYY/MM/dd hh:mm"?
20250327-141723

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
.request-a-copy-access-icon {
margin-right: 4px;
color: #26a269;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This colour is defined in two places.

Could you use a Bootstrap variable like --bs-success for it, or create a new --ds- variable in _custom_variables.scss to make sure it's easy to theme?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

<!-- Access period dropdown -->
<div ngbDropdownMenu aria-labelledby="accessPeriod">
@for (accessPeriod of (validAccessPeriods$ | async); track accessPeriod) {
<button
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dropdown looks very weird in the custom theme:
20250327-141128

...vs. in the dspace theme:
20250327-141325

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i hadn't seen this and it just followed the same pattern of ngbDropdownMenu used elsewhere but I'm trying a few things out, please let me know if you still see that

@github-project-automation github-project-automation bot moved this from 👍 Reviewer Approved to 👀 Under Review in DSpace 9.0 Release Mar 27, 2025
@kshepherd
Copy link
Member Author

@ybnd thanks, I'm working on these right now. I don't want to do a large refactor to support the special expiry alert vs 4xx redirects so I'm seeing what I can do with error messages rather than altering the models..

@kshepherd kshepherd force-pushed the request-a-copy-secure-links_main branch from b8263a0 to 2c435a0 Compare March 27, 2025 16:15
@kshepherd
Copy link
Member Author

kshepherd commented Mar 27, 2025

@ybnd just pushed changes now. It requires the latest version of the DSpace backend PR too, since I no longer thrown an AuthorizeException in that findByAccessToken repository get method, but instead return the object with "expired" set.

  • New accessExpired flag can be checked along with acceptRequest, so expired or not-granted links  will show an error and draw 'normal' download links instead of doing a hard 4xx redirect
  • The download links will just be the usual restricted ones, so if a token has expired, the user can just click to submit a new request-a-copy request
  • Old ItemWithSupplementaryData component removed
  • Notifications moved to sub-component of item page and display a warning on valid access token, or a danger alert if expired or not-granted, with an appropriate error message
  • Tests written for each scenario to ensure correct alert is shown
  • Date formatted in messages as yyyy-MM-dd

You can see examples of the "expired" and "not granted / revoked" errors below:

screenshot_2025-03-27_16:44:20_selection
screenshot_2025-03-27_16:37:11_selection

@kshepherd kshepherd force-pushed the request-a-copy-secure-links_main branch from 2c435a0 to 293e4c4 Compare March 27, 2025 16:23
@kshepherd kshepherd requested a review from ybnd March 27, 2025 16:24
@kshepherd kshepherd force-pushed the request-a-copy-secure-links_main branch from 1eab960 to abd3b71 Compare March 27, 2025 17:02
* New accessExpired flag can be checked along with
  acceptRequest, so expired or not-granted links
  will show an error and draw 'normal' download links
  instead of doing a hard 4xx redirect
* Old ItemWithSupplementaryData component removed
* Notifications moved to sub-component of item page
  and display a warning on valid access token, or a
  danger alert if expired or not-granted, with an
  appropriate error message
* Date formatted in messages as yyyy-MM-dd
@kshepherd kshepherd force-pushed the request-a-copy-secure-links_main branch from abd3b71 to e9cf183 Compare March 27, 2025 17:18
@kshepherd
Copy link
Member Author

btw my tests should pass, for some reason the last few CI/CD tests fail to download+build the docker image, so please don't let that put you off

Copy link
Member

@ybnd ybnd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for the quick turnaround, and sorry again for the short notice.

Very happy with the overall shape of the codebase & UX now 👍

The dropdown issue in the custom theme needs one more tiny change to fully work and then we can merge!

@pnbecker pnbecker force-pushed the request-a-copy-secure-links_main branch from bd40391 to 87624c7 Compare March 28, 2025 10:49
@pnbecker pnbecker requested a review from ybnd March 28, 2025 10:50
@kshepherd
Copy link
Member Author

thanks @ybnd , as you see the change is cherry-picked, and I tested it in my custom theme and it's looking fine
screenshot_2025-03-28_13:43:39_selection

@tdonohue
Copy link
Member

Thanks all! As I see that @ybnd gave this a 👍 in his last comment, and his one minor suggestion has been accepted, I'm considering this at +2. I see the backend PR is also at +2.

Therefore, I'm merging this immediately. Thanks again @kshepherd and @pnbecker for all the hard work on this!

Keep in mind, we will need documentation both for the Request a Copy enhancements and the Altcha captcha options added to https://wiki.lyrasis.org/display/DSDOC9x/ Therefore, I'm adding the needs documentation label until that is complete.

@tdonohue tdonohue added this to the 9.0 milestone Mar 28, 2025
@tdonohue tdonohue merged commit bb18278 into DSpace:main Mar 28, 2025
15 checks passed
@github-project-automation github-project-automation bot moved this from 👀 Under Review to ✅ Done in DSpace 9.0 Release Mar 28, 2025
@tdonohue tdonohue added the needs documentation PR is missing documentation. All new features and config changes require documentation. label Mar 28, 2025
@tdonohue
Copy link
Member

tdonohue commented Apr 1, 2025

@kshepherd or @pnbecker : Would one of you be able to add documentation for this feature in our 9.0 docs? https://wiki.lyrasis.org/display/DSDOC9x Ideally, basic documentation would exist prior to our 9.0 Testathon (which starts on Monday, April 7). But, if it cannot exist by then, it should hopefully be created shortly thereafter.

@tdonohue
Copy link
Member

tdonohue commented Apr 8, 2025

@tdonohue tdonohue removed the needs documentation PR is missing documentation. All new features and config changes require documentation. label Apr 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

No open projects
Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

5 participants