-
-
Notifications
You must be signed in to change notification settings - Fork 24
Added Google Cloud Storage (GCS) integration for image uploads #533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe changes add support for image uploads to Google Cloud Storage (GCS) in the application. A new A new authenticated API endpoint Unit tests for the storage handler cover validation errors and successful uploads. New Cucumber step definitions and a feature scenario are added to test file upload functionality end-to-end. Suggested labels
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (4)
docker-compose.yml (1)
23-23: Consider environment-specific bucket names.The GCP bucket name is hardcoded to "stg-activitypub" across all environments (development, testing). While fine for development, consider making this configurable based on environment to prevent accidentally using staging resources in different environments.
services: activitypub: # ... environment: - - GCP_BUCKET_NAME=stg-activitypub + - GCP_BUCKET_NAME=${GCP_BUCKET_NAME:-dev-activitypub} # ... activitypub-testing: # ... environment: - - GCP_BUCKET_NAME=stg-activitypub + - GCP_BUCKET_NAME=${GCP_BUCKET_NAME:-dev-activitypub} # ... cucumber-tests: # ... environment: - - GCP_BUCKET_NAME=stg-activitypub + - GCP_BUCKET_NAME=${GCP_BUCKET_NAME:-dev-activitypub}Also applies to: 130-130, 170-170
src/storage/gcloud-storage/storage.ts (1)
54-56: Consider preserving file extension for better file management.The current path generation uses UUIDs but doesn't preserve file extensions, which can make it harder to identify file types in storage.
function getStoragePath(account: Account) { - return `images/${account.uuid}/${uuidv4()}`; + return (file: File) => { + const uuid = uuidv4(); + // Extract extension from original filename or MIME type + let extension = ''; + if (file.name) { + const parts = file.name.split('.'); + if (parts.length > 1) { + extension = '.' + parts.pop(); + } + } else if (file.type) { + // Fallback to MIME type if filename not available + const mimeToExt: Record<string, string> = { + 'image/jpeg': '.jpg', + 'image/png': '.png', + 'image/gif': '.gif', + 'image/webp': '.webp' + }; + extension = mimeToExt[file.type] || ''; + } + + return `images/${account.uuid}/${uuid}${extension}`; + }; }Then update the handler:
const account = await accountService.getAccountForSite(ctx.get('site')); -const storagePath = getStoragePath(account); +const getPathForFile = getStoragePath(account); const formData = await ctx.req.formData(); const file = formData.get('file') as File; if (!file) { return new Response('No file provided', { status: 400 }); } +const storagePath = getPathForFile(file); try { const bucket = storage.bucket(bucketName);src/storage/gcloud-storage/storage.unit.test.ts (2)
1-128: Consider adding tests for additional edge cases.The test suite covers basic scenarios well but could benefit from additional tests for important edge cases such as:
- File size limits
- Handling of different file types
- Content type validation
- Malformed file data
96-99: Enhance URL validation with more specific checks.While the regex pattern for validating the file URL is good, consider making it more specific by verifying that the account UUID from your test setup is correctly incorporated in the URL path.
expect(responseData.fileUrl).toMatch( - /^https:\/\/storage.googleapis.com\/test-bucket\/images\/f4ec91bd-56b7-406f-a174-91495df6e6c\/[a-f0-9-]+$/, + new RegExp(`^https://storage.googleapis.com/test-bucket/images/${account.uuid}/[a-f0-9-]+$`) );
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
yarn.lockis excluded by!**/yarn.lock,!**/*.lock
📒 Files selected for processing (5)
docker-compose.yml(3 hunks)package.json(1 hunks)src/app.ts(2 hunks)src/storage/gcloud-storage/storage.ts(1 hunks)src/storage/gcloud-storage/storage.unit.test.ts(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/app.ts (2)
src/instrumentation.ts (1)
spanWrapper(48-60)src/storage/gcloud-storage/storage.ts (1)
createStorageHandler(7-57)
src/storage/gcloud-storage/storage.ts (1)
src/account/account.service.ts (1)
AccountService(46-512)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build, Test and Push
🔇 Additional comments (10)
package.json (1)
63-63:Details
✅ Verification successful
LGTM: Added Google Cloud Storage dependency.
The addition of the
@google-cloud/storagedependency at version 7.16.0 looks good and aligns with the GCP storage integration work.
🏁 Script executed:
#!/bin/bash # Verify if the dependency version has any known vulnerabilities or is outdated # Check the latest version npm view @google-cloud/storage version # Check for any npm audit issues npm info @google-cloud/storageLength of output: 1244
Final Verification: Approved – Dependency version is up-to-date and secure.
- The dependency was verified with
npm viewand confirmed as version 7.16.0, which aligns with the latest release.- No vulnerability or audit issues were found according to the output of
npm info.LGTM: The addition of the
@google-cloud/storagedependency meets the integration requirements.src/app.ts (2)
129-129: LGTM: Import statement for GCP storage handler.Clean import for the new storage handler module.
1056-1060: LGTM: Added upload endpoint with appropriate permissions.The implementation follows the project's patterns by:
- Using the appropriate HTTP method (POST) for file uploads
- Restricting access to Owner and Administrator roles
- Wrapping the handler with spanWrapper for tracing
src/storage/gcloud-storage/storage.ts (1)
1-6: LGTM: Imports look good.All necessary dependencies are imported correctly.
src/storage/gcloud-storage/storage.unit.test.ts (6)
1-7: Appropriate imports for the GCP storage feature.The imports look well-structured, including the necessary dependencies for Google Cloud Storage, account services, and testing utilities.
8-10: Good use of mocking for the external GCP dependency.The Storage class from @google-cloud/storage is properly mocked, which is essential for unit testing without actual cloud dependencies.
20-38: Well-structured mock context with necessary properties.The mock context correctly provides site information and logger which will be needed by the storage handler, making tests more reliable.
39-70: Comprehensive test setup with proper mocks and environment variables.The test setup properly configures all necessary mocks and environment variables. The account creation is thorough and includes all required properties.
72-79: Good error handling test for missing file scenario.This test effectively verifies that the handler correctly responds with a 400 status when no file is provided.
81-108: Thorough test for successful file upload operation.The test successfully verifies the upload workflow, URL format, and proper interaction with GCP Storage APIs. The response validation is comprehensive.
sagzy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, thinking whether an API endpoint that is specific to image upload would be better:
- we don't intend to support other file type uploads until GA
- allows for a validation on the type of media file (reject any file that is not an image)
| }; | ||
|
|
||
| function getStoragePath(account: Account) { | ||
| return `images/${account.uuid}/${uuidv4()}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path starts with images/, whereas the rest the API endpoint is media type agnostic - think either:
- We have an API endpoint that is specific to image uploads (which will also allow us to reject any files that are not supported atm)
- Or create a path helper that takes into account the media type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also think we're missing the file extension
| const bucketName = process.env.GCP_BUCKET_NAME; | ||
|
|
||
| if (!bucketName) { | ||
| return new Response('Bucket name is not configured', { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should error at application boot time, not on a request!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (3)
features/step_definitions/stepdefs.js (1)
2015-2026: Well-structured assertion for file URL validation.The assertions properly validate that the response contains a fileUrl that's a string and starts with 'http'. You might consider adding a more comprehensive URL validation using a regex pattern to ensure the returned URL matches the expected format.
Then('the response contains a file URL', async function () { const responseJson = await this.response.clone().json(); assert(responseJson.fileUrl, 'Response should contain a fileUrl'); assert( typeof responseJson.fileUrl === 'string', 'fileUrl should be a string', ); assert( responseJson.fileUrl.startsWith('http'), - 'fileUrl should be a valid URL', + 'fileUrl should start with http' ); + assert( + /^https?:\/\/[^\s/$.?#].[^\s]*$/.test(responseJson.fileUrl), + 'fileUrl should be a valid URL format' + ); });src/storage/gcloud-storage/storage.ts (2)
60-60: Remove debug logging statement.This console.log statement should be replaced with the logger or removed before production deployment.
-console.log('emulatorHost: ', emulatorHost); +logger.debug('Storage emulator host:', emulatorHost);
95-102: Improve storage path generation.Good implementation of storage path generation that ensures uniqueness. There are two small issues to address:
- The function doesn't check if the extension is valid
- The hardcoded 'images/' prefix might not be appropriate if we later support non-image files
function getStoragePath(account: Account, fileName: string) { - const extension = fileName.split('.').pop(); + // Get extension and ensure it's lowercase for consistency + const extension = fileName.split('.').pop()?.toLowerCase(); + + // Determine folder based on file type or use 'files' as default + const fileCategory = file.type.startsWith('image/') ? 'images' : 'files'; + let path = `images/${account.uuid}/${uuidv4()}`; - if (extension) { + // Only add extension if it's valid (non-empty and alphanumeric + some special chars) + if (extension && /^[a-z0-9]+$/.test(extension)) { path = `${path}.${extension}`; } return path; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
docker-compose.yml(6 hunks)features/step_definitions/stepdefs.js(1 hunks)features/upload-file.feature(1 hunks)src/app.ts(4 hunks)src/storage/gcloud-storage/storage.ts(1 hunks)src/storage/gcloud-storage/storage.unit.test.ts(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- features/upload-file.feature
🚧 Files skipped from review as they are similar to previous changes (2)
- src/storage/gcloud-storage/storage.unit.test.ts
- src/app.ts
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/storage/gcloud-storage/storage.ts (2)
features/step_definitions/stepdefs.js (2)
formData(2002-2002)file(1999-2001)src/core/result.ts (5)
isError(15-17)error(41-43)getError(23-25)Result(3-3)ok(37-39)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build, Test and Push
🔇 Additional comments (9)
features/step_definitions/stepdefs.js (1)
1995-2013: Good implementation of file upload test step!This step definition correctly creates a test file and sends it as multipart/form-data to the endpoint, which matches how real clients would upload files.
src/storage/gcloud-storage/storage.ts (4)
1-16: Good organization of imports and constants.The code has well-structured imports and clearly defined constants for allowed image types. The TODO comment about potentially supporting additional formats (gif, svg) is a good reminder for future consideration.
31-48: Excellent validation and error handling.The file validation logic is well implemented using the Result pattern. Appropriate HTTP status codes are returned for different error scenarios (413 for file too large, 415 for unsupported file type) with clear error messages.
104-114: Good file validation logic.The file validation logic is well implemented with clear error types. The 25MB size limit is reasonable for images.
17-22:Details
❓ Verification inconclusive
Add bucket existence validation at startup.
The code should verify bucket existence at application boot time instead of on each request, following the pattern used for PubSub in this project.
Refactor to validate bucket existence at startup:
🏁 Script executed:
#!/bin/bash # Let's look for the PubSub validation pattern to understand how resources are validated at boot time rg "topic.*exists|check.*topic|validate.*topic" --type ts # Let's also look at app startup code to see where we should add bucket validation rg "bootstrap|init|startup" --type ts --include "app.ts|index.ts|main.ts|server.ts"Length of output: 654
Bucket Existence Validation at Application Startup
The current implementation in
src/storage/gcloud-storage/storage.tssets up the upload handler without any pre-validation of the bucket. To align with our pattern for PubSub resource validation—where, for example, the PubSub topic’s existence is checked at startup (see the comment insrc/mq/gcloud-pubsub-push/mq.ts)—you should refactor the code so that the bucket’s existence is verified once during application boot rather than on each upload request.
- Action Required:
- Move the bucket existence check out of the per-request handler.
- Integrate this check into the application initialization process (e.g., in a startup file such as
app.ts,main.ts, orserver.ts). Since our automated search for startup routines usingrg "bootstrap|init|startup"did not yield clear results due to command issues, please verify manually where the application boot logic resides.- Once the correct startup location is identified, add similar validation logic as is used for PubSub.
docker-compose.yml (4)
23-24: Good configuration for GCP storage in all environments.The environment variables for GCP bucket name and storage emulator host are correctly added to all necessary services (activitypub, activitypub-testing, cucumber-tests).
Also applies to: 131-132, 174-175
34-35: Proper service dependencies setup.The services correctly define dependencies on the fake-gcs service with appropriate condition, ensuring they wait for the emulator to be available before starting.
Also applies to: 140-141, 185-186
255-270: Well-configured GCS emulator service.The fake-gcs service is properly configured with:
- Correct image (fsouza/fake-gcs-server)
- Port mapping (4443)
- Network aliases in both default and test networks
- Persistent volume for data storage
- Appropriate command-line arguments
This setup provides a reliable GCS emulator for both development and testing.
271-272: Good practice using named volume for GCS emulator data.Using a named volume for the fake-gcs-data ensures persistence between container restarts, which is important for testing file uploads and retrievals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (3)
src/storage/gcloud-storage/storage.ts (3)
53-58: Improve error response for missing bucket configuration.The current error response returns an empty JSON object with status 500 when bucket name is not configured. This should include a proper error message in the response and set the Content-Type header.
if (!bucketName) { logger.error('Bucket name is not configured'); - return new Response(JSON.stringify({}), { + return new Response(JSON.stringify({ + error: 'Storage configuration error', + message: 'Bucket name is not configured' + }), { status: 500, + headers: { + 'Content-Type': 'application/json' + } }); }
89-92: Set Content-Type header in successful response.The successful response doesn't specify a Content-Type header, which should be set to application/json for JSON responses.
return new Response(JSON.stringify({ fileUrl }), { status: 200, + headers: { + 'Content-Type': 'application/json' + } });
79-83:⚠️ Potential issueAdd try/catch block for file upload operation.
The file upload operation could fail for various reasons (network issues, permissions, etc.), but there's no error handling for this critical operation.
const bucket = storage.bucket(bucketName); +try { await bucket.file(storagePath).save(file.stream(), { metadata: { contentType: file.type, - } + }, }); +} catch (err) { + logger.error('Error uploading file:', err); + return new Response(JSON.stringify({ + error: 'Upload failed', + message: err instanceof Error ? err.message : 'Unknown error' + }), { + status: 500, + headers: { + 'Content-Type': 'application/json' + } + }); +}Note: This also fixes the formatting issue in line 82 that was detected by the pipeline.
🧰 Tools
🪛 GitHub Actions: CICD
[error] 82-82: File content differs from formatting output. Biome formatting check failed. Run the formatter to fix code style issues.
🧹 Nitpick comments (3)
src/storage/gcloud-storage/storage.ts (1)
60-60: Remove console.log statement.This appears to be debugging code that shouldn't be included in the production version.
- console.log('emulatorHost: ', emulatorHost);src/storage/gcloud-storage/storage.unit.test.ts (2)
81-100: Use consistent size limit in tests and implementation.The unit test checks for a file larger than 26MB, but the implementation rejects files larger than 25MB. These should use the same constant.
Extract the limit to a constant that can be shared between implementation and tests:
it('handles large files', async () => { const ctx = getMockContext(); const largeFile = new File( - ['x'.repeat(26 * 1024 * 1024)], + ['x'.repeat((25 * 1024 * 1024) + 1)], // Just over 25MB limit 'large.jpg', { type: 'image/jpeg' }, );Alternatively, export the size limit constant from the implementation file and import it in the test.
152-155: Strengthen URL format validation in tests.The current test only verifies that the URL matches a very generic pattern. Consider strengthening this to verify the expected format based on whether the emulator host is set or not.
const responseData = await response.json(); -expect(responseData.fileUrl).toMatch(/^https?:\/\/.+\/.+$/); +// Check for the specific format based on the bucket name +expect(responseData.fileUrl).toMatch(/^https:\/\/storage\.googleapis\.com\/test-bucket\/.+$/); + +// Test with emulator too +it('formats URLs correctly when using emulator', async () => { + // Setup emulator environment + process.env.GCP_STORAGE_EMULATOR_HOST = 'http://fake-gcs:4443'; + + const ctx = getMockContext(); + const testFile = new File(['test content'], 'test.png', { type: 'image/png' }); + const formData = new FormData(); + formData.append('file', testFile); + (ctx.req.formData as Mock).mockResolvedValue(formData); + + const handler = createStorageHandler(accountService); + const response = await handler(ctx); + + expect(response.status).toBe(200); + + const responseData = await response.json(); + // Check for the emulator-specific format + expect(responseData.fileUrl).toMatch(/^http:\/\/localhost:4443\/download\/storage\/v1\/b\/test-bucket\/o\/.+\?alt=media$/); + + // Clean up + delete process.env.GCP_STORAGE_EMULATOR_HOST; +});
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/storage/gcloud-storage/storage.ts(1 hunks)src/storage/gcloud-storage/storage.unit.test.ts(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/storage/gcloud-storage/storage.ts (2)
features/step_definitions/stepdefs.js (2)
formData(2002-2002)file(1999-2001)src/core/result.ts (5)
isError(15-17)error(41-43)getError(23-25)Result(3-3)ok(37-39)
🪛 GitHub Actions: CICD
src/storage/gcloud-storage/storage.ts
[error] 82-82: File content differs from formatting output. Biome formatting check failed. Run the formatter to fix code style issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
src/storage/gcloud-storage/gcp-storage.service.unit.test.ts (1)
68-123: Include tests for upload error scenarios.The tests cover successful file uploads, but there are no tests for when the file upload operation fails.
Add a test case that simulates a failure during the file save operation:
it('handles upload failures', async () => { // Mock a save failure mockBucket.file = vi.fn().mockReturnValue({ save: vi.fn().mockRejectedValue(new Error('Upload failed')), }); const validFile = new globalThis.File(['test'], 'test.png', { type: 'image/png', }); // Verify that the method throws an error await expect(service.saveFile(validFile, 'test-uuid')).rejects.toThrow('Upload failed'); });
🧹 Nitpick comments (1)
src/storage/gcloud-storage/gcp-storage.service.unit.test.ts (1)
97-101: Use a more robust URL verification method.The current test uses a regex pattern to verify the URL format, which might be brittle if the URL format changes. Consider extracting the URL verification logic to a separate helper function that can be reused across tests.
function verifyValidGcsUrl(url: string, bucketName: string, path: string, extension: string) { const baseUrl = `http://localhost:4443/storage/v1/b/${bucketName}/o/`; const encodedPath = encodeURIComponent(`${path}`); const pattern = new RegExp(`^${baseUrl}${encodedPath}\/[a-f0-9-]+${extension}\\?alt=media$`); return pattern.test(url); } // Then in the test: expect(result.isOk()).toBeTruthy(); if (result.isOk()) { expect(verifyValidGcsUrl(result.value, 'test-bucket', 'images/test-uuid', '.png')).toBeTruthy(); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
fake-gcs/Dockerfile(1 hunks)fake-gcs/start.sh(1 hunks)src/app.ts(4 hunks)src/storage/gcloud-storage/gcp-storage.service.ts(1 hunks)src/storage/gcloud-storage/gcp-storage.service.unit.test.ts(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- fake-gcs/Dockerfile
- fake-gcs/start.sh
🚧 Files skipped from review as they are similar to previous changes (2)
- src/app.ts
- src/storage/gcloud-storage/gcp-storage.service.ts
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build, Test and Push
Ref https://linear.app/ghost/issue/AP-1062/
Summary of changes:-
fake-gcsto simulate GCS locally.-- Initializes the GCS client
-- Validates and creates the bucket (if running with the emulator)
./ghost/activitypub/upload/image-- Validates file type and size
-- Uploads the image to the GCS bucket with a unique path
-- Returns a public URL to access the image