Releases: data-dot-all/dataall
v2.9.0
Release v2.9.0
What's Changed
🚀 Major Features
- Automated metadata generation using genAI by @dlpzx in #1670
- [GH-1806] iam user support by @TejasRGitHub in #1817
🔧 Technical Upgrades
- upgrade python to 3.12 by @petrkalos in #1850
- upgrade to SQLAv2 by @petrkalos in #1849
🔒 Security & Permission Hardening
- Restrict Glue IAM permissions for pivot role (#1189) and Restrict RAM IAM permissions in pivot role #1195 by @arjunp99 in #1853
- Fix: Scope CDK execution policy IAM permissions to prevent privilege escalation (#1614) by @arjunp99 in #1864
- fix pivot role ram policy by @petrkalos in #1876
- add codecommit:GitPull permissions by @petrkalos in #1851
- add missing permission by @petrkalos in #1872
- must be able to assume the default pivotRole which is prefixed with d… by @petrkalos in #1873
✨ Enhancements
- [Gh-1863] Support for editing KMS keys by @TejasRGitHub in #1865
- [GH-1824] Worksheet fixes by @TejasRGitHub in #1871
- Issue1856: Allow periods in S3 bucket name by @anushka-singh in #1857
- add CloudTrail perms and improve EvironmentCreateForm by @petrkalos in #1852
🐛 Bug Fixes
- Issue1867: Fix entity type in trigger function by @anushka-singh in #1874
- fix migrations image dep by @petrkalos in #1855
- drop IsMultiDialectView by @petrkalos in #1875
- sync alembic revisions by @petrkalos in #1877
📦 Dependency Updates
- upgrade ddk, urllib3, pytest and opensearch to latest by @petrkalos in #1848
- upgrade form-data by @petrkalos in #1846
- Bump requests from 2.32.2 to 2.32.4 in /backend/dataall/base/cdkproxy by @dependabot[bot] in #1832
- Bump on-headers and compression in /frontend by @dependabot[bot] in #1844
- Bump axios from 1.8.4 to 1.12.1 in /frontend by @dependabot[bot] in #1861
New Contributors
Full Changelog: v2.8.0...v2.9.0
v2.8.0
Summary
🚨 This release includes a major infrastructure update that automates the migration from Aurora V1 to V2 with minimal configuration changes. 🚨 It also delivers key improvements such as asynchronous notifications in metadata form for enforcement rules and the migration of the user guide to GitHub Pages. As part of keeping the infrastructure up to date, Node.js 18 (now deprecated and nearing end-of-life) has been removed and replaced with Node.js 22 for CDK code builds and cdk synth. UI bugs identified after the 2.7.0 release have been addressed, and additional issues in the metadata forms module have been resolved. Integration test performance has been improved through faster execution and role configuration, and database session handling has been made more robust. The release also includes important security and compatibility updates to dependencies like urllib3 and requests.
What's Changed
Major changes
- Aurora RDS upgrade from v1 to v2 using CDK build #1759 – by @SofiaSazonova
- Drop user guide endpoint & migrate to GitHub Pages #1829 – by @petrkalos
Other changes
Features and Enhancements
- Asynchronous notifications for MF enforcement rules #1804 – by @SofiaSazonova
- Update Node to v22 #1830 – by @petrkalos
- Integ test consumption role + pytest early exit improvements #1827 – by @petrkalos
Bug Fixes
- Fix MF delete_rule #1794 – by @petrkalos
- Prevent scoped_session from closing active connections #1786 – by @petrkalos
- Fix for blank screen (GH-1796) #1797 – by @TejasRGitHub
- Critical MF bug fixes (GH-1781) #1795 – by @TejasRGitHub
- Bug fixes post 2.7.0-rc #1785 – by @TejasRGitHub
Security
- Fix CDK-Nag for cross-region stack #1820 – by @petrkalos
Dependency Updates
- Bump urllib3 from 1.26.19 to 2.5.0 in custom_authorizer #1825 – by @dependabot
- Bump requests from 2.32.2 to 2.32.4 in custom_authorizer #1823 – by @dependabot
- Bump requests from 2.32.2 to 2.32.4 in backend #1826 – by @dependabot
Full Changelog: v2.7.0...v2.8.0-rc
v2.7.0
The data.all 2.7.0 release places a strong emphasis on fortifying platform security, while simultaneously delivering significant new capabilities. Major advancements, such as the robust Amazon Redshift integration with enhanced sharing controls and the introduction of row and column level data filtering, dramatically improve granular access governance for diverse data assets. Furthermore, dynamic metadata forms now enable programmatic enforcement of security policies, adding another layer of data protection. These pivotal features are backed by comprehensive security enhancements including strengthened input validation, critical dependency upgrades, platform hardening (like S3 bucket versioning), improved logging and monitoring, and advanced network security controls, all contributing to a more secure and resilient data ecosystem.
Finally a warm welcome to @anushka-singh, @rbernotas , and @TejasRGitHub from Yahoo to data.all's maintainers team
What's Changed
Security Related Changes
fix DatabaseResourceArn SSM paramby @petrkalos in #1398Add init for resource lockby @noah-paige in #1426Fix: Typo, missing @staticmethod in ResourcePolicyRepository methodby @dlpzx in #1439Redshift data sharing - Cluster encryption guardrails and informationby @dlpzx in #1447update checkov baseline for cdk synth outputby @noah-paige in #1450Updated glue crawler security configby @mourya-33 in #1434allow dbmigrations lambda to invoke any alembic commandby @petrkalos in #1488Import Datasets: Validate that bucket is uniqueby @SofiaSazonova in #1498check bucket encryption type: key|aliasby @SofiaSazonova in #1499Validate imported resource names via NamingConventionServiceby @SofiaSazonova in #1501S3Bucket WRITE/MODIFY permissionsby @petrkalos in #1472Allow origins conf changesby @mourya-33 in #1486fix importing sse encrypted bucketsby @petrkalos in #1514Redshift data sharing - Add interface for share validations and Redshift guardrailsby @dlpzx in #1484Update baseline removing checkov exception for glue security configby @noah-paige in #1516Add External Id Conditions to Deployment Rolesby @noah-paige in #1521Add bucket versioningby @noah-paige in #1522Add bucket versioning pt 2by @noah-paige in #1529Increase access point creation buffer time and fix bug in share cross account if conditionby @SofiaSazonova in #1552Bandit fix: explicitly install typing-extensionsby @SofiaSazonova in #1600New permission model for Redshift ADMIN connectionsby @dlpzx in #1573warn users when evaluating a non-readonly share requestby @petrkalos in #1568try to create AP every time, catch if already existsby @SofiaSazonova in #1609Restrict invitation to Redshift Connections and edit permission nameby @dlpzx in #1638Add forceDelete to shareObjects to clean-up all shareItemsby @dlpzx in #1646Add permission checks to markNotificationAsRead + deleteNotificationby @noah-paige in #1654Add Removal Policy Retain to Bucket Policy IaCby @noah-paige in #1660Extend Tenant Perms Coverageby @noah-paige in #1630add custom domain support for apigwby @petrkalos in #1679add warning to untrust data.all account when removing an environmentby @petrkalos in #1685Restrict pivotRole permissions with DENY statementby @dlpzx in #1681Added Token Validationsby @noah-paige in #1682Updating overly permissive policies tagged by checkov for environment role using least privilege principlesby @mourya-33 in #1632Update sanitization techniqueby @noah-paige in #1692Fix/input validationby @noah-paige in #1693Add MANAGE_SHARES permissionsby @dlpzx in #1702Disable introspection on prod sizingby @noah-paige in #1704Bump python runtime to bump cdk klayers cryptography versionby @noah-paige in #1707tenant-permission testsby @dlpzx in #1694Added permission check - is tenant to update SSM parameters APIby @dlpzx in #1714Add GET_SHARE_OBJECT permissions to get data filters APIby @dlpzx in #1717Add permissions on list datasets for env group + cosmetic S3 Datasetsby @dlpzx in #1718Add GET_WORKSHEET permission in RUN_SQL_QUERYby @dlpzx in #1716Added permissions to Quicksight monitoring service layerby @dlpzx in #1715Add LIST_ENVIRONMENT_DATASETS permission for listing shared datasets and cleanup unused codeby @dlpzx in #1719Add omics create_run unauthorized test and improve other testsby @dlpzx in #1723Introduce is_owner permissions to Glossary mutations + add new integration testsby @dlpzx in #1721Refactor env permissions + modify getTrustAccountby @dlpzx in #1712Avoid infinite loop in glossaries checksby @dlpzx in #1725Feed consistent permissionsby @dlpzx in #1722Votes consistent permissionsby @dlpzx in #1724Consistent get_<DATA_ASSET> permissions - Dashboardsby @dlpzx in #1729add resource permission checksby @petrkalos in #1711Consistent get_<DATA_ASSET> permissions - S3_Datasetsby @dlpzx in #1727BUGFIX] gh-1734by @TejasRGitHub in [#1741Gh 884] IAM policy splitting for requestor IAM policiesby @TejasRGitHub in [#1650Bugfix] - Changes in logic to delete share dbby @TejasRGitHub in [#1706Bugfix] | GH-1749 -Fixing share expiration taskby @TejasRGitHub in [#1750Fix: Add conditional to not lock empty list of resourcesby @dlpzx in #1760disable apigw data tracing to avoid leaking sensitive informationby @petrkalos in #1798allow customization of waf rate limits and api gateway throttling limitsby @petrkalos in #1800add s3 server access loggingby @petrkalos in #1811git CodeBuild baseline role permissions to use GitHub connectionby @SofiaSazonova in #1813create a new access logs bucket instead of importingby @petrkalos in #1815Fix/custom auth 500by @petrkalos in #1792change all lambdas to structured loggingby @petrkalos in #1801add explicit token duration config for both JWTsby @noah-paige in #1698Userguide signout flowby @noah-paige in #1629log API handler response only for LOG_LEVEL DEBUG. Set log level INFO for prod deploymentsby @dlpzx in #1662Separating Out Access Loggingby @noah-paige in #1695
Major Changes
Redshift Integration
This section details significant advancements in integrating Amazon Redshift, enabling better management, sharing, and security of Redshift datasets within the platform.
Add Redshift datasets moduleby @dlpzx in #1424 - Introduces a new module for managing Redshift datasets.Redshift dataset module testing: Re-added client factories, mocking clientsby @dlpzx in #1449 - Enhances testing capabilities for the Redshift dataset module by re-adding client factories and mocking clients.Redshift data sharing - Redshift connection types and namespace Idby @dlpzx in #1451 - Adds support for different Redshift connection types and namespace IDs for data sharing.Redshift data sharing - Boilerplate for redshift dataset sharing moduleby @dlpzx in #1461 - Provides foundational code for the Redshift dataset sharing module.Redshift data sharing - Make ShareObject.IAMRole a generic "Role"by @dlpzx in #1462 - Generalizes the IAM Role definition withinShareObjectfor Redshift data sharing.Redshift data sharing - Polish frontend views for Redshift sharesby @dlpzx in #1477 - Improves the user interface for managing Redshift shares.Redshift data sharing - Add sharing tasks to process Redshift datasharesby @dlpzx in #1467 - Implements tasks to process Redshift data shares.Redshift data sharing - Added methods from sharing back to redshift datasets (check_on_delete, list_shared_datasets...)by @dlpzx in #1511 - Adds methods for managing shared Redshift datasets, including checks on deletion and listing shared datasets.Redshift data sharing - Documentation 1 - Redshift Connections and Datasetsby @dlpzx in #1512 - First part of the Redshift connections and datasets documentation.Redshift data sharing - Documentation 2 - Redshift Sharingby @dlpzx in #1519 - Second part of the Redshift sharing documentation.Redshift data sharing - frontend changes in the Catalog - cleanby @dlpzx in #1458 - Cleans up frontend changes related to Redshift data sharing in the Catalog.Fix wrong environment in the verification of redshift roleby @dlpzx in #1587 - Corrects an issue with Redshift role verification related to environments.Add Redshift connection tooltips and info + restrict to DATA_USER connections for import Redshift Datasetby @dlpzx in #1565 - Adds helpful tooltips and restricts Redshift dataset import toDATA_USERconnections.Integration tests executed on a real deployment as part of the CICD - Redshift Connectionsby @dlpzx in #1628 - Introduces integration tests for Redshift connections within the CI/CD pipeline, executed on a real deployment.Integration tests executed on a real deployment as part of the CICD - Redshift Datasetsby @dlpzx in #1636 - Adds integration tests for Redshift datasets within the CI/CD pipeline, executed on a real deployment.Fix error message of Redshift share verifierby @dlpzx in #1647 - Resolves an issue with the error message from the Redshift share verifier.Fix: check if Redshift table exists before publishing it to data.allby @dlpzx in #1644 - Ensures Redshift tables exist before being published to data.all.Integration tests executed on a real deployment as part of the CICD - Redshift Sharesby @dlpzx in #1643 - Implements integration tests for Redshift shares within the CI/CD pipeline, executed on a real deployment.
Test improvements
This section highlights a series of enhancements to the tes...
v2.7.0-rc1
What's Changed
- fix DatabaseResourceArn SSM param by @petrkalos in #1398
- fix delete_env parameter by @petrkalos in #1397
- Fix deprecated mui tree view by @noah-paige in #1427
- Add init for resource lock by @noah-paige in #1426
- Database tables and enums for metadata forms by @SofiaSazonova in #1422
- Dependencies: Upgrade
fast-xml-parserto 4.4.1 by @dlpzx in #1441 - Fix: Typo, missing @staticmethod in ResourcePolicyRepository method by @dlpzx in #1439
- Feat: API call to query Enum values by @SofiaSazonova in #1435
- Feat: API call to query Enum values - continuation - semgrep fix by @SofiaSazonova in #1445
- Add Redshift datasets module by @dlpzx in #1424
- Fix for getting correct gluedb name for central cataloged dataset by @TejasRGitHub in #1433
- pass ShareableType instead of it's value and log exception details by @petrkalos in #1452
- Redshift dataset module testing: Re-added client factories, mocking clients by @dlpzx in #1449
- Redshift data sharing - Cluster encryption guardrails and information by @dlpzx in #1447
- Redshift data sharing - frontend changes in the Catalog - clean by @dlpzx in #1458
- Issue1456: Fix for persistent email reminders by @anushka-singh in #1457
- Redshift data sharing - Redshift connection types and namespace Id by @dlpzx in #1451
- Redshift data sharing - Boilerplate for redshift dataset sharing module by @dlpzx in #1461
- hide access point consumer details if access points feature is disabled by @fourtyplustwo in #1466
- Redshift data sharing - Make ShareObject.IAMRole a generic "Role" by @dlpzx in #1462
- Metadata forms-2: Create, display list, search list by @SofiaSazonova in #1444
- Fix: Remove enums from i-tests for MFs by @SofiaSazonova in #1473
- move backend approval_tests as the last step within the backend stage by @petrkalos in #1423
- Fix local share processors registered by @noah-paige in #1470
- Issue1468: Submit request redirect by @anushka-singh in #1469
- update checkov baseline for cdk synth output by @noah-paige in #1450
- Metadata forms 3: Metadata Form View page. Add, Edit fields by @SofiaSazonova in #1455
- Row/Column Level Data Filters by @noah-paige in #1438
- Fix history of alembic migration scripts data filters vs metadata forms by @dlpzx in #1478
- Redshift data sharing - Polish frontend views for Redshit shares by @dlpzx in #1477
- Bugfix: Parsing error in Admin settings tab by @SofiaSazonova in #1482
- Redshift data sharing - Add sharing tasks to process Redshift datashares by @dlpzx in #1467
- Upgrade axios version by @noah-paige in #1483
- Run reapply automatically if Share Verifier Task detects Unhealthy Shared Items by @noah-paige in #1476
- Save data filter perms before backfilling by @noah-paige in #1485
- Updated glue crawler security config by @mourya-33 in #1434
- Metadata forms 4: Access Control by @SofiaSazonova in #1474
- fix table share revoke with no filters by @noah-paige in #1493
- allow dbmigrations lambda to invoke any alembic command by @petrkalos in #1488
- Metadata forms 5: UI improvement + possible values validation by @SofiaSazonova in #1480
- Import Datasets: Validate that bucket is unique by @SofiaSazonova in #1498
- check bucket encryption type: key|alias by @SofiaSazonova in #1499
- Modifying Regex for fixing redirection not working when visitin s3-datasets by @TejasRGitHub in #1494
- Make log query period configurable by @SofiaSazonova in #1503
- Validate imported resource names via NamingConventionService by @SofiaSazonova in #1501
- S3Bucket WRITE/MODIFY permissions by @petrkalos in #1472
- Allow origins conf changes by @mourya-33 in #1486
- fix importing sse encrypted buckets by @petrkalos in #1514
- feat(GH-1083) share expiration by @TejasRGitHub in #1489
- Redshift data sharing - Add interface for share validations and Redshift guardrails by @dlpzx in #1484
- Bump flask-cors from 4.0.1 to 5.0.0 in /backend by @dependabot in #1515
- Bump webpack to 5.94.0 by @noah-paige in #1517
- Bump micromatch from 4.0.7 to 4.0.8 in /frontend by @dependabot in #1518
- Update baseline removing checkov exception for glue security config by @noah-paige in #1516
- Redshift data sharing - Added methods from sharing back to redshift datasets (check_on_delete, list_shared_datasets...) by @dlpzx in #1511
- add docs on how to create table filters and assign to shares by @noah-paige in #1506
- Metadata forms 6: attach MF to Orgs, Envs and Datasets by @SofiaSazonova in #1495
- Redshift data sharing - Documentation 1 - Redshift Connections and Datasets by @dlpzx in #1512
- Redshift data sharing - Documentation 2 - Redshift Sharing by @dlpzx in #1519
- Upgrade
path-to-regexpto 0.1.10 by @dlpzx in #1525 - Add External Id Conditions to Deployment Roles by @noah-paige in #1521
- Add bucket versioning by @noah-paige in #1522
- Upgrade body parser dependency by @noah-paige in #1530
- Increase CodeBuild timeout for integration tests by @dlpzx in #1532
- Add bucket versioning pt 2 by @noah-paige in #1529
- Upgrade send to 0.19.0 and express to 4.20.0 by @dlpzx in #1542
- Config log retention by @noah-paige in #1527
- Add check to skip processor initialization if there are not shareable items in revoke, verify and reapply by @dlpzx in #1538
- Updating logic to check if expiration is changed on the UI by @TejasRGitHub in #1545
- Add Dataset integration tests - Tables, Folders by @noah-paige in #1391
- add mlstudio integ tests by @petrkalos in #1535
- Allow configurable Region to run CDK IaC checks by @noah-paige in #1531
- Feat/integration tests dataset filters by @noah-paige in #1539
- Increase access point creation buffer time and fix bug in share cross account if condition by @SofiaSazonova in #1552
- Add Dataset integration tests - Dataset missing tests, Table Profiling by @dlpzx in #1533
- Add Permissions integration tests by @dlpzx in #1550
- Add Stacks and KeyValueTags integration tests by @dlpzx in #1551
- Add VPC network integration tests + fix tags bug in networks by @dlpzx in #1555
- Add Glossaries integration tests by @dlpzx in #1556
- Add Redshift connection tooltips and info + restrict to DATA_USER connections for import Redshift Dataset by @dlpzx in #1565
- fix setting maintenance modes enum by @noah-paige in #1567
- Feat/integration tests dashboards by @noah-paige in #1560
- Upgrade rollup to n...
v2.6.2
🔐 Security
- Update sanitization technique for terms filtering by @noah-paige in #1692 and in #1693
- Move access logging to a separate environment logging bucket by @noah-paige in #1695
- Add explicit token duration config for both JWTs by @noah-paige in #1698
- Disable GraphQL introspection if prod sizing by @noah-paige in #1704
- Add snyk workflow on schedule by @noah-paige in #1705, #1708, #1713, #1745 and in in #1746
- Unify Logger Config for Tasks by @noah-paige in #1709
- Updating overly permissive policies tagged by checkov for environment role using least privilege principles by @mourya-33 in #1632
Data.all permission model has been reviewed to ensure all Mutations and Queries have proper permissions:
- Add MANAGE_SHARES permissions by @dlpzx in #1702
- Add permission check - is tenant to update SSM parameters API by @dlpzx in #1714
- Add GET_SHARE_OBJECT permissions to get data filters API by @dlpzx in #1717
- Add permissions on list datasets for env group + cosmetic S3 Datasets by @dlpzx in #1718
- Add GET_WORKSHEET permission in RUN_SQL_QUERY by @dlpzx in #1716
- Add permissions to Quicksight monitoring service layer by @dlpzx in #1715
- Add LIST_ENVIRONMENT_DATASETS permission for listing shared datasets and cleanup unused code by @dlpzx in #1719
- Add is_owner permissions to Glossary mutations + add new integration tests by @dlpzx in #1721
- Refactor env permissions + modify getTrustAccount by @dlpzx in #1712
- Add Feed consistent permissions by @dlpzx in #1722
- Add Votes consistent permissions by @dlpzx in #1724
- Consistent get_<DATA_ASSET> permissions - Dashboards by @dlpzx in #1729
🧪 Test improvements
Integration tests are in sync with main without 2.7 planned features. In this PR all core modules, optional modules and submodules are tested. That includes: tenant-permissions, omics, mlstudio, votes, notifications and backwards compatiblity of s3 shares. by @SofiaSazonova, @noah-paige , @petrkalos and @dlpzx
In addition, the following PR adds functional tests that ensure the permission model of data.all is not corrupted.
- ⭐ Add resource permission checks by @petrkalos in #1711
Dependencies
- Update FastAPI by @petrkalos in #1577
- update fastapi dependency by @noah-paige in #1699
- Upgrade "cross-spawn" to "7.0.5" by @dlpzx in #1701
- Bump python runtime to bump cdk klayers cryptography version by @noah-paige in #1707
v2.6.1
What's Changed
This release is focused on security enhancements
- Added Token Validations (#1682) + small fix in get-parameter CloudfrontDistributionDomainName from us-east-1 (#1687)
- Add warning to untrust data.all account when removing an environment (#1685)
- Add custom domain support for apigw (#1679)
- Lambda Event Logs Handling (#1678)
- Upgrade Spark version to 3.3 (#1675)
- ES Search Query Collect All Response (#1631)
- Extend Tenant Perms Coverage (#1630)
- Limit Response info dataset queries (#1665)
- Add Removal Policy Retain to Bucket Policy IaC (#1660)
- log API handler response only for LOG_LEVEL DEBUG. Set log level INFO for prod deployments (#1662)
- Add permission checks to markNotificationAsRead + deleteNotification (#1654)
- Added error view and unified utility to check tenant user (#1657)
- Userguide signout flow (#1629)
Full Changelog: v2.6.0...v2.6.1
v2.6.0
What's Changed
New features 🆕
- 🩺 ❗ 🩺 ❗ 🩺 ❗ Adding AWS HealthOmics as a Module in "Play" tools by @ironspur5 in #954
- Allow DA admins to view share logs by @SofiaSazonova in #1274
- Maintenance window by @TejasRGitHub in #1236 and documentation in #1333
- Persistent Email Reminders by @anushka-singh in #1354
- Bulk share reapply on dataset by @TejasRGitHub in #1363
- Convert Dataset Lock Mechanism to Generic Resource Lock by @noah-paige in #1338
Refactoring 💻
- Generic dataset module and specific s3_datasets module by @dlpzx ( #1258 , #1276 , #1281 , #1282 , #1292 , #1297
- Generic shares_base module and specific s3_datasets_shares module by @dlpzx ( #1284 , #1294 , #1298 , #1311 , #1312 , #1320, #1340, #1350, #1351 , #1357 , #1359 )
- Refactoring getStack API by @noah-paige in #1182 and #1344
- Gql schema cleanup sdkcli by @noah-paige in #1330
- Move quicksight monitoring to config.json and disable it in FE by @dlpzx in #1328
- Remove global imports in modules by @dlpzx in #1270
Enhancements 🥇
- Add confirmation pop-ups for deletion of team roles and groups by @SofiaSazonova in #1231
- UI improvement of "Request Access" by @SofiaSazonova in #1228
- ShareView remake by @SofiaSazonova in #1277
- Create RDS database snapshot before executing alembic migrations by @dlpzx in #1267
- Set DataSearch fuzziness to 0 -- strict search by @SofiaSazonova in #1279
- Add dependency of SSM to cognito url trigger by @dlpzx in #1395
- Ignore ruff change in blame by @petrkalos in #1372
- Allow descriptions schema by @noah-paige in #1305
- Update
safety check ignorelist by @petrkalos in #1310 - Misc logging improvements by @petrkalos in #1317
- Update FE dependency and re-create lock files by @noah-paige in #1326
- Updating encryption for lambda env vars - cont by @mourya-33 in #1322
- Organization Group Permissions Add|Edit by @SofiaSazonova in #1306
- Add support for full or partially updating Config params from SSM by @petrkalos in #1318
- Enhance Share Health Status Verify/ReApply by @noah-paige in #1346
- Split cognito urls setup and cognito user creation by @petrkalos in #1366
- Enforce non null on GQL query string if non null defined by @noah-paige in #1362
- Add search (Autocomplete) in dropdowns by @dlpzx (#1368 , #1356 , #1335 , #1347 , #1367 )
- Rename alias for env_vars kms key in cognito lambdas FE and BE by @dlpzx in #1385
- Add check in delete environment for create_failed stacks by @dlpzx in #1386
- Add delete docs not found when re indexing in catalog task by @noah-paige in #1365
- Introduce check for IAM actions in share_verify bucket and access points + reapply with list of allowed actions by @SofiaSazonova in #1407
- Add cognito urls config trigger func frontend by @noah-paige in #1413
Tests 🧪
- Automate bootstrapping of integrations tests by @petrkalos in #1289
- Codebuild integration tests reads cognito-test-users param from environment account by @petrkalos in #1295
- Add environment tests by @petrkalos in #1371, #1334 and Update gql apis + update_environment tests by @petrkalos in #1348
- Add group/consumption_role invite/remove tests by @petrkalos in #1387
- Add Dataset integration tests - Dataset CRUD + actions outside of data.all by @dlpzx in #1379
- Add Worksheet integration tests - all except run sql query by @dlpzx in #1393
- Add Notebook integration testsby @noah-paige in #1400
Fixes 🪲
- Scope down dataset sharing requester IAM role managed IAM policy S3 permissions by @mourya-33 in #1280
- Fix: timeout error when listing Consumption Roles by @SofiaSazonova in #1303
- Fix: upgrade react avoid ip by @dlpzx in #1308
- Fix: Upgrade Github actions/checkout to v4 by @dlpzx in #1307
- Fix positional args generate env access by @noah-paige in #1316
- Fix s3_datasets and s3_datasets_shares tests by @dlpzx in #1325
- Update profiler run status on Refresh by @SofiaSazonova in #1404
- Share UI Submit fix by @SofiaSazonova in #1403
- Share UI fix: revoke items from share in revoked state by @SofiaSazonova in #1394
- Fix: Env Group Option Forms for create Pipelines and Omic Runs by @noah-paige in #1399
- Fix path deequ jar by @noah-paige in #1402
- Fix/remove edit team modals by @noah-paige in #1412
- Fix error while calling get_cognito_groups function by @TejasRGitHub in #1315
- Fix local dev gql request by @noah-paige in #1337
- Fix get author session API QuickSight by @noah-paige in #1383
- Fix Init Share Base by @noah-paige in #1360
- Fix listOrganizationGroupPermissions by @noah-paige in #1369
- Fix migration to not rely on OrganizationService or RequestContext by @noah-paige in #1361
- Fix: glossary status by @noah-paige in #1373
- Fix lambda_env_key out of scope for vpc-facing cognito setup by @dlpzx in #1384
- Script fix by @SofiaSazonova in #1355
- Fix getOrg query by @petrkalos in #1352
- Fix: Add Maintenance Guard Component separate from AuthGuard by @noah-paige in #1321
- Fix: Extend Sagemaker permissions and fix typo by @noah-paige in #1401
- Fix: Alembic sync by @SofiaSazonova in #1336
Dependencies 📦
- Safety checks - Ignore disputed issue on pip by @dlpzx in #1271
- Bump certifi from 2023.7.22 to 2024.7.4 in /deploy/custom_resources/custom_authorizer by @dependabot in #1390
- Upgrade ejs to 3.1.10 in yarn npm by @dlpzx in #1265
- Bump requests from 2.31.0 to 2.32.0 in /backend by @dependabot in #1291
- Bump requests from 2.31.0 to 2.32.0 in /backend/dataall/base/cdkproxy by @dependabot in #1293
- Bump requests from 2.31.0 to 2.32.2 in /deploy/custom_resources/custom_authorizer by @dependabot in #1309
- Upgrade flask packages to satisfy
safety checkby @petrkalos in #1313 - Fix npm audit findings by @noah-paige in #1341
- Bump urllib3 from 1.26.18 to 1.26.19 in /deploy/custom_resources/custom_authorizer by @dependabot in #1339
- Update version auth at edge to use node v20 by @noah-paige in #1327
New Contributors
- @ironspur5 made their first contribution in #954
Full Changelog: v2.5.0...v2.6.0
v2.5.0
What's Changed
New features 🆕
- Make visibility of auto-approval toggle configurable based on confidentiality by @anushka-singh in #1223
Refactoring 💻
- Uncouple datasets and dataset_sharing modules by @dlpzx in #1184, #1186, #1185, #1187, #1213, #1214 and #1242
- Refactor core - Stacks by @SofiaSazonova in #1194
- Rename datasets as s3_datasets by @dlpzx in #1250
Enhancements 🥇
- Enable encryption for lambda environment variables by @mourya-33 in #1225
- Add integration tests on a real API client and integrate the tests in CICD by @dlpzx in #1219
- Update lambda_api.py to add encryption for lambda env vars by @mourya-33 in #1255
Fixes 🪲
- Fix Profiling job by @SofiaSazonova in #1222
- Fix Notification link routes to a share request page by @SofiaSazonova in #1227
- Fix listValidEnvironments called only once by @noah-paige in #1238
- Fix Alembic Migration: has table checks by @noah-paige in #1240
- Fix EnvironmentGroup can remove other groups by @SofiaSazonova in #1234
- Fix local test groups listing for listGroups query by @noah-paige in #1239
- Fix DATASET_READ_TABLE read permissions by @SofiaSazonova in #1237
- Add order_by for paginated queries by @noah-paige in #1249
- Explicitly specify dataset_client s3 endpoint_url - fix CORS issue in upload files by @petrkalos in #1260
- Fix TABLE/FOLDER READ shared permissions by @SofiaSazonova in #1259
Dependencies 📦
- Bump werkzeug from 3.0.1 to 3.0.3 in /tests_new/integration_tests by @dependabot in #1254
- Bump werkzeug from 3.0.1 to 3.0.3 in /backend/dataall/base/cdkproxy by @dependabot in #1252
- Bump werkzeug from 3.0.1 to 3.0.3 in /tests by @dependabot in #1253
Full Changelog: v2.4.0...2.5.0
v2.4.0
What's Changed
New features 🆕
- Allow multiple environments in the same account with cdk-pivot role by @dlpzx in #1064
- Add high throughput SSM on prod_sizing by @noah-paige in #1154
- Add share_reapply ECS task - ON DEMAND for data.all admins by @dlpzx in #1151
- Initialise RDS database data.all permissions once per deployment by @petrkalos in #1145 and small fix in #1170
- Run RDS database migrations in a custom resource by @petrkalos in #1177
- Ruff code auto-format by @SofiaSazonova in #1105, #1112 and in #1129 and by @petrkalos in #1159 and in #1160
Big Refactoring 💻
- Refactor core/groups by @SofiaSazonova in #1113
- Refactor core/permissions by @SofiaSazonova in #1114
- Refactor core/environment and core/stack by @SofiaSazonova in #1164, in #1169, in #1178 and in #1181
Enhancements 🥇
- Remove allowAll bucket policy statement by @dlpzx in #1106
- Adding check to remove any spaces in confidentiality names by @TejasRGitHub in #1126
- Worksheet UI improvements - fix Team and list Environments of Team by @dlpzx in #1111
- WAF rule parameters in cdk.json + Documentation by @SofiaSazonova in #1140
- Update cdkExecPolicy.yaml to cleanup overly excessive permissions by @mourya-33 in #1085
- Add grants to pivot role in verify tables functions by @dlpzx in #1149
- Implement guardrails and mechanisms to deal with deleted IAM roles in share requests by @SofiaSazonova in #1161
- Implement least privilege principle for cloudfront, lambda and db migration stacks by @mourya-33 in #1134
- Implement less restrictive trust policy for local development pivot roles by @dlpzx in #1176
Fixes 🪲
- Fix EnvUri to check GET_ENV permission for worksheet by @noah-paige in #1125
- Grant IAM permissions to read data to environment team IAM roles independently from CREATE_DATASET permissions by @SofiaSazonova in #1137
- Allow ListEnv to get associated organization information by @noah-paige in #1139
- Redirect the user to correct URL after login by @TejasRGitHub in #1094
- Fixes for email notifications not sending share link in the body by @TejasRGitHub in #1143
- Fix folder pagination missing page by @dlpzx in #1158
- Add "/ "to prefix in crawlers if it is not specified in input by @dlpzx in #1156
- Add Athena List permissions to use AWS SDK for Pandas in SageMaker by @dlpzx in #1155
- Add new data.all permissions REMOVE_ORGANIZATION_GROUP, INVITE_ORGANIZATION_GROUP to teams invited to an Organization by @SofiaSazonova in #1162
- Fix missing GET_FOLDER permissions by @dlpzx in #1163
- Fix input parameters for get credentials get environment group by @dlpzx in #1198
- Update CDK exec role Policy name with region in template by @dlpzx in #1197
- Remove creation of log-groups in Lambdas by @dlpzx in #1192
- Fix missing session in resolve_environment by @dlpzx in #1199
- Fix missing $ in CDK custom policy by @dlpzx in #1204
- Fix unnecessary permission check in resolve_stack functions (failure in list datasets when there are shared datasets) by @dlpzx in #1205
- Fix reference to locationUri by @dlpzx in #1209
- Fix sagemaker tagging permissions by @dlpzx in #1211
Documentation 📚
- Documentation in GitHub pages for release 2.4.0 by @dlpzx in #1191
- Documentation in Userguide for release 2.4 by @dlpzx in #1218
Dependencies 📦
v2.3.0
What's Changed
- Using cdk.json parameter
enable_update_dataall_stacks_in_cicd_pipeline--> automatically updates the environments and dataset stacks in the CICD pipeline - Waiting for overnight update stack task --> same as the above, but it runs at a daily schedule.
- Updating environments in Environment > Stack tab > click on
Updatebutton --> manual update
New features 🆕
- Introduce dataset lock for data sharing, increasing robustness of parallel data sharing by @anushka-singh in #1072
- Add verification of data sharing and reapplying if "unhealthy" by @noah-paige in #1062
- Enable Central Catalog Glue databases import by @TejasRGitHub in #1021 and list them in worksheets in #1079
- Replace IAM inline policies by configurable Managed Policies for folder and bucket sharing by @SofiaSazonova and @dlpzx in #1068
- Simplify LakeFormation Glue database shares - single shared_db and single resource link table by @dlpzx in #1016 and add sharing guardrails drop permissions in #1055 and update Worksheet database names in UI in #1063
- Add data sharing auto-approval option for datasets by @SofiaSazonova in #988
- Introduce feature flags for topics and confidentiality and custom confidentiality list by @TejasRGitHub in #1049
Enhancements 🥇
- Enable key rotation for KMS in CodePipeline by @mourya-33 in #923
- Add support for custom environment linking text with sanitization by @zsaltys in #934
- Add KMS encryption for Aurora DB secrets by @mourya-33 in #935
- Implement Docker user directives by @mourya-33 in #895 and by @noah-paige in #968
- Add checkov GitHub actions by @dlpzx in #962
- Add word-wrap in strings in share lists by @dlpzx in #972
- Add logic to serialize bytes and bytearray datatypes to string by @awskaran in #973
- Add network information to listValidEnvironments by @dlpzx in #986
- Introduce data.all version parameter by @SofiaSazonova in #991
- Add WAF ACL to Cognito User Pool by @noah-paige in #976 and in #1097
- Add checkov baseline by @noah-paige in #1019
- Add dataset Description on shares UI page by @TejasRGitHub in #1026
- Allow update consumption role ownership by @petrkalos in #1020
- Add validation of AWS account and region in environment creation by @dlpzx in #1043
- Remove policies-updater ECS task by @dlpzx in #1046
- Remove git_release functionality by @dlpzx in #1042
- Clean-up auto create pivot permissions by @mourya-33 in #1075
- Add email notification metadata by @TejasRGitHub in #1082
- Add guardrails to alembic sync upgrade/downgrade by @noah-paige in #1084
Fixes 🪲
- Fix reAuth re-renders glitch by @noah-paige in #918
- Fix s3 bucket sharing for federated roles by @zsaltys in #920
- Fix Disappearing Env Value Request Access Modal by @noah-paige in #919
- Fix Frontend Config Role Issue while switching from Cognito Idp to Custom Auth by @TejasRGitHub in #938
- Investigate why some shares did not go to failed state (issue 932), but remained stuck or in-progress by @anushka-singh in #933
- Fix when migrating from Manually Created Pivot Role to Auto Create Pivot Role by @TejasRGitHub in #948
- Validate consumer roles by @SofiaSazonova in #951
- Fix local dev environment is broken after recent changes by @TejasRGitHub in #967
- Bugfix 956 by @anushka-singh in #961
- Add lakeformation in trust policy of dataset role by @dlpzx in #970
- Add else if condition to get tables into InSync state by @TejasRGitHub in #980
- Fix consumption role filtering by @TejasRGitHub in #975
- Replace dataall prefix by resourcePrefix in data pipeline creation by @dlpzx in #985
- Remove AWS Manged Lake Formation Service Linked Role from Pivot Role Nested Stack by @TejasRGitHub in #999
- Fix created dataset naming convention by @noah-paige in #1002
- Add CloudFormation permission to PivotRoleNestedStack by @TejasRGitHub in #1040
- Fix userguide dockerfile by @dlpzx in #1089
- Create DatasetLock for new datasets by @noah-paige in #1090
- Fix verify share table items and access point share no bucket policy by @noah-paige in #1095
- Add check and reapply for attaching S3 IAM policy by @dlpzx in #1096
- Fix counter on paged responses by @petrkalos in #1091
- Handle Error on clean up share and not get stuck in IN_PROGRESS status by @noah-paige in #1099
- Fix issue in SageMaker Create permissions by @dlpzx in #1102
Refactoring 💻
- Refactor Core/Organization to follow api/services/db layers by @dbalintx in #989
- Refactor Core/Vpc refactoring to follow api/services/db layers by @dlpzx in #1044
- Refactor Enums by @SofiaSazonova in #978
Documentation 📚
- Update Userguide documentation for v2.3 updates by @noah-paige in #1100
- Add alembic documentation by @SofiaSazonova in #1033
Dependencies 📦
- Upgrade Aurora postgreSQL engine 11 --> 13 by @noah-paige in #963
- Upgrade
axiospackage to resolve follow-redirect vulnerability by @noah-paige in #952 - Remove unused packages:
jinja2,deprecatedby @dlpzx in #969 - Upgrade npm packages:
axios,css-toolsby @dlpzx in #1052 - Upgrade
postcssand add yarn resolutions by @dlpzx in #1059 - Apply
boto3==1.34.35in DeployFrontend action by @anandsumit2000 in #1054 - Upgrade
starletteversion and dependecies to avoid ReDoS by @dlpzx in #1038 - Upgrade
ippackage in frontend for yarn and npm by @dlpzx in #1070
New Contributors 👨💻 👩💻
- @SofiaSazonova made their first contribution in #951
- @awskaran made their first contribution in #973
- @petrkalos made their first contribution in #1020
- @anandsumit2000 made their first contribution in #1054
Full Changelog: v2.2.0...v2.3.0